Skip to yearly menu bar Skip to main content


Poster

ProTransformer: Robustify Transformers via Plug-and-Play Paradigm

Zhichao Hou · Weizhi Gao · Yuchen Shen · Feiyi Wang · Xiaorui Liu

[ ]
Thu 12 Dec 11 a.m. PST — 2 p.m. PST

Abstract:

Transformer-based architectures have dominated various areas of machine learning in recent years. In this paper, we introduce a novel robust attention mechanism designed to enhance the resilience of transformer-based architectures. Crucially, this technique can be integrated into existing transformers as a plug-and-play layer, improving their robustness without the need for additional training or fine-tuning. Through comprehensive experiments and ablation studies, we demonstrate that our ProTransformer significantly enhances the robustness of transformer models across a variety of prediction tasks, attack mechanisms, backbone architectures, and data domains. Notably, without further fine-tuning, the ProTransformer consistently improves the performance of vanilla transformers by 19.5\%, 28.3\%, 16.1\%, and 11.4\% for BERT, ALBERT, DistilBERT, and RoBERTa, respectively, under the classical TextFooler attack. Furthermore, ProTransformer shows promising resilience in large language models (LLMs) against prompting-based attacks, improving the performance of T5 and LLaMA by 24.8\% and 17.8\%, respectively, and enhancing Vicuna by an average of 10.4\% against the Jailbreaking attack. Beyond the language domain, ProTransformer also demonstrates outstanding robustness in both vision and graph domains.

Live content is unavailable. Log in and register to view live content