Skip to yearly menu bar Skip to main content


Poster

Transforming Vision Transformer: Towards Efficient Multi-Task Asynchronous Learner

Hanwen Zhong · Jiaxin Chen · Yutong Zhang · Di Huang · Yunhong Wang

[ ]
Thu 12 Dec 4:30 p.m. PST — 7:30 p.m. PST

Abstract:

Multi-Task Learning (MTL) aims at enhancing model capabilities by tackling multiple tasks simultaneously. Recent MTL research has predominantly focused on designing Mixture-of-Experts (MoE) structures and integrating Low-Rank Adaptation (LoRA) to efficiently perform multi-task learning. However, their rigid combination hampers both the optimization of MoE and reparameterization capability of LoRA, leading to sub-optimal performance and a lower inference speed compared to Transformer. In this work, we propose Efficient Multi-Task Learning (EMTAL), a novel approach that transforms a pre-trained Transformer into an efficient multi-task learner during training, and reparameterizes the knowledge back to the original Transformer for efficient inference. Specifically, we decompose the pre-trained Transformer into a low-rank MoE structure and employ LoRA to tune experts, termed MoEfied LoRA. Additionally, we consider the intrinsic asynchronous nature of task optimization and we designed a learning Quality Retaining (QR) multi-task optimization strategy using historical high-quality class logits to prevent a well-trained task from performance degradation. Finally, we introduce a fade strategy to integrate the learned LoRA parameters into the original Transformer for efficient inference. Extensive experiments on public benchmarks demonstrate the superiority of our method over existing multi-task learning approaches.

Live content is unavailable. Log in and register to view live content