Skip to yearly menu bar Skip to main content


Poster

Balancing Multimodal Learning with Classifier-guided Gradient Modulation

Zirun Guo · Tao Jin · Jingyuan Chen · Zhou Zhao

[ ] [ Project Page ]
Wed 11 Dec 4:30 p.m. PST — 7:30 p.m. PST

Abstract:

Multimodal learning has developed very fast in recent years. However, during the multimodal training process, the model tends to rely on only one modality based on which it could learn faster, thus leading to inadequate use of other modalities. Existing methods to balance the training process always have some limitations on the loss function, optimizers and the number of modalities and only consider modulating the magnitude of the gradients while ignoring the directions of the gradients. To solve these problems, in this paper, we present a novel method to balance multimodal learning with Classifier-Guided Gradient Modulation (CGGM), considering both the magnitude and directions of the gradients. We conduct extensive experiments on four multimodal datasets: UPMC-Food 101, CMU-MOSI, IEMOCAP and BraTS 2021, covering classification, regression and segmentation tasks. The results show that CGGM outperforms all the baselines and other gradient modulation methods consistently, demonstrating its effectiveness and versatility. Our codes will be publicly available.

Live content is unavailable. Log in and register to view live content