Skip to yearly menu bar Skip to main content


Poster
in
Workshop: UniReps: Unifying Representations in Neural Models

Understanding Permutation Based Model Merging with Feature Visualizations

Congshu Zou · Geraldin Nanfack · Stefan Horoi · Eugene Belilovsky

Keywords: [ Linear mode connectivity ] [ Feature Visualization ] [ Convolutional neural network ] [ Deep learning ] [ Model Merging ]


Abstract:

Linear mode connectivity (LMC) has become a topic of great interest in recent years. It has been empirically demonstrated that popular deep learning models trained from different initializations exhibit linear model connectivity up to permutation. Based on this, several approaches for finding a permutation of the model's features or weights have been proposed leading to several popular methods for model merging. These methods enable the simple averaging of two models to create a new high-performance model. However, besides accuracy, the properties of these models and their relationships to the representations of the models they derive from are poorly understood. In this work, we study the inner mechanisms behind LMC in model merging through the lens of classic feature visualization methods. Focusing on convolutional neural networks (CNNs) we make several observations that shed light on the underlying mechanisms of model merging by permute and average.

Chat is not available.