Poster
in
Workshop: Mathematics of Modern Machine Learning (M3L)
Complexity Matters: Dynamics of Feature Learning in the Presence of Spurious Correlations
GuanWen Qiu · Da Kuang · Surbhi Goel
Existing research often posits spurious features as "easier" to learn than core features in neural network optimization, but the nuanced impact of their relative simplicity remains under-explored. In this paper, we propose a theoretical framework and associated synthetic dataset grounded in boolean function analysis. Our framework allows for fine-grained control on both the relative complexity (compared to core features) and correlation strength (with respect to the label) of spurious features. Experimentally, we observe that the presence of stronger spurious correlations or simpler spurious features leads to a slower rate of learning for the core features in networks when trained with (stochastic) gradient descent. Perhaps surprisingly, we also observe that spurious features are not forgotten even when the network has perfectly learned the core features. We give theoretical justifications for these observations for the special case of learning with parity features on a one-layer hidden network. Our findings justify the success of retraining the last layer for accelerating core feature convergence and identify limitations of debiasing algorithms that exploit early learning of spurious features. We corroborate our findings through experiments on real-world vision datasets, thereby validating the practical relevance of our framework.