Poster Session
in
Workshop: Scientific Methods for Understanding Neural Networks
Training Dynamics of Convolutional Neural Networks for Learning the Derivative Operator
Erik Wang · Yongji Wang · Ching-Yao Lai
Despite significant interest in developing new methods for scientific machine learning, the research community at large lacks a thorough understanding of the behavior of existing methods. For instance, while deep multi-layer perceptions are known to exhibit a bias toward low-frequency data, whether this phenomenon holds for other methods and how it manifests are less certain. We investigate the training dynamics of convolutional neural networks in the context of operator learning and find that high-frequency components of the input signal are generally learned before the low-frequency components, followed by the amplitudes of the frequency distribution. Our results also show that networks trained on a range of frequencies tend to perform better on high-frequency data, but that increasing the model's kernel size can decrease this accuracy gap and generally improve performance. This trend does not hold for deeper models, suggesting that larger kernel sizes may have a stronger influence on training stability and model accuracy.