Poster
in
Workshop: NeurIPS 2024 Workshop: Machine Learning and the Physical Sciences
Uncertainty Quantification From Scaling Laws in Deep Neural Networks
Ibrahim Elsharkawy · Yonatan Kahn · Benjamin Hooberman
Abstract:
Quantifying the uncertainty from machine learning analyses is critical to their use in the physical sciences. In this work we focus on uncertainty inherited from the initialization distribution of neural networks. We compute the mean $\mu_{\mathcal{L}}$ and variance $\sigma_{\mathcal{L}}^2$ of the test loss $\mathcal{L}$ for an ensemble of multi-layer perceptrons (MLPs) with neural tangent kernel (NTK) initialization in the infinite-width limit, and compare empirically to the results from finite-width networks for two example tasks: MNIST classification and calorimeter energy regression. We observe scaling laws as a function of training set size $N_{\mathcal{D}}$ for both $\mu_{\mathcal{L}}$ and $\sigma_{\mathcal{L}}$, but find that the relative variance $\epsilon_{\mathcal{L}} \equiv \sigma_{\mathcal{L}}/\mu_{\mathcal{L}}$ becomes independent of $N_{\mathcal{D}}$ at both infinite and finite width for sufficiently large $N_{\mathcal{D}}$. This implies that the relative variance of a finite-width network may be approximated by its infinite-width value, and may be calculable using finite-width perturbation theory.
Chat is not available.