Poster
Trade-Offs of Diagonal Fisher Information Matrix Estimators
Alexander Soen · Ke Sun
West Ballroom A-D #5710
The Fisher information matrix can be used to characterize the local geometry ofthe parameter space of neural networks. It elucidates insightful theories anduseful tools to understand and optimize neural networks. Given its highcomputational cost, practitioners often use random estimators and evaluate onlythe diagonal entries. We examine two popular estimators whose accuracy and samplecomplexity depend on their associated variances. We derive bounds of thevariances and instantiate them in neural networks for regression andclassification. We navigate trade-offs for both estimators based on analyticaland numerical studies. We find that the variance quantities depend on thenon-linearity w.r.t. different parameter groups and should not be neglected whenestimating the Fisher information.