In-person presentation
in
Workshop: Attributing Model Behavior at Scale (ATTRIB)
What Neural Networks Memorize and Why (Vitaly Feldman)
Deep learning algorithms tend to fit the entire training dataset (nearly) perfectly including mislabeled examples and outliers. In addition, in extreme cases, complex models appear to memorize entire input examples, including seemingly irrelevant information (social security numbers from text, for example). We provide a simple conceptual explanation and a theoretical model demonstrating that memorization of labels is necessary for achieving close-to-optimal generalization error when learning from long-tailed data distributions. We also describe natural prediction problems for which every sufficiently accurate training algorithm must encode, in the prediction model, essentially all the information about a large subset of its training examples. This remains true even when most of that information is ultimately irrelevant to the task at hand. Finally, we demonstrate the utility of memorization and support our explanation empirically. These results rely on a new technique for efficiently estimating memorization and influence of training data points.