Skip to yearly menu bar Skip to main content


Talk
in
Workshop: I Can’t Believe It’s Not Better! Bridging the gap between theory and empiricism in probabilistic machine learning

Invited Talk: Mike Hughes - The Case for Prediction Constrained Training

Michael Hughes


Abstract:

This talk considers adding supervision to well-known generative latent variable models (LVMs), including both classic LVMs (e.g. mixture models, topic models) and more recent “deep” flavors (e.g. variational autoencoders). The standard way to add supervision to LVMs would be to treat the added label as another observed variable generated by the graphical model, and then maximize the joint likelihood of both labels and features. We find that across many models, this standard supervision leads to surprisingly negligible improvement in prediction quality over a more naive baseline that first fits an unsupervised model, and then makes predictions given that model’s learned low-dimensional representation. We can’t believe it is not better! Further, this problem is not properly solved by previous approaches that just upweight or “replicate” labels in the generative model (the problem is not just that we have more observed features than labels). Instead, we suggest the problem is related to model misspecification, and that the joint likelihood objective does not properly encode the desired performance goals at test time (we care about predicting labels from features, but not features from labels). This motivates a new training objective we call prediction constrained training, which can prioritize the label-from-feature prediction task while still delivering reasonable generative models for the observed features. We highlight promising results of our proposed prediction-constrained framework including recent extensions to semi-supervised VAEs and model-based reinforcement learning.