NIPS Poster Probabilistic Topic Coding for Superset Label Learning

Poster

Probabilistic Topic Coding for Superset Label Learning

Liping Liu · Thomas Dietterich

Harrah’s Special Events Center 2nd Floor

[ Abstract ]

Abstract:

In the superset label learning problem, each training instance provides a set of candidate labels of which one is the true label of the instance. Most approaches learn a discriminative classifier that tries to minimize an upper bound of the unobserved 0/1 loss. In this work, we propose a probabilistic model, Probabilistic Topic Coding (PTC), for the superset label learning problem. The PTC model is derived from logistic stick breaking process. It first maps the data to ``topics'', and then assigns to each topic a label drawn from a multinomial distribution. The layer of topics can capture underlying structure in the data, which is very useful when the model is weakly supervised. This advantage comes at little cost, since the model introduces few additional parameters. Experimental tests on several real-world problems with superset labels show results that are competitive or superior to the state of the art. The discovered underlying structures also provide improved explanations of the classification predictions.

Live content is unavailable. Log in and register to view live content