Skip to yearly menu bar Skip to main content


Poster

Optimal Algorithms for Learning Partitions with Faulty Oracles

Adela DePavia · Olga Medrano Martin del Campo · Erasmo Tani

Poster Room - TBD
[ ]
Fri 13 Dec 11 a.m. PST — 2 p.m. PST

Abstract: We consider a clustering problem where a learner seeks to partition a finite set by querying a faulty oracle. This models applications where learners crowdsource information from non-expert human workers or conduct noisy experiments to determine group structure. The learner aims to exactly recover a partition by submitting queries of the form ``are $u$ and $v$ in the same group?'' for any pair of elements $u$ and $v$ in the set. Moreover, because the learner only has access to faulty sources of information, they require an error-tolerant algorithm for this task: i.e. they must fully recover the correct partition, even if up to $\ell$ answers are incorrect, for some error-tolerance parameter $\ell$. We study the question: for any given error-tolerance $\ell$, what is the minimum number of queries needed to learn a finite set partition of $n$ elements into $k$ groups? We design algorithms for this task and prove that they achieve optimal query complexity. To analyze our algorithms, we first highlight a novel connection between this task and correlation clustering. We then use this connection to build a R\'enyi-Ulam style analytical framework for this problem, which yields matching lower bounds. Our analysis also reveals an inherent asymmetry between the query complexity necessary to be robust against false negative errors as opposed to false positive errors.

Live content is unavailable. Log in and register to view live content