Invited Talk
in
Workshop: Mathematics of Modern Machine Learning (M3L)
Adaptivity in Domain Adaptation and Friends
Samory Kpotufe
Domain adaptation, transfer, multitask, meta, few-shots, or lifelong learning … these are all important recent directions in ML that all touch at the core of what we might mean by ‘AI’. As these directions all concern learning in heterogeneous and ever-changing environments, they all share a central question: what information a 'source' distribution may have about a 'target' distribution, or put differently, which measures of discrepancy between distributions properly model such information.
Our understanding of this central question is still rather fledgeling, with both positive and negative results. On one hand we show that traditional notions of distance and divergence between distributions (e.g., Wasserstein, TV, KL, Renyi) are in fact too conservative: a source may be 'far' from a target under such traditional notions, yet still admit much useful information about the target distribution. We then turn to the existence of 'adaptive' procedures, i.e., procedures which can optimally leverage such information in the source data without any prior distributional knowledge. Here the picture is quite nuanced: while various existing approaches turn out to be adaptive in usual settings with a single source and hypothesis class, no procedure can guarantee optimal rates adaptively in more general settings, e.g., settings with multiple source datasets (as in multitask learning), or settings with multiple hypothesis classes (as in model selection or hyper-parameter tuning).
Such negative results raise new questions, as they suggest that domain adaptation and related problems may benefit from more structure in practice than captured by current formalisms.
The talk is based on joint work with collaborators over the last few years, namely, G. Martinet, S. Hanneke, J. Suk, Y. Mahdaviyeh, N. Galbraith.