Projection Robust Wasserstein Distance and Riemannian Optimization
Darren Lin, Chenyou Fan, Nhat Ho, Marco Cuturi, Michael Jordan
Spotlight presentation: Orals & Spotlights Track 32: Optimization
on 2020-12-10T19:50:00-08:00 - 2020-12-10T20:00:00-08:00
on 2020-12-10T19:50:00-08:00 - 2020-12-10T20:00:00-08:00
Poster Session 7 (more posters)
on 2020-12-10T21:00:00-08:00 - 2020-12-10T23:00:00-08:00
GatherTown: Theory ( Town E1 - Spot D2 )
on 2020-12-10T21:00:00-08:00 - 2020-12-10T23:00:00-08:00
GatherTown: Theory ( Town E1 - Spot D2 )
Join GatherTown
Only iff poster is crowded, join Zoom . Authors have to start the Zoom call from their Profile page / Presentation History.
Only iff poster is crowded, join Zoom . Authors have to start the Zoom call from their Profile page / Presentation History.
Toggle Abstract Paper (in Proceedings / .pdf)
Abstract: Projection robust Wasserstein (PRW) distance, or Wasserstein projection pursuit (WPP), is a robust variant of the Wasserstein distance. Recent work suggests that this quantity is more robust than the standard Wasserstein distance, in particular when comparing probability measures in high-dimensions. However, it is ruled out for practical application because the optimization model is essentially non-convex and non-smooth which makes the computation intractable. Our contribution in this paper is to revisit the original motivation behind WPP/PRW, but take the hard route of showing that, despite its non-convexity and lack of nonsmoothness, and even despite some hardness results proved by~\citet{Niles-2019-Estimation} in a minimax sense, the original formulation for PRW/WPP \textit{can} be efficiently computed in practice using Riemannian optimization, yielding in relevant cases better behavior than its convex relaxation. More specifically, we provide three simple algorithms with solid theoretical guarantee on their complexity bound (one in the appendix), and demonstrate their effectiveness and efficiency by conducing extensive experiments on synthetic and real data. This paper provides a first step into a computational theory of the PRW distance and provides the links between optimal transport and Riemannian optimization.