Skip to yearly menu bar Skip to main content


Poster

Learning a Single Neuron Robustly to Distributional Shifts and Adversarial Label Noise

Shuyao Li · Sushrut Karmalkar · Ilias Diakonikolas · Jelena Diakonikolas

[ ]
Fri 13 Dec 11 a.m. PST — 2 p.m. PST

Abstract: We study the problem of learning a single neuron with respect to the $L_2^2$-loss in the presence of adversarial distribution shifts, where the labels can be arbitrary, and the goal is to find a "best-fit" function.More precisely, given training samples from a reference distribution $p_0$, the goal is to approximate the vector $\mathbf{w}^*$which minimizes the squared loss with respect to the worst-case distribution that is close in $\chi^2$-divergence to $p_{0}$.We design a computationally efficient algorithm that recovers a vector $ \hat{\mathbf{w}}$satisfying $\mathbb{E}\_{p^*} (\sigma(\hat{\mathbf{w}} \cdot \mathbf{x}) - y)^2 \leq C \hspace{0.2em} \mathbb{E}\_{p^*} (\sigma(\mathbf{w}^* \cdot \mathbf{x}) - y)^2 + \epsilon$, where $C>1$ is a dimension-independent constant and $(\mathbf{w}^*, p^*)$ is the witness attaining the min-max risk$\min_{\mathbf{w}:\|\mathbf{w}\| \leq W} \max\_{p} \mathbb{E}\_{(\mathbf{x}, y) \sim p} (\sigma(\mathbf{w} \cdot \mathbf{x}) - y)^2 - \nu \chi^2(p, p_0)$.Our algorithm follows the primal-dual framework and is designed by directly bounding the risk with respect to the original, nonconvex $L_2^2$ loss.From an optimization standpoint, our work opens new avenues for the design of primal-dual algorithms under structured nonconvexity.

Live content is unavailable. Log in and register to view live content