Robust Learning of a Single Neuron: Bridging Computational Gaps Using Insights from Optimization

Jelena Diakonikolas
University of Wisconsin-Madison

I will discuss recent results on learning a single neuron with ReLU and other activation functions, in the agnostic (i.e., adversarial label noise) setting, with the goal of achieving near-optimal mean square loss. The key ingredient of this work is a surrogate stochastic convex optimization problem that we show can be solved with low sample and computational complexity while achieving target error guarantees. Our results are enabled by local error bounds from optimization theory that we establish for this problem under mild distributional assumptions that capture sub-exponential, heavy-tailed (with polynomial tails), and even some discrete distributions. Surprisingly, for all these distributional families, the constant in the error bound is an absolute constant — independent of the problem dimension – and this is crucial for establishing our results. We then discuss generalizations of this result to other activation functions, including the much more challenging case in which the activation function is unknown.


Back to EnCORE Workshop on Computational vs Statistical Gaps in Learning and Optimization