Controlling overfitting (i.e., when the decision rules obtained fit the training samples extremely well, but fail to "generalize" well and perform poorly on the true distribution) is a long standing topic of study in machine learning. Regularization is a widely used technique to control overfitting where a penalty is added to the cost function (typically the classification or regression error). The success of regularization in a host of different algorithms is usually interpreted as coming from penalizing the complexity of the resulting decision rules favoring "simple" rules. In this talk we propose a different perspective to learning base on robust optimization. That is, assuming that each sample corrupted by a certain disturbance, we find the best decision under the most adversarial disturbance. We show that a special choice of the disturbance exactly recovers the solution obtained by penalizing complexity via regularization. Both Support Vector Machines and Lasso can be re-derived from a robust optimization perspective.
The equivalence relationship between regularization and robustness gives a physical interpretation of the regularization process. Moreover, it helps us explain from a robustness point of view why support vector machines are consistent, and why Lasso produces sparse solutions.
Generalizing these results we use the robustness perspective to derive new algorithms in new domains that have both favorable statistical and computational properties.
Back to Workshop IV: Robust Optimization