Offline and Online Learning for Contextual Stochastic Optimization

Paul Grigas
University of California, Berkeley (UC Berkeley)

Often the parameters of an optimization task are predicted based on contextual features, and it is desirable to leverage the structure of the underlying optimization task when training a machine learning model. A natural loss function in this setting is based on considering the cost of the decisions induced by the predicted parameters, in contrast to standard measures of prediction error. We propose several practical and theoretically sound ways of overcoming the computational challenge of directly optimizing this loss function. In the case of linear objectives, we propose the use of a novel convex surrogate loss function, called the “Smart Predict-then-Optimize+ (SPO+)” loss function. In the offline learning situation, we prove that the SPO+ loss function is statistically consistent and develop corresponding quantitative risk bounds under mild conditions. In the case of nonlinear objectives, we consider an integrated approach for estimating the conditional distribution of parameters given the features that incorporates the downstream optimization task. We then consider an online variant of our setting with resource constraints, where a decision-maker first predicts a reward vector and resource consumption matrix based on a given context vector and then makes a decision. We prove regret bounds that are sublinear with rate depending on the corresponding offline risk bounds of the surrogate loss used to learn the prediction model. We also conduct numerical experiments to empirically demonstrate the strength of our proposed methods in the online setting. This talk is based on a series of papers jointly with Adam Elmachtoub, Othman El Balghiti, Ambuj Tewari, Heyuan Liu, Meng Qi, and Max Shen.

Back to Long Programs