Proper regularizers for semi-supervised learning

Dejan Slepcev
Carnegie Mellon University

We consider a standard problem of semi-supersized learning: given a data set (considered as a point cloud in a euclidean space) with a small number of labeled points the task is to extrapolate the label values to the whole data set. In order to utilize the geometry of the dataset one creates a graph by connecting the nodes which are sufficiently close. Many standard approaches rely on minimizing graph-based functionals, which reward the agreement with the labels and the regularity of the estimator. Choosing a good regularization leads to questions about the relations between discrete functionals in random setting and continuum nonlocal and differential functionals. We will discuss how insights about this relation and results about the functionals provides ways to properly choose the functionals for semi-supervised learning and appropriately set the weights of the graph so that the information is propagated in a consistent way from the labeled points. That is we propose a consistent way to overcome the issues observed in Laplacian learning. This talk is based on joint works with Calder, Dunlop, Stuart and Thorpe

Presentation (PDF File)

Back to Workshop III: Geometry of Big Data