Statistics, Geometry, Computation: Searching for Low Dimensional Structure in High Dimensional Data

Lawrence Saul
University of California, San Diego (UCSD)

How can we detect low dimensional structure in high dimensional data? If the data is mainly confined to a low dimensional subspace, then simple linear methods can be used to discover the subspace and estimate its dimensionality. However, other richer types of structure are also possible. For example, the data may lie on or near a low dimensional manifold, it may group into clusters of unknown shapes and sizes, or it may have a rich hierarchical structure. In these cases, the overall structure will be nonlinear, and linear methods are bound to fail.

The last few years have witnessed several advances in the problem of high dimensional data analysis. We now have many compelling frameworks for discovering nonlinear structure in high dimensional data.
Surprisingly, the main computations in these frameworks are based on highly tractable optimizations, such as nearest-neighbor searches, eigenvalue problems, and semidefinite programming. These optimizations draw on ideas from convex optimization, spectral graph theory, differential geometry, and kernel methods. I will talk about our recent explorations in these areas for nonlinear dimensionality reduction and deep learning.

Back to Machine Reasoning Workshops I & II: Mission-Focused Representation & Understanding of Complex Real-World Data