Manifolds model wide classes of physical systems. This talk is concerned with fitting non-parametric manifold models to large scientific data sets, motivated by the need to relieve the scientist from the responsibility of parameter selection and model validation by visual inspection. Thus, I will present components of a coherent framework, that allows a user to semi-automatically select the neighborhood scale, compute an embedding of the data, estimate and correct its distortions, and find physical interpretations of the embedding coordinates.
The last task is solved by a novel method called ManifoldLasso, which selects from a dictionary of functions a subset that can
*non-linearly* parametrize the manifold.
The entire pipeline is implemented in megaman, an open source python package for scalable manifold learning.
Joint work with Dominique Perrault-Joncas, James McQueen, Jacob VanderPlas, Zhongyue Zhang, Grace Telford, Yu-chia Chen, Samson Koelle, Hanyu Zhang