Multiscale Geometry and Analysis in High Dimensions

September 7 - December 17, 2004

Overview

In the past decade, a number of independent developments in mathematical analysis, in computer vision, in pattern recognition, and in statistical analysis have independently produced tools and theories which can now be seen to be closely related as parts of an area which should be called Multiscale Geometric Analysis (MGA). One of the most exciting developments of this discipline is a now-emerging theory of analysis and geometry in high dimensions. The purpose of this program is to crystallize this emerging field and to stimulate cross-disciplinary exchanges which will accelerate its formation and development.

In MGA, the goal is to detect, organize, represent, and manipulate data which nominally span a high-dimensional space but which contain important features which are approximately concentrated on lower-dimensional subsets (curves, sheets, etc.). This is a problem that comes up in image analysis where edges are important features, in volumetric 3D imaging where filaments and tubes are important, and in high-dimensional statistical data analysis where the data cloud concentrates near special low-dimensional subsets. The tools of MGA range from multiscale approximation of data in dyadic cubes by k-dimensional flats, as in Jones’ traveling salesman theorem; to multiscale radon transformation, as in beamlet analysis; to special space-frequency tilings, as in curvelet analysis. In many situations where natural operators of physics arise (e.g. scattering) these analytic methods are combined with a detailed multiscale geometric decomposition of the underlying set. Computational tools and mathematical theories are under development, and some initial results are very impressive. For example, MGA provides in mathematical analysis, a complete understanding of when a pointset actually lies on a lower-dimensional rectifiable set a corresponding understanding of what portion of a probability measure lies within a small neighborhood of a lower-dimensional rectifiable set; data compression via multiscale least squares approximation in approximation theory, data compression of objects with edges which achieves performance levels far beyond existing wavelet coders, and which are in fact asymptotically optimal; and in statistical signal processing, detectors for faint curvilinear structures in noisy images which are orders of magnitude more sensitive than previously proposed methods based e.g. on edge detectors, and again are provably asymptotically optimal. There are exciting emerging applications of these ideas in particle physics data analysis, in computer vision, and, most recently, in the analysis of massive digital astronomical catalogs.

NEW DIRECTIONS: While some of the above topics are in a certain sense well understood from a mathematical perspective, several are not and the potential applications are still relatively undeveloped. Thus this proposal includes members of scientific disciplines where we believe the chances for interaction are greatest. For example in materials science one potential application is the development of a theory of the geometry of microstructures. In medical imaging there have been many recent breakthroughs, and there are hints that these methods may also prove valuable in materials science. One of the fundamental tools in high dimensional analysis is the Singular Value Decomposition, where new algorithms have been introduced by Rokhlin. The use of SVD in the study of languages has gained acceptance over the past decade and is now a central tool. A speculative area that has been proposed is the study of texts as high dimensional objects, where massive amounts could be scanned for certain pre-selected structures. In Computer Science, ideas of this type already appear in the search algorithm used by Google, which uses the Perron-Frobenius Theorem (philosophically akin to the SVD) to weight edges of a graph. In hydrodynamics the velocity field is a six dimensional object whose geometry has only begun to be explored. In high dimensional analysis the manner in which a data set can be explored is by no means obvious, and innovative methods, such as those introduced by Inselberg, require further exploration.

On the side of theory there are two areas the program will stress. First, the development of multiscale techniques in relatively small dimensions (two to six, for example) is very far from over. We believe there are a host of new structures and basis functions waiting to be developed. As an example one can take a gray-scale image, which is inherently two dimensional, and instead of looking at the value at each pixel, consider also the values at the eight neighboring pixels. This gives a vector valued function where the values lie in a nine dimensional space. The geometry of the image is now rather different from what one started with and the investigation of these structures has only begun. The use of special bases in three dimensions is another new area, where beamlets provide the beginning of a theory of geometrically based functions. Yet to be developed are bases that provide useful description of surfaces in the range from smooth to more fractal-like, and versions that also take into account related functions (e.g. a velocity field). This could lead to a new theory of correlations between geometric structures arising in areas of physics. The development of this general area of bases has been driven by applications and there are certain to be many more applications in the coming years before a program in 2005.

Our knowledge of geometric structures in high (e.g. ten) dimensions or in very high (e.g. one thousand) dimensions has only begun. The basic tool used here is dimensional reduction via SVD. One finds the first few principal components of a data set using SVD, and then projects on those coordinates. What has been realized in the past few years is that one can continue this construction in a multiscale manner to elucidate smaller structures in the data set. But the number of ways of continuing or using stopping time rules is very large and a general methodology is lacking. The point here is that the underlying “physics” of the data set determine the proper way to compress or analyze. Another area where more results can be anticipated is the proper way of using methods to sample data sets while respecting geometry. A sample application would be to investigate the parameter space of metallic alloys. The parameters here include the elements used as well as the conditions of heating and cooling. Another example would be the analysis of texts where one searches for searches for several structures which also have a special “geometric” relation to each other. The search for relations in data sets arising in genomics is another area with high promise for this methodology, and we plan to include specialists in our program. This area of high dimensional analysis is so large that one could easily devote an entire program to it. We believe that the correct way to start is to use the hard-won results from function theory in low dimensions as a guide, and build on that experience. The program would provide an outstanding meeting ground for specialists from several areas.

A final area of interest is the development of methodologies for problems where one must use simultaneous multiscale function techniques and multiscale geometry to study natural operators defined on sets. Here the physics of the problem dictates which expansions should be used for the functions, and which decomposition of the set at various scales should be used to turn the operator in question into one of low rank. Scattering provides a natural example of such a problem. The mathematics encountered here can be viewed as the new (numerical version of) Calderon-Zygmund Theory. The most classical aspects of that theory use “only” projections of functions defined on cubes to functions of mean value zero. While CZ theory has been heavily developed since the 1950’s to introduce any number of variants (including wavelet theory) the correct computational version has yet to be developed in the context of function (plus operator) theory on sets. Here we note that this is the case even for oscillatory integrals on the line, where the functions are supported on a given collection of intervals. An appealing aspect of IPAM is that UCLA has been a center of Fourier analysis in the CZ style, and this could provide outstanding chances for new contributions by pure mathematicians.

We expect this workshop to lead to a new synthesis of ideas and numerous valuable collaborations and initiatives. Participants will have the opportunity to interact with researchers from pure math, applied math, statistics, computer science, industry, and government labs.

Organizing Committee

Emmanuel Candes (California Institute of Technology)

Ronald Coifman (Yale University)

David Donoho (Stanford University)

Peter Jones (Yale University)

Francois Meyer (University of Colorado)

Naoki Saito (University of California at Davis)

Jean-Luc Starck (CEA Saclay, France)