Graduate Summer School: Intelligent Extraction of Information from Graphs and High Dimensional Data

July 11 - 29, 2005

Overview

In recent years, there has been a rapidly increasing demand for targeted analysis of large data streams and large networks. One of the goals has been identification of key features: face recognition in video streams and voice recognition in audio streams are two examples. Another goal has been inference of relationships: pattern discovery in large databases and determination of key links in social networks. At the same time, a number of scientific disciplines have come together to develop a theory for the analysis of high-dimensional data, as well as for the analysis of dynamic processes on massive graphs. The new techniques and new mathematics coming out of this line of research are ideally suited to a wide range of applications.

Applications and connections to real challenges will be drawn from: data fusion, automated feature extraction, face and shape recognition, spectral and hyperspectral image analysis, relational data mining, link analysis and discovery, graph mining, social and transactional networks, robust network design (making networks hard to break), optimal epidemic intervention (making networks easy to break), and hidden state inference (where are targets based on indirect measurements).

The summer school is intended for graduate students and postdocs, as well as more senior researchers interested in focusing their efforts on these mathematical challenges and applications. The program is organized as follows.

  • Week 1: High-dimensional data, relational data and kernel methods. The ubiquitous nature of high-dimensional data, combined with the difficulties presented by them, argues for the importance of finding models for their analysis. At the same time, large collections of relational data present the challenge of detecting and inferring factual information from sparse evidence. This week will highlight research in dimensionality reduction, as well as methods of graph mining and relational data mining.
  • Week 2: Image analysis and machine learning. The importance of image data for the validation of scientific theories in the form of large-scale computations underscores the need for principled metrics on data in those image spaces. This week will explore topics involving image detection as well as learning from image, voice and text data. Such problem are integral to building efficient algorithms for automatic detection of targets (such as faces), classification of patterns (face recognition) and prediction of important events (extreme event prediction).
  • Week 3: Streaming data and networks. There is a rapidly growing need for effective methods in addressing problems on large distributed networks. Problems associated with dynamics on and of networks are largely unexplored. This week will provide a further focus on graph mining as well as on analysis of streaming data, and will involve such topics as network tomography, moving neighborhood networks, dynamic network analysis and social networks.

Organizing Committee

Edmond Chow (D.E. Shaw Research & Development)
Tina Eliassi-Rad (Lawrence Livermore National Laboratory)
Yann LeCun (New York University)
Carey Priebe (Johns Hopkins University)
Kevin Vixie, Chair (Los Alamos National Laboratory, T-7, Mathematical MOdeling and Analysis)