TUTORIAL - Regularized Kernel Estimation (RKE) From Dissimilarity Data

Grace Wahba
University of Wisconsin
Statistics

In many problems approximate "distances" or dissimilarity data may be available between objects, for example protein sequences, where
various methods for coming up with similarity measures are available for implementation. The end goal of this work is to be able to take noisy, incomplete "distance" data and obtain a 'best-fitting' positive definite function.
Since this positive definite function can provide a consistent (Euclidean) set of coordinates, the end result can be used to cluster unlabeled objects. Furthermore if a labeled training set is available, new unlabeled objects may be fit in the coordinate scheme obtained so far via a 'newbie' algorithm to be described. Then the (multicategory) support vector machine or penalized likelihood estimate can be used to
classify the new object or estimate the likelihood that it is in various classes.


Presentation (PDF File)
Video of Talk (RealPlayer File)

Back to Graduate Summer School: Intelligent Extraction of Information from Graphs and High Dimensional Data