Archetype analysis: a framework for selecting representative objects

Volker Roth
Universität Basel

Archetype analysis involves the identification of representative objects from amongst a set of multivariate data such that the data can be expressed as a convex combination of these representative objects (the meaning of an "object" depends only on the context, it might be a biological species, a particle, a molecule etc.). Contrary to mixture-based cluster analysis -- where the focus is on selecting cluster centroids that are prototypes (i.e. summary descriptors) of the respective clusters -- archetype analysis attempts to select the "purest" or "most extreme" objects. In this talk I will address many facets of this very generic analysis method, ranging from a biological motivation to questions related to model selection, outlier sensitivity, data representation, algorithms and applications. On the technical side, I will discuss some aspects of "forward-stagewise" and "Frank-Wolfe" algorithms, convex hull approximations and semi-parametric copulas.

Presentation (PDF File)

Back to Workshop I: Machine Learning Meets Many-Particle Problems