From gene expression data to cancer class discovery

Amir Ben-Dor
Agilent Technologies
Agilent Laboratories

Recent studies demonstrate the discovery of putative disease subtypes from gene expression data. The underlying computational problem is to partition the set of sample tissues into statistically meaningful classes. We approach this problem by statistically scoring candidate partitions according to the overabundance of genes that separate the different classes. (Overabundance is measured against a stochastic null model). Using simulated annealing we explore the space of all possible partitions of the set of samples, seeking highly scoring partitions. We demonstrate the performance of our methods on both synthetic data, where we recover planted partitions, and on tumor expression datasets, where we find several highly pronounced partitions. Joint work with Nir Friedman and Zohar Yakhini.

Back to Long Programs