Statistical methods for cancer cluster validation in microarray experiments

Sandrine Dudoit
Stanford University
Biochemistry

A reliable and precise classification of tumors is essential for successful diagnosis and treatment of cancer. cDNA microarrays and high-density oligonucleotide chips are novel biotechnologies which are being used increasingly in cancer research. By allowing the monitoring of expression levels for thousands of genes simultaneously, such techniques may lead to a more complete understanding of the molecular variations among tumors and hence to a finer and more informative classification.

We discuss applications of re-sampling methods to the cluster analysis (unsupervised learning) of tumors using gene expression data from microarray experiments. A re-sampling method, known as bagging in discriminant analysis, is applied to increase clustering accuracy and to assess the confidence of cluster assignments for individual observations. A re-sampling method is also proposed to estimate the number of tumor clusters, if any. We compare the performance of these approaches to existing methods using simulated data and cancer microarray data.


Back to Long Programs