Statistical modeling of microarray data to discover cell cycle dependent transcripts

Lue Ping Zhao
University of Washington/Fred Hutchinson Cancer Research Center
Public Health Sc.

Functional genomics, with an aid of microarray and oligonucleotide technologies, have been increasingly used in biomedical research, and have generated an impressive amount of gene expression data, which may hold secrets of life and disease progressions. To uncover such secrets from "ocean of genomic data", however, appear to be challenging without aid of computational tools. In this talk, I will introduce a statistical modeling approach for analyzing genomic expression data, while simultaneously normalizing expression data. Specifically, we postulated a regression-type model to depict relationships of averaged expression levels with a covariate, which may be binary in two-group comparison or continuous in time course study. Further, this model is indexed by a set of additive and multiplicative factors to account for the heterogeneity between multiple chips. Under this model, we described an estimating equation technique to estimate relevant parameters, and to use bootstrap and permutation techniques for making robust inferences, while controlling genome-wide significance level at, for example, 5%. To illustrate this approach, we applied it an expression study on leukemia, with the primary objective to discover genes that are differentially expressed between acute lymphoblastic leukemia (ALL) and 11 with acute myeloid leukemia (AML). The analysis has yielded a list of 35 genes, which are significantly differentially expressed between two groups at the genome-wide significance level 5%.


Back to Long Programs