Transitive Functional Annotation by Shortest Path Analysis of Gene Expression Data

Xianghong Zhou
Harvard

We propose that transitive expression similarity among genes can be used as an important attribute to cluster genes of the same biological pathway. Based on large-scale yeast microarray expression data, we assigned functions for 146 yeast genes which are considered as unknown by SGD and by YPD.

Current methods for the functional analysis of microarray gene expression data make the implicit assumption that genes with similar expression profiles have similar functions in cells. However, among genes involved in the same biological pathway, not all gene pairs show high expression similarity. Here, we propose that transitive expression similarity among genes can be used as an important attribute to cluster genes of the same biological pathway. Based on large-scale yeast microarray expression data, we use the shortest path analysis to identify transitive genes between two given genes from the same biological process. We find that not only are functionally related genes with correlated expression profiles identified, but also those without. In the latter case, we compare our method to hierarchical clustering, and show that our method can reveal functional relationships among genes in a more precise manner. Finally, we show that our method can be used to reliably predict the function of unknown genes from known genes lying on the same shortest path. We assigned functions for 146 yeast genes which are considered as unknown by SGD and by YPD. This constitutes around 5% of the unknown yeast ORFome.