Exploiting contextual similarity for semantic prediction in the Biology domain

Karin Verspoor
Los Alamos National Laboratory

Current biological analysis techniques have achieved great success in gene sequencing for a multitude of organisms. The Human Genome Project, for instance, has resulted in establishing the complete genome sequence for each human chromosome, identifying approximately 30,000 genes. But, what do all of these genes do? Establishing the functionality of each of these genes is a daunting task. Most automated methods strictly consider syntactic similarity, assuming that genes that have similar amino acid sequences also have similar functions. We introduce a method which takes advantage of both the syntactic similarity and the semantic similarity within the target function space, represented by the Gene Ontology. The system we have developed is called POSOLE, or the POSet Ontology Laboratory Environment. POSOLE consists of a set of modules supporting ontology representation, categorization of nodes in the
ontology, and analysis. We will discuss how we have taken advantage of the structure of function space to enable better gene function prediction than the purely syntactic methods.

Presentation (PDF File)

Back to Workshop I: Dynamic Searches and Knowledge Building