Prediction of Protein Structure and Function on a Proteomic Scale

Jeffrey Skolnick
State University of New York, Buffalo

A novel, proteomic scale, protein sequence-to-structure-to-structure-function approach has been developed. For single domain proteins, we demonstrate that at the level of low-to-moderate resolution structures, the library of already solved protein structures in the Protein Data Bank, PDB, is essentially complete. That is, the global fold of essentially all single domain proteins is found among the already solved structures. Exploiting this information and a unified approach to protein structure prediction, TASSER, we have developed a methodology that can consistently bring template-based alignments closer to native. We have applied TASSER to a comprehensive representative benchmark of small to medium size target proteins in the PDB where the target and template proteins are at best weakly homologous. Our results suggest that about 2/3 of proteins lacking any significant homology to a known template protein can be successfully folded. We further demonstrate that TASSER also yields good results on membrane proteins. Next, we investigated the utility of such predicted structures for biochemical function inference and conclude that structures whose backbone root mean square deviation from native is below 4 Ă… are quite useful. We then applied the threading component of TASSER to a large number of genomes; over 70% of the ORFS in a typical genome have at least one domain that can be assigned to a known fold. Finally, we next describe the extension of threading to quaternary structure prediction, MULTIPROSPECTOR, and present results for the S. cerevisiae genome and show that MULTIPROSPECTOR is competitive with large-scale experimental approaches. Finally, we conclude a discussion of an approach to the automated assignment of proteins to pathways.

Presentation (PowerPoint File)

Back to Workshop III: Structural Proteomics