Finding family-specific protein structural motifs is an important problem in computational biology because these motifs can be very useful in protein function prediction and classification. Discriminative subgraph mining methods have been shown effective in finding these motifs from protein graphs when knowledge of functional sites is unavailable. We will present the state of the art methods developed in our lab and demonstrate their advantages over frequent subgraph based approaches. We also investigated the benefit of allowing approximate matches in edges and nodes. Our experiments show that these methods are able to discover high quality motifs, improve classification accuracy and runtime efficiency across all protein families of sufficient sample size. It complements the sequence and global structure alignment based methods in protein function prediction when significantly similar proteins are absent in the database.
Back to Workshop II: Optimization, Search and Graph-Theoretical Algorithms for Chemical Compound Space