Pushing the limits of sequence profile similarity search and alignment

Nick Grishin
UT Southwestern Medical School

Traditionally, sequence similarity search has been riding on negatives and deriving power from random models to be rejected for positive hits. Exploring the other side, we show that the search can be significantly improved by considering the positives, i.e., known homology relationships in a database of sequence profiles. Similar strategies have been widely used by most successful search engines, such as google. This algorithm results in re-ranking of hits, but does not correct faulty alignments. The main focus in the sequence alignment field has been on alignment construction. However, many alignments are reasonably accurate with the exception of several mildly misaligned regions. We propose new approaches to refinement of existing alignments and show that successful a posteriori detection and correction of misaligned regions results in alignment improvement.

Back to Multiple Sequence Alignment