Evolutionary analysis of genome variation

Shamil Sunyaev
Harvard Medical School
Computational Biology

Five complete mammalian genomes are now available for comparative analysis and genomic sequence of as many as 22 mammals have been sequenced in regions selected by the ENCODE project. This sequencing effort has been accompanied by the accumulation of data on human genetic variation.
Multiple potentially functional sites within non-protein coding regions of the genomes have been identified with a variety of functional genomics assays.


New methods to detect conservation at the single basepair resolution will be presented and the feasibility of a conservation analysis of individual positions will be discussed. The methods take into account insertions and deletions together with substitutions and incorporate context-dependency of mutation rates. A comparison of sequence conservation with human polymorphism led us to hypothesize that the fraction of the genome under weak purifying selection is larger than was previously estimated. Regions involved in transcription regulation as evident from chromatin analysis are generally less conserved than expected. It is likely that pattern rather than strength of conservation is indicative of the regulatory role of DNA sequence.


Back to Sequence Analysis Toward System Biology