Ultraconserved elements, living fossil transposons, distal enhancers, and mysterious RNA genes: reconstructing the detailed evolutionary history of the human genome.

David Haussler
UC Santa Cruz
Computer Science

Comparison of the human genome with the genomes of the mouse, rat, dog, cow, and other mammals reveals that at least 5% of the human genome is under negative selection. Negative selection occurs in important functional segments of the
genome where random (mostly deleterious) mutations are rejected by natural selection, leaving the orthologous segments in different species more similar than would be expected under a neutral substitution model. Protein coding regions account for at most 1/3 of the segments that are under negative selection. In fact, the most conserved segments of the human genome do not appear to code for protein. These “ultraconserved” elements, of length from 200-800bp, are totally unchanged between human mouse and rat, and are on average
96% identical in chicken. The function of most is currently unknown, but we have evidence that many may be distal enhancers controlling the expression of genes involved in embryonic development. Some of these distal enhancers make mysterious short RNA transcripts as well. Other ultraconserved elements appear to be involved in the regulation of alternative splicing. Evolutionary analysis
indicates that many of these elements date from a period very early in the evolution of vertebrates, as they have no orthologous counterparts in sea
squirts, flies or worms. At least one group, involving a conserved enhancer of one gene and an ultraconserved altspliced exon of another, evolved from a novel retrotransposon family that was active in lobe-finned fishes, and is still
active today in the “living fossil” coelacanth, the ancient link between marine and land vertebrates.


In contrast with the slowly changing ultraconserved regions, in other areas of
the genome recent genetic innovations that are specific to primates or specific to humans have caused relatively rapid change through positive selection of advantageous changes. Via simulation, we estimate that most of the DNA sequence
of the common ancestor of all placental mammals, which lived in the last part Cretaceous period about 80-100 million years ago, can be predicted with 98% accuracy. We recently reconstructed and entire chromosome arm from the genome of this ancient species, and are currently working on a full genome reconstruction. Given this as a basis, and enough well-placed primate genomes to reconstruct intermediate states, we should eventually be able to document most of the
genomic changes that occurred in the evolution of the human lineage from the placental ancestor over the last 100 million years, including innovations that arose by positive selection.


References: see http://genome.ucsc.edu/goldenPath/pubs.html
Credits: UCSC Genome Bioinformatics Group and Genome Laboratory; enhancer work is collaboration with Eddy Rubin lab at Berkeley, reconstruction project is collaboration with Webb Miller group at Penn State, Mathieu Blanchette at McGill, and Eric Lander group at the Broad.


Back to Sequence Analysis Toward System Biology