The evolution of non-coding functional elements in our genome

David Haussler
University of California, Santa Cruz (UC Santa Cruz)
Computer Science

Comparing the genomes of present-day species allows us to computationally reconstruct what most of the DNA bases in the genome of the common ancestor of placental mammals must have looked like approximately 100 million years ago. In many important functional regions, we can reconstruct genome sequences even further back in history. We can then deduce the genetic changes on the evolutionary path from those ancient species to humans. In so doing, we discover how natural selection has shaped us at the molecular level. About five percent of the human genome consists of conserved elements that have remained surprisingly unchanged across millions of years of evolution, suggesting important function. Only one third of these code for protein; the rest are likely to be gene regulatory elements and non-coding RNAs.


In most parts of the genome we find that the protein-coding sequences have changed relatively little during the last ~500 million years of vertebrate evolution, and many have orthologous counterparts in invertebrates. But these represent less than 1.5% of the genome. In contrast, even the “ultraconserved” vertebrate non-coding genomic elements for the most part bear little resemblance to anything we find in invertebrates. Apart from a few highly conserved elements, the non-coding sequences that inhabit introns and intergenic regions between genes in most regions appear to have undergone a complete turnover due to the activity of transposons. These mobile elements, in mammals primarily retrotransposons that copy themselves via an RNA intermediate, add new DNA by inserting into novel locations. They also facilitate the removal of DNA through the process of non-homologous recombination. We conjecture that most of the non-coding regulatory elements in the genomes of living vertebrates derive from DNA that was put into place by transposons during vertebrate evolution, and hence these vertebrate regulatory elements have no orthologous counterparts in invertebrates. Examples where ancient transposons have contributed functional regulatory sequences are being discovered at an increasing rate. Since many transposons derive from defective viruses, this suggests that viruses have played a fundamental role in our molecular evolution.


Recently we used more than 3 million conserved, non-exonic elements (CNEEs) in the vertebrate genomes to track evolutionary changes over the full range of vertebrate evolution. This analysis pointed to three periods in the evolution of gene regulatory networks through the introduction of new CNEEs: first a period of innovations in fundamental transcription factors and developmental genes, then an increase in innovations in genes involved in cell-cell communication, and finally, in the last 100 million years or so, an increase in innovations in genes involved in key signaling pathways within cells. These results illustrate new aspects of vertebrate genome evolution revealed by powerful new comparative genomics methodologies.


Back to Workshop III: Evolutionary Genomics