High-throughput sequencing of metagenomes provides access to genome-wide heterogeneity within environmental populations, and current computational strategies can reveal associations between ecological parameters and micro-diversity patterns at various levels of resolution. However, teasing apart evolutionary processes that maintain the genomic heterogeneity and shape the biogeography of naturally occurring microbes is difficult as we often lack de novo insights into the role of individual variants. In this talk I will describe (1) our recent effort to ameliorate this bottleneck through accurate characterization of non-synonymous variation within environmental populations via single-amino acid variants (SAAVs) and placing SAAVs on predicted protein structures, and (2) the current computational and biological challenges to make sense of this multi-dimensional mess.
Back to Long Programs