University of Rochester
Department of Biology
River Campus Box 270211
Rochester, New York 14627-0211
Hutchison 319 (office)
Hutchison 313 (lab)
(585) 276-4816 (office)
(585) 276-5073 (lab)
Coalescent theory has revolutionized the analysis of DNA sequences that are sampled from within a species. The coalescent provides a mathematical framework for describing the evolutionary history of chromosomes within populations. Because a coalescent history can be easily simulated in a computer, it is amenable to large-scale genomic analysis. My laboratory focuses on using the coalescent to test evolutionary models of population structure with genome-wide DNA sequence data. This work incorporates aspects of both theoretical and empirical population genetics, computational biology and bioinformatics, genomics and evolutionary biology.
The coalescent and human origins
One goal of my laboratory is to use coalescent theory to understand human origins and evolution. This work is motivated by the discovery of regions of the human genome with an unexpected pattern of ancestry. In particular, some human populations have X chromosomes containing regions that are polymorphic for two highly divergent sequence types. One type of sequence resembles the “normal” pattern of human variation, but a rarer type appears to have diverged from the “normal” types about 2 million years ago. Interestingly, there appears to be little recombination between the two types of sequences, which is not expected if the two types have co-existed within the same population for the last 2 million years. By simulating coalescent ancestries under a wide variety of evolutionary models, I have found that this pattern can be most readily explained by an “isolation-and-admixture” (IAA) model. The IAA model includes a prolonged period of isolation between two ancestral populations, during which many sequence differences are allowed to accumulate. Then, a recent admixture (or interbreeding) event brings the two divergent types of sequence into the modern population. In terms of human evolution, this suggests that, as anatomically modern human populations began to expand 100,000 years ago, they interbred with archaic forms of the genus Homo, resulting in the observed “mosaic” pattern of DNA sequence variation. This conclusion has paved the way for investigation of a large number of possible human origins models that incorporate ancestral population structure.
Computational challenges for whole-genome analysis
My laboratory also focuses on developing innovations in statistical and computational methods that will be required to analyze entire-genome datasets in the future. This work involves using the coalescent in conjunction with statistical techniques such as Markov chain Monte Carlo (MCMC) and Importance Sampling (IS) to identify the evolutionary models that best fit data from a sample of entire genomes. Such large datasets present considerable computing challenges and creative solutions that include parallel programming are required for efficient analysis.
One project aims to analyze whole-genome DNA sequence data from two species of Drosophila, D. simulans and D. melanogaster. Multiple strains from each species were resequenced via 454 pyrosequencing and an unusually high number of polymorphic nucleotide sites were found to be polymorphic in both species. A novel coalescent-based MCMC technique is being developed to assess what is the most likely evolutionary explanation for such a large degree of shared variation could be between two species. One of the main theoretical challenges to entire-genome analysis is accounting for the correlation of ancestries that naturally arises between adjacent genomic regions. Therefore, another focus of the laboratory is to incorporate the process of recombination into coalescent-based algorithms.
- 2008. On the timing and magnitude of the "Out-of-Africa" human population bottleneck. Mol. Cell Biol. in press.
- 2008. Ancient lineages in the genome: a response to Fagundes et al. Proc. Natl. Acad. Sci. USA 105: E3.
- 2007. Archaic human admixture: a view from the genome. Current Anthropology. 48: 895-902.
- 2007. Inferring human population sizes, divergence times and rates of gene flow from mitochondrial, X and Y chromosome resequencing data. Genetics. 177: 2195-2207.
- 2006. Reconstructing human origins in the genomic era. Nature Reviews Genetics. 7: 669-680.
- 2005. Deep haplotype divergence and long-range linkage disequilibrium at Xp21.1 provide evidence that humans descend from a structured ancestral population. Genetics. 170: 1849-1856.
- 2005. Evidence for archaic Asian ancestry on the human X chromosome. Molecular Biology and Evolution. 22: 189-192.
- 2003. Perspective: Detecting adaptive molecular polymorphism, lessons from the MHC. Evolution. 57: 1707-1722.