Beneficial mutations: models and inference from experimental evolution data
Adaptive divergence of a population from the ancestral state through natural selection depends ultimately on variation supplied by beneficial mutations. Yet virtually nothing is known about this class of mutation because focus has been on deleterious mutations, which are far more abundant. We thus lack a working theory of adaptation. I will review recent attempts to develop such a theory by Orr (Genetics 2003; Evolution 2002), based on Gillespie's mutational landscape model (Evolution 1984) and Fisher's geometric model (Martin and Lenormand, Evolution 2006, Genetics 2008). I will concentrate on predictions about the distribution of fitness effects among beneficial mutations exposed to and ultimately fixed by selection made by these theories. Some of the challenges inherent to testing these predictions will be illustrated using the filamentous fungi Aspergillus nidulans. I will show that (i) 112 independent adaptive walks in Aspergillus nidulans tend to be short and fitness increases become exponentially smaller as successive mutations fix, and (ii) that the supply of beneficial mutations decreases as populations adapt and provide empirical distributions of fitness effects among mutations fixed at each step.
Approximate Bayesian Computation in a hierarchical model: applications todetecting selection
Recently a group of techniques, variously called likelihood-free inference, or Approximate Bayesian Computation (ABC), have been quite widely applied in population genetics. These methods typically require the data to be compressed into summary statistics. In a hierarchical setting one may be interested both in hyper-parameters and parameters, and there may be very many of the latter - for example, in a genetic model, these may be parameters describing each of many loci or populations. This poses a problem for ABC in that one then requires summary statistics for each locus, and, if used naively, a consequent problem in conditional density estimation. We develop a general method for addressing these problems efficiently, and we describe recent work in which the ABC method can be used to detect loci under local selection.
Authorship: Eric Bazin (CIRAD, Montpellier), Kevin Dawson (Rothamsted Research), Mark Beaumont (Unversity of Reading)
Inference for Lambda-coalescents
Multiple merger coalescents, aka Lambda-coalescents, are generalisations of Kingman's coalescent, which have been proposed as models for genealogies in species with highly variable offspring numbers. We extend methods of Griffiths & Tavaré, which allow to estimate the likelihood of sequence observations using a Monte Carlo approach, to this setting, and illustrate our method using simulated datasets and a Cod dataset from Árnason (2004).
Joint work with Jochen Blath and Matthias Steinrücken, TU Berlin
New methods for genome-wide population genetics
The recent availability of closely related, fully sequenced genomes provides us with an unprecedented amount of data, enabling new insights into the genetics of ancestral populations. These new data, consisting of billions of sites for a few species, bring new statistical and computational challenges, since traditional analyses involved only a hundred of sites, yet for dozens of individuals. The emerging "population genomics" hence differs from the traditional "population genetics" by its methods and, to some extent, by the questions it can address. I will present recent developments in the field, and their applications to Drosophila and Primates genomes.
Simulation models of the immune system
So far, we have focused on modelling different parts of the immune system using bioinformatic methods. These models can be used to predict answers to relatively simple questions like if a given peptide from a virus will bind to a given molecule from the immune system. More complex questions that could be really interesting to provide answers to include for example which of the thousands of different proteins in a given microbe the immune system of a person with a given genetic background will respond to, how does the outcome of this infection depend on which infections/vaccinations this person has earlier received, and under which circumstances will such an infection lead to autoimmune responses. To answer these types of questions, the different sub-models of the immune system must be put together into an integrated model of the immune system. The virtual immune system model can then be used to test hypothesis and in silico screenings can be used to select the most interesting experiments that should be carried out in vitro and in vivo.
Understanding human admixture, and association mapping in admixed populations
Historical, archaeological and linguistic data demonstrate that many modern day human populations represent admixtures of genetically distinct groups. New fine-scale genome wide data sets offer the opportunity to understand admixture events that occurred much further in the past, or occurred between more closely related populations, than has previously been possible. However, there is a lack of applicable methods to extract such information. Here we discuss the development and application of a novel approach to help improve our understanding of how historical admixture has influenced human genomes. The approach aims to date events, as well as to identify the contributing populations. Application to data for 53 human populations suggests admixture has affected almost all human populations. For example, multiple admixture events within Central and East Asia date back to the time of the Mongol Empire, and even earlier events are detected in some African, Middle Eastern and Melanesian populations. We also discuss how utilizing admixed populations can aid disease mapping by association, and apply these ideas to real case-control association data, revealing a number of variants independently influencing risk of developing prostate cancer in African-Americans.