My favorites | Sign in
Logo
                
Show all Featured downloads:
dadi-1.3.3.zip
People details
Project owners:
  rgutenkunst

Diffusion Approximation for Demographic Inference

∂a∂i implements a method for demographic inference from genetic data, based on a diffusion approximation to the allele frequency spectrum. One of ∂a∂i's main benefits is speed: fitting a two-population model typically takes around 10 minutes, and run time is independent of the number of SNPs in your data set. ∂a∂i is also flexible, handling up to three simultaneous populations, with arbitrary timecourses for population size and migration, plus the possibility of admixture and population-specific selection. The code is young but has already been used in several publications.

Originally ∂a∂i was developed by Ryan Gutenkunst in the labs of Scott Williamson and Carlos Bustamante in Cornell's Department of Biological Statistics and Computational Biology. Ryan is now a postdoc at Los Alamos National Lab, in the Theoretical Biology and Biophysics group and the Center for Nonlinear Studies, and he is continuing to work on ∂a∂i.

News

PLoS Genetics has published the paper describing ∂a∂i.
Also, I have posted source code for ∂a∂i 1.3.3. This release fixes a minor bug (which did not affect computation results), and hopefully enables Python 2.4 compatibility.
This release of ∂a∂i includes two substantial changes to improve ease-of-use. In addition, the tests suite has been expanded.
  • The Misc.make_data_dict function has been added to ease import of data into ∂a∂i. See DataFormats for details on the file format for this method.
  • Spectrum objects now track whether or not they are folded. This reduces the potential for mistakes, and it allows operations like projection and marginalization of folded spectra to be handled automatically and correctly.
The paper on ∂a∂i has been accepted for publication in PLoS Genetics. It has also been posted on the arXiv.
The paper describes the method and infers demographic models for human expansion out of Africa and the settlement of the New World. The out of Africa model is then combined with a previously estimated distribution of selection coefficients to accurately predict the distribution of segregating nonsynonymous variation between populations.

Example

The above plot summarizes the result of fitting a model the joint frequency spectrum of genetic variation between the Yoruba (YRI) and CEPH European (CEU) populations. The data is derived from the Environmental Genome Project SNPs database. The upper left panel is the data, and the upper right is the result of a demographic model whose parameters have been optimized using ∂a∂i. The lower left panel is the residuals between model and data (red means the model predicts too many SNPs in that bin) and the lower right is a histogram of the residuals. The model involves population growth in the ancestral population, followed by divergence of the CEU population with a bottleneck and exponential growth. It contains 7 free parameters and took a few minutes to fit using ∂a∂i.









Hosted by Google Code