G-PhoCS: A Generalized Phylogenetic Coalescent Sampler




G-PhoCS is a software package for inferring ancestral population sizes, population divergence times, and migration rates from individual genome sequences. G-PhoCS accepts as input a set of multiple sequence alignments from separate neutrally evolving loci along the genome. Parameter inference is done in a Bayesian manner, using a Markov Chain Monte Carlo (MCMC) to jointly sample model parameters and genealogies at the input loci.

G-PhoCS is inspired by and derived from MCMCcoal, developed by Ziheng Yang. Two main conceptual differences separate G-PhoCS from MCMCcoal:
  1. G-PhoCS models gene flow between populations along user-defined migration bands.
  2. G-PhoCS analyzes unphased diploid genotypes using a novel method for integrating over all possible phases.

Additional adjustments were made to the C implementation of MCMCcoal in order to make it more efficient and reduce running time.

More information on G-PhoCS can be found in Section 4 of the supplement to our paper, and in the G-PhoCS user manual.


G-PhoCS is still at an experimental stage. We are working on cleaning up the code and improving the user interface. We encourage people to try G-PhoCS out on their favorite data sets, and appreciate any feedback we get. We will post updates and fixes periodically on this website.

Thanks for your understanding.


Useful Links

G-PhoCS code     Updated code repository on GitHub.
BSNP     A Bayesian SNP caller from short read data with reduced reference bias.
Data     Sequence data from seven individual human genomes used in the demographic analysis of Gronau et al. Nat Genet 2011.

Cite

Gronau I, Hubisz MJ, Gulko B, Danko CG, Siepel A.   Bayesian inference of ancient human demography from individual genome sequences.  Nature Genetics 43 1031–1034.   2011


Contact

Problems, questions, feature requests should be directed to ilan.gronau@idc.ac.il