ARGweaver-D documentation

Introduction

ARGweaver-D is an extension of ARGweaver that allows ARGs to be sampled conditional on a user-defined demographic model that includes population splits and migration events. The ARGweaver-D code is built into the ARGweaver software base, and only requires extra options to the command-line programs to invoke. It is recommended that you first familiarize yourself with basic ARGweaver usage before reading this manual. However, note that ARGweaver-D has only been added to the C++ component of the ARGweaver codebase, and has not been incorporated into ARGweaver's python library.

Sampling an ARG under ARGweaver-D using arg-sample

Demography-aware ARGs can be sampled with the command arg-sample. Here, we document the options specific to ARGweaver-D:

Overview

Getting help

Specifying a population tree

npop 5

## Neanderthal(1)/Denisova(2) div at 415kya
div 14310 2 1

## Archaic(1)/African(0) div at 575kya
div 19827 1 0

## Super-archaic(3)/Hominin(0) div at 1.0mya
div 34483 3 0

## Hominin(0)/Chimp(4) div at 11Mya
div 379310 4 0

## Afr-Nea migration @ 250kya
mig 8621 1 0 0.01
mig 8621 2 0 0.01

## Super-archaic migrations @ 250kya
mig 8621 0 3 0.01
mig 8621 1 3 0.01
mig 8621 2 3 0.01

Figure: Example population tree file. The first line starts with the tag "npop" and is followed by the number of populations at present time. The populations are numbered 0,...,(npop-1).

Each "div" line specifies a population divergence time. It takes three arguments: a time (in generations), and two population numbers. Looking backwards in time, the second population merges with the first population at this time. If there are n populations, there must be n-1 "div" lines, specified in such a way so that only one population remains by the final discrete time in the ARGweaver model (see arg-sample --maxtime or arg-sample --times-file).

Migrations are optionally specified with "mig" lines. They have the same format as the "div" lines, but with an additional probability parameter at the end, specifying the probability that a lineage sampled from the receiving population just after the migration event came from the donor population. Looking backwards in time, individuals from the first population move to the second population with the specified probability at the specified time.

Comment lines (starting with #) are allowed.

One important note is that all times in this file will be rounded to the nearest half-time in the discretized ARGweaver model.

Specifying population assignments

chimp	4
Mandenka_2F	0
Khomani_San_1F	0
Vindija	1
Altai	1
Denisova	2

Figure: Example population file. The population file is simply a tab-delimited file with two columns: an individual name and a population assignment. The population assignment numbers correspond to the numbers in the population tree file (described above). The first column can give the names of haploid lineages (as used in SITES file format). Or, they can be diploid names, if VCF file input was used, or if the haploid lineages are named with the convention ind_1 ind_2, where "ind" is the name of the diploid individual.

Other arg-sample options

Parsing demographic ARGs with arg-summarize

hToN 1 0 8621
hToD 2 0 8621
sToH 0 3 8621
sToN 1 3 8621
sToD 2 3 8621

Figure: Example migration file for --mig-file option to arg-summarizeEach row describes a migration event that we want to summarize. The first column names the event; this name will be used in the arg-summarize output file. The next three columns give the receiving population, the source population, and the time (in generations) of the event.

hToN.alt_1 1 0 8621 Altai_1 hToN.alt_2 1 0 8621 Altai_2 hToN.vin_1 1 0 8621 Vindija_1 hToN.vin_2 1 0 8621 Vindija_2 hToD.den_1 2 0 8621 Denisova_1 hToD.den_2 2 0 8621 Denisova_2 sToA.afr1_1 0 3 8621 Khomani_San_1F_1 sToA.afr1_2 0 3 8621 Khomani_San_1F_2 sToA.afr2_1 0 3 8621 Mandenka_2F_1 sToA.afr2_2 0 3 8621 Mandenka_2F_2 sToN.alt_1 1 3 8621 Altai_1 sToN.alt_2 1 3 8621 Altai_2 sToN.vin_1 1 3 8621 Vindija_1 sToN.vin_2 1 3 8621 Vindija_2 sToD.den_1 2 3 8621 Denisova_1 sToD.den_2 2 3 8621 Denisova_2

ARGweaver-D: Sampling ancestral recombination graphs conditional on a demographic model

Documentation for ARGweaver-D

Updated: June, 2019