| Literature DB >> 23369091 |
Filippo Utro1, Marc Pybus, Laxmi Parida.
Abstract
BACKGROUND: Reconstructability of population history, from genetic information of extant individuals, is studied under a simulation setting. We do not address the issue of accuracy of the reconstruction algorithms: we assume the availability of the theoretical best algorithm. On the other hand, we focus on the fraction (1 - f) of the common genetic history that is irreconstructible or impenetrable. Thus the fraction, f, gives an upper bound on the extent of estimability. In other words, there exists no method that can reconstruct a fraction larger than f of the entire common genetic history. For the realization of such a study, we first define a natural measure of the amount of genetic history. Next, we use a population simulator (from literature) that has at least two features. Firstly, it has the capability of providing samples from different demographies, to effectively reflect reality. Secondly, it also provides the underlying relevant genetic history, captured in its entirety, where such a measure is applicable. Finally, to compute f, we use an information content measure of the relevant genetic history. The simulator of choice provided the following demographies: Africans, Europeans, Asians and Afro-Americans.Entities:
Mesh:
Year: 2013 PMID: 23369091 PMCID: PMC3549811 DOI: 10.1186/1471-2164-14-S1-S10
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Experimental set-up
| Parameters | Values |
|---|---|
| Mutation rate (bp/gen × 10-8) | 0.7, 1.5, 3.0 |
| Sequence length (Kb) | 5, 10, 30, 50, 75, 100, 150, 200 |
| Sample size | 5, 10, 30, 60, 120 |
| Recombination rate (cM/Mb) | 0.1, 0.3, 0.5, 0.7, 0.9, 1.1, 1.3, 1.5, 1.7, 1.9, 2.1, 2.3, 2.5, 2.8, 3.1, 3.5, 3.9, 4.5, 5.1 |
Figure 1The topology of an ARG. (a) The topology of an ARG G with three extant samples marked 1, 2 and 3. The dashed horizontal lines mark the age or depth of the nodes which are the same in all the four figures. (b) The three nonmixing segments are red, green and blue, in that order. Each node displays the nonmixing segments. A white rectangle indicates the absence of that segment in that node. The three embedded trees, corresponding to each segment, are shown in the same color as that of the segment. The edge labels (mutation events) are not shown to avoid clutter. (c) The same as (b) with the two marked nodes that are not t-coalescent. (d) Removing the marked nodes to obtain a minimal descriptor G'.
Figure 2A simple example using the output of COSI. A simple example using the output of COSI, where the horizontal line corresponds to the age or depth of the node that it intersects. Also, the length of each edge is not proportional to the size in the rendering. For instance the length of edge AB is 34.163732 while length of BD is only 7.645987, while they have been rendered here with the same size. (a) The topology of the ARG: node D is not t-coalescent. (b) The ARG in (a) after removal of node D. (c) The ARG in (b) after removal of chain node E. (d) The ARG in (c) after removal of chain node B.
Figure 3The values of . The values of (a) Nand (b) fof relevant genetic events against depth, for the four demographies. The data is over all sequence lengths with a mutation rate 1.5 × 10-8 bp/gen, recombination rate 2.1 cM/Mb and sample size 60. The four vertical lines in (b) correspond to the four events incorporated in the simulator COSI: (1) increase in effective population size, (2) bottleneck event, (3) out of Africa event and (4) increase in effective population size.
Figure 4Summary plots of . Summary plots of f for each demography with mutation rate 1.5 × 10-8 bp/gen for different values of (a) sequence length, (b) sample size and (c) recombination rate. For each value on the x-axis we consider the average of all possible values of the other parameters.