| Literature DB >> 25596205 |
Paul R Staab1, Sha Zhu1, Dirk Metzler1, Gerton Lunter1.
Abstract
MOTIVATION: Coalescent-based simulation software for genomic sequences allows the efficient in silico generation of short- and medium-sized genetic sequences. However, the simulation of genome-size datasets as produced by next-generation sequencing is currently only possible using fairly crude approximations.Entities:
Mesh:
Year: 2015 PMID: 25596205 PMCID: PMC4426833 DOI: 10.1093/bioinformatics/btu861
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.Approximation of genetic linkage. Shown is the correlation of ρ (y-axis) of the total local branch length at two sites δ base pairs apart (x-axis). The linkage in the CWR (ms, options 20 1 -r 4000 10000001 -T) is indicated in black. Results for scrm using different exact window sizes (see legend) are indicated in colour
Fig. 2.Efficiency for different approximations. Shown is the deviation (y-axis) against run-time (x-axis) for simulating 10 Mb with a recombination rate of per base per generation. The deviation of the approximation from the correct values is measured as the square root of the area between the correlation curves for the approximate simulated data, and ms-generated data (see Fig. 1). For scrm and MaCS multiple approximation levels are drawn using different exact window sizes or history parameters. The recently published Cosi2 (Shlyakhter ) does not output trees and could not be included in this figure; for a comparison of Cosi2 and scrm using different summary statistics see Supplementary Figure S5