| Literature DB >> 30165547 |
Abstract
MOTIVATION: The Li and Stephens model, which approximates the coalescent describing the pattern of variation in a population, underpins a range of key tools and results in genetics. Although highly efficient compared to the coalescent, standard implementations of this model still cannot deal with the very large reference cohorts that are starting to become available, and practical implementations use heuristics to achieve reasonable runtimes.Entities:
Mesh:
Year: 2019 PMID: 30165547 PMCID: PMC6394399 DOI: 10.1093/bioinformatics/bty735
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.Structure of the matrix M. The rows M are sorted lexicographically; in particular . The Burrows-Wheeler transform of X (see text) is the rightmost column of M, while the positional BWT of the sequences is the upper half of the same column (see text). The column indices are determined by , the allele frequency of symbol a at locus i, and , the cumulative frequency of symbol a across loci . Note that ordering of rows to is determined by the special position symbols , but to avoid cluttering the notation these are all written as
Fig. 2.Running time for inferring inheritance patterns under the haploid (dashes) and diploid Li and Stephens model over a simulated reference set of n (horizontal axis) haploid sequences, using the Viterbi (red) and fastLS (green) algorithms, using . Dots represent measurements, curves show quadratic fits. (a) Results for a simulated reference population of n samples. (b) Results for a fixed simulated reference population of 100 000, subsampled to n samples