| Literature DB >> 21342586 |
Jorge Duitama1, Justin Kennedy, Sanjiv Dinakar, Yözen Hernández, Yufeng Wu, Ion I Măndoiu.
Abstract
BACKGROUND: Recent technology advances have enabled sequencing of individual genomes, promising to revolutionize biomedical research. However, deep sequencing remains more expensive than microarrays for performing whole-genome SNP genotyping.Entities:
Mesh:
Year: 2011 PMID: 21342586 PMCID: PMC3044311 DOI: 10.1186/1471-2105-12-S1-S53
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1HF-HMM model for multilocus genotype inference.
Figure 2Schematic state diagram for the HMMs M and M′ used in the reduction of the consensus string problem to MMGPP.
Summary statistics for the three datasets used in evaluation
| Dataset | Test SNPs | Raw Reads | Raw Sequence | Mapped Reads | Avg. Mapped SNP coverage |
|---|---|---|---|---|---|
| Watson 454 | 443 | 74.2 | 19.7 | 49.8 | 5.85× |
| NA18507 Illumina | 2.85 | 525 | 18.9 | 397 | 6.10× |
| NA18507 SOLiD | 2.85 | 2.45 | 75 | 900 | 9.85× |
Figure 3Genotype calling accuracy of compared methods for homozygous (a) and heterozygous (b) SNPs of the NA18507 Illumina dataset.
Figure 4HMM posterior decoding accuracy (a) and distribution of reference allele coverage ratios for heterozygous SNPs (b) on the Watson 454, NA18507 Illumina, and NA18507 SOLiD datasets.
Figure 5Effect of local recombination rate (a) and minor allele frequency (b) on concordance of genotypes called by the HMM posterior decoding algorithm on the NA18507 Illumina dataset.
Figure 6Effect of the reference panel size (a) and tradeoff between concordance and calling rate (b) for genotypes called by the HMM posterior decoding algorithm on the Watson 454 dataset.