| Literature DB >> 19252504 |
Todd E Druley1, Francesco L M Vallania, Daniel J Wegner, Katherine E Varley, Olivia L Knowles, Jacqueline A Bonds, Sarah W Robison, Scott W Doniger, Aaron Hamvas, F Sessions Cole, Justin C Fay, Robi D Mitra.
Abstract
We report a targeted, cost-effective method to quantify rare single-nucleotide polymorphisms from pooled human genomic DNA using second-generation sequencing. We pooled DNA from 1,111 individuals and targeted four genes to identify rare germline variants. Our base-calling algorithm, SNPSeeker, derived from large deviation theory, detected single-nucleotide polymorphisms present at frequencies below the raw error rate of the sequencing platform.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19252504 PMCID: PMC2776647 DOI: 10.1038/nmeth.1307
Source DB: PubMed Journal: Nat Methods ISSN: 1548-7091 Impact factor: 28.547
Figure 1Error modeling. (a) The cumulative likelihood of every possible misincorporation event for sequencing cycles 1–32 is depicted for both the sense (+) and antisense (−) strands. The Illumina data filtering process truncated the data from two dates at 32 bases instead of 36, which is why only 32 cycles are represented here. INSET. Higher resolution of the error probability across cycles 1–12. (b) The intra- and inter-day variability for the A→C misincorporation event from four different sequencing dates. The error bars represent the standard deviation between different flowcell channels from the same date. INSET. Higher resolution of cycles 1–12.
Figure 2Allele frequency by sequencing vs. genotyping. The allele frequency as determine by sequencing is plotted against the actual frequencies as determined by individual Taqman assay for the 14 validated SNPs in our dataset (correlation coefficient r2 = 0.96).