Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Analysis of error profiles in deep next-generation sequencing data.

Literature DB >> 30867008

Analysis of error profiles in deep next-generation sequencing data.

Xiaotu Ma¹, Ying Shao², Liqing Tian², Diane A Flasch², Heather L Mulder², Michael N Edmonson², Yu Liu², Xiang Chen², Scott Newman², Joy Nakitandwe³, Yongjin Li², Benshang Li⁴, Shuhong Shen⁴, Zhaoming Wang^2,5, Sheila Shurtleff³, Leslie L Robison⁵, Shawn Levy⁶, John Easton², Jinghui Zhang⁷.

Abstract

BACKGROUND: Sequencing errors are key confounding factors for detecting low-frequency genetic variants that are important for cancer molecular diagnosis, treatment, and surveillance using deep next-generation sequencing (NGS). However, there is a lack of comprehensive understanding of errors introduced at various steps of a conventional NGS workflow, such as sample handling, library preparation, PCR enrichment, and sequencing. In this study, we use current NGS technology to systematically investigate these questions.
RESULTS: By evaluating read-specific error distributions, we discover that the substitution error rate can be computationally suppressed to 10-5 to 10-4, which is 10- to 100-fold lower than generally considered achievable (10-3) in the current literature. We then quantify substitution errors attributable to sample handling, library preparation, enrichment PCR, and sequencing by using multiple deep sequencing datasets. We find that error rates differ by nucleotide substitution types, ranging from 10-5 for A>C/T>G, C>A/G>T, and C>G/G>C changes to 10-4 for A>G/T>C changes. Furthermore, C>T/G>A errors exhibit strong sequence context dependency, sample-specific effects dominate elevated C>A/G>T errors, and target-enrichment PCR led to ~ 6-fold increase of overall error rate. We also find that more than 70% of hotspot variants can be detected at 0.1 ~ 0.01% frequency with the current NGS technology by applying in silico error suppression.
CONCLUSIONS: We present the first comprehensive analysis of sequencing error sources in conventional NGS workflows. The error profiles revealed by our study highlight new directions for further improving NGS analysis accuracy both experimentally and computationally, ultimately enhancing the precision of deep sequencing.

Entities: CellLine Chemical Disease Gene Mutation Species

Keywords: Deep sequencing; Detection; Error rate; Hotspot mutation; Subclonal; Substitution

Mesh：

Year: 2019 PMID： 30867008 PMCID： PMC6417284 DOI： 10.1186/s13059-019-1659-6

Source DB: PubMed Journal: Genome Biol ISSN： 1474-7596 Impact factor: 13.583

Keyword Cloud
Cited

67 in total

1. Distinct error rates for reference and nonreference genotypes estimated by pedigree analysis.

Authors: Richard J Wang; Predrag Radivojac; Matthew W Hahn
Journal: Genetics Date: 2021-03-03 Impact factor: 4.562

2. Stability of SARS-CoV-2 phylogenies.

Authors: Yatish Turakhia; Nicola De Maio; Bryan Thornlow; Landen Gozashti; Robert Lanfear; Conor R Walker; Angie S Hinrichs; Jason D Fernandes; Rui Borges; Greg Slodkowicz; Lukas Weilguny; David Haussler; Nick Goldman; Russell Corbett-Detig
Journal: PLoS Genet Date: 2020-11-18 Impact factor: 5.917

3. Sensitive Identification of Bacterial DNA in Clinical Specimens by Broad-Range 16S rRNA Gene Enrichment.

Authors: Sara Rassoulian Barrett; Noah G Hoffman; Christopher Rosenthal; Andrew Bryan; Desiree A Marshall; Joshua Lieberman; Alexander L Greninger; Vikas Peddu; Brad T Cookson; Stephen J Salipante
Journal: J Clin Microbiol Date: 2020-11-18 Impact factor: 5.948

4. A Zipf-plot based normalization method for high-throughput RNA-seq data.

Authors: Bin Wang
Journal: PLoS One Date: 2020-04-09 Impact factor: 3.240

5. The NSD2 p.E1099K Mutation Is Enriched at Relapse and Confers Drug Resistance in a Cell Context-Dependent Manner in Pediatric Acute Lymphoblastic Leukemia.

Authors: Joanna Pierro; Jason Saliba; Sonali Narang; Gunjan Sethia; Shella Saint Fleur-Lominy; Ashfiyah Chowdhury; Anita Qualls; Hannah Fay; Harrison L Kilberg; Takaya Moriyama; Tori J Fuller; David T Teachey; Kjeld Schmiegelow; Jun J Yang; Mignon L Loh; Patrick A Brown; Jinghui Zhang; Xiaotu Ma; Aristotelis Tsirigos; Nikki A Evensen; William L Carroll
Journal: Mol Cancer Res Date: 2020-04-24 Impact factor: 5.852

6. Fitness selection of hyperfusogenic measles virus F proteins associated with neuropathogenic phenotypes.

Authors: Satoshi Ikegame; Takao Hashiguchi; Chuan-Tien Hung; Kristina Dobrindt; Kristen J Brennand; Makoto Takeda; Benhur Lee
Journal: Proc Natl Acad Sci U S A Date: 2021-05-04 Impact factor: 11.205

7. St. Jude Cloud: A Pediatric Cancer Genomic Data-Sharing Ecosystem.

Authors: Clay McLeod; Alexander M Gout; Xin Zhou; Andrew Thrasher; Delaram Rahbarinia; Samuel W Brady; Michael Macias; Kirby Birch; David Finkelstein; Jobin Sunny; Rahul Mudunuri; Brent A Orr; Madison Treadway; Bob Davidson; Tracy K Ard; Arthur Chiao; Andrew Swistak; Stephanie Wiggins; Scott Foy; Jian Wang; Edgar Sioson; Shuoguo Wang; J Robert Michael; Yu Liu; Xiaotu Ma; Aman Patel; Michael N Edmonson; Mark R Wilkinson; Andrew M Frantz; Ti-Cheng Chang; Liqing Tian; Shaohua Lei; S M Ashiqul Islam; Christopher Meyer; Naina Thangaraj; Pamella Tater; Vijay Kandali; Singer Ma; Tuan Nguyen; Omar Serang; Irina McGuire; Nedra Robison; Darrell Gentry; Xing Tang; Lance E Palmer; Gang Wu; Ed Suh; Leigh Tanner; James McMurry; Matthew Lear; Alberto S Pappo; Zhaoming Wang; Carmen L Wilson; Yong Cheng; Soheil Meshinchi; Ludmil B Alexandrov; Mitchell J Weiss; Gregory T Armstrong; Leslie L Robison; Yutaka Yasui; Kim E Nichols; David W Ellison; Chaitanya Bangur; Charles G Mullighan; Suzanne J Baker; Michael A Dyer; Geralyn Miller; Scott Newman; Michael Rusch; Richard Daly; Keith Perry; James R Downing; Jinghui Zhang
Journal: Cancer Discov Date: 2021-01-06 Impact factor: 39.397

8. SequencErr: measuring and suppressing sequencer errors in next-generation sequencing data.

Authors: Eric M Davis; Yu Sun; Yanling Liu; Pandurang Kolekar; Ying Shao; Karol Szlachta; Heather L Mulder; Dongren Ren; Stephen V Rice; Zhaoming Wang; Joy Nakitandwe; Alexander M Gout; Bridget Shaner; Salina Hall; Leslie L Robison; Stanley Pounds; Jeffery M Klco; John Easton; Xiaotu Ma
Journal: Genome Biol Date: 2021-01-25 Impact factor: 13.583

9. Precision genome editing using cytosine and adenine base editors in mammalian cells.

Authors: Tony P Huang; Gregory A Newby; David R Liu
Journal: Nat Protoc Date: 2021-01-18 Impact factor: 13.491

10. Prediction and validation of hematopoietic stem and progenitor cell off-target editing in transplanted rhesus macaques.

Authors: Aisha A AlJanahi; Cicera R Lazzarotto; Shirley Chen; Tae-Hoon Shin; Stefan Cordes; Xing Fan; Isabel Jabara; Yifan Zhou; David J Young; Byung-Chul Lee; Kyung-Rok Yu; Yuesheng Li; Bradley Toms; Ilker Tunc; So Gun Hong; Lauren L Truitt; Julia Klermund; Geoffroy Andrieux; Miriam Y Kim; Toni Cathomen; Saar Gill; Shengdar Q Tsai; Cynthia E Dunbar
Journal: Mol Ther Date: 2021-06-24 Impact factor: 11.454