Literature DB >> 33487172

SequencErr: measuring and suppressing sequencer errors in next-generation sequencing data.

Eric M Davis1, Yu Sun1,2, Yanling Liu1, Pandurang Kolekar1, Ying Shao1, Karol Szlachta1, Heather L Mulder1, Dongren Ren3, Stephen V Rice1, Zhaoming Wang4, Joy Nakitandwe5, Alexander M Gout1, Bridget Shaner1, Salina Hall6, Leslie L Robison4, Stanley Pounds7, Jeffery M Klco5, John Easton1, Xiaotu Ma8.   

Abstract

BACKGROUND: There is currently no method to precisely measure the errors that occur in the sequencing instrument/sequencer, which is critical for next-generation sequencing applications aimed at discovering the genetic makeup of heterogeneous cellular populations.
RESULTS: We propose a novel computational method, SequencErr, to address this challenge by measuring the base correspondence between overlapping regions in forward and reverse reads. An analysis of 3777 public datasets from 75 research institutions in 18 countries revealed the sequencer error rate to be ~ 10 per million (pm) and 1.4% of sequencers and 2.7% of flow cells have error rates > 100 pm. At the flow cell level, error rates are elevated in the bottom surfaces and > 90% of HiSeq and NovaSeq flow cells have at least one outlier error-prone tile. By sequencing a common DNA library on different sequencers, we demonstrate that sequencers with high error rates have reduced overall sequencing accuracy, and removal of outlier error-prone tiles improves sequencing accuracy. We demonstrate that SequencErr can reveal novel insights relative to the popular quality control method FastQC and achieve a 10-fold lower error rate than popular error correction methods including Lighter and Musket.
CONCLUSIONS: Our study reveals novel insights into the nature of DNA sequencing errors incurred on DNA sequencers. Our method can be used to assess, calibrate, and monitor sequencer accuracy, and to computationally suppress sequencer errors in existing datasets.

Entities:  

Keywords:  DNA sequencing; Error suppression; Sequencer/instrument error

Mesh:

Year:  2021        PMID: 33487172      PMCID: PMC7829059          DOI: 10.1186/s13059-020-02254-2

Source DB:  PubMed          Journal:  Genome Biol        ISSN: 1474-7596            Impact factor:   13.583


  28 in total

1.  Estimate of the mutation rate per nucleotide in humans.

Authors:  M W Nachman; S L Crowell
Journal:  Genetics       Date:  2000-09       Impact factor: 4.562

2.  Genetic Risk for Subsequent Neoplasms Among Long-Term Survivors of Childhood Cancer.

Authors:  Zhaoming Wang; Carmen L Wilson; John Easton; Andrew Thrasher; Heather Mulder; Qi Liu; Dale J Hedges; Shuoguo Wang; Michael C Rusch; Michael N Edmonson; Shawn Levy; Jennifer Q Lanctot; Eric Caron; Kyla Shelton; Kelsey Currie; Matthew Lear; Aman Patel; Celeste Rosencrance; Ying Shao; Bhavin Vadodaria; Donald Yergeau; Yadav Sapkota; Russell J Brooke; Wonjong Moon; Evadnie Rampersaud; Xiaotu Ma; Ti-Cheng Chang; Stephen V Rice; Cynthia Pepper; Xin Zhou; Xiang Chen; Wenan Chen; Angela Jones; Braden Boone; Matthew J Ehrhardt; Matthew J Krasin; Rebecca M Howell; Nicholas S Phillips; Courtney Lewis; Deokumar Srivastava; Ching-Hon Pui; Chimene A Kesserwan; Gang Wu; Kim E Nichols; James R Downing; Melissa M Hudson; Yutaka Yasui; Leslie L Robison; Jinghui Zhang
Journal:  J Clin Oncol       Date:  2018-05-30       Impact factor: 44.544

3.  Musket: a multistage k-mer spectrum-based error corrector for Illumina sequence data.

Authors:  Yongchao Liu; Jan Schröder; Bertil Schmidt
Journal:  Bioinformatics       Date:  2012-11-29       Impact factor: 6.937

4.  Lighter: fast and memory-efficient sequencing error correction without counting.

Authors:  Li Song; Liliana Florea; Ben Langmead
Journal:  Genome Biol       Date:  2014       Impact factor: 13.583

5.  AfterQC: automatic filtering, trimming, error removing and quality control for fastq data.

Authors:  Shifu Chen; Tanxiao Huang; Yanqing Zhou; Yue Han; Mingyan Xu; Jia Gu
Journal:  BMC Bioinformatics       Date:  2017-03-14       Impact factor: 3.169

6.  High-intensity sequencing reveals the sources of plasma circulating cell-free DNA variants.

Authors:  Pedram Razavi; Bob T Li; David N Brown; Byoungsok Jung; Earl Hubbell; Ronglai Shen; Wassim Abida; Krishna Juluru; Ino De Bruijn; Chenlu Hou; Oliver Venn; Raymond Lim; Aseem Anand; Tara Maddala; Sante Gnerre; Ravi Vijaya Satya; Qinwen Liu; Ling Shen; Nicholas Eattock; Jeanne Yue; Alexander W Blocker; Mark Lee; Amy Sehnert; Hui Xu; Megan P Hall; Angie Santiago-Zayas; William F Novotny; James M Isbell; Valerie W Rusch; George Plitas; Alexandra S Heerdt; Marc Ladanyi; David M Hyman; David R Jones; Monica Morrow; Gregory J Riely; Howard I Scher; Charles M Rudin; Mark E Robson; Luis A Diaz; David B Solit; Alexander M Aravanis; Jorge S Reis-Filho
Journal:  Nat Med       Date:  2019-11-25       Impact factor: 53.440

7.  Fast and accurate short read alignment with Burrows-Wheeler transform.

Authors:  Heng Li; Richard Durbin
Journal:  Bioinformatics       Date:  2009-05-18       Impact factor: 6.937

8.  LoFreq: a sequence-quality aware, ultra-sensitive variant caller for uncovering cell-population heterogeneity from high-throughput sequencing datasets.

Authors:  Andreas Wilm; Pauline Poh Kim Aw; Denis Bertrand; Grace Hui Ting Yeo; Swee Hoe Ong; Chang Hua Wong; Chiea Chuen Khor; Rosemary Petric; Martin Lloyd Hibberd; Niranjan Nagarajan
Journal:  Nucleic Acids Res       Date:  2012-10-12       Impact factor: 16.971

9.  A somatic reference standard for cancer genome sequencing.

Authors:  David W Craig; Sara Nasser; Richard Corbett; Simon K Chan; Lisa Murray; Christophe Legendre; Waibhav Tembe; Jonathan Adkins; Nancy Kim; Shukmei Wong; Angela Baker; Daniel Enriquez; Stephanie Pond; Erin Pleasance; Andrew J Mungall; Richard A Moore; Timothy McDaniel; Yussanne Ma; Steven J M Jones; Marco A Marra; John D Carpten; Winnie S Liang
Journal:  Sci Rep       Date:  2016-04-20       Impact factor: 4.379

10.  Benchmarking of computational error-correction methods for next-generation sequencing data.

Authors:  Keith Mitchell; Jaqueline J Brito; Igor Mandric; Qiaozhen Wu; Sergey Knyazev; Sei Chang; Lana S Martin; Aaron Karlsberg; Ekaterina Gerasimov; Russell Littman; Brian L Hill; Nicholas C Wu; Harry Taegyun Yang; Kevin Hsieh; Linus Chen; Eli Littman; Taylor Shabani; German Enik; Douglas Yao; Ren Sun; Jan Schroeder; Eleazar Eskin; Alex Zelikovsky; Pavel Skums; Mihai Pop; Serghei Mangul
Journal:  Genome Biol       Date:  2020-03-17       Impact factor: 13.583

View more
  3 in total

Review 1.  Therapeutic and prognostic insights from the analysis of cancer mutational signatures.

Authors:  Samuel W Brady; Alexander M Gout; Jinghui Zhang
Journal:  Trends Genet       Date:  2021-09-02       Impact factor: 11.639

2.  Advancing NGS quality control to enable measurement of actionable mutations in circulating tumor DNA.

Authors:  James C Willey; Tom B Morrison; Bradley Austermiller; Erin L Crawford; Daniel J Craig; Thomas M Blomquist; Wendell D Jones; Aminah Wali; Jennifer S Lococo; Nathan Haseley; Todd A Richmond; Natalia Novoradovskaya; Rebecca Kusko; Guangchun Chen; Quan-Zhen Li; Donald J Johann; Ira W Deveson; Timothy R Mercer; Leihong Wu; Joshua Xu
Journal:  Cell Rep Methods       Date:  2021-11-03

Review 3.  CRISPR Screens in Synthetic Lethality and Combinatorial Therapies for Cancer.

Authors:  Laia Castells-Roca; Eudald Tejero; Benjamín Rodríguez-Santiago; Jordi Surrallés
Journal:  Cancers (Basel)       Date:  2021-03-30       Impact factor: 6.639

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.