Literature DB >> 32131723

Family reunion via error correction: an efficient analysis of duplex sequencing data.

Nicholas Stoler1, Barbara Arbeithuber2, Gundula Povysil3,4, Monika Heinzl3, Renato Salazar3, Kateryna D Makova5, Irene Tiemann-Boege6, Anton Nekrutenko7.   

Abstract

BACKGROUND: Duplex sequencing is the most accurate approach for identification of sequence variants present at very low frequencies. Its power comes from pooling together multiple descendants of both strands of original DNA molecules, which allows distinguishing true nucleotide substitutions from PCR amplification and sequencing artifacts. This strategy comes at a cost-sequencing the same molecule multiple times increases dynamic range but significantly diminishes coverage, making whole genome duplex sequencing prohibitively expensive. Furthermore, every duplex experiment produces a substantial proportion of singleton reads that cannot be used in the analysis and are thrown away.
RESULTS: In this paper we demonstrate that a significant fraction of these reads contains PCR or sequencing errors within duplex tags. Correction of such errors allows "reuniting" these reads with their respective families increasing the output of the method and making it more cost effective.
CONCLUSIONS: We combine an error correction strategy with a number of algorithmic improvements in a new version of the duplex analysis software, Du Novo 2.0. It is written in Python, C, AWK, and Bash. It is open source and readily available through Galaxy, Bioconda, and Github: https://github.com/galaxyproject/dunovo.

Entities:  

Keywords:  Barcodes; Duplex sequence; Error correction; Low frequency variants

Mesh:

Substances:

Year:  2020        PMID: 32131723      PMCID: PMC7057607          DOI: 10.1186/s12859-020-3419-8

Source DB:  PubMed          Journal:  BMC Bioinformatics        ISSN: 1471-2105            Impact factor:   3.169


  12 in total

1.  Detection of ultra-rare mutations by next-generation sequencing.

Authors:  Michael W Schmitt; Scott R Kennedy; Jesse J Salk; Edward J Fox; Joseph B Hiatt; Lawrence A Loeb
Journal:  Proc Natl Acad Sci U S A       Date:  2012-08-01       Impact factor: 11.205

2.  Towards error-free profiling of immune repertoires.

Authors:  Mikhail Shugay; Olga V Britanova; Ekaterina M Merzlyak; Maria A Turchaninova; Ilgar Z Mamedov; Timur R Tuganbaev; Dmitriy A Bolotin; Dmitry B Staroverov; Ekaterina V Putintseva; Karla Plevova; Carsten Linnemann; Dmitriy Shagin; Sarka Pospisilova; Sergey Lukyanov; Ton N Schumacher; Dmitriy M Chudakov
Journal:  Nat Methods       Date:  2014-05-04       Impact factor: 28.547

Review 3.  Enhancing the accuracy of next-generation sequencing for detecting rare and subclonal mutations.

Authors:  Jesse J Salk; Michael W Schmitt; Lawrence A Loeb
Journal:  Nat Rev Genet       Date:  2018-03-26       Impact factor: 53.242

4.  Maternal age effect and severe germ-line bottleneck in the inheritance of human mitochondrial DNA.

Authors:  Boris Rebolledo-Jaramillo; Marcia Shu-Wei Su; Nicholas Stoler; Jennifer A McElhoe; Benjamin Dickins; Daniel Blankenberg; Thorfinn S Korneliussen; Francesca Chiaromonte; Rasmus Nielsen; Mitchell M Holland; Ian M Paul; Anton Nekrutenko; Kateryna D Makova
Journal:  Proc Natl Acad Sci U S A       Date:  2014-10-13       Impact factor: 11.205

5.  Ultrafast and memory-efficient alignment of short DNA sequences to the human genome.

Authors:  Ben Langmead; Cole Trapnell; Mihai Pop; Steven L Salzberg
Journal:  Genome Biol       Date:  2009-03-04       Impact factor: 13.583

6.  A standard curve based method for relative real time PCR data processing.

Authors:  Alexey Larionov; Andreas Krause; William Miller
Journal:  BMC Bioinformatics       Date:  2005-03-21       Impact factor: 3.169

7.  MAGERI: Computational pipeline for molecular-barcoded targeted resequencing.

Authors:  Mikhail Shugay; Andrew R Zaretsky; Dmitriy A Shagin; Irina A Shagina; Ivan A Volchenkov; Andrew A Shelenkov; Mikhail Y Lebedin; Dmitriy V Bagaev; Sergey Lukyanov; Dmitriy M Chudakov
Journal:  PLoS Comput Biol       Date:  2017-05-05       Impact factor: 4.475

8.  smCounter2: an accurate low-frequency variant caller for targeted sequencing data with unique molecular identifiers.

Authors:  Chang Xu; Xiujing Gu; Raghavendra Padmanabhan; Zhong Wu; Quan Peng; John DiCarlo; Yexun Wang
Journal:  Bioinformatics       Date:  2019-04-15       Impact factor: 6.937

9.  Kalign2: high-performance multiple alignment of protein and nucleotide sequences allowing external features.

Authors:  Timo Lassmann; Oliver Frings; Erik L L Sonnhammer
Journal:  Nucleic Acids Res       Date:  2008-12-22       Impact factor: 16.971

10.  Streamlined analysis of duplex sequencing data with Du Novo.

Authors:  Nicholas Stoler; Barbara Arbeithuber; Wilfried Guiblet; Kateryna D Makova; Anton Nekrutenko
Journal:  Genome Biol       Date:  2016-08-26       Impact factor: 13.583

View more
  4 in total

1.  Age-related accumulation of de novo mitochondrial mutations in mammalian oocytes and somatic tissues.

Authors:  Barbara Arbeithuber; James Hester; Marzia A Cremona; Nicholas Stoler; Arslan Zaidi; Bonnie Higgins; Kate Anthony; Francesca Chiaromonte; Francisco J Diaz; Kateryna D Makova
Journal:  PLoS Biol       Date:  2020-07-15       Impact factor: 8.029

2.  High prevalence of somatic PIK3CA and TP53 pathogenic variants in the normal mammary gland tissue of sporadic breast cancer patients revealed by duplex sequencing.

Authors:  Anna Kostecka; Tomasz Nowikiewicz; Paweł Olszewski; Magdalena Koczkowska; Monika Horbacz; Monika Heinzl; Maria Andreou; Renato Salazar; Theresa Mair; Piotr Madanecki; Magdalena Gucwa; Hanna Davies; Jarosław Skokowski; Patrick G Buckley; Rafał Pęksa; Ewa Śrutek; Łukasz Szylberg; Johan Hartman; Michał Jankowski; Wojciech Zegarski; Irene Tiemann-Boege; Jan P Dumanski; Arkadiusz Piotrowski
Journal:  NPJ Breast Cancer       Date:  2022-06-29

3.  Discovery of an unusually high number of de novo mutations in sperm of older men using duplex sequencing.

Authors:  Renato Salazar; Barbara Arbeithuber; Maja Ivankovic; Monika Heinzl; Sofia Moura; Ingrid Hartl; Theresa Mair; Angelika Lahnsteiner; Thomas Ebner; Omar Shebl; Johannes Pröll; Irene Tiemann-Boege
Journal:  Genome Res       Date:  2022-02-24       Impact factor: 9.043

4.  Advanced age increases frequencies of de novo mitochondrial mutations in macaque oocytes and somatic tissues.

Authors:  Barbara Arbeithuber; Marzia A Cremona; James Hester; Alison Barrett; Bonnie Higgins; Kate Anthony; Francesca Chiaromonte; Francisco J Diaz; Kateryna D Makova
Journal:  Proc Natl Acad Sci U S A       Date:  2022-04-08       Impact factor: 12.779

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.