Robert C Edgar1, Henrik Flyvbjerg2. 1. Tiburon, CA 94920, USA and. 2. Department of Micro- and Nanotechnology, Technical University of Denmark, DK-2800 Lyngby, Denmark.
Abstract
MOTIVATION: Next-generation sequencing produces vast amounts of data with errors that are difficult to distinguish from true biological variation when coverage is low. RESULTS: We demonstrate large reductions in error frequencies, especially for high-error-rate reads, by three independent means: (i) filtering reads according to their expected number of errors, (ii) assembling overlapping read pairs and (iii) for amplicon reads, by exploiting unique sequence abundances to perform error correction. We also show that most published paired read assemblers calculate incorrect posterior quality scores. AVAILABILITY AND IMPLEMENTATION: These methods are implemented in the USEARCH package. Binaries are freely available at http://drive5.com/usearch. CONTACT: robert@drive5.com SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: Next-generation sequencing produces vast amounts of data with errors that are difficult to distinguish from true biological variation when coverage is low. RESULTS: We demonstrate large reductions in error frequencies, especially for high-error-rate reads, by three independent means: (i) filtering reads according to their expected number of errors, (ii) assembling overlapping read pairs and (iii) for amplicon reads, by exploiting unique sequence abundances to perform error correction. We also show that most published paired read assemblers calculate incorrect posterior quality scores. AVAILABILITY AND IMPLEMENTATION: These methods are implemented in the USEARCH package. Binaries are freely available at http://drive5.com/usearch. CONTACT: robert@drive5.com SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: Yoann Le Bagousse-Pinguet; Nicolas Gross; Hugo Saiz; Fernando T Maestre; Sonia Ruiz; Marina Dacal; Sergio Asensio; Victoria Ochoa; Beatriz Gozalo; Johannes H C Cornelissen; Lucas Deschamps; Carlos García; Vincent Maire; Rubén Milla; Norma Salinas; Juntao Wang; Brajesh K Singh; Pablo García-Palacios Journal: Proc Natl Acad Sci U S A Date: 2021-02-16 Impact factor: 11.205
Authors: Milan Gryndler; Petr Šmilauer; David Püschel; Petra Bukovská; Hana Hršelová; Martina Hujslová; Hana Gryndlerová; Olena Beskid; Tereza Konvalinková; Jan Jansa Journal: Mycorrhiza Date: 2018-06-21 Impact factor: 3.387
Authors: Paula Huber; Francisco M Cornejo-Castillo; Isabel Ferrera; Pablo Sánchez; Ramiro Logares; Sebastián Metz; Vanessa Balagué; Silvia G Acinas; Josep M Gasol; Fernando Unrein Journal: Appl Environ Microbiol Date: 2019-03-22 Impact factor: 4.792
Authors: Cinque Soto; Robin G Bombardi; Andre Branchizio; Nurgun Kose; Pranathi Matta; Alexander M Sevy; Robert S Sinkovits; Pavlo Gilchuk; Jessica A Finn; James E Crowe Journal: Nature Date: 2019-02-13 Impact factor: 49.962
Authors: A R Sitarik; S Havstad; A M Levin; S V Lynch; K E Fujimura; D R Ownby; C C Johnson; G Wegienka Journal: Indoor Air Date: 2018-03-13 Impact factor: 5.770
Authors: Jan Jansa; Petr Šmilauer; Jan Borovička; Hana Hršelová; Sándor T Forczek; Kristýna Slámová; Tomáš Řezanka; Martin Rozmoš; Petra Bukovská; Milan Gryndler Journal: Mycorrhiza Date: 2020-02-15 Impact factor: 3.387
Authors: Ya-Long Feng; Gang Cao; Dan-Qian Chen; Nosratola D Vaziri; Lin Chen; Jun Zhang; Ming Wang; Yan Guo; Ying-Yong Zhao Journal: Cell Mol Life Sci Date: 2019-05-30 Impact factor: 9.261
Authors: Rui Kong; Hongying Duan; Zizhang Sheng; Kai Xu; Priyamvada Acharya; Xuejun Chen; Cheng Cheng; Adam S Dingens; Jason Gorman; Mallika Sastry; Chen-Hsiang Shen; Baoshan Zhang; Tongqing Zhou; Gwo-Yu Chuang; Cara W Chao; Ying Gu; Alexander J Jafari; Mark K Louder; Sijy O'Dell; Ariana P Rowshan; Elise G Viox; Yiran Wang; Chang W Choi; Martin M Corcoran; Angela R Corrigan; Venkata P Dandey; Edward T Eng; Hui Geng; Kathryn E Foulds; Yicheng Guo; Young D Kwon; Bob Lin; Kevin Liu; Rosemarie D Mason; Martha C Nason; Tiffany Y Ohr; Li Ou; Reda Rawi; Edward K Sarfo; Arne Schön; John P Todd; Shuishu Wang; Hui Wei; Winston Wu; James C Mullikin; Robert T Bailer; Nicole A Doria-Rose; Gunilla B Karlsson Hedestam; Diana G Scorpio; Julie Overbaugh; Jesse D Bloom; Bridget Carragher; Clinton S Potter; Lawrence Shapiro; Peter D Kwong; John R Mascola Journal: Cell Date: 2019-07-25 Impact factor: 41.582