L Schaeffer1, H Pimentel2, N Bray3, P Melsted4, L Pachter1,5. 1. Department of Molecular and Cell Biology, UC Berkeley, Berkeley, CA, USA. 2. Department of Genetics, Stanford University, Stanford, CA, USA. 3. Department of Molecular and Cell Biology and Innovative Genomics Institute, UC Berkeley, Berkeley, CA, USA. 4. Faculty of Industrial Engineering, Mechanical Engineering and Computer Science, University of Iceland, Reykjavik, Iceland. 5. Departments of Mathematics and Computer Science, UC Berkeley, Berkeley, CA, USA.
Abstract
MOTIVATION: Read assignment is an important first step in many metagenomic analysis workflows, providing the basis for identification and quantification of species. However ambiguity among the sequences of many strains makes it difficult to assign reads at the lowest level of taxonomy, and reads are typically assigned to taxonomic levels where they are unambiguous. We explore connections between metagenomic read assignment and the quantification of transcripts from RNA-Seq data in order to develop novel methods for rapid and accurate quantification of metagenomic strains. RESULTS: We find that the recent idea of pseudoalignment introduced in the RNA-Seq context is highly applicable in the metagenomics setting. When coupled with the Expectation-Maximization (EM) algorithm, reads can be assigned far more accurately and quickly than is currently possible with state of the art software, making it possible and practical for the first time to analyze abundances of individual genomes in metagenomics projects. AVAILABILITY AND IMPLEMENTATION: Pipeline and analysis code can be downloaded from http://github.com/pachterlab/metakallisto. CONTACT: lpachter@math.berkeley.edu.
MOTIVATION: Read assignment is an important first step in many metagenomic analysis workflows, providing the basis for identification and quantification of species. However ambiguity among the sequences of many strains makes it difficult to assign reads at the lowest level of taxonomy, and reads are typically assigned to taxonomic levels where they are unambiguous. We explore connections between metagenomic read assignment and the quantification of transcripts from RNA-Seq data in order to develop novel methods for rapid and accurate quantification of metagenomic strains. RESULTS: We find that the recent idea of pseudoalignment introduced in the RNA-Seq context is highly applicable in the metagenomics setting. When coupled with the Expectation-Maximization (EM) algorithm, reads can be assigned far more accurately and quickly than is currently possible with state of the art software, making it possible and practical for the first time to analyze abundances of individual genomes in metagenomics projects. AVAILABILITY AND IMPLEMENTATION: Pipeline and analysis code can be downloaded from http://github.com/pachterlab/metakallisto. CONTACT: lpachter@math.berkeley.edu.
Authors: Ryan Lister; Ronan C O'Malley; Julian Tonti-Filippini; Brian D Gregory; Charles C Berry; A Harvey Millar; Joseph R Ecker Journal: Cell Date: 2008-05-02 Impact factor: 41.582
Authors: Cole Trapnell; Brian A Williams; Geo Pertea; Ali Mortazavi; Gordon Kwan; Marijke J van Baren; Steven L Salzberg; Barbara J Wold; Lior Pachter Journal: Nat Biotechnol Date: 2010-05-02 Impact factor: 54.908
Authors: Andrew McDavid; Greg Finak; Pratip K Chattopadyay; Maria Dominguez; Laurie Lamoreaux; Steven S Ma; Mario Roederer; Raphael Gottardo Journal: Bioinformatics Date: 2012-12-24 Impact factor: 6.937
Authors: Jordan M Eizenga; Adam M Novak; Jonas A Sibbesen; Simon Heumos; Ali Ghaffaari; Glenn Hickey; Xian Chang; Josiah D Seaman; Robin Rounthwaite; Jana Ebler; Mikko Rautiainen; Shilpa Garg; Benedict Paten; Tobias Marschall; Jouni Sirén; Erik Garrison Journal: Annu Rev Genomics Hum Genet Date: 2020-05-26 Impact factor: 8.929
Authors: Vittoria Roncalli; Matthew C Cieslak; Ann M Castelfranco; Russell R Hopcroft; Daniel K Hartline; Petra H Lenz Journal: BMC Genomics Date: 2021-06-03 Impact factor: 3.969
Authors: Sandeep J Joseph; Ben Li; Robert A Petit Iii; Zhaohui S Qin; Lyndsey Darrow; Timothy D Read Journal: PeerJ Date: 2016-10-18 Impact factor: 2.984
Authors: Karen Viacava; Jiangtao Qiao; Andrew Janowczyk; Suresh Poudel; Nicolas Jacquemin; Karin Lederballe Meibom; Him K Shrestha; Matthew C Reid; Robert L Hettich; Rizlan Bernier-Latmani Journal: ISME J Date: 2022-03-25 Impact factor: 11.217