Literature DB >> 29028181

Unbiased Taxonomic Annotation of Metagenomic Samples.

Bruno Fosso1, Graziano Pesole1, Francesc Rosselló2, Gabriel Valiente3.   

Abstract

The classification of reads from a metagenomic sample using a reference taxonomy is usually based on first mapping the reads to the reference sequences and then classifying each read at a node under the lowest common ancestor of the candidate sequences in the reference taxonomy with the least classification error. However, this taxonomic annotation can be biased by an imbalanced taxonomy and also by the presence of multiple nodes in the taxonomy with the least classification error for a given read. In this article, we show that the Rand index is a better indicator of classification error than the often used area under the receiver operating characteristic (ROC) curve and F-measure for both balanced and imbalanced reference taxonomies, and we also address the second source of bias by reducing the taxonomic annotation problem for a whole metagenomic sample to a set cover problem, for which a logarithmic approximation can be obtained in linear time and an exact solution can be obtained by integer linear programming. Experimental results with a proof-of-concept implementation of the set cover approach to taxonomic annotation in a next release of the TANGO software show that the set cover approach further reduces ambiguity in the taxonomic annotation obtained with TANGO without distorting the relative abundance profile of the metagenomic sample.

Entities:  

Keywords:  classification; correlation; metagenomics; set cover; taxonomic annotation.

Mesh:

Year:  2017        PMID: 29028181      PMCID: PMC5865273          DOI: 10.1089/cmb.2017.0144

Source DB:  PubMed          Journal:  J Comput Biol        ISSN: 1066-5277            Impact factor:   1.479


  25 in total

Review 1.  A bioinformatician's guide to metagenomics.

Authors:  Victor Kunin; Alex Copeland; Alla Lapidus; Konstantinos Mavromatis; Philip Hugenholtz
Journal:  Microbiol Mol Biol Rev       Date:  2008-12       Impact factor: 11.056

Review 2.  A renaissance for the pioneering 16S rRNA gene.

Authors:  Susannah G Tringe; Philip Hugenholtz
Journal:  Curr Opin Microbiol       Date:  2008-10-08       Impact factor: 7.934

Review 3.  Sequencing technologies - the next generation.

Authors:  Michael L Metzker
Journal:  Nat Rev Genet       Date:  2009-12-08       Impact factor: 53.242

4.  A new balance index for phylogenetic trees.

Authors:  Arnau Mir; Francesc Rosselló; Lucı A Rotger
Journal:  Math Biosci       Date:  2012-11-07       Impact factor: 2.144

5.  Flexible taxonomic assignment of ambiguous sequencing reads.

Authors:  José C Clemente; Jesper Jansson; Gabriel Valiente
Journal:  BMC Bioinformatics       Date:  2011-01-07       Impact factor: 3.169

6.  Type material in the NCBI Taxonomy Database.

Authors:  Scott Federhen
Journal:  Nucleic Acids Res       Date:  2014-11-14       Impact factor: 19.160

7.  BioMaS: a modular pipeline for Bioinformatic analysis of Metagenomic AmpliconS.

Authors:  Bruno Fosso; Monica Santamaria; Marinella Marzano; Daniel Alonso-Alemany; Gabriel Valiente; Giacinto Donvito; Alfonso Monaco; Pasquale Notarangelo; Graziano Pesole
Journal:  BMC Bioinformatics       Date:  2015-07-01       Impact factor: 3.169

8.  Advancing our understanding of the human microbiome using QIIME.

Authors:  José A Navas-Molina; Juan M Peralta-Sánchez; Antonio González; Paul J McMurdie; Yoshiki Vázquez-Baeza; Zhenjiang Xu; Luke K Ursell; Christian Lauber; Hongwei Zhou; Se Jin Song; James Huntley; Gail L Ackermann; Donna Berg-Lyons; Susan Holmes; J Gregory Caporaso; Rob Knight
Journal:  Methods Enzymol       Date:  2013       Impact factor: 1.600

9.  Subsampled open-reference clustering creates consistent, comprehensive OTU definitions and scales to billions of sequences.

Authors:  Jai Ram Rideout; Yan He; Jose A Navas-Molina; William A Walters; Luke K Ursell; Sean M Gibbons; John Chase; Daniel McDonald; Antonio Gonzalez; Adam Robbins-Pianka; Jose C Clemente; Jack A Gilbert; Susan M Huse; Hong-Wei Zhou; Rob Knight; J Gregory Caporaso
Journal:  PeerJ       Date:  2014-08-21       Impact factor: 2.984

10.  Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation.

Authors:  Nuala A O'Leary; Mathew W Wright; J Rodney Brister; Stacy Ciufo; Diana Haddad; Rich McVeigh; Bhanu Rajput; Barbara Robbertse; Brian Smith-White; Danso Ako-Adjei; Alexander Astashyn; Azat Badretdin; Yiming Bao; Olga Blinkova; Vyacheslav Brover; Vyacheslav Chetvernin; Jinna Choi; Eric Cox; Olga Ermolaeva; Catherine M Farrell; Tamara Goldfarb; Tripti Gupta; Daniel Haft; Eneida Hatcher; Wratko Hlavina; Vinita S Joardar; Vamsi K Kodali; Wenjun Li; Donna Maglott; Patrick Masterson; Kelly M McGarvey; Michael R Murphy; Kathleen O'Neill; Shashikant Pujar; Sanjida H Rangwala; Daniel Rausch; Lillian D Riddick; Conrad Schoch; Andrei Shkeda; Susan S Storz; Hanzhen Sun; Francoise Thibaud-Nissen; Igor Tolstoy; Raymond E Tully; Anjana R Vatsan; Craig Wallin; David Webb; Wendy Wu; Melissa J Landrum; Avi Kimchi; Tatiana Tatusova; Michael DiCuccio; Paul Kitts; Terence D Murphy; Kim D Pruitt
Journal:  Nucleic Acids Res       Date:  2015-11-08       Impact factor: 16.971

View more
  3 in total

1.  No metagenomic evidence of tumorigenic viruses in cancers from a selected cohort of immunosuppressed subjects.

Authors:  Nunzia Passaro; Andrea Casagrande; Matteo Chiara; Bruno Fosso; Caterina Manzari; Anna Maria D'Erchia; Samuele Iesari; Francesco Pisani; Antonio Famulari; Patrizia Tulissi; Stefania Mastrosimone; Maria Cristina Maresca; Giuseppe Mercante; Giuseppe Spriano; Giacomo Corrado; Enrico Vizza; Anna Rosa Garbuglia; Maria Rosaria Capobianchi; Carla Mottini; Alessandra Cenci; Marco Tartaglia; Alessandro Nanni Costa; Graziano Pesole; Marco Crescenzi
Journal:  Sci Rep       Date:  2019-12-24       Impact factor: 4.379

2.  Improved sequence-based prediction of interaction sites in α-helical transmembrane proteins by deep learning.

Authors:  Jianfeng Sun; Dmitrij Frishman
Journal:  Comput Struct Biotechnol J       Date:  2021-03-09       Impact factor: 7.271

3.  Accurate quantification of bacterial abundance in metagenomic DNAs accounting for variable DNA integrity levels.

Authors:  Caterina Manzari; Annarita Oranger; Bruno Fosso; Elisabetta Piancone; Graziano Pesole; Anna Maria D'Erchia
Journal:  Microb Genom       Date:  2020-10
  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.