Literature DB >> 21984758

Sparse distance-based learning for simultaneous multiclass classification and feature selection of metagenomic data.

Zhenqiu Liu1, William Hsiao, Brandi L Cantarel, Elliott Franco Drábek, Claire Fraser-Liggett.   

Abstract

MOTIVATION: Direct sequencing of microbes in human ecosystems (the human microbiome) has complemented single genome cultivation and sequencing to understand and explore the impact of commensal microbes on human health. As sequencing technologies improve and costs decline, the sophistication of data has outgrown available computational methods. While several existing machine learning methods have been adapted for analyzing microbiome data recently, there is not yet an efficient and dedicated algorithm available for multiclass classification of human microbiota.
RESULTS: By combining instance-based and model-based learning, we propose a novel sparse distance-based learning method for simultaneous class prediction and feature (variable or taxa, which is used interchangeably) selection from multiple treatment populations on the basis of 16S rRNA sequence count data. Our proposed method simultaneously minimizes the intraclass distance and maximizes the interclass distance with many fewer estimated parameters than other methods. It is very efficient for problems with small sample sizes and unbalanced classes, which are common in metagenomic studies. We implemented this method in a MATLAB toolbox called MetaDistance. We also propose several approaches for data normalization and variance stabilization transformation in MetaDistance. We validate this method on several real and simulated 16S rRNA datasets to show that it outperforms existing methods for classifying metagenomic data. This article is the first to address simultaneous multifeature selection and class prediction with metagenomic count data. AVAILABILITY: The MATLAB toolbox is freely available online at http://metadistance.igs.umaryland.edu/. CONTACT: zliu@umm.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Entities:  

Mesh:

Year:  2011        PMID: 21984758      PMCID: PMC3223360          DOI: 10.1093/bioinformatics/btr547

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  16 in total

1.  Naive Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy.

Authors:  Qiong Wang; George M Garrity; James M Tiedje; James R Cole
Journal:  Appl Environ Microbiol       Date:  2007-06-22       Impact factor: 4.792

2.  The human microbiome project.

Authors:  Peter J Turnbaugh; Ruth E Ley; Micah Hamady; Claire M Fraser-Liggett; Rob Knight; Jeffrey I Gordon
Journal:  Nature       Date:  2007-10-18       Impact factor: 49.962

3.  Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities.

Authors:  Patrick D Schloss; Sarah L Westcott; Thomas Ryabin; Justine R Hall; Martin Hartmann; Emily B Hollister; Ryan A Lesniewski; Brian B Oakley; Donovan H Parks; Courtney J Robinson; Jason W Sahl; Blaz Stres; Gerhard G Thallinger; David J Van Horn; Carolyn F Weber
Journal:  Appl Environ Microbiol       Date:  2009-10-02       Impact factor: 4.792

4.  Visual and statistical comparison of metagenomes.

Authors:  Suparna Mitra; Bernhard Klar; Daniel H Huson
Journal:  Bioinformatics       Date:  2009-06-10       Impact factor: 6.937

5.  Outcomes in patients infected with carbapenem-resistant Acinetobacter baumannii and treated with tigecycline alone or in combination therapy.

Authors:  R Guner; I Hasanoglu; S Keske; A K Kalem; M A Tasyaran
Journal:  Infection       Date:  2011-07-26       Impact factor: 3.553

6.  Bacterial community variation in human body habitats across space and time.

Authors:  Elizabeth K Costello; Christian L Lauber; Micah Hamady; Noah Fierer; Jeffrey I Gordon; Rob Knight
Journal:  Science       Date:  2009-11-05       Impact factor: 47.728

7.  Human gut microbiota in obesity and after gastric bypass.

Authors:  Husen Zhang; John K DiBaise; Andrea Zuccolo; Dave Kudrna; Michele Braidotti; Yeisoo Yu; Prathap Parameswaran; Michael D Crowell; Rod Wing; Bruce E Rittmann; Rosa Krajmalnik-Brown
Journal:  Proc Natl Acad Sci U S A       Date:  2009-01-21       Impact factor: 11.205

8.  UniFrac: a new phylogenetic method for comparing microbial communities.

Authors:  Catherine Lozupone; Rob Knight
Journal:  Appl Environ Microbiol       Date:  2005-12       Impact factor: 4.792

9.  Statistical methods for detecting differentially abundant features in clinical metagenomic samples.

Authors:  James Robert White; Niranjan Nagarajan; Mihai Pop
Journal:  PLoS Comput Biol       Date:  2009-04-10       Impact factor: 4.475

10.  Methods for comparative metagenomics.

Authors:  Daniel H Huson; Daniel C Richter; Suparna Mitra; Alexander F Auch; Stephan C Schuster
Journal:  BMC Bioinformatics       Date:  2009-01-30       Impact factor: 3.169

View more
  18 in total

1.  Zero-Inflated Beta Regression for Differential Abundance Analysis with Metagenomics Data.

Authors:  Xiaoling Peng; Gang Li; Zhenqiu Liu
Journal:  J Comput Biol       Date:  2015-12-16       Impact factor: 1.479

Review 2.  The Role of the Gut Microbiome in Predicting Response to Diet and the Development of Precision Nutrition Models-Part I: Overview of Current Methods.

Authors:  Riley L Hughes; Maria L Marco; James P Hughes; Nancy L Keim; Mary E Kable
Journal:  Adv Nutr       Date:  2019-11-01       Impact factor: 8.701

3.  Multilevel regularized regression for simultaneous taxa selection and network construction with metagenomic count data.

Authors:  Zhenqiu Liu; Fengzhu Sun; Jonathan Braun; Dermot P B McGovern; Steven Piantadosi
Journal:  Bioinformatics       Date:  2014-11-20       Impact factor: 6.937

4.  Pattern and synchrony of gene expression among sympatric marine microbial populations.

Authors:  Elizabeth A Ottesen; Curtis R Young; John M Eppley; John P Ryan; Francisco P Chavez; Christopher A Scholin; Edward F DeLong
Journal:  Proc Natl Acad Sci U S A       Date:  2013-01-23       Impact factor: 11.205

5.  Phylogenetic approaches to microbial community classification.

Authors:  Jie Ning; Robert G Beiko
Journal:  Microbiome       Date:  2015-10-05       Impact factor: 14.650

6.  Alignment-free supervised classification of metagenomes by recursive SVM.

Authors:  Hongfei Cui; Xuegong Zhang
Journal:  BMC Genomics       Date:  2013-09-22       Impact factor: 3.969

7.  Exploration and retrieval of whole-metagenome sequencing samples.

Authors:  Sohan Seth; Niko Välimäki; Samuel Kaski; Antti Honkela
Journal:  Bioinformatics       Date:  2014-05-19       Impact factor: 6.937

8.  Microbial transformation from normal oral microbiota to acute endodontic infections.

Authors:  William Wl Hsiao; Kevin L Li; Zhenqiu Liu; Cheron Jones; Claire M Fraser-Liggett; Ashraf F Fouad
Journal:  BMC Genomics       Date:  2012-07-28       Impact factor: 3.969

9.  Class prediction and feature selection with linear optimization for metagenomic count data.

Authors:  Zhenqiu Liu; Dechang Chen; Li Sheng; Amy Y Liu
Journal:  PLoS One       Date:  2013-03-26       Impact factor: 3.240

10.  Efficient feature selection and multiclass classification with integrated instance and model based learning.

Authors:  Zhenqiu Liu; Halima Bensmail; Ming Tan
Journal:  Evol Bioinform Online       Date:  2012-04-30       Impact factor: 1.625

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.