Literature DB >> 33711013

DCMD: Distance-based classification using mixture distributions on microbiome data.

Konstantin Shestopaloff1, Mei Dong1, Fan Gao1, Wei Xu1,2.   

Abstract

Current advances in next-generation sequencing techniques have allowed researchers to conduct comprehensive research on the microbiome and human diseases, with recent studies identifying associations between the human microbiome and health outcomes for a number of chronic conditions. However, microbiome data structure, characterized by sparsity and skewness, presents challenges to building effective classifiers. To address this, we present an innovative approach for distance-based classification using mixture distributions (DCMD). The method aims to improve classification performance using microbiome community data, where the predictors are composed of sparse and heterogeneous count data. This approach models the inherent uncertainty in sparse counts by estimating a mixture distribution for the sample data and representing each observation as a distribution, conditional on observed counts and the estimated mixture, which are then used as inputs for distance-based classification. The method is implemented into a k-means classification and k-nearest neighbours framework. We develop two distance metrics that produce optimal results. The performance of the model is assessed using simulated and human microbiome study data, with results compared against a number of existing machine learning and distance-based classification approaches. The proposed method is competitive when compared to the other machine learning approaches, and shows a clear improvement over commonly used distance-based classifiers, underscoring the importance of modelling sparsity for achieving optimal results. The range of applicability and robustness make the proposed method a viable alternative for classification using sparse microbiome count data. The source code is available at https://github.com/kshestop/DCMD for academic use.

Entities:  

Year:  2021        PMID: 33711013      PMCID: PMC7990174          DOI: 10.1371/journal.pcbi.1008799

Source DB:  PubMed          Journal:  PLoS Comput Biol        ISSN: 1553-734X            Impact factor:   4.475


  16 in total

1.  Sparse distance-based learning for simultaneous multiclass classification and feature selection of metagenomic data.

Authors:  Zhenqiu Liu; William Hsiao; Brandi L Cantarel; Elliott Franco Drábek; Claire Fraser-Liggett
Journal:  Bioinformatics       Date:  2011-10-07       Impact factor: 6.937

Review 2.  Supervised classification of human microbiota.

Authors:  Dan Knights; Elizabeth K Costello; Rob Knight
Journal:  FEMS Microbiol Rev       Date:  2010-10-07       Impact factor: 16.408

3.  Analyzing differences between microbiome communities using mixture distributions.

Authors:  Konstantin Shestopaloff; Michael D Escobar; Wei Xu
Journal:  Stat Med       Date:  2018-07-23       Impact factor: 2.373

4.  Introduction to machine learning: k-nearest neighbors.

Authors:  Zhongheng Zhang
Journal:  Ann Transl Med       Date:  2016-06

5.  Diagnosis of multiple cancer types by shrunken centroids of gene expression.

Authors:  Robert Tibshirani; Trevor Hastie; Balasubramanian Narasimhan; Gilbert Chu
Journal:  Proc Natl Acad Sci U S A       Date:  2002-05-14       Impact factor: 11.205

6.  Modulation of gut microbiota by berberine and metformin during the treatment of high-fat diet-induced obesity in rats.

Authors:  Xu Zhang; Yufeng Zhao; Jia Xu; Zhengsheng Xue; Menghui Zhang; Xiaoyan Pang; Xiaojun Zhang; Liping Zhao
Journal:  Sci Rep       Date:  2015-09-23       Impact factor: 4.379

7.  A comprehensive evaluation of multicategory classification methods for microbiomic data.

Authors:  Alexander Statnikov; Mikael Henaff; Varun Narendra; Kranti Konganti; Zhiguo Li; Liying Yang; Zhiheng Pei; Martin J Blaser; Constantin F Aliferis; Alexander V Alekseyenko
Journal:  Microbiome       Date:  2013-04-05       Impact factor: 14.650

8.  Normalization and microbial differential abundance strategies depend upon data characteristics.

Authors:  Sophie Weiss; Zhenjiang Zech Xu; Shyamal Peddada; Amnon Amir; Kyle Bittinger; Antonio Gonzalez; Catherine Lozupone; Jesse R Zaneveld; Yoshiki Vázquez-Baeza; Amanda Birmingham; Embriette R Hyde; Rob Knight
Journal:  Microbiome       Date:  2017-03-03       Impact factor: 14.650

9.  Microbiome Learning Repo (ML Repo): A public repository of microbiome regression and classification tasks.

Authors:  Pajau Vangay; Benjamin M Hillmann; Dan Knights
Journal:  Gigascience       Date:  2019-05-01       Impact factor: 6.524

10.  Impact of technical sources of variation on the hand microbiome dynamics of healthcare workers.

Authors:  Mariana Rosenthal; Allison E Aiello; Carol Chenoweth; Deborah Goldberg; Elaine Larson; Gregory Gloor; Betsy Foxman
Journal:  PLoS One       Date:  2014-02-14       Impact factor: 3.240

View more
  1 in total

1.  COVID-19 heterogeneity in islands chain environment.

Authors:  Monique Chyba; Prateek Kunwar; Yuriy Mileyko; Alan Tong; Winnie Lau; Alice Koniges
Journal:  PLoS One       Date:  2022-05-18       Impact factor: 3.752

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.