Literature DB >> 34302462

Transformation and differential abundance analysis of microbiome data incorporating phylogeny.

Chao Zhou1,2, Hongyu Zhao3,2, Tao Wang1,2,4.   

Abstract

MOTIVATION: Microbiome data have proven extremely useful for understanding microbial communities and their impacts in health and disease. Although microbiome analysis methods and standards are evolving rapidly, obtaining meaningful and interpretable results from microbiome studies still requires careful statistical treatment. In particular, many existing and emerging methods for differential abundance analysis fail to account for the fact that microbiome data are high-dimensional and sparse, compositional, negatively and positively correlated, and phylogenetically structured. To better describe microbiome data and improve the power of differential abundance testing, there is still a great need for the continued development of appropriate statistical methodology.
RESULTS: In this paper, we propose a model-based approach for microbiome data transformation, and a phylogenetically informed procedure for differential abundance (DA) testing based on the transformed data. First, we extend the Dirichlet-tree multinomial (DTM) to zero-inflated DTM (ZIDTM) for multivariate modeling of microbial counts, addressing data sparsity, and correlation and phylogeny among bacterial taxa. Then, within this framework and using a Bayesian formulation, we introduce posterior mean transformation to convert raw counts into nonzero relative abundances that sum to one, accounting for the compositionality nature of microbiome data. Second, using the transformed data, we propose adaptive analysis of composition of microbiomes (adaANCOM) for DA testing by constructing log-ratios adaptively on the tree for each taxon, greatly reducing the computational complexity of ANCOM in high dimensions. Finally, we present extensive simulation studies, an analysis of HMP data across 18 body sites and 2 visits, and an application to a gut microbiome and malnutrition study, to investigate the performance of posterior mean transformation and adaANCOM. Comparisons with ANCOM and other DA testing procedures show that adaANCOM controls the false discovery rate well, allows for easy interpretation of the results, and is computationally efficient for high-dimensional problems. AVAILABILITY: The developed R package is available at https://github.com/ZRChao/adaANCOM. For replicability purposes, scripts for our simulations and data analysis are available at https://github.com/ZRChao/Papers_supplementary. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author(s) (2021). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

Entities:  

Year:  2021        PMID: 34302462     DOI: 10.1093/bioinformatics/btab543

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  2 in total

1.  phyloMDA: an R package for phylogeny-aware microbiome data analysis.

Authors:  Tiantian Liu; Chao Zhou; Huimin Wang; Hongyu Zhao; Tao Wang
Journal:  BMC Bioinformatics       Date:  2022-06-06       Impact factor: 3.307

2.  tascCODA: Bayesian Tree-Aggregated Analysis of Compositional Amplicon and Single-Cell Data.

Authors:  Johannes Ostner; Salomé Carcy; Christian L Müller
Journal:  Front Genet       Date:  2021-12-07       Impact factor: 4.599

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.