| Literature DB >> 28112797 |
Tao Wang1,2, Hongyu Zhao3,2.
Abstract
Understanding the factors that alter the composition of the human microbiota may help personalized healthcare strategies and therapeutic drug targets. In many sequencing studies, microbial communities are characterized by a list of taxa, their counts, and their evolutionary relationships represented by a phylogenetic tree. In this article, we consider an extension of the Dirichlet multinomial distribution, called the Dirichlet-tree multinomial distribution, for multivariate, over-dispersed, and tree-structured count data. To address the relationships between these counts and a set of covariates, we propose the Dirichlet-tree multinomial regression model for which we develop a penalized likelihood method for estimating parameters and selecting covariates. For efficient optimization, we adopt the accelerated proximal gradient approach. Simulation studies are presented to demonstrate the good performance of the proposed procedure. An analysis of a data set relating dietary nutrients with bacterial counts is used to show that the incorporation of the tree structure into the model helps increase the prediction power.Entities:
Keywords: Dirichlet distributions; Over-dispersion; Sparse group lasso; Tree-structured learning
Mesh:
Year: 2017 PMID: 28112797 PMCID: PMC5587402 DOI: 10.1111/biom.12654
Source DB: PubMed Journal: Biometrics ISSN: 0006-341X Impact factor: 2.571