| Literature DB >> 30093893 |
Xinyan Zhang1, Yu-Fang Pei2, Lei Zhang2, Boyi Guo3, Amanda H Pendegraft3, Wenzhuo Zhuang4, Nengjun Yi3.
Abstract
The metagenomics sequencing data provide valuable resources for investigating the associations between the microbiome and host environmental/clinical factors and the dynamic changes of microbial abundance over time. The distinct properties of microbiome measurements include varied total sequence reads across samples, over-dispersion and zero-inflation. Additionally, microbiome studies usually collect samples longitudinally, which introduces time-dependent and correlation structures among the samples and thus further complicates the analysis and interpretation of microbiome count data. In this article, we propose negative binomial mixed models (NBMMs) for longitudinal microbiome studies. The proposed NBMMs can efficiently handle over-dispersion and varying total reads, and can account for the dynamic trend and correlation among longitudinal samples. We develop an efficient and stable algorithm to fit the NBMMs. We evaluate and demonstrate the NBMMs method via extensive simulation studies and application to a longitudinal microbiome data. The results show that the proposed method has desirable properties and outperform the previously used methods in terms of flexible framework for modeling correlation structures and detecting dynamic effects. We have developed an R package NBZIMM to implement the proposed method, which is freely available from the public GitHub repository http://github.com//nyiuab//NBZIMM and provides a useful tool for analyzing longitudinal microbiome data.Entities:
Keywords: count data; longitudinal study; metagenomics; microbiome; negative binomial mixed model
Year: 2018 PMID: 30093893 PMCID: PMC6070621 DOI: 10.3389/fmicb.2018.01683
Source DB: PubMed Journal: Front Microbiol ISSN: 1664-302X Impact factor: 5.640
Longitudinal microbiome data structure.
| Subject 1 | |||||||
| Subject 1 | |||||||
| Subject 1 | |||||||
| Subject 2 | |||||||
| · | · | · | · | · | · | · | |
| Subject n |
Parameter ranges in simulation studies.
| log( | Unif(0.1, 3.5) |
| Dispersion parameter θ | Unif(0.1, 5) |
| Fixed effects β1, β2, β3 (false positive rate) | 0, 0, 0 |
| Fixed effects β1, β2, β3 (power of interaction) | 0, 0, Unif(0.2, 0.35) or Unif(0.35, 0.8) |
| Fixed effects β1, β2, β3 (power of both β1 and β3) | All from Unif(0.2, 0.35) or Unif(0.35, 0.8) |
| Standard deviation τ | Unif(0.5, 1) |
| Correlation ρ | Unif(0.1, 0.5) |
| Standard deviation σ | Unif(0.1, 0.5) |
Figure 1Empirical power of interaction term and false positive rates of main effect in all four simulation settings.
Figure 2Empirical power of both interaction term and main effect in all four simulation settings.
Figure 3False positive rates of both interaction term and main effect in all four simulation settings.
Significant taxa rates detected in four models with LMMs and NBMMs.
| Model 1 | Test of β1 | LMMs | 0.034483 |
| NBMMs | 0.068966 | ||
| Model 2 | Test of β1 | LMMs | 0.034483 |
| NBMMs | 0.12069 | ||
| Model 3 | Test of β1 | LMMs | 0.12069 |
| NBMMs | 0.224138 | ||
| Test of β3 | LMMs | 0.137931 | |
| NBMMs | 0.275862 | ||
| Model 4 | Test of β1 | LMMs | 0.137931 |
| NBMMs | 0.206897 | ||
| Test of β3 | LMMs | 0.137931 | |
| NBMMs | 0.293103 |
Figure 4The analyses of NBMMs and LMM: minus log transformed p-values for the significant differentially abundant taxa at the 5% significance threshold between term and preterm groups for species level; the left panel shows the minus log transformed p-values for association of group main effect, and the right panel shows the minus log transformed p-values for association for group by time interaction.