| Literature DB >> 33166356 |
Xinyan Zhang1, Boyi Guo2, Nengjun Yi2.
Abstract
MOTIVATION: The human microbiome is variable and dynamic in nature. Longitudinal studies could explain the mechanisms in maintaining the microbiome in health or causing dysbiosis in disease. However, it remains challenging to properly analyze the longitudinal microbiome data from either 16S rRNA or metagenome shotgun sequencing studies, output as proportions or counts. Most microbiome data are sparse, requiring statistical models to handle zero-inflation. Moreover, longitudinal design induces correlation among the samples and thus further complicates the analysis and interpretation of the microbiome data.Entities:
Year: 2020 PMID: 33166356 PMCID: PMC7652264 DOI: 10.1371/journal.pone.0242073
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
False positive rate and power for testing H0: β1 = 0 based on ZIGMMs and ZIBR for significance level at 0.05 for various sample sizes.
| False Positive Rate | Power (Low Effect Setting) | Power (High Effect Setting) | ||||
|---|---|---|---|---|---|---|
| Sample Size | ZIGMMs (arcsine)† | ZIBR‡ | ZIGMMs (arcsine) | ZIBR | ZIGMMs (arcsine) | ZIBR |
| n = 50 | 0.0681 | 0.0577 | 0.1937 | 0.1438 | 0.4100 | 0.3022 |
| n = 100 | 0.0554 | 0.0578 | 0.3025 | 0.2218 | 0.6592 | 0.5135 |
| n = 150 | 0.0563 | 0.0533 | 0.4308 | 0.3031 | 0.8296 | 0.6906 |
ZIBR‡: Zero-inflated beta mixed model.
ZIGMMs(arcsine)†: Zero-inflated Gaussian mixed models with arcsine transformation.
Parameter ranges in simulation studies.
| Parameter | Range |
|---|---|
| log( | Unif(0.1, 3.5) |
| dispersion parameter | Unif(0.1, 5) |
| Fixed effects | 0 |
| Fixed effects | Unif(0.2, 0.3) |
| Unif(0.3, 0.4) | |
| Fixed effect | Unif(0.2, 0.3) |
| Unif(0.3, 0.4) | |
| standard deviation | Unif(0.5, 1) |
| correlation | Unif(0.1, 0.5) |
| standard deviation | Unif(0.1, 0.5) |
| Overall zero-inflation proportion | Unif(0.0, 0.2) |
| Unif(0.2, 0.4) | |
| Unif(0.4, 0.6) |
Fig 1Empirical powers in four simulation settings under low effect scenario.
Fig 2False positive rates in all four simulation settings.
False positive rate and power for testing H0: β1 = 0 and H0: β3 = 0 from setting E for significance level at 0.05 for various sample sizes.
| False Positive Rate | ||||||
| Sample Size | LMMs§ | NBMMs¶ | ZIGMMs(log)! | LMMs§ | NBMMs¶ | ZIGMMs(log)! |
| n = 50 | 0.045 | 0.053 | 0.065 | 0.045 | 0.064 | 0.084 |
| n = 100 | 0.050 | 0.061 | 0.067 | 0.054 | 0.072 | 0.082 |
| n = 150 | 0.047 | 0.061 | 0.071 | 0.050 | 0.068 | 0.082 |
| Power (Low Effect Setting) | ||||||
| Sample Size | LMMs§ | NBMMs¶ | ZIGMMs(log)! | LMMs§ | NBMMs¶ | ZIGMMs(log)! |
| n = 50 | 0.082 | 0.158 | 0.187 | 0.172 | 0.251 | 0.334 |
| n = 100 | 0.148 | 0.265 | 0.325 | 0.295 | 0.425 | 0.563 |
| n = 150 | 0.204 | 0.360 | 0.439 | 0.405 | 0.562 | 0.720 |
| Power (High Effect Setting) | ||||||
| Sample Size | LMMs§ | NBMMs¶ | ZIGMMs(log)! | LMMs§ | NBMMs¶ | ZIGMMs(log)! |
| n = 50 | 0.121 | 0.252 | 0.304 | 0.303 | 0.418 | 0.558 |
| n = 100 | 0.224 | 0.439 | 0.522 | 0.507 | 0.654 | 0.815 |
| n = 150 | 0.340 | 0.602 | 0.699 | 0.628 | 0.769 | 0.920 |
LMMs§: Linear mixed models.
NBMMs¶: Negative Binomial mixed models.
ZIGMMs(log)!: Zero-inflated Gaussian mixed models with log transformation.
Proportions of significant taxa detected in four models with LMMs, NBMMs and ZIGMMs.
| Model A | Model B | Model C | Model D | ||||
|---|---|---|---|---|---|---|---|
| Test of | Test of | Test of | Test of | Test of | Test of | Test of | |
| LMMs§ | 0.29 | 0.03 | 0.15 | 0.03 | 0.12 | 0.07 | 0.10 |
| NBMMs¶ | 0.49 | 0.12 | 0.25 | 0.12 | 0.25 | 0.12 | 0.25 |
| ZIGMMs(log)! | 0.63 | 0.34 | 0.24 | 0.39 | 0.27 | 0.36 | 0.24 |
| Model E | Model F | Model G | Model H | ||||
| Test of | Test of | Test of | Test of | Test of | Test of | Test of | |
| ZIGMMs(log) | 0.54 | 0.19 | 0.31 | 0.20 | 0.24 | 0.20 | 0.20 |
LMMs§: Linear mixed models.
NBMMs¶: Negative Binomial mixed models.
ZIGMMs(log)!: Zero-inflated Gaussian mixed models with log transformation.
Fig 3The analyses of ZIGMMs(log), NBMMs and LMMs: minus log transformed p-values for the significant differentially abundant taxa at the 5% significance threshold between pregnancy and non-pregnancy groups for host factor effect (left panel) and interaction effect (right panel) from Model C.
Proportions of significant taxa detected in four models with LMMs and ZIGMMs.
| Model A | Model B | Model C | Model D | ||||
| Test of | Test of | Test of | Test of | Test of | Test of | Test of | |
| LMMs§ | 0.11 | 0.13 | 0.12 | 0.11 | 0.11 | 0.10 | 0.06 |
| ZIGMMs (arcsine)† | 0.12 | 0.12 | 0.19 | 0.17 | 0.18 | 0.11 | 0.10 |
| Model E | Model F | Model G | Model H | ||||
| Test of | Test of | Test of | Test of | Test of | Test of | Test of | |
| ZIGMMs (arcsine) | 0.15 | 0.14 | 0.21 | 0.14 | 0.23 | 0.14 | 0.10 |
ZIGMMs(arcsine)†: Zero-inflated Gaussian mixed models with arcsine transformation.
LMMs§: Linear mixed models.