| Literature DB >> 31068967 |
Kalins Banerjee1, Ni Zhao2, Arun Srinivasan3, Lingzhou Xue3, Steven D Hicks4, Frank A Middleton5, Rongling Wu1, Xiang Zhan1.
Abstract
Differential abundance analysis is a crucial task in many microbiome studies, where the central goal is to identify microbiome taxa associated with certain biological or clinical conditions. There are two different modes of microbiome differential abundance analysis: the individual-based univariate differential abundance analysis and the group-based multivariate differential abundance analysis. The univariate analysis identifies differentially abundant microbiome taxa subject to multiple correction under certain statistical error measurements such as false discovery rate, which is typically complicated by the high-dimensionality of taxa and complex correlation structure among taxa. The multivariate analysis evaluates the overall shift in the abundance of microbiome composition between two conditions, which provides useful preliminary differential information for the necessity of follow-up validation studies. In this paper, we present a novel Adaptive multivariate two-sample test for Microbiome Differential Analysis (AMDA) to examine whether the composition of a taxa-set are different between two conditions. Our simulation studies and real data applications demonstrated that the AMDA test was often more powerful than several competing methods while preserving the correct type I error rate. A free implementation of our AMDA method in R software is available at https://github.com/xyz5074/AMDA.Entities:
Keywords: adaptive microbiome differential analysis (AMDA); maximum mean discrepancy (MMD); multivariate two-sample test; permutation; subset testing; taxa-set
Year: 2019 PMID: 31068967 PMCID: PMC6491633 DOI: 10.3389/fgene.2019.00350
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
An adaptive two-sample test for microbiome differential abundance analysis
Apply the centered log-ratio transformation Equation (2) to the microbiome composition matrix. Without loss of generality, we still use Use the testing subset selection procedure described in section 2.4 to select a testing subset For Calculate the final |
Empirical type I errors of different tests for microbiome differential abundance analysis under nominal significance level α = 0.05.
| 50 | 0.0478 | 0.0478 | 0.0506 | 0.0516 | 0.0508 | 0.0436 | |
| 50 | 100 | 0.0464 | 0.0458 | 0.0492 | 0.0536 | 0.0540 | 0.0488 |
| 200 | 0.0504 | 0.0542 | 0.0530 | 0.0534 | 0.0548 | 0.0480 | |
| 50 | 0.0486 | 0.0478 | 0.0490 | 0.0434 | 0.0424 | 0.0532 | |
| 100 | 100 | 0.0464 | 0.0494 | 0.0492 | 0.0544 | 0.0542 | 0.0478 |
| 200 | 0.0524 | 0.0558 | 0.0514 | 0.0440 | 0.0424 | 0.0470 | |
| 50 | 0.0454 | 0.0498 | 0.0492 | 0.0438 | 0.0400 | 0.0490 | |
| 200 | 100 | 0.0514 | 0.0476 | 0.0464 | 0.0530 | 0.0516 | 0.0538 |
| 200 | 0.0464 | 0.0510 | 0.0506 | 0.0542 | 0.0530 | 0.0476 | |
| 50 | 0.0480 | 0.0464 | 0.0504 | 0.0556 | 0.0442 | 0.0474 | |
| 500 | 100 | 0.0540 | 0.0544 | 0.0566 | 0.0570 | 0.0498 | 0.0468 |
| 200 | 0.0556 | 0.0576 | 0.0456 | 0.0490 | 0.0442 | 0.0336 |
Results are averaged over 5,000 replicates.
Figure 1Empirical power of different tests under p = 50 (first row) and p = 100 (second row). The Y-axis represents the power and the X-axis represents the sparsity level at 10, 30, and 50%.
Figure 2Empirical power of different tests under p = 200 (first row) and p = 500 (second row). The Y-axis represents the power and the X-axis represents the sparsity level at 10, 30, and 50%.
Number of significant differential abundant taxa-set at each taxonomic rank detected by different methods under family-wise error rate of 0.05.
| Phylum (10) | 3 | 1 | 0 | 0 | 0 | 1 | |
| ASD vs. DD | Class (18) | 3 | 1 | 1 | 2 | 2 | 1 |
| Order (34) | 2 | 0 | 0 | 1 | 1 | 2 | |
| Family (52) | 1 | 0 | 0 | 0 | 0 | 1 | |
| Phylum (10) | 2 | 2 | 2 | 0 | 0 | 1 | |
| ASD vs. TD | Class (18) | 4 | 3 | 3 | 2 | 2 | 5 |
| Order (34) | 3 | 2 | 1 | 1 | 1 | 2 | |
| Family (52) | 2 | 2 | 1 | 2 | 2 | 2 |
Number in parentheses denotes the total number of tests conducted at that rank.
Figure 3P-values of AMDA, MAX, OMiAT, MMD, MiRKAT, and QCAT for family-level differential abundance analysis. The left panel corresponds to the comparison between ASD and DD, and right panel corresponds to the comparison between ASD and TD.