| Literature DB >> 35668471 |
Caizhi Huang1, Benjamin J Callahan1,2, Michael C Wu3, Shannon T Holloway4, Hayden Brochu1,5, Wenbin Lu4, Xinxia Peng1,5, Jung-Ying Tzeng6,7.
Abstract
BACKGROUND: The relationship between host conditions and microbiome profiles, typically characterized by operational taxonomic units (OTUs), contains important information about the microbial role in human health. Traditional association testing frameworks are challenged by the high dimensionality and sparsity of typical microbiome profiles. Phylogenetic information is often incorporated to address these challenges with the assumption that evolutionarily similar taxa tend to behave similarly. However, this assumption may not always be valid due to the complex effects of microbes, and phylogenetic information should be incorporated in a data-supervised fashion.Entities:
Keywords: Association test; Kernel machine regression; Phylogenetic tree
Mesh:
Year: 2022 PMID: 35668471 PMCID: PMC9171974 DOI: 10.1186/s40168-022-01266-3
Source DB: PubMed Journal: Microbiome ISSN: 2049-2618 Impact factor: 16.837
Fig. 1Overview of phylogeny-guided microbiome OTU-specific association test (POST)
Fig. 2A Phylogenetic relationship between the target OTU (OTU3438) and its neighboring OTUs. B OTU phylogenetic correlation r under different values of the tuning parameter c, illustrated by setting m = OTU3438 as the target OTU. The x-axis is the OTUs shown in the same order as the tree tips. The y-axis is r, the correlation between OTU3438 and the remaining OTUs
Fig. 3Illustration of the five causal OTU scenarios considered in the simulation. Scenarios 1 to 3 consider larger “causal hubs,” each containing about 7–10 causal OTUs; scenario 4 considers smaller causal hubs of 2–3 causal OTUs; scenario 5 considers causal OTUs with random positions in the phylogenetic tree. Red (blue) circles indicate that causal effect size tends to be positive (negative)
Type I error rates at the significance levels of 0.05, 0.01, and 0.001 for POST
| Simulation | Outcome | ||||
|---|---|---|---|---|---|
| A | Binary | 0.047 | 0.007 | 0.0006 | 0.00005 |
| B | Continuous | 0.051 | 0.010 | 0.0010 | 0.00008 |
| Binary | 0.047 | 0.008 | 0.0006 | 0.00007 |
AUC of different methods in simulations A and B. Methods considered include POST, TreeFDR (TF), Single-OTU test (SO) implemented by POST with c=0, DESeq2 (DE), ANCOM-BC (AB) and LinDA (LD), Wilcoxon rank-sum test with proportional data (WR-P) or CLR transformed data (WR-R) for binary outcomes, and Spearman correlation test with proportional data (SC-P) or CLR transformed data (SC-R) for continuous outcomes. The outcome values are simulated assuming no covariate effects and 5 different causal OTU scenarios. Scenarios 1 to 3 consider larger “causal hubs,” each containing about 7–10 causal OTUs; scenario 4 considers smaller causal hubs of 2–3 causal OTUs; scenario 5 considers causal OTUs with random positions in the phylogenetic tree. Methods with the highest AUC are shown in bold
| Simulation | Simulation A | Simulation B | ||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Outcome | Binary outcome | Continuous outcome | Binary outcome | |||||||||||||||||||||
| Method | POST | TF | SO | DE | AB | LD | WR-P | WR-R | POST | TF | SO | DE | AB | LD | SC-P | SC-R | POST | TF | SO | DE | AB | LD | WR-P | WR-R |
| Large effect size* | ||||||||||||||||||||||||
| Scenario1 | 0.78 | 0.62 | 0.69 | 0.54 | 0.56 | 0.51 | 0.51 | 0.52 | 0.58 | 0.61 | 0.57 | 0.61 | 0.57 | 0.61 | 0.58 | 0.53 | 0.55 | 0.52 | 0.57 | 0.54 | 0.55 | |||
| Scenario2 | 0.77 | 0.71 | 0.71 | 0.70 | 0.67 | 0.62 | 0.58 | 0.50 | 0.62 | 0.65 | 0.62 | 0.62 | 0.57 | 0.51 | 0.60 | 0.48 | 0.56 | 0.57 | 0.56 | 0.52 | 0.51 | |||
| Scenario3 | 0.78 | 0.72 | 0.70 | 0.64 | 0.64 | 0.60 | 0.60 | 0.49 | 0.62 | 0.63 | 0.61 | 0.62 | 0.58 | 0.51 | 0.50 | 0.59 | 0.55 | 0.58 | 0.58 | 0.57 | 0.51 | |||
| Scenario4 | 0.75 | 0.70 | 0.69 | 0.70 | 0.67 | 0.62 | 0.58 | 0.48 | 0.61 | 0.58 | 0.61 | 0.58 | 0.52 | 0.49 | 0.54 | 0.55 | 0.55 | 0.55 | 0.51 | |||||
| Scenario5 | 0.66 | 0.67 | 0.64 | 0.65 | 0.63 | 0.57 | 0.55 | 0.59 | 0.48 | 0.60 | 0.59 | 0.60 | 0.54 | 0.51 | 0.54 | 0.50 | 0.54 | 0.52 | 0.55 | 0.53 | 0.50 | |||
| Small effect size* | ||||||||||||||||||||||||
| Scenario1 | 0.65 | 0.59 | 0.62 | 0.57 | 0.57 | 0.53 | 0.51 | 0.48 | 0.56 | 0.58 | 0.56 | 0.59 | 0.55 | 0.56 | 0.56 | 0.49 | 0.53 | 0.52 | 0.54 | 0.52 | 0.52 | |||
| Scenario2 | 0.65 | 0.61 | 0.62 | 0.63 | 0.62 | 0.58 | 0.55 | 0.49 | 0.59 | 0.61 | 0.59 | 0.59 | 0.54 | 0.51 | 0.54 | 0.49 | 0.53 | 0.55 | 0.53 | 0.51 | 0.50 | |||
| Scenario3 | 0.65 | 0.63 | 0.62 | 0.62 | 0.60 | 0.57 | 0.55 | 0.49 | 0.60 | 0.60 | 0.58 | 0.60 | 0.56 | 0.50 | 0.50 | 0.56 | 0.55 | 0.55 | 0.56 | 0.54 | 0.50 | |||
| Scenario4 | 0.65 | 0.62 | 0.62 | 0.64 | 0.62 | 0.58 | 0.55 | 0.48 | 0.58 | 0.57 | 0.58 | 0.56 | 0.51 | 0.50 | 0.53 | 0.53 | 0.52 | 0.53 | 0.51 | |||||
| Scenario5 | 0.59 | 0.58 | 0.59 | 0.58 | 0.58 | 0.54 | 0.52 | 0.56 | 0.50 | 0.56 | 0.56 | 0.53 | 0.50 | 0.50 | 0.52 | 0.52 | 0.52 | 0.50 | ||||||
*For simulation A, small OTU effect size is from N(±1,1) and large OTU effect size is from N(±2,1). For simulations B, small OTU effect size is simulated from N(±0.2, 0.04) and N(±0.3,0.06) for continuous and binary outcomes, respectively; large OTU effect size is simulated from N(±0.5, 0.1) and N(±1,0.2) for continuous and binary outcomes, respectively
Fig. 4ROC curves of simulation A with large OTU effect size (top row) and small OTU effect size (bottom row) for POST, Single-OTU test (SO), TreeFDR (TF), DESeq2 (DE), ANCOM-BC (AB), LinDA (DA), and Wilcoxon rank-sum test (WR) under the 5 causal scenarios. Scenarios 1 to 3 consider larger “causal hubs,” each containing about 7–10 causal OTUs; scenario 4 considers smaller causal hubs of 2–3 causal OTUs; scenario 5 considers causal OTUs with random positions in the phylogenetic tree
Fig. 5Upset plot of detected OTUs at FDR level of 0.05 and phylogenetic trees of the analyzed OTUs with detected OTUs for bacterial vaginosis study. SO, single-OTU test implemented by POST with c = 0; TF, TreeFDR; DE, DESeq2; AB, ANCOM-BC; LD, LinDA; WR-P, Wilcoxon rank-sum test using proportional data; WR-R, Wilcoxon rank-sum test using CLR transformed data
OTUs significantly associated with bacterial vaginosis (BV) at FDR level of 0.05. TF, TreeFDR; SO, Single-OTU test implemented by POST with c = 0; DE, DESeq2; AB, ANCOM-BC; LD, LinDA; WR-P, Wilcoxon rank-sum test using proportional data; WR-R, Wilcoxon rank-sum test using the CLR transformed data
| OTU | FDR-adjusted | Detected method | Genus/species | Direction** | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| POST | TF | SO | DE | AB | LD | WR-P* | WR-R* | ||||
| OTU3 | 0.039 | 0.089 | 0.045 | 0.002 | 0.006 | 0.041 | 0.178 | 0.025 | POST/SO/DE/AB/LD/WR-R | - | |
| OTU90 | 0.039 | 0.245 | 0.953 | 0.950 | 0.983 | 0.961 | 0.375 | 0.989 | POST | + | |
| OTU7 | 0.039 | 0.653 | 0.881 | 0.147 | 0.766 | 0.729 | 0.760 | 0.932 | POST | - | |
| OTU82 | 0.039 | 0.517 | 0.985 | 0.970 | 0.963 | 0.976 | 0.258 | 0.989 | POST | - | |
| OTU66 | 0.039 | 0.544 | 0.701 | 0.027 | 0.379 | 0.615 | 0.378 | 0.932 | POST/DE | - | |
| OTU2 | 0.039 | 0.180 | 0.917 | 0.876 | 0.766 | 0.923 | 0.235 | 0.989 | POST | + | |
| OTU58 | 0.039 | 0.839 | 0.917 | 0.876 | 0.871 | 0.923 | 0.791 | 0.989 | POST | + | |
| OTU112 | 0.411 | 0.046 | 0.701 | 0.422 | 0.379 | 0.678 | 0.040 | 0.932 | TF/WR-P | + | |
| OTU85 | 0.470 | 0.046 | 0.906 | 0.372 | 0.766 | 0.870 | 0.097 | 0.989 | TF | + | |
| OTU11 | 0.918 | 0.286 | 0.943 | 0.000 | 0.869 | 0.923 | 0.220 | 0.989 | DE | + | |
| OTU12 | 0.391 | 0.089 | 0.701 | 0.013 | 0.338 | 0.615 | 0.040 | 0.942 | DE/WR-P | + | |
| OTU16 | 0.680 | 0.155 | 0.881 | 0.001 | 0.766 | 0.918 | 0.097 | 0.989 | DE | + | |
| OTU91 | 0.680 | 0.092 | 0.881 | 0.505 | 0.766 | 0.852 | 0.040 | 0.989 | WR-P | + | |
*WR-P and WR-R did not adjust for race
**+ (and −) indicates that the OTU is positively (and negatively) associated with BV risk from a logistic regression
Fig. 6Upset plot of detected OTUs at FDR level of 0.05 and phylogenetic trees of the analyzed OTUs with detected OTUs for preterm birth study. SO, Single-OTU test implemented by POST with c = 0; TF, TreeFDR; DE, DESeq2; AB, ANCOM-BC; LD, LinDA; WR-P, Wilcoxon rank-sum test using proportional data; WR-R,Wilcoxon rank-sum test using the CLR transformed data