| Literature DB >> 35729515 |
Wei Dai1, Cai Li2, Ting Li3, Jianchang Hu4, Heping Zhang5.
Abstract
BACKGROUND: Microbial communities in the human body, also known as human microbiota, impact human health, such as colorectal cancer (CRC). However, the different roles that microbial communities play in healthy and disease hosts remain largely unknown. The microbial communities are typically recorded through the taxa counts of operational taxonomic units (OTUs). The sparsity and high correlations among OTUs pose major challenges for understanding the microbiota-disease relation. Furthermore, the taxa data are structured in the sense that OTUs are related evolutionarily by a hierarchical structure.Entities:
Keywords: Colorectal cancer; Microbiome joint effects; Microbiota-disease association studies; Super-Taxon
Mesh:
Year: 2022 PMID: 35729515 PMCID: PMC9215102 DOI: 10.1186/s12859-022-04786-9
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.307
Fig. 1Method Overview. a OTUs in a sample are displayed. b OTUs are divided into sets/blocks by biological group (Genus, Family, Order, Class). c Within each set, a tree-based method is utilized to obtain the importance measure of each OTU and form a ranking of OTUs in terms of their marginal contribution to the disease status. d Empirically determine the number of top OTUs to form a super-taxon. e Top OTUs within each set/block are then aggregated into a super-taxon (STB and STC are both considered for OTU presence or abundance)
Block-level identification rate over 500 replications for four true blocks under three graph structures
| Method | Graph type | Block 1 | Block 2 | Block 3 | Block 4 | Average (SD) |
|---|---|---|---|---|---|---|
| STB | Random | 0.8289 | 0.9918 | 1 | 0.8598 | 0.9201 (0.0885) |
| STC | 0.9401 | 1 | 0.6736 | 1 | 0.9034 (0.1558) | |
| ZIBR | 0.8223 | 0.8595 | 0.9814 | 0.9835 | 0.9117 (0.0831) | |
| ANCOM-BC | 1 | 1 | 1 | 1 | 1 (0) | |
| Hu et al. [ | 0.0020 | 0.4633 | 0.1388 | 0 | 0.1510 (0.2181) | |
| STB | Hub | 0.7478 | 0.7917 | 1 | 0.9364 | 0.8690 (0.1188) |
| STC | 0.9670 | 1 | 1 | 1 | 0.9917 (0.0165) | |
| ZIBR | 0.6865 | 0.7726 | 1 | 1 | 0.8648 (0.1601) | |
| ANCOM-BC | 1 | 1 | 1 | 1 | 1 (0) | |
| Hu et al. (2020) | 0.0082 | 0.2238 | 0.1458 | 0 | 0.0944 (0.1091) | |
| STB | Cluster | 0.7868 | 1 | 1 | 1 | 0.9467 (0.1066) |
| STC | 0.9510 | 1 | 0.9787 | 0.7975 | 0.9318 (0.0918) | |
| ZIBR | 0.7361 | 0.7961 | 0.7682 | 0.9506 | 0.8128 (0.0951) | |
| ANCOM-BC | 1 | 1 | 1 | 1 | 1 (0) | |
| Hu et al. [ | 0.0040 | 0.3300 | 0.1360 | 0 | 0.1175 (0.1551) |
is set to reflect the correlations among OTUs. Random graph structure indicates that OTUs are correlated with each other randomly. The hub and cluster graphs capture some aspects of biological networks, such as highly connected nodes and community structure. More details can be found in Osborne et al. [20]
Average of sensitivity, specificity, precision and standard deviations for block-level identification over 500 replications
| Method | Graph type | Sensitivity (SD) | Specificity (SD) | Precision (SD) |
|---|---|---|---|---|
| STB | Random | 0.9201 (0.1178) | 0.9996 (0.0049) | 0.9988 (0.0157) |
| STC | 0.9034 (0.1219) | 1 (0) | 1 (0) | |
| ZIBR | 0.9117 (0.1228) | 1 (0) | 1 (0) | |
| ANCOM-BC | 1 (0) | 0 (0) | 0.2 (0) | |
| Hu et al. [ | 0.1510 (0.1255) | 0.9719 (0.0326) | 0.5806 (0.4815) | |
| STB | Hub | 0.8690 (0.1250) | 1 (0) | 1 (0) |
| STC | 0.9917 (0.0447) | 0.9869 (0.0255) | 0.9581 (0.0814) | |
| ZIBR | 0.8648 (0.1404) | 1 (0) | 1 (0) | |
| ANCOM-BC | 1 (0) | 0 (0) | 0.2 (0) | |
| Hu et al. [ | 0.0945 (0.1213) | 0.9588 (0.0307) | 0.3634 (0.4739) | |
| STB | Cluster | 0.9467 (0.1025) | 0.9975 (0.0123) | 0.9919 (0.0395) |
| STC | 0.9318 (0.1115) | 1 (0) | 1 (0) | |
| ZIBR | 0.8128 (0.1340) | 1 (0) | 1 (0) | |
| ANCOM-BC | 1 (0) | 0 (0) | 0.2 (0) | |
| Hu et al. [ | 0.1175 (0.1259) | 0.9655 (0.0323) | 0.4630 (0.4966) |
Different graph types are set to reflect various correlation structures among OTUs. Random graph structure indicates that OTUs are correlated with each other randomly. The hub and cluster graphs capture some aspects of biological networks, such as highly connected nodes and community structure. More details can be found in Osborne et al. [20]
Average of sensitivity, specificity, precision and standard deviations for OTU-level identification over 500 replications
| Method | Graph type | Sensitivity (SD) | Specificity (SD) | Precision (SD) |
|---|---|---|---|---|
| STB | Random | 0.2815 (0.0609) | 0.9979 (0.0015) | 0.9073 (0.0438) |
| STC | 0.1377 (0.0724) | 0.9993 (0.0011) | 0.9528 (0.0637) | |
| ZIBR | 0.1463 (0.0486) | 1 (0) | 0.5083 (0.2771) | |
| ANCOM-BC | 0.7225 (0.0152) | 0.6084 (0.0078) | 0.1702 (0.0048) | |
| Hu et al. [ | 0.0047 (0.0005) | 0.9993 (0.0006) | 0.4520 (0.4915) | |
| STB | Hub | 0.3067 (0.0425) | 0.9986 (0.0009) | 0.9218 (0.0557) |
| STC | 0.1183 (0.0310) | 0.9991 (0.0017) | 0.9198 (0.1473) | |
| ZIBR | 0.1439 (0.0910) | 0.9999 (0.0012) | 0.4827 (0.2760) | |
| ANCOM-BC | 0.7927 (0.0154) | 0.5204 (0.0168) | 0.1465 (0.0032) | |
| Hu et al. [ | 0.0025 (0.0046) | 0.9991 (0.0005) | 0.2194 (0.4068) | |
| STB | Cluster | 0.3255 (0.0513) | 0.9974 (0.0018) | 0.9065 (0.0533) |
| STC | 0.1339 (0.0425) | 0.9998 (0.0004) | 0.9851 (0.0354) | |
| ZIBR | 0.1160 (0.0811) | 0.9993 (0.0033) | 0.3760 (0.2261) | |
| ANCOM-BC | 0.8107 (0.0155) | 0.5789 (0.0228) | 0.1749 (0.0073) | |
| Hu et al. [ | 0.0034 (0.0048) | 0.9992 (0.0005) | 0.3280 (0.4667) |
Different graph types are set to reflect various correlation structures among OTUs. Random graph structure indicates that OTUs are correlated with each other randomly. The hub and cluster graphs capture some aspects of biological networks, such as highly connected nodes and community structure. More details can be found in Osborne et al. [20]
Fig. 2Average block-level identification rate of first four blocks between STB and STC
Real data results for STB
| (a) | ||||
|---|---|---|---|---|
| Super-taxon | Discovery | Verification | ||
| Odds ratio (95% CI) | P-value | Odds ratio (95% CI) | P-value | |
| Block 107 | 8.4851 (4.3838, 16.4234) | 2.21e-10 | 4.2538 (2.0093, 9.0057) | 1.55e-04 |
| Block 84 | 5.5873 (3.2172, 9.7034) | 1e-09 | 1.7073 (0.9901, 2.9441) | 0.0543 |
Marginal Effects of 2 Super-taxa on Baxter’s dataset (discovery set) and Zeller’s dataset (verification set) are displayed in Table 4. The selected OTUs and their mapping to species are in Table 4
Real data results for STC
| (a) | ||||
|---|---|---|---|---|
| Super-taxon | Discovery | Verification | ||
| Odds Ratio (95% CI) | P-value | Odds Ratio (95% CI) | P-value | |
| Block 107 | 9.2060 (4.4911, 18.8705) | 1.35e-09 | 9.0996 (3.4478, 24.0160) | 8.21e-06 |
| Block 84 | 5.5723 (3.1147, 9.9690) | 7.11e-09 | 3.4140 (1.3696, 8.5101) | 8.42e-03 |
Marginal Effects of 2 Super-taxa on Baxter’s dataset (discovery set) and Zeller’s dataset (verification set) are displayed in Table 5. The selected OTUs and their mapping to species are in Table 5