| Literature DB >> 34752448 |
Sean M Devlin1, Axel Martin1, Irina Ostrovnaya1.
Abstract
In recent literature, the human microbiome has been shown to have a major influence on human health. To investigate this impact, scientists study the composition and abundance of bacterial species, commonly using 16S rRNA gene sequencing, among patients with and without a disease or condition. Methods for such investigations to date have focused on the association between individual bacterium and an outcome, and higher-order pairwise relationships or interactions among bacteria are often avoided due to the substantial increase in dimension and the potential for spurious correlations. However, overlooking such relationships ignores the environment of the microbiome, where there is dynamic cooperation and competition among bacteria. We present a method for identifying and ranking pairs of bacteria that have a differential dichotomized relationship across outcomes. Our approach, implemented in an R package PairSeek, uses the stability selection framework with data-driven dichotomized forms of the pairwise relationships. We illustrate the properties of the proposed method using a published oral cancer data set and a simulation study.Entities:
Mesh:
Substances:
Year: 2021 PMID: 34752448 PMCID: PMC8631663 DOI: 10.1371/journal.pcbi.1009501
Source DB: PubMed Journal: PLoS Comput Biol ISSN: 1553-734X Impact factor: 4.475
Fig 1Two pairs of bacteria with a differential relationship between oral cancer cases (red) and healthy controls (black).
Black and red lines represent linear regression lines fitted within cases and controls separately.
Fig 2Operating characteristics of PairSeek and the FDR-adjusted single pair screens, which adjusted for main effects using either relative abundance (RA) or the centered log-ratio transformation (CLR).
The top row shows the average number of true and false positives (TP and FP, respectively) across different values of the selection threshold for M = 2 under the Alternative scenario for PairSeek along with the two single pair screens for different FDR thresholds. The second row are parallel results for M = 6. The last row shows the performance of the three methods for scenarios Null 1 (M0 = 2 or 6 main effects only) and Null 2 (no association between the microbiome and generated outcome).
| PairSeek | Single Pair Screen, RA Main Effects | ||||
| 1st True Hit | 2nd True Hit | Median Rank | 1st True Hit | 2nd True Hit | Median Rank |
| 2 Pairs | 2 Pairs | ||||
| 1.0 | 2.0 | 1.0 | 2.5 | ||
| 6 Pairs | 6 Pairs | ||||
| 1.0 | 2.0 | 3.8 | 2.0 | 9.2 | 283.6 |
| Single Pair Screen, CLR Main Effects | Random Forests | ||||
| 1st True Hit | 2nd True Hit | Median Rank | 1st True Hit | 2nd True Hit | Median Rank |
| 2 Pairs | 2 Pairs | ||||
| 1.1 | 3.1 | 11.5 | 222.9 | ||
| 6 Pairs | 6 Pairs | ||||
| 1.3 | 5.1 | 89.5 | 24.7 | 71.6 | 250.1 |