| Literature DB >> 35956358 |
Stephan C Bischoff1, Nguyen K Nguyen2, Benjamin Seethaler1, Julia Beisner1, Philipp Kügler3, Thorsten Stefan3.
Abstract
Background: The long-term success of nonsurgical weight reduction programs is variable; thus, predictors of outcome are of major interest. We hypothesized that the intestinal microbiota known to be linked with diet and obesity contain such predictive elements.Entities:
Keywords: machine learning; microbiome; microbiota; obesity; prediction; weight loss
Mesh:
Year: 2022 PMID: 35956358 PMCID: PMC9370776 DOI: 10.3390/nu14153182
Source DB: PubMed Journal: Nutrients ISSN: 2072-6643 Impact factor: 6.706
Characteristics of the study population.
| Parameter | Hypothesis-Generating Cohort |
|---|---|
|
| 15 |
| Weight loss success ( | 8 |
| Age | 40 ± 6 |
| Blood pressure systolic (mmHg) | 126 ± 15 |
| Blood pressure diastolic (mmHg) | 86 ± 11 |
| Fasting blood glucose (mg/dL) | 105 ± 619 |
| Cholesterol (mg/dL) | 203 ± 641 |
| HDL cholesterol (mg/dL) | 47 ± 610 |
| LDL cholesterol (mg/dL) | 130 ± 27 |
| Triglycerides (mg/dL) | 181 ± 154 |
| WC (cm) | 123 ± 15 |
| Weight (kg) | 128 ± 20 |
| BMI (kg/m2) | 42.4 ± 6 |
Weight loss success is defined as relative weight loss at T24 > 10%. Data are presented as mean ± standard deviation. Abbreviations: T24 = 24 months after baseline, HDL = high-density lipoprotein, LDL = low-density lipoprotein, WC = waist circumference, BMI = body mass index.
Selection of genera and species predictor variables according to elastic net regularization.
| Predictor Variable | Occurrence (Delta BMI) | Occurrence (RWL) | |
|---|---|---|---|
| Genera |
| 10.40% | 46.10% |
|
| 3.40% | 51.80% | |
|
| 10.40% | 26.40% | |
|
| 6.90% | 15.90% | |
| Species |
| 100.00% | 22.00% |
|
| 10.10% | 19.10% | |
|
| 0.30% | 17.10% | |
|
| 0.20% | 1.30% | |
|
| 0.00% | 1.00% | |
Selection of genera predictor variables using the Monte Carlo approach.
| Genera (Weight 1) | Full List | Reduced List | Reduced List | Reduced List | Reduced List |
|---|---|---|---|---|---|
| (A 60%, B 20%, C 20%) | (102 Terms) | (44 Terms) | (23 Terms) | (15 Terms) | (7 Terms) |
|
| 0.9957 | 0.9570 | 0.9515 | 0.9397 | 0.9551 |
|
| 0.8742 | 0.7755 | 0.6916 | 0.6964 | 1.0000 |
|
| 0.7179 | 0.7126 | 0.7805 | 0.8054 | 0.7513 |
|
| 0.5285 | 0.5802 | 0.5475 | 0.5781 | 0.6515 |
|
| 0.3212 | 0.6202 | 0.7111 | 0.7311 | 0.3895 |
|
| 0.3818 | 0.5506 | 0.5359 | 0.5225 | 0.5057 |
|
| 0.4032 | 0.4131 | 0.4091 | 0.4474 | |
| Burkholderia | 0.3751 | 0.5381 | 0.5932 | 0.6016 | |
| Treponema | 0.3780 | 0.5865 | 0.6172 | 0.6445 | |
| Aeromonas | 0.3029 | 0.4871 | 0.5573 | 0.5955 | |
| Gordonibacter | 0.3518 | 0.3991 | |||
| Streptococcus | 0.3041 | 0.4350 | 0.4969 | 0.5200 | |
| Alistipes | 0.5047 | 0.4468 | 0.4249 | ||
| Haemophilus | 0.3812 | 0.4531 | 0.4335 | ||
| Bordetella | 0.3540 | 0.4177 | 0.4631 | 0.4894 | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 0.9973 | 0.9981 | 0.9686 | 0.9936 | 0.9609 |
|
| 0.9162 | 0.8792 | 0.8201 | 0.8379 | 1.0000 |
|
| 0.6671 | 0.6583 | 0.7758 | 0.7302 | 0.7180 |
|
| 0.6315 | 0.6632 | 0.6414 | 0.6044 | 0.6844 |
|
| 0.3434 | 0.5460 | 0.7008 | 0.7817 | 0.4112 |
|
| 0.4706 | 0.5025 | 0.5379 | 0.5652 | 0.5645 |
|
| 0.4946 | 0.4854 | 0.4529 | 0.5026 | |
| Burkholderia | 0.4192 | 0.5271 | 0.6486 | 0.6774 | |
| Treponema | 0.4055 | 0.4918 | 0.5667 | 0.5731 | |
| Aeromonas | 0.3337 | 0.4694 | 0.5539 | 0.6356 | |
| Gordonibacter | 0.4607 | 0.4597 | |||
| Streptococcus | 0.3485 | 0.5570 | 0.5711 | 0.6459 | |
| Alistipes | 0.6135 | 0.5296 | 0.4784 | ||
| Haemophilus | 0.5137 | 0.5667 | 0.5167 | 0.5037 | |
| Bordetella | 0.3599 | 0.4661 | 0.5346 | 0.6305 |
Predictor variables for the genera data set and weightings 1 and 2 are shown in the table. Genera that were eventually selected are in boldface. Italics represent genera that were already selected in the previous step (elastic net regularization). * Oscillibacter was selected as it was one of the top predictor variables (together with Blautia) in a reduced data set that only tested the 46 most common genera (data not shown). Detailed results including all 102 genera, as well as the calculation of the index values for the full list (102 terms), and weighting 2 can be found in Supplementary Table S3.
Selection of species predictor variables using the Monte Carlo approach.
| Species (Weight 1) | Full List | Reduced List | Reduced List | Reduced List | |
|---|---|---|---|---|---|
| (A: 60%, B: 20%, C: 20%) | (106 Terms) | (28 Terms) | (16 Terms) | (8 Terms) | |
|
| 0.9043 | 0.6981 | 0.7663 | 0.9521 | |
|
| 0.6873 | 0.9467 | 0.9814 | 0.8594 | |
|
| 0.6824 | 0.8763 | 0.9004 | 0.6096 | |
|
| 0.6871 | 0.6270 | 0.6522 | 0.8361 | |
|
| 0.5777 | 0.7872 | 0.8143 | 0.6423 | |
|
| 0.4309 | 0.7694 | 0.8154 | 0.6728 | |
|
| 0.3782 | 0.4657 | 0.4623 | 0.4312 | |
|
| 0.4986 | 0.4376 | 0.4079 | 0.4461 | |
| Prevotella dentalis | 0.3424 | 0.5949 | 0.6055 | ||
| Bifidobacterium bifidum | 0.4589 | 0.5286 | 0.5876 | ||
| Pseudoflavonifractor capillosus | 0.3900 | 0.5629 | 0.5214 | ||
| Bacteroides ovatus | 0.3736 | 0.5577 | 0.5795 | ||
| Clostridium hathewayi | 0.3342 | ||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 0.9522 | 0.8361 | 0.8729 | 0.8527 | 0.9760 |
|
| 0.6167 | 0.8676 | 0.9245 | 0.9609 | 0.8501 |
|
| 0.6871 | 0.8249 | 0.8489 | 0.8990 | 0.6666 |
|
| 0.7308 | 0.7343 | 0.7362 | 0.7914 | 0.8713 |
|
| 0.5581 | 0.7228 | 0.7384 | 0.7799 | 0.7011 |
|
| 0.4554 | 0.6997 | 0.7370 | 0.7894 | 0.6564 |
|
| 0.5017 | 0.5354 | 0.5162 | 0.5548 | 0.4834 |
|
| 0.6040 | 0.5443 | 0.5195 | 0.4576 | 0.5029 |
| Prevotella dentalis | 0.3577 | 0.6112 | 0.6224 | 0.6351 | |
| Bifidobacterium bifidum | 0.4012 | 0.5426 | 0.5808 | 0.6033 | |
| Pseudoflavonifractor capillosus | 0.5032 | 0.6374 | 0.5877 | 0.5341 | |
| Bacteroides ovatus | 0.3828 | 0.5227 | 0.5328 | 0.5515 | |
| Clostridium hathewayi | 0.2808 |
Predictor variables for the species data set and weightings 1 and 2 are shown in the table. Species that were eventually selected are in boldface. Italics represent genera that were already selected in the previous step (elastic net regularization). Detailed results including all 106 species, as well as the calculation of the index values for the full list (106 terms), and weighting 2 can be found in Supplementary Table S3.
Most-relevant candidates on the genera and species level for weight loss success predictor variables.
| Weight (Importance) | |||||
|---|---|---|---|---|---|
| 1 | 1 | 3 | 1 | ||
|
|
|
|
|
|
|
|
| 2 | 1 | 1 | 1.5 | 15.5 |
|
| 1 | 2 | - | –0.5 | 11.5 |
|
| 3 | 3 | - | - | 8 |
|
| 5 | - | 1 | 1.5 | 6.5 |
|
| 4 | 4 | - | - | 6 |
|
| - | - | 1 | 1.5 | 4.5 |
|
| - | - | 1 | 0.5 | 3.5 |
|
|
|
|
|
|
|
|
| 1 | 1 | 1 | 2 | 17 |
|
| 3 | 2.5 | - | −0.5 | 9 |
|
| 3 | 2.5 | - | - | 8.5 |
|
| 5 | - | 1 | 1.5 | 6.5 |
| 3 | 5.5 | - | - | 5.5 | |
|
| 5 | 4.5 | - | - | 4.5 |
|
| - | - | 1 | 1 | 4 |
|
| - | 5 | - | - | 2 |
Final sets of selected genera and species are shown in the table. A final score was calculated, based on a set of weights (top line) for the different approaches to identify predictor variables. The first column of the table contains the names of the selected genera and species; the next two columns show the ranks of the predictor variables for the full (second column) and final (third column) lists in the Monte Carlo (MC) approach. The fourth column shows the predictor variables selected by the elastic net approach (1), while the fifth column identifies predictor variables with strong correlations (positive or negative) to both delta BMI and RWL. Negative correlations are identified with a minus sign, and the absolute values are calculated by grouping the predictor variables into four classes based on strength of correlation: for an absolute value of correlation between 45% and 50%, rounded to full percent, a value of 0.5 is assigned, between 50% and 55%, a value of 1 is assigned, between 55% and 60%, a value of 1.5 is assigned, and 60% and above leads to a value of 2. The final score for each predictor variable is calculated by subtracting the MC ranks (columns 2 and 3) from 7 and multiplying them with the respective weights (1 in both cases) before summing up these two values with the product of the value of column 4 with its weight (3) and the absolute value of column 5 with its weight (1). Only predictor variables with final scores of 2 and above are shown.
Classification of relative weight-loss predictions for each patient under different models.
| Classification | Classification | Classification | Classification | Classification | |
|---|---|---|---|---|---|
| RWL | RWL | RWL | RWL | RWL | |
| DS | 1 Term Model | 2 Terms Model | 3 Terms Model | 4 Terms Model | 5 Terms Model |
| 1 | correct(−) | correct(−) | correct(−) | correct(−) | correct(−) |
| 2 | correct(−) | correct(−) | correct(−) | correct(−) | correct(−) |
| 3 |
|
| correct(+) | correct(+) | correct(+) |
| 4 | correct(−) | correct(−) | correct(−) | correct(−) | correct(−) |
| 5 | correct(−) | correct(−) | correct(−) | correct(−) | correct(−) |
| 6 | correct(+) | correct(+) | correct(+) | correct(+) | correct(+) |
| 7 | correct(−) | correct(−) | correct(−) |
| correct(−) |
| 8 | correct(+) | correct(+) | correct(+) | correct(+) | correct(+) |
| 9 | correct(+) | correct(+) | correct(+) | correct(+) | correct(+) |
| 10 |
| correct(+) | correct(+) | correct(+) | correct(+) |
| 11 |
| correct(−) | correct(−) | correct(−) | correct(−) |
| 12 |
| correct(+) | correct(+) |
|
|
| 13 | correct(−) | correct(−) | correct(−) | correct(−) | correct(−) |
| 14 |
|
|
| correct(+) | correct(+) |
| 15 | correct(+) | correct(+) | correct(+) | correct(+) | correct(+) |
|
| 4 | 6 | 7 | 7 | 7 |
|
| 6 | 7 | 7 | 6 | 7 |
|
| 4 | 2 | 1 | 1 | 1 |
|
| 1 | 0 | 0 | 1 | 0 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Classification of the predicted relative weight loss (RWL) for each patient under the optimal linear regression models with one to five linear terms containing predictor variables from the final set of selected species of Table 5 is shown. Predictions for patients, numbered from 1–15 (column “DS”) were classified as “correct(+)” in cases where significant weight loss occurred and was predicted, as “correct(−)” in cases where significant weight loss did not occur and was predicted as not occurring, as “overpredicted” in cases where a model predicted significant weight loss but it did not occur, and as “underpredicted” in cases where a model predicted no significant weight loss but significant weight loss occurred. The threshold for a significant relative weight loss was chosen as 10. Analogous tables for delta BMI, and for predictor variables from the final set of selected genera of Table 5 for both RWL and delta BMI, can be found in Supplementary Tables S5 and S6.
Figure 1Microbial differences between individuals from the persistent success and the nonpersistent success groups on genus level. (A) This principal component analysis plot shows the 7 weight-loss predictive genera as well as the 15 individuals, color-coded by their persistent/nonpersistent weight loss status. Distances between the dots (representing individuals) were computed by the Euclidean index on centered-log ratio transformed data of the seven weight-loss predictive genera presented in Table 5. Dim, dimension/principal component; PERMANOVA, permutational multivariate analysis of variance. (B) Boxplots showing the comparisons of bacterial abundance (Z score) between the two groups for each genus using Mann–Whitney U tests with the respective p-values being adjusted with the false discovery rate method (q-value). Only comparisons with q < 0.1 are shown.
Figure 2Microbial differences between individuals from the persistent success and the nonpersistent success groups on species level. (A) This principal component analysis plot shows the 8 weight-loss predictive species as well as the 15 individuals, color-coded by their persistent/nonpersistent weight-loss status. Distances between the dots (representing individuals) were computed by the Euclidean index on centered-log ratio transformed data of the eight weight-loss predictive species shown in Table 5. Dim, dimension/principal component; PERMANOVA, permutational multivariate analysis of variance. (B) Boxplots showing the comparisons of bacterial abundance (Z score) between the two groups for each species using Mann–Whitney U tests with the respective p-values being adjusted with the false discovery rate method (q-value). Only comparisons with q < 0.1 are shown.