| Literature DB >> 32727141 |
Hirokazu Fukui1, Akifumi Nishida2,3,4, Satoshi Matsuda5, Fumitaka Kira2,6, Satoshi Watanabe2, Minoru Kuriyama2, Kazuhiko Kawakami5, Yoshiko Aikawa5, Noritaka Oda5, Kenichiro Arai5, Atsushi Matsunaga5, Masahiko Nonaka5, Katsuhiko Nakai5, Wahei Shinmura6, Masao Matsumoto6, Shinji Morishita6, Aya K Takeda2, Hiroto Miwa1.
Abstract
Irritable bowel syndrome (IBS) is diagnosed by subjective clinical symptoms. We aimed to establish an objective IBS prediction model based on gut microbiome analyses employing machine learning. We collected fecal samples and clinical data from 85 adult patients who met the Rome III criteria for IBS, as well as from 26 healthy controls. The fecal gut microbiome profiles were analyzed by 16S ribosomal RNA sequencing, and the determination of short-chain fatty acids was performed by gas chromatography-mass spectrometry. The IBS prediction model based on gut microbiome data after machine learning was validated for its consistency for clinical diagnosis. The fecal microbiome alpha-diversity indices were significantly smaller in the IBS group than in the healthy controls. The amount of propionic acid and the difference between butyric acid and valerate were significantly higher in the IBS group than in the healthy controls (p < 0.05). Using LASSO logistic regression, we extracted a featured group of bacteria to distinguish IBS patients from healthy controls. Using the data for these featured bacteria, we established a prediction model for identifying IBS patients by machine learning (sensitivity >80%; specificity >90%). Gut microbiome analysis using machine learning is useful for identifying patients with IBS.Entities:
Keywords: IBS; gut microbiome; machine learning; short-chain fatty acids
Year: 2020 PMID: 32727141 PMCID: PMC7464323 DOI: 10.3390/jcm9082403
Source DB: PubMed Journal: J Clin Med ISSN: 2077-0383 Impact factor: 4.241
Clinical and demographic characteristics for healthy subjects and irritable bowel syndrome (IBS) patients.
| Factors | Healthy | IBS | IBS-C | IBS-D | IBS-M | IBS-U | ||||
|---|---|---|---|---|---|---|---|---|---|---|
| Age | 46.2 ± 10.6 | 51.3 ± 15.3 | 0.058 | 56.3 ± 15.2 | 0.007 | 49.7 ± 14.0 | 0.274 | 46.8 ± 16.7 | 0.873 | 56.7 ± 5.8 |
| Sex (M/F) | 9/17 | 37/48 | 0.562 | 8/19 | 0.925 | 17/16 | 0.301 | 11/11 | 0.433 | 1/2 |
| BMI | 22.2 ± 3.6 | 22.2 ± 4.2 | 0.987 | 20.9 ± 3.5 | 0.182 | 21.9 ± 3.2 | 0.742 | 24.6 ± 5.3 | 0.079 | 18.1 ± 1.1 |
| IBS-symptom frequency (1/2/3) | NA | 34/17/33 | NA | 13/3/11 | NA | 12/7/13 | NA | 9/7/6 | NA | 0/0/3 |
| Stool frequency | 1.48 ± 0.56 | 1.40 ± 0.91 | 0.614 | 0.88 ± 0.72 | 0.003 | 2.03 ± 0.82 | 0.005 | 1.15 ± 0.80 | 0.124 | 1.23 ± 0.46 |
| Stool consistency | 4.15 ± 0.46 | 4.41 ± 1.61 | 0.223 | 3.67 ± 1.88 | 0.259 | 5.31 ± 0.71 | <0.001 | 3.95 ± 1.82 | 0.630 | 4.00 ± 0.00 |
Data are shown as mean ± SD. The frequency of IBS symptoms was graded as 1, 3–9 days/month; 2, 10–19 days/month; 3, 20–every day/month. NA, not available. t, t value. BMI, body mass index. p-values for IBS-U were not indicated because of low numbers. IBS with constipation (IBS-C), IBS with diarrhea (IBS-D), mixed IBS (IBS-M), or unsubtyped IBS (IBS-U).
Figure 1Microbiota α-diversity (Shannon index, observed operational taxonomic units (OTUs) and phylogenetic diversity (PD) whole tree) of (A) IBS (Welch’s test, * p < 0.05) and (B) IBS types (Welch’s test, * p < 0.05). HC, D, C, and M/U indicate healthy control, IBS-D, IBS-C, and mixture of IBS-M and IBS-U, respectively.
Short-chain fatty acids in feces of healthy subjects and IBS patients.
| Factors | Healthy | IBS | IBS-C | IBS-D | IBS-M | IBS-U | ||||
|---|---|---|---|---|---|---|---|---|---|---|
| acetic acid | 42.0 ± 8.7 | 36.9 ± 12.9 | 0.025 | 37.5 ± 11.2 | 0.114 | 34.5 ± 13.0 | 0.011 | 40.0 ± 14.2 | 0.516 | 39.8 ± 17.5 |
| propionic acid | 8.3 ± 4.4 | 11.6 ± 6.4 | 0.004 | 11.0 ± 6.6 | 0.096 | 10.8 ± 6.7 | 0.095 | 12.9 ± 5.5 | 0.004 | 17.1 ± 4.0 |
| butyric acid | 7.0 ± 3.4 | 6.5 ± 3.2 | 0.574 | 6.6 ± 3.1 | 0.668 | 5.6 ± 2.8 | 0.109 | 7.9 ± 3.9 | 0.402 | 7.3 ± 1.5 |
| valerate | 1.0 ± 0.9 | 1.1 ± 1.0 | 0.593 | 0.9 ± 0.8 | 0.797 | 1.1 ± 1.2 | 0.628 | 1.1 ± 0.9 | 0.726 | 2.4 ± 0.7 |
| iso-butyric acid | 0.7 ± 0.6 | 0.8 ± 0.5 | 0.302 | 0.9 ± 0.5 | 0.123 | 0.7 ± 0.5 | 0.970 | 0.8 ± 0.4 | 0.300 | 1.1 ± 0.8 |
| iso-valerate | 0.7 ± 0.5 | 0.7 ± 0.5 | 0.821 | 0.8 ± 0.6 | 0.456 | 0.6 ± 0.5 | 0.639 | 0.7 ± 0.4 | 0.873 | 0.9 ± 0.8 |
| butyric acid- valerate | 1.4 ± 2.6 | 5.1 ± 5.8 | <0.001 | 4.4 ± 6.9 | 0.044 | 5.2 ± 5.7 | 0.001 | 5.0 ± 4.8 | 0.004 | 9.9 ± 4.3 |
Data are shown as mean ± SD. The p value for IBS-U was not indicated because of low numbers. t, t value. † Short-chain fatty acid data lacks 4 samples of IBS including 2 IBS-C and 2 IBS-M.
Figure 2Distance of microbial composition between IBS patients and healthy controls. (A) Principal coordinate analysis (PCoA) of the unweighted UniFrac distance matrix from taxa-assigned data. Blue marks indicate healthy controls and orange marks indicate IBS types. Each orange letter indicates IBS subtype: D, IBS-D; C, IBS-C; M, IBS-M; and U, IBS-U. Green rectangles indicate the samples that belong to a green cluster coming from IBS samples alone in panel B. (B) Hierarchical clustering of unweighted UniFrac distance using Ward’s method for visualizing the relationship between IBS patients and healthy controls. A green cluster consisting of only IBS samples is shown as green rectangles in (A). (C) Unweighted UniFrac distance among healthy controls and IBS samples (Welch’s test, * p < 0.05). (D) Unweighted UniFrac distance of each IBS subtype from healthy controls (Welch’s test, * p < 0.05).
Differences in abundance of single taxa between healthy controls and IBS patients.
| Taxon genus level | Healthy Group | IBS Group | IBS-C | IBS-D | IBS-M | IBS-U | ||||
|---|---|---|---|---|---|---|---|---|---|---|
| f_Halomonadaceae; | 0.00 ± 0.00 | 0.12 ± 0.18 | <0.001 | 0.07 ± 0.10 | <0.001 | 0.18 ± 0.24 | <0.001 | 0.12 ± 0.13 | <0.001 | 0.04 ± 0.04 |
| f_Lachnospiraceae; | 0.41 ± 0.39 | 0.23 ± 0.45 | <0.001 | 0.42 ± 0.70 | 0.008 | 0.08 ± 0.12 | <0.001 | 0.21 ± 0.28 | 0.005 | 0.24 ± 0.17 |
| f_Ruminococcaceae; | 4.41 ± 3.26 | 2.64 ± 2.93 | <0.001 | 3.37 ± 2.99 | 0.120 | 1.72 ± 2.56 | <0.001 | 2.94 ± 2.89 | 0.045 | 4.00 ± 5.05 |
| f_Enterobacteriaceae; | 0.02 ± 0.04 | 0.16 ± 0.57 | 0.001 | 0.12 ± 0.32 | 0.206 | 0.13 ± 0.18 | 0.002 | 0.28 ± 1.04 | 0.223 | 0.06 ± 0.07 |
| f_Coriobacteriaceae; | 1.76 ± 1.36 | 1.23 ± 1.59 | 0.05 | 1.16 ± 1.50 | 0.022 | 1.01 ± 1.45 | 0.004 | 1.67 ± 1.95 | 0.304 | 0.98 ± 0.95 |
Data are shown as mean ± SD. The p value for IBS-U was not indicated because of low numbers. t, t value.
Figure 3Effect plot examining univariate differences between IBS and healthy control groups. (A) The plot shows effect size versus the expected p-value of the Welch’s test. (B) The volcano plot shows the difference between groups versus the expected p-value of the Welch’s test.
Figure 4Area under the curve (AUC) scores and receiver-operating characteristic (ROC) curves for IBS prediction using taxa and short-chain fatty acid (SCFA) data. (A) Boxplots of AUC scores describing the prediction performance for IBS using taxa-assigned data, short-chain fatty acid data, and both. Solid lines in boxes indicate median and dashed lines indicate mean. (B) ROC curves describing specificity and sensitivity using taxa-assigned data, short-chain fatty acid data, and both. The gray shadow indicates the standard deviation of the ROC curve obtained using taxa-assigned data.