| Literature DB >> 32325848 |
Daniele Pietrucci1, Adelaide Teofani1, Valeria Unida1, Rocco Cerroni2, Silvia Biocca3, Alessandro Stefani2, Alessandro Desideri1.
Abstract
The involvement of the gut microbiota in Parkinson's disease (PD), investigated in several studies, identified some common alterations of the microbial community, such as a decrease in Lachnospiraceae and an increase in Verrucomicrobiaceae families in PD patients. However, the results of other bacterial families are often contradictory. Machine learning is a promising tool for building predictive models for the classification of biological data, such as those produced in metagenomic studies. We tested three different machine learning algorithms (random forest, neural networks and support vector machines), analyzing 846 metagenomic samples (472 from PD patients and 374 from healthy controls), including our published data and those downloaded from public databases. Prediction performance was evaluated by the area under curve, accuracy, precision, recall and F-score metrics. The random forest algorithm provided the best results. Bacterial families were sorted according to their importance in the classification, and a subset of 22 families has been identified for the prediction of patient status. Although the results are promising, it is necessary to train the algorithm with a larger number of samples in order to increase the accuracy of the procedure.Entities:
Keywords: Parkinson’s disease; gut microbiota; gut–brain axis; machine learning; predictor
Year: 2020 PMID: 32325848 PMCID: PMC7226159 DOI: 10.3390/brainsci10040242
Source DB: PubMed Journal: Brain Sci ISSN: 2076-3425
List of references, number of samples, methodological approaches and nationality of studies considered in this analysis.
| Reference | PD Samples | HC Samples | Sample Transport | DNA Extraction | 16S Region | Nationality |
|---|---|---|---|---|---|---|
| Method | ||||||
| [ | 34 | 31 | BD Gaspak | FastDNA Spin Kit for Soil | V4 | United States |
| [ | 65 | 68 | NR | PSP Spin Stool Kit | V3-V4 | Finland |
| [ | 116 | 82 | Stabilizer PSP | PSP Spin Stool Kit | V3-V4 | Italy |
| [ | 206 | 133 | Ambient temp | Earth microbiome project protocol | V4 | United States |
| [ | 22 | 34 | Stabilizer PSP | PSP Spin Stool Kit | V3-V4 | Germany |
| [ | 29 | 26 | Immediate freezing | Custom Protocol Hopfner | V4 | Russia |
Figure 1(A) Average ROC curves (over 5 folds) with confidence intervals and (B) prediction performance metrics with the relative margin of error. The results for random forests are reported in green, for neural networks in purple and for support vector machines in orange, respectively.
Ranking of the importance of the bacterial families in discriminating between healthy controls and Parkinson’s disease (PD) patients.
| Bacterial Family | Ranking of Importance | Higher (−) or Lower (+) Abundance in PD Patients from RF Algorithm | References in the Literature Reporting Overabundance in PD Patients | References in the Literature Reporting Lower Abundance in PD Patients |
|---|---|---|---|---|
|
| 1 | − | [ | [ |
|
| 2 | − | [ | [ |
|
| 3 | − | [ | [ |
|
| 4 | + | [ | |
|
| 5 | + | [ | |
|
| 6 | + | [ | [ |
|
| 7 | + | [ | |
|
| 8 | + | ||
|
| 9 | + | [ | |
|
| 10 | − | ||
|
| 11 | + | [ | |
|
| 12 | + | [ | |
|
| 13 | + | [ | |
|
| 14 | + | ||
|
| 15 | − | [ | |
|
| 16 | + | [ | |
|
| 17 | − | [ | |
|
| 18 | − | [ | [ |
|
| 19 | + | ||
|
| 20 | + | [ | |
|
| 21 | + | ||
|
| 22 | − | [ | [ |
|
| 23 | + | [ | |
|
| 24 | − | ||
|
| 25 | + | ||
|
| 26 | − | ||
|
| 27 | + | ||
|
| 28 | − | ||
|
| 29 | + | ||
|
| 30 | + | ||
|
| 31 | − | [ | [ |
|
| 32 | + | ||
|
| 33 | − | ||
|
| 34 | + | [ | |
|
| 35 | + | ||
|
| 36 | − | ||
|
| 37 | − | ||
|
| 38 | + | ||
|
| 39 | − | ||
|
| 40 | − | ||
|
| 41 | − | ||
|
| 42 | − | ||
|
| 43 | − | ||
|
| 44 | + | ||
|
| 45 | − | ||
|
| 46 | + | ||
|
| 47 | − | ||
|
| 48 | + | ||
|
| 49 | − | ||
|
| 50 | + | ||
|
| 51 | + | ||
|
| 52 | + |
Figure 2List of the 22 bacterial families required for discriminating between HC and PD patients. For each family, the average percentage of abundance is represented by a bar, orange for HC and blue for PD patients (left scale). The importance of the family in discriminating the status is represented by a red dot (right scale).