| Literature DB >> 36109726 |
Md Zakir Hossain1, Elena Daskalaki2, Anne Brüstle3, Jane Desborough4, Christian J Lueck5,6, Hanna Suominen2,7.
Abstract
BACKGROUND: Multiple sclerosis (MS) is a neurological condition whose symptoms, severity, and progression over time vary enormously among individuals. Ideally, each person living with MS should be provided with an accurate prognosis at the time of diagnosis, precision in initial and subsequent treatment decisions, and improved timeliness in detecting the need to reassess treatment regimens. To manage these three components, discovering an accurate, objective measure of overall disease severity is essential. Machine learning (ML) algorithms can contribute to finding such a clinically useful biomarker of MS through their ability to search and analyze datasets about potential biomarkers at scale. Our aim was to conduct a systematic review to determine how, and in what way, ML has been applied to the study of MS biomarkers on data from sources other than magnetic resonance imaging.Entities:
Keywords: Deep learning; Disease progression; Medical informatics; Multiple sclerosis; Prognosis; Supervised machine learning; Systematic review
Mesh:
Substances:
Year: 2022 PMID: 36109726 PMCID: PMC9476596 DOI: 10.1186/s12911-022-01985-5
Source DB: PubMed Journal: BMC Med Inform Decis Mak ISSN: 1472-6947 Impact factor: 3.298
“Multiple Sclerosis” and specific machine learning algorithms returned 1, 052 studies from eight search resources
| Search terms | Search resource | Number of returned studies |
|---|---|---|
| “Multiple Sclerosis” AND (“Machine Learning” OR “Machine Intelligence” | ||
| OR “Deep Learning” OR “Decision Tree*” OR “Random Forest*” | PubMed | 75 |
| OR “Pattern Recognition” OR “Genetic Algorithm*” OR “Supervised Algorithm*” | ||
| OR “Decision Support System*” OR “Evolutionary Computation*” | Cochrane | 25 |
| OR “Neural Network*” OR “Support Vector Machine*” OR “Autoencoder*” | Google scholar | 100 # |
| OR “Deep Belief Network*” OR “Adversarial Network*” | ||
| OR “Self Organizing Map*” OR “Self Organising Map*”) | ||
| “Multiple Sclerosis” AND (“machine learning” OR “machine intelligence”) | Science direct | 340 # |
| Scopus | 169 | |
| Web of Science | 179 # | |
| Lens | 160 | |
| “Multiple sclerosis” AND “machine learning” | dblp | 4 # |
| Total count (# Sort by relevance) | 1052 |
# Sort by relevance
Fig. 1Flow chart of the systematic review process
Fig. 2Distribution of manuscripts with publication years. The total number of publications adds up to 68 because out of the 66 included publications, one discussed both diagnosis and MS sub-types and another discussed both diagnosis and prognosis
Summary of the included papers that reported on applications towards evaluating response to treatment, symptoms, or underlying pathophysiology together with those for improving measurement tools or support groups. Abbreviations as above in Table 2
| Author | Data sources | ML methods | Outcomes |
|---|---|---|---|
| Baranzini et al. [ | INF- | RF; | Accuracy in [75.0%, 82.0%]; |
| CASP2 / IL10 / IL12Rb1. | |||
| Ebrahimkhani et al. [ | microRNA | LR; RF; | AUC in [65.2%, 91.1%]. |
| Fagone et al. [ | Genomics | UCSC; | Accuracy = 89.2%. |
| Karim et al. [ | INF- | CART; LASSO; SVM; LR; | Hazard Ratio[4] in [1.359, 1.372]. |
| Kasatkin et al. [ | Flu-like symptoms | NN; Static Model; | Sensitivity in [73.4%, 81.2%]; |
| Specificity in [71.6%, 80.6%]. | |||
| Li et al. [ | Cardiac data | DT; | Baseline hare rate (HR). |
| Üçer et al. [ | INF- | SNAc; SVM; KNN; RF; NB; LR; DT; | Accuracy in [63.1%, 64.5%]; |
| F1 score in [77.4%, 78.3%]; | |||
| Walter et al. [ | Costing data | DT; | NAb is cheaper than other tests. |
| Patrick et al. [ | RNAs | GB; LR; RF; LASSO; DA; Nearest SC; WE; | AUC in [72.1%, 89.9%]; |
| Bhattacharya et al. [ | Daily activities | NN; | Fatigue. |
| Papakostas et al. [ | EMG | SVM; RF; ET; Gradient-Boosting; | F1 Score in [75.1%, 77.8%]. |
| Chi et al. [ | Genetic ancestry | LR; RF | HLA-DRB1*15:01 and HLA-DRB1*03:01 alleles. |
| Forbes et al. [ | Gut microbiota | RF; | Accuracy in [82.0%, 84.0%]; |
| AUC in [91.0%, 94.0%]. | |||
| Sébastien et al. [ | Gait analysis | ET; | Accuracy in [70.9%, 91.7%]. |
| Michel et al. [ | Quality of life | DT; IRT; | Accuracy in [96.0%, 98.0%]. |
| Rezaallah et al. [ | Social media text | NLP; NB; | 6 topics related to MS medication. |
| Deetjen et al. [ | Text data | LR; NB; | Accuracy in [91.6%, 96.0%]; |
| 56% informational and 44% emotional for MS. | |||
Summary of 49 included papers that reported on applications towards supporting diagnosis, disease status assessment, MS sub-typing, and prognosis. See Table 3 for a summary of 17 included papers that reported on other applications. Abbreviations as below in the Table
| Author | Data sources | ML methods | Outcomes |
|---|---|---|---|
|
| |||
| Ahmadi et al. [ | EEG | OS-ELM; | Accuracy in [90.0%, 91.0%]. |
| Andersen et al. [ | Metabolomics | LR; RF; | AUC in [81.0%, 86.0%]. |
| Bertolazzi et al. [ | Genes | KNN; SVM; DT; | Accuracy in [92.0%, 95.0%]. |
| Broza et al. [ | Breath markers | NN; | Accuracy in [72.0%, 90.0%]; |
| AUC in [79.0%, 87.0%]. | |||
| Chase et al. [ | Medical records | NB; NLP; | AUC in [90.0%, 94.0%]. |
| deAndrés-G. et al. [ | Genetic pathways | Distance-based classifier; | Accuracy in [93.8, 98.2%]. |
| Minimum spanning tree; | Neurogenesis and Hemoglobin related genes. | ||
| Galli et al. [ | Lymphocytes | NN; | TNF, GM-CSF, IFN- |
| Goldstein et al. [ | SNP | RF; LASSO; GLM; KNN; LR; | CRHR1. |
| Goyal et al. [ | Cytokines | SVM; NN; DT; RF; | Accuracy = 90.9%; AUC = 95.7%. |
| Lötsch et al. [ | Lipid markers | SOM; AdaBoost; KNN; RF; | Accuracy in [92.5%, 100%]; AUC in [92.5%, 100%]. |
| Lötsch et al. [ | Lipid markers | SOM; | Accuracy in [77.0%, 94.6%]; Ceramides. |
| Perera et al. [ | Tremor | Linear Regression; SVR; RF; | Accuracy in [84.2%, 90.8%]; Velocity of index finger. |
| Prabahar et al. [ | MicroRNA | SVM; | Accuracy in [87.8%, 90.1%]. |
| Severini et al. [ | Balance board | SVM; | Accuracy in [83.3, 85.5%]. |
| Telalovic et al. [ | lncRNAs | RF; | Accuracy in [61.5%, 84.6%]. |
| Torabi et al. [ | EEG | SVM; KNN; | Accuracy in [79.8%, 93.1%]. |
| Zhang et al. [ | Genetic pathways | SVM; | Accuracy in [61.2%, 70.3%]. |
| Kiiski et al. [ | ERPs | Linear Regression; | Visual task is better than auditory task. |
| Saroukolaei et al. [ | Enzymes | Linear Regression; NN; | Higher CA. |
| Sun et al. [ | Postural sway | RF; | Accuracy in [92.3%, 95.6%]. |
|
| |||
| Bang et al. [ | Gut microbial | SVM; KNN; LogitBoost; Logistic Tree; | Accuracy in [96.4%, 98.3%]. |
| Guo et al. [ | Transcriptomics | KNN; SVM; NB; NN; LR; RF; | Accuracy in [77.2%, 86.4%]; |
| TNFSF10 is allied to the PwMS. | |||
| Ohanian et al. [ | Key symptoms | DT; | Accuracy in [79.2%, 81.2%]; |
| Immune domain is useful in this case. | |||
| Ostmeyer et al. [ | B-cell receptor | Optimize Log Likelihood; | Accuracy in [72.0%, 87.0%]. |
|
| |||
| Azrour et al. [ | Gait analysis | DT; | EDSS score in [< 0.97 (No MS), >4.15 (MS)]. |
| Fritz et al. [ | Falls risk | LR; | Fallers and near-fallers are at similar risks. |
| Gudesblatt et al. [ | Falls risk | RF; | Accuracy in [82.9%, 91.2%]; |
| F1 score in [78.9%, 91.3%]. | |||
| Haider et al. [ | Body movements | SVM; KNN; RF; | Accuracy in [95.5%, 100%]. |
| Jackson et al. [ | Genetic markers | RF; | 19 genetic variants. |
| Kosa et al. [ | Clinical data, MEP | GA; | CombiWISE is better than MRI measures. |
| McGinnis et al. [ | Gait speeds | SVR; | RMSE speed in [0.12 m/s, 0.14 m/s]. |
| Morrison et al. [ | Motor assessment | DT; SVM; | Visualisation reduce gap between human and ML. |
| Shahid et al. [ | Clinical data | KNN; SVM; RF; Rough Set; | Accuracy in [79.7%, 84.0%]. |
| Supratak et al. [ | Walking speed | SVR; | Walking speed in [0.57 m/s, 1.22 m/s]. |
|
| |||
| Acquarelli et al. [ | Pathology | NLP; Clustering; | Pathological profiles and disease duration. |
| Fiorini et al. [ | Clinical data | LS; LR; SVM; KNN; | Accuracy in [75.0%, 78.3%]; |
| F1 score in [62.3%, 70.2%]. | |||
| Gronsbell et al. [ | EMR | SSL; | Accuracy in [92.9%, 93.9%]. |
| Gupta et al. [ | Microbiomics | RF; | Specificity = 86.4%; Sensitivity = 45.4%. |
| Lim et al. [ | Kyneurenine | DT; DA; CART; SVM; | Accuracy in [83.0%, 91.0%]. |
| Lopez et al. [ | Genetic signatures | Clustering; | CD69, CCR5, IL13, and STAT3. |
|
| |||
| Bejarano et al. [ | Clinical, MEP | NB; NN; LR; DT; Linear Regression; | Accuracy in [67.0%, 80.0%]; AUC in [65%, 76.0%]. |
| Brichetto et al. [ | Clinical data | Supervised Algorithms; | Accuracy in [82.6%, 86.0%]. |
| Briggs et al. [ | Clinical data | LASSO; | Obesity and smoking. |
| Flauzino et al. [ | Clinical data | LR; NN; | AUC = 84.2; Lower IL4. |
| Pruenza et al. [ | Clinical data | RF; | AUC in [80.0%, 82.0%]. |
| Tacchella et al. [ | Clinical data | RF; | AUC in [69.6%, 72.5%]. |
| Yperman et al. [ | MEP | RF; LR; | AUC in [72.0%, 75.0%]. |
| Zhao et al. [ | Clinical data | SVM; LR; | Accuracy in [68.0%, 73.0%]. |
| Zhao et al. [ | Clinical data | SVM; KNN; AdaBoost; | Accuracy in [76.0%, 90.0%]. |
Accuracy = (TP + TN) / (TP + TN + FP + FN); FPR =FP(FP+TN); Precision = TP / (TP+FP); F1 Score = 2*(Recall * Precision) / (Recall + Precision); Sensitivity / Recall / TPR = TP / (TP + FN); Specificity = TN / (TN + FP); AUC = Area Under the ROC curve, calculated from the plot of TPR vs. FPR;
CART = Classification and Regression Tree; DA = Discriminant Analysis; DT = Decision Tree; ET = Extra-Trees; FN = False Negatives; FP = False Positives; FPR = False Positive Rate; GA = Genetic Algorithm; GAIMS = Gait Analysis Imaging System; GB = Gradient Boosting; GLM = Generalized Linear Model; IP-GRASP = A Greedy Randomized Adaptive Search Procedure with memory; IRT = Item Response Theory; KNN = k-nearest Neighbour; LASSO = Least absolute shrinkage and selection operator; LR = Logistic Regression; LS = Least Squares; ML = Machine Learning; MRI = Magnetic Resonance Imaging; NB = Naïve Bayes; NLP = Natural Language Processing; NN = Neural Network; OS-ELM = Online Sequential Extreme Learning Machine; QoL = Quality of Life; RF = Random Forest; RMSE = Root Mean Square Error; ROC = Receiver Operating Characteristic; RR = Relapsing-Remitting Multiple Sclerosis; SC = Shrunken Centroid; SOM = Self-Organising Map; SNAc = Social Network Analysis-based Classifier; SSL = Semi-supervised Learning; SVM = Support Vector Machines; TN = True Negatives; TP = True Positives; TPR = True Positive Rate;
CA = Candida Albicans; CAO = Clinician Assessed Outcomes; CFS = Chronic Fatigue Syndrome; CIS = Clinically Isolated Syndrome; EDSS = Expanded Disability Status Scale; EEG = Electroencephalogram; EMG = Electromyogram; EMR = Electronic Medical Record; ERPs = Event Related Potentials; HC = Healthy Controls; IM &NO = Immune-inflammatory, Metabolic, and Nitro-Oxidative; KP = Kynurenine Pathway; lncRNAs = long non-coding RNAs; ME = Myalgic Encephalomyelitis; MEP = Motor Evoked Potentials; MS = Multiple Sclerosis; NAb = Neutralising Antibodies; PP = Primary-Progressive Multiple Sclerosis; PRO = Patient Reported Outcomes; PwMS = people living with MS; rRNA = Ribosomal Ribonucleic Acid; SP = Secondary-Progressive Multiple Sclerosis; without MS = people living without Multiple Sclerosis; WE = Word Embedding;
C6ORF10 = Chromosome 6 Open Reading Frame 10; CASP2 = Caspase 2, Apoptosis-Related Cysteine Peptidase; CCR5 = C-C Chemokine Receptor Type 5; CD69 = CD69 Antigen (P60, Early T-Cell Activation Antigen); CRHR1 = Corticotropin Releasing Hormone Receptor 1; CXCR4 = C-X-C Motif Chemokine Receptor 4; GM-CSF = Granulocyte-Macrophage Colony-Stimulating Factor; HLA-DRB1 = Human Leukocyte Antigen haplotype, DR beta 1; IFN- = Interferon beta; IFN- = Interferon Gamma; IL2 = Interleukin 2, T Cell Growth Factor; IL4 = Interleukin 4; IL10 = Interleukin 10; IL12Rb1 = Interleukin 12 Receptor Subunit Beta 1; IL13 = Interleukin 13; TAP2 = Transporter 2, ATP Binding Cassette Subfamily B Member; TNF = Tumor Necrosis Factor; TNFSF10 = Tumor Necrosis Factor (ligand) superfamily, member 10; STAT3 = Signal Transducer and Activator Of Transcription 3;
Fig. 3Sunburst chart of machine learning algorithms applicable to multiple sclerosis studies
Fig. 4Histogram of machine learning algorithms in multiple sclerosis studies. The y-axis refers to the number of studies
Fig. 5Sunburst chart of machine learning applications and data in multiple sclerosis studies
Fig. 6Histogram of data for ML applications. The y-axis refers to the number of studies