| Literature DB >> 35798934 |
Alaa Abd-Alrazaq1, Dari Alhuwail2,3, Jens Schneider4, Carla T Toro4, Arfan Ahmed1, Mahmood Alzubaidi4, Mohannad Alajlani5, Mowafa Househ6.
Abstract
Artificial intelligence (AI) has been successfully exploited in diagnosing many mental disorders. Numerous systematic reviews summarize the evidence on the accuracy of AI models in diagnosing different mental disorders. This umbrella review aims to synthesize results of previous systematic reviews on the performance of AI models in diagnosing mental disorders. To identify relevant systematic reviews, we searched 11 electronic databases, checked the reference list of the included reviews, and checked the reviews that cited the included reviews. Two reviewers independently selected the relevant reviews, extracted the data from them, and appraised their quality. We synthesized the extracted data using the narrative approach. We included 15 systematic reviews of 852 citations identified. The included reviews assessed the performance of AI models in diagnosing Alzheimer's disease (n = 7), mild cognitive impairment (n = 6), schizophrenia (n = 3), bipolar disease (n = 2), autism spectrum disorder (n = 1), obsessive-compulsive disorder (n = 1), post-traumatic stress disorder (n = 1), and psychotic disorders (n = 1). The performance of the AI models in diagnosing these mental disorders ranged between 21% and 100%. AI technologies offer great promise in diagnosing mental health disorders. The reported performance metrics paint a vivid picture of a bright future for AI in this field. Healthcare professionals in the field should cautiously and consciously begin to explore the opportunities of AI-based tools for their daily routine. It would also be encouraging to see a greater number of meta-analyses and further systematic reviews on performance of AI models in diagnosing other common mental disorders such as depression and anxiety.Entities:
Year: 2022 PMID: 35798934 PMCID: PMC9262920 DOI: 10.1038/s41746-022-00631-8
Source DB: PubMed Journal: NPJ Digit Med ISSN: 2398-6352
Fig. 1Flow chart of the study selection process: 852 citations were retrieved from searching the databases.
Of these, 344 duplicates were removed. Screening titles and abstracts of the remaining citations led to excluding 446 citations. By reading the full text of the remaining 62 publications, we excluded 48 publications. An additional systematic review was identified by checking the list of the included reviews. In total, 15 systematic reviews were included in the current.
Meta-data of the included reviews.
| Study | Year | Country | Publication type | Registered protocol | Followed guidelines |
|---|---|---|---|---|---|
| Pellegrini[ | 2018 | UK | Journal article | Yes | PRISMAa |
| Billeci[ | 2020 | Italy | Journal article | No | PRISMAa |
| Sarica[ | 2017 | Italy | Journal article | No | PRISMAa |
| Ebrahimighahnavieh[ | 2020 | Australia | Journal article | No | No |
| Petti[ | 2020 | UK | Journal article | No | PRISMAa |
| Battista[ | 2020 | Italy | Journal article | No | PRISMAa |
| Law[ | 2020 | UK | Journal article | No | PRISMAa |
| de Filippis[ | 2019 | Italy | Journal article | No | PRISMAa |
| Steardo[ | 2020 | Italy | Journal article | No | PRISMAa |
| Bracher-Smith[ | 2020 | UK | Journal article | Yes | PRISMAa |
| Librenza-Garcia[ | 2017 | Brazil | Journal article | No | PRISMAa |
| Moon[ | 2019 | South Korea | Journal article | Yes | PRISMAa |
| Ramos-Lima[ | 2019 | Brazil | Journal article | Yes | PRISMAa |
| Bruin[ | 2019 | Netherlands | Journal article | No | PRISMAa |
| Sanfelici[ | 2020 | Germany | Journal article | No | PRISMAa |
aPRISMA Preferred Reporting Items for Systematic Reviews and Meta-Analyses, UK United Kingdom.
Eligibility criteria of the included reviews.
| Study | Target disorder | AI approach | Type of data | Language restrictions | Time limit |
|---|---|---|---|---|---|
| Pellegrini[ | AD & MCI | UML, SML, DL | Neuroimaging data | No restriction | January 1, 2006–September 30, 2016 |
| Billeci[ | AD & MCI | SML | Neuroimaging data | NR | January 1, 2010–2019 |
| Sarica[ | AD & MCI | SML | Neuroimaging data | English | January 1, 2007–May 1, 2017 |
| Ebrahimighahnavieh[ | AD & MCI | DL | Neuroimaging data | English | No restriction |
| Petti[ | AD & MCI | UML, SML, DL | Neuropsychological tests | English | January 1, 2013–August 8, 2019 |
| Battista[ | AD & MCI | SML | Neuropsychological tests | English | January 1, 2010–July 15, 2018 |
| Law[ | AD & DLB | SML | EEG measures | English | No restriction |
| de Filippis[ | SCZ | UML, SML, DL | Neuroimaging data | No restriction | No restriction |
| Steardo[ | SCZ | SML | Neuroimaging data | No restriction | No restriction |
| Bracher-Smith[ | SCZ, BD, ASD, AN | UML, SML, DL | Genetic data | English | No restriction |
| Librenza-Garcia[ | BD | UML, SML, DL | No restriction | No restriction | January 1, 1960–January 1, 2017 |
| Moon[ | ASD | UML, SML, DL | Neuroimaging data | No restriction | No restriction |
| Ramos-Lima[ | PTSD | UML, SML, DL | No restriction | No restriction | January 1, 1960–May 1, 2019 |
| Bruin[ | OCD | SML | Neuroimaging data | NR | No restriction |
| Sanfelici[ | Psychotic disorders | SML | No restriction | English | No restriction |
AD Alzheimer’s disease, AI Artificial intelligence, AN Anorexia nervosa, ASD Autism spectrum disorder, BD Bipolar disease, DL Deep learning, DLP Dementia with Lewy bodies, EEG Electroencephalography, MCI Mild cognitive impairment, NR Not reported, OCD Obsessive-compulsive disorder, PTSD Post-traumatic stress disorder, SCZ Schizophrenia, SML Supervised machine learning, UML Unsupervised machine learning.
Search sources, study selection, data extraction, quality assessment, and data synthesis in the included reviews.
| Study | Databases searched | Reference list checking | Number of reviewers | Quality assessment tool | Meta-analysis | ||
|---|---|---|---|---|---|---|---|
| Study selection | Data extraction | Quality assessment | |||||
| Pellegrini[ | MEDLINE, Elsevier, IEEE Xplore, Science Direct, ACM Digital Library, Web of Science | No | 2 | NR | NR | QUADAS-2 | No |
| Billeci[ | MEDLINE | No | 2 | NR | NA | No | No |
| Sarica[ | MEDLINE, Scopus, Web of Science, Google Scholar | No | 2 | NR | NA | No | No |
| Ebrahimighahnavieh[ | IEEE Xplore, ScienceDirect, SpringerLink, ACM Digital Library, Web of Science, Scopus | Forward | 1 | NR | NR | Tool developed by the authors | No |
| Petti[ | MEDLINE, Web of Science, Ovid | No | 2 | NR | NA | No | No |
| Battista[ | NR | Backward | 2 | NR | NR | QUADAS-2 | Yes |
| Law[ | MEDLINE, EMBASE, PsycINFO | Backward | 2 | 2 | NR | Joanna Brigg Institute | No |
| de Filippis[ | MEDLINE, EMBASE, PsycINFO, Cochrane Library | Backward | 2 | NR | NR | Jadad rating system | No |
| Steardo[ | MEDLINE, EMBASE, PsycINFO, Cochrane Library | Backward | 2 | 2 | NR | Jadad rating system | No |
| Bracher-Smith[ | MEDLINE, PsycINFO, Web of Science, and Scopus | No | 2 | 2 | 2 | PROBAST | No |
| Librenza-Garcia[ | MEDLINE, EMBASE, Web of Science | Backward | 2 | NR | NA | No | Yes |
| Moon[ | MEDLINE, EMBASE, CINAHL, PsycINFO, IEEE Xplore | No | 2 | 2 | 2 | QUADAS-2 | Yes |
| Ramos-Lima[ | MEDLINE, EMBASE, Web of Science | Backward | 2 | NR | NR | Tool developed by the authors | No |
| Bruin[ | MEDLINE | Backward | NR | NR | NA | No | No |
| Sanfelici[ | MEDLINE, Scopus | Backward | NR | 2 | NA | No | Yes |
NA Not applicable, NR Not reported, PROBAST Prediction model risk of bias assessment tool, QUADAS-2 Revised tool for Quality Assessment of Diagnostic Accuracy Studies.
Search results and dataset features in the included studies in the included reviews.
| Study | # of retrieved studies | # of included studies | Dataset size | Data type |
|---|---|---|---|---|
| Pellegrini[ | 7991 | 111 | 100–902 | Neuroimaging data, CSF biomarkers, Demographic data, Genetic data, Biological data |
| Billeci[ | 52 | 21 | 31–330 | Neuroimaging data |
| Sarica[ | 70 | 12 | 26–870 | Neuroimaging data |
| Ebrahimighahnavieh[ | NR | 114 | 43–2,464 | Neuroimaging data, Genetic data, Demographical data, Clinical data, CSF biomarkers Neuropsychological test |
| Petti[ | 2,447 | 33 | 10–484 | Neuropsychological tests (Speech and language data) |
| Battista[ | 203 | 59 | 22–7,026 | Neuropsychological data, Biological data, Neuroimaging data, Demographical data, Clinical data |
| Law[ | 1,264 | 43 | 61–654 | EEG measures, Neuroimaging data, CSF biomarkers |
| de Filippis[ | 2,386 | 35 | 34–734 | Neuroimaging data |
| Steardo[ | 660 | 22 | 40–737 | Neuroimaging data |
| Bracher-Smith[ | 1,241 | 13 | 20–5,554 | Genetic data |
| Librenza-Garcia[ | 757 | 51 | 42–4,488 | Neuroimaging, Genetic data, EEG measures, Neuropsychological tests, Serum biomarkers |
| Moon[ | 348 | 43 | 20–2,686 | Neuroimaging data, EEG measures, Neuropsychological tests, Biochemical data |
| Ramos-Lima[ | 806 | 49 | 25–391 | Neuroimaging data, Neuropsychological data, EEG measures, Biological data, Clinical data |
| Bruin[ | 170 | 12 | 20–172 | Neuroimaging data |
| Sanfelici[ | 1,103 | 44 | 38–202 | Neuroimaging data, Clinical data |
CSF Cerebrospinal Fluid, EEG Electroencephalography.
Features of models in the included studies in the included reviews.
| Study | Classification algorithm type | Type of validation |
|---|---|---|
| Pellegrini[ | Fuzzy, HMM, k-NN, LASSO, LBP, LDA, MIL, NN, PNN, QDA, QDC, RF, RLR, SRC, SVM, ν-MKL | NR |
| Billeci[ | AdaBoost, DS, EGB, LDA, LinReg, LogReg, NB, PLSDA, RF, SVM | Internal validation (K-fold cross-validation & Leave One Out cross validation) |
| Sarica[ | RF | Internal validation (K-fold cross-validation, Leave One Out cross validation) & External validation |
| Ebrahimighahnavieh[ | AE, CNN, DNN, MLP, RNN, DBN, DBM, DPN | Internal validation (K-fold cross-validation, Train-and-test, Leave One Out cross validation) |
| Petti[ | DT, LogReg, NB, SVM | NR |
| Battista[ | BN, GC, LDA, linReg, LogReg, NB, NN, RF, SVM | Internal validation (K-fold cross-validation, Train-and-test, Nested Cross Validation, Leave One Out cross validation) |
| Law[ | RF, SVM | NR |
| de Filippis[ | AE, DBN, DNN, ENet, GC, GNet, LASSO, LDA, LogReg, MPA, RDA, RF, Ridge, SRBVS, SVM, TBMFA, ν-MKL | Internal validation (K-fold cross-validation, Leave One Out cross validation) |
| Steardo[ | SVM | NR |
| Bracher-Smith[ | AdaBoost, BFT, BN, DT, DTNB, EC, GBM, k-NN, LASSO, NB, MDR, NN, RF, Ridge, SVM | Internal validation (K-fold cross-validation, Train-and-test, Leave One Out cross validation, Apparent validation) & External validation |
| Librenza-Garcia[ | ANN, BN, CRT, DT, k-NN, LASSO, LR, MFA, MLR, MDL, NB, NN, NSC, RBFN, RF, SVM | NR |
| Moon[ | ANN, DNN, DT, Fuzzy, GBM, k-NN, LDA, logReg, MLP, NB, PLSDA, RF, SVM | Internal validation, External validation, and both |
| Ramos-Lima[ | SVM, DBN, k-NN, MLP, NB, SMO, TL | NR |
| Bruin[ | LogReg, SVM | Internal validation (Leave One Out cross validation & Train-and-test) |
| Sanfelici[ | RF, SVM | Internal validation (K-fold cross-validation, Leave One Out cross validation) |
AE Auto-Encoder, AN Anorexia nervosa, ANN Artificial Neural Network, BFT best-first tree, BN Bayesian Network, CHR clinical high risk; CIF Conditional Inference Forests, CNN Convolutional Neural Networks, CRT Classification and Regression tree, DBM Deep Boltzmann Machine, DBN Deep Belief Network, DNN Deep Neural Network, DPN Deep Polynomial Network, DS Decision Stump, DTNB Decision Table Naïve Bayes, EC Evolutionary Computation, EGB Extreme Gradient Boosting, ENet Elastic Net, GBM Gradient Boosting Machine, GC Gaussian Classifier, GNet Graph Net, HMM Hidden Markov Model, k-NN K-Nearest Neighbors, LASSO Least Absolute Shrinkage and Selection Operator, LBP Local Binary Patterns, LDA Linear Discriminant Analysis, LinReg Linear Regression, LogReg Logistic Regression, MDL Minimum Description Length, MDR Multifactor Dimensionality Reduction, MFA Mixture Factor Analysis, MIL Multiple Instance Learning, MLR Multivariate Logistic Regressions, NSC Nearest Shrunken Centroids, MLP Multi-Layer Perceptron, MPA Multivariate Pattern Analysis, NB Naïve Bayes, NN Neural Networks, NR Not reported, OPLS Orthogonal Projections to Latent Structures, PLSDA Partial Least Squares Discrimination Analysis, PNN Probabilistic Neural Network, QDA Quadratic Discriminant Analysis, QDC Quadratic Discriminant Classifier, RBFN Radial Basis Function Network, RDA Regularized Discriminant Analysis, RF Random Forest, Ridge Ridge Regression, RLR Regularized Logistic Regression, RNN Recurrent Neural Network, SMO Sequential Minimal Optimization, SRBVS Sparse-Representation-Based Variable Selection, SRC Sparse Representation Classification, SVM Support Vector Machine, TBMFA Translation Based Multimodal Fusion Approach, TC trauma-exposed controls, TL Transfer Learning, ν-MKL Multiple Kernel Learning.
Fig. 2Review authors’ judgments about each appraisal item: The quality of the included reviews was assessed against appraisal items.
Yes (green) refers that study meets the item, thereby, it has a good quality in terms of that item. No (red) refers that study did not meet the item, thereby, it has poor quality in terms of that item. Unclear (yellow) refers that we could not appraise the study in terms of the item due to the lack of reported information. Not applicable (gray) refers that the appraisal item is not applicable to the systematic review as it does not include a feature that the item assesses.
Classifier performance in differentiating AD from HC.
| Study | AI approach | Accuracy ( | Sensitivity ( | Specificity ( | AUC ( |
|---|---|---|---|---|---|
| Neuroimaging data | |||||
| Pellegrini[ | UML, SML, DL | 71–98.1 (68) | 60–99.2 (68) | 75.9–98.3 (68) | NR |
| Billeci[ | SML | 56–100 (21) | 37.3–100 (14) | 55–100 (14) | NR |
| Sarica[ | SML | 87–98 (4) | NR | NR | NR |
| Ebrahimighahnavieh[ | DL | 75–100 (83) | 73–100 (52) | 80–100 (52) | NR |
| Neuropsychological data | |||||
| Petti[ | UML, SML, DL | 68–95 (17) | NR | NR | NR |
| Battista[ | DL | 72–100 (18) | 73–100 (13) | 77–100 (13) | 79–98 (5) |
AD Alzheimer’s disease, AI Artificial intelligence, AUC Area under the Curve, DL Deep learning, HC Healthy controls, n number of studies reported the corresponding measure, SML Supervised machine learning, NR not reported, UML Unsupervised machine learning.
Classifier performance in differentiating AD from MCI.
| Study | AI approach | Accuracy ( | Sensitivity ( | Specificity ( | AUC ( |
|---|---|---|---|---|---|
| Neuroimaging data | |||||
| Pellegrini[ | UML, SML, DL | 64.8–85.6 (8) | 40.3–87 (8) | 67–94.1 (8) | NR |
| Billeci[ | SML | 56–92 (6) | NR | NR | NR |
| Ebrahimighahnavieh[ | DL | 62.5–100 (27) | 62.3–100 (15) | 67.2–100 (15) | NR |
| Neuropsychological data | |||||
| Petti[ | UML, SML, DL | 68–86 (3) | NR | NR | NR |
AD Alzheimer’s disease, AI Artificial intelligence, AUC Area under the Curve, DL Deep learning, MCI mild cognitive impairment, n number of studies reported the corresponding measure, SML Supervised machine learning, NR not reported, UML Unsupervised machine learning.
Classifier performance in differentiating MCI from HC.
| Study | AI approach | Accuracy ( | Sensitivity ( | Specificity ( | AUC ( |
|---|---|---|---|---|---|
| Neuroimaging data | |||||
| Pellegrini[ | UML, SML, DL | 61.8–92.7 (30) | 49.5–94.8 (30) | 47.3–90.8 (30) | NR |
| Billeci[ | SML | 47–97.7 (10) | 24.3–95 (4) | 66.4–97 (4) | NR |
| Sarica[ | SML | 58.4–82.3 (3) | NR | NR | NR |
| Ebrahimighahnavieh[ | DL | 55.2–99.2 (53) | 52–98.3 (34) | 47.1–95 (32) | NR |
| Neuropsychological data | |||||
| Petti[ | UML, SML, DL | 73–88.1 (4) | NR | NR | NR |
| Battista[ | DL | 60–98 (16) | 45–97 (13) | 67–100 (14) | 63–99 (7) |
AI Artificial intelligence, AUC Area under the Curve, DL Deep learning, HC Healthy controls, MCI mild cognitive impairment, n number of studies reported the corresponding measure, SML Supervised machine learning, NR not reported, UML Unsupervised machine learning.
Classifier performance in differentiating MCIc from MCInc.
| Study | AI approach | Accuracy ( | Sensitivity ( | Specificity ( | AUC ( |
|---|---|---|---|---|---|
| Neuroimaging data | |||||
| Pellegrini[ | UML, SML, DL | 56.1–82.5 (38) | 56.2–94.2 (38) | 51.2–89 (38) | NR |
| Sarica[ | SML | 58.4–82.3 (4) | NR | NR | NR |
| Ebrahimighahnavieh[ | DL | 47–96.2 (27) | 42.1–99 (19) | 53–95.2 (19) | NR |
| Neuropsychological data | |||||
| Battista[ | DL | 61–85 (19) | 50–91 (16) | 48–91 (16) | 67–93 (14) |
AI Artificial intelligence, AUC Area under the Curve, DL Deep learning, MCIc MCI converting, MCInc MCI non-converting, n number of studies reported the corresponding measure, SML Supervised machine learning, NR not reported, UML Unsupervised machine learning.
Classifier performance in differentiating SCZ from HC.
| Study | AI approach | Accuracy ( | Sensitivity ( | Specificity ( | AUC ( |
|---|---|---|---|---|---|
| Neuroimaging data | |||||
| de Filippis[ | UML, SML, DL | 61–99.3 (28) | 57.9–100 (20) | 40.9–98.6 (20) | NR |
| Steardo[ | SML | 61–99.3 (22) | 65–100 (17) | 40.9–98.6 (17) | 61–91.4 (3) |
| Genetic data | |||||
| Bracher-Smith[ | DL | 40–86 (5) | NR | NR | 54–95 (5) |
AI Artificial intelligence, AUC Area under the Curve, DL Deep learning, HC Healthy controls, n number of studies reported the corresponding measure, SCZ Schizophrenia, SML Supervised machine learning, NR not reported, UML Unsupervised machine learning.
Classifier performance in differentiating BD from HC.
| Study | AI approach | Accuracy ( | Sensitivity ( | Specificity ( | AUC ( |
|---|---|---|---|---|---|
| Neuroimaging data | |||||
| Librenza-Garcia[ | UML, SML, DL | 55–100 (8) | 40–100 (12) | 49–100 (12) | NR |
| Neuropsychological data | |||||
| Librenza-Garcia[ | UML, SML, DL | 71–96.4 (3) | NR | NR | NR |
| Genetic data | |||||
| Bracher-Smith[ | UML, SML, DL | 54–77 (4) | NR | NR | 48–65 (3) |
AI Artificial intelligence, AUC Area under the Curve, BD bipolar disorders, DL Deep learning; HC Health control, n number of studies reported the corresponding measure, SML Supervised machine learning, NR not reported, UML Unsupervised machine learning.
Classifier performance in differentiating ASD from HC.
| Study | AI approach | Accuracy ( | Sensitivity ( | Specificity ( | AUC ( |
|---|---|---|---|---|---|
| Neuroimaging data | |||||
| Moon[ | UML, SML, DL | 45–97 (20) | 24–100 (20) | 21–99 (20) | NR |
| Neuropsychological tests | |||||
| Moon[ | UML, SML, DL | 78.1–100 (9) | 64–100 (9) | 48–97 (9) | NR |
| Biochemical features | |||||
| Moon[ | UML, SML, DL | 75–94 (5) | 77–94 (5) | 67–93 (5) | NR |
| EEG measures | |||||
| Moon[ | UML, SML, DL | 85–100 (4) | 94–97 (4) | 81–94 (4) | NR |
AI Artificial intelligence, ASD autism spectrum disorder, AUC Area under the Curve, DL Deep learning, HC Healthy control, SML Supervised machine learning, n number of studies reported the corresponding measure, NR not reported, UML Unsupervised machine learning.