| Literature DB >> 30947496 |
Gyeongcheol Cho1, Jinyeong Yim2, Younyoung Choi3, Jungmin Ko4, Seoung-Hwan Lee5.
Abstract
OBJECTIVE: Enhanced technology in computer and internet has driven scale and quality of data to be improved in various areas including healthcare sectors. Machine Learning (ML) has played a pivotal role in efficiently analyzing those big data, but a general misunderstanding of ML algorithms still exists in applying them (e.g., ML techniques can settle a problem of small sample size, or deep learning is the ML algorithm). This paper reviewed the research of diagnosing mental illness using ML algorithm and suggests how ML techniques can be employed and worked in practice.Entities:
Keywords: Big data; Machine learning; Mental illness
Year: 2019 PMID: 30947496 PMCID: PMC6504772 DOI: 10.30773/pi.2018.12.21.2
Source DB: PubMed Journal: Psychiatry Investig ISSN: 1738-3684 Impact factor: 2.505
The summary of characteristics of 5 machine learning algorithms frequently used in the field of mental health
| Algorithm | Advantage | Disadvantage |
|---|---|---|
| SVM | - Working relatively accurate in general in practice | - An appropriate kernel and careful data-preprocessing is required to precisely analyze complicated data |
| GBM | - Working relatively accurate in general in practice | - Training learner in GBM takes relatively more time than in other algorithms. |
| - The algorithm itself automatically covers its previous weakness in dealing with some attributes by adding more weak learners | - Difficult to interpret the relationship between attributes and the target | |
| Random forest | - Prediction speed can be relatively slow for a complex random forest | |
| - Difficult to interpret the relationship between attributes and the target | ||
| Naïve bayes | - Need relatively short time for training and prediction | - Unrealistic assumption: independency of all attributes |
| - Can provide intermediate conditional probabilities of attributes | ||
| KNN | - Training process is not required | - Prediction time rapidly increases as the number of data or attributes increases |
| - Relatively large additional data is required for even a small increase in the number of features to consider | ||
| - Highly dependent on The distance metric to be used |
SVM: Support Vector Machine, GBM: Gradient Boosting Machine, KNN: K-Nearest Neighborhood
The summary of mental health researches using ML techniques
| Disease | Machine learning | Data type | Sample size | Accuracy | Author | Year |
|---|---|---|---|---|---|---|
| PTSD | Stochastic GBM | Balloon analogue risk task (BART), questionnaires in semi-structured interviews | 56 | Augsburger and Elbert | 2017 [ | |
| SVM | Longitudinal data (evaluations during emergency room admission) | 152 | 64% (using background information alone) | Galatzer-Levy et al. | 2017 [ | |
| 82% (using background and ER variables) | ||||||
| 88% (using data through 1 week) | ||||||
| 93% (using data through 1 month) | ||||||
| SVM, Target information equivalence algorithm | Event characteristics, emergency department records, early symptoms | 957 | 75% | Karstoft et al. | 2015 [ | |
| Schizophrenia | SVM | MRI | 198 | 90% (differentiating between schizophrenia patients and healthy controls) | Schnack et al. | 2014 [ |
| 88% (differentiating between schizophrenia patients and bipolar disorder ones) | ||||||
| 53%, 67% (differentiating between bipolar disorder patients and healthy controls) | ||||||
| Random forest | MRI scans (anatomic brain measures), clinical and genetic information | 197 | 73.7% | Greenstein et al. | 2012 [ | |
| SVM with an ensemble (SVME) method | fMRI, single nucleotide polymorphism (SNP) data | 40 | 74% (applying SVM to fMRI and SNP data) | Yang et al. | 2010 [ | |
| 82% (building SVME with SNP data) | ||||||
| 83% (building SVME with fMRI data) | ||||||
| 87% (combining above three models) | ||||||
| Depression | GBM | Patient-reported data | 1,949 | 64.6% (predicting clinical remission) | Chekroud et al. | 2016 [ |
| 59.6% (predicting escitalopram treatment group in cross validation test) | ||||||
| 59.7% (predicting combined escitalopram-buproprion treatment group in cross validation test) | ||||||
| ASD | SVM | Single-word utterance data | 81 | 76% | Nakai et al. | 2017 [ |
| Random forest | Words and phrases from the evaluations | 1,162 | 86.5% | Maenner et al. | 2016 [ | |
| KNN, SVM, naive bayes | Modified multiscale entropy (mMSE) computed from resting state EEG data | 79 | 77% (applying KNN and SVM to 9 months infants data) | Bosl et al. | 2011 [ | |
| 80% (applying naive bayes to 18 months infants data) |
ASD: Autism Spectrum Disorders, EEG: electroencephalogram, MRI: magnetic resonance imaging, fMRI: functional magnetic resonance imaging, PTSD: post-traumatic stress disorder, SVM: support vector machine, GBM: gradient boosting machine, KNN: K-Nearest Neighbors