| Literature DB >> 28407777 |
Yolanda Garcia-Chimeno1,2, Begonya Garcia-Zapirain3,4, Marian Gomez-Beldarrain5, Begonya Fernandez-Ruanova5, Juan Carlos Garcia-Monco6.
Abstract
BACKGROUND: Feature selection methods are commonly used to identify subsets of relevant features to facilitate the construction of models for classification, yet little is known about how feature selection methods perform in diffusion tensor images (DTIs). In this study, feature selection and machine learning classification methods were tested for the purpose of automating diagnosis of migraines using both DTIs and questionnaire answers related to emotion and cognition - factors that influence of pain perceptions.Entities:
Keywords: Boosting(adaboost); Classification; Committee; DTI; Feature selection; Migraine; Naive bayes; SVM
Mesh:
Year: 2017 PMID: 28407777 PMCID: PMC5390380 DOI: 10.1186/s12911-017-0434-4
Source DB: PubMed Journal: BMC Med Inform Decis Mak ISSN: 1472-6947 Impact factor: 2.796
Description of the sample
| N(%) | Chronic migraine | Episodic migraine | Controls |
| |
|---|---|---|---|---|---|
| Total | 52 | 18(34.6) | 19(36.5) | 15(28.9) | |
| Socio-demographic and clinical data | |||||
| Age | 43.5(7.7) | 43.8(7.1) | 41.4(7.9) | 45.7(6.8) | 0.2 |
| Sex (Female) | 47(90.4) | 14(77.8) | 19(100) | 14(93.3) | 0.05 |
| Cognitive reserve | |||||
| ≤11(Low) | 13(25) | 9(50) | 3(15.8) | 1(6.7) | |
| (Low-Medium) | 13(25) | 4(22.2) | 5(26.3) | 4(26.7) | |
| 16-18(Medium-High) | 15(28.9) | 2(11.1) | 6(31.6) | 7(46.7) | |
| >18(High) | 11(21.2) | 3(16.7) | 5(26.3) | 3(20) |
Fig. 1Block diagram showing the feature selection process and classification. Each of the selection methods is applied to the initial full dataset with all features, classifying them so as to quantify whether any improvement exists. The feature selection method committee is then applied and the resulting reduce set is once again classified
Accuracy, precision, recall and F1score of the initial full set with all features using the SVM, Boosting (Adaboost) and Naive Bayes classifiers
| Accuracy | Precision | Recall | F1score | |
|---|---|---|---|---|
| SVM | 90% | 85% | 91% | 87% |
| Boosting (Adaboost) | 93% | 100% | 89% | 93% |
| Naive Bayes | 67% | 66% | 76% | 69% |
Number of remaining features and percentage decrease according to the number of initial features, applying this using all the feature selection methods
| Number of features | Percentage decrease | |
|---|---|---|
| Gradient Tree-Boosting | 4 | 90% |
| L1-based | 21 | 48% |
| Random Forest | 12 | 70% |
| Univariate | 10 | 75% |
Accuracy of the resulting dataset after having applied of the feature selection methods using the SVM, Boosting (Adaboost) and Naive Bayes classifiers
| Accuracy | Precision | Recall | F1score | |
|---|---|---|---|---|
| SVM | ||||
| Gradient Tree Boosting | 98% | 100% | 94% | 96% |
| L1-based | 88% | 86% | 85% | 84% |
| Random Forest | 91% | 91% | 85% | 87% |
| Univariate | 78% | 80% | 78% | 78% |
| Boosting (Adaboost) | ||||
| Gradient Tree Boosting | 94% | 100% | 92% | 9% |
| L1-based | 91% | 91% | 91% | 89% |
| Random Forest | 95% | 100% | 94% | 96% |
| Univariate | 87% | 87% | 87% | 85% |
| NaiveBayes | ||||
| Gradient Tree Boosting | 98% | 96% | 100% | 98% |
| L1-based | 61% | 60% | 78% | 67% |
| Random Forest | 92% | 86% | 98% | 91% |
| Univariate | 60% | 53% | 87% | 66% |
Features selected from the chosen threshold applied to the resulting features of each feature selection method
| Frequency (All = 4) | |
|---|---|
| Total Pain days | 4 |
| Total Analgesics | 4 |
| Score MSQoL | 3 |
| Left Uncinate | 3 |
| Left Cingulate Gyrus | 2 |
| Score MDQ-H | 2 |
| Total Pain Month 1 | 2 |
Accuracy, precision, recall and F1score with the dataset resulting from the feature selection method committee using the SVM, Boosting (Adaboost) and Naive Bayes classifiers
| Accuracy (Threshold ≥ 2) | Precision | Recall | F1score | |
|---|---|---|---|---|
| SVM | 95% | 93% | 92% | 92% |
| Boosting (Adaboost) | 94% | 96% | 89% | 92% |
| Naive Bayes | 93% | 90% | 98% | 92% |