| Literature DB >> 26576199 |
T R Sivapriya1, A R Nadira Banu Kamal2, P Ranjit Jeba Thangaiah3.
Abstract
The objective of this study is to develop an ensemble classifier with Merit Merge feature selection that will enhance efficiency of classification in a multivariate multiclass medical data for effective disease diagnostics. The large volumes of features extracted from brain Magnetic Resonance Images and neuropsychological tests for diagnosis lead to more complexity in classification procedures. A higher level of objectivity than what readers have is needed to produce reliable dementia diagnostic techniques. Ensemble approach which is trained with features selected from multiple biomarkers facilitated accurate classification when compared with conventional classification techniques. Ensemble approach for feature selection is experimented with classifiers like Naïve Bayes, Random forest, Support Vector Machine, and C4.5. Feature search is done with Particle Swarm Optimisation to retrieve the subset of features for further selection with the ensemble classifier. Features selected by the proposed C4.5 ensemble classifier with Particle Swarm Optimisation search, coupled with Merit Merge technique (CPEMM), outperformed bagging feature selection of SVM, NB, and Random forest classifiers. The proposed CPEMM feature selection found the best subset of features that efficiently discriminated normal individuals and patients affected with Mild Cognitive Impairment and Alzheimer's Dementia with 98.7% accuracy.Entities:
Mesh:
Year: 2015 PMID: 26576199 PMCID: PMC4632180 DOI: 10.1155/2015/676129
Source DB: PubMed Journal: Comput Math Methods Med ISSN: 1748-670X Impact factor: 2.238
Datasets used in the study.
| Dataset | Number of instances | Number of attributes | Number of classes | AD | Normal | MCI |
|---|---|---|---|---|---|---|
| Neuropsychological dataset | 750 | 48 | 3 | 150 | 200 | 400 |
| Neuroimaging dataset | 650 | 108 | 2 | 250 | 250 | 200 |
| Baseline combined data | 870 | 65 | 3 | 140 | 280 | 450 |
| Combined dataset | 750 | 40 | 2 | 150 | 200 | 400 |
List of attributes derived from neuropsychological test and neuroimaging measures.
| Neuropsychological and neuroimaging measures | |
|---|---|
| Average FDG-PET of angular, temporal, and posterior cingulate | Mini Mental State Examination-baseline |
|
| |
| Average PIB SUVR of frontal cortex, anterior cingulate, precuneus cortex, and parietal cortex | Ventricles measure |
|
| |
| Average AV45 SUVR of frontal, anterior cingulate, precuneus, and parietal cortex relative to the cerebellum | Hippocampus-baseline, volume |
|
| |
| Clinical dementia ratio-SB | Whole brain-baseline, volume |
|
| |
| ADAS 11 | UCSF entorhinal-baseline, volume |
|
| |
| ADAS 13 | UCSF fusiform-baseline, volume |
|
| |
| Mini Mental Scale Examination score | UCSF Med Temp-baseline |
|
| |
| RAVLT (forgetting) | UCSF ICV-baseline |
|
| |
| RAVLT (5 sum) | MOCA-baseline |
|
| |
| Functional Assessment Questionnaire | Pt ECog-Memory-baseline |
|
| |
| MOCA | Pt ECog-Language-baseline |
|
| |
| Pt ECog-Memory | Pt ECog-Vis/Spat-baseline |
|
| |
| Pt ECog-Language | Pt ECog-Plan-baseline |
|
| |
| Pt ECog-Visual | Pt ECog-Organ-baseline |
|
| |
| Pt ECog-Plan | Pt ECog-Div atten-baseline |
|
| |
| Pt ECog-Organ | Pt ECog-Total-baseline |
|
| |
| Pt ECog-Div atten | SP ECog-Mem-baseline |
|
| |
| Pt ECog-Total | SP ECog-Lang-baseline |
|
| |
| SP ECog-Memory | SP ECog-Vis/Spat-baseline |
|
| |
| SP ECog-Language | SP ECog-Plan-baseline |
|
| |
| SP ECog-Visual | SP ECog-Organ-baseline |
|
| |
| SP ECog-Plan | SP ECog-Div atten-baseline |
|
| |
| SP ECog-Organ | SP ECog-Total-baseline |
|
| |
| SP ECog-Attention | Average FDG-PET of angular, temporal, and posterior cingulate at baseline |
|
| |
| SP ECog-Total | Average PIB SUVR of frontal cortex, anterior cingulate, precuneus cortex, and parietal cortex at baseline |
|
| |
| UCSF ventricles measures | Average AV45 (PET ligand) SUVR of frontal, anterior cingulate, precuneus, and parietal cortex relative to the cerebellum at baseline |
|
| |
| UCSF hippocampus measure | CDR-SB |
|
| |
| UCSF whole brain measure | ADAS 11, baseline |
|
| |
| UCSF entorhinal measure | ADAS 13, baseline |
|
| |
| UCSF fusiform measure | |
|
| |
| UCSF temporal measure | RAVLT (forgetting), baseline |
|
| |
| UCSF ICV | RAVLT (5 sum), baseline |
Pt: patient, ECog: everyday cognition test, SP: study partner, ADAS: Alzheimer's disease assessment scale, MOCA: Montreal Cognitive Assessment, Ray Auditory Verbal Learning Test, ICV: intracranial volume, SUVR: Standard Uptake value ratio, and CDR-SB: Clinical Dementia Rating Sum of Boxes.
Figure 1Selection of base classifier for ensemble feature selection.
Figure 2Steps in Phase II and Phase III.
Algorithm 1Results for the 3-class dataset with ensemble of NB, J48, RF, SVM and with CPEMM ensemble.
| NB Ensemble | J48 Ensemble | RF Ensemble | SVM Ensemble | NB-CPEMM | J48-CPEMM | RF-CPEMM | |||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Acc | Pre | Rec | Acc | Pre | Rec | Acc | Pre | Rec | Acc | Pre | Rec | Acc | Pre | Rec | Acc | Pre | Rec | Acc | Pre | Rec | |
| NSDS | 0.838 | 0.854 | 0.838 | 0.911 | 0.895 | 0.889 | 0.936 | 0.941 | 0.799 | 0.647 | 0.564 | 0.677 | 0.899 | 0.904 | 0.912 | 0.932 | 0.958 | 0.933 | 0.936 | 0.941 | 0.799 |
| NIDS | 0.756 | 0.737 | 0.556 | 0.936 | 0.941 | 0.937 | 0.545 | 0.645 | 0.545 | 0.667 | 0.685 | 0.734 | 0.834 | 0.895 | 0.738 | 0.904 | 0.934 | 0.907 | 0.545 | 0.645 | 0.545 |
| CIDS-I | 0.863 | 0.833 | 0.823 | 0.916 | 0.933 | 0.923 | 0.963 | 0.963 | 0.963 | 0.673 | 0.699 | 0.788 | 0.896 | 0.812 | 0.822 | 0.987 | 0.955 | 0.963 | 0.963 | 0.963 | 0.963 |
| CIDS-II | 0.756 | 0.654 | 0.563 | 0.945 | 0.931 | 0.924 | 0.797 | 0.723 | 0.634 | 0.685 | 0.676 | 0.657 | 0.885 | 0.823 | 0.813 | 0.971 | 0.965 | 0.923 | 0.888 | 0.856 | 0.833 |
(NSDS: Neuropsychological data set, NIDS: Neuroimaging dataseta, CIDS: combined dataset).
Acc: Accuracy, Pre: precision, Rec: Recall.
Figure 3Accuracy of classifiers with features selected by PSO, CPEMM methods.
Results with the maximal feature subset obtained by divide and merge feature selection technique.
| Classifier | Normal class | Dementia | Mild Cognitive Impairment | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Acc | Pre | Rec | Acc | Pre | Rec | Acc | Pre | Rec | |
| J48 | 0.963 | 0.963 | 0.963 |
|
|
| 0.966 | 0.968 | 0.987 |
| J48-TT | 0.983 | 0.972 | 0.977 |
|
|
| 0.977 | 0.976 | 0.977 |
| J48-CV | 0.965 | 0.954 | 0.945 |
|
|
| 0.964 | 0.954 | 0.945 |
TT: training and testing; CV: cross validation.
Figure 4Comparison of area under the curve obtained by ordinary ensemble vs CPEMM for the four datasets.
Description of the J48 ensemble model used for the multiclass classification.
| Details | Value |
|---|---|
| Split method | Binary split |
|
| |
| Cross validation accuracy | 0.976 |
|
| |
| AUC with CV | 0.971 |
|
| |
| Train and test accuracy | 0.986 |
|
| |
| AUC with train and test | 0.987 |
|
| |
| Common features selected by all methods | MMSE, CDR, hippocampus volume, and everyday cognition measures |
|
| |
| Features added by CPEMM | Entorhinal measures, CDRSB, and Ray Auditory Verbal Learning Test-immediate |