| Literature DB >> 29188397 |
Jonathan Christopher Taylor1, John Wesley Fenner2.
Abstract
BACKGROUND: Semi-quantification methods are well established in the clinic for assisted reporting of (I123) Ioflupane images. Arguably, these are limited diagnostic tools. Recent research has demonstrated the potential for improved classification performance offered by machine learning algorithms. A direct comparison between methods is required to establish whether a move towards widespread clinical adoption of machine learning algorithms is justified. This study compared three machine learning algorithms with that of a range of semi-quantification methods, using the Parkinson's Progression Markers Initiative (PPMI) research database and a locally derived clinical database for validation. Machine learning algorithms were based on support vector machine classifiers with three different sets of features: Voxel intensities Principal components of image voxel intensities Striatal binding radios from the putamen and caudate. Semi-quantification methods were based on striatal binding ratios (SBRs) from both putamina, with and without consideration of the caudates. Normal limits for the SBRs were defined through four different methods: Minimum of age-matched controls Mean minus 1/1.5/2 standard deviations from age-matched controls Linear regression of normal patient data against age (minus 1/1.5/2 standard errors) Selection of the optimum operating point on the receiver operator characteristic curve from normal and abnormal training data Each machine learning and semi-quantification technique was evaluated with stratified, nested 10-fold cross-validation, repeated 10 times.Entities:
Keywords: 123I-FP; DaTSCAN; Machine learning; Parkinson’s disease; Semi-quantification; Support vector machine
Year: 2017 PMID: 29188397 PMCID: PMC5707214 DOI: 10.1186/s40658-017-0196-1
Source DB: PubMed Journal: EJNMMI Phys ISSN: 2197-7364
Summary of the available literature on machine learning algorithms for binary classification of (I123)FP-CIT images since 2010, ordered according to maximum accuracy (where available)
| Authors | Image features | Classifier | Validation data + method | Results |
|---|---|---|---|---|
| Augimeri et al. 2016 [ | Mean ellipsoid uptake, dysmorphic index (ellipsoid orientation) | SVM | 43 local images (12 normal, 31 Parkinson’s disease (PD)), no cross-validation mentioned | Up to 100% accuracy, specificity and sensitivity |
| Bhalchandra et al. 2015 [ | Analysis of 42nd slice only. Striatal binding ratios in both caudates and putamina, radial features and gradient features. Features are tested for statistical significance (wilcoxon rank) before use in the classifier | Linear SVM and SVM with Radial Basis Function (RBF) kernel, Linear Discriminant Analysis (LDA) | 350 images from PPMI database (187 healthy controls (HC), 163 PD). 5 fold cross-validation (CV), repeated 100 times | Linear SVM: maximum of accuracy = 99.4% |
| Oliveira and Castelo-Branco 2015 [ | Image voxels within striatal region of interest | Linear SVM | 654 images from PPMI database (209 HC, 445 PD). Leave-one-out CV | Maximum of accuracy = 97.9% |
| Prashanth et al. 2017 [ | 16 shape and 14 surface fitting features of selected slices, following thresholding. Striatal binding ratios of both caudates and putamina and asymmetry indices were also considered. Features are tested for statistical significance (wilcoxon rank) before use in the classifier | SVM with RBF kernel, boosted trees, random forests, naive bayes | 715 images from PPMI database (208 HC, 427 PD, 80 scans without evidence of dopaminergic deficit (SWEDD)). 10 fold CV, repeated 100 times. Hyperparameters for SVM chosen through 10 fold CV | SVM: accuracy = 97.3 ± 0.1% |
| Tagare et al. 2017 [ | Voxel intensities within a region of interest | Logistic lasso | 658 images from PPMI database (210 HC, 448 PD). 3 fold CV for performance assessment. Parameters chosen through 10 fold CV (nested within outer 3 fold CV). | Maximum of accuracy = 96.5 ± 1.3% |
| Palumbo et al. 2014 [ | Striatal binding ratios for both caudates and putamina (and a subset of these 4 features), patient age | SVM with RBF kernel | 90 local images from patients with ‘mild’ symptoms (34 non-PD, 56 PD). Leave-one-out and 5 fold CV | Maximum of accuracy = 96.4% |
| Prashanth et al. 2014 [ | Striatal binding ratio for both caudates and putamina | SVM, linear and RBF kernel. | 493 images from PPMI database (181 HC, 369 early PD), 10 fold CV, no repeats | RBF kernel: accuracy = 96.1%, sensitivity = 96.6%, specificity = 95.0% |
| Martinez-Murcia et al. 2013 [ | 12 Haralick texture features within a brain region of interest | Linear SVM | ‘Whole’ PPMI database. Leave-one-out CV | Maximum of accuracy = 95.9%, sensitivity = 97.3%, specificity = 94.9% |
| Zhang and Kagen 2016 [ | Voxel intensities from a single axial slice, repeated for 3 different slices | Single layer Neural network | 1513 images from PPMI database (baseline and follow-up, 1171 PD, 211 HC, 131 SWEDD). 1189 images for training, 108 for validation, 216 for testing. 10 fold CV | Maximum of accuracy = 95.6 ± 1.5%, sensitivity = 97.4 ± 4.3%, specificity = 93.1 ± 3.6% |
| Rojas et al. 2013 [ | Voxel intensities, independent component analysis (ICA) & principal component analysis (PCA) decomposition of voxel data (after applying empirical mode decomposition) within regions of interest | Linear SVM | 80 local images (39 non-pre-synaptic dopaminergic deficit (non-PDD), 41 PDD). Leave-one-out CV | Raw voxels: accuracy = 87.5%, sensitivity = 90.2%, specificity = 84.6% |
| Towey et al. 2011 [ | PCA decomposition of voxels within striatal region of interest | Naïve-Bayes, Group prototype | 116 local images (37 non-PDD, 79 PDD). Leave-one-out CV | Naïve-Bayes: accuracy = 94.8%, sensitivity = 93.7%, specificity = 97.3% |
| Segovia et al. 2012 [ | Partial least squares decomposition of voxels within striatal regions | SVM applied to hemispheres separately. RBF kernel | 189 local images (94 non-PDD, 95 PDD). Leave-one-out CV | Features varied from 1 to 20. Maximum of accuracy = 94.7%, sensitivity = 93.2%, specificity = 93.6% |
| Martinez-Murcia et al. 2014 [ | ICA decomposition of selected voxels | SVM, linear and RBF kernel | 208 local images (100 non-PDD, 108 PDD), 289 images from PPMI database (114 normal, 175 PD). 30 fold CV | RBF kernel: maximum of accuracy = 94.7%, sensitivity = 98.1%, specificity = 92.0% |
| Illan et al. 2012 [ | Image voxel intensities and image voxels within striatal region of interest | Nearest mean, k-nearest neighbour (k-NN), linear SVM | 208 local images (108 non-PDD, 108 PDD). 30 random permutations CV, with 1/3 data held out for testing | SVM: maximum of sensitivity = 89.0%, specificity = 93.2% |
| Palumbo et al. 2010 [ | Striatal binding ratios for caudate and putamina on 3 slices | Probablistic neural network (PNN), Classification tree (CT) | 216 local images (89 non-PDD, 127 PD). Two fold CV, repeated 1000 times | PNN: for patients with essential tremor mean probability of correct classification = 96.6 ± 2.6% |
Algorithms using only (I123)FP-CIT SPECT data are considered, multimodal inputs are excluded. Literature lacking accuracy data are grouped at the bottom of the table
Summary of patient preparation and image acquisition parameters
| Parameter | Local database | PPMI database |
|---|---|---|
| Administered activity | 167–185 MBq | 111–185 MBq |
| Injection-to-scan delay | 3–6 h | 3.5–4.5 h |
| Acquisition time | 30 min | 30–45 min |
| Acquisition pixel size | 3.68 mm | Variable (scanner dependent) |
| Number of projections | 60 per head (over 180o) | 120 per head (over 360o) |
| Energy window | 159 keV ± 10% | 159 keV ± 10% and 122 keV ± 10% |
| Acquisition matrix size | 128 × 128 | 128 × 128 |
Summary of patient demographics
| Database | Diagnosis | Sex (total male/total female) | Age (years) |
|---|---|---|---|
| Local | Non-PDD | 61/52 | 68.7 (12.4) |
| Local | PDD | 132/59 | 68.7 (13.3) |
| PPMI | HC | 73/136 | 60.8 (11.3) |
| PPMI | PD | 289/159 | 61.6 (9.8) |
Summary of the semi-quantitative methods investigated. Methods are principally grouped according to the particular technique for defining the SBR cut-off
| Semi-quantification method | Comparison data | SBRs considered | Cut-offs defined by |
|---|---|---|---|
| SQ 1 | Age-matched normals | Left and right putamen | Mean − 2SD |
| SQ 2 | Age-matched normals | Left and right putamen and caudate | Mean − 2SD |
| SQ 3 | Age-matched normals | Left and right putamen only | Mean − 1.5SD |
| SQ 4 | Age-matched normals | Left and right putamen and caudate | Mean − 1.5SD |
| SQ 5 | Age-matched normals | Left and right putamen | Mean − 1SD |
| SQ 6 | Age-matched normals | Left and right putamen and caudate | Mean − 1SD |
| SQ 7 | Age-matched normals | Left and right putamen | Minimum |
| SQ 8 | Age-matched normals | Left and right putamen and caudate | Minimum |
| SQ 9 | All normals | Left and right putamen | Linear regression − 2SE |
| SQ 10 | All normals | Left and right putamen and caudate | Linear regression − 2SE |
| SQ 11 | All normals | Left and right putamen | Linear regression − 1.5SE |
| SQ 12 | All normals | Left and right putamen and caudate | Linear regression − 1.5SE |
| SQ 13 | All normals | Left and right putamen | Linear regression − 1SE |
| SQ 14 | All normals | Left and right putamen and caudate | Linear regression − 1SE |
| SQ 15 | All normals and abnormals | Lowest putamen | Optimal point on ROC curve |
| SQ 16 | All normals and abnormals | Lowest putamen and lowest caudate | Optimal point on ROC curve |
| SQ 17 | Age-matched normals and abnormals | Lowest putamen | Optimal point on ROC curve |
| SQ 18 | Age-matched normals and abnormals | Lowest putamen and lowest caudate | Optimal point on ROC curve |
Fig. 1Summary of the machine learning pipelines investigated
Fig. 2Overview of performance comparison method
Semi-quantitative results for local clinical data
| Method number | Cut-offs defined by | SBRs | Accuracy | SD | Sensitivity | SD | Specificity | SD |
|---|---|---|---|---|---|---|---|---|
| SQ 1 | Mean − 2SD | L + R putamen | 0.79 | 0.08 | 0.68 | 0.12 | 0.97 | 0.05 |
| SQ 2 | Mean − 2SD | L + R putamen, L + R caudate | 0.78 | 0.08 | 0.68 | 0.11 | 0.96 | 0.06 |
| SQ 3 | Mean − 1.5SD | L + R putamen | 0.85 | 0.06 | 0.82 | 0.09 | 0.90 | 0.10 |
| SQ 4 | Mean − 1.5SD | L + R putamen, L + R caudate | 0.85 | 0.06 | 0.83 | 0.08 | 0.88 | 0.11 |
| SQ 5 | Mean − 1SD | L + R putamen | 0.86 | 0.06 | 0.91 | 0.06 | 0.77 | 0.12 |
| SQ 6 | Mean − 1SD | L + R putamen, L + R caudate | 0.86 | 0.05 | 0.92 | 0.06 | 0.75 | 0.13 |
| SQ 7 | Minimum | L + R putamen | 0.83 | 0.06 | 0.78 | 0.08 | 0.92 | 0.08 |
| SQ 8 | Minimum | L + R putamen, L + R caudate | 0.84 | 0.07 | 0.81 | 0.09 | 0.89 | 0.10 |
| SQ 9 | Regression − 2SE | L + R putamen | 0.82 | 0.07 | 0.72 | 0.11 | 0.99 | 0.03 |
| SQ 10 | Regress − 2SE | L + R putamen, L + R caudate | 0.82 | 0.06 | 0.72 | 0.10 | 0.98 | 0.04 |
| SQ 11 | Regress − 1.5SE | L + R putamen | 0.86 | 0.06 | 0.82 | 0.09 | 0.93 | 0.09 |
| SQ 12 | Regress − 1.5SE | L + R putamen, L + R caudate | 0.86 | 0.06 | 0.83 | 0.08 | 0.91 | 0.10 |
| SQ 13 | Regression − 1SE | L + R putamen | 0.87 | 0.06 | 0.92 | 0.06 | 0.78 | 0.12 |
| SQ 14 | Regress − 1SE | L + R putamen, L + R caudate | 0.87 | 0.06 | 0.93 | 0.06 | 0.77 | 0.12 |
| SQ 15 | ROC age-matched | Lowest putamen | 0.87 | 0.05 | 0.89 | 0.06 | 0.83 | 0.11 |
| SQ 16 | ROC age-matched | Lowest putamen, lowest caudate | 0.83 | 0.07 | 0.92 | 0.07 | 0.67 | 0.16 |
| SQ 17 | ROC | Lowest putamen | 0.86 | 0.06 | 0.86 | 0.08 | 0.86 | 0.13 |
| SQ 18 | ROC | Lowest putamen, lowest caudate | 0.84 | 0.06 | 0.90 | 0.07 | 0.74 | 0.14 |
Semi-quantitative results for PPMI database
| Method number | Method | SBRs | Accuracy | SD | Sensitivity | SD | Specificity | SD |
|---|---|---|---|---|---|---|---|---|
| SQ 1 | Mean − 2SD | L + R putamen | 0.93 | 0.03 | 0.92 | 0.04 | 0.97 | 0.04 |
| SQ 2 | Mean − 2SD | L + R putamen, L + R caudate | 0.93 | 0.03 | 0.92 | 0.04 | 0.96 | 0.04 |
| SQ 3 | Mean − 1.5SD | L + R putamen | 0.94 | 0.03 | 0.95 | 0.03 | 0.92 | 0.06 |
| SQ 4 | Mean − 1.5SD | L + R putamen, L + R caudate | 0.94 | 0.03 | 0.95 | 0.03 | 0.90 | 0.07 |
| SQ 5 | Mean − 1SD | L + R putamen | 0.92 | 0.03 | 0.98 | 0.02 | 0.78 | 0.09 |
| SQ 6 | Mean − 1SD | L + R putamen, L + R caudate | 0.89 | 0.04 | 0.98 | 0.02 | 0.71 | 0.11 |
| SQ 7 | Minimum | L + R putamen | 0.90 | 0.04 | 0.87 | 0.05 | 0.96 | 0.04 |
| SQ 8 | Minimum | L + R putamen, L + R caudate | 0.90 | 0.03 | 0.88 | 0.05 | 0.94 | 0.05 |
| SQ 9 | Regression − 2SE | L + R putamen | 0.93 | 0.03 | 0.91 | 0.04 | 0.97 | 0.04 |
| SQ 10 | Regression − 2SE | L + R putamen, L + R caudate | 0.93 | 0.03 | 0.91 | 0.04 | 0.97 | 0.04 |
| SQ 11 | Regression − 1.5SE | L + R putamen | 0.94 | 0.03 | 0.95 | 0.03 | 0.92 | 0.05 |
| SQ 12 | Regression − 1.5SE | L + R putamen, L + R caudate | 0.94 | 0.03 | 0.95 | 0.03 | 0.90 | 0.07 |
| SQ 13 | Regression − 1SE | L + R putamen | 0.92 | 0.03 | 0.98 | 0.02 | 0.80 | 0.08 |
| SQ 14 | Regression − 1SE | L + R putamen, L + R caudate | 0.89 | 0.04 | 0.98 | 0.02 | 0.71 | 0.11 |
| SQ 15 | ROC age-matched | Lowest putamen | 0.94 | 0.03 | 0.96 | 0.03 | 0.91 | 0.07 |
| SQ 16 | ROC age-matched | Lowest putamen, lowest caudate | 0.89 | 0.03 | 0.97 | 0.03 | 0.73 | 0.09 |
| SQ 17 | ROC | Lowest putamen | 0.95 | 0.03 | 0.96 | 0.03 | 0.92 | 0.06 |
| SQ 18 | ROC | Lowest putamen, lowest caudate | 0.89 | 0.03 | 0.97 | 0.03 | 0.71 | 0.10 |
Machine learning results for local clinical data
| Method number | Feature | No. PCs | Kernel | Mean accuracy | SD | Sensitivity | SD | Specificity | SD |
|---|---|---|---|---|---|---|---|---|---|
| ML 1 | PCs | 3 | Linear | 0.91 | 0.05 | 0.93 | 0.05 | 0.88 | 0.10 |
| ML 2 | PCs | 5 | Linear | 0.92 | 0.05 | 0.94 | 0.06 | 0.88 | 0.10 |
| ML 3 | PCs | 10 | Linear | 0.91 | 0.05 | 0.93 | 0.06 | 0.86 | 0.10 |
| ML 4 | PCs | 15 | Linear | 0.89 | 0.05 | 0.92 | 0.06 | 0.83 | 0.11 |
| ML 5 | PCs | 20 | Linear | 0.89 | 0.05 | 0.92 | 0.07 | 0.83 | 0.12 |
| ML 6 | PCs | 3 | RBF | 0.91 | 0.05 | 0.91 | 0.07 | 0.89 | 0.09 |
| ML 7 | PCs | 5 | RBF | 0.91 | 0.06 | 0.92 | 0.06 | 0.89 | 0.10 |
| ML 8 | PCs | 10 | RBF | 0.90 | 0.05 | 0.91 | 0.07 | 0.88 | 0.09 |
| ML 9 | PCs | 15 | RBF | 0.89 | 0.05 | 0.91 | 0.07 | 0.87 | 0.10 |
| ML 10 | PCs | 20 | RBF | 0.90 | 0.05 | 0.90 | 0.07 | 0.89 | 0.10 |
| ML 11 | Voxels | Linear | 0.88 | 0.05 | 0.91 | 0.06 | 0.84 | 0.11 | |
| ML 12 | SBRs | Linear | 0.89 | 0.05 | 0.92 | 0.06 | 0.82 | 0.10 | |
| ML 13 | SBRs | RBF | 0.89 | 0.06 | 0.91 | 0.07 | 0.85 | 0.10 |
Machine learning results for PPMI data
| Method number | Feature | No. PCs | Kernel | Mean accuracy | SD | Sensitivity | SD | Specificity | SD |
|---|---|---|---|---|---|---|---|---|---|
| ML 1 | PCs | 3 | Linear | 0.97 | 0.02 | 0.98 | 0.02 | 0.96 | 0.04 |
| ML 2 | PCs | 5 | Linear | 0.97 | 0.02 | 0.98 | 0.02 | 0.96 | 0.05 |
| ML 3 | PCs | 10 | Linear | 0.97 | 0.02 | 0.98 | 0.02 | 0.96 | 0.04 |
| ML 4 | PCs | 15 | Linear | 0.97 | 0.02 | 0.97 | 0.02 | 0.95 | 0.04 |
| ML 5 | PCs | 20 | Linear | 0.97 | 0.02 | 0.98 | 0.02 | 0.96 | 0.05 |
| ML 6 | PCs | 3 | RBF | 0.97 | 0.02 | 0.98 | 0.02 | 0.97 | 0.04 |
| ML 7 | PCs | 5 | RBF | 0.97 | 0.02 | 0.97 | 0.02 | 0.97 | 0.03 |
| ML 8 | PCs | 10 | RBF | 0.97 | 0.02 | 0.97 | 0.02 | 0.97 | 0.04 |
| ML 9 | PCs | 15 | RBF | 0.97 | 0.02 | 0.97 | 0.02 | 0.97 | 0.04 |
| ML 10 | PCs | 20 | RBF | 0.97 | 0.02 | 0.97 | 0.02 | 0.97 | 0.04 |
| ML 11 | Voxels | Linear | 0.95 | 0.02 | 0.97 | 0.03 | 0.92 | 0.06 | |
| ML 12 | SBRs | Linear | 0.95 | 0.03 | 0.97 | 0.03 | 0.91 | 0.06 | |
| ML 13 | SBRs | RBF | 0.95 | 0.02 | 0.96 | 0.03 | 0.93 | 0.06 |
Fig. 3Accuracy results for all semi-quantification and machine learning methods applied to local data. Semi-quantification results are grouped to the left of the graph and machine learning algorithms to the right. Whiskers represent one standard deviation
Fig. 4Accuracy results for all semi-quantification and machine learning methods applied to PPMI data. Semi-quantification results are grouped to the left of the graph and machine learning algorithms to the right. Whiskers represent one standard deviation