| Literature DB >> 30952930 |
Jihye Yun1, Ji Eun Park2, Hyunna Lee3, Sungwon Ham1, Namkug Kim1,4, Ho Sung Kim4.
Abstract
We aimed to establish a high-performing and robust classification strategy, using magnetic resonance imaging (MRI), along with combinations of feature extraction and selection in human and machine learning using radiomics or deep features by employing a small dataset. Using diffusion and contrast-enhanced T1-weighted MR images obtained from patients with glioblastomas and primary central nervous system lymphomas, classification task was assigned to a combination of radiomic features and (1) supervised machine learning after feature selection or (2) multilayer perceptron (MLP) network; or MR image input without radiomic feature extraction to (3) two neuro-radiologists or (4) an end-to-end convolutional neural network (CNN). The results showed similar high performance in generalized linear model (GLM) classifier and MLP using radiomics features in the internal validation set, but MLP network remained robust in the external validation set obtained using different MRI protocols. CNN showed the lowest performance in both validation sets. Our results reveal that a combination of radiomic features and MLP network classifier serves a high-performing and generalizable model for classification task for a small dataset with heterogeneous MRI protocols.Entities:
Mesh:
Year: 2019 PMID: 30952930 PMCID: PMC6451024 DOI: 10.1038/s41598-019-42276-w
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Demographics of the Patients and Protocols in the Image Datasets.
| Group | Dataset origin | No. of glioblastoma | No. of PCNSL | Age (y) | Female (%) | MR | TR/TE (ms), CE-T1WI | Slice thickness (mm), CE-T1WI | TR/TE (ms), DWI | Slice thickness (mm), DWI |
|---|---|---|---|---|---|---|---|---|---|---|
| Training set | AMC | 73 | 50 | 63.0 ± 11.7 | 35.9 | 3 T | 9.0–10.1/4.4–4.8 | 0.5 | 3000–4000/56–61.7 | 4 mm |
| Internal validation set | AMC | 18 | 12 | 58.8 ± 13.0 | 50.0 | 3 T | 9.0–10.1/4.4–4.8 | 0.5 | 3000–4000/56–61.7 | 4 mm |
| External validation set | SMC | 28 | 14 | 56.3 ± 10.6 | 50.0 | 3 T | 8.6–10.4/3.5–4.7 | 1 | 6900–12,000/55–81 | 3–5 mm |
Abbreviations: Data are expressed as mean ± standard deviation for age and range for image protocols. PCNSL = primary central nervous system lymphoma, TR = repetition time, TE = echo time, CE-T1WI = contrast-enhanced T1 weighted imaging, DWI = diffusion weighted imaging, AMC = Asan Medical Center, Seoul, Korea, SMC = Samsung Medical Center, Seoul, Korea.
Figure 1Study design. Metric 1: extraction of radiomic features, followed by machine learning of support vector machine (SVM), generalized linear model (GLM), and random forest (RF); Metric 2: extraction of radiomic features, followed by deep learning-based classification of multilayer perceptron network; Metric 3: assessment by 2 neuroradiologists; Metric 4: end-to-end classifier using convolutional neural network (CNN) without feature extraction.
Figure 2Optimization of metric 1 in the training set using contrast-enhanced T1-weighted (CE-T1WI) and ADC images. (A) Heatmap depicting the diagnostic performance (area-under-the receiver operating characteristics curve, AUCs) of 3 feature selection (in rows) and 3 classification (in columns) methods in the training set. Color scale: expressed from orange (AUC, 0.60) to red (AUC, 1.00). (B) Stability of the AUCs using 10-fold cross-validation in the training set. Color scale: expressed from sky blue (stability, 3%) to dark blue (stability, 6.5%).
Optimization of the best classifier in multilayer perceptron network (metric 2) and CNN (metric 4) by comparing diagnostic performance in the internal validation set.
| Imaging data | CE-T1WI + ADC | CE-T1WI | ADC |
|---|---|---|---|
| MLP network | AUC (95% CI) | ||
| 100-10 | 0.991 (0.987–0.994) | 0.965 (0.959–0.972) | 0.969 (0.960–0.978) |
| 500-100-10 | 0.990 (0.987–0.993) | 0.965 (0.960–0.971) | 0.968 (0.956–0.979) |
| 500-100-50-10 | 0.989 (0.985–0.993) | 0.956 (0.949–0.962) | 0.964 (0.953–0.975) |
| 500-250-100-50-10 | 0.988 (0.982–0.995) | 0.968 (0.964–0.982) | 0.965 (0.952–0.977) |
| 750-500-250-100-50-10 | 0.986 (0.978–0.993) | 0.971 (0.967–0.975) | 0.960 (0.951–0.969) |
| CNN | Accuracy (Sensitivity/Specificity) | ||
| Inception v-3 | 80.0 (50.0/100) | 76.7 (41.6/100) | 86.7 (83.3/88.9) |
Abbreviations: MLP = multilayer perceptron, CNN = convolutional neural network, CE-T1WI = contrast-enhanced T1 weighted imaging, DWI = diffusion weighted imaging, AUC = area under the receiver operating characteristic curve.
Figure 3The difference of receiver operating characteristic curve according to image sets in the convolutional neural network (CNN) classifier (metric 4). The CNN showed the best performance with apparent diffusion coefficient (ADC) images, compared with the performance achieved when a combination of contrast-enhanced T1-weighted (CE-T1WI) and ADC images were used or when CE-T1WI images alone were used.
Validation Datasets Showing the Area Under the Curve, Sensitivity, and Specificity of the Classification Metrics to Distinguish Primary Central Nervous System Lymphoma and Glioblastoma, With Reference to a Histopathology Finding.
| Metric | Images | Training set | Internal validation set | External validation set | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| AUC (95% CI) | Sens (%) | Spec (%) | AUC (95% CI) | Sens (%) | Spec (%) | AUC (95% CI) | Sens (%) | Spec (%) | |||
| 1 | Backward feature elimination with GLM boosting classifier | Radiomics features | 0.943 (0.927–0.978) | 96.3 | 92.3 | 0.931 (0.914–0.941) | 98.8 | 92.3 | 0.811 (0.795–0.835) | 85.5 | 78.9 |
| 2 | MLP network classifier (100-10) | Radiomics features | 0.994 (0.994–0.995) | 100 | 100 | 0.991 (0.987–0.994) | 100 | 100 | 0.947 (0.937–0.956) | 92.9 | 82.1 |
| 3 | Human readers | Images | 0.825–0.908 (0.755–0.949) | 69.4–83.9 | 95.6–97.8 | 0.833–0.875 (0.653–0.940) | 75.0–83.3 | 83.3–100 | 0.913–0.932 (0.808–0.981) | 86.2–89.7 | 96.4–96.4 |
| 4 | CNN based end-to-end classifier | Images | 0.973 (0.966–0.980) | 100 | 94.51 | 0.879 (0.856–0.902) | 83.3 | 83.3 | 0.486 (0.468–0.503) | 100 | 35.7 |
Abbreviations: GLM = Generalized linear model, MLP = multilayer perceptron, CNN = convolutional neural network, AUC = area under the receiver operating characteristic curve, Sens = Sensitivity, Spec = Specificity.
Figure 4The diagnostic performance of metric 2 in the internal and external validation sets. The area under the receiver operating characteristic curve (AUC) remained robust in the external validation set.