| Literature DB >> 33037266 |
Eric W Prince1,2,3, Ros Whelan4, David M Mirsky5, Nicholas Stence5, Susan Staulcup4, Paul Klimo6,7, Richard C E Anderson8, Toba N Niazi9, Gerald Grant10, Mark Souweidane11,12, James M Johnston13, Eric M Jackson14, David D Limbrick15, Amy Smith16, Annie Drapeau17, Joshua J Chern18, Lindsay Kilburn19, Kevin Ginn20, Robert Naftel21, Roy Dudley22, Elizabeth Tyler-Kabara23, George Jallo24, Michael H Handler25,4, Kenneth Jones26, Andrew M Donson27,28, Nicholas K Foreman27,28, Todd C Hankinson25,4,27.
Abstract
Deep learning (DL) is a widely applied mathematical modeling technique. Classically, DL models utilize large volumes of training data, which are not available in many healthcare contexts. For patients with brain tumors, non-invasive diagnosis would represent a substantial clinical advance, potentially sparing patients from the risks associated with surgical intervention on the brain. Such an approach will depend upon highly accurate models built using the limited datasets that are available. Herein, we present a novel genetic algorithm (GA) that identifies optimal architecture parameters using feature embeddings from state-of-the-art image classification networks to identify the pediatric brain tumor, adamantinomatous craniopharyngioma (ACP). We optimized classification models for preoperative Computed Tomography (CT), Magnetic Resonance Imaging (MRI), and combined CT and MRI datasets with demonstrated test accuracies of 85.3%, 83.3%, and 87.8%, respectively. Notably, our GA improved baseline model performance by up to 38%. This work advances DL and its applications within healthcare by identifying optimized networks in small-scale data contexts. The proposed system is easily implementable and scalable for non-invasive computer-aided diagnosis, even for uncommon diseases.Entities:
Mesh:
Year: 2020 PMID: 33037266 PMCID: PMC7547020 DOI: 10.1038/s41598-020-73278-8
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Transfer learning networks, feature embeddings, and baseline results. (A) ILSVRC network models utilized, with their top 1% and top 5% accuracy in ILSVRC competition noted. (B) Example CT and MRI images for both ACP and NOTACP. (C) ROC (left) and AUC (right) values for all twelve networks and both imaging modalities (CT top, MRI bottom). The diagonal dashed line represents performance of a random guess.
Figure 2Genetic algorithm optimization of model parameters. (A) General process schematic for genetic algorithm parameter optimization. Moving from left to right, a feature variant is selected for each model feature to create individual networks (Step 1; individuals are highlighted in unique colors). Individuals are trained and evaluated to determine fitness and ranked accordingly (Step 2). Two networks are chosen from the fittest population and a new network is derived by selecting from feature variants in these two networks, and variants are occasionally mutated (i.e., randomly selected from the population pool; Step 3). (B) Model feature and respective feature variants explored in first phase of genetic algorithm optimization. Each column represents a model feature to be optimized and each row is a possible feature variant for the GA to select from. This table reflects the “Population Pool” (A). (C) Top-5 performing networks for independent CT and MRI networks after 10 generations of 100 solution populations; ranked according to test accuracy. (D) Top-5 performing networks for combined CT-MRI networks after 10 generations of 100 solution populations; ranked according to test accuracy.
Figure 3Further optimization with GA. (A). Model features and variants available for solution search space in second phase of GA optimization. Each column represents a model feature to be optimized and each row is a possible feature variant for the GA to select from. (B) Top-5 performing networks from CT and MRI trained networks as optimized by the GA for 10 generations of 100 solution populations. (C) Top-5 performing networks for combined CT-MRI trained networks as optimized by the GA for 10 generations of 100 solution populations.
Figure 4Test performance of models trained on stochastic image augmentation and GAN-LSTM augmented images. (A) Exemplar images of original training CT (top) and MRI (bottom), with randomly augmented variants, and TANDA-augmented variants. (B) ROC curves for CT- and MRI-trained networks comparing top results of supervised augmented images and TANDA-generated images. Dashed lines represent ROC curve of random chance classification.
Figure 5Pituitary obfuscation reveals latent features exist outside canonical ROI for CT scans. (A) Example original and obfuscated images for both data classes and both imaging modalities. (B) ROC curves for networks trained on obfuscated and original data; original data was ‘Augmented ()’ variant. (C) Baseline ROC curves for all twelve networks trained on original (left) and obfuscated (right) CT images. (D) Baseline ROC curves for all twelve networks trained on original (left) and obfuscated (right) MRI images.
Figure 6Optimized network classification performance versus human specialist and 5-fold cross-validation evaluation. (A) Radiologist average auROC of 89.4%, 83.3%, and 93.8% for CT, MRI, and CT-MRI, respectively. GA-optimzed auROC of 85.3%, 83.3%, and 87.8% for CT, MRI, and CT-MRI, respectively. (B) Schematics of 5-fold cross-validation (5F-CV) approaches used to verify the perceived improvement yielded by augmented training data (scenario 3 vs. scenarios 1 and 2). Additionally scenarios 1 and 2 investigate the effect of mixing augmented data into the overall data pool versus only augmenting training data. (C) Performance metrics (AUC: area under the ROC curve; Accuracy: standard accuracy metric) for 5F-CV across all three scenarios. Peak performance was achieved via scenario 2 in CT (AUC = 88.0%, Accuracy = 89.0%) and MRI (AUC = 97.5%, Accuracy = 97.4%). In the context of CT-MRI, peak performance was attained in scenario 3 (AUC = 97.8%, Accuracy = 97.9%).
Pre-trained networks utilized.
| Network | Source | Feature vector size |
|---|---|---|
| Inception V1 | 1024 | |
| Inception V2 | 1024 | |
| Inception V3 | 2048 | |
| Inception ResNet V2 | 1536 | |
| ResNet V1 50 | 2048 | |
| ResNet V1 101 | 2048 | |
| ResNet V1 152 | 2048 | |
| ResNet V2 50 | 2048 | |
| ResNet V2 101 | 2048 | |
| ResNet V2 152 | 2048 | |
| NASNet-A Large | 4032 | |
| PNASNet-5 Large | 4320 |
Modules were accessed using the respective URL and standard TensorFlow Hub methods.