| Literature DB >> 35939301 |
Peiyi Gao1,2, Wei Shan1,3, Yue Guo1,2, Yinyan Wang1,4, Rujing Sun1,2, Jinxiu Cai1,2, Hao Li1,2, Wei Sheng Chan4, Pan Liu4, Lei Yi5, Shaosen Zhang1,2, Weihua Li5, Tao Jiang1,4, Kunlun He6,7, Zhenzhou Wu1,4.
Abstract
Importance: Deep learning may be able to use patient magnetic resonance imaging (MRI) data to aid in brain tumor classification and diagnosis. Objective: To develop and clinically validate a deep learning system for automated identification and classification of 18 types of brain tumors from patient MRI data. Design, Setting, and Participants: This diagnostic study was conducted using MRI data collected between 2000 and 2019 from 37 871 patients. A deep learning system for segmentation and classification of 18 types of intracranial tumors based on T1- and T2-weighted images and T2 contrast MRI sequences was developed and tested. The diagnostic accuracy of the system was tested using 1 internal and 3 external independent data sets. The clinical value of the system was assessed by comparing the tumor diagnostic accuracy of neuroradiologists with vs without assistance of the proposed system using a separate internal test data set. Data were analyzed from March 2019 through February 2020. Main Outcomes and Measures: Changes in neuroradiologist clinical diagnostic accuracy in brain MRI scans with vs without the deep learning system were evaluated.Entities:
Mesh:
Year: 2022 PMID: 35939301 PMCID: PMC9361083 DOI: 10.1001/jamanetworkopen.2022.25608
Source DB: PubMed Journal: JAMA Netw Open ISSN: 2574-3805
Test Set Evaluation of Deep Learning System
| Training set (N = 37 871) | Pathology confirmed data sets, outcome (95% CI) | ||||
|---|---|---|---|---|---|
| Tiantan Hospital (n = 300) | Jilin University-Affiliated Hospital (n = 71) | Shanxi Hospital (n = 99) | 301 Hospital (n = 869) | ||
| Tumor classification, No. | 18 | 18 | 12 | 12 | 16 |
| Images, No. | 2 693 856 | 21 288 | 2832 | 5062 | 56 520 |
| Sex, No (%) | |||||
| Women | 19 186 (50.66) | 145 (48.33) | 36 (50.70) | 66 (66.67) | 498 (57.31) |
| Men | 18685 (49.34) | 155 (51.67) | 35 (49.30) | 33 (33.33) | 371 (42.69) |
| Age, mean (SD), y | 41.58 (11.4) | 35.97 (13.1) | 49.38 (14.2) | 51.56 (16.8) | 38.72 (18.1 |
| Accuracy, % | NA | 73.00 (67.7-77.7) | 74.20 (62.1-83.4) | 81.20 (70.4-88.6) | 73.60 (70.5-76.4) |
| Sensitivity | NA | 0.889 (0.853-0.924) | 0.727 (0.621-0.834) | 0.876 (0.810-0.942) | 0.735 (0.619-0.553) |
| Specificity | NA | 0.963 (0.942-0.984) | 0.849 (0.763-0.935) | 0.968 (0.93-1.00) | 0.941 (0.925-0.985) |
| Precision | NA | 0.747 (0.698-0.795) | 0.688 (0.577-0.799) | 0.912 (0.857-0.969) | 0.644 (0.613-0.674) |
| F1 score | NA | 0.796 (0.751-0.841) | 0.618 (0.502-0.735) | 0.887 (0.824-0.951) | 0.545 (0.513-0.577) |
| Recall | NA | 0.889 (0.853-0.924) | 0.727 (0.621-0.834) | 0.876 (0.810-0.942) | 0.735 (0.619-0.553) |
Abbreviation: NA, not applicable.
Figure. Study Design Overview
DLS, deep learning system; MRI, magnetic resonance imaging; T1C, T1-weighted contrast-enhanced images; T1WI, T1-weighted images; T2WI, T2-weighted images.
Accuracy of Professional Evaluators vs DLS
| Evaluator index | Years of experience, No. | Accuracy, % (95% CI) |
|---|---|---|
| A | 15 | 73.0 (68.0-78.0) |
| B | 11 | 71.3 (66.2-76.5) |
| C | 13 | 68.3 (63.1-73.6) |
| D | 19 | 64.7 (59.3-70.1) |
| E | 15 | 64.7 (59.3-70.1) |
| F | 30 | 63.0 (57.5-68.5) |
| G | 10 | 60.3 (54.8-65.9) |
| H | 9 | 60.0 (54.5-65.5) |
| I | 10 | 59.7 (54.1-65.2) |
| M | 9 | 50.7 (45.0-56.3) |
| N | 24 | 48.7 (43.0-54.3) |
| Mean | 15 | 60.9 (46.8-75.1) |
| DLS | NA | 73.0(67.7-77.7) |
Abbreviations: DLS, deep learning system; NA, not applicable.
Among 300 patients from Tiantan Hospital.
Accuracy in DLS-Assistance Evaluation
| Evaluator index | Accuracy, % (95% CI) | |
|---|---|---|
| Without DLS assistance | With DLS assistance | |
| A | 67.3 (58.3-76.3) | 77.9 (69.9-85.9) |
| B | 51.9 (42.4-61.4) | 77.4 (69.4-85.3) |
| C | 61.5 (52.2-70.9) | 67.3 (58.3-76.3) |
| D | 42.9 (33.4-52.3) | 73.3 (64.9-81.8) |
| E | 59.0 (49.6-68.5) | 74.3 (65.9-82.6) |
| F | 81.1 (73.7-88.6) | 84.9 (78.1-91.7) |
| G | 66.9 (59.9-73.8) | 77.0 (70.8-83.2) |
| H | 68.6 (61.7-75.4) | 78.3 (72.2-84.4) |
| I | 65.0 (58.1-71.9) | 69.9 (63.3-76.6) |
| Mean | 63.5 (60.7-66.2) | 75.5 (73.0-77.9) |
Abbreviation: DLS, deep learning system.
Among 1166 patients in independent internal test data set.
Top 2 Validation Tests in Multiple Tumors
| Tumor type | Patients, No. | Sensitivity (95% CI) | Specificity (95% CI) | Precision (95% CI) | F1 score (95% CI) | Recall (95% CI) |
|---|---|---|---|---|---|---|
| Acoustic neuroma | 205 | 0.89 (0.84-0.93) | 0.99 (0.98-1.0) | 0.95 (0.90-0.97) | 0.92 (0.88-0.94) | 0.89 (0.84-0.93) |
| Pituitary tumor | 234 | 0.90 (0.86-0.94) | 0.96 (0.94-0.97) | 0.82 (0.77-0.87) | 0.86 (0.83-0.89) | 0.90 (0.86-0.94) |
| Epidermoid cyst | 21 | 0.82 (0.61-0.93 | 1.00 (1.00-1.00) | 0.95 (0.75-0.99) | 0.88 (0.75-0.95) | 0.82 (0.61-0.93) |
| Meningioma | 280 | 0.93 (0.9-0.96)) | 0.94 (0.92-0.95) | 0.81 (0.76-0.85) | 0.87 (0.84-0.89) | 0.93 (0.9-0.96) |
| Paraganglioma | 37 | 0.37 (0.24-0.52 | 1.00 (0.99-1.00) | 1.00 (0.80-1.00) | 0.54 (0.41-0.66) | 0.37 (0.24-0.52) |
| Craniopharyngioma | 47 | 0.93 (0.82-0.98) | 0.99 (0.99-1.00) | 0.82 (0.70-0.90) | 0.88 (0.79-0.93) | 0.93 (0.82-0.98) |
| Glioma | 262 | 0.92 (0.87-0.95) | 0.98 (0.97-0.99) | 0.91 (0.86-0.94) | 0.91 (0.88-0.93) | 0.92 (0.87-0.95) |
| Hemangioblastoma | 27 | 0.88 (0.70-0.96) | 1.00 (0.99-1.00) | 0.92 (0.74-0.98) | 0.90 (0.78-0.96) | 0.88 (0.70-0.96) |
| Metastatic tumor | 44 | 0.74 (0.59-0.85) | 0.99 (0.90-0.99) | 0.72 (0.57-0.83) | 0.73 (0.63-0.81) | 0.74 (0.59-0.85) |
| Germ cell tumor | 24 | 0.67 (0.47-0.82) | 1.00 (0.99-1.00) | 0.73 (0.52-0.87) | 0.70 (0.55-0.81) | 0.67 (0.47-0.82) |
| Medulloblastoma | 25 | 0.86 (0.67-0.95) | 1.00 (1.00-1.00) | 0.95 (0.76-0.99) | 0.90 (0.78-0.96) | 0.86 (0.67-0.95) |
| DNET | 14 | 0.64 (0.38-0.84) | 1.00 (0.99-1.00) | 1.00 (0.70-1.00) | 0.70 (0.46-0.94) | 0.64 (0.38-0.84 |
| Chordoma | 22 | 0.86 (0.67-0.95) | 1.00 (1.00-1.00) | 0.95 (0.76-0.99) | 0.90 (0.78-0.96) | 0.86 (0.67-0.95) |
| Lymphomas | 43 | 0.56 (0.41-0.71) | 0.99 (0.99-1.00) | 0.85 (0.66-0.94) | 0.68 (0.56-0.77) | 0.56 (0.41-0.71) |
| Choroid plexus papilloma | 18 | 0.61 (0.39-0.80) | 1.00 (0.99-1.00) | 0.85 (0.58-0.96) | 0.71 (0.53-0.84) | 0.61 (0.39-0.80) |
| Gangliocytoma | 13 | 0.50 (0.25-0.75) | 0.99 (0.99-1.00) | 0.40 (0.20-0.64) | 0.44 (0.28-0.63) | 0.50 (0.25-0.75) |
| Hemangiopericytoma | 13 | 0.44 (0.19-0.73) | 1.00 (0.99-1.00) | 0.50 (0.22-0.78) | 0.47 (0.26-0.69) | 0.44 (0.19-0.73) |
| Other | 11 | 0.25 (0.09-0.53) | 1.00 (98-1.00)) | 0.6 (0.23-0.88) | 0.35 (0.17-0.59) | 0.25 (0.09-0.53) |
Abbreviation: DNET, dysembryoplastic neuroepithelial tumor.
Top 2 results output by the deep learning system among 1339 patients from 4 centers.