| Literature DB >> 35070762 |
Jiajun Deng1, Mengmeng Zhao1, Qiuyuan Li1, Yikai Zhang2, Minjie Ma3, Chuanyi Li4, Jun Wang5, Yunlang She1, Yan Jiang1, Yunzeng Zhang6, Tingting Wang7, Chunyan Wu8, Likun Hou8, Sheng Zhong9, Shengxi Jin10, Dahong Qian5, Dong Xie1, Yuming Zhu1, Yasmeen K Tandon11, Annemiek Snoeckx12,13, Feng Jin14, Bentong Yu15, Guofang Zhao16,17, Chang Chen1,3,18.
Abstract
BACKGROUND: Clinical management of subsolid nodules (SSNs) is defined by the suspicion of tumor invasiveness. We sought to develop an artificial intelligent (AI) algorithm for invasiveness assessment of lung adenocarcinoma manifesting as radiological SSNs. We investigated the performance of this algorithm in classification of SSNs related to invasiveness.Entities:
Keywords: Artificial intelligence (AI); computed tomography (CT); lung adenocarcinoma; pulmonary subsolid nodules (SSNs)
Year: 2021 PMID: 35070762 PMCID: PMC8743520 DOI: 10.21037/tlcr-21-971
Source DB: PubMed Journal: Transl Lung Cancer Res ISSN: 2218-6751
Figure 1Flowchart of the study design. Artificial intelligence diagnostic tool, SSNet, was first developed and validated using retrospective datasets, then evaluated in an external dataset for its clinical utility. SSNs, subsolid nodules. ROI, region of interest.
Figure 2ROC curves showing the diagnostic performance in binary (A,D), 3-category (B,E), and 4-category (C,F) classifications. (A-C) ROC curves measure performance on the methodology-level, including practicing doctors with and without SSNet served as a second viewer. (D-F) ROC curves measure performance on the participant-level of practicing doctors, indicating the performance improvement with the assistance of SSNet. ROC, receiver-operating characteristic; AUC, area under ROC curve.
Diagnostic performance and clinical utility in the internal and external test
| Tasks | AUC | 95% CI | Difference (Bonferroni corrected CI) | Advantage |
|---|---|---|---|---|
| Internal test | ||||
| Two classifications | ||||
| SSNet | 0.914 | 0.813–0.987 | – | – |
| Human (unassisted) | 0.900 | 0.867–0.922 | –0.014 (–0.090 to 0.060)* | No difference |
| Human (assisted) | 0.937 | 0.911–0.970 | 0.037 (–0.078 to –0.014)† | Human (assisted) |
| Radiomics | 0.845 | 0.806–0.883 | 0.067 (–0.034 to 0.145)* | No difference |
| 0.071 (0.032–0.110)‡ | Human (unassisted) | |||
| Three classifications | ||||
| SSNet | 0.874 | 0.832–0.909 | – | – |
| Human (unassisted) | 0.844 | 0.816–0.864 | –0.030 (0.000–0.087)* | SSNet |
| Human (assisted) | 0.852 | 0.825–0.882 | 0.008 (-0.015-0.042)† | No difference |
| Four classifications | ||||
| SSNet | 0.869 | 0.824–0.892 | – | – |
| Human (unassisted) | 0.835 | 0.817–0.862 | –0.034 (0.012–0.098)* | SSNet |
| Human (assisted) | 0.836 | 0.811–0.862 | 0.001 (–0.030 to 0.036)† | Human (assisted) |
| External test | ||||
| Two classifications | ||||
| SSNet | 0.949 | 0.884–1.000 | – | |
| Human (unassisted) | 0.883 | 0.826–0.939 | –0.066 (0.037–0.212)* | SSNet |
| Human (assisted) | 0.908 | 0.847–0.982 | 0.025 (–0.092 to 0.029)† | Human (assisted) |
*, AUC difference was calculated as the AUC of the algorithm minus the AUC of the doctors (unassisted) or the AUC of radiomics. †, AUC difference was calculated as the AUC of the doctors (assisted) minus the AUC of the doctors (unassisted). ‡, AUC difference was calculated as the AUC of the doctors (unassisted) minus the AUC of the radiomics. To account for multiple hypothesis testing, the Bonferroni corrected CI (1−0.05/n, 97.5% for 2 classifications; 98.3% for 3 classifications; 98.8% for 4 classifications) around the difference was computed. AUC, area under the receiver-operating characteristic curve; CI, confidence interval.
Figure 3Box graph demonstrating the evaluation metrics in binary (A), 3-category (B-D), and 4-category (E-H) classifications. 1, performance of SSNet; 2, performance of previously constructed radiomic signature; 3–8, performance of practicing doctors without artificial intelligence interpretation; and 9–14, performance of practicing doctors with artificial intelligence interpretation. NPV, negative predictive value; PPV, positive predictive value.
Comparison of SSNet, radiomic signature, and practicing doctors to differentiate invasive adenocarcinoma in the internal and external test
| Performance metrics | SSNet | Radiomics | Unassisted | Assisted | |||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Junior | Middle | Senior | Micro average | Junior | Middle | Senior | Micro average | ||||||||||||||
| 1 | 2 | 1 | 2 | 1 | 2 | 1 | 2 | 1 | 2 | 1 | 2 | ||||||||||
| Retrospective | |||||||||||||||||||||
| Sensitivity | 0.933 | 0.885 | 0.750 | 0.875 | 0.702 | 0.894 | 0.933 | 0.923 | 0.846 | 0.885 | 0.885 | 0.798 | 0.731 | 1.000 | 0.769 | 0.845 | |||||
| McNemar’s test | <0.001* | 0.146* | <0.001* | 0.424* | 1.000* | 1.000* | 0.004† | 1.000† | 0.052† | 0.001† | 0.016† | 0.002† | |||||||||
| Specificity | 0.794 | 0.673 | 0.860 | 0.831 | 0.934 | 0.794 | 0.816 | 0.897 | 0.855 | 0.816 | 0.831 | 0.912 | 0.941 | 0.772 | 0.949 | 0.870 | |||||
| McNemar’s test | 0.049* | 0.302* | <0.001* | 1.000* | 0.250* | 0.003* | 0.146† | 1.000† | 0.508† | <0.001† | 0.180† | 0.039† | |||||||||
| Accuracy | 0.921 | 0.866 | 0.897 | 0.919 | 0.909 | 0.912 | 0.929 | 0.952 | 0.919 | 0.916 | 0.921 | 0.926 | 0.919 | 0.931 | 0.931 | 0.880 | |||||
| McNemar’s test | <0.001* | 0.054 | <0.001* | 0.596* | 0.250* | 0.012* | 0.001† | 1.000† | 0.031† | <0.001† | 0.007† | <0.001† | |||||||||
| Kappa‡ | 0.718 | 0.701 | |||||||||||||||||||
| Prospective | |||||||||||||||||||||
| Sensitivity | 0.958 | 0.708 | 0.854 | 0.625 | 0.875 | 0.958 | 0.896 | 0.819 | 0.0.875 | 0.813 | 0.729 | 0.667 | 1.000 | 0.729 | 0.802 | ||||||
| McNemar’s test | <0.001* | 0.063* | <0.001* | 0.289* | 1.000* | 0.375* | 0.039† | 0.688† | 0.227† | 0.006† | 0.500† | 0.039† | |||||||||
| Specificity | 0.796 | 0.815 | 0.852 | 0.907 | 0.759 | 0.815 | 0.833 | 0.830 | 0.759 | 0.778 | 0.870 | 0.944 | 0.778 | 0.944 | 0.846 | ||||||
| McNemar’s test | 1.000* | 0.453* | 0.031* | 0.727* | 1.000* | 0.688* | 0.453† | 0.289† | 0.727† | 0.006† | 0.727† | 0.031† | |||||||||
| Accuracy | 0.932 | 0.867 | 0.921 | 0.873 | 0.897 | 0.938 | 0.926 | 0.904 | 0.897 | 0.885 | 0.891 | 0.897 | 0.938 | 0.915 | 0.904 | ||||||
| McNemar’s test | 0.004* | 0.039* | 0.001* | 0.804* | 1.000 * | 0.227* | 0.019† | 0.791† | 0.167† | <0.001† | 0.344† | 0.001† | |||||||||
| Kappa‡ | 0.632 | 0.649 | |||||||||||||||||||
1, 2 represents doctors 1 and 2. *, McNemar’s test P value for comparison of evaluation metrics between SSNet and practicing doctors alone; †, McNemar’s test P-value for comparison of evaluation metrics between practicing doctors with and without the assistance of SSNet; ‡, Kappa value was calculated as Fleiss’ kappa for the 6 readers.
Figure 4ROC curves showing the diagnostic performance (A) for invasive adenocarcinoma discrimination in prospective validation by SSNet and practicing doctors (B). ROC, receiver-operating characteristic; AUC, area under ROC curve.