E J Yates1, L C Yates2, H Harvey3. 1. Foundation Doctor, West Midlands, England, UK. Electronic address: elliotyatesj@gmail.com. 2. Foundation Doctor, West Midlands, England, UK. 3. Kheiron Medical Technologies, Rocketspace, 40 Islington High St, London N1 8EQ, UK.
Abstract
AIM: To develop a machine learning-based model for the binary classification of chest radiography abnormalities, to serve as a retrospective tool in guiding clinician reporting prioritisation. MATERIALS AND METHODS: The open-source machine learning library, Tensorflow, was used to retrain a final layer of the deep convolutional neural network, Inception, to perform binary normality classification on two, anonymised, public image datasets. Re-training was performed on 47,644 images using commodity hardware, with validation testing on 5,505 previously unseen radiographs. Confusion matrix analysis was performed to derive diagnostic utility metrics. RESULTS: A final model accuracy of 94.6% (95% confidence interval [CI]: 94.3-94.7%) based on an unseen testing subset (n=5,505) was obtained, yielding a sensitivity of 94.6% (95% CI: 94.4-94.7%) and a specificity of 93.4% (95% CI: 87.2-96.9%) with a positive predictive value (PPV) of 99.8% (95% CI: 99.7-99.9%) and area under the curve (AUC) of 0.98 (95% CI: 0.97-0.99). CONCLUSION: This study demonstrates the application of a machine learning-based approach to classify chest radiographs as normal or abnormal. Its application to real-world datasets may be warranted in optimising clinician workload.
AIM: To develop a machine learning-based model for the binary classification of chest radiography abnormalities, to serve as a retrospective tool in guiding clinician reporting prioritisation. MATERIALS AND METHODS: The open-source machine learning library, Tensorflow, was used to retrain a final layer of the deep convolutional neural network, Inception, to perform binary normality classification on two, anonymised, public image datasets. Re-training was performed on 47,644 images using commodity hardware, with validation testing on 5,505 previously unseen radiographs. Confusion matrix analysis was performed to derive diagnostic utility metrics. RESULTS: A final model accuracy of 94.6% (95% confidence interval [CI]: 94.3-94.7%) based on an unseen testing subset (n=5,505) was obtained, yielding a sensitivity of 94.6% (95% CI: 94.4-94.7%) and a specificity of 93.4% (95% CI: 87.2-96.9%) with a positive predictive value (PPV) of 99.8% (95% CI: 99.7-99.9%) and area under the curve (AUC) of 0.98 (95% CI: 0.97-0.99). CONCLUSION: This study demonstrates the application of a machine learning-based approach to classify chest radiographs as normal or abnormal. Its application to real-world datasets may be warranted in optimising clinician workload.