| Literature DB >> 31406557 |
Matthew Seidler1, Behzad Forghani2, Caroline Reinhold1,2, Almudena Pérez-Lara1, Griselda Romero-Sanchez1, Nikesh Muthukrishnan3, Julian L Wichmann4, Gabriel Melki3, Eugene Yu5, Reza Forghani1,2,3,6,7.
Abstract
PURPOSE: To determine whether machine learning assisted-texture analysis of multi-energy virtual monochromatic image (VMI) datasets from dual-energy CT (DECT) can be used to differentiate metastatic head and neck squamous cell carcinoma (HNSCC) lymph nodes from lymphoma, inflammatory, or normal lymph nodes.Entities:
Keywords: AUC, Area under the receiver operating curve; Artificial intelligence; DECT, Dual-energy CT; Dual energy CT; HNSCC, Head and neck squamous cell carcinoma; Lymph nodes; Machine learning; NPV, Negative predictive value; PPV, Positive predictive value; Radiomics; Texture analysis; VMI, Virtual monochromatic image
Year: 2019 PMID: 31406557 PMCID: PMC6682309 DOI: 10.1016/j.csbj.2019.07.004
Source DB: PubMed Journal: Comput Struct Biotechnol J ISSN: 2001-0370 Impact factor: 7.271
Fig. 1Flowchart of patient and node selection for texture analysis, with inclusion and exclusion criteria. For HNSCC, only patients with pathological confirmation based on lymph node dissection, either positive or negative, were included. Because lymphoma patients or patients with inflammatory disease do not undergo cervical lymph node dissection as part of their treatment, the above strict criteria were used for selection of nodes in these patient groups (please see text and Supplemental Data for additional details). Two groups of controls were used: (1) pathologically proven benign nodes from HNSCC patients (group 1, controls for metastatic HNSCC nodes) and (2) nodes from normal neck scans of patients without history of cancer or other systemic disease (group 2, controls for lymphoma and inflammatory groups).
Fig. 2Flowchart of node selection and groupings for construction of prediction models using machine learning. Because of the large imbalance between the number of nodes in the control groups compared to metastatic HNSCC or inflammatory lymph node groups, for pairwise comparison of these groups and controls, a maximum of 4 lymph nodes per control patient were randomly selected for use by the machine learning algorithm. This resulted in a subset of 39 nodes for control group 1 and 40 nodes for control group 2. For all other groupings and comparisons, the full complement of control nodes was used.
Prediction accuracy for distinction of Metastatic HNSCC from normal nodes.
| ML approach | Acc | Sens | Spec | PPV | NPV | AUC | |
|---|---|---|---|---|---|---|---|
| Training set | RF | 48/50 (96%) | 21/22 (95%) | 27/28 (96%) | 21/22 (95%) | 27/28 (96%) | 0.95 |
| GBM | 49/50 (98%) | 21/22 (95%) | 28/28 (100%) | 21/21 (100%) | 28/29 (97%) | 1.00 | |
| Testing set | RF | 17/20 (85%; 69, 100) | 8/9 (89%; 68, 100) | 9/11 (82%; 59, 100) | 8/10 (80%; 55, 100) | 9/10 (90%; 71, 100) | 0.97 (0.89, 1.00) |
| GBM | 18/20 (90%; 77, 100) | 8/9 (89%; 68, 100) | 10/11 (91%; 74, 100) | 8/9 (89%; 68, 100) | 10/11 (91%; 74, 100) | 0.96 (0.87; 1.00) |
Models constructed are based on analysis of VMIs at 65 keV using texture data extracted from the first set of independent contours.
For the testing sets (prediction models), the lower and upper limits of the 95% confidence interval are provided after the percentage value or the AUC.
ML: Machine Learning; RF: Random Forests; GBM: Gradient Boosting Machine.
Fig. 3Example of ROC curve analysis of the diagnostic performance of A, Random Forests (RF) and B, Gradient Boosting Machine (GBM) method for distinguishing metastatic HNSCC from benign lymph nodes. The testing (prediction) sets (red) have similar performance to the training sets (blue), as expected, suggesting that the models are reliable and that there is not overfitting. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
Prediction accuracy for distinction of Lymphoma from normal nodes.
| ML approach | Acc | Sens | Spec | PPV | NPV | AUC | |
|---|---|---|---|---|---|---|---|
| Training set | RF | 129/146 (88%) | 38/46 (83%) | 91/100 (91%) | 38/47 (81% | 91/99 (92%) | 0.95 |
| GBM | 138/146 (95%) | 42/46 (91%) | 96/100 (96%) | 42/46 (91%) | 96/100 (96%) | 0.99 | |
| Testing set | RF | 55/61 (90%; 83, 98) | 19/19 (100%; 100, 100) | 36/42 (86%; 75, 96) | 19/25 (76%; 59, 93) | 36/36 (100%; 100, 100) | 0.95 (0.90, 1.00) |
| GBM | 57/61 (93%; 87, 100) | 19/19 (100%; 100, 100) | 38/42 (90%; 82, 99) | 19/23 (83%; 67, 98) | 38/38 (100%; 100, 100) | 0.96 (0.91, 1.00) |
Models constructed are based on analysis of VMIs at 65 keV using texture data extracted from the first set of independent contours.
For the testing sets (prediction models), the lower and upper limits of the 95% confidence interval are provided after the percentage value or the AUC.
ML: Machine Learning; RF: Random Forests; GBM: Gradient Boosting Machine.
Prediction accuracy for distinction of inflammatory from normal nodes.
| ML approach | Acc | Sens | Spec | PPV | NPV | AUC | |
|---|---|---|---|---|---|---|---|
| Training set | RF | 44/49 (90%) | 18/21 (86%) | 26/28 (93%) | 18/20 (90%) | 26/29 (90%) | 0.97 |
| GBM | 49/49 (100%) | 21/21 (100%) | 28/28 (100%) | 21/21 (100%) | 28/28 (100%) | 1.00 | |
| Testing set | RF | 16/20 (80%; 62, 98) | 7/8 (88%; 65, 100) | 9/12 (75%; 50, 100) | 7/10 (70%; 42, 98) | 9/10 (90%; 71, 100) | 0.97 (0.89, 1.00) |
| GBM | 16/20 (80%; 62, 98) | 7/8 (88%; 65, 100) | 9/12 (75%; 50, 100) | 7/10 (70%; 42, 98) | 9/10 (90%; 71, 100) | 0.96 (0.87, 1.00) |
Models constructed are based on analysis of VMIs at 65 keV using texture data extracted from the first set of independent contours.
For the testing sets (prediction models), the lower and upper limits of the 95% confidence interval are provided after the percentage value or the AUC.
ML: Machine Learning; RF: Random Forests; GBM: Gradient Boosting Machine.
Prediction accuracy for Lymph node classification as malignant versus Benign.
| Acc | Sens | Spec | PPV | NPV | |
|---|---|---|---|---|---|
| Training set | 155/170 (91%) | 95/102 (93%) | 60/68 (88%) | 95/103 (92%) | 60/67 (90%) |
| Testing set | 65/71 (92%; 85, 98) | 39/43 (91%; 82, 99) | 26/28 (93%; 83, 100) | 39/41 (95%; 89, 100) | 26/30 (87%; 75, 99) |
Models constructed are based on analysis of VMIs at 65 keV using texture data extracted from the first set of independent contours, using Random Forests, based on texture data from histopathology proven benign lymph nodes (negative node dissection in HNSCC patients), metastatic HNSCC, and lymphoma nodes.
For the testing sets (prediction models), the lower and upper limits of the 95% confidence interval are provided after the percentage value.
Comparison of testing (prediction) model performance using texture data extracted from manual contouring by two different readers.
| Testing paradigm | Acc (first reader) | Acc (second reader) | Cohen's kappa coefficient (κ) | |
|---|---|---|---|---|
| RF | Distinction of Metastatic HNSCC from normal nodes | 17/20 (85%; 69, 100) | 19/20 (95%; 85, 100) | 0.8 |
| Distinction of Lymphoma from normal nodes | 55/61 (90%; 83, 98) | 55/61 (90%; 83, 98) | 0.86 | |
| Distinction of Inflammatory from normal nodes | 16/20 (80%; 62, 98) | 17/20 (85%; 69, 100) | 0.9 | |
| Lymph Node classification as Malignant versus Benign | 65/71 (92%; 85, 98) | 66/71 (93%; 87, 99) | 0.91 | |
| GBM | Distinction of Metastatic HNSCC from normal nodes | 18/20 (90%; 77, 100) | 18/20 (90%; 77, 100) | 0.8 |
| Distinction of Lymphoma from normal nodes | 57/61 (93%; 87, 100) | 56/61 (92%; 85, 99) | 0.96 | |
| Distinction of Inflammatory from normal nodes | 16/20 (80%; 62, 98) | 18/20 (90%; 77, 100) | 0.8 |
Models constructed are based on analysis of VMIs at 65 keV.
For the testing sets (prediction models), the lower and upper limits of the 95% confidence interval are provided after the percentage value.
RF: Random Forests; GBM: Gradient Boosting Machine.