| Literature DB >> 34898544 |
Andrew Lagree1,2,3,4, Audrey Shiner1,2,4, Marie Angeli Alera1,2,4, Lauren Fleshner1,2,4, Ethan Law1,2,4, Brianna Law1,2,4, Fang-I Lu4,5, David Dodington5, Sonal Gandhi4,6, Elzbieta A Slodkowska5, Alex Shenfield7, Katarzyna J Jerzak6, Ali Sadeghi-Naini1,8, William T Tran1,2,3,4,9.
Abstract
BACKGROUND: Evaluating histologic grade for breast cancer diagnosis is standard and associated with prognostic outcomes. Current challenges include the time required for manual microscopic evaluation and interobserver variability. This study proposes a computer-aided diagnostic (CAD) pipeline for grading tumors using artificial intelligence.Entities:
Keywords: Nottingham grade; biopsy; breast cancer; computational oncology; imaging biomarkers; tumor
Mesh:
Substances:
Year: 2021 PMID: 34898544 PMCID: PMC8628688 DOI: 10.3390/curroncol28060366
Source DB: PubMed Journal: Curr Oncol ISSN: 1198-0052 Impact factor: 3.677
Figure 1Nottingham grade classification pipeline. (a) (i) A representative H&E stained CNB section is first tiled, followed by stain normalization (ii), then used as input to a CNN (modified VGG19), which predicts the tumor bed probabilities (iii). A heatmap is generated using the tumor bed probabilities (iv). Tiles from the tumor bed are then used as input for the Mask R-CNN, which segments the malignant nuclei (v). (b) Spatial and clinical features are extracted. Spatial features were extracted using the centroids of the segmented nuclei. The spatial features included density features (vi), graph features (vii), and nuclei count. Clinicopathological features, including patient age (years) and receptor status (ER, PR, HER2). (c) Separate machine learning models were trained for spatial and clinical features. The clinical and spatial models were then combined to create an ensemble model. The ensemble model was evaluated on the hold-out (test) set.
Clinicopathological characteristics of the patients with G1, 2 and G3 breast cancer tumors. Bolded values represent statistical significance (p < 0.05). Abbreviations: G1, 2, Nottingham grade 1 and 2; G3, Nottingham grade 3; SD, standard deviation; y, years; ER, estrogen receptor; PR, progesterone receptor; HER2, human epidermal growth factor.
| Patient Clinicopathological Characteristics | Study Cohort ( | ||
|---|---|---|---|
| G1, 2 ( | G3 ( | ||
| Age | |||
| Mean Age ± SD (y) | 51.6 ± 10.8 | 50.3 ± 9.2 | 0.423 |
| ≤50 years | 23 (39.7) | 39 (48.8) | 0.289 |
| >50 years | 35 (60.3) | 41 (51.3) | |
| Menopausal Status | |||
| Pre | 30 (51.7) | 38 (47.5) | 0.624 |
| Post | 28 (48.3) | 42 (52.5) | |
| Tumor Laterality | |||
| Left | 25 (43.1) | 38 (47.5) | 0.609 |
| Right | 33 (56.9) | 42 (52.5) | |
| Receptor Status | |||
| Median ER ± SD (%) | 90 ± 44.1 | 0 ± 43.9 |
|
| ER-positive | 43 (74.1) | 34 (42.5) |
|
| ER-negative | 15 (25.9) | 46 (57.5) | |
| Median PR ± SD (%) | 4 ± 42.8 | 0 ± 36.3 |
|
| PR-positive | 35 (60.3) | 30 (37.5) |
|
| PR-negative | 23 (39.7) | 50 (62.5) | |
| HER2-positive | 26 (44.8) | 39 (48.8) | 0.649 |
| HER2-negative | 32 (55.2) | 41 (51.3) | |
| Tumor Size | 0.445 | ||
| Mean Size ± SD (mm) | 48.3 ± 27.8 | 44.5 ± 25.1 | |
| Clinical T Stage | |||
| 1 | 5 (8.6) | 4 (5.0) | 0.314 |
| 2 | 32 (55.2) | 54 (67.5) | |
| 3 | 21 (36.2) | 22 (27.5) | |
| 4 | 0 (0.0) | 0 (0.0) | |
| Clinical N Stage | |||
| 0 | 12 (20.7) | 28 (35.0) | 0.183 |
| 1 | 40 (69.0) | 46 (57.5) | |
| 2 | 6 (10.3) | 6 (7.5) | |
| 3 | 0 (0.0) | 0 (0.0) | |
| Node Status | |||
| Node-positive | 46 (79.3) | 52 (65.0) | 0.067 |
| Node-negative | 12 (20.7) | 28 (35.0) | |
| Inflammatory Breast Cancer | |||
| Yes | 5 (8.6) | 8 (10.0) | 0.784 |
| No | 53 (91.4) | 72 (90.0) | |
Bolded values represent statistical significance (p < 0.05).
Figure 2Instance segmentation of malignant nuclei by Mask regional convolutional neural network (Mask R-CNN) and representative feature extraction. (a,b) Mask R-CNN performance, evaluated on the hold-out (test) set. The highest, median, and lowest scoring AJI images from the Post-NAT-BRCA dataset are displayed. The predicted cells are color-coded such that green denotes true-positive, blue false-positive, and red false-negative pixels. Average precision over ten intersections over union thresholds is also displayed. (c) Representative H&E images from five patients and their respective malignant nuclei masks are displayed. (d) The Delaunay triangulation features, Voronoi diagram features, and density features were calculated using the centroids of the segmented malignant nuclei. Abbreviations: H&E, hematoxylin and eosin; AJI, Aggregated Jaccard Index.
Figure 3Combined box and whisker and swarm plots of the statistically significant (p < 0.05) spatial features. Abbreviations: stddev, standard deviation; Min, minimum; Max, maximum; MST, Minimum Spanning Tree; a.u., Arbitrary units; G1, 2, Nottingham grade 1 and 2; G3, Nottingham grade 3.
Most frequently occurring spatial and clinical feature sets. One hundred iterations of sequential forward feature selection were performed per model. The most frequently occurring clinical and spatial feature sets are reported. Abbreviations: K-NN, K-nearest neighbor; LR, logistic regression; RF, random forest classifier; SVM, support vector machine; XGBoost, extreme gradient boost; #, number; ρ, Density; V, Voronoi; D, Delauney; MST, Minimum Spanning Tree; Med, median; dist, distance; ƒ, frequency.
| Model | Feature Type | Feature Index | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | ƒ | ||
| Naïve Bayes | Spatial | V max dist disorder | MST branches min max ratio | 33 | ||||||||
| Clinical | ||||||||||||
| K-NN | Spatial | # of nuclei | V max dist disorder | ρ neighbors in dist 1 disorder | ρ neighbors in dist 4 stddev | ρ min | 16 | |||||
| Clinical | Age (years) | ER (%) | PR (%) | HER2 status | 63 | |||||||
| LR | Spatial | V max dist stddev | V max dist disorder | MST branches min max ratio | ρ neighbors in dist 1 disorder | ρ dist for neighbors 2 min max ratio | 43 | |||||
| Clinical | ER (%) | PR (%) | 81 | |||||||||
| RF | Spatial | # of nuclei | V max dist stddev | V max dist disorder | ρ neighbors in dist 1 disorder | ρ neighbors in dist 4 stddev | ρ dist for neighbors 2 min max ratio | ρ dist for neighbors 2 disorder | ρ min | ρ med | 15 | |
| Clinical | PR (%) | HER2 status | 62 | |||||||||
| SVM | Spatial | # of nuclei | V max dist disorder | MST branches min max ratio | ρ neighbors in dist 1 disorder | ρ neighbors in dist 4 stddev | ρ dist for neighbors 2 disorder | ρ min | ρ med | 25 | ||
| Clinical | Age (years) | ER (%) | PR (%) | HER2 status | 84 | |||||||
| XGBoost | Spatial | # of nuclei | V max dist stddev | V max dist disorder | MST branches min max ratio | ρ neighbors in dist 1 disorder | ρ neighbors in dist 4 stddev | ρ dist for neighbors 2 min max ratio | ρ dist for neighbors 2 disorder | ρ min | ρ med | 7 |
| Clinical | ER (%) | PR (%) | 22 | |||||||||
Performance measures of machine learning models, trained using clinical and spatial features sets. All models were trained using 10-fold cross-validation and tested on an independent hold-out set. The three highest performing ensemble models are reported. All performance measures are reported at the patient level. Abbreviations: K-NN, K-nearest neighbor; LR, logistic regression; RF, random forest classifier; SVM, support vector machine; XGBoost, extreme gradient boost; AUC, area under the curve; SD, standard deviation; ACC, accuracy; Sn, sensitivity; Sp, specificity; Prev, prevalence; FNR, false-negative rate; FPV, false-positive rate; PPV, positive predictive value; NPV, negative predictive value; FDR, false discovery rate; FOR, false omission rate; LR+, positive likelihood ratio; LR-, negative likelihood ratio; DOR, diagnostic odds ratio.
| Feature Set | Model | Training Set | Testing Set | |||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Mean AUC ± SD | Mean ACC ± SD (%) | AUC | Acc (%) | Sn (%) | Sp (%) | Prev (%) | FNR (%) | FPR (%) | PPV (%) | NPV (%) | FDR (%) | FOR (%) | LR+ | LR- | DOR | f1 | ||
| Clinical | K-NN | 0.66 ± 0.16 | 62 ± 14 | 0.62 | 64.29 | 75.00 | 50.00 | 57.14 | 25.00 | 50.00 | 66.67 | 60.00 | 33.33 | 40.00 | 1.50 | 0.50 | 3.00 | 0.71 |
| LR | 0.66 ± 0.22 | 64 ± 19 | 0.77 | 73.81 | 75.00 | 72.22 | 57.14 | 25.00 | 27.78 | 78.26 | 68.42 | 21.74 | 31.58 | 2.70 | 0.35 | 7.80 | 0.77 | |
| RF | 0.68 ± 0.15 | 59 ± 14 | 0.56 | 66.67 | 83.33 | 44.44 | 57.14 | 16.67 | 55.56 | 66.67 | 66.67 | 33.33 | 33.33 | 1.50 | 0.38 | 4.00 | 0.74 | |
| SVM | 0.64 ± 0.25 | 52 ± 16 | 0.50 | 66.67 | 75.00 | 55.56 | 57.14 | 25.00 | 44.44 | 69.23 | 62.50 | 30.77 | 37.50 | 1.69 | 0.45 | 3.75 | 0.72 | |
| XGBoost | 0.63 ± 0.23 | 59 ± 16 | 0.77 | 73.81 | 75.00 | 72.22 | 57.14 | 25.00 | 27.78 | 78.26 | 68.42 | 21.74 | 31.58 | 2.70 | 0.35 | 7.80 | 0.77 | |
| Spatial | Naïve Bayes | 0.65 ± 0.07 | 59 ± 6 | 0.68 | 64.29 | 87.50 | 33.33 | 57.14 | 12.50 | 66.67 | 63.64 | 66.67 | 36.36 | 33.33 | 1.31 | 0.38 | 3.50 | 0.74 |
| K-NN | 0.87 ± 0.03 | 76 ± 3 | 0.64 | 66.67 | 66.67 | 66.67 | 57.14 | 33.33 | 33.33 | 72.73 | 60.00 | 27.27 | 40.00 | 2.00 | 0.50 | 4.00 | 0.70 | |
| LR | 0.67 ± 0.04 | 62 ± 5 | 0.73 | 66.67 | 62.50 | 72.22 | 57.14 | 37.50 | 27.78 | 75.00 | 59.09 | 25.00 | 40.91 | 2.25 | 0.52 | 4.33 | 0.68 | |
| RF | 0.88 ± 0.04 | 79 ± 5 | 0.75 | 64.29 | 79.17 | 44.44 | 57.14 | 20.83 | 55.56 | 65.52 | 61.54 | 34.48 | 38.46 | 1.43 | 0.47 | 3.04 | 0.72 | |
| SVM | 0.79 ± 0.06 | 77 ± 5 | 0.69 | 69.05 | 75.00 | 61.11 | 57.14 | 25.00 | 38.89 | 72.00 | 64.71 | 28.00 | 35.29 | 1.93 | 0.41 | 4.71 | 0.73 | |
| XGBoost | 0.88 ± 0.03 | 79 ± 3 | 0.78 | 71.43 | 87.50 | 50.00 | 57.14 | 12.50 | 50.00 | 70.00 | 75.00 | 30.00 | 25.00 | 1.75 | 0.25 | 7.00 | 0.78 | |
| Ensemble | LR + RF | 0.96 ± 0.12 | 88 ± 14 | 0.84 | 78.57 | 83.33 | 72.22 | 57.14 | 16.67 | 27.78 | 80.00 | 76.47 | 20.00 | 23.53 | 3.00 | 0.23 | 13.00 | 0.82 |
| LR + XGBoost | 0.70 ± 0.23 | 56 ± 14 | 0.84 | 73.81 | 75.00 | 72.22 | 57.14 | 25.00 | 27.78 | 78.26 | 68.42 | 21.74 | 31.58 | 2.70 | 0.35 | 7.80 | 0.77 | |
| XGBoost + RF | 0.96 ± 0.13 | 92 ± 13 | 0.83 | 73.81 | 87.50 | 55.56 | 57.14 | 12.50 | 44.44 | 72.41 | 76.92 | 27.59 | 23.08 | 1.97 | 0.23 | 8.75 | 0.79 | |
Figure 4Receiver operating characteristics (ROC) curve and area under the curve (AUC) of the top performing machine learning models trained with clinical and spatial feature sets. (a left) ROC and AUC of XGBoost, the top performing classifier using clinical features. (a right) ROC and AUC of XGBoost, the top performing classifier using spatial features. (b–d) The top three performing ensemble models. (b) AUC vs. Threshold of LR+RF, with an optimal threshold of 37% (left). ROC and AUC of LR+RF (right). (c) AUC vs. Threshold of LR+XGBoost, with an optimal threshold of 10% (left). ROC and AUC of LR+XGBoost (right). (d) AUC vs. Threshold of XGBoost+RF, with an optimal threshold of 46% (left). ROC and AUC of XGBoost+RF (right). Abbreviations: ROC, Receiver Operating Characteristic curve; AUC, Area Under the Curve; LR, Logistic regression; RF, random forest classifier; XGBoost, Extreme Gradient Boost.