| Literature DB >> 31228956 |
Benjamin Hinton1,2, Lin Ma3, Amir Pasha Mahmoudzadeh4, Serghei Malkov5, Bo Fan6, Heather Greenwood7, Bonnie Joe7, Vivian Lee8, Karla Kerlikowske9, John Shepherd10.
Abstract
BACKGROUND: To determine if mammographic features from deep learning networks can be applied in breast cancer to identify groups at interval invasive cancer risk due to masking beyond using traditional breast density measures.Entities:
Keywords: Breast Cancer; Breast density; Deep learning; Interval Cancer; Mammography; Masking; Neural network; Transfer learning
Mesh:
Year: 2019 PMID: 31228956 PMCID: PMC6589178 DOI: 10.1186/s40644-019-0227-3
Source DB: PubMed Journal: Cancer Imaging ISSN: 1470-7330 Impact factor: 3.909
Fig. 1Schematic of the architecture of the deep learning network used in this study. YxY conv, M/N = M kernels of YxYx3 size and stride length of N (N = 1 if only M is listed). Fully Connected (FC) Layer = Dense (256), Dropout, Dense (1)
Descriptive statistics of the screen-detected and interval cancer groups. Percentage in each BI-RADS category are calculated excluding the missing/unknown groups
| Screen-Detected Group | Interval Group | ||
|---|---|---|---|
| N | 173 | 182 | |
| Age, years (Standard Deviation) | 57.8 (10.9) | 56.8 (11.8) | 0.28 |
| BMI, kg/m2 (Standard Deviation) | 24.9 (4.7) | 23.5 (4.3) | < 0.0001 |
| Time to Detection (Days) | 56.3 (81.4) | 239.8 (94.6) | < 0.0001 |
| Race: | 0.88 | ||
| White | 127 | 129 | |
| African American | 3 | 4 | |
| Chinese | 25 | 27 | |
| Filipina | 3 | 3 | |
| Hispanic | 0 | 2 | |
| Japanese | 5 | 8 | |
| Mixed | 5 | 5 | |
| Other Asian | 2 | 1 | |
| Other Non-Asian | 3 | 3 | |
| Menopausal status | 119 (69%) | 123 (68%) | 0.69 |
| Family history of breast cancer | 47 (23%) | 60 (33%) | 0.25 |
| Previous history of breast biopsy | 55 (32%) | 68 (37%) | 0.33 |
| BI-RADS Frequency: | 0.008 | ||
| A: Almost Entirely Fatty | 11 (7.8%) | 3 (1.8%) | |
| B: Scattered Fibroglandularities | 50 (35.5%) | 33 (19.7%) | |
| C: Heterogeneously Dense | 61 (43.3%) | 78 (46.7%) | |
| D: Extremely Dense | 19 (13.5%) | 53 (31.7%) | |
| Missing Data | 19 | 7 | |
| Unknown | 13 | 8 |
Chosen hyperparameters with brief description. Hyperparameter sweep went through a realizable range for each hyperparameter and individual values were chosen to optimize training ability or to minimize overfitting, depending on the parameter
| Hyperparameter (Range) | Hyperparameter Type | Interpretation | Chosen Value |
|---|---|---|---|
| Rotation (0–90) | Data Augmentation | Range for a random rotation | 20 |
| Zoom (0–1) | Data Augmentation | Range for a random zoom | 0.5 |
| Shear (0–1) | Data Augmentation | Range for a random shear | 0.3 |
| Vertical/Horizontal Flip (Yes/No) | Data Augmentation | Random chance of flip in respective direction | Yes/Yes |
| Momentum (0–1) | Optimizer Parameter | Accelerates or dampens oscillations in given direction. | 0.3 |
| Regularization (0–1) | Optimizer Parameter | Penalty applied to large image weights | 0 |
| Decay (0–1) | Optimizer Parameter | Learning Rate decay over each update. | 1e-5 |
| Dropout (0–1) | Fully-connected Layer | Percent of weights dropped out between dense layers in the FC layer. | 0.95 |
| Learning Rate (0–1) | Training Parameter | Importance attributed to weight updates. | 1e-3 |
| Epochs (Integer) | Training Parameter | Number of epochs performed | 1000 |
| Batch Size (2n any n) | Training Parameter | Number of samples per gradient update | 16 |
| Image Size (Minimum 224) | Training Parameter | Input image size in pixels | 224 |
| nLayersRetrain (Fully Connected only – All Layers) | Training Parameter | Number of layers allowed to have their weights altered. | All Layers (173) |
Fig. 2Loss and accuracy curves per epoch of the test and train set of the deep learning network. Best test loss occurred on epoch 482. At that epoch training loss and accuracy were 0.58 and 67.4%, respectively, and test loss and accuracy were 0.499 and 75.2%, respectively
Fig. 3ROC Curves interval vs screen-detected cancer classification using BI-RADS density alone (Only BI-RADS) vs using the deep learning predictions (Deep Learning) vs using both as predictors (Combined). Prediction accuracy was 63% using BI-RADS density alone and 75% using deep learning alone
Contingency table of the number of correctly and incorrectly classified images from the deep learning network
| Number (Percent) | Predicted Screened | Predicted Interval | Total |
|---|---|---|---|
| Actual Screened | 134/173 (77.4%) | 39/173 (22.5%) | 173 |
| Actual Interval | 48/182 (26.4%) | 134/182 (73.6%) | 182 |
| Total | 182 | 173 |
Fig. 4Saliency maps of sample screen-detected and interval images (both correctly classified). For each row, the pseudo-presentation images are shown (left) along with the saliency map (middle) that highlights the pixels that had above a 50% weight in classifying the image in its respective category (i.e. first row saliency map highlights weights that push towards decision of classifying as screen-detected decision). At right, the images are overlaid