| Literature DB >> 28854966 |
Lothar Häberle1,2, Carolin C Hack3, Katharina Heusinger3, Florian Wagner4, Sebastian M Jud3, Michael Uder5, Matthias W Beckmann3, Rüdiger Schulz-Wendtland5, Thomas Wittenberg4, Peter A Fasching3,6.
Abstract
BACKGROUND: Tumors in radiologically dense breast were overlooked on mammograms more often than tumors in low-density breasts. A fast reproducible and automated method of assessing percentage mammographic density (PMD) would be desirable to support decisions whether ultrasonography should be provided for women in addition to mammography in diagnostic mammography units. PMD assessment has still not been included in clinical routine work, as there are issues of interobserver variability and the procedure is quite time consuming. This study investigated whether fully automatically generated texture features of mammograms can replace time-consuming semi-automatic PMD assessment to predict a patient's risk of having an invasive breast tumor that is visible on ultrasound but masked on mammography (mammography failure).Entities:
Keywords: Mammographic density; Mammography screening; Masking; Risk prediction; Sensitivity; Texture analysis; Variable selection
Mesh:
Year: 2017 PMID: 28854966 PMCID: PMC5577694 DOI: 10.1186/s40001-017-0270-0
Source DB: PubMed Journal: Eur J Med Res ISSN: 0949-2321 Impact factor: 2.175
Patient characteristics in relation to mammography failure (yes/no)
| Characteristic | Visible on mammography and US | Visible only on US (mammography failure) | ||
|---|---|---|---|---|
| Mean or | SD or % | Mean or | SD or % | |
| Age | 60.2 | 12.5 | 52.5 | 12.1 |
| BMI | 26.4 | 4.7 | 23.7 | 3.5 |
| PMD | 34.5 | 18.3 | 51.3 | 20.5 |
| Previous breast surgery | ||||
| No | 1080 | 87.4 | 79 | 80.6 |
| Yes | 156 | 12.6 | 19 | 19.4 |
| Menopausal and HRT status | ||||
| Premenopausal | 269 | 21.8 | 46 | 46.9 |
| Postmenopausal and no HRT | 721 | 58.3 | 28 | 28.6 |
| Postmenopausal and HRT | 246 | 19.9 | 24 | 24.5 |
| Imaging technique | ||||
| Analog | 761 | 61.6 | 55 | 56.1 |
| Digital | 475 | 38.4 | 43 | 43.9 |
Mean and standard deviation (SD) are shown for continuous characteristics, and frequency and percentage for categorical characteristics
BMI body mass index, HRT hormone replacement therapy, PMD percentage mammographic density, US ultrasonography
Prediction of PMD
| Method | MSE |
|
|
|---|---|---|---|
| Univariate selection | 117.0 (8.6) | 0.67 (0.02) | 132.5 (9.7) |
| Lasso | 111.9 (8.4) | 0.69 (0.02) | 108.8 (12.9) |
| Boosting | 113.0 (8.6) | 0.68 (0.02) | 126.1 (8.8) |
| Random forest | 120.2 (9.7) | 0.66 (0.03) | –a |
Summary statistics (mean and standard deviation) of mean squared error (MSE) and R 2 obtained from (linear) regression models with selected features, as well as the number of selected features N, are shown. All measurements were obtained by 3-fold cross-validation with 100 repetitions
MSE mean squared error, PMD percentage mammographic density
aThere was no variable selection with random forest
Fig. 1Predicted and observed percentage mammographic density (PMD) values on a validation dataset (one-third of the patients), based on linear regression models fitted on a training dataset (two-thirds of the patients) using lasso (a) and boosting (b)
Selected texture features for predicting percentage mammographic density (PMD)
| Feature family | Number of features | Correlation with PMD | |||||
|---|---|---|---|---|---|---|---|
| All | Lassoa | Boostinga | Commonb | All | Lasso | Boosting | |
| Fourier | 12 | 9 | 9 | 6 | 0.16 (0.03, 0.28) | 0.10 (0.03, 0.28) | 0.12 (0.03, 0.28) |
| Histogram | 14 | 13 | 13 | 12 | 0.18 (0.00, 0.25) | 0.19 (0.00, 0.25) | 0.17 (0.00, 0.25) |
| Markovian | 37 | 24 | 24 | 20 | 0.44 (0.00, 0.72) | 0.39 (0.00, 0.72) | 0.43 (0.00, 0.72) |
| Moment-based | 70 | 54 | 33 | 32 | 0.21 (0.00, 0.61) | 0.21 (0.00, 0.61) | 0.21 (0.00, 0.61) |
| Regional | 45 | 36 | 32 | 28 | 0.22 (0.01, 0.52) | 0.23 (0.03, 0.52) | 0.20 (0.01, 0.52) |
| Run length | 28 | 18 | 15 | 12 | 0.59 (0.04, 0.71) | 0.60 (0.15, 0.70) | 0.62 (0.04, 0.71) |
| Wavelet | 12 | 10 | 8 | 8 | 0.24 (0.01, 0.42) | 0.26 (0.06, 0.42) | 0.26 (0.06, 0.31) |
| Total | 218 | 164 | 134 | 118 | |||
aSelected number of features using lasso and boosting method, respectively, to predict PMD. Prediction models were fitted on the complete dataset. The tuning parameters were estimated by cross-validation
bNumber of features selected both by lasso and boosting
cEach feature was correlated with PMD. Summary statistics (median, minimum, maximum) of Spearman correlation coefficients between (all and selected) features and PMD are shown
Prediction of masking
| Method | MSE | AUC | NRI | Reclassification | |
|---|---|---|---|---|---|
| Correctly upwards | Correctly downwards | ||||
| Nulla | 0.0682 (0.0095) | 0.500 (0.000) | |||
| Clinical findingsb | 0.0657 (0.0085) | 0.734 (0.037) | |||
| Univariate selectionc | 0.0656 (0.0085) | 0.743 (0.036) | 27.9 (16.2) | 57.9 (9.1) | 56.1 (3.0) |
| Lassoc | 0.0655 (0.0084) | 0.747 (0.036) | 33.1 (15.6) | 60.0 (8.7) | 56.6 (3.0) |
| Boostingc | 0.0654 (0.0084) | 0.747 (0.036) | 32.5 (15.5) | 59.8 (8.6) | 56.5 (3.0) |
| Random forestc | 0.0656 (0.0087) | 0.739 (0.035) | 4.4 (16.3) | 45.1 (9.0) | 57.1 (3.4) |
| Observed PMDd | 0.0645 (0.0082) | 0.753 (0.036) | 35.7 (14.4) | 58.5 (8.2) | 59.4 (2.9) |
Summary statistics (mean and standard deviation) of MSE, AUC, and the net reclassification improvement (NRI) in percentages obtained from logistic regression models with clinical predictors and the observed or predicted PMD using various regression methods. All measurements were obtained by 3-fold cross-validation with 100 repetitions
AUC area under the curve, BMI body mass index, HRT hormone replacement therapy, MSE mean squared error, NRI net reclassification improvement, PMD percentage mammographic density
aLogistic regression model without any predictors
bLogistic regression model with clinical predictors (age, BMI, prior breast surgery, menopausal and HRT status, imaging technique) but without PMD
cLogistic regression model with clinical predictors and PMD predicted from texture features using univariate selection, lasso, boosting, or random forest
dLogistic regression model with clinical predictors and the original PMD values (“observed PMD”)
Discovery rates for three models and different cut-off points
| Cut-off point for predicted masking risk (%)a | Frequency above cut-off point (%)b | Discovery rates for tumors not seen on mammography (%) | ||
|---|---|---|---|---|
| Clinical modelc | Boosting PMD modeld | Observed PMD modele | ||
| 5 | 47.5 | 81.8 | 80.9 | 78.9 |
| 10 | 26.4 | 54.5 | 57.7 | 55.6 |
| 12 | 20.0 | 44.8 | 47.4 | 48.7 |
| 15 | 13.6 | 32.9 | 35.1 | 39.7 |
| 20 | 7.5 | 16.2 | 21.0 | 25.4 |
All measurements were obtained by 3-fold cross-validation with 100 repetitions
BMI body mass index, HRT hormone replacement therapy, PMD percentage mammographic density
aPatients were classified into a “high-risk” group if the prediction model assigned a masking risk above the cut-off point. Discovery rates are defined as the proportion of masked tumors in the “high-risk” group
bProportion of “high risk” classified patients in the total study population, using boosting-based prediction model
cLogistic regression model with the clinical predictors age, BMI, previous breast surgery, menopausal and HRT status, and imaging technique
dLogistic regression model with the same clinical predictors and additionally PMD predicted by a boosting regression model beforehand
eLogistic regression model with the clinical predictors and the observed PMD
Fig. 2Cross-validated receiver operating characteristic (ROC) curves, showing the discriminative value of logistic regression models, each with clinical predictors but with different percentage mammographic density (PMD) measures (without PMD, with observed PMD, and with predicted PMD using boosting)
Logistic regression model for predicting masking with predicted PMD based on boosting
| Variable | Coefficient (standard error) |
|---|---|
| Baseline | –0.906 (1.308) |
| Age (year) | –0.018 (0.014) |
| BMI (kg/m2) | –0.080 (0.033) |
| Previous breast surgery | |
| Noa | 0 |
| Yes | 0.502 (0.286) |
| Menopausal and HRT status | |
| Premenopausala | 0 |
| Postmenopausal and no HRT | –0.530 (0.357) |
| Postmenopausal and HRT | 0.208 (0.355) |
| Imaging technique | |
| Analoga | 0 |
| Digital | 0.416 (0.223) |
| Predicted PMD | 0.032 (0.009) |
The model is fitted on the complete dataset. To estimate a patient’s risk for masking, the following steps are necessary: texture features values are calculated from the mammogram, the boosting regression model is applied to obtain the predicted PMD, and patient characteristics and predicted PMD are linearly combined with the logistic regression coefficient to obtain interim value z. Finally, exp (z)/(1 + exp (z)) is the predicted risk for masking
aReference category