| Literature DB >> 19760292 |
Ryan W Woods1, Louis Oliphant, Kazuhiko Shinki, David Page, Jude Shavlik, Elizabeth Burnside.
Abstract
The purpose of our study is to identify and quantify the association between high breast mass density and breast malignancy using inductive logic programming (ILP) and conditional probabilities, and validate this association in an independent dataset. We ran our ILP algorithm on 62,219 mammographic abnormalities. We set the Aleph ILP system to generate 10,000 rules per malignant finding with a recall >5% and precision >25%. Aleph reported the best rule for each malignant finding. A total of 80 unique rules were learned. A radiologist reviewed all rules and identified potentially interesting rules. High breast mass density appeared in 24% of the learned rules. We confirmed each interesting rule by calculating the probability of malignancy given each mammographic descriptor. High mass density was the fifth highest ranked predictor. To validate the association between mass density and malignancy in an independent dataset, we collected data from 180 consecutive breast biopsies performed between 2005 and 2007. We created a logistic model with benign or malignant outcome as the dependent variable while controlling for potentially confounding factors. We calculated odds ratios based on dichomotized variables. In our logistic regression model, the independent predictors high breast mass density (OR 6.6, CI 2.5-17.6), irregular mass shape (OR 10.0, CI 3.4-29.5), spiculated mass margin (OR 20.4, CI 1.9-222.8), and subject age (β = 0.09, p < 0.0001) significantly predicted malignancy. Both ILP and conditional probabilities show that high breast mass density is an important adjunct predictor of malignancy, and this association is confirmed in an independent data set of prospectively collected mammographic findings.Entities:
Mesh:
Year: 2009 PMID: 19760292 PMCID: PMC2950275 DOI: 10.1007/s10278-009-9235-3
Source DB: PubMed Journal: J Digit Imaging ISSN: 0897-1889 Impact factor: 4.056
Fig 1BI-RADS descriptors.
Exploratory Dataset Background Knowledge used for ILP
| Patient descriptors | Age |
| Hormone therapy | |
| Personal history of breast cancer | |
| Family history of breast cancer | |
| Prior surgery | |
| Overall breast composition (breast density) | |
| Mass descriptors | Shape |
| Margin | |
| Density | |
| Size | |
| Stability (increasing, stable, or decreasing) | |
| Calcification descriptors | Shape |
| Distribution | |
| Stability (increasing, stable, or decreasing) | |
| Associated findings | Skin thickening |
| Skin lesion | |
| Trabecular thickening | |
| Nipple retraction, skin retraction | |
| Axillary adenopathy, lymph node | |
| Architectural distortion | |
| Special cases | Tubular density |
| Asymmetric breast tissue | |
| Focal asymmetric density | |
| Additional background knowledge | BI-RADS final assessment category |
| Reason for the mammogram (screening, diagnostic) | |
| Other findings on the same mammogram | |
| Other findings on a prior mammogram |
Background knowledge based on the BI-RADS7 lexicon. “Stability,” though not defined in the BI-RADS lexicon, is frequently used when describing a mass or calcifications
Exploratory Dataset Showing Examples of Rules Learned by ILP, Including Total, Benign, and Malignant Cases for which the Rule is True
| Rule | Total Cases | Benign (%) | Malignant (%) | Precision | Recall |
|---|---|---|---|---|---|
| Finding is malignant if: | 79 | 21 (26.6) | 58 (73.4) | 0.73 | 0.11 |
| Mass margins = spiculated and | |||||
| Mass density = high and | |||||
| Reason for mammogram = diagnostic | |||||
| Finding is malignant if: | 66 | 29 (43.9) | 37 (56.1) | 0.56 | 0.07 |
| Mass density = high and | |||||
| Age >65 and | |||||
| Mass size >10 mm and | |||||
| Reason for mammogram = diagnostic | |||||
| Finding is malignant if: | 78 | 41 (52.6) | 37 (47.4) | 0.47 | 0.07 |
| Mass density = high and | |||||
| Personal history of breast cancer = yes and | |||||
| Mass stability = increasing |
Also shown are the precision and recall for each rule
Exploratory Dataset Showing the Probability of Malignancy Given Individual Descriptors
| Descriptor | Benign | Malignant | |
|---|---|---|---|
| BI-RADS category = 5 | 0.62 | 109 | 181 |
| Mass margin = spiculated | 0.44 | 147 | 114 |
| Regional pleomorphic calcifications | 0.31 | 19 | 8 |
| BI-RADS category = 4 | 0.28 | 403 | 160 |
| Mass density = high | 0.18 | 489 | 108 |
| Mass shape = irregular | 0.12 | 842 | 118 |
| Nipple retraction = present | 0.10 | 72 | 7 |
| Mass size >30 mm | 0.07 | 426 | 30 |
| Clustered pleomorphic calcifications | 0.07 | 937 | 67 |
| Prior surgery = true | 0.06 | 1,609 | 106 |
The ten descriptors with the highest probability of malignancy are shown. P(malignancy | descriptor) = the probability of malignancy given the finding on the mammogram
Univariate Analysis Results
| Predictor | Total | Benign (%) | Benign Mean | Malignant (%) | Malignant Mean | OR (95% CI) | ||
|---|---|---|---|---|---|---|---|---|
| Mass density | High | 81 | 30 (37) | 51 (63) | <0.0001 | 6.7 (3.5–13.9) | ||
| Iso/low | 99 | 79 (79.8) | 20 (20.2) | 1.0 (ref) | ||||
| Overall breast composition | Lower density breasts | 94 | 47 (50) | 47 (50) | 0.0003 | 2.6 (1.4–5.3) | ||
| Higher density breasts | 86 | 62 (72.1) | 24 (27.9) | 1.0 (ref) | ||||
| Image acquisition | Digital | 109 | 70 (64.2) | 39 (35.8) | 0.2345 | 0.7 (0.4–1.3) | ||
| Analog | 71 | 39 (54.9) | 32 (45.1) | 1.0 (ref) | ||||
| Age | 180 | 49.7 | 62.8 | <0.0001 | – | |||
| Size | 180 | 13.3 | 15.9 | 0.023 | – |
(ref) signifies the reference level in the odds ratio calculation. (–) signifies that odds ratios could not be calculated on continuous variables. Lower density breasts = “almost entirely fat” and “scattered fibroglandular densities.” Higher density breasts = “heterogeneously dense” and “extremely dense” breasts
Logistic Regression Model Results
| Predictor | Std. Error | Estimated OR (95% CI) | |||
|---|---|---|---|---|---|
| Mass density | High | 1.88 | 0.5 | 0.0002 | 6.6 (2.5–17.6) |
| Iso/low | – | – | – | 1.0 (ref) | |
| Mass shape | Irregular | 2.3 | 0.6 | <0.0001 | 10.0 (3.4–29.5) |
| All others | – | – | – | 1.0 (ref) | |
| Mass margin | Spiculated | 3.01 | 1.22 | 0.01 | 20.4 (1.9–222.8) |
| All others | – | – | – | 1.0 (ref) | |
(ref) signifies the reference level in the odds ratio calculation