| Literature DB >> 30178398 |
Kawthar Al-Ajmi1, Artitaya Lophatananon1, Martin Yuille1, William Ollier1, Kenneth R Muir2.
Abstract
A disease risk model is a statistical method which assesses the probability that an individual will develop one or more diseases within a stated period of time. Such models take into account the presence or absence of specific epidemiological risk factors associated with the disease and thereby potentially identify individuals at higher risk. Such models are currently used clinically to identify people at higher risk, including identifying women who are at increased risk of developing breast cancer. Many genetic and non-genetic breast cancer risk models have been developed previously. We have evaluated existing non-genetic/non-clinical models for breast cancer that incorporate modifiable risk factors. This review focuses on risk models that can be used by women themselves in the community in the absence of clinical risk factors characterization. The inclusion of modifiable factors in these models means that they can be used to improve primary prevention and health education pertinent for breast cancer. Literature searches were conducted using PubMed, ScienceDirect and the Cochrane Database of Systematic Reviews. Fourteen studies were eligible for review with sample sizes ranging from 654 to 248,407 participants. All models reviewed had acceptable calibration measures, with expected/observed (E/O) ratios ranging from 0.79 to 1.17. However, discrimination measures were variable across studies with concordance statistics (C-statistics) ranging from 0.56 to 0.89. We conclude that breast cancer risk models that include modifiable risk factors have been well calibrated but have less ability to discriminate. The latter may be a consequence of the omission of some significant risk factors in the models or from applying models to studies with limited sample sizes. More importantly, external validation is missing for most of the models. Generalization across models is also problematic as some variables may not be considered applicable to some populations and each model performance is conditioned by particular population characteristics. In conclusion, it is clear that there is still a need to develop a more reliable model for estimating breast cancer risk which has a good calibration, ability to accurately discriminate high risk and with better generalizability across populations.Entities:
Keywords: Assessment risk tool; Calibration; Concordance and E/O statistics; Discrimination; Risk factors; Risk prediction
Mesh:
Year: 2018 PMID: 30178398 PMCID: PMC6182451 DOI: 10.1007/s10552-018-1072-6
Source DB: PubMed Journal: Cancer Causes Control ISSN: 0957-5243 Impact factor: 2.506
Fig. 1Identification of eligible risk models using PRISMA flowchart
Breast cancer risk factors included in the 14 models
| Name of model | Gail [ | Rosner [ | Rosner [ | Colditz [ | Ueda [ | Boyle [ | Lee [ | Novotny [ | Gail [ | Matsuno [ | Banegas [ | Pffeifer [ | Park [ | Lee [ | Effect | Level of evidence | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Basic characteristics | |||||||||||||||||
| Age | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Increased risk | Definite | |||||
| Ethnicity | Yes | Jewish increased risk | Definite | ||||||||||||||
| Height | Yes | Increased risk | Definite | ||||||||||||||
| Weight | Yes | Increased risk in post-menopausal | Probable | ||||||||||||||
| BMI | Yes | Yes | Yes | Yes | Yes | Yes | Probable | ||||||||||
| Alcohol intake | Yes | Yes | Yes | Yes | Yes | Increased risk | Probable | ||||||||||
| Smoking | Yes | Yes | Increased risk | Possible | |||||||||||||
| Physical activity | Yes | Yes | Yes | Decreased risk | Possible | ||||||||||||
| Diet | Yes | Decreased risk | Probable | ||||||||||||||
| Hormonal and reproductive factors | |||||||||||||||||
| Age at menarche | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Increased risk | Definite | |||
| Age at first live birth | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Increases risk | Definite | |
| Age at subsequent birth | Yes | Yes | Increases risk | Definite | |||||||||||||
| Age at menopause | Yes | Yes | Yes | Yes | Yes | Yes | Increased risk | Definite | |||||||||
| Hormone replacement therapy use | Yes | Yes | Yes | Yes | Increases risk | Definite | |||||||||||
| Oral contraceptive use | Yes | Yes | Yes | Increases risk | Definite | ||||||||||||
| Breast feeding | Yes | Yes | Decreases risk | Probable | |||||||||||||
| Pregnancy | Yes | Decreases risk | Possible | ||||||||||||||
| Parity | Yes | Yes | Decreases risk | Definite | |||||||||||||
| Children number | Yes | Yes | Decreases risk | Possible | |||||||||||||
| Menopause type | Yes | Surgical menopause reduces risk | Possible | ||||||||||||||
| Menstrual regularity | Yes | Menstrual regularity and duration—inconsistent results | Possible | ||||||||||||||
| Menstrual duration | Yes | Yes | Possible | ||||||||||||||
| Menopausal status | Yes | Yes | Post-menopause increases risk | Possible | |||||||||||||
| Gestation period | Yes | Increases risk | Possible | ||||||||||||||
| Family history of breast and/or ovarian cancer or diseases | |||||||||||||||||
| Family history of breast cancer | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Increases risk | Definite | |||||
| First-degree relatives with breast cancer | Yes | Yes | Yes | Yes | Increases risk | Definite | |||||||||||
| Age of onset of breast cancer in a relative | Yes | Increases risk | Probable | ||||||||||||||
| Benign breast disease | Yes | Yes | Yes | Increases risk | Probable | ||||||||||||
| History of breast biopsies | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Increases risk | Definite | ||||||||
| Mammogram | Yes | Increases risk | Probable | ||||||||||||||
| Summary of risk factors included in each model | |||||||||||||||||
| Definite factors | 5 | 5 | 6 | 10 | 3 | 5 | 3 | 6 | 3 | 5 | 5 | 5 | 7 | 5 | Max of 10 and min of 3 factors | ||
| Probable factors | 0 | 0 | 0 | 4 | 1 | 3 | 2 | 1 | 0 | 1 | 0 | 3 | 3 | 2 | Max of 4 and min of 0 factors | ||
| Possible factors | 0 | 0 | 0 | 2 | 0 | 2 | 3 | 0 | 0 | 0 | 0 | 1 | 3 | 5 | Max of 5 and min of 0 factors | ||
| Total factors | 5 | 5 | 6 | 16 | 4 | 8 | 8 | 7 | 3 | 5 | 5 | 9 | 13 | 12 | Max of 16 and min of 3 factors | ||
Formulas used to calculate the accuracy of the model
| Term | Definition | Equation |
|---|---|---|
| Sensitivity | Probability of a test will indicate ‘positive’ among those with the disease | (TP)/(TP + FN) |
| Specificity | Probability of a test will indicate ‘negative’ among those without the disease | (TN)/(TN + FP) |
| Positive predictive value | Probability of a patient having disease when test is positive | (TP)/(TP + FP) |
| Negative predictive value | Probability of a patient not having disease when test is negative | (TN)/(FN + TN) |
TP True positive, TN true negative, FP false positive, FN false negative
Fig. 2Calibration and discrimination performances of the 13 breast cancer risk models
Summary of the evaluation measures of the risk models
| Model | Calibration | Discrimination | Accuracy | Utility | ||||
|---|---|---|---|---|---|---|---|---|
| Derived model | Internal | External | Derived model | Internal | External | Sensitivity, specificity, PPV, NPV | ||
| Gail [ | 0.79–1.12 | 0.58–0.67 | ||||||
| Rosner [ | – | – | – | – | – | – | – | |
| Rosner [ | – | 1.00 (0.93–1.07)d | – | 0.57 (0.55–0.59)d | ||||
| Colditz [ | – | – | 1.01 (0.94–1.09)d | – | 0.64 (0.62–0.66)d | – | Goode | |
| Ueda [ | – | – | – | – | – | – | – | |
| Boyle [ | (a) 0.96 (0.75–1.16) cohort1 | – | 0.59 | – | – | |||
| Lee [ | – | – | – | – | – | – | – | |
| Novotny [ | – | – | – | – | – | – | – | |
| Gail [ | – | 1.08 (0.97–1.20) | 0.93 (0.97–1.20)f | – | 0.56 (0.54–0.58)f | – | – | |
| Matsuno [ | 1.17 (0.99–1.38) | 0.614 (0.59–0.64) | – | – | ||||
| Banegas [ | – | (a) 1.08 (0.91–1.28); Hispanic | – | – | – | – | – | – |
| Pfeiffer [ | 1.00 (0.96–1.04) | 0.58 (0.57–0.59) | ||||||
| Park [ | – | – | (a) 0.97(0.67–1.40); KMCC | – | (a) 0.63 (0.61–0.65) < 50 years (KMCC) | (a) 0.61(0.49–0.72); KMCC | – | – |
| Lee [ | Overall: 0.62 | (a) Sensitivity | – | |||||
aBoyle [39] used two cohorts for calibration (1-cohort with complete follow-up and 2-cohort with 5 years of follow-up at most)
bBanegas [40] used two cohorts for calibration (1-Hispanic and 2-non-Hispanic white (NHW))
cPark [23] used two cohorts for calibration and discrimination, using two Korean cohorts: 1-the Korean Multi-center Cancer Cohort (KMCC) and 2-National Cancer Centre (NCC) cohort
d[49]
e[52]
f[11]
Characteristic summary of the reviewed breast cancer risk models
| Author/model | Study design | Participants | Ethnicity | Outcome | Statistical method | Effect estimates | Sample size | Risk factors considered in the models | Age target | Stratification |
|---|---|---|---|---|---|---|---|---|---|---|
| Gail [ | Case–control | White American females from the Breast Cancer Detection Demonstration Project (BCDDP) | American–Caucasian | Invasive breast cancer + in situ carcinoma | unconditional logistic regression | Relative risk | 2,852 cases | Age at menarche, age at first live birth, number of previous biopsies, and number of first-degree relatives with breast cancer | Any age | None |
| Rosner [ | Cohort | Registered nurses | American–Caucasian | Invasive breast cancer | Poisson regression | Cumulative incidence | 2,341 cases, 91,523 controls | Age, age at all births, menopause age, menarche age | 30–55 years | Number of births |
| Rosner [ | Cohort | Registered nurses | American–Caucasian | Invasive breast cancer | Poisson regression | Relative risk | 2,249 cases, 89,132 controls | Menarche age, first live birth age, subsequent births age, menopause age | Any age | None |
| Colditz [ | Cohort | General women | American–Caucasian | Invasive breast cancer | Poisson regression | Cumulative incidence | 1,761cases | Benign breast disease, use of HRT, weight, height, menopausal type, and alcohol intake | Women aged 30–55 years | None |
| Ueda [ | Case–control | General women | Japanese–Asian | Invasive breast cancer | Conditional logistic regression | Relative risk | 376 cases | Menarche, first birth age, family history, and BMI in post-menopausal women | Any age | Menopausal status |
| Boyle [ | Case–control | General women | Italian–Caucasian | Invasive breast cancer | Conditional logistic regression | Absolute + relative risk | 2,569 cases | Menarche age, first birth age, alcohol intake, family history, age of diagnosis in relatives, and one of the two diet scores. BMI and HRT were included only for women older > 50 | 23–74 years (cases) | Age (< 50 and > 50) |
| Lee [ | Case–control | 1-General women | Korean–Asian | Invasive breast cancer | Hosmer–Lemeshow goodness of fit | Probability | 384 cases | With hospitalized controls: family history, menstrual regularity, total menstrual duration, first full-term pregnancy age, breastfeeding duration while with nurse/teacher controls: age, menstrual regularity, drinking status, smoking status | Age at least 20 years | None |
| Novotny [ | Case–control | General women | Czeck females–Caucasian | Invasive breast cancer | Unconditional Logistic regression | Relative risk | 4,598 matched pairs | Age at birth of first child, family history of breast cancer, No. of previous breast biopsy, menarche age, parity, history of benign breast disease | Age matched | None |
| Gail [ | Case–control | General women | African American | Invasive breast cancer | Conditional logistic regression | Absolute + relative risk | 1,607cases | Menarche age, No. of affected mother or sisters, No. of benign biopsy | 35–64 years | Age (< 50 and > 50) |
| Matsuno [ | Case–control | General women | Asian and Pacific Islander American | Invasive breast cancer | Conditional logistic regression | Absolute + relative + attributable | 589 cases | Menarche age, age at first live birth, No. of biopsies, family history, ethnicity | Any age | Ethnicity |
| Banegas [ | Longitudinal study | General women | Hispanic | Invasive breast cancer | Cox proportional hazards regression | Relative risk | 6,353 cases | Age, age at first live birth, menarche age, No. of first-degree relatives with breast cancer, No. of breast biopsies | Post-menopausal participants aged ≥ 50 | None |
| Pfeiffer [ | Prospective study | White over 50 years old | White and non-Hispanic Caucasian | Invasive breast cancer | Cox proportional hazards regression | Relative and attributable risks | 7,695 cases | BMI, oestrogen and progestin MHT use, other MHT use, parity, age at first birth, pre-menopausal, age at menopause, benign breast diseases, family history of breast or ovarian cancer, and alcohol consumption | 50 and above | None |
| Park [ | Case–control | General women | Korean–Asian | Invasive breast cancer | Unconditional Logistic regression | Absolute risk | 3,789 cases | Family history, menarche age, menopausal status, menopause age, pregnancy, first full-term pregnancy age, No. of pregnancies, breastfeeding duration, OC usage, HRT, exercise, BMI, smoking, drinking, No. of breast examinations | Any age | Age (< 50 and > 50) |
| Lee [ | Case–control | General women | Asian | Invasive breast cancer | Conditional logistic regression | 2,291 cases and 2,283 controls | First full-term pregnancy age, children No., menarche age, BMI, family history, menopausal status, regular mammography, exercises, oestrogen exposure duration, gestation period, menopause age | Any age | Age (< 50 and > 50) |
Models reviewed in this article
| Title | Size of study | Population | First author | References | |
|---|---|---|---|---|---|
| Included in this review | Projecting individualized probabilities of developing breast cancer for white females who are being examined annually | 2,852 cases | Caucasian | Gail 1989 | [ |
| Reproductive risk factors in a prospective study of breast cancer: the Nurses’ Health Study | 2,341 cases, 91,523 controls | Caucasian | Rosner 1994 | [ | |
| Nurses’ health study: log-incidence mathematical model of breast cancer incidence | 2,249 cases, 89,132 controls | Caucasian | Rosner 1996 | [ | |
| Cumulative risk of breast cancer to age 70 years according to risk factor status: data from the Nurses’ Health Study | 1,761cases | Caucasian | Colditz | [ | |
| Estimation of individualized probabilities of developing breast cancer for Japanese women | 376 cases | Asian | Ueda | [ | |
| Contribution of three components to individual cancer risk predicting breast cancer risk in Italy | 2,569 cases | Caucasian | Boyle | [ | |
| Determining the Main Risk Factors and High-risk Groups of Breast Cancer Using a Predictive Model for Breast Cancer Risk Assessment in South Korea | 384 cases | Asian | Lee | [ | |
| Breast cancer risk assessment in the Czech female population–an adjustment of the original Gail model | 4,598 matched pairs | Caucasian | Novotny | [ | |
| Projecting individualized absolute invasive breast cancer risk in African American women | 1,607cases | African | Gail | [ | |
| Projecting individualized absolute invasive breast cancer risk in Asian and Pacific Islander American women | 589 cases | Asian | Matsuno | [ | |
| Evaluating breast cancer risk projections for Hispanic women | 6,353 cases | Hispanic | Banegas | [ | |
| Risk Prediction for Breast, Endometrial, and Ovarian Cancer in White Women Aged 50 y or Older: Derivation and Validation from Population-Based Cohort Studies | 42,821 cases | White, non-Hispanic women aged 50+ | Pfeiffer | [ | |
| Korean risk assessment model for breast cancer risk prediction | 3,789 cases | Asian | Park | [ | |
| Computational Discrimination of Breast Cancer for Korean Women Based on Epidemiologic Data Only | 2,291 cases and 2,283 controls | Asian | Lee | [ | |
| Excluded from this review | [ | ||||