| Literature DB >> 26538066 |
Stephen R Piccolo1,2,3, Irene L Andrulis4, Adam L Cohen5,6, Thomas Conner7, Philip J Moos8, Avrum E Spira9, Saundra S Buys10,11, W Evan Johnson12,13, Andrea H Bild14,15.
Abstract
BACKGROUND: Women with a family history of breast cancer face considerable uncertainty about whether to pursue standard screening, intensive screening, or prophylactic surgery. Accurate and individualized risk-estimation approaches may help these women make more informed decisions. Although highly penetrant genetic variants have been associated with familial breast cancer (FBC) risk, many individuals do not carry these variants, and many carriers never develop breast cancer. Common risk variants have a relatively modest effect on risk and show limited potential for predicting FBC development. As an alternative, we hypothesized that additional genomic data types, such as gene-expression levels, which can reflect genetic and epigenetic variation, could contribute to classifying a person's risk status. Specifically, we aimed to identify common patterns in gene-expression levels across individuals who develop FBC.Entities:
Mesh:
Year: 2015 PMID: 26538066 PMCID: PMC4634735 DOI: 10.1186/s12920-015-0145-6
Source DB: PubMed Journal: BMC Med Genomics ISSN: 1755-8794 Impact factor: 3.063
Summary of patient subgroups in the Utah and Ontario populations
| Category | Utah | Ontario |
|---|---|---|
| Family history, BRCA1/2, Cancer | 16 | 11 |
| Family history, BRCAX, Cancer | 23 | 17 |
| Family history, BRCA1/2, No Cancer | 18 | 14 |
| Family history, BRCAX, No Cancer | 26 | 18 |
| No family history, Cancer | 22 | 8 |
| No family history, No Cancer | 19 | 5 |
| Total | 124 | 73 |
Patients fell into one of six groups, depending on whether 1) they had a family history of breast cancer, 2) had been diagnosed with breast cancer previously, or 3) were known to carry a pathogenic variant in BRCA1 or BRCA2. The number of patients in each group is listed for each cohort
Summary of ages at which blood samples were acquired
| Description | Minimum | Median | Maximum |
|---|---|---|---|
| Family history, BRCA1/2, Cancer | 45 | 59.0 | 77 |
| Family history, BRCAX, Cancer | 56 | 58.5 | 78 |
| Family history, BRCA1/2, No Cancer | 46 | 60.0 | 78 |
| Family history, BRCAX, No Cancer | 55 | 63.0 | 83 |
| No family history, Cancer | 53 | 65.5 | 79 |
| No family history, No Cancer | 51 | 58.0 | 86 |
For participants within each group, this table indicates the minimum, average, and maximum age at which blood samples were drawn. These data represent 117 participants from the Utah cohort. The remaining 8 Utah participants were at least 55 years old; however, it was not feasible to collect their exact ages in retrospect. The median age at which blood samples were acquired was consistent across the groups (p = 0.064)
Fig. 1Predictions of familial breast cancer status in two independent cohorts. a In a cross-validated design, we predicted familial breast cancer status for 124 women from Utah. This cohort included women who did or did not have a family history (FH) of breast cancer, who did or did not carry a BRCA1 or BRCA2 mutation (BRCAX if not), and who had or had not developed breast cancer. The “Genomic model score” values represent probabilistic predictions made by the support vector machines algorithm. Higher values indicate a higher probability that a given individual had developed familial breast cancer. These scores were much higher for individuals who had a family history of breast cancer and developed a breast tumor, irrespective of BRCA1/BRCA2 mutation status. b In a training/testing design, we predicted whether individuals in the independent Ontario cohort had developed familial breast cancer. The support vector machines algorithm was trained on the full Utah data set. Again, the scores were considerably higher for women with a family history of breast cancer who had developed a breast tumor
Fig. 2Cross-validation performance of gene-expression biomarker with different quantities of genes. For the gene-expression biomarker, we used the SVM-RFE method to identify genes whose expression differed most consistently between individuals who developed familial breast cancer and individuals who did not. The sizes of these gene subsets ranged in size between 25 and 300 genes. In repeated cross-validation (1,000 iterations), predictive accuracy peaked at 250 genes and was consistent when the number of genes was 150 or higher
Fig. 3Sensitivity and specificity of biomarker predictions. Because the support vector machines predictions (genomic model score) are probabilistic, we evaluated various cutoff thresholds at which patients could be considered to have had a “high” probability of developing familial breast cancer. a-b Receiver operating characteristic curves illustrate the balance between sensitivity and specificity across many probability thresholds for the Utah and Ontario cohorts. c-d As the genomic model scores increase, a larger proportion of patients who fell above the threshold would have been predicted accurately to develop familial breast cancer. As the threshold approaches its maximum, the predictive accuracy for patients above the threshold is nearly perfect; however, such high thresholds would result in low sensitivity levels. A threshold near 0.2 may be optimal. Panel C represents predictions for Utah participants who had a family history of breast cancer; Panel D represents the Ontario cohort. The dashed lines represent predictions for individuals who carried a BRCA1 or BRCA2 mutation. The dotted lines represent predictions for BRCAX individuals. (Plotted lines were fitted using a LOESS model [span = 0.5] for smoothing)
Top pathway results from GATHER analysis
| Term ID | Term |
|
|---|---|---|
| hsa04520 | Adherens junction | 0.00149 |
| hsa00590 | Prostaglandin and leukotriene metabolism | 0.00775 |
| hsa04350 | TGF-beta signaling pathway | 0.0132 |
| hsa04510 | Focal adhesion | 0.014 |
For genes that exhibited consistent fold-change directions in the Utah and Ontario gene-expression data (Additional files 7 and 8), we sorted the genes by average rank of fold change and t-test p-values. The 250 top-ranked genes were used to query GATHER [39] for KEGG pathways most strongly associated with this gene list. Pathways that attained a p-value less than 0.05 are shown