| Literature DB >> 29116104 |
Mahalakshmi Kumaran1, Carol E Cass2, Kathryn Graham2, John R Mackey2, Roland Hubaux3, Wan Lam3, Yutaka Yasui4, Sambasivarao Damaraju5,6.
Abstract
Breast cancer is one of the most common cancers among women, and susceptibility is explained by genetic, lifestyle and environmental components. Copy Number Variants (CNVs) are structural DNA variations that contribute to diverse phenotypes via gene-dosage effects or cis-regulation. In this study, we aimed to identify germline CNVs associated with breast cancer susceptibility and their relevance to prognosis. We performed whole genome CNV genotyping in 422 cases and 348 controls using Human Affymetrix SNP 6 array. Principal component analysis for population stratification revealed 84 outliers leaving 366 cases and 320 controls of Caucasian ancestry for association analysis; CNVs with frequency > 10% and overlapping with protein coding genes were considered for breast cancer risk and prognostic relevance. Coding genes within the CNVs identified were interrogated for gene- dosage effects by correlating copy number status with gene expression profiles in breast tumor tissue. We identified 200 CNVs associated with breast cancer (q-value < 0.05). Of these, 21 CNV regions (overlapping with 22 genes) also showed association with prognosis. We validated representative CNVs overlapping with APOBEC3B and GSTM1 genes using the TaqMan assay. Germline CNVs conferred dosage effects on gene expression in breast tissue. The candidate CNVs identified in this study warrant independent replication.Entities:
Mesh:
Substances:
Year: 2017 PMID: 29116104 PMCID: PMC5677082 DOI: 10.1038/s41598-017-14799-7
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Study Overview. The figure outlines the study design with brief description of methods and data filters. Summary of key result of each analysis indicating the number of CNVs at various stages of analysis. OS, overall survival; RFS, recurrence free survival. + Time to event analysis based on cases (n = 366).
Top associated germ line CNVs/CNVRs associated with breast cancer risk.
| CNV region | Cytoband | Size (bp) | Total CNV /CNVR Frequency in cohort | Average Frequency of CNV | q-value | Overlapping gene | Mapping | |
|---|---|---|---|---|---|---|---|---|
| Cases (Gain/Loss) | Controls (Gain/Loss) | |||||||
| Chr5-69784291-70254895 | 5q13.2 | 470605 | 44 | 31 (13/18) | 59 (3/56) | 1.46 × 10−21 |
| 1000 g, DGV |
| Chr5-70254905-70328368 | 5q13.2 | 73469 | 31 | 26 (11/15) | 37 (7/30) | 3 × 10−02 to 1.76 × 10−13 |
| 1000 g. DGV |
| Chr21-40184963-40190820 | 21q22.2 | 2792 | 15 | 7 (3/4) | 24 (0/24) | 1.58 × 10−10 to 4.3 × 10−12 |
| — |
| Chr9-40784158-40800446 | 9p13.1 | 60428 | 19 | 12 (5/7) | 28 (3/25) | 1.09 × 10−11 to 5.23 × 10−12 |
| DGV |
| Chr8-7827144-7831849 | 8p23.1 | 4707 | 24 | 15 (7/8) | 33 (4/29) | 1.02 × 10−09 to 1.65 × 10−09 |
| DGV |
| Chr9-67899911-68067313 | 9q13 | 167404 | 18 | 8 (2/6) | 29 (4/25) | 1.86 × 10−08 to 1.52 × 10−09 |
| DGV |
| Chr1-248683401-248687808 | 1q44 | 4409 | 29 | 23 (8/15) | 35 (1/34) | 2.38 × 10−08 to 6.47 × 10−09 |
| DGV |
| Chr11-55418110-55421252 | 11q11 | 3143 | 85 | 94 (49/45) | 76 (32/44) | 1.21 × 10−08 |
| 1000 g, DGV |
| Chr8-93005629-93015066 | 8q21.3 | 9444 | 11 | 5 (2/3) | 18 (0/18) | 7.69 × 10−08 to 5.94 × 10−09 |
| — |
| Chr6-34516636-34517772 | 6p21.31 | 1143 | 11 | 17 (13/4) | 6 (0/6) | 1.34 × 10−07 to 1.02 × 10−08 |
| DGV |
| Chr11-55403771-55407672 | 11q11 | 3902 | 85 | 93 (49/44) | 77 (33/44) | 4.18 × 10−08 |
| 1000 g, DGV |
| Chr1-149548719-149563724 | 1q21.2 | 15005 | 30 | 26 (10/16) | 35 (2/33) | 6.61 × 10−08 |
| 1000 g, DGV |
| Chr10-123346484-123348045 | 10q26.13 | 1569 | 11 | 7 (3/4) | 15 (0/15) | 6.04 × 10−07 to 1.05 × 10−07 |
| — |
| Chr16-10788745-10790882 | 16p13.13 | 2137 | 10 | 7 (4/3) | 14 (0/14) | 4.24 × 10−07 |
| 1000 g, DGV |
| Chr1-356492-380356 | 1p36.33 | 23865 | 21 | 16 (8/8) | 28 (4/24) | 5.62 × 10−07 |
| 1000 g, DGV |
| Chr9-67789400-67808579 | 9q13 | 19180 | 19 | 10 (2/8) | 28 (3/25) | 7.98 × 10−07 |
| 1000 g, DGV |
| Chr4-144288613-144293270 | 4q31.21 | 4667 | 18 | 11 (5/6) | 26 (2/24) | 1.5 × 10−05 to 2.4 × 10−11 |
| DGV |
| Chr4-69505724-69536970 | 4q13.2 | 31250 | 32 | 29 (12/17) | 35 (5/30) | 1.29 × 10−03 to 1.10 × 10−06 |
| 1000 g, DGV |
| Chr11-55430518-55436423 | 11q11 | 5907 | 81 | 87 (46/41) | 73 (30/43) | 1.68 × 10−05 to 2.79 × 10−08 |
| DGV |
| Chr9-67753281-67808579 | 9q13 | 55300 | 19 | 11 (2/9) | 28 (3/25) | 1.46 × 10−06 to 7.87 × 10−07 |
| 1000 g, DGV |
| Chr13-67509369-67513167 | 13q21.32 | 3811 | 11 | 7 (3/4) | 14 (1/14) | 1.24 × 10−03 to 2.07 × 10−06 |
| DGV |
| Chr7-75044860-75062133 | 7q11.23 | 17277 | 12 | 7 (3/4) | 17 (0/17) | 2.09 × 10−06 to 1.76 × 10−07 |
| DGV |
| Chr17-20346165-20366887 | 17p11.2 | 20725 | 11 | 7 (3/4) | 15 (0/15) | 2.08 × 10−06 to 6.78 × 10−07 |
| 1000 g, DGV |
| Chr4-55106768-55120708 | 4q12 | 13940 | 17 | 15 (6/9) | 19 (0/19) | 5.21 × 10−03 to 6.14 × 10−08 |
| — |
| Chr13-48968806-48977635 | 13q14.2 | 8835 | 11 | 7 (3/4) | 17 (0/17) | 1.53 × 10−06 to 6.19 × 10−07 |
| 1000 g |
| Chr3-127422064-127423993 | 3q21.3 | 1931 | 10 | 6 (2/4) | 15 (0/15) | 6.29 × 10−06 to 4.01 × 10−06 |
| 1000 g, DGV |
| Chr5-180425664-180437832 | 5q35.3 | 12170 | 19 | 19 (9/10) | 18 (1/17) | 4.71 × 10−05 to 2.62 × 10−05 |
| 1000 g, DGV |
| Chr1-152572873-152574332 | 1q21.3 | 2728 | 75 | 83 (40/43) | 67 (24/43) | 4.71 × 10−05 to 2.64 × 10−05 |
| 1000 g, DGV |
| Chr22-39363651-39371629 | 22q13.1 | 1119 | 19 | 21 (3/18) | 17 (3/14) | 3.65 × 10−02 to 2.73 × 10−02 |
| 1000 g, DGV |
List of CNVs/CNVRs identified in the CNV-GWAS that were associated (q-value < 5 × 10−5) with breast cancer. For CNVRs, we presented the range of q-values from the CNVs identified (Supplementary 1 Table S1). The last row shows the CNVR from APOBEC3A_B (fusion gene) reported in the literature[47] and its association with breast cancer risk in the current study as an independent validation of findings.
CNVRs associated with breast cancer risk and OS.
| CNVR region | Gene name | CNVR Size (kb) | Copy number status | P-value | Hazards Ratio [95% CI] |
|---|---|---|---|---|---|
| chr19:36846012-36847567* |
| 1.55 | gain | 4.78 × 10−3 | 2.38 [1.3-4.36] |
| chr1:65393459-65410228* |
| 16.77 | gain | 1.07 × 10−2 | 3.24 [1.31-8.01] |
| chr1:110225034-110226615 |
| 1.58 | gain | 1.30 × 10−2 | 1.81 [1.13-2.89] |
| chr17:80646036-80647251 |
| 1.21 | gain | 1.60 × 10−2 | 2.57 [1.19-5.52] |
| chr6:32487136-32497161 |
| 10.02 | gain | 2.25 × 10−2 | 0.59 [0.38-0.93] |
| chr8:72213838-72215337 |
| 1.49 | gain | 3.09 × 10−2 | 1.59 [1.04–2.43] |
| chr6:161032642-161068568* |
| 35.92 | gain | 3.13 × 10−2 | 0.37 [0.15–0.91] |
| chr3:50951343-50960775 |
| 9.43 | gain | 3.18 × 10−2 | 2.20 [1.07–4.52] |
| chr12:99796328-99797863 |
| 1.53 | gain | 3.35 × 10−2 | 1.94 [1.05–3.57] |
| chr12:2254285-2256046 |
| 1.76 | gain | 3.49 × 10−2 | 0.48 [0.24–0.95] |
| chr4:55111660–55120708* |
| 9.05 | loss | 6.58 × 10−3 | 0.35 [0.16–0.74] |
| chr16:515664-536683 |
| 21.02 | loss | 1.66 × 10−2 | 0.43 [0.22-0.86] |
| chr21:11053457-11069332 |
| 15.87 | loss | 2.01 × 10−2 | 0.40 [0.19–0.87] |
| chr8:14284477-14288732 |
| 4.25 | loss | 2.41 × 10−2 | 0.27 [0.08–0.84] |
| chr7:75044860-75054268 |
| 9.41 | loss | 4.77 × 10−2 | 0.20 [0.06–0.98] |
List of CNVRs associated with both risk and overall survival identified using Cox proportional hazard model. Only the associated copy number status (either loss or gain) compared with diploid is indicated in the table. The CNVR region marked with “*” indicate common CNVRs between OS and RFS. Abbreviation: CI – Confidence Interval.
CNVRs associated with breast cancer risk and RFS.
| CNVR region | Gene name | CNVR Size (kb) | CNV type | Cox P-value | Hazards Ratio [95% CI] |
|---|---|---|---|---|---|
| chr19:36846012–36847567* |
| 1.55 | Gain | 3.82 × 10−4 | 2.89 [1.61–5.19] |
| chr4:186629984-186634169 |
| 4.18 | Gain | 1.35 × 10−2 | 3.54 [1.3–9.64] |
| chr1:152572873-152574332 |
| 1.46 | Gain | 1.94 × 10−2 | 1.75 [1.1–2.81] |
| chr1:248787969-248794876 |
| 6.91 | Gain | 2.64 × 10−2 | 2.09 [1.09–4] |
| chr3:195456468-195461506 |
| 5.04 | Gain | 3.46 × 10−2 | 0.62 [0.39–0.97] |
| chr1:65393459-65410228* |
| 16.77 | Gain | 3.47 × 10−2 | 2.6 [1.07–6.47] |
| chr6:161032642-161068568* |
| 35.92 | Gain | 5.08 × 10−3 | 0.31 [0.13–0.70] |
| chr17:20346165-20366887 |
| 20.72 | Gain | 3.52 × 10−2 | 2.27 [1.06–4.87] |
| chr4:55111660-55120708* |
| 9.05 | Loss | 7.92 × 10−3 | 0.42 [0.22–0.8] |
| chr6:53931117-53933601 |
| 2.48 | Loss | 2.53 × 10−2 | 0.62 [0.4–0.94] |
| chr4:186629984-186634169 |
| 4.18 | Loss | 3.65 × 10−2 | 1.93 [1.04–3.58] |
List of CNVRs associated with both risk and RFS identified using Cox proportional hazard model. Only the associated copy number status (either loss or gain) compared with diploid is indicated in the table. The CNVR region marked with “*” indicate common CNVRs between OS and RFS “ + ” Indicates that gene that has both gain and loss associated with recurrence free survival when compared to diploid. Abbreviation: CI – Confidence Interval.
Figure 2Kaplan Meier plots for CNVRs associated with Overall Survival. KM plots were constructed based on the copy number status of each gene to determine the difference in overall survival (OS) between cases with genes harbouring copy number variation (gain/loss) versus diploid status. Blue indicates Diploid copy number; Green indicates Copy number gain; Red indicates Copy number loss. “ + ” indicates the censored events. The number of cases, n, in the analysis is indicated and the number of events in the study for each survival curve is indicated in parenthesis. Log rank p-value for significance between the curves is indicated at the bottom of each panel within the figure.
Figure 3Kaplan Meier plots for CNVRs associated with Recurrence Free Survival. KM plots were constructed based on the copy number status of each gene to determine the difference in recurrence free survival (RFS) between cases with genes harbouring copy number variation (gain/loss) versus diploid status. Blue indicates Diploid copy number; Green indicates Copy number gain; Red indicates Copy number loss. “ + ” indicates the censored events. Number of cases, n in the analysis is indicated and the number of events in the study for each survival curve is indicated in parenthesis. Log rank p-value for significance between the curves is indicated at the bottom of each panel within the figure.
Figure 4Copy number status estimated in study samples using TaqMan Assay. Copy number status of genes APOBEC3B (a) and GSTM1 (b) are represented for each sample. The Human RNAase P was used as internal normalization and the Coriell sample NA18635, which is diploid for both genes, were also used in copy number estimation.
Figure 5Association of germline copy number status and gene expression in breast tumor tissue. Germline copy number status of individual genes was plotted against gene expression in breast tumors from matched samples. The colours indicated in green, grey and red represent gain, diploid and deletion, respectively.