| Literature DB >> 28957356 |
Guimin Gao1, Brandon L Pierce1,2, Olufunmilayo I Olopade3, Hae Kyung Im4, Dezheng Huo1.
Abstract
Genome-wide association studies (GWAS) have identified more than 90 susceptibility loci for breast cancer, but the underlying biology of those associations needs to be further elucidated. More genetic factors for breast cancer are yet to be identified but sample size constraints preclude the identification of individual genetic variants with weak effects using traditional GWAS methods. To address this challenge, we utilized a gene-level expression-based method, implemented in the MetaXcan software, to predict gene expression levels for 11,536 genes using expression quantitative trait loci and examine the genetically-predicted expression of specific genes for association with overall breast cancer risk and estrogen receptor (ER)-negative breast cancer risk. Using GWAS datasets from a Challenge launched by National Cancer Institute, we identified TP53INP2 (tumor protein p53-inducible nuclear protein 2) at 20q11.22 to be significantly associated with ER-negative breast cancer (Z = -5.013, p = 5.35×10-7, Bonferroni threshold = 4.33×10-6). The association was consistent across four GWAS datasets, representing European, African and Asian ancestry populations. There are 6 single nucleotide polymorphisms (SNPs) included in the prediction of TP53INP2 expression and five of them were associated with estrogen-receptor negative breast cancer, although none of the SNP-level associations reached genome-wide significance. We conducted a replication study using a dataset outside of the Challenge, and found the association between TP53INP2 and ER-negative breast cancer was significant (p = 5.07x10-3). Expression of HP (16q22.2) showed a suggestive association with ER-negative breast cancer in the discovery phase (Z = 4.30, p = 1.70x10-5) although the association was not significant after Bonferroni adjustment. Of the 249 genes that are 250 kb within known breast cancer susceptibility loci identified from previous GWAS, 20 genes (8.0%) were statistically significant associated with ER-negative breast cancer (p<0.05), compared to 582 (5.2%) of 11,287 genes that are not close to previous GWAS loci. This study demonstrated that expression-based gene mapping is a promising approach for identifying cancer susceptibility genes.Entities:
Mesh:
Substances:
Year: 2017 PMID: 28957356 PMCID: PMC5619687 DOI: 10.1371/journal.pgen.1006727
Source DB: PubMed Journal: PLoS Genet ISSN: 1553-7390 Impact factor: 5.917
Top genes with P-values < 10−3 in analyses of association between predicted gene expressions and overall breast cancer risk*.
| Gene | Cytoband | SNPs in predictor | AABC | CGEMS | ROOT | SBCGS | Total | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Z score | P value | Z score | P value | Z score | P value | Z score | P value | Z score | P value | FDR | |||
| 6 | 0.34 | ||||||||||||
| BAG3 | 10q26.11 | 18 | -2.109 | 3.49E-02 | -1.145 | 2.52E-01 | -3.003 | 2.67E-03 | -1.074 | 2.83E-01 | -3.660 | 2.52E-04 | 0.77 |
| POLN | 4p16.3 | 39 | -2.291 | 2.20E-02 | -2.614 | 8.96E-03 | -0.942 | 3.46E-01 | -1.624 | 1.04E-01 | -3.644 | 2.68E-04 | 0.77 |
| WDR37 | 10p15.3 | 9 | -1.637 | 1.02E-01 | -0.747 | 4.55E-01 | -0.292 | 7.70E-01 | -4.144 | 3.42E-05 | -3.629 | 2.84E-04 | 0.77 |
| TTLL5 | 14q24.3 | 26 | 2.717 | 6.59E-03 | 2.087 | 3.69E-02 | 1.189 | 2.35E-01 | 1.206 | 2.28E-01 | 3.588 | 3.33E-04 | 0.77 |
| HP | 16q22.2 | 19 | 2.424 | 1.53E-02 | 1.961 | 4.99E-02 | 1.598 | 1.10E-01 | 1.147 | 2.52E-01 | 3.529 | 4.18E-04 | 0.77 |
| VTI1B | 14q24.1 | 1 | 1.433 | 1.52E-01 | 2.790 | 5.28E-03 | 0.902 | 3.67E-01 | 2.151 | 3.15E-02 | 3.471 | 5.18E-04 | 0.77 |
| HLA-DMA | 6p21.32 | 30 | -2.001 | 4.54E-02 | -0.756 | 4.50E-01 | -1.603 | 1.09E-01 | -2.293 | 2.19E-02 | -3.456 | 5.47E-04 | 0.77 |
| MYOM2 | 8p23.3 | 109 | 2.338 | 1.94E-02 | -0.051 | 9.59E-01 | 2.153 | 3.14E-02 | 1.955 | 5.06E-02 | 3.430 | 6.04E-04 | 0.77 |
| MYO9B | 19p13.11 | 6 | 1.643 | 1.00E-01 | 0.887 | 3.75E-01 | 1.473 | 1.41E-01 | 2.549 | 1.08E-02 | 3.373 | 7.44E-04 | 0.81 |
| ZNF202 | 11q24.1 | 15 | 2.214 | 2.69E-02 | 1.675 | 9.39E-02 | 0.003 | 9.98E-01 | 2.644 | 8.20E-03 | 3.363 | 7.71E-04 | 0.81 |
*Bonferroni threshold = 4.33×10−6.
FDR, false discovery rate.
Top genes with P-values < 10−3 in analyses of association between predicted gene expressions and ER-negative breast cancer risk*.
| Gene | Cytoband | SNPs in predictor | AABC | BPC3 | ROOT | SBCGS | Total | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Z score | P value | Z score | P value | Z score | P value | Z score | P value | Z score | P value | FDR | |||
| 0.0062 | |||||||||||||
| 0.098 | |||||||||||||
| DHODH | 16q22.2 | 58 | -1.121 | 2.62E-01 | -4.700 | 2.61E-06 | 1.020 | 3.08E-01 | -1.859 | 6.31E-02 | -4.119 | 3.80E-05 | 0.15 |
| YJEFN3 | 19p13.11 | 20 | -2.650 | 8.05E-03 | -2.797 | 5.16E-03 | 0.154 | 8.78E-01 | -1.549 | 1.21E-01 | -3.810 | 1.39E-04 | 0.34 |
| MAP1LC3A | 20q11.22 | 49 | -2.077 | 3.78E-02 | -2.922 | 3.48E-03 | -1.751 | 7.99E-02 | -0.157 | 8.75E-01 | -3.734 | 1.88E-04 | 0.34 |
| DPY19L1 | 7p14.2 | 24 | 2.188 | 2.87E-02 | 3.035 | 2.41E-03 | 0.672 | 5.01E-01 | 0.791 | 4.29E-01 | 3.731 | 1.91E-04 | 0.34 |
| GCOM1 | 15q21.3 | 75 | -1.841 | 6.56E-02 | -3.295 | 9.85E-04 | -0.854 | 3.93E-01 | -0.525 | 5.99E-01 | -3.689 | 2.25E-04 | 0.34 |
| AMOTL1 | 11q21 | 14 | 2.155 | 3.12E-02 | 2.118 | 3.42E-02 | 0.448 | 6.54E-01 | 2.509 | 1.21E-02 | 3.675 | 2.38E-04 | 0.34 |
| ITCH | 20q11.22 | 12 | -1.597 | 1.10E-01 | -3.861 | 1.13E-04 | 0.203 | 8.39E-01 | -0.318 | 7.51E-01 | -3.494 | 4.77E-04 | 0.61 |
| TRPC4AP | 20q11.22 | 26 | 2.385 | 1.71E-02 | 1.899 | 5.76E-02 | 2.536 | 1.12E-02 | 0.127 | 8.99E-01 | 3.466 | 5.28E-04 | 0.61 |
| SNX24 | 5q23.2 | 3 | -0.902 | 3.67E-01 | -2.235 | 2.54E-02 | -1.612 | 1.07E-01 | -2.022 | 4.32E-02 | -3.327 | 8.77E-04 | 0.91 |
*Bonferroni threshold = 4.33×10−6
FDR, false discovery rate
TP53INP2-related SNPs and their association with breast cancer risk.
| Pos. at chr20 | Overall | ER-negative | |||||
|---|---|---|---|---|---|---|---|
| SNP | Test/ref allele | Study | OR (95% CI) | P | OR (95% CI) | P | |
| rs1205339 | 32,924,967 | G/A | BPC3 | 1.17 (1.05–1.31) | 5.9E-03 | ||
| -367,127 | CGEMS | 1.05 (0.91–1.23) | 0.49 | ||||
| AABC | 1.14 (1.03–1.27) | 0.013 | 1.29 (1.12–1.50) | 5.6E-04 | |||
| ROOT | 1.19 (1.04–1.36) | 0.012 | 1.39 (1.10–1.76) | 5.6E-03 | |||
| SBCGS | 1.07 (0.97–1.18) | 0.18 | 1.06 (0.89–1.26) | 0.51 | |||
| meta | 1.11 (1.05–1.18) | 4.2E-04 | 1.20 (1.11–1.29) | 1.8E-06 | |||
| rs4911154 | 32,996,101 | A/G | BPC3 | 1.16 (1.04–1.30) | 0.01 | ||
| -295,993 | CGEMS | 1.04 (0.89–1.21) | 0.65 | ||||
| AABC | 1.15 (1.03–1.28) | 0.014 | 1.32 (1.14–1.54) | 2.9E-04 | |||
| ROOT | 1.23 (1.07–1.42) | 3.5E-03 | 1.39 (1.09–1.78) | 8.3E-03 | |||
| SBCGS | 1.09 (0.98–1.21) | 0.11 | 1.06 (0.88–1.28) | 0.54 | |||
| meta | 1.13 (1.06–1.20) | 1.6E-04 | 1.20 (1.11–1.30) | 2.6E-06 | |||
| rs8116198 | 33,114,201 | G/A | BPC3 | 0.92 (0.80–1.05) | 0.21 | ||
| -177,893 | CGEMS | 0.94 (0.78–1.13) | 0.5 | ||||
| AABC | 1.04 (0.82–1.33) | 0.73 | 1.11 (0.79–1.57) | 0.54 | |||
| ROOT | 0.67 (0.45–1.00) | 0.052 | 0.87 (0.49–1.54) | 0.63 | |||
| meta | 0.93 (0.81–1.07) | 0.33 | 0.92 (0.80–1.04) | 0.30 | |||
| rs6058107 | 33,288,546 | C/T | BPC3 | 0.87 (0.78–0.97) | 8.7E-03 | ||
| -3,548 | CGEMS | 0.92 (0.80–1.06) | 0.27 | ||||
| AABC | 0.91 (0.83–1.01) | 0.072 | 0.84 (0.73–0.96) | 0.014 | |||
| ROOT | 0.83 (0.73–0.94) | 4.00E-03 | 0.80 (0.64–0.99) | 0.041 | |||
| SBCGS | 0.92 (0.84–1.00) | 0.057 | 1.03 (0.88–1.20) | 0.71 | |||
| meta | 0.90 (0.85–0.95) | 1.4E-04 | 0.90 (0.83–0.97) | 5.0E-04 | |||
| rs6060047 | 33,367,400 | G/T | BPC3 | 0.87 (0.77–0.97) | 0.017 | ||
| 75,306 | CGEMS | 0.91 (0.78–1.07) | 0.27 | . | |||
| AABC | 0.88 (0.79–0.98) | 0.016 | 0.75 (0.65–0.86) | 7.5E-05 | |||
| ROOT | 0.83 (0.73–0.95) | 6.10E-03 | 0.76 (0.60–0.96) | 0.019 | |||
| SBCGS | 0.94 (0.86–1.03) | 0.21 | 0.98 (0.83–1.16) | 0.81 | |||
| meta | 0.90 (0.85–0.95) | 2.1E-04 | 0.84 (0.78–0.91) | 7.3E-06 | |||
| rs11546155 | 33,451,148 | A/G | BPC3 | 1.19 (1.04–1.35) | 9.1E-03 | ||
| 159,054 | CGEMS | 1.12 (0.94–1.34) | 0.21 | ||||
| AABC | 1.14 (1.02–1.27) | 0.023 | 1.32 (1.14–1.54) | 3.3E-04 | |||
| ROOT | 1.11 (0.96–1.28) | 0.15 | 1.18 (0.93–1.51) | 0.18 | |||
| SBCGS | 1.16 (0.98–1.38) | 0.089 | 1.26 (0.94–1.70) | 0.13 | |||
| meta | 1.13 (1.05–1.21) | 7.0E-04 | 1.23 (1.13–1.35) | 2.0E-06 |
* NCBI 37 and from transcription starting site of TP53INP2
† rs8116198 is monomorphic in Asian population.
None of the tests for heterogeneity across studies was significant.
OR, odds ratio; CI, confidence intervals; ER, estrogen receptor
Regulatory element annotation of variants that predicted expression of TP53INP2 using HaploReg [33].
| Variant | Position | Promoter | Enhancer | DNAse hypersensitivity | Proteins | Motifs |
|---|---|---|---|---|---|---|
| rs1205339 | -367,127 | 6 tissues including breast and blood (HMEC, MYO, HMEC.35) | ATF2, Mef2, Pax-4, Pou1f1, TATA | |||
| rs4911154 | -295,993 | Liver | Blood, liver | TCF4 | RFX5 | |
| rs8116198 | -177,893 | 24 tissues including breast and blood (HMEC, MYO, HMEC.35) | POL2, TBP, TR4 | Rad21 | ||
| rs6058107 | -3,548 | 24 tissues including breast and blood (HMEC, MYO, HMEC.35) | 28 tissues including breast and blood | AP-1,NF-E2 | ||
| rs6060047 | 75,306 | Multiple tissue types including blood and breast (HMEC, MYO, HMEC.35) | Multiple tissue types including blood | BATF, GCNF, HNF1, Irf, STAT | ||
| rs11546155 | 159,054 | 2 tissue types | 4 tissue types | NRSF, Pou5f1, RXRA, Sin3Ak-20 |
* base pair from the transcription starting site of TP53INP2
† Normal mammary or breast cancer cell lines are indicated in parenthesis. HMEC.35: breast variant human mammary epithelial cells; MYO: breast myoepithelial primary cells; HMEC: mammary epithelial primary cells (vMHEC)
‡ Variants in strong linkage disequilibrium
GAME-ON replication for SNPs related to the TP53INP2 gene.
| Test/ref allele | Study phase | Overall | ER-negative | |||
|---|---|---|---|---|---|---|
| SNP | OR (95% CI) | P | OR (95% CI) | P | ||
| rs11546155 | A/G | U4C | 1.13 (1.05–1.21) | 7.0E-04 | 1.28 (1.14–1.44) | 5.0E-05 |
| GAME-ON | 1.05 (0.99–1.10) | 0.11 | 1.13 (1.03–1.25) | 0.013 | ||
| rs1205339 | G/A | U4C | 1.11 (1.05–1.18) | 4.1E-04 | 1.22 (1.11–1.35) | 8.0E-05 |
| GAME-ON | 1.02 (0.98–1.07) | 0.3 | 1.09 (1.01–1.17) | 0.021 | ||
| rs4911154 | A/G | U4C | 1.13 (1.06–1.20) | 1.6E-04 | 1.25 (1.12–1.39) | 5.5E-05 |
| GAME-ON | 1.02 (0.98–1.07) | 0.31 | 1.09 (1.01–1.17) | 0.02 | ||
| rs6058107 | C/T | U4C | 0.90 (0.85–0.95) | 1.5E-04 | 0.90 (0.82–0.98) | 0.021 |
| GAME-ON | 0.96 (0.92–1.00) | 0.045 | 0.91 (0.85–0.97) | 5.0E-03 | ||
| rs6060047 | G/T | U4C | 0.90 (0.85–0.95) | 2.1E-04 | 0.82 (0.75–0.91) | 1.2E-04 |
| GAME-ON | 0.96 (0.91–1.00) | 0.066 | 0.90 (0.84–0.97) | 7.7E-03 | ||
| rs8116198 | G/A | U4C | 0.93 (0.81–1.07) | 0.33 | 1.04 (0.78–1.40) | 0.79 |
| GAME-ON | 0.97 (0.91–1.03) | 0.28 | 0.94 (0.84–1.05) | 0.28 | ||
*The overlapping study (BPC3) was removed from the meta-analysis in the discovery phase (U4C).
OR, odds ratio; CI, confidence interval; ER, estrogen receptor
dbGaP datasets used in the our gene level expression-based GWAS analysis.
| Accession Number | Study Name | Acronym | Breast Cancer Phenotype | Population |
|---|---|---|---|---|
| phs000851 | African American Breast Cancer GWAS | AABC | 3016 cases, 988 ER- cases, 2745 controls | African American |
| phs000812 | Breast and Prostate Cancer Cohort Consortium GWAS | BPC3 | 1998 ER- cases, 3263 controls | European American |
| phs000147 | Cancer Genetic Markers of Susceptibility Breast Cancer GWAS | CGEMS | 1142 cases, 1145 controls | European American |
| phs000383 | GWAS of Breast Cancer in the African Diaspora | ROOT | 1657 cases, 403 ER- cases, 2029 controls | African American, African, African Barbadian |
| phs000799 | Shanghai Breast Cancer Genetic Study | SBCGS | 2790 cases, 490 ER- cases, and 2176 controls | Asian (Chinese) |