| Literature DB >> 31173346 |
Shengfeng Wang1,2, Jason J Pitt3,4, Yonglan Zheng2, Toshio F Yoshimatsu2, Guimin Gao5, Ayodele Sanni6, Olayiwola Oluwasola7, Mustapha Ajani7, Dominic Fitzgerald3, Abayomi Odetunde8, Galina Khramtsova2, Ian Hurley2, Abiodun Popoola9, Adeyinka Falusi8, Temidayo Ogundiran10, John Obafunwa6, Oladosu Ojengbede11, Nasiru Ibrahim8, Jordi Barretina12, Kevin P White13, Dezheng Huo5, Olufunmilayo I Olopade2.
Abstract
Somatic mutation signatures may represent footprints of genetic and environmental exposures that cause different cancer. Few studies have comprehensively examined their association with germline variants, and none in an indigenous African population. SomaticSignatures was employed to extract mutation signatures based on whole-genome or whole-exome sequencing data from female patients with breast cancer (TCGA, training set, n = 1,011; Nigerian samples, validation set, n = 170), and to estimate contributions of signatures in each sample. Association between somatic signatures and common single nucleotide polymorphisms (SNPs) or rare deleterious variants were examined using linear regression. Nine stable signatures were inferred, and four signatures (APOBEC C>T, APOBEC C>G, aging and homologous recombination deficiency) were highly similar to known COSMIC signatures and explained the majority (60-85%) of signature contributions. There were significant heritable components associated with APOBEC C>T signature (h2 = 0.575, p = 0.010) and the combined APOBEC signatures (h2 = 0.432, p = 0.042). In TCGA dataset, seven common SNPs within or near GNB5 were significantly associated with an increased proportion (beta = 0.33, 95% CI = 0.21-0.45) of APOBEC signature contribution at genome-wide significance, while rare germline mutations in MTCL1 was also significantly associated with a higher contribution of this signature (p = 6.1 × 10-6 ). This is the first study to identify associations between germline variants and mutational patterns in breast cancer across diverse populations and geography. The findings provide evidence to substantiate causal links between germline genetic risk variants and carcinogenesis.Entities:
Keywords: breast cancer; rare deleterious variants; single-nucleotide polymorphisms; somatic mutation signatures
Mesh:
Year: 2019 PMID: 31173346 PMCID: PMC6851589 DOI: 10.1002/ijc.32498
Source DB: PubMed Journal: Int J Cancer ISSN: 0020-7136 Impact factor: 7.396
Figure 1Flow chart of this study. [Color figure can be viewed at http://wileyonlinelibrary.com]
GWAS estimates of signature heritability for signature contributions in TCGA dataset
| Adjusted PCA, 733 samples (excluding relatedness) | ||||||||
|---|---|---|---|---|---|---|---|---|
|
|
|
| ||||||
| Signatures | Value | SE | Value | SE | Value | SE |
| |
| 1 | APOBEC of cytidine deaminases (C>T) | 0.011 | 0.006 | 0.008 | 0.004 | 0.575 | 0.254 |
|
| 2 | APOBEC (C>G) | 0.001 | 0.004 | 0.010 | 0.002 | 0.108 | 0.307 | 0.350 |
| 1 + 2 | APOBEC | 0.022 | 0.016 | 0.028 | 0.010 | 0.432 | 0.272 |
|
| 6 | Spontaneous deamination of 5‐methylcytosine | 0.001 | 0.011 | 0.034 | 0.007 | 0.040 | 0.313 | 0.445 |
| 9 | Failure of DNA double‐strand break‐repair by homologous recombination | 0.004 | 0.012 | 0.026 | 0.007 | 0.154 | 0.358 | 0.349 |
Note: In TCGA, among 24,570,114 SNPs, 1,573,566 variants were removed due to missing genotype data (geno = 0.05), 5,301,200 variants were removed due to minor allele threshold (MAF, 0.001) and 188,864 variants were removed due to Hardy–Weinberg exact test. A total of 17,506,484 SNPs were kept in the end. Among 1,054 samples with GWAS information, 178 samples were removed due to missing genotype data (mind = 0.01), 77 were removed when pruning the GRM (relatedness = 0.05) and 66 samples were excluded due to missing of mutation signature. In the end, we have 733 samples in the model. Bold p values denote statistical significance at the p < 0.05.
Figure 2Common SNP rs66866642/rs578194564 in GNB5 is associated with APOBEC signatures. (a) Manhattan plot for associations between common SNPs and APOBEC signatures. (b) Plot of log‐transformed p values from single marker analysis for the GNB5 gene. The labeled marker, a purple diamond (rs66866642/rs578194564), is the most significant SNP (index SNP). The LD between the index SNP and other markers in the region was color coded, with red color indicating strong LD (r 2 > 0.8) and blue color indicating weak LD (r 2 < 0.2). (c) Genotype of rs66866642/rs578194564 among different ethnic groups. (d) Functional elements at the GNB5 gene region. Data is from ENCODE through UCSC Genome Browser, including histone modification marks for H3K4Me1 and H3K27Ac, transcription factor binding sites and DNase hypersensitivity sites of human mammary epithelial cells (HMEC), breast cancer cell (MCF7). [Color figure can be viewed at http://wileyonlinelibrary.com]
The top 30 signal for common SNPs in association tests with combined APOBEC signature contributions
| ID | SNP | Chromosome | Position (base) | Ref allele | ALT allele | TCGA | Nigeria | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Information score | MAF |
| Beta | Standard error | MAF |
| Beta | Standard error | ||||||
| 1 | rs66866642/rs578194564 | 15 | 52,444,832 | C | CT | 0.948 | 0.169 |
| 0.353 | 0.062 | – | – | – | – |
| 2 | rs12902073 | 15 | 52,447,508 | G | A | 0.986 | 0.170 |
| 0.334 | 0.060 | – | – | – | – |
| 3 | rs12440354 | 15 | 52,452,434 | T | C | 0.983 | 0.175 |
| 0.328 | 0.060 | – | – | – | – |
| 4 | rs12901730 | 15 | 52,452,697 | T | A | 0.983 | 0.171 |
| 0.330 | 0.060 | – | – | – | – |
| 5 | rs12901891 | 15 | 52,452,834 | T | C | 0.983 | 0.171 |
| 0.330 | 0.060 | – | – | – | – |
| 6 | rs12438743 | 15 | 52,454,098 | C | T | 0.982 | 0.171 |
| 0.330 | 0.060 | – | – | – | – |
| 7 | rs373292393 | 15 | 52,440,257 | C | CAA | 0.983 | 0.155 |
| 0.340 | 0.062 | – | – | – | – |
| 8 | rs12910398 | 15 | 52,445,477 | C | T | 0.987 | 0.176 | 5.57E‐08 | 0.322 | 0.059 | 0.051 | 0.239 | 0.390 | 0.331 |
| 9 | rs12905698 | 15 | 52,445,090 | A | G | 0.986 | 0.176 | 6.16E‐08 | 0.321 | 0.059 | 0.051 | 0.239 | 0.390 | 0.331 |
| 10 | rs12909880 | 15 | 52,449,293 | T | C | 0.982 | 0.173 | 6.76E‐08 | 0.323 | 0.060 | – | – | – | – |
| 11 | rs12902522 | 15 | 52,448,036 | A | G | 0.986 | 0.177 | 7.59E‐08 | 0.319 | 0.059 | 0.051 | 0.239 | 0.390 | 0.331 |
| 12 | rs12909184 | 15 | 52,450,617 | G | T | 0.983 | 0.178 | 7.91E‐08 | 0.319 | 0.059 | 0.051 | 0.239 | 0.390 | 0.331 |
| 13 | rs12910052 | 15 | 52,449,360 | T | C | 0.983 | 0.178 | 7.98E‐08 | 0.318 | 0.059 | 0.051 | 0.239 | 0.390 | 0.331 |
| 14 | rs12903769 | 15 | 52,449,733 | G | A | 0.982 | 0.167 | 8.20E‐08 | 0.327 | 0.061 | – | – | – | – |
| 15 | rs12908833 | 15 | 52,449,195 | G | C | 0.982 | 0.174 | 8.80E‐08 | 0.320 | 0.060 | – | – | – | – |
| 16 | rs12908742 | 15 | 52,449,379 | A | C | 0.982 | 0.174 | 8.82E‐08 | 0.320 | 0.060 | 0.051 | 0.239 | 0.390 | 0.331 |
| 17 | rs3794543 | 15 | 52,441,524 | C | T | 0.993 | 0.158 | 1.07E‐07 | 0.327 | 0.061 | – | – | – | – |
| 18 | rs4776007 | 15 | 52,447,299 | A | G | 0.987 | 0.185 | 1.29E‐07 | 0.308 | 0.058 | 0.097 | 0.203 | 0.323 | 0.254 |
| 19 | rs8034097 | 15 | 52,437,278 | A | G | 0.995 | 0.151 | 1.32E‐07 | 0.330 | 0.063 | – | – | – | – |
| 20 | rs12438274 | 15 | 52,439,368 | T | G | 0.971 | 0.188 | 1.33E‐07 | 0.311 | 0.059 | – | – | – | – |
| 21 | rs12438194 | 15 | 52,439,096 | T | A | 0.993 | 0.152 | 1.35E‐07 | 0.330 | 0.063 | – | – | – | – |
| 22 | rs4636859 | 15 | 52,435,696 | T | C | 0.996 | 0.151 | 1.45E‐07 | 0.329 | 0.063 | – | – | – | – |
| 23 | rs12438937 | 15 | 52,441,686 | C | T | 0.992 | 0.170 | 1.79E‐07 | 0.313 | 0.060 | 0.092 | 0.216 | 0.320 | 0.259 |
| 24 | rs35709612 | 15 | 52,439,940 | A | C | 0.994 | 0.169 | 1.80E‐07 | 0.313 | 0.060 | 0.092 | 0.208 | 0.326 | 0.259 |
| 25 | rs35448038 | 15 | 52,434,758 | GT | G | 0.981 | 0.163 | 1.89E‐07 | 0.317 | 0.061 | – | – | – | – |
| 26 | rs12593041 | 15 | 52,443,107 | C | T | 0.988 | 0.171 | 2.49E‐07 | 0.310 | 0.060 | 0.097 | 0.203 | 0.323 | 0.254 |
| 27 | rs62014722 | 15 | 52,434,338 | C | G | 0.995 | 0.163 | 2.52E‐07 | 0.313 | 0.061 | 0.056 | 0.130 | 0.481 | 0.318 |
| 28 | rs62014721 | 15 | 52,434,284 | C | G | 0.994 | 0.163 | 2.53E‐07 | 0.313 | 0.061 | 0.056 | 0.130 | 0.481 | 0.318 |
| 29 | rs4238384 | 15 | 52,436,334 | C | T | 1.000 | 0.222 | 2.75E‐07 | 0.281 | 0.055 | 0.276 | 0.227 | 0.212 | 0.175 |
| 30 | rs71130134 | 15 | 52,437,673 | CA | C | 0.980 | 0.157 | 4.73E‐07 | 0.313 | 0.062 | – | – | – | – |
Bold p values denote statistical significance at the p < 5 × 10−8 (genome‐wide association study threshold).
A total of 17 SNPs were not covered in WES samples, so we filled the corresponding cell with “–”.
Abbreviation: MAF, minor allele frequency.
Figure 3Associations between rare deleterious variants and HRD signature. (a) Plot of log‐transformed p values from gene‐based analysis for rare deleterious variants. (b–d) Comparison of contribution of HRD signature, grouped by breast cancer tissue subgroups, datasets, and the number of BRCA1 (b), BRCA2 (c) and BRCA1/BRCA2 mutations (d). [Color figure can be viewed at http://wileyonlinelibrary.com]
Significant results of gene‐based analysis in the training and validation datasets
| Signatures | Gene | TCGA data | Nigeria data | Pooled data | ||||
|---|---|---|---|---|---|---|---|---|
| Nominal | False discovery rate | Theta | # of variants in the test |
| # of variants in the test |
| ||
| APOBEC C>T |
| 9.04 × 10−6 | 0.006 | 1 | 5 | 0.580 | 5 | 0.0089 |
|
| 0.186 | 1.000 | 1 | 9 | 0.442 | 1 | 0.282 | |
|
| 0.773 | 1.000 | 1 | 13 | 0.494 | 2 | 1.000 | |
|
| 0.0023 | 1.154 | 0 | 6 | 0.434 | 1 | 0.0036 | |
|
| 5.21 × 10−5 | 0.053 | 0 | 16 | 0.538 | 2 | 0.0005 | |
|
| 0.149 | 1.000 | 1 | 20 | 0.0657 | 9 | 0.0798 | |
|
| 0.0002 | 0.061 | 0 | 15 | 0.156 | 7 | 0.0609 | |
| APOBEC C>G |
| 0.356 | 1.000 | 1 | 5 | 0.737 | 5 | 0.580 |
|
| 4.45 × 10−7 | 0.0005 | 0 | 9 | 0.444 | 1 |
| |
|
| 3.46 × 10−6 | 0.005 | 0 | 13 | 0.219 | 2 |
| |
|
| 6.46 × 10−7 | 0.001 | 0 | 6 | 0.740 | 1 | 2.86 × 10−5 | |
|
| 1.35 × 10−6 | 0.001 | 0 | 16 | 0.360 | 2 |
| |
|
| 2.23 × 10−5 | 0.015 | 0 | 20 | 0.119 | 9 | 2.57 × 10−5 | |
|
| 0.0095 | 0.417 | 1 | 15 | 0.0238 | 7 | 0.0445 | |
| APOBEC |
| 0.1274 | 0.883 | 1 | 5 | 0.611 | 5 | 0.507 |
|
| 0.0006 | 0.165 | 0 | 9 | 0.957 | 1 | 0.0014 | |
|
| 0.0154 | 0.452 | 0 | 13 | 0.808 | 2 | 0.0075 | |
|
| 1.50 × 10−5 | 0.056 | 0 | 6 | 0.520 | 1 | 0.0002 | |
|
| 1.91 × 10−6 | 0.012 | 1 | 16 | 0.393 | 2 |
| |
|
| 0.0024 | 0.253 | 1 | 20 | 0.054 | 9 | 0.0009 | |
|
| 0.0004 | 0.189 | 1 | 15 | 0.041 | 7 | 0.079 | |
| HRD |
| 1.03 × 10−8 | 0.0001 | 1 | 25 | 0.139 | 6 |
|
|
| 1.05 × 10−5 | 0.008 | 1 | 50 | 0.014 | 8 |
| |
|
| 5.15 × 10−6 | 0.005 | 0 | 10 | 0.391 | 1 |
| |
|
| 1.80 × 10−5 | 0.009 | 0 | 5 | – | 0 |
| |
Bold p values denote statistical significance at the p < 9.69 × 10−6 (Bonferroni corrected alpha level).
The weighting parameter theta indicates whether the SKAT test (theta = 0) or burden test (theta = 1) gave the smallest p value.
p value from optimal sequence kernel association test (SKAT‐O).