| Literature DB >> 25803781 |
You Li1, Xiaosheng Wang1, Suleyman Vural1, Nitish K Mishra1, Kenneth H Cowan2, Chittibabu Guda3.
Abstract
Breast cancers exhibit highly heterogeneous molecular profiles. Although gene expression profiles have been used to predict the risks and prognostic outcomes of breast cancers, the high variability of gene expression limits its clinical application. In contrast, genetic mutation profiles would be more advantageous than gene expression profiles because genetic mutations can be stably detected and the mutational heterogeneity widely exists in breast cancer genomes. We analyzed 98 breast cancer whole exome samples that were sorted into three subtypes, two grades and two stages. The sum deleterious effect of all mutations in each gene was scored to identify differentially mutated genes (DMGs) for this case-control study. DMGs were corroborated using extensive published knowledge. Functional consequences of deleterious SNVs on protein structure and function were also investigated. Genes such as ERBB2, ESP8, PPP2R4, KIAA0922, SP4, CENPJ, PRCP and SELP that have been experimentally or clinically verified to be tightly associated with breast cancer prognosis are among the DMGs identified in this study. We also identified some genes such as ARL6IP5, RAET1E, and ANO7 that could be crucial for breast cancer development and prognosis. Further, SNVs such as rs1058808, rs2480452, rs61751507, rs79167802, rs11540666, and rs2229437 that potentially influence protein functions are observed at significantly different frequencies in different comparison groups. Protein structure modeling revealed that many non-synonymous SNVs have a deleterious effect on protein stability, structure and function. Mutational profiling at gene- and SNV-level revealed differential patterns within each breast cancer comparison group, and the gene signatures correlate with expected prognostic characteristics of breast cancer classes. Some of the genes and SNVs identified in this study show high promise and are worthy of further investigation by experimental studies.Entities:
Mesh:
Year: 2015 PMID: 25803781 PMCID: PMC4372331 DOI: 10.1371/journal.pone.0119383
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
A summary of the five comparison groups of breast cancers used in this study.
|
|
|
|
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|---|---|---|---|
|
| 35 | 14 | 31 | 18 | 8 | 41 | 25 | 13 | 32 | 10 |
|
| 5 | 13 | 6 | 12 | 0 | 1 | 0 | 13 | 38 | 8 |
|
| 0.001892 | 0.05079 | 1 | 1 | 0.598 | |||||
|
| 40 | 27 | 37 | 30 | 8 | 42 | 25 | 13 | 70 | 18 |
a Sample size for these class are too small (<10) for separate class comparison among each race.
b Fisher’s exact tests have been conducted in order to check the distribution difference of Mexican and Vietnam patients in each comparison group. Only ER comparison group has significantly different race composition (p<0.05).
c 25 of the patients in Grade II are all Mexican patients, compared to 13 Mexican patients and 13 Vietnamese Patients in Grade III. Therefore, we excluded 13 Samples from Vietnam Grade III patients and the sample sizes of Grade II vs. Grade III used in this study (all Mexican patients), are 25 and 13 respectively. The reported fisher’s exact test statistics for this comparison group is also based on the exclusion of Vietnam patient samples.
Fig 1The differentially mutated genes between breast cancer subtypes.
A total of 50 genes are identified that are differentially mutated by comparison of ER+ vs. ER-, PR+ vs. PR-, HER2+ vs. HER2-, grade II vs. grade III, and stage II vs. stage III breast cancer classes, respectively. Each class comparison is shown in layered circles. The differentially mutated genes are shown in the outer layer, which correspond to their chromosome coordinates and subtype comparisons. The differentially mutated genes are sorted into four different categories based on their relevance to breast cancer or other types of cancer. Category 1 includes the genes that are directly related to breast cancer (in dark red). Category 2 includes the genes that are related to other types of cancer (in green). Category 3 includes the genes whose family members are related to cancer (in blue). Category 4 includes the genes whose relatedness to cancer remains to be studied (in gray). The mean deleterious mutation score for each gene in each class comparison is shown in colored thin bar (red and blue colors refer to two different classes). The length of thin bars is proportional to the mean deleterious score.
Fig 2The deleterious mutation scores for the differentially mutated genes across the compared samples.
Five heatmaps show the deleterious mutation scores across the compared samples for the differentially mutated genes identified by comparison of ER+ vs. ER-, PR+ vs. PR-, HER2+ vs. HER2-, grade II vs. grade III, and stage II vs. stage III breast cancer classes, respectively. Higher score implies more deleterious mutations one gene has. It is evident that groups with better prognosis (ER+, PR+, HER2-, Stage II and Grade II) tend to have fewer deleteriously mutated genes.
Differentially mutated genes between ER+ and ER- breast cancer subtypes.
|
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|---|
| CSN3 | 7.06E-05 | 0.090 | 0.61 | 0.16 | 3.75 | casein kappa | 1 |
| ERBB2 | 1.57E-04 | 0.099 | 0.32 | 0.81 | 0.40 | v-erb-b2 erythroblastic leukemia viral oncogene homolog 2, neuro/glioblastoma derived oncogene homolog (avian) | 1 |
| PPP2R4 | 2.09E-04 | 0.099 | 0.44 | 0.04 | 10.40 | protein phosphatase 2A activator, regulatory subunit 4 | 1 |
| CAPZA2 | 4.02E-04 | 0.128 | 0.75 | 0.33 | 2.24 | capping protein (actin filament) muscle Z-line, alpha 2 | 1 |
| SKOR1 | 7.56E-04 | 0.181 | 0.41 | 0.80 | 0.51 | SKI family transcriptional corepressor 1 | 1 |
| ARL6IP5 | 1.72E-04 | 0.099 | 0.40 | 0.04 | 9.39 | ADP-ribosylation-like factor 6 interacting protein 5 | 2 |
| RAET1E | 2.28E-04 | 0.099 | 0.63 | 0.20 | 3.14 | retinoic acid early transcript 1E | 2 |
| DPP3 | 2.54E-04 | 0.099 | 0.26 | 0.70 | 0.38 | dipeptidyl-peptidase 3 | 2 |
| OR1J2 | 4.04E-05 | 0.090 | 0.33 | 0.00 | INF | olfactory receptor, family 1, subfamily J, member 2 | 3 |
| OR52E6 | 1.68E-04 | 0.099 | 0.86 | 0.43 | 2.00 | olfactory receptor, family 52, subfamily E, member 6 | 3 |
| GPR157 | 5.57E-04 | 0.142 | 0.18 | 0.58 | 0.30 | G protein-coupled receptor 157 | 3 |
| SLC24A1 | 6.30E-05 | 0.090 | 0.85 | 0.34 | 2.46 | solute carrier family 24 (sodium/potassium/calcium exchanger), member 1 | 4 |
| KRT74 | 2.59E-04 | 0.099 | 0.59 | 0.18 | 3.37 | keratin 74 | 4 |
| DIS3L | 2.85E-04 | 0.099 | 0.49 | 0.10 | 4.97 | DIS3 mitotic control homolog (S. cerevisiae)-like | 4 |
| OC90 | 4.68E-04 | 0.138 | 0.26 | 0.00 | INF | otoconin 90 | 4 |
| DYNC2LI1 | 5.56E-04 | 0.142 | 0.37 | 0.80 | 0.46 | dynein, cytoplasmic 2, light intermediate chain 1 | 4 |
| GLYATL3 | 8.67E-04 | 0.192 | 0.44 | 0.82 | 0.54 | chromosome 6 open reading frame 140 | 4 |
| FAM209B | 9.02E-04 | 0.192 | 0.43 | 0.10 | 4.44 | family with sequence similarity 209, member B | 4 |
a FDR: False Discovery Rate
b FC: fold change (ER-/ER+); INF: infinite
c Category 1: directly related to breast cancer;
Category 2: related to other types of cancer, but not to breast cancer;
Category 3: other members of the same family (but not by itself) are related to cancer;
Category 4: not belonging to any of the former three categories.
*All the above notations apply to Tables 3, 4, 5 and 6.
Differentially mutated genes between Stage II and Stage III breast cancer classes.
| Gene Symbol | p-value | FDR a | Mean of mutation score in Stage II | Mean of mutation score in Stage III | FC b | Gene Name | Category c |
|---|---|---|---|---|---|---|---|
| CPZ | 4.24E-05 | 0.055 | 0.01 | 0.39 | 0.04 | carboxypeptidase Z | 1 |
| LPPR2 | 4.29E-05 | 0.055 | 0.01 | 0.26 | 0.05 | lipid phosphate phosphatase-related protein type 2 | 1 |
| PRCP | 1.44E-04 | 0.138 | 0.08 | 0.43 | 0.19 | prolylcarboxypeptidase (angiotensinase C) | 1 |
| UNC45A | 5.03E-04 | 0.297 | 0.01 | 0.23 | 0.06 | unc-45 homolog A (C. elegans) | 1 |
| PLEKHG6 | 5.44E-04 | 0.297 | 0.38 | 0.82 | 0.46 | pleckstrin homology domain containing, family G (with RhoGef domain) member 6 | 1 |
| MMP20 | 7.99E-04 | 0.339 | 0.06 | 0.33 | 0.17 | matrix metallopeptidase 20 | 2 |
| CDH26 | 4.21E-05 | 0.055 | 0.01 | 0.28 | 0.05 | cadherin-like 26 | 3 |
| GSTO1 | 2.93E-04 | 0.224 | 0.21 | 0.66 | 0.32 | glutathione S-transferase omega 1 | 3 |
| AGL | 6.42E-04 | 0.307 | 0.37 | 0.83 | 0.44 | amylo-1, 6-glucosidase, 4-alpha-glucanotransferase | 4 |
| OGFOD3 | 9.77E-04 | 0.374 | 0.07 | 0.39 | 0.18 | 2-oxoglutarate and iron-dependent oxygenase domain containing 3 | 4 |
*All the notations are the same as in Table 2.
Differentially mutated genes between PR+ and PR- breast cancer subtypes.
| Gene Symbol | p-value | FDR a | Mean of mutation score in PR- | Mean of mutation score in PR+ | FC b | Gene Name | Category c |
|---|---|---|---|---|---|---|---|
| SKOR1 | 1.14E-04 | 0.297 | 0.40 | 0.84 | 0.48 | SKI family transcriptional corepressor 1 | 1 |
| CPN1 | 5.47E-04 | 0.425 | 0.32 | 0.03 | 11.27 | carboxypeptidase N, polypeptide 1 | 1 |
| ARID5A | 1.00E-03 | 0.425 | 0.37 | 0.06 | 6.49 | AT rich interactive domain 5A (MRF1-like) | 1 |
| DPP3 | 7.74E-04 | 0.425 | 0.30 | 0.70 | 0.42 | dipeptidyl-peptidase 3 | 2 |
| OR1J2 | 2.14E-04 | 0.297 | 0.30 | 0.00 | INF | olfactory receptor, family 1, subfamily J, member 2 | 3 |
| HKR1 | 8.95E-04 | 0.425 | 0.50 | 0.14 | 3.60 | GLI-Kruppel family member HKR1 | 3 |
| KIAA1377 | 2.33E-04 | 0.297 | 0.56 | 0.11 | 5.01 | KIAA1377 | 4 |
| RBM46 | 5.91E-04 | 0.425 | 0.26 | 0.00 | INF | RNA binding motif protein 46 | 4 |
| WDR87 | 8.72E-04 | 0.425 | 0.14 | 0.54 | 0.26 | WD repeat domain 87 | 4 |
*All the notations are the same as in Table 2.
Differentially mutated genes between HER2+ and HER2- breast cancer subtypes.
| Gene Symbol | p-value | FDR a | Mean of mutation score in HER2- | Mean of mutation score in HER2+ | FC b | Gene Name | Category c |
|---|---|---|---|---|---|---|---|
| BCAR1 | 1.06E-05 | 0.020 | 0.00 | 0.37 | 0.00 | similar to breast cancer anti-estrogen resistance 1; breast cancer anti-estrogen resistance 1 | 1 |
| CENPJ | 2.59E-04 | 0.248 | 0.56 | 1.46 | 0.38 | centromere protein J | 1 |
| EPS8 | 4.61E-04 | 0.294 | 0.03 | 0.37 | 0.08 | epidermal growth factor receptor pathway substrate 8 | 1 |
| KIAA0922 | 6.15E-04 | 0.332 | 0.00 | 0.25 | 0.00 | KIAA0922 | 1 |
| SP4 | 9.61E-04 | 0.380 | 0.07 | 0.48 | 0.15 | Sp4 transcription factor | 1 |
| GABRE | 3.89E-04 | 0.294 | 0.93 | 0.48 | 1.95 | gamma-aminobutyric acid (GABA) A receptor, epsilon | 3 |
| TTC7A | 1.06E-05 | 0.020 | 0.00 | 0.38 | 0.00 | tetratricopeptide repeat domain 7A | 3 |
| DIS3L | 1.00E-03 | 0.380 | 0.07 | 0.50 | 0.14 | DIS3 mitotic control homolog (S. cerevisiae)-like | 4 |
| ZCWPW1 | 1.39E-04 | 0.177 | 0.04 | 0.49 | 0.09 | zinc finger, CW type with PWWP domain 1 | 4 |
| ZNF233 | 6.95E-04 | 0.332 | 0.12 | 0.62 | 0.20 | zinc finger protein 233 | 4 |
*All the notations are the same as in Table 2.
Differentially mutated genes between Grade II and Grade III breast cancer classes.
| Gene Symbol | p-value | FDR a | Mean of mutation score in Grade II | Mean of mutation score in Grade III | FC b | Gene Name | Category c |
|---|---|---|---|---|---|---|---|
| SELP | 6.73E-05 | 0.230 | 0.00 | 0.45 | 0.00 | selectin P (granule membrane protein 140kDa, antigen CD62) | 1 |
| ANO7 | 6.32E-04 | 0.460 | 0.16 | 0.68 | 0.24 | anoctamin 7 | 2 |
| ANKRD18B | 1.22E-04 | 0.230 | 0.12 | 0.69 | 0.18 | ankyrin repeat domain 18B | 3 |
| ANKRD32 | 4.70E-04 | 0.450 | 0.00 | 0.38 | 0.00 | ankyrin repeat domain 32 | 3 |
| THAP8 | 8.19E-04 | 0.460 | 0.51 | 0.00 | INF | THAP domain containing 8 | 3 |
| ADD1 | 3.63E-04 | 0.450 | 0.31 | 1.33 | 0.23 | adducin 1 (alpha) | 4 |
| GFM2 | 8.37E-04 | 0.460 | 0.12 | 0.61 | 0.20 | G elongation factor, mitochondrial 2 | 4 |
*All the notations are the same as in Table 2.
Fig 3The distribution of deleterious SNVs across the compared patient samples.
Five charts illustrate the deleterious mutation distribution in different breast cancer class. Red dot indicates the presence of SNV for the corresponding gene in each sample.
Differentially occurring SNVs with deleterious mutations in domain regions.
|
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|---|
| SLC24A1_chr15_65916527_65916527_A_T | 23/27 | 13/40 | 11.94 | 2.11E-05 | rs3743171 | p.T37S | PfamB PB047652 |
| CSN3_chr4_71114956_71114956_G_T | 15/27 | 7/40 | 5.89 | 1.59E-03 | rs1048152 | p.R110L | Kappa casein |
| ERBB2_chr17_37884037_37884037_C_G | 8/27 | 26/40 | 0.23 | 6.24E-03 | rs1058808 | p.P1140A | PfamB PB015832 |
| PPP2R4_chr9_131909736_131909736_C_T | 11/27 | 2/40 | 13.06 | 4.29E-04 | rs2480452 | p.S287L | Phosphotyrosyl phosphate activator (PTPA) protein |
| DPP3_chr11_66276576_66276576_G_A | 7/27 | 28/40 | 0.15 | 1.41E-03 | rs12421620 | p.E690K | Peptidase family M49 |
| KRT74_chr12_52966428_52966428_G_C | 12/27 | 5/40 | 5.60 | 4.53E-03 | rs11170177 | p.N165K | Intermediate filament |
| GPR157_chr1_9165685_9165685_G_A | 5/27 | 24/40 | 0.15 | 1.01E-03 | rs72637739 | p.R218C | Secretin receptor family |
| FAM209B_chr20_55111364_55111364_A_C | 12/27 | 4/40 | 7.20 | 2.63E-03 | rs2296129 | p.E129A | FAM209 family |
|
|
|
|
|
|
|
|
|
| KIAA1377_chr11_101832590_101832590_C_A | 15/30 | 4/37 | 8.25 | 7.84E-04 | rs11225089 | p.S275Y | Susceptibility to monomelic amyotrophy |
| CPN1_chr10_101829514_101829514_C_T | 10/30 | 1/37 | 18.00 | 1.57E-03 | rs61751507 | p.G178D | Zinc carboxypeptidase (Peptidase_M14) |
| RBM46_chr4_155719189_155719189_T_G | 8/30 | 0/37 | 0.36/0 | 8.97E-04 | rs79167802 | p.I126M | RNA recognition motif (RRM_1) |
| DPP3_chr11_66276576_66276576_G_A | 9/30 | 26/37 | 0.18 | 1.41E-03 | rs12421620 | p.E690K | Peptidase family M49 |
| HKR1_chr19_37854040_37854040_G_A | 12/30 | 4/37 | 5.50 | 8.63E-03 | rs2921563 | p.R448H | Zinc-finger double domain (zf-H2C2_2) |
|
|
|
|
|
|
|
|
|
| CENPJ_chr13_25486911_25486911_G_T | 5/42 | 4/8 | 0.14 | 2.64E-02 | rs9511510 | p.P85T | PfamB PB003077 |
| GABRE_chrX_151138179_151138179_A_C | 40/42 | 4/8 | 20.00 | 3.94E-03 | rs1139916 | p.S102A | Neurotransmitter-gated ion-channel ligand binding domain |
| SP4_chr7_21469504_21469504_C_G | 3/42 | 4/8 | 0.08 | 8.54E-03 | rs139491266 | p.L241V | PfamB PB022696 |
|
|
|
|
|
|
|
|
|
| ANKRD32_chr5_94030818_94030818_G_T | 0/25 | 4/13 | 0.00 | 9.69E-03 | rs76504036 | p.C993F | PfamB PB101142 |
| GFM2_chr5_74037386_74037386_T_A | 2/25 | 5/13 | 0.14 | 3.41E-02 | rs16872235 | p.S300C | Elongation factor Tu GTP binding domain |
|
|
|
|
|
|
|
|
|
| LPPR2_chr19_11473358_11473358_C_G | 1/70 | 5/18 | 0.04 | 1.14E-03 | rs11540666 | p.T253S | PAP2 superfamily |
| PRCP_chr11_82564294_82564294_T_G | 5/70 | 8/18 | 0.10 | 4.80E-04 | rs2229437 | p.E112D | Serine carboxypeptidase S28 |
| GSTO1_chr10_106022789_106022789_C_A | 13/70 | 11/18 | 0.15 | 7.33E-04 | rs4925 | p.A140D | Glutathione S-transferase, C-terminal domain |
| PLEKHG6_chr12_6421495_6421495_G_A | 26/70 | 15/18 | 0.12 | 5.23E-04 | rs740842 | p.A35T | PfamB PB015161 |
| AGL_chr1_100358103_100358103_C_T | 5/70 | 5/18 | 0.20 | 2.72E-02 | rs3753494 | p.P1051S | Amylo-alpha-1,6-glucosidase |
| AGL_chr1_100361925_100361925_G_A | 20/70 | 10/18 | 0.32 | 4.93E-02 | rs2230307 | p.G1115R | Amylo-alpha-1,6-glucosidase |
| MMP20_chr11_102482504_102482504_T_G | 3/70 | 5/18 | 0.12 | 8.03E-03 | rs17099008 | p.I169L | Matrixin (Peptidase_M10) |
a SNV mutate ratio in ER-, PR-, HER2-, Grade II, and Stage II. (number of patients with the mutation in the class/total number of patients in the class)
b SNV mutate ratio in ER+, PR+, HER2+, Grade III, and Stage III. (number of patients with the mutation in the class/total number of patients in the class)
c OR: Odd ratio (ER-/ER+; PR-/PR+; HER2-/HER2+; GradeII/GradeIII; StageII/StageIII)
d Fisher’s exact test
Pfam and Panther motif analysis for breast cancer related mutated genes and overall impact of mutation in protein stability.
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|
| CPN1 | rs61751507 | P15169 | p.G178D | Peptidase M14 (PF00246) | Protease M14 Carboxypeptidase (PTHR11532) | Destabilizing |
| AGL | rs2230307 | P35573 | p.G1115R | GDE_C (PF06202) | Glycogen Debranching Enzyme (PTHR10569) | Destabilizing |
| PPP2R4 | rs2480452 | Q15257 | p.S287L | PTPA (PF03095) | Serine/Threonine-Protein Phosphatase 2A Regulatory Subunit B (PTHR10012) | Destabilizing |
| GPR157 | rs72637739 | Q5UAW9 | p.R218C | 7tm_2 (PF00002) | G Protein-Coupled Receptor 157 (PTHR23112) | Destabilizing |
| GFM2 | rs16872235 | Q969S9 | p.S300C | GTP_EFTU (PF00009) | Translation elongation factor G (PTHR23115) | Stabilizing |
| CENPJ | rs9511510 | Q9HC77 | p.P85T | —— | T complex protein 10 (PTHR10331) | Destabilizing |
| DPP3 | rs12421620 | Q9NY33 | p.E690K | PeptidaseM49 (PF03571) | Dipeptidyl peptidase III (PTHR23422) | Destabilizing |
| ANKRD32 | rs11225089 | Q9BQI6 | p.C993F | —— | —— | Stabilizing |
| KIAA1377 | rs61751507 | Q9P2H0 | p.S275Y | K1377 (PF15352) | Pthr31191 family not named (PTHR31191) | Destabilizing |
a Pfam ID (Pfam accession ID)
b Panther family (Panther accession ID)
c Test scores for stabilizing/destabilizing are shown in S5 Table
Fig 4Superimposed structures of normal (green) and mutated (yellow) DPP3 protein chains.
Amino acid change at 690th position for DPP3 leads to the structural changes at the C-terminus (in red square) region of the mutant protein. Normal residue (E) at 690th position is shown in blue and the mutated residue (K) is shown in red.