| Literature DB >> 35883535 |
Byungkyu Park1,2, Jinho Im2,3, Kyungsook Han2.
Abstract
Breast cancer is one of the most prevalent cancers in females, with more than 450,000 deaths each year worldwide. Among the subtypes of breast cancer, basal-like breast cancer, also known as triple-negative breast cancer, shows the lowest survival rate and does not have effective treatments yet. Somatic mutations in the TP53 gene frequently occur across all breast cancer subtypes, but comparative analysis of gene correlations with respect to mutations in TP53 has not been done so far. The primary goal of this study is to identify gene correlations in two groups of breast cancer patients and to derive potential prognostic gene pairs for breast cancer. We partitioned breast cancer patients into two groups: one group with a mutated TP53 gene (mTP53) and the other with a wild-type TP53 gene (wtTP53). For every gene pair, we computed the hazard ratio using the Cox proportional hazard model and constructed gene correlation networks (GCNs) enriched with prognostic information. Our GCN is more informative than typical GCNs in the sense that it indicates the type of correlation between genes, the concordance index, and the prognostic type of a gene. Comparative analysis of correlation patterns and survival time of the two groups revealed several interesting findings. First, we found several new gene pairs with opposite correlations in the two GCNs and the difference in their correlation patterns was the most prominent in the basal-like subtype of breast cancer. Second, we obtained potential prognostic genes for breast cancer patients with a wild-type TP53 gene. From a comparative analysis of GCNs of mTP53 and wtTP53, we found several gene pairs that show significantly different correlation patterns in the basal-like breast cancer subtype and obtained prognostic genes for patients with a wild-type TP53 gene. The GCNs and prognostic genes identified in this study will be informative for the prognosis of survival and for selecting a drug target for breast cancer, in particular for basal-like breast cancer. To the best of our knowledge, this is the first attempt to construct GCNs for breast cancer patients with or without mutations in the TP53 gene and to find prognostic genes accordingly.Entities:
Keywords: TP53 mutation; breast cancer; gene correlation network; prognosis
Mesh:
Substances:
Year: 2022 PMID: 35883535 PMCID: PMC9313229 DOI: 10.3390/biom12070979
Source DB: PubMed Journal: Biomolecules ISSN: 2218-273X
Figure 1Schematic overview of the framework for constructing gene correlation networks (GCNs) and obtaining prognostic gene pairs for two groups of breast cancer patients. wtTP53: a group of breast cancer patients with a wild-type TP53 gene. mTP53: a group of breast cancer patients with somatic mutations in the TP53 gene.
The number and proportion (in parentheses) of breast cancer cases from TCGA and their subtypes based on the PAM50 classification [19]. The proportion represents the ratio of the cases to the entire cases of the same subtype. mTP53: breast cancer patients with somatic mutations in the TP53 gene. wtTP53: breast cancer patients with a wild-type TP53 gene.
| Breast Cancer Subtype | mTP53 | wtTP53 | Total |
|---|---|---|---|
| Luminal A | 41 (16.1%) | 369 (74.4%) | 410 (54.7%) |
| Luminal B | 50 (19.7%) | 95 (19.2%) | 145 (19.3%) |
| HER2-enriched | 38 (15.0%) | 14 (2.8%) | 52 (6.9%) |
| Basal-like | 125 (49.2%) | 18 (3.6%) | 143 (19.1%) |
| Total | 254 (33.9%) | 496 (66.1%) | 750 (100.0%) |
Figure 2(A) Volcano plot for comparing differentially expressed genes in two groups, wtTP53 and mTP3. The horizontal axis represents the fold change (FC) between the two groups on a log2 scale, and the vertical axis shows the negative logarithm to the base 10 of p values from the t-test. A gene with a higher expression level in mTP53 than in wtTP53 has a positive FC and is shown as a red dot. A gene with a lower expression level in mTP53 than in wtTP53 has a negative FC and is shown as a blue dot. Top 10 genes with low adjusted p-values by Benjamini–Hochberg procedure are labeled with their names. (B) GO circle plot and GO terms for genes with significant FC. Genes with higher expression levels in mTP53 than in wtTP53 are shown as red dots, and genes with lower expression levels in mTP53 are shown as blue dots. The z-score is the number of overexpressed genes minus the number of underexpressed genes divided by the square root of the count.
Gene pairs with the highest hazard ratio in wtTP53.
| Gene Pair | Large | Small | Log-Rank Test | Cox PH | ||
|---|---|---|---|---|---|---|
| adj. | Hazard Ratio | |||||
| MAPK10_PTK6 | 22 | 474 | 4.13E−08 | 1.80E−04 | 9.254 | 1.02E−06 |
| ECT2_HIF1A-AS2 | 15 | 481 | 2.11E−05 | 1.48E−02 | 8.616 | 9.47E−05 |
| HIF1A-AS2_KIF15 | 16 | 480 | 2.11E−05 | 1.48E−02 | 8.616 | 9.47E−05 |
| CLDN7_MAPK10 | 16 | 480 | 1.75E−06 | 3.31E−03 | 8.611 | 1.02E−05 |
| CDH3_FGFR2 | 13 | 483 | 3.88E−05 | 2.18E−02 | 7.950 | 9.13E−05 |
| PTGS2_SUSD2 | 20 | 476 | 8.62E−09 | 5.36E−05 | 7.695 | 5.77E−06 |
| LSINCT5_SUSD2 | 14 | 482 | 1.01E−05 | 9.59E−03 | 6.786 | 6.11E−05 |
| AHR_SUSD2 | 29 | 467 | 5.06E−10 | 1.10E−05 | 6.557 | 7.85E−07 |
| GLI1_RMST | 14 | 482 | 1.51E−06 | 3.14E−03 | 6.059 | 3.60E−05 |
| CDH3_GLI1 | 32 | 464 | 1.38E−09 | 1.88E−05 | 5.945 | 1.72E−07 |
Gene pairs with the highest hazard ratio in mTP53.
| Gene Pair | Large | Small | Log-Rank Test | Cox PH | ||
|---|---|---|---|---|---|---|
| adj. | Hazard Ratio | |||||
| KDM5B_ST14 | 18 | 243 | 6.19E−06 | 4.07E−02 | 6.713 | 7.51E−06 |
| NAT2_PBOV1 | 27 | 234 | 7.69E−06 | 4.07E−02 | 5.868 | 3.49E−05 |
| KIT_RHOBTB2 | 24 | 237 | 5.24E−06 | 4.07E−02 | 5.703 | 1.09E−05 |
| PBOV1_TWIST1 | 49 | 212 | 2.80E−07 | 1.21E−02 | 5.680 | 1.51E−06 |
| FLT1_MDM2 | 30 | 231 | 5.81E−07 | 1.25E−02 | 5.247 | 7.97E−06 |
| PIK3CA_PRLR | 33 | 228 | 1.13E−06 | 1.63E−02 | 5.139 | 6.96E−06 |
| EPCAM_SERPINE1 | 28 | 233 | 6.46E−06 | 4.07E−02 | 4.714 | 1.85E−05 |
| CDC27_MDM2 | 25 | 236 | 6.92E−06 | 4.07E−02 | 4.644 | 9.70E−05 |
| CLDN7_PIK3CA | 38 | 223 | 8.50E−06 | 4.07E−02 | 4.082 | 6.79E−05 |
Figure 3Kaplan–Meier plots for the gene pairs selected by the log-rank test [24]. (A) The highest hazard ratio of gene pair MAPK10_PTK6 for prognosis of survival of wtTP53 (left plot). (B) The highest hazard ratio of gene pair KDM5B_ST14 for prognosis of survival of mTP53 (left plot). The Kaplan–Meier plots of the same gene pairs are displayed for the other group (right plots) for comparative purposes.
Figure 4Kaplan–Meier plots for the gene pair MAPK10_PTK6. (A) The gene pair for prognosis of survival of wtTP53 in TCGA-BRCA. (B) The gene pair for prognosis of survival of wtTP53 in METABRIC.
Comparison of two models with different features in classifying subtypes of breast cancer. A better performance is shown in bold. AC: accuracy, SE: sensitivity, SP: specificity, MCC: Matthews Correlation Coefficient.
| Feature | Subtype | AC | SE | SP | PPV | NPV | F-Score | MCC |
|---|---|---|---|---|---|---|---|---|
| Basal-like |
|
| 100.0 | 100.00 |
|
|
| |
| PCCs & | HER2-enriched |
|
|
|
|
|
|
|
| Luminal A |
| 94.48 |
|
| 93.94 |
|
| |
| Luminal B |
|
| 97.14 | 87.50 |
|
|
| |
| Basal-like | 99.10 | 96.15 | 100.00 | 100.00 | 98.84 | 0.980 | 0.975 | |
| gene | HER2-enriched | 96.40 | 93.33 | 96.88 | 82.35 | 98.94 | 0.875 | 0.856 |
| expressions | Luminal A | 94.59 |
| 92.31 | 90.00 |
| 0.938 | 0.892 |
| Luminal B | 91.89 | 70.83 |
|
| 92.39 | 0.791 | 0.749 |
Figure 5Visualization of four subtypes of breast cancer with respect to PCCs of gene pairs. Principal component analysis (PCA) was used for the visualization. PCCs of gene pairs in the basal-like subtype (violet dots) are very different from those in the other subtypes.