| Literature DB >> 36134028 |
Faeze Keshavarz-Rahaghi1,2, Erin Pleasance1, Tyler Kolisnik1,3, Steven J M Jones1,4,5.
Abstract
The tumor suppressor gene, TP53, has the highest rate of mutation among all genes in human cancer. This transcription factor plays an essential role in the regulation of many cellular processes. Mutations in TP53 result in loss of wild-type p53 function in a dominant negative manner. Although TP53 is a well-studied gene, the transcriptome modifications caused by the mutations in this gene have not yet been explored in a pan-cancer study using both primary and metastatic samples. In this work, we used a random forest model to stratify tumor samples based on TP53 mutational status and detected a p53 transcriptional signature. We hypothesize that the existence of this transcriptional signature is due to the loss of wild-type p53 function and is universal across primary and metastatic tumors as well as different tumor types. Additionally, we showed that the algorithm successfully detected this signature in samples with apparent silent mutations that affect correct mRNA splicing. Furthermore, we observed that most of the highly ranked genes contributing to the classification extracted from the random forest have known associations with p53 within the literature. We suggest that other genes found in this list including GPSM2, OR4N2, CTSL2, SPERT, and RPE65 protein coding genes have yet undiscovered linkages to p53 function. Our analysis of time on different therapies also revealed that this signature is more effective than the recorded TP53 status in detecting patients who can benefit from platinum therapies and taxanes. Our findings delineate a p53 transcriptional signature, expand the knowledge of p53 biology and further identify genes important in p53 related pathways.Entities:
Keywords: ensemble classifier; machine learining; p53 pathway activation; pan-cancer; random forest; transcriptome
Year: 2022 PMID: 36134028 PMCID: PMC9483853 DOI: 10.3389/fgene.2022.987238
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.772
FIGURE 1Plots of area under the receiver operating characteristic curve (AUROC) and area under the precision-recall curve (AUPRC) for TCGA, POG, and merged (set of all impactful and wild-type samples) data sets.
FIGURE 2The bar plot of importance (Gini) scores of top 67 genes extracted from the random forest. Red colour indicates downregulation and blue colour shows upregulation of the genes with the loss of wild-type p53 function. Genes are grouped based on their known link to p53.
Silent mutations classified as p53 mutant, the number of samples containing these mutations, and the reported consequences and interpretation of them based on ClinVar database (nucleotide variations with * are not present in general population based on ClinVar evidence).
| Nucleotide variation | Amino acid variation | Number of samples | Exon location | Disease | ClinVar pathogenicity | ClinVar record |
|---|---|---|---|---|---|---|
| c.375G>T* | p.T125T | 20 | Last nucleotide of exon 4 | Li-Fraumeni syndrome | Likely pathogenic | VCV000237948.3 ( |
| c.375G>A* | p.T125T | 3 | Last nucleotide of exon 4 | Li-Fraumeni syndrome | Pathogenic | VCV000177825.18 ( |
| Li-Fraumeni-like/Chompret criteria | ||||||
| Rhabdomyosarcoma | ||||||
| Breast and/or ovarian cancer | ||||||
| Malignant tumour of prostate | ||||||
| c.375G>C | p.T125T | 1 | Last nucleotide of exon 4 | Ependymoma | Likely pathogenic | VCV000480746.3 (NM_000546a) |
| Early-onset breast cancer | ||||||
| c.672G>A* | p.E224E | 3 | Last nucleotide of exon 5 | Li-Fraumeni syndrome | Pathogenic/Likely pathogenic | VCV000080709.6 (NM_000546b) |
| Chompret criteria | ||||||
| c.993G>A* | p.Q331Q | 2 | Last nucleotide of exon 8 | Adrenocortical carcinoma | Likely pathogenic | VCV000428868.7 ( |
| Suspected Li-Fraumeni syndrome | ||||||
| c.207T>C | p.A69A | 1 | Exon 4 | Li-Fraumeni syndrome | Likely benign | VCV000219841.7 ( |
| Hereditary cancer-predisposing syndromes |
FIGURE 3RNA-seq data of lung adenocarcinoma (LUAD) samples with p53 silent mutations at threonine 125 with specific nucleotide modification of G>T. The last two tracks are from LUAD samples with wild-type p53 copies.
FIGURE 4The number of days on platinum therapies, taxanes, and epothilones divided by TP53 mutation status and the predicted status by the random forest (the p-values are found in a Mann-Whitney-Wilcoxon two-sided test with Bonferroni correction; p-value annotation guide: ns: 5.00e-02 < p ≤ 1.00, *: 1.00e-02 < p ≤ 5.00e-2, **: 1.00e-03 < p ≤ 1.00e-02). (A) The boxplots of log10 of the number of days on platinum and taxanes; the difference between p53 wild-type and mutant sets is statistically more significant when data points are divided by the random forest predictions (blue) than when they are divided by the true p53 status (red). (B) The boxplots of log10 of the number of days on epothilones (represented only by the drug eribulin); the difference between p53 wild-type and mutant sets is statistically more significant when data points are divided by the true p53 status (red) than when they are divided by the random forest predictions (blue).