| Literature DB >> 32176769 |
Pramod Chandrashekar1,2, Navid Ahmadinejad1,2, Junwen Wang1,3, Aleksandar Sekulic3, Jan B Egan3, Yan W Asmann4, Sudhir Kumar5,6, Carlo Maley2, Li Liu1,2,3.
Abstract
MOTIVATION: Functions of cancer driver genes vary substantially across tissues and organs. Distinguishing passenger genes, oncogenes (OGs) and tumor-suppressor genes (TSGs) for each cancer type is critical for understanding tumor biology and identifying clinically actionable targets. Although many computational tools are available to predict putative cancer driver genes, resources for context-aware classifications of OGs and TSGs are limited.Entities:
Mesh:
Year: 2020 PMID: 32176769 PMCID: PMC7703750 DOI: 10.1093/bioinformatics/btz851
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.The distribution of selection coefficients of the curated genes. (A) Split violin plot showing densities of log(ω) and log(φ) values for PGs, TSGs and OGs. (B) Positional distribution of somatic mutations of the BCOR gene in stomach cancer and in melanoma. Vertical lines represent frequencies of various types of mutations at a given position. Synonymous, missense and truncating mutations are represented by green, blue and red lines, respectively. Gray lines are density curves. (C) Scatter plot of log(ω) and log(φ) values. Shades of hexagon bins represent the number of observations. (D) Positional distribution of somatic mutations of the FBXW7 gene in uterine carcinosarcoma. (Color version of this figure is available at Bioinformatics online.)
Performance of GUST and 20/20+
| Binary classes | Three classes | ||||
|---|---|---|---|---|---|
| Positive | OG, TSG | OG | TSG | ||
| Negative | PG | PG, TSG | PG, OG | ||
| GUST | TPR | 0.93 | 0.84 | 0.93 | — |
| TNR | 0.94 | 0.98 | 0.95 | — | |
| PPV | 0.92 | 0.85 | 0.9 | — | |
| NPV | 0.95 | 0.98 | 0.96 | — | |
| ACC | 0.94 | 0.97 | 0.94 | 0.92 | |
| AUC | 0.97 | 0.99 | 0.97 | 0.98 | |
| 20/20+ | TPR | 0.90 | 0.95 | 0.86 | — |
| TNR | 0.85 | 0.97 | 0.90 | — | |
| PPV | 0.82 | 0.78 | 0.81 | — | |
| NPV | 0.92 | 0.99 | 0.93 | — | |
| ACC | 0.88 | 0.97 | 0.89 | 0.86 | |
| AUC | 0.94 | 0.97 | 0.93 | 0.95 | |
Macro-AUC values were calculated by averaging three one-vs-rest ROC curves. Linear interpolation was used between points of ROC (Wei and Wang, 2018).
TPR, true positive rate, sensitivity; TNR, true negative rate, specificity; PPV, positive predictive value, precision; NPV, negative predictive value; ACC, accuracy; AUC, area under the ROC curve.
Fig. 2.The GUST method. (A) ROC curves of one-vs-rest predictions for GUST and for 20/20+. (B) Variable importance of each feature in the random forest model. (C) Positional distribution of somatic mutations of the MB21D2 gene. Mutations were combined from tumor samples of bladder cancer, cervical cancer, head and neck cancer, lung adenocarcinoma and lung squamous cell carcinoma. A mutation hotspot is located at coding position 931 that corresponds to protein position 311. (D) Selection coefficients estimated for the MB21D2 gene in individual cancer types (dots) and for combined samples (cross). Broken lines are the mean selection coefficient of all genes analyzed using all TCGA samples. Shaded areas are the 95% confidence intervals of the mean selection coefficients
Fig. 3.GUST analysis of the TCGA samples. (A) Number of common and rare OGs and TSGs found in each cancer type. Abbreviations of cancer types are listed in Supplementary Table S3. (B–E) Positional distributions of somatic mutations in novel OGs and TSGs. Evolutionary conservation of each position, measured as number of substitutions per billion years is displayed above each plot. (F) Distribution of driver genes with different spectrum of tissue specificity. (G) Positional distribution of mutations in the EGFR gene in lung adenocarcinoma and glioma (low-grade glioma and glioblastoma combined). (H) Two-way clustering of driver genes and cancer types. Driver genes found in more than one cancer type are used (OGs in red and TSGs in blue). (Color version of this figure is available at Bioinformatics online.)