| Literature DB >> 30704472 |
Alice Djotsa Nono1, Ken Chen2, Xiaoming Liu3,4.
Abstract
BACKGROUND: Identifying cancer driver genes (CDG) is a crucial step in cancer genomic toward the advancement of precision medicine. However, driver gene discovery is a very challenging task because we are not only dealing with huge amount of data; but we are also faced with the complexity of the disease including the heterogeneity of background somatic mutation rate in each cancer patient. It is generally accepted that CDG harbor variants conferring growth advantage in the malignant cell and they are positively selected, which are critical to cancer development; whereas, non-driver genes harbor random mutations with no functional consequence on cancer. Based on this fact, function prediction based approaches for identifying CDG have been proposed to interrogate the distribution of functional predictions among mutations in cancer genomes (eLS 1-16, 2016). Assuming most of the observed mutations are passenger mutations and given the quantitative predictions for the functional impact of the mutations, genes enriched of functional or deleterious mutations are more likely to be drivers. The promises of these methods have been continually refined and can therefore be applied to increase accuracy in detecting new candidate CDGs. However, current function prediction based approaches only focus on coding mutations and lack a systematic way to pick the best mutation deleteriousness prediction algorithms for usage.Entities:
Keywords: Bioinformatics; Cancer genomics; Computational evaluation; Driver genes; Function prediction method; Whole genome sequencing
Mesh:
Year: 2019 PMID: 30704472 PMCID: PMC6357357 DOI: 10.1186/s12920-018-0452-9
Source DB: PubMed Journal: BMC Med Genomics ISSN: 1755-8794 Impact factor: 3.063
Fig. 1SMDS, a gene-based permutation method for the detection of candidate driver genes. The steps are shown from a to e
Candidate driver genes positively selected (p-value ≤0.01) by each permutation model and their percentage (in brackets) of all genes tested for breast and lung cancer data
| Permutation model | CADD | DANN | Fathmm-MKL coding | Fathmm-MKL noncoding | MetaLR | SPIDEX | VEST3 | Unique genes | Total genes |
|---|---|---|---|---|---|---|---|---|---|
| Breast | 263 (1.3) | 158 (0.8) | 178 (0.9) | 184 (0.9) | 174 (0.9) | 178 (0.9) | 171 (0.9) | 942 (4.7) | 19,835 |
| Lung | 121 (0.6) | 142 (0.7) | 138 (0.7) | 149 (0.7) | 178 (0.9) | 171 (0.9) | 164 (0.8) | 796 (4.0) | 20,047 |
Fig. 2Quantile–quantile (QQ) plots of p-values comparing the observed distribution of p-values (y - axis) to the expected p-values of a null distribution (x - axis) for 19,835 breast cancer genes (Panel a) and 20,047 lung cancer genes (Panel b). The red line represents the expectation under the null. The grey area depicts the 95% confidence
Fig. 3Proportion of breast cancer candidate driver genes (Panel a) and lung cancer genes (Panel b) predicted by one, two to three or more than three permutation models
Fig. 4Comparison of the candidate driver genes predicted by seven models with 119 breast cancer samples (Panel a) and 24 lung cancer samples (Panel b). UpSet plot showing the intersection of overlap sets between the CDGs predicted by 7 permutation models (sets). Horizontal bars (Set Size) next to the tool names represent the total number of CDGs selected by each set. Vertical bars (Intersection Size) are annotated by connected blue dots reflecting the number of common CDGs detected by a specific combination of sets sorted by size
Candidate driver genes predicted by five or more permutation models for breast and lung cancer
| Number of shared predictive models | Number of genes predicted | Genes names overlapping with breast or lung cancer genes in CGC | Genes names overlapping with other cancer genes in CGC | Gene names not in CGC |
|---|---|---|---|---|
| Breast | ||||
| 7 | 1 | TP53 | ||
| 6 | 2 | GRIN1, XG | ||
| 5 | 6 | PIK3CA, MAP 3 K1 | MAP 2 K4 | TAF1L, OTOP1, PSMA4, FZD3 |
| Lung | ||||
| 7 | ||||
| 6 | 1 | DLX4 | ||
| 5 | 4 | TP53, RBM10 | CCT7, ST6GAL2 | |
Fig. 5Proportion of 534 genes in the Cancer Gene Census predicted by each permutation model from breast cancer (Panel a) and lung cancer (Panel b). The number of predicted driver genes is on top of each bar
Performance comparison of the seven permutation models on breast and lung cancer
| Method | Number of Significant genes | Overlap with CGC | Method consensus | CGC Rank | Consensus Rank | Average Rank |
|---|---|---|---|---|---|---|
| Breast cancer | ||||||
| | 263 | 0.041 | 0.45 | 1 | 4 | 2.5 |
| | 158 | 0.011 | 0.57 | 6 | 2 | 4 |
| | 178 | 0.022 | 0.58 | 2 | 2 | 2 |
| | 184 | 0.021 | 0.49 | 3 | 3 | 3 |
| | 174 | 0.013 | 0.5 | 5 | 3 | 4 |
| | 178 | 0.015 | 0.12 | 4 | 5 | 4.5 |
| | 171 | 0.021 | 0.62 | 3 | 1 | 2 |
| Lung cancer | ||||||
| | 121 | 0.017 | 0.48 | 2 | 4 | 3 |
| | 142 | 0.013 | 0.25 | 3 | 6 | 4.5 |
| | 138 | 0.021 | 0.66 | 1 | 1 | 1 |
| | 149 | 0.011 | 0.54 | 4 | 3 | 3.5 |
| | 178 | 0.007 | 0.46 | 5 | 5 | 5 |
| | 171 | 0.002 | 0.11 | 6 | 7 | 6.5 |
| | 164 | 0.011 | 0.57 | 4 | 2 | 3 |