| Literature DB >> 25161249 |
Rui Tian1, Malay K Basu2, Emidio Capriotti3.
Abstract
MOTIVATION: The recent advance in high-throughput sequencing technologies is generating a huge amount of data that are becoming an important resource for deciphering the genotype underlying a given phenotype. Genome sequencing has been extensively applied to the study of the cancer genomes. Although a few methods have been already proposed for the detection of cancer-related genes, their automatic identification is still a challenging task. Using the genomic data made available by The Cancer Genome Atlas Consortium (TCGA), we propose a new prioritization approach based on the analysis of the distribution of putative deleterious variants in a large cohort of cancer samples.Entities:
Mesh:
Year: 2014 PMID: 25161249 PMCID: PMC4147919 DOI: 10.1093/bioinformatics/btu466
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.(A–C) Scatter plot of tumor versus background PDRs for all PIGs. On the x-axis is reported the background PDR of each gene which corresponds to the maximum PDR in normal and 1000 Genomes samples. On the y-axis are reported the PDRs calculated on the tumor samples. The gray scale on the side assigns darker colors to highly scored PIGs
Performance of the method in discriminating tumor types
| Tumor | Q2 | PPV | TPR | NPV | TNR | C | AUC | NG |
|---|---|---|---|---|---|---|---|---|
| COAD | 0.98 | 0.97 | 0.99 | 0.99 | 0.97 | 0.96 | 0.99 | 107/128 |
| LUAD | 0.77 | 0.74 | 0.83 | 0.82 | 0.71 | 0.55 | 0.83 | 274/28 |
| PRAD | 0.84 | 0.78 | 0.95 | 0.94 | 0.72 | 0.69 | 0.89 | 59/199 |
Note: Q2, overall accuracy; PPV and NPV, positive and negative predicted values; TPR and TNR, true positive and negative rates; MCC, Matthew’s correlation; AUC, area under the (ROC) curve. NG is the number of top positive/ lowest negative genes with score higher than 3/ lower than −3.
Performance of the methods on different tumor types
| Tumor | Method | Q2 | PPV | TPR | NPV | TNR | C | AUC |
|---|---|---|---|---|---|---|---|---|
| COAD | CRank | 0.92 | 0.97 | 0.86 | 0.87 | 0.97 | 0.84 | 0.94 |
| CLow | 0.72 | 0.78 | 0.66 | 0.72 | 0.78 | 0.47 | 0.79 | |
| CDiff | 0.76 | 0.78 | 0.77 | 0.80 | 0.74 | 0.55 | 0.83 | |
| LUAD | CRank | 0.97 | 0.97 | 0.96 | 0.96 | 0.97 | 0.93 | 0.99 |
| CLow | 0.84 | 0.88 | 0.79 | 0.81 | 0.89 | 0.69 | 0.91 | |
| CDiff | 0.89 | 0.89 | 0.90 | 0.90 | 0.88 | 0.79 | 0.96 | |
| PRAD | CRank | 0.91 | 0.92 | 0.91 | 0.91 | 0.92 | 0.83 | 0.97 |
| CLow | 0.70 | 0.66 | 0.75 | 0.72 | 0.74 | 0.43 | 0.77 | |
| CDiff | 0.79 | 0.83 | 0.74 | 0.77 | 0.83 | 0.58 | 0.87 |
Note: Performance of ContrastRank, ContrastLow and ContastDiff (respectively, CRank, CLow and CDiff) calculated using an average number 239, 494 and 127 genes with score >3 respectively for COAD, LUAD and PRAD. Q2, overall accuracy; PPV and NPV, positive and negative predicted values; TPR and TNR, true positive and negative rates; MCC, Matthew’s correlation; AUC = area under the (ROC) curve.
Fig. 2.Venn diagram representing the number of PIGs with average scores >3 for colon, lung and prostate adenocarcinomas (respectively COAD, LUAD and PRAD). Among those genes TP53, BRAF, NBEA, AR, RNF145 are in common in the three adenocarcinomas