| Literature DB >> 29617727 |
Zhuo Zhang1, Hao Li1, Shuai Jiang1, Ruijiang Li1, Wanying Li1, Hebing Chen1, Xiaochen Bo1.
Abstract
The Cancer Genome Atlas (TCGA) is a publicly funded project that aims to catalog and discover major cancer-causing genomic alterations with the goal of creating a comprehensive 'atlas' of cancer genomic profiles. The availability of this genome-wide information provides an unprecedented opportunity to expand our knowledge of tumourigenesis. Computational analytics and mining are frequently used as effective tools for exploring this byzantine series of biological and biomedical data. However, some of the more advanced computational tools are often difficult to understand or use, thereby limiting their application by scientists who do not have a strong computational background. Hence, it is of great importance to build user-friendly interfaces that allow both computational scientists and life scientists without a computational background to gain greater biological and medical insights. To that end, this survey was designed to systematically present available Web-based tools and facilitate the use TCGA data for cancer research.Entities:
Keywords: The Cancer Genome Atlas; bioinformatics tools; cancer; databases; survey
Mesh:
Substances:
Year: 2019 PMID: 29617727 PMCID: PMC6781580 DOI: 10.1093/bib/bby023
Source DB: PubMed Journal: Brief Bioinform ISSN: 1467-5463 Impact factor: 11.622
Figure 1Overview of common analysis and some applications for multidimensional data available from TCGA.
ID types within TCGA
| ID type | Description | Example |
|---|---|---|
| File UUID | ID of data in TCGA | 00a2364d-7385-4fa8-8562-b4f19548505a |
| File Submitted ID | ID of data uploaded to TCGA | 147f470-7440-42b8-8e3a-4e28b654916e-beta-value |
| Case UUID | Sample/case ID in TCGA | 942c0088-c9a0-428c-a879-e16f8c5bfdb8 |
| Case Submitted ID | ID of sample/case uploaded to TCGA, which is commonly used to represent sample/case | TCGA-CJ-4642 |
| Project ID | Project ID which sample/case belongs to | TCGA-BRCA |
Description of data types and their access level
| Data type | Description | Access Level |
|---|---|---|
| Aligned Reads | Raw sequencing data | Controlled |
| Raw Simple Somatic Mutation | Raw mutation information data | Controlled |
| Annotated Somatic Mutation | Annotated mutation information data | Controlled |
| Aggregated Somatic Mutation | Aggregated mutation information data | Controlled |
| Masked Somatic Mutation | Transformed mutation information data | Open |
| Gene Expression Quantification | Gene expression data | Open |
| Copy Number Segment | Copy number information data | Open |
| Masked Copy Number Segment | Transformed copy number information data | Open |
| Methylation Beta Value | Methylation data | Open |
| Isoform Expression Quantification | Mature miRNA expression data | Open |
| miRNA Expression Quantification | miRNA expression data | Open |
| Biospecimen Supplement | Biospecimen information | Open |
| Clinical Supplement | Clinical information | Open |
List of Web servers and databases
| Name | Databases | Batch queries | Mutation analysis | Correlation analysis | Differential expression analysis | Pathway analysis | Kaplan–Meier plots | Pan-cancer analysis | Visualization type | Download | API | URL |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| BCMD | TCGA | No | No | No | No | No | No | No | Image | No | No |
|
| Broad GDAC Firehose | TCGA | No | Yes | Yes | Yes | Yes | Yes | Yes |
Matrix Histogram | Yes | Yes |
|
| Cancer Landscapes | TCGA | No | No | Yes | No | Yes | Yes | Yes |
Networks Matrix | Yes | No |
|
| Cancer3D |
TCGA CCLE | No | Yes | No | No | No | No | No |
Genomic coordinates Network Scatter plots/box plots 3D structure | Yes | No |
|
| canEvolve |
TCGA ICGC GEO | Yes | No | Yes | Yes | Yes | Yes | No |
Heatmap Network Plots | Yes | No |
|
| cbioportal |
TCGA CCLE | Yes | Yes | Yes | Yes | No | Yes | Yes |
Networks Matrix Heatmaps | Yes | Yes |
|
| CDSA | TCGA | No | No | No | No | No | No | No | Image | No | No |
|
| CELLX |
TCGA CCLE GEO GSK GTEx | Yes | Yes | Yes | Yes | No | Yes | No |
Heatmap Matrix | Yes | No |
|
| GDISC | TCGA | No | No | Yes | No | No | Yes | No |
Matrix Box plots | Yes | No |
|
| GEPIA |
TCGA GTEx | Yes | No | Yes | Yes | No | Yes | No |
Matrix Bar graph Box plots/violin plots/dot plots | Yes | Yes |
|
| IntOGen |
TCGA ICGC | Yes | Yes | No | No | No | No | Yes |
Heatmap Matrix Histogram | Yes | No |
|
| KMplotter |
TCGA GEO EGA | Yes | No | No | No | No | Yes | No | Linear plots | Yes | No |
|
| MethHC | TCGA | Yes | No | Yes | No | Yes | No | No |
Matrix Heatmaps | Yes | No |
|
| MEXPRESS | TCGA | No | No | Yes | Yes | No | No | No | Genomic coordinates | Yes | Yes |
|
| OASISPRO | TCGA | No | No | Yes | No | No | Yes | No | Histogram linear plots/box plots | Yes | No |
|
| OncoScape |
TCGA CCLE | Yes | No | No | Yes | Yes | No | No |
Heatmap Pathway maps Matrix Scatter plot | Yes | No |
|
| PathwayMapper | TCGA | No | No | No | No | Yes | No | No | Pathway maps | Yes | Yes |
|
| PROGgeneV2 |
TCGA GEO NKI | Yes | No | No | No | No | Yes | No | Linear plots | Yes | No |
|
| Regulome Explorer | TCGA | No | No | Yes | No | Yes | No | Yes |
Circos Genomic coordinates Network Matrix | Yes | No |
|
| TANRIC |
TCGA CCLE | No | Yes | Yes | Yes | No | Yes | No | Heatmaps | Yes | No |
|
| TCGA Clinial Explorer | TCGA | No | Yes | Yes | No | No | Yes | No | Matrix Histogram | Yes | No |
|
| TCGA Mbatch | TCGA | No | No | No | No | No | No | No |
Matrix PCA diagrams Hierarchical clustering diagrams | Yes | No |
|
| TCGA NG-CHM | TCGA | No | No | Yes | No | Yes | No | Yes | Heatmaps | Yes | No |
|
| TCGA SpliceSeq | TCGA | No | No | No | No | No | No | No | Matrix | Yes | No |
|
| TCGA4U | TCGA | Yes | Yes | No | Yes | No | Yes | No |
Heatmap Matrix Histogram | Yes | No |
|
| TCIA | TCGA | No | No | No | No | No | No | No | Image | Yes | Yes |
|
| TCPA | TCGA | No | No | Yes | Yes | No | Yes | No |
Networks Heatmaps | Yes | No |
|
| UALCAN | TCGA | Yes | No | No | Yes | No | Yes | No |
Heatmap Boxplots Linear plots | Yes | No |
|
| UCSC Xena |
TCGA GDC ICGC GTEx TARGET TOIL | No | Yes | No | No | No | Yes | Yes |
Heatmaps Scatter plot Histogram | Yes | Yes |
|
| Vanno | TCGA | No | Yes | No | No | No | No | No |
Circos Matrix 3D structure Heatmap | Yes | No |
|
| Wanderer | TCGA | No | No | Yes | Yes | No | No | No |
Genomic coordinates Scatter plot | Yes | Yes |
|
| Zodiac | TCGA | Yes | No | Yes | No | No | No | Yes |
Matrix Circular network | No | No |
|
Additional databases and Web servers
| Name | Content | URL |
|---|---|---|
| AnimalTFDB 2.0 | Animal transcription factors |
|
| ArrayMap | A resource for genomic copy number profiles of human tumors |
|
| BloodSpot | Gene expression profiles and transcriptional programs for healthy and malignant hematopoiesis |
|
| BreCAN-DB | Break point profiles of cancer genomes |
|
| Cancer RNA-Seq Nexus | Phenotype-specific transcriptome profiling |
|
| canSAR | Cancer research and drug discovery |
|
| ccmGDB | Cancer cell metabolism gene |
|
| CGWB | A computational platform to integrate clinical tumor mutation profiles with the reference human genome |
|
| ChimerDB 3.0 | Fusion gene |
|
| ChIPBase v2.0 | Transcriptional regulatory networks of noncoding RNAs and protein-coding genes |
|
| CMPD | Cancer mutant proteome database |
|
| COSMIC | Somatic mutations in human cancer |
|
| dbDEMC 2.0 | Differentially expressed miRNAs in human cancer |
|
| DBTSS | Transcriptome, epigenome and genome sequence variation data |
|
| DiseaseMeth | Human disease methylation database |
|
| DriverDBv2 | Human cancer driver gene |
|
| LNCediting | A database for functional effects of RNA editing in lncRNAs |
|
| lncRNASNP | SNPs in lncRNAs |
|
| miRTarBase 2016 | MiRNA database |
|
| Mutagene | Cancer genetic heterogeneity |
|
| MutationAligner | Recurrent mutation hot spots |
|
| mutLBSgeneDB | Mutated ligand-binding site gene DataBase |
|
| NetGestalt | Multidimensional omics data |
|
| Oncotator | Cancer variant annotation tool |
|
| PhosphoSitePlus | Protein posttranslational modifications |
|
| POSTAR | Posttranscriptional regulation |
|
| RBP-Var | Functional variants involved in regulation mediated by RNA-binding proteins |
|
| WebGestalt 2017 | Enrichment analysis |
|
| YM500v2 | MiRNAs for human cancer |
|
Figure 2Two explorations of global alteration profile patterns as provided by publicly accessible Broad GDAC Firehose and Cancer Landscape Web tools. (A) This window view displays the user interface of Broad GDAC Firehose where users can choose a specific mutation analysis method. (B) This window provides network modeling of multiple cancers and data sets as indicated by the data sets and data types that were selected at the far right in Cancer Landscapes.
Figure 3An exploration of driver genes associated with lung adenocarcinoma was conducted in OncoScape (A) and IntOGen (B). The two windows display different formats for the results obtained.
Figure 4Views of interface windows in OASISPRO. (A) The stepwise selection of parameters for conducting a classification of clinical phenotypes is shown. (B) This window presents the input variables and results obtained from a representative analysis.
Figure 5A representative window of the results provided by Regulome Explorer for a correlation analysis. This figure displays the main user interface, including the option for using multiple data types.
Figure 6A representative survival plot generated with PROGgeneV2. TP53 gene expression was applied to a lung adenocarcinoma data set from TCGA.