| Literature DB >> 26602693 |
Michael Ryan1, Wing Chung Wong2, Robert Brown2, Rehan Akbani3, Xiaoping Su3, Bradley Broom3, James Melott3, John Weinstein4.
Abstract
TCGA's RNASeq data represent one of the largest collections of cancer transcriptomes ever assembled. RNASeq technology, combined with computational tools like our SpliceSeq package, provides a comprehensive, detailed view of alternative mRNA splicing. Aberrant splicing patterns in cancers have been implicated in such processes as carcinogenesis, de-differentiation and metastasis. TCGA SpliceSeq (http://bioinformatics.mdanderson.org/TCGASpliceSeq) is a web-based resource that provides a quick, user-friendly, highly visual interface for exploring the alternative splicing patterns of TCGA tumors. Percent Spliced In (PSI) values for splice events on samples from 33 different tumor types, including available adjacent normal samples, have been loaded into TCGA SpliceSeq. Investigators can interrogate genes of interest, search for the genes that show the strongest variation between or among selected tumor types, or explore splicing pattern changes between tumor and adjacent normal samples. The interface presents intuitive graphical representations of splicing patterns, read counts and various statistical summaries, including percent spliced in. Splicing data can also be downloaded for inclusion in integrative analyses. TCGA SpliceSeq is freely available for academic, government or commercial use.Entities:
Mesh:
Substances:
Year: 2015 PMID: 26602693 PMCID: PMC4702910 DOI: 10.1093/nar/gkv1288
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.SpliceSeq analysis of mRNA data. (A) Ensemble coding transcripts for each gene are assembled into a unified splice graph. (B) Reads are aligned to the splice graphs and totals for each exon and splice are calculated. Read counts are normalized to the length of each exon and number of aligned reads in the sample. (C) Percent Spliced In values are computed for each possible splice event in each gene. PSI is the ratio of reads indicating the presence of a transcript element versus the total reads covering the event. In this example, the 8 yellow reads (exon 3 body reads, exon 2–3 junction reads and exon 3–4 junction reads) indicate that exon 3 is present. The red junction 2–4 reads indicate that exon 3 was spliced out. The PSI is therefore 8/10 or 0.8 indicating that 80% of the transcripts in the sample include exon 3. (D) SpliceSeq evaluates all of the listed types of splice events.
Figure 2.TCGA SpliceSeq gene symbol search results. (A) Results table in the TCGA SpliceSeq gene symbol search. Each row contains data for a splice event in the gene. The table columns show the average PSI values for specific tumor types. Selecting a cell for a splice event/tumor type (e.g. exon 4 skip event in COAD) causes all other display panels to present data for the selected event or tissue. (B) The gene information tab provides a general description of the selected gene and the hg19 coordinates and sequence for each exon. The exon information can be used to identify exons mentioned in literature with different naming schemes. (C) A splice graph of the gene's exons is shaded based on expression level and shows the selected splice event outlined in red. (D)The PSI plot for the exon 4 skip event in RAC1 shows the expected inclusion of exon 4 in COAD but also strong use of exon 4 in LUAD and several other tissues. (E) Selecting ‘Show Normal’ in the plot panel shows PSI values for the same splice event in adjacent normal tissue. Normal values show that the exon is expressed in normal tissues but that COAD and LUAD tumor samples have higher exon 4 expression than adjacent normal. (F) Selecting ‘Show Scatter Plot’ in the plot panel displays individual sample PSI values, providing insight into the range/consistency of PSI values for a group of samples. LUAD samples demonstrate a much wider range of exon 4 inclusion than do COAD samples.
Figure 3.TCGA SpliceSeq Top Hits results. (A) PSI values for an exon 11 skip event in the CLSTN1 gene. This gene is a top hit in the tumor difference query for all tissues because the PSI values for exon 11 inclusion vary dramatically across different cancer types. (B) Splice graph of the CLSTN1 exon 11 skip event (highlighted in red) showing average expression in BLCA tumor samples. (C) Portion of the UniProt tab for CLSTN1. Red/Green sections with numbers indicate the exon that codes for each section of the protein. Tracks below the amino acid sequence show functional and structural annotations. The exon 11 skip event falls in the Extracellular region of the protein so it may affect protein–protein interactions. (D) PSI values of exon 11 skip event in the extracellular matrix gene FBLN2. The event was high in the tumor-normal results for the subset of tumor types displayed because the PSI shifts strikingly and consistently between tumor and adjacent normal values.