| Literature DB >> 26448683 |
Bjarne Johannessen1, Anita Sveen1, Rolf I Skotheim2.
Abstract
Alternative splicing is a key regulatory mechanism for gene expression, vital for the proper functioning of eukaryotic cells. Disruption of normal pre-mRNA splicing has the potential to cause and reinforce human disease. Owing to rapid advances in high-throughput technologies, it is now possible to identify novel mRNA isoforms and detect aberrant splicing patterns on a genome scale, across large data sets. Analogous to the genomic types of instability describing cancer genomes (eg, chromosomal instability and microsatellite instability), transcriptome instability (TIN) has recently been proposed as a splicing-related genome-wide characteristic of certain solid cancers. We present the R package TIN, available from Bioconductor, which implements a set of methods for TIN analysis based on exon-level microarray expression profiles. TIN provides tools for estimating aberrant exon usage across samples and for analyzing correlation patterns between TIN and splicing factor expression levels.Entities:
Keywords: R software; alternative splicing; exon microarray; splicing factor; transcriptome instability
Year: 2015 PMID: 26448683 PMCID: PMC4578549 DOI: 10.4137/CIN.S31363
Source DB: PubMed Journal: Cancer Inform ISSN: 1176-9351
Figure 1Pipeline to investigate TIN in tumor samples based on exon-level microarray data. CEL files with raw expression data is taken as input, along with gene-level expression data. The FIRMA algorithm is used to identify exon skipping and inclusion events, and user-defined thresholds (such as the upper and lower first percentile) are used for denoting exons as aberrantly spliced. The correlation between aberrant exon usage and splicing factor gene expression is evaluated and tested against random associations in two ways. First, the correlation step is carried out using permutations of the expression data at each probe set. Second, the correlation is calculated using random gene sets instead of known splicing factor genes.
Figure 2(A) Sample-wise relative amounts (blue dots) of aberrant exon inclusion (horizontal axis) and exon skipping (vertical axis) events for the 131 prostate cancers in the worked example, compared to random sample-wise amounts calculated from permuted FIRMA scores (yellow dots). (B) Correlation between estimated aberrant exon usage and splicing factor expression compared with random gene sets and permuted TIN-estimates. In the example cancer dataset, 195 of the 280 (70%) splicing factor genes had expression levels that were significantly correlated (P < 0.05; Pearson correlation; red dot; horizontal axis). This is more than expected by chance, as compared with first making 1,000 random permutations of the TIN-estimates (bar graphs in dark blue) and second by selecting 1,000 random sets of 280 genes (bar graphs in light blue). (C) Negative correlation between TIN-estimates and splicing factor expression in the example prostate cancer dataset. Inverse relationship with strong associations between TIN-estimates and expression levels of splicing factors (n = 280), with a much higher percentage of significantly negatively (horizontal axes) than positively (vertical axes) correlated splicing factor genes (red). The shift was higher than expected by chance, as demonstrated by comparing first with each of 1,000 permutations of the TIN-estimates (dark blue) and second with genes in each of 1,000 random sets of 280 genes (light blue). (D) Unsupervised hierarchical clustering analysis (Euclidean distance metrics; complete linkage) of all the 131 samples based on the expression levels of all 280 splicing factor genes. The example prostate series is separated into clusters with some samples having predominantly lower (blue) or higher (red) relative amounts of deviating exon usage than the more average sample (black).