| Literature DB >> 19040723 |
Cristina Della Beffa1, Francesca Cordero, Raffaele A Calogero.
Abstract
BACKGROUND: A new microarray platform (GeneChip Exon 1.0 ST) has recently been developed by Affymetrix http://www.affymetrix.com. This microarray platform changes the conventional view of transcript analysis since it allows the evaluation of the expression level of a transcript by querying each exon component. The Exon 1.0 ST platform does however raise some issues regarding the approaches to be used in identifying genome-wide alternative splicing events (ASEs). In this study an exon-level data analysis workflow is dissected in order to detect limit and strength of each step, thus modifying the overall workflow and thereby optimizing the detection of ASEs.Entities:
Mesh:
Year: 2008 PMID: 19040723 PMCID: PMC2612032 DOI: 10.1186/1471-2164-9-571
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Example of a set of exon skipping events and exon-skipping cleaning procedure. A) Example of a set of exon skipping events. The gene-level probe set (gene) G1 is made of 5 exon-level probe sets (exons) E1, E2, E3, E4, E5. Exon-level probe set signals associated with 128 pM spike-in are black whereas signals associated with 32 pM spike-in are grey. New genes are created combining exon-level expressions derived from different spike-in concentrations. In this specific case, the combination of 128 and 32 pM spike-in signals for gene G1 are used for the generation of 5 new genes (G1skipE1, G1skipE2, etc) each one characterized by a skipping event, given by the spike-in at 32 pM, in one of the 5 exons of gene G1. The unspliced exons are instead given by the 128 pM spike-in. For the sake of simplicity only one out of the three technical replicates is shown. B) Exon-skipping cleaning procedure. The cleaning procedure, applied to all new genes characterized by a skipping event, retains only those where the synthetic skipping event represents the smallest intensity or SI value within the exons belonging to the gene. Here, it is shown the example of gene G5, which is made of 7 exons and therefore produces 7 new genes, G5skipE1, G5skipE2, etc. In G5skipE3 gene, exon E3 should be the only exon characterized by the smallest SI. G5skipE3 gene is retained in the set 128-32, since E3 (grey) is characterized by the smallest SI within all 7 exons (black). The gene is instead removed in the set 2-0 since exon E5 has a SI smaller than the one of exon E3.
Figure 2MiDAS exon skipping detection using RMA or PLIER summarization. ROC curves were used to identify the effect of data summarization on the detection of ASEs. ASEs were detected using MiDAS on the full core Exon 1.0 ST data set (continuous lines) using RMA (red line) or PLIER (black line). The same analysis was also applied to a subset of the core Exon 1.0 ST data set by encompassing only those gene/exon-level probe sets passing the multiple RNAs filter (dashed lines), i.e. those exons of genes associated to more than one mRNA isoform in ENSEMBL database.
Effect of annotation and intensity based filters on the selection of TP and reduction of unspliced exon set (TN).
| Splicing set | Splicing set | Splicing set | ||||
| 128.32 vs 512 | 32.2 vs 128 | 2.0 vs 32 | ||||
| TP | TN | TP | TN | TP | TN | |
| Cross Hybridization filter | 172 | 228264 | 195 | 228264 | 179 | 228264 |
| Multiple mRNAs filter | 172 | 71037 | 195 | 71037 | 179 | 71037 |
| DABG filter | 172 | 197951 | 185 | 197951 | 170 | 197951 |
The effects of filtering by means of annotation (Cross Hybridization/Multiple mRNAs filters) or intensity signal (DABG filter) are evaluated using exon-skipping events at various concentrations.
Figure 3Efficacy of MiDAS and RP in the detection of ASEs. ROC curves were used to detect the efficacy of MiDAS and RP in the detection of ASEs. A) ROC curves for ASE detection using MiDAS. B) ROC curves for ASE detection using RPSI. RP was calculated using exon signal normalized with respect to gene signal, i.e. SI. C) ROC curves for ASE detection RPI. RPI was calculated using exon intensity signal without any further normalization.
MiDAS and RP alternative splicing detection.
| Splicing set | Splicing set | Splicing set | ||||
| 128.32 vs 512 | 32.2 vs 128 | 2.0 vs 32 | ||||
| TP | FP | TP | FP | TP | FP | |
| MiDAS | 119 | 2416 | 176 | 2319 | 138 | 2338 |
| RPI | 174 | 12941 | 193 | 11883 | 164 | 9989 |
| MiDAS & RPI intersection | 119 | 436 | 176 | 424 | 138 | 375 |
RPI is the Rank Product calculated using the intensity signals without SI calculation. Statistical analyses done using MiDAS or RPI, calculated using intensity signals, at p-value ≤ 0.05 are contaminated by a significant number of FPs due to the multiple test problem. The intersection of the results using the two methods significantly reduces the number of FPs.
Figure 4Workflow for exon-level analysis. Workflow proposed for the detection of ASEs. a) The number of probe sets to be considered for the analysis is reduced on the basis of ENSEMBL isoform knowledge (multiple RNAs filter). Eventually, a filter based on the quality of the intensity signal (DABG filter) might be considered as an additional filter. b-c) Statistical analysis is done using a model based algorithm (MiDAS) and a non-parametric algorithm (RP). d) Intersection of data derived by the two statistical analyses, using a common arbitrary p-value threshold (e.g. 0.05), is used to reduce the number of FPs.