| Literature DB >> 26944083 |
Matthew Dapas, Manoj Kandpal1, Yingtao Bi2, Ramana V Davuluri2.
Abstract
Given that the majority of multi-exon genes generate diverse functional products, it is important to evaluate expression at the isoform level. Previous studies have demonstrated strong gene-level correlations between RNA sequencing (RNA-seq) and microarray platforms, but have not studied their concordance at the isoform level. We performed transcript abundance estimation on raw RNA-seq and exon-array expression profiles available for common glioblastoma multiforme samples from The Cancer Genome Atlas using different analysis pipelines, and compared both the isoform- and gene-level expression estimates between programs and platforms. The results showed better concordance between RNA-seq/exon-array and reverse transcription-quantitative polymerase chain reaction (RT-qPCR) platforms for fold change estimates than for raw abundance estimates, suggesting that fold change normalization against a control is an important step for integrating expression data across platforms. Based on RT-qPCR validations, eXpress and Multi-Mapping Bayesian Gene eXpression (MMBGX) programs achieved the best performance for RNA-seq and exon-array platforms, respectively, for deriving the isoform-level fold change values. While eXpress achieved the highest correlation with the RT-qPCR and exon-array (MMBGX) results overall, RSEM was more highly correlated with MMBGX for the subset of transcripts that are highly variable across the samples. eXpress appears to be most successful in discriminating lowly expressed transcripts, but IsoformEx and RSEM correlate more strongly with MMBGX for highly expressed transcripts. The results also reinforce how potentially important isoform-level expression changes can be masked by gene-level estimates, and demonstrate that exon arrays yield comparable results to RNA-seq for evaluating isoform-level expression changes.Entities:
Keywords: Exon-array; RNA-seq; alternative splicing; cross-platform integration; gene expression; isoform-level expression
Mesh:
Substances:
Year: 2017 PMID: 26944083 PMCID: PMC5444266 DOI: 10.1093/bib/bbw016
Source DB: PubMed Journal: Brief Bioinform ISSN: 1467-5463 Impact factor: 11.622
Correlations between RNA-seq abundance estimates
Number of overlapping resolved isoforms per sample.
Expression estimates from each of the tested RNA-seq quantification methods were compared with one another. The number of resolved transcripts shared between each pair of methods is shown in the lower-left. The Spearman correlation between each pair of methods is shown in the upper right.
Figure 1Scatter plots of average expression and fold change (tumor versus normal) estimates between exon array and RNA-seq at both gene and isoform levels. Normalized expression estimates were averaged across samples for each program. RNA-seq estimates were then averaged across programs. Transcripts that were not resolved by a majority of RNA-seq programs were excluded. Transcripts with an average expression TPMadj < log2(0.001) were considered to be not expressed.
Figure 2Spearman correlation coefficients between MMBGX and different RNA-seq quantification methods. (A) Box plots summarize the distribution of individual sample correlations with MMBGX estimates according to each RNA-seq tool tested. Median correlation values are shown (n = 102). For each method, correlations were calculated for both raw expression values and fold change values relative to the normal-tissue samples. (B) Average number of commonly resolved isoforms between MMBGX and each RNA-seq method. MMBGX-only transcripts (yellow) included only if in top 50% of transcripts. (C) Correlations between MMBGX and each RNA-seq method for relatively highly expressed (75–100%) and lowly expressed (0–25%) isoforms.
Figure 3Scatter plots of fold changes labeled according to differential expression. Average fold changes (tumor versus normal) between exon array and RNA-seq are plotted and labeled according to whether they were identified as DE. Genes/isoforms identified as DE by both platforms with consistent direction of change are plotted in green. Genes/isoforms identified as DE by only RNA-seq or exon array are plotted in blue and yellow, respectively. Genes/isoforms not identified as DE by either platform are plotted in gray. Genes/isoforms identified as DE by both platforms but with inconsistent directions of change are plotted in red. (A) Gene-level DE. (B) Isoform-level DE.
Differential gene expression versus Isoform dynamics
| RNA-seq | 35 441 | 8434 | 11 921 | 8071 (67.7%) | 6353 | 5040 (79.3%) | 128 | 26 (20.3%) |
| Exon array | 49 039 | 6876 | 8560 | 5986 (69.9%) | 3379 | 2841 (84.1%) | 173 | 45 (26.0%) |
| RNA-seq ∩ Exon array | 33 514 | 4371 | 5001 | 3858 (77.1%) | 2486 | 2203 (88.6%) | 5 | 0 (0.0%) |
| RNA-seq ∪ Exon array | 50 963 | 10 939 | 15 480 | 10 199 (65.9%) | 7246 | 5678 (78.4%) | 296 | 71 (24.0%) |
Differential expression was measured in genes (DEG) with one or more DE isoforms (DEI) using the eXpress and MMBGX data, and in genes with two or more DEI according to whether or not any pairs of differentially expressed isoforms featured opposite directions of change within the same gene. The percentages describe the proportion of each category of gene called as DE.
Figure 4RT-qPCR correlations. (A) The transcripts included in RT-qPCR analysis, according to their average expression estimates from the RNA-seq and MMBGX exon-array tumor results. (B) The transcripts included in RT-qPCR analysis, according to their average fold change estimates relative to normal brain from the RNA-seq and MMBGX exon-array results. (C) The Spearman correlations and number of shared, resolved transcripts between the various programs tested and the RT-qPCR estimates.