Literature DB >> 28968636

MAJIQ-SPEL: web-tool to interrogate classical and complex splicing variations from RNA-Seq data.

Christopher J Green1, Matthew R Gazzara1,2, Yoseph Barash1,3.   

Abstract

SUMMARY: Analysis of RNA sequencing (RNA-Seq) data have highlighted the fact that most genes undergo alternative splicing (AS) and that these patterns are tightly regulated. Many of these events are complex, resulting in numerous possible isoforms that quickly become difficult to visualize, interpret and experimentally validate. To address these challenges we developed MAJIQ-SPEL, a web-tool that takes as input local splicing variations (LSVs) quantified from RNA-Seq data and provides users with visualization and quantification of gene isoforms associated with those. Importantly, MAJIQ-SPEL is able to handle both classical (binary) and complex, non-binary, splicing variations. Using a matching primer design algorithm it also suggests to users possible primers for experimental validation by RT-PCR and displays those, along with the matching protein domains affected by the LSV, on UCSC Genome Browser for further downstream analysis.
AVAILABILITY AND IMPLEMENTATION: Program and code will be available at http://majiq.biociphers.org/majiq-spel. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

Entities:  

Year:  2018        PMID: 28968636      PMCID: PMC7263396          DOI: 10.1093/bioinformatics/btx565

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 Introduction

Advances in RNA-Seq technology have led to improved detection and quantification of splicing variations through the use of short reads that span across spliced junctions. Most commonly used AS analysis tools focus exclusively on classical, binary AS events (e.g. cassette exon, alternative 5′ or 3′ splice sites, intron retention, etc.). Recently, we formulated local splicing variations (LSVs) that capture both classical as well as complex splicing patterns (i.e. involving three or more junctions). Briefly, LSVs can be thought of as splits in a gene’s splice graph where exons are nodes and splicing of pre-mRNA segments are edges. In this formulation LSVs capture several optional (alternative) pre-mRNA segments that the spliceosome may splice to a reference exon up or downstream.Figure 1A illustrates such an LSV with the reference exon marked in gray and several downstream alternative exons along with the matching LSV edges colored in red, blue and green. Such an LSVs is considered complex as it involves more than two alternative junctions. Importantly, we found that over 30% of splicing variations in extensive human and mouse RNA-Seq experiments we interrogated are complex (Vaquero-Garcia).
Fig. 1

(A) Splice graph representation of LSV withinClta from mouse cerebellum (top) and adrenal gland (bottom) with reads detected from RNA-Seq data displayed above each junction. Junctions quantified directly within the LSV are colored. (B) Primer table (bottom) that suggests possible forward (left) and reverse (right) primers. Additional information for each primer can be displayed by clicking the ‘i’ icon as shown below the black cursor. Various primer filters can be applied (top) for additional, on the fly, filtering. (C) UCSC Genome Browser snapshot with custom tracks produced by MAJIQ-SPEL as labeled. SPEL opens those when clicking the Genome Browser logo shown on the left. (D) Isoform table that displays PSI () quantifications (left) and possible isoforms associated with each LSV edge (right). Note that as illustrated here complex LSVs may have a single PSI capturing multiple isoforms and that similarly PSI captures the fraction of a splicing event (edge), not necessarily the fraction of each colored exon. Fraction of each Nucleotide sizes correspond to products produced using the selected primers from (B). (E) Representative RT-PCR validation of predicted product sizes and quantification using the primers selected in (B) on total RNA from mouse cerebellum (left) and adrenal gland (right)

(A) Splice graph representation of LSV withinClta from mouse cerebellum (top) and adrenal gland (bottom) with reads detected from RNA-Seq data displayed above each junction. Junctions quantified directly within the LSV are colored. (B) Primer table (bottom) that suggests possible forward (left) and reverse (right) primers. Additional information for each primer can be displayed by clicking the ‘i’ icon as shown below the black cursor. Various primer filters can be applied (top) for additional, on the fly, filtering. (C) UCSC Genome Browser snapshot with custom tracks produced by MAJIQ-SPEL as labeled. SPEL opens those when clicking the Genome Browser logo shown on the left. (D) Isoform table that displays PSI () quantifications (left) and possible isoforms associated with each LSV edge (right). Note that as illustrated here complex LSVs may have a single PSI capturing multiple isoforms and that similarly PSI captures the fraction of a splicing event (edge), not necessarily the fraction of each colored exon. Fraction of each Nucleotide sizes correspond to products produced using the selected primers from (B). (E) Representative RT-PCR validation of predicted product sizes and quantification using the primers selected in (B) on total RNA from mouse cerebellum (left) and adrenal gland (right) The pervasiveness of complex splicing variations suggests that accurate interpretation of the underlying isoforms is crucial for experimentally interrogating and understanding the consequences of these splicing changes. We therefore developed MAJIQ and VOILA (http://majiq.biociphers.org) to define, quantify and visualize LSVs (Vaquero-Garcia). LSVs visualization is based on segments of splice graphs as shown inFigure 1A, while quantification is based on PSI (Percent Selected Index,) which captures the marginal fraction of each LSV edge (i.e. the fraction of isoforms that utilize this splicing junction). Similarly, changes between experimental conditions are measured by dPSI (). However, no current tool offers a user-friendly interface to connect LSVs, whether simple or complex, to the underlying known gene isoform and affected protein domains. Also, there is a clear need for automated design and visualization of potential primers that flank an LSV for experimental validation via RT-PCR, the gold standard in the field. Specifically, previous work only allows for design of a single primer pair and focuses on classical, binary AS events (Tokheim).

2 Results

We developed the web-tool MAJIQ-SPEL (MAJIQ for Sampling Primers and Evaluating LSVs) to aid in the visualization, interpretation and experimental validation of both classical and complex splicing variations. Typically MAJIQ and its visualization package VOILA (Vaquero-Garcia) are executed by users on datasets ranging from just a few to hundreds or thousands of RNASeq samples to detect local splicing variations (LSV) of interest. SPEL can then analyze LSVs of interest from such large executions which quantify PSI (quantification of a single experimental group) or delta PSI (quantification of splicing changes between two experimental groups). MAJIQ-SPEL (or SPEL for short) is implemented on a Galaxy web server (Afgan) and takes as input the output of VOILA (Vaquero-Garcia). Specifically, users can now click a button to copy a splice graph and LSV quantification of interest, then paste it into SPEL’s Galaxy input form and run the analysis. SPEL is intended to be used primarily as a Galaxy web-tool SPEL but we also made it available as a stand-alone version. The stand-alone is light on memory and CPU, taking about 0.5 s and 24 MB of memory per job on a standard laptop. MAJIQ-SPEL output contains several components, which we highlight inFigure 1 using a complex LSV withinClta generated comparing RNA-Seq from mouse cerebellum and adrenal gland (Zhang). First, colorized representations of the LSV are displayed with junction spanning read counts for each junction quantified directly in the LSV (colored arcs inFig. 1A). Also shown are counts for junction spanning reads that occur within the boundaries of the event, but are not part of the LSV quantified (dashed grey arcs). This visualization allows for quick interpretation of which paths are commonly utilized in each sample. We note that the ratio of the colored read counts usually correspond approximately to the expected PSI (E[Ψ]) but may vary from it due to various normalization factors applied during quantification (GC content, stack removal etc.). Second, SPEL produces a table of putative forward and reverse primers for the 5- and 3-most exons within the LSV (Fig. 1B). The primers are optimized for validating the given LSV via low-cycle RT-PCR, based on the experimental protocols and primer design factors described in (Smith and Lynch, 2014). In brief, to allow for a stringent RT-PCR assay each primer must have a minimal melting temperature (Tm) of 76 °C by the Marmur formula, have a GC content of between 50 and 60%, and have between 2 and 4 G or C nucleotides at the 3 end. Additionally, the selected primer pair should produce expected products within a certain size range to allow visualization via gel electrophoresis and to reduce bias during reverse transcription (Smith and Lynch, 2014). Importantly, design considerations such as minimum and maximum primer length, product length, GC content, Tm and Tm estimation method can be adjusted under ‘Advanced Options’ in the submission form. All primers that meet these criteria are displayed for users to sample and select a pair that best meets experimental needs. For ease of use, the primer table is searchable and key summary information for each primer can also be displayed. Additionally, a number of filters can be applied to the primer table to further reduce the number of primers shown, on the fly, without re-executing SPEL (Fig. 1B). MAJIQ-SPEL also offers UCSC Genome Browser (Kent) connectivity. Clicking the browser’s logo brings up custom tracks that display the exons, junctions and locations of putative primers to aid in selection of primers for validation (Fig. 1C). These tracks also include known isoforms and annotated protein domains [Pfam (Finn)], which can aid in examining the functions of alternative isoforms produced. Finally, MAJIQ-SPEL traverses all possible paths within the splice graph contained in the LSV region based on observed and annotated junctions to create the isoform segments table that links the MAJIQ PSI quantification to the associated isoform(s) (Fig. 1D). Importantly, once the user selects a forward and reverse primer pair, this table updates to display the expected product size for each isoform segment for validation. Additionally, once a primer pair is chosen, the user can run In-Silico PCR through UCSC Genome Browser (Kent) to further validate and check the specificity of the chosen pair. In the example shown, RT-PCR performed using primers generated by MAJIQ-SPEL demonstrates both accurate prediction of all four product sizes and quantification for both cerebellum and adrenal gland (Fig. 1E). Beyond handling classic or complex splicing variations, MAJIQ-SPEL also offers researchers fast and accurate primer design for de novo splicing variations not in the annotated transcriptome. In such cases experimental validation is crucial. Such a case is shown in an event inFubp3 (Supplementary Fig. S1). Since this LSV involves novel exon skipping it will likely not be captured in other tools for splicing quantification and visualization packages that rely only on the annotation database. TheFubp3 andClta splicing variations shown here also highlight how MAJIQ-SPEL can aid in functional analysis of LSVs. The combined UCSC Genome Browser tracks show the alternative exons overlap annotated protein domains, suggesting a functional effect. The cassette exon inFubp3 is not a multiple of three, suggesting a frameshift and the Browser tracks revealed that skipping inserts a premature termination codon (PTC). Future extensions of this work will aim to further integrate these and other functional analyses into MAJIQ-SPEL.

Funding

This work has been supported in part by the Penn Institute for Biomedical Informatics Pilot Grant and R01 AG046544 to YB. Conflict of Interest: none declared. Click here for additional data file.
  7 in total

1.  The human genome browser at UCSC.

Authors:  W James Kent; Charles W Sugnet; Terrence S Furey; Krishna M Roskin; Tom H Pringle; Alan M Zahler; David Haussler
Journal:  Genome Res       Date:  2002-06       Impact factor: 9.043

2.  Cell-based splicing of minigenes.

Authors:  Sarah A Smith; Kristen W Lynch
Journal:  Methods Mol Biol       Date:  2014

3.  A circadian gene expression atlas in mammals: implications for biology and medicine.

Authors:  Ray Zhang; Nicholas F Lahens; Heather I Ballance; Michael E Hughes; John B Hogenesch
Journal:  Proc Natl Acad Sci U S A       Date:  2014-10-27       Impact factor: 11.205

4.  The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2016 update.

Authors:  Enis Afgan; Dannon Baker; Marius van den Beek; Daniel Blankenberg; Dave Bouvier; Martin Čech; John Chilton; Dave Clements; Nate Coraor; Carl Eberhard; Björn Grüning; Aysam Guerler; Jennifer Hillman-Jackson; Greg Von Kuster; Eric Rasche; Nicola Soranzo; Nitesh Turaga; James Taylor; Anton Nekrutenko; Jeremy Goecks
Journal:  Nucleic Acids Res       Date:  2016-05-02       Impact factor: 16.971

5.  PrimerSeq: Design and visualization of RT-PCR primers for alternative splicing using RNA-seq data.

Authors:  Collin Tokheim; Juw Won Park; Yi Xing
Journal:  Genomics Proteomics Bioinformatics       Date:  2014-04-18       Impact factor: 7.691

6.  A new view of transcriptome complexity and regulation through the lens of local splicing variations.

Authors:  Jorge Vaquero-Garcia; Alejandro Barrera; Matthew R Gazzara; Juan González-Vallinas; Nicholas F Lahens; John B Hogenesch; Kristen W Lynch; Yoseph Barash
Journal:  Elife       Date:  2016-02-01       Impact factor: 8.140

7.  The Pfam protein families database: towards a more sustainable future.

Authors:  Robert D Finn; Penelope Coggill; Ruth Y Eberhardt; Sean R Eddy; Jaina Mistry; Alex L Mitchell; Simon C Potter; Marco Punta; Matloob Qureshi; Amaia Sangrador-Vegas; Gustavo A Salazar; John Tate; Alex Bateman
Journal:  Nucleic Acids Res       Date:  2015-12-15       Impact factor: 16.971

  7 in total
  9 in total

1.  Chromatin-mediated alternative splicing regulates cocaine-reward behavior.

Authors:  Song-Jun Xu; Sonia I Lombroso; Delaney K Fischer; Marco D Carpenter; Dylan M Marchione; Peter J Hamilton; Carissa J Lim; Rachel L Neve; Benjamin A Garcia; Mathieu E Wimmer; R Christopher Pierce; Elizabeth A Heller
Journal:  Neuron       Date:  2021-09-03       Impact factor: 18.688

Review 2.  Computational cancer neoantigen prediction: current status and recent advances.

Authors:  G Fotakis; Z Trajanoski; D Rieder
Journal:  Immunooncol Technol       Date:  2021-11-20

Review 3.  The how and why of lncRNA function: An innate immune perspective.

Authors:  Elektra K Robinson; Sergio Covarrubias; Susan Carpenter
Journal:  Biochim Biophys Acta Gene Regul Mech       Date:  2019-09-02       Impact factor: 4.490

Review 4.  Implications of Antigen Selection on T Cell-Based Immunotherapy.

Authors:  Faye A Camp; Jill E Slansky
Journal:  Pharmaceuticals (Basel)       Date:  2021-09-29

5.  IRFinder-S: a comprehensive suite to discover and explore intron retention.

Authors:  Claudio Lorenzi; Sylvain Barriere; Katharina Arnold; Reini F Luco; Andrew J Oldfield; William Ritchie
Journal:  Genome Biol       Date:  2021-11-08       Impact factor: 13.583

6.  Cell environment shapes TDP-43 function with implications in neuronal and muscle disease.

Authors:  Urša Šušnjar; Neva Škrabar; Anna-Leigh Brown; Yasmine Abbassi; Hemali Phatnani; Andrea Cortese; Cristina Cereda; Enrico Bugiardini; Rosanna Cardani; Giovanni Meola; Michela Ripolone; Maurizio Moggio; Maurizio Romano; Maria Secrier; Pietro Fratta; Emanuele Buratti
Journal:  Commun Biol       Date:  2022-04-05

7.  Modulation of pre-mRNA structure by hnRNP proteins regulates alternative splicing of MALT1.

Authors:  Alisha N Jones; Carina Graß; Isabel Meininger; Arie Geerlof; Melina Klostermann; Kathi Zarnack; Daniel Krappmann; Michael Sattler
Journal:  Sci Adv       Date:  2022-08-03       Impact factor: 14.957

Review 8.  Strategies to Uplift Novel Mendelian Gene Discovery for Improved Clinical Outcomes.

Authors:  Eleanor G Seaby; Heidi L Rehm; Anne O'Donnell-Luria
Journal:  Front Genet       Date:  2021-06-17       Impact factor: 4.599

9.  Alternative Splicing During the Chlamydomonas reinhardtii Cell Cycle.

Authors:  Manishi Pandey; Gary D Stormo; Susan K Dutcher
Journal:  G3 (Bethesda)       Date:  2020-10-05       Impact factor: 3.154

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.