Literature DB >> 26053180

Tracks through the genome to physiological events.

Diane Lipscombe1, Jen Q Pan2, Stephanie Schorge3.   

Abstract

NEW
FINDINGS: What is the topic of this review? We discuss tools available to access genome-wide data sets that harbour cell-specific, brain region-specific and tissue-specific information on exon usage for several species, including humans. In this Review, we demonstrate how to access this information in genome databases and its enormous value to physiology. What advances does it highlight? The sheer scale of protein diversity that is possible from complex genes, including those that encode voltage-gated ion channels, is vast. But this choice is critical for a complete understanding of protein function in the most physiologically relevant context. Many proteins of great interest to physiologists and neuroscientists are structurally complex and located in specialized subcellular domains, such as neuronal synapses and transverse tubules of muscle. Genes that encode these critical signalling molecules (receptors, ion channels, transporters, enzymes, cell adhesion molecules, cell-cell interaction proteins and cytoskeletal proteins) are similarly complex. Typically, these genes are large; human Dystrophin (DMD) encodes a cytoskeletal protein of muscle and it is the largest naturally occurring gene at a staggering 2.3 Mb. Large genes contain many non-coding introns and coding exons; human Titin (TTN), which encodes a protein essential for the assembly and functioning of vertebrate striated muscles, has over 350 exons and consequently has an enormous capacity to generate different forms of Titin mRNAs that have unique exon combinations. Functional and pharmacological differences among protein isoforms originating from the same gene may be subtle but nonetheless of critical physiological significance. Standard functional, immunological and pharmacological approaches, so useful for characterizing proteins encoded by different genes, typically fail to discriminate among splice isoforms of individual genes. Tools are now available to access genome-wide data sets that harbour cell-specific, brain region-specific and tissue-specific information on exon usage for several species, including humans. In this Review, we demonstrate how to access this information in genome databases and its enormous value to physiology.
© 2015 The Authors. Experimental Physiology published by John Wiley & Sons Ltd on behalf of The Physiological Society.

Entities:  

Mesh:

Substances:

Year:  2015        PMID: 26053180      PMCID: PMC5008151          DOI: 10.1113/EP085129

Source DB:  PubMed          Journal:  Exp Physiol        ISSN: 0958-0670            Impact factor:   2.969


Introduction

The combination of cell‐specific use of alternative promoters, alternative polyadenylation sites and cell‐specific alternative splicing of pre‐mRNA in the cell nucleus generates a wide range of mRNAs that may have different stabilities or encode functionally different proteins. During transcription and pre‐mRNA processing, the spliceosome orchestrates intron removal and exon splicing. Appropriate caps (5′‐7‐methylguanylate) and tails [3′ poly‐(A)] are added to form mature mRNAs. Cell‐specific factors recruit transcriptional machinery in the cell nucleus to control the choice of alternative promoters and first exons, promote recognition or skipping of alternatively spliced exons by the spliceosome and control the choice of alternative polyadenylation sites. The large number of RNA variants that can be derived from a single gene is a powerful mechanism to support functional diversity among proteins without increasing genetic load. Cell‐specific use of alternative exons, alternative promoters, RNA editing and alternative sites of polyadenylation in 3′ untranslated regions give rise to multiple mRNA isoforms according to cell and tissue type, developmental stage, activity level and disease state. These forms of RNA processing are no longer considered exceptional, but rather a property that is characteristic of the vast majority of multi‐exon eukaryotic genes. Certain genes are subject to extensive alternative splicing, including Drosophila Dscam and mammalian neurexin (Nrxn) genes that have the potential to generate thousands of functionally different protein isoforms (Schmucker et al. 2000; Aoto et al. 2013; Lah et al. 2014). In our discussions below, we focus on examples of tissue‐specific regulation of exon choice that expand the proteome. However, tissue‐specific control of certain disruptive alternative exons is also a cellular mechanism to turn on or off the expression of proteins downstream of transcriptional activity (Lareau & Brenner, 2015). Insertion or skipping of disruptive alternative exons in mRNAs shifts the open reading frame, introduces premature stop codons and results in mRNA degradation by nonsense‐mediated decay (O'Brien et al. 2012; Eom et al. 2013; Yan et al. 2015; Lareau & Brenner, 2015). The functional importance of alternative splicing has already been shown for a wide range of physiological processes, including the following: acquisition of cell phenotype during early differentiation; correct axonal targeting and synaptogenesis during brain development; sex determination in insects; muscle fibre type specification; synaptic scaling; cell signalling; pluripotent stem cell renewal; and, possibly, speciation (Aoto et al. 2013; Han et al. 2013; Jepson et al. 2013; Lipscombe et al. 2013 a; Venables et al. 2013; Lah et al. 2014; Raj et al. 2014; Lu et al. 2014). Cell‐specific pre‐mRNA processing greatly expands the coding capacity of genes, and the resulting pool of mRNA splice isoforms in specific tissues carries a richer source of information about the function and the state of the cell than the genotype alone. The potential to produce functional diversity by regulated alternative splicing is now well appreciated, and the molecular mechanisms that regulate cell‐specific and developmental stage‐specific alternative splicing are under active investigation. Our understanding of where, when, why and how mRNA splice isoforms are expressed is currently incomplete, but the landscape is changing rapidly. Whole‐transcriptome analyses of data sets derived from massively parallel sequencing are added to publicly available databases with high frequency, and contain sequences for expressed RNAs for an expanding number of human and mouse tissues, cells and diseases (Voineagu et al. 2011; Li et al. 2014; Martin et al. 2014; Zhang et al. 2014). The sheer scale of protein diversity that is possible from complex genes, including those that encode voltage‐gated ion channels, can be daunting when it comes to selecting the most physiologically relevant protein splice isoform to study. Nonetheless, this choice is likely to be critical for a complete understanding of protein function – including subcellular targeting, activation, modulation and pharmacological sensitivity – in the most physiologically relevant context.

Identifying splice isoforms in specific cells and tissues

Identifying and manipulating protein isoforms in specific tissues is technically challenging because of their structural, pharmacological and functional similarities, the range of tissues and cell types where they are expressed, the short temporal expression pattern of certain splice isoforms and the short length of some alternative exons, which may encode only one or two amino acids. Where traditional physiological and pharmacological tools cannot meet the challenge, large‐scale, deep‐sequencing methods now give unprecedented details about the mRNAs expressed in a variety of human tissues. Critically, these methods now provide comprehensive sequence coverage for complex genes, such as ion channels and other membrane proteins whose mRNAs may be present at relatively low levels. Using available visualization tools, it is now easier to gain substantial information on the transcribed sequences of genes of interest, including the major mRNA isoforms expressed in various mammalian tissues, the location of major splice junctions and the location and type of RNA and DNA regulatory elements. Such information can provide a useful starting point to develop hypotheses about the role of different protein isoforms in specific cells and tissues for direct testing in vitro and in vivo. We demonstrate the utility of these genome‐wide deep‐sequencing data sets by illustrating how to visualize alternatively spliced exons of the Cacna1 genes, which encode the functional core α1 subunits of voltage‐gated calcium (CaV) channels in mammals.

Site of alternative splicing in the Cacna1b gene

Two decades ago, we set out to identify the molecular correlates of different CaV2.2 channel activities (N‐type currents) observed in patch‐clamp recordings from neurons. The CaV2.2 channels support many different cell functions, including excitation–secretion coupling at many different synapses in the nervous system, and are encoded by the Cacna1b gene. At that time, genomic sequence information was limited. We, and others, therefore relied heavily on time‐intensive targeted RT‐PCR amplification across exon junctions, based on the available Cacna1b sequence, and using mRNA isolated from microdissected brain regions, different peripheral ganglia and different tissues. We identified four sites of cell‐specific alterative splicing in Cacna1b based on the presence of CaV2.2 mRNA isoforms as well as alternative sites of polyadenylation in the 3′ untranslated region that generates CaV2.2 mRNAs with different stabilities in neurons (Schorge et al. 1999). In subsequent studies, we showed that cell‐specific control of alternative pre‐mRNA splicing positions CaV2.2 splice isoforms with unique biophysical properties and sensitivities to modulation by G‐protein‐coupled receptors in specific subpopulations of neurons and brain regions (Lipscombe et al. 2013 b). Today these same CaV2.2 mRNA isoforms can be identified in minutes by viewing CaV2.2 RNA‐seq data alignments and the conservation track of Cacna1b homologues across a range of vertebrates. This type of integrated visualization of data sets is nicely illustrated using the UCSC Genome Browser (Kent et al. 2002), for four cell type‐specific alternatively spliced exons in Cacna1b (Fig. 1 A–1D; see below).
Figure 1

Visualizing various tracks using UCSC Genome Browser reveals the location of four alternatively spliced exons of mouse ; http://genome.ucsc.edu/cgi‐bin/hgGateway; Kent et al. 2002)

The genomic regions flanking alternatively spliced exons (18a, 24a, 31a and 37a/37b) of Cacna1b are shown in A–D. In each panel, scale bars indicate the size of the genomic region shown. Horizontal layers I–IV represent different display options, as follows: layer I, possible splice patterns in the region of Cacna1b; and layers II–V, tracks of UCSC Genes (blue), Ensemble Genes (maroon), and two different display options for Vertebrate Conservation (grey and green). Exons are shown as rectangles (layer I), and the direction of transcription is indicated by small arrows (layer III). The conservation track (layers IV and V) displays the PhastCon scores (from 0 to 1) calculated based on the genome sequence alignment of 60 vertebrates. In each panel, an alternatively spliced exon is captured in at least one Ensemble transcript. Not all alternatively spliced exons appear in mm10 UCSC Genes (or Refseq Genes track, not shown), but they align perfectly with peaks of highest conservation in the Vertebrate Conservation track. The following steps will recreate the display shown. (i) In the UCSC Genome Browser, choose ‘Genomes’ top left, group ‘mammal’, genome ‘mouse’, assembly ‘GRCm38/mm10’, search term ‘Cacna1b’, location chr2:24603889‐24763152, submit. (ii) Scroll down to ‘Genes and Genes Predictions’ header category, set ‘Ensembl genes’ tab to ‘full’ and ‘UCSC Genes’ tabs to ‘dense’ and set all other tabs to ‘hide’. (iii) Scroll down to Comparative Genomics header, set ‘Conservation’ to display ‘full’ and click on the ‘Conservation’ link; here you can select subtracks by clade, select ‘phastCons’ scores in ‘full’, and ‘hide’ other features, further restrict the PhastCons scores to ‘60 Vert. Cons’ in the subtrack lists, ‘submit’. The complete Cacna1b gene is displayed. A–D are zoomed in to resolve regions that contain alternatively spliced exons. To recreate each panel type in locations: chr2:24678405‐24686581 (18a region; A); chr2:24653647‐24657925 (24a region; B); chr2:24638102‐24649580 (31a region; C); and chr2:24621473‐24632972 (37 region; D). Co‐ordinates of exons are as follows: chr2:24682976‐24683038 (18a); chr2:24656711‐24656722 (24a); chr2:24642853‐24642858 (e31a); chr2:24626823‐24626919 (37a); and chr2:24625238‐24625334 (37b). Alternatively spliced exons are flanked by consensus dinucleodide splice junction motifs AG and GT. Functions and tissue‐specific distributions of these alternative exons have been described by our laboratory (Lipscombe et al. 2013 a).

Visualizing various tracks using UCSC Genome Browser reveals the location of four alternatively spliced exons of mouse ; http://genome.ucsc.edu/cgi‐bin/hgGateway; Kent et al. 2002)

The genomic regions flanking alternatively spliced exons (18a, 24a, 31a and 37a/37b) of Cacna1b are shown in A–D. In each panel, scale bars indicate the size of the genomic region shown. Horizontal layers I–IV represent different display options, as follows: layer I, possible splice patterns in the region of Cacna1b; and layers II–V, tracks of UCSC Genes (blue), Ensemble Genes (maroon), and two different display options for Vertebrate Conservation (grey and green). Exons are shown as rectangles (layer I), and the direction of transcription is indicated by small arrows (layer III). The conservation track (layers IV and V) displays the PhastCon scores (from 0 to 1) calculated based on the genome sequence alignment of 60 vertebrates. In each panel, an alternatively spliced exon is captured in at least one Ensemble transcript. Not all alternatively spliced exons appear in mm10 UCSC Genes (or Refseq Genes track, not shown), but they align perfectly with peaks of highest conservation in the Vertebrate Conservation track. The following steps will recreate the display shown. (i) In the UCSC Genome Browser, choose ‘Genomes’ top left, group ‘mammal’, genome ‘mouse’, assembly ‘GRCm38/mm10’, search term ‘Cacna1b’, location chr2:24603889‐24763152, submit. (ii) Scroll down to ‘Genes and Genes Predictions’ header category, set ‘Ensembl genes’ tab to ‘full’ and ‘UCSC Genes’ tabs to ‘dense’ and set all other tabs to ‘hide’. (iii) Scroll down to Comparative Genomics header, set ‘Conservation’ to display ‘full’ and click on the ‘Conservation’ link; here you can select subtracks by clade, select ‘phastCons’ scores in ‘full’, and ‘hide’ other features, further restrict the PhastCons scores to ‘60 Vert. Cons’ in the subtrack lists, ‘submit’. The complete Cacna1b gene is displayed. A–D are zoomed in to resolve regions that contain alternatively spliced exons. To recreate each panel type in locations: chr2:24678405‐24686581 (18a region; A); chr2:24653647‐24657925 (24a region; B); chr2:24638102‐24649580 (31a region; C); and chr2:24621473‐24632972 (37 region; D). Co‐ordinates of exons are as follows: chr2:24682976‐24683038 (18a); chr2:24656711‐24656722 (24a); chr2:24642853‐24642858 (e31a); chr2:24626823‐24626919 (37a); and chr2:24625238‐24625334 (37b). Alternatively spliced exons are flanked by consensus dinucleodide splice junction motifs AG and GT. Functions and tissue‐specific distributions of these alternative exons have been described by our laboratory (Lipscombe et al. 2013 a).

Tools to visualize tissue‐specific isoforms from genome‐wide expression data sets

Massively parallel sequencing technology (next‐generation sequencing) has reduced the cost of DNA sequencing dramatically (Lander, 2011), and many genome‐wide expression data sets have been produced for a range of tissues and species and deposited to the public database. At the same time, interactive informatics tools have been developed to help researchers access, align and visualize any genome‐wide information that is annotated by genome co‐ordinate positions for species whose genomes have been sequenced and reference genome assembly and co‐ordinates established. The UCSC Genome Browser and the Integrated Genomics Viewer (IGV) visualization tools offer different ways to explore large, heterogeneous, integrated genomic data sets from different species. While the UCSC Genome Browser is a Web application that offers the pre‐integration of many data sets (Kent et al. 2002; http://genome.ucsc.edu/), IGV is optimized to provide high‐performance data visualization as a stand‐alone software application (IGV, http://www.broadinstitute.org/igv/; Thorvaldsdóttir et al. 2013, 2015). Both tools can upload custom data sets for visualization and both provide access to the Encyclopedia of DNA elements (ENCODE; Birney et al. 2007). Genotype‐Tissue Expression project (GTEx; http://www.gtexportal.org/; GTEx, 2013) is a resource that can be used to visualize human gene expression and regulation for multiple tissues from a database collected and analysed within the GTEx project. The type of information highlighted here represents only a glimpse of what can be accessed from sequence databases using these tools; new sequence information is added regularly and new tools are being developed constantly.

UCSC Genome Browser

In the UCSC Genome Browser, the user can zoom and scroll over genomic locations on the chromosomes and display genome‐wide annotation data sets (‘tracks’), including mRNA alignments, gene‐expression data and epigenetic regulatory markers, viewed beneath genome co‐ordinate positions (Kent et al. 2002). The annotation tracks allow rapid vertical integration to visualize different types of information aligned to the same genome co‐ordinates. The following description is best read while viewing the UCSC Genome Browser (see legend to Fig. 1). The UCSC Genome Browser contains > 10 major categories of genome annotations. For example, ‘Expression and Regulation’ is particularly useful for viewing cell‐ and tissue‐specific events. Under each category, various tracks are available for viewing, and each track is derived from specially formatted files that can be read by the Genome Browser. These large genome‐wide data sets have been preloaded at the UCSC server and can be configured by the user to display individual tracks of interest at several different resolutions (dense, squish, pack and full). For example, the ‘Burge‐RNA seq’ track located under the ‘Expression and Regulation’ tab contains RNA‐seq data from five human cell lines and nine human tissues (Wang et al. 2008) that can be displayed by selecting a view option (dense, squish, pack and full). In Fig. 1, we show an output from the UCSC Genome Browser to demonstrate its value for visualizing tissue‐specific alternative exons in Cacna1b. Previously characterized alternatively spliced exons 18a, 24a, 31a, 37a and 37b in Cacna1b are captured by visualizing UCSC Genes and Ensembl Genes tracks aligned to the 60‐vertebrate Conservation track (Fig. 1). Notably, all five alternative exons are found in all vertebrate Cacna1b genes and captured in tracks such as ‘Other RefSeq’ (non‐human), ‘Human mRNAs’ and ‘Spliced ESTs’, but none is documented in the human reference RNA (human RefSeq) for the human genome (data not shown). Even exon 31a, which is only six nucleotides, located within a 10–12 kb intron and expressed at very low levels in mammalian brains, is captured in at least one Ensembl transcript (Fig. 1). In Fig. 2, we show the 5′ end of Cacna1c, the gene that encodes dihydropyridine‐sensitive CaV1.2 channels (high‐voltage‐activated L‐type currents) to illustrate tissue‐specific choice of promoters and first exons. CaV1.2 channels are expressed throughout the body, including the heart, smooth muscle, endocrine cells and neurons. They couple excitation to muscle contraction in heart and smooth muscle, to vesicle release in endocrine cells and to postsynaptic calcium‐mediated synaptic plasticity and gene expression in neurons. Consistent with their very different cellular roles, unique CaV1.2 channel isoforms and exons dominate in different types of cells and tissues. The tissue‐specific choice of alternative promoters of mouse Cacna1c can be inferred from visualizing tissue‐specific histone modifications displayed in ChIP‐seq track from the track ‘ENCODE/Ludwig Institute for Cancer Research’ (LICR; ENCODE, 2012) in parallel with RNA‐seq data from different mouse tissues (Fig. 2). Here, we show the H3K4me3 subtrack that is predictive of active promoters, but this track can also be used to display an extensive number of cis‐regulatory elements genome wide for a range of mouse tissue and cell lines (ENCODE, 2012).
Figure 2

The use of different first exons in mouse

Mouse July 2007 (NCBI37/mm9) assembly is used for display, with genomic co‐ordinate chr6:118534231‐119182730. Genomic location and scale bar indicate the position and coverage of the genome. Each line represents a reference sequence aligned to the mouse genome (mm9) from UCSC Genes track (blue). Tissue‐specific histone modifications are displayed in the ChIP‐seq subtrack from the ENCODE/Ludwig Institute for Cancer Research (LICR; ENCODE, 2012) in parallel with RNA‐seq data from mouse heart (8 weeks) and mouse embryonic whole brain (day 14.5). Exon 1a is used in heart and is located > 80 kb upstream of exon 1b that is used in both heart and brain. Three H3K4Me peaks are shown that correspond to two different transcription start sites for Cacna1c and a single transcription start for Dcp1b. The Dcp1b gene is transcribed on the reverse strand.

The use of different first exons in mouse

Mouse July 2007 (NCBI37/mm9) assembly is used for display, with genomic co‐ordinate chr6:118534231‐119182730. Genomic location and scale bar indicate the position and coverage of the genome. Each line represents a reference sequence aligned to the mouse genome (mm9) from UCSC Genes track (blue). Tissue‐specific histone modifications are displayed in the ChIP‐seq subtrack from the ENCODE/Ludwig Institute for Cancer Research (LICR; ENCODE, 2012) in parallel with RNA‐seq data from mouse heart (8 weeks) and mouse embryonic whole brain (day 14.5). Exon 1a is used in heart and is located > 80 kb upstream of exon 1b that is used in both heart and brain. Three H3K4Me peaks are shown that correspond to two different transcription start sites for Cacna1c and a single transcription start for Dcp1b. The Dcp1b gene is transcribed on the reverse strand.

Integrated Genomics Viewer

The IGV is a stand‐alone application that is optimized for high‐performance data visualization and offers functionality such as displaying splicing and exon junctions quantitatively (Robinson et al. 2011; Thorvaldsdóttir et al. 2013, 2015; Katz et al. 2015). The IGV comes with a set of highly used publicly available data sets, including ENCODE. Custom genome annotation data sets can be loaded from a server or local files. Many large‐scale RNA‐seq projects have produced mRNA expression data sets across multiple tissues and multiple cell lines. For example, human BodyMap 2.0 data from Illumina® can be downloaded, and it contains deep‐sequencing alignments and coverage for RNA‐seq derived from 16 human tissues (http://www.ebi.ac.uk/arrayexpress/experiments/E‐MTAB‐513/). Human BodyMap 2.0 data can be visualized within IGV. Sashimi plots and junction box displays provide quantitative visualizations of tissue‐specific expression patterns of alternative splice junctions, as shown in Fig. 3.
Figure 3

Visualization of tissue‐specific use of alternative promoters in human Thorvaldsdóttir et al. 2013)

The 2009 human (GRCh37/hg19) assembly was used in this figure to visualize the two alternative promoters in CACNA1C. Junction Box option is turned on for quantitative visualization of exon–exon junction from mRNA sequencing reads aligned to gene annotations. These plots show tissue‐specific usage of exon 1a (heart) and exon 1b (heart and brain). The thickness of the connections indicates the frequency of the reads that connect exon–exon junctions. The sense strand (red) and antisense strand (blue) are shown. Data were loaded from ‘Bodymap 2.0’ from IGV. The DPC1B gene is short and transcribed in the reverse direction. There are two promoters in CACNA1C, marked 1a and 1b.

Visualization of tissue‐specific use of alternative promoters in human Thorvaldsdóttir et al. 2013)

The 2009 human (GRCh37/hg19) assembly was used in this figure to visualize the two alternative promoters in CACNA1C. Junction Box option is turned on for quantitative visualization of exon–exon junction from mRNA sequencing reads aligned to gene annotations. These plots show tissue‐specific usage of exon 1a (heart) and exon 1b (heart and brain). The thickness of the connections indicates the frequency of the reads that connect exon–exon junctions. The sense strand (red) and antisense strand (blue) are shown. Data were loaded from ‘Bodymap 2.0’ from IGV. The DPC1B gene is short and transcribed in the reverse direction. There are two promoters in CACNA1C, marked 1a and 1b.

The GTEx portal

The GTEx portal is an atlas of gene expression and regulation across multiple human tissues (GTEx, 2013). Since 2012, 3797 tissues from 150 donors have been collected and genotyped, and there is a target of gathering data from 900 donors and ∼20,000 tissue samples. GTEx is especially rich in mRNA sequences from different brain regions (13 subregions), lending greater resolution to region‐specific expression patterns. Analysed data from GTEx are available to download using a simple user interface that generates an output showing both expression levels and frequency of exon usage across tissue and brain regions. In Fig. 4, we again use Cacna1c to illustrate how GTEx can be used to visualize tissue‐specific use of alternative promoters and exons in human heart and coronary artery smooth muscle. Alternative exons 1a and 8a dominate in CaV1.2 mRNAs expressed in heart, whereas alternative exons 1b and 8b dominate in CaV1.2 mRNAs of coronary artery (Fig. 4). All other Cacna1c exons marked in GTEx appear to be expressed at about the same frequency in these two tissues. Brainspan is a complementary site that, although not specifically designed to identify alternative splicing, shows a heat map of gene expression levels in different regions of the human nervous system across development (http://www.brainspan.org/; BrainSpan, 2011). The Brainspan project contains mRNA sequencing data from 42 brain specimens spanning pre‐ and postnatal developmental stages for both sexes. For example, the ‘Developmental transcriptome’ tab can be used to visualize expression levels of CACNA1 genes in different brain regions.
Figure 4

Tissue‐specific use of alternative exons in human , 2013)

Using the GTEx portal, the major exons and patterns of alternative splicing are displayed. The blue rectangles represent exons and the red circles the splice options. The colour intensity represents the frequency of sequence reads for the specified tissue. Shown are analyses of RNA‐seq data from human coronary artery and heart ventricles. The majority of CaV1.2 expressed sequence in heart differs from that expressed in coronary artery at two mutually exclusive sites. In heart, exons 1a and 8a are expressed, whereas in coronary artery exons 1b and 8b are expressed.

Tissue‐specific use of alternative exons in human , 2013)

Using the GTEx portal, the major exons and patterns of alternative splicing are displayed. The blue rectangles represent exons and the red circles the splice options. The colour intensity represents the frequency of sequence reads for the specified tissue. Shown are analyses of RNA‐seq data from human coronary artery and heart ventricles. The majority of CaV1.2 expressed sequence in heart differs from that expressed in coronary artery at two mutually exclusive sites. In heart, exons 1a and 8a are expressed, whereas in coronary artery exons 1b and 8b are expressed.

Splicing factors identify cohorts of synergistically regulated exons across different genes

The new genome‐wide data sets also contain valuable information about correlated exon choice across multiple genes, a process essential for acquisition of cell phenotype and that is controlled by the action of cell‐specific and tissue‐specific RNA binding proteins (Wang et al. 2008). Nuclear cell‐specific trans‐acting splicing factors bind pre‐mRNAs to promote or inhibit the recognition of certain alternatively spliced exons by the spliceosome during splicing. For example, the RBM20 RNA binding protein regulates alternative splicing of an exon in Titin pre‐mRNA that is essential for adaptive changes in cardiac ventricular filling (Hidalgo & Granzier, 2013). In neurons, Nova RNA binding proteins regulate exon selection during pre‐mRNA splicing of many genes essential for synaptic transmission (Zhang et al. 2010). RbFox RNA binding proteins are critical for exon selection of genes expressed in many tissues and cells, and they are essential for normal cell differentiation, cell survival and cell signalling (Yeo et al. 2009). Analyses of splicing factor binding to RNA, by cross‐linking immunoprecipitation combined with high‐throughout sequencing (CLIP‐HITS), have provided genome‐wide maps of protein binding sites in brain regions (Licatalosi et al. 2008; Lambert et al. 2014). The location of splicing factor binding, in some cases, is predictive of splicing factor action. For example, serine–arginine‐rich (SR) splicing factors Nova, RbFox and PTB act as repressors or enhancers of alternative exon inclusion in RNA in neurons depending on where their consensus binding motifs (cis elements) are relative to the target exon (upstream or downstream; Zhang et al. 2010). Genome‐wide mapping of splicing factor binding represents the starting framework to define a splicing code that may eventually predict certain cell‐specific splicing events across families of genes (Wang et al. 2008; Licatalosi et al. 2008; Yeo et al. 2009). Trans‐acting RNA binding proteins are the best studied of the cell‐specific splicing factors, but other molecules regulate cell‐specific splicing, including epigenetic modulators of DNA (Kornblihtt et al. 2013; Fukuda et al. 2013; Yan et al. 2015). For example, a ubiquitous DNA‐binding protein, CCCTC binding factor (CTCF) regulates cell‐specific splicing of alternative exons. In this case, CTCF binding promotes recognition of a weak splice site by the spliceosome, but its binding is inhibited by cell‐specific methylation within the CTCF binding motif (Shukla et al. 2011). These data emphasize that transcription and pre‐mRNA processing should not be viewed as sequential events but rather as processes that are strongly coupled temporally and mechanistically (Shukla & Oberdoerffer, 2012; Kornblihtt et al. 2013).

Developmental changes in ion channel splicing

In the examples discussed in the foregoing subsections, we highlight isoforms of voltage‐gated calcium ion channel genes (Cacna1) that vary across different tissues and brain regions, but splicing of certain alternatively spliced exons is also strongly dependent on development. In this section, we turn to studies of closely related voltage‐gated sodium ion channel genes (SCN) to highlight examples of developmentally regulated alternatively spliced exons. Splice isoforms of several neuronal sodium channels have been described, but perhaps most interesting is a splicing event that is highly conserved across several human SCN genes [SCN1A (NaV1.1), SCN2A (NaV1.2), SCN3A (NaV1.3) and SCN8A (NaV1.6)] that define ‘neonatal’ (exon N) and ‘adult’ (exon A) forms of human NaV channels (Copley, 2004; Gazina et al. 2015). Mutually exclusive exons (usually exon 5) encode different amino acid sequences in the first (D1) of four canonical structurally homologous domains that typify all tetrameric voltage‐gated ion channels. Interestingly, this conserved neonatal/adult splice site is a modifier of disease severity. For example, a mutation in SCN2A (NaV1.2 L1563V), which is located in a region of SCN2A that is downsteam of both the neonantal and adult exons, is causative of benign familial neonatal–infantile seizures. Although this mutation has no functional impact in adult NaV1.2 channel isoforms, it increases the overall activity of the neonatal isoform of NaV1.2. The benign familial neonatal–infantile seizures‐causing mutation alters several biophysical properties of neonatal NaV1.2 channels, including reduced kinetics of inactivation, faster recovery from inactivation and less voltage‐sensitive inactivation. Notably, the mutation makes the neonatal NaV1.2 channels behave like wild‐type adult NaV1.2 channels, perhaps explaining why this mutation is disease causing in only neonatal infants and not in adults (Xu et al. 2007; Gazina et al. 2015). Human SCNA8A (NaV1.6) contains a different pair of exons (18A and 18N) that are also differentially expressed during development. Exon 18N‐containing mRNAs dominate in neurons at birth, but by 10 months of age in mice there is a complete switch to 18A‐containing mRNAs. Interestingly, exon 18N is disruptive; its inclusion shifts the open reading frame. NaV1.6 mRNAs containing 18N are non‐functional and are degraded by nonsense‐mediated decay (O'Brien et al. 2012). Thus, in this case alternative pre‐mRNA splicing is used as a mechanism for cell‐specific and development‐dependent control of NaV1.6 channel expression post‐transcription. Subtle changes in splicing regulation at the conserved neonatal/adult splice site of SCN genes have also been proposed to arise from common single nucleotide polymorphisms (SNPs), and these are linked to altered clinical outcomes. For example, common SNPs in SCNA1 (NaV1.1) are hypothesized to influence the development of some types of epilepsy (Kasperaviciute et al. 2013) and the dosage of anti‐epileptic drugs (Tate et al. 2005). The function of voltage‐gated calcium ion channels of skeletal muscle, CaV1.1, is also modified during development as a result of alternative pre‐mRNA splicing. Compared with adult channels, CaV1.1 channels that dominate in embryonic skeletal muscle lack exon 29, an alternatively spliced exon that encodes 19 amino acids in the putative extracellular linker between transmembrane domains IVS3–IVS4. When present in CaV1.1, the exon 29 encoding sequence reduces calcium currents and shifts channel activation thresholds to more positive voltages compared with embryonic CaV1.1 channels that lack exon 29 (Tuluc et al. 2009; Benedetti et al. 2015). The amino acid sequence that links domains IVS3 and IVS4 of CaV1 and CaV2 channels is hypervariable and a hotspot for alterative splicing in Cacna1 genes from Drosophila to humans (see Lipscombe et al. 2013b). The neonatal and adult alternative splice isoforms of SCN genes and the developmentally regulated CACNA1S encoding CaV1.1 channels can all be identified in the UCSC genome browser or IGV by aligning the mRNAs from all species and cross‐referencing with conservation scores, as shown in Figs 1 and 2 for Cacna1 genes, demonstrating the utility of the genome‐wide transcription data in identifying developmentally regulated splicing events.

Current directions: alternative splicing disease and therapy

As discussed elsewhere, aberrant patterns of alternative pre‐mRNA splicing can also contribute to disease pathology, as the primary cause of disease, as a consequence of disease pathology or as disease modifiers (Lipscombe et al. 2013b; Singh & Cooper, 2012), while natural genetic variations among individuals might account for differences in disease susceptibility. Consequently, there is interest in therapeutic strategies to regulate the pattern of alternative pre‐mRNA splicing, for example, to correct mis‐splicing that may result from mutations that disrupt pre‐mRNA splicing or to shift the balance of splicing to modify disease severity (Yoshida et al. 2015). Aberrant splicing patterns arising from rare mutations and common nucleotide variants across the human genome contribute to the aetiology of diseases and disorders including cancers, spinal muscular atrophy, autism spectrum disorders and other developmental disorders (Moulton et al. 2013; Tavassoli et al. 2014; Xiong et al. 2015; Zhang et al. 2015). In human heart, SCN5A pre‐mRNAs (NaV1.5) are mis‐spliced in patients with myotonic dystrophy type 1 (DM1) and Brugada syndrome ventricular tachyarrhythmias (Wahbi et al. 2013). There are abnormally high levels of the neonatal splice isoforms of NaV1.5 mRNA in ventricular myocardial tissue from patients with DM1 compared with control subjects, but this is only one example of a large number of abnormal splicing patterns found in patients with DM1. Myotonic dystrophy type 1 is caused by heterozygous trinucleotide repeat expansion [(CTG)n] in the 3′ untranslated region of the dystrophia myotonica protein kinase gene. A secondary consequence of this trinucleotide expansion is widespread dysregulation of pre‐mRNA splicing in muscle, because of the sequestration of nuclear trans‐acting splicing factors to the expanded trinucleotide repeat, notably Muscleblind‐like member 1 (MBNL1) and CUG‐binding protein Elav‐like family member 1 (CELF1). Strategies designed to correct levels of these critical splicing factors are being explored as novel therapies for this devastating disease (Lee et al. 2012). The causal gene for Rett syndrome, MECP2, regulates splicing, and aberrant alternative splicing patterns are observed in mouse models of Rett syndrome (Young et al. 2005). Several other neurological and motor neuron disorders are linked by common dysfunction in RNA processing, including fused in sarcoma (FUS)‐ amyotrophic lateral sclerosis (FUS‐ALS), TAR DNA binding protein (TDP43)‐ALS, frontotemporal dementia (FTD), spinal muscular atrophy (SMA) and autism spectrum disorders (Achsel et al. 2013; Ling et al. 2013; Corominas et al. 2014; Xiong et al. 2015). Approaches to normalize the RNA splicing events are therefore being explored as therapeutic strategies to compensate for diseases such as SMA (Hua et al. 2011) and DM1 (Oana et al. 2013; Wojciechowska et al. 2014).

Conclusion

The importance of individual and correlated RNA splicing events in normal and pathological cell function is indisputable. Here, we demonstrate how to access publicly available genome‐wide sequence databases and to visualize and vertically integrate this information for a given gene or sets of genes using available Web‐based and informatics tools. The value‐content of genome‐wide data sets is already high, but it will continue to grow as massively parallel sequencing of RNA and maps of regulatory elements in RNA and DNA are added from different cell types and tissues, in healthy and disease‐affected individuals. Deep, genome‐wide sequencing of expressed mRNAs at different developmental time points for a range of specific brain regions and tissues will be necessary for an accurate spatial and temporal map of splicing regulation. Knowledge of cell‐ and tissue‐specific exon combination patterns, as well as the cell‐specific regulatory elements that regulate them, will lead to a comprehensive set of data that should be highly predictive of the function and the state of the cell, and may inform strategies aimed at correcting abnormal splicing events associated with disease.

Additional information

Competing interests

None declared.

Author contrbutions

All authors contributed equally to the compiling of the literature, its analysis and the writing of the paper.

Funding

This work was funded by National Institute of Neurological Disorders and Stroke (NINDS; grant no. NS055251 to D.L.) the Royal Society (to S.S), the National Institute for Mental Health (NIMH; grant no. MH099448 to J.Q.P.) and the Stanley Medical Foundation (to J.Q.P.).
  61 in total

1.  Systematic discovery of regulated and conserved alternative exons in the mammalian brain reveals NMD modulating chromatin regulators.

Authors:  Qinghong Yan; Sebastien M Weyn-Vanhentenryck; Jie Wu; Steven A Sloan; Ye Zhang; Kenian Chen; Jia Qian Wu; Ben A Barres; Chaolin Zhang
Journal:  Proc Natl Acad Sci U S A       Date:  2015-03-03       Impact factor: 11.205

Review 2.  The emerging era of genomic data integration for analyzing splice isoform function.

Authors:  Hong-Dong Li; Rajasree Menon; Gilbert S Omenn; Yuanfang Guan
Journal:  Trends Genet       Date:  2014-06-17       Impact factor: 11.639

Review 3.  Tuning the molecular giant titin through phosphorylation: role in health and disease.

Authors:  Carlos Hidalgo; Henk Granzier
Journal:  Trends Cardiovasc Med       Date:  2013-01-05       Impact factor: 6.677

4.  RBM20, a gene for hereditary cardiomyopathy, regulates titin splicing.

Authors:  Wei Guo; Sebastian Schafer; Marion L Greaser; Michael H Radke; Martin Liss; Thirupugal Govindarajan; Henrike Maatz; Herbert Schulz; Shijun Li; Amanda M Parrish; Vita Dauksaite; Padmanabhan Vakeel; Sabine Klaassen; Brenda Gerull; Ludwig Thierfelder; Vera Regitz-Zagrosek; Timothy A Hacker; Kurt W Saupe; G William Dec; Patrick T Ellinor; Calum A MacRae; Bastian Spallek; Robert Fischer; Andreas Perrot; Cemil Özcelik; Kathrin Saar; Norbert Hubner; Michael Gotthardt
Journal:  Nat Med       Date:  2012-05       Impact factor: 53.440

5.  'Neonatal' Nav1.2 reduces neuronal excitability and affects seizure susceptibility and behaviour.

Authors:  Elena V Gazina; Bryan T W Leaw; Kay L Richards; Verena C Wimmer; Tae H Kim; Timothy D Aumann; Travis J Featherby; Leonid Churilov; Vicki E Hammond; Christopher A Reid; Steven Petrou
Journal:  Hum Mol Genet       Date:  2014-11-06       Impact factor: 6.150

6.  Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project.

Authors:  Ewan Birney; John A Stamatoyannopoulos; Anindya Dutta; Roderic Guigó; Thomas R Gingeras; Elliott H Margulies; Zhiping Weng; Michael Snyder; Emmanouil T Dermitzakis; Robert E Thurman; Michael S Kuehn; Christopher M Taylor; Shane Neph; Christoph M Koch; Saurabh Asthana; Ankit Malhotra; Ivan Adzhubei; Jason A Greenbaum; Robert M Andrews; Paul Flicek; Patrick J Boyle; Hua Cao; Nigel P Carter; Gayle K Clelland; Sean Davis; Nathan Day; Pawandeep Dhami; Shane C Dillon; Michael O Dorschner; Heike Fiegler; Paul G Giresi; Jeff Goldy; Michael Hawrylycz; Andrew Haydock; Richard Humbert; Keith D James; Brett E Johnson; Ericka M Johnson; Tristan T Frum; Elizabeth R Rosenzweig; Neerja Karnani; Kirsten Lee; Gregory C Lefebvre; Patrick A Navas; Fidencio Neri; Stephen C J Parker; Peter J Sabo; Richard Sandstrom; Anthony Shafer; David Vetrie; Molly Weaver; Sarah Wilcox; Man Yu; Francis S Collins; Job Dekker; Jason D Lieb; Thomas D Tullius; Gregory E Crawford; Shamil Sunyaev; William S Noble; Ian Dunham; France Denoeud; Alexandre Reymond; Philipp Kapranov; Joel Rozowsky; Deyou Zheng; Robert Castelo; Adam Frankish; Jennifer Harrow; Srinka Ghosh; Albin Sandelin; Ivo L Hofacker; Robert Baertsch; Damian Keefe; Sujit Dike; Jill Cheng; Heather A Hirsch; Edward A Sekinger; Julien Lagarde; Josep F Abril; Atif Shahab; Christoph Flamm; Claudia Fried; Jörg Hackermüller; Jana Hertel; Manja Lindemeyer; Kristin Missal; Andrea Tanzer; Stefan Washietl; Jan Korbel; Olof Emanuelsson; Jakob S Pedersen; Nancy Holroyd; Ruth Taylor; David Swarbreck; Nicholas Matthews; Mark C Dickson; Daryl J Thomas; Matthew T Weirauch; James Gilbert; Jorg Drenkow; Ian Bell; XiaoDong Zhao; K G Srinivasan; Wing-Kin Sung; Hong Sain Ooi; Kuo Ping Chiu; Sylvain Foissac; Tyler Alioto; Michael Brent; Lior Pachter; Michael L Tress; Alfonso Valencia; Siew Woh Choo; Chiou Yu Choo; Catherine Ucla; Caroline Manzano; Carine Wyss; Evelyn Cheung; Taane G Clark; James B Brown; Madhavan Ganesh; Sandeep Patel; Hari Tammana; Jacqueline Chrast; Charlotte N Henrichsen; Chikatoshi Kai; Jun Kawai; Ugrappa Nagalakshmi; Jiaqian Wu; Zheng Lian; Jin Lian; Peter Newburger; Xueqing Zhang; Peter Bickel; John S Mattick; Piero Carninci; Yoshihide Hayashizaki; Sherman Weissman; Tim Hubbard; Richard M Myers; Jane Rogers; Peter F Stadler; Todd M Lowe; Chia-Lin Wei; Yijun Ruan; Kevin Struhl; Mark Gerstein; Stylianos E Antonarakis; Yutao Fu; Eric D Green; Ulaş Karaöz; Adam Siepel; James Taylor; Laura A Liefer; Kris A Wetterstrand; Peter J Good; Elise A Feingold; Mark S Guyer; Gregory M Cooper; George Asimenos; Colin N Dewey; Minmei Hou; Sergey Nikolaev; Juan I Montoya-Burgos; Ari Löytynoja; Simon Whelan; Fabio Pardi; Tim Massingham; Haiyan Huang; Nancy R Zhang; Ian Holmes; James C Mullikin; Abel Ureta-Vidal; Benedict Paten; Michael Seringhaus; Deanna Church; Kate Rosenbloom; W James Kent; Eric A Stone; Serafim Batzoglou; Nick Goldman; Ross C Hardison; David Haussler; Webb Miller; Arend Sidow; Nathan D Trinklein; Zhengdong D Zhang; Leah Barrera; Rhona Stuart; David C King; Adam Ameur; Stefan Enroth; Mark C Bieda; Jonghwan Kim; Akshay A Bhinge; Nan Jiang; Jun Liu; Fei Yao; Vinsensius B Vega; Charlie W H Lee; Patrick Ng; Atif Shahab; Annie Yang; Zarmik Moqtaderi; Zhou Zhu; Xiaoqin Xu; Sharon Squazzo; Matthew J Oberley; David Inman; Michael A Singer; Todd A Richmond; Kyle J Munn; Alvaro Rada-Iglesias; Ola Wallerman; Jan Komorowski; Joanna C Fowler; Phillippe Couttet; Alexander W Bruce; Oliver M Dovey; Peter D Ellis; Cordelia F Langford; David A Nix; Ghia Euskirchen; Stephen Hartman; Alexander E Urban; Peter Kraus; Sara Van Calcar; Nate Heintzman; Tae Hoon Kim; Kun Wang; Chunxu Qu; Gary Hon; Rosa Luna; Christopher K Glass; M Geoff Rosenfeld; Shelley Force Aldred; Sara J Cooper; Anason Halees; Jane M Lin; Hennady P Shulha; Xiaoling Zhang; Mousheng Xu; Jaafar N S Haidar; Yong Yu; Yijun Ruan; Vishwanath R Iyer; Roland D Green; Claes Wadelius; Peggy J Farnham; Bing Ren; Rachel A Harte; Angie S Hinrichs; Heather Trumbower; Hiram Clawson; Jennifer Hillman-Jackson; Ann S Zweig; Kayla Smith; Archana Thakkapallayil; Galt Barber; Robert M Kuhn; Donna Karolchik; Lluis Armengol; Christine P Bird; Paul I W de Bakker; Andrew D Kern; Nuria Lopez-Bigas; Joel D Martin; Barbara E Stranger; Abigail Woodroffe; Eugene Davydov; Antigone Dimas; Eduardo Eyras; Ingileif B Hallgrímsdóttir; Julian Huppert; Michael C Zody; Gonçalo R Abecasis; Xavier Estivill; Gerard G Bouffard; Xiaobin Guan; Nancy F Hansen; Jacquelyn R Idol; Valerie V B Maduro; Baishali Maskeri; Jennifer C McDowell; Morgan Park; Pamela J Thomas; Alice C Young; Robert W Blakesley; Donna M Muzny; Erica Sodergren; David A Wheeler; Kim C Worley; Huaiyang Jiang; George M Weinstock; Richard A Gibbs; Tina Graves; Robert Fulton; Elaine R Mardis; Richard K Wilson; Michele Clamp; James Cuff; Sante Gnerre; David B Jaffe; Jean L Chang; Kerstin Lindblad-Toh; Eric S Lander; Maxim Koriabine; Mikhail Nefedov; Kazutoyo Osoegawa; Yuko Yoshinaga; Baoli Zhu; Pieter J de Jong
Journal:  Nature       Date:  2007-06-14       Impact factor: 49.962

7.  RNA Bind-n-Seq: quantitative assessment of the sequence and structural binding specificity of RNA binding proteins.

Authors:  Nicole Lambert; Alex Robertson; Mohini Jangi; Sean McGeary; Phillip A Sharp; Christopher B Burge
Journal:  Mol Cell       Date:  2014-05-15       Impact factor: 17.970

8.  Transcriptome sequencing from diverse human populations reveals differentiated regulatory architecture.

Authors:  Alicia R Martin; Helio A Costa; Tuuli Lappalainen; Brenna M Henn; Jeffrey M Kidd; Muh-Ching Yee; Fabian Grubert; Howard M Cann; Michael Snyder; Stephen B Montgomery; Carlos D Bustamante
Journal:  PLoS Genet       Date:  2014-08-14       Impact factor: 5.917

9.  Epilepsy, hippocampal sclerosis and febrile seizures linked by common genetic variation around SCN1A.

Authors:  Dalia Kasperaviciute; Claudia B Catarino; Mar Matarin; Costin Leu; Jan Novy; Anna Tostevin; Bárbara Leal; Ellen V S Hessel; Kerstin Hallmann; Michael S Hildebrand; Hans-Henrik M Dahl; Mina Ryten; Daniah Trabzuni; Adaikalavan Ramasamy; Saud Alhusaini; Colin P Doherty; Thomas Dorn; Jörg Hansen; Günter Krämer; Bernhard J Steinhoff; Dominik Zumsteg; Susan Duncan; Reetta K Kälviäinen; Kai J Eriksson; Anne-Mari Kantanen; Massimo Pandolfo; Ursula Gruber-Sedlmayr; Kurt Schlachter; Eva M Reinthaler; Elisabeth Stogmann; Fritz Zimprich; Emilie Théâtre; Colin Smith; Terence J O'Brien; K Meng Tan; Slave Petrovski; Angela Robbiano; Roberta Paravidino; Federico Zara; Pasquale Striano; Michael R Sperling; Russell J Buono; Hakon Hakonarson; João Chaves; Paulo P Costa; Berta M Silva; António M da Silva; Pierre N E de Graan; Bobby P C Koeleman; Albert Becker; Susanne Schoch; Marec von Lehe; Philipp S Reif; Felix Rosenow; Felicitas Becker; Yvonne Weber; Holger Lerche; Karl Rössler; Michael Buchfelder; Hajo M Hamer; Katja Kobow; Roland Coras; Ingmar Blumcke; Ingrid E Scheffer; Samuel F Berkovic; Michael E Weale; Norman Delanty; Chantal Depondt; Gianpiero L Cavalleri; Wolfram S Kunz; Sanjay M Sisodiya
Journal:  Brain       Date:  2013-09-06       Impact factor: 13.501

10.  Small molecule kinase inhibitors alleviate different molecular features of myotonic dystrophy type 1.

Authors:  Marzena Wojciechowska; Katarzyna Taylor; Krzysztof Sobczak; Marek Napierala; Wlodzimierz J Krzyzosiak
Journal:  RNA Biol       Date:  2014-04-24       Impact factor: 4.652

View more
  1 in total

1.  Cell-Specific RNA Binding Protein Rbfox2 Regulates CaV2.2 mRNA Exon Composition and CaV2.2 Current Size.

Authors:  Summer E Allen; Cecilia P Toro; Arturo Andrade; Eduardo J López-Soto; Sylvia Denome; Diane Lipscombe
Journal:  eNeuro       Date:  2017-10-10
  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.