| Literature DB >> 30388411 |
Lorea Blazquez1, Warren Emmett2, Rupert Faraway3, Jose Mario Bello Pineda4, Simon Bajew5, Andre Gohr5, Nejc Haberman3, Christopher R Sibley6, Robert K Bradley7, Manuel Irimia5, Jernej Ule8.
Abstract
Recursive splicing (RS) starts by defining an "RS-exon," which is then spliced to the preceding exon, thus creating a recursive 5' splice site (RS-5ss). Previous studies focused on cryptic RS-exons, and now we find that the exon junction complex (EJC) represses RS of hundreds of annotated, mainly constitutive RS-exons. The core EJC factors, and the peripheral factors PNN and RNPS1, maintain RS-exon inclusion by repressing spliceosomal assembly on RS-5ss. The EJC also blocks 5ss located near exon-exon junctions, thus repressing inclusion of cryptic microexons. The prevalence of annotated RS-exons is high in deuterostomes, while the cryptic RS-exons are more prevalent in Drosophila, where EJC appears less capable of repressing RS. Notably, incomplete repression of RS also contributes to physiological alternative splicing of several human RS-exons. Finally, haploinsufficiency of the EJC factor Magoh in mice is associated with skipping of RS-exons in the brain, with relevance to the microcephaly phenotype and human diseases.Entities:
Keywords: RS exon; alternative splicing mechanisms; evolution; exon junction complex; gene expression; microcephaly; microexon; neurodevelopmental disorders; recursive splicing
Mesh:
Substances:
Year: 2018 PMID: 30388411 PMCID: PMC6224609 DOI: 10.1016/j.molcel.2018.09.033
Source DB: PubMed Journal: Mol Cell ISSN: 1097-2765 Impact factor: 19.328
Figure 1Core EJC Components Promote Inclusion of Putative “RS-Exons”
(A) An RS-exon starts with a partial 5ss motif, and after the first step of splicing to the preceding exon, it generates RS-5ss within the part-spliced transcript. The RS-exon will be skipped if the RS-5ss is used for the second step of splicing and included if the canonical 5ss is used.
(B) Pie charts show the prevalence of putative RS-exons in human mRNAs according to ENSEMBL GRCh37 annotation.
(C) RT-PCR analysis of unspliced and part-spliced reporters derived from the alternative CADM2 (aCADM2) isoform after transient transfection into HeLa cells.
(D) aCADM2 unspliced reporter was stably integrated into the genome of HeLa cells, and the splicing pattern of the RS-exon was analyzed by RT-PCR after eIF4A3, RBM8A, MAGOH, and UPF1 KD.
(E) Boxplots show the difference in percentage spliced in (dPSI) of highly included exons (PSI > 90%) after KD of eIF4A3, RBM8A, CASC3, or UPF1. Exons are binned by their RS-5ss score, and dPSI for each bin is calculated by subtracting the PSI in the control experiment to each KD. The RS-5ss values on the x axis indicate the midpoint of each group. Negative dPSI values indicate increased exon skipping upon KD.
(F) Same as (E), but for alternative exons with a PSI < 90%.
(G) The statistical significance of RS-exon skipping is performed by dividing constitutive RS-exons (PSI >0.98) into two groups based on a RS-5ss score threshold, analyzing the differences in dPSI values between the two groups, and calculating a signed p-value by testing for a skew in dPSI values between the two groups using the Wilcoxon rank-sum test. The analysis is done at multiple thresholds, from −40 to 8.
(H) RT-PCR analysis of RS-exon splicing pattern after KD of EJC core factors or UPF1.
Results shown in (C), (D), and (H) derive from a minimum of 3 independent experiments performed in HeLa cells.
Figure 2PNN and RNPS1 Contribute to the Inclusion of “Annotated RS-Exons”
(A) The statistical significance of RS-exon skipping upon KD of 28 RBPs is performed by analysis of public RNA-seq data as explained in Figure 1G. Peripheral EJC factors are marked in purple, and experiments that had no effect are marked in gray.
(B) The statistical significance of RS-exon skipping upon KD of 191 RBPs calculated by analysis of ENCODE consortium RNA-seq, as in (A). Core EJC components are marked in red, new factors that had an effect are marked in blue, and KD experiments that had no effect are marked in gray.
(C) Distribution of dPSI after KD of different RBPs is shown for highly included exons in control (PSI > 90%), as explained in Figure 1E.
(D) Same as (C), but for alternative exons with PSI < 90%.
(E) RT-PCR validation of RS-exon skipping after KD of eIF4A3, RNPS1, or PNN (n = 3 independent experiments for KLHL20 and TMA16; n = 2 for EGLN1, KPNA1, MRPL3, and SACM1L).
Figure 3Skipping of Annotated RS-Exons upon EJC KD Results from Recursive Splicing
(A) The count of PRPF8 iCLIP reads that identify crosslinks at each nucleotide upstream of exon-intron junctions is normalized by the total number of evaluated junctions (RS = 4,631; non-RS = 130,410) and the total number of crosslinks in each experiment. HeLa cells treated with either control or eIF4A3 siRNAs were used for iCLIP (n = 4 per group, 2 independent experiments). Reads upstream of RS-exons and non RS-exons are plotted in orange and gray, respectively. The data were smoothed using locally weighted scatterplot smoothing (LOESS). Shaded regions represent 95% confidence intervals.
(B) Density plot as in (A), but assessing PRPF8 iCLIP upstream of exon-exon junctions. Crosslinks upstream of RS-exons or non RS-exons are plotted in blue or gray, respectively.
(C) KPNA1 unspliced reporter with a mutant RS-5ss was stably integrated into HeLa cells, and the splicing pattern of endogenous or mutant RS-exon was analyzed by RT-PCR after eIF4A3 KD.
(D) Same analysis as in (C), after transfecting cells with an antisense oligonucleotide complementary to the EJC deposition site.
(E) RT-PCR analysis of wild-type and mutant unspliced and part-spliced reporters derived from the KPNA1 gene after their transient transfection into HeLa cells.
(F) RT-PCR analysis of RS-exon splicing pattern after transient transfection of wild-type or mutant aCADM2, KPNA1, and PSMA3-AS1 reporters into HeLa cells treated with control or eIF4A3 siRNAs. The original RS-5ss score is indicated in bold, and the scores after mutation are in regular font.
(G) Nucleotide sequences at the exon-exon junctions and their associated RS-5ss scores for the splicing reporters used in (C), (F), and (H). The nucleotides before and after the slash sign correspond to the last 3 nt of the preceding exon and the first 6 nt of the exon under study. Mutations are indicated in red.
(H) Wild-type and mutant splicing reporters derived from RPS2 gene were transiently transfected into HeLa cells, and their splicing pattern was analyzed by RT-PCR after control or eIF4A3 KD.
Results shown in (C)–(F) and (H) derive from a minimum of 3 independent replicates.
Figure 4Stable EJC Deposition Is Required to Block Recursive Splicing
(A) KD of EJC components was rescued with siRNA-resistant FLAG-tagged counterparts. The splicing pattern of endogenous KPNA1 or stably integrated aCADM2 RS-exons was monitored by RT-PCR.
(B) Wild-type or mutant SL2 sequence was inserted into KPNA1 and mCADM2 splicing reporters at the expected EJC deposition site, and the splicing pattern was monitored by RT-PCR after co-transfection of SL2 reporters and plasmids expressing the indicated MS2-tagged proteins.
A minimum of 3 independent replicates was performed in HeLa (A) or 293 (B) cells.
Figure 5EJC Depletion Leads to Inclusion of Cryptic Microexons
(A) The SL2 in mCADM2 splicing reporter was moved upstream of the exon-exon junction as indicated, and the RS-exon splicing pattern was analyzed after co-transfection of SL2 reporters and plasmids expressing MS2-eIF4A3 or GFP proteins.
(B) Six nucleotides (GCACAG) were added at the beginning of the KPNA1 RS-exon to move the RS site to an internal 5ss, thus creating a KPNA1 microexon reporter. Unspliced and part-spliced version of the reporters were transfected, and their splicing pattern was analyzed by RT-PCR.
(C) The internal or canonical 5ss within KPNA1 microexon unspliced reporter were mutated, and the splicing pattern was analyzed by RT-PCR after transfection into HeLa cells treated with control or eIF4A3 siRNAs.
(D) The difference in the use of RS-5ss or internal 5ss in EJC KD compared to control (dPURS) is shown in HeLa cells. Cryptic microexons result from the use of an internal 5ss that is located within the first 15 nt of a longer annotated exon. Positive dPURS values indicate increased RS-exon skipping or increased inclusion of cryptic microexons.
(E) RT-PCR validation of 2 microexon inclusion events identified in (D) after KD of EJC core factors or UPF1.
Results shown in (A)–(C) and (E) derive from a minimum of 3 independent replicates performed in HeLa cells.
Figure 6EJC-Mediated Repression of Recursive Splicing in the Brain, and Physiologic Recursive Splicing
(A) The dPSI of exons highly included in wild-type mouse brain (PSI > 90%) is shown after comparison with Magoh haploinsufficient mouse brain as in Figure 1E.
(B) Same as (A), but comparing control Emx1-cre and wild-type mouse brain.
(C) A schematic of the MRPS5 gene with the RS-exon highlighted in blue. Below, PRPF8 iCLIP crosslinking is shown, as identified by either ungapped or gapped reads in control or eIF4A3 KD cells (4 replicates are summed up per condition). Crosslinks upstream of junctions involving the RS-exon are shown in red, and corresponding reads are zoomed into on the right. Further below, a Sashimi plot shows RNA-seq evidence of RS-exon skipping.
(D) The types of possible lariats associated with RS-exon splicing are named. In black are the lariats that were detected for the AP1G2 gene as described in (E).
(E) Sequence of the RS-exon in the AP1G2 gene and its downstream intron. BP annotations based on lariat sequencing are highlighted in red. Arrows indicate primers used to interrogate lariats shown in black in (D). Above, sequences and number of reads supporting RS-lariats (inverted alignment) with alternative BP annotations in cerebellum and K562 cells are shown. Below, sequences and number of reads supporting exon inclusion downstream lariats with alternative BP annotations are shown.
Figure 7Analysis of RS-Exon Inclusion across Evolution
(A) Density plot showing 5ss score distribution at exon-intron junctions (canonical 5ss, shaded lines) or at exon-exon junctions (RS-5ss, unshaded lines) for all internal annotated exons in human (blue), sea urchin (purple), or fruit fly (green). The red line represents the 5ss score is lower than 90% of exon-intron junctions in each species. This value is used as the RS-5ss threshold to quantify the proportion of putative RS-exons across species in (B).
(B) A RS-5ss threshold was calculated as explained in (A) for 7 different species. Putative RS-exons for each species were calculated as the proportion of annotated exons that reconstitute a RS-5ss above the threshold.
(C) Model for EJC-dependent repression of recursive splicing and microexon inclusion. Top panel: annotated putative RS-exons (in blue) are defined and spliced to their preceding exon to reconstitute a 5ss (RS-5ss) at the exon-exon junction, which leads to EJC deposition. Two outcomes can result from the second step of splicing: (1) the EJC does not efficiently repress the RS-5ss, either because it is a very strong RS-5ss or EJC assembly is deficient, and as a result, the RS-exon is recursively spliced; or (2) the EJC efficiently represses recursive splicing at most RS-5ss that are present in human mRNAs, most often leading to constitutive inclusion of RS-exons. Bottom panel: canonical exons that contain an internal 5ss close to their beginning (in white and purple) are defined and spliced to their preceding exon. (3) This splicing event leads to EJC deposition, which normally blocks the internal 5ss recognition, leading to an isoform that includes the full exon (bottom). However, if EJC does not repress the internal 5ss efficiently, its recognition leads to inclusion of a microexon (in white), while the rest of the exon (in purple) is removed in the second splicing step.
| REAGENT or RESOURCE | SOURCE | IDENTIFIER |
|---|---|---|
| Rabbit anti-PRPF8 Antibody | Bethyl | Cat#A303-922A; RRID: |
| Rabbit anti-eIF4AIII Antibody | Abcam | Cat#ab32485; RRID: |
| Mouse anti-RBM8A Antibody | SCBT | Cat#sc-32312; RRID: |
| Rabbit anti-MAGOH Antibody | Abcam | Cat#ab180505 |
| Rabbit anti-UPF1 Antibody | Abcam | Cat#ab109363; RRID: |
| Mouse anti-FLAG Antibody | Sigma | Cat#F1804: RRID: |
| Mouse anti-GFP Antibody | Santa Cruz | Cat#sc-9996; RRID: |
| Rabbit anti-GAPDH Antibody | Cell signaling | Cat#2118L; RRID: |
| Mouse anti-GAPDH Antibody | Abcam | Cat#ab8245; RRID: |
| NEB 5-alpha Competent | New England Biolabs | Cat#C2987I |
| Human Brain, Cerebellum Total RNA | Takara Bio | Cat#636535 |
| TRIzol Reagent | Thermo Fisher Scientific | Cat#15596026 |
| Phusion High-Fidelity DNA Polymerase (2 U/μL) | Thermo Fisher Scientific | Cat#F530L |
| 10 mM dNTP Mix | Thermo Fisher Scientific | Cat#18427088 |
| UltraPure Agarose | Thermo Fisher Scientific | Cat#16500500 |
| BlueJuice Gel Loading Buffer | Thermo Fisher Scientific | Cat#10816015 |
| Kanamycin Sulfate | Thermo Fisher Scientific | Cat#BP906-5 |
| Zeocin Selection Reagent | Thermo Fisher Scientific | Cat#R25001 |
| ESGRO Recombinant Mouse LIF Protein | Millipore | Cat#ESG1107 |
| PD 0325901 | Sigma | Cat#PZ0162-25MG |
| CHIR99021 | Sigma | Cat#SML1046-25MG |
| Recombinant Mouse FGF basic Protein | R&D Systems | Cat#3139-FB-025 |
| Lipofectamine RNAiMAX Transfection Reagent-1.5 mL | Thermo Fisher Scientific | Cat#13778150 |
| Lipofectamine 2000 Transfection Reagent-1.5 mL | Thermo Fisher Scientific | Cat#11668019 |
| Endoporter | GeneTools | |
| AMPure XP, 5 mL | Agencourt | Cat#A63880 |
| RIPA Buffer | Sigma-Aldrich | Cat#R0278-50ML |
| cOmplete(TM) Protease Inhibitor Cocktail | Sigma-Aldrich | Cat#11697498001 |
| NuPAGE Novex 4-12% Bis-Tris Protein Gels, 1.0 mm, 10 well | Thermo Fisher Scientific | Cat#NP0321BOX |
| Lipofectamine 3000 Transfection Reagent-1.5 mL | Thermo Fisher Scientific | Cat#L3000015 |
| Lenti-X Concentrator, 100 mL | Takara Clontech | Cat#631231 |
| Puromycin | Takara Clontech | Cat#631305 |
| Blasticidin S HCl | ThermoFisher Scientific | Cat#R21001 |
| Hygromycin B (50mg/ml) | ThermoFisher Scientific | Cat#10687010 |
| Doxycycline hyclate | Sigma-Aldrich | Cat#D9891 |
| Novex TBE Gels, 6%, 10 well-1 box | Thermo Fisher Scientific | Cat#EC6265BOX |
| DNA Clean & Concentrator-5 | Zymo Research | Cat#D4014 |
| Zymoclean Gel DNA Recovery | Zymo Research | Cat#D4007 |
| SuperScript IV First-strand synthesis system | Thermo Fisher Scientific | Cat#18091050 |
| Zero Blunt TOPO PCR Cloning Kit for Sequencing, with One Shot TOP10 Chemically competent E.coli | Thermo Fisher Scientific | Cat#K2875J10 |
| SENSE mRNA-Seq Library Prep Kit v2 for HiSeq, 96 preps | Lexogen | N/A |
| QIAxcel DNA Screening Kit (2400) | QIAGEN | Cat#929004 |
| Fast SYBR Green Master Mix | Thermo Fisher Scientific | Cat#4385612 |
| ImmoMix | BIOLINE | Cat#BIO-25020 |
| Maxwell RSC simplyRNA Cells Kit | Promega | Cat#AS1390 |
| High-Capacity cDNA Reverse Transcription Kit-200 reactions | Thermo Fisher Scientific | Cat#4368814 |
| MEGAclear Transcription Clean-Up Kit | Thermo Fisher Scientific | Cat#AM1908 |
| MEGAscript T7 Transcription Kit | Thermo Fisher Scientific | Cat#AM1333 |
| Original gel and capillary electrophoresis images and quantification for all figures | This study | |
| PRPF8 iCLIP after eIF4A3 knockdown in HeLa cells | This study | Raw data accessible via |
| mRNA-seq after EJC component knockdown in S2 cells | This study | E-MTAB-7271. Accessible via |
| RNA-seq data from ENCODE | ||
| RNA-seq data from K562 cells enriched for lariats | ( | GEO: |
| RNA-seq data from K562 and NALM-6 cells enriched for lariats | ( | SRA: SRP094107 |
| RNA-seq data from HeLa cells after EJC KD | N/A | GEO: |
| RNA-seq data after PNN KD in corneal epithelial cells | N/A | GEO: |
| RNA-seq data after Acinus KD in HeLa cells | N/A | GEO: |
| RNA-seq data after KD of NMD factors in HeLa cells | N/A | GEO: |
| RNA-seq data after KD of EJC auxiliary factors in human lymphoblastoid cell lines | N/A | GEO: |
| RNA-seq data after KD of SR proteins in mouse P19 cells | N/A | GEO: |
| RNA-seq data of EJC haploinsufficient mouse neocortices | N/A | GEO: |
| eIF4A3 and BTZ iCLIP data | ( | ArrayExpress: |
| RNA-seq HeLa control samples | NCBI sequence read archive | SRA: SRR514854 |
| RNA-seq HeLa control samples | NCBI sequence read archive | SRA: SRR514855 |
| Genome Reference Consortium | ||
| Genome Reference Consortium | N/A | |
| Berkeley Drosophila Genome Project | ||
| Ensembl | ||
| Ensembl | ||
| Ensembl | ||
| Ensembl | ||
| Human: K562 cells | ATCC | ATCC number CCL-243 |
| Human: HeLa Flp-In T-Rex cells | N/A | Prof. Stephen Taylor (University of Manchester) |
| Human: HEK293T cells | European Collection of Authenticated Cell Cultures (ECACC) | 12022001 |
| Human: HEK293 cells | ATCC | ATCC number CRL-1573 |
| Mouse: 46C cells | ( | N/A |
| Drosophila: S2 cells | Drosphila Genomics Resource Center (DGRC) | S2-DGRC |
| Stealth RNAi siRNA Negative Control, Med GC | Thermo Fisher Scientific | Cat#12935300 |
| KPNA1 Morpholino | GeneTools | 5′ GAATATCATCCCCTGTGACAATGTT 3′ |
| Control Morpholino | GeneTools | 5′ CCTCTTACCTCAGTTACAATTTATA 3′ |
| EIF4A3 Stealth siRNA | Thermo Fisher Scientific | Cat#HSS114709 |
| UPF1 Stealth siRNA | Thermo Fisher Scientific | Cat#HSS109172 |
| QX DNA Size Marker 50–800 bp (50ul) | QIAGEN | Cat#929561 |
| pcDNA 3.1(+) Mammalian Expression Vector | Thermo Fisher Scientific | Cat#V79020 |
| pOG44 Flp-Recombinase Expression Vector | Thermo Fisher Scientific | Cat#V600520 |
| pcDNA5 FRT/TO Vector Kit | Thermo Fisher Scientific | Cat#V652020 |
| pLKO.1 puro | N/A | Addgene plasmid # 8453 |
| pCMV-VSV-G | N/A | Addgene plasmid # 8454 |
| pMDLg/pRRE | N/A | Addgene plasmid # 12251 |
| pRSV-Rev | N/A | Addgene plasmid # 12253 |
| pCI-neo FLAG eIF4A3 | ( | N/A |
| pCI-neo FLAG MAGOH | ( | N/A |
| pCI-neo FLAG eIF4A3 E188R | ( | N/A |
| pCI-neo FLAG eIF4A3 401/402 | ( | N/A |
| pCI-neo FLAG MAGOH E20R | ( | N/A |
| pCI-neo FLAG eIF4A3 E188Q | This study | Oligos for cloning detailed in |
| pMS2-GFP | This study | Oligos for cloning detailed in |
| pMS2-eIF4A3 | ( | N/A |
| pMS2-eIF4AIII 401/402 | ( | N/A |
| pcDNA3 and pcDNA5 splicing reporters | This study | Sequences detailed in |
| Bowtie2 | N/A | |
| R/Bioconductor | N/A | |
| R/dplyr | N/A | |
| R/ggplot2 | N/A | |
| STAR RNA aligner | N/A | |
| Samtools version 1.3.1 | N/A | |
| Bedtools (v.2.17.0) | N/A | |
| MAxEntScan | ( | |
| MAJIQ: Modeling Alternative Junction Inclusion Quantification | N/A | |
| RSEM and EBSEQ | N/A | |
| R | R Project for Statistical Computing | RRID: |
| RNAfold | N/A | |
| iMaps webserver | N/A | |
| Hisat2 version 2.0.5 | N/A | |
| Cutadapt | N/A | |