Literature DB >> 28038679

The incredible complexity of RNA splicing.

Christelle Robert1, Mick Watson2.   

Abstract

Alternative splice isoforms are common and important and have been shown to impact many human diseases. A new study by Nellore et al. offers a comprehensive study of splice junctions in humans by re-analyzing over 21,500 public human RNA sequencing datasets.

Entities:  

Mesh:

Substances:

Year:  2016        PMID: 28038679      PMCID: PMC5203710          DOI: 10.1186/s13059-016-1121-y

Source DB:  PubMed          Journal:  Genome Biol        ISSN: 1474-7596            Impact factor:   13.583


Introduction

A newly published study by Nellore et al. in Genome Biology provides us with the most comprehensive view of human transcriptome splicing to date, having (re)analyzed over 21,500 RNA sequencing (RNA-seq) datasets and discovered 56,865 novel splice junctions [1]. RNA splicing is a post-transcriptional RNA processing mechanism occurring in eukaryotic organisms whereby introns are removed from pre-mRNA leading to mature mRNA molecules, or transcripts, consisting of joined exons. The process of RNA splicing generates distinct transcript variants of the same gene, referred to as alternative transcript isoforms, the translation of which leads to distinct protein products. Thus, alternative splicing is a critical process that ensures protein diversity, with most of the multi-exon genes in humans generating multiple alternative transcript isoforms.

Alternative splicing affects human disease

Dysregulation of alternative splicing can have major functional consequences through the expression of abnormal isoforms that contribute to disease progression. Isoform switching, where the most abundant transcript isoform has changed between two conditions (e.g., cancer and normal cells) is a common mechanism. Recently, Sebestyén et al. [2] reported recurrent isoform switches for known tumor-driver genes (e.g., PPARG, MITF, and MYH11) across seven cancer types that resulted in altered gene function; and (amongst many others) aberrant splicing mutations have been reported in muscular dystrophy [3] and cystic fibrosis [4].

RNA-seq as an incredibly powerful method for splice junction discovery

RNA-seq has now become the standard method to analyze the transcriptome, the complete set of transcripts expressed in a given cell. This approach is commonly used to identify the diverse set of transcript types (e.g., mRNA, noncoding RNAs) and their isoform structure (splicing patterns); to quantify transcript-level expression and the changes in expression under various experimental conditions; and to discover novel transcript isoforms or splice junctions; though care must be taken as accurate alignment and quantification is difficult due to the high similarity between some transcripts and genes [5]. Remarkably, Nellore et al. have re-analyzed over 21,500 public RNA-seq datasets, producing the most comprehensive catalogue of splice junctions to date, as well as tracking the annotation of human RNA splicing over time [1].

Most common junctions are annotated but many rare junctions are not

Nellore et al. find that most of the reads that map to splice junctions map to junctions that are already known; specifically, in 10,090 of 10,311 datasets that met the authors’ filtering criteria, over 95% of junction reads overlap junctions found in the existing annotation. However, although most splice junctions with high read coverage have been documented, there remains a large number of splice junctions that occur across multiple samples that have not. For example, in 3389 samples from the same set (n = 10,311), fewer than 80% of the observed junctions are annotated. In total, Nellore et al. report 56,865 novel junctions (18.6%) found in at least 1000 samples. Thus, comparison of multiple independent studies can reveal many unannotated junctions.

Junction discovery power is influenced by read depth and length

Nellore et al. confirm that variation in unannotated junction expression across samples strongly correlates with both junction sequencing depth and read length. High read coverage across splice junctions provides stronger evidence that it is real and expressed; and an increased read length allows for a larger proportion of reads to be mapped across splice junctions. Thus, both parameters, read depth and read length, strongly influence junction discovery power.

Most junctions have now been discovered…in human

From 2009 to 2013, splice junction discovery has increased over time with spikes of discovery mostly due to large-scale sequencing projects such as the Human Reference Epigenome Mapping Project [6] (with over 200,000 newly discovered junctions), followed by ENCODE [7] and the Illumina Body Map 2.0 projects. By 2013, the splice junction discovery process reached a plateau, at which point 96.1% of annotated junctions were already discovered. For example, the large-scale GEUVADIS [8] project contributed relatively few novel well-supported splice junctions from lymphoblastoid cell lines, as those cell lines had been well-studied by that time.

What this means for studies in other species

Accurate gene-level and transcript-level expression analyses often rely on the completeness of transcript and splice junction annotation, and research suffers if that annotation is incomplete. Unfortunately, such information is not at the same level of completion for species other than human—beyond human and mouse, other animal genomes can lack up to 20 megabases of annotation [9]—and even for species as well-studied as human, it is now clear that the transcript annotations are not fully complete. The effort of Nellore et al. provides an unprecedented insight into the splice junction usage in humans through large-scale RNA-seq data analysis and further highlights the need for similar studies in other less well-characterized species [10]. The data and resource provided by Nellore et al. will be of importance to anyone studying RNA in humans and will specifically impact on our ability to study splice variation effects in human disease.
  10 in total

1.  The NIH Roadmap Epigenomics Mapping Consortium.

Authors:  Bradley E Bernstein; John A Stamatoyannopoulos; Joseph F Costello; Bing Ren; Aleksandar Milosavljevic; Alexander Meissner; Manolis Kellis; Marco A Marra; Arthur L Beaudet; Joseph R Ecker; Peggy J Farnham; Martin Hirst; Eric S Lander; Tarjei S Mikkelsen; James A Thomson
Journal:  Nat Biotechnol       Date:  2010-10       Impact factor: 54.908

2.  Detection of recurrent alternative splicing switches in tumor samples reveals novel signatures of cancer.

Authors:  Endre Sebestyén; Michał Zawisza; Eduardo Eyras
Journal:  Nucleic Acids Res       Date:  2015-01-10       Impact factor: 16.971

3.  An exon skipping-associated nonsense mutation in the dystrophin gene uncovers a complex interplay between multiple antagonistic splicing elements.

Authors:  A Disset; C F Bourgeois; N Benmalek; M Claustres; J Stevenin; Sylvie Tuffery-Giraud
Journal:  Hum Mol Genet       Date:  2006-02-06       Impact factor: 6.150

4.  Nuclear factor TDP-43 binds to the polymorphic TG repeats in CFTR intron 8 and causes skipping of exon 9: a functional link with disease penetrance.

Authors:  Emanuele Buratti; Antonia Brindisi; Franco Pagani; Francisco E Baralle
Journal:  Am J Hum Genet       Date:  2004-06       Impact factor: 11.025

5.  Design and development of exome capture sequencing for the domestic pig (Sus scrofa).

Authors:  Christelle Robert; Pablo Fuentes-Utrilla; Karen Troup; Julia Loecherbach; Frances Turner; Richard Talbot; Alan L Archibald; Alan Mileham; Nader Deeb; David A Hume; Mick Watson
Journal:  BMC Genomics       Date:  2014-07-03       Impact factor: 3.969

6.  Human splicing diversity and the extent of unannotated splice junctions across human RNA-seq samples on the Sequence Read Archive.

Authors:  Abhinav Nellore; Andrew E Jaffe; Jean-Philippe Fortin; José Alquicira-Hernández; Leonardo Collado-Torres; Siruo Wang; Robert A Phillips; Nishika Karbhari; Kasper D Hansen; Ben Langmead; Jeffrey T Leek
Journal:  Genome Biol       Date:  2016-12-30       Impact factor: 13.583

7.  An integrated encyclopedia of DNA elements in the human genome.

Authors: 
Journal:  Nature       Date:  2012-09-06       Impact factor: 49.962

8.  Errors in RNA-Seq quantification affect genes of relevance to human disease.

Authors:  Christelle Robert; Mick Watson
Journal:  Genome Biol       Date:  2015-09-03       Impact factor: 13.583

9.  Transcriptome and genome sequencing uncovers functional variation in humans.

Authors:  Tuuli Lappalainen; Michael Sammeth; Marc R Friedländer; Peter A C 't Hoen; Jean Monlong; Manuel A Rivas; Mar Gonzàlez-Porta; Natalja Kurbatova; Thasso Griebel; Pedro G Ferreira; Matthias Barann; Thomas Wieland; Liliana Greger; Maarten van Iterson; Jonas Almlöf; Paolo Ribeca; Irina Pulyakhina; Daniela Esser; Thomas Giger; Andrew Tikhonov; Marc Sultan; Gabrielle Bertier; Daniel G MacArthur; Monkol Lek; Esther Lizano; Henk P J Buermans; Ismael Padioleau; Thomas Schwarzmayr; Olof Karlberg; Halit Ongen; Helena Kilpinen; Sergi Beltran; Marta Gut; Katja Kahlem; Vyacheslav Amstislavskiy; Oliver Stegle; Matti Pirinen; Stephen B Montgomery; Peter Donnelly; Mark I McCarthy; Paul Flicek; Tim M Strom; Hans Lehrach; Stefan Schreiber; Ralf Sudbrak; Angel Carracedo; Stylianos E Antonarakis; Robert Häsler; Ann-Christine Syvänen; Gert-Jan van Ommen; Alvis Brazma; Thomas Meitinger; Philip Rosenstiel; Roderic Guigó; Ivo G Gut; Xavier Estivill; Emmanouil T Dermitzakis
Journal:  Nature       Date:  2013-09-15       Impact factor: 49.962

10.  GO-FAANG meeting: a Gathering On Functional Annotation of Animal Genomes.

Authors:  Christopher K Tuggle; Elisabetta Giuffra; Stephen N White; Laura Clarke; Huaijun Zhou; Pablo J Ross; Hervé Acloque; James M Reecy; Alan Archibald; Rebecca R Bellone; Michèle Boichard; Amanda Chamberlain; Hans Cheng; Richard P M A Crooijmans; Mary E Delany; Carrie J Finno; Martien A M Groenen; Ben Hayes; Joan K Lunney; Jessica L Petersen; Graham S Plastow; Carl J Schmidt; Jiuzhou Song; Mick Watson
Journal:  Anim Genet       Date:  2016-07-24       Impact factor: 3.169

  10 in total
  1 in total

1.  Prognostic value and potential function of splicing events in prostate adenocarcinoma.

Authors:  Zhi-Guang Huang; Rong-Quan He; Zeng-Nan Mo
Journal:  Int J Oncol       Date:  2018-09-17       Impact factor: 5.650

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.