Literature DB >> 22139928

SpliceDisease database: linking RNA splicing and disease.

Juan Wang1, Jie Zhang, Kaibo Li, Wei Zhao, Qinghua Cui.   

Abstract

RNA splicing is an important aspect of gene regulation in many organisms. Splicing of RNA is regulated by complicated mechanisms involving numerous RNA-binding proteins and the intricate network of interactions among them. Mutations in cis-acting splicing elements or its regulatory proteins have been shown to be involved in human diseases. Defects in pre-mRNA splicing process have emerged as a common disease-causing mechanism. Therefore, a database integrating RNA splicing and disease associations would be helpful for understanding not only the RNA splicing but also its contribution to disease. In SpliceDisease database, we manually curated 2337 splicing mutation disease entries involving 303 genes and 370 diseases, which have been supported experimentally in 898 publications. The SpliceDisease database provides information including the change of the nucleotide in the sequence, the location of the mutation on the gene, the reference Pubmed ID and detailed description for the relationship among gene mutations, splicing defects and diseases. We standardized the names of the diseases and genes and provided links for these genes to NCBI and UCSC genome browser for further annotation and genomic sequences. For the location of the mutation, we give direct links of the entry to the respective position/region in the genome browser. The users can freely browse, search and download the data in SpliceDisease at http://cmbi.bjmu.edu.cn/sdisease.

Entities:  

Mesh:

Year:  2011        PMID: 22139928      PMCID: PMC3245055          DOI: 10.1093/nar/gkr1171

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

Cells need to regulate the expression of a gene in a specific level at specific time and space in order to fulfill specific task. Gene regulation is a ubiquitous phenomenon and is critical in every biological process (1). Mechanisms of gene regulation include the regulations of transcription, RNA processing and translation. In higher eukaryotes, pre-mRNA splicing plays an important role in gene regulation. The inclusion of different exons in mRNA—alternative splicing (AS)—enables one single gene to produce multiple different mRNAs, which can be further translated into different proteins called splice variants (2,3). New high-throughput sequencing technology has revealed that >90% of human genes undergo AS—a much higher percentage than anticipated (4). And recent genome-wide analyses have indicated that almost all primary transcripts from multi-exon human genes undergo alternative pre-mRNA splicing (5). Therefore, RNA splicing greatly increases the genomic complexity of higher eukaryotes (6). RNA splicing is tissue specific and studies highlight differences in the types of AS occurring commonly in different tissues. For example, the frequencies of alternative 3′ splice site and alternative 5′ splice site usage are ∼50–100% higher in liver than in other investigated tissues (7). The importance of splicing is emphasized by its presence in species throughout the phylogenetic tree. Evolutionary studies, which have revealed the formation of de novo alternative exons and the evolution of exon–intron architecture, highlight the importance of AS in the diversification of the transcriptome, especially in humans (8). As we stated earlier, RNA splicing is critical in many biological processes. Splicing of RNA is regulated by complicated mechanisms involving numerous RNA-binding proteins and the intricate network of interactions among them. Splicing in general, and AS in particular, if disrupted, can lead to disease. Therefore, mutations in cis-acting splicing elements or splicing machinery and the regulatory proteins which could compromise the accuracy of either constitutive or alternative splicing would have a profound impact on human pathogenesis. Defects in pre-mRNA splicing have been shown as a common disease-causing mechanism in several studies (9–11). As an example, a point mutation in exon 7 of SMN2 gene leads to exon 7 skipping and a truncated protein, which causes decreased effective rate of SMN protein production and motor neuron degenerative disease (12). Other studies indicate trans-acting mutations affect RNA-dependent functions and cause disease (9,13). A number of bioinformatics resources for RNA splicing have been developed during the past decade including databases and tools (Table 1). For example, Human Splicing Finder is a tool to predict the effects of mutations on splicing signals and can identify splicing motifs in human sequence (14). These resources have provided great help in the study and analysis of RNA splicing.
Table 1.

Databases and tools of splicing mutation and alternative splicing

ResourceDescriptionURL
HGMD (15)The Human Gene Mutation Database (HGMD) constitutes a comprehensive core collection of data on germ-line mutations in nuclear genes underlying or associated with human inherited diseasewww.hgmd.org
DBASS5 (16,17)A database of aberrant 5′ splice siteshttp://www.dbass.org.uk/
DBASS3 (17,18)A database of aberrant 3′ splice siteshttp://www.dbass.org.uk/
ASDB (19)Database of alternatively spliced geneshttp://cbcg.nersc.gov/asdb
ssSNPTarget (20)A genome-wide splice-site Single Nucleotide Polymorphism databasehttp://ssSNPTarget.org
EuSplice (21)a splice-centric database which provides reliable splice signal and AS information for 23 eukaryoteshttp://66.170.16.154/EuSplice
AsMamDB (22)An alternative splice database of mammalshttp://166.111.30.65/ASMAMDB.html
Alternative Splicing Database (23)An alternative splicing database based on publicationshttp://cgsigma.cshl.org/new_alt_exon_db2/
TassDB2 (24)A database of subtle alternative splicing eventshttp://www.tassdb.info
ISIS (25)An intron information systemhttp://isis.bit.uq.edu.au/
ASPicDB (26)A database of annotated transcript and protein variants generated by alternative splicinghttp://www.caspur.it/ASPicDB/
STEPs (27)A database of splice translational efficiency polymorphismshttp://dbstep.genes.org.uk/
Human Splicing Finder (14)A tool to predict the effects of mutations on splicing signals or to identify splicing motifs in any human sequencehttp://www.umd.be/HSF/
SpliceMiner (28)A high-throughput database implementation of the NCBI Evidence Viewer for microarray splice variant analysishttp://discover.nci.nih.gov/spliceminer
Intronerator (29)Exploring introns and alternative splicing in Caenorhabditis eleganshttp://www.cse.ucsc.edu/~kent/intronerator
WebScipio (30)Tool for Predicting mutually exclusive spliced exons based on exon length, splice site and reading frame conservation, and exon sequence homologyhttp://www.webscipio.org
IsoEM (31)Tool for the estimation of alternative splicing isoform frequencies from RNA-Seq datahttp://dna.engr.uconn.edu/software/IsoEM/
MAISTAS (32)A tool for automatic structural evaluation of alternative splicing productshttp://maistas.bioinformatica.crs4.it/
HMMSplicer (33)A tool for efficient and sensitive discovery of known and novel splice junctions in RNA-Seq datahttp://derisilab.ucsf.edu/software/hmmsplicer
SFmap (34)A web server for motif analysis and prediction of splicing factor binding siteshttp://sfmap.technion.ac.il
Databases and tools of splicing mutation and alternative splicing The above evidences have shown an increased importance of connecting the RNA splicing and diseases. For this reason, a high quality database linking RNA splicing and splicing mutations with disease will be of great help and be emergently needed in the study of both RNA splicing and disease. Although the human gene mutation database (HGMD, http://www.hgmd.org/) integrated this kind of data but there are big difference between HGMD and SpliceDisease database (15). Firstly, HGMD is not free and only provides ‘search’ function for registered users for limited days. Secondly, HGMD only provides information of point mutations of intronic sequence for splicing mutation. Thirdly, HGMD does not provide detailed descriptions for the relationship among gene mutations, splicing defects and diseases. On the other hand, SpliceDisease database is a free and comprehensive database containing cis-splicing sequence mutations and trans-acting splicing mutations that cause disease. SpliceDisease integrates detailed descriptions for the relationship among gene mutation, splicing defect and disease. And it provides direct links of EntreZ gene, genome browser, respective location of the mutation on the gene and PubMed for each literature. At present, the ‘SpliceDisease’ database is at its first step, it will be a valuable ongoing resource for the study of RNA splicing and disease.

DATA SOURCES AND IMPLEMENTATION

RNA splicing and disease-related literature was acquired by PubMed search using the keywords ‘splice’, ‘splicing’ and ‘spliced’. Literatures with titles including ‘mutation spectrum’, ‘mutational spectrum’, ‘mutation analysis’ and ‘mutation screening’ were also obtained. We then curated the data manually and retrieved the association between RNA splicing and splicing mutations in the gene and disease of interest. The data were double checked by different people. We standardized the disease names and gene names based on NLM Mesh Browser and EntreZ gene. Each gene was linked to NCBI for comprehensive annotations and to UCSC genome browser for genomic sequence. The mutations of genes were annotated as well including nucleotide change and location on the sequence. We used the nomenclature for description of sequence variants and exon/intron numbering according to den Dunnen and Antonarakis (35). For example: ‘c.’ for a cDNA sequence; IVS for intron sequence; substitutions are designated by a ‘>’ character. We also gave direct link for entry to the respective position of mutation. In the sequence file, the intron/exon of the mutation location is highlighted in yellow color and the specific nucleotide is marked in red color. PubMed IDs and hyperlinks to PubMed were also provided for each literature. More importantly, we curated the detailed description for the relationship among gene mutation, splicing defect and disease. As a result, we manually curated 2337 splicing mutation-disease entries including 303 genes and 370 diseases from 898 publications. In the 2337 entries, ∼89% of them are point mutations (Figure 1A) among which >50% are mutations between G and A (36.5% G > A and 14.6% A > G) (Figure 1B).
Figure 1.

Distribution of mutation type and distribution of point mutation type in the SpliceDisease database. (A) Splicing mutation type: point, point mutation; ins, insertion mutation; del, deletion mutation; other, other types. (B) Axes of the histogram represent the proportions of different nucleotide substitutions in whole point mutations.

Distribution of mutation type and distribution of point mutation type in the SpliceDisease database. (A) Splicing mutation type: point, point mutation; ins, insertion mutation; del, deletion mutation; other, other types. (B) Axes of the histogram represent the proportions of different nucleotide substitutions in whole point mutations. All data were organized in the ‘SpliceDisease’ database using PostgreSQL 9.0, a lightweight database management system. The website is presented using Apache Tomcat 7.0, a JSP&Java web framework which is available at http://cmbi.bjmu.edu.cn/sdisease/.

USING SpliceDisease

SpliceDisease is a user-friendly designed database. The homepage has been designed to provide an organized venue to access all data. The top banner section of the homepage has tabs for ‘Browser’, ‘Search’, ‘Submit’, ‘Download’ and ‘Help’, respectively. When a user performs a search in SpliceDisease, he can use the ‘Browser’ to select the disease or gene of interest or use the ‘Search’ which supports fuzzy queries to find it. The page of result contains nine items disease name and gene symbol, gene EntreZ ID (link to NCBI gene database), chromosome location of genomic sequence (link to UCSC genome browser), mutation, mutation location (direct link to respective position of mutation in the genome browser automatically), organism, description and reference (link to PubMed database) (Figure 2).
Figure 2.

SpliceDisease results page. (A) Once a user runs a search, there comes the result summary page that includes nine items. (B) The direct link for entry to the respective position of mutation. The sequence of exon shows in upper case and intron shows in lower case. And one FASTA record per region (exon, intron) is used in the sequence file. The inton/exon of the location of mutation is highlighted in yellow color and specific nucleotide is marked in red color.

SpliceDisease results page. (A) Once a user runs a search, there comes the result summary page that includes nine items. (B) The direct link for entry to the respective position of mutation. The sequence of exon shows in upper case and intron shows in lower case. And one FASTA record per region (exon, intron) is used in the sequence file. The inton/exon of the location of mutation is highlighted in yellow color and specific nucleotide is marked in red color. All data in SpliceDisease can be downloaded in a file of csv format. These data will facilitate study of exploitation of splicing mutational mechanisms, understanding of RNA biology and helping to discover new therapeutic targets.

FUTURE EXTENSIONS

The SpliceDisease database is in the first step of the project and further extensions will be developed. As we described earlier, a number of bioinformatics resources for RNA splicing have been developed. Therefore, we plan to integrate some related bioinformatics resources in the near future. We will also incorporate the expression data of different mRNA isoforms. As the data accumulation, we will add more trans-acting splicing mutations that cause disease. Finally, SpliceDisease will be continuously updated.

FUNDING

Funding for open access charge: National Natural Science Foundation of China (Grant No. 81001481). Conflict of interest statement. None declared.
  35 in total

1.  ASDB: database of alternatively spliced genes.

Authors:  I Dralyuk; M Brudno; M S Gelfand; M Zorn; I Dubchak
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

2.  A genomic view of alternative splicing.

Authors:  Barmak Modrek; Christopher Lee
Journal:  Nat Genet       Date:  2002-01       Impact factor: 38.330

Review 3.  Pre-mRNA splicing and human disease.

Authors:  Nuno André Faustino; Thomas A Cooper
Journal:  Genes Dev       Date:  2003-02-15       Impact factor: 11.361

4.  ssSNPTarget: genome-wide splice-site Single Nucleotide Polymorphism database.

Authors:  Jin Ok Yang; Woo-Yeon Kim; Jong Bhak
Journal:  Hum Mutat       Date:  2009-12       Impact factor: 4.878

Review 5.  Alternative splicing and disease.

Authors:  Jamal Tazi; Nadia Bakkour; Stefan Stamm
Journal:  Biochim Biophys Acta       Date:  2008-10-17

6.  Genome-wide data-mining of candidate human splice translational efficiency polymorphisms (STEPs) and an online database.

Authors:  Christopher A Raistrick; Ian N M Day; Tom R Gaunt
Journal:  PLoS One       Date:  2010-10-11       Impact factor: 3.240

7.  HMMSplicer: a tool for efficient and sensitive discovery of known and novel splice junctions in RNA-Seq data.

Authors:  Michelle T Dimon; Katherine Sorber; Joseph L DeRisi
Journal:  PLoS One       Date:  2010-11-08       Impact factor: 3.240

8.  The Human Gene Mutation Database (HGMD) and its exploitation in the study of mutational mechanisms.

Authors:  David N Cooper; Peter D Stenson; Nadia A Chuzhanova
Journal:  Curr Protoc Bioinformatics       Date:  2006-01

9.  Predicting mutually exclusive spliced exons based on exon length, splice site and reading frame conservation, and exon sequence homology.

Authors:  Holger Pillmann; Klas Hatje; Florian Odronitz; Björn Hammesfahr; Martin Kollmar
Journal:  BMC Bioinformatics       Date:  2011-06-30       Impact factor: 3.169

10.  SpliceMiner: a high-throughput database implementation of the NCBI Evidence Viewer for microarray splice variant analysis.

Authors:  Ari B Kahn; Michael C Ryan; Hongfang Liu; Barry R Zeeberg; D Curtis Jamison; John N Weinstein
Journal:  BMC Bioinformatics       Date:  2007-03-05       Impact factor: 3.169

View more
  31 in total

1.  A meta-analysis of somatic mutations from next generation sequencing of 241 melanomas: a road map for the study of genes with potential clinical relevance.

Authors:  Junfeng Xia; Peilin Jia; Katherine E Hutchinson; Kimberly B Dahlman; Douglas Johnson; Jeffrey Sosman; William Pao; Zhongming Zhao
Journal:  Mol Cancer Ther       Date:  2014-04-22       Impact factor: 6.261

2.  Single-Molecule Pull-Down FRET to Dissect the Mechanisms of Biomolecular Machines.

Authors:  Matthew L Kahlscheuer; Julia Widom; Nils G Walter
Journal:  Methods Enzymol       Date:  2015-03-03       Impact factor: 1.600

3.  SMN2 splice modulators enhance U1-pre-mRNA association and rescue SMA mice.

Authors:  James Palacino; Susanne E Swalley; Cheng Song; Atwood K Cheung; Lei Shu; Xiaolu Zhang; Mailin Van Hoosear; Youngah Shin; Donovan N Chin; Caroline Gubser Keller; Martin Beibel; Nicole A Renaud; Thomas M Smith; Michael Salcius; Xiaoying Shi; Marc Hild; Rebecca Servais; Monish Jain; Lin Deng; Caroline Bullock; Michael McLellan; Sven Schuierer; Leo Murphy; Marcel J J Blommers; Cecile Blaustein; Frada Berenshteyn; Arnaud Lacoste; Jason R Thomas; Guglielmo Roma; Gregory A Michaud; Brian S Tseng; Jeffery A Porter; Vic E Myer; John A Tallarico; Lawrence G Hamann; Daniel Curtis; Mark C Fishman; William F Dietrich; Natalie A Dales; Rajeev Sivasankaran
Journal:  Nat Chem Biol       Date:  2015-06-01       Impact factor: 15.040

4.  Modulators of alternative splicing as novel therapeutics in cancer.

Authors:  Sebastian Oltean
Journal:  World J Clin Oncol       Date:  2015-10-10

Review 5.  More than a messenger: Alternative splicing as a therapeutic target.

Authors:  A J Black; J R Gamarra; J Giudice
Journal:  Biochim Biophys Acta Gene Regul Mech       Date:  2019-07-02       Impact factor: 4.490

Review 6.  RNA splicing during terminal erythropoiesis.

Authors:  John G Conboy
Journal:  Curr Opin Hematol       Date:  2017-05       Impact factor: 3.284

Review 7.  Identifying disease mutations in genomic medicine settings: current challenges and how to accelerate progress.

Authors:  Gholson J Lyon; Kai Wang
Journal:  Genome Med       Date:  2012-07-26       Impact factor: 11.117

8.  Recruitment of the NineTeen Complex to the activated spliceosome requires AtPRMT5.

Authors:  Xian Deng; Tiancong Lu; Lulu Wang; Lianfeng Gu; Jing Sun; Xiangfeng Kong; Chunyan Liu; Xiaofeng Cao
Journal:  Proc Natl Acad Sci U S A       Date:  2016-04-25       Impact factor: 11.205

Review 9.  Alternative Splicing in CKD.

Authors:  Megan Stevens; Sebastian Oltean
Journal:  J Am Soc Nephrol       Date:  2016-01-13       Impact factor: 10.121

10.  Nucleoside analog studies indicate mechanistic differences between RNA-editing adenosine deaminases.

Authors:  Rena A Mizrahi; Kelly J Phelps; Andrea Y Ching; Peter A Beal
Journal:  Nucleic Acids Res       Date:  2012-08-11       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.