Literature DB >> 21051348

ASPicDB: a database of annotated transcript and protein variants generated by alternative splicing.

Pier L Martelli¹, Mattia D'Antonio, Paola Bonizzoni, Tiziana Castrignanò, Anna M D'Erchia, Paolo D'Onorio De Meo, Piero Fariselli, Michele Finelli, Flavio Licciulli, Marina Mangiulli, Flavio Mignone, Giulio Pavesi, Ernesto Picardi, Raffaella Rizzi, Ivan Rossi, Alessio Valletti, Andrea Zauli, Federico Zambelli, Rita Casadio, Graziano Pesole.

Abstract

Alternative splicing is emerging as a major mechanism for the expansion of the transcriptome and proteome diversity, particularly in human and other vertebrates. However, the proportion of alternative transcripts and proteins actually endowed with functional activity is currently highly debated. We present here a new release of ASPicDB which now provides a unique annotation resource of human protein variants generated by alternative splicing. A total of 256,939 protein variants from 17,191 multi-exon genes have been extensively annotated through state of the art machine learning tools providing information of the protein type (globular and transmembrane), localization, presence of PFAM domains, signal peptides, GPI-anchor propeptides, transmembrane and coiled-coil segments. Furthermore, full-length variants can be now specifically selected based on the annotation of CAGE-tags and polyA signal and/or polyA sites, marking transcription initiation and termination sites, respectively. The retrieval can be carried out at gene, transcript, exon, protein or splice site level allowing the selection of data sets fulfilling one or more features settled by the user. The retrieval interface also enables the selection of protein variants showing specific differences in the annotated features. ASPicDB is available at http://www.caspur.it/ASPicDB/.

Entities: Chemical

Mesh：

Substances：

Year: 2010 PMID： 21051348 PMCID： PMC3013677 DOI： 10.1093/nar/gkq1073

Source DB: PubMed Journal: Nucleic Acids Res ISSN： 0305-1048 Impact factor: 16.971

INTRODUCTION

Alternative splicing is a well characterized mechanism which, coupled with alternative initiation and termination of transcription (1), may expand the transcriptome and proteome complexity in human and other organisms by over one order of magnitude with respect to the number of annotated genes (2,3). In particular, it is now widely demonstrated that virtually all multi-exon genes may generate multiple transcripts and protein variants (3,4) and that the splicing process is tightly regulated in different physiological conditions, tissues or developmental stages (5). Furthermore, alterations of the splicing process can be observed in several genetic diseases and in cancer (6–10). The huge amount of EST sequences (11) together with the relevant reference genome sequence has been used to carry out an extensive analysis of alternative splicing in human through the ASPIC algorithm (12–14). The alternative splicing pattern of human multi-exon genes, determined by ASPIC, has been collected in ASPicDB, a database resource which presents some unique features with respect to other similar databases (15). The ASPIC algorithm implements an optimization strategy that, performing a multiple alignment of all available transcript data (including full-length cDNA and EST sequences) to the relevant genome sequence, detects the set of introns that minimizes the number of splicing sites. It also generates through a directed-acyclic graph combinatorial procedure the minimal set of non-mergeable transcript isoforms compatible with the detected splicing events (14). The reliability of splicing isoforms detected by ASPIC has been recently established through a comparative assessment (16). The advent of massive transcriptome sequence data generated by RNA-Seq (17) is steadily increasing the number of validated splicing sites and isoforms in human and other organisms thus suggesting that a fraction of alternative splicing events are the result of background noise in the splicing process (18) which generates non-functional isoforms expressed at low level. Therefore, extensive research efforts are required to distinguish functional species-specific variants from non-functional ones originated from neutral drift in the splicing process, as well as to asses the biological role of functional isoforms. The annotation of the protein variants predicted with ASPIC is an essential step for exploring the functional and structural diversity of the proteins originating from the same gene by means of alternative splicing and therefore for unraveling the complex physiological effects of alternative splicing events (19). Indeed, currently available databases, such as ASD (20), ASAP II (21), ASTALAVISTA (22) and H-DBAS (23), mostly collect information on alternative transcripts at the mRNA level, without considering the effect of alternative splicing on the protein structure and function. The ProSAS (24) database contains structural information as derived from comparative modeling procedure, but due to the limitations of the modeling techniques, only ∼15% of the human transcripts are endowed with a reliable protein structure prediction. ASPICdb aims at filling the gap of structural and functional annotation of protein splicing variants, by adopting a set of analysis and prediction tools that do not rely only on annotation transfer by sequence similarity. It provides a thorough computational annotation of predicted human protein variants including PFAM domains (25), N-terminal signal peptides, GPI-anchor propeptides, transmembrane domains, subcellular localization and other features, also reporting the relevant crosslinks to UniprotKB/Swissprot (26) and PDB databases (27). A comprehensive annotation of the domain architecture and other structural features could also be extremely useful to critically assess the reliability of the functional classification provided the GO System (25), which still neglects much of the relevant information for alternative splicing products. In addition, in consideration of the fragmented nature of the available transcript data, the new version of ASPicDB include the annotation of CAGE tags (28) in order to identify truly transcription initiation sites and discriminate between full-length isoforms using alternative transcription initiations and 5′-partial transcripts for which a full-length CDS and the encoded protein cannot be reliably predicted.

ANNOTATION PIPELINE OF HUMAN PROTEIN VARIANTS

The computational pipeline implemented for supplementing the ASPicDB protein sequences with functional and structural annotations is represented in Figure 1 and integrates several state-of-the-art tools for similarity search and for machine-learning based prediction of protein features starting from residue sequence.

Figure 1.

Pipeline for the annotation of alternative transcripts.

Pipeline for the annotation of alternative transcripts. For each one of the 256 939 protein variants coming from 17 191 human genes, a first layer of annotation consists in the retrieval of similar sequences from the two major repositories containing well-characterized proteins, namely: (i) the UniProtKB/SwissProt data base (26) (rel. 2010_07, June 2010), that contains 547 011 protein sequences with curated annotations, including 517 802 principal entries and 29 209 splicing variants (UniProt Consotium, 2010); (ii) the Protein Data Bank (rel. July 2010), that contains resolved three dimensional structures for 50 171 different protein sequences (29). Similarity searches were performed with BLAST (30) setting the E-value threshold to 10−3. A second layer of annotation is obtained by mapping the structural and functional domains collected in the PFAM-A database (rel. 24.0, October 2009) that contains curated multiple sequence alignments based on hidden Markov models (HMM) for 8691 families, 2985 domains, 162 repeats and 74 motifs (25). The PFAM models were mapped on the ASPicDB protein sequences by means of the pfam_scan.pl program (ftp://ftp.sanger.ac.uk/pub/databases/Pfam/Tools/), based on HMMER3.0 (31). The third layer of annotation results from the integration of several predictors based on machine learning tools, such as neural networks, hidden Markov models, support vector machines and conditional random fields. Since most of the methods take advantage of the evolutionary information encoded in sequence profiles, we compiled them starting from the similar sequences retrieved with two PSI-BLAST iterations (setting the E-value threshold to 10−3) from the UniRef90 data set consisting of 6 955 504 sequences (July 2010). The first predicted features are the presence of N-terminal signal peptide and of C-terminal GPI-anchor propeptides, with SPEPlip (32) and PredGPI (33), respectively. Both the methods are among the best available predictors, scoring with accuracy as high as 95% the former and 88% the latter. When present, the signal peptide and the propeptide are cleaved from the protein sequence. The presence of coiled-coil domains is predicted with CCHMM-PROF that is able to locate coiled-coil segments in protein sequences with 80% accuracy (34). α-Helical transmembrane domains are then predicted with ENSEMBLE (35), that discriminates transmembrane from globular proteins with false positive and false negative rates both equal to 3%. The same tool is adopted for predicting the number and the position of transmembrane segments along the sequence, with an accuracy of 90% on the protein base. The subcellular localization of globular proteins is predicted with BaCelLo (36), which discriminates four localizations in animals (secretory pathway, cytoplasm, nucleus and mitochondrion) with 74% accuracy.

ASPicDB CONTENT AND ANNOTATION OF PROTEIN VARIANTS

Table 1 reports some statistics on the data contained in the current version of ASPicDB (version 2.0, August 2010) which refers only to human multi-exon genes annotated in NCBI Entrez Gene (37) with at least one RefSeq transcript (38) and the relevant Unigene cluster (39) collecting all available gene-specific cDNA and EST sequences.

Table 1.

Statistics of the ASPicDB content (v2.0, August 2010)

	ASPicDB v2.0
Genes	17 191
Transcripts	319 092
Proteins	256 939
Exons	390 886
Splicing sites	351 345
U2	302 164
U12	1712
Splicing events	233 717

The number of splicing sites belonging to the U2 or U12 class and of splicing events is also reported.

Statistics of the ASPicDB content (v2.0, August 2010) The number of splicing sites belonging to the U2 or U12 class and of splicing events is also reported. In the current version of ASPicDB some more features are available including the annotation of the CAGE tags (28) which define truly transcription initiation sites and a comprehensive protein annotation. A total of 12 789 394 CAGE tags have been mapped thus supporting constitutive or alternative transcription start sites. To each transcript variant a ‘unique identifier’ (16) has been associated in order to make possible the unambiguous comparison with alternative transcripts collected in other databases. All alternative proteins collected in ASPicDB have been compared with UniprotKB/SwissProt (26) and PDB (29) databases. The results of similarity searches are reported in Table 2. Only 17% of the ASPicDB protein sequences are identical to proteins deposited in UniProtKB/SwissProt database. However, 94% of the sequences share significant similarity with proteins annotated in the same database, prompting the possibility of a reliable annotation transfer. Moreover, 54% of ASPicDB sequences are similar to proteins deposited in the PDB suggesting that their structures can be modeled, at least partially.

Table 2.

Annotation of human variants upon similarity and PFAM searches

Sequence repository	No of proteins^a	No of genes^a
UniProtKB/SwissProt, %
E-value < 10⁻³, %	239 814 (93)	17 054 (99)
Identical, %	42 601 (17)	13 043 (76)
PDB
E-value < 10⁻³, %	137 528 (54)	11 062 (64)
Identical, %	1079 (0.4)	316 (2)
PFAM
All matches, E-value < 10⁻⁵, %	183 483 (71)	14 205 (83)
Complete matches, E-value < 10⁻⁵, %	46 630 (18)	5621 (33)

aThe percentages are computed with respect to 256 939 protein variants and 17 191 genes.

Annotation of human variants upon similarity and PFAM searches aThe percentages are computed with respect to 256 939 protein variants and 17 191 genes. A considerable amount of PFAM models map on the ASPicDB sequences (Table 2). On the overall, 71% of sequences match with at least one model. This result is in agreement with the reported sequence coverage on the human proteome of the current PFAM release, which is equal to 72.5% (25). It is worth noticing that, although all the models map with an E-value < 10−5, only 20% of the matches are complete (that is, involve the whole model). A note of caution is necessary when inferring features from partial matches and the actual extent of the match has to be evaluated for each instance. Table 3 summarizes the results of the annotation process performed with machine learning based predictors. Two percent of proteins were not predicted since they are shorter than 50 residues, 16% of proteins are predicted as transmembrane and 82% are predicted as globular. Among the globular proteins, 12% are predicted as secreted, 35% as cytoplasmic, 27% as globular and 8% as mitochondrial. Signal peptides and GPI-anchor propeptides are predicted in the 12 and 0.7% of the sequences, respectively. Coiled-coil domains are predicted in 1.3% of the proteins. At the gene level, 30 and 92% of genes encode for transmembrane and globular proteins, respectively. Since the sum exceed 100%, it follows that 22% of the genes encode for both globular and transmembrane variants. The same consideration holds for the other annotations as reported in Table 4. The amount of genes predicted to encode for proteins with different subcellular localization achieves 56%. This is partially explained by the fact that BaCelLo scores with an accuracy equal to 74%, which is the lowest among the methods included in the pipeline. Indeed the discrimination between the ‘cytoplasmic’ and the ‘nuclear’ classes is still a difficult task for all subcellular localization predictors (40). When the two classes are merged together, the BaCelLo accuracy increases up to 91%, but the rate of genes encoding for proteins with different localizations is still as high as 44%, suggesting that localization diversity is inherent in the ASPicDB protein variants. The structure of PFAM annotations is also highly variable: 38% of genes encode for variants matching with different number and/or type of PFAM models. Altogether, results listed in Table 4 suggest that alternative transcripts can encode for proteins endowed with different structural and functional features. ASPicDB provides a unique resource reporting the annotation of alternative splicing variants at the protein level and an interface enabling the discovery of such differences.

Table 3.

Machine learning-based prediction of the human proteins deposited ASPicDB

Annotation	No. of proteins^a	No. of genes^a
Type
Globular, %	210 608 (82)	15 513 (90)
Transmembrane, %	41 561 (16)	5439 (32)
Localization (globular proteins)
Secretory pathway, %	31 917 (12)	7348 (43)
Cytoplasm, %	90 046 (35)	10 327 (60)
Nucleus, %	69 167 (27)	8183 (48)
Mitochondrion, %	19 478 (8)	4698 (27)
Domains
Signal peptide, %	30 508 (12)	5153 (30)
GPI-anchor propeptide, %	1673 (0.7)	629 (4)
Coiled-coil segments, %	3423 (1.3)	497 (2.8)

aThe percentages are computed with respect to 256 939 protein variants and 17 191 genes.

Table 4.

Differences among alternative proteins encoded by the same human gene

Annotation	No. of genes^a, %
Type (globular/transmembrane)	3817 (22)
Subcellular localization (globular proteins)	9593 (56)
Presence of signal peptide	3939 (23)
Presence of GPI-anchor propeptide	591 (3.4)
Presence of coiled-coil domains	464 (2.7)
Number of transmembrane helices	2140 (12)
PFAM models (all matches)	6575 (38)

aThe percentages are computed with respect to 17 191 genes.

Machine learning-based prediction of the human proteins deposited ASPicDB aThe percentages are computed with respect to 256 939 protein variants and 17 191 genes. Differences among alternative proteins encoded by the same human gene aThe percentages are computed with respect to 17 191 genes.

ASPicDB RETRIEVAL INTERFACE

ASPicDB can be accessed though simple or advanced query forms. The simple query form allows the user to obtain the splicing pattern of one or more genes selected according to several criteria (e.g. HGNC name, RefSeq or Unigene accession IDs, etc.). The advanced query form allows the user to search for (i) genes, (ii) transcripts; (iii) exons; (iv) splicing sites; and (v) proteins, fulfilling different criteria (e.g. exons in a given length range, etc.). Depending on the choice separate query forms appear. The ‘gene’, ‘transcript’ and ‘splicing sites’ query forms have been described previously (15) whereas the ‘exon’ and ‘protein’ query forms are novel features of this version of ASPicDB. The exon query form allows the user to select exons in a given length range, belonging to a specific type (initial, internal or teminal), flanked by specific splicing sites or associated to one or more Affimetrix ExonArray probeset IDs. The ‘protein’ query form allows the retrieval of transcripts encoding proteins isoforms of a specific class (e.g. globular or transmembrane), subcellular localization (e.g. mitochondrion, nucleus, secretory, cytoplasm) or containing one or more features, including occurrence and number of PFAM or transmembrane domains, GPI-anchor propeptides, signal peptides. Finally, it is also possible to retrieve genes encoding for alternative proteins that show differences in the above mentioned features.

ASPicDB OUTPUT

After a simple or advanced query has been submitted the output for each selected gene is shown which is organized in eight panels. Gene information reports a summary of the genomic and transcript data used by ASPIC to generate the prediction, downloadable by the user and links to other popular prediction programs such as ASAP2 (21), ASD (20) and ACEVIEW (41) as well as to ASPIC results for orthologous genes in other species. Gene structure view provides a schematic graphical view of the gene structure including all predicted exons/introns. Predicted transcripts show a graphical representation of the assembled transcripts with predicted annotations of 5′-UTR, CDS and 3′-UTR, CAGE tag mapping, Premature Termination Codons (PTC) and polyA sites. Transcript table lists the details of all predicted alternative transcripts including their length, number of exons and presence of a protein coding sequence. The ‘variant type’ column lists all the alternative splicing events using a RefSeq mRNA as the reference transcript. The transcript signature is also reported which consists in a unique ID for alternatively spliced variants generated according to (16). Predicted proteins show a graphical representation of the encoded proteins with matching domains (Figure 2). For each mapped domain the sequence coordinates are reported and different symbols indicate whether the mapping involves the complete domain or only a part of it.

Figure 2.

‘Predicted proteins’ panel for gene CSMD3 (CUB and Sushi multiple domains 3). The gene is predicted to encode for 12 transcripts and 7 different protein sequences. Variants labeled as PR1, PR2 and PR3 are identical to the isoforms reported in the CSMD3_HUMAN entry of SwissProt/UniprotKB. Two more variants are reported in that file, although lacking of experimental annotations. Several repetitions of Sushi and CUB domains are predicted with PFAM (25) and represented with symbols indicating whether the model is completely or partially mapped to the sequence. The two transmembrane helices are predicted with ENSEMBLE (35).

Protein table lists the predicted features of the alternative proteins that include: (i) the best hits obtained from the similarity searches against the UniProtKB/SwissProt and PDB databases, along with the identity value and coverage of the alignment with respect to both the query and the subject sequence lengths; (ii) the features predicted by the pipeline based on machine-learning tools. Predicted splice sites shows the multiple alignment between the genomic sequence and the expressed sequences (i.e. mRNAs and ESTs) near the boundaries (splice sites) of all predicted introns. Intron table lists all predicted introns and their relevant features; All results can be also downloaded by the user in textual format following the ‘gene transfer format’ (GTF) (see the Gene Information panel). ‘Predicted proteins’ panel for gene CSMD3 (CUB and Sushi multiple domains 3). The gene is predicted to encode for 12 transcripts and 7 different protein sequences. Variants labeled as PR1, PR2 and PR3 are identical to the isoforms reported in the CSMD3_HUMAN entry of SwissProt/UniprotKB. Two more variants are reported in that file, although lacking of experimental annotations. Several repetitions of Sushi and CUB domains are predicted with PFAM (25) and represented with symbols indicating whether the model is completely or partially mapped to the sequence. The two transmembrane helices are predicted with ENSEMBLE (35). After a query at the gene, transcript, exon, protein or splice site level has been completed, the user can also download specific sets of sequences in FASTA format for further analyses, e.g. genes, transcripts, exons, proteins, 5′-UTRs, coding sequences, 3′-UTRs, introns as well as sequence regions surrounding splice site boundaries.

FUTURE PERSPECTIVES

ASPicDB is an ongoing project and we plan to further develop it in the next releases. In particular we plan to add specific annotations on splicing regulatory elements and their interacting RNA-binding proteins located both in exonic and intronic regions. We also plan to update alternative splicing prediction by using the huge amount of RNA-Seq data which are now being produced by next generation sequencing, possibly annotating splicing events as constitutive or tissue-specific. Furthermore, literature-screened splicing patterns related to diseases will be annotated as they represent potential molecular biomarkers and possible targets for therapy. Finally, the inclusion in the database of data related to other organisms will certainly favor a better understanding of the alternative splicing process through comparative analyses.

FUNDING

Ministero dell’Istruzione, dell’Università e della Ricerca: Fondo Italiano Ricerca di Base: ‘Laboratorio Internazionale di Bioinformatica’ (LIBI); Laboratorio di Bioinformatica per la Biodiversità Molecolare (MBLAB) and Telethon (project GGP01658). Funding for open access charge: Ministero dell’Università e della Ricerca: Fondo Italiano Ricerca di Base: ‘Laboratorio Internazionale di Bioinformatica’ (LIBI). Conflict of interest statement. None declared.

41 in total

Review 1. Pre-mRNA splicing and human disease.

Authors: Nuno André Faustino; Thomas A Cooper
Journal: Genes Dev Date: 2003-02-15 Impact factor: 11.361

Review 2. Understanding alternative splicing: towards a cellular code.

Authors: Arianne J Matlin; Francis Clark; Christopher W J Smith
Journal: Nat Rev Mol Cell Biol Date: 2005-05 Impact factor: 94.444

3. A code for transcription initiation in mammalian genomes.

Authors: Martin C Frith; Eivind Valen; Anders Krogh; Yoshihide Hayashizaki; Piero Carninci; Albin Sandelin
Journal: Genome Res Date: 2007-11-21 Impact factor: 9.043

4. UniProtKB/Swiss-Prot.

Authors: Emmanuel Boutet; Damien Lieberherr; Michael Tognolli; Michel Schneider; Amos Bairoch
Journal: Methods Mol Biol Date: 2007

Review 5. The prediction of protein subcellular localization from sequence: a shortcut to functional genome annotation.

Authors: Rita Casadio; Pier Luigi Martelli; Andrea Pierleoni
Journal: Brief Funct Genomic Proteomic Date: 2008-02-18

6. Detecting alternative gene structures from spliced ESTs: a computational approach.

Authors: Paola Bonizzoni; Giancarlo Mauri; Graziano Pesole; Ernesto Picardi; Yuri Pirola; Raffaella Rizzi
Journal: J Comput Biol Date: 2009-01 Impact factor: 1.479

7. Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing.

Authors: Qun Pan; Ofer Shai; Leo J Lee; Brendan J Frey; Benjamin J Blencowe
Journal: Nat Genet Date: 2008-11-02 Impact factor: 38.330

8. CCHMM_PROF: a HMM-based coiled-coil predictor with evolutionary information.

Authors: Lisa Bartoli; Piero Fariselli; Anders Krogh; Rita Casadio
Journal: Bioinformatics Date: 2009-09-10 Impact factor: 6.937

9. Assembly, annotation, and integration of UNIGENE clusters into the human genome draft.

Authors: D Zhuo; W D Zhao; F A Wright; H Y Yang; J P Wang; R Sears; T Baer; D H Kwon; D Gordon; S Gibbs; D Dai; Q Yang; J Spitzner; R Krahe; D Stredney; A Stutz; B Yuan
Journal: Genome Res Date: 2001-05 Impact factor: 9.043

10. ProSAS: a database for analyzing alternative splicing in the context of protein structures.

Authors: Fabian Birzele; Robert Küffner; Franziska Meier; Florian Oefinger; Christian Potthast; Ralf Zimmer
Journal: Nucleic Acids Res Date: 2007-10-11 Impact factor: 16.971

21 in total

1. Exploring the functional impact of alternative splicing on human protein isoforms using available annotation sources.

Authors: Dinanath Sulakhe; Mark D'Souza; Sheng Wang; Sandhya Balasubramanian; Prashanth Athri; Bingqing Xie; Stefan Canzar; Gady Agam; T Conrad Gilliam; Natalia Maltsev
Journal: Brief Bioinform Date: 2019-09-27 Impact factor: 11.622

2. Quantification of type II procollagen splice forms using alternative transcript-qPCR (AT-qPCR).

Authors: Audrey McAlinden; Kyu-Hwan Shim; Louisa Wirthlin; Soumya Ravindran; Thomas M Hering
Journal: Matrix Biol Date: 2012-09-10 Impact factor: 11.583

Review 3. Function of alternative splicing.

Authors: Olga Kelemen; Paolo Convertini; Zhaiyi Zhang; Yuan Wen; Manli Shen; Marina Falaleeva; Stefan Stamm
Journal: Gene Date: 2012-08-15 Impact factor: 3.688

4. Evolution of gene structural complexity: an alternative-splicing-based model accounts for intron-containing retrogenes.

Authors: Chengjun Zhang; Andrea R Gschwend; Yidan Ouyang; Manyuan Long
Journal: Plant Physiol Date: 2014-02-11 Impact factor: 8.340

5. APPRIS: annotation of principal and alternative splice isoforms.

Authors: Jose Manuel Rodriguez; Paolo Maietta; Iakes Ezkurdia; Alessandro Pietrelli; Jan-Jaap Wesselink; Gonzalo Lopez; Alfonso Valencia; Michael L Tress
Journal: Nucleic Acids Res Date: 2012-11-17 Impact factor: 16.971

6. MAISTAS: a tool for automatic structural evaluation of alternative splicing products.

Authors: Matteo Floris; Domenico Raimondo; Guido Leoni; Massimiliano Orsini; Paolo Marcatili; Anna Tramontano
Journal: Bioinformatics Date: 2011-04-15 Impact factor: 6.937

7. Clusterin transcript variants expression in thyroid tumor: a potential marker of malignancy?

Authors: Paolo Fuzio; Anna Napoli; Anna Ciampolillo; Serafina Lattarulo; Angela Pezzolla; Nicoletta Nuzziello; Sabino Liuni; Francesco Giorgino; Eugenio Maiorano; Elda Perlino
Journal: BMC Cancer Date: 2015-05-02 Impact factor: 4.430

8. A novel computational strategy to identify A-to-I RNA editing sites by RNA-Seq data: de novo detection in human spinal cord tissue.

Authors: Ernesto Picardi; Angela Gallo; Federica Galeano; Sara Tomaselli; Graziano Pesole
Journal: PLoS One Date: 2012-09-05 Impact factor: 3.240

9. HEXEvent: a database of Human EXon splicing Events.

Authors: Anke Busch; Klemens J Hertel
Journal: Nucleic Acids Res Date: 2012-10-31 Impact factor: 16.971

10. SpliceAid-F: a database of human splicing factors and their RNA-binding sites.

Authors: Matteo Giulietti; Francesco Piva; Mattia D'Antonio; Paolo D'Onorio De Meo; Daniele Paoletti; Tiziana Castrignanò; Anna Maria D'Erchia; Ernesto Picardi; Federico Zambelli; Giovanni Principato; Giulio Pavesi; Graziano Pesole
Journal: Nucleic Acids Res Date: 2012-10-30 Impact factor: 16.971