| Literature DB >> 16381912 |
Stefan Stamm1, Jean-Jack Riethoven, Vincent Le Texier, Chellappa Gopalakrishnan, Vasudev Kumanduri, Yesheng Tang, Nuno L Barbosa-Morais, Thangavel Alphonse Thanaraj.
Abstract
Alternative splicing is an important regulatory mechanism of mammalian gene expression. The alternative splicing database (ASD) consortium is systematically collecting and annotating data on alternative splicing. We present the continuation and upgrade of the ASD [T. A. Thanaraj, S. Stamm, F. Clark, J. J. Riethoven, V. Le Texier, J. Muilu (2004) Nucleic Acids Res. 32, D64-D69] that consists of computationally and manually generated data. Its largest parts are AltSplice, a value-added database of computationally delineated alternative splicing events. Its data include alternatively spliced introns/exons, events, isoform splicing patterns and isoform peptide sequences. AltSplice data are generated by examining gene-transcript alignments. The data are annotated for various biological features including splicing signals, expression states, (SNP)-mediated splicing and cross-species conservation. AEdb forms the manually curated component of ASD. It is a literature-based data set containing sequence and properties of alternatively spliced exons, functional enumeration of observed splicing events, characterization of observed splicing regulatory elements, and a collection of experimentally clarified minigene constructs. ASD includes a workbench, which is an analysis tool that enables users to carry out splicing related analysis such as characterization of introns for various splicing signals, identification of splicing regulatory elements on a given RNA sequence, prediction of putative exons and prediction of putative translation start codons. The different ASD modules are integrated and can be accessed through user-friendly interfaces and visualization tools. ASD data has been integrated with Ensembl genome annotation project as a Distributed Annotation System (DAS) resource and can be viewed on Ensembl genome browser. The ASD resource is presented at (http://www.ebi.ac.uk/asd).Entities:
Mesh:
Substances:
Year: 2006 PMID: 16381912 PMCID: PMC1347394 DOI: 10.1093/nar/gkj031
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1Structure of ASD. We used computational pipelines and manual curation (top, pink) to create the modular databases of ASD (middle, blue). The individual databases are integrated, cross-linked and are available through a variety of interface tools (bottom, sky blue). Currently the databases are the computer generated Altsplice and the manually-curated AEdb-Sequence, AEdb-Motif, AEdb-Function and AEdb-Minigene databases. The ASD data are integrated with Ensembl genome annotation system and is visible from Ensembl genome browser; Publicly-available databases on alternative splicing are accessible from ASD interfaces (top right, dark green) The databases are connected to analysis tools that are collected in the ASD workbench (middle right, grey).
Statistics on ASD data
| AltSplice data statistics (Release 2 of AltSplice-Human and AltSplice-Mouse) | |
| Genes with alternative splicing | 9929 (61%) of 16 293 human genes |
| 8211 (50%) of 16 391 (mouse) genes | |
| Alternative splicing events per gene | 2.6 (human); 2.2 (mouse) |
| Alternative splicing patterns per gene | 3.9 (human); 3.7 (mouse) |
| Distribution of the observed event types | Cassette exon (52%) |
| Alternate acceptor or donor (27%) | |
| Intron retention (17%) | |
| Mutually exclusive (4%) | |
| Flanking exons often undergo extension or truncation | 26% of Cassette exon events |
| 23% of Retained introns | |
| 19% of Mutually exclusive exons | |
| Human-mouse conservation | Orthologous gene pairs = 5176 |
| Number of conserved events = 1177 | |
| Number of conserved exons = 28 276 | |
| Entries with peptide sequence annotation for at least 1 splicing pattern | 6848 (human); 4366 (mouse) |
| Entries with peptide sequence annotation for ≥2 splicing patterns | 2896 (human); 977 (mouse) |
| Entries for which variant peptide data are available in UniProt | 2083 (human); 896 (mouse) |
| Entries presenting data for ≥2 isoform peptide sequences (by either of the above two sources) | 3994 (human); 1573 (mouse) |
| Entries associated with AEdb-Sequence database | 367 (human); 118 (mouse) |
| AEdb-Sequence data statistics-2255 entries | |
| Distribution | Number of entries |
| Organism distribution | Human (1283); mouse (413); rat (232); drosophila (100); others (227) |
| Event type distribution | Cassette exon (1281) |
| Alternative acceptor or donor (395) | |
| Intron retention (154) | |
| Mutually exclusive exons (130) | |
| Alternative 3′ exon by polyA variant (71) | |
| Regulation associated with disease | 295 |
| Regulation associated with development | 282 |
| Regulation associated with tissue type | 312 |
| Regulation causing frameshift | 151 |
| Regulation introducing stop codons | 260 |
| Alternative exon being noncoding exon | 222 |
| Entries associated with AltSplice | 1198 (human and mouse entries) |
| AEdb-Function data statistics-354 entries | |
| Functional role | Number of entries |
| Modulation of protein interaction | 136 |
| Internal structural change | 119 |
| Novel carboxyl terminus | 87 |
| Novel amino terminus | 38 |
| Association with disease | 81 |
| Intracellular location | 76 |
| Enzymatic activity | 64 |
| Channel activity | 54 |
| Others | 37 |
| AEdb-Motif data statistics-255 entries | |
| Type of regulator sequence | Number of entries |
| Exon enhancer | 97 |
| Exon silencer | 44 |
| Intron enhancer | 56 |
| Intron silencer | 37 |
| AEdb-Minigene data statistics—82 entries | |
| Distribution | Number of entries |
| Organism distribution | Human (46); mouse (17); rat (15); others (9) |
| Splicing mechanism distribution | Cassette exon [single exon, 45; multiple casette exons (3); incremental combinatorial exons (2)]; Alternative acceptor or donor sites (17); Mutually exclusive exons (13); Intron retention (2) |
| Reported tissue specificity | 55 |
| Known regulatory factors | 32 |
| Deduced enhancer and silencer sequences | 97 |
| Hyperlinks to AEdb-Sequence database | 78 (to 105 AEdb-Sequence entries) |
Figure 2Result page of query to all of ASD data. The ASD was queried using the wrapper interface with the term ‘tra2a* | tra2b*’ and this resulted in the retrieval of data entries from AltSplice-Human (two entries), AEdb-Sequence (seven entries), AEdb-Function (one entry) and AEdb-Motif (five entries). This figure illustrates the integration among the different data sets of ASD—(i) the AltSplice-Human entries are seen associated with entries from AEdb-Sequence, and from AltSplice-Mouse; (ii) the AEdb-Sequence entries are seen associated with entries from AltSplice-Human. These associations are hyperlinked.
Figure 3Display of data on alternative splicing of human tra2-beta gene as seen in AltSplice and AEdb-Function data sets. (a) This figure presents a display of data sections on Gene Information, Evidences and Splicing events as seen in AltSplice. Gene information section provides hyperlinks to a page listing the gene entry from HUGO Gene Nomenclature database and to a page which lists the sequence of the gene. The evidences section provides hyperlinks to the associated entries from AEdb-Sequence, to pages that list variant peptide sequences for the gene from UniProt or to pages that list the Ensembl transcript sequences for the gene. The events section lists all the splicing events that AltSplice has identified for the gene. Column 1 lists the gene coordinates of alternatively spliced exons/introns. Column 2 indicates whether the event involves modifications in the flanking exons as well; entries are hyperlinked to pages listing detailed information on the event. Column 3 indicates the identifier of the associated entry from AEdb-Sequence (if any) and the entry is hyperlinked. Column 4 indicates the identifier of the orthologous gene (if any) and the coordinates of the exon orthologous to the one presented in column 1; the entry is hyperlinked to the orthologous gene entry. (b) This figure presents the textual and graphical display of observed splicing patterns for tra2-beta gene as seen in AltSplice data. Splice Pattern Table: Entry in column 1 is hyperlinked to a page listing the sequence of the splicing pattern. Entry in column 2 gives the coding start and end positions on the gene and the length of the translated peptide sequence and is hyperlinked to a page listing the peptide sequence. Entry in column 3 lists the structure of the splicing pattern as a string of exons (exon boundaries are presented in gene coordinates). Entry in column 4 is hyperlinked to pages listing detailed information on the confirming transcript sequences. Entry in column 5 is hyperlinked to pages listing expression states. Entry in column 6 is hyperlinked to pages listing allele specificity of the splicing pattern. Splice Pattern View: Exons are indicated by boxes and introns by lines. Exons/introns that are variants are indicated in blue color. Browsing the cursor over various elements of the pattern displays pop-up's giving detailed information on the elements. The displayed pop-up in this example shows information on the expression state of Splicing Pattern 6. (c) This figure presents data on the functional changes due to alternative splicing in tra2-beta as seen in AEdb-Function data set. The data are organised into three sections namely, gene information, bibliography information and functional information. This figure illustrates the wealth of knowledge captured from literature.
Figure 4Example output from workbench tool that detects splicing regulatory sequences. The figure displays a portion of the page that reports splicing regulatory sequences in tra2-beta gene. The names of identified motifs are hyperlinked to appropriate entries in AEdb-Motif database. Matches against tra2-beta regulatory sequences are also seen. It is known that tra2-beta1 auto regulates its protein concentration by influencing alternative splicing of its pre-mRNA (31).