| Literature DB >> 17933774 |
Fabian Birzele1, Robert Küffner, Franziska Meier, Florian Oefinger, Christian Potthast, Ralf Zimmer.
Abstract
Alternative splicing is known to be one of the major sources for functional diversity in higher eukaryotes. Several splicing isoforms have been characterized in the literature that play important roles in cellular processes like apoptosis or signal transduction pathways. Splicing events can often be detected on the mRNA level by large-scale cDNA or EST experiments and such data is collected and annotated in several databases. Nevertheless, the effects of splicing on the structure of a protein are largely unknown. The ProSAS (Protein Structure and Alternative Splicing) database fills this gap and provides a unified resource for analyzing effects of alternative splicing events in the context of protein structures. ProSAS comprehensively annotates and models protein structures for several Ensembl genomes as well as SwissProt entries harbouring splicing events. Alternative isoforms annotated in Ensembl or SwissProt can be analyzed on the protein structure and protein function level using an intuitive user interface that provides several features and tools for a structure-based analysis of alternative splicing events. The ProSAS database is freely accessible at http://www.bio.ifi.lmu.de/ProSAS.Entities:
Mesh:
Substances:
Year: 2007 PMID: 17933774 PMCID: PMC2238869 DOI: 10.1093/nar/gkm793
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Database content and annotation pipeline. Gene and alternative transcript information is obtained from the Ensembl database as well as from SwissProt. Additional information comes from InterPro patterns and orthologs are mapped by Ensembl database using BioMart. Protein structures from the PDB are annotated to transcripts using SIMAP and several structural features are computed using SCOP, PDP and DSSP annotations. Other data like the (structural) variance in the corresponding protein family will be added in the future (Other methods). Affymetrix probesets are mapped onto transcripts and genes.
Summary of the content of the ProSAS database.
| Genes | Transcripts | Exons | Structurally modelled transcripts (genes) | ||
|---|---|---|---|---|---|
| Seqid > 0.4, cov > 0.75 | seq_id > 0.4 | ||||
| Human | 26 228 | 50 539 | 256 257 | 7601 (4504) | 18 520 (9433) |
| Mouse | 24 423 | 32 041 | 205 865 | 4912 (4167) | 10 673 (8330) |
| Rat | 23 265 | 33 657 | 219 304 | 5348 (4123) | 11 680 (8229) |
| Total | 73 916 | 11 6237 | 681 426 | 17 861 (12794) | 40 873 (25 992) |
| Proteins | Isoforms | Modelled SwissProt proteins | |||
| SwissProt | 12 530 | 33 155 | 1949 | 5767 | |
The last two columns give information about the coverage of human, rat and mouse genes and transcripts as well as SwissProt proteins, respectively, with respect to two different criteria. The first column requires a sequence identity between target and template of at least 40% (save modelling zone) and a structural coverage of the transcript (SwissProt protein) sequence of 75%. The second column displays all transcripts and genes (SwissProt proteins) that have at least one template assigned with a sequence identity larger than 40%. In total, the database covers about 17% of the genes and 15% of the SwissProt proteins with high quality, full length models and about 35% of the genes and 46% of all SwissProt entries with high quality, partial structures.
Figure 2.Details for gene ENSMUSG00000006611_13 from mouse showing all exons of the gene as well as different transcripts annotated for the gene in Ensembl. Matches of InterPro patterns and Affymetrix probesets onto exons are also shown.
Figure 3.Transcript details view for transcript ENSMUST00000091706_13 from gene ENSMUSG00000006611_13. The structure of the transcript is visualized with Jmol (Jmol: an open-source Java viewer for chemical structures in 3D. http://www.jmol.org/) and the difference with respect to transcript ENSMUST00000091707_13, namely the deletion of a larger N-terminal part is visualized on the structure. The alternatively spliced region is characterized with respect to different features such as solvent accessibility or secondary structure content.