| Literature DB >> 17130147 |
Jun-ichi Takeda1, Yutaka Suzuki, Mitsuteru Nakao, Tsuyoshi Kuroda, Sumio Sugano, Takashi Gojobori, Tadashi Imanishi.
Abstract
The Human-transcriptome DataBase for Alternative Splicing (H-DBAS) is a specialized database of alternatively spliced human transcripts. In this database, each of the alternative splicing (AS) variants corresponds to a completely sequenced and carefully annotated human full-length cDNA, one of those collected for the H-Invitational human-transcriptome annotation meeting. H-DBAS contains 38,664 representative alternative splicing variants (RASVs) in 11,744 loci, in total. The data is retrievable by various features of AS, which were annotated according to manual annotations, such as by patterns of ASs, consequently invoked alternations in the encoded amino acids and affected protein motifs, GO terms, predicted subcellular localization signals and transmembrane domains. The database also records recently identified very complex patterns of AS, in which two distinct genes seemed to be bridged, nested or degenerated (multiple CDS): in all three cases, completely unrelated proteins are encoded by a single locus. By using AS Viewer, each AS event can be analyzed in the context of full-length cDNAs, enabling the user's empirical understanding of the relation between AS event and the consequent alternations in the encoded amino acid sequences together with various kinds of affected protein motifs. H-DBAS is accessible at http://jbirc.jbic.or.jp/h-dbas/.Entities:
Mesh:
Substances:
Year: 2006 PMID: 17130147 PMCID: PMC1716722 DOI: 10.1093/nar/gkl854
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Statistics of the data processing and of the AS variants and exons identified by genomic structure
| #Locus | #cDNA | #Total exon | #Alternative exona | #Constitutive exon | |
|---|---|---|---|---|---|
| H-Invitational cDNAs | 35 005 | 167 992 | 1 164 482b | 184 649 | 979 833 |
| Successfully mapped | 34 678 | 167 564 | 1 164 482 | 184 649 | 979 833 |
| ≥2 cDNAs per locus | 15 445 | 89 687 | 795 175 | 184 649 | 610 526 |
| Identified AS variants | 11 744 | 74 378 | 687 841 | 184 649 | 503 192 |
| Identified RASVsc | 11 744 | 38 664 | 378 024 | 98 156 | 279 868 |
| 5′-end | 7488 | 15 920 | 38 664 | 15 920 | 22 744 |
| Internal | 10 030 | 26 443 | 300 696 | 69 359 | 231 337 |
| 3′-end | 5978 | 12 877 | 38 664 | 12 877 | 25 787 |
| Retrotransposonsd | 7435 | 14 534 | 22 583 | 12 735 | 9848 |
| LINEs | 3548 | 5360 | 6620 | 3863 | 2757 |
| SINEs | 5849 | 10 188 | 14 114 | 8724 | 5390 |
| Alu elements | 4487 | 7323 | 10 240 | 6379 | 3861 |
| Identified RASVsc including full-length ORF | 11 382 | 30 389 | 311 409 | 78 078 | 233 331 |
| 5′-UTR | 6660 | 14 230 | 26 310 | 10 238 | 16 072 |
| CDS | 11 382 | 30 389 | 272 780 | 64 270 | 208 510 |
| 3′-UTR | 3519 | 5259 | 12 319 | 3570 | 8749 |
aThe number of exons was simply counted in which indicated AS relation was not associated
bUnmapped transcripts' exons could not be counted
cRepresentative AS Variants.
dThey were detected by RepeatMasker (A.F.A. Smit, R. Hubley & P. Green RepeatMasker at ).
Numbers of the loci in which AS variants should influence the possible protein functions
| #Locus | #cDNA | |
|---|---|---|
| AS affecting function total | 7630 | 24 092 |
| Motif-changed | 4624 | 14 550 |
| GO-changed | 4150 | 14 248 |
| Subcellular localization-changed | 5323 | 17 718 |
| Transmembrane domain-changed | 1248 | 3995 |
| Complex AS pattern total | 1512 | 5394 |
| Bridged | 472 | 2336 |
| Nested | 1223 | 3629 |
| Multiple CDS | 101 | 258 |
Figure 1Search system of H-DBAS is shown. (A) Search form. H-DBAS has two sorts of search form named simple search and advanced search. The simple search allows the user to find AS locus by using keyword, H-Inv cluster ID (HIX), H-Inv transcript ID (HIT), accession number, RefSeq ID, Ensembl ID, HUGO gene symbol and definition. The advanced search allows the user to find AS locus by using various combinations of three categories such as Genomic location (i), AS structure (ii) and AS functional annotation (iii). In this search system, any combinations including simple search are available. (B) Result summary. The result of the query written in text and Figure 1A is shown.
Figure 2AS Viewer of H-DBAS is shown. Java applet for operating AS event and checking AS exon and protein functions such as protein motif and transmembrane domain. The user can also compare with representative AS variants (RASVs) by nucleic and amino acid sequence level. AS Viewer is separated following parts: (i) Definition and genomic information of the locus; (ii) Selection function and definition of RASVs including RefSeq and Ensembl transcripts as references; (iii) Selection function of protein motif and transmembrane domain; (iv) All exons of selected RASVs are located on genome and AS exons are colored red. By clicking an AS exon in Exonic Segment field, the AS events on the location are shown in AS Event Viewer. Max ORF means total ORF range on genome of selected RASVs; (v) Selection function of AS structure on genome and on cDNA. Selected RASVs' structures are shown and these ORFs are colored pink and protein motifs and transmembrane domains are colored aqua. They are also shown nucleic and amino acid sequences by using zoom function.
Figure 3Examples of the alternative splicing affecting motif (A) and bridged complex AS pattern (B) from AS Viewer in H-DBAS. Exons and introns are represented by boxes and lines. ORF region is colored pink and protein motif region is colored aqua.