| Literature DB >> 15608290 |
Hsien-Da Huang1, Jorng-Tzong Horng, Feng-Mao Lin, Yu-Chung Chang, Chen-Chia Huang.
Abstract
We have developed an information repository named SpliceInfo to collect the occurrences of the four major alternative-splicing (AS) modes in human genome; these include exon skipping, 5'-alternative splicing, 3'-alternative splicing and intron retention. The dataset is derived by comparing the nucleotide and protein sequences available for a given gene for evidence of AS. Additional features such as the tissue specificity of the mRNA, the protein domain contained by exons, the GC-ratio of exons, the repeats contained within the exons, and the Gene Ontology are annotated computationally for each exonic region that is alternatively spliced. Motivated by a previous investigation of AS-related motifs such as exonic splicing enhancer and exonic splicing silencer, this resource also provides a means of identifying motifs candidates and this should help to identify potential regulatory mechanisms within a particular exonic sequence set and its two flanking intronic sequence sets. This is carried out using motif discovery tools to identify motif candidates related to alternative splicing regulation and together with a secondary structure prediction tool, will help in the identification of the structural properties of such regulatory motifs. The integrated resource is now available on http://SpliceInfo.mbc.NCTU.edu.tw/.Entities:
Mesh:
Substances:
Year: 2005 PMID: 15608290 PMCID: PMC540083 DOI: 10.1093/nar/gki129
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1Four major alternative splicing modes defined in the SpliceInfo resource. In each case, the constitutively spliced exons are indicated in blue and the alternatively spliced exons are shown in orange. In case (A), the exon in the middle is alternative spliced into the first isoform and spliced out of the second isoform. In cases (B) and (C), the different selection of the splicing sites causes alternative isoforms. Especially in the last case (D), the first sequence without splicing in alternative pathway gives rise to intron retention.
The SpliceInfo data statistics for the four major alternative splicing (AS) modes
| Amount | AS modes | Total | |||
|---|---|---|---|---|---|
| Exon skipping | 5′-alternative splicing | 3′-alternative splicing | Intron retention | ||
| Occurrences | 149 560 | 25 918 | 14 851 | 13 316 | 203 645 |
| Genes containing the occurrences | 3747 | 1622 | 2925 | 1776 | 6309* (at least one AS-mode) |
| Average occurrences per gene | 39.91 | 15.9 | 5.07 | 7.50 | 32.28 |
The first row gives the number of occurrences of each alternative splicing mode; the second row gives the number of genes containing the occurrence of each AS-mode; the last row shows the average number of occurrence per gene. The symbol ‘ indicates that there are 6309 genes that contain at least one occurrences of any modes of alternative splicing. This is not the total number of genes that make up each alternative splicing mode.
The distribution of genes containing the occurrence of a particular AS mode in SpliceInfo
| AS modes | Number of genes containing the occurrences of particular AS-modes | |||
|---|---|---|---|---|
| Exon skipping | 5′-alternative splicing | 3′-alternative splicing | Intron retention | |
| v | 1818 | |||
| v | 944 | |||
| v | 619 | |||
| v | v | 573 | ||
| v | v | v | 357 | |
| v | 311 | |||
| v | v | 291 | ||
| v | v | 290 | ||
| v | v | 240 | ||
| v | v | 223 | ||
| v | v | v | 219 | |
| v | v | v | v | 205 |
| v | v | v | 96 | |
| v | v | v | 62 | |
| v | v | 61 |
The ‘v’ annotated in the first four columns means that the genes contain that particular mode of alternative splicing. For example, 1818 genes contain the occurrences of only one mode, namely, ‘exon skipping’. While in the sixth row, 357 genes contain three modes, namely, ‘exon skipping’, ‘5′-alternative splicing’ and ‘3-alternative splicing’.
Figure 2Data flow of the SpliceInfo resource. In the right-bottom subfigure, the nucleotides indicated by the red circles constitute the motif, itself, and the nucleotides indicated by the black circles are the motif surrounding sequences.
Figure 3Gene view in SpliceInfo.