| Literature DB >> 17672918 |
Karol Szafranski1, Stefanie Schindler, Stefan Taudien, Michael Hiller, Klaus Huse, Niels Jahn, Stefan Schreiber, Rolf Backofen, Matthias Platzer.
Abstract
BACKGROUND: Despite some degeneracy of sequence signals that govern splicing of eukaryotic pre-mRNAs, it is an accepted rule that U2-dependent introns exhibit the 3' terminal dinucleotide AG. Intrigued by anecdotal evidence for functional non-AG 3' splice sites, we carried out a human genome-wide screen.Entities:
Mesh:
Substances:
Year: 2007 PMID: 17672918 PMCID: PMC2374985 DOI: 10.1186/gb-2007-8-8-r154
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Figure 1Screening procedure for unusual 3' splice sites found in pairs of 3' splice variants that differ by 3 nt (Δ3SVPs). Processing of AG-AG tandem cases ('NAGNAG', parallel branch on the right) was performed as a comparison to unusual 3' splice site tandems.
Unusual TG splice acceptors identified in the human transcriptome
| Intron | 3' Splice site pair | ESTs for unusual 3' splice sites | ||||
| Gene | No. | Length | Distance | Motif | Fraction | No. |
| 3 | 7843 | 3 | CTG,CAG| | 0.15-0.62*† | 282 | |
| 1 | 224 | 3 | AAG|ATG, | 0.50 | 4 | |
| 3 | 168 | 3 | TTGTTG,AAG| | 0.25 | 257 | |
| 3 | 168 | 6 | TTG,TTGAAG| | 0.01 | 10 | |
| 3 | 1999 | 3 | CAG|ATG, | 0.14 | 4 | |
| 3 | 9975 | 3 | CAG|ATG, | 0.09 | 2 | |
| 6 | 1147 | 3 | CAG|CTG, | 0.07 | 2 | |
| 2 | 2162 | 3 | CAG|ATG, | 0.04 | 7 | |
| 1 | 4354 | 3 | TTG,GAG| | 0.04 | 2 | |
| 9 | 2532 | 3 | TTGTTG,GAG| | ?† | - | |
| 9 | 2532 | 6 | TTG,TTGGAG| | 0.17† | - | |
| 9 | 1377 | 4 | CAG|GATG, | 0.03 | 2 | |
| 1 | 1459 | 5 | TTG,AGCAG| | 0.09 | 2 | |
| 2 | 36530 | 6 | CTG,TTGTAG| | 0.11 | 2 | |
| 1 | 1892 | 6 | CTG,TTTCAG| | 0.04 | 2 | |
| 6 | 1485 | 6 | CTG,GTGCAG| | 0.02† | 5† | |
| 5 | 134 | 7 | CTG,GCTCCAG| | 0.20 | 3 | |
| 1 | 489 | 7 | TTG,AATTCAG| | 0.20 | 16 | |
| 2 | 16849 | 7 | CTG,CCTCCAG| | 0.04 | 2 | |
| 3 | 2753 | 8 | TTG,ATTTCTAG| | 0.13 | 2 | |
| 7 | 3107 | 9 | TTG,GCTCCTTAG| | 0.77 | 27 | |
| 5 | 131 | 9 | CTG,GAGTTGCAG| | 0.62 | 8 | |
| 29 | 6174 | 9 | TTG,ACCCTGAAG| | 0.41 | 34 | |
| 15 | 177 | 9 | TTG,GCCTACAAG| | 0.21 | 3 | |
| 7 | 2839 | 9 | TTG,GTTTAACAG| | 0.13 | 15 | |
| 1 | 214 | 9 | CTG,ATCCCCTAG| | 0.06 | 2 | |
| 6 | 454 | 10 | CTG,TCCTGGGCAG| | 0.13 | 2 | |
| 11 | 1599 | 11 | CTG,TTTCTCCTCAG| | 0.04 | 5 | |
| 18 | 182 | 12 | TTG,TACTCCCCCCAG| | 0.74 | 75 | |
| 7 | 1337 | 12 | CTG,ACTCTCTCCCAG| | 0.43 | 169 | |
| 10 | 4269 | 12 | TTG,GCTCTACTCCAG| | 0.33 | 3 | |
| 6 | 164 | 12 | CTG,ATCCCCTCCCAG| | 0.25 | 5 | |
| 9 | 1259 | 13 | TTG,CCCTCCTGAGTAG| | 0.09 | 3 | |
| 1 | 95 | 16 | CTG,ACCTCTCCCCTAGCAG| | 0.07 | 2 | |
| 3 | 20478 | 17 | TTG,TTTGTTTTTTTTTTTAG| | 0.07 | 3 | |
| 6 | 832 | 18 | CTG,ACTCTCCCCTACCTTCAG| | 0.01 | 1 | |
| 6 | 838 | 21 | TTG,GTTTTGTTTTGGTCTCGTCAG| | 0.07 | 1 | |
| 1 | 3097 | 27 | CTG,ACCCATGTACCTGAGGCTGATTTCCAG| | 0.60 | 3 | |
| 10 | 253 | 28 | TTG,TTTCTTGTGTTTTTTCTGAACACTCCAG| | 0.09 | 4 | |
Entries in bold have RefSeq transcripts supporting the unusual TG acceptor site. Each TG splice variant is supported by at least two ESTs and at least 3% of all covering ESTs, except for some RefSeq-supported cases, CACNA1A [24,35], DRD2 [19] and BAT3. In the 'Motif' column, a vertical line (|) indicates a canonical splice site, and a comma (,) marks the TG splice site. Splice ratios are given as absolute EST counts (No.) as well as the fraction of TG splice variants. A question mark indicates that an explicit fraction is not given in the referenced article, although the authors performed quantitative experiments. *EST ratio depends on the exon junction; the upstream exon 3 may be skipped. †Splice variants were previously quantified by others: GNAS [16,26], CACNA1A (splice ratio cited from [24,35]), DRD2 (splice ratio cited from [19]). ‡Alternative splicing at FBXO17 intron 3 was not experimentally reproducible in this study.
Validation and quantification of alternative splice variants
| Splice junction | Tissue | Fraction of TG splice | Method | |
| Intron 3, exon junction 3-4 | Leukocytes | 0.14 | n = 115 | |
| Intron 1 | Placenta | 0.99 | n = 89 | |
| Intron 3, indel AAG | Leukocytes | 0.52 | n = 69 | |
| Brain | 0.38 | n = 58 | ||
| Placenta | 0.33 | n = 70 | ||
| Intron 3, indel TTGAAG | Leukocytes | 0.01 | n = 69 | |
| Intron 3 | Leukocytes | - | n = 96 | |
| Liver | - | n = 151 | ||
| Intron 3 | Leukocytes | 0.09 | n = 110 | |
| Brain | - | Direct sequencing | ||
| Intron 6 | Lung | - | n = 90 | |
| Brain | 0.19 | n = 92 | ||
| Intron 2 | Leukocytes | 0.02 | n = 142 | |
| Intron 1 | Heart | 0.03 | n = 91 | |
| Intron 9, indel GAG | Brain | 0.85 | n = 90 | |
| Intron 9, indel TTGGAG | Brain | 0.03 | n = 90 | |
| Intron 18 | Leukocytes | 0.55 | n = 37 | |
| Intron 7 | Leukocytes | 0.54 | n = 47 | |
| Intron 1 | Testis | 0.50 | Direct sequencing |
In the 'Methods' column, n represents the number of subclones sequenced.
Figure 2Tissue-specific fractions of TG-derived splice variants. (a) BRUNOL4 (values are as shown in Table 2); (b) CNBP; and (c) ARS2. Pyrosequencing assays (for (b,c)) were performed multiple times for each sample (two to four times). Error bars depict the standard deviation of individual measurements.
Figure 3Conservation of the TG splice site found in intron7 of the RYK gene from human to chicken. (a) Human genomic sequence and derived splice variants. Canonical (filled triangle) and TG 3' splice site (open triangle) are marked. (b) Alignment of orthologous exon-intron boundary regions from several vertebrate genomes, splice sites highlighted as in (a). Numbers on the right display the ratios of species-specific ESTs for the TG and AG splice sites, respectively.
Figure 4Intron flank conservation of TG-AG splice acceptor tandems. Orthologous human/mouse intron-exon boundaries involving TG splice sites are displayed in a two-dimensional plot according to two properties: horizontal axis = sequence identity of 50 nt sequence upstream of both splice sites; vertical axis = relative abundance of the TG-derived splice variant, as reflected by the fraction of TG-spliced ESTs (except for CACN1A1, where the data are taken from Table 2). Data points are labeled with the gene symbol if the conservation score and/or the fraction of TG-derived splice variant are significantly high. Conservation properties of canonical introns are indicated by shaded intervals: black line = median; dark gray = 66% percentile; light gray = 90% percentile.