| Literature DB >> 18838389 |
Jun-Ichi Takeda1, Yutaka Suzuki, Ryuichi Sakate, Yoshiharu Sato, Masahide Seki, Takuma Irie, Nono Takeuchi, Takuya Ueda, Mitsuteru Nakao, Sumio Sugano, Takashi Gojobori, Tadashi Imanishi.
Abstract
Using full-length cDNA sequences, we compared alternative splicing (AS) in humans and mice. The alignment of the human and mouse genomes showed that 86% of 199 426 total exons in human AS variants were conserved in the mouse genome. Of the 20 392 total human AS variants, however, 59% consisted of all conserved exons. Comparing AS patterns between human and mouse transcripts revealed that only 431 transcripts from 189 loci were perfectly conserved AS variants. To exclude the possibility that the full-length human cDNAs used in the present study, especially those with retained introns, were cloning artefacts or prematurely spliced transcripts, we experimentally validated 34 such cases. Our results indicate that even retained-intron type transcripts are typically expressed in a highly controlled manner and interact with translating ribosomes. We found non-conserved AS exons to be predominantly outside the coding sequences (CDSs). This suggests that non-conserved exons in the CDSs of transcripts cause functional constraint. These findings should enhance our understanding of the relationship between AS and species specificity of human genes.Entities:
Mesh:
Substances:
Year: 2008 PMID: 18838389 PMCID: PMC2582632 DOI: 10.1093/nar/gkn677
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Comparative analysis of human and mouse AS sequences. (A) Categories of conserved exons. Full-length exons and coding regions of exons were included in the analysis, and the highest conservation level was selected. Boxes indicate exons; thin straight lines indicate introns. (B) Categories of conserved AS variants within the categorized exons in (A). Conservation levels were determined from the lowest conservation level of each transcript's exons. (C) Equally spliced variants (ESVs) and conserved AS sequences represent higher categories of transcript-conserved variants. ESVs are transcript-conserved variants with similar combinations of exons in different species. Conserved AS sequences are two or more different ESV pairs at a particular locus in multiple species. Additional details are available in the Results section and the Materials and methods section.
Genomic conservation of AS in humans and mice
| Total | Non-conserved | Genome-conserved | Transcript-conserved | |
|---|---|---|---|---|
| Panel A: exon-level conservation of human AS variants, compared with the mouse genome | ||||
| All exons | 199 426 | 27 879 (14%) | 23 412 (12%) | 148 135 (74%) |
| AS exons | 49 842 | 12 196 (25%) | 9064 (18%) | 28 582 (57%) |
| Total | Non-conserved | Genome-conserved | Transcript-conserved | |
| Panel B: transcript-level conservation of human AS variants, compared with the mouse genome and transcripts | ||||
| AS variants | 20 392 | 8296 (41%) | 4410 (21%) | 7686 (38%) |
| AS loci | 7601 | 1716 (23%) | 1241 (16%) | 4644 (61%) |
| Transcript-conserved total | Non-equally spliced variant | ESV | Conserved AS | |
| Panel C: higher conservation categories for transcript-conserved AS sequences | ||||
| AS variants | 7686 | 2631 (34%) | 4624 (60%) | 431 (6%) |
| AS loci | 4644 | 885 (19%) | 3570 (77%) | 189 (4%) |
Figure 2.Conserved AS sequences, exemplified by (A) PI3-kinase regulatory subunit and (B) CARS. Constitutive introns are shortened here and additional details are available in the Results section. Boxes indicate exons. Filled regions within boxes indicate CDSs (green), protein motifs (red) and untranslated regions (yellow). The ESVs shared by humans and mice are indicated by arrows.
Comparison of AS features in total and conserved AS loci
| Total AS loci | Conserved AS loci | |
|---|---|---|
| Panel A: AS patterns | ||
| Total | 7601 | 189 |
| Cassette (Skipped exon) | 3584 (35%) | 66 (42%) |
| Internal acceptor (Alternative 3′ splice) | 1988 (19%) | 30 (19%) |
| Internal donor (Alternative 5′ splice) | 1990 (20%) | 33 (21%) |
| Mutually exclusive | 237 (2%) | 4 (2%) |
| Retained intron | 2477 (24%) | 26 (16%) |
| Panel B: Effects of AS on protein function | ||
| Total | 7601 | 189 |
| Average difference in CDS length | 87 aa | 52 aa |
| Total effects of AS on function | 3395 (45%) | 125 (66%) |
| Protein-motif alterations | 2423 | 86 |
| GO alterations | 1078 | 30 |
| Sub-cellular localization changes | 2305 | 75 |
| Transmembrane domain changes | 444 | 16 |
GO terms and protein motifs frequently observed at conserved AS loci
| GO ID | Definition | Number of conserved AS loci | Total number of AS loci | |
|---|---|---|---|---|
| Panel A: GO terms | ||||
| GO:0003677 | DNA binding | 17 | 333 | 0.0085* |
| GO:0004601 | Peroxidase activity | 3 | 14 | 0.0077* |
| GO:0006979 | Response to oxidative stress | 3 | 14 | 0.0077* |
| InterPro ID | Definition | Number of conserved AS loci | Total number of AS loci | |
| Panel B: protein motifs | ||||
| IPR004827 | Basic-leucine zipper transcription factor | 3 | 4 | 0.0005* |
| IPR000580 | TSC-22/Dip/Bun | 2 | 3 | 0.0057* |
P-values were calculated using Fisher's exact test. They indicate the significance of the difference between the ratios to the conserved AS loci (189) and the total AS loci (7601). *P < 0.01.
Figure 3.Experimental validation of AS human transcripts by using the retained-intron pattern. (A) RNA expression of retained-intron AS sequences in 20 human tissues (1, adrenal gland; 2, bone marrow; 3, brain, cerebellum; 4, brain, whole; 5, fetal brain; 6, fetal liver; 7, heart; 8, kidney; 9, liver; 10, lung, whole; 11, placenta; 12, prostate; 13, salivary gland; 14, skeletal muscle; 15, testis; 16, thymus; 17, thyroid gland; 18, trachea; 19, uterus and 20, spinal cord). The upper panel exemplifies ‘ubiquitous’ retained-intron AS sequences and the lower panel exemplifies ‘tissue-preferred’ retained-intron AS sequences. (B) RT–PCR analysis using polysomal fractions isolated from the human promyelocytic leukaemia cell line HL60. (C) RNA expression of retained-intron AS sequences mixed with translating ribosome fractions (i.e. polysome fractions) from the HL60 cell line. (D) Number of expressed transcripts in each conserved category.
Relationship between conservation and splicing in human alternatively spliced variant exons
| Total | CDS-related | Protein-motif-related | Retrotransposon-related | Exonic-splicing-enhancer-related | |
|---|---|---|---|---|---|
| C/CS exons | 133 901 | 95 583 (71%) | 27 805 (21%) | 2523 (2%) | 130 104 (97%) |
| C/AS exons | 37 646 | 20 805 (55%) | 6192 (16%) | 2308 (6%) | 35 702 (95%) |
| NC/CS exons | 15 683 | 7030 (45%) | 1898 (12%) | 2549 (16%) | 14 961 (95%) |
| NC/AS exons | 12 196 | 3516 (29%) | 812 (7%) | 4544 (37%) | 11 701 (96%) |
| All exons | 99 426 | 126 934 (64%) | 36 707 (18%) | 11 924 (6%) | 192 468 (97%) |
C, conserved; NC, non-conserved.
Figure 4.Expression pattern of parent genes in H-ANGEL that have GAGE or T-complex 11 motifs. The expressed pattern was examined by iAFLP, GeneChip and BodyMap-EST. The height of bars indicates the percentage in 10 tissue categories. The name of tissues are as follows (from the left): neural, blood/spleen/lymph node dissection (LND), dermal/connective, placenta/testis/ovary, muscle/heart, stomach/colon, liver, lung, kidney/bladder and endocrine/exocrine. The expressed specific tissue here was testis according to more detailed categories in H-ANGEL.