| Literature DB >> 17941993 |
Wei-Hua Chen1, Guanting Lv, Congying Lv, Changqing Zeng, Songnian Hu.
Abstract
BACKGROUND: Alternative splicing (AS) contributes significantly to protein diversity, by selectively using different combinations of exons of the same gene under certain circumstances. One particular type of AS is the use of alternative first exons (AFEs), which can have consequences far beyond the fine-tuning of protein functions. For example, AFEs may change the N-termini of proteins and thereby direct them to different cellular compartments. When alternative first exons are distant, they are usually associated with alternative promoters, thereby conferring an extra level of gene expression regulation. However, only few studies have examined the patterns of AFEs, and these analyses were mainly focused on mammalian genomes. Recent studies have shown that AFEs exist in the rice genome, and are regulated in a tissue-specific manner. Our current understanding of AFEs in plants is still limited, including important issues such as their regulation, contribution to protein diversity, and evolutionary conservation.Entities:
Mesh:
Substances:
Year: 2007 PMID: 17941993 PMCID: PMC2174465 DOI: 10.1186/1471-2229-7-55
Source DB: PubMed Journal: BMC Plant Biol ISSN: 1471-2229 Impact factor: 4.215
Acquired data
| Species | Sequence | Datasets | Database |
| General EST | 1,211,078 | NCBI dbEST | |
| mRNA | 23,309 | NCBI CoreNucleotide | |
| Full-length cDNA | 32,127 | KOME** | |
| Genome | IRGSP* Release 4.0 | ||
| General EST | 734,275 | NCBI dbEST | |
| mRNA | 30,476 | NCBI CoreNucleotide | |
| Full-length cDNA | 15,294 | RIKEN RAFL*** | |
| Genome | NCBI Genomes |
*IRGSP stands for International Rice Genome Sequencing Project
**KOME stands for Knowledge-based Oryza Molecular biological Encyclopedia
*** RAFL stands for RIKEN Arabidopsis Full-length cDNA clones
Figure 1Diagrammatic view of different types of AFE events. Alternative first exons are highlighted in orange and green. Constitutive exons are drawn in dark blue. Other alternatively spliced exons are drawn in brown. (A). Type I AFE clusters. Alternative first exons are mutually exclusive in different gene structures. (B). Type II AFE clusters. The first exon of one transcript is (part of) a downstream exon of other transcripts. (C). Some AFEs are coupled with downstream alternative splicing events.
Results of AFE analysis in rice and Arabidopsis
| Rice | Arabidopsis | ||
| Type I AFE | 137 | 99 | |
| N-terminal diversification | 53 | 20 | |
| Overlapping with functional domain | 5 | 1 | |
| Putative alternative promoter | 62 | 22 | |
| Both N-terminal and PAP | 3 | 7 | |
| NMD | 47 | 10 | |
| Type II AFE | 1,241 | 546 | |
| N-terminal diversification | 213 | 298 | |
| Overlapping with functional domain | 56 | 71 | |
| Putative alternative promoter | 257 | 352 | |
| Both N-terminal and PAP | 189 | 244 | |
| NMD | 237 | 42 | |
| Total | 1,378 | 645 | |
Figure 2Chromosomal distribution of AFE-containing clusters. The distribution of AFEs on Arabidopsis chromosomes was determined using the alignment positions of AFE-clusters.
5' splice site analysis of AFEs
| Constitutive (± SD) * | AFE Type I | AFE Type II | |||||
| Total | Major** | Minor** | Total | Major** | Minor** | ||
| Rice | 9.310 ± 3.72 | 7.87 ± 4.11 | 7.75 ± 4.23 | 7.75 ± 3.91 | 8.61 ± 4.01 | 7.75 ± 4.03 | 8.98 ± 3.20 |
| Comparison with constitutive sites *** | 1.3063e-011 | 5.7841e-007 | 1.3907e-006 | 3.1057e-010 | 1.0233e-029 | 0.9846 | |
| Arabidopsis | 8.00 ± 2.89 | 7.39 ± 3.23 | 8.20 ± 3.03 | 5.89 ± 3.07 | 8.44 ± 2.93 | 8.42 ± 2.84 | 8.40 ± 3.02 |
| Comparison with constitutive sites *** | 0.0013 | 0.4077 | 3.2361e-012 | 9.4224e-005 | 0.0062 | 0.0151 | |
* The 5' splice site scores were predicted by GeneSplicer. Higher score indicates stronger splicing signal.
** Major and minor types of alternative first exons within each gene cluster were determined as described in the Methods section.
*** P-values were determined using t-tests.
secondary structure formation analysis at 5' splice sites of AFEs
| Constitutive (± SD) * | AFE Type I | AFE Type II | |||||
| Total | Major** | Minor** | Total | Major** | Minor** | ||
| Rice | -19.22 ± 5.59 | -23.61 ± 8.62 | -24.28 ± 8.37 | -23.00 ± 8.79 | -22.45 ± 7.8 | -24.7 ± 8.51 | -20.37 ± 6.46 |
| Comparison with constitutive sites *** | 3.2796e-071 | 1.8749e-061 | 9.6957e-035 | 9.6069e-082 | 1.7511e-160 | 3.0208e-012 | |
| Arabidopsis | -17.80 ± 4.33 | -15.09 ± 5.10 | -14.59 ± 5.38 | -15.60 ± 4.62 | -16.52 ± 4.98 | -16.47 ± 4.89 | -16.46 ± 5.29 |
| Comparison with constitutive sites *** | 1.6711e-028 | 4.5892e-022 | 1.3987e-011 | 4.7938e-015 | 1.9863e-009 | 2.9444e-009 | |
* Secondary structure formation was measured as Minimal Folding Energy (MFE) by MRNAFOLD. Lower scores indicate a higher likelihood of an input sequence to form a secondary structure;
** Major and minor types of alternative first exons within each gene cluster were determined as described in the Methods section.
*** P-values were determined using t-tests.
Functional categories (GO) significantly biased in AFE-containing clusters in Arabidopsis
| GO category | AFE containing cluster | ||
| Enriched** | cellular physiological process | 327 | 0 |
| metabolism | 297 | 0 | |
| nucleotide binding | 65 | 0 | |
| catalytic activity | 27 | 1.52E-10 | |
| transferase activity | 104 | 1.35E-09 | |
| ligase activity | 25 | 1.73E-08 | |
| hydrolase activity | 89 | 1.20E-07 | |
| ubiquitin ligase complex | 13 | 1.24E-07 | |
| intracellular part | 259 | 1.94E-07 | |
| intracellular | 265 | 2.42E-07 | |
| cell part | 368 | 7.82E-06 | |
| membrane part | 37 | 4.80E-05 | |
| nucleic acid binding | 91 | 0.000128 | |
| lyase activity | 18 | 0.000265 | |
| localization | 51 | 0.000476 | |
| Depleted | triplet codon-amino acid adaptor activity | 0 | 5.61E-06 |
* P-value was calculated by the hypergeometric distribution. The cutoff is 1E-5.
** "Enriched" categories refer to those containing significantly more genes (observed) than expected. "Depleted" categories refer to those containing significantly less genes (observed) than expected.
Functional categories (GO) significantly biased in AFE-containing clusters in Rice.
| Enriched | GO category | AFE containing cluster | |
| metabolism | 468 | 0 | |
| cellular physiological process | 595 | 0 | |
| nucleotide binding | 155 | 0 | |
| hydrolase activity | 144 | 0 | |
| transferase activity | 131 | 0 | |
| oxidoreductase activity | 79 | 0 | |
| ion binding | 65 | 0 | |
| nucleic acid binding | 147 | 1.02E-14 | |
| helicase activity | 17 | 2.78E-09 | |
| catalytic activity | 45 | 1.04E-08 | |
| lyase activity | 24 | 1.95E-08 | |
| regulation of cellular process | 50 | 3.95E-08 | |
| regulation of physiological process | 50 | 4.25E-08 | |
| non-membrane-bound organelle | 35 | 4.98E-08 | |
| ligase activity | 32 | 6.29E-08 | |
| ATPase activity, coupled to movement of substances | 20 | 7.01E-08 | |
| organelle part | 35 | 7.38E-08 | |
| intracellular organelle part | 35 | 7.38E-08 | |
| membrane | 208 | 1.32E-07 | |
| carrier activity | 27 | 2.15E-07 | |
| membrane part | 32 | 1.24E-06 | |
| protein binding | 26 | 1.66E-06 | |
| ion transporter activity | 23 | 2.67E-06 | |
| ribonucleoprotein complex | 23 | 1.38E-05 | |
| microtubule associated complex | 7 | 2.78E-05 | |
| cell communication | 22 | 3.91E-05 | |
| amine binding | 6 | 4.49E-05 | |
| protein transporter activity | 9 | 0.000192 | |
| response to endogenous stimulus | 13 | 0.000197 | |
| unlocalized protein complex | 5 | 0.000212 | |
| cofactor binding | 6 | 0.000212 | |
| ATP-binding cassette (ABC) transporter complex | 7 | 0.000245 | |
| ubiquitin ligase complex | 18 | 0.000306 | |
| nuclear pore | 3 | 0.000338 | |
| Depleted | membrane-bound organelle | 860 | 1.47E-52 |
| intracellular organelle | 878 | 9.04E-47 | |
| intracellular part | 905 | 4.36E-39 | |
| intracellular | 911 | 7.83E-38 | |
| cell part | 1,004 | 2.46E-33 | |
Figure 3Gene Ontology (GO) categories of AFE-containing clusters in rice and . The genes were functionally categorized according to the Gene Ontology Consortium and level two of the assignment results were plotted here. 87% (1,204 of a total 1,378) AFE-containing clusters from rice and 94% (605 of a total 645) AFE clusters from Arabidopsis were classified by GO.
Figure 4Gene Ontology (GO) categories of two types of AFE-containing clusters in rice and . The genes were functionally categorized according to the Gene Ontology Consortium and level two of the assignment results were plotted here. GO categories of two types of AFE-containing clusters were plotted for rice (A) and Arabidopsis (B), respectively.
Tissue- and development stage- specific expression of AFEs in rice and Arabidopsis
| Tissue specific* | Development stage specific* | Both | ||
| Rice | HC** | 390 | 273 | 200 |
| LC** | 914 | 713 | 624 | |
| Arabidopsis | HC | 31 | 44 | 21 |
| LC | 55 | 113 | 39 |
* Tissue- and development stage- specific gene expression were determined using the methods suggested by Qiang Xu et al.
** High confidence (HC) tissue specificity was defined as TS>50, rTS>0.9 and rTS~>0.9, low confidence (LC) was defined as TS>0, rTS>0.5 and rTS~>0.5 (see Methods)