| Literature DB >> 18282305 |
Bing-Bing Wang1, Mike O'Toole, Volker Brendel, Nevin D Young.
Abstract
BACKGROUND: Although originally thought to be less frequent in plants than in animals, alternative splicing (AS) is now known to be widespread in plants. Here we report the characteristics of AS in legumes, one of the largest and most important plant families, based on EST alignments to the genome sequences of Medicago truncatula (Mt) and Lotus japonicus (Lj).Entities:
Mesh:
Year: 2008 PMID: 18282305 PMCID: PMC2277414 DOI: 10.1186/1471-2229-8-17
Source DB: PubMed Journal: BMC Plant Biol ISSN: 1471-2229 Impact factor: 4.215
Transcript alignments, intron and exon features in plants
| Medicago | Lotus# | Arabidopsis | Rice | |
| EST/cDNA total | 225,920 | 150,855 | 691,516 | 1,009,754 |
| Mapped to genome^ | 104,382 (46.2%) | 22,144 (14.7%)* | 589,254 (85.2%) | 916,825 (90.8%) |
| Transcription unit (TU)/Genes | 11,516 | 3,298 | 22,518 | 31,044 |
| MultiEST TU/Genes | 8,544 (74.2%) | 1,879 (57.0%) | 19,857 (88.2%) | 26,859 (86.5%) |
| Average (Median) ESTs/gene | 9.8 (4) | 6.9 (2) | 26.3 (11) | 30.1 (10) |
| Number of Introns | 32,860 | 4,357 | 97,095 | 107,162 |
| Average (Median) intron size | 472 (218) | 458 (215) | 171 (101) | 438 (164) |
| Long intron (>1000 nt) | 12.7% | 10.9% | 0.7% | 10.7% |
| Number of internal exons | 24,600 | 2,717 | 78,911 | 83,668 |
| Average (Median) internal exon | 140 (108) | 127 (100) | 164 (114) | 175 (113) |
^ Transcript sequences are required to have >95% identity and >80% coverage to be considered as mapped.
# Lotus data are based on the ESTs aligned to finished TACs (phase 3).
* A total of 48,691 (32.3% of 150,855) transcript sequences can be mapped to Lj TACs in all phases, including phase 1, phase 2 and phase 3.
Figure 1Size distributions of introns and internal exons in plants. The x-axis indicates the size of either introns (A) or internal exons (B). Each number except the last one is labeled with the upper bound (e.g., 100 nt comprises size 51–100 nt). The y-axis indicates the fraction of total introns (A) or internal exons (B) for a given size range of intron or internal exon. The insets show a detailed distribution of smaller (<300 nt) introns (A) or internal exons (B). The bin size is 10, and 100 nt comprises size 91–100 nt for the insets.
Comparison of alternative splicing events and frequencies in plants
| Medicago | Lotus# | Arabidopsis | Rice | |
| AltD | 204 (13.5%) | 18 (15.7%) | 818 (11.3%) | 1,165 (9.6%) |
| AltA | 350 (23.1%) | 37 (32.2%) | 1,785 (24.7%) | 2,377 (19.5%) |
| AltP | 21 (1.4%) | 2 (1.7%) | 106 (1.5%) | 306 (2.5%) |
| ExonS | 162 (10.7%) | 10 (8.7%) | 445 (6.2%) | 1,332 (10.9%) |
| IntronR | 778 (51.3%) | 48 (41.7%) | 4,062 (56.3%) | 7,011 (57.5%) |
| Total | 1,515 | 115 | 7,216 | 12,191 |
| AS genes | 1,107 (9.6%) | 92 (2.8%) | 4,497 (20.0%) | 6,313 (20.3%) |
Percentages in parenthesis for each alternative splicing type are the portion relative to the total events. Percentages for AS genes are the portion of alternatively spliced genes relative to the total number of expressed genes (genes/TU) in Table 1.
# Lotus data are based on the ESTs aligned to finished TACs (phase 3).
Figure 2Correlation between AS frequency and EST coverage. The x-axis indicates groups of genes with certain numbers of ESTs. The primary y-axis for the bar graph indicates total number of genes within each group. The secondary y-axis for the line graph indicates the fraction of alternatively spliced genes for the group. Note that different bin sizes were used to keep the number of genes in each group greater than 500 in At and Os. AS data from groups with fewer than 80 genes in Mt were removed to reduce noise. Lj data were not shown as only the first six groups have more than 80 genes.
Figure 3Ratio of different AS types in a reliable subset of AS events. The reliable data set consisted of AS events with multiple supporting ESTs for each isoform. IntronR is still the most abundant AS type in the subset. The error bar represents the ratio for each AS type in full data set described in Table 2.
Cross-species EST alignments in Medicago
| Species | EST/cDNA | Mapped to | Genes | Genes without | AS Genes | Novel AS* | Predicted introns | Consistent introns^ |
| Lotus | 150,855 | 15,542 (10.3%) | 2,955 | 367 (12.4%) | 12 (3.3%) | 8 | 5,606 | 4,256 |
| Soybean | 359,834 | 42,665 (11.9%) | 5,810 | 925 (15.9%) | 242 (4.2%) | 201 | 16,758 | 11,420 |
| Other legumes | 127,684 | 26,547 (20.8%) | 5,335 | 700 (13.1%) | 69 (1.3%) | 50 | 13,052 | 9,926 |
| Total | 638,373 | 84,754 (13.3%) | 7,896 | 1,475 (18.7%) | 307 (3.9%) | 248 | 23,179 | 15,506 |
* Novel AS gene indicates genes not identified as alternative splicing by Mt EST.
^ Consistent introns indicate number of introns predicted from cross-species ESTs which are also supported by Mt EST.
AS events predicted from cross-species EST alignment in Medicago
| Species | AS events | AltD | AltA | AltP | ExonS | IntronR |
| Lotus | 12 | 2 (16.7%) | 6 (50.0%) | 1 (8.3%) | 2 (16.7%) | 1 (8.3%) |
| Soybean | 276 | 40 (14.5%) | 75 (27.2%) | 5 (1.8%) | 53 (19.2%) | 103 (37.3%) |
| Other legume | 87 | 20 (23%) | 26 (29.9%) | 2 (2.3%) | 7 (8.0%) | 32 (36.8%) |
| Total | 367 | 59 (16.1%) | 107 (29.1%) | 8 (2.2%) | 62 (16.9%) | 131 (35.7%) |
Figure 4Completely conserved ExonS event in plant enoyl-CoA hydratase/isomerase genes. A: same-species and cross-species EST alignments in Mt gene locus AC145499_47. Filled boxes and arrows indicate exons, and lines indicate introns. Green open or filled boxes indicate exons skipped or retained in certain ESTs. The top black scale indicates coordinates for the gene locus on BAC (AC145499). The blue bar represents the IMGAG annotated gene model, with the green triangle representing the protein translation start codon and the red triangle representing the stop codon. Red bars represent individual same species EST alignments. Purple bars represent Lj ESTs, dark yellow bars represent soybean ESTs, and gray bars represent ESTs from other legume species. B. Multiple sequence alignments of the mutual exclusive exons. E3 indicates the Exon 3 and E4 indicates the Exon 4. At2E3 refers to the exon in the second copy of At gene (At4g13360). Amino acids encoded by Mt sequences are list at the top of sequence alignment. Degenerate positions (change in nucleotide will not change amino acids) which are conserved in all exons are highlighted in colors. C. EST alignment in the second copy of At gene (At4g13360). Only exon E3 exists in this gene and no ExonS can be detected. D, E. EST alignment in At and Os genes where the ExonS pattern is completely conserved.