| Literature DB >> 18478122 |
William Lanier1, Ahmed Moustafa, Debashish Bhattacharya, Josep M Comeron.
Abstract
BACKGROUND: The genome of the pico-eukaryotic (bacterial-sized) prasinophyte green alga Ostreococcus lucimarinus has one of the highest gene densities known in eukaryotes, yet it contains many introns. Phylogenetic studies suggest this unusually compact genome (13.2 Mb) is an evolutionarily derived state among prasinophytes. The presence of introns in the highly reduced O. lucimarinus genome appears to be in opposition to simple explanations of genome evolution based on unidirectional tendencies, either neutral or selective. Therefore, patterns of intron retention in this species can potentially provide insights into the forces governing intron evolution. METHODOLOGY/PRINCIPALEntities:
Mesh:
Year: 2008 PMID: 18478122 PMCID: PMC2367439 DOI: 10.1371/journal.pone.0002171
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Ostreococcus lucimarinus and length distribution of clustered contigs and open reading frames.
The prasinophyte green algal genus Ostreococcus is the smallest-known free-living eukaryotes, with an average size of 0.8 μm. (a) Image of Ostreococcus strain RCC 143 kindly provided by W. Eikrem and J. Throndsen (University of Oslo) (image also available in Wiki Commons). (b) and (c) Histograms showing the frequency and length distribution for clustered contigs (b) and longest open reading frames (c) for the Ostreococcus lucimarinus EST library.
Summary of genes predicted and EST-annotated (mapped) for O. lucimarinus.
| Chr. # | Genes Predicted | Clusters mapped | Clusters with Introns | % Map/Predicted |
| 1 | 684 | 146 | 19 | 21.4 |
| 2 | 489 | 136 | 46 | 27.8 |
| 3 | 589 | 149 | 19 | 25.2 |
| 4 | 524 | 117 | 14 | 22.3 |
| 5 | 492 | 115 | 17 | 23.4 |
| 6 | 454 | 83 | 12 | 18.3 |
| 7 | 463 | 111 | 10 | 24.0 |
| 8 | 426 | 91 | 15 | 21.4 |
| 9 | 414 | 85 | 13 | 20.5 |
| 10 | 358 | 73 | 17 | 20.4 |
| 11 | 328 | 68 | 10 | 20.7 |
| 12 | 325 | 68 | 10 | 20.9 |
| 13 | 298 | 76 | 7 | 25.5 |
| 14 | 373 | 82 | 14 | 21.9 |
| 15 | 267 | 46 | 7 | 17.2 |
| 16 | 262 | 52 | 11 | 19.9 |
| 17 | 209 | 50 | 10 | 23.9 |
| 18 | 79 | 26 | 1 | 32.9 |
| 19 | 92 | 20 | 4 | 21.7 |
| 20 | 330 | 76 | 9 | 23.0 |
| Totals | 7456 | 1670 | 265 | 22.4 |
Genes predicted for O. lucimarinus Build 2.0 in relation to the unigene clusters mapped, mapped clusters with introns, and the ratio of unigene clusters mapped per genes predicted in the currently genome assembly.
Summary statistics of predicted introns in O. lucimarinus.
| Average Intron Size, bp | 188 |
| Longest Intron, bp | 1773 |
| Shortest Intron, bp | 26 |
| Average Intron Number per transcript | 1.74 |
| Maximum Introns In Single transcript | 13 |
Spearman's rank correlation coefficient analysis of EST-annotated introns in O. lucimarinus.
| All chromosomes | Excluding Chr. 2 | Chr.2 Only | Excluding ribosomal proteins | |
| Sequenced Clones | 0.0893 | 0.0972 | 0.0132 n.s. | 0.0970 |
| Sequenced Clones vs. Intron Number/LORF | 0.0753 | 0.0839 | −0.0300 n.s. | 0.0833 |
| Sequenced Clones vs. Mean Intron Length | 0.1008 | 0.0987 | 0. 0459 n.s. | 0.1019 |
| Sequenced Clones vs. Cluster Length | 0. 4741 | 0.4627 | 0. 6049 | 0.4764 |
| Sequenced Clones vs. LORF | 0. 4771 | 0.4687 | 0. 5757 | 0.4783 |
| Cluster Length vs. Intron Number | −0.0189 n.s. | −0.0130 n.s. | −0.0608 n.s. | −0.0103 n.s. |
Number of sequenced clones within assembled unigene clusters used as a measure of gene expression.
Length of the open reading frame.
, p<0.05;
, p<0.005;
, p<0.0005;
n.s., p>0.05
Figure 2Levels of gene expression for intron containing and intronless genes in Ostreococcus lucimarinus.
(a) Mean expression values and +/− standard errors for clusters lacking introns and all clusters containing introns. Expression values are in terms of the number of sequenced clones represented within a single unigene cluster. Boxplots showing the clustered contig lengths (b) and longest open reading frames (c) for contigs without introns (red, n = 265), contigs with introns (green, n = 1405), the combined data set of intron containing and intronless contigs (blue n = 1670), and for the complete EST library (yellow, n = 2050).
Summary statistics for cluster lengths and longest open reading frames (LORFs) for all unigene clusters.
| All Data | Intronless | Intronic | Intronic excluding Chr. 2 | |
|
| n = 1405 | n = 265 | n = 219 | |
| Longest, bp | 4309 | 4309 | 3502 | 3502 |
| Shortest, bp | 344 | 344 | 456 | 456 |
| Mean, bp | 1358 | 1361 | 1340 | 1351 |
| Median, bp | 1308 | 1304 | 1321 | 1335 |
| σ2, bp | 455 | 450 | 479 | 495 |
|
| ||||
| Longest, bp | 4230 | 4230 | 3192 | 3192 |
| Mean, bp | 1059 | 1060 | 1052 | 1078 |
| Median, bp | 1007 | 1011 | 1005 | 1017 |
| Shortest, bp | 201 | 201 | 288 | 288 |
| σ2, bp | 492 | 497 | 463 | 465 |
Putative exon-junction associated proteins.
| EJC Associated Proteins |
| Model Protein ID | e-value |
| AtUAP56-1 | At5g11220 | 43741 | e-180 |
| AtUAP56-2 | At5g11170 | 43741 | e-147 |
| atALY-1 | At5g59950 | 18962 | 5.00E-19 |
| atALY-2 | At5g02530 | 18962 | 1.00E-19 |
| atALY-3 | At5g66260 | n/a | n/a |
| AtDEK-1 | At3g48710 | 16371 | 3.00E-33 |
| AtDEK-2 | At5g63550 | 16371 | 2.00E-35 |
| AtDEK-3 | At4g26630 | 16371 | 6.00E-27 |
| AtDEK-4 | At5g55660 | 16371 | 1.00E-36 |
| AtP15-1 | At1g11570 | 13809 | 1.00E-15 |
| AtP15-2 | At1g27970 | 13809 | 6.00E-27 |
| AtP15-3 | At1g27310 | 13809 | 8.00E-26 |
| AteIF-4AIII | At3g19760 | 39453 | 0 |
| AtUPF2 | At2g39260 | 37385 | e-103 |
| AtUPF3 | At1g33980 | 16338 | 4.00E-06 |
| AtMagoNashi | At1g02140 | 39734 | 3.00E-60 |
| AtY14 | At1g51510 | 32221 | 2.00E-32 |
| AtRNPS1 | At1g16610 | n/a | n/a |
| AtSRm160 | At2g29210 | n/a | n/a |
Putative exon-junction associated proteins in Arabidopsis, their corresponding locus, best hit to protein model in the Ostreococcus lucimarinus Build 2.0, and respective e-value. The proposed Arabidopsis EJC protein families have been grouped together.