| Literature DB >> 23300929 |
Jean-François Pombert1, Erick R James, Jan Janouškovec, Patrick J Keeling.
Abstract
BACKGROUND: Photosynthetic euglenids acquired their plastid by secondary endosymbiosis of a prasinophyte-like green alga. But unlike its prasinophyte counterparts, the plastid genome of the euglenid Euglena gracilis is riddled with introns that interrupt almost every protein-encoding gene. The atypical group II introns and twintrons (introns-within-introns) found in the E. gracilis plastid have been hypothesized to have been acquired late in the evolution of euglenids, implying that massive numbers of introns may be lacking in other taxa. This late emergence was recently corroborated by the plastid genome sequences of the two basal euglenids, Eutreptiella gymnastica and Eutreptia viridis, which were found to contain fewer introns. METHODOLOGY/PRINCIPALEntities:
Mesh:
Year: 2012 PMID: 23300929 PMCID: PMC3534033 DOI: 10.1371/journal.pone.0053433
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Gene map of the Monomorphina aenigmatica (UTEX 1284) plastid genome.
Genes (filled boxes) located inside/outside the map are transcribed clockwise/counterclockwise. Introns are denoted by open boxes whereas intronic ORFs are illustrated as half-height boxes within the open boxes. tRNA genes are indicated by the one-letter amino acid code followed by the anticodon in parentheses. ORFs smaller than 150 amino acids are not shown. A+T/G+C content are shown in the inner circle in light and dark grey, respectively. Conserved gene clusters between M. aenigmatica and E. gracilis are denoted by brackets: rpoB to Y(gua), M(cau) to L(caa), rrs to rrl, and S(gcu) to H(gug).
Main features of euglenid plastid DNAs and their closest known relative.
|
|
|
|
|
|
| |
|
| 73.9 | 77.6 | 70.6 | 71.4 | 65.7 | 65.3 |
|
| ||||||
| Total | 143,171 | 73,345 | 74,746 | 65,513 | 67,622 | 101,605 |
| Genes | 62,776 | 49,860 | 45,568 | 44,061 | 50,573 | 80,191 |
| Intergenic | 24,712 | 11,357 | 13,048 | 9,288 | 10,176 | 18,657 |
| Introns | 55,683 | 12,128 | 16,130 | 12,164 | 6,873 | 2,757 |
|
| ||||||
| Coding | 43.8 | 68.0 | 61.0 | 67.3 | 74.8 | 78.9 |
| Intergenic | 17.3 | 15.5 | 17.5 | 14.2 | 15.0 | 18.4 |
| Intronic | 38.9 | 16.5 | 21.6 | 18.6 | 10.2 | 2.7 |
|
| ||||||
| Total | 88 | 57 | 87 | 84 | 86 | 110 |
| Protein-coding | 58 | 27 | 58 | 56 | 58 | 81 |
| tRNAs | 27 | 27 | 27 | 25 | 26 | 27 |
| rRNAs | 3 | 3 | 2 | 3 | 2 | 2 |
|
| 139 | 60 | 53 | 23 | 7 | 1 |
|
| 4 | 0 | 3 | 3 | 4 | 1 |
|
| 3 | 3 | 1 | ≥2 | 2 | 2 |
Does not include intronic ORFs and pseudogenes.
Includes intronic ORF.
Duplicated genes were counted only once. Free-standing ORFs are not included.
Twintrons (introns-within-introns) were counted as single insertion sites.
Includes the rpoA gene reported in [15].
The orf103 annotated on the opposite strand of rpl14-rpl16 in Eta. viridis [19] is spurious and was not taken into account.
According to the read coverage described in Wiegert et al. [19].
The Etl. gymnastica cpDNA features one intron in atpA, rps2, rps18 and two introns in psbC that were not reported in Hrdá et al. [18] (see Data S1 for annotation).
Intron insertion sites in M. aenigmatica that are shared with other euglenid plastid DNAs.
| Gene | Shared insertion sites | Gene | Shared insertion sites |
|
|
|
| Maen.8, Egra.9, Elon.9 |
|
| Maen.1, Egra.2 |
|
|
|
| Maen.1, Egra.6 |
| Maen.1, Egra.1, Elon.1 |
|
| Maen.1, Egra.1 |
| Maen.2, Egra.2, Elon.2 |
|
|
|
| Maen.3, Egra.3, Elon.3 |
|
| Maen.1, Egra.1 |
|
|
|
| Maen.2, Egra.2 |
| Maen.2, Egra.2, Elon.2 |
|
| Maen.1, Egra.2 |
| Maen.1, Egra.2, Elon.2 |
|
|
|
| Maen.2, Egra.3, Elon.3 |
|
|
|
| Maen.1, Egra.3 |
|
|
|
| Maen.2, Egra.5, Elon.4 |
|
|
|
| Maen.3, Egra.6, Elon.5 |
|
| Maen.1, Egra.1, Elon.1 |
| Maen.1, Egra.1, Elon.1 |
|
| Maen.1, Egra.1, Elon.1 |
| Maen.2, Egra.2, Elon.2 |
|
|
|
| Maen.1, Egra.1, Elon.1 |
|
|
|
| Maen.1, Egra.1 |
|
| Maen.2, Egra.2, Elon.2 |
|
|
|
|
|
| Maen.1, Egra.1, Elon.1 |
|
| Maen.4, Egra.4, Elon.4 |
| Maen.1, Egra.1, Elon.1 |
|
| Maen.6, Elon.7 |
| Maen.1, Egra.1 |
|
| Maen.7, Egra.8, Elon.8 |
Intron insertion sites reported to display twintrons in Euglena gracilis are highlighted in bold.
Maen, Monomorphina aenigmatica; Egra, Euglena gracilis; Elon, Euglena longa; Evir, Eutreptia viridis; Egym, Eutreptiella gymnastica. The number corresponds to the insertion site on the gene (labelled from 5′ to 3′).
The sequence of the rps18.Egym.1 intron is different from those in M. aenigmatica and E. gracilis and may be unrelated.
Intron insertion sites unique to M. aenigmatica.
| Unique insertion sites | |||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The rpoB.Maen.1 insertion site is located in the vicinity of rpoB.Evir.1, but it is unknown is these sites are related given the low conservation of the gene.
Figure 2Euglena gracilis twintron insertion sites found in M. aenigmatica.
Internal introns not found in M. aenigmatica are indicated by dashed lines. The total length of each insertion site is indicated on the right. Group II (GII) and group III (GIII) introns are indicated below the corresponding introns. Maen, Monomorphina aenigmatica; Egra, Euglena gracilis. The number after the period corresponds to the insertion site on the gene. In psbC.Egra.2, the fragmented ORF spliced together after excision of the internal introns is marked by asterisks.
Figure 3Phylogenetic mapping of intron insertion sites among euglenid chloroplast genomes.
At the top is a matrix showing the number of insertion sites shared between pairwise taxa (shared twintron insertions sites are indicated between brackets). At the bottom is a cladogram showing the distribution of shared intron insertion sites across the euglenids. The total number of insertion sites for each species is indicated between parentheses below its name. Insertions across the phylogenetic tree are denoted by triangles. For this analysis, the ambiguous rpoB intron insertion sites were considered distinct. The no-longer-photosynthetic E. longa was not included in this figure due to its many gene losses. The phylogenetic relationships described here are schematized from Kim et al. [21].