Literature DB >> 27503298

Mitochondrion-to-Chloroplast DNA Transfers and Intragenomic Proliferation of Chloroplast Group II Introns in Gloeotilopsis Green Algae (Ulotrichales, Ulvophyceae).

Monique Turmel1, Christian Otis1, Claude Lemieux2.   

Abstract

To probe organelle genome evolution in the Ulvales/Ulotrichales clade, the newly sequenced chloroplast and mitochondrial genomes of Gloeotilopsis planctonica and Gloeotilopsis sarcinoidea (Ulotrichales) were compared with those of Pseudendoclonium akinetum (Ulotrichales) and of the few other green algae previously sampled in the Ulvophyceae. At 105,236 bp, the G planctonica mitochondrial DNA (mtDNA) is the largest mitochondrial genome reported so far among chlorophytes, whereas the 221,431-bp G planctonica and 262,888-bp G sarcinoidea chloroplast DNAs (cpDNAs) are the largest chloroplast genomes analyzed among the Ulvophyceae. Gains of non-coding sequences largely account for the expansion of these genomes. Both Gloeotilopsis cpDNAs lack the inverted repeat (IR) typically found in green plants, indicating that two independent IR losses occurred in the Ulvales/Ulotrichales. Our comparison of the Pseudendoclonium and Gloeotilopsis cpDNAs offered clues regarding the mechanism of IR loss in the Ulotrichales, suggesting that internal sequences from the rDNA operon were differentially lost from the two original IR copies during this process. Our analyses also unveiled a number of genetic novelties. Short mtDNA fragments were discovered in two distinct regions of the G sarcinoidea cpDNA, providing the first evidence for intracellular inter-organelle gene migration in green algae. We identified for the first time in green algal organelles, group II introns with LAGLIDADG ORFs as well as group II introns inserted into untranslated gene regions. We discovered many group II introns occupying sites not previously documented for the chloroplast genome and demonstrated that a number of them arose by intragenomic proliferation, most likely through retrohoming.
© The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

Entities:  

Keywords:  inverted repeat; mitochondrial genome; plastid genome; promiscuous DNA; repeated sequences; retrohoming

Mesh:

Substances:

Year:  2016        PMID: 27503298      PMCID: PMC5630911          DOI: 10.1093/gbe/evw190

Source DB:  PubMed          Journal:  Genome Biol Evol        ISSN: 1759-6653            Impact factor:   3.416


Introduction

The Ulvophyceae is one of the three major classes of green algae belonging to the core Chlorophyta, a robustly supported assemblage that took roots from unicellular planktonic prasinophytes (Leliaert et al. 2012; Fucikova et al. 2014; Turmel et al. 2016). Although the class is best known for its macroscopic marine forms (the green seaweeds), several members live in freshwater or damp subaerial habitats (Friedl and Rybalka 2012; Leliaert et al. 2012). The Ulvophyceae far exceeds the other chlorophyte classes in diversity of morphological complexity (ranging from microscopic unicells to multicellular plants and giant-cells) and cellular sophistication (with four cytomorphological types). Numerous orders have been erected to reflect this diversity, but the relationships among them as well as the monophyly of the class remain ambiguous because the phylogenetic analyses that were inferred to elucidate these questions yielded conflicting results. A phylogenetic analysis based on ten genes (eight encoded in the nucleus and two in the chloroplast) recovered the Ulvophyceae as a well-supported monophyletic group composed of two main lineages: the Oltmannsiellopsidales–Ulvales–Ulotrichales (comprising most microscopic forms) and the Trentepohliales–Bryopsidales–Cladophorales–Dasycladales (Cocquyt et al. 2010). In contrast, recent chloroplast phylogenomic studies revealed that the class is not monophyletic (Fucikova et al. 2014; Leliaert and Lopez-Bautista 2015; Melton et al. 2015; Sun et al. 2016; Turmel et al. 2016): the Oltmannsiellopsidales–Ulvales–Ulotrichales was recovered in most studies but the affinity of this clade with the other lineages (Bryopsidales, Trentepohliales, and Dasycladales) received no support. The genomic data currently available for the Ulvophyceae indicate that the chloroplast genome is highly variable in structure, gene density, gene order, and intron content. Complete chloroplast DNA (cpDNA) sequences have been reported for only seven ulvophycean taxa: the marine flagellate Oltmannsiellopsis viridis (Oltmannsiellopsidales) (Pombert et al. 2006), the freshwater filamentous and branched Pseudendoclonium akinetum (Ulotrichales) (Pombert et al. 2005), the marine macroalgae Ulva sp. UNA00071828 and Ulva fasciata (Ulvales) (Melton and Lopez-Bautista 2015; Melton et al. 2015), and three marine siphonous species from the Bryopsis and Tydemania genera (Bryopsidales) (Lu et al. 2011; Leliaert and Lopez-Bautista 2015). Their sizes range from 96 kb (in Ulva fasciata) to 196 kb (in Pseudendoclonium), while their gene repertoires contain 100 (in Ulva species) to 105 (in Pseudendoclonium) conserved genes. Only the chloroplast genomes sampled from the Oltmannsiellopsidales (Oltmannsiellopsis) and Ulotrichales (Pseudendoclonium) have retained the large rRNA operon-encoding inverted repeat (IR) that is present in most green plants, but as observed for the Chlorophyceae (de Cambiaire et al. 2006; Brouard et al. 2008), gene partitioning among the single-copy regions differs considerably between these two ulvophycean lineages and is also distinct from the patterns observed in other green algal groups. Another feature shared by ulvophycean and chlorophycean cpDNAs is their highly diverse and variable intron contents, which are generally characterized by a greater proportion of group I rather than group II introns. Twenty-seven introns, all belonging to the group I class, are present in Pseudendoclonium cpDNA, whereas Oltmannsiellopsis and the two Ulva species contain only five introns in their chloroplast. Moreover, a number of chloroplast group I introns are inserted at the same insertion sites in some members of the Ulvophyceae and Chlorophyceae. For instance, 11 of the introns found in Pseudendoclonium and Oedogonium cardiacum share the same positions (Brouard et al. 2008). This observation might reflect past events of intercellular horizontal DNA transfers involving these mobile elements, but more data on intron diversity in the Chlorophyceae and the Oltmannsiellopsidales–Ulvales–Ulotrichales clade are needed to provide support for this hypothesis. Unexpectedly, the analysis of the Pseudendoclonium chloroplast and mitochondrial genomes provided indirect evidence for intracellular, inter-organellar DNA exchanges in the Ulotrichales (Pombert et al. 2005, 2004). Two observations support the occurrence of these genetic exchanges: first, the mitochondrial atp1 gene contains a group I intron (site 522) that is highly similar in both sequence and secondary structure to a group I intron inserted at the identical site (site 489) in the chloroplast atpA gene, and second, the chloroplast and mitochondrial genomes contain an identical 15-bp repeat. The direction of transfer of these sequences, however, could not be elucidated. To date, inter-organellar DNA transfers have been documented solely for seed plants, with frequent transfers of cpDNA to the mitochondrion but only rare cases of mitochondrial DNA (mtDNA) migration to the chloroplast (Goremykin et al. 2009; Smith 2011, 2014; Straub et al. 2013). It is thought that the relative absence of mtDNA-derived sequences in the chloroplast might reflect the lack of a DNA uptake system in this organelle (Bock 2010). In the present study, we have analyzed the chloroplast and mitochondrial genomes of two additional ulvophyceans belonging to the Ulotrichales, Gloeotilopsis planctonica and Gloeotilopsis sarcinoidea, and compared these genomes to their previously reported counterparts. The Gloeotilopsis lineage is sister to that comprising Pseudendoclonium akinetum and species from the genus Hazenia (Škaloud et al. 2013). The main goals of our study were to enhance our understanding of chloroplast and mitochondrial genome evolution in the Ulvales/Ulotrichales and to gain more information regarding inter-organellar DNA transfer. Our comparative analyses unveiled a number of novel genomic features not previously documented for green algal organelles, including the first case of intracellular transfer of mtDNA to the chloroplast.

Materials and Methods

Strains and Culture Conditions

Gloeotilopsis planctonica SAG 29.93 and Gloeotilopsis sarcinoidea UTEX 1710 were obtained from the culture collections of algae at the University of Gottingen (SAG) and the University of Texas at Austin (UTEX), respectively. Both strains were grown in C medium (Andersen 2005) at 18 °C under alternating 12 h light and 12 h dark periods.

Genome Assemblies, Annotations, and Sequence Analyses

The G. planctonica chloroplast and mitochondrial genomes were sequenced using the Roche 454 method. A shotgun library (700-bp fragments) of A + T-rich organelle DNA, which was obtained by CsCl-bisbenzimide isopycnic centrifugation of total cellular DNA (Turmel et al. 1999), was constructed using the GS-FLX Titanium Rapid Library Preparation Kit of Roche 454 Life Sciences (Branford, CT, USA). Library construction and 454 GS-FLX DNA Titanium pyrosequencing were carried out by the “Plateforme d’Analyses Génomiques de l’Université Laval” (http://pag.ibis.ulaval.ca/seq/en/; last accessed 8 August 2016). Reads were assembled using Newbler v2.5 (Margulies et al. 2005) with default parameters, and contigs of chloroplast and mitochondrial origins were identified by BlastN and BlastX (Altschul et al. 1990) searches against a local database of organelle genomes. Following visualization and editing with the CONSED 22 finishing package (Gordon et al. 1998), the 20 chloroplast contigs comprising a total of 47,577 reads (average coverage depth = 67) were ordered and linked by polymerase chain reaction (PCR) amplification of the regions spanning gaps using a set of 32 oligonucleotides (supplementary table S1, Supplementary Material online). Purified PCR products were sequenced using Sanger chemistry with the PRISM BigDye Terminator Ready Reaction Cycle Sequencing Kit (Applied Biosystems, Foster City, CA, USA). The five contigs that were recovered for the mitochondrial genome (7172 reads; average coverage depth = 24) were assembled using CONSED. The G. sarcinoidea organelle genomes were sequenced using the Illumina method. Total cellular DNA was isolated using the EZNA HP Plant Mini Kit of Omega Bio-Tek (Norcross, GA, USA). A library of 500-bp fragments was constructed using the TrueSeq DNA Sample Prep Kit (Illumina, San Diego, CA, USA) and paired-end reads (300-bp) were generated on the MiSeq sequencer by the “Plateforme d’Analyses Génomiques de l’Université Laval” (http://pag.ibis.ulaval.ca/seq/en/index.php; last accessed 8 August 2016). These reads were trimmed to remove adapter and low-quality sequences with CUTADAPT (Martin 2011) and PRINTSEQ (Schmieder and Edwards 2011), respectively, and the paired-end sequences were merged using FLASH (Magoc and Salzberg 2011). Reads were then assembled using Ray 2.3.1 (Boisvert et al. 2010) with kmer values of 61 and 89. Identification, visualization and editing of organellar DNA contigs were performed as described above for the 454 sequence assemblies. A single contig with overlapping terminal sequences that allowed genome circularization was recovered for each organellar genome. The final chloroplast assembly contained a total of 281,367 reads with an average coverage depth of 319, while the final mitochondrial assembly contained 36,523 reads with an average coverage depth of 129. Genes and open reading frames (ORFs) were identified on the final assemblies using a custom-built suite of bioinformatics tools allowing the automated execution of the following three steps: 1) ORFs were found using GETORF in EMBOSS (Rice et al. 2000), 2) their translated products were identified by BlastP (Altschul et al. 1990) searches against a local database of mtDNA- and cpDNA-encoded proteins or the nr database at the National Center for Biotechnology Information, and 3) consecutive 100 bp segments of the genome sequence were analyzed with BlastN and BlastX (Altschul et al. 1990) to identify gene sequences. Only the ORFs that revealed identities with genes of known functions or previously reported ORFs were annotated. Genes for rRNAs and tRNAs were independently identified and localized using RNAmmer (Lagesen et al. 2007) and tRNAscan-SE (Lowe and Eddy 1997), respectively. Intron boundaries were determined by comparing intron-containing genes with intronless homologs and by modeling intron secondary structures (Michel et al. 1989; Michel and Westhof 1990). Circular genome maps were drawn with OGDraw (Lohse et al. 2007). In the course of our study, we also used the methods mentioned above to verify the gene repertoires and intron contents of the organelle genomes previously reported for Bryopsis and Ulva species (Lu et al. 2011; Leliaert and Lopez-Bautista 2015; Melton and Lopez-Bautista 2015, 2016; Zhou et al. 2016a, 2016b). Genome-scale sequence comparisons were carried out with LAST v7.1.4 (Frith et al. 2010) to map regions sharing similar sequences within a genome and to identify shared sequences between chloroplast and mitochondrial genomes. To estimate the proportion of small repeated sequences, repeats with a minimal size of 30 bp were retrieved using REPFIND of REPuter v2.74 (Kurtz et al. 2001) with the options -f -p -l -allmax and were then masked on the genome sequence using RepeatMasker (http://www.repeatmasker.org/; last accessed 8 August 2016) running under the Crossmatch search engine (http://www.phrap.org/; last accessed 8 August 2016). Tandem repeats were identified with TRF v4.09 (Benson 1999) and classification of the repeat sequences was performed using RECON (Bao and Eddy 2002).

Analyses of Gene Organization

The number of reversals separating the chloroplast or mitochondrial genomes of the two Gloeotilopsis species from one another and from their Pseudendoclonium counterparts was estimated with GRIMM v2.01 (Tesler 2002). We used a custom-built Perl script to identify the regions that display the same gene order in these genomes. This script employs a concatenated list of signed gene orders in the compared genomes as input file (i.e., taking into account gene polarity) and interacts with MySQL database tools (https://www.mysql.com; last accessed 8 August 2016) to identify syntenic regions.

Phylogenetic Analyses

To identify the relationships among Gloeotilopsis group II introns, nucleotides comprising the central wheel of the intron secondary structure as well as adjacent nucleotides from the conserved regions within the six domains radiating from this wheel were aligned based on secondary structure models and the resulting alignment was analyzed using RAxML v8.2.6 (Stamatakis 2014) and the GTR + Γ4 model. Confidence of branch points was estimated by fast-bootstrap analysis (f = a) with 1000 replicates. To determine the timing of inter-organellar DNA transfers in the Ulotrichales, the Gloeotilopsis mitochondrial-like chloroplast sequences were aligned with the corresponding mitochondrial gene sequences of Gloeotilopsis and other green plants using MUSCLE v3.7 (Edgar 2004) and the resulting alignments were analyzed individually using RAxML v8.2.6 (Stamatakis 2014) and the GTR + Γ4 model. Confidence of branch points was estimated by fast-bootstrap analysis (f = a) with 500 replicates.

Results and Discussion

The Gloeotilopsis Mitochondrial Genomes Share Several Features with their Pseudendoclonium Counterpart

The G. planctonica and G. sarcinoidea mtDNAs were assembled as circular molecules of 105,236 bp and 85,108 bp, respectively (fig. 1A and supplementary fig. S1, Supplementary Material online). The G. planctonica genome is the largest reported so far among chlorophytes. Although the two newly sequenced ulotrichalean genomes and their Pseudendoclonium counterpart are larger than the mtDNAs currently available for the Olmannsiellopsidales and Ulvales, their repertoires of conserved genes are very similar to that of the Ulva genomes (table 1 and fig. 1B). The Gloeotilopsis mtDNAs are identical in gene content: their 58 conserved genes code for 30 proteins, two rRNAs, and 26 tRNAs. The only difference relative to the Pseudendoclonium gene repertoire is the presence of trnL(caa), a gene specific to the Gloeotilopsis lineage. The mitochondrial gene distribution currently available for ulvophycean green algae also suggests that trnR(ucg) is a lineage-specific tRNA gene in the Ulvales and that four genes present in the Oltmannsiellopsidales (nad9, rrn5, trnG(gcc) and trnR(acg)) were lost before the divergence of the Ulvales and Ulotrichales (fig. 1B). As reported for other algal lineages (Valach et al. 2014), it is possible that the gene coding for the 5S rRNA (rrn5) went undetected because it is too divergent in sequence. Aside from conserved genes, the two Gloeotilopsis genomes share with Pseudendoclonium mtDNA three free-standing ORFs that code for one LAGLIDADG homing endonuclease (LHE) and two proteins of unknown functions (fig. 1A and supplementary fig. S1, Supplementary Material online); the corresponding Pseudendoclonium ORFs are orf307, orf325, and orf361, respectively (Pombert et al. 2004). In addition, we found 5′ to cox2, one free-standing LAGLIDADG ORF in G. sarcinoidea (orf193) and two in G. planctonica (orf102 and orf174) that are highly similar to the proteins encoded by group I introns in the mitochondrial atp1 and chloroplast atpA genes of Pseudendoclonium.
F

Mitochondrial gene content and organization in Gloeotilopsis. (A) Gene map of the G. planctonica mitochondrial genome. Filled boxes on the gene map represent genes, with colors denoting gene categories as indicated in the legend. Genes on the outside are transcribed counterclockwise, whereas those on the inside are transcribed clockwise. Thick lines in the innermost ring represent the gene clusters shared with Pseudendoclonium; those in the second outermost inner ring represent the gene clusters shared with G. sarcinoidea. (B) Comparison of gene content among ulvophycean mtDNAs. The presence of a gene is denoted by a blue box.

Table 1

General Features of Gloeotilopsis and Other Ulvophycean Mitochondrial Genomes

TaxonAccessionSizeA+TGenesaIntrons (no.)b
Repeatsc
(bp)(%)(no.)GIGII(%)
Oltmannsiellopsidales
Oltmannsiellopsis viridis NIES 360NC_00825656,76166.654218.9
Ulvales
Ulva prolifera NC_02853863,84566.058723.0
Ulva linza NC_02970170,85865.458943.5
Ulva sp. UNA00071828KP72061773,49367.858822.3
Ulva fasciata NC_02808161,61467.558311.4
Ulotrichales
Pseudendoclonium akinetum UTEX 1912NC_00592695,88060.7577013.0
Gloeotilopsis planctonica SAG 29.93KX306823105,23665.3581016.6
Gloeotilopsis sarcinoidea UTEX 1710KX30682285,10866.258503.9

Intronic genes and freestanding ORFs not usually found in green plant mitochondrial genomes are not included in these values. Duplicated genes were counted only once.

Number of group I (GI) and group II (GII) introns is given.

Non-overlapping repeat elements were mapped on each genome with RepeatMasker using as input sequences the repeats of at least 30 bp identified with REPuter.

Mitochondrial gene content and organization in Gloeotilopsis. (A) Gene map of the G. planctonica mitochondrial genome. Filled boxes on the gene map represent genes, with colors denoting gene categories as indicated in the legend. Genes on the outside are transcribed counterclockwise, whereas those on the inside are transcribed clockwise. Thick lines in the innermost ring represent the gene clusters shared with Pseudendoclonium; those in the second outermost inner ring represent the gene clusters shared with G. sarcinoidea. (B) Comparison of gene content among ulvophycean mtDNAs. The presence of a gene is denoted by a blue box. General Features of Gloeotilopsis and Other Ulvophycean Mitochondrial Genomes Intronic genes and freestanding ORFs not usually found in green plant mitochondrial genomes are not included in these values. Duplicated genes were counted only once. Number of group I (GI) and group II (GII) introns is given. Non-overlapping repeat elements were mapped on each genome with RepeatMasker using as input sequences the repeats of at least 30 bp identified with REPuter. Unlike the Ulva mtDNAs whose genes are all encoded on the same DNA strand (Melton et al. 2015; Melton and Lopez-Bautista 2016; Zhou et al. 2016a, 2016b), the Gloeotilopsis and Pseudendoclonium genomes have their genes distributed between the two strands. The four Ulva genomes exhibit the same gene order but are substantially rearranged compared to ulotrichalean genomes. The Gloeotilopsis mtDNAs show slight differences in gene organization: nearly 90% of their conserved genes and free-standing ORFs form four syntenic blocks (fig. 1) and as inferred by GRIMM analysis, only four reversals are required to convert gene order in one genome to that of the other. Our comparison of G. planctonica and G. sarcinoidea mtDNAs with their Pseudendoclonium counterpart revealed seven and eight syntenic blocks (fig. 1), respectively, with 11 and 12 reversals inferred using GRIMM.

The Gloeotilopsis and Ulva Mitochondrial Genomes Boast Group II Introns with LAGLIDADG ORFs

Gloeotilopsis planctonica mtDNA contains ten group I introns, seven of which are present in cox1, as well as one group II intron, whereas G. sarcinoidea mtDNA contains five group I introns (fig. 2). The introns uncovered so far in ulvophycean mtDNAs represent 22 distinct insertion sites, half of which have not been identified in other clades of chlorophytes (see arrows in fig. 2). Note that a number of introns were not annotated or correctly annotated in the GenBank accessions of Ulvales; numerous changes related to intron types and insertion positions were introduced in the course of this study (fig. 2). All known ORF-containing group I introns in ulvophycean mitochondria encode putative LAGLIDADG homing endonucleases (LHEs), an observation that contrasts with the finding of intron-encoded homing endonucleases from three distinct families (LAGLIDADG, GIY-YIG, and H-N-H) in the ulvophycean chloroplast (Pombert et al. 2005).
F

Group I and group II introns in Gloeotilopsis and other ulvophycean mtDNAs. A grey box represents an intron lacking an ORF; a green box indicates that the intron encodes a LAGLIDADG homing endonuclease, whereas a blue box indicates that the intron encodes a reverse transcriptase (RT) with or without H-N-H endonuclease and/or intron maturase domains. Arrows denote the intron insertion sites that have not been observed in other classes of the Chlorophyta. Intron insertion sites in protein-coding and tRNA genes are given relative to the corresponding genes in Mesostigma viride mtDNA (Turmel et al. 2002); insertion sites in rrs and rrl are given relative to the Escherichia coli 16S and 23S rRNAs, respectively. For each insertion site, the position corresponding to the nucleotide immediately preceding the intron is reported.

Group I and group II introns in Gloeotilopsis and other ulvophycean mtDNAs. A grey box represents an intron lacking an ORF; a green box indicates that the intron encodes a LAGLIDADG homing endonuclease, whereas a blue box indicates that the intron encodes a reverse transcriptase (RT) with or without H-N-H endonuclease and/or intron maturase domains. Arrows denote the intron insertion sites that have not been observed in other classes of the Chlorophyta. Intron insertion sites in protein-coding and tRNA genes are given relative to the corresponding genes in Mesostigma viride mtDNA (Turmel et al. 2002); insertion sites in rrs and rrl are given relative to the Escherichia coli 16S and 23S rRNAs, respectively. For each insertion site, the position corresponding to the nucleotide immediately preceding the intron is reported. Unexpectedly, group II introns containing LAGLIDADG ORFs were identified at three distinct mtDNA sites in the course of our study: site 916 within the cox1 genes of U. prolifera (genomic positions 4994–6502) and U. linza (positions 7516–9024), site 1911 within rnl of G. planctonica, and site 2451 within rnl of U. linza (positions 37,293–38,685). LHEs are generally associated with group I introns, promoting their mobility by introducing double-strand breaks within intronless target sequences (Belfort et al. 2002). The great majority of ORF-containing group II introns encode proteins with reverse transcriptase (RT), intron maturase (X), and H-N-H endonuclease (En) domains that are required for intron splicing and mobility (Zimmerly and Semper 2015). The only few LHE-encoding group II introns that have been previously documented are confined to the giant sulfur bacterium Thiomargarita namibiensis (Salman et al. 2012) and the mitochondria of fungi belonging to the Ascomycota and Basidiomycota (Toor and Zimmerly 2002; Monteiro-Vitorello et al. 2009; Mullineux et al. 2010; Pfeifer et al. 2012). In fungal mitochondria, these unusual introns also occur in the cox1 and rRNA genes but their insertion sites in these genes (cox1 site 969, rns sites 785 and 952, and rnl site 2059) differ from those we uncovered in ulvophycean mtDNAs. Structural analyses of the four ulvophycean LHE-encoding introns revealed that the ORF is located in domain IV of the secondary structure, i.e., the same domain where RT-related ORFs are generally located in group II introns. The LHEs encoded by the introns inserted at distinct sites do not appear to be closely related, as BlastP searches against the non-redundant database of NCBI using these protein sequences identified different sets of sequences, which originated mostly from fungal mitochondria. Therefore, it is likely that the LAGLIDADG ORFs in ulvophycean group II introns represent three different events of integration which, as demonstrated for the LHEs encoded by the Leptographium rns intron (Mullineux et al. 2010) and the Ustilago rnl intron (Pfeifer et al. 2012), confer intron mobility by allowing the introduction of double-strand breaks at intronless target sites.

The Gloeotilopsis Chloroplast Genomes Are Large, Lack the IR, and Differ in Gene Order

The G. planctonica and G. sarcinoidea cpDNAs are the largest ulvophycean chloroplast genomes analyzed so far (table 2); they were assembled as circular molecules of 221,431 and 262,888 bp, respectively (fig. 3A and supplementary fig. S2, Supplementary Material online). Like their counterparts in the Bryopsidales and Ulvales (Lu et al. 2011; Leliaert and Lopez-Bautista 2015; Melton et al. 2015), they lack a large IR encoding the rRNA genes. Their gene repertoire, which only differs from that of Pseudendoclonium by the absence of trnR(ccu) (fig. 3B), comprises 104 different conserved genes coding for 73 proteins, three rRNAs and 28 tRNAs. As revealed by GRIMM analyses, seven reversals account for the alterations in gene order between the two Gloeotilopsis cpDNAs; 101 of their 104 shared conserved genes form six syntenic blocks, with the longest containing 69 genes (fig. 3A). The G. sarcinoidea and G. planctonica genomes differ from their Pseudendoclonium counterpart by 21 and 23 reversals, respectively; the conserved genes shared by the Gloeotilopsis/Pseudendoclonium cpDNAs form 14 syntenic blocks (fig. 3A). In addition to the ancestral gene clusters that were reported to be fragmented in Pseudendoclonium (Pombert et al. 2005), the Gloeotilopsis genomes display a disrupted rRNA operon.
Table 2

General Features of Gloeotilopsis and Other Ulvophycean Chloroplast Genomes

TaxonAccessionSize (bp)A+TGenesaIntrons (no.)b
Repeatsc
GenomeIR(%)(no.)GIGII(%)
Bryopsidales
Bryopsis hypnoides NC_013359153,42966.9108d669.9
Bryopsis plumosa West4718NC_026795106,85969.2108762.4
Tydemania expeditionis FL1151NC_026796105,20067.2109830.4
Oltmannsiellopsidales
Oltmannsiellopsis viridis NIES 360NC_008099151,93318,51059.51045011.1
Ulvales
Ulva sp. UNA00071828KP72061699,98374.7100410.5
Ulva fasciata NC_02904096,00575.1100410.5
Ulotrichales
Pseudendoclonium akinetum UTEX 1912NC_008114195,8676,03968.51052705.3
Gloeotilopsis planctonica SAG 29.93KX306824221,43168.510414173.7
Gloeotilopsis sarcinoidea UTEX 1710KX306821262,88868.5104151211.6

Intronic genes and freestanding ORFs not usually found in green plant chloroplast genomes are not included in these values. Duplicated genes were counted only once.

Number of group I (GI) and group II (GII) introns is given.

Non-overlapping repeat elements were mapped on each genome with RepeatMasker using as input sequences the repeats of at least 30 bp identified with REPuter.

This value includes four pseudogenes (rpoB, rpoC1, ycf20 and ycf47).

F

Chloroplast gene content and organization in Gloeotilopsis. (A) Gene map of the G. sarcinoidea chloroplast genome. Filled boxes on the gene map represent genes, with colors denoting gene categories as indicated in the legend. Duplicated gene sequences are denoted with filled circles and sequences of mitochondrial origin are highlighted in yellow. Genes on the outside are transcribed counterclockwise, whereas those on the inside are transcribed clockwise. Thick lines in the innermost ring represent the gene clusters shared with Pseudendoclonium; those in the second outermost inner ring represent the gene clusters shared with G. sarcinoidea. (B) Comparison of gene content in ulvophycean cpDNAs. Genes and pseudogenes are shown in dark and light blue, respectively. Only the conserved genes showing a variable distribution are indicated. All compared genomes share the following set of 95 genes: accD, atpA, B, E, F, H, I, ccsA, cemA, chlI, clpP, ftsH, infA, petA, B, D, G, L, psaA, B, C, I, J, M, psbA, B, C, D, E, F, H, I, J, K, L, M, N, T, Z, rbcL, rpl2, 5, 14, 16, 19, 20, 23, 32, 36, rpoA, C2, rps2, 3,4, 7, 8, 9, 11, 12, 14, 18, 19, rrf, rrl, rrs, tufA, ycf1, 3, 4, 12, trnA(ugc), C(gca), D(guc), E(uuc), F(gaa), G(gcc), G(ucc), H(gug), I(gau), K(uuu), L(uaa), L(uag), Me(cau), Mf(cau), N(guu), P(ugg), Q(uug), R(acg), R(ucu), S(gcu), S(uga), T(ugu), V(uac), W(cca), Y(gua). (C) Alignment of the G. sarcinoidea and G. planctonica cpDNA regions extending from rrs to petD. Sequence identity is denoted by the grey scale.

Chloroplast gene content and organization in Gloeotilopsis. (A) Gene map of the G. sarcinoidea chloroplast genome. Filled boxes on the gene map represent genes, with colors denoting gene categories as indicated in the legend. Duplicated gene sequences are denoted with filled circles and sequences of mitochondrial origin are highlighted in yellow. Genes on the outside are transcribed counterclockwise, whereas those on the inside are transcribed clockwise. Thick lines in the innermost ring represent the gene clusters shared with Pseudendoclonium; those in the second outermost inner ring represent the gene clusters shared with G. sarcinoidea. (B) Comparison of gene content in ulvophycean cpDNAs. Genes and pseudogenes are shown in dark and light blue, respectively. Only the conserved genes showing a variable distribution are indicated. All compared genomes share the following set of 95 genes: accD, atpA, B, E, F, H, I, ccsA, cemA, chlI, clpP, ftsH, infA, petA, B, D, G, L, psaA, B, C, I, J, M, psbA, B, C, D, E, F, H, I, J, K, L, M, N, T, Z, rbcL, rpl2, 5, 14, 16, 19, 20, 23, 32, 36, rpoA, C2, rps2, 3,4, 7, 8, 9, 11, 12, 14, 18, 19, rrf, rrl, rrs, tufA, ycf1, 3, 4, 12, trnA(ugc), C(gca), D(guc), E(uuc), F(gaa), G(gcc), G(ucc), H(gug), I(gau), K(uuu), L(uaa), L(uag), Me(cau), Mf(cau), N(guu), P(ugg), Q(uug), R(acg), R(ucu), S(gcu), S(uga), T(ugu), V(uac), W(cca), Y(gua). (C) Alignment of the G. sarcinoidea and G. planctonica cpDNA regions extending from rrs to petD. Sequence identity is denoted by the grey scale. General Features of Gloeotilopsis and Other Ulvophycean Chloroplast Genomes Intronic genes and freestanding ORFs not usually found in green plant chloroplast genomes are not included in these values. Duplicated genes were counted only once. Number of group I (GI) and group II (GII) introns is given. Non-overlapping repeat elements were mapped on each genome with RepeatMasker using as input sequences the repeats of at least 30 bp identified with REPuter. This value includes four pseudogenes (rpoB, rpoC1, ycf20 and ycf47). The 41-kb size difference between the G. planctonica and G. sarcinoidea cpDNAs is essentially accounted for by longer intergenic regions in the latter genome (133 kb versus 92 kb). Both intergenic regions and introns, however, contribute to the larger sizes of the Gloeotilopsis cpDNAs relative to previously sequenced ulvophycean genomes. For instance, in the Pseudendoclonium genome, the intergenic regions are 81 kb in length, and the intron sequences span 30 kb as compared to 50 kb in the Gloeotilopsis genomes, even though the total number of introns is comparable in both ulvophycean lineages (table 2). In contrast to the Oltmannsiellopsis and Pseudendoclonium cpDNAs, whose introns all belong to the group I family, the Gloeotilopsis genomes contain many group II introns (representing 14.1% and 11.8% of the G. planctonica and G. sarcinoidea cpDNAs, respectively); the latter introns, in particular those containing ORFs, account for the increased proportion of intron sequences in the Gloeotilopsis lineage. Considering our current understanding of the phylogenetic relationships among major ulvophycean lineages (Škaloud et al. 2013) and also the fact that there is no solid evidence for de novo gain of the IR by chloroplast genomes, our results indicate that the IR was lost at least once in the Ulotrichales, bringing to three the number of independent IR losses known in the Ulvophyceae. Although multiple IR losses have also been reported in the Trebouxiophyceae (Turmel et al. 2015), Prasinophyceae (Lemieux et al. 2014), and streptophytes (Ruhlman and Jansen 2014; Blazier et al. 2016; Lemieux et al. 2016), the molecular mechanism(s) underlying these events remain(s) unclear. One might envision that multiple rounds of IR contraction by a mechanism involving illegitimate recombination (Goulding et al. 1996; Wang et al. 2008) ultimately lead to IR loss. This hypothesis predicts that intermediate IR forms housing partial rRNA operons would be created; however, such observations are extremely rare (Guisinger et al. 2011) and recent reports suggest that the rRNA operon may be disrupted before IR loss (Turmel et al. 2015; Lemieux et al. 2016). Our comparison of the Gloeotilopsis cpDNA regions corresponding to the IR and adjacent sequences in the Pseudendoclonium genome provides hints into the process of IR loss in the Ulotrichales (fig. 4). The rRNA operon, which exists in its ancestral form in Pseudendoclonium, has been broken into two pieces in the Gloeotilopsis lineage: trnI(gau) and trnA(ugc) are still linked but are far apart from the rrs, rrl and rrf genes, which are also clustered and in the same order as in the ancestral operon (fig. 4A). Importantly, if we exclude a small inversion involving the minD/trnMe(cau) genes, the rRNA gene cluster is bordered on each side by a long segment of conserved genes that is perfectly colinear with the corresponding Pseudendoclonium sequence neighboring the IR. As a possible interpretation of these results, we suggest that the trnI(gau) and trnA(ugc) genes were deleted from the rRNA operon ancestrally located in the large syntenic region and that these two genes were the only genetic elements that were retained from the other copy of the rRNA operon during the process of IR loss in the Gloeotilopsis lineage (fig. 4B). The alternative explanation that the trnI(gau) and trnA(ugc) gene pair in this large syntenic block was translocated to another locus is unlikely, given that this type of gene rearrangements have not been documented for chloroplast genomes. Analyses of additional ulotrichalean chloroplast genomes will be necessary to further support or reject the notion that genes making up the rRNA operon can be differentially lost from the IR copies, thereby leading to disruption of the rRNA operon, during the elimination of the quadripartite structure.
F

Model of IR loss from the chloroplast genome based on the comparison of the Pseudendoclonium and G. planctonica cpDNAs. (A) Comparison of gene order and gene content in the regions surrounding the genes making up the rRNA operon. Shown in red are the genes from the rRNA operon that are missing in Gloeotilopsis. (B) Model of IR loss proposed for the Ulotrichales. Genes differentially lost from the two copies of the rRNA operon are shown in red.

Model of IR loss from the chloroplast genome based on the comparison of the Pseudendoclonium and G. planctonica cpDNAs. (A) Comparison of gene order and gene content in the regions surrounding the genes making up the rRNA operon. Shown in red are the genes from the rRNA operon that are missing in Gloeotilopsis. (B) Model of IR loss proposed for the Ulotrichales. Genes differentially lost from the two copies of the rRNA operon are shown in red.

Intergenic Regions of the G. sarcinoidea cpDNA Include Sequences of Mitochondrial Origin, Gene Fragments, and Small Dispersed Repeats

Remarkably, short sequences of mitochondrial origin were detected at two distinct loci of the G. sarcinoidea cpDNA. The psbC/ftsH spacer includes a 530-bp sequence corresponding to the 5’ part of atp1 and a 408-bp sequence spanning the entire coding region of rps12, whereas the petB/rrf spacer contains four coding sequences with a total length of 1529 bp that originate from nad7 and the adjacent LHE-encoding gene present in ulotrichalean mtDNAs, as well as from the separate locus containing trnS(gcu) (fig. 3A). Although the chloroplast mitochondrial-like rps12 and trnS(gcu) sequences cover the complete coding regions of these genes, they have clearly undergone pseudogenization: the rps12 sequence displays frameshifts at two distinct locations resulting from 4-bp and 5-bp insertions and includes a UGA stop codon, whereas the trnS(gcu) sequence contains a 4-bp deletion within the stem of the T-arm and a C→T substitution in the highly conserved TΨC sequence of the T-loop. The mitochondrial sequences we uncovered in the G. sarcinoidea cpDNA are unlikely to be the result of a chimeric assembly for the following reasons. First, the assembly of this genome yielded a single contig with overlapping sequences at both ends that allowed circularization of the genome. Second, coverage depth was high (average depth = 319) and uniform throughout the genome, and crucially, several hundreds of reads spanned the junctions between the mtDNA insertions and adjacent cpDNA sequences. Third, the sequences of the mtDNA insertions differed with regards to their homologs in the mitochondrial genome (see below). To identify the potential donor(s) of the chloroplast mitochondrial-like atp1, rps12, nad7, and trnS(gcu) sequences, phylogenetic trees were inferred with RAxML using individual gene datasets comprising these sequences and the corresponding mitochondrial sequences available for G. sarcinoidea, G. planctonica and other green plants (fig. 5). All chloroplast mitochondrial-like sequences were found to be nested within the clade containing the G. sarcinoidea and G. planctonica mitochondrial genes, occupying a position sister to one of the Gloeotilopsis sequences or to both algal sequences. The Gloeotilopsis clade received strong bootstrap support in the nad7 and atp1 trees (bootstrap values of 92% and 94%, respectively). These results support the notion that mtDNA fragments were transferred to the chloroplast during the evolution of the Gloeotilopsis genus. Only the nad7 tree is sufficiently resolved to distinguish unambiguously that the chloroplast captured this gene in the lineage leading to G. sarcinoidea. Although no mitochondrial-like sequences were identified with confidence in the G. planctonica cpDNA (E-value threshold of 1e−20), it is possible that mitochondrion-to-chloroplast DNA transfers occurred in a common ancestor of the two Gloeotilopsis species or in the lineage leading to G. planctonica and that the transferred mitochondrial sequences diverged rapidly following their insertion. In agreement with this hypothesis, we identified a short sequence showing limited similarity to cox1 in the G. planctonica cpDNA (genomic coordinates 48,919–48,980, E-value of 2e−12).
F

Phylogenetic relationships between the mitochondrial-like sequences in G. sarcinoidea cpDNA and the corresponding genes of green plant mtDNAs. The RAxML trees were inferred using the GTR + Γ4 model. Branches that received ≥ 80% bootstrap support are denoted by thick lines. The names of the G. sarcinoidea mitochondrial-like cpDNA sequences are highlighted in red.

Phylogenetic relationships between the mitochondrial-like sequences in G. sarcinoidea cpDNA and the corresponding genes of green plant mtDNAs. The RAxML trees were inferred using the GTR + Γ4 model. Branches that received ≥ 80% bootstrap support are denoted by thick lines. The names of the G. sarcinoidea mitochondrial-like cpDNA sequences are highlighted in red. To the best of our knowledge, our study provides the first evidence for the transfer of gene sequences from the mitochondria to the chloroplast in green algae. In flowering plants, only two cases of mitochondrion-to-plastid DNA transfer have been documented (Smith 2014); the first in a carrot ancestor (Goremykin et al. 2009; Iorizzo et al. 2012a, 2012b) and the second in the common milkweed (Straub et al. 2013). The scarcity of inter-organelle DNA exchanges that occurred in this direction in land plants has been suggested to be due to the lack of a specific DNA-uptake mechanism in the chloroplast (Bock 2010); however, it has been shown that under certain conditions that might be encountered during environmental stress, DNA can move across the chloroplast envelope due to transient alteration of its permeability (Cerutti and Jagendorf 1995). The DNA transfer reported for the carrot ancestor was proposed to result from transposition of a non-LTR retrotransposon, while that reported for the milkweed was attributed to homologous recombination events involving sequences of chloroplast origin that were identified near the transferred sequence in milkweed mtDNA. Although we have not detected any sequences of obvious chloroplast origin in Gloeotilopsis mitochondrial genomes, mtDNA fragments might have entered the Gloeotilopsis chloroplast genome through homologous recombination using small repeats shared by the two organelle genomes (see below). Another unusual feature of the G. sarcinoidea cpDNA is the presence of duplicated sequences for psbA, psbC, petD and the three rRNA genes. These sequences, which are denoted by large dots in figure 3A, map to six distinct intergenic regions, including the two harboring mtDNA insertions, and except for rrf, which is entirely duplicated, they display either the 5′ or 3′ coding region. While repeats consisting of gene fragments (i.e., pseudogenes) are often found in highly rearranged chloroplast genomes of angiosperms (Guisinger et al. 2011), they have been rarely observed in green algal cpDNAs (Smith et al. 2010). The G. sarcinoidea psbC, petD and rrl pseudogenes correspond to the 3′ coding region and are located immediately downstream and on the same DNA strand as the functional gene copies. Similar arrangements, presumably resulting from the insertion of a mobile element, have been documented for the chloroplast of the chlorophycean green alga Dunaniella salina (Smith et al. 2010) and the mitochondria of the fungus Allomyces macrogynus (Paquin et al. 1994), and in both cases, an ORF encoding a homing endonuclease of the GIY-YIG family separates the functional protein-coding gene from the pseudogene. In this context, it is worth noting that for the G. sarcinoidea and G. planctonica rrf/petB intergenic regions, which differ by an addition/deletion 15.6 kb, the extra DNA segment of G. sarcinoidea contains at one junction a mtDNA insertion that includes part of a LAGLIDADG ORF and at the other junction the 3′-coding region of rrl and the second rrf copy (fig. 3C). In contrast to the G. sarcinoidea genome, the G. planctonica cpDNA exhibits only repeats of 5′ coding regions (rns, psbA and rpoC2) and the four detected pseudogenes lie within intergenic regions that coincide with endpoints of gene rearrangements between the two Gloeotilopsis genomes (supplementary fig. S2, Supplementary Material online). The proportion of small repeats in the G. sarcinoidea genome (11.6%) is three-fold higher than in the G. planctonica cpDNA but is comparable to the values observed for Oltmannsiellopsis and Bryopsis hypnoides (table 2). Most of the G. sarcinoidea repeats reside in intergenic regions where they represent a diversified collection of sequences; a small proportion of repeats (spanning 2.0% of the genome) is accounted for by the presence of gene fragments and highly conserved group II intron sequences (see below). We identified many tandem repeats in G. sarcinoidea, several of which are dispersed throughout the genome. Notably, repeat units of 15 and 13 bp were found to be shared with the mtDNAs of G. sarcinoidea (TGGGAAAACTTTTCC, 11 identical copies in cpDNA and 8 in mtDNA) and G. planctonica (TTTATCAAAAAAG, 112 identical copies in cpDNA and 12 in mtDNA), respectively. The 13-bp unit is part of a 20-bp sequence (TTTATCAAAAAAGTTTTTTG) that is repeated in tandem at 19 different sites in the G. sarcinoidea chloroplast genome.

All Gloeotilopsis Chloroplast Group I introns Occur at Previously Identified Insertion Sites

The group I introns identified so far in ulvophycean cpDNAs are distributed among 15 genes and represent a total of 45 insertion sites, most of which also occur in other groups of chlorophytes (fig. 6). The 14 G. planctonica and 15 G. sarcinoidea introns are inserted at 19 distinct sites. Considering that most of these sites are also found in Pseudendoclonium and/or Ulva species, many group I introns were likely inherited vertically in ulotrichalean and ulvalean lineages. The majority of the Gloeotilopis introns (those at 17 sites) display an ORF, with the LAGLIDADG family of homing endonucleases being the most represented (nine sites), followed by the GIY-YIG family (five sites). Based on the observation that the Pseudendoclonium chloroplast atpA intron at site 489 and the mitochondrial atp1 intron at site 522 share not only the same insertion site but also highly similar secondary structures and LAGLIDADG ORFs, it has been hypothesized earlier that an inter-organellar, lateral DNA transfer event involving a group I intron took place specifically in the Ulvophyceae (Pombert et al. 2005). The direction of this inter-organellar DNA transfer still remains unclear at this point owing to the limited information currently available for the distributions of the atpA 489 and atp1 522 introns among chlorophytes.
F

Group I introns in Gloeotilopsis and other ulvophycean cpDNAs. A grey box represents an intron lacking an ORF, whereas a colored box represents an intron containing an ORF (see the color code for the type of intron-encoded protein). Arrows denote the intron insertion sites that have not been observed in other groups of chlorophytes. Intron insertion sites in protein-coding and tRNA genes are given relative to the corresponding genes in M. viride cpDNA (Lemieux et al. 2000); insertion sites in rrs and rrl are given relative to E. coli 16S and 23S rRNAs, respectively.

Group I introns in Gloeotilopsis and other ulvophycean cpDNAs. A grey box represents an intron lacking an ORF, whereas a colored box represents an intron containing an ORF (see the color code for the type of intron-encoded protein). Arrows denote the intron insertion sites that have not been observed in other groups of chlorophytes. Intron insertion sites in protein-coding and tRNA genes are given relative to the corresponding genes in M. viride cpDNA (Lemieux et al. 2000); insertion sites in rrs and rrl are given relative to E. coli 16S and 23S rRNAs, respectively.

Many Group II introns in the Gloeotilopsis Chloroplast Genomes Arose by Intragenomic Proliferation

Group II introns are scattered among 26 protein-coding genes where they occupy 38 distinct sites, and in contrast to the situation prevailing for group I introns, the great majority of the insertion sites appear to be unique to ulvophycean green algae, with 24 sites specific to Gloeotilopsis species, just one site specific to the Ulvales, and nine to the Bryopsidales (fig. 7A). G. planctonica and G. sarcinoidea share only two insertion sites, one of which is also present in the Bryopsidales (atpA 753). All 12 G. sarcinoidea group II introns encode proteins with RT, intron maturase and/or H-N-H endonuclease domains, whereas most of the 17 G. planctonica introns are lacking ORFs and thus depend on proteins encoded by other group II introns for splicing and mobility. As reported for the genome of the thermophilic cyanobacterium Thermosynechoccus elongatus in which 20 of the 28 group II introns do not encode any proteins (Mohr et al. 2010), the high proportion of ORF-less introns in G. planctonica may reflect an evolutionary pressure for a smaller and more compact intron structure enabling increased efficiency of splicing and mobility.
F

Analysis of group II introns in Gloeotilopsis cpDNAs. (A) Comparison of their insertion sites and ORFs with those of group II introns in other ulvophycean cpDNAs. A grey box represents an intron lacking an ORF, whereas a colored box represents an intron containing an ORF (see the color code for the domains present in the encoded protein: RT, reverse transcriptase; X, intron maturase; EN, H-N-H endonuclease). Arrows denote the intron insertion sites that have not been observed in other groups of chlorophytes. Intron insertion sites in protein-coding and tRNA genes are given relative to the corresponding genes in M. viride cpDNA (Lemieux et al. 2000); insertion sites in rrs and rrl are given relative to E. coli 16S and 23S rRNAs, respectively. For each insertion site, the position corresponding to the nucleotide immediately preceding the intron is reported. Note that a few introns were not correctly annotated in the GenBank accessions of Bryopsidales and Ulvales and that changes related to intron types and insertion positions were introduced during this study. (B) Phylogenetic relationships among Gloeotilopsis group II introns. This tree was inferred by RAxML analysis of an alignment of 286 nucleotides corresponding to the core secondary structures of the group II introns. Branches that received ≥ 78% bootstrap support are denoted by thick lines (the actual bootstrap values are shown). Gp, G. planctonica; Gs, G. sarcinoidea.

Analysis of group II introns in Gloeotilopsis cpDNAs. (A) Comparison of their insertion sites and ORFs with those of group II introns in other ulvophycean cpDNAs. A grey box represents an intron lacking an ORF, whereas a colored box represents an intron containing an ORF (see the color code for the domains present in the encoded protein: RT, reverse transcriptase; X, intron maturase; EN, H-N-H endonuclease). Arrows denote the intron insertion sites that have not been observed in other groups of chlorophytes. Intron insertion sites in protein-coding and tRNA genes are given relative to the corresponding genes in M. viride cpDNA (Lemieux et al. 2000); insertion sites in rrs and rrl are given relative to E. coli 16S and 23S rRNAs, respectively. For each insertion site, the position corresponding to the nucleotide immediately preceding the intron is reported. Note that a few introns were not correctly annotated in the GenBank accessions of Bryopsidales and Ulvales and that changes related to intron types and insertion positions were introduced during this study. (B) Phylogenetic relationships among Gloeotilopsis group II introns. This tree was inferred by RAxML analysis of an alignment of 286 nucleotides corresponding to the core secondary structures of the group II introns. Branches that received ≥ 78% bootstrap support are denoted by thick lines (the actual bootstrap values are shown). Gp, G. planctonica; Gs, G. sarcinoidea. Remarkably, the first intron in G. planctonica petA as well as the psbZ intron of the same alga are located in 5′ untranslated gene sequences. These observations represent the first cases of introns in non-coding regions of green plant organelle genes. The petA and psbZ introns were detected owing to their high sequence identity with several other group II introns of the G. planctonica and G. sarcinoidea cpDNAs. Indeed, in the course of identifying repeats using LAST and REPUTER, we found that several chloroplast group II introns at distinct insertion sites share long regions with high nucleotide identities. For instance, the G. sarcinoidea ycf3 and psbH introns are 85.6% identical over an alignment of 2563 bp. Note that LAST analysis also revealed in the 3′ untranslated sequence of the G. planctonica psbZ the remnant sequence of a group II intron (positions 151,853–152,004) that is similar (E-value of 1.3e−31) to the 5′ part of domain I of the ycf12 intron in the same alga. To delineate the relationships among the 29 Gloeotilopsis group II introns, a global alignment of 286 nucleotides corresponding to the core secondary structures of these introns was submitted to phylogenetic analysis using RAxML under the GTR + G4 model (fig. 7B). Four clades (I–IV) with a total of 23 introns received bootstrap support ≥79%. The introns in clades I, II, and III occupy distinct insertion sites and form three families of very closely related sequences as judged by the very short lengths of the observed branches. Clade III is unique in comprising only ORF-less introns from G. planctonica; deletion of the ORF probably occurred prior to the divergence of the five introns, with both splicing and dispersal to novel (ectopic) sites being promoted by the protein encoded by a closely related, but not yet identified group II intron. In addition to the rbcL intron unique to G. sarcinoidea, clade IV comprises the Gloeotilopsis atpA and atpB introns sharing common insertion sites; the distinct clades formed by the latter two pairs of introns support the idea that the G. planctonica and G. sarcinoidea introns in each clade are related through vertical descent. The fifth clade that we identified displays the most divergent introns. These findings, which highlight the first case of intragenomic proliferation of organellar group II introns in the Viridiplantae, suggest that several independent waves of intron dispersal occurred in the Gloeotilopsis chloroplast. Evidence for group II intron proliferation to high copy number in chloroplast genomes has previously been reported only for euglenids (Hallick et al. 1993; Pombert et al. 2012; Bennett and Triemer 2015) and the red alga Porphyridium purpureum (Perrineau et al. 2015). Different pathways of intron proliferation to ectopic sites have been elucidated for the Lactococcus lactis LI.LtrB IIA intron in its native host and in Escherichia coli, and for the CL/IIB1 introns in the thermophilic cyanobacterium Thermosynechoccus elongatus (Lambowitz and Belfort 2015). Although these mechanisms have in common a target DNA-primed reverse transcription step in which the excised intron RNA reverse splices into one strand of a DNA target site and is then reverse transcribed by the intron-encoded protein to produce an intron cDNA that is integrated into the genome, they differ in a number of ways, including the recognition of the target site by the ribonucleoprotein complex formed by the group II intron RNA and intron-encoded protein. To gain insights into the pathway underlying the dissemination of group II introns in Gloeotilopsis cpDNAs, we compared the secondary structure models of the introns from clade I as well as the flanking 5′ and 3′ exon sequences that contain the intron-binding sites IBS1, IBS2, and IBS3. The consensus of these intron secondary structures is characteristic of the CL/IIB1 introns (Toor et al. 2001), with 84% of the represented positions (545/651) displaying identical residues in at least seven of the nine introns (fig. 8). Remarkably, the single-stranded loops of domain I containing the exon-binding sites EBS1, EBS2, and EBS3, which play a major role in exon recognition during intron splicing and retrohoming, feature highly variable sequences. But the EBS1 and EBS2 sequences of each intron show perfect or nearly perfect complementarity with the IBS sequences in the 5′ exon, and only the EBS3 motif of the G. planctonica intron located in the 5′ untranslated region of petA cannot base pair with IBS3 in the 3′ exon (fig. 8). Essentially the same observations were made regarding the EBS-IBS interactions of the clade-II and clade-III introns (fig. 8).
F

Consensus secondary structure model of the nine clade-I group II introns and insertion sites of introns from clades I, II, and III. The consensus secondary structure model is displayed according to Toor et al. (2001). Highly conserved (in seven or more introns) and less conserved (in at least five introns) nucleotide positions are shown in uppercase and lowercase characters respectively; the other residues are denoted by dots. Highly conserved (in seven or more introns) and less conserved base pairings (in at least five introns) are denoted by thick and thin dashes respectively. Roman numbers specify the major structural domains, with long-range tertiary interactions being denoted by Greek letters. Variations in size of peripheral regions are indicated by numbers inside the loops. The 5' and 3' exon sequences flanking the introns from clades I, II, and III are aligned along with the intron sequences spanning the exon-binding sites EBS1, EBS2, and EBS3. Colors highlight EBS sequences and complementary nucleotide residues in the IBS exon sequences.

Consensus secondary structure model of the nine clade-I group II introns and insertion sites of introns from clades I, II, and III. The consensus secondary structure model is displayed according to Toor et al. (2001). Highly conserved (in seven or more introns) and less conserved (in at least five introns) nucleotide positions are shown in uppercase and lowercase characters respectively; the other residues are denoted by dots. Highly conserved (in seven or more introns) and less conserved base pairings (in at least five introns) are denoted by thick and thin dashes respectively. Roman numbers specify the major structural domains, with long-range tertiary interactions being denoted by Greek letters. Variations in size of peripheral regions are indicated by numbers inside the loops. The 5' and 3' exon sequences flanking the introns from clades I, II, and III are aligned along with the intron sequences spanning the exon-binding sites EBS1, EBS2, and EBS3. Colors highlight EBS sequences and complementary nucleotide residues in the IBS exon sequences. The above results suggest that the spread of group II introns to ectopic sites in the Gloeotilopsis chloroplast genome occurred through several independent waves of retrohoming—the mobility mechanism used to maintain group II introns at cognate sites—following mutations in the EBS sequences of founding introns that enabled insertions into different target sites. It has been previously shown that retrohoming and divergence of DNA target specificity also contributed to the proliferation of CL/IIB1 introns in the Thermosynechoccus genome (Mohr et al. 2010) and of a subset of group II introns in Wolbachia bacterial endosymbionts (Leclercq et al. 2011). In contrast, as reported for the yeast mitochondrial group II intron aI1 inserted in the cox1 gene (Dickson et al. 2001), the LI.LtrB intron spreads to ectopic sites in its native host by retrotransposition, a mechanism occurring at lower frequency than retrohoming and involving relaxed EBS-IBS sequence interactions for intron integration (Ichiyanagi et al. 2002).

Conclusion

Prior to this study, Pseudendoclonium was the sole representative of the Ulotrichales that had been sampled for mitochondrial and chloroplast genome sequencing. By incorporating the newly sequenced organelle genomes of two Gloeotilopsis species, our comparative analyses have provided a better understanding of the evolutionary histories of the mitochondrial and chloroplast genomes in the Ulvales–Ulotrichales, revealing novel genomic features in both genomes—LAGLIDADG ORF-containing group II introns, group II introns in non-coding regions, and pseudogenes resulting from duplications—as well as unanticipated phenomena that contributed to the expansion of the chloroplast genome in the Gloeotilopsis lineage. We showed that sequences accumulated in the chloroplast genome in this lineage through intracellular mitochondrion-to-chloroplast DNA transfers and intragenomic proliferation of group II introns through EBS sequence divergence and retrohoming. Moreover, our results revealed that the chloroplast genome experienced a minimum of two independent events of IR loss in the Ulvales–Ulotrichales, one during the evolutionary period separating the Pseudendoclonium and Gloeotilopsis lineages and the other during the evolution of ulvalean green algae. The comparison of the Pseudendoclonium and Gloeotilopsis cpDNAs offered clues regarding the mechanism of IR loss in the Ulotrichales, suggesting that internal sequences from the rDNA operon were differentially lost from the two original IR copies during this process. Analysis of chloroplast genomes from additional representatives of the Ulotrichales will be necessary to test the hypothesis that breakage of the rDNA operon was intimately linked with IR loss in this lineage. Click here for additional data file.
  69 in total

1.  Retrotransposition of a yeast group II intron occurs by reverse splicing directly into ectopic DNA sites.

Authors:  L Dickson; H R Huang; L Liu; M Matsuura; A M Lambowitz; P S Perlman
Journal:  Proc Natl Acad Sci U S A       Date:  2001-10-30       Impact factor: 11.205

2.  Remarkable abundance and evolution of mobile group II introns in Wolbachia bacterial endosymbionts.

Authors:  Sébastien Leclercq; Isabelle Giraud; Richard Cordaux
Journal:  Mol Biol Evol       Date:  2010-09-06       Impact factor: 16.240

Review 3.  The give-and-take of DNA: horizontal gene transfer in plants.

Authors:  Ralph Bock
Journal:  Trends Plant Sci       Date:  2009-11-10       Impact factor: 18.313

4.  Consed: a graphical tool for sequence finishing.

Authors:  D Gordon; C Abajian; P Green
Journal:  Genome Res       Date:  1998-03       Impact factor: 9.043

5.  Variable presence of the inverted repeat and plastome stability in Erodium.

Authors:  John C Blazier; Robert K Jansen; Jeffrey P Mower; Madhu Govindu; Jin Zhang; Mao-Lun Weng; Tracey A Ruhlman
Journal:  Ann Bot       Date:  2016-04-28       Impact factor: 4.357

6.  The Cryphonectria parasitica mitochondrial rns gene: plasmid-like elements, introns and homing endonucleases.

Authors:  Claudia B Monteiro-Vitorello; Georg Hausner; Denise B Searles; Ewan A Gibb; Dennis W Fulbright; Helmut Bertrand
Journal:  Fungal Genet Biol       Date:  2009-07-14       Impact factor: 3.495

7.  Mechanisms used for genomic proliferation by thermophilic group II introns.

Authors:  Georg Mohr; Eman Ghanem; Alan M Lambowitz
Journal:  PLoS Biol       Date:  2010-06-08       Impact factor: 8.029

8.  Horizontal transfer of DNA from the mitochondrial to the plastid genome and its subsequent evolution in milkweeds (apocynaceae).

Authors:  Shannon C K Straub; Richard C Cronn; Christopher Edwards; Mark Fishbein; Aaron Liston
Journal:  Genome Biol Evol       Date:  2013       Impact factor: 3.416

9.  Distinctive Architecture of the Chloroplast Genome in the Chlorodendrophycean Green Algae Scherffelia dubia and Tetraselmis sp. CCMP 881.

Authors:  Monique Turmel; Jean-Charles de Cambiaire; Christian Otis; Claude Lemieux
Journal:  PLoS One       Date:  2016-02-05       Impact factor: 3.240

10.  Six newly sequenced chloroplast genomes from prasinophyte green algae provide insights into the relationships among prasinophyte lineages and the diversity of streamlined genome architecture in picoplanktonic species.

Authors:  Claude Lemieux; Christian Otis; Monique Turmel
Journal:  BMC Genomics       Date:  2014-10-04       Impact factor: 3.969

View more
  12 in total

Review 1.  Mobile Group II Introns as Ancestral Eukaryotic Elements.

Authors:  Olga Novikova; Marlene Belfort
Journal:  Trends Genet       Date:  2017-08-14       Impact factor: 11.639

2.  Proliferation of group II introns in the chloroplast genome of the green alga Oedocladium carolinianum (Chlorophyceae).

Authors:  Jean-Simon Brouard; Monique Turmel; Christian Otis; Claude Lemieux
Journal:  PeerJ       Date:  2016-10-25       Impact factor: 2.984

3.  Divergent copies of the large inverted repeat in the chloroplast genomes of ulvophycean green algae.

Authors:  Monique Turmel; Christian Otis; Claude Lemieux
Journal:  Sci Rep       Date:  2017-04-20       Impact factor: 4.379

4.  Tracing the Evolution of the Plastome and Mitogenome in the Chloropicophyceae Uncovered Convergent tRNA Gene Losses and a Variant Plastid Genetic Code.

Authors:  Monique Turmel; Adriana Lopes Dos Santos; Christian Otis; Roxanne Sergerie; Claude Lemieux
Journal:  Genome Biol Evol       Date:  2019-04-01       Impact factor: 3.416

5.  Extensive chloroplast genome rearrangement amongst three closely related Halamphora spp. (Bacillariophyceae), and evidence for rapid evolution as compared to land plants.

Authors:  Sarah E Hamsher; Kyle G Keepers; Cloe S Pogoda; Joshua G Stepanek; Nolan C Kane; J Patrick Kociolek
Journal:  PLoS One       Date:  2019-07-03       Impact factor: 3.240

6.  Common Repeat Elements in the Mitochondrial and Plastid Genomes of Green Algae.

Authors:  David Roy Smith
Journal:  Front Genet       Date:  2020-05-12       Impact factor: 4.599

7.  What Happened before Losses of Photosynthesis in Cryptophyte Algae?

Authors:  Shigekatsu Suzuki; Ryo Matsuzaki; Haruyo Yamaguchi; Masanobu Kawachi
Journal:  Mol Biol Evol       Date:  2022-02-03       Impact factor: 16.240

8.  Chloroplast genome expansion by intron multiplication in the basal psychrophilic euglenoid Eutreptiella pomquetensis.

Authors:  Nadja Dabbagh; Matthew S Bennett; Richard E Triemer; Angelika Preisfeld
Journal:  PeerJ       Date:  2017-08-25       Impact factor: 2.984

9.  Large Diversity of Nonstandard Genes and Dynamic Evolution of Chloroplast Genomes in Siphonous Green Algae (Bryopsidales, Chlorophyta).

Authors:  Ma Chiela M Cremen; Frederik Leliaert; Vanessa R Marcelino; Heroen Verbruggen
Journal:  Genome Biol Evol       Date:  2018-04-01       Impact factor: 3.416

10.  The Complete Chloroplast Genome of Two Important Annual Clover Species, Trifolium alexandrinum and T. resupinatum: Genome Structure, Comparative Analyses and Phylogenetic Relationships with Relatives in Leguminosae.

Authors:  Yanli Xiong; Yi Xiong; Jun He; Qingqing Yu; Junming Zhao; Xiong Lei; Zhixiao Dong; Jian Yang; Yan Peng; Xinquan Zhang; Xiao Ma
Journal:  Plants (Basel)       Date:  2020-04-09
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.