| Literature DB >> 25879186 |
Frederik Leliaert1,2, Juan M Lopez-Bautista3.
Abstract
BACKGROUND: Species of Bryopsidales form ecologically important components of seaweed communities worldwide. These siphonous macroalgae are composed of a single giant tubular cell containing millions of nuclei and chloroplasts, and harbor diverse bacterial communities. Little is known about the diversity of chloroplast genomes (cpDNAs) in this group, and about the possible consequences of intracellular bacteria on genome composition of the host. We present the complete cpDNAs of Bryopsis plumosa and Tydemania expeditiones, as well as a re-annotated cpDNA of B. hypnoides, which was shown to contain a higher number of genes than originally published. Chloroplast genomic data were also used to evaluate phylogenetic hypotheses in the Chlorophyta, such as monophyly of the Ulvophyceae (the class in which the order Bryopsidales is currently classified).Entities:
Mesh:
Year: 2015 PMID: 25879186 PMCID: PMC4487195 DOI: 10.1186/s12864-015-1418-3
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Phylogenetic relationships among the main clades of green algae, focusing on the core Chlorophyta. The tree is a composite of accepted relationships based on molecular phylogenetic evidence from different studies [17-19]. Uncertain or conflicting relationships are indicated by polytomies or question marks. Numbers in square brackets indicate the number of (nearly) complete cpDNAs available to date. Clades currently classified as Ulvophyceae are in green, and Trebouxiophyceae are in yellow.
Figure 2Gene map of the chloroplast genome of . The 106,859 bp genome contains 115 unique genes, including three ribosomal RNA genes, 26 transfer RNA genes, and 86 protein coding genes. Genes shown on the outside of the circle are transcribed counterclockwise. Annotated genes are colored according to the functional categories shown in the legend bottom left. The red arcs indicate gene regions of putative bacterial origin.
Figure 3Gene map of the chloroplast genome of . The 105,200 bp genome contains 125 unique genes, including three ribosomal RNA genes, 28 transfer RNA genes, and 94 protein coding genes. Genes shown on the outside of the circle are transcribed counterclockwise. Annotated genes are colored according to the functional categories shown in the legend bottom left. The red arcs indicate gene regions of putative bacterial origin.
Summary of the and cpDNAs and comparison with other ulvophycean cpDNAs
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
| 106.9 | 30.8 | 115 | 79/7 | 3 | 26 | 74.4 | 20.1 | 5.5 | 8.3 | 13 | - | This study |
|
| 105.2 | 32.8 | 125 | 77/17 | 3 | 28 | 77.5 | 16.3 | 6.3 | 11.2 | 11 | - | This study |
|
| 153.4 | 33.1 | 131 | 78/11 | 3 | 39 | 55.9 | 38.9 | 5.2 | 7.1 | 11 | - | [ |
|
| 195.8 | 31.5 | 135 | 74/29 | 3 | 29 | 53.9 | 37.5 | 8.5 | 12.3 | 28 | + | [ |
|
| 151.9 | 40.5 | 112 | 75/8 | 3 | 26 | 57.9 | 39.5 | 2.5 | 6.8 | 10 | + | [ |
aRe-annotated in this study (see Additional file 2).
bIncluding intronic open reading frames (ORFs).
cExcluding intronic ORFs.
dOnly those ORFs are included, which were found to have a significant blastp result (E value < 1e-04).
Comparison of tRNA genes in the two species, and two other species of Ulvophyceae ( and )
|
|
|
|
|
| |
|---|---|---|---|---|---|
|
| 1* | ||||
|
| 1* | ||||
|
| 1,1* | 1 | 1 | 2 | 2 |
|
| 1,1* | 1 | 1 | 1 | 1 |
|
| 1,1* | 1 | 1 | 1 | 1 |
|
| 1* | ||||
|
| 1 | 1 | 1 | 1 | 1 |
|
| 1,1* | 1 | 1 | 1 | 1 |
|
| 1,1* | 1 | 1 | 1 | 1 |
|
| 1,1* | 2 | 1 | 1 | 1 |
|
| 1,1* | 1 | 1 | 1 | 1 |
|
| 1* | ||||
|
| 1 | 1 | 1 | 1 | |
|
| 1 | 1 | 1 | 2 | 2 |
|
| 1* | ||||
|
| 1 | 1 | 1 | 1 | 1 |
|
| 1* | 1 | 1 | ||
|
| 1 | 1 | 1 | 1 | 1 |
|
| 1 | 1 | 1 | 1 | 1 |
|
| 1 | ||||
|
| 1* | ||||
|
| 1 | 1 | 1 | 1 | 1 |
|
| 1 | 1 | 1 | 1 | 1 |
|
| 1,1* | 1 | 1 | 1 | 1 |
|
| 1* | ||||
|
| 1,1* | 1 | 1 | 1 | 1 |
|
| 1* | ||||
|
| 1 | 1 | 1 | 1 | 1 |
|
| 1,1* | 1 | 1 | 1 | 1 |
|
| 1* | ||||
|
| 1* | 1 | 1 | ||
|
| 1 | 1 | 1 | 1 | 1 |
|
| 1,1* | 1 | 1 | 1 | 1 |
|
| 1 | 1 | 1 | 1 | 1 |
|
| 1* | ||||
|
| 1 | 1 | 1 | 1 | 1 |
|
| 1* | ||||
|
| 1* | ||||
|
| 1 | 1 | 1 | 1 | 1 |
|
| 1,1* | 1 | 1 | 1 | 1 |
|
| 1,1* | 1 | 1 | 1 | 1 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
aThe trnM(cau) gene situated in the large tRNA region in B. hypnoides showed similarity (blastn E value < 2e-10) with land plant nuclear encoded tRNA-Met initiator (Met-tRNA-i) genes.
Numbers indicate gene copy. An asterisks indicates that the gene or gene copy is situated in the large tRNA region in B. hypnoides (Additional file 2). B. hyp = Bryopsis hypnoides, B. plu = B. plumosa, T. exp = Tydemania expeditiones, P. aki = Pseudendoclonium akinetum, O. vir = Oltmannsiellopsis viridis.
Figure 4Comparison of protein coding genes content among core Chlorophyta. Pd = Pedinophyceae. 80 genes that are shared among the 20 cpDNAs are not included: atpA, B, E, F, H, cemA, clpP, ftsH, petB, D, G, L, psaA, B, C, J, psbA, B, C, D, E, F, H, I, J, K, L, M, N, T, Z, rbcL, rpl2, 5, 14, 16, 20, 23, 36, rps2, 3, 4, 7, 8, 9, 11, 12, 14, 18, 19, tufA, ycf1, ycf3, ycf4, rpoA, B, C1, C2, rrf, rrl, rrs, and 20 tRNA genes (trnA, C, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W, Y). ftsH and ycf1 are present in C. reinhardtii as ORF2971 and ORF1995, respectively, and ycf1 is present in C. vulgaris as ORF819 ([91,92] (supplementary Table II)). ycf47 is present in C. vulgaris as ORF70 (determined by blastx). Data sources: B. hypnoides [13], P. akinetum [12], O. viridis [11], C. vulgaris [6], P. kessleri, O. solitaria, P. minor [22], C. subellipsoidea [93], L. terrestris [8], A. obliquus [94], D. salina [95], V. carteri [96], G. pectorale [97], C. reinhardtii [98], O. cardiacum [99], F. terrestris [7], S. leibleinii [4], S. helveticum [5].
Freestanding ORFs in and cpDNAs
|
|
|
|
|
|
|---|---|---|---|---|
|
| 2,676 | bacterial | <1e-39 | No similar green algal sequencesa (apart from |
|
| 212 | hypothetical protein, | 1e-26 | No similar green algal sequencesa |
|
| 76 | hypothetical protein, | 4e-19 | No similar green algal sequencesa (apart from |
|
| 446 | Bacterial RNA-directed DNA polymerase | <1e-32 | Contains a group II intron, maturase-specific domain. Similarity to a RT group II intron protein of |
|
| 375 | Bacterial RNA-directed DNA polymerase | <1e-23 | Contains a Group II intron, maturase-specific domain. Similarity with RT in |
|
| 96 | bacterial DNA methyltransferase | <1e-22 | Contains Cytosine-C5 specific DNA methylase domain. No similar green algal sequencesa |
|
| 202 | bacterial DNA methyltransferase | <1e-56 | Contains Cytosine-C5 specific DNA methylase domain. No similar green algal sequencesa |
|
| 287 | bacterial transposase | <1e-23 | Contains an integrase core domain protein. No similar green algal sequencesa |
|
| 117, 177 | bacterial DNA polymerase | < 1e-46 | Contains a DNA polymerase family A domain. No similar green algal sequencesa |
|
| 381 | bacterial DNA primase or phage/plasmid primase | < 1e-97 | No similar green algal sequencesa |
aTo verify whether homologous genes were present in green algae, blastp searcher were performed with organisms constrained to Viridiplantae (taxid:33090).
Distribution and characteristics of introns in and
|
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|---|
|
| + | - | 239 | 489 | 69% | - | group I intron |
|
| + | - | 728 | 753 | 70% | - | putative group II |
|
| + | + | 481/373 | 108/108 | 82/77% | - | undetermined |
|
| + | + | 385/370 | 210/210 | 79/78% | - | undetermined |
|
| + | - | 2244 | 1797 | 69% | RT, IM | group II |
|
| - | + | 3227 | 576 | 67% | RT, IM | group II |
|
| + | - | 1103 | 600 | 73% | LHE | group I |
|
| - | + | 2491 | 1131 | 57% | RT, IM | group II |
|
| + | - | 427 | 27 | 82% | - | undetermined |
|
| + | - | 389 | 171 | 77% | - | putative group II |
|
| + | - | 351 | 24 | 85% | - | undetermined |
|
| + | - | 848 | 141 | 64% | - | putative group II |
|
| - | + | 776 | 1917 | 73% | LHE | group I |
|
| - | + | 801 | 1923 | 69% | LHE | group I |
|
| - | + | 1049 | 1931 | 67% | LHE b | group I |
|
| + | + | 1037/730 | 2598/2598 | 63/71% | LHE | group I |
|
| - | + | 860 | 510 | 67% | LHE | group I |
|
| - | + | 988 | 794 | 71% | LHE | group I |
|
| + | + | 205/152 | 35 | 73/65% | - | putative group I |
|
| + | - | 373 | 174 | 79% | - | putative group II |
aIntron insertion site positions correspond to the nucleotide immediately preceding the intron. Insertion sites in genes coding for proteins and the tRNA are given relative to the corresponding genes in Mesostigma viride cpDNA (GenBank NC_002186); insertion sites in the rRNA genes are given relative to the 16S and 23S rRNA genes of Escherichia coli (GenBank NC_004431).
bContains two LAGLIDADG homing endonuclease domains.
RT = reverse transcriptase, IM = intron maturase, LHE = LAGLIDADG homing endonuclease.
Figure 5Sequence repeats (yellow) in and cpDNAs. Arrows indicate the direction of the repeats. Protein coding genes are indicated in green, tRNA genes in purple and rRNA genes in red.
Figure 6Alignments between the chloroplast genomes of , , and . The Mauve algorithm [81] was used to align the cpDNAs between B. plumosa and B. hypnoides (A), and between B. plumosa and T. expeditiones (B). Corresponding colored boxes are locally collinear blocks (LCBs), which represent homologous regions of sequences that do not contain any major rearrangements. Inside each block a sequence identity similarity profile is shown. Inverted LCBs are presented as blocks below the center line. Annotations are shown above and below the LCBs: protein coding genes as white boxes, tRNA genes in green, rRNA genes in red and short repeats in pink. Lowered position of a box indicates inverted orientation. The asterisk indicates the large tRNA region in B. hypnoides.
Figure 7Comparison of conserved gene clusters. (A) Comparison of gene clusters conserved between at least two of the four depicted species, representing different ulvophycean and trebouxiophycean lineages [based on Turmel et al. [22]] with gene order found in the Bryopsidales (Bryopsis plumosa, B. hypnoides and Tydemania expeditiones). (B) Comparison of gene clusters conserved in Bryopsidales with gene order found in four species, representing different ulvophycean and trebouxiophycean lineages. Black connected circles indicate gene clusters. Grey circles indicate genes that are located elsewhere on the cpDNA. White circles indicated genes that are missing from the cpDNA.
Figure 8Phylogenetic trees of Chlorophyta inferred from a 79-gene dataset. (A) Bayesian majority rule tree showing all compatible partitions, inference from the protein alignment of 79 concatenated chloroplast genes (16,205 amino acid positions and 44 terminal taxa). Node support is given as Bayesian posterior probabilities and maximum-likelihood (ML) bootstrap values of the protein analyses (above branches), and the nucleotide analyses (below branches); values <0.9 and 50, respectively, are not shown; asterisks indicated full support in both the Bayesian and ML analyses. (B) Bayesian tree inference from the nucleotide alignment (first two codon positions) of 79 concatenated chloroplast genes (32,410 nucleotide positions and 44 terminal taxa). Node support is given as Bayesian posterior probabilities and maximum-likelihood (ML) bootstrap values, and asterisks indicate full support in both analyses.
Figure 9Phylogenetic trees of Chlorophyta inferred from a 50-gene dataset. (A) Bayesian majority rule tree showing all compatible partitions, inference from the protein alignment of 50 concatenated chloroplast genes (9,300 amino acid positions and 51 terminal taxa). Node support is given as Bayesian posterior probabilities and maximum-likelihood (ML) bootstrap values of the protein analyses (above branches), and the nucleotide analyses (below branches); values <0.9 and 50, respectively, are not shown; asterisks indicated full support in both the Bayesian and ML analyses. (B) Bayesian tree inference from the nucleotide alignment (first two codon positions) of 50 concatenated chloroplast genes (18,600 nucleotide positions and 51 terminal taxa). Node support is given as Bayesian posterior probabilities and maximum-likelihood (ML) bootstrap values, and asterisks indicate full support in both analyses.