| Literature DB >> 17222354 |
Claude Lemieux1, Christian Otis, Monique Turmel.
Abstract
BACKGROUND: The Viridiplantae comprise two major phyla: the Streptophyta, containing the charophycean green algae and all land plants, and the Chlorophyta, containing the remaining green algae. Despite recent progress in unravelling phylogenetic relationships among major green plant lineages, problematic nodes still remain in the green tree of life. One of the major issues concerns the scaly biflagellate Mesostigma viride, which is either regarded as representing the earliest divergence of the Streptophyta or a separate lineage that diverged before the Chlorophyta and Streptophyta. Phylogenies based on chloroplast and mitochondrial genomes support the latter view. Because some green plant lineages are not represented in these phylogenies, sparse taxon sampling has been suspected to yield misleading topologies. Here, we describe the complete chloroplast DNA (cpDNA) sequence of the early-diverging charophycean alga Chlorokybus atmophyticus and present chloroplast genome-based phylogenies with an expanded taxon sampling.Entities:
Mesh:
Substances:
Year: 2007 PMID: 17222354 PMCID: PMC1781420 DOI: 10.1186/1741-7007-5-2
Source DB: PubMed Journal: BMC Biol ISSN: 1741-7007 Impact factor: 7.431
Figure 1Gene map of . Genes (closed boxes) shown on the outside of the map are transcribed in a clockwise direction, whereas those on the inside of the map are transcribed counterclockwise. All ORFs larger than 75 codons as well as genes absent from Mesostigma cpDNA are represented in beige. Blocks of gene sequences colinear with Mesostigma cpDNA are shown as alternating series of green and red boxes. Genes present in Mesostigma cpDNA but located outside these conserved blocks are shown in grey. tRNA genes are indicated by the one-letter amino acid code (Me, elongator methionine; Mf, initiator methionine) followed by the anticodon in parentheses. The intron in trnL(uaa) is represented by an open box.
Conservation of gene order in green algal and land plant cpDNAs relative to their Chlorokybus and Mesostigma homologues
| 15 | 126/134 | 6.3 | 12 | |
| 18 | 93/126 | 4.1 | 42 | |
| 21 | 90/124 | 3.5 | 46 | |
| 19 | 87/120 | 3.8 | 42 | |
| 21 | 81/123 | 3.1 | 55 | |
| 15 | 126/134 | 6.3 | 12 | |
| 21 | 101/124 | 3.9 | 36 | |
| 23 | 96/122 | 3.4 | 42 | |
| 22 | 95/118 | 3.7 | 39 | |
| 24 | 89/122 | 3.0 | 50 |
a A conserved cluster was defined as a suite of genes/gene pieces with the same order and polarities in the pair of compared genomes.
b The ratio of these two values represents the fraction of shared genes/gene pieces found in conserved clusters.
c This value was obtained by dividing the percentage of shared genes/gene pieces found in conserved clusters by the number of conserved clusters.
d The GRIMM web server [70] was used to infer the numbers of inversions required to transform the gene order displayed by Chlorokybus or Mesostigma cpDNA into those of the compared genomes. In this analysis, the order of the 109 genes/gene pieces shared by the six compared genomes was examined.
Abundance of repeats in Chlorokybus and other green algal cpDNAs
| 217 | 0.2 | 0.7 | |
| 873 | 0.6 | 1.4 | |
| 3,008 | 1.6 | 4.2 | |
| 972 | 0.7 | 3.2 | |
| 2,226 | 1.3 | 3.2 | |
| 501 | 0.3 | 0.8 | |
| 1,061 | 0.5 | 1.1 | |
| 11,743 | 7.8 | 14.9 | |
| 18,033 | 11.9 | 29.9 | |
| 10,621 | 5.4 | 12.9 | |
| 39,941 | 17.8 | 38.2 | |
| 4,817 | 3.0 | 8.3 | |
| 34,244 | 16.8 | 34.2 | |
a Non-overlapping repeat elements were mapped on each genome with RepeatMasker using the repeats ≥ 30 bp identified with REPuter [67] as input sequences.
b Unique ORFs were considered to be non-coding sequences, i.e. components of intergenic regions.
Figure 2Phylogenetic positions of . (A) Best ML tree based on 45 proteins from 45 algal/plant taxa. (B) Best ML tree based on 45 genes from 45 algal/plant taxa. The eight non-green algae (bottom of the figure) were used as outgroup. The nodes that received 100% bootstrap support in ML, MP, ML-distance and LogDet-distance analyses are denoted by asterisks. For the other nodes, only the bootstrap values ≥50% are indicated; the values obtained in ML, MP, ML-distance, and LogDet-distance analyses are listed in this order from left to right. The detailed names of taxa are reported in the Methods. The 45 genes analyzed are as follows: atpA,B,E,F,H, petB,D,G, psaA,B,C,J, psbA,B,C,D,E,F,H,I,J,K,N,T,Z, rpl2,14,16,20,36, rpoB,C1,C2, rps2,3,4,7,8,11,12,14,18,19, ycf3,4.
Green algal and land plant chloroplast genomes examined in this study
| 184,933 | 10,919 | 127 | 2/16 | GenBank: | |
| 131,183 | 12,431 | 125 | 1/17 | GenBank: | |
| 152,254 | 7,640 | 138 | 1/0 | GenBank: | |
| 118,360 | 6,057 | 137 | 0 | GenBank: | |
| 157,089 | - | 121 | 1/7 | GenBank: | |
| 165,372 | - | 125 | 1/12 | GenBank: | |
| 153,821 | 25,697 | 110 | 1/20 | GenBank: | |
| 162,686 | 26,651 | 111 | 1/20 | GenBank: | |
| 161,162 | 15,744 | 120 | 2/20 | GenBank: | |
| 154,478 | 26,264 | 110 | 1/20 | GenBank: | |
| 156,687 | 17,031 | 111 | 1/20 | GenBank: | |
| 153,337 | 23,296 | 111 | 1/20 | GenBank: | |
| 155,293 | 24,753 | 111 | 1/20 | GenBank: | |
| 160,286 | 26393 | 109 | 1/20 | GenBank: | |
| 154,373 | 15,314 | 119 | 1/20 | GenBank: | |
| 152,765 | 25,033 | 111 | 1/20 | GenBank: | |
| 150,519 | 25,156 | 109 | 1/20 | GenBank: | |
| 121,024 | 10,058 | 120 | 1/19 | GenBank: | |
| 155,939 | 25,341 | 110 | 1/20 | GenBank: | |
| 159,930 | 25,177 | 111 | 1/20 | GenBank: | |
| 163,935 | 27,807 | 109 | 1/18 | GenBank: | |
| 134,525 | 20,799 | 108 | 1/17 | GenBank: | |
| 156,318 | 26,071 | 111 | 1/20 | GenBank: | |
| 148,964 | 23,787 | 99 | 1/18 | GenBank: | |
| 122,890 | 9,589 | 116 | 1/20 | GenBank: | |
| 119,707 | - | 106 | 1/15 | GenBank: | |
| 141,182 | 22795 | 108 | 1/17 | GenBank: | |
| 150,725 | 25,073 | 110 | 1/19 | GenBank: | |
| 134,545 | 20,703 | 108 | 1/17 | GenBank: | |
| 140,384 | 22,748 | 108 | 1/17 | GenBank: | |
| 200,799 | 46,137 | 128 | 0 | GenBank: | |
| 150,613 | - | 112 | 3/0 | GenBank: | |
| 151,933 | 18,510 | 104 | 5/0 | GenBank: | |
| 195,867 | 6,039 | 105 | 27/0 | GenBank: | |
| 203,827 | 22,211 | 94 | 5/2 | GenBank: | |
| 161,452 | 12,022 | 96 | 7/2 | GenBank: | |
| 223,902 | - | 97 | 16/5 | GenBank: |
a Genes present in the IR were counted only once. Pseudogenes, unique ORFs and intron ORFs were not taken into account.
b Numbers of group I and group II introns are given before and after the slash, respectively.
Figure 3Influence of removing increasing proportions of fast-evolving sites in the amino acid data set on the robustness of the nodes supporting the . A series of data sets lacking 10% to 90% of the 4,179 phylogenetically informative sites present in the original data set analyzed in Figure 2A were subjected to phylogenetic analyses. (A) Fluctuations of bootstrap values for the node supporting the Chlorokybus + Mesostigma clade in ML (left side) and MP (right side) analyses. (B) Fluctuations of bootstrap values for the nodes supporting the T1, T2 and T3 placement of the Chlorokybus + Mesostigma clade in ML (left side) and MP (right side) analyses. The three topologies tested are colour-coded as follows: red, T1 topology, i.e. the placement of the clade before the divergence of the Streptophyta and Chlorophyta; blue, T2 topology, i.e. the placement of the clade as sister to the Streptophyta; and green, T3 topology, i.e. the placement of the clade as sister to the Chlorophyta. (C) Fluctuations in the number of characters supporting unambiguously each of the three topologies identified for the Chlorokybus + Mesostigma clade. The topologies are colour-coded as in (B).
Figure 4Influence of removing substantial proportions of fast-evolving sites in the amino acid data set on the resolution of phylogenetic relationships. (A) Best ML tree inferred from the data subset lacking 70% of the original information. (B) Best ML tree inferred from the data subset lacking 90% of the original information. The nodes that received 100% bootstrap support in ML and MP analyses are denoted by asterisks. For the other nodes, only the bootstrap values ≥ 50% are indicated; the values obtained in ML and MP analyses appear on the left and right, respectively.
Figure 5Phylogenetic positions of . The Chlorokybus rRNA gene sequences were added to a data set previously analyzed by Turmel et al. [16] and phylogenetic analyses of the resulting data set (4,072 sites) were carried out essentially as described in the Methods, except that the TrN+Γ+I model was used. The best ML tree is shown. The nodes that received 100% bootstrap support in ML, MP, ML-distance and LogDet-distance analyses are denoted by asterisks. For the other nodes, only the bootstrap values ≥50% are indicated; the values obtained in ML, MP, ML-distance, and LogDet-distance analyses are listed in this order from left to right.
Figure 6Phylogenetic positions of . (A) MP analysis based on gene content. The strict consensus of 12 equally parsimonious trees is shown. (B) MP analysis based on gene order. The strict consensus of 56 equally parsimonious trees is shown. The nodes that received 100% bootstrap support are denoted by asterisks. For the other nodes, only the bootstrap values ≥50% are indicated.
Figure 7MP analysis based on combined gene content and gene order data derived from the chloroplast genomes. The figure shows the strict consensus of 28 equally most parsimonious trees. The nodes that received 100% bootstrap support are denoted by asterisks. For the other nodes, only the bootstrap values ≥ 50% are indicated.