| Literature DB >> 16638149 |
Jean-Charles de Cambiaire1, Christian Otis, Claude Lemieux, Monique Turmel.
Abstract
BACKGROUND: The phylum Chlorophyta contains the majority of the green algae and is divided into four classes. While the basal position of the Prasinophyceae is well established, the divergence order of the Ulvophyceae, Trebouxiophyceae and Chlorophyceae (UTC) remains uncertain. The five complete chloroplast DNA (cpDNA) sequences currently available for representatives of these classes display considerable variability in overall structure, gene content, gene density, intron content and gene order. Among these genomes, that of the chlorophycean green alga Chlamydomonas reinhardtii has retained the least ancestral features. The two single-copy regions, which are separated from one another by the large inverted repeat (IR), have similar sizes, rather than unequal sizes, and differ radically in both gene contents and gene organizations relative to the single-copy regions of prasinophyte and ulvophyte cpDNAs. To gain insights into the various changes that underwent the chloroplast genome during the evolution of chlorophycean green algae, we have sequenced the cpDNA of Scenedesmus obliquus, a member of a distinct chlorophycean lineage.Entities:
Mesh:
Substances:
Year: 2006 PMID: 16638149 PMCID: PMC1513399 DOI: 10.1186/1471-2148-6-37
Source DB: PubMed Journal: BMC Evol Biol ISSN: 1471-2148 Impact factor: 3.260
Figure 1Gene map of . The two copies of the rRNA operon-containing IR (IRA and IRB) are represented by thick lines; the transcription direction of the rRNA genes is indicated by arrows. Genes (filled boxes) on the outside of the map are transcribed in a clockwise direction; those on the inside of the map are transcribed counterclockwise. The colour-code denotes the genomic regions containing the homologous genes in Chlamydomonas cpDNA: cyan, SC1; magenta, SC2; yellow, IR. Genes and ORFs absent from Chlamydomonas cpDNA are shown in grey. Labelled brackets denote the gene clusters shared specifically by Scenedesmus and Chlamydomonas cpDNAs (see Table 4 for the gene content of these clusters). tRNA genes are indicated by the one-letter amino acid code followed by the anticodon in parentheses (Me, elongator methionine: Mf, initiator methionine). Identical copies of the trnE(uuc) genes are denoted by asterisks. Introns are represented by open boxes and intron ORFs are denoted by narrow, filled boxes. The intron sequences bordering the psaA exons (psaA exon 1 and psaA exon 2) are spliced in trans at the RNA level. Note that only one of the two isomeric forms of the genome is shown here; these isomers differ with respect to the relative orientation of the single-copy regions.
General features of Scenedesmus and other UTC algal cpDNAs
| Size (bp) | |||||
| Total | 150,613 | 151,933 | 195,867 | 161,452 | 203,827 |
| IR | – a | 18,510 | 6,039 | 12,022 | 22,211 |
| LSC | – a | 33,610 | 140,914 | 72,440 b | 81,307 b |
| SSC | – a | 81,303 | 42,875 | 64,968 c | 78,088 c |
| A+T (%) | 68.4 | 59.5 | 68.5 | 73.1 | 65.5 |
| Coding sequences (%) d | 60.9 | 59.2 | 62.3 | 67.2 | 50.1 |
| Genes (no.) e | 112 | 105 | 105 | 96 | 94 |
| Introns (no.) | |||||
| Group I | 3 | 5 | 27 | 7 | 5 |
| Group II | 0 | 0 | 0 | 2 | 2 |
aBecause Chlorella cpDNA lacks an IR, only the total size of this genome is given.
bIn this study, this region is designated SC1 rather than LSC because it displays major differences in gene content relative to the LSC region in Mesostigma and Nephroselmis cpDNAs [13].
c This region is designated SC2 rather than SSC because it displays major differences in gene content relative to the SSC region in Mesostigma and Nephroselmis cpDNAs [13].
dConserved genes, unique ORFs and introns were considered as coding sequences.
eGenes present in the IR were counted only once. Unique ORFs and intron ORFs were not taken into account.
Expanded genes in Scenedesmus and other UTC algal cpDNAs
| 801 | 1.6 | 1059 | 2.2 | 909 | 1.8 | 1278 | 2.6 | 1503 | 3.1 | |
| 606 | 0.9 | 588 | 0.9 | 597 | 0.9 | 1614 | 2.3 | 1575 | 2.3 | |
| 5163 | 1.9 | 6879 | 2.6 | 7791 | 2.9 | 10998 | 4.1 | 8916 | 3.3 | |
| 240 | 1.2 | 222 | 1.1 | 330 | 1.6 | 306 | 1.5 | – c | – c | |
| 837 | 0.9 | 1527 | 1.6 | 1734 | 1.8 | 1437 | 1.5 | 2213 d | 2.3 | |
| 3906 | 1.2 | 4251 | 1.3 | 6537 | 2.0 | 4896 e | 1.5 | 4967 e | 1.5 | |
| 2511 | 1.3 | 3066 | 1.5 | 4737 | 2.4 | 4590 | 2.3 | 5739 e | 2.9 | |
| 4689 | 1.3 | 5580 | 1.5 | 10389 | 2.8 | 7659 | 2.1 | 9363 | 2.5 | |
| 804 | 1.2 | 717 | 1.0 | 777 | 1.1 | 2747 e | 4.0 | 2731 e | 4.0 | |
| 696 | 1.1 | 708 | 1.1 | 690 | 1.1 | 2103 | 3.3 | 2139 | 3.3 | |
| 312 | 1.4 | 237 | 1.1 | 258 | 1.2 | 567 | 2.6 | 414 | 1.9 | |
| 2460 | 2.0 | 2427 | 2.0 | 2505 | 2.0 | 7008 | 5.7 | 5988 | 4.9 | |
a Expansion factor relative to the corresponding Mesostigma gene. Each value was obtained by dividing the size of the UTC algal gene by the size of the corresponding Mesostigma gene.
b Genes also expanded in streptophyte cpDNAs. Note that the ftsH gene is designated as ycf2 in Chlorella, Chlamydomonas and land plants cpDNAs.
c infA is missing in Chlamydomonas cpDNA
d Size derived from our unpublished sequence data. A portion of the rpoA region is not annotated in GenBank:NC_005353 as a result of a sequencing error introducing a frameshift mutation.
e Gene occurring as two independent ORFs. The indicated size includes the spacer separating the ORFs.
Figure 2Conservation of ancestral gene clusters in . Black boxes represent the 89 genes found in the 24 clusters shared by Mesostigma and Nephroselmis cpDNAs as well as the genes in UTC algal cpDNAs that have retained the same order as those in these ancestral clusters. For each genome, the set of genes making up each of the identified clusters (either an intact or fragmented ancestral cluster) is shown as black boxes connected by a horizontal line. Black boxes that are contiguous but not linked together indicate that the corresponding genes are not adjacent on the genome. Gray boxes denote genes in UTC algal cpDNAs that have been relocated elsewhere on the chloroplast genome; open boxes denote genes that have disappeared from the chloroplast genome. Although the rpl22 gene is missing from Nephroselmis cpDNA, it is shown as belonging to the large ribosomal protein cluster equivalent to the contiguous S10, spc and α operons of Escherichia coli because it is present in this cluster in the cpDNAs of Mesostigma, streptophytes and algae from other lineages. Note also that the psbB cluster of Oltmannsiellopsis and Pseudendoclonium cpDNAs differs from the ancestral cluster found in other genomes by the presence of psbN on the alternate DNA strand.
Derived gene clusters shared by Scenedesmus and Chlamydomonas cpDNAs
| 1 | |
| 2 | |
| 3 | |
| 4 | |
| 5 | |
| 6 | |
| 7 | |
| 8 | |
| 9 | |
| 10 | |
| 11 |
a Clusters are labelled as in Fig.1.
b The slash indicates a change in gene polarity.
Introns in Scenedesmus cpDNA and homologous introns at identical gene locations in other green algal cpDNAs
| So. | IC3 | – | – | [GenBank: | |
| [GenBank: | |||||
| So. | IA1 | – | – | [GenBank: | |
| So. | IA1 | L5 | H-N-H | – | - |
| So. | IA1 | – | – | – | - |
| So. | IA1 | L5 | H-N-H | [GenBank: | |
| So. | IB4 | L6 | LAGLIDADG | [GenBank: | |
| So. | IA3 | L6 | LAGLIDADG | [GenBank: | |
| [GenBank: | |||||
| [GenBank: | |||||
| [GenBank: | |||||
| So. | IIA | DIV | RT-X-Zn | – | – |
| So. | IIB | – | – | [GenBank: | |
a Group I introns were classified according to Michel and Westhof [66], whereas classification of group II introns was according to Michel et al. [65].
b L followed by a number refers to the loop extending the base-paired region identified by the number; D refers to a domain of group II intron secondary structure.
c For group I intron ORFs, the conserved motif in the predicted homing endonuclease is given; for the petD intron ORF, RT, X and Zn refer to the reverse transcriptase, maturase and nuclease domains of reverse transcriptases, respectively.
d For the two rrl introns, only the homologues identified in completely sequenced genomes are given. The complete lists of introns homologous to So. rrl.1 and So.rrl.2 can be obtained in Turmel et al. [35] and Pombert et al. [13], respectively. The letter in parentheses indicates the chlorophyte lineage comprising the green alga indicated. U, Ulvophyceae; T, Trebouxiophyceae; C, Chlorophyceae.
e The intron number is given when more than one intron is present.
Figure 3Secondary structure model of the . Intron modelling was according to the nomenclature proposed by Michel et al. [65]. Exon sequences are shown in lowercase letters. Roman numerals specify the six major structural domains of group II introns. Tertiary interactions are represented by blocked arrows. EBS and IBS refer to exon-binding and intron-binding sites, respectively. Numbers inside the loops denote the sizes of these regions. The 5' and 3' strand polarities of the psaAa and psaAb transcripts are indicated by arrows.
Abundance of SDRs in Scenedesmus and other UTC algal cpDNAs
| 84 | 269 | 44 | 11,743 | 7.8 | 20.8 | |
| 172 | 1,205 | 161 | 18,033 | 11.9 | 30.1 | |
| 171 | 1,047 | 203 | 10,073 | 5.1 | 13.6 | |
| 112 | 86 | 21 | 4,817 | 3.0 | 8.7 | |
| 221 | 3,247 | 551 | 32,244 | 15.8 | 31.9 | |
aRepeats with identical sequences on the same DNA strand or different DNA strands were identified using REPuter.
bThe repeats identified with REPuter were annotated and masked on the genome sequence using RepeatMasker.
SDR repeat units in Scenedesmus cpDNA
| A | 15 | TTTACGCTTTTTTTC | 17 | 10 |
| B | 15 | TTCTTCTTCATTTTT | 22 | 19 |
| C | 16 | TGCTTTGCTGCTTTTT | 16 | 22 |
aCopies of each SDR unit were identified using FINDPATTERNS (one mismatch was allowed).
Figure 4Positions of SDR elements in . Lines connect cpDNA loci displaying repeats ≥30 bp with identical sequences either on the same strand or different strands. For this analysis carried out with REPuter, one copy of the IR sequence (IRA) was deleted; the location of this deleted copy is indicated by the long, vertical arrow. The loci containing repeat units A, B and C are represented by symbols of different shapes outside the gene map: triangles, repeat unit A; squares, repeat unit B; circles, repeat unit C. Filled symbols denote the repeats occupying the + strand; open symbols denote the repeats found on the alternate strand. A symbol accompanied by an asterisk indicates the presence of two or more copies. Small arrows point to gene coding regions containing copies of repeat unit B.
Minimal numbers of inversions accounting for gene rearrangements between green algal cpDNAs
| 43 | 46 | 54 | 54 | 75 | 75 | |
| 47 | 55 | 55 | 73 | 74 | ||
| 50 | 52 | 74 | 75 | |||
| 55 | 78 | 75 | ||||
| 76 | 77 | |||||
| 58 | ||||||
a Numbers of gene permutations by inversions were computed using GRIMM.