| Literature DB >> 24260278 |
Fernando Martínez-Alberola1, Eva M Del Campo, David Lázaro-Gimeno, Sergio Mezquita-Claramonte, Arantxa Molins, Isabel Mateu-Andrés, Joan Pedrola-Monfort, Leonardo M Casano, Eva Barreno.
Abstract
Completely sequenced plastomes provide a valuable source of information about the duplication, loss, and transfer events of chloroplast genes and phylogenetic data for resolving relationships among major groups of plants. Moreover, they can also be useful for exploiting chloroplast genetic engineering technology. Ericales account for approximately six per cent of eudicot diversity with 11,545 species from which only three complete plastome sequences are currently available. With the aim of increasing the number of ericalean complete plastome sequences, and to open new perspectives in understanding Mediterranean plant adaptations, a genomic study on the basis of the complete chloroplast genome sequencing of Arbutus unedo and an updated phylogenomic analysis of Asteridae was implemented. The chloroplast genome of A. unedo shows extensive rearrangements but a medium size (150,897 nt) in comparison to most of angiosperms. A number of remarkable distinct features characterize the plastome of A. unedo: five-fold dismissing of the SSC region in relation to most angiosperms; complete loss or pseudogenization of a number of essential genes; duplication of the ndhH-D operon and its location within the two IRs; presence of large tandem repeats located near highly re-arranged regions and pseudogenes. All these features outline the primary evolutionary split between Ericaceae and other ericalean families. The newly sequenced plastome of A. unedo with the available asterid sequences allowed the resolution of some uncertainties in previous phylogenies of Asteridae.Entities:
Mesh:
Year: 2013 PMID: 24260278 PMCID: PMC3832540 DOI: 10.1371/journal.pone.0079685
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Gene map of the Arbutus unedo complete chloroplast genome represented as a circular molecule.
Genes shown inside the circle are transcribed clockwise and genes outside are transcribed counter clockwise. Genes for tRNAs are represented by one letter code amino acids with anticodons. Asterisks indicate genes with introns. Pseudogenes are preceded by the Ψ symbol.
Genes found in the Arbutus unedo chloroplast genome.
| Function | Different products | Total genes | Total introns | Gene name |
| Photosystem I | 7 | 8 | 0 |
|
| Photosystem II | 15 | 15 | 0 |
|
| Cytochrome b6/f complex | 6 | 6 | 2 |
|
| ATP synthase | 6 | 6 | 0 |
|
| Calvin clycle | 1 | 1 | 0 |
|
| C-type cytochrome synthesis | 1 | 2 | 0 |
|
| NADH dehydrogenase | 11 | 18 | 4 |
|
| RNA polymerase | 4 | 4 | 0 |
|
| Maturase K | 1 | 1 | 0 |
|
| Translation initiation factor | 1 | 1 | 0 |
|
| Large subunit ribosomal proteins | 9 | 9 | 1 | rpl2 |
| Small subunit ribosomal proteins | 12 | 15 | 3 | rps2, 3, 4, 7 |
| Ribosomal RNAs (4) | 4 | 8 | 0 | rrn23 |
| tRNAs | 30 | 39 | 8 | trnA-UGCbc, C-GCA, D-GUC, E-UUC, F-GAA, G-GCC, G-UCC |
| Envelope membrane protein | 1 | 1 | 0 |
|
| Pseudogenes | 5 | 8 | 0 |
|
Gene containing two introns.
Gene containing a single intron.
Two gene copies in the IRs.
Gene whose transcripts are trans-spliced.
Genes having cis-spliced introns in the Arbutus unedo cpDNA and the lengths of exons and introns.
| Gene | Location | Exon I nt | Exon II nt | Exon III nt | Intron I nt | Intron class | Intron II nt | Intron class |
|
| LSC | 145 | 410 | – | 714 | IIA | – | – |
|
| IR | 553 | 539 | – | 1073 | IIB | – | – |
|
| IR | 777 | 756 | – | 684 | IIB | – | – |
|
| LSC | 6 | 642 | – | 736 | IIB | – | – |
|
| LSC | 8 | 481 | – | 792 | IIB | – | – |
|
| LSC | 391 | 434 | – | 672 | IIA | – | – |
|
| LSC | 9 | 408 | – | 10367 | IIB | – | – |
|
| LSC | 453 | 1626 | – | 738 | IIB | – | – |
|
| LSC | 40 | 188 | – | 857 | IIB | – | – |
|
| IR | 37 | 35 | – | 807 | IIA | – | – |
|
| LSC | 23 | 48 | – | 692 | IIB | – | – |
|
| IR | 37 | 35 | – | 950 | IIA | – | – |
|
| LSC | 37 | 35 | – | 2514 | IIA | – | – |
|
| LSC | 35 | 50 | – | 521 | I | – | – |
|
| LSC | 39 | 35 | – | 620 | IIA | – | – |
|
| LSC | 124 | 230 | 153 | 680 | IIB | 722 | IIB |
Codon-anticodon recognition pattern and codon usage for the chloroplast genome of Arbutus unedo.
| Amino acid | tRNA | Codon | No. | Amino acid | tRNA | Codon | No. | Amino acid | tRNA | Codon | No. |
| Ala | trnA-UGC | GCU | 497 | Lys | trnK-UUU | AAA | 678 | Ser | trnS-GCU | AGU | 254 |
| trnA-UGC | GCA | 305 | trnK-UUU | AAG | 191 | trnS-GCU | AGC | 69 | |||
| trnA-UGC | GCC | 172 | Leu | trnL-CAA | UUG | 380 | trnS-GGA | UCU | 402 | ||
| trnA-UGC | GCG | 130 | trnL-UAA | UUA | 685 | trnS-GGA | UCC | 197 | |||
| Cys | trnC-GCA | UGU | 160 | trnL-UAG | CUU | 405 | trnS-UGA | UCA | 232 | ||
| trnC-GCA | UGC | 43 | trnL-UAG | CUA | 242 | trnS-UGA | UCG | 96 | |||
| Asp | trnD-GUC | GAU | 521 | trnL-UAG | CUC | 116 | Thr | trnT-GGU | ACU | 407 | |
| trnD-GUC | GAC | 134 | trnL-UAG | CUG | 97 | trnT-GGU | ACC | 176 | |||
| Glu | trnE-UUC | GAA | 670 | Met | trnM-CAU | AUG | 441 | trnT-UGU | ACA | 286 | |
| trnE-UUC | GAG | 220 | Asn | trnN-GUU | AAU | 591 | trnT-UGU | ACG | 91 | ||
| Phe | trnF-GAA | UUU | 676 | trnN-GUU | AAC | 168 | Val | trnV-GAC | GUU | 385 | |
| trnF-GAA | UUC | 311 | Pro | trnP-UGG | CCU-P | 295 | trnV-GAC | GUC | 133 | ||
| Gly | trnG-GCC | GGU | 445 | trnP-UGG | CCA-P | 223 | trnV-UAC | GUA | 392 | ||
| trnG-GCC | GGC | 153 | trnP-UGG | CCC-P | 145 | trnV-UAC | GUG | 137 | |||
| trnG-UCC | GGA | 549 | trnP-UGG | CCG-P | 95 | Trp | trnW-CCA | UGG | 317 | ||
| trnG-UCC | GGG | 220 | Gln | trnQ-UUG | CAA | 497 | Tyr | trnY-GUA | UAU | 542 | |
| His | trnH-GUG | CAU | 341 | trnQ-UUG | CAG | 136 | trnY-GUA | UAC | 117 | ||
| trnH-GUG | CAC | 87 | Arg | trnR-ACG | CGA | 282 | Stop | – | UAA | 38 | |
| Ile | trnI-CAU | AUA | 490 | trnR-ACG | CGU | 280 | – | UAG | 17 | ||
| trnI-GAU | AUU | 759 | trnR-ACG | CGG | 65 | – | UGA | 18 | |||
| trnI-GAU | AUC | 292 | trnR-ACG | CGC | 62 | ||||||
| trnR-UCU | AGA | 306 | |||||||||
| trnR-UCU | AGG | 86 |
Numerals indicate the frequency of usage of each codon in 17,947 codons in 73 potential protein-coding genes.
Figure 2Whole genome alignment of the Arbutus unedo chloroplast genome with other asterid chloroplast genomes obtained with MultiPipMaker [32] taking that of Nicotiana tabacum as the reference.
Sequence identity is shown by red (75–100%), green (50–75%), and white (<50%). Positions of some genes in N. tabacum are indicated as a guide (genes encoding proteins and rRNAs are indicated as yellow and red arrows, respectively). The taxonomic classification is indicated on the left (AP: Apiales, AS: Asterales, GE: Gentianales, LA: Lamiales, SO: Solanales, ER: Ericales).
Figure 3Comparison of the lengths of LSC, SSC and IR regions among Asteridae.
Accession numbers of the corresponding genomes are indicated in Table S2.
Figure 4Gene map and alignment of the LSC region of three ericalean species in relation to Nicotiana tabacum.
(A) Gene map of the LSC region in the chloroplast genome of Nicotiana tabacum. (B) Gene alignment of the LSC region of Ardisia polysticta, Camellia sinensis, Arbutus unedo, Vaccinium macrocarpon belonging to Ericales and Nicotiana tabacum belonging to Solanales. MAUVE multiple alignment [33] implemented in Geneious [23]. Colored outlined blocks surround regions of the genome sequence that aligned with part of another genome. The coloured bars inside the blocks are related to the level of sequence similarities. Lines link blocks with homology between two genomes. Accession numbers of the corresponding genomes are indicated in Table S2.
Figure 5Tandem repeats in the Arbutus unedo plastome and other asterids.
(A) Genome sizes, number of repeat found and maximum consensus size of some asterids arranged by their genome size. (B) Frequency of tandem repeats by length.
Figure 6Phylogram based on sequence analysis of 83 chloroplast genes from 57 plant species (Table S2).
Asterisks indicate nodes with values of 0.1 and 100 for bootstrap values and posterior probabilities, respectively. The scale bar indicates substitutions/site. The current taxonomic classifications are indicated on the right (i.s., incertae sedis).