| Literature DB >> 34671062 |
Rafaela Jorge Trad1,2, Fernanda Nunes Cabral3, Volker Bittrich4, Saura Rodrigues da Silva5, Maria do Carmo Estanislau do Amaral6.
Abstract
A complete chloroplast genome is not yet available for numerous species of plants. Among the groups that lack plastome information is the clusioid clade (Malpighiales), which includes five families: Bonnetiaceae, Calophyllaceae, Clusiaceae, Hypericaceae, and Podostemaceae. With around 2200 species, it has few published plastomes and most of them are from Podostemaceae. Here we assembled and compared six plastomes from members of the clusioids: five from Calophyllaceae (newly sequenced) and one from Clusiaceae. Putative regions for evolutionary studies were identified and the newly assembled chloroplasts were analyzed with other available chloroplasts for the group, focusing on Calophyllaceae. Our results mostly agree with recent studies which found a general conserved structure, except for the two Podostemaceae species that have a large inversion (trnK-UUU-rbcL) and lack one intron from ycf3. Within Calophyllaceae we observed a longer LSC and reduced IRs in Mahurea exstipulata, resulting in some genic rearrangement, and a short inversion (psbJ-psbE) in Kielmeyera coriacea. Phylogenetic analyses recovered the clusioids and the five families as monophyletic and revealed that conflicts in relationships reported in the literature for the group agree with nodes concentrating uninformative or conflicting gene trees. Our study brings new insights about clusioid plastome architecture and its evolution.Entities:
Mesh:
Year: 2021 PMID: 34671062 PMCID: PMC8528878 DOI: 10.1038/s41598-021-99178-z
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Circular map of the five complete clusioid plastomes. (A) Caraipa heterocarpa, (B) Kielmeyera appariciana, (C) K. coriacea, (D) Mahurea exstipulata, (E) Clusia panapanari. The genes represented outside the circle are transcribed counterclockwise and those inside the outer circle are transcribed clockwise. Genes are colored according to their functional groups following the legend. The inner gray graphs indicate the GC content across the plastome.
GenBank accession numbers and comparison of chloroplast genome size and GC content across three different regions (LSC, SSC, and IR) for 12 clusioid species. LSC large single copy, SSC small single copy, IR inverted repeat.
| Species | Family | GenBank | Coverage (mean) | Total reads | Mapped reads | LSC | SSC | IR | Full plastome | ||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| bp | GC % | bp | GC % | bp | GC % | bp | GC % | ||||||
| Bonnetiaceae | MK995182 | – | – | – | 84,024 | 33.9 | 18,140 | 29.9 | 27,309 | 41.9 | 156,782 | 36.2 | |
| Calophyllaceae | MW853786 | 52.0 | 8,922,582 | 61,669 | 88,119 | 34.2 | 17,464 | 30.6 | 27,334 | 42.2 | 160,253 | 36.5 | |
| Calophyllaceae | MW853787 | 44.1 | 8,486,444 | 49,148 | 86,990 | 34.2 | 18,260 | 30.7 | 26,353 | 42.7 | 157,956 | 36.6 | |
| Calophyllaceae | MW853788 | 447.3 | 8,424,602 | 504,031 | 87,648 | 34.2 | 18,300 | 30.5 | 25,519 | 42.7 | 158,986 | 36.6 | |
| Calophyllaceae | MW853789 | 474.7 | 11,077,404 | 540,562 | 88,263 | 34.3 | 18,219 | 30.6 | 26,490 | 42.7 | 159,470 | 36.6 | |
| Calophyllaceae | MW853790 | 298.2 | 7,486,882 | 333,152 | 98,042 | 34.6 | 18,395 | 30.7 | 16,553 | 45.4 | 149,535 | 36.5 | |
| Calophyllaceae | MK995181 | – | – | – | 88,784 | 34.0 | 17,482 | 30.6 | 27,614 | 42.1 | 161,494 | 36.4 | |
| Clusiaceae | SRR7518735 | 78.6 | 2,452,164 | 89,207 | 85,899 | 33.6 | 19,102 | 29.5 | 26,702 | 42.3 | 158,405 | 36.0 | |
| Clusiaceae | MK995180 | – | – | – | 85,640 | 34.0 | 18,892 | 29.9 | 26,272 | 42.1 | 157,076 | 36.2 | |
| Clusiaceae | NC_047250 | – | – | – | 84,998 | 33.5 | 17,088 | 30.3 | 27,058 | 42.1 | 156,202 | 36.2 | |
| Podostemaceae | MK995178 | – | – | – | 79,506 | 32.2 | 12,262 | 28.0 | 19,916 | 43.0 | 131,600 | 35.1 | |
| Podostemaceae | MK995179 | – | – | – | 79,002 | 33.7 | 12,717 | 30.7 | 19,623 | 43.5 | 130,967 | 36.3 | |
List of genes annotated in the six assembled plastomes: Calophyllum brasiliense, Caraipa heterocarpa: Clusia panapanari, Kielmeyera appariciana, K. coriacea, and Mahurea exstipulata.
| Functional annotation | Name of the gene |
|---|---|
| Photosystem I | psaA, psaB, psaC, psaI, psaJ |
| Protosystem II | psbA, psbB, psbC, psbD, psbE, psbF, psbH, psbI, psbJ, psbK, psbL, psbM, psbN, psbT, psbZ |
| Cytochrome b/f complex | petA, petBi, petDi, petG, petL, petN |
| ATP synthase | atpA, atpB, atpE, atpFi, atpH, atpI |
| NADH dehydrogenase | ndhAi, ndhBi (× 2), ndhC, ndhD, ndhE, ndhF, ndhG, ndhH, ndhI, ndhJ, ndhK, ndhL |
| RubisCO large subunit | rbcL |
| RNA polymerase | rpoA, rpoB, rpoC1i, rpoC2 |
| Ribosomal proteins (SSU) | rps2, rps3, rps4, rps7 (× 2), rps8, rps11, rps12iT, rps14, rps15, rps16i, rps18, rps19 |
| Ribosomal proteins (LSU) | rpl2i (2x), rpl14, rpl16i, rpl20, rpl22, rpl23 (2x), rpl32, rpl33, rpl36 |
| Other genes | ccsA, clpPi, matK, accD, cemA, infA |
| Transfer RNAs | trnA-UGCi (× 2), trnC-ACAi, trnC-GCA, trnD-GUC, trnE-UUCi (× 3), trnF-GAA, trnG-GCC, trnG-UCCi*, trnH-GUG, trnK-UUUi, trnL-CAA (2x), trnL-UAAi**, trnL-UAG, trnM-CAU (× 4)***, trnN-GUU (× 2), trnP-UGG, trnQ-UUG, trnR-ACG (× 2), trnR-UCU, trnS-AGAi***, trnS-CGAi, trnS-GCU, trnS-GGA, trnS-UGA, trnT-GGU, trnT-UGU, trnV-GAC (× 2), trnW-CCA, trnY-GUA |
| Ribosomal RNAs | rrn4.5 (× 2), rrn5 (× 2), rrn16 (× 2), rrn23 (× 2) |
| Hypothetical chloroplast reading frames | ycf1 (2x), ycf2 (2x), ycf3i, ycf4, ycf15 (2x) |
iGenes with introns.
TTranspliced gene.
*Present only in M. exstipulata.
**Absent in C. brasiliense.
***Only three copies in M. exstipulata.
Figure 2Comparison of the genes flanking the limits of LSC, SSC, and IR regions from the five complete clusioid plastomes assembled. JLB (IRb/LSC), JSB (IRb/SSC), JSA (SSC/IRa) and JLA (IRa/LSC) denote the respective limit in the genome.
Figure 3Progressive Mauve alignment showing synteny and rearrangements for twelve clusioid plastomes. Brownish-orange bars represent the psbJ–psbE inversion in Kielmeyera coriacea and light blue bars represent the trnK–rbcL inversion in the two Podostemaceae species (Marathrum foeniculaceum and Tristicha trifaria).
Comparison of the number of simple sequence repeats (SSRs) and of long repeats present in 12 clusioid species.
| Species | Microsatellites (SSRs) | Repeats | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Mono | Di | Tri | Tetra | Penta | Hexa | Total | Complementary | Forward | Palindromic | Reverse | |
| 358 | 49 | 2 | 6 | 0 | 0 | 415 | 0 | 15 | 13 | 4 | |
| 306 | 35 | 6 | 8 | 5 | 1 | 361 | 0 | 19 | 10 | 4 | |
| 297 | 53 | 7 | 5 | 3 | 1 | 366 | 0 | 19 | 14 | 6 | |
| 337 | 42 | 6 | 10 | 1 | 0 | 396 | 0 | 9 | 18 | 1 | |
| 380 | 45 | 3 | 11 | 5 | 0 | 444 | 0 | 40 | 10 | 0 | |
| 321 | 43 | 6 | 9 | 1 | 0 | 380 | 2 | 24 | 21 | 0 | |
| 300 | 53 | 3 | 10 | 1 | 1 | 368 | 0 | 19 | 8 | 4 | |
| 304 | 53 | 2 | 8 | 0 | 0 | 367 | 0 | 26 | 11 | 3 | |
| 301 | 48 | 4 | 6 | 1 | 0 | 360 | 0 | 13 | 13 | 2 | |
| 403 | 28 | 5 | 14 | 2 | 2 | 454 | 0 | 6 | 15 | 4 | |
| 309 | 41 | 7 | 8 | 4 | 1 | 370 | 0 | 17 | 17 | 3 | |
| 373 | 25 | 2 | 2 | 0 | 0 | 402 | 0 | 8 | 10 | 2 | |
Figure 4Network representation of the relationships within the clusioid families. Compatible sets of splits are represented by a single branch, and splits where there is incompatibility are represented by a band of parallel branches, colored in red. Families are colored following the legend; names in black denote the outgroup.
Figure 5Relationships within the clusioid families and summary of conflicts. On the top, (I) Gene tree discordance within the clusioids represented in the coalescent-based species tree. Pie charts summarize the proportion of conflicting (red and green), concordant (blue) and non-informative (gray) genes for each branch. Numbers above branches indicate concordant genes at that node, and below conflicting genes. On the right the tree generated with one tree for each of the 82 genes (82), and on the left the tree generated with 25 independent replicates for each of the 82 genes (2050), both after removal of poorly aligned sequences (gb). On the bottom, (II) Summary of the main conflicting relationships recovered using maximum-likelihood (values in blue), Bayesian inference (values in red), and multispecies coalescent (values in black) and eight different datasets (CU: 82 protein-coding genes concatenated unpartitioned, CP: 82 protein-coding genes partitioned with individual evolutionary models, 82: a single consensus gene tree per locus used as input, 2050: 25 consensus gene trees from independent runs per locus used as input; each of these four datasets have two versions: one without removal of poorly aligned regions–nogb, and one after removal using Gblocks [50]–gb). On the top of each tree there are support values for the branch highlighted in red. The two upper trees (A and B) represent alternative topologies for the position of Bonnetiaceae and Clusiaceae families. The two bottom trees (C and D) represent alternative topologies for the position of Mammea americana (Calophyllaceae). Support values are represented by ultrafast bootstrap (UB) or posterior probability (PP), and the respective dataset (CUgb, CUnogb, CPgb, CPnogb, 82gb, 82nogb, 2050gb, 2050nogb) are in parenthesis. A. Bonnetiaceae sister to Clusiaceae: 76% (UB, CUgb)/58% (UB, CPnogb)/83% (UB, CPgb)/0.47 (PP, 82gb)/1.0 (PP, 2050gb)/ 0.98 (PP, 2050nogb)/0.71 (PP, CUgb)/0.93 (PP, CPgb)/ 0.62 (PP, CPnogb). B. Bonnetiaceae sister to the other clusioid families: 66% (UB, CUnogb)/0.58 (PP, 82nogb)/0.80 (PP, CUnogb). C. Mammea americana sister to the other Calophyllaceae species: 77% (UB, CUgb)/ 100% (UB, CUnogb)/ 87% (UB, CPgb)/ 100% (UB, CPnogb)/0.7 (PP, CUgb)/1.0 (PP, CUnogb)/0.92 (PP, CPgb)/1.0 (PP, CPnogb). D. Mammea americana in a clade with Calophyllum brasiliense and Mesua ferrea: 0.76 (PP, 2050gb)/0.22 (PP, 2050nogb)/0.37 (PP, 82gb)/0.44 (82PP, nogb).