| Literature DB >> 34204211 |
Dhafer A Alzahrani1, Enas J Albokhari1,2, Samaila S Yaradua1, Abidina Abba1.
Abstract
This study presents for the first time the complete chloroplast genomes of four medicinal species in the Capparaceae family belonging to two different genera, Cadaba and Maerua (i.e., C. farinosa, C. glandulosa, M. crassifolia and M. oblongifolia), to investigate their evolutionary process and to infer their phylogenetic positions. The four species are considered important medicinal plants, and are used in the treatment of many diseases. In the genus Cadaba, the chloroplast genome ranges from 156,481 bp to 156,560 bp, while that of Maerua ranges from 155,685 bp to 155,436 bp. The chloroplast genome of C. farinosa, M. crassifolia and M. oblongifolia contains 138 genes, while that of C. glandulosa contains 137 genes, comprising 81 protein-coding genes, 31 tRNA genes and 4 rRNA genes. Out of the total genes, 116-117 are unique, while the remaining 19 are replicated in inverted repeat regions. The psbG gene, which encodes for subunit K of NADH dehydrogenase, is absent in C. glandulosa. A total of 249 microsatellites were found in the chloroplast genome of C. farinosa, 251 in C. glandulosa, 227 in M. crassifolia and 233 in M. oblongifolia, the majority of which are mononucleotides A/T found in the intergenic spacer. Comparative analysis revealed variable hotspot regions (atpF, rpoC2, rps19 and ycf1), which can be used as molecular markers for species authentication and as regions for inferring phylogenetic relationships among them, as well as for evolutionary studies. The monophyly of Capparaceae and other families under Brassicales, as well as the phylogenetic positions of the studied species, are highly supported by all the relationships in the phylogenetic tree. The cp genomes reported in this study will provide resources for studying the genetic diversity of Capparaceae, as well as resolving phylogenetic relationships within the family.Entities:
Keywords: Cadaba; Capparaceae; Maerua; chloroplast genome; phylogenetic relationships
Year: 2021 PMID: 34204211 PMCID: PMC8234754 DOI: 10.3390/plants10061229
Source DB: PubMed Journal: Plants (Basel) ISSN: 2223-7747
Base content in the C. farinosa, C. glandulosa, M. crassifolia and M. oblongifolia chloroplast genomes.
| Species |
|
|
|
|
|---|---|---|---|---|
| Genome size (bp) | 156,481 | 156,560 | 155,685 | 155,436 |
| IR (bp) | 26,430 | 26,424 | 26,294 | 26,401 |
| LSC (bp) | 85,565 | 85,681 | 84,624 | 84,153 |
| SSC (bp) | 18,056 | 18,031 | 18,473 | 18,481 |
| Total number of genes | 138 | 137 | 138 | 138 |
| rRNA | 4 | 4 | 4 | 4 |
| tRNA | 31 | 31 | 31 | 31 |
| Protein-coding genes | 81 | 80 | 81 | 81 |
| A% | 31 | 31 | 31 | 31 |
| C% | 18 | 18 | 18 | 18 |
| T% | 32 | 32 | 32 | 32 |
| G% | 17 | 17 | 17 | 17 |
Figure 1Chloroplast genome maps of the four Capparaceae species. Genes drawn inside the circle are transcribed clockwise, while those outside the circle are transcribed counter-clockwise. The inner dark gray circle corresponds to GC content and the inner light gray circle corresponds to the AT content. Different colors are used as a representation of distinctive genes within separate functional groups.
Gene contents in the chloroplast genomes of Cadaba and Maerua species.
| Category | Gene Groups | Gene Names |
|---|---|---|
| RNA genes | Ribosomal RNA genes (rRNA) |
|
| Transfer RNA genes (tRNA) | ||
| Ribosomal proteins | Small ribosomal subunit |
|
| Transcription | Large ribosomal subunit |
|
| DNA dependent RNA polymerase |
| |
| Protein-coding genes | Photosystem I |
|
| Photosystem II | ||
| Subunit of cytochrome |
| |
| Subunit of synthase |
| |
| Large subunit of Rubisco |
| |
| NADH dehydrogenase |
| |
| Other genes | ATP dependent protease subunit P |
|
| Chloroplast envelope membrane protein |
| |
| Maturase |
| |
| Subunit of acetyl-CoA carboxylase |
| |
| C-type cytochrome synthesis |
| |
| Translation initiation factor |
| |
| Hypothetical proteins |
| |
| Component of the TIC complex |
|
+ Gene with one intron, ++ gene with two introns and a gene with multiple copies. a gene with multiple copies. * ndhK in group photosystem II in C. farinosa and group NADH dehydrogenase in C. glandulosa, M. crassifolia and M. oblongifolia.
Figure 2Amino acid frequencies in the four Capparaceae chloroplast genomes’ protein-coding sequences.
Figure 3Number of different repeats in four chloroplast genomes of four species of Capparaceae. p = palindromic, F = forward, R = reverse and C= complement.
Simple sequence repeats in the C. farinosa, C. glandulosa, M. crassifolia and M. oblongifolia chloroplast genomes.
| SSR Type | Repeat Unit | Species | |||
|---|---|---|---|---|---|
|
|
|
|
| ||
| Mono | A/T | 220 | 223 | 204 | 208 |
| C/G | 1 | 1 | 2 | 2 | |
| Di | AC/GT | 0 | 1 | 0 | 0 |
| AG/CT | 2 | 0 | 1 | 1 | |
| AT/AT | 11 | 12 | 7 | 10 | |
| Tri | AAT/ATT | 4 | 5 | 2 | 3 |
| Tetra | AAAC/GTTT | 0 | 0 | 1 | 0 |
| AAAG/CTTT | 0 | 0 | 0 | 1 | |
| AAAT/ATTT | 6 | 6 | 4 | 5 | |
| AATT/AATT | 2 | 2 | 1 | 0 | |
| AACT/AGTT | 0 | 0 | 0 | 1 | |
| AGAT/ATCT | 1 | 1 | 2 | 1 | |
| Penta | AAAAT/ATTTT | 1 | 0 | 0 | 0 |
| AAATT/AATTT | 0 | 0 | 1 | 0 | |
| AACAT/ATGTT | 0 | 0 | 0 | 1 | |
| AAACT/AGTTT | 1 | 0 | 0 | 0 | |
| AATAG/ATTCT | 0 | 0 | 2 | 0 | |
Figure 4Frequency of different SSR motifs in different repeat types in C. farinosa, C. glandulosa, M. crassifolia and M. oblongifolia chloroplast genomes.
Figure 5Number of different SSR types in the four chloroplast genomes of Capparaceae.
Figure 6Alignment of chloroplast genomes of C. farinosa, C. glandulosa, M. crassifolia, M. oblongifolia and C. versicolor performed with C. farinosa as reference. Transcription direction is indicated by the gray arrows at the top, protein coding is represented by blue bars, non-coding sequence CNS is represented by pink bars and tRNAs and rRNAs are represented by light green. The cp genome is identified by the coordinates in the x-axis, while the y-axis represents the percentage identity within 50–100%.
Figure 7Comparison of the IR, SSC and LSC junction positions among five chloroplast genomes of Capparaceae.
Figure 8The synonymous (dS) and dN/dS ratio values of 81 protein-coding genes from four Capparaceae cp genomes.
Figure 9Phylogenetic tree reconstruction based on the complete chloroplast genome of 21 taxa inferred from Bayesian inference (BI) methods, showing relationships within Brassicales. Numbers in the clade represent posterior probability (PP) values.