| Literature DB >> 31200508 |
Ming Yan1,2, Xueqing Zhao3,4, Jianqing Zhou5,6, Yan Huo7,8, Yu Ding9,10, Zhaohe Yuan11,12.
Abstract
Pomegranates (Punica granatum L.) are one of the most popular fruit trees cultivated in arid and semi-arid tropics and subtropics. In this study, we determined and characterized three complete chloroplast (cp) genomes of P. granatum cultivars with different phenotypes using the genome skimming approach. The complete cp genomes of three pomegranate cultivars displayed the typical quadripartite structure of angiosperms, and their length ranged from 156,638 to 156,639 bp. They encoded 113 unique genes and 17 are duplicated in the inverted regions. We analyzed the sequence diversity of pomegranate cp genomes coupled with two previous reports. The results showed that the sequence diversity is extremely low and no informative sites were detected, which suggests that cp genome sequences may be not be suitable for investigating the genetic diversity of pomegranate genotypes. Further, we analyzed the codon usage pattern and identified the potential RNA editing sites. A comparative cp genome analysis with other species within Lythraceae revealed that the gene content and organization are highly conserved. Based on a site-specific model, 11 genes with positively selected sites were detected, and most of them were photosynthesis-related genes and genetic system-related genes. Together with previously released cp genomes of the order Myrtales, we determined the taxonomic position of P. granatum based on the complete chloroplast genomes. Phylogenetic analysis suggested that P. granatum form a single clade with other species from Lythraceae with a high support value. The complete cp genomes provides valuable information for understanding the phylogenetic position of P. gramatum in the order Myrtales.Entities:
Keywords: chloroplast genome; phylogeny; pomegranate; sequence diversity; site-specific selection
Mesh:
Substances:
Year: 2019 PMID: 31200508 PMCID: PMC6627765 DOI: 10.3390/ijms20122886
Source DB: PubMed Journal: Int J Mol Sci ISSN: 1422-0067 Impact factor: 5.923
Figure 1Chloroplast genome maps of P. granatum. Genes drawn outside the outer circle are transcribed clockwise, and those inside are transcribed counter-clockwise. Genes belonging to different functional groups are color-coded.
The groups of genes within the P. granatum chloroplast genome.
| Group of Genes | Gene Names |
|---|---|
| Photosystem I | |
| Photosystem II | |
| Cytochrome b/f complex | |
| ATP synthase | |
| NADP dehydrogenase | |
| RubisCO large subunit | |
| RNA polymerase | |
| Ribosomal proteins (SSU) | |
| Ribosomal proteins (LSU) | |
| Hypothetical chloroplast reading frames | |
| Translation initiation factor IF-1 | |
| Acetyl-CoA carboxylase | |
| Cytochrome c biogenesis Maturase | |
| ATP-dependent protease | |
| Inner membrane protein | |
| Ribosomal RNAs | |
| Transfer RNAs |
One asterisk indicates that the genes contained two copies. a and b indicate one- and two-intron containing genes, respectively.
Figure 2The codon usage pattern of the pomegranate chloroplast (cp) genome. (A) GC content on three different positions. (B) Neutrality plot (GC12 against GC3). (C) The codon adaptation index (CAI) value of gene sets with different functions. (D) Relationship between GC3 and effective number of codons (ENC) (ENC-plot). The expected ENC from GC3 is shown as a solid.
Putative preferred codons in the P. granatum cp genome. RSCU = relative synonymous codon usage.
| Amino Acid | Codon | Codon Frequency | RSCU | AA | Codon | Codon Frequency | RSCU |
|---|---|---|---|---|---|---|---|
| Phe | UUU* | 4551 | 1.18 | Ser | UCU* | 2417 | 1.46 |
| UUC | 3143 | 0.82 | UCC | 1577 | 0.96 | ||
| Leu | UUA* | 3112 | 1.41 | UCA* | 2278 | 1.38 | |
| UUG* | 2920 | 1.32 | UCG | 1268 | 0.77 | ||
| CUU* | 2586 | 1.17 | Pro | CCU* | 1385 | 1.21 | |
| CUC | 1360 | 0.62 | CCC | 929 | 0.81 | ||
| CUA | 1958 | 0.89 | CCA* | 1387 | 1.21 | ||
| CUG | 1287 | 0.58 | CCG | 876 | 0.77 | ||
| Ile | AUU* | 4378 | 1.27 | Thr | ACU* | 1478 | 1.13 |
| AUC | 2723 | 0.79 | ACC | 1095 | 0.84 | ||
| AUA | 3246 | 0.94 | ACA* | 1759 | 1.35 | ||
| Met | AUG | 2760 | 1.00 | ACG | 886 | 0.68 | |
| Val | GUU* | 2045 | 1.34 | Ala | GCU* | 1389 | 1.45 |
| GUC | 1033 | 0.68 | GCC | 712 | 0.75 | ||
| GUA* | 1891 | 1.24 | GCA | 1145 | 1.20 | ||
| GUG | 1123 | 0.74 | GCG | 576 | 0.60 | ||
| Tyr | UAU* | 3606 | 1.37 | Cys | UGU* | 1410 | 1.17 |
| UAC | 1665 | 0.63 | UGC | 993 | 0.83 | ||
| TER | UAA* | 2029 | 1.03 | TER | UGA | 2003 | 1.01 |
| UAG | 1893 | 0.96 | Trp | UGG | 2392 | 1.00 | |
| His | CAU* | 1908 | 1.38 | Arg | CGU | 888 | 0.7 |
| CAC | 866 | 0.62 | CGC | 456 | 0.36 | ||
| Gln | CAA* | 2815 | 1.35 | CGA* | 1428 | 1.13 | |
| CAG | 1342 | 0.65 | CGG | 865 | 0.68 | ||
| Asn | AAU* | 3923 | 1.37 | Ser | AGU | 1441 | 0.87 |
| AAC | 1800 | 0.63 | AGC | 923 | 0.56 | ||
| Lys | AAA* | 4768 | 1.31 | Arg | AGA* | 2560 | 2.02 |
| AAG | 2538 | 0.69 | AGG | 1412 | 1.11 | ||
| Asp | GAU* | 2818 | 1.49 | Gly | GGU | 1642 | 1.01 |
| GAC | 962 | 0.51 | GGC | 886 | 0.54 | ||
| Glu | GAA* | 3632 | 1.37 | GGA* | 2409 | 1.48 | |
| GAG | 1689 | 0.63 | GGG | 1569 | 0.96 |
Preferred codons (RSCU value > 1.0) are indicated with (*).
Predicted RNA editing sites in the cp genome of P. granatum.
| Gene | Nucleotide Position | Amino Acid Position | Codon Conversion | Score |
|---|---|---|---|---|
| 644 | 215 | GCA (A) => GTA (V) | 1 | |
| 1177 | 393 | CGG (R) => TGG (W) | 1 | |
| 1187 | 396 | TCA (S) => TTA (L) | 0.86 | |
| 1246 | 416 | CAC (H) => TAC (Y) | 1 | |
| 791 | 264 | CCC (P) => CTC (L) | 1 | |
| 92 | 31 | CCA (P) => CTA (L) | 0.86 | |
| 23 | 8 | ACC (T) => ATC (I) | 1 | |
| 422 | 141 | TCG (S) => TTG (L) | 1 | |
| 3056 | 1019 | GCA (A) => GTA (V) | 0.86 | |
| 3998 | 1333 | GCG (A) => GTG (V) | 0.86 | |
| 41 | 14 | TCA (S) => TTA (L) | 1 | |
| 1171 | 391 | CCA (P) => TCA (S) | 1 | |
| 338 | 113 | TCT (S) => TTT (F) | 1 | |
| 551 | 184 | TCA (S) => TTA (L) | 1 | |
| 566 | 189 | TCG (S) => TTG (L) | 1 | |
| 973 | 325 | CTC (L) => TTC (F) | 0.86 | |
| 80 | 27 | TCA (S) => TTA (L) | 1 | |
| 149 | 50 | TCA (S) => TTA (L) | 1 | |
| 1487 | 496 | TCG (S) => TTG (L) | 1 | |
| 794 | 265 | TCG (S) => TTG (L) | 0.8 | |
| 1403 | 468 | CCT (P) => CTT (L) | 1 | |
| 2 | 1 | ACG (T) => ATG (M) | 1 | |
| 77 | 26 | TCT (S) => TTT (F) | 1 | |
| 559 | 187 | CAT (H) => TAT (Y) | 1 | |
| 28 | 10 | CTC (L) => TTC (F) | 1 | |
| 149 | 50 | TCA (S) => TTA (L) | 1 | |
| 467 | 156 | CCA (P) => CTA (L) | 0.8 | |
| 586 | 196 | CAT (H) => TAT (Y) | 1 | |
| 611 | 204 | TCA (S) => TTA (L) | 1 | |
| 737 | 246 | CCA (P) => CTA (L) | 1 | |
| 746 | 249 | TCT (S) => TTT (F) | 1 | |
| 830 | 277 | TCA (S) => TTA (L) | 1 | |
| 836 | 279 | TCA (S) => TTA (L) | 1 | |
| 1255 | 419 | CAT (H) => TAT (Y) | 1 | |
| 1481 | 494 | CCA (P) => CTA (L) | 1 | |
| 160 | 54 | CTT (L) => TTT (F) | 1 | |
| 586 | 196 | CTT (L) => TTT (F) | 0.8 | |
| 89 | 30 | TCG (S) => TTG (L) | 1 | |
| 2 | 1 | ACG (T) => ATG (M) | 1 | |
| 185 | 62 | ACC (T) => ATC (I) | 1 | |
| 313 | 105 | CGG (R) => TGG (W) | 0.8 | |
| 383 | 128 | TCA (S) => TTA (L) | 1 | |
| 674 | 225 | TCG (S) => TTG (L) | 1 | |
| 845 | 282 | ACA (T) => ATA (I) | 0.8 | |
| 878 | 239 | TCA (S) => TTA (L) | 1 | |
| 887 | 296 | CCA (P) => CTA (L) | 1 | |
| 1405 | 469 | CTT (L) => TTT (F) | 0.8 | |
| 155 | 52 | CCA (P) => CTA (L) | 1 | |
| 166 | 56 | CAT (H) => TAT (Y) | 0.8 | |
| 314 | 105 | ACA (T) => ATA (I) | 0.8 | |
| 341 | 114 | TCA (S) => TTA (L) | 1 | |
| 566 | 189 | TCA (S) => TTA (L) | 1 | |
| 1073 | 358 | TCC (S) => TTC (F) | 1 |
Figure 3The genetic distance based on Kimura’s two-parameter model. (A) The P-distance value of protein-coding genes. (B) The P-distance value of intergenic regions. (C) Boxplots of P-distance value difference among LSC, SSC, and IRs. (D) Boxplots of P-distance value differences between protein-coding genes and intergenic regions.
Summary of the complete chloroplast genome characteristics of five species in Lythraceae.
| Species | |||||
|---|---|---|---|---|---|
| Genome size | 158,638 | 152,025 | 153,061 | 155,577 | 159,219 |
| LSC size | 89,021 | 84,046 | 87,226 | 88,528 | 88,571 |
| SSC size | 18,684 | 16,914 | 18,032 | 18,272 | 18,821 |
| IR size | 25,467 | 25,623 | 23,902 | 24,389 | 25,914 |
| Number of genes | 113 | 113 | 107 | 110 | 112 |
| Protein-coding genes | 79 (6) | 79 (7) | 79 (6) | 77 (5) | 78 (7) |
| tRNA genes | 30 (7) | 30 (7) | 24 (5) | 29 (9) | 30 (6) |
| rRNA genes | 4 (4) | 4 (4) | 4 (4) | 4 (4) | 4 (4) |
| Number of genes duplicated in IR | 17 | 18 | 15 | 18 | 17 |
| GC content | 36.92 | 37.59 | 37.29 | 36.4 | 36.95 |
| GenBank accession | MK603511 | NC_030484 | NC_039975 | NC_037023 | MG921615 |
Figure 4Comparison of the borders of the LSC, SSC, and IRs regions among five Lythraceae cp genomes.
Figure 5Co-linear analysis of five cp genomes within Lythraceae.
Log-likelihood values of the site-specific models, with detected sites having non-synonymous/synonymous (dN/dS) values > 1.
| Gene Name | Models (Number of Parameters) | lnL | Likelihood Ratio Test | Positively Selected Sites |
|---|---|---|---|---|
| M8 (12) | −2534.400824 | 0.0962956 | 125 G 0.955 * | |
| M7 (10) | −2536.741156 | |||
| M8(12) | −4446.871610 | 0.000000002 | 292 N 0.961 *; 486 R 0.999 **; 487 I 0.975 *; 490 K 0.985 *; 518 N 0.969 *; 648 S 0.983 *; 738 F 0.995 ** | |
| M7(10) | −4466.787914 | |||
| M8(12) | −806.644367 | 0.003291615 | 121 R 0.970 * | |
| M7(10) | −812.360744 | |||
| M8(12) | −139.985819 | 0.031547151 | 26 H 0.959* | |
| M7(10) | −143.4420 | |||
| M8(12) | −942.833567 | 0.000497160 | 4 L 0.972 *; 5 Y 0.961 *; 73 P 0.962 *; 125 A 0.993 **; 126 R 0.994 ** | |
| M7(10) | −950.440165 | |||
| M8(12) | −529.317591 | 0.002536102 | 117 K 0.974 * | |
| M7(10) | −535.294718 | |||
| M8(12) | −1171.788526 | 0.008240568 | 173 E 0.982 * | |
| M7(10) | −1176.587212 | |||
| M8(12) | −994.666749 | 0.007084882 | 28 P 0.959 * | |
| M7(10) | −999.616541 | |||
| M8(12) | −672.815064 | 0.000000001 | 84 T 1.000 ** | |
| M7(10) | −693.433775 | |||
| M8(12) | −718.655799 | 0.001840922 | 59 L 0.989 * | |
| M7(10) | −724.953288 | |||
| M8(12) | −11,993.590817 | 0.000001936 | 205 V 0.977 *; 206 F 0.975 *; 341 S 0.974 *; 495 S 0.952 *; 534 A 0.951 *; 1073 A 0.963 *; 1290 R 0.978 *; 1446 E 0.963 *; 1701 K 0.976 *; 1728 T 0.950 * | |
| M7(10) | −12,006.745724 |
* p < 0.05; ** p < 0.01.
Figure 6Phylogenetic tree was reconstructed using Maximum likelihood (ML) and Bayes inference (BI) methods based on complete cp genomes of the order Myrtales. Only the tree topology of the ML tree was presented.