| Literature DB >> 35704623 |
Selahattin Baris Cay1, Yusuf Ulas Cinar1, Selim Can Kuralay1, Behcet Inal2, Gokmen Zararsiz3,4, Almila Ciftci1, Rachel Mollman1, Onur Obut1, Vahap Eldem1, Yakup Bakir5, Osman Erol1.
Abstract
Crocus istanbulensis (B.Mathew) Rukšāns is one of the most endangered Crocus species in the world and has an extremely limited distribution range in Istanbul. Our recent field work indicates that no more than one hundred individuals remain in the wild. In the present study, we used genome skimming to determine the complete chloroplast (cp) genome sequences of six C. istanbulensis individuals collected from the locus classicus. The cp genome of C. istanbulensis has 151,199 base pairs (bp), with a large single-copy (LSC) (81,197 bp), small single copy (SSC) (17,524 bp) and two inverted repeat (IR) regions of 26,236 bp each. The cp genome contains 132 genes, of which 86 are protein-coding (PCGs), 8 are rRNA and 38 are tRNA genes. Most of the repeats are found in intergenic spacers of Crocus species. Mononucleotide repeats were most abundant, accounting for over 80% of total repeats. The cp genome contained four palindrome repeats and one forward repeat. Comparative analyses among other Iridaceae species identified one inversion in the terminal positions of LSC region and three different gene (psbA, rps3 and rpl22) arrangements in C. istanbulensis that were not reported previously. To measure selective pressure in the exons of chloroplast coding sequences, we performed a sequence analysis of plastome-encoded genes. A total of seven genes (accD, rpoC2, psbK, rps12, ccsA, clpP and ycf2) were detected under positive selection in the cp genome. Alignment-free sequence comparison showed an extremely low sequence diversity across naturally occurring C. istanbulensis specimens. All six sequenced individuals shared the same cp haplotype. In summary, this study will aid further research on the molecular evolution and development of ex situ conservation strategies of C. istanbulensis.Entities:
Mesh:
Year: 2022 PMID: 35704623 PMCID: PMC9200356 DOI: 10.1371/journal.pone.0269747
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.752
Chloroplast genomes features of seven taxa from the Iridaceae.
|
|
|
|
|
|
|
| |
|---|---|---|---|---|---|---|---|
|
Genome Size (bp) | 151,199 | 150,819 | 150,820 | 152,408 | 153,441 | 153,084 | 119,004 |
|
| 81,197 | 81,309 | 81,310 | 82,340 | 82,659 | 82,484 | 45,795 |
|
| 26,239 | 26,057 | 26,056 | 26,026 | 26,221 | 26,168 | 36,347 |
|
| 17,524 | 17,396 | 17,396 | 18,016 | 18,376 | 18,264 | 515 |
|
| 132 | 132 | 132 | 133 | 132 | 133 | 111 |
|
| 86 | 86 | 86 | 87 | 86 | 86 | 39 |
|
| 38 | 38 | 38 | 38 | 38 | 38 | 37 |
|
| 8 | 8 | 8 | 8 | 8 | 8 | 8 |
|
| 37.6 | 37.5 | 37.5 | 38.5 | 37.9 | 37.9 | 38.5 |
|
| 35.69 | 35.57 | 35.57 | 36.23 | 36.01 | 36.07 | 35.79 |
|
| 42.75 | 42.79 | 42.79 | 43.07 | 43.07 | 43.08 | 40.33 |
|
| 30.97 | 30.76 | 30.76 | 31.83 | 31.55 | 31.52 | 31.59 |
|
| MN254968 | MH542231 | MH542233 | KT626943 | KM014691 | MH251636 | MH142524 |
Fig 1Circular visualization of cp genome annotation for C. istanbulensis.
Genes belonging to different functions categories were shown in different colors. Genes drawn inside the circle are transcribed clockwise, and those outside are transcribed counter clockwise. GC content ratio is shown in the middle circle.
The functional classification of cp genes annotated in the cp genome of C. istanbulensis.
| Category | Gene group | Gene Name |
|---|---|---|
|
| Subunits of Photosystem I | |
| Subunits of Photosystem II | ||
| Large subunit of rubisco |
| |
| Subunits of ATP synthase | ||
| Subunits of cytochrome | ||
| Subunits of NADH dehydrogenase | ||
|
| Small subunit of ribosome | |
| Large subunit of ribosome | ||
| Transfer RNA genes | ||
| DNA-dependent RNA polymerase | ||
|
| Translational initiation factor |
|
| Protease |
| |
| Maturase |
| |
| Envelop membrane protein |
| |
| Subunit of acetyl-CoA-carboxylase |
| |
|
| Conserved hypothetical chloroplast reading frames |
* indicates gene containing a single intron, (2X) refers genes that are located in the IRs and hence are duplicated.
Fig 2Comparison of the LSC, IR and SSC junction positions among seven-chloroplast genome of Iridaceae.
JLB represents the of LSC/IRb junction, JSB represents the IRb/ SSC junction, JSA represents the SSC/IRa junction, and JLA represents the IRa/LSC junction. The thin lines represent the connection points of each area, and the number of base pairs (bp) show the distance from the boundary site to the end of the gene (in colored box).
Fig 3Comparative genome map of C. cartwrightianus, C. istanbulensis and C. sativus.
Lines among cp genomes represent matched genes. Genes were colored based on their structural and functional classes, the inverted region of 26,239 bp in length.
The number and distribution of SSR repeats in Iridaceae cp genome.
|
|
|
|
|
|
|
| ||
|---|---|---|---|---|---|---|---|---|
| Total |
| 60 | 63 | 65 | 44 | 35 | 38 | 45 |
|
| 699 | 724 | 749 | 507 | 401 | 449 | 516 | |
|
| 0.46 | 0.48 | 0.49 | 0.33 | 0.26 | 0.29 | 0.43 | |
| Monomer |
| 56 | 60 | 60 | 44 | 32 | 33 | 36 |
|
| 625 | 650 | 651 | 507 | 353 | 371 | 407 | |
|
| 89.41 | 89.78 | 86.92 | 100.00 | 88.03 | 82.63 | 78.88 | |
| Dimer |
| 4 | 2 | 4 | 0 | 2 | 4 | 8 |
|
| 74 | 38 | 62 | 0 | 30 | 48 | 91 | |
| Trimer |
| 0 | 0 | 0 | 0 | 1 | 1 | 1 |
|
| 0 | 0 | 0 | 0 | 18 | 30 | 18 | |
| Hexamer |
| 0 | 1 | 1 | 0 | 0 | 0 | 0 |
|
| 0 | 36 | 36 | 0 | 0 | 0 | 0 |
Fig 4The distribution of SSR motifs in Iridaceae cp genome.
Those marked with green, yellow, red and black circle indicate high, middle, low and lowest SSR numbers.
Results of the evolutionary analyses for positively selected sites for accD,.
| Gene | Region | M7 vs M8 (χ2) | M7 vs M8 | % sites with | avg( | M8 BEB ( | |
|---|---|---|---|---|---|---|---|
|
|
| Full (aa 1–442) | 26.89 | < 0:001 | 1.12 | 34.13 | R4; M34; L38; L55; A212; |
|
| Full (aa 1–442) | 24.03 | < 0:001 | 1.01 | 41.67 | ||
|
| Full (aa 1–442) | 21.47 | < 0:001 | 3.14 | 11.79 | ||
|
|
| Full (aa 1–1355) | 6.46 | 0:04 | 12.65 | 2.24 | I626; |
|
| Full (aa 1–1355) | 2.77 | 0:25 | NA | NA | NA | |
|
| Full (aa 1–1355) | 5.33 | 0:07 | 1.15 | 6.38 | Q925; | |
|
|
| Full (aa 1–61) | 6.0 | 0:05 | 23.95 | 6.08 | S17 |
|
| Full (aa 1–61) | 8.19 | 0:017 | 4.71 | 31.82 | S17; H20 | |
|
| Full (aa 1–61) | 10.56 | 0:005 | 6.38 | 26.49 | S17; H20 | |
|
|
| Full (aa 1–116) | 28.54 | < 0:001 | 25.78 | 14.87 | |
|
| Full (aa 1–116) | 33.88 | < 0:001 | 25.57 | 16.27 | ||
|
| Full (aa 1–116) | 37.64 | < 0:001 | 25.44 | 20.04 | ||
|
|
| Full (aa 1–318) | 5.23 | 0:073 | 1.99 | 6.96 | A4; G92; A103 |
|
| Full (aa 1–318) | 9.13 | 0:01 | 1.68 | 10.5 | A4; | |
|
| Full (aa 1–318) | 8.44 | 0:015 | 1.86 | 9.71 | A4; | |
|
|
| Full (aa 1–203) | 25.34 | < 0:001 | 0.49 | 104.75 |
|
|
| Full (aa 1–203) | 26.96 | < 0:001 | 0.49 | 83.13 |
| |
|
| Full (aa 1–203) | 27.41 | < 0:001 | 0.5 | 114.9 |
| |
|
|
| Full (aa 1–2183) | 5.84 | 0:054 | 1.59 | 11.29 | D65; R1147; K1190; N1238; K1571; H1655; L2048; A215 |
|
| Full (aa 1–2183) | 6.62 | 0:036 | 3.06 | 8.87 | D65; R1147; K1190; N1238; K1571; H1655; L2048; A2155 | |
|
| Full (aa 1–2183) | 6.83 | 0:033 | 4.67 | 7.27 | D65; R1147; K1190; N1238; K1571; H1655; L2048; A2155 | |
P-values were achieved by performing chi-squared tests on twice the difference of the computed log likelihood values of the models disallowing (M7) or allowing (M8) dN = dS > 1. The BEB column lists rapidly evolving sites with a dN = dS > 1 and a posterior probability > 0:95, determined by the Bayes Empirical Bayes implemented in Codeml. Amino acids refer to C. istanbulensis cp exonic sequence. Note that INDELs and the stop codon were removed from the alignment prior to evolutionary analysis, so shown positions are based on the alignment without gaps (aa = amino acids, PP = posterior probability).
Fig 5(a) A schematic representation of k-mer based unassembled similar results of C. istanbulensis individuals using genome skim data. (b) The sequence similarity results of C. istanbulensis individuals with the whole chloroplast genomes of other Iridaceae species. Warm colors (red) represent relatively moderate sequence diversity, whereas cool colors (blue) represent low sequence diversity.