| Literature DB >> 35672390 |
Morteza Sheikh-Assadi1, Roohangiz Naderi2, Mohsen Kafi3, Reza Fatahi3, Seyed Alireza Salami3, Vahid Shariati4.
Abstract
Lilium ledebourii (Baker) Boiss is a rare species, which exhibits valuable traits. However, before its genetic diversity and evolutionary were uncovered, its wild resources were jeopardized. Moreover, some ambiguities in phylogenetic relationships of this genus remain unresolved. Therefore, obtaining the whole chloroplast sequences of L. ledebourii and its comparative analysis along with other Lilium species is crucial and pivotal to understanding the evolution of this genus as well as the genetic populations. A multi-scale genome-level analysis, especially selection pressure, was conducted. Detailed third‑generation sequencing and analysis revealed a whole chloroplast genome of 151,884 bp, with an ordinary quadripartite and protected structure comprising 37.0% GC. Overall, 113 different genes were recognized in the chloroplast genome, consisting of 30 distinct tRNA genes, four distinct ribosomal RNAs genes, and 79 unique protein-encoding genes. Here, 3234 SSRs and 2053 complex repeats were identified, and a comprehensive analysis was performed for IR expansion and contraction, and codon usage bias. Moreover, genome-wide sliding window analysis revealed the variability of rpl32-trnL-ccsA, petD-rpoA, ycf1, psbI-trnS-trnG, rps15-ycf1, trnR, trnT-trnL, and trnP-psaJ-rpl33 were higher among the 48 Lilium cp genomes, displaying higher variability of nucleotide in SC regions. Following 1128 pairwise comparisons, ndhB, psbJ, psbZ, and ycf2 exhibit zero synonymous substitution, revealing divergence or genetic restriction. Furthermore, out of 78 protein-coding genes, we found that accD and rpl36 under positive selection: however, at the entire-chloroplast protein scale, the Lilium species have gone through a purifying selection. Also, a new phylogenetic tree for Lilium was rebuilt, and we believe that the Lilium classification is clearer than before. The genetic resources provided here will aid future studies in species identification, population genetics, and Lilium conservation.Entities:
Mesh:
Year: 2022 PMID: 35672390 PMCID: PMC9174193 DOI: 10.1038/s41598-022-13449-x
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.996
Figure 1The chloroplast genome map of L. ledebourii. Transcriptional directions are represented on the circle's inside (clockwise) and outside (counterclockwise). Genes are color-coded according to their functional groups.
Gene content and functional classification of L. ledebourii chloroplast genome.
| Category | Gene group | Gene name |
|---|---|---|
| Photosynthesis pathways | ATP synthase | |
| NADH-dehydrogenase | ||
| Cytochrome b/f complex | ||
| Photosystem I | ||
| Photosystem II | ||
| Rubisco | ||
| Transcription and translation related genes | DNA-dependent RNA polymerase | |
| Ribosomal proteins | Large subunit of ribosomal proteins | |
| Small subunit of ribosomal proteins | ||
| RNA genes | Ribosomal RNA | |
| Transfer RNA | ||
| Other genes | Maturase K | |
| Subunit of acetyl-CoAcarboxylase | ||
| C-type cytochrome synthesis gene | ||
| Envelope membrane protein | ||
| ATP-dependent protease subunit P | ||
| Translational initiation factor | ||
| Conserved hypothetical open reading frames |
*Gene with one intron; **gene with two introns; (×2) duplicated gene; atrans-spliced gene; ψ: pseudogene.
Figure 2Comparison of the junction positions of LSC, SSC, and IR regions among the among 48 Lilium cp genomes. The red identifiers represent the GenBank accession number of each species.
Figure 3Sliding window analysis of 48 Lilium cp genomes (window length: 600 bp—step size: 200 bp). The X-axis and Y-axis represents the position of a window and nucleotide diversity (Pi) of each window, respectively.
Figure 4The type and distribution of simple sequence repeats (SSRs) and complex repeat in the 48 Lilium cp genomes. (A) Frequency and type of SSRs in the L. ledebourii cp genome. (B) The number of SSR types discovered in 48 Lilium cp genomes. (C) The percentage of SSrs types in 48 Lilium cp genomes. (D) The number and of complex repeats types in 48 Lilim cp genomes. (E) Frequency of complex repeats by size.
Dispersed and palindromic repeats by positions in the cp genome of L. ledebourii.
| N | Size (bp) | Start position1 | Type | Start position2 | E-value |
|---|---|---|---|---|---|
| 1 | 53 | 36,308 | D | 38,532 | 5.06E−17 |
| 2 | 43 | 43,951 | P | 43,951 | 1.08E−14 |
| 3 | 38 | 27,465 | P | 27,465 | 8.59E−14 |
| 4 | 34 | 112,143 | P | 112,143 | 2.20E−11 |
| 5 | 40 | 88,672 | D | 88,696 | 3.77E−11 |
| 6 | 40 | 88,672 | P | 144,558 | 3.77E−11 |
| 7 | 40 | 88,696 | P | 144,582 | 3.77E−11 |
| 8 | 40 | 144,558 | D | 144,582 | 3.77E−11 |
| 9 | 39 | 41,001 | D | 96,049 | 1.43E−10 |
| 10 | 39 | 41,001 | P | 137,206 | 1.43E−10 |
| 11 | 41 | 71,433 | P | 71,433 | 3.86E−10 |
| 12 | 35 | 91,474 | P | 91,474 | 5.77E−10 |
| 13 | 35 | 91,474 | D | 141,785 | 5.77E−10 |
| 14 | 35 | 141,785 | P | 141,785 | 5.77E−10 |
| 15 | 34 | 53,789 | P | 53,789 | 1.11E−07 |
| 16 | 35 | 33,896 | P | 33,896 | 9.71E−07 |
| 17 | 34 | 75,092 | D | 75,118 | 3.55E−06 |
| 18 | 31 | 7496 | P | 42,412 | 5.89E−06 |
| 19 | 33 | 36,334 | D | 38,558 | 1.30E−05 |
| 20 | 30 | 3602 | P | 68,653 | 2.20E−05 |
| 21 | 30 | 5488 | P | 5488 | 2.20E−05 |
| 22 | 30 | 120,861 | P | 120,896 | 2.20E−05 |
| 23 | 32 | 33,306 | P | 42,412 | 4.71E−05 |
| 24 | 31 | 7496 | D | 33,307 | 1.71E−04 |
| 25 | 30 | 9028 | D | 34,112 | 6.17E−04 |
| 26 | 30 | 29,142 | D | 29,414 | 6.17E−04 |
| 27 | 30 | 110,845 | P | 110,873 | 6.17E−04 |
The Relative synonymous codon usage (RSCU) of L. ledebourii protein-coding genes.
| Codon | AA | ObsFreq | RSCU | Codon | AA | ObsFreq | RSCU |
|---|---|---|---|---|---|---|---|
| UAA | a | 68 | 1.468 | AUG | M | 552 | 1 |
| UAG | a | 41 | 0.885 | AAC | N | 216 | 0.422 |
| UGA | a | 30 | 0.647 | AAU | N | 808 | 1.578 |
| GCA | A | 345 | 1.151 | CCA | P | 254 | 1.121 |
| GCC | A | 186 | 0.621 | CCC | P | 184 | 0.812 |
| GCG | A | 125 | 0.417 | CCG | P | 116 | 0.512 |
| GCU | A | 543 | 1.812 | CCU | P | 352 | 1.554 |
| UGC | C | 62 | 0.486 | CAA | Q | 573 | 1.504 |
| UGU | C | 193 | 1.514 | CAG | Q | 189 | 0.496 |
| GAC | D | 175 | 0.411 | AGA | R | 401 | 1.519 |
| GAU | D | 676 | 1.589 | AGG | R | 127 | 0.481 |
| GAA | E | 859 | 1.511 | CGA | R | 298 | 1.53 |
| GAG | E | 278 | 0.489 | CGC | R | 81 | 0.416 |
| UUC | F | 435 | 0.691 | CGG | R | 109 | 0.56 |
| UUU | F | 824 | 1.309 | CGU | R | 291 | 1.494 |
| GGA | G | 588 | 1.581 | AGC | S | 86 | 0.383 |
| GGC | G | 171 | 0.46 | AGU | S | 363 | 1.617 |
| GGG | G | 253 | 0.68 | UCA | S | 352 | 1.149 |
| GGU | G | 476 | 1.28 | UCC | S | 259 | 0.846 |
| CAC | H | 108 | 0.408 | UCG | S | 144 | 0.47 |
| CAU | H | 421 | 1.592 | UCU | S | 470 | 1.535 |
| AUA | I | 647 | 1 | ACA | T | 353 | 1.253 |
| AUC | I | 365 | 0.564 | ACC | T | 205 | 0.728 |
| AUU | I | 929 | 1.436 | ACG | T | 115 | 0.408 |
| AAA | K | 837 | 1.499 | ACU | T | 454 | 1.611 |
| AAG | K | 280 | 0.501 | GUA | V | 456 | 1.482 |
| CUA | L | 289 | 1.115 | GUC | V | 161 | 0.523 |
| CUC | L | 160 | 0.617 | GUG | V | 167 | 0.543 |
| CUG | L | 135 | 0.521 | GUU | V | 447 | 1.452 |
| CUU | L | 453 | 1.747 | UGG | W | 399 | 1 |
| UUA | L | 782 | 1.271 | UAC | Y | 162 | 0.393 |
| UUG | L | 449 | 0.729 | UAU | Y | 662 | 1.607 |
aStop codon.
Figure 5Codon distribution of protein-coding genes among Lilium cp genomes. Color code: Red denotes a higher RSCU and blue denotes a lower RSCU.
Figure 6Ka/Ks ratios between Lilium cp genome pairs. In the multigene nucleotide alignment, the heatmap depicts pairwise Ka/Ks ratios between each concatenated single-copy CDs sequence.
Figure 7The phylogenetic relationships of Lilium species employing whole cp genome sequences. Fritillaria hupehensis and Fritillaria cirrhosa were applied as outgroups. Phylogenetic tree were constructed by Maximum likelihood (ML). The ML bootstrap values are represented by the numbers above the branches.