| Literature DB >> 35885967 |
Yu Li1,2, Tian-Rui Wang2, Gregor Kozlowski2,3,4, Mei-Hua Liu1, Li-Ta Yi1, Yi-Gang Song1,2.
Abstract
Quercus litseoides, an endangered montane cloud forest species, is endemic to southern China. To understand the genomic features, phylogenetic relationships, and molecular evolution of Q. litseoides, the complete chloroplast (cp) genome was analyzed and compared in Quercus section Cyclobalanopsis. The cp genome of Q. litseoides was 160,782 bp in length, with an overall guanine and cytosine (GC) content of 36.9%. It contained 131 genes, including 86 protein-coding genes, eight ribosomal RNA genes, and 37 transfer RNA genes. A total of 165 simple sequence repeats (SSRs) and 48 long sequence repeats with A/T bias were identified in the Q. litseoides cp genome, which were mainly distributed in the large single copy region (LSC) and intergenic spacer regions. The Q. litseoides cp genome was similar in size, gene composition, and linearity of the structural region to those of Quercus species. The non-coding regions were more divergent than the coding regions, and the LSC region and small single copy region (SSC) were more divergent than the inverted repeat regions (IRs). Among the 13 divergent regions, 11 were in the LSC region, and only two were in the SSC region. Moreover, the coding sequence (CDS) of the six protein-coding genes (rps12, matK, atpF, rpoC2, rpoC1, and ndhK) were subjected to positive selection pressure when pairwise comparison of 16 species of Quercus section Cyclobalanopsis. A close relationship between Q. litseoides and Quercus edithiae was found in the phylogenetic analysis of cp genomes. Our study provided highly effective molecular markers for subsequent phylogenetic analysis, species identification, and biogeographic analysis of Quercus.Entities:
Keywords: Cyclobalanopsis litseoides; Fagaceae; montane cloud forests; plastome; repeat sequences
Mesh:
Year: 2022 PMID: 35885967 PMCID: PMC9316884 DOI: 10.3390/genes13071184
Source DB: PubMed Journal: Genes (Basel) ISSN: 2073-4425 Impact factor: 4.141
Chloroplast genome structure and feature of Q. litseoides. Abbreviations: LSC (Large Single Copy), SSC (Small Single Copy), IR (Inverted Repeat), PCGs (protein-coding genes), tRNA (Transfer RNA genes), and rRNA (Ribosomal RNA genes).
| Genome Feature | Length (bp)/Numbers | GC Content (%) | |
|---|---|---|---|
| Structure length | Total | 160,782 | 36.9 |
| LSC region | 90,235 | 34.74 | |
| SSC region | 18,867 | 31.13 | |
| IR (a/b) region | 25,840 | 42.77 | |
| Gene numbers of different categories | Genes | 131 | 39.5 |
| PCGs | 86 | 37.88 | |
| tRNA | 37 | 53.2 | |
| rRNA | 8 | 55.49 | |
| Gene numbers of different regions | LSC region | 61 (PCGs) and 22 (tRNA) | No information |
| SSC region | 11 (PCGs) and 1 (tRNA) | No information | |
| IR regions | 14 (PCGs), 14 (tRNA) and 8 (rRNA) | No information | |
Figure 1Gene map of the chloroplast genome of Q. litseoides. The chloroplast genome map has four circles. Outward from the center, the first circle shows forward and reverse repeats connected by red and green arcs, respectively. The second circle shows tandem repeats marked with a short bar. The third circle is the SSRs identified by MISA. The fourth circle is drawn with drawgenemap to display the gene structure on the chloroplast genome. The genes shown outside of the circle are transcribed clockwise, while those inside of the circle are transcribed counterclockwise. Genes with different functional groups are identified by different colors.
Genetic classification of the chloroplast genome of Q. litseoides. Genes marked with the * or ** sign are the gene with single or double introns, respectively. The duplicated genes located in IR regions were marked as (×2).
| Category | Group | Name |
|---|---|---|
| Transcription and translation | Translational initiation factor |
|
| Ribosomal RNAs | ||
| Transfer RNAs | ||
| Small subunit of ribosome (SSU) | ||
| Large subunit of ribosome (LSU) | ||
| DNA-dependent RNA polymerase | ||
| Photosynthesis | Photosystem I | |
| Photosystem II | ||
| Subunit of cytochrome | ||
| ATP synthase | ||
| RubisCO large subunit |
| |
| NADH dehydrogenase | ||
| Biosynthesis | Maturase |
|
| ATP-dependent Protease | ||
| Acetyl-CoA-carboxylase |
| |
| Envelop membrane protein |
| |
| C-Type cytochrome synthesis |
| |
| Unknown | Hypothetical chloroplast reading frames( |
Simple sequence repeats (SSRs) number in the chloroplast genome of Q. litseoides. Abbreviations: LSC (Large Single Copy), SSC (Small Single Copy), IRs (Inverted Repeats), IGS (Intergenic Spacer Regions), GR (Gene Regions).
| Repeat Type | Repeat Unit | Number (Proportion) of SSRs | Region | Location | |||
|---|---|---|---|---|---|---|---|
| LSC | SSC | IRs | IGS | GR | |||
| Mononucleotides | A/T | 77 (46.67%) | 60 | 11 | 6 | 59 | 18 |
| C/G | 5 (3.03%) | 5 | 0 | 0 | 3 | 2 | |
| Dinucleotides | AG/CT | 19 (11.52%) | 2 | 1 | 16 | 5 | 14 |
| AT/AT | 43 (26.06%) | 29 | 4 | 10 | 28 | 15 | |
| Trinucleotides | AAG/CTT | 1 (0.61%) | 0 | 1 | 0 | 0 | 1 |
| AAT/ATT | 6 (3.64%) | 4 | 2 | 0 | 3 | 3 | |
| Tetranucleotides | AAAT/ATTT | 8 (4.85%) | 7 | 1 | 0 | 5 | 3 |
| AATG/ATTC | 1 (0.61%) | 1 | 0 | 0 | 1 | 0 | |
| AATT/AATT | 2 (1.21%) | 2 | 0 | 0 | 1 | 1 | |
| Pentanucleotides | AAAAT/ATTTT | 1 (0.61%) | 0 | 1 | 0 | 0 | 1 |
| AATGC/ATTGC | 2 (1.21%) | 0 | 0 | 2 | 2 | 0 | |
| Total | 165 | 110 (66.7%) | 21 (12.7%) | 34 (20.6%) | 107 (64.8%) | 58 (35.2%) | |
The long repeat sequences, including minisatellite sequences (M), forward repeat sequences (F), reverse repeat sequences (R), complementary repeat sequences (C), and palindromic repeat sequences (P), in the cp genome of Q. litseoides.
| No. | Repeat Type | Repeat Length (bp) | Region | Location | No. | Repeat Type | Repeat Length (bp) | Region | Location |
|---|---|---|---|---|---|---|---|---|---|
| 1 | M | 19 | LSC | IGS | 25 | R | 33 | LSC, LSC |
|
| 2 | M | 20 | IRa |
| 26 | C | 34 | LSC, LSC | IGS |
| 3 | M | 21 | IRa |
| 27 | C | 30 | LSC, LSC | IGS ( |
| 4 | M | 31 | IRa | IGS | 28 | P | 56 | SSC, SSC | IGS ( |
| 5 | M | 31 | IRb | IGS | 29 | P | 44 | LSC, LSC | IGS ( |
| 6 | M | 21 | IRb |
| 30 | P | 40 | IRa, IRb |
|
| 7 | M | 20 | IRb |
| 31 | P | 40 | IRa, IRb |
|
| 8 | F | 40 | IRa, IRa |
| 32 | P | 38 | LSC, LSC | IGS ( |
| 9 | F | 40 | IRb, IRb |
| 33 | P | 34 | SSC, SSC |
|
| 10 | F | 39 | LSC, IRa |
| 34 | P | 39 | LSC, IRb |
|
| 11 | F | 40 | IRa, SSC | IGS | 35 | P | 40 | SSC, IRb |
|
| 12 | F | 30 | IRa, IRa | IGS | 36 | P | 39 | LSC, LSC | IGS |
| 13 | F | 30 | IRb, IRb | IGS | 37 | P | 30 | LSC, LSC |
|
| 14 | F | 30 | LSC, LSC |
| 38 | P | 30 | IRa, IRb | IGS |
| 15 | F | 30 | LSC, IRa |
| 39 | P | 30 | IRa, IRb | IGS |
| 16 | F | 30 | IRa, SSC | IGS | 40 | P | 32 | LSC, LSC | IGS |
| 17 | F | 30 | IRa, IRb |
| 41 | P | 30 | LSC, IRb |
|
| 18 | F | 32 | IRa, IRa |
| 42 | P | 30 | IRa, IRa |
|
| 19 | F | 32 | IRb, IRb |
| 43 | P | 30 | SSC, IRb |
|
| 20 | F | 30 | LSC, LSC |
| 44 | P | 30 | IRb, IRb |
|
| 21 | F | 30 | LSC, LSC |
| 45 | P | 32 | LSC, LSC | IGS ( |
| 22 | R | 31 | LSC, LSC | IGS | 46 | P | 32 | IRa, IRb |
|
| 23 | R | 31 | LSC, LSC |
| 47 | P | 32 | IRa, IRb |
|
| 24 | R | 31 | LSC, LSC | IGS ( | 48 | P | 30 | LSC, LSC |
|
Figure 2Comparison of the junction regions (JLA, JLB, JSB, JSA) of the chloroplast genomes of Quercus section Cyclobalanopsis.
Figure 3Sliding window analysis of 16 chloroplast genomes of Quercus section Cyclobalanopsis. The x-axis represents the site positions of the middle point of the window, and the y-axis represents the value of nucleotide diversity (Pi) per window.
Figure 4The Ka/Ks (ω) values of 37 shared functional protein-coding genes of 16 chloroplast genomes of Quercus section Cyclobalanopsis.
Figure 5The phylogenetic tree among 37 chloroplast genome homologous sequences is based on the ML method. Values besides the branch represented bootstrap support (BS). Abbreviations: Quercus (Q.), Trigonobalanus (T.), Fagus (F.), and Juglans (J.).