| Literature DB >> 30126202 |
Xuan Li1, Yongfu Li2, Mingyue Zang3, Mingzhi Li4, Yanming Fang5.
Abstract
Quercus acutissima, an important endemic and ecological plant of the Quercus genus, is widely distributed throughout China. However, there have been few studies on its chloroplast genome. In this study, the complete chloroplast (cp) genome of Q. acutissima was sequenced, analyzed, and compared to four species in the Fagaceae family. The size of the Q. acutissima chloroplast genome is 161,124 bp, including one large single copy (LSC) region of 90,423 bp and one small single copy (SSC) region of 19,068 bp, separated by two inverted repeat (IR) regions of 51,632 bp. The GC content of the whole genome is 36.08%, while those of LSC, SSC, and IR are 34.62%, 30.84%, and 42.78%, respectively. The Q. acutissima chloroplast genome encodes 136 genes, including 88 protein-coding genes, four ribosomal RNA genes, and 40 transfer RNA genes. In the repeat structure analysis, 31 forward and 22 inverted long repeats and 65 simple-sequence repeat loci were detected in the Q. acutissima cp genome. The existence of abundant simple-sequence repeat loci in the genome suggests the potential for future population genetic work. The genome comparison revealed that the LSC region is more divergent than the SSC and IR regions, and there is higher divergence in noncoding regions than in coding regions. The phylogenetic relationships of 25 species inferred that members of the Quercus genus do not form a clade and that Q. acutissima is closely related to Q. variabilis. This study identified the unique characteristics of the Q. acutissima cp genome, which will provide a theoretical basis for species identification and biological research.Entities:
Keywords: Quercus; chloroplast genome; phylogenetic relationship
Mesh:
Substances:
Year: 2018 PMID: 30126202 PMCID: PMC6121628 DOI: 10.3390/ijms19082443
Source DB: PubMed Journal: Int J Mol Sci ISSN: 1422-0067 Impact factor: 5.923
Figure 1Chloroplast genome map of Q. acutissima. Genes inside the circle are transcribed clockwise, and those outside are transcribed counterclockwise. Genes of different functions are color-coded. The darker gray in the inner circle shows the GC content, while the lighter gray shows the AT content.
Summary of five Quercus chloroplast genome features.
| Genome Features |
|
|
|
|
|
|
|---|---|---|---|---|---|---|
| Genome size (bp) | 161,124 | 161,077 | 161,237 | 160,799 | 161,020 | 158,346 |
| LSC length (bp) | 90,423 | 90,387 | 90,461 | 90,432 | 90,596 | 87,667 |
| SSC length (bp) | 19,068 | 19,056 | 19,048 | 18,995 | 19,160 | 18,895 |
| IR length (bp) | 51,632 | 51,634 | 51,728 | 51,372 | 51,264 | 51,784 |
| Number of genes | 136 | 134 | 134 | 130 | 134 | 131 |
| Number of protein–coding genes | 88 | 86 | 86 | 83 | 87 | 83 |
| Number of tRNA genes | 40 | 40 | 40 | 37 | 39 | 40 |
| Number of rRNA genes | 8 | 8 | 8 | 8 | 8 | 8 |
Figure A1BLAST result of the chloroplast genome and the GC stew of Q. acutissima. BlAST 1 represents L. balansae; BlAST 2 represents Q. variabilis; BlAST 3 represents Q. dolicholepis.
Base composition of the Q. acutissima chloroplast genome.
| Region | A (%) | T (U) (%) | C (%) | G (%) | A + T (%) | G + C (%) |
|---|---|---|---|---|---|---|
| LSC | 31.99 | 33.4 | 17.74 | 16.88 | 65.39 | 34.62 |
| SSC | 34.46 | 34.71 | 16.24 | 14.6 | 69.17 | 30.84 |
| IR | 28.61 | 28.61 | 21.39 | 21.39 | 57.22 | 42.78 |
| Total | 31.69 | 32.24 | 18.46 | 17.62 | 63.93 | 36.08 |
List of genes annotated in the cp genomes of Q. acutissima sequenced in this study.
| Function | Genes |
|---|---|
| RNAs, transfer | |
| RNAs, ribosomal | |
| Transcription and splicing | |
| Translation, ribosomal proteins | |
| Small subunit | |
| Large subunit | |
| Photosynthesis | |
| ATP synthase | |
| Photosystem I | |
| Photosystem II | |
| Calvin cycle |
|
| Cytochrome complex | |
| NADH dehydrogenase | |
| Others |
* Genes containing one intron; ** genes containing two introns.
The number of genes in the Q. acutissima cp genome.
| Region | Number of CDS | Number of tRNA | Number of rRNA | Total |
|---|---|---|---|---|
| LSC region | 62 | 25 | 0 | 87 |
| SSC region | 13 | 1 | 0 | 14 |
| IRA region | 6 | 7 | 4 | 17 |
| IRB region | 7 | 7 | 4 | 18 |
Codon-anticodon recognition patterns and codon usage of the Q. acutissima chloroplast genome.
| Amino Acid | Codon | No. | RSCU | tRNA | Amino Acid | Codon | No. | RSCU | tRNA |
|---|---|---|---|---|---|---|---|---|---|
| Ala | GCG | 164 | 0.47 | Pro | CCA | 313 | 1.13 |
| |
| Ala | GCC | 224 | 0.64 | Pro | CCC | 226 | 0.82 | ||
| Ala | GCU | 630 | 1.79 | Pro | CCU | 409 | 1.48 | ||
| Ala | GCA | 388 | 1.1 | Pro | CCG | 161 | 0.58 | ||
| Cys | UGU | 221 | 1.44 | Gln | CAG | 215 | 0.45 | ||
| Cys | UGC | 86 | 0.56 |
| Gln | CAA | 731 | 1.55 |
|
| Asp | GAC | 209 | 0.39 |
| Arg | CGU | 337 | 1.26 |
|
| Asp | GAU | 870 | 1.61 | Arg | AGA | 500 | 1.87 |
| |
| Glu | GAA | 1064 | 1.5 |
| Arg | CGA | 358 | 1.34 | |
| Glu | GAG | 357 | 0.5 | Arg | AGG | 183 | 0.68 | ||
| Phe | UUU | 983 | 1.3 | Arg | CGG | 118 | 0.44 | ||
| Phe | UUC | 535 | 0.7 |
| Arg | CGC | 109 | 0.41 | |
| Gly | GGU | 580 | 1.27 | Ser | AGC | 125 | 0.37 |
| |
| Gly | GGG | 330 | 0.72 | Ser | UCU | 557 | 1.66 | ||
| Gly | GGA | 706 | 1.55 | Ser | UCA | 397 | 1.18 |
| |
| Gly | GGC | 206 | 0.45 |
| Ser | UCC | 349 | 1.04 |
|
| His | CAU | 486 | 1.54 | Ser | AGU | 391 | 1.17 | ||
| His | CAC | 145 | 0.46 |
| Ser | UCG | 193 | 0.58 | |
| Ile | AUC | 458 | 0.58 | Thr | ACU | 538 | 1.6 | ||
| Ile | AUA | 758 | 0.97 | Thr | ACG | 160 | 0.48 | ||
| Ile | AUU | 1139 | 1.45 | Thr | ACC | 247 | 0.73 |
| |
| Lys | AAG | 379 | 0.5 | Thr | ACA | 402 | 1.19 |
| |
| Lys | AAA | 1062 | 1.4 | Val | GUU | 508 | 1.41 | ||
| Leu | UUG | 572 | 1.22 |
| Val | GUC | 181 | 0.5 |
|
| Leu | UUA | 894 | 1.9 | Val | GUA | 547 | 1.52 | ||
| Leu | CUU | 583 | 1.24 | Val | GUG | 207 | 0.57 | ||
| Leu | CUA | 373 | 0.79 |
| Trp | UGG | 462 | 1 |
|
| Leu | CUC | 204 | 0.43 | Tyr | UAC | 212 | 0.42 |
| |
| Leu | CUG | 198 | 0.42 | Tyr | UAU | 792 | 1.58 | ||
| Met | AUG | 620 | 1 |
| Stop | UAA | 47 | 1.6 | |
| Asn | AAU | 1004 | 1.5 | Stop | UAG | 22 | 0.75 | ||
| Asn | AAC | 304 | 0.46 | Stop | UGA | 19 | 0.65 |
RSCU: Relative Synonymous Codon Usage.
The lengths of exons and introns in genes with introns in the Q. acutissima chloroplast genome.
| Gene | Location | Exon I (bp) | Intron I (bp) | Exon II (bp) | Intron II (bp) | Exon III (bp) |
|---|---|---|---|---|---|---|
|
| LSC | 42 | 898 | 195 | ||
|
| LSC | 144 | 780 | 411 | ||
|
| LSC | 432 | 827 | 1626 | ||
|
| LSC | 127 | 718 | 228 | 778 | 155 |
|
| LSC | 69 | 844 | 294 | 649 | 228 |
|
| LSC | 6 | 841 | 642 | ||
|
| LSC | 9 | 640 | 474 | ||
|
| LSC | 9 | 1102 | 399 | ||
|
| RepeatA | 390 | 628 | 471 | ||
|
| RepeatA | 777 | 680 | 756 | ||
|
| RepeatA | 10 | 537 | 231 | ||
|
| SSC | 551 | 1040 | 541 | ||
|
| RepeatB | 232 | 536 | 26 | ||
|
| RepeatB | 777 | 680 | 756 | ||
|
| RepeatB | 390 | 628 | 471 | ||
|
| LSC | 23 | 734 | 37 | ||
|
| LSC | 37 | 2505 | 35 | ||
|
| LSC | 35 | 483 | 50 | ||
|
| LSC | 36 | 630 | 37 | ||
|
| RepeatA | 42 | 950 | 35 | ||
|
| RepeatA | 38 | 800 | 35 | ||
|
| RepeatB | 38 | 800 | 35 | ||
|
| RepeatB | 42 | 950 | 35 |
Figure 2Complete chloroplast genome comparison of six species using the chloroplast genome of Q. variabilis as a reference. The grey arrows and thick black lines above the alignment indicate the genes’ orientations. The Y-axis represents the identity from 50% to 100%.
Figure A2Percentage of variation in the complete cp genomes of the six species. The regions are oriented according to their locations in the genome.
Figure 3Comparison of the large single copy (LSC), small single copy (SSC), and inverted repeat (IR) regions in chloroplast genomes of four species. Genes are denoted by colored boxes. The gaps between the genes and the boundaries are indicated by the base lengths (bp). Extensions of the genes are indicated above the boxes.
Long repeat sequence in the Q. acutissima chloroplast genome.
| ID | Repeat Start I | Type | Size (bp) | Repeat Start 2 | Mismatch (bp) | E-Value | Gene | Region |
|---|---|---|---|---|---|---|---|---|
| 1 | 6831 | F | 46 | 6853 | 0 | 1.47 × 10−18 | IGS | LSC |
| 2 | 11,847 | R | 31 | 11,847 | 0 | 1.58 × 10−9 | IGS | LSC |
| 3 | 6818 | R | 26 | 6818 | 0 | 1.62 × 10−6 |
| LSC |
| 4 | 47,242 | F | 25 | 47,264 | 0 | 6.49 × 10−6 | IGS | LSC |
| 5 | 6831 | F | 24 | 6875 | 0 | 2.59 × 10−5 | IGS | LSC |
| 6 | 115,801 | F | 24 | 135,722 | 0 | 2.59 × 10−5 |
| IRA; IRB |
| 7 | 113,545 | F | 23 | 113,576 | 0 | 1.04 × 10−4 | IGS | IRA |
| 8 | 118,844 | R | 23 | 118,844 | 0 | 1.04 × 10−4 | IGS | IRA |
| 9 | 137,948 | F | 23 | 137,979 | 0 | 1.04 × 10−4 | IGS | IRB |
| 10 | 11,371 | F | 22 | 41,193 | 0 | 4.15 × 10−4 | LSC | |
| 11 | 9536 | F | 21 | 39,849 | 0 | 1.66 × 10−3 | LSC | |
| 12 | 10,319 | F | 21 | 18,682 | 0 | 1.66 × 10−3 | IGS | LSC |
| 13 | 117,049 | R | 21 | 117,049 | 0 | 1.66 × 10−3 |
| SSC |
| 14 | 36,478 | F | 20 | 53,719 | 0 | 6.64 × 10−3 | IGS | LSC |
| 15 | 53,720 | F | 20 | 130,481 | 0 | 6.64 × 10−3 | IGS | LSC; SSC |
| 16 | 55,907 | R | 20 | 55,907 | 0 | 6.64 × 10−3 |
| LSC |
| 17 | 57,271 | F | 20 | 142,064 | 0 | 6.64 × 10−3 | LSC; IRB | |
| 18 | 105,331 | F | 20 | 105,349 | 0 | 6.64 × 10−3 | IGS | IRA |
| 19 | 146,178 | F | 20 | 146,196 | 0 | 6.64 × 10−3 | IGS | IRB |
| 20 | 4930 | F | 19 | 36,476 | 0 | 2.66 × 10−2 | IGS | LSC |
| 21 | 8915 | R | 19 | 8915 | 0 | 2.66 × 10−2 | IGS | LSC |
| 22 | 13,541 | R | 19 | 76,642 | 0 | 2.66 × 10−2 |
| LSC |
| 23 | 18,685 | R | 19 | 118,842 | 0 | 2.66 × 10−2 |
| LSC; SSC |
| 24 | 21,297 | R | 19 | 54,183 | 0 | 2.66 × 10−2 |
| LSC |
| 25 | 36,479 | F | 19 | 130,481 | 0 | 2.66 × 10−2 | IGS | LSC; SSC |
| 26 | 39,957 | R | 19 | 39,957 | 0 | 2.66 × 10−2 | IGS | LSC |
| 27 | 62,040 | R | 19 | 62,040 | 0 | 2.66 × 10−2 | IGS | LSC |
| 28 | 64,751 | R | 19 | 64,751 | 0 | 2.66 × 10−2 | IGS | LSC |
| 29 | 69,026 | R | 19 | 69,026 | 0 | 2.66 × 10−2 | IGS | LSC |
| 30 | 71,277 | R | 19 | 71,277 | 0 | 2.66 × 10−2 | IGS | LSC |
| 31 | 72,561 | R | 19 | 72,561 | 0 | 2.66 × 10−2 | IGS | LSC |
| 32 | 4430 | R | 18 | 4430 | 0 | 1.06 × 10−1 | IGS | LSC |
| 33 | 4437 | F | 18 | 24,828 | 0 | 1.06 × 10−1 | SSC | |
| 34 | 4935 | F | 18 | 52,105 | 0 | 1.06 × 10−1 | IGS | LSC |
| 35 | 4938 | F | 18 | 118,695 | 0 | 1.06 × 10−1 | IGS | LSC |
| 36 | 6813 | F | 18 | 6847 | 0 | 1.06 × 10−1 | IGS | LSC |
| 37 | 6813 | F | 18 | 6869 | 0 | 1.06 × 10−1 | IGS | LSC |
| 38 | 6817 | F | 18 | 127,945 | 0 | 1.06 × 10−1 | LSC | |
| 39 | 7369 | F | 18 | 7387 | 0 | 1.06 × 10−1 | IGS | LSC; SSC |
| 40 | 7465 | R | 18 | 7465 | 0 | 1.06 × 10−1 | IGS | LSC; SSC |
| 41 | 8589 | R | 18 | 34,768 | 0 | 1.06 × 10−1 | IGS | LSC; SSC |
| 42 | 9996 | R | 18 | 9996 | 0 | 1.06 × 10−1 | IGS | LSC |
| 43 | 10,283 | F | 18 | 31,730 | 0 | 1.06 × 10−1 | IGS | LSC |
| 44 | 10,322 | R | 18 | 118,843 | 0 | 1.06 × 10−1 | IGS | LSC; IRA |
| 45 | 10,548 | F | 18 | 133,365 | 0 | 1.06 × 10−1 |
| LSC |
| 46 | 31,728 | F | 18 | 125,951 | 0 | 1.06 × 10−1 | IGS | LSC |
| 47 | 39,812 | F | 18 | 40,698 | 0 | 1.06 × 10−1 |
| LSC; SSC |
| 48 | 40,022 | R | 18 | 69,093 | 0 | 1.06 × 10−1 | IGS | LSC |
| 49 | 40,700 | F | 18 | 123,827 | 0 | 1.06 × 10−1 | IGS | LSC |
| 50 | 43,446 | F | 18 | 45,670 | 0 | 1.06 × 10−1 |
| SSC |
| 51 | 40,022 | R | 18 | 69,093 | 0 | 1.06 × 10−1 | IGS | LSC |
| 52 | 40,700 | F | 18 | 123,827 | 0 | 1.06 × 10−1 | IGS | LSC |
| 53 | 43,446 | F | 18 | 45,670 | 0 | 1.06 × 10−1 | LSC |
F: forward; I: inverted; IGS: intergenic space.
Simple sequence repeats (SSRs) in the Q. acutissima chloroplast genome.
| ID | Repeat Motif | Length (bp) | Start | End | Region | Gene | ID | Repeat Motif | Length (bp) | Start | End | Region | Gene |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | (A)10 | 9 | 1809 | 1818 | LSC | 34 | (T)10 | 9 | 55,713 | 55,722 | LSC | ||
| 2 | (C)14 | 13 | 4433 | 4446 | LSC | 35 | (T)10 | 9 | 59,591 | 59,600 | LSC | ||
| 3 | (T)11 | 10 | 4697 | 4707 | LSC | 36 | (T)10 | 9 | 60,063 | 60,072 | LSC | ||
| 4 | (A)10 | 9 | 4939 | 4948 | LSC |
| 37 | (T)10 | 9 | 64,092 | 64,101 | LSC |
|
| 5 | (T)11 | 10 | 7001 | 7011 | LSC | 38 | (A)11 | 10 | 64,266 | 64,276 | LSC | ||
| 6 | (T)10 | 9 | 7746 | 7755 | LSC | 39 | (AT)7 | 13 | 64,570 | 64,583 | LSC | ||
| 7 | (A)10 | 9 | 8174 | 8183 | LSC | 40 | (T)14 | 13 | 64,945 | 64,958 | LSC | ||
| 8 | (A)12 | 11 | 8590 | 8601 | LSC |
| 41 | (T)13 | 12 | 66,170 | 66,182 | LSC | |
| 9 | (A)11 | 10 | 8920 | 8930 | LSC | 42 | (T)11 | 10 | 68,616 | 68,626 | LSC |
| |
| 10 | (A)10 | 9 | 9465 | 9474 | LSC | 43 | (T)11 | 10 | 70,730 | 70,740 | LSC | ||
| 11 | (A)10 | 9 | 10,161 | 10,170 | LSC | 44 | (T)11 | 10 | 71,398 | 71,408 | LSC | ||
| 12 | (A)11 | 10 | 13,547 | 13,557 | LSC | 45 | (T)11 | 10 | 73,389 | 73,399 | LSC | ||
| 13 | (T)12 | 11 | 15,345 | 15,356 | LSC | 46 | (AT)6 | 11 | 77,274 | 77,285 | LSC |
| |
| 14 | (T)10 | 9 | 16,160 | 16,169 | LSC | 47 | (TA)7 | 13 | 82,928 | 82,941 | LSC |
| |
| 15 | (A)12 | 11 | 18,692 | 18,703 | LSC |
| 48 | (A)11 | 10 | 85,781 | 85,791 | LSC | |
| 16 | (T)12 | 11 | 21,295 | 21,306 | LSC |
| 49 | (T)10 | 9 | 86,100 | 86,109 | LSC | |
| 17 | (T)14 | 13 | 25,299 | 25,312 | LSC | 50 | (T)10 | 9 | 88,820 | 88,829 | LSC | ||
| 18 | (T)10 | 9 | 28,563 | 28,572 | LSC | 51 | (T)11 | 10 | 114,070 | 114,080 | IRA | ||
| 19 | (T)10 | 9 | 29,651 | 29,660 | LSC | 52 | (T)12 | 11 | 118,582 | 118,593 | SSC | ||
| 20 | (T)11 | 10 | 30,275 | 30,285 | LSC | 53 | (A)11 | 10 | 118,695 | 118,705 | SSC | ||
| 21 | (C)14 | 13 | 30,428 | 30,441 | LSC | 54 | (T)11 | 10 | 119,000 | 119,010 | SSC | ||
| 22 | (T)11 | 10 | 31,731 | 31,741 | LSC | 55 | (A)10 | 9 | 119,794 | 119,803 | SSC | ||
| 23 | (A)10 | 9 | 32,094 | 32,103 | LSC | 56 | (T)11 | 10 | 122,199 | 122,209 | SSC |
| |
| 24 | (A)10 | 9 | 33,986 | 33,995 | LSC | 57 | (A)10 | 9 | 122,546 | 122,555 | SSC | ||
| 25 | (A)13 | 12 | 34,775 | 34,787 | LSC | 58 | (AT)8 | 15 | 123,832 | 123,847 | SSC | ||
| 26 | (A)10 | 9 | 34,955 | 34,964 | LSC | 59 | (T)11 | 10 | 125,812 | 125,822 | SSC | ||
| 27 | (A)10 | 9 | 36,485 | 36,494 | LSC | 60 | (T)11 | 10 | 125,954 | 125,964 | SSC | ||
| 28 | (AT)6 | 11 | 39,819 | 39,830 | LSC | 61 | (T)11 | 10 | 130,262 | 130,272 | SSC | ||
| 29 | (T)10 | 9 | 41,238 | 41,247 | LSC |
| 62 | (A)10 | 9 | 130,487 | 130,496 | SSC | |
| 30 | (T)11 | 10 | 53,217 | 53,227 | LSC | 63 | (T)10 | 9 | 133,465 | 133,474 | SSC |
| |
| 31 | (A)10 | 9 | 53,726 | 53,735 | LSC | 64 | (T)13 | 12 | 134,042 | 134,054 | SSC |
| |
| 32 | (T)15 | 14 | 54,110 | 54,124 | LSC | 65 | (A)11 | 10 | 137,468 | 137,478 | SSC | ||
| 33 | (A)11 | 10 | 54,990 | 55,000 | LSC |
Figure 4Bayesian inference (BI) phylogenetic tree reconstruction including 25 species based on all chloroplast genomes. Malus prunifolia and Ulmus gaussenii were used as the outgroup.