| Literature DB >> 31361757 |
Xuemin Ye1, Dongnan Hu1, Yangping Guo1, Rongxi Sun1.
Abstract
Castanopsis sclerophylla (Lindl.) Schott is an important species of evergreen broad-leaved tree in subtropical areas and has high ecological and economic value. However, there are few studies on its chloroplast genome. In this study, the complete chloroplast genome sequence of C. sclerophylla was determined using the Illumina HiSeq 2500 platform. The complete chloroplast genome of C. sclerophylla is 160,497 bp long, including a pair of inverted repeat (IR) regions (25,675 bp) separated by a large single-copy (LSC) region of 90,255 bp and a small single-copy (SSC) region of 18,892 bp. The overall GC content of the chloroplast genome is 36.82%. A total of 131 genes were found; of these, 111 genes are unique and annotated, including 79 protein-coding genes, 27 transfer RNA genes (tRNAs), and four ribosomal RNA genes (rRNAs). Twenty-one genes were found to be duplicated in the IR regions. Comparative analysis indicated that IR contraction might be the reason for the smaller chloroplast genome of C. sclerophylla compared to three congeneric species. Sequence analysis indicated that the LSC and SSC regions are more divergent than IR regions within Castanopsis; furthermore, greater divergence was found in noncoding regions than in coding regions. The maximum likelihood phylogenetic analysis showed that four species of the genus Castanopsis form a monophyletic clade and that C. sclerophylla is closely related to Castanopsis hainanensis with strong bootstrap values. These results not only provide a basic understanding of Castanopsis chloroplast genomes, but also illuminate Castanopsis species evolution within the Fagaceae family. Furthermore, these findings will be valuable for future studies of genetic diversity and enhance our understanding of the phylogenetic evolution of Castanopsis.Entities:
Mesh:
Year: 2019 PMID: 31361757 PMCID: PMC6667119 DOI: 10.1371/journal.pone.0212325
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Chloroplast genome annotation map for C. sclerophylla. Genes inside the circle are transcribed in a clockwise direction; genes outside are transcribed in a counterclockwise direction.
Different colors represent different functional genes. The darker gray and lighter gray in the inner circle show the GC and AT contents of the chloroplast genome, respectively.
Summary of the characteristics of four Castanopsis chloroplast genomes.
| Genome | ||||
|---|---|---|---|---|
| 160,631 | 160,647 | 160,606 | ||
| 90,255 | 90,328 | 90,394 | 90,368 | |
| 18,892 | 18,929 | 18,995 | 18,884 | |
| 25,675 | 25,687 | 25,629 | 25,677 | |
| 131 | 132 | 132 | 136 | |
| 86 | 84 | 84 | 82 | |
| 37 | 40 | 40 | 46 | |
| 8 | 8 | 8 | 8 |
Base content of the C. sclerophylla chloroplast genome.
| Region | A (%) | T (%) | C (%) | G (%) | A+T (%) | G+C (%) |
|---|---|---|---|---|---|---|
| 31.94 | 33.4 | 17.74 | 16.91 | 65.34 | 34.65 | |
| 34.4 | 34.66 | 16.29 | 14.65 | 69.06 | 30.94 | |
| 28.61 | 28.61 | 21.39 | 21.39 | 57.22 | 42.78 | |
| 31.65 | 32.23 | 18.47 | 17.65 | 63.18 | 36.82 |
List of genes annotated in the sequenced C. sclerophylla chloroplast genome.
| Category | Function | Genes |
|---|---|---|
| Photosystem I | ||
| Photosystem II | ||
| Cytochrome b/f complex | ||
| ATP synthase | ||
| NADH dehydrogenase | ||
| Rubisco large subunit | ||
| RNA polymerase | rpoA, rpoB, rpoC1 | |
| Ribosomal proteins (LSU) | rpl14, rpl16, rpl2 | |
| Ribosomal proteins (SSU) | rps11, rps12(X2), rps14, rps15, rps16 | |
| Transfer RNAs | ||
| Ribosomal RNAs | ||
| Hypothetical chloroplast reading frames | ||
| Other genes |
(×n) number of gene copies in the IR.
* Genes containing one intron
** genes containing two introns.
Lengths of exons and introns for genes with introns in the C. sclerophylla chloroplast genome.
| Gene | Location | Exon I (bp) | Intron I (bp) | Exon II (bp) | Intron II (bp) | Exon III (bp) |
|---|---|---|---|---|---|---|
| LSC | 70 | 851 | 291 | 654 | 227 | |
| LSC | 36 | 2511 | 34 | |||
| LSC | 429 | 838 | 1618 | |||
| LSC | 37 | 610 | 55 | |||
| IRA | 776 | 681 | 755 | |||
| SSC | 550 | 1049 | 540 | |||
| IRB | 390 | 685 | 433 | |||
| IRB | 36 | 801 | 35 | |||
| LSC | 34 | 485 | 49 | |||
| LSC | 31 | 956 | 39 | |||
| LSC | 34 | 720 | 42 | |||
| LSC | 41 | 903 | 227 | |||
| LSC | 125 | 727 | 225 | 768 | 154 | |
| LSC | 144 | 789 | 409 |
Fig 2Codon numbers of twenty kinds of amino acids and stop codon of protein-coding sequences for the C. sclerophylla chloroplast genome.
Different colors of the histogram represent the proportion of codon usage and stop codon.
Fig 3Visualization of alignment of the complete chloroplast genome of four species by the program mVISTA using C. concinna as a reference.
The gray arrows and thick black lines above the alignment indicate the orientation of genes. Blue bars represent exons, sky-blue bars represent untranslated regions (UTRs), and pink bars represent noncoding sequences (NCSs). The vertical scale represents the percent identity within 50–100%.
Fig 4Comparison of junctions of large single-copy (LSC), small single-copy (SSC), and inverted repeat (IR) regions among the chloroplast genomes of four congeneric species.
The genes transcribed on the positive strand are depicted on the top of their corresponding locus from right to left; negative strand genes are depicted below from left to right. The arrows indicate the distance between the start or end of a given gene and the corresponding junction site. JLB (LSC/IRb), JSB (IRb/SSC), JSA (SSC/IRa) and JLA (IRa/LSC) denote four junctions in the genome between the two single-copy regions (LSC and SSC) and the two IRs (IRa and IRb).
Fig 5A maximum likelihood (ML) phylogenetic tree was constructed based on the chloroplast genomes of 22 species.
C. fargesii and E. umbra were used as outgroups.