| Literature DB >> 31071159 |
Wei Li1, Cuiping Zhang1, Xiao Guo1, Qinghua Liu1, Kuiling Wang1.
Abstract
Camellia is an economically, ecologically and phylogenetically valuable genus in the family Theaceae. The frequent interspecific hybridization and polyploidization makes this genus phylogenetically and taxonomically under controversial and require detailed investigation. Chloroplast (cp) genome sequences have been used for cpDNA marker development and genetic diversity evaluation. Our research newly sequenced the chloroplast genome of Camellia japonica using Illumina HiSeq X Ten platform, and retrieved five other chloroplast genomes of Camellia previously published for comparative analyses, thereby shedding lights on a deeper understanding of the applicability of chloroplast information. The chloroplast genome sizes ranged in length from 156,607 to 157,166 bp, and their gene structure resembled those of other higher plants. There were four categories of SSRs detected in six Camellia cpDNA sequences, with the lengths ranging from 10 to 17bp. The Camellia species exhibited different evolutionary routes that lhbA and orf188, followed by orf42 and psbZ, were readily lost during evolution. Obvious codon preferences were also shown in almost all protein-coding cpDNA and amino acid sequences. Selection pressure analysis revealed the influence of different environmental pressures on different Camellia chloroplast genomes during long-term evolution. All Camellia species, except C. crapnelliana, presented the identical rate of amplification in the IR region. The datasets obtained from the chloroplast genomes are highly supportive in inferring the phylogenetic relationships of the Camellia taxa, indicating that chloroplast genome can be used for classifying interspecific relationships in this genus.Entities:
Mesh:
Substances:
Year: 2019 PMID: 31071159 PMCID: PMC6508735 DOI: 10.1371/journal.pone.0216645
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Statistics on the basic features of the chloroplast genomes of six Camellia species.
| Length (bp) | 156607 | 156997 | 157039 | 157166 | 156903 | 156865 |
| GC content (%) | 37.32 | 37.30 | 37.30 | 37.30 | 37.32 | 37.32 |
| AT content (%) | 62.68 | 62.70 | 62.70 | 62.70 | 62.68 | 62.68 |
| LSC length (bp) | 86258 | 86655 | 86674 | 86719 | 86568 | 86579 |
| SSC length (bp) | 18415 | 18406 | 18281 | 18293 | 18203 | 18236 |
| IR length (bp) | 25967 | 25968 | 26042 | 26077 | 26066 | 26025 |
| Gene number | 134 | 136 | 135 | 133 | 133 | 133 |
| Gene number in IR regions | 36 | 35 | 36 | 35 | 35 | 35 |
| Pseudogene number | 1 | 0 | 3 | 1 | 1 | 1 |
| Pseudogene (%) | 0.75 | 0 | 2.22 | 0.75 | 0.75 | 0.75 |
| Protein-coding gene number | 89 | 89 | 87 | 87 | 87 | 87 |
| Protein-coding gene (%) | 66.42 | 65.44 | 64.44 | 65.41 | 65.41 | 65.41 |
| rRNA gene number | 8 | 8 | 8 | 8 | 8 | 8 |
| rRNA (%) | 5.97 | 5.88 | 5.93 | 6.02 | 6.02 | 6.02 |
| tRNA gene number | 36 | 39 | 37 | 37 | 37 | 37 |
| tRNA (%) | 26.87 | 28.68 | 27.41 | 27.82 | 27.82 | 27.82 |
Fig 1Gene map of Camellia japonica.
The genes inside and outside of the circle are transcribed in the clockwise and counterclockwise directions, respectively. Genes belonging to different functional groups are shown in different colors. The thick lines indicate the extent of the inverted repeats (IRa and IRb) that separate the genomes into small single-copy (SSC) and large single-copy (LSC) regions.
Genes identified in the chloroplast genome of Camellia species.
| Category for genes | Group of gene | Name of gene |
|---|---|---|
| Genes for photosynthesis | ATP synthase | atpA,atpF(pseudogene),atpH,atpI,atpE,atpB |
| NADH-dehydrogenase | ndhJ,ndhK,ndhC,ndhB,ndhF,ndhD,ndhE,ndhG,ndhI,ndhA,ndhH,ndhB | |
| cytochrome b/f complex | petN,petA,petL,petG,petB,petD | |
| photosystem I | psaB,psaA,psaI,psaJ,psaC | |
| photosystem II | psbA,psbK,psbI,psbM,psbD,psbC,psbZ,psbJ,psbL,psbF,psbE,psbB,psbT,psbN,psbH | |
| Rubisco | rbcL | |
| Transcription and translation related genes | transcription | rpoC2,rpoC1,rpoB,rpoA |
| ribosomal proteins | rps12, rps16, rps2, rps14, rps4, rps18, rps12, rps11, rps8, rps3, rps19, rps7,rps15,rps7,rpl33,rpl20,rpl36,rpl14,rpl16,rpl22,rpl2,rpl23,rpl32,rpl23,rpl2 | |
| RNA genes | ribosomal RNA | rrn16S,rrn23S,rrn4.5S,rrn5S,rrn5S,rrn4.5S,rrn23S,rrn16S |
| transfer RNA | trnH-GUG, trnK-UUU,trnQ-UUG,trnS-GCU,trnR-UCU,trnC-GCA,trnD-GUC,trnY-GUA, trnE-UUC,trnT-GGU,trnS-UGA,trnG-UCC,trnM-CAU,trnS-GGA,trnT-UGU,trnL-UAA, trnF-GAA, trnV-UAC,trnM-CAU,trnW-CCA,trnP-UGG,trnI-CAU,trnL-CAA,trnV-GAC, trnI-GAU, trnA-UGC,trnR-ACG,trnN-GUU,trnL-UAG,trnN-GUU,trnR-ACG,trnA-UGC, trnI-GAU,trnV-GAC,trnL-CAA,trnI-CAU | |
| Other genes | RNA processing | matK |
| carbon metabolism | cemA | |
| fatty acid synthesis | accD | |
| proteolysis | clpP | |
| Genes of unkown function | Conserved open reading frames | ycf3,ycf4,ycf2,ycf15,ycf15,ycf1,ycf1,ycf15,ycf15,ycf2 |
Fig 2Comparison of simple sequence repeats among six chloroplast genomes.
a. Numbers of SSRs detected in ten Camellia chloroplast genomes; b. Frequencies of identified SSRs in LSC, IR and SSC regions; c. Numbers of SSR types detected in ten Camellia chloroplast genomes.
Pairwise substitution rate between the Camellia chloroplast gemomes based on the 78 protein-coding gene sequences.
| 35 | 33 | 37 | 34 | 33 | ||
| 0.4200 | 26 | 34 | 58 | 54 | ||
| 0.4791 | 0.3613 | 20 | 44 | 40 | ||
| 0.3872 | 0.2342 | 0.3663 | 17 | 18 | ||
| 0.4870 | 0.3610 | 0.5096 | 0.4993 | 23 | ||
| 0.4301 | 0.3470 | 0.4450 | 0.4605 | 0.5971 |
Genes from the chloroplast genomes of Camellia.
| Name of species | lhbA | orf188 | orf42 | psaJ | psbF | psbH | psbZ | trnfM-CAU |
|---|---|---|---|---|---|---|---|---|
| - | - | - | + | + | + | + | - | |
| - | - | + | + | + | + | + | + | |
| - | - | - | + | + | + | + | + | |
| - | - | - | + | + | + | + | + | |
| - | - | - | + | + | + | + | + | |
| - | - | + | + | + | + | + | + | |
| - | - | + | + | + | + | + | + | |
| - | - | + | + | + | + | + | + | |
| - | - | + | + | + | + | + | + | |
| - | - | + | + | + | + | + | + | |
| - | + | + | + | + | + | + | + | |
| + | + | + | + | + | + | - | + | |
| + | + | + | + | + | + | - | + | |
| + | + | + | + | + | + | - | + | |
| + | + | + | + | + | + | - | + | |
| + | + | + | - | - | - | - | + | |
| + | + | + | + | + | + | - | + | |
| - | - | + | + | + | + | + | + | |
| - | - | + | + | + | + | + | + | |
| Total number of missing gene | 13 | 12 | 4 | 1 | 1 | 1 | 6 | 1 |
Fig 3Codon distribution of all merged protein-coding genes.
Color key: Red indicates a higher frequency and blue indicates a lower frequency.
Fig 4Codon content of 20 amino acid and stop codons in all protein-coding genes.
Fig 5Inverted repeat region contraction analysis of various plant species.
Fig 6Identity plot comparing the chloroplast genomes of six Camellia taxa.
The vertical scale indicates the percentage of identity, ranging from 50% to 100%. The horizontal axis indicates the coordinates within the chloroplast genome. Genome regions are color coded as protein-coding, rRNA, tRNA, intron, and conserved non-coding sequences.
Fig 7Sliding window analysis of the whole chloroplast genomes of six Camellia taxa (window length: 600 bp, step size: 200bp).
X-axis, position of the midpoint of a window; Y-axis, nucleotide diversity of each window.
Variability of six hyper-variable markers and universal chloroplast DNA barcodes (rbcL and matK) in Camellia.
| Maker | length | Variable base sites | Informative base sites | Mean distance | Discriminationsuccess(%) based on Distance method | ||
|---|---|---|---|---|---|---|---|
| Number | Percentage (%) | Number | Percentage (%) | ||||
| trnS-trnR | 1024 | 55 | 5.37 | 29 | 2.78 | 0.0172 | 60 |
| petN-psbM | 2038 | 121 | 5.94 | 61 | 2.99 | 0.0157 | 100 |
| trnF-ndhJ | 867 | 48 | 5.53 | 25 | 2.88 | 0.0167 | 80 |
| petA-psbJ | 1547 | 86 | 5.60 | 44 | 2.84 | 0.0165 | 80 |
| rpl32-trnL | 2017 | 108 | 5.35 | 55 | 2.72 | 0.0178 | 80 |
| ycf1 | 2168 | 122 | 5.63 | 62 | 2.86 | 0.0181 | 100 |
| rbcL | 1401 | 31 | 2.22 | 17 | 1.18 | 0.0063 | 40 |
| matK | 1535 | 49 | 3.19 | 26 | 1.66 | 0.0091 | 80 |
Fig 8Phylogenetic relationships of the nineteen Camellia species constructed from the complete chloroplast genome sequences using maximum likelihood (ML) and maximum parsimony methods (MP).