| Literature DB >> 35222983 |
NingJie Wang1, ShuiFei Chen2, Lei Xie1, Lu Wang1, YueYao Feng1, Ting Lv1, YanMing Fang1, Hui Ding2.
Abstract
Hamamelidaceae is an important group that represents the origin and early evolution of angiosperms. Its plants have many uses, such as timber, medical, spice, and ornamental uses. In this study, the complete chloroplast genomes of Loropetalum chinense (R. Br.) Oliver, Corylopsis glandulifera Hemsl., and Corylopsis velutina Hand.-Mazz. were sequenced using the Illumina NovaSeq 6000 platform. The sizes of the three chloroplast genomes were 159,402 bp (C. glandulifera), 159,414 bp (C. velutina), and 159,444 bp (L. chinense), respectively. These chloroplast genomes contained typical quadripartite structures with a pair of inverted repeat (IR) regions (26,283, 26,283, and 26,257 bp), a large single-copy (LSC) region (88,134, 88,146, and 88,160 bp), and a small single-copy (SSC) region (18,702, 18,702, and 18,770 bp). The chloroplast genomes encoded 132-133 genes, including 85-87 protein-coding genes, 37-38 tRNA genes, and 8 rRNA genes. The coding regions were composed of 26,797, 26,574, and 26,415 codons, respectively, most of which ended in A/U. A total of 37-43 long repeats and 175-178 simple sequence repeats (SSRs) were identified, and the SSRs contained a higher number of A + T than G + C bases. The genome comparison showed that the IR regions were more conserved than the LSC or SSC regions, while the noncoding regions contained higher variability than the gene coding regions. Phylogenetic analyses revealed that species in the same genus tended to cluster together. Chunia Hung T. Chang, Mytilaria Lecomte, and Disanthus Maxim. may have diverged early and Corylopsis Siebold & Zucc. was closely related to Loropetalum R. Br. This study provides valuable information for further species identification, evolution, and phylogenetic studies of Hamamelidaceae plants.Entities:
Keywords: Hamamelidaceae; chloroplast genomes; comparative analysis; phylogenetic relationship
Year: 2022 PMID: 35222983 PMCID: PMC8848467 DOI: 10.1002/ece3.8637
Source DB: PubMed Journal: Ecol Evol ISSN: 2045-7758 Impact factor: 2.912
FIGURE 1The chloroplast genome maps of Corylopsis glandulifera, Corylopsis velutina, and Loropetalum chinense. Genes on the inside of the circle are transcribed clockwise and those on the outside are transcribed counter‐clockwise. The darker gray inner circle corresponds to the GC content, whereas the lighter gray indicates the AT content. Different colors represent different functional genes
Summary of the complete chloroplast genomes of the three Hamamelidaceae species
| Genome features |
|
|
|
|---|---|---|---|
| Total length (bp) | 159,414 | 159,402 | 159,444 |
| LSC length (bp) | 88,146 | 88,134 | 88,160 |
| SSC length (bp) | 18,702 | 18,702 | 18,770 |
| IRa length (bp) | 26,283 | 26,283 | 26,257 |
| IRb length (bp) | 26,283 | 26,283 | 26,257 |
| Genes | 133 | 132 | 132 |
| Protein‐coding genes (CDS) | 87 | 87 | 85 |
| tRNA genes | 37 | 37 | 38 |
| rRNA genes | 8 | 8 | 8 |
| GC% | 38.03 | 38.03 | 37.97 |
Base composition of the complete chloroplast genomes of the three Hamamelidaceae species
| Species | Region | A (%) | T (U) (%) | C (%) | G (%) | AT (%) | GC (%) |
|---|---|---|---|---|---|---|---|
|
| LSC | 31.26 | 32.60 | 18.61 | 17.53 | 63.86 | 36.14 |
| SSC | 33.65 | 33.67 | 17.11 | 15.57 | 67.32 | 32.68 | |
| IR | 28.44 | 28.44 | 21.55 | 21.55 | 56.88 | 43.10 | |
| Total | 30.61 | 31.36 | 19.40 | 18.63 | 61.97 | 38.03 | |
|
| LSC | 31.26 | 32.59 | 18.61 | 17.53 | 63.85 | 36.14 |
| SSC | 33.69 | 33.67 | 17.11 | 15.54 | 67.36 | 32.64 | |
| IR | 28.45 | 28.45 | 21.55 | 21.55 | 56.90 | 43.10 | |
| Total | 30.62 | 31.35 | 19.41 | 18.62 | 61.97 | 38.03 | |
|
| LSC | 31.29 | 32.65 | 18.58 | 17.49 | 63.94 | 36.07 |
| SSC | 33.62 | 33.70 | 17.19 | 15.49 | 67.32 | 32.67 | |
| IR | 28.46 | 28.46 | 21.53 | 21.53 | 56.92 | 43.06 | |
| Total | 30.63 | 31.39 | 19.39 | 18.59 | 62.02 | 37.97 |
Lists of genomic genes for Corylopsis velutina, Corylopsis glandulifera, and Loropetalum chinense
| Function |
Genes |
Genes |
Genes |
|---|---|---|---|
| Photosystem I |
| ||
| Photosystem II |
| ||
| NADH dehydrogenase |
| ||
| Cytochrome b/f complex |
| ||
| ATP synthase |
| ||
| Rubisco |
| ||
| Large subunit ribosomal proteins |
| ||
| Small subunit ribosomal proteins |
| ||
| RNA polymerase |
| ||
| Ribosomal RNAs |
| ||
| Transfer RNAs |
|
|
|
| Other |
| ||
| Unknown function | # |
| # |
*, Gene with one intron; **, Gene with two introns; #, Pseudogene; (2): Gene with two copies.
Characteristics and sizes of the intron and exon genes from the three Hamamelidaceae species
| Species | Gene | Exon I (bp) | Intron I (bp) | Exon II (bp) | Intron II (bp) | Exon III (bp) |
|---|---|---|---|---|---|---|
|
|
| 37 | 2,441 | 37 | ||
|
| 39 | 850 | 225 | |||
|
| 34 | 688 | 48 | |||
|
| 159 | 712 | 411 | |||
|
| 435 | 735 | 1,632 | |||
|
| 126 | 746 | 228 | 741 | 153 | |
|
| 37 | 515 | 50 | |||
|
| 39 | 574 | 37 | |||
|
| 114 | – | 232 | 538 | 26 | |
|
| 69 | 635 | 291 | 812 | 228 | |
|
| 6 | 744 | 651 | |||
|
| 9 | 690 | 474 | |||
|
| 9 | 1,001 | 402 | |||
|
| 393 | 653 | 435 | |||
|
| 777 | 682 | 756 | |||
|
| 232 | – | 26 | 538 | 114 | |
|
| 42 | 939 | 30 | |||
|
| 38 | 842 | 35 | |||
|
| 552 | 1,073 | 540 | |||
|
| 38 | 842 | 35 | |||
|
| 42 | 939 | 30 | |||
|
| 777 | 682 | 756 | |||
|
| 393 | 653 | 435 | |||
|
|
| 37 | 2,443 | 35 | ||
|
| 39 | 851 | 225 | |||
|
| 34 | 687 | 48 | |||
|
| 159 | 712 | 411 | |||
|
| 435 | 735 | 1,632 | |||
|
| 126 | 746 | 228 | 741 | 153 | |
|
| 37 | 516 | 50 | |||
|
| 39 | 574 | 37 | |||
|
| 114 | – | 232 | 538 | 26 | |
|
| 69 | 631 | 291 | 812 | 228 | |
|
| 6 | 744 | 651 | |||
|
| 9 | 690 | 474 | |||
|
| 9 | 1,001 | 402 | |||
|
| 393 | 653 | 435 | |||
|
| 777 | 682 | 756 | |||
|
| 232 | – | 26 | 538 | 114 | |
|
| 42 | 939 | 30 | |||
|
| 38 | 842 | 35 | |||
|
| 552 | 1,073 | 540 | |||
|
| 38 | 842 | 35 | |||
|
| 42 | 939 | 30 | |||
|
| 777 | 682 | 756 | |||
|
| 393 | 653 | 435 | |||
|
|
| 37 | 2,457 | 35 | ||
|
| 42 | 853 | 225 | |||
|
| 24 | 699 | 48 | |||
|
| 159 | 697 | 426 | |||
|
| 427 | 752 | 1,625 | |||
|
| 126 | 742 | 228 | 757 | 156 | |
|
| 37 | 512 | 50 | |||
|
| 39 | 574 | 32 | |||
|
| 114 | – | 232 | 538 | 26 | |
|
| 69 | 644 | 291 | 836 | 228 | |
|
| 6 | 781 | 654 | |||
|
| 9 | 690 | 474 | |||
|
| 9 | 1,005 | 402 | |||
|
| 393 | 653 | 435 | |||
|
| 777 | 682 | 756 | |||
|
| 232 | – | 26 | 538 | 114 | |
|
| 42 | 890 | 35 | |||
|
| 38 | 842 | 35 | |||
|
| 552 | 1,042 | 540 | |||
|
| 38 | 842 | 35 | |||
|
| 33 | 939 | 41 | |||
|
| 42 | 890 | 35 | |||
|
| 777 | 682 | 756 | |||
|
| 393 | 653 | 435 |
FIGURE 2Codon content of 20 amino acids and stop codons in the protein‐coding genes of the chloroplast genomes of the three Hamamelidaceae species. (a) Loropetalum chinense; (b) Corylopsis glandulifera; (c) Corylopsis velutina
FIGURE 3Analysis of repeated sequences in the three Hamamelidaceae chloroplast genomes. (a) Frequency of repeat types; (b) Frequency of repeat sequences by length
FIGURE 4Frequency of SSRs in the different repeat class types. (a) Loropetalum chinense; (b) Corylopsis glandulifera; (c) Corylopsis velutina
FIGURE 5Complete chloroplast genome alignments of six Hamamelidaceae species using the mVISTA program, with the chloroplast genome of Loropetalum chinense as a reference. The horizontal axis indicates the coordinates within the chloroplast genome. The vertical scale indicates the percent identity within 50–100%. Annotated genes are displayed along the top
FIGURE 6Nucleotide diversity (Pi) values among the six Hamamelidaceae species. X‐axis: the position in the genome; Y‐axis: Pi value. Pi, polymorphism information
FIGURE 7Comparison of the borders of the large single‐copy (LSC), small single‐copy (SSC), and inverted repeat (IR) regions among the six Hamamelidaceae chloroplast genomes. Genes are denoted by colored boxes. The gaps between the genes and the boundaries are indicated by the base lengths (bp)
FIGURE 8Bayesian inference (BI) and maximum likelihood (ML) phylogenetic trees were constructed using the general time‐reversible (GTR)+F+I+G4 model based on the chloroplast genomes of 31 species. Numbers are support values for ML‐SH‐Alrt, ML‐UFBoot, and BI‐PP (SH‐aLRT/UFBoot/PP). The species investigated in this study are colored in red
FIGURE 9Bayesian inference (BI) and maximum likelihood (ML) phylogenetic trees were constructed using the general time‐reversible (GTR)+F+I+G4 model based on the LSC regions. Numbers on the branches are support values for ML‐SH‐Alrt, ML‐UFBoot, and BI‐PP (SH‐aLRT/UFBoot/PP). The species investigated in this study are colored in red
FIGURE 10Bayesian inference (BI) and maximum likelihood (ML) phylogenetic trees were constructed using the general time‐reversible (GTR)+F+G4 model based on the SSC regions. Numbers on the branches are support values for ML‐SH‐Alrt, ML‐UFBoot, and BI‐PP (SH‐aLRT/UFBoot/PP). The species investigated in this study are colored in red