Literature DB >> 35547259

A Comparative Analysis of the Chloroplast Genomes of Four Polygonum Medicinal Plants.

Shuai Guo1,2, Xuejiao Liao1,2, Shiyu Chen1,2, Baosheng Liao2, Yiming Guo3, Ruiyang Cheng2, Shuiming Xiao2, Haoyu Hu2, Jun Chen4, Jin Pei1, Yangjin Chen5, Jiang Xu2, Shilin Chen1,2.   

Abstract

Polygonum is a generalized genus of the Polygonaceae family that includes various herbaceous plants. In order to provide aid in understanding the evolutionary and phylogenetic relationship in Polygonum at the chloroplast (cp) genome-scale level, we sequenced and annotated the complete chloroplast genomes of four Polygonum species using next-generation sequencing technology and CpGAVAS. Then, repeat sequences, IR contractions, and expansion and transformation sites of chloroplast genomes of four Polygonum species were studied, and a phylogenetic tree was built using the chloroplast genomes of Polygonum. The results indicated that the chloroplast genome construction of Polygonum also displayed characteristic four types of results, comparable to the published chloroplast genome of recorded angiosperms. The chloroplast genomes of the four Polygonum plants are highly consistent in genome size (159,015 bp-163,461 bp), number of genes (112 genes, including 78 protein-coding genes, 30 tRNA genes, and 4 rRNA genes), gene types, gene order, codon usage, and repeat sequence distribution, which identifies the high preservation among the Polygonum chloroplast genomes. The Polygonum phylogenetic tree was recreated by a full sequence of the chloroplast genome, which illustrates that the P. bistorta, P. orientale, and P. perfoliatum are divided into the same branch, and P. aviculare belongs to Fallopia. The precise system site of lots base parts requires further verification, but the study would provide a basis for developing the available genetic resources and evolutionary relationships of Polygonum.
Copyright © 2022 Guo, Liao, Chen, Liao, Guo, Cheng, Xiao, Hu, Chen, Pei, Chen, Xu and Chen.

Entities:  

Keywords:  Polygonum; comparative analysis; complete chloroplast genome; phylogenetic analysis; repeats analysis

Year:  2022        PMID: 35547259      PMCID: PMC9084321          DOI: 10.3389/fgene.2022.764534

Source DB:  PubMed          Journal:  Front Genet        ISSN: 1664-8021            Impact factor:   4.772


Introduction

Polygonum is an annual genus of Polygonaceae, which are broadly spread around the world, and most of them are distributed in the north temperate zone (Macêdo et al., 2021; Mohtashami et al., 2021; Song et al., 2020). Moreover, there are about 113 species in China. Some species of Polygonaceae have been used as traditional Chinese medicine due to their remarkable effects on the treatment of edema and sore poison. Polygonum aviculare, Polygonum bistorta, Polygonum cuspidatum, and Polygonum perfoliatum have displayed medical values as diuretics in clinical practice, and it has effects of dehumidification of the wind, heat detoxification, and live blood (Lin et al., 2015). Phytochemical research studies showed that flavonoids, quinones, phenylpropanoids, and terpenoids are contained among Polygonum. Also, the pure compounds from Polygonum might have extensive bioactivities, such as anticancer, antitumor, antioxidative, anti-inflammatory, analgesic, antimicrobial, and insecticidal activities. The subordinate division of Polygonaceae is not clear and has been controversial. Since Linnaeus established the genus Polygonum in 1753, Meissner conducted an in-depth research on the genus Polygonum in the world in 1826 and established 10 groups under the genus of Polygonum (Fan et al., 2013). With the further development of research work, most groups have been promoted to the genus level by some scholars, and the genus is divided into 15 genera (Costea and Tardif 2005). In the different genera of Polygonum, the medicinal chemical components and the curative effects on diseases are different. Therefore, accurate identification of the species is the key to ensuring the clinical efficacy and safety of medicinal plants of this genus. Chloroplast is one of the plastids and a vital organelle for transforming the energy and performing photosynthesis among the plants, which can be generally found in land plants, algae, and some protists (Asaf, Khan, Aaqil Khan, et al., 2017). Chloroplasts are composed of membranes, with thylakoids and stromata inside. The membrane of the thylakoid contains a large number of pigment molecules in photosynthesis, which are used to capture and transfer energy during photosynthesis, while the stroma contains various enzymes, inorganic salts, and DNA. Chloroplasts can not only synthesize sugars into photosynthesis but also participate in the synthesis of complex organic substances such as amino acids and fatty acids in organisms (Yao et al., 2015). The chloroplast genomes are conversed into many plants, presenting a covalently closed circular DNA, and few are linear or other shapes (Asaf et al., 2017). For example, in Acetabularia, the chloroplasts are found to be a rare linear structure rather than a conventional closed circular double-stranded DNA, and in another alga, a polycyclic structure exists because several independent micro-circular linked to each other in dinoflagellate (Cheng et al., 2017; Cheon et al., 2019; Yu, Sun, et al., 2020). The size of chloroplast genomes is usually between 100 and 200 kb in plants of different families and genera of plants (Chen et al., 2019). The chloroplast DNA size of most angiosperms is generally between 110 and 160 kb, and the chloroplast DNA size of ferns is about 140–150 kb. The double-stranded closed circle of the chloroplast genome is generally classified into four regions: large single-copy region (LSC), small single-copy region (SSC), inverted repeat region A (IRa), and inverted repeat region B (IRb). Two IR regions are separated by LSC and SSC, and they have the same length in opposite directions (Jansen et al., 2011). Studies (Jansen, Saski, Lee, Hansen and Daniell 2011; Cheon, Kim, Kwak, Lee and Yoo 2019) have shown that changes in the IR region are the main reason for chloroplast genome changes. In the earlier reported cp genome in Polygonaceae, Yu and Ye (Yu et al., 2020; Ye et al., 2021) studied the length of the cp genome of LSC, SSC, and IR region and classified the species of Polygonum chinense and P. cuspidatum. The result would be a valuable genetic resource for studying the genetics and evolutionary relationships between the Polygonaceae species. There are about 110–130 genes encoded by chloroplast DNA, which consist of rRNA-coding genes, protein-coding genes, and tRNA-coding genes (Cheng, Li, Zhang, Cai, Gao, Qiao and Mi 2017; Tan et al., 2020). In general, gene replication occurs in all rRNA genes, along with some protein-coding and tRNA genes. Based on the functions of chloroplast DNA-encoding genes, it can be divided into three classes: genes for the photosynthetic system, such as petB, genes for the genetic system related to transcription and translation, such as tRNA-UGC, genes for biosynthesis related to the synthesis of amino acids, and open reading frames (ORF), such as accD, matK, and ycf1 (Yang et al., 2013). The chloroplast genome is a valuable multi-level taxonomic resource with rich genetic information and has been broadly used in the aspect of plant phylogeny and evolution, species identification, and taxonomy. One reason for the fast development of chloroplast genomes is the advent of high-throughput sequencing technologies. After the complete chloroplast genomes of Nicotiana tabacum (Shinozaki et al., 1986) and Marchantia polymorpha (Kohchi et al., 1988) were established in 1986, researchers (Daniell et al., 2016) have paid more attention to chloroplast genomes of plants. After that, 3,721 cp DNA in different plants has been described, including green algae and aquatic life which can be found in National Center for Biotechnology Information (NCBI) database (Tonti-Filippini et al., 2017). First-generation sequencing technologies (Pareek et al., 2011; Slatko et al., 2011; Liao et al., 2021), including the traditional “dideoxy” sequencing technique, chemical degradation method, and the improved fluorescence automatic sequencing technology developed based on them (Dobrogojski et al., 2020). The next-generation sequencing technology that does not require DNA amplification and cloning and the third-generation sequencing technology that uses single-molecule real-time (SMRT) sequencing is now being widely used in chloroplast genome sequencing, which could facilitate de novo genome assembly (Tonti-Filippini, Nevill, Dixon and Small 2017). Most of the Polygonum species have high financial and therapeutic values. Here, four Polygonum chloroplast genomes (P. aviculare, P. bistorta, P. orientale, P. perfoliatum) were identified and assembled, compared with other published Polygonum chloroplast genomes (Yu, Liu, Liu, Lan and Qu 2020; Ye, Lin, Zhou, He, Yan and Cheng 2021), we got considerable biological evidence, with the cp genome structure, repeat sequences, and other characteristics. This work also provided an essential foundation of the Polygonum cp genome library, encouraging the progress of phylogenetics, DNA barcoding, and population (Lee et al., 2019).

Materials and Methods

DNA Extracting, Sequencing, and Genome Annotation

DNA was extracted from four species of Polygonum, such as P. aviculare, P. bistorta, P. orientale and P. perfoliatum. Fresh Polygonum leaves were collected from Chengdu (Sichuan Province, China). The specimens of Polygonum have been kept in CDCM (Traditional Chinese medicine herbarium of Chengdu University of Traditional Chinese Medicine) (Supplementary Table S1). The cetyltrimethylammonium bromide method has been used to obtain the whole genome DNA from fresh leaves. The ND-2000 spectrometer was used to quantify the DNA. A shotgun library (250 bp) was built following the constructer guidelines. The X Ten Platform (Illumina, San Diego, CA, United States) was used to sequence through the double terminal sequencing technique with pair-end 150. The total raw data from the mensuration DNA was about 3.5G, and about 12 million paired-ends scrutinizes were finished (Supplementary Table S2). The raw reads were trimmed using Skewer v0.22 (skewer -q 20 -Q 30–l 100 -t 32) (Jiang et al., 2014). BLAST was used to predict the chloroplast-like reads by cleaning the reads with the sequences of the reference P. cuspidatum (MW411186.1) (Deng et al., 2015; Ye, Lin, Zhou, He, Yan and Cheng 2021). Generally, SOAPdenovo-2. 04 (SOAPdenovo-127mer all -s config. txt -o out -K 51 -R) was used to accumulate the sequences using chloroplast reads (Gogniashvili et al., 2015). Then, those accumulated assembled sequences were extended with the help of SSPACE-3. 0 (SSPACE_standard_v3.0.pl -l library.txt -s out.config -x 1 -T 4 -b sspace.out) and GapCloser-1.12 (GapCloser -a scaffols.fa -b library.txt -o Fanal.fa)were used to fill the gaps (Boetzer et al., 2011; Acemel et al., 2016). To authenticate the accuracy of the connection splicing, a random primer was designed to check the connections of the sequence by polymerase chain reaction. The PCR primer information and amplification conditions are shown in the Supplementary Table S3. The Sanger sequencing results were compared with the assembled chloroplast genome sequence to verify the accuracy of genome linkage. CpGAVAS (Liu et al., 2012) was used for sequence annotation. The annotation results were checked by DOGMA (http://dogma.ccbb.utexas.edu/) and BLAST (Wyman et al., 2004). In addition, the tRNA genes were classified using tRNAscanSEv1. 21 (Chan and Lowe 2019). The OGDRAWv1. 2 (Lohse et al., 2007) and MEGA5. 2 (Tamura et al., 2011) were used to plot the structural features of the chloroplast genome and define the relative utilization of synonymous codons. MEGA5. 2 (Tamura, Peterson, Peterson, Stecher, Nei and Kumar 2011) was adopted to analyze the relative synonymous codon usage (RSCU). The assembled chloroplast genome sequences of the four Polygonum species were deposited in NCBI under the Genbank accession number MZ748474–MZ748477.

Repeats and Comparative Analysis of Chloroplast Genomes

Tandem Repeats Finder (Benson 1999) and REPuter (Kurtz et al., 2001) have been used to find the tandem, forward, and palindromic repeats from the four Polygonum chloroplast genomes. The Misa.pl (Beier et al., 2017) was used to recognize the SSRs and the finding parameters of mononucleotides transfer to eight repeatable elements, dinucleotides, and trinucleotides four repeatable elements, tetranucleotides, pentanucleotides, and hexanucleotides transfer to three repeatable elements. Primer3 (Untergasser et al., 2012) was used to design the SSR primers. Genome structures among six Polygonum chloroplast genomes, Including four polygonum species in this study and two polygonum species published by the NCBI (P. cuspidatum and P. chinense), were completed by mVISTA software (Shuffle-LAGAN mode) (Frazer et al., 2004) using the genome of P. cuspidatum as the reference. Pi values and sequence polymorphisms of six Polygonum species were analyzed using DNAsp v. 6.12.03 (Rozas et al., 2017). The step size was set to 200 bp, and the window length was set to 800 bp.

Phylogenetic Analysis

A total of 19 chloroplasts sequences (Supplementary Table S4) were used to build the phylogenetic trees. Each of the 67 protein-coding genes shared by all the genomes was compared individually and then linked end to end to form a supergene from each species. The sequences alignment was carried out using the MAFFT v7.309. The best model was determined using the modeltest-ng-0.1.6 software with default parameters; ML analysis was performed using RAxMLNG v0.9.084 based on Linux edition using default parameters. The parameters were GTR + FU + IU + G4m, noname = 1–51,039. Chrysanthemum x morifolium has been situated likewise those out-groups.

Results

Features of Polygonum Chloroplast DNA

The genome sizes of the four Polygonum chloroplasts are 163,461 bp (P. aviculare), 159,476 bp (P. bistorta), 159,015 bp (P. orientale), and 160,680 bp (P. perfoliatum), respectively. The whole GC content are 37.43% (P. aviculare), 37.37% (P. bistorta), 38.21% (P. orientale), and 37.96% (P. perfoliatum), individually. The LSC region, SSC region, and a couple of inverted repeat regions (IRA/IRB) are alike in Polygonum chloroplast genomes than other plants (Guo et al., 2018). The length of the LSC region is 83,583–88,021 bp and the GC content is 35.48–36.59% in Polygonum chloroplast genomes. The distribution of length in the SSC regions is 12,928–13,306 bp and the GC content is about 32.46–33.19%. The GC content in those IR regions is 41.27–41.45% and the length is 31,067–31,184 bp (Table 1). The GC content is a significant marker to identify the genetic relationship of species; moreover, the Polygonum has comparable cpDNA GC content. The phenomenon is also common in other plants (Guo, Guo, Zhao, Xu, Li, Zhang, Shen, Wu and Hou 2018; Liang et al., 2019) that the GC content in those IR regions is more than that of other regions (LSC, SSC). The high GC content of the IR areas usually points to the rRNA and tRNA genes (He et al., 2016; Shen et al., 2017). Also, the whole chloroplast genome sequences of four Polygonum species can be checked in the National Center for Biotechnology Information (NCBI) database afterward annotation and the GenBank accession number can be found in Table1.
TABLE 1

Chloroplast genome features of Polygonum.

SpeciesAllLSCSSCIRAccession numbers
Length (bp)GC%Length (bp)GC%Length (bp)GC%Length (bp)GC%
Polygonum aviculare 163,46137.4388,02135.4813,30632.4631,06741.27MZ748474
Polygonum bistorta 159,47637.3784,36036.0712,96832.9931,07441.35MZ748475
Polygonum orientale 159,01538.2183,58336.5913,15433.1931,13941.45MZ748476
Polygonum perfoliatum 160,68037.9685,38436.2012,92833.0931,18441.39MZ748477
Chloroplast genome features of Polygonum. The annotation results (GB files) of four Polygonum chloroplast genomes which were measured in this study were submitted to the OGDraw software, and the physical map of the Polygonum chloroplast genomes were drawn. The results can be found in Figure 1. In total 112 genes were found in the chloroplast genomes, such as four rRNA genes, 30 tRNA genes, and 78 protein-coding genes (Figure 1; Table 2). The main genes of the four Polygonum chloroplasts can be coarsely divided into three classes, named chloroplast self-replication-related genes, photosynthesis-related genes, and other genes (Saski et al., 2005).
FIGURE 1

Gene map of the Polygonum chloroplast genome. Genes drawn inside the circle are transcribed clockwise, and those outside are transcribed counterclockwise. Genes belonging to different functional groups are color coded. The darker gray in the inner circle corresponds to DNA G + C content, while the lighter gray corresponds to A + T content.

TABLE 2

Gene composition of the chloroplast genome of Polygonum.

CategoryGroup of genesName of genes
Self-replicationLarge subunit of ribosomal proteins rpl2 a , b , 14, 16 a , 20, 22 b , 32, 33, 36
Small subunit of ribosomal proteins rps2, 3, 4, 7 b , 8, 11, 12 a , b , 14,15, 16 a , 18, 19 b
DNA-dependent RNA polymerase rpoA, B, C1 a , C2
rRNA genes rrn16S b , rrn23S b , rrn4.5S b , rrn5S b
tRNA genes trnA-UGC a , b , trnC-GCA, trnD-GUC, trnE-UUC, trnF-GAA, trnfM-CAU, trnG-UCC, trnG-GCC, trnH-GUG, trnI-CAU, trnI-GAU a , b , trnK-UUU a , trnL-CAA, trnL-UAA, trnL-UAG, trnM-CAU, trnN-GUU, trnP-UGG, trnQ-UUG, trnR-ACG, trnR-UCU, trnS-GCU, trnS-GGA, trnS-UGA, trnT-GGU, trnT-UGU, trnV-GAC, trnV-UAC a , trnW-CCA, trnY-GUA
PhotosynthesisPhotosystem I psaA, B, C, I, J
Photosystem II psbA, B, C, D, E, F, H, I, J, K, L, M, N, T, Z,
NADH oxidoreductase ndhA a , B a , b , C, D, E, F, G, H, I, J, K
Cytochrome b6/f complex petA, B a , D a , G, L, N
ATP synthase atpA, B, E, F a , H, I
Rubisco rbcL
Other genesMaturase matK
Protease clpP a
Envelope membrane protein cemA
Subunit acetyl-CoA-carboxylase accD
c-type cytochrome synthesis gene ccsA
Conserved open reading frames ycf1 b , 2 b , 3 a , 4

Genes containing introns.

Duplicated gene (genes present in the IR regions).

Gene map of the Polygonum chloroplast genome. Genes drawn inside the circle are transcribed clockwise, and those outside are transcribed counterclockwise. Genes belonging to different functional groups are color coded. The darker gray in the inner circle corresponds to DNA G + C content, while the lighter gray corresponds to A + T content. Gene composition of the chloroplast genome of Polygonum. Genes containing introns. Duplicated gene (genes present in the IR regions). There are 16 genes with introns among the 112 genes of four Polygonum chloroplast genomes (Table 3), with 5 tRNA genes and 11 functional genes. The tRNA genes include trnI-GAU, trnL-UAA, trnV-UAC, trnK-UUU, and trnA-UGG. The 11 functional genes include ndhB, ndhA, atpF, petB, petD, rpoC1, ycf3, clpP, rps12, rpl16, and rpl2. The 5’ termination of the rps12 gene is in the LSC region of the chloroplast genome, and the 3’ end is in the IR region of the chloroplast genome. Like the other angiospermous chloroplasts trans-splicing phenomenon also occurs in the rps12 gene of the Polygonum chloroplast genome. Three of the 17 intron-containing genes cover two introns (rps12, ycf3, and clpP), and the other genes only contain one intron, of which trnK-UUU covers the main intron (2,510 bp), and this intron covers the entire gene of matK.
TABLE 3

Length of exons and introns in four Polygonum chloroplast genomes.

GeneRegionExon 1Intron 1Exon 2Intron 2Exon 3
Polygonum aviculare atpF LSC411770144
clpP IR32757629191569
ndhA SSC5411099551
ndhB IR756675777
petB IR6769642
petD IR8759475
rpl16 IR3999319
rpl2 IR435662393
rpoC1 LSC1613780430
rps12 IR1142758723252726
trnA-UGC IR3880135
trnI-GAU IR4294535
trnK-UUU LSC35251037
trnL-UAA LSC3757950
trnV-UAC LSC3758338
ycf3 LSC155744227739128
Polygonum bistorta atpF LSC411770144
clpP LSC32757629191569
ndhA SSC5411099551
ndhB IR756674777
petB LSC6769642
petD LSC8759475
rpl16 LSC3999319
rpl2 IR435662393
rpoC1 LSC1613780430
rps12 LSC+IR1147518423252726
trnA-UGC IR3880135
trnI-GAU IR4294535
trnK-UUU IR35251037
trnL-UAA LSC3757950
trnV-UAC LSC3758338
ycf3 IR155744227739128
Polygonum orientale atpF LSC411770144
clpP LSC32757629191569
ndhA SSC541559551
ndhB IR777675756
petB LSC6796642
petD LSC8759475
rpl16 LSC3999319
rpl2 LSC1613780430
rpoC1 LSC+IR27527231
rps12 LSC1147518423152727
trnA-UGC IR3880135
trnI-GAU IR4294535
trnK-UUU LSC35251037
trnL-UAA LSC3757950
trnV-UAC LSC3758338
ycf3 LSC155744227739128
Polygonum perfoliatum atpF LSC411770144
clpP LSC32757629191569
ndhA SSC5411099551
ndhB IR756675777
petB LSC6769642
petD LSC8759475
rpl16 LSC3999319
rpl2 IR435662393
rpoC1 LSC1613780430
rps12 IR1147508023252726
rps16 LSC22386344
trnA-UGC IR3880135
trnI-GAU IR4294535
trnK-UUU LSC35251037
trnV-UAC LSC3758338
ycf3 LSC155744227739128
Length of exons and introns in four Polygonum chloroplast genomes.

Relative Synonymous Codon Usage Analysis

Relative synonymous codon usage (RSCU) is a synonymous codon correlative effect, which values the 64 vital synonymous codons (Wu et al., 2007). RSCU is calculated as the ratio of the actual observed value to the average usage of the synonymous codons. The value of RSCU can be divided into three types: greater than 1, less than 1, and equal to 1. If the value of RSCU is greater than 1, it indicates that the codon is used more frequently than other codons. If the value of RSCU is less than 1, it means that other synonymous codons of this codon are used more frequently than this codon. If the value of RSCU is equal to 1, it indicates that there is no bias in the use of a codon. According to the statistical study of four Polygonum chloroplast genomes, the extent of CDS is from 80,286 to 83,403 bp, and for which accounts for about 50% of the total chloroplast genome length. The number of codons is in between 26,762–27,801. As for the statistical study of RSCU, there was some certain bias in the use of other amino acids, except for Trp and Met (Tables 4).
TABLE 4

Codon–anticodon recognition patterns and codon usage of four Polygonum chloroplast genomes

AACodonRSCU value
P. aviculare P. bistorta P. orientale P. perfoliatum
StopUAA1.661.641.641.66
UAG0.760.730.750.67
UGA0.590.630.610.67
AlaGCA1.131.111.131.13
GCC0.70.740.720.68
GCG0.460.520.510.52
GCU1.71.641.641.68
CysUGC0.520.540.590.55
UGU1.481.461.411.45
AspGAC0.430.430.410.41
GAU1.571.571.591.59
GluGAA1.51.461.461.46
GAG0.50.540.540.54
PheUUC0.660.670.680.65
UUU1.341.331.321.35
GlyGGA1.541.561.531.57
GGC0.50.480.510.48
GGG0.740.740.730.7
GGU1.211.221.231.25
HisCAC0.490.470.480.5
CAU1.511.531.521.5
IleAUA0.930.970.960.95
AUC0.560.550.550.54
AUU1.511.481.491.51
LysAAA1.491.481.481.49
AAG0.510.520.520.51
LeuCUA0.850.850.830.84
CUC0.430.450.420.42
CUG0.380.380.40.39
CUU1.291.261.311.28
UUA1.851.841.771.83
UUG1.211.221.271.24
MetAUG1111
AsnAAC0.510.460.50.51
AAU1.491.541.51.49
ProCCA1.041.121.071.04
CCC0.760.780.820.79
CCG0.690.610.60.62
CCU1.511.491.511.55
GlnCAA1.521.511.51.51
CAG0.480.490.50.49
ArgAGA1.681.761.691.71
AGG0.770.740.80.76
CGA1.431.421.431.4
CGC0.360.40.360.37
CGG0.470.420.470.48
CGU1.291.271.261.29
SerAGC0.420.420.420.44
AGU1.171.21.161.16
UCA1.151.141.161.17
UCC0.971.030.990.97
UCG0.630.610.60.58
UCU1.661.61.661.67
ThrACA1.231.231.271.25
ACC0.730.730.70.7
ACG0.510.550.520.53
ACU1.531.491.511.52
ValGUA1.461.431.441.46
GUC0.560.570.60.58
GUG0.540.550.540.53
GUU1.441.451.421.43
TrpUGG1111
TyrUAC0.380.390.410.41
UAU1.621.611.591.59
Codon–anticodon recognition patterns and codon usage of four Polygonum chloroplast genomes

Long-Repeat and Simple Sequence Repeat Analysis

In this research, we also studied the repeated sequence of four Polygonum chloroplast genomes, with tandem repeats (T), forward repeats (F), reverse repeats (R), and palindromic repeats (P). The results of the repeated study of four Polygonum chloroplast genomes are in Figure 2. There are 99 repeated sequences in P. aviculare, including 49 tandem repeats, 22 forward repeats, 26 palindromic repeats, and 2 reverse repeats; 86 repeated sequences in P. bistorta, including 36 tandem repeats, 23 forward repeats, 22 palindromic repeats, and 5 reverse repeats; 72 repeated sequences in P. orientale, including 22 tandem repeats, 22 forward repeats, 22 palindromic repeats, and 6 reverse repeats; 76 repeated sequences in P. perfoliatum, including 27 tandem repeats, 20 forward repeats, 22 palindromic repeats, and 7 reverse repeats. Among all the categories of repeated sequences, the sequences of length 20–50 bp are the most (Figure 2).
FIGURE 2

Repeat sequences analysis of the four Polygonum cp genomes. (A) Repeat types in the four cp genomes; (B) tandem repeats in the four cp genomes; (C) forward repeats in the four cp genomes; (D) palindromic repeats in the four cp genomes. In (A), different colors show different repeat types; in (B–D), different colors show different lengths. The ordinate represents the number of repeats.

Repeat sequences analysis of the four Polygonum cp genomes. (A) Repeat types in the four cp genomes; (B) tandem repeats in the four cp genomes; (C) forward repeats in the four cp genomes; (D) palindromic repeats in the four cp genomes. In (A), different colors show different repeat types; in (B–D), different colors show different lengths. The ordinate represents the number of repeats. Simple sequence repeat (SSR), also known as microsatellite sequence, is a repeat sequence composed of one to six bases as repeat units in series, which is of great significance to the study of plant populations. SSRs with a length of over 10 bp are inclined on slipped-strand mispairing, which is approved to be the principal mutational mechanism of SSR polymorphisms (Asaf et al., 2017). Additionally, SSRs which are changeable at the intraspecific position in the chloroplast genome are often used as the genetic marker in the investigation of population genetics and evolution (Yang et al., 2016; Zl et al., 2016). There are 228 SSRs in P. aviculare, including, three trinucleotides, 57 dinucleotides, 159 mononucleotides, 6 tetranucleotides, and 3 pentanucleotides; 225 repeated sequences in P. bistorta, including 7 tetranucleotides, 8 trinucleotides, 161 mononucleotides and 49 dinucleotides; 181 repeated sequences in P. orientale, including 136 mononucleotides, 38 dinucleotides, 3 trinucleotides and 4 tetranucleotides; 204 repeated sequences in P. perfoliatum, including 152 mononucleotides, 44 dinucleotides, 4 trinucleotides, and 4 tetranucleotides (Figure 3). Based on these SSR results, we designed 10 pairs of primers as molecular markers. P. orientale, P. cuspidatum, and P. aviculare were used for molecular marker amplification. The results showed that four pairs of primers could distinguish well between species (Supplementary Figure S1). Although the four of them can be distinguished at the genus level, the PCR amplification products may be chloroplast DNA or nuclear DNA, which requires further study.
FIGURE 3

SSR analysis of four cp genomes. The ordinate represents the number of SSRs.

SSR analysis of four cp genomes. The ordinate represents the number of SSRs.

Comparative Chloroplast Genomic Analysis and Sequence Variation

The chloroplast genome sequence of P. cuspidatum is used as a reference sequence to draw an analogy among the genomic sequences of six Polygonum chloroplast genomes (P. cupidatum, P. chinense, P. aviculare, P. bistorta, P. orientale, and P. orientale). Figure 4 displayed regions in common. The figure displayed that, aiming at the Polygonum chloroplast genomes; the rate of changes for the LSC and SSC regions is visibly greater than that of the IR region. To further clarify the variation in the coding regions, the Pi (nucleotide diversity) was also calculated (Figure 5). Six divergent loci (psbI-trnS-GCU, rpoB-trnC-GCA, trnE-UUC-trnT-GGU, trnT-GGU-psbD, trnT-UGU-trnL-UAA, and rpl32-trnL-UAG) had a Pi value greater than 0.12. All of these six divergent loci were intergenic regions and were present in the LSC region, except for rpl32-trnL-UAG, which occurred in the SSC region, with none being detected in the IR region. These highly variable regions may also resolve the interspecific relationships of Polygonum in the Polygonaceae phylogeny. With the forward progress of the chloroplast genome because of the wide usage of the chloroplast DNA fragments, some conundrums in plant species authentication, phylogenetics, and other related researche studies can be solved. Obviously, the chloroplast DNA fragments have higher resolution loci than general DNA fragments which have low distinguishability and mutation rates in some proximal related groups. Meanwhile, these high-efficiency DNA fragments from the chloroplast genome can promote the development of species identification and population diversity by comparing the differences between the chloroplast genome sequences of different groups.
FIGURE 4

Comparative analysis of chloroplast genome differences in six Polygonum cp genomes. Gray arrows and thick black lines above the alignment indicate gene orientation. Purple bars represent exons, blue bars represent untranslated regions (UTRs), pink bars represent non-coding sequences (CNS), and gray bars represent mRNA. The y-axis represents the percentage identity (shown: 50–100%).

FIGURE 5

Nucleotide diversity (Pi) among cp genomes of six Polygonum species.

Comparative analysis of chloroplast genome differences in six Polygonum cp genomes. Gray arrows and thick black lines above the alignment indicate gene orientation. Purple bars represent exons, blue bars represent untranslated regions (UTRs), pink bars represent non-coding sequences (CNS), and gray bars represent mRNA. The y-axis represents the percentage identity (shown: 50–100%). Nucleotide diversity (Pi) among cp genomes of six Polygonum species. IR Contraction and Expansion in Polygonum Chloroplast Genome In this research, the analysis of IR-LSC and IR-SSC border structure and location of four Polygonum species was finished (Figure 6). The results found out that the SSC/IRa assembly was located in the ndhF region in the four species of Polygonum chloroplast genome, and spread a length of 60–63 bp into the IRa region in the four species. Also, the rps19 gene was located in the IRa region and the length is about (P. orientale, 21 bp; P. perfoliatum, 40 bp) into the LSC region. The ycf1 gene of four Polygonum chloroplast genomes completely exists in the IRb region, with a terminal 225 bp (P. aviculare), 275 bp (P. bistorta), 261 bp (P. orientale), and 257 bp (P. perfoliatum) from the SSC/IRa border. For now, the trnH gene existed in the LSC region, and it had a length of 2, 1, 3, and 5 bp from the LSC/IRb border in the chloroplast genome of P. aviculare, P. bistorta, P. orientale, and P. perfoliatum, individually.
FIGURE 6

Comparison of the chloroplast genome boundaries in four Polygonum cp genomes.

Comparison of the chloroplast genome boundaries in four Polygonum cp genomes. Until now, the expansion mechanism of the IR region has been debated; moreover, the double-strand break repair (DCBR) theory is written off as the prime mechanism for the expansion of the IR region (Machour and Ayoub 2020). There is little probability of the IR region, which has a large shrink. Furthermore, it is believed that the DCBR model is not only the core mechanism of IR region expansion but also the mechanism of IR region contraction. Phylogenetic analysis was accomplished on an alliance of concatenated nucleotide sequences of all genes from 19 angiosperm species. We used ML to construct a phylogenetic tree ground based on these gene data, while C. morifolium was agreed as the outgroup. The P. bistorta, P. orientale, P. perfoliatum, and P. chinense are divided into a branch, namely, Polygonum. And in the branch of Polygonum, it is further divided into two small sub-branches. The upper sub-branch includes P. orientale, P. perfoliatum, and P. chinense. Moreover, P. orientale and P. perfoliatum clustered on the small branch. The P. aviculare is closer to F. sachalinensis, F. aubertii, and Polygonum cuspidatum, which is far away from the branch of Polygonum. Therefore, we put P. aviculare and P. cuspidatum into the branch of Fallopia (Figure 7).
FIGURE 7

ML phylogenetic tree reconstruction containing the cp genomes of 19 plants. Chrysanthemum x morifolium was set as the out-group.

ML phylogenetic tree reconstruction containing the cp genomes of 19 plants. Chrysanthemum x morifolium was set as the out-group.

Discussion

Overall, we examined the four species of Polygonum chloroplast genomes, and then the results inferred that the four Polygonum species were similar in the angiospermous features both in structure and content. Also this shows that the characteristics of the chloroplast genome in other medicinal angiosperms (He et al., 2017) would be reliable with the characteristic quadripartite structure of the Polygonum chloroplast genome. The phenomenon is general among other angiospermous chloroplast genomes (Raubeson et al., 2007; Yang et al., 2010; Asaf, Khan, Khan, Waqas, Kang, Yun and Lee 2017) that the AT content was higher than the GC content in the chloroplast genome among all four Polygonum species, then entirely these presented that there were no significant variances in chloroplast genomes among those four Polygonum species. The consequences also confirmed that the GC content was much higher, probably because of the presence of the huge quantity of rRNA in the IR regions. However, the precise details are still poorly unknown. The consequences turned up in coding and extremely different regions among Polygonum chloroplast genomes were also exposed among other floristic chloroplast genomes (Ni et al., 2016; Chen et al., 2017). The length of introns and exons was important among various plant chloroplast genomes. Here, the results indicated that only one gene (rps12) included three exons, meanwhile two genes (ycf3 and clpP) had two introns among the four Polygonum chloroplast genomes. The rps12 genes’ were located at the 5’ end on the LSC region and meanwhile the duplicated 3’ ends was set in the IRs regions due to the phenomenon it has been called trans-spliced gene (Guo, Guo, Zhao, Xu, Li, Zhang, Shen, Wu and Hou 2018). Furthermore, ycf3 is known as a photosynthesis-related gene as reported before (Naver et al., 2001). Therefore, the attendance of the ycf3 gene might result in an extra study of Polygonum chloroplast. The ycf1 gene also expected a basic part in the chloroplast genome, there were some related studies focused on gene capacity around ycf1, and these reports exposed ycf1 as a paramount pseudogene for those varieties of chloroplast genome and similarly encodes for Tic214 in plants (de Cambiaire et al., 2006; Nakai 2015). It has been testified in an earlier study that the introns play a significant part in regulating the expression of genes (Xu et al., 2017), which might control the gene expression level in different spatiotemporal (Le Hir et al., 2003; Niu and Yang 2011). Additionally, we reached a status that the attendance of intron or the loss of genes can be discovered in the chloroplast genomes (Graveley 2001; Ueda et al., 2007), and the regulating function of intron similarly need to be exposed through the research among a large number floristic chloroplast genomes (Emami et al., 2013). However, currently, there was no related research on the intron regulation mechanisms among those Polygonums. So, we will obtain more appropriate information through further studies to find out the functions of introns in the chloroplast genomes. And, these data around chloroplast genome will entitle significant theoretical basis for plant identification, especially medicinal plants. It is important to identify the resources of plant germplasm and molecular markers by the finding the long repeat sequences and the SSRs of the chloroplast genomes. The results of the study demonstrated that the genes which have long repeats may be produced as a genetic marker for identifying the related species, but the exact use of it still needs to be proved by further studies. SSRs play a crucial status role in the chloroplast genomes. Because of its extreme variability, it was always used in genetic research (Ebert and Peakall 2009; Dong et al., 2013). The previous studies presented that SSR is widely distributed in the genome, and because of its special parental inheritance features, SSR is usually analyzed for genetic population structure and maternal analysis. Former researchers have studied that A. formosae has the most plentiful repeats on mononucleotides, and the phenomenon among the four Polygonum chloroplast genomes was in common. Consequently, the development found in SSRs of chloroplast genomes will significantly inspire the learning of identification among plenty species, genetic variety, and evolutionary development in Polygonum (Provan 2000; Flannery et al., 2006). The method of DNA barcoding, which was put forward by Hebert (Hebert et al., 2003; Hu et al., 2019), can be utilized to identify the species through DNA sequences, ITS2, matK, psbA-trnH, and rbcL. However, the identification of related species—and predominantly the morphologically confusing species in the same genus—still exist in various problems. For that reason, discovering a proper DNA marker for such species is indispensable. The cp genomes have habitually been utilized for phylogenetic studies and species identification as a result of they have slower evolution than nuclear genomes (Song et al., 2017). In the current study, an analysis of five Polygonum cp genomic alignments has shown an enlarged figure of mutable sites in the intergenic spacer of the atpI-rps2, atpB-rbcL, psbD-rps14, ycf4-cemA, etc. Thus, these regions may be utilized as different nominee fragments to identify Polygonum. Moreover, ycf1a or ycf1b is the most mutable plastid genome region and can be used as a principal barcode for land plants (Dong et al., 2015). On the other hand, more Polygonum cp sequence data is necessary to be tested and it should be handled in future studies. Earlier research studies had shown that IRs regions were the most conserved regions in the chloroplast genomes (Daniell et al., 2016). Its shrinkage and expansion at the borders are a common evolutionary occasion, and characterize the governing cause for the size variation and rearrangement of the chloroplast genome. There were a lot of reports that showed that the chloroplast gene had been conserved in most land plants but there were also reports that many sequences were rearranged in the chloroplast genomes of most plant species, then the IR shrinkage and extensions with inversions, the inversions in the LSC region, and the re-inversion in the SSC region were involved, and some reports showed that the wide rearrangements in the chloroplast genome of Trachelium caeruleum are associated with repeats and tRNA genes (Raubeson and Jansen 2005; Haberle et al., 2008). The sequence rearrangements caused by the alteration of chloroplast genome structure in related species may be connected with the plant genetic multiplicity information, so it can be utilized for molecular identification and evolutionary study (Chumley et al., 2006). At present, it has been known that the chloroplast genome can serve as a super barcode to identify the plant species (Hernández-León et al., 2013). By phylogenetic analysis of the chloroplast genome of four Polygonum species, we suggested that the chloroplast genome of Polygonum might be a key marker for species identification. Furthermore, research is necessary to study and identify this assumption. The results are of great value to the genetic diversity and phylogenetic research of Polygonum at some point. Nevertheless, our research did not completely figure out the relationship between genera. Furthermore, our phylogenetic study is grounded on the chloroplast genome. If we want to completely figure out the phylogeny of the species in Polygonum, we may need to study the nuclear genes of plants, and more genera should be involved in the future. Nonetheless, our phylogeny research provided treasured resources for the classification, phylogeny, and evolutionary history of Polygonum.
  72 in total

Review 1.  Alternative splicing: increasing diversity in the proteomic world.

Authors:  B R Graveley
Journal:  Trends Genet       Date:  2001-02       Impact factor: 11.639

2.  OrganellarGenomeDRAW (OGDRAW): a tool for the easy generation of high-quality custom graphical maps of plastid and mitochondrial genomes.

Authors:  Marc Lohse; Oliver Drechsel; Ralph Bock
Journal:  Curr Genet       Date:  2007-10-24       Impact factor: 3.886

3.  Chloroplast simple sequence repeats (cpSSRs): technical resources and recommendations for expanding cpSSR discovery and applications to a wide array of plant species.

Authors:  Daniel Ebert; Rod Peakall
Journal:  Mol Ecol Resour       Date:  2009-01-28       Impact factor: 7.090

4.  DnaSP 6: DNA Sequence Polymorphism Analysis of Large Data Sets.

Authors:  Julio Rozas; Albert Ferrer-Mata; Juan Carlos Sánchez-DelBarrio; Sara Guirao-Rico; Pablo Librado; Sebastián E Ramos-Onsins; Alejandro Sánchez-Gracia
Journal:  Mol Biol Evol       Date:  2017-12-01       Impact factor: 16.240

5.  Complete plastid genome sequences of three Rosids (Castanea, Prunus, Theobroma): evidence for at least two independent transfers of rpl22 to the nucleus.

Authors:  Robert K Jansen; Christopher Saski; Seung-Bum Lee; Anne K Hansen; Henry Daniell
Journal:  Mol Biol Evol       Date:  2010-10-08       Impact factor: 16.240

6.  VISTA: computational tools for comparative genomics.

Authors:  Kelly A Frazer; Lior Pachter; Alexander Poliakov; Edward M Rubin; Inna Dubchak
Journal:  Nucleic Acids Res       Date:  2004-07-01       Impact factor: 16.971

7.  Skewer: a fast and accurate adapter trimmer for next-generation sequencing paired-end reads.

Authors:  Hongshan Jiang; Rong Lei; Shou-Wei Ding; Shuifang Zhu
Journal:  BMC Bioinformatics       Date:  2014-06-12       Impact factor: 3.169

8.  Complete Chloroplast Genome Sequence and Phylogenetic Analysis of the Medicinal Plant Artemisia annua.

Authors:  Xiaofeng Shen; Mingli Wu; Baosheng Liao; Zhixiang Liu; Rui Bai; Shuiming Xiao; Xiwen Li; Boli Zhang; Jiang Xu; Shilin Chen
Journal:  Molecules       Date:  2017-08-11       Impact factor: 4.411

9.  The complete chloroplast genome sequences of four Viola species (Violaceae) and comparative analyses with its congeneric species.

Authors:  Kyeong-Sik Cheon; Kyung-Ah Kim; Myounghai Kwak; Byoungyoon Lee; Ki-Oug Yoo
Journal:  PLoS One       Date:  2019-03-20       Impact factor: 3.240

10.  Complete chloroplast genome sequence and phylogenetic analysis of Camellia fraterna.

Authors:  Bo Yu; Ying-Bo Sun; Xiao-Fei Liu; Li-Li Huang; Ye-Chun Xu; Chao-Yi Zhao
Journal:  Mitochondrial DNA B Resour       Date:  2020-12-24       Impact factor: 0.658

View more
  1 in total

1.  Characterization and Comparative Analysis of Chloroplast Genomes in Five Uncaria Species Endemic to China.

Authors:  Min-Min Chen; Miao Zhang; Zong-Suo Liang; Qiu-Ling He
Journal:  Int J Mol Sci       Date:  2022-10-01       Impact factor: 6.208

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.