Literature DB >> 32148396

Comprehensive comparative analysis of chloroplast genomes from seven Panax species and development of an authentication system based on species-unique single nucleotide polymorphism markers.

Van Binh Nguyen¹, Vo Ngoc Linh Giang¹, Nomar Espinosa Waminal¹, Hyun-Seung Park¹, Nam-Hoon Kim¹, Woojong Jang¹, Junki Lee¹, Tae-Jin Yang^1,2.

Abstract

BACKGROUND: Panax species are important herbal medicinal plants in the Araliaceae family. Recently, we reported the complete chloroplast genomes and 45S nuclear ribosomal DNA sequences from seven Panax species, two (P . quinqu e folius and P . trifolius) from North America and five (P . ginseng, P . notoginseng, P . japonicus, P . vietnamensis, and P . stipuleanatus) from Asia.
METHODS: We conducted phylogenetic analysis of these chloroplast sequences with 12 other Araliaceae species and comprehensive comparative analysis among the seven Panax whole chloroplast genomes.
RESULTS: We identified 1,128 single nucleotide polymorphisms (SNP) in coding gene sequences, distributed among 72 of the 79 protein-coding genes in the chloroplast genomes of the seven Panax species. The other seven genes (including psaJ, psbN, rpl23, psbF, psbL, rps18, and rps7) were identical among the Panax species. We also discovered that 12 large chloroplast genome fragments were transferred into the mitochondrial genome based on sharing of more than 90% sequence similarity. The total size of transferred fragments was 60,331 bp, corresponding to approximately 38.6% of chloroplast genome. We developed 18 SNP markers from the chloroplast genic coding sequence regions that were not similar to regions in the mitochondrial genome. These markers included two or three species-specific markers for each species and can be used to authenticate all the seven Panax species from the others.
CONCLUSION: The comparative analysis of chloroplast genomes from seven Panax species elucidated their genetic diversity and evolutionary relationships, and 18 species-specific markers were able to discriminate among these species, thereby furthering efforts to protect the ginseng industry from economically motivated adulteration.

Entities: Chemical

Keywords: Araliaceae evolution; Chloroplast genome; Ginseng authentication; Panax species; dCAPS markers

Year: 2018 PMID： 32148396 PMCID： PMC7033337 DOI： 10.1016/j.jgr.2018.06.003

Source DB: PubMed Journal: J Ginseng Res ISSN： 1226-8453 Impact factor: 6.060

Introduction

Panax (ginseng) species are widely distributed from high altitude freeze-free regions including the Eastern Himalayas, the Hoang Lien Son, and the Annamite mountain range to the freezing winter regions of Northeastern Asia and North America. Ginseng contains many important pharmaceuticals that have been used in traditional medicine for thousands of years. Ginseng is also becoming one of the most important national agricultural commodities not only in Asian countries such as Korea, China, and Vietnam but also in Russia, Canada, and United States. Of the 14 known species in the Panax genus, five species (Panax ginseng, P. quinquefolius, P. notoginseng, P. japonicus, and P. vietnamensis) are used as expensive herbal medicines in Korea, United States, China, Japan, and Vietnam. However, limited genetic information is available on other species such as P. stipuleanatus and P. trifolius. Notable therapeutic effects of ginseng on life-threatening diseases such as neurodegenerative diseases [1], [2], cardiovascular diseases [3], diabetes [4], and cancer [5], [6] are well documented. Owing to the high pharmacological and economical value of ginseng, many economically motivated adulterations (EMAs) of ginseng products have occurred [7]. Traditional methods for authentication of herb plants primarily depend on morphological and histological characteristics. However, morphological and histological authentication methods are not precise enough to distinguish among ginseng species because of their similar morphological appearances and intraspecies morphological differences caused by variation in growing conditions. Moreover, almost all commercial ginseng products are sold in various forms such as dried root, powder, liquid extracts, or other processed products, which are impossible to authenticate based on morphology. Methods of ginsenoside profiling have been developed for authentication of ginseng [8], [9], [10], [11]; however, their applications are limited because ginsenosides are secondary metabolites and their accumulation varies among different tissues (such as roots, leaves, stems, flower buds, and berries) [12], [13], cultivars [14], age [12], [15], environmental conditions [16], [17], storage conditions, and manufacturing processes [7]. Chloroplasts are multifunctional organelles required for photosynthesis and carbon fixation that contain their own genetic material. Chloroplast genomes are highly conserved in plants, with a quadripartite structure comprising two copies of inverted repeat (IR) regions that separate the large and small single-copy (LSC and SSC, respectively) regions. The chloroplast genome size in angiosperms ranges from 115 to 165 kb [18]. Since the emergence of next-generation sequencing, the number of completely sequenced chloroplast genomes rapidly increased. As of September 2017, more than 1541 complete chloroplast genomes from land plants are available in the GenBank Organelle Genome Resources. Of these, five chloroplast genomes from the Panax genus have been sequenced [19]. Sequence-based DNA markers are advantageous and powerful tools used in species identification with high accuracy, simplicity, and time and cost efficiency [7]. Various types of DNA markers have been applied to the authentication of Panax species including nuclear genome-derived random amplified polymorphic DNA [20], microsatellites [21], and expressed sequence tag–simple sequence repeats [22], [23]. However, these nuclear genome-derived DNA markers are usually used to analyze intraspecies level diversity. DNA markers derived from the chloroplast genome have been widely used and are considered to be the best barcoding targets for plant species identification [24] because of their highly conserved structure and high copy numbers that are easily detected. Chloroplast genome divergence is lower at the intraspecies specific level and higher at the interspecies specific level. Recently, chloroplast-derived DNA markers were developed to authenticate ginseng, including markers of single nucleotide polymorphisms (SNPs) and insertions or deletions (InDels) [7], [25], [26], [27]. However, these markers are still of limited use due to the lack of genomic information for intra-species and interspecies variations. Recently, we obtained complete chloroplast genome and nuclear ribosomal DNA sequences from five major Panax species [19], [25] and two basal Panax species [28] from either Asia or North America by de novo assembly using low-coverage whole-genome shotgun next-generation sequencing (dnaLCW) [29]. Using this information, we previously developed InDel-based authentication markers among the five species [7]. Although these markers are easy to apply, their usefulness is somewhat limited by the relatively rich intraspecies polymorphism at the InDel regions. In this study, we conducted a comprehensive comparative genomics study of the chloroplast genomes from the seven Panax species and identified 18 chloroplast CDS-derived SNP markers that can be used to authenticate each of the seven species. This study provides valuable genetic information as well as a practical marker system for authentication of each Panax species that will be very helpful for regulating the ginseng industry.

Materials and methods

Plant materials and genomic DNA extraction

P. ginseng cultivars and P. quinquefolius plants were collected from the ginseng farm at Seoul National University in Suwon, Korea. P. notoginseng and P. japonicus plants were collected from Dafang County, Guizhou Province, and Enshi County, Hubei Province, China, respectively. P. vietnamensis and P. stipuleanatus plants were collected from Kon Tum and Lao Cai Province, Vietnam, respectively. P. trifolius plants were collected from North Eastern America. DNA was extracted from leaves and roots using a modified cetyltrimethylammonium bromide method [30]. The quality and quantity of extracted genomic DNA was measured using a UV-spectrophotometer and agarose gel electrophoresis.

Phylogenetic analysis

Phylogenetic tree construction and the reliability assessment of internal branches were conducted using the maximum likelihood method with 1,000 bootstrap replicates using MEGA 6.0 [31].

Comparative analysis of 79 protein-coding genes between seven Panax species

The chloroplast genome sequences of 11 P. ginseng cultivars (ChP_KM088019, YP_KM088020, GU_ KM067388, GO_ KM067387, SP_ KM067391, SO_ KM067390, SU_ KM067392, SH_ KM067393, CS_ KM067386, HS_ KM067394, JK_ KM067389), two P. quinquefolius (KM088018, KT028714), four P. notoginseng (KP036468, KT001509, NC_026447, KR021381), one P. japonicus (KP036469), two P. vietnamensis (KP036471, KP036470), one P. stipuleanatus (KX247147), and one P. trifolius (MF100782) were obtained from our previous studies [19], [25], [28] and Genbank. Chloroplast protein-coding gene sequences were extracted using Artemis [32] and manually curated. These chloroplast CDS regions were concatenated and aligned using the MAFFT program (http://mafft.cbrc.jp/alignment/server/). The SNPs from 79 CDSs were identified using MEGA 6 [31]; then the SNPs that were located on the chloroplast CDS maps from seven Panax species were identified using Circos v.0.67 [33].

Identification of chloroplast gene insertion in mitochondria

The mitochondrial genome of P. ginseng was retrieved from GenBank (KF735063) and mapped to chloroplast genomes to eliminate BLAST hits of transferred genes between chloroplast and mitochondrial genomes. The maps of chloroplast and mitochondrial genomes from Panax species as well as the fragments of gene transfers were drawn using Circos v.0.67 [33].

Development and validation of derived cleaved amplified polymorphic sequence markers

To discriminate among the seven Panax species, we used polymorphisms in the chloroplast CDSs. For this, derived cleaved amplified polymorphic sequence (dCAPS) primers were designed based on SNP polymorphic sites after eliminating intraspecies polymorphic sites and chloroplast gene transfer regions. The dCAPSs were designed to create restriction enzyme cut sites using dCAPS Finder 2.0 (http://helix.wustl.edu/dcaps/dcaps.html), and the specific primers were designed using the Primer3 program (http://bioinfo.ut.ee/primer3-0.4.0/). Polymerase chain reaction (PCR) was carried out in a 25 μl reaction mixture containing 2.5 μl of 10× reaction buffer, 1.25 mM deoxynucleotide triphosphate, 5 pmol of each primer, 1.25 units of Taq DNA polymerase (Inclone, Korea), and 20 ng of DNA template. The PCR reaction was performed in thermocyclers using the following cycling parameters: 94°C (5 min); 35 cycles of 94°C (30 s), 56–62°C (30 s); 72°C (30 s), then 72°C (7 min). PCR products were visualized on agarose gels (2.0–3.0%) containing safe gel stain (Inclone, Korea). Analytical restriction enzyme reactions were performed in a volume of 10 μl containing 5 μl of PCR product, 1 μl of 10× restriction enzyme buffer, and 0.3 μl (10 units) of restriction enzyme. The reaction mixtures were incubated at the optimum temperature for 3 hours or overnight, then visualized on agarose gels (2.0–3.0%) containing safe gel stain.

Results

Characteristics of the complete chloroplast genomes from seven Panax species

Complete chloroplast genome length from the seven Panax species ranged from 155,993 bp to 156,466 bp (Table 1). These chloroplast genomes had a typical quadripartite structure, consisting of a pair of IRs separated by the LSC and SSC regions (Fig. 1). There were no structural variations except for small InDels and SNPs. Each genome contained 113 functional genes, including 79 protein-coding genes, 30 transfer RNA genes, and four ribosomal RNA genes. The gene map for the seven Panax chloroplast genomes is shown in Fig. 1.

Table 1

Chloroplast genome sequences used in this study

Species	Whole genome sequencing data (Mb)	Sequence reads used		Chloroplast genome length (bp)
Species	Whole genome sequencing data (Mb)	Amounts (Mb)	Chloroplast coverage (x)	Chloroplast genome length (bp)
P. ginseng	10,418	505	97	156,248(KM088019)
P. quinquefolius	3,557	1,010	127	156,088(KM088018)
P. notoginseng	5,619	2,811	246	156,466(KP036468)
P. japonicus	5,738	2,870	237	156,188(KP036469)
P. vietnamensis	7,541	4,586	1,005	155,993(KP036470)
P. stipuleanatus	2,218	599	154	156,064(KX247147)
P. trifolius	14,657	2,300	993	156,157(MF100782)

Fig. 1

Complete chloroplast genomes from seven . Colored boxes represent conserved chloroplast genes that were classified based on product function. Genes shown inside the circle are transcribed clockwise, and those outside the circle are transcribed counterclockwise. Genes belonging to different functional groups are color-coded. The dashed area in the inner circle indicates the GC content.

Chloroplast genome sequences used in this study Complete chloroplast genomes from seven . Colored boxes represent conserved chloroplast genes that were classified based on product function. Genes shown inside the circle are transcribed clockwise, and those outside the circle are transcribed counterclockwise. Genes belonging to different functional groups are color-coded. The dashed area in the inner circle indicates the GC content.

Phylogenomic analysis of 19 complete chloroplast genomes from Araliaceae species

Phylogenetic relationships were inferred using the entire chloroplast genome sequences from 19 species in the Araliaceae family. The results indicate that the nine genera in Araliaceae were divided into two typical monophyletic lineages consisting of the Aralia–Panax group and the other group with the seven remaining genera (Fig. 2). Species from each genus, Panax, Aralia, Schefflera, Dendropanax, Eleutherococcus, Brassaiopsis, Fatsia, Kalopanax, and Metapanax were grouped accordingly. Based on the phylogenetic tree, the seven Panax species were divided among a few subgroups, in which P. stipuleanatus and P. trifolius diverged from the common ancestor earlier than the other five Panax species (Fig. 2).

Fig. 2

ML phylogenetic tree ofseven Numbers in the nodes are the bootstrap support values from 1000 replicates. Black triangles indicate tetraploid Panax species. The chloroplast sequence of carrot (Daucus carota) was used as an outgroup. ML, maximum likelihood.

SNPs in chloroplast genomes of seven Panax species

SNPs were identified in the chloroplast genomes from seven Panax species and were compared among them to develop SNP-derived markers for authentication. In total, 1,783 SNP sites were identified in the whole chloroplast genome sequences of seven species, and of these, 1,128 sites were in protein-coding regions, i.e., CDSs. Despite having more SNP sites, the total number of SNP types in CDS regions accounted for less than half of all SNPs in whole chloroplast genome sequences because multiple SNP types were often found in a given site in the noncoding regions (Table 2). The two closest tetraploid species (P. ginseng and P. quinquefolius) had a lower number of SNPs in both CDSs and whole chloroplast genome sequences than any other pair (Table 2). P. trifolius had the highest numbers of SNPs in both CDS and whole chloroplast sequences in comparison with each of the six other species (Table 2). SNPs were distributed in 72 of the 79 protein-coding gene sequences of seven Panax species, the exceptions being seven highly conserved genes including psaJ, psbN, rpl23, psbF, psbL, rps18, and rps7 (Fig. 3). SNP density was lower in IR regions than in LSC and SSC regions (Fig. 3).

Table 2

Number of SNPs among seven Panax chloroplast genomes.

		Whole chloroplast genomes
		PG	PQ	PN	PJ	PV	PS	PT
CDS regions	PG	/	131	460	495	531	1157	1485
	PQ	59	/	493	496	518	1145	1479
	PN	171	210	/	476	535	1159	1514
	PJ	183	220	183	/	316	1150	1513
	PV	246	245	243	157	/	1196	1555
	PS	497	534	522	524	566	/	1484
	PT	594	610	621	624	664	639	/

Number of SNPs from the entire chloroplast genome sequences and number of SNPs from 79 protein coding sequences are shown above and below the self-comparison diagonal, respectively.

Fig. 3

Single nucleotide polymorphic sites in 79 protein-coding genes fromseven The inner track shows the 79 chloroplast CDS genes. Track A represents the total SNPs in all seven Panax species. Track B–G represents SNPs in P. trifolius, P. stipuleanatus, P. vietnamensis, P. japonicus, P. notoginseng, and P. quinquefolius compared to P. ginseng. The red, green, blue, and black lines on each track indicate the four kinds of SNPs (T, A, C, and G nucleotides), respectively. Yellow lines indicate InDel regions.

CDS, coding sequence; InDel, insertions or deletion; SNP, single nucleotide polymorphism.

Number of SNPs among seven Panax chloroplast genomes. Number of SNPs from the entire chloroplast genome sequences and number of SNPs from 79 protein coding sequences are shown above and below the self-comparison diagonal, respectively. Single nucleotide polymorphic sites in 79 protein-coding genes fromseven The inner track shows the 79 chloroplast CDS genes. Track A represents the total SNPs in all seven Panax species. Track B–G represents SNPs in P. trifolius, P. stipuleanatus, P. vietnamensis, P. japonicus, P. notoginseng, and P. quinquefolius compared to P. ginseng. The red, green, blue, and black lines on each track indicate the four kinds of SNPs (T, A, C, and G nucleotides), respectively. Yellow lines indicate InDel regions. CDS, coding sequence; InDel, insertions or deletion; SNP, single nucleotide polymorphism.

Characterization of chloroplast genome transfer into the mitochondrial genome

The mitochondrial genome sequence of P. ginseng retrieved from GenBank is 464,680 bp, which is approximately 3 times larger than the chloroplast genome and consists of 94 functional genes (Fig. 4). We identified 12 large chloroplast genomes fragments in the mitochondrial genome. The fragments ranged from 2,297 to 8,250 bp and retained ≥90% sequence identity with their original chloroplast counterparts (Fig. 4). The combined total size of these fragments was 60,331 bp, which corresponds to approximately 38.6% of chloroplast genome (Fig. 4). Collectively, the gene transfer regions spanned almost 49 chloroplast genes as well as intergenic regions (Fig. 4).

Fig. 4

Schematic representation of gene transfer between the chloroplast and mitochondrial genomes from Each gray line within the circle shows the regions of chloroplast genome that has been inserted into the indicated location in the mitochondrial genome. Colored boxes show conserved chloroplast genes, classified based on product function. Genes shown inside the circle are transcribed clockwise, and those outside the circle are transcribed counterclockwise.

Identification of species-specific SNP markers for authentication of the seven Panax species

A total of 18 dCAPS markers were selected from the species-specific SNP targets among the seven Panax species. Each of these SNP targets was derived from CDS regions and showed a unique polymorphism in one of the seven Panax species. At least two unique dCAPS markers were selected for each species, for a total of 18 (Table 3). Each of these markers resulted in the expected band sizes before and after restriction enzyme digestion (Fig. 5). Markers Pgdm1–3 that were derived from the rpl20, ndhK, and rps15 gene sequences, respectively, were specific to P. ginseng and resulted in different band sizes when digested compared to other species (Fig. 5). Markers Pqdm4–6 were derived from rpoC1, ndhA, and ndhK sequences, respectively, and resulted in a unique digestion pattern for P. quinquefolius (Fig. 5). Markers Pndm7–9 were derived from rpoC1, rpoC2, and ndhK sequences, respectively, and resulted in a unique digestion pattern for P. notoginseng (Fig. 5). Markers Pjdm10 and 11 were derived from the rpoC2 and rpoB sequences, respectively, and their digestion pattern was unique for P. japonicus, while markers Pvdm12 and 13 were derived from rpoC2 and ndhH genes, respectively, and resulted in a digestion pattern that was unique for P. vietnamensis (Fig. 5). Markers Psdm14–16 were derived from psbB, rpoC1, and rpoB, respectively, and resulted in a digestion pattern that was unique for P. stipuleanatus (Fig. 5). Two markers, Ptdm17 and 18 were derived from ndhA and rpoC1, respectively, and resulted in a unique digestion pattern for P. trifolius (Fig. 5). All 18 markers were practical and successful for distinguishing among the seven Panax species and can therefore be applied to ginseng species authentication.

Table 3

Details for the dCAPS markers developed to authenticate Panax species

Marker ID	Primer sequence (5′-3′)	Location	Tm (°C)	PCR product size (bp)	Digestion enzyme	Target SNP
Marker ID	Primer sequence (5′-3′)	Location	Tm (°C)	PCR product size (bp)	Digestion enzyme	Pg	Pq	Pn	Pj	Pv	Ps	Pt
Pgdm1	GTTTAAATTATTCCGGTGGATTCTT	rpl20	59.2	170	Cla1	A	G	G	G	G	G	G
Pgdm1	GTAGCCTATAGTTATAGTAGATTAATCGA	rpl20	63.4	170	Cla1	A	G	G	G	G	G	G
Pgdm2	GTCCGCTTGTCTAGGACTCG	ndhK	62.5	177	Cla1	A	G	G	G	G	G	G
Pgdm2	CAAAATTCAGTTATTTCAACTACATCAAT	ndhK	60.5	177	Cla1	A	G	G	G	G	G	G
Pgdm3	ATCCAACCGACCAATTAATTCTTTA	rps15	59.2	219	Sma1	C	T	T	T	T	T	T
Pgdm3	TTGAAAGAGGAAAACAAAGACACCC	rps15	62.5	219	Sma1	C	T	T	T	T	T	T
Pqdm4	TATGACCGTCCCTCATCGGTTGTCG	rpoC1	69.1	212	Sal1	G	A	G	G	G	G	G
Pqdm4	CATTCAGATAGATGGGGGTAAACTA	rpoC1	62.5	212	Sal1	G	A	G	G	G	G	G
Pqdm5	CTCGTAAACCACCTAAAAAGGAAT	ndhA	60.1	206	Cla1	C	T	C	C	C	C	C
Pqdm5	TCGTTTATTCAGTATCGGACCATCG	ndhA	64.2	206	Cla1	C	T	C	C	C	C	C
Pqdm6	TTCCGGCTGTTAAAATTAGGTCAGC	ndhK	63	167	Alu1	T	C	T	T	T	T	T
Pqdm6	TCTTTCAAATTGGTCAAGACTCTCT	ndhK	60.9	167	Alu1	T	C	T	T	T	T	T
Pndm7	CCTATTTACACAAATACCCCGTCGA	rpoC1	64.2	223	Sal1	T	T	C	T	T	T	T
Pndm7	ATTAGTTCGTAAAGGATTCAATGCAG	rpoC1	61.6	223	Sal1	T	T	C	T	T	T	T
Pndm8	TTCATTTGATCTTGATCCTTGTG	rpoC2	57.5	216	HindIII	A	A	G	A	A	A	A
Pndm8	TCCACTTTGAATTTTAAAGAGAAGCT	rpoC2	60.0	216	HindIII	A	A	G	A	A	A	A
Pndm9	ATCGACAGGAATTAGCTTATCGAC	ndhK	61.8	238	Cla1	G	G	A	G	G	G	G
Pndm9	ACGATTCGACTTTGATCGTTATCGA	ndhK	62.5	238	Cla1	G	G	A	G	G	G	G
Pjdm10	TGGATATCTCCAGAAAATATTTTAAGTAC	rpoC2	62.0	250	Sal1	A	A	A	T	G	A	A
Pjdm10	AGGATTTGATTGAGTATCGAGGAG	rpoC2	61.8	250	Sal1	A	A	A	T	G	A	A
Pjdm11	AGTCCGACATTTATTCCTTCAGAC	rpoB	61.8	172	Rsa1	T	T	T	C	T	T	T
Pjdm11	GTTTTGGATCGAACTAATCCATTGGT	rpoB	63.2	172	Rsa1	T	T	T	C	T	T	T
Pvdm12	TGCGCGAATCTCAGCAATCACTAG	rpoC2	65.3	195	Spe1	T	T	T	T	C	T	T
Pvdm12	AAATTCAATGAGGATTTGGTTCAT	rpoC2	56.7	195	Spe1	T	T	T	T	C	T	T
Pvdm13	CATAAGGTAAATACTGTATAATTGATCG	ndhH	61.3	170	Cla1	G	G	G	G	A	G	G
Pvdm13	TATGATAGTCAATCTGGGTCCTCA	ndhH	61.8	170	Cla1	G	G	G	G	A	G	G
Psdm14	AACCTTCTTTGGATTTGCCCAAGCT	psbB	64.2	166	HindIII	C	C	C	C	C	T	C
Psdm14	CACGCTGGATTTACAGATTGTACT	psbB	61.8	166	HindIII	C	C	C	C	C	T	C
Psdm15	GAAGCCACAAAGGACTATCTAAATG	rpoC1	62.5	180	EcoR1	G	G	G	G	G	A	G
Psdm15	GTCGGGGTATTTGTGTAAATAGGT	rpoC1	61.8	180	EcoR1	G	G	G	G	G	A	G
Psdm16	TAAGCTTCCTTCCTATTAATCTGGGAATT	rpoB	64.8	179	EcoR1	C	C	C	C	C	T	C
Psdm16	CATATTAGAGCTCGCCAGGAAGTA	rpoB	63.5	179	EcoR1	C	C	C	C	C	T	C
Ptdm17	TATGTACGGAATAGAAAGATTCCAAGC	ndhA	63.7	187	Alu1	C	C	C	C	C	C	T
Ptdm17	CGAGTGTGAGAGATTACCTTTTGA	ndhA	61.8	187	Alu1	C	C	C	C	C	C	T
Ptdm18	CGCTCTATTTAGCAATACGGGATA	rpoC1	61.8	162	EcoRV	C	C	C	C	C	C	T
Ptdm18	GCAATAGAGCTTTTCCAGACATTT	rpoC1	60.1	162	EcoRV	C	C	C	C	C	C	T

dCAPS, derived cleaved amplified polymorphic sequence; SNP, single nucleotide polymorphism; PCR, polymerase chain reaction; Pg, Panax ginseng; Pq, P. quinquefolius; Pn, P. notoginseng; Pj, P. japonicus; Pv, P. vietnamensis; Ps, P. stipuleanatus; Pt, P. trifolius.

Fig. 5

Validation of 18 dCAPS markers derived from CDS SNP regions of seven The 18 denoted dCAPS markers, Pgdm1–3, Pqdm4–6, Pndm7–9, Pjdm10 and 11, Pvdm12 and 13, Psdm14–16, and Ptdm17 and 18 are unique for P. ginseng, P. quinquefolius, P. notoginseng, P. japonicus, P. vietnamensis, P. stipuleanatus, and P. trifolius, respectively. Abbreviated species names shown on amplicons are as follows: Pg, P. ginseng; Pq, P. quinquefolius; Pn, P. notoginseng; Pj, P. japonicus; Pv, P. vietnamensis; Ps, P. stipuleanatus; Pt, P. trifolius; M, 100-bp DNA ladder.

CDS, coding sequence; dCAPS, derived cleaved amplified polymorphic sequence; SNP, single nucleotide polymorphism.

Details for the dCAPS markers developed to authenticate Panax species dCAPS, derived cleaved amplified polymorphic sequence; SNP, single nucleotide polymorphism; PCR, polymerase chain reaction; Pg, Panax ginseng; Pq, P. quinquefolius; Pn, P. notoginseng; Pj, P. japonicus; Pv, P. vietnamensis; Ps, P. stipuleanatus; Pt, P. trifolius. Validation of 18 dCAPS markers derived from CDS SNP regions of seven The 18 denoted dCAPS markers, Pgdm1–3, Pqdm4–6, Pndm7–9, Pjdm10 and 11, Pvdm12 and 13, Psdm14–16, and Ptdm17 and 18 are unique for P. ginseng, P. quinquefolius, P. notoginseng, P. japonicus, P. vietnamensis, P. stipuleanatus, and P. trifolius, respectively. Abbreviated species names shown on amplicons are as follows: Pg, P. ginseng; Pq, P. quinquefolius; Pn, P. notoginseng; Pj, P. japonicus; Pv, P. vietnamensis; Ps, P. stipuleanatus; Pt, P. trifolius; M, 100-bp DNA ladder. CDS, coding sequence; dCAPS, derived cleaved amplified polymorphic sequence; SNP, single nucleotide polymorphism.

Discussion

Characterization of complete chloroplast genome structures provides valuable genetic information for Panax species

Chloroplast DNA sequences are useful in genetic engineering [34], DNA barcoding [35], and studying evolutionary relationships among plants [36], [37]. With recent technical advances in DNA sequencing, the number of completely sequenced chloroplast genomes has grown rapidly. However, the complete chloroplast genome sequences for many high-value plant species are not available yet because of the high cost of sequencing [38]. In our previous studies, we applied a de novo assembly method using dnaLCW [29] to obtain complete chloroplast genomes of five Panax species [19]. Here, we added the complete chloroplast genomes of two more basal Panax species [28] for comparative structure analysis. All seven chloroplast genome sequences were supported by an average read-mapping coverage of 97–1,005x (Table 1). Among the seven Panax species examined here, the chloroplast genome structures are identical except for small InDels and different numbers of SNPs. These seven complete chloroplast genomes will provide more valuable genetic information for the study of the evolutionary relationships, breeding, and authentication of ginseng species. The Araliaceae is a family of flowering plants that consists of about 70 genera and approximately 750 species that vary in type from trees and shrubs to lianas and perennial herbs [39]. Araliaceae speciation is predicted to have occurred in two particular regions of North America and South East Asia [39]. Furthermore, the diversification and speciation were associated with whole genome duplication (WGD) or polyploidy events [40], [41], [42]. Previous studies indicated that two tetraploid Panax species, P. ginseng and P. quinquefolius, have undergone two rounds of WGD [43], [44]. These WGD events, along with geographic and ecological isolation, have contributed to the diversification of Panax species [45]. Taxonomy of Panax that is based on the morphological characteristics is considered controversial due to the complicated morphological variation between intra-species and inter-species according to geographic and ecological environment. Our phylogenetic tree based on whole-chloroplast genome sequences clearly showed the evolutionary relationship between Panax species and between genera in the Araliaceae family. In particular, our results indicated that the diploid species P. trifolius, which diverged from common ancestor earlier and migrated to North America, was not involved in the tetraploidization of P. ginseng and P. quinquefolius. Another diploid species (P. stipuleanatus) which diverged earlier than the five remaining species had an overlapping distribution with the three diploid species group in South East Asia (P. notoginseng, P. vietnamensis, and P. japonicus). Two tetraploid species, P. ginseng and P. quinquefolius, which are involved in the recent second WGD, had diverged from the group of three diploid species and located in Northeastern Asia and North America due to geographic isolation (Fig. 2).

Comparative analysis of Panax chloroplast genomes

The number of SNPs at the intraspecies level is very low compared to that at the interspecies level. SNPs within the whole chloroplast genomes from 12 P. ginseng cultivars are rare, with only six SNPs identified in 12 P. ginseng cultivars [25]. By contrast, a total of 1,783 and 1,128 interspecies SNP sites were identified among seven Panax species in the whole chloroplast genome and protein-coding gene sequences, respectively (Table 2). Nevertheless, chloroplast genomes are highly conserved within the Panax genus, displaying high similarity (≥97.6%) at the nucleotide sequence level. In our previous study, we found that some chloroplast protein-coding genes are highly divergent while others are highly conserved among different Araliaceae species. Four genes, infA, rpl22, rps19, and ndhE, were more divergent and displayed large numbers of SNPs between different species. By contrast, atpF, atpE, ycf2, and rps15, had a high number of nonsynonymous mutations which might be related to evolution under positive selection [19]. However, some genes were highly conserved at the family (Araliaceae) level, such as petN, psaJ, psbN, and rpl23, or even at the order (Apiales) level, such as psbF [19]. The current study is consistent with these findings except petN gene, and in addition to four of the five chloroplast-encoded highly conserved genes (psaJ, psbN, rpl23, and psbF), we found three more among the seven Panax species (psbL, rps18, and rps7) (Fig. 3).

Chloroplast genome fragments were found in mitochondrial genomes

The sequencing of different genomes (nuclear, chloroplast, and mitochondrial) has uncovered staggering amounts of intracellular gene transfer between them [46], [47]. Studies have shown that there is a high frequency of organelle DNA transfer to the nucleus in angiosperms [48], [49], [50]. Interorganelle genome transfer from chloroplast to mitochondrial genomes is also reported recently as a common phenomenon in higher plants in the course of evolution [48], [51]. We identified 12 large fragments of chloroplast genome (representing 38.6% of the chloroplast genome) in mitochondrial genomes from Panax species, including both genes and intergenic regions (Fig. 4). Genome transfer can result in assembly errors in chloroplast or mitochondrial genomes due to the high sequence similarity between the original chloroplast genome and the transferred chloroplast genome segments in mitochondrial genome. Moreover, the study of evolution or the development of molecular markers within gene transfer regions can generate confusing or biased results [7]. To counter this limitation, we examined all the gene transfer regions and removed all SNPs in these regions from our analysis before developing SNP-derived markers for authentication.

Use of dCAPS markers for ginseng species authentication

DNA barcoding may be defined as the use of short DNA sequences from either nuclear or organelle genomes to identify a species. DNA barcoding is a new technique that is widely used as a biological tool for species identification, breeding, and evolutionary research [52]. Identification of plant species is important for standardizing the food and herbal medicine industries and for preventing EMAs. Since ginseng has a high pharmacological and economical value, there is ample potential for EMA of ginseng products. Therefore, easy, reliable, and practical methods that accurately identify the origins of ginseng products play an important role in the development and protection of the ginseng industry. Chloroplast genomes are endemic to plants, smaller in size, and have hundreds of copies in a cell as compared to the nuclear genome. Furthermore, since the chloroplast genome has sufficient interspecific divergence coupled with low intraspecific variation, chloroplast genome–based DNA barcodes are the best targets for methods of species authentication [24]. Recently, chloroplast genome sequences have been used to develop markers for ginseng authentication [7], [26], [27], [53]; however, this method can be applied only to certain species because of a lack of information about variation at the intra-species and interspecies levels. In this study, we developed 18 CDS-derived, species-specific, SNP markers from chloroplast genomes for the authentication of seven Panax species including five representative Panax species and two basal Panax species from Asia and North America. Recently we developed cultivar-unique markers for P. ginseng based on a comprehensive comparative genomic analysis of the chloroplast genome sequences from 12 ginseng cultivars [25]. We excluded those intraspecies polymorphic markers in this study because the aim of this study is to distinguish among different species. We also excluded the chloroplast genome targets that were transferred into mitochondrial genomes. All 18 dCAPS markers presented in this study are unique for one of the seven species and can be practically applied toward species authentication and breeding.

Conflicts of interest

The authors have no conflicts of interest to declare.

47 in total

Review 1. Endosymbiotic gene transfer: organelle genomes forge eukaryotic chromosomes.

Authors: Jeremy N Timmis; Michael A Ayliffe; Chun Y Huang; William Martin
Journal: Nat Rev Genet Date: 2004-02 Impact factor: 53.242

Review 2. Roles and mechanisms of ginseng in protecting heart.

Authors: Si-Dao Zheng; Hong-Jin Wu; De-Lin Wu
Journal: Chin J Integr Med Date: 2012-07-07 Impact factor: 1.978

Review 3. Gene-balanced duplications, like tetraploidy, provide predictable drive to increase morphological complexity.

Authors: Michael Freeling; Brian C Thomas
Journal: Genome Res Date: 2006-07 Impact factor: 9.043

4. MEGA6: Molecular Evolutionary Genetics Analysis version 6.0.

Authors: Koichiro Tamura; Glen Stecher; Daniel Peterson; Alan Filipski; Sudhir Kumar
Journal: Mol Biol Evol Date: 2013-10-16 Impact factor: 16.240

5. Polyploidy and angiosperm diversification.

Authors: Douglas E Soltis; Victor A Albert; Jim Leebens-Mack; Charles D Bell; Andrew H Paterson; Chunfang Zheng; David Sankoff; Claude W Depamphilis; P Kerr Wall; Pamela S Soltis
Journal: Am J Bot Date: 2009-01 Impact factor: 3.844

6. DNA barcoding of Panax species.

Authors: Yunjuan Zuo; Zhongjian Chen; Katsuhiko Kondo; Tsuneo Funamoto; Jun Wen; Shiliang Zhou
Journal: Planta Med Date: 2010-08-27 Impact factor: 3.352

7. Molecular poltergeists: mitochondrial DNA copies (numts) in sequenced nuclear genomes.

Authors: Einat Hazkani-Covo; Raymond M Zeller; William Martin
Journal: PLoS Genet Date: 2010-02-12 Impact factor: 5.917

8. Evolution of the Araliaceae family inferred from complete chloroplast genomes and 45S nrDNAs of 10 Panax-related species.

Authors: Kyunghee Kim; Van Binh Nguyen; Jingzhou Dong; Ying Wang; Jee Young Park; Sang-Choon Lee; Tae-Jin Yang
Journal: Sci Rep Date: 2017-07-07 Impact factor: 4.379

9. Chloroplast transformation of Platymonas (Tetraselmis) subcordiformis with the bar gene as selectable marker.

Authors: Yulin Cui; Song Qin; Peng Jiang
Journal: PLoS One Date: 2014-06-09 Impact factor: 3.240

10. Complete sequences of organelle genomes from the medicinal plant Rhazya stricta (Apocynaceae) and contrasting patterns of mitochondrial genome evolution across asterids.

Authors: Seongjun Park; Tracey A Ruhlman; Jamal S M Sabir; Mohammed H Z Mutwakil; Mohammed N Baeshen; Meshaal J Sabir; Nabih A Baeshen; Robert K Jansen
Journal: BMC Genomics Date: 2014-05-28 Impact factor: 3.969

18 in total

1. The complete chloroplast genome of the Lonicera maackii (Caprifoliaceae), an ornamental plant.

Authors: Guolun Jia; Huan Wang; Pei Yu; Peng Li
Journal: Mitochondrial DNA B Resour Date: 2020-01-14 Impact factor: 0.658

2. Generation of Chloroplast Molecular Markers to Differentiate Sophora toromiro and Its Hybrids as a First Approach to Its Reintroduction in Rapa Nui (Easter Island).

Authors: Ignacio Pezoa; Javier Villacreses; Miguel Rubilar; Carolina Pizarro; María Jesús Galleguillos; Troy Ejsmentewicz; Beatriz Fonseca; Jaime Espejo; Víctor Polanco; Carolina Sánchez
Journal: Plants (Basel) Date: 2021-02-10

3. Diversity and authentication of Rubus accessions revealed by complete plastid genome and rDNA sequences.

Authors: Young Sang Park; Jee Young Park; Jung Hwa Kang; Wan Hee Lee; Tae-Jin Yang
Journal: Mitochondrial DNA B Resour Date: 2021-04-20 Impact factor: 0.658

Review 4. Dynamic evolution of Panax species.

Authors: Hyeonah Shim; Nomar Espinosa Waminal; Hyun Hee Kim; Tae-Jin Yang
Journal: Genes Genomics Date: 2021-02-20 Impact factor: 1.839

5. Assembly and comparative analysis of the first complete mitochondrial genome of Acer truncatum Bunge: a woody oil-tree species producing nervonic acid.

Authors: Qiuyue Ma; Yuxiao Wang; Shushun Li; Jing Wen; Lu Zhu; Kunyuan Yan; Yiming Du; Jie Ren; Shuxian Li; Zhu Chen; Changwei Bi; Qianzhong Li
Journal: BMC Plant Biol Date: 2022-01-13 Impact factor: 4.215

6. Species discrimination of novel chloroplast DNA barcodes and their application for identification of Panax (Aralioideae, Araliaceae).

Authors: Nguyen Nhat Linh; Pham Le Bich Hang; Huynh Thi Thu Hue; Nguyen Hai Ha; Ha Hong Hanh; Nguyen Dang Ton; Le Thi Thu Hien
Journal: PhytoKeys Date: 2022-01-06 Impact factor: 1.635

7. Complete mitochondrial genomes of three Mangifera species, their genomic structure and gene transfer from chloroplast genomes.

Authors: Yingfeng Niu; Chengwen Gao; Jin Liu
Journal: BMC Genomics Date: 2022-02-19 Impact factor: 3.969

8. Comparative plastome analysis of Blumea, with implications for genome evolution and phylogeny of Asteroideae.

Authors: Furrukh Mehmood; Abdur Rahim; Parviz Heidari; Ibrar Ahmed; Péter Poczai
Journal: Ecol Evol Date: 2021-05-06 Impact factor: 2.912

9. Characterization of Chloroplast Genomes From Two Salvia Medicinal Plants and Gene Transfer Among Their Mitochondrial and Chloroplast Genomes.

Authors: Chengwen Gao; Chuanhong Wu; Qian Zhang; Xia Zhao; Mingxuan Wu; Ruirui Chen; Yalin Zhao; Zhiqiang Li
Journal: Front Genet Date: 2020-10-22 Impact factor: 4.599

10. Foliose Ulva Species Show Considerable Inter-Specific Genetic Diversity, Low Intra-Specific Genetic Variation, and the Rare Occurrence of Inter-Specific Hybrids in the Wild.

Authors: Antoine Fort; Marcus McHale; Kevin Cascella; Philippe Potin; Björn Usadel; Michael D Guiry; Ronan Sulpice
Journal: J Phycol Date: 2020-11-24 Impact factor: 2.923