Literature DB >> 31641462

Population genetic structure, migration, and polyploidy origin of a medicinal species Gynostemma pentaphyllum (Cucurbitaceae).

Xiao Zhang¹, Hailun Su¹, Jia Yang¹, Li Feng¹, Zhonghu Li¹, Guifang Zhao¹.

Abstract

Gynostemma pentaphyllum, a member of family Cucurbitaceae, is a perennial creeping herb used as a traditional medicinal plant in China. In this study, six polymorphic nSSR and four EST-SSR primers were used to genotype 1,020 individuals in 72 wild populations of G. pentaphyllum. The genetic diversity and population structure were investigated, and ecological niche modeling was performed to reveal the evolution and demographic history of its natural populations. The results show that G. pentaphyllum has a low level of genetic diversity and high level of variation among populations because of pervasive asexual propagation, genetic drift, and long-term habitat isolation. Results of the Mantel test demonstrate that the genetic distance and geographic distance are significantly correlated among G. pentaphyllum natural populations. The populations can be divided into two clusters on the basis of genetic structure. Asymmetrical patterns of historical gene flow were observed among the clusters. For the contemporary, almost all the bidirectional gene flow of the related pairs was symmetrical with slight differences. Recent bottlenecks were experienced by 34.72% of the studied populations. The geographic range of G. pentaphyllum continues to expand northward and eastward from Hengduan Mountains. The present distribution of G. pentaphyllum is a consequence of its complex evolution. Polyploidy in G. pentaphyllum is inferred to be polygenetic. Finally, G. pentaphyllum is a species in need of protection, so in situ and ex situ measures should be considered in the future.

Entities: Chemical Disease Species

Keywords: Gynostemma pentaphyllum; conservation strategy; genetic diversity; migration; polyploidy origin; population structure

Year: 2019 PMID： 31641462 PMCID： PMC6802062 DOI： 10.1002/ece3.5618

Source DB: PubMed Journal: Ecol Evol ISSN： 2045-7758 Impact factor: 2.912

INTRODUCTION

Population genetic diversity is the product accumulated in the long‐term historical process of evolution in species or populations. It can be used to assess the potential for species survival, adaptation, and development. The evolutionary potential of a species and its ability to mitigate against adverse environmental factors depend not only on the level of genetic variation within the species (genetic polymorphism), but also on the population genetic structure (Li, Liu, Zhao, Su, & Zhao, 2012). Thus, it is necessary to investigate population genetics to evaluate evolutionary processes, and to assess the utilization and conservation of genetic resources. In the past decades, a number of studies of population genetics have used the Himalaya–Hengduan Mountains (HHM) areas and the Qinghai‐Tibetan Plateau (QTP) to examine the effects of orographic uplift and climatic perturbation on plant speciation and population demography (Du, Hou, Wang, Mao, & Hampe, 2017; Liu et al., 2013). In contrast, few studies have been conducted in subtropical China (Sun, Hu, Huang, & Vargas‐Mendoza, 2014; Wang et al., 2015), which consists of the hills and mountains of the Qinling Mountains–Huai River area and the south tropical region of China (Qiu, Fu, & Comes, 2011). Subtropical China is thought to have acted as a refugium for many ancient species during the Pleistocene glacial and interglacial cycles (e.g., Wang, Gao, Kang, Lowe, & Huang, 2009). Many species of this area have unique haplotypes with high levels of genetic diversity. Moreover, the level of genetic differentiation among glacial refugia should be high because of the random fixation of alleles (Hewitt, 2000; Zhang et al., 2015). Gynostemma pentaphyllum is a perennial creeping plant found in subtropical China, Japan, Myanmar, and India (Chen & Gilbert, 2006). In China, it mainly grows near rivers and in the shade of the forests that cover the Yangtze River basin and its southern areas (Chen, 1995). Gynostemma pentaphyllum belongs to the Cucurbitaceae family and has 5–7 foliolate leaves. It can reproduce sexually or by clonal growth of rhizomes or bulbils (Gao, Chen, Gu, & Zhao, 1995). Polyploidization is common in G. pentaphyllum, which can be diploid, tetraploid, hexaploid, or octoploid (x = 11, 2n = 22, 44, 66, and 88). However, it is difficult to determine the ploidy based on the morphological features (Gao et al., 1995). At present, it is not known if the polyploid complex of G. pentaphyllum is autopolyploid or allopolyploid, and the genetic signature and origin of populations with different ploidies are still unclear. As a traditional Chinese medicinal herb, G. pentaphyllum is useful in medical science because it can inhibit the reproduction of tumor cells, regulate lipid metabolism, decrease blood sugar, and enhance immunity (Xie et al., 2010). Thus, most studies of this species have focused on the extraction (Yin, Hu, & Pan, 2004), chemistry, and pharmacology (Razmovski‐Naumovski et al., 2005; Tsai, Lin, & Chen, 2010) of its bioactive components. However, the wild populations of G. pentaphyllum have decreased and become fragmented as a consequence of the increased use of natural medicinal herbs and habitat destruction, to the extent that G. pentaphyllum has been listed as a Grade II Key Protected Wild Plant Species by the Chinese Government (Yu, 1999). It is therefore imperative to investigate the wild populations of G. pentaphyllum, including analysis of their genetic diversity and population structure, to formulate an effective conservation strategy. Existing genetic studies of G. pentaphyllum (Jiang, Qian, Guo, Wang, & Zhao, 2009; Pang, Zou, & Xiao, 2006) used RAPD and ISSR molecular markers on relatively small sample sets that did not cover the spatial distribution of G. pentaphyllum in subtropical China. The simple sequence repeat (SSR) molecular markers, also known as microsatellites, are codominant molecular markers with putative neutral evolutionary history. They can be used to measure or infer bottlenecks (Spencer, Neigel, & Leberg, 2000), local adaptation (Nielsen, 2005), allelic fixation index (F ST; Slatkin, 1995), population size (Kohn et al., 1999), and gene flow (Waits, Taberlet, Swenson, Sandegren, & Franzén, 2000). Furthermore, while paleoecological reconstructions of forest biomes provide fundamental guidance for testable phylogeographic hypotheses, they cannot provide details of population history (Gavin et al., 2014; Qiu et al., 2011). Ecological niche modeling (ENM), which can determine past species distributions, can be used to augment the limited fossil record in East Asia (Wang et al., 2015). Combined with molecular data, ENM can strengthen our understanding of the temporal dimension of population dynamics (Mellick, Lowe, Allen, Hill, & Rossetto, 2012; Scoble & Lowe, 2010). In the current study, SSR markers were used to investigate the genetic diversity and population structure of G. pentaphyllum, and ENM was used to investigate the history of the evolution and demographic structure of natural G. pentaphyllum populations in subtropical China. The main objectives of our study were to: (a) assess the level of genetic diversity in natural populations; (b) evaluate the degree of differentiation and structure among populations; (c) explore the origins and migration of G. pentaphyllum; (d) speculate on the origin of polyploidy; and (e) provide basic information that can be used to formulate a conservation strategy.

MATERIALS AND METHODS

Plant sampling

Wild G. pentaphyllum samples were collected from most of the georeferenced sampling sites; the sample set covers the full longitudinal and latitudinal extent of G. pentaphyllum in China (Figure 1; Table A1). Five to twenty‐four individuals were collected randomly from each population, with the number of samples taken dependent on population size. A total of 1,093 individuals in 72 wild populations were collected. Five individuals from each of two Gomphogyne populations were selected as outgroups. Fresh leaf materials were dried in silica gel. Root cusp samples were immersed in FAA solution (50 ml of 50% alcohol + 5 ml of glacial acetic acid + 5 ml of 37% formaldehyde) and reserved for further laboratory analysis. A handheld GPS (Garmin eTrex Handheld GPS; Garmin) was used to determine the latitude and longitude of each site. Voucher specimens for the samples were deposited at the Northwest University (Xi'an, Shaanxi).

Figure 1

Regional and estimated genetic structure for K = 2 for 72 populations of G. pentaphyllum. (a) Individual assignment to two clusters for all 72 populations was visualized as pie charts. Each population was partitioned into several colored parts proportionally to its membership in a given cluster; colored rings around the pie charts represented the ploidy of each population (gray: diploid; light blue: tetraploid; orange: hexaploid; purple: octaploid). (b) STRUCTURE plot presented for K = 2. Each vertical bar represents a population and its assignment proportion into one of two (colored) population clusters (K). The arrows represented the migration paths

Table A1

Sampling details of each population of G. pentaphyllum

Pop no.	Pop ID	Location	Ploidy level	Sample size (Di‐,Tetra‐, Hexa‐, Octa‐)	Latitude	Longitude
G. pentaphyllum
1	ST	Shitai, Anhui	2×	19 (19,0,0,0)	30°11′N	117°31′E
2	LA	Lu'an, Anhui	2×	24 (24,0,0,0)	31°45′N	116°31′E
3	XC	Xuancheng, Anhui	2×	16 (16,0,0,0)	30°56′N	118°45′E
4	JY	Beibei, Chongqing	2×	16 (16,2,0,0)	29°50′N	106°23′E
5	SX	Shaxian, Fujian	2×	20 (20,0,0,0)	26°24′N	117°47′E
6	GD	Guangzhou, Guangdong	2×	13 (13,0,0,0)	23°10′N	113°16′E
7	GP	Guiping, Guangxi	2×	15 (15,1,0,0)	23°26′N	110°04′E
8	LZ	Liuzhou, Guangxi	2×	13 (13,0,0,0)	24°17′N	109°38′E
9	LP	Liupanshui, Guizhou	2×	17 (17,0,0,0)	26°29′N	104°46′E
10	HN	Wuzhishan, Hainan	2×	12 (12,0,0,0)	18°46′N	109°31′E
11	XY	Xinyang, Henan	2×	19 (19,0,0,0)	32°08′N	114°05′E
12	ES	Enshi, Hubei	2×	15 (15,0,0,0)	30°16′N	109°29′E
13	HF	Hefeng, Hubei	2×	20 (20,8,0,0)	29°53′N	110°01′E
14	FX	Fangxian, Hubei	2×	9 (9,1,0,0)	32°03′N	110°44′E
15	WD	Wudang, Hubei	2×	14 (14,1,0,0)	32°23′N	111°00′E
16	HB	Zhushan, Hubei	2×	11 (11,3,0,0)	32°13′N	110°13′E
17	GZ	Guzhang, Hunan	2×	12 (12,6,0,0)	28°36′N	109°59′E
18	HS	Zhangjiajie, Hunan	2×	9 (9,4,0,0)	29°13′N	110°27′E
19	ZJ	Zhangjiajie, Hunan	2×	14 (14,0,0,0)	29°13′N	110°27′E
20	ZZ	Zhuzhou, Hunan	2×	17 (17,0,0,0)	27°50′N	113°07′E
21	BH	Jurong, Jiangsu	2×	19 (19,1,0,0)	32°07′N	119°04′E
22	DY	Dayu, Jiangxi	2×	16 (16,0,0,0)	25°23′N	114°05′E
23	RJ	Ruijin, Jiangxi	2×	18 (18,0,0,0)	25°51′N	116°03′E
24	WN	Wuning, Jiangxi	2×	18 (18,0,0,0)	29°19′N	115°05′E
25	SR	Shangrao, Jiangxi	2×	15 (15,0,0,0)	28°27′N	117°56′E
26	YX	Pingli, Shaanxi	2×	13 (13,0,0,0)	32°21′N	109°17′E
27	XA	Xi'an, Shaanxi	2×	20 (20,11,0,0)	33°56′N	108°06′E
28	QC	Chengdu, Sichuan	2×	10 (10,0,0,0)	30°55′N	103°34′E
29	QC2	Chengdu, Sichuan	2×	11 (11,0,0,0)	30°55′N	103°34′E
30	EM	Emeishan, Sichuan	2×	10 (10,0,0,0)	29°33′N	103°25′E
31	GA	Guang'an, Sichuan	2×	12 (12,0,0,0)	30°15′N	106°48′E
32	GY	Guangyuan, Sichuan	2×	14 (14,0,0,0)	32°26′N	105°50′E
33	GM	Yanyuan, Sichuan	2×	15 (15,0,0,0)	27°23′N	101°31′E
34	PZ	Panzhihua, Sichuan	2×	13 (13,0,0,0)	26°36′N	101°43′E
35	WY	Wanyuan, Sichuan	2×	20 (20,2,0,0)	31°47′N	107°41′E
36	LJ	Xiachang, Sichuan	2×	15 (15,0,0,0)	27°51′N	102°18′E
37	AL	Jiayi, Taiwan	2×	20 (20,9,0,0)	23°30′N	120°48′E
38	HH	Nantou, Taiwan	2×	20 (20,4,0,0)	23°58′N	120°58′E
39	TC	Tengchong, Yunnan	2×	16 (16,0,0,0)	25°06′N	98°30′E
40	CS	Cangshan, Yunnan	2×	9 (9,0,0,1)	25°50′N	100°10′E
41	CZ	Cizhong, Yunnan	2×	15 (15,0,0,0)	28°01′N	98°54′E
42	JH	Jinghong, Yunnan	2×	11 (11,0,0,0)	21°59′N	100°47′E
43	NN	Jinghong, Yunnan	2×	12 (12,1,0,0)	21°56′N	100°36′E
44	LC	Lincang, Yunnan	2×	18 (18,1,0,0)	23°52′N	100°04′E
45	ML	Mengla, Yunnan	2×	15 (15,0,0,0)	21°33′N	101°34′E
46	TH	Yuxi, Yunnan	2×	11 (11,0,0,0)	24°06′N	102°44′E
47	ZT	Zhaotong, Yunnan	2×	15 (15,0,0,0)	27°21′N	103°43′E
48	WS	Hangzhou, Zhejiang	2×	16 (16,0,0,0)	30°14′N	120°09′E
49	YH	Hangzhou, Zhejiang	2×	14 (14,0,0,0)	30°13′N	120°09′E
50	YN	Ha Giang, Vietnam	2×	9 (9,0,0,0)	22°45′N	104°56′E
51	BS	Baise, Guangxi	4×	15 (0,15,0,0)	23°55′N	106°37′E
52	RH	Renhuai, Guizhou	4×	15 (0,15,0,1)	27°50′N	106°24′E
53	XX	Xixia, Henan	4×	7 (1,7,0,0)	33°17′N	111°28′E
54	XX2	Xixia, Henan	4×	12 (2,12,0,0)	33°17′N	111°28′E
55	WG	Wugang, Henan	4×	15 (1,15,0,0)	33°09′N	113°35′E
56	LH	Linghu, Henan	4×	12 (0,12,0,0)	34°27′N	110°40′E
57	SY	Shiyan, Hubei	4×	12 (0,12,0,0)	32°26′N	110°43′E
58	NJ	Nanjing, Jiangsu	4×	12 (0,12,0,0)	32°06′N	118°48′E
59	ZS	Nanjing, Jiangsu	4×	10 (0,10,0,0)	32°06′N	118°48′E
60	YF	Pingli, Shaanxi	4×	10 (3,10,0,0)	32°21′N	109°17′E
61	YT	Yingtou, Shaanxi	4×	17 (0,17,0,0)	34°09′N	107°45′E
62	WL	Hanzhong, Shaanxi	4×	16 (3,16,0,0)	33°35′N	106°17′E
63	TZ	Shangluo, Shanxi	4×	15 (2,15,0,0)	33°23′N	110°01′E
64	FH	Emeishan, Sichuan	4×	10 (0,10,0,0)	29°33′N	103°25′E
65	LD	Luding, Sichuan	4×	18 (0,18,0,0)	29°57′N	102°13′E
66	KM	Kunming, Yunnan	4×	10 (0,10,0,0)	24°57′N	102°38′E
67	KZ	Kunming, Yunnan	4×	9 (4,9,0,0)	25°09′N	102°44′E
68	SQ	Kunming, Yunnan	4×	10 (0,10,0,0)	24°57′N	102°38′E
69	JS	Jishou, Hunan	6×	17 (0,0,17,0)	28°17′N	109°42′E
70	YJ	Yingjiang, Yunnan	6×	14 (0,0,14,0)	24°36′N	97°39′E
71	DL	Dali, Yunnan	8×	10 (0,0,0,10)	25°38′N	100°16′E
72	DL2	Dali, Yunnan	8×	9 (0,0,0,9)	25°38′N	100°16′E
Gomphogyne cissiformis
73	O1	Yongde, Yunnan	2×	5 (5,0,0,0)	24°11′N	99°30′E
Gomphogyne cissiformis var. villosa
74	O2	Yongde, Yunnan	2×	5 (5,0,0,0)	24°11′N	99°30′E
Total				1,030 (1,103)

DNA extraction, amplification, and microsatellite genotyping

Total genomic DNA was extracted using Plant Genomic DNA Kit (TIANGEN Biotech, Beijing Co., Ltd.) following the manufacturer's protocol. Preliminary analyses investigated 14 nSSR and 16 EST‐SSR primers developed in G. pentaphyllum (Liao et al., 2011; Zhao, Zhou, Li, & Zhao, 2015), most of them were monomorphic among the populations. At last, six polymorphic nSSR and four EST‐SSR primer pairs (Table 1) were tested to genotype the samples. Polymerase chain reaction (PCR) amplifications were performed using a MyCycler™ Thermal Cycler (Bio‐RAD). A Biometra Thermocycler was used with the following cycling conditions: 94°C for 5 min, 32 cycles of 94°C for 30 s, annealing temperature (Table 1) for 45 s, 72°C for 45 s and an extension step of 72°C for 5 min, and then a final holding temperature of 10°C.

Table 1

The primary primer sequences and annealing temperatures of SSR used in the study

Name	ID	Sequence (5′–3′)	Repetitive unit	Repetitive time	Ta (°C)	No. of alleles	PIC
SSR E1	13343	F: TGACCACTCTCAATCTCATCT R: GTTGAACTATGGGAAGAGAGG	TCT	6	51	8	0.712
SSR E2	2782	F: GGTCGAGACTTTTCAGTTTTG R: CATAATCGTTTTGGTGGAAC	GAA	6	52	7	0.655
SSR E3	10765	F: CTCAAACTTATCACCGTCTGA R: ATTCCCCACTCTGTCTCTATC	GAA	5	51	12	0.803
SSR E4	2369	F: GATGGATGAACTAGGCTGTTT R: GCTTCAGGAAATGATCAACA	TGA	5	52	17	0.821
SSR 1	14	F: AAACTTCAGATCTACGCG R: AGAATGGTAGTAGGTTTTG	CT	16	52	13	0.141
SSR 2	16	F: CTGGAATGGATCTTCTTC R: AGCTGTAGTTCGTGGTTA	AG	9	59	15	0.815
SSR 3	17	F: CATAGGCAGCTGTTATTTC R: TGTTGTCAGAAGCATTGG	CT	7	60	10	0.750
SSR 4	21	F: TTCACCACTTATGTCCTA R: GAAAATGAAGGAATTAAG	CT	11	50	17	0.830
SSR 5	25	F: TAAAAGTATGCTACGAGTTCA R: TTATCCCACCATCAGATT	AG	10	53	12	0.196
SSR 6	139	F: AAATTACCAAAGCTACCCTTCT R: TGTAGATCCCAAGCTCCATG	GCT	6	63	8	0.223
Mean						11.9	0.595

The primary primer sequences and annealing temperatures of SSR used in the study F: TGACCACTCTCAATCTCATCT R: GTTGAACTATGGGAAGAGAGG F: GGTCGAGACTTTTCAGTTTTG R: CATAATCGTTTTGGTGGAAC F: CTCAAACTTATCACCGTCTGA R: ATTCCCCACTCTGTCTCTATC F: GATGGATGAACTAGGCTGTTT R: GCTTCAGGAAATGATCAACA F: AAACTTCAGATCTACGCG R: AGAATGGTAGTAGGTTTTG F: CTGGAATGGATCTTCTTC R: AGCTGTAGTTCGTGGTTA F: CATAGGCAGCTGTTATTTC R: TGTTGTCAGAAGCATTGG F: TTCACCACTTATGTCCTA R: GAAAATGAAGGAATTAAG F: TAAAAGTATGCTACGAGTTCA R: TTATCCCACCATCAGATT F: AAATTACCAAAGCTACCCTTCT R: TGTAGATCCCAAGCTCCATG The protocols of Sullivan were employed for the nSSR genotyping (Sullivan, 2013). The products of PCR were separated on 12% non‐denaturing polyacrylamide gel (280 V, 50 W, 3 hr) and visualized using 0.1% silver nitrate stained with a PBR322 DNA marker ladder (TIANGEN Biotech, Beijing Co., Ltd.) to assess the length of the DNA bands. Software Quantity One version 4.6.2 (Bio‐Rad Technical Service Department) was used for quantification. Bands were corrected by capillary electrophoresis, based on several individuals for each primer. Capillary electrophoresis was used for the EST‐SSR genotyping. Sample analyses were carried out using the GeneMarker genotyping software (Hulce, Li, Snyder‐Leiby, & Liu, 2011). The raw data were transformed into 1,0 data for further analysis.

Chromosome counts and DNA ploidy‐level estimation

The method of Sang (2002) was used for chromosome counts. However, the chromosomes of this species are very small in size and difficult to identify even under a high‐power microscope. Thus, the PloidyInfer v1.1 (Huang, Ritland, Dunn, & Li, 2019) software was used to confirm and test the ploidy level of every individual of ambiguous genotype in mixed‐ploidy populations. Confounding individuals were removed to make a single ploidy level for each population.

Genetic analysis and population structure

The Micro‐Checker v2.2.3 (Van Oosterhout, Hutchinson, Wills, & Shipley, 2004) software was used to check for large allele and nonamplifying (null) alleles for each microsatellite locus. Hardy–Weinberg equilibrium (HWE) and linkage disequilibrium (LD) were evaluated using FSTAT v2.9.3 (Goudet, 2002). Significance levels were corrected by the sequential Bonferroni method (Rice, 1989), repeated 740 times. The BayeScan v2.1 (Foll, 2012) program was used to detect outlier loci using the data converted by the PGDSpider v 2.0.1.3 (Lischer & Excoffier, 2011) software. The polymorphism information content (PIC) of each primer was calculated to estimate the allelic variation of SSRs according to the formula:where Pi is the frequency of the ith allele for a given SSR marker, and n is the total number of alleles detected for that SSR marker (Botstein, White, Skolnick, & Davis, 1980). The genetic diversity indices (mean number of alleles; Na, number of effective alleles; Nae, allelic richness; Ar, observed heterozygosity; Ho, expected heterozygosity over all loci; He, gene diversity with unordered alleles; h, and individual inbreeding coefficient; Fi) of each locus and population were estimated by SPAGeDi (Hardy & Vekemans, 2002), and GenALEx 6.5 (Peakall & Smouse, 2006) was used to estimate Shannon's Information Index (I), the percentage of polymorphic loci (PPL), and geographic distance (GGD) among population pairs. The IBM SPSS Statistics v21.0 (SPSS Inc.) software was used to calculate the bivariate correlation between the longitude and latitude and three diversity indices: expected heterozygosity, Shannon's Information Index, and the frequency of private alleles. The Pearson two‐tailed test was used to test correlations, with a significance value of 0.05. The correlation between ploidy and genetic diversity was calculated. The distribution of private allele frequency (Fp) and expected heterozygosity (He) of populations were mapped using the ArcGIS (Esri) program, employing a kriging spherical interpolation method. The program STRUCTURE v2.3.3 (Pritchard, Stephens, & Donnelly, 2000), which employs Bayesian clustering analysis, was used to analyze the genetic structure, analysis followed the admixture model with independent allele frequencies. Ten independent simulations were run for K from 1 to 12 with 100,000 burn‐in steps followed by 1,000,000 MCMC steps. Two alternative methods were utilized to estimate the most likely number (K) of genetic clusters with the online program STRUCTURE HARVESTER (Earl, 2012) by tracing the change in the average of log‐likelihood L(K) as suggested by Pritchard et al. (2000) and by calculating delta K (ΔK) according to Evanno, Regnaut, and Goudet (2005). The ArcMap v10.0 and DISTRUCT v1.1 (Rosenberg, 2004) software packages were used to create the distribution of pie charts and bar charts for the data derived from the STRUCTURE analysis. Analysis of molecular variance (AMOVA) and the fixation indices calculation in Arlequin 3.5 (Excoffier & Lischer, 2010) were used to investigate the extent of genetic differentiation among populations. Calculations were made using four levels of data grouping: (1) species level; (2) ploidy level; (3) two clusters; and (4) five clusters, based on the results of the STRUCTURE analysis, respectively. The significance of the fixation indices was tested using 104 permutations. StAMPP (Pembleton, Cogan, & Forster, 2013), which is an R package for calculation of genetic differentiation and structure of populations with mixed‐ploidy level, was used to calculate the genetic distance and pairwise F ST. The online software Isolation by Distance Web Service version 3.23 (http://ibdws.sdsu.edu; Bohonak, 2002; Jensen, Bohonak, & Kelley, 2005) was used to perform a Mantel test (Mantel, 1967) with 10,000 permutations to detect the relationship between genetic distance and geographic distance among populations, and to determine the possible role of isolation by distance (IBD) in the formation of the current population structure. Principal coordinate analysis (PCoA) was performed based on the genetic distance between pairwise populations.

Bottlenecks and formation pattern of population structure

The BOTTLENECK v1.2.02 (Piry, Luikart, & Cornuet, 1999) software was used to detect genetic bottlenecks within all populations and to determine whether populations exhibited a significant number of loci with heterozygosity excess. A “Wilcoxon signed‐rank test” with a two‐phase model of mutation (TPM; Di Rienzo et al., 1994) with 70% stepwise mutations and 30% multistep mutations was used to analyze heterozygosity excess or deficiency. A descriptor of the allele frequency distribution named “mode‐shift indicator” was also used; this method can discriminate between bottlenecked and stable populations (Luikart, Allendorf, Cornuet, & Sherwin, 1998). Ten thousand iterations were performed for each mutational model. The program 2MOD v0.2 (Ciofi, Beaumontf, Swingland, & Bruford, 1999) was used to estimate the relative likelihoods of immigration–drift equilibrium and drift since a certain time (i.e., the relative effects of gene flow and genetic drift in the current population structure). The program used the settings of Feng et al. (2016), each model was run three times to check whether the MCMC had converged, 100,000 iterations were performed, and the first 10% of iterations in the output were excluded to avoid dependence on initial starting values.

Effective population size and migration

The software Migrate‐n v3.6 (Beerli, 2005) was used to estimate the historical gene flow. The outputs of this software, which calculates the maximum likelihood using the Brownian method and a constant mutation rate (μ), include the effective migration rate (M = m/μ, where m is the migration rate per generation and μ is the mutation rate) paired in both directions, and the theta value (Θ = 4N e μ where N e is the effective population size). Uniform priors and metropolis sampling with 10 short and 1 long chain with 50,000 and 500,000 iterations, respectively, were used to investigate genealogies. Genealogies were sampled 100 steps apart, and the first 1,000 were discarded. The gene flow and number of migrants per population (N m) were estimated from the values of M and Θ. Before running the program, the results of STRUCTURE were used to define 2 and 5 clusters. The effective population size (N e) per population was estimated using an average mutation rate for microsatellites of 5 × 10−4 (Schlötterer, 2000; Selkoe & Toonen, 2006). The Bayesian‐based program BAYESASS v3.0 (Wilson & Rannala, 2003) was used to estimate contemporary migration rates among the clusters (over the last few generations, mc), with a sampling frequency of 1,000.

Ecological niche modeling

Ecological niche (ENM) modeling was used to predict suitable paleo‐ and current distribution ranges of G. pentaphyllum using the Maxent v.3.3.3k (Phillips, Anderson, & Schapire, 2006; Phillips & Dudík, 2008) software. Model inputs included the present geographic distribution and current environmental factors, which were projected back to the Last Glacial Maximum (LGM). The geographic distribution of species was based on the 72 sample sites in this study and 320 records of the species retrieved from the Chinese Virtual Herbarium website (http://www.cvh.org.cn/cms/). Nineteen bioclimatic variables were taken from the WorldClim website (http://www.worldclim.org/; Hijmans, Cameron, Parra, Jones, & Jarvis, 2005). The LGM data used in this study are from the Community Climate System Model (CCSM; Collins et al., 2006). Pairwise correlations were calculated between the 19 variables. Model goodness of fit was evaluated using the area under the receiver operating characteristics curve (AUC). An AUC score above 0.7 was considered to indicate good model performance (Fielding & Bell, 1997).

RESULTS

Samples and loci assessment

The ploidy of every individual was tested, and mixed ploidies were recognized in a few populations. Confounding individuals were removed to give a single ploidy for each population. Further analyses were performed on the remaining 1,020 individuals from 72 populations. After Bonferroni corrections, significant deviation from HWE induced by homozygote excess was detected in most populations (Table A2), and the excess was mainly distributed in loci SSRE3, SSR2, and SSRE2. There was no evidence for LD, but 27 null alleles were found to exist in all loci. The null alleles were regarded as missing data in subsequent analysis. The PIC value of 10 loci ranged from 0.141 to 0.830, with an average value of 0.595 (Table 1). Among these, the values of three loci (SSR1, SSR5, and SSR6) were less than 0.5, indicating that the other seven primers are suitable for identification purposes. The mean values of Ar, He, Ho, F ST, and G ST were 3.543, 0.595, 0.334, 0.491, and 0.508, respectively (Table 2, Table A2). These loci have a high level of genetic diversity and differentiation.

Table A2

Results of Hardy–Weinberg equilibrium and linkage disequilibria

Pop No.	Pop ID	Hardy–Weinberg equilibrium										Linkage disequilibria
Pop No.	Pop ID	SSRE1	SSRE2	SSRE3	SSRE4	SSR1	SSR2	SSR3	SSR4	SSR5	SSR6	Linkage disequilibria
1	NN	0.010	0.880	0.083	0.062	0.970	0.100	0.085	0.080	0.753	M	0.000*
2	HS	0.367	0.860	0.398	0.052	0.038	0.004*	0.722	0.708	0.860	M	0.000*
3	ZJ	0.943	0.038	0.101	0.533	M	0.773	M	0.773	M	0.308	0.000*
4	GD	M	M	M	0.885	0.000*	M	M	0.002*	M	M	0.000*
5	JY	0.007*	0.790	0.003*	0.837	0.459	0.182	0.908	0.568	M	0.356	0.000*
6	WS	0.000*	M	0.001*	0.000*	0.790	0.086	0.459	0.000*	M	M	0.000*
7	ML	0.000*	0.000*	0.000*	0.000*	M	0.239	0.000*	0.000*	M	M	0.000*
8	WN	0.875	0.494	M	0.007	M	0.001*	0.088	M	M	0.007	0.000*
9	DL	0.000*	0.019	0.175	0.002*	0.868	0.429	0.164	0.002*	0.002*	M	0.000*
10	DL2	0.016	0.003*	0.003*	0.003*	0.860	0.708	0.003*	M	0.029	M	0.000*
11	RH	0.000*	0.519	0.000*	M	0.894	0.002*	M	M	M	M	0.000*
12	ZT	0.667	0.258	0.083	0.551	0.894	0.555	0.847	0.085	0.894	0.894	0.000*
13	NJ	0.842	0.753	0.248	0.392	M	0.001*	M	0.665	0.001*	0.248	0.000*
14	ZS	0.868	M	0.002*	M	0.577	0.002*	M	0.429	0.002*	0.175	0.000*
15	QC	0.217	0.024	0.006*	0.774	0.725	0.868	0.868	0.083	M	0.577	0.000*
16	QC2	0.294	0.740	0.749	0.367	0.875	0.036	0.461	0.906	0.875	0.740	0.000*
17	LC	0.338	0.396	0.601	0.157	0.063	0.494	M	0.060	0.494	0.964	0.000*
18	GP	0.424	0.034	0.005*	M	M	0.000*	0.439	0.001*	M	0.424	0.000*
19	JS	0.901	0.000*	0.000*	0.001*	M	0.690	M	0.690	0.690	M	0.000*
20	CZ	M	0.551	0.930	0.001*	M	0.439	0.053	M	M	0.025	0.000*
21	ES	0.894	0.002*	0.782	M	M	0.000*	0.000*	M	0.894	M	0.000*
22	TC	0.897	0.897	0.356	0.129	0.897	0.094	M	0.033	0.790	M	0.000*
23	GY	0.025	M	0.584	0.180	M	0.773	0.360	0.005*	M	M	0.000*
24	YF	0.577	0.725	0.010	0.003*	M	0.103	0.868	0.002*	M	M	0.000*
25	YX	0.000*	M	0.000*	0.000*	0.885	0.000*	0.000*	M	M	M	0.000*
26	BS	0.000*	0.000*	0.002*	0.000*	0.894	0.005*	0.439	M	M	0.000*	0.000*
27	RJ	M	0.000*	M	0.000*	M	0.596	M	M	M	0.803	0.000*
28	XC	M	M	M	M	M	M	M	M	0.000*	M	0.000*
29	YH	0.584	0.000*	0.852	M	M	0.852	0.890	0.000*	M	M	0.000*
30	SQ	M	0.019	0.125	M	M	M	0.002*	0.002*	0.019	M	0.000*
31	KM	M	M	M	M	M	0.002*	0.002*	M	0.002*	M	0.000*
32	HN	M	M	M	M	0.880	M	M	M	0.753	M	0.000*
33	LH	0.002*	0.035	0.003*	0.229	M	0.126	0.154	0.753	M	M	0.000*
34	YN	0.003*	M	M	0.003*	0.134	0.003*	0.003*	M	M	0.003*	0.000*
35	YJ	0.773	0.000*	0.262	0.773	0.773	0.524	0.000*	0.229	0.229	0.533	0.000*
36	YT	0.000*	0.000*	0.000*	0.000*	0.901	0.000*	M	0.000*	0.000*	M	0.000*
37	LP	0.000*	0.000*	0.000*	M	M	0.000*	0.901	0.901	0.996	M	0.000*
38	LJ	0.001*	0.894	M	0.949	0.894	0.053	0.189	0.894	M	0.894	0.000*
39	TH	0.001*	0.001*	M	M	0.875	M	0.001*	M	M	M	0.000*
40	ZZ	M	0.582	0.690	0.002*	M	0.000*	0.049	0.000*	M	M	0.000*
41	SR	M	M	0.000*	M	0.015	0.894	0.000*	M	M	M	0.000*
42	LD	0.000*	0.000*	0.000*	0.000*	0.007	0.000*	M	0.002*	0.000*	M	0.000*
43	CS	0.056	0.003*	0.003*	0.860	0.948	0.000*	0.708	0.003*	0.708	M	0.000*
44	LA	0.484	0.003*	0.001*	0.917	0.917	0.000*	0.656	M	M	0.484	0.000*
45	PZ	0.011	M	M	0.002*	0.638	0.000*	0.000*	0.764	0.885	M	0.000*
46	FH	0.910	0.325	0.000*	0.035	M	0.577	0.098	0.004*	M	0.577	0.000*
47	EM	0.002*	0.002*	0.002*	0.002*	0.868	0.577	0.002*	0.002*	M	M	0.000*
48	SY	0.106	0.001*	0.003*	0.015	0.753	0.182	0.753	0.397	M	0.880	0.000*
49	JH	M	0.001*	0.058	M	M	M	0.001*	0.014	M	M	0.000*
50	XY	0.809	0.120	0.000*	0.809	M	0.032	0.044	0.245	M	M	0.000*
51	FX	0.003*	M	0.000*	M	M	0.000*	0.134	0.708	0.860	M	0.000*
52	XX	0.839	M	0.072	0.008	M	0.008	0.047	0.008	M	M	0.000*
53	XX2	0.125	0.007	0.000*	0.038	M	0.001*	0.621	0.621	M	M	0.000*
54	GM	0.000*	0.000*	M	0.002*	0.439	0.000*	0.239	0.667	M	M	0.000*
55	WY	0.000*	0.717	0.128	0.000*	M	0.717	0.084	0.492	M	0.717	0.000*
56	DY	0.000*	0.000*	0.000*	0.000*	M	0.000*	0.000*	0.000*	M	M	0.000*
57	KZ	0.391	0.708	0.037	0.096	M	0.124	0.141	0.764	0.860	M	0.000*
58	SX	0.000*	0.000*	0.000*	0.000*	0.047	0.000*	0.000*	0.000*	M	M	0.000*
59	AL	0.614	M	0.136	0.061	0.004*	0.036	0.006*	0.069	M	M	0.000*
60	HH	0.909	0.909	0.264	0.048	0.970	M	0.341	0.051	M	0.343	0.000*
61	XA	0.010	M	0.006*	0.136	0.982	0.001*	0.523	0.017	M	M	0.000*
62	HF	0.325	0.037	0.000*	0.005*	0.012	0.000*	0.080	0.053	0.004*	0.103	0.000*
63	LZ	0.279	0.002*	0.005*	0.364	0.999	0.000*	0.000*	0.764	0.140	0.391	0.000*
64	WG	0.056	0.003*	0.000*	0.047	0.667	0.002*	0.002*	0.000*	M	M	0.000*
65	WL	0.347	0.001*	0.041	0.790	0.871	0.006*	0.002*	0.000*	0.790	0.263	0.000*
66	GZ	0.488	0.488	0.000*	0.248	M	0.006*	0.029	0.488	0.753	M	0.000*
67	BH	0.446	0.414	0.135	0.176	0.325	0.120	0.004*	0.414	M	0.709	0.000*
68	WD	0.791	0.001*	0.038	0.003*	M	0.158	0.015	0.024	0.890	0.890	0.000*
69	ST	0.325	M	0.003*	0.039	0.608	0.046	0.453	0.072	M	0.364	0.000*
70	HB	0.601	0.000*	0.006*	0.000*	0.909	0.009	0.000*	0.012	M	M	0.000*
71	TZ	0.086	0.000*	0.032	0.002*	0.921	0.001*	M	0.000*	0.667	0.980	0.000*
72	GA	0.013	0.013	0.013	0.753	M	0.058	0.001*	M	M	0.392	0.000*

Abbreviation: M, monomorphic site.

Corrected significance levels (p < .00693) by the sequential Bonferroni method.

Table 2

Summary of F‐statistics for each locus

Locus	F _it	F _is	F _ST	G _ST
Loc E1	0.404	−0.090	0.453	0.473
Loc E2	0.426	−0.182	0.514	0.538
Loc E3	0.320	−0.159	0.413	0.434
Loc E4	0.486	−0.087	0.527	0.537
Loc 1	0.196	0.134	0.071	0.083
Loc 2	0.439	−0.074	0.477	0.510
Loc 3	0.500	−0.110	0.550	0.519
Loc 4	0.623	0.057	0.600	0.633
Loc 5	0.263	−0.267	0.418	0.480
Loc 6	0.391	−0.035	0.412	0.448
Mean	0.444	−0.092	0.491	0.508
SE	0.036	0.027	0.025	0.025

G ST: equivalent to F ST but estimator with different statistical properties.

Summary of F‐statistics for each locus G ST: equivalent to F ST but estimator with different statistical properties.

Genetic diversity of G. pentaphyllum populations

The level of genetic diversity level in the 72 G. pentaphyllum populations was relatively low. The value of He ranged from 0.024 to 0.513, with an average value of 0.297, while Ho ranged from 0.100 to 0.710, with an average value of 0.329. The observed gene diversity is significantly higher than the expected equilibrium gene diversity (r = .698, p < .01). The value of Ar, I, and PPL for each population ranged from 1.09 to 2.74, 0.046 to 0.987, and 10% to 100%, respectively (Table A3). The trends for each genetic parameter were consistent for the 72 populations, of which the HN and HF populations had the lowest and highest genetic diversity, respectively. Private alleles were found in fourteen populations; among them, populations RH and ZT had two private alleles, and the others had one. The 72 populations were divided into 4 groups based on ploidies, and their genetic diversity indices were compared. The genetic diversity of the polyploid populations is greater than that of the diploid populations; the ranking of diversity is octoploid > tetraploid > hexaploid > diploid. The geographic distribution of population diversity based on Fp and He is shown in Figure 2. It is likely that the Qinling–Daba Mountain areas and southwest China are the center of genetic diversity for G. pentaphyllum. The correlations between genetic diversity indices and ploidy were calculated, and only Nae and He showed a significant positive correlation (Table 3). Correlations were calculated between three diversity indices (He, I, and Fp) and the longitude and latitude. The only positive correlations of significance were between the He and I parameters and latitude (Figure 3).

Table A3

Summary of G. pentaphyllum population genetic parameters

Pop No.	Pop ID	Sample size	NA	Nae	AR	He	Ho	h	I	PPL (%)	Fi	Bn	Mode‐shift
Diploid		745	2.08	1.60	1.77	0.277	0.315	0.277	0.443	70.40
1	ST	19	2.50	1.82	2.03	0.337	0.300	0.337	0.570	80.00	0.113	0.125	S
2	LA	24	2.00	1.41	1.62	0.222	0.254	0.222	0.363	80.00	−0.147	0.629	L
3	XC	16	1.10	1.11	1.10	0.052	0.100	0.052	0.069	10.00	−1.000	0.250	S
4	JYa	16	2.40	1.57	1.98	0.305	0.302	0.305	0.517	90.00	0.036	0.875	S
5	SX	20	1.80	1.76	1.76	0.377	0.710	0.377	0.518	80.00	−0.926	0.004	S
6	GD	13	1.50	1.14	1.22	0.082	0.115	0.082	0.134	30.00	−0.44	0.875	L
7	GPa	15	2.00	1.67	1.83	0.321	0.208	0.321	0.492	70.00	0.364	0.012	S
8	LZa	13	2.80	1.92	2.23	0.432	0.569	0.432	0.697	100.00	−0.335	0.348	L
9	LP	17	1.80	1.45	1.49	0.229	0.424	0.229	0.330	70.00	−0.896	0.289	S
10	HN	12	1.20	1.03	1.09	0.024	0.025	0.024	0.046	20.00	−0.031	1.000	L
11	XY	19	2.10	1.59	1.78	0.282	0.321	0.282	0.452	70.00	−0.144	0.188	L
12	ES	15	1.70	1.38	1.45	0.187	0.327	0.187	0.281	60.00	−0.791	0.422	S
13	HF	20	3.90	2.39	2.74	0.513	0.379	0.513	0.987	100.00	0.301	0.138	S
14	FX	9	1.90	1.52	1.69	0.244	0.287	0.244	0.365	60.00	−0.156	0.281	S
15	WD	14	2.80	1.97	2.19	0.396	0.408	0.396	0.669	90.00	−0.039	0.410	S
16	HB	11	2.70	1.99	2.33	0.391	0.305	0.391	0.689	80.00	0.181	0.422	S
17	GZa	12	2.40	1.90	2.07	0.356	0.294	0.356	0.603	80.00	0.195	0.014	S
18	HS	9	3.10	2.17	2.39	0.397	0.369	0.397	0.692	90.00	0.182	0.752	S
19	ZJ	14	1.90	1.35	1.61	0.214	0.257	0.214	0.344	70.00	−0.212	0.711	L
20	ZZ	17	1.70	1.54	1.61	0.253	0.382	0.253	0.375	60.00	−0.538	0.039	S
21	BH	19	2.30	1.71	2.00	0.357	0.438	0.357	0.575	90.00	−0.194	0.019	S
22	DY	16	2.00	1.83	1.83	0.378	0.669	0.378	0.542	70.00	−0.814	0.004	S
23	RJa	18	1.40	1.25	1.30	0.134	0.233	0.134	0.195	40.00	−0.781	0.156	S
24	WN	18	1.60	1.41	1.52	0.228	0.306	0.228	0.328	60.00	−0.357	0.039	S
25	SR	15	1.40	1.24	1.29	0.129	0.213	0.129	0.186	40.00	−0.697	0.156	S
26	YX	13	1.60	1.55	1.53	0.268	0.508	0.268	0.363	60.00	−0.97	0.016	S
27	XA	20	2.30	1.72	1.90	0.332	0.372	0.332	0.568	70.00	−0.053	0.148	S
28	QC	10	2.50	1.77	2.04	0.321	0.330	0.321	0.541	90.00	−0.029	0.875	L
29	QC2a	11	2.70	1.84	2.11	0.371	0.327	0.371	0.611	100.00	0.124	0.615	L
30	FH	10	2.20	1.83	2.04	0.382	0.340	0.382	0.584	80.00	0.116	0.020	S
31	GA	12	2.10	1.68	1.84	0.342	0.475	0.342	0.488	70.00	−0.327	0.020	S
32	GY	14	1.80	1.54	1.65	0.258	0.250	0.258	0.390	60.00	0.032	0.039	S
33	GMa	15	1.80	1.60	1.71	0.301	0.500	0.301	0.438	70.00	−0.699	0.012	S
34	PZ	13	2.10	1.71	1.81	0.280	0.246	0.280	0.454	70.00	0.126	0.289	S
35	WY	20	2.20	1.44	1.82	0.257	0.195	0.257	0.437	80.00	0.217	0.809	L
36	LJ	15	1.90	1.36	1.49	0.201	0.193	0.201	0.307	80.00	0.037	0.629	L
37	AL	20	3.00	2.00	2.22	0.371	0.360	0.371	0.667	70.00	0.106	0.188	L
38	HH	20	2.70	1.80	2.01	0.292	0.343	0.292	0.523	80.00	−0.114	0.680	L
39	TC	16	2.00	1.52	1.65	0.234	0.194	0.234	0.378	80.00	0.177	0.680	L
40	CSa	9	2.40	1.36	1.97	0.224	0.099	0.224	0.508	90.00	0.386	0.367	S
41	CZ	15	1.60	1.42	1.55	0.239	0.333	0.239	0.341	60.00	−0.417	0.023	S
42	JH	11	1.40	1.27	1.37	0.158	0.082	0.158	0.225	40.00	0.494	0.031	S
43	NNa	12	2.70	1.90	2.20	0.390	0.415	0.390	0.640	90.00	0.029	0.590	L
44	LC	18	2.20	1.55	1.86	0.315	0.323	0.315	0.499	90.00	0.004	0.410	S
45	MLa	15	2.00	1.40	1.73	0.250	0.113	0.250	0.405	70.00	0.556	0.594	S
46	TH	11	1.40	1.34	1.34	0.166	0.309	0.166	0.226	40.00	−0.943	0.063	S
47	ZTb	15	2.20	1.71	1.86	0.359	0.373	0.359	0.527	100.00	−0.041	0.138	L
48	WS	16	1.90	1.32	1.54	0.180	0.087	0.180	0.302	70.00	0.522	0.766	L
49	YH	14	1.60	1.40	1.51	0.224	0.243	0.224	0.319	60.00	−0.089	0.078	S
50	YN	9	1.60	1.64	1.60	0.312	0.567	0.312	0.410	60.00	−0.915	0.008	S
Tetraploid		225	2.15	1.81	1.89	0.339	0.377	0.339	0.510	72.78
51	BS	15	2.00	1.75	1.81	0.356	0.434	0.356	0.488	80.00	−0.236	0.010	S
52	RHb	15	1.90	1.60	1.61	0.219	0.254	0.219	0.285	50.00	−0.16	0.078	S
53	XX	7	1.80	1.71	1.68	0.272	0.314	0.272	0.363	60.00	−0.158	0.055	S
54	XX2	12	2.40	1.93	2.07	0.348	0.360	0.348	0.614	70.00	0.016	0.055	S
55	WGa	15	2.40	2.03	2.08	0.419	0.479	0.419	0.652	80.00	−0.142	0.010	S
56	LH	12	2.40	1.92	2.04	0.369	0.412	0.369	0.567	70.00	−0.099	0.020	S
57	SY	12	2.60	2.08	2.15	0.399	0.414	0.399	0.623	90.00	−0.059	0.102	L
58	NJ	12	2.00	1.59	1.79	0.325	0.317	0.325	0.487	80.00	0.026	0.125	S
59	ZS	10	1.70	1.49	1.59	0.266	0.293	0.266	0.381	70.00	−0.114	0.039	S
60	YF	10	2.20	1.70	1.73	0.280	0.282	0.280	0.450	70.00	0.018	0.531	S
61	YT	17	2.10	2.01	1.97	0.399	0.521	0.399	0.498	80.00	−0.323	0.004	S
62	WL	16	2.90	2.02	2.34	0.474	0.447	0.474	0.810	100.00	0.092	0.053	L
63	TZ	15	2.90	2.08	2.26	0.419	0.416	0.419	0.714	90.00	0.03	0.326	L
64	EM	10	1.90	1.75	1.80	0.356	0.447	0.356	0.478	80.00	−0.28	0.010	S
65	LD	18	1.90	1.80	1.83	0.382	0.483	0.382	0.512	80.00	−0.281	0.004	S
66	KM	10	1.40	1.39	1.38	0.167	0.217	0.167	0.208	30.00	−0.333	0.063	S
67	KZ	9	2.30	1.97	2.04	0.361	0.330	0.361	0.583	80.00	0.123	0.156	S
68	SQ	10	1.90	1.82	1.80	0.301	0.363	0.301	0.464	50.00	−0.23	0.016	S
Hexaploid		31	2.20	1.56	1.85	0.284	0.176	0.284	0.476	85.00
69	JS	17	1.90	1.50	1.67	0.226	0.138	0.226	0.376	70.00	0.337	0.469	L
70	YJa	14	2.50	1.62	2.02	0.342	0.214	0.342	0.576	100.00	0.388	0.461	S
Octaploid		19	2.20	1.99	2.05	0.407	0.395	0.407	0.583	85.00
71	DL	10	2.40	2.19	2.23	0.455	0.418	0.455	0.660	90.00	0.088	0.007	S
72	DL2	9	2.00	1.79	1.86	0.360	0.372	0.360	0.506	80.00	−0.04	0.014	S
Mean			2.10	1.66	1.81	0.297	0.329	0.297	0.465	71.81

NA, alleles; Nae, effective alleles; AR (k = 8), allelic richness (expected number of alleles among eight gene copies); He, gene diversity corrected for sample size; Ho, observed heterozygosity; Fi, individual inbreeding coefficient; h, gene diversity with UNORDERED alleles; I, Shannon's Information Index = −1 × Sum (pi × Ln (pi)) where pi is the frequency of the ith allele for the population; Bn, bottleneck probability of Wilcoxon one‐tailed test for heterozygote excess under PTM; L, normal L‐shaped distribution; S, shifted mode.

Populations with one private allele.

Populations with two private alleles.

Figure 2

Table 3

The correlation between genetic diversity indices and ploidy of each population

	Fp	Nae	AR	He	I	PPL
Ploidy
Pearson's correlation coefficient	−0.009	0.321	0.191	0.291	0.189	0.161
p‐Value	.938	.006*	.108	.013*	.112	.177

p < .05.

Figure 3

Scatter diagram correlation between the longitude and latitude and three diversity indices, that is, expected heterozygosity, Shannon's Information Index, and private allele frequency, respectively

Distribution of G. pentaphyllum population diversity based on frequency of private allele and expected heterozygosity. (a) Frequency of private allele (Fp) for all populations, (b) expected heterozygosity (He) for all populations, (c) expected heterozygosity for diploid populations, (d) expected heterozygosity for polyploidy populations. Red represents the higher level and blue represents the lower. Black dots indicate the sampling sites The correlation between genetic diversity indices and ploidy of each population p < .05. Scatter diagram correlation between the longitude and latitude and three diversity indices, that is, expected heterozygosity, Shannon's Information Index, and private allele frequency, respectively

Genetic structure and divergence

STRUCTURE analysis clearly differentiated the populations into two clusters: north cluster (N) and south cluster (S) with little admixture (Figure 1). There is a clear peak in the value of DK at K = 2 and a small peak at K = 5 (Figure 4b). Some populations also form clusters at K = 3 or K = 4, but these are inconsistent with high variance (Figure 4c,d). The south cluster (S) was stable at higher K values, but the north cluster (N) showed some evidence for partitioning into further clusters: a northwest cluster (NW), a north‐central cluster 1 (NC1), a north‐central cluster 2 (NC2), and a northeast cluster (NE). Most of the ploidy populations fell into clusters with similar genotypes, rather than clusters with any other groups or a single group, except for two tetraploid populations in the northeast cluster (NJ and ZS).

Figure 4

Genetic structure for K = 3 to K = 5 for 72 populations of G. pentaphyllum. Bayesian inference analysis for determining the most likely number of clusters (K) for the distribution of (a) the likelihood L(K) values and (b) ΔK values was presented for K = 1–12 (10 replicates per K‐value). (c–e) Individual assignment to 3–5 clusters for all 72 populations was visualized as pie charts. Each population was partitioned into several colored parts proportionally to its membership in a given cluster; colored rings around the pie charts represented the ploidy of each population (gray: diploid; light blue: tetraploid; orange: hexaploid; purple: octaploid). (f) STRUCTURE plots were presented for K = 3 to K = 5, respectively. Each vertical bar represents a population and its assignment proportion into one of three to five (colored) population clusters (K) The relationships between populations based on PCoA plots of pairwise Euclidean distances are consistent with the results of the STRUCTURE analysis. 31.38% is accommodated by the first three components, which separate all populations into their respective groups (Figure 5). Components PC1, PC2, and PC3 account for 13.54%, 11.24%, and 8.60% of the total variance, respectively. The UPGMA tree based on a matrix of Nei's genetic distance among the 72 populations divided the accessions into five main branches, with three additional subclusters within those branches (Figure 6). The branches and clusters are consistent with those of the STRUCTURE analysis.

Figure 5

Figure 6

The UPGMA clustering tree of 72 populations of G. pentaphyllum. Individual assignment to two to five clusters for all 72 populations was visualized as pie charts. Each population was partitioned into several colored parts proportionally to its membership in a given cluster; colored rings around the pie charts represented the ploidy of each population (gray: diploid; light blue: tetraploid; orange: hexaploid; purple: octaploid)

PCoA graph of G. pentaphyllum. Principal coordinate analysis of pairwise distances between populations of G. pentaphyllum. Percentage of variation explained by the first 3 axes were 13.54%, 11.24%, and 8.60%, respectively The UPGMA clustering tree of 72 populations of G. pentaphyllum. Individual assignment to two to five clusters for all 72 populations was visualized as pie charts. Each population was partitioned into several colored parts proportionally to its membership in a given cluster; colored rings around the pie charts represented the ploidy of each population (gray: diploid; light blue: tetraploid; orange: hexaploid; purple: octaploid) The results of AMOVA reveal that the genetic variation is mostly within populations (Table 4). The percentage of variation within populations at species level, two and five clusters, and ploidy level are 50.42%, 48.19%, 49.22%, and 49.71%, respectively. The corresponding F ST values are 0.496, 0.518, 0.508, and 0.503. Results of the Mantel test on the 72 populations (Figure 7) show that there is a significant linear relationship between Nei's genetic distance and geographic distance (r = .1518, p < .001) and F ST value and geographic distance (r = .1564, p < .001). These results indicate that the genetic diversity and variation are related to geographic distribution.

Table 4

Results of AMOVA for the populations of G. Pentaphyllum

Source of variation	df	Sum of squares	Variance components	Percentage of variation	Fixation indices
(1) Populations
Among populations	71	3,053.739	1.46669 Va	49.58	F _ST: 0.496
Within populations	1,968	2,935.5	1.49162 Vb	50.42
Total	2,039	5,989.239	2.95831
(2) Two clusters
Among groups	1	305.057	0.26780 Va	8.65	F _CT: 0.087
Among populations within groups	70	2,748.682	1.33611 Vb	43.16	F _SC: 0.473
Within populations	1,968	2,935.5	1.49162 Vc	48.19	F _ST: 0.518
Total	2,039	5,989.239	3.09553
(3) Five clusters
Among groups	4	677.497	0.33604 Va	11.09	F _CT: 0.111
Among populations within groups	67	2,376.242	1.20302 Vb	39.69	F _SC: 0.446
Within populations	1,968	2,935.5	1.49162 Vc	49.22	F _ST: 0.508
Total	2,039	5,989.239	3.03068
(4) Four ploidies
Among groups	3	179.266	0.07368 Va	2.46	F _CT: 0.025
Among populations within groups	68	2,874.473	1.43553 Vb	47.84	F _SC: 0.490
Within populations	1,968	2,935.5	1.49162 Vc	49.71	F _ST: 0.503
Total	2,039	5,989.239	3.00083

F ST = F SC + F CT. F ST = F SC + F CT.

Abbreviations: df, degrees of freedom; PV, percentage of variation; SS, sum of squares; VC, variance components; ΦCT, differentiation among groups within three species; ΦSC, differentiation among populations within species; ΦST, differentiation among populations within three species.

Figure 7

Results of Mantel test. (a) Nei's genetic distance versus geographic distance; (b) F ST values versus geographic distance

Results of AMOVA for the populations of G. Pentaphyllum F ST = F SC + F CT. F ST = F SC + F CT. Abbreviations: df, degrees of freedom; PV, percentage of variation; SS, sum of squares; VC, variance components; ΦCT, differentiation among groups within three species; ΦSC, differentiation among populations within species; ΦST, differentiation among populations within three species. Results of Mantel test. (a) Nei's genetic distance versus geographic distance; (b) F ST values versus geographic distance

Effective population size and population history

Our study indicated an asymmetrical pattern of historical gene flow among clusters. When the 72 populations are divided into two clusters, the mean migration rate (M) from the north (N) to the south (S) clusters is 11.323 and from S to N is 92.507. The gene flow (N m) between the two clusters is also asymmetrical (Table 5). The value of M between pairs of clusters varies from 9.104 to 36.299 migrants when the populations are divided into five clusters. The effective population size ranges from 245 individuals in the south cluster (S) to 415 individuals in the northeast cluster (NE; Table 6). The highest value of N m was 4.680, calculated for migration from the south (S) to north‐central 1 (NC1) cluster. Bidirectional contemporary gene flows of the related pairs were symmetrical, with slight differences. The highest migration rate (0.157) was calculated for migration from the NW cluster to the S cluster; the S cluster provided less immigrations.

Table 5

Estimates of migration rate (M) among two and five clusters

Two clusters
Migrate‐n	N	S	BAYESASS	N	S
N	(–)	92.507 (91.319–93.702)	N	(–)	0.232
S	11.323 (11.011–11.639)	(–)	S	0.114	(–)
Five clusters
	S	NC1	NC2	NW	NE
Migrate‐n
S	(–)	36.299 (35.614–36.989)	14.163 (13.779–14.551)	9.2169 (8.920–9.518)	10.263 (9.977–10.551)
NC1	11.929 (11.601–12.260)	(–)	12.402 (12.048–12.761)	12.070 (11.731–12.412)	12.021 (11.713–12.332)
NC2	8.885 (8.601–9.173)	9.104 (8.765–9.448)	(–)	13.3614 (13.00–13.726)	13.277 (12.950–13.608)
NW	9.491 (9.196–9.790)	13.804 (13.390–14.226)	10.160 (9.836–10.489)	(–)	12.122 (11.814–12.433)
NE	8.437 (8.160–8.716)	10.765 (10.397–11.137)	7.7411 (7.453–8.033)	7.652 (7.385–7.922)	(–)
BAYESASS
S	(–)	0.041	0.042	0.042	0.065
NC1	0.002	(–)	0.041	0.043	0.053
NC2	0.002	0.042	(–)	0.042	0.075
NW	0.157	0.041	0.042	(–)	0.009
NE	0.002	0.042	0.042	0.040	(–)

Asymmetrical gene flow was shown in bold. Values in parentheses brackets represented the 5% to 95% confidence intervals (CI). Directionality of gene flow was read among clusters on the left being the source populations, whereas geographic units on top were the recipient populations.

Table 6

The effective size of population per cluster and gene flow among all clusters

	Θ	N _e	M	m	N _m
Two clusters
S → N	0.497	248.640	11.323	0.0057	1.408
N → S	0.482	241.210	92.507	0.0463	11.157
Five clusters
NC1 → S	0.491	245.370	11.929	0.0060	1.463
NC2 → S			8.885	0.0044	1.090
NW → S			9.491	0.0047	1.164
NE → S			8.437	0.0042	1.035
S → NC1	0.516	257.840	36.299	0.0181	4.680
NC2 → NC1			9.104	0.0046	1.174
NW → NC1			13.804	0.0069	1.780
NE → NC1			10.765	0.0054	1.388
S → NC2	0.518	258.970	14.163	0.0071	1.834
NC1 → NC2			12.402	0.0062	1.606
NW → NC2			10.160	0.0051	1.316
NE → NC2			7.741	0.0039	1.002
S → NW	0.584	292.240	9.217	0.0046	1.347
NC1 → NW			12.070	0.0060	1.764
NC2 → NW			13.361	0.0067	1.952
NE → NW			7.652	0.0038	1.118
S → NE	0.830	415.140	10.263	0.0051	2.130
NC1 → NE			12.021	0.0060	2.495
NC2 → NE			13.277	0.0066	2.756
NW → NE			12.122	0.0061	2.516

M was mean effective migration rate (M = m/μ); N e was the effective size of population; m was the migration rate per generation; Θ = 4N e μ; N m was the gene flow or number of migrants per population, and here, μ was the mutation rate using the value 5 × 10−4. Arrows showed the direction from one cluster to the other.

Estimates of migration rate (M) among two and five clusters Asymmetrical gene flow was shown in bold. Values in parentheses brackets represented the 5% to 95% confidence intervals (CI). Directionality of gene flow was read among clusters on the left being the source populations, whereas geographic units on top were the recipient populations. The effective size of population per cluster and gene flow among all clusters M was mean effective migration rate (M = m/μ); N e was the effective size of population; m was the migration rate per generation; Θ = 4N e μ; N m was the gene flow or number of migrants per population, and here, μ was the mutation rate using the value 5 × 10−4. Arrows showed the direction from one cluster to the other. Results of the genetic bottleneck analysis indicate that 34.72% of the populations (25 out of 72) have a high probability of genetic bottleneck (p < .05) and that 52 populations show a shifted mode in mode‐shift indicator (Table A3; Figure 8). These populations are inferred to have experienced recent bottlenecks. Results of the 2MOD analysis suggest that a drift model rather than gene flow–drift led to the current population structure (p = 1.0, Bayesian factor = 100,000).

Figure 8

The distribution of bottlenecked populations with four different methods in the TPM model. (a) Wilcoxon test; (b) mode‐shift indicator. Color scales refer to significant level of each population experienced recent bottleneck and results of the mode‐shift indicator Six bioclimatic variables were selected out of 19 for the ENM (Table 7). The highest contribution rate was by precipitation of warmest quarter (Bio18) at 52.4%, and the most important was temperature seasonality (Bio4), with an important coefficient of 33.9%. Analysis of correlations between the six variables shows no significant correlation, so these variables could be used for further analyses.

Table 7

Information of the six ecological variables

Name	Contribution rate (%)	Significance index	Ecological variables
Bio2	2.0	5.0	Mean monthly temperature range
Bio4	7.6	33.9	Temperature seasonality (STD*100)
Bio9	6.7	4.2	Mean temperature of driest quarter
Bio14	1.0	4.6	Precipitation of driest month
Bio15	6.7	12.9	Precipitation seasonality (CV)
Bio18	52.4	18.8	Precipitation of warmest quarter

Information of the six ecological variables The AUC value, based on 10 times repeat, was 0.987, with a standard deviation of 0.003. The calculated distribution under the current climatic conditions is generally similar to the known distribution (Figure 9a), while the predicted suitable habitats for G. pentaphyllum in the LGM periods are limited to the Himalayas and Qinling Mountains in southwest and central China (Figure 9b). It is inferred that G. pentaphyllum has expanded continuously since the LGM, with an increase in the geographic range to the north and east, including expansion onto the Korean Peninsula and south Japan Islands.

Figure 9

Species distribution modeling using maximum entropy modeling of G. pentaphyllum. Predicted distributions were shown for two periods, that is, (a) the present time and (b) the LGM (21,000 years before present) periods. Color scales refer to logistic probability of occurrence, and black dots indicate the sampling sites

DISCUSSION

Genetic divergence and diversity

Genetic diversity of a species reflects its evolutionary potential and allows for evolution and adaptation. The more abundant the genetic variation of a species is, the more adaptable it is. Thus, it is necessary to study the genetic diversity of a species to understand its biological properties (Grant, 1985). Subtropical China was a Pleistocene refugium for many ancient species during the Pleistocene glacial and interglacial cycles (e.g., Wang et al., 2009). Species in this region commonly have unique haplotypes, and the level of genetic differentiation among glacial refugia is usually high because of random allele fixation (Hewitt, 2000; Zhang et al., 2015). In the current study, six SSR and four EST‐SSR markers were used to evaluate the population genetics of a large number of G. pentaphyllum populations across its distribution range in subtropical China. The average level of genetic diversity of G. pentaphyllum is relatively low (He = 0.297, Ho = 0.329, I = 0.465, and PPL = 71.81%, Table A3), though the observed level of genetic diversity is significantly higher than the expected value (r = .698, p < .01). The trend of these indexes is fairly similar; the lowest and highest diversity were found in the HN and HF populations, respectively. In contrast, Wang, Zhang, Qian, and Zhao (2008) reported that the genetic diversity of 14 G. pentaphyllum populations was high (PPL = 96.39%, I = 0.407, He = 0.262), based on a study of ISSR markers. The difference may relate to the reproductive attributes of this species, the sample size, and/or the characteristics of the molecular markers. G. pentaphyllum is a perennial dioecious herbaceous plant that can be pollinated by insects or propagate asexually by rhizomes or bulbils. In the long term, asexual propagation would lead to a reduction of genetic differences among individuals within populations and enhance differences between populations. Moreover, insects' activity can increase the gene flow among individuals. The results of the AMOVA suggested that the maximum contribution rate is within populations. However, there was also a greater degree of variability among populations (Table 4). Previous studies have suggested that small populations commonly experience serious genetic drift and long‐term habitat isolation might intensify this effect, leading to genetic differentiation among populations by reducing the level of genetic diversity within population (Ellstrand & Elam, 1993). A species with low genetic diversity lacks the evolutionary flexibility to cope with a changing ecological environment and is passive in longer‐term evolutionary processes (Genton, Shykoff, & Giraud, 2005). Thus, a species with low level of genetic diversity is relatively more vulnerable to get extinction (Chen et al., 2015; Knox, Bezold, Cabe, Williams, & Simurda, 2016; Nolan, Noyes, Bennett, Hunter, & Hunter, 2010). In contrast, a high level of genetic diversity tends to be associated with successful ecological adaptation (Ortego, Noguerales, Gugger, & Sork, 2015).

Genetic structure of G. pentaphyllum

Genetic distance is commonly used to describe the genetic structure of a population and the differences among populations (Nei, 1972). Among the 72 populations in this study, the highest genetic distance was between the GD and YT populations (genetic distance = 1.42) which are almost the most extreme southernmost and northernmost populations on the mainland. Results of the Mantel test shown that the genetic distance and geographic distance are significantly correlated (r = .1518, p < .001). It is speculated that the differentiation of populations might be related to the species' asexual reproductive characteristics, geographic isolation, and human activity. G. pentaphyllum has been overexploited in recent years, so the natural resources are becoming scarce. Moreover, many of the populations have a fragmented distribution, which also contributes to the genetic differentiation of G. pentaphyllum. Habitat fragmentation, therefore, has consequences for the genetic structure of species as well as for the ecological processes, abiotic factors, and the quantity and structure of species that make up an ecosystem (Saunders, Hobbs, & Margules, 1991; Templeton, Shaw, Routman, & Davis, 1990; Young, Boyle, & Brown, 1996). Patterns in genetic structure are produced by evolutionary and demographic processes at different temporal scales (Morris, Ickert‐Bond, Brunson, Soltis, & Soltis, 2008). Factors such as mutation, migration, natural selection, and genetic drift, as well as the evolutionary history and biological characteristics of the species, combine to produce a nonrandomly distributed pattern of genetic variation in space and time. The evolutionary potential of a species or populations depends to a large extent on the genetic structure of the population (Loveless & Hamrick, 1984). The results of the STRUCTURE analysis performed for this study indicate that the most likely genetic structure of the 72 studied populations is either two or five clusters (Figures 1 and 4). With two clusters (K = 2), the populations fall into two groups located in the north and south of the study area, with mixed in the west. Based on the assumption that southwest China is the origin and diversity center of Gynostemma species (Chen, 1995), we speculate that the cradle of diversity for G. pentaphyllum was in southwest China. With five clusters (K = 5), the five clusters are not geographically independent, and there is mixing in some areas, for example, the Hengduan Mountains and Qinling–Daba Mountains. An exception is provided by the HN population, which is located in the southernmost extreme of the distribution area, but groups with the eastern populations. It is possible that this population originated in southwest China but experienced a similar evolutionary history to the eastern group. Some populations consisting of single genetic component (i.e., SX, DL, and LA population) might have experienced significant bottleneck or founder effects.

Gene flow, migration, and diffusion

Mutation and genetic drift lead to genetic differentiation of local populations, and gene flow might promote evolution by spreading new genes (Slatkin, 1987) and producing changes in the spatial distribution and genetic structure of species. In sessile organisms, such as plants, gene flow occurs mainly through pollen and seed dispersal (Robledo‐Arnuncio, Klein, Muller‐Landau, & Santamaría, 2014; Slatkin, 1985). However, factors related to breeding (i.e., outcrossing and self‐fertilization rates), the mode of reproduction (i.e., biparental inbreeding and clonal propagation), and external factors (i.e., a capricious climate, uplifted mountains, broad rivers, wind direction, and animal activity) can facilitate or hinder the gene flow of plants (Robledo‐Arnuncio et al., 2014). A previous study has suggested that the effective gene flow for G. pentaphyllum (N m = 0.0622) is much less than one successful migrant per generation (Wang et al., 2008). In contrast, the results of this study suggest a higher rate of gene flow (N m > 1). A possible reason for the discrepancy is that the large number of samples and sampling strategy in this study reduced the geographic distances among the populations. G. pentaphyllum is a perennial herbaceous plant that is dioecious and pollinated by insects. Pollen flow mediated by insects can promote gene exchange among adjacent populations and individuals. Moreover, according to the Bayesian clustering results, the clusters were not completely independent, but show mosaic phenomena in some areas where the genetic diversity was also abundant, such as the Hengduan Mountains and Qinling–Daba Mountains. It is, therefore, possible that high genetic diversity can be used as evidence for frequent gene flow. The southwest region of China is the current center of distribution and diversity center for G. pentaphyllum (Chen, 1995), and Wang et al. (2008) suggested that the species originated in the area around the Hengduan Mountains. This suggestion is consistent with our results; populations from the Hengduan Mountains displayed components from all Bayesian clusters. Furthermore, the results of ENM suggest that populations from the Hengduan Mountains area are relatively old. We concur with Wang et al. (2008) that G. pentaphyllum originated in the Hengduan Mountains area, and speculate that the species expanded northward and eastward along three trajectories. The first trajectory was along the Hengduan Mountains and the edge of Sichuan Basin to the north, through the Qinling–Daba Mountain area, and then eastward, though the eastward spread was affected by the east–west mountain ranges. The populations of this cluster are mostly distributed through mountain areas with a complex topography and varied climate, so new clusters formed during the migration process. Populations on the plains in the east of China mostly fall into a single cluster. The second trajectory was from the southwest of China toward the east. The populations in this cluster are similar to each other, and their compositions are stable. The landforms of eastern and southern China are mostly plains and low hills; these environments are not conducive to the production of new genotypes, and most of the populations in this region experienced bottlenecks, resulting in the reduction of genetic diversity. The effective size of the populations that experienced bottlenecks is usually small, and the number of alleles and heterozygosity is expected to be correspondingly decreased. However, the observed heterozygosity is greater than the heterozygosity calculated from the number of alleles using the mutation–drift equilibrium; this phenomenon is known as heterozygosity excess (Piry et al., 1999). The third trajectory was from the source to Hainan Island, through northern Vietnam and south China. Results of the ENM analysis also suggest that G. pentaphyllum has expanded its distribution range continuously since the last interglacial period (LIG; Figure 9). Indeed, the genus Gynostemma is thought to have originated in “West Sichuan Central Yunnan old land,” while southwest China is its modern center of distribution and diversity (Chen, 1995). Moreover, the recent expansion of G. pentaphyllum populations in China is from the southwest to the east and north. The north–south asymmetrical gene flow documented by this study is significantly greater than the flow from south to north. The largest recent gene flow was from the NW cluster to the S cluster; this observation suggests that the southern populations are of recent origin.

Origin of polyploidization

Polyploidization is one of the most important evolutionary characteristics of plant species and a major driving force for the high diversity of angiosperms (Otto & Whitton, 2000; Soltis & Soltis, 1999). Approximately 47% of the angiosperm species and 80% of ferns have undergone polyploidization processes in their evolutionary history (Cui et al., 2006; Soltis, 2005). Compared with the diploid species, polyploid species may have broader niches and/or larger distribution ranges (Ehrendorfer, 1980; Li, Wan, Guo, Abbott, & Rao, 2014; Parisod & Besnard, 2007; Ramsey & Ramsey, 2014; Tremetsberger, König, Samuel, Pinsker, & Stuessy, 2002), exhibit increased vigor and competitiveness (te Beest et al., 2011; Lumaret, Guillerm, Maillet, & Verlaque, 1997; Maceira, Jacquard, & Lumaret, 1993; Schlaepfer, Edwards, & Billeter, 2010), and show a preference for distinct habitats (McIntyre, 2012; Ramsey, 2011). Polyploid plants often originate from diploid ancestors, and their origin is often associated with dramatic climate fluctuations and changes in the geological environment (Parisod, Holderegger, & Brochmann, 2010). Previous studies suggest that polyploidization occurred throughout the Quaternary period, and many plant groups exhibit high degrees of polyploidy (Brysting, Oxelman, Huber, Moulton, & Brochmann, 2007). For example, a study of five fragments of chloroplast DNA sequence from diploid–tetraploid complex of Allium przewalskianum in the Qinghai–Tibet Plateau and adjacent areas concluded that the tetraploid population of the species originated from its diploid ancestor at least eight separate times and that it had undergone at least one geographic expansion in the origin of the polyploidy complex (Wu, Cui, Milne, Sun, & Liu, 2010). In general, the derivation of polyploidies from different diploid ancestors induces a high level of genetic variation and population differentiation in the polyploid species, which increases the genetic diversity of polyploidies through hybridization and genomic recombination events from the autopolyploid. The results of this study show that polyploid populations have high levels of genetic diversity. The genus Gynostemma might have originated from the “West Sichuan Central Yunnan old land” in early Tertiary (Chen, 1995). Thus, G. pentaphyllum probably experienced the effects of severe climate instability and changes to its geological environment during the Quaternary glacial–interglacial period. Polyploidization in natural populations of G. pentaphyllum has occurred throughout its long history, including during periods of migration and diffusion. Most of the polyploid populations in this study occur on the edge of the Sichuan Basin and in the Qinling–Daba Mountain area, where the topography and geological history are complex. It is speculated that these populations were affected by changes to the geology and climate. Moreover, the polyploid populations are commonly fragmented and occur in moist forests; it is proposed environmental changes and migration of the species drove the emergence of polyploidies in G. pentaphyllum. The results of this study suggest that G. pentaphyllum is autopolyploid (Jiang et al., 2009). The results of both Bayesian clustering and UPGMA tree showed that most polyploid populations in this study were divided into the same cluster as their geographically adjacent diploid populations. Most of the polyploid populations have components in common with neighboring diploid populations, rather than forming a single cluster. Therefore, the polyploid populations are likely to have originated from the adjacent diploid populations and have coexisted with their diploid parents. The origin of polyploidy in G. pentaphyllum is therefore inferred to be polygenesis. Similar result was also found in the study of Galax urceolata (Servick, Visger, Gitzendanner, Soltis, & Soltis, 2015). Some polyploid populations are separated from adjacent diploid populations, such as the NJ and ZS tetraploid populations. It is speculated that a primitive genotype was preserved and doubled, and then adapted through the process of polyploidization. Such processes explain the geographic distribution pattern of coexisting polyploidy and diploid populations. Polyploid species have inherent advantages because they can adapt readily to environmental changes and/or occupy new environments. However, whether polyploidization occurred once or multiple times has not yet been determined. Therefore, further work on the origin and evolution of polyploidies in G. pentaphyllum, using modern phylogeography based on molecular methods, is necessary. The use of plant sequence fragments to construct geographic genetic distribution patterns for the different genetic backgrounds, and simulations of evolution using statistical population analysis, have the potential to provide further information on the origin and evolutionary history of this species and its polyploidy complexes in the future.

Implications for conservation

Studies of the genetic diversity and genetic structure of species are important components of biodiversity conservation. Preferential conservation of populations with high diversity optimizes the potential of a species to adapt. However, populations with low genetic diversity should be protected from the threats that arise from evolutionary factors. As a traditional medicinal plant in China, G. pentaphyllum has a high medicinal value, but it has been listed as a Grade II Key Protected Wild Plant Species by the Chinese government. Cultivation sites, such as Pingli Jiaogulan Base (Ankang, China), have been established to breed G. pentaphyllum, but there is still a risk that the wild resource could be depleted. Therefore, both in situ and ex situ measures should be taken to protect G. pentaphyllum resources. Potential measures to protect G. pentaphyllum include: (a) education to enhance public awareness and understanding of the importance of wild plants and develop a culture of protection; (b) the establishment of demonstration bases to encourage the public to protect G. pentaphyllum; (c) correct usage of G. pentaphyllum resources. Natural populations of G. pentaphyllum should not be excavated, and its living habitats should be protected; (d) hybridization between cultivated and wild individuals should be prevented to avoid genomic contamination; and (e) in situ measures should be undertaken to protect populations with high levels of genetic diversity (i.e., the polyploidy populations and diploidic HF, WD, and HS populations), and populations that exhibit specific genotypes and private alleles (i.e., ZT, ML, NN, CS, GM, QC2, RJ, GZ, GP, and JY) should be conserved by a combination of in situ and ex situ measures, for example, removal of plants to a park or botanical garden for protection and scientific study. To summarize, the wild populations of G. pentaphyllum resources should be protected and developed sustainably to enable continued utilization of this natural resource.

CONFLICT OF INTEREST

None declared.

AUTHORS' CONTRIBUTIONS

G.Z. and Z.L. conceived the ideas; X.Z., H.S., J.Y., and L.F. contributed to the sample collection; X.Z. did the experiments, analyzed the data, and written the manuscript. All authors read and approved the final manuscript.

57 in total

1. Polyploidy: recurrent formation and genome evolution.

Authors:
Journal: Trends Ecol Evol Date: 1999-09 Impact factor: 17.712

Review 2. The genetic legacy of the Quaternary ice ages.

Authors: G Hewitt
Journal: Nature Date: 2000-06-22 Impact factor: 49.962

Review 3. Utility of low-copy nuclear gene sequences in plant phylogenetics.

Authors: Tao Sang
Journal: Crit Rev Biochem Mol Biol Date: 2002 Impact factor: 8.250

4. Detecting the number of clusters of individuals using the software STRUCTURE: a simulation study.

Authors: G Evanno; S Regnaut; J Goudet
Journal: Mol Ecol Date: 2005-07 Impact factor: 6.185

5. Multiple autopolyploidizations and range expansion of Allium przewalskianum Regel. (Alliaceae) in the Qinghai-Tibetan Plateau.

Authors: Li-Li Wu; Xie-Kui Cui; Richard I Milne; Yong-Shuai Sun; Jian-Quan Liu
Journal: Mol Ecol Date: 2010-03-22 Impact factor: 6.185

6. Why only tetraploid Solidago gigantea (Asteraceae) became invasive: a common garden comparison of ploidy levels.

Authors: Daniel R Schlaepfer; Peter J Edwards; Regula Billeter
Journal: Oecologia Date: 2010-03-18 Impact factor: 3.225

7. Chemical composition of five commercial Gynostemma pentaphyllum samples and their radical scavenging, antiproliferative, and anti-inflammatory properties.

Authors: Zhuohong Xie; Wei Liu; Haiqiu Huang; Margaret Slavin; Yang Zhao; Monica Whent; Jessica Blackford; Herman Lutterodt; Huiping Zhou; Pei Chen; Thomas T Y Wang; Shaoke Wang; Liangli Lucy Yu
Journal: J Agric Food Chem Date: 2010-10-12 Impact factor: 5.279

8. Phylogeographical structure and temporal complexity in American sweetgum (Liquidambar styraciflua; Altingiaceae).

Authors: Ashley B Morris; Stefanie M Ickert-Bond; D Burke Brunson; Douglas E Soltis; Pamela S Soltis
Journal: Mol Ecol Date: 2008-07-24 Impact factor: 6.185

Review 9. Construction of a genetic linkage map in man using restriction fragment length polymorphisms.

Authors: D Botstein; R L White; M Skolnick; R W Davis
Journal: Am J Hum Genet Date: 1980-05 Impact factor: 11.025

10. GenAlEx 6.5: genetic analysis in Excel. Population genetic software for teaching and research--an update.

Authors: Rod Peakall; Peter E Smouse
Journal: Bioinformatics Date: 2012-07-20 Impact factor: 6.937

8 in total

1. Shannon diversity index: a call to replace the original Shannon's formula with unbiased estimator in the population genetics studies.

Authors: Maciej K Konopiński
Journal: PeerJ Date: 2020-06-29 Impact factor: 2.984

2. Characterizing Tetraploid Populations of Actinidia chinensis for Kiwifruit Genetic Improvement.

Authors: Zhi Wang; Guangming Hu; Zuozhou Li; Caihong Zhong; Xiaohong Yao
Journal: Plants (Basel) Date: 2022-04-24

Review 3. Herbal Medicine for Cardiovascular Diseases: Efficacy, Mechanisms, and Safety.

Authors: Abdullah Shaito; Duong Thi Bich Thuan; Hoa Thi Phu; Thi Hieu Dung Nguyen; Hiba Hasan; Sarah Halabi; Samar Abdelhady; Gheyath K Nasrallah; Ali H Eid; Gianfranco Pintus
Journal: Front Pharmacol Date: 2020-04-07 Impact factor: 5.810

4. Protective Effects of Gynostemma pentaphyllum (var. Ginpent) against Lipopolysaccharide-Induced Inflammation and Motor Alteration in Mice.

Authors: Andrea Mastinu; Sara Anna Bonini; Marika Premoli; Giuseppina Maccarinelli; Eileen Mac Sweeney; Leilei Zhang; Luigi Lucini; Maurizio Memo
Journal: Molecules Date: 2021-01-22 Impact factor: 4.411

5. Culturable Yeast Diversity of Grape Berries from Vitis vinifera ssp. sylvestris (Gmelin) Hegi.

Authors: Gustavo Cordero-Bueso; Ileana Vigentini; Roberto Foschino; David Maghradze; Marina Ruiz-Muñoz; Francisco Benitez-Trujillo; Jesús M Cantoral
Journal: J Fungi (Basel) Date: 2022-04-16

6. High genetic diversity and low population differentiation of a medical plant Ficus hirta Vahl., uncovered by microsatellite loci: implications for conservation and breeding.

Authors: Yi Lu; Jianling Chen; Bing Chen; Qianqian Liu; Hanlin Zhang; Liyuan Yang; Zhi Chao; Enwei Tian
Journal: BMC Plant Biol Date: 2022-07-12 Impact factor: 5.260

7. Cytotype distribution and chloroplast phylogeography of the Actinidia chinensis complex.

Authors: Zhi Wang; Caihong Zhong; Dawei Li; Chunlin Yan; Xiaohong Yao; Zuozhou Li
Journal: BMC Plant Biol Date: 2021-07-06 Impact factor: 4.215

8. The level of genetic diversity and differentiation of tropical lotus, Nelumbo nucifera Gaertn. (Nelumbonaceae) from Australia, India, and Thailand.

Authors: Yeshitila Mekbib; Shi-Xu Huang; Boniface K Ngarega; Zhi-Zhong Li; Tao Shi; Ke-Fang Ou; Yu-Ting Liang; Jin-Ming Chen; Xing-Yu Yang
Journal: Bot Stud Date: 2020-05-16 Impact factor: 2.787

8 in total