Literature DB >> 24351835

A simple strategy for development of single nucleotide polymorphisms from non-model species and its application in Panax.

Ming Rui Li, Xin Feng Wang, Cui Zhang, Hua Ying Wang, Feng Xue Shi, Hong Xing Xiao1, Lin Feng Li2.   

Abstract

Single nucleotide polymorphisms (SNPs) are widely employed in the studies of population genetics, molecular breeding and conservation genetics. In this study, we explored a simple route to develop SNPs from non-model species based on screening the library of single copy nuclear genes (SCNGs). Through application of this strategy in Panax, we identified 160 and 171 SNPs from P. quinquefolium and P. ginseng, respectively. Our results demonstrated that both P. ginseng and P. quinquefolium possessed a high level of nucleotide diversity. The number of haplotype per locus ranged from 1 to 12 for P. ginseng and from 1 to 9 for P. quinquefolium, respectively. The nucleotide diversity of total sites (πT) varied between 0.000 and 0.023 for P. ginseng and 0.000 and 0.035 for P. quinquefolium, respectively. These findings suggested that this approach is well suited for SNP discovery in non-model organisms and is easily employed in standard genetics laboratory studies.

Entities:  

Mesh:

Year:  2013        PMID: 24351835      PMCID: PMC3876129          DOI: 10.3390/ijms141224581

Source DB:  PubMed          Journal:  Int J Mol Sci        ISSN: 1422-0067            Impact factor:   5.923


Introduction

Detection and assessing the genetic variations of a given species is one of the fundamental issues in biology. Since Mendel initially developed the phenotype-based genetic markers in his experiments, the identification and employment of genetic markers have made great progress in the past decades [1]. Specifically, a series of molecular markers have been explored due to the advances in molecular technologies. For example, restriction fragment length polymorphism (RFLP) is the first DNA marker that provides an efficient molecular tool to evaluate the genetic variation of a species [2]. This hybridization-based technique is widely utilized to detect DNA polymorphisms because of its relatively high polymorphic, co-dominantly inherited and highly reproducible. In addition, the development of polymerase chain reaction (PCR)-based molecular markers, such as random amplified polymorphic DNA (RAPD), amplified fragment length polymorphism (AFLP) and microsatellite, also supply an array of approaches that yield a large number of genetic variations in different organisms [3,4]. For instance, the microsatellite markers are broadly employed as a reliable DNA marker for multiple purposes across a wide range of species, including QTL tagging, population genetics, molecular breeding and phylogenetic analysis [5-7]. In recent years, however, the availability of abundant genetic resources for numerous organisms is contributing to a transition to the use of single nucleotide polymorphisms (SNPs) [8]. In particular, recent progresses in the cost and accuracy of high throughput sequencing technologies are revolutionizing the opportunities for producing genetic resources in different organisms [9]. For example, Geraldes et al. [10] have identified 0.5 million putative SNPs in 26,595 genes of the model species black cottonwood (Populus trichocarpa) using high-throughput sequencing technology. Similarly, Howe et al. [11] have also characterized 278,979 unique SNPs from the non-model species Pseudotsuga menziesii through screening of a reference transcriptome. Notably, although the next generation sequencing platforms have generated a large numbers of SNPs in both model and non-model organisms, some of these DNA polymorphisms are distributed in the duplicate regions of the genome (i.e., different members of the same gene families) that might result in the paralogous sequence variants (PSVs) and eventually limit the utilization of SNPs. Therefore, there is an urgent need to develop reliable SNPs from single copy nuclear genes (SCNGs) that could be used for applications such as molecular phylogenetics and genetic mapping. To this end, we explored a simple and straightforward approach to characterize SNPs from Panax ginseng C.A. Meyer and P. quinquefolium L. by screening the constructed library of Arabidopsis SCNGs. Panax L. (Araliaceae), commonly known as ginseng, is a medicinally important genus in the Orient and includes 18 species with 16 from eastern Asia and two from eastern North America [12,13]. P. ginseng is one of the highest valued medicinal species within Panax. Although P. ginseng was widely distributed in Russia, Korea and China at the beginning of 20th century, there exists only a few individuals in natural environments due to the over exploitation of wild resources and the destruction of natural habitats [14,15]. To date, P. ginseng has been listed as a rare and endangered plant in China [16]. Similarly, P. quinquefolium L. (American ginseng) is also a medicinal plant which is native to North America and widely cultivated in China [17]. Results from molecular phylogenetic analyses revealed that P. ginseng and P. quinquefolium are most closely related species within this genus [12,13]. To explore reliable SNPs from P. ginseng and P. quinquefolium, we developed 16 single copy nuclear genes (SCNGs) from the Panax dbEST of GenBank (http://www.ncbi.nlm.nih.gov/dbEST/index.html) (1 October 2012) [18]. These SCNGs may provide a series of useful molecular markers for future studies of conservation genetics.

Results and Discussion

Development of SNPs from SCNG Library

The predominant type of molecular genetic marker has changed substantially over the past decades [8]. To date, SNP markers have come to prominence due to the abundant polymorphism in genomes, low-scoring error rates and relative ease of calibration among laboratories [9]. Specifically, with the advances in DNA sequencing technologies, SNP markers have contributed greatly to the genetic studies of model organisms. A large numbers of SNPs were retrieved from Arabidopsis, Oryza and Populus via the employment of high throughput DNA sequencing platforms [19-22]. Nonetheless, the application of SNPs in non-model species lagged behind because of the limitation of marker development and the existence of PSVs [23]. Although the strategies of transcriptome and reduced representation genomic libraries sequencing also generated a numbers of SNPs from different non-model organisms, utilization of these SNP discovery approaches as standard tools in non-model species remain challenging thus far [24-27]. The main stumbling block hindering wide adoption of SNPs in non-model organisms is that these next generation sequencing technology-based approaches are too expensive for the population level analysis, in particular to these studies with large sample size, because it is sometimes impossible to assemble all the short reads without a reference genome. In addition, it is also difficult to distinguish the sequencing errors and PSVs from true SNPs. Take the maize as an example, it has been demonstrated that although millions of SNPs were identified, only a small portion of those polymorphisms could be utilized for the further development of robust and versatile assays [28]. To this end, we explored a simple strategy to develop SNPs from non-model species P. ginseng by performing a BLAST homology search against the constructed SCNGs library of Arabidopsis. Accordingly, a total of 22,824 Panax ESTs were analyzed and 542 of them showed high similarity to the references of Arabidopsis SCNGs. Forty-five primer pairs were designed from the exon regions of Panax SCNGs, of which 16 primer pairs produced clear amplicons of the expected size in P. ginseng (Table 1) and ten of which were successfully amplified in P. quinquefolium (Table 2). To ensure whether the SNPs were actually retrieved from the orthologous genes, we have analyzed the genetic divergence of all the obtained clones for each putative SCNG. As expected, only a small amount of sites showed single nucleotide variation and almost all of the retrieved SNPs were found at these sites. These attributes suggested that the 16 nuclear genes are likely single copy nuclear gene in P. ginseng and P. quinquefolium. Through screening DNA polymorphisms of the 16 SCNGs, we successfully identified 160 and 177 SNPs from P. quinquefolium and P. ginseng, respectively. In addition, the obtained sequences of SCNGs produced alignments ranging from 278 base pair (bp) to 1,339 and 286 to 853 bp in P. ginseng and P. quinquefolium, respectively. All of these DNA sequences have been submitted to GenBank under the accession numbers of KF529139-KF529528.
Table 1.

Nucleotide diversity of the 16 single copy nuclear genes in Panax ginseng.

LocusPrimer sequences (5′-3′)Alignment (bp)Ta (°C)ShHdπTπNonFunction annotation

exonintron
PGN7F: CCCAATGCCCCCAGAGTTTTR: AGCGAGGTGCTGCTTGAAGT441336542160.7790.0110.003beta-amyrin synthase
PW2F: AGCACAAGCTCAAGCGTCTCR: CAGTTGGCTGGCATAACACC63269485120.9470.0070.01540S ribosomal protein S27
PW8F: ATAGCTCGTGTAACTGATGGR: TTGAGTGCGGGTGTCTGAAT1195556430100.9260.0000.000vesicle transport protein
PW16F: ATTGGTGGAGGGAAGGAACTR: GAGTGGCATGAGCAGTATGT17027852790.9050.0090.004prolyl-tRNA synthetase
PW21F: AAAAGGTTGGCTACGAGTGGR: TACATGATGGGTGGAGGAGA14614064230.6580.0030.000photosystem I reaction center subunit N
PW28F: GGGGTGGGAATTTGGAAGTAR: TGAAGGAGCATCGGAACCAT155205601520.5260.0220.017photosystem I reaction center subunit H-2
PZ7F: ACCTGGTTCGCTGCTATTCCR: CAAGCATTGGTTCCCTCTGG9730452230.4680.0010.000PGR5-like protein 1A
PZ12F: GAGCGTTCTCAAATGCGGTAGR: CTTAGCCTCAAACTGGTCGG11883054120.1000.0010.00060S ribosomal protein
PZ15F: TGAACAGGCATTATTACTCGR: ACTCATCCTCCTCTTGAACG10565348010.0000.0000.00026S proteasome non-ATPase
PZ14F: CTTTGTTTCTCCTCCTCCAGR: GGATTTCCAGAGCAACCTTT178667541140.6210.0070.000diacylglycerol kinase 1
PZ10F: CTATGATGGGGTCTGGAGGGR: AGCAGTGATGGTGGATGAGG305445623280.8530.0230.013glycine decarboxylase
PZ13F: AGCAGCCGAGTATGAAACCCR: CCTCAGGTAAACGATAACCG17484556010.0000.0000.000signal peptidase complex subunit 3B
PZ1F: CACTACCCCGTTCTTTTCCGR: CCTTTTGTTCCTCAACCACC372967602560.7680.0100.009glycine decarboxylase P-protein
PZ4F: TGTTGACCATCTACTCACCCAGR: CCTTCACGCATTCCCACAAT20742648780.8950.0060.000hypothetical protein
PZ5F: TGACGGACTTGACCTAACATR: CTTCAGATACAGCCCACAGC171707561240.6210.0070.000ABC transporter F family member 3-like
PZ8F: GGGAAGGAAAAGTTGCTCTGR: TATTCGTGTTGGGGCATCTG19654560740.7790.0050.000hypothetical protein

Ta, annealing temperature; S, number of segregating sites; h, number of haplotypes; Hd, haplotype diversity; πT, nucleotide diversity for total sites; πNon, nucleotide diversity for nonsynonymous sites.

Table 2.

Nucleotide diversity of the ten single copy nuclear genes in Panax quinquefolium.

LocusAlignment (bp)Ta (°C)ShHdπTπNon

exonintron
PGN7441338542370.9640.0100.001
PW260266481580.9560.0270.079
PW16153290402790.9780.0350.033
PW21152134601620.5560.0310.047
PZ798290521140.7330.0160.015
PZ159666548010.0000.0000.000
PZ1417867549950.8000.0050.004
PZ10302445602620.5560.0190.021
PZ5204516462670.9110.0200.040
PZ813847760740.8220.0060.000

Ta, annealing temperature; S, number of segregating sites; h, number of haplotypes; Hd, haplotype diversity; πT, nucleotide diversity for total sites; πNon, nucleotide diversity for nonsynonymous sites.

Nucleotide Diversity in P. ginseng and P. quinquefolium

These SNPs can be employed to investigate the molecular phylogenetics, population genetic and molecular breeding of the Panax species. For example, although several previous studies have employed allozyme, random amplification polymorphism DNA (RAPD), inter simple sequence repeat (ISSR), amplification fragment length polymorphism (AFLP) and microsatellite techniques to investigate the genetic diversity of P. ginseng, these genetic markers are largely from unknown regions of the genome and can not be applied among laboratories that might have less practical value in the further studies [14,15,29]. In this study, we applied these SNPs to evaluate the nucleotide diversity of P. ginseng and P. quinquefolium. Results from the polymorphic loci of P. ginseng revealed that nucleotide diversity ranged from 0.001 to 0.023 for total sites (πT) and from 0.000 to 0.017 for nonsynonmous sites (πNon), respectively (Table 1). Similarly, nucleotide diversity of P. quinquefolium varied from 0.005 to 0.035 for total sites and from 0.000 to 0.079 for nonsynonmous sites, respectively (Table 2). The genetic diversity based on SNP markers has been also reported in some other crop plants. For example, Haudry et al. [30] have employed 21 nuclear genes to investigate the genetic diversity of Triticum turgidum ssp. dicoccum and revealed that this species possessed low genetic diversity (πT = 0.0008). Likewise, low genetic diversity were also found in Zea may ssp. may (πT = 0.0064) and Hordeum vulgare (πT = 0.0031) [31,32]. In comparison with these previous studies, our results showed that although a small amount of individuals of P. ginseng and P. quinquefolium were investigated respectively, both the two Panax species exhibited relatively high level of nucleotide diversity at both total (πT are 0.007 and 0.017 for P. ginseng and P. quinquefolium, respectively) and nonsynonmous (πT are 0.004 and 0.024 for P. ginseng and P. quinquefolium, respectively) sites. Notably, we found that P. quinquefolium showed relatively higher genetic diversity at both total and nonsynonmous sites in comparison with P. ginseng. It indicated that P. ginseng might have undergone genetic bottleneck during the domestication process. In addition, Wang et al. [33] have developed an amplification refractory mutation system (ARMS)-PCR method and successfully applied it to identify the ginseng cultivars. Here, our results showed that no haplotypes were shared between P. ginseng and P. quinquefolium. It suggested that these molecular markers could be employed to distinguish the two Panax species.

Experimental Section

Samples and DNA Extraction

SNPs discovery was assessed in samples from 20 individuals of P. ginseng and ten individuals of P. quinquefolium. The detailed information of the specimens was listed in Table 3. In general, the 20 samples were collected from ten localities and each of them contained two individuals. Similarly, the ten individuals of P. quinquefolium were also obtained from two locates. Genomic DNA was extracted from leaves of each individual using a Plant Genomic DNA kit (TianGen, Beijing, China) following the manufacturer’s protocols.
Table 3.

Details of localities sampled from the field in this study and number of individual sequenced for each locus in each locality.

Species nameLocalityCountryLatitude/longitudeElevation (meter)Number of individualsSampling dateVoucher specimens
P. ginsengTQLChina43°36′129″N129°35′807″E46929/2011NENU20110902001
WHLChina43°30′181″N127°54′193″E55129/2011NENU20110903001
FSChina42°24′216″N127°12′186″E58927/2011NENU20110720001
JYChina42°23′197″N126°48′490″E61227/2011NENU20110713001
XJDChina42°20′870″N128°44′449″E84529/2011NENU20110902002
CBChina41°39′442″N127°35′229″E93629/2011NENU20110802001
LJChina41°48′432″N126°55′530″E66328/2011NENU20110801001
BTChina41°18′492″N125°49′954″E36927/2011NENU20110718006
SZChina40°45′595″N125°20′863″E37527/2011NENU20110718001
GLChina41°25′121″N128°12′296″E76529/2012NENU20130929001

P. quinquefoliumWSUSAn.a.n.a.59/2012n.a
JYChina42°23′197″N126°48′490″E61257/2011NENU20110713009

n.a., no available data.

SCNG Library Construction and Primer Design

To obtain SNPs from P. ginseng and P. quinquefolium, library of SCNGs was constructed based on the database of putative SCNGs (see in the Supplementary File 1). In detail, available references of Arabidopsis were retrieved from GenBank according to the accession number of Duarte et al. [34]. Then, all ESTs and genomic sequences of Panax were downloaded from GenBank and aligned against the constructed SCNGs library of Arabidopsis using Basic Local Alignment Search Tool (BLAST). For aligned EST sequences that satisfy minimum matched query length of 200 nucleotides and identify of 80% were considered as valid hits. To further identify the gene structures of SCNGs in P. ginseng, we blasted these ESTs against the BLASTX of GenBank (http://www.ncbi.nlm.nih.gov/dbEST/index.html) [35]. The exon-intron boundaries of SCNGs were determined by available annotated references. The identified Panax ESTs were subjected to design primers using the software Primer Premier 5.0 (Premier Biosoft International, Palo Alto, CA, USA).

PCR, Sequencing and Gene Function Prediction

The designed primer pairs were further employed to amplify the target fragments of P. ginseng and P. quinquefolium. PCRs were performed using an ABI 2720 Thermocycler (Applied Biosystems, Foster City, CA, USA) in a 30 μL total volume containing: 20–50 ng template DNA, 1× PCR buffer (Mg2+ free), 2.5 mM Mg2+, 0.6 μM of each primer, 0.2 mM of each dNTP, 1 unit of rTaq polymerase (Takara, Dalian, Liaoning, China). The amplifications were performed under the following conditions: 94 °C for 5 min, 35 cycles of 30 s at 94 °C, 30 s at the annealing temperature (Tables 1 and 2) for each designed specific primer, 90 s at 72 °C, and a final extension of 72 °C for 8 min. All amplified products were separated by electrophoresis on 1.5% agarose gels and purified with the Gel DNA Recovery Kits (Takara) following manufacturer’s instructions and sequenced with the ABI3730 sequencer (Beijing Invitrogen Biotechnology CO., Ltd., Beijing, China). Previous studies have documented that P. ginseng and P. quinquefolium are tetraploid species [36-38]. To ensure all the SNPs were retrieved from the orthologous genes, we have therefore sequenced more than 10 clones from the same individual for each putative SCNGs and analyzed the genomic divergence of the obtained sequences. To further determine the function of SCNGs, the obtained genomic sequences were searched against the GenBank non-redundant protein database of Arabidopsis thaliana using BLASTX [35] with an expected value <10−7. The putative functions of these SCNGs are listed in Table 1.

SNP Genotyping and Data Analyses

Obtained DNA sequences of P. ginseng and P. quinquefolium were subsequently subjected to identify the SNPs. Initial sequence editing and assembly was performed using the ContigExpress (Informax Inc., North Bethesda, MD, USA, 2000). DNA sequence alignment was implemented in ClustalX 1.83 [39] and if necessary edited manually in BioEdit 7.0.1 [40]. To evaluate nucleotide diversity of the two Panax species, nucleotide polymorphisms were analyzed using DnaSP version 5 [41], including number of segregating sites (S), number of haplotypes (h) and haplotype diversity (Hd). In addition, we also surveyed nucleotide diversity π [42] for total and nonsynonymous sites for each locus and the combined dataset separately. The insertions/deletions (indels) were not included in these analyses.

Conclusions

SNPs are increasingly being used as an ideal molecular marker in both model and non-model species. Here, we explored an approach of development SNPs from the SCNGs of non-model species P. ginseng and P. quinquefolium. Our results suggested that this strategy could also be applied to develop SNPs in other model or non-model species.
  34 in total

1.  DnaSP, DNA polymorphism analyses by the coalescent and other methods.

Authors:  Julio Rozas; Juan C Sánchez-DelBarrio; Xavier Messeguer; Ricardo Rozas
Journal:  Bioinformatics       Date:  2003-12-12       Impact factor: 6.937

2.  Haplotype structure at seven barley genes: relevance to gene pool bottlenecks, phylogeny of ear type and site of barley domestication.

Authors:  Benjamin Kilian; Hakan Ozkan; Jochen Kohl; Arndt von Haeseler; Francesca Barale; Oliver Deusch; Andrea Brandolini; Cemal Yucel; William Martin; Francesco Salamini
Journal:  Mol Genet Genomics       Date:  2006-06-07       Impact factor: 3.291

Review 3.  Genic microsatellite markers in plants: features and applications.

Authors:  Rajeev K Varshney; Andreas Graner; Mark E Sorrells
Journal:  Trends Biotechnol       Date:  2005-01       Impact factor: 19.536

Review 4.  Advances in molecular marker techniques and their applications in plant sciences.

Authors:  Milee Agarwal; Neeta Shrivastava; Harish Padh
Journal:  Plant Cell Rep       Date:  2008-02-02       Impact factor: 4.570

5.  DNA polymorphisms amplified by arbitrary primers are useful as genetic markers.

Authors:  J G Williams; A R Kubelik; K J Livak; J A Rafalski; S V Tingey
Journal:  Nucleic Acids Res       Date:  1990-11-25       Impact factor: 16.971

6.  The effects of artificial selection on the maize genome.

Authors:  Stephen I Wright; Irie Vroh Bi; Steve G Schroeder; Masanori Yamasaki; John F Doebley; Michael D McMullen; Brandon S Gaut
Journal:  Science       Date:  2005-05-27       Impact factor: 47.728

7.  A PCR-based SNP marker for specific authentication of Korean ginseng (panax ginseng) cultivar "Chunpoong".

Authors:  Hongtao Wang; Hua Sun; Woo-Saeng Kwon; Haizhu Jin; Deok-Chun Yang
Journal:  Mol Biol Rep       Date:  2010-02       Impact factor: 2.316

8.  Phylogeny of Panax using chloroplast trnC-trnD intergenic region and the utility of trnC-trnD in interspecific studies of plants.

Authors:  Chunghee Lee; Jun Wen
Journal:  Mol Phylogenet Evol       Date:  2004-06       Impact factor: 4.286

9.  Grinding up wheat: a massive loss of nucleotide diversity since domestication.

Authors:  A Haudry; A Cenci; C Ravel; T Bataillon; D Brunel; C Poncet; I Hochu; S Poirier; S Santoni; S Glémin; J David
Journal:  Mol Biol Evol       Date:  2007-04-18       Impact factor: 16.240

10.  GenBank.

Authors:  Dennis A Benson; Ilene Karsch-Mizrachi; David J Lipman; James Ostell; Eric W Sayers
Journal:  Nucleic Acids Res       Date:  2009-11-12       Impact factor: 16.971

View more
  3 in total

1.  The impacts of polyploidy, geographic and ecological isolations on the diversification of Panax (Araliaceae).

Authors:  Feng-Xue Shi; Ming-Rui Li; Ya-Ling Li; Peng Jiang; Cui Zhang; Yue-Zhi Pan; Bao Liu; Hong-Xing Xiao; Lin-Feng Li
Journal:  BMC Plant Biol       Date:  2015-12-21       Impact factor: 4.215

2.  Genome-Wide Variation Patterns Uncover the Origin and Selection in Cultivated Ginseng (Panax ginseng Meyer).

Authors:  Ming-Rui Li; Feng-Xue Shi; Ya-Ling Li; Peng Jiang; Lili Jiao; Bao Liu; Lin-Feng Li
Journal:  Genome Biol Evol       Date:  2017-09-01       Impact factor: 3.416

Review 3.  Till 2018: a survey of biomolecular sequences in genus Panax.

Authors:  Vinothini Boopathi; Sathiyamoorthy Subramaniyam; Ramya Mathiyalagan; Deok-Chun Yang
Journal:  J Ginseng Res       Date:  2019-06-20       Impact factor: 6.060

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.