Literature DB >> 25511658

Geographic origin is not supported by the genetic variability found in a large living collection of Jatropha curcas with accessions from three continents.

Fatemeh Maghuly¹, Joanna Jankowicz-Cieslak, Stephan Pabinger, Bradley J Till, Margit Laimer.

Abstract

Increasing economic interest in Jatropha curcas requires a major research focus on the genetic background and geographic origin of this non-edible biofuel crop. To determine the worldwide genetic structure of this species, amplified fragment length polymorphisms, inter simple sequence repeats, and novel single nucleotide polymorphisms (SNPs) were employed for a large collection of 907 J. curcas accessions and related species (RS) from three continents, 15 countries and 53 regions. PCoA, phenogram, and cophenetic analyses separated RS from two J. curcas groups. Accessions from Mexico, Bolivia, Paraguay, Kenya, and Ethiopia with unknown origins were found in both groups. In general, there was a considerable overlap between individuals from different regions and countries. The Bayesian approach using STRUCTURE demonstrated two groups with a low genetic variation. Analysis of molecular varience revealed significant variation among individuals within populations. SNPs found by in silico analyses of Δ12 fatty acid desaturase indicated possible changes in gene expression and thus in fatty acid profiles. SNP variation was higher in the curcin gene compared to genes involved in oil production. Novel SNPs allowed separating toxic, non-toxic, and Mexican accessions. The present study confirms that human activities had a major influence on the genetic diversity of J. curcas, not only because of domestication, but also because of biased selection.

© 2015 The Authors. Biotechnology Journal published by Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim. This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.

Entities: CellLine Chemical Disease Gene Mutation Species

Keywords: Biofuel; EcoTILLING; Genetic diversity; Molecular markers; Toxicity

Mesh：

Year: 2015 PMID： 25511658 PMCID： PMC4413048 DOI： 10.1002/biot.201400196

Source DB: PubMed Journal: Biotechnol J ISSN： 1860-6768 Impact factor: 4.677

1 Introduction

Jatropha curcas is a perennial, monoecious shrub in the Euphorbiaceae family native to America, but distributed widely in tropical and subtropical areas. Wild or semicultivated genotypes of J. curcas can grow well under unfavorable climatic and soil conditions. J. curcas has attracted a great deal of attention worldwide, regarding its potential as a new energy plant [1]. It is planted on an estimated 1.8 million ha in Indonesia, China, Brazil, Africa, and has a huge potential in India and other tropical countries [2-4]. However, Jatropha breeding material is rather limited due to low or incomplete information about germplasm resources. Therefore, for this biofuel crop it is necessary to increase the understanding of intraspecific variability and interspecific relationships, providing insights into population structure or adaptations to develop effective breeding programs and comparisons with related species (RS) [1]. Over the last years, a few genetic characterization studies using molecular markers were conducted with J. curcas [5-8]. Some investigations revealed relatively high estimates of polymorphism either within or between populations from different eco-geographical regions using random amplified polymorphic DNA (RAPD) and inter simple sequence repeat (ISSR) markers alone or in combination with single sequence repeat (SSR) markers [7, 9–12]. In contrast, an exceptionally high genetic similarity along with phenotypic diversity in J. curcas germplasm was found using RAPD and amplified fragment length polymorphism (AFLP) markers [6]. With the exception of Mexican ecotypes, a tight clustering of worldwide collected J. curcas accessions was reported based on RAPD and ISSR markers [5]. In addition, a broad genetic diversity was observed in 160 accessions collected from 8 locations in Kenya, which indicates that they might have originated both from Asian and African countries [13]. Despite the low Jatropha diversity reported by RAPD in Jatropha worldwide, ISSR analysis of 224 accessions from different regions of South China and Myanmar showed high levels of genetic diversity, suggesting that Jatropha could have been introduced into these regions from different places [14]. Further, the Indian accessions could be distinguished from Mexican genotypes by two polymorphic ISSR markers converted to sequence characterized amplified region (SCAR) markers [9]. Analysis of 48 Jatropha accessions from India by AFLP showed a high genetic diversity as well as variation in oil content between accessions [15]. In contrast, [16] discovered 2482 single nucleotide polymorphisms (SNPs) among 148 Jatropha collections, revealed narrow level of diversity among the Indian genotypes compared to American and African genotypes. In addition, Jatropha accessions from Mexico showed a high level of polymorphism using AFLP markers [17, 18]. Further, the Mesoamerican region is considered to be the potential center of origin for Jatropha, since accessions from this region have a higher diversity than in other parts of the world [19, 20]. In order to develop sustainable management strategies, a clear and detailed understanding of the extent and distribution of genetic variation within and among J. curcas accessions between different continents, countries, regions, and individuals is necessary. However, little is known about Jatropha's genetic structure; the genetic diversity in J. curcas was studied so far mainly with small numbers of individuals, which makes an overall judgment of Jatropha variability difficult. Furthermore, as already stated [21], any attempt to study the genus Jatropha benefits from the use of living plants. Thus, in this study, AFLP and ISSR markers were used in a large living collection of 907 J. curcas accessions from 53 geographical regions covering 15 countries and 3 continents. In addition, the biochemical composition of J. curcas is of great interest for several reasons [22], and the wish to take this plant into culture has been steadily increasing. Earlier chemical analyses have revealed high concentrations of several interesting compounds [22, 23]. The use of accessions with defined toxin contents would abate the costs associated with conversion of seed cake to animal feed, generating additional revenues from Jatropha [24]. Therefore, there is a need to measure variations in toxic and non-toxic J. curcas accessions to determine the importance of these variations. Since most of the functional variation resides in the coding regions of the genome, the logical first step is to use coding regions for the discovery of the causative genetic variants of traits of interest [25]. Thus, genetic diversity was studied in the coding region of 12 genes involved in toxin and oil production traits by identification of novel SNPs. Selected markers in the current study showed variation among and relationships between J. curcas accessions and other RS of the Plant Biotechnology Unit (PBU) collection, which can be used for future breeding strategies.

2 Materials and methods

2.1 Plant material and DNA extraction

A living reference collection of 907 J. curcas accessions from 3 different continents, 15 countries and 53 regions (Asia: India, Indonesia, China; Africa: Cape Verde, Guinea Bissau, Burkina Faso, Ethiopia, Kenya, Mali, Tanzania; America: Mexico, Brazil, Paraguay, Bolivia) and RS (J. hieronymi, J. macrocarpa, and J. gossypifolia from Argentina, J. podagrica from Brazil, J. multifida from Indonesia, Ricinus communis from Kenya) was established in vivo and in vitro at the PBU (Supporting information, Table S1). Total genomic DNA was extracted from 100 mg of leaf tissue using DNeasy® Plant Mini Kit (QIAGEN) following the supplier's instructions. Gel electrophoresis and spectrophotometry determined the DNA quality and concentration, respectively.

2.2 ISSR

Five polymorphic ISSR primers (ISSR4: (CAC)7G, ISSR5: GTCA(CCA)6C; ISSR6: GT(GGT)6GC, ISSR9: (GA)9AY, ISSR49: CT(CCT)5C) out of 55 originally tested were employed for genomic DNA amplification in the present study. PCR amplifications were conducted in a total volume of 25 μL using 1× PCR buffer (QIAGEN), 2 mM MgCl2, 0.2 mM dNTPs, 10 pmol primer, 0.6 units HotStarTaq Polymerase (QIAGEN), and 20–30 ng of total genomic DNA. PCR-cycling conditions (Biometra) consisted of an initial denaturation step of 95°C for 5 min followed by 30 cycles of 30 s at 94°C, 45 s at annealing temperature 54°C, and 1 min at 72°C. A final step of 7 min at 72°C finalized the cycle. The amplified products along with Molecular Weight Marker VI (Roche) were size fragmented on 1.5% agarose gel at 100 V. DNA fragments were stained with ethidiumbromide and visualized under UV light. The patterns photographed using Geldoc system (Bio-Rad) were stored as digital pictures. The reproducibility of the amplification was confirmed by repeating each experiment three times. Pictures were analyzed on GelAnalyzer2010a software (http://www.gelanalyzer.com/) to calculate the size of each amplified fragments. Fragments with the same size were considered as belonging to the same locus. Binary data were generated and scored for presence (1) and absence (0) of bands.

2.3 AFLP

AFLP analysis was performed as described by [26]. DNA was digested with EcoRI and MseI restriction enzymes (New England BioLabs) and double stranded adaptors were ligated onto the ends of fragments. In the pre-selective PCR amplification, primers with single bp extension were used. Three different selective primer pairs out of 63 containing a 3-bp extension (E-AGC/M-GAC, E-AGG/M-CAG, E-AGG/M-CTA) were chosen, which generated a reduced number of amplified fragments. Fluorescently labeled PCR products were purified using Sephadex G-50 Superfine (GE Healthcare Life-Sciences) applied to a MultiScreen-HV 96-Well Plate (Millipore). Eluates were combined with HiDi and GeneScan 500 ROX (Applied Biosystem) and visualized on a capillary sequencer (3130×l Genetic Analyzer, Applied Biosystem). Fragments were analyzed using GeneScan Analysis Software Version 3.1 (Applied Biosystem). Loci were defined manually with the Genotyper® 2.0 Software (Applied Biosystem), ranging between 50 and 500 bp. The relative fluorescent unit (RFU) threshold was set at 60. A binary table was constructed recording the presence (1) and absence (0) of bands.

2.4 Primer design, EcoTILLING, and evaluation of natural mutations

Primers were designed to amplify regions of approximately 1.0–2.8 kb in 36 target genes. Sequence information was taken from the National Center for Biotechnology Information (NCBI) database (http://www.ncbi.nlm.nih.gov/). CODDLE (codons optimized to discover deleterious lesions, http://www.proweb.org/input) and Primer3 software [27] were used with a minimum of 67°C and maximum of 73°C melting temperature as previously described [28]. Twelve primers passing the quality check were included in the experiment (see Section 3.1) by labeling forward primers with IRDye700 and reverse primers with IRDye800 dyes. PCR amplification was performed using 0.3 ng of DNA template to uncover homozygous and heterozygous SNPs of J. curcas accessions. After enzymatic mismatch cleavage, using a crude celery juice extract, denaturing polyacrylamide gel electrophoresis (PAGE) and fluorescence detection using LI-COR 4300 DNA analyser was conducted as previously described [29]. Three strategies were chosen for identification of novel SNPs, which were performed (a) within a single tree (heterozygous), (b) between individual trees and a reference (Kemise1 from Ethiopia), and (c) in samples of eightfold pooling strategies. Gel images in TIF format were generated by LI-COR DNA analyser and manually scored using the GelBuddy program [30]. Nucleotide polymorphisms were confirmed by Sanger sequencing and evaluated for the potential effect of SNPs on protein function using SIFT and PARSESNP programs [29, 31, 32]. Sanger sequence trace data was analyzed using Lasergene 8 and compared to reference sequence to determine the precise position and type of nucleotide polymorphism.

2.5 Statistical analyses

To investigate the genetic structure of J. curcas accessions within and among populations from different regions and countries, all analyses were performed on a single matrix combining AFLP and ISSR data. Allele frequencies were calculated using the Bayesian method proposed by [33] for diploid species with the option of non-uniform prior distributions of allele frequencies. They were used to estimate the percentage of polymorphic loci (5% criteria) and Nei's gene diversity (Hj, analogous to expected heterozygosity, He) within and between samples, under the assumption of Hardy–Weinberg equilibrium following [34], using the AFLP-SURV 1.0 [35]. Further, three different measures of pairwise genetic distance between populations, fixation index (FST) [36], Nei's D and Reynold's distance [34, 37] were calculated to estimate the genetic differentiation between populations and individuals; their significance was evaluated with a permutation test (10 000 pseudo-replicates). The robustness of each cluster was evaluated by bootstrapping of 10 000 replicates using PHYLIP package version 3.695 (http://evolution.genetics.washington.edu/phylip.html). Nei's unbiased genetic distances estimated between pairs of populations were used for visualizing cluster analysis and for principal coordinates analysis (PCoA) with the help of Biodiversity R developed for the R statistical software (http://www.Rproject.org) [38]. The proportion of genetic variance for each axis was calculated by dividing eigenvalues by the sum of all eigenvalues, and expressed as percentage [38]. The canonical analysis of principal coordinates (CAP) was used to investigate the influence of geographical location according to Jaccard's measure. The significance of ordination results was investigated by randomization test with 999 permutations. All the constrained analyses and ordination diagrams were obtained with the Biodiversity R software developed for the R statistical software [38]. An analysis of molecular variance (AMOVA) was performed using GenAlEx 6.5 (http://biology.anu.edu.au/GenAlEx/Download.html [39]) to estimate global population structure and to evaluate the significance of population differentiation (ΦPT) with 9999 permutations. ΦPT has been specifically recommended for dominant data [33]. In an alternative approach, the genetic structure was investigated with the Bayesian model-based clustering algorithms implemented in STRUCTURE v2.3.4 [40]. Sampling locations were used as prior information to assist the clustering. However, anticipating that a superstructure might hide other structures at smaller spatial scales, STRUCTURE software was run for the whole dataset as well as for the populations within continents, countries, and regions separately. Ten runs with a burn-in period of 100 000 replications and a run length of 100 000 Markov Chain Monte Carlo (MCMC) iterations and a model based on admixture and independent allelic frequencies, without sampling localities information, were performed for a number of clusters (K) ranging from 1 to 12. In addition, the same analyses were carried out with sampling location information for a number of K ranging from 1 to 6 for continents, 1 to 18 for countries, and 1 to 58 for regions. The ad hoc summary statistic ΔK of [41] was determined to select the value of K with the uppermost hierarchical level of population STRUCTURE using the software STRUCTURE HARVESTER [42]. Production of a combined file from the replicates of each K was performed using Clump v1.1.2 with the full search algorithm [43].

3 Results

Polymorphic primers of three AFLP combinations and five ISSRs across 907 J. curcas accessions and RS from 3 continents, 15 countries, and 53 regions yielded 1217 and 84 polymorphic loci, respectively. The largest portion of polymorphic loci within countries was found in Jatropha accessions from Bolivia (40.7%) and the lowest within regions in Icrisat-SBAN from India (17.5%) (Table 1). The maximum number of private bands were found in Africa (159), with the highest number in Ethiopia (115) particularly in the region Kalu-Dure (7), followed by Kenya (4). Astonishingly, there were no private bands in the Mexican group (Table 1). The mean value of Shannon information index obtained for continents, countries, and regions had a value of 0.178, 0.136, and 0.118, respectively. However, the highest value was found in Ethiopia (0.182) followed by Bolivia and Paraguay (0.169 and 0.160, respectively). Expected heterozygosity Hj or Nei's gene diversity showed the most diverse population in Cape Verde (0.15) and the least diverse population in Icrisat-SBAN (0.0628) from India (Table 1).

Table 1

Results of 1301 loci of J. curcas accessions and RS from 3 continents, 15 countries, and 53 regions for gene diversity

	Populations	Na)	Polymorphic locib)	H_jc) ± S.E.	N_ed) ± S.E.	Ie) ± S.E.	No. private band
Continents	Asia	66	406 (31.2%)	0.10657 ± 0.00430	1.160 ± 0.008	0.167 ± 0.006	2
	Africa	701	431 (33.1%)	0.10958 ± 0.00409	1.165 ± 0.007	0.184 ± 0.006	159
	America	113	408 (31.4%)	0.10408 ± 0.00418	1.157 ± 0.008	0.168 ± 0.006	3
	RS	27	467 (35.9%)	0.12236 ± 0.00418	1.175 ± 0.007	0.193 ± 0.006	31
	Mean ± S.E.				1.164 ± 0.004	0.178 ± 0.003
Countries	India	43	402 (30.9%)	0.10924 ± 0.00442	1.161 ± 0.008	0.161 ± 0.006	1
	Indonesia	16	335 (25.7%)	0.09346 ± 0.00420	1.124 ± 0.007	0.126 ± 0.006	1
	China	7	437 (33.6%)	0.11537 ± 0.00456	1.133 ± 0.007	0.132 ± 0.006	0
	Cape Verde	3	378 (29.1%)	0.15048 ± 0.00545	1.138 ± 0.008	0.118 ± 0.007	0
	Guinea Bissau	4	307 (23.6%)	0.09810 ± 0.00447	1.083 ± 0.006	0.076 ± 0.005	0
	Burkina Faso	6	286 (22.0%)	0.08308 ± 0.00430	1.090 ± 0.007	0.082 ± 0.005	5
	Ethiopia	619	435 (33.4%)	0.10817 ± 0.00406	1.162 ± 0.007	0.182 ± 0.006	115
	Kenya	48	408 (31.4%)	0.10869 ± 0.00434	1.163 ± 0.008	0.169 ± 0.006	4
	Mali	15	352 (27.1%)	0.09516 ± 0.00409	1.122 ± 0.007	0.128 ± 0.006	0
	Tanzania	6	422 (32.4%)	0.12012 ± 0.00474	1.132 ± 0.007	0.125 ± 0.006	0
	Mexico	19	438 (33.7%)	0.10853 ± 0.00436	1.150 ± 0.008	0.154 ± 0.006	0
	Brazil	5	268 (20.6%)	0.07805 ± 0.00406	1.074 ± 0.006	0.071 ± 0.005	0
	Paraguay	83	365 (28.1%)	0.10108 ± 0.00419	1.152 ± 0.007	0.160 ± 0.006	1
	Bolivia	6	529 (40.7%)	0.14027 ± 0.00468	1.171 ± 0.008	0.169 ± 0.007	0
	RS	27	467 (35.9%)	0.12236 ± 0.00418	1.175 ± 0.007	0.193 ± 0.006	31
	Mean ± S.E.				1.135 ± 0.002	0.136 ± 0.002
Regions	India	6	356 (27.4%)	0.10064 ± 0.00451	1.106 ± 0.007	0.098 ± 0.006	0
	Icrisat_BAAS	3	315 (24.2%)	0.11829 ± 0.00497	1.096 ± 0.007	0.085 ± 0.006	0
	Icrista_ISC	4	303 (23.3%)	0.10246 ± 0.00465	1.094 ± 0.007	0.086 ± 0.006	0
	Icrisat_SBAN	5	228 (17.5%)	0.06277 ± 0.00372	1.058 ± 0.006	0.052 ± 0.004	0
	Icrisat_SNES	25	407 (31.3%)	0.11445 ± 0.00463	1.165 ± 0.008	0.158 ± 0.007	1
	Sleman_D	5	365 (28.1%)	0.10538 ± 0.00449	1.106 ± 0.007	0.104 ± 0.006	1
	Sleman_U	11	291 (22.4%)	0.09202 ± 0.00426	1.113 ± 0.007	0.110 ± 0.006	0
	China	7	437 (33.6%)	0.11537 ± 0.00456	1.133 ± 0.007	0.132 ± 0.006	0
	Cape Verde	3	378 (29.1%)	0.15048 ± 0.00545	1.138 ± 0.008	0.118 ± 0.007	0
	Guinea Bissau	4	307 (23.6%)	0.09810 ± 0.00447	1.083 ± 0.006	0.076 ± 0.005	0
	Sarya	6	286 (22.0%)	0.08308 ± 0.00430	1.090 ± 0.007	0.082 ± 0.005	5
	South Wello	5	304 (23.4%)	0.08616 ± 0.00427	1.082 ± 0.007	0.071 ± 0.005	0
	Ataye_JewihaNegeso	25	325 (25.0%)	0.08802 ± 0.00397	1.119 ± 0.007	0.127 ± 0.006	1
	Kalu_Dure	43	371 (28.5%)	0.09927 ± 0.00416	1.145 ± 0.007	0.152 ± 0.006	7
	Kalu_Degan	11	268 (20.6%)	0.07824 ± 0.00390	1.089 ± 0.006	0.088 ± 0.005	0
	Kewit_Rasa	135	351 (27.0%)	0.08980 ± 0.00408	1.137 ± 0.007	0.142 ± 0.006	2
	Metema_Yohannis	88	375 (28.8%)	0.09852 ± 0.00419	1.148 ± 0.007	0.154 ± 0.006	0
	Metema_Shehedin	36	385 (29.6%)	0.09820 ± 0.00421	1.140 ± 0.007	0.143 ± 0.006	1
	Metema_AgamWuha	44	346 (26.6%)	0.09204 ± 0.00411	1.132 ± 0.007	0.135 ± 0.006	1
	Kemise_Y	29	370 (28.4%)	0.09407 ± 0.00423	1.138 ± 0.008	0.140 ± 0.006	0
	Kemise_M	28	386 (29.7%)	0.09811 ± 0.00408	1.136 ± 0.007	0.146 ± 0.006	3
	Kemise_D	3	309 (23.8%)	0.11149 ± 0.00487	1.089 ± 0.007	0.076 ± 0.005	0
	Woreda	37	318 (24.4%)	0.07887 ± 0.00387	1.111 ± 0.007	0.115 ± 0.006	0
	West Hararghe	3	355(27.3%)	0.13628 ± 0.00512	1.117 ± 0.007	0.107 ± 0.006	0
	Chreti	8	423 (32.5%)	0.10982 ± 0.00454	1.135 ± 0.007	0.131 ± 0.006	0
	Shoa_Robit	58	428 (32.9%)	0.10578 ± 0.00416	1.154 ± 0.007	0.165 ± 0.006	1
	Shoa_Zuti	3	250 (19.2%)	0.08526 ± 0.00426	1.061 ± 0.005	0.056 ± 0.005	2
	Shoa_Jewha	6	491 (37.7%)	0.13265 ± 0.00465	1.159 ± 0.008	0.157 ± 0.006	0
	Shoa_Fok	5	343 (26.4%)	0.09499 ± 0.00427	1.085 ± 0.006	0.079 ± 0.005	0
	Shoa_Sodere	10	420 (32.3%)	0.11044 ± 0.00460	1.140 ± 0.007	0.134 ± 0.006	0
	Wolita_Sodo	5	460 (35.4%)	0.14627 ± 0.00488	1.169 ± 0.008	0.159 ± 0.007	0
	Goffa	7	339 (26.1%)	0.09735 ± 0.00451	1.107 ± 0.007	0.095 ± 0.006	0
Regions	Wolita_WachigaMalka	8	419 (32.2%)	0.11262 ± 0.00477	1.137 ± 0.008	0.125 ± 0.006	0
	Wolita_Mella	8	385 (29.6%)	0.10352 ± 0.00464	1.128 ± 0.008	0.118 ± 0.006	0
	Wolita_Fango	4	337 (25.9%)	0.12195 ± 0.00514	1.128 ± 0.008	0.109 ± 0.006	1
	SouthOmo_Turmie	3	392 (30.1%)	0.15189 ± 0.00515	1.146 ± 0.008	0.133 ± 0.007	0
	Waretiyo	7	454 (34.9%)	0.12542 ± 0.00484	1.146 ± 0.008	0.137 ± 0.006	0
	Busia	5	311 (23.9%)	0.09299 ± 0.00445	1.093 ± 0.007	0.085 ± 0.005	0
	Kakamega	20	421 (32.4%)	0.10852 ± 0.00427	1.150 ± 0.007	0.156 ± 0.006	3
	Kenya_introduce	16	330 (25.4%)	0.08707 ± 0.00418	1.115 ± 0.007	0.111 ± 0.006	1
	Siaya	7	355 (27.3%)	0.09237 ± 0.00425	1.096 ± 0.007	0.093 ± 0.005	0
	Falou	8	434 (33.4%)	0.11028 ± 0.00448	1.132 ± 0.007	0.130 ± 0.006	0
	Ech	7	301 (23.1%)	0.07897 ± 0.00407	1.086 ± 0.006	0.082 ± 0.005	0
	Arusha	6	422 (32.4%)	0.12012 ± 0.00474	1.132 ± 0.007	0.125 ± 0.006	0
	Las Pilas	6	380 (29.2%)	0.10819 ± 0.00466	1.116 ± 0.007	0.105 ± 0.006	0
	Mexico	3	311 (23.9%)	0.12326 ± 0.00515	1.112 ± 0.007	0.096 ± 0.006	0
	Morelia	3	274 (21.1%)	0.10607 ± 0.00489	1.091 ± 0.007	0.079 ± 0.006	0
	Puerto Escondido	7	414 (31.8%)	0.11081 ± 0.00440	1.127 ± 0.007	0.127 ± 0.006	0
	Brazil	5	268 (20.6%)	0.07805 ± 0.00406	1.074 ± 0.006	0.071 ± 0.005	0
	Paraguay	83	365 (28.1%)	0.10108 ± 0.00419	1.152 ± 0.007	0.160 ± 0.006	1
	Bolivia	6	529 (40.4%)	0.14027 ± 0.00468	1.171 ± 0.008	0.169 ± 0.007	0
	RS_Catamarca	14	445 (34.2%)	0.12489 ± 0.00428	1.171 ± 0.007	0.179 ± 0.006	8
	Rs_La Riova	9	378 (29.1%)	0.09321 ± 0.00443	1.116 ± 0.007	0.108 ± 0.006	0
	Rs_Mix	4	579 (44.5%)	0.17036 ± 0.00409	1.202 ± 0.008	0.204 ± 0.007	15
	Mean ± S.E.				1.122 ± 0.007	0.118 ± 0.006

Average number of scored individuals.

5% Criteria applied to Bayesian estimates of allele frequencies.

Nei's gene diversity.

No. of effective alleles.

Shannon's information index, standard error (S.E.).

Results of 1301 loci of J. curcas accessions and RS from 3 continents, 15 countries, and 53 regions for gene diversity Average number of scored individuals. 5% Criteria applied to Bayesian estimates of allele frequencies. Nei's gene diversity. No. of effective alleles. Shannon's information index, standard error (S.E.). Both FST and ΦPT statistics revealed the highly significant genetic differences among regions but moderate genetic differentiation among continents or countries (Table 2). Summaries of AMOVAs, based on all surveyed populations from different continents, countries, or regions, are shown in Table 3. AMOVA nested analyses based on continents showed that only 6% (ΦRT = 0.059, p < 0.0001) of the total genetic variation was partitioned between continents, 9% (ΦPR = 0.098, p < 0.0001) of the variation among countries within continents and 85% (ΦPT = 0.151, p < 0.0001) of the variation among individuals within countries. Further, the component of variability within populations was higher than among populations based on continents, countries, and regions (Supporting information, Table S2).

Table 2

Wright FST and ΦPT values

	F_STa)	Φ_PTb)
Continents	0.1257	0.123
	<0.0039	<0.0001
Countries	0.1079	0.138
	<0.0133	<0.0001
Regions	0.1782	0.273
	<0.0070	<0.0001

Based on Bayesian estimates of allele frequencies with 10 000 random permutations.

Calculated using Euclidean distance with 9999 permutations.

Table 3

AMOVA based on 1301 loci and 907 individuals of J. curcas accessions and RS from 3 continents, 15 countries, and 53 regions

Source of variation	d.f.	MSD	Variance component	% of total	Φ statistics	p-value*
Between continents	3	1525.425	6.037	6%	Φ_RT = 0.059	<0.0001
Among countries within continents	11	284.234	9.372	9%	Φ_PR = 0.098	<0.0001
Among individuals within countries	892	86.301	86.301	85%	Φ_PT = 0.151	<0.0001
Between countries	14	550.204	4.662	5%	Φ_RT = 0.048	<0.0001
Among regions within countries	39	464.415	23.372	24%	Φ_PR = 0.253	<0.0001
Among individuals within regions	853	69.013	69.013	71%	Φ_PT = 0.289	<0.0001
Between regions	53	487.076	25.940	27%
Among individuals within regions	853	69.013	69.013	73%	Φ_PT = 0.273	<0.0001

The percentage of variance partitioning at different levels of structure and associated significance values are shown, as degree of freedom (d.f.), mean squared deviations (MSD), and the values of the variance components.

Wright FST and ΦPT values Based on Bayesian estimates of allele frequencies with 10 000 random permutations. Calculated using Euclidean distance with 9999 permutations. AMOVA based on 1301 loci and 907 individuals of J. curcas accessions and RS from 3 continents, 15 countries, and 53 regions The percentage of variance partitioning at different levels of structure and associated significance values are shown, as degree of freedom (d.f.), mean squared deviations (MSD), and the values of the variance components. Different genetic distance measures were tested, but proved to be highly correlated and yielded consistent results (data not shown). For the following analyses the results obtained from Nei's D were used. For the continents, the dendrogram divided the populations into two major groups, one containing American and Asian and the other containing African accessions ( Fig. 1A). This result is also consistent with the PCoA, cophenetic, and STRUCTURE analyses ( Fig. 1B–D). The phenogram from 15 countries ( Fig. 2A) divided the J. curcas accessions into three major clusters. Cluster 1 (green) consists of the population from Burkina Faso with 98% random dataset clearly separated from the other countries. Cluster 2 (violet) includes Indonesia and Brazil, with 99% random dataset, together with India (90%) and Guinea Bissau (75%). Cluster 3 (black) is divided into two groups, one comprising Ethiopia, China, and Kenya with 98% random dataset, whereas the second group contains three populations from Americas (Mexico, Bolivia, and Paraguay) and three from Africa (Cape Verde, Tanzania, and Mali) with lower than 70% random dataset. In general, populations from different countries are not separated based on their geographic provenance. All RS (red) were separated from J. curcas accessions with 99% bootstrapped dataset. Further, the PCoA based on Nei's unbiased genetic distances for 14 populations of J. curcas accessions and one RS from 15 different countries is consistent with previous cluster analyses ( Fig. 2B). Interestingly, RS (red) and accessions from Burkina Faso and Guinea Bissau were located far from others countries, and Mexico and Mali overlap each other ( Fig. 2B). Further, the cophenetic analysis is a representation of what happens during the clustering process. The red circle was formed between RS and J. curcas accessions. The green circles summarized the distance between Burkina Faso and other J. curcas accessions ( Fig. 2C).

Figure 1

Figure 2

The phenogram, principal coordinates and STRUCTURE assignment analyses for the 1301 loci for 907 J. curcas accessions and RS from 15 countries. (A) Phenogram based on Nei's unbiased genetic distance, with 10000 permutation. The vertical axis shows the genetic distance at which populations cluster. Nodes with >75% bootstrap support were indicated and color indicates similar groups. (B) PCoA based on genetic distances calculated according to Jaccard's method. Horizontal and vertical scales represent the first and second principal axes of variation, respectively. In this instance, the first principal axis represents a large 41.4% of variation, the second a much smaller 15.5% (total eigenvalues = 0.003319). (C) Plot of pairwise genetic distances between 15 countries and RS on the horizontal axis against pairwise cophenetic distances between same populations as suggested by the cluster algorithm on the vertical axis. Mantel statistic r = 0.8545, p < 0.009, based on 100 permutations. (D) Result of the STRUCTURE analysis. The color of the individuals (white and gray) represents the proportion of their genome assigned to the K = 2 inferred clusters in the model-based admixture analyses.

The phenogram, principal coordinates and STRUCTURE assignment analyses for the 1301 loci for 907 J. curcas accessions and RS (RS) from three continents. (A) Phenogram based on Nei's unbiased genetic distance, with 10 000 permutation. The vertical axis shows the genetic distance at which populations cluster. Nodes with >75% bootstrap support were indicated and color indicates similar groups. (B) PCoA based on genetic distances calculated according to Jaccard's method. In this instance, the first principal axis represents a large 93.1% of variation, the second a much smaller 4.1% (total eigenvalues = 0.000687). (C) Plot of pairwise genetic distances between three continents and RS on the horizontal axis against pairwise cophenetic distances between same populations as suggested by the cluster algorithm on the vertical axis. The line shows where values on the horizontal axis are equal to values on the vertical axis (when genetic distance = cophenetic distance). Mantel statistic r = 0.952, p < 0.009, based on 100 permutations. (D) Result of the STRUCTURE analysis. The color of the individuals (white and gray) represents the proportion of their genome assigned to the K = 2 inferred clusters in the model-based admixture analyses. The phenogram, principal coordinates and STRUCTURE assignment analyses for the 1301 loci for 907 J. curcas accessions and RS from 15 countries. (A) Phenogram based on Nei's unbiased genetic distance, with 10000 permutation. The vertical axis shows the genetic distance at which populations cluster. Nodes with >75% bootstrap support were indicated and color indicates similar groups. (B) PCoA based on genetic distances calculated according to Jaccard's method. Horizontal and vertical scales represent the first and second principal axes of variation, respectively. In this instance, the first principal axis represents a large 41.4% of variation, the second a much smaller 15.5% (total eigenvalues = 0.003319). (C) Plot of pairwise genetic distances between 15 countries and RS on the horizontal axis against pairwise cophenetic distances between same populations as suggested by the cluster algorithm on the vertical axis. Mantel statistic r = 0.8545, p < 0.009, based on 100 permutations. (D) Result of the STRUCTURE analysis. The color of the individuals (white and gray) represents the proportion of their genome assigned to the K = 2 inferred clusters in the model-based admixture analyses. Similar results were obtained in populations analyzed from 53 regions ( Fig. 3). In the phenogram ( Fig. 3A), all RS (red) formed an isolated out-group, while all J. curcas accessions from different regions formed two main clusters. One cluster (green) with 99% bootstrap support contained three populations of Wolita, Shoa Fok, Goffa, and Kemise_D, all originating from Ethiopia. The second cluster contained three subclusters: one (violet) with the populations from Kenya (Busia and Siaya), China and Ethiopia (Shoa-Rabit), and two Ethiopian regions (Metema_Yohannis and Metema_Shehedin) with 97, 99, and 97% random dataset, respectively, which clustered with Kalu_Dure, Metema_AgamWuha from Ethiopia and Kakamega from Kenya with 88.0 and 75%, respectively. Interestingly, Las Pilas from Mexico clustered with J. curcas accessions from other regions. The Mexican populations (Puerto Escondido, Morelia, Mexico) clustered with Arusha from Tanzania, Falou from Mali, Shoa_Jewha, West Hararghe from Ethiopia and Paraguay. Further, Kalu_Degan and Ataye_JewihaNegeso clustered together with Kemise_Y from Ethiopia with 100 and 99% bootstrap support, respectively. Brazil and Sleman-D from Indonesia clustered together with 96% random dataset. The same structures could also be recognized by standard PCoA approaches ( Fig. 3B). The RS (red), Kemise_D, Goffa, Shoa_Fok as well as three populations of Wolita (green) clearly separated from other J. curcas accessions on the first axis, which was further confirmed by cophenetic analysis ( Fig. 3). Mexican accessions (Mexico, Las Pilas, and Morelia) were mixed with accessions from different regions without any correlation with geographical location.

Figure 3

The phenogram, principal coordinates and STRUCTURE assignment analyses for the 1301 loci for 907 J. curcas accessions and RS (RS_Catamarca, Rs_ La_Riova, RS_Mix) from 53 regions. (A) Phenogram based on Nei's unbiased genetic distance, with 10 000 permutation. The vertical axis shows the genetic distance at which populations cluster. Nodes with >75% bootstrap support were indicated and color indicates similar groups. (B) PCoA of J. curcas accessions and RS (RS_Catamarca, Rs_ La_Riova, RS_Mix) from 53 different regions, based on genetic distances calculated according to Jaccard's method. In this instance, the first principal axis represents a large 21.8% of variation, the second a much smaller 20.4% (total eigenvalues = 0.04497858). (C) Plot pairwise genetic distances of J. curcas accessions from 53 different regions and RS (RS_Catamarca, Rs_ La_Riova, RS_Mix) on the horizontal axis against pairwise cophenetic distances between populations as suggested by the cluster algorithm on the vertical axis. Mantel statistic r = 0.7086, p < 0.009, based on 100 permutations. (D) Result of the STRUCTURE analysis. The color of the individuals (white and gray) represents the proportion of their genome assigned to the K = 2 inferred clusters in the model-based admixture analyses. The phenogram based on Nei's genetic distance of 1301 loci divided the 907 Jatropha accessions into three clusters ( Fig.4A). Similarly, the Standard PCoA showed that individuals could be divided into three groups; RS (red) in the center separated from the other two groups (green and violet), containing J. curcas accessions from different regions and countries and overlapping each other in distribution. In this instance, the first principal axis represents 19.6% of variation and the second 7.9%. The first axis separated J. curcas accessions into two groups. Accessions from Ethiopia, Kenya, Paraguay, Bolivia, and Mexico were found in both groups, while India, Mali, Indonesia, Guinea Bissau, China, Cape Verde, Brazil, Burkina Faso, and Tanzania located only in one group (violet) ( Fig. 4B). Further, results showed that particular individuals (mostly from Africa) mixed with RS ( Fig. 4A and B). The overlapped positions of individuals from different continents, countries and/or regions are clearly numerous ( Fig. 4B), which can also be seen in the cophenetic diagram ( Fig. 4C). Similarly, the CAP results, divided individuals with countries and regions into three groups. In addition, the total squared Jaccard distance showed 33.87% (p < 0.001) differentiation between various populations, while the value for the first two constrained ordination axes was 16.18% (Supporting information, Fig. S1).

Figure 4

The phenogram, principal coordinates and STRUCTURE assignment analyses for the 1301 loci for 907 J. curcas accessions and RS. (A) Phenogram based on Nei's unbiased genetic distance. The vertical axis shows the genetic distance at which populations cluster. (B) PCoA of 907 J. curcas accession and RS, based on genetic distances calculated according to Jaccard's method. Individuals from different populations are labeled with different symbols. Horizontal and vertical scales represent the first (20%) and second (8%) principal axes of variation, respectively. Color indicates similar groups. (C) Plot pairwise genetic distances of 907 J. curcas accession and RS on the horizontal axis against pairwise cophenetic distances between individuals as suggested by the cluster algorithm on the vertical axis. Mantel statistic r = 0.8313, p < 0.009, based on 100 permutations. (D) Result of the STRUCTURE analysis. The color (white and gray) of the individuals from right to left as described in Supporting information, Table S1 represents the proportion of their genome assigned to the K = 2 inferred clusters in model-based admixture analyses.

3.1 SNP detection in natural accessions

Twelve candidate genes involved in toxin, oil production, and stress tolerance were screened for SNPs in 907 J. curcas accessions and RS (Table 4). Selected primers amplified homologous sequences in RS within the genus Jatropha; however, a clear distinct pattern associated to J. curcas was found (data not shown).

Table 4

List of J. curcas genes, sequences of primers, size of amplicons, and the number of SNPs identified in 907 J. curcas by EcoTILLING

Gene name	Function	IRD700	IRD800	Acc. no.	Size (bp)	Common SNPs	Rare SNPs
AF	Curcin	GGGCAGTTTCCCTAT	ATTAAAGCCATGGCA	AF469003	1267	15	29a)
		AAAAGCAGGTGA	GCCACTTTTGGT
EU06	Phosphoenolpyruvate	ACGGCAAGTTTCTA	AAGCTCTGCAGCT	EU069413	2000	NA	NA
	carboxylase	CCTTTGGGCTTTC	GGTTTGCTTGATTC
EF03	Aquaporin	GCCAAGGAAGTAAGT	TTCTTTCATTTTAGTT	EF030420	1289	1	2
		GAAGAAACGCAAA	GGTGGGGTTGCTG
DQ98	Beta-ketoacyl-ACP	AAGCCCTCCAATCCCC	GCCTCGTTAAACCTG	DQ987699	2500	3	0
	synthase I	ATCTATACGAG	GTCCATAAGCAA
EU10	Chloroplast acyl-ACP	TGCTGCTACTTCCTCG	GGCACTTTCAACTGG	EU106891	3664	0	2
	thioesterase (FATB)	TTCTTCCCTGT	AATCTGACCCATA
EU22	Acyl-ACP thioesterase	AACCAGCATCGTAGC	TCATCTGGAGGGCTT	EU267122	2800	2	0
	(FATA)	AGTTCCCATTTC	CTTTCTCCATTC
EU39	Seed maturation	GTAAGAATGGCTTCC	CTCTGCATAGGCGTTT	EU395432	950	0	0
	protein-like (Smp1)	GGCGAGCAG	GTTTCCCAAAG
FJ23	lipase	ACAGCAGGACTTGAAA	TTCCCTCTCCAACCA	FJ233094	2800	3	0
		GGGACAGTTGC	GGCTTACAACAG
EU21	Microsomal omega-3 fatty	AACACAAATGGCGTT	GGAATTCTGTATAGC	EU267121	1500	11	0
	acid desaturase (FAD3)	AATGGGTTTCACA	TCAGGGTCTGTTTGA
DQ15	Δ12-Fatty acid desaturase	AGAATGTCTGTTCCT	CCAGAACACACCTCT	DQ157776	1152	0	14b)
		CCTTCCCCCAAG	GCTTTGATCAGC
DQ66	Chloroplast omega-3 fatty	AGCTGGGTTTTGTCT	GGTGATGCAAGTAGG	DQ665869	1700	0	0
	acid desaturase	GAATGTGGCCTA	TCACAAGGTCCA
SUSYII	Sucrose synthase (SuSy)	CGTGTTATCACTCGCG	CAATTTCCTGGAACCT	KC346252	1500	4	0
		TTCACAGCATC	GTGCTCGAACT

20 SNPs were detected by EcoTILLING and 26 additional SNPs were found after sequencing.

15 SNPs were detected by EcoTILLING, while no SNPs was found after sequencing.

List of J. curcas genes, sequences of primers, size of amplicons, and the number of SNPs identified in 907 J. curcas by EcoTILLING 20 SNPs were detected by EcoTILLING and 26 additional SNPs were found after sequencing. 15 SNPs were detected by EcoTILLING, while no SNPs was found after sequencing. Both heterozygous and homozygous variants were screened to obtain an accurate estimate of nucleotide diversity. A total of 62 new SNPs (39 common and 23 minor alleles represented at or below 5%) were detected in the 10 out of 12 genes tested by EcoTILLING (Table 4). The SNPs were generated in silico, and predicted by either SIFT or PARSESNP for all individuals. No SNPs were found by EcoTILLING in EU39 and DQ66, a J. curcas seed maturation gene (Smp1) and a chloroplast omega-3 fatty acid (FA) desaturase, respectively, while the most polymorphic locus, AF, a curcin gene coding for a ribosome inactivating protein from J. curcas, revealed 20 SNPs. Additionally, in the curcin gene, 26 novel SNPs were identified upon DNA sequencing that were not detected through EcoTILLING. In silico analysis showed that these changes may affect the amino acid composition. However, in gene DQ15, a (12-FA desaturase of J. curcas, only 8 of the candidates for polymorphic sites could be verified by sequencing. Additionally, in this gene, 14 SNPs were found by EcoTILLING in one accession (R1009-7) of Shoa_Jewha region from Ethiopia, but could not be confirmed by Sanger sequencing. Previous studies using EcoTILLING in complex genomes showed false positive and false negative error rates of 5% or lower [44]. The cause of errors in the present study remains unknown, but may be due in part to co-amplification of homologous sequences affecting enzymatic mismatch cleavage and sequence analysis. A complete list of polymorphisms after sequencing is provided (Supporting information, Table S3), and the distribution of SNPs in the targeted region of these genes is shown (Supporting information, Fig. S7). Polymorphisms were found for the AF gene in four Mexican accessions, two from the Morelia region (MexMar1-2), which were described as non-toxic and two from Puerto Escondido region (MexPC1, 4), as well as one accession from Mali in Ech region (Mal3) (Supporting information, Table S3). The EU10 gene showed SNPs in six Mexican accessions (MexMar1, 2, MexPC1, 3, 4, and Mexico3), in one from India (ISC111_2) and in one from Mali (Mal3). Further, SNPs were found for the EF03 gene in one accession from Mali (Mal3) and five Mexican accessions, two in the non-toxic accessions (MexMar1 and 2), two from Puerto Escondido region (MexPC1, 4), one from Las Pilas (J_curc1).

4 Discussion

The level of genetic diversity and differentiation of Jatropha is partly attributed to the mode of its introduction in many countries as an exotic species [45]. The precise origin of J. curcas is somewhat disputed, but it is assumed to be native to Central America, and to have been transported via the Cape Verde and Guinea Bissau to Africa and Asia by Portuguese ships approximately 200 years ago [46], which represents a very short time frame to evolve new alleles. If this is correct, the level of genetic variation present in Africa and Asia is determined largely by the material originally exported from America, and possibly also from reciprocal trade movements between these continents. Due to the ability of Jatropha to successfully grow and occupy large areas in a short period, it is possible that African and Asian populations result from a narrow germplasm origin. Human activities are a crucial factor influencing the genetic diversity of J. curcas, which was extensively managed over the last centuries. In addition, the quite common practice to use reproductive material from unknown origin might have resulted in a significant proportion of re-implanted stocks poorly adapted to site conditions. This complex genetic history might have contributed to a potential genetic bottleneck [19, 45]. Therefore, molecular tools to certify, identify, and understand the genetic variation and relationship within and between J. curcas accessions and RS of suitable autochthonous stands can greatly improve the formulation of appropriate breeding strategies for this species [47]. In general, as well as in this study, all investigated populations showed a low to moderate level of variability [3, 9, 16, 48–50], which could be explained by human impact, intensive selection, industrial management or the Jatropha mating system resulting in a high frequency of homozygosis [50, 51]. On the other hand, clear phenotypic variation among Jatropha collections is well documented. However, most of the genetic variability in J. curcas was shown to be essentially epigenetic, which could explain the phenotypic variation with no requirement of changing DNA sequences [3, 45, 52]. Further, J. curcas is a semi-deciduous tree having male and female flowers. Seeds are produced both asexually by apomixis and sexually by a mixed mating system through self- and cross-pollination [51, 53]. However, a high rate of correlated mating was found in which open-pollinated seeds were mainly full sibs [51]. In addition, since J. curcas was found as isolated plant, interbreeding is not very frequent, which may further limit its genetic diversity [50]. Jatropha accessions are not naturally distributed, which means that no “true” wild populations, but only fence- or home garden trees exist. Therefore, the genetic diversity depends largely on its mode of propagation [46, 54], limiting the interchange of germplasm at the local level [54]. Moreover, the genetic diversity in J. curcas was studied so far mainly with small numbers of individuals and regions with different sampling bias and methods, which could increase or decrease its variability as well as the interpretation of results as mentioned before. Investigations indicated a narrow genetic diversity index worldwide [3], however some studies reported a higher index value among populations from different countries and continents [55, 56], even higher than those obtained in Mexican populations [54]. Further, [56] reported a high genetic diversity index in accessions from Yunnan (China) and South America (0.502 and 0.492, respectively), using EST-SSRs for 45 J. curcas accessions from Indonesia, South America, Grenada, and China. There exist only few studies on the degree of genetic diversity in Mexico [17, 18, 55, 57], which showed that populations of J. curcas from the State of Chiapas (on the border to Guatemala) have a greater genetic diversity (0.267), compared to other parts of Mexico (0.081–0.199) [18], which was also described by [54]. Similarly, in the current study Mexican accessions not originating from Chiapas, showed a lower genetic diversity. Further, Chiapas is extremely different in climate (most humid of the warm sub-humid area with summer rains), and because of the observed toxicity no selection process is believed to have occurred in the Chiapas region. In contrast, the high genetic similarity found in non-toxic accessions from Veracruz, Hidalgo, Puebla, Yucatan, and Morelos, could result from the fact, that Jatropha was selected and domesticated there approximately 3000 years ago [18]. In the current study, both FST and ΦPT statistics (Table 2) revealed highly significant genetic differences among regions (0.178 and 0.273, respectively) but moderate genetic differentiation among countries (0.107 and 0.138, respectively) or continents (0.125 and 0.123, respectively). ΦPT value is analog to FST, where values ranging from 0.15 to 0.25 reflect great, from 0.05 to 0.15 reflect moderate and below 0.05 suggest little genetic differentiation [36]. As expected in all cases (between continents, countries, or regions) ΦPT values exceeded those of FST. Previous studies reported various index values of differentiation among populations [18, 24, 54–58]. An index value of differentiation among populations greater than 0.25 is considered high, and Index values greater than 0.5, mean that populations are so different that they could even be in the process of speciation, which is not supposed to be the case with J. curcas [54]. AMOVA showed that most of the genetic variations were preserved within individuals rather than among populations from different countries or regions, a phenomenon observed in tropical tree species [59] and J. curcas [49, 54]. However, this can be explained by the fact that in the current study sampling was performed on a much larger geographical scale and be indicative for a recent distribution. Results of these analyses support the former history of geographical distribution of J. curcas, and therefore its genetic mixture with low variation beyond the center of origin. Using AFLP and ISSR markers revealed that most accessions were divided in two groups, where the accessions from Mexico, Paraguay, Bolivia, Kenya as well as from Ethiopia with unknown origins were found in both groups. All Asian (India, Mali, Indonesia, China) and some African (Guinea Bissau, Cape Verde, Burkina Faso, and Tanzania) accessions were found in only one group. The existence of Central and South American accessions in both groups supports the hypothesis that they were spread in the past into other populations. Further, identification of novel SNPs in Mexican accessions could confirm a broad gene pool of J. curcas. On the other hand, accessions located in only one group might stem from different ancestral populations or were under different selection pressures or recently introduced and rapidly distributed due to their economic relevance [48, 60]. In general, there was a considerable overlap between individuals from different regions and countries ( Fig. 4). Further, the clustering of accessions based on molecular information was not related to the geographic origin, indicating widespread seed or vegetative dispersion of particular genotypes. Although, accessions from Cape Verde and South Omo-Tumine (Ethiopia) had a high genetic diversity, the Ethiopian accessions showed the most private alleles, making them also important reservoirs for genetic conservation and interesting for any breeding program. This phenomenon might be explained by the mode of introduction from different genetic backgrounds, prior to anthropogenic management [13, 55], selective environmental pressure [48], or the possible cross hybridization between J. curcas and RS [3]. However, it is clear that further interpretation requires an understanding of the history of its introduction through human activities, which is unfortunately currently not available. The existence of African species of Jatropha may explain the diversification of the genus by vicariances [21]. It was claimed by Dehgan and Schutzman [61] that the subsequent spread of Jatropha spp. in Africa and America was a result of the separation of the continent of Gondwana ∼100 million years ago [17]. Further, findings of the oldest fossil of Jatropha genus in Peru [62] indicated that the genus radiated from this area [3]. If this observation is correct, Central America might be a center of diversity rather than the center of origin [60]. Further J. curcas accessions formed a tight cluster with short branches indicating high genetic similarity, whereas populations from RS had much longer branches pointing to higher genetic differentiation ( Figs. 1A– 4A). Nevertheless, some J. curcas accessions were mixed with RS indicating possible inbred lines ( Fig. 4A and B). Despite the low diversity between J. curcas accessions, the ordination provided additional information to the analysis of populations. STRUCTURE was used to determine the probability that each of the collected individuals belongs to a distinct population [40] using the recessive alleles model for dominant marker data assuming admixture and independent allele frequency. However, STRUCTURE was not able to resolve relationships between J. curcas accessions and RS.

4.1 In silico gene function analysis

A major aim in the genetic improvement of Jatropha is the development of high yielding varieties in terms of seed yield, high oil content, and low toxin amount. To accelerate this process, the identification of natural variation in genes of interest is important. For this purpose, EcoTILLING, a variation of TILLING, was performed to investigate the function of candidate genes by discovering new SNPs. Identification of SNP markers in coding regions allowed to discover alleles that are under selection or affect gene expression. Further, minor alleles are key to most phenotypic variation and could reflect the evolutionary history. In addition, the frequency of different SNPs in different populations may reflect distinctive histories or selective forces affecting individuals in that population [63]. Moreover, looking at variation in as many populations as possible gives a more precise picture of genetic diversity. Therefore, 12 candidate genes were selected which were predicted to be involved in oil and toxin production, as well as seed development. Among different pooling strategies, samples pooled eight times [29] showed clear polymorphic bands. Because SNPs in coding regions are in general less frequent than in non-coding region, a combination of coding and non-coding regions was used to increase the number of identified SNPs. In general, the number of rare SNPs was very low (Table 4), reflecting a low polymorphism among individuals. However, the chosen strategy could separate Mexican from other J. curcas accessions and distinguish toxic from non-toxic ones. Interestingly, a higher variation was found in the curcin gene than in genes responsible for oil production. The two selected ancient thioesterase genes (FatA and FatB), encoding 18:1 (oleoyl)-ACP (acyl–acyl carrier protein) and 16:0 (palmitoyl)-ACP thioesterase, respectively, play an essential role in chain termination during de novo FA synthesis and in the channeling of carbon flux between the two lipid biosynthesis pathways in plants [64]. Investigating seed quality characters from Jatropha germplasm, demonstrated a large variation in seed and oil traits related to different geographic locations or different environmental conditions [65, 66]. In fact, selection of a superior genotype for total saturated FA composition may not remain consistent from one environment to another [65]. The ability of biodiesel to meet the special criteria is largely determined by its FA composition with high monounsaturated FA (oleic acid) content [1, 67]. To improve Jatropha biodiesel qualities, higher oleic acid (>70%) and lower saturated FA (<10%) are desirable through identified accessions in the Jatropha germplasm [68]. Δ12-desaturase (FAD2) is the key enzyme responsible for the production of linoleic acid in plants by catalyzing oleoyl-ACP (oleic) to linoleoyl-ACP (linoleic), indicating that high oleic acid accessions are always associated with a lesion in one of the FAD2 genes [1, 67]. In conclusion, the polymorphisms identified support the value of SNPs as markers for potential breeding strategies. The data also showed that EcoTILLING, especially by adopting pooling strategies is a cost effective high throughput, fast and efficient way to analyze genetic variation and to perform functional analyses in species with low variation like J. curcas.

31 in total

1. Inference of population structure using multilocus genotype data.

Authors: J K Pritchard; M Stephens; P Donnelly
Journal: Genetics Date: 2000-06 Impact factor: 4.562

2. Analysis of seed phorbol-ester and curcin content together with genetic diversity in multiple provenances of Jatropha curcas L. from Madagascar and Mexico.

Authors: Wei He; Andrew J King; M Awais Khan; Jesús A Cuevas; Danièle Ramiaramanana; Ian A Graham
Journal: Plant Physiol Biochem Date: 2011-07-23 Impact factor: 4.270

3. The genetical structure of populations.

Authors: S WRIGHT
Journal: Ann Eugen Date: 1951-03

4. CLUMPP: a cluster matching and permutation program for dealing with label switching and multimodality in analysis of population structure.

Authors: Mattias Jakobsson; Noah A Rosenberg
Journal: Bioinformatics Date: 2007-05-07 Impact factor: 6.937

5. AFLP: a new technique for DNA fingerprinting.

Authors: P Vos; R Hogers; M Bleeker; M Reijans; T van de Lee; M Hornes; A Frijters; J Pot; J Peleman; M Kuiper
Journal: Nucleic Acids Res Date: 1995-11-11 Impact factor: 16.971

6. Targeted screening for induced mutations.

Authors: C M McCallum; L Comai; E A Greene; S Henikoff
Journal: Nat Biotechnol Date: 2000-04 Impact factor: 54.908

7. Palmitoyl-acyl carrier protein (ACP) thioesterase and the evolutionary origin of plant acyl-ACP thioesterases.

Authors: A Jones; H M Davies; T A Voelker
Journal: Plant Cell Date: 1995-03 Impact factor: 11.277

8. Molecular characterization and identification of markers for toxic and non-toxic varieties of Jatropha curcas L. using RAPD, AFLP and SSR markers.

Authors: D V N Sudheer Pamidimarri; Sweta Singh; Shaik G Mastan; Jalpa Patel; Muppala P Reddy
Journal: Mol Biol Rep Date: 2008-07-19 Impact factor: 2.316

9. Exome-wide DNA capture and next generation sequencing in domestic and wild species.

Authors: Ted Cosart; Albano Beja-Pereira; Shanyuan Chen; Sarah B Ng; Jay Shendure; Gordon Luikart
Journal: BMC Genomics Date: 2011-07-05 Impact factor: 3.969

Review 10. Jatropha curcas, a biofuel crop: functional genomics for understanding metabolic pathways and genetic improvement.

Authors: Fatemeh Maghuly; Margit Laimer
Journal: Biotechnol J Date: 2013-10 Impact factor: 4.677

5 in total

1. Evaluation of the trnK-matK-trnK, ycf3, and accD-psal chloroplast regions to differentiate crop type and biogeographical origin of Cannabis sativa.

Authors: Ya-Chih Cheng; Rachel Houston
Journal: Int J Legal Med Date: 2021-02-18 Impact factor: 2.686

2. Genotyping by sequencing-based linkage map construction and identification of quantitative trait loci for yield-related traits and oil content in Jatropha (Jatropha curcas L.).

Authors: Vijay Yepuri; Saakshi Jalali; Vishwnadharaju Mudunuri; Sai Pothakani; Nagesh Kancharla; S Arockiasamy
Journal: Mol Biol Rep Date: 2022-03-03 Impact factor: 2.742

3. Genetic Tracing of Jatropha curcas L. from Its Mesoamerican Origin to the World.

Authors: Haiyan Li; Suguru Tsuchimoto; Kyuya Harada; Masanori Yamasaki; Hiroe Sakai; Naoki Wada; Atefeh Alipour; Tomohiro Sasai; Atsushi Tsunekawa; Hisashi Tsujimoto; Takayuki Ando; Hisashi Tomemori; Shusei Sato; Hideki Hirakawa; Victor P Quintero; Alfredo Zamarripa; Primitivo Santos; Adel Hegazy; Abdalla M Ali; Kiichi Fukui
Journal: Front Plant Sci Date: 2017-09-07 Impact factor: 5.753

4. Molecular characterization and genetic diversity of Jatropha curcas L. in Costa Rica.

Authors: Marcela Vásquez-Mayorga; Eric J Fuchs; Eduardo J Hernández; Franklin Herrera; Jesús Hernández; Ileana Moreira; Elizabeth Arnáez; Natalia M Barboza
Journal: PeerJ Date: 2017-02-09 Impact factor: 2.984

5. The Pattern and Distribution of Induced Mutations in J. curcas Using Reduced Representation Sequencing.

Authors: Fatemeh Maghuly; Stephan Pabinger; Julie Krainer; Margit Laimer
Journal: Front Plant Sci Date: 2018-04-23 Impact factor: 5.753

5 in total