| Literature DB >> 33233626 |
Sewalem Tsehay1, Rodomiro Ortiz1, Eva Johansson1, Endashaw Bekele2, Kassahun Tesfaye2,3, Cecilia Hammenhag1, Mulatu Geleta1.
Abstract
The development and use of genomic resources are essential for understanding the population genetics of crops for their efficient conservation and enhancement. Noug (Guizotia abyssinica) is an economically important oilseed crop in Ethiopia and India. The present study sought to develop new DNA markers for this crop. Transcriptome sequencing was conducted on two genotypes and 628 transcript sequences containing 959 single nucleotide polymorphisms (SNPs) were developed. A competitive allele-specific PCR (KASP) assay was developed for the SNPs and used for genotyping of 24 accessions. A total of 554 loci were successfully genotyped across the accessions, and 202 polymorphic loci were used for population genetics analyses. Polymorphism information content (PIC) of the loci varied from 0.01 to 0.37 with a mean of 0.24, and about 49% of the loci showed significant deviation from the Hardy-Weinberg equilibrium. The mean expected heterozygosity was 0.27 suggesting moderately high genetic variation within accessions. Low but significant differentiation existed among accessions (FST = 0.045, p < 0.0001). Landrace populations from isolated areas may have useful mutations and should be conserved and used in breeding this crop. The genomic resources developed in this study were shown to be useful for population genetics research and can also be used in, e.g., association genetics.Entities:
Keywords: Guizotia; Hardy-Weinberg equilibrium; KASP markers; SNPs; genetic diversity; genotyping; heterozygosity; noug; population structure; transcriptome
Year: 2020 PMID: 33233626 PMCID: PMC7709008 DOI: 10.3390/genes11111373
Source DB: PubMed Journal: Genes (Basel) ISSN: 2073-4425 Impact factor: 4.096
Figure 1Maximum, minimum, and mean values of (A) observed number of alleles (Na), effective number of alleles (Ne), allele frequency (Af), expected heterozygosity (He), polymorphism information content (PIC), fixation indices (FIS, FIT and FST) per locus; (B) sample size (N) per population and estimate of gene-flow (Nm) per locus.
Figure 2Pie-chart of the SNP loci significantly deviated from Hardy-Weinberg equilibrium (HWE) showing their proportions in terms of heterozygote excess and deficiency at different levels of significance.
Description of polymorphic single nucleotide polymorphic (SNP) loci that showed highly significant deviation from Hardy-Weinberg Equilibrium (HWE), and corresponding sunflower (Helianthus annuus) homologs of noug genes harboring the SNPs.
| Noug Contig | SNP Locus | SNP PINC c | Ref_Alt d,e | Missing Genotype | MAF f | |
|---|---|---|---|---|---|---|
| CL3143Contig1 | 3143A a | 376 | G_C | GG | 0.131 | 17.8 kDa class I heat shock protein-like_LOC110904834: XM_022150704; 212..484 |
| 3143B a | 388 | C_A | AA | 0.120 | ||
| CCHT13019.b1F16.ab1 | 13,019A a | 30 | C_A | AA | 0.389 | Two-component response regulator-like PRR73_LOC110868813: XM_022118076; 1133..1903 |
| 13,019C b | 372 | T_A | AT | 0.448 | ||
| CCHT3719.b1M18.ab1 | 3719A a | 502 | A_G | AA | 0.239 | TPR2-like protein_LOC110930065: XM_022173292; 2716..3466 |
| CCHT4593.b1B22.ab1 | 4593B a | 274 | A_C | None | 0.389 | Calnexin homolog_LOC110865890: XM_022115225; 1115..1583 |
| CCHT4736.b1P07.ab1 | 4736 a | 498 | G_C | GG | 0.293 | Cytochrome P450 CYP82D47-like_LOC110879526: XM_022127994; 1210..1778 |
| CCHT7954.b1C22.ab1 | 7954A a | 387 | T_C | CC | 0.371 | Uncharacterized protein_LOC110878690: XM_022127042; 145..855 |
| CCHT8585.b1B11.ab1 | 8585 a | 280 | C_T | CC | 0.149 | Uncharacterized protein_ LOC110927216: XM_022170855; 1193..1751 |
| CCHT10160.b1P19.ab1 | 10,160 a | 328 | G_C | GG | 0.441 | UDP-arabinopyranose mutase 1-like_ LOC110915474: XM_022160180; 62..679 |
| CCHT13180.b1H07.ab1 | 13,180 a | 460 | G_A | None | 0.432 | 40S ribosomal protein S13_LOC110937753: XM_022180210; 4..544 |
| CCHT17789.b1J07.ab1 | 17,789 a | 698 | C_A | CC | 0.245 | Uncharacterized 38.1 kDa protein-like_LOC110867109: XM_022116245; 208..521 |
| CCHT20996.b1G18.ab1 | 20,996B a | 636 | C_T | CC | 0.470 | Heat shock 70 kDa protein 14 like_LOC110894223: XM_022141417, 93..831 |
| CCHT17807.b1N11.ab1 | 17,807 b | 350 | T_G | GT | 0.250 | Nucleobase-ascorbate transporter 6-like_LOC110941756: XM_022183418; 123..833 |
| CCHT17571.b1F02.ab1 | 17,571 b | 408 | T_A | AT | 0.470 | Probable ADP-ribosylation factor GTPase-activating protein AGD14_LOC110886873: XM_022134739; 92..748 |
| CCHT4779.b1F19.ab1 | 4779 b | 255 | A_C | AC | 0.185 | Probable E3 ubiquitin-protein ligase ARI1_LOC110929292: XM_022172423; 1130..1867 |
a = heterozygote excess; b = heterozygote deficit. c SNP PINC = SNP position in noug contig. d Ref_Alt = Reference allele_Alternate allele; e All SNPs resulted in synonymous amino acid substitution except SNP at locus 3143A that led to the non-synonymous substitution of Serine vs. Arginine, at position 99 of the amino acid sequence of the 17.8 kDa class I heat shock protein-like protein; f MAF = minor allele frequency.
Summary of genetic diversity estimates for 24 noug accessions based on 202 single-nucleotide polymorphic (SNP) loci or its subsets, and mean values for different groups and all accessions.
| Acc | PPL | I | Ho | He | uHe | F | Theta | EAED1 | EAED2 | EAED3 |
|---|---|---|---|---|---|---|---|---|---|---|
| NG086 a | 75.7 | 0.391 | 0.238 | 0.262 | 0.274 | 0.105 | 2.016 | 0.313 | 0.554 | 0.207 |
| NG088 a | 77.7 | 0.396 | 0.242 | 0.263 | 0.276 | 0.092 | 2.004 | 0.276 | 0.468 | 0.198 |
| NG089 a | 75.7 | 0.387 | 0.248 | 0.258 | 0.270 | 0.042 | 2.038 | 0.294 | 0.528 | 0.201 |
| NG090 a | 80.1 | 0.403 | 0.244 | 0.268 | 0.280 | 0.084 | 1.978 | 0.274 | 0.489 | 0.187 |
| NG092 a | 84.0 | 0.419 | 0.235 | 0.278 | 0.290 | 0.165 | 1.927 | 0.304 | 0.541 | 0.206 |
| NG095 a | 80.6 | 0.415 | 0.256 | 0.277 | 0.289 | 0.087 | 1.934 | 0.285 | 0.497 | 0.196 |
| NG096 a | 73.8 | 0.381 | 0.259 | 0.255 | 0.267 | 0.017 | 2.057 | 0.257 | 0.454 | 0.176 |
| NG097 a | 76.7 | 0.384 | 0.224 | 0.254 | 0.266 | 0.122 | 2.063 | 0.256 | 0.430 | 0.183 |
| NG098 a | 78.6 | 0.403 | 0.256 | 0.267 | 0.279 | 0.060 | 1.984 | 0.280 | 0.436 | 0.216 |
| NG099 a | 76.7 | 0.400 | 0.261 | 0.266 | 0.288 | 0.011 | 1.968 | n/c | n/c | n/c |
| NG101 a | 79.1 | 0.405 | 0.259 | 0.270 | 0.282 | 0.047 | 2.012 | 0.248 | 0.396 | 0.187 |
| NG103 a | 80.1 | 0.396 | 0.245 | 0.263 | 0.275 | 0.067 | 2.076 | 0.218 | 0.380 | 0.146 |
| NG105 a | 78.6 | 0.395 | 0.259 | 0.263 | 0.274 | 0.057 | 1.971 | 0.249 | 0.386 | 0.195 |
| NG106 a | 77.2 | 0.382 | 0.227 | 0.252 | 0.264 | 0.103 | 2.067 | 0.286 | 0.548 | 0.188 |
| NG107 a | 75.7 | 0.402 | 0.246 | 0.270 | 0.282 | 0.086 | 1.959 | 0.286 | 0.496 | 0.201 |
| NG108 a | 70.9 | 0.380 | 0.237 | 0.255 | 0.269 | 0.081 | 1.974 | 0.272 | 0.484 | 0.190 |
| NG109 a | 79.6 | 0.408 | 0.244 | 0.272 | 0.284 | 0.115 | 2.031 | 0.306 | 0.465 | 0.242 |
| NG111 a | 80.1 | 0.405 | 0.234 | 0.269 | 0.281 | 0.134 | 2.045 | 0.278 | 0.568 | 0.165 |
| NG112 a | 79.1 | 0.392 | 0.233 | 0.259 | 0.271 | 0.110 | 1.940 | 0.295 | 0.532 | 0.203 |
| NG113 a | 77.7 | 0.388 | 0.237 | 0.257 | 0.269 | 0.073 | 2.073 | 0.274 | 0.453 | 0.203 |
| NG123 a | 82.5 | 0.414 | 0.256 | 0.275 | 0.288 | 0.090 | 2.008 | 0.283 | 0.462 | 0.214 |
| NG124 b | 72.8 | 0.379 | 0.217 | 0.253 | 0.264 | 0.133 | 1.936 | 0.287 | 0.493 | 0.204 |
| Fogera c | 80.1 | 0.407 | 0.253 | 0.270 | 0.283 | 0.072 | 1.969 | 0.300 | 0.519 | 0.213 |
| Shambu c | 81.1 | 0.398 | 0.238 | 0.262 | 0.274 | 0.091 | 2.011 | 0.284 | 0.487 | 0.205 |
| Mean_Alt-1 | 77.8 | 0.397 | 0.245 | 0.264 | 0.277 | 0.088 | 2.005 | 0.275 | 0.447 | 0.205 |
| Mean_Alt-2 | 77.5 | 0.397 | 0.244 | 0.265 | 0.277 | 0.086 | 2.000 | 0.289 | 0.517 | 0.196 |
| Mean_Alt-3 | 79.0 | 0.398 | 0.245 | 0.265 | 0.278 | 0.075 | 2.013 | 0.265 | 0.470 | 0.182 |
| Mean_Reg-1 | 78.8 | 0.401 | 0.251 | 0.266 | 0.279 | 0.076 | 2.006 | 0.281 | 0.458 | 0.211 |
| Mean_Reg-2 | 73.8 | 0.382 | 0.240 | 0.255 | 0.267 | 0.073 | 2.031 | 0.262 | 0.456 | 0.183 |
| Mean_Reg-3 | 79.7 | 0.405 | 0.244 | 0.270 | 0.282 | 0.100 | 1.983 | 0.285 | 0.495 | 0.197 |
| Mean_Reg-4 | 77.4 | 0.391 | 0.236 | 0.260 | 0.272 | 0.093 | 2.010 | 0.285 | 0.507 | 0.199 |
| Mean_Reg-5 | 80.1 | 0.396 | 0.245 | 0.263 | 0.275 | 0.067 | 2.076 | 0.218 | 0.380 | 0.146 |
| Mean_Reg-6 | 79.1 | 0.407 | 0.250 | 0.271 | 0.286 | 0.077 | 1.982 | 0.282 | 0.533 | 0.181 |
| Mean_Landrace | 78.1 | 0.397 | 0.245 | 0.264 | 0.277 | 0.083 | 2.006 | 0.277 | 0.478 | 0.195 |
| Mean_Cultivar | 80.6 | 0.4025 | 0.245 | 0.266 | 0.278 | 0.081 | 1.99 | 0.292 | 0.503 | 0.209 |
| Mean_all | 78.1 | 0.397 | 0.244 | 0.264 | 0.277 | 0.085 | 2.002 | 0.278 | 0.481 | 0.197 |
| SE_all | 0.006 | 0.004 | 0.003 | 0.003 | 0.003 | 0.006 | 0.050 | 0.005 | 0.011 | 0.004 |
Acc = accessions (a landrace populations, b breeding population, c cultivars); PPL = percent polymorphic loci; I = Shannon’s information index; Ho = observed heterozygosity; He = expected heterozygosity; uHe = unbiased expected heterozygosity; F = fixation index; Theta (H) = Theta from mean heterozygosity under the stepwise mutation model [52]; EAED_1, 2, and 3 = estimates of average evolutionary divergence over sequence pairs within populations for (a) all polymorphic loci, (b) loci with allele frequency of equal or above 0.3 and below or equal 0.7, and (c) loci with allele frequency of below 0.3 and above 0.7, respectively, as estimated based on the Tajima-Nei model [53]. Note: In all cases, genotypes with more than 7% missing data were excluded. The Pearson correlation coefficient of EAED1 vs. EAED2, EAED1 vs. EAED3, and EAED2 vs. EAED3 were 0.79 (p < 0.001), 0.75 (p < 0.001), and 0.20 (p = 0.35). The 21 landrace accessions were grouped into three altitudinal groups and six regional groups (see Table S1).
Analysis of molecular variance (AMOVA), based on 1023 permutations, for 24 accessions without grouping, and for 21 accessions by grouping them according to altitudinal range or regions of origin.
| Source of Variation | DF | Sum of Squares | Variance Components | Percentage of Variation | Fixation Indices | Probability ( |
|---|---|---|---|---|---|---|
| Among accessions | 23 | 1078.2 | 1.019 Va | 4.52 | FST = 0.045 | Va & FST < 0.0001 |
| Among individuals within accessions | 259 | 5921.0 | 1.321 Vb | 5.86 | FIS = 0.061 | Vb & FIS < 0.0001 |
| Within individuals | 283 | 5722.0 | 20.219 Vc | 89.63 | FIT = 0.104 | Vc & FIT < 0.0001 |
| Total | 565 | 12,721.3 | 22.556 | |||
| a Among alt groups | 2 | 97.9 | 0.026 Va | 0.12 | FCT = 0.001 | Va & FCT = 0.1935 |
| Among accessions within alt groups | 18 | 800.1 | 1.015 Vb | 4.67 | FSC = 0.047 | Vb & FSC < 0.0001 |
| Within accessions | 471 | 9746.7 | 20.694 Vc | 95.21 | FST = 0.048 | Vc & FST < 0.0001 |
| Total | 491 | 10,644.8 | 21.735 | |||
| b Among regions-I | 5 | 217.0 | −0.027 Va | −0.12 | FCT = −0.001 | Va & FCT = 0.6715 |
| Among accessions within regions-I | 15 | 681.0 | 1.056 Vb | 4.86 | FSC = 0.048 | Vb & FSC < 0.0001 |
| Within accessions | 471 | 9746.7 | 20.694 Vc | 95.26 | FST = 0.047 | Vc & FST < 0.0001 |
| Total | 491 | 10,644.8 | 21.723 | |||
| c Among regions-II | 1 | 55.4 | 0.044 Va | 0.20 | FCT = 0.002 | Va & FCT = 0.047 |
| Among accessions within regions-II | 19 | 842.6 | 1.011 Vb | 4.65 | FSC = 0.047 | Vb & FSC < 0.0001 |
| Within accessions | 471 | 9746.7 | 20.694 Vc | 95.15 | FST = 0.048 | Vc & FST < 0.0001 |
| Total | 491 | 10,644.8 | 21.723 |
Note: 21 of the 24 accessions were grouped according to region or altitudinal (alt) range or origin. The two cultivars and the breeding population were excluded from the grouping, as they cannot be placed in any of the groups. a The 21 accessions were grouped into three altitudinal groups: 1400–1680 m above sea level (masl), 1820–1968 masl, and 2045–2590 masl. b The 21 accessions were grouped into six regions (regions-I), and c the 21 accessions were grouped into two regions (regions-II) (Table S1).
Analysis of molecular variance (AMOVA)-based pairwise FST between the 24 populations with 1023 permutations (below the diagonal) and mean FST of each accession against all other accessions (diagonal).
|
| 086 | 088 | 089 | 090 | 092 | 095 | 096 | 097 | 098 | 099 | 101 | 103 | 105 | 106 | 107 | 108 | 109 | 111 | 112 | 113 | 123 | 124 | Fog | Sha |
| 086 |
| |||||||||||||||||||||||
| 088 | 0.031 |
| ||||||||||||||||||||||
| 089 | 0.026 | 0.041 |
| |||||||||||||||||||||
| 090 | 0.031 | 0.055 | 0.041 |
| ||||||||||||||||||||
| 092 | 0.023 | 0.026 | 0.046 | 0.041 |
| |||||||||||||||||||
| 095 | 0.047 | 0.031 | 0.051 | 0.056 | 0.043 |
| ||||||||||||||||||
| 096 | 0.054 | 0.075 | 0.076 | 0.081 | 0.061 | 0.095 |
| |||||||||||||||||
| 097 | 0.069 | 0.066 | 0.065 | 0.082 | 0.065 | 0.064 | 0.124 |
| ||||||||||||||||
| 098 | 0.046 | 0.061 | 0.037 | 0.073 | 0.037 | 0.074 | 0.079 | 0.08 |
| |||||||||||||||
| 099 | 0.030 | 0.053 | 0.042 | 0.053 | 0.040 | 0.048 | 0.078 | 0.082 | 0.066 |
| ||||||||||||||
| 101 | 0.034 | 0.032 | 0.053 | 0.065 | 0.050 | 0.064 | 0.08 | 0.078 | 0.057 | 0.075 |
| |||||||||||||
| 103 | 0.024 | 0.031 | 0.031 | 0.031 | 0.017 | 0.041 | 0.075 | 0.069 | 0.052 | 0.027 | 0.036 |
| ||||||||||||
| 105 | 0.034 | 0.033 | 0.043 | 0.045 | 0.042 | 0.046 | 0.069 | 0.084 | 0.049 | 0.054 | 0.048 | 0.038 |
| |||||||||||
| 106 | 0.031 | 0.036 | 0.039 | 0.044 | 0.036 | 0.049 | 0.072 | 0.069 | 0.043 | 0.054 | 0.042 | 0.024 | 0.035 |
| ||||||||||
| 107 | 0.023 | 0.039 | 0.039 | 0.033 | 0.033 | 0.043 | 0.069 | 0.078 | 0.047 | 0.034 | 0.042 | 0.025 | 0.035 | 0.035 |
| |||||||||
| 108 | 0.038 | 0.057 | 0.059 | 0.057 | 0.031 | 0.056 | 0.103 | 0.091 | 0.042 | 0.068 | 0.068 | 0.043 | 0.051 | 0.043 | 0.038 |
| ||||||||
| 109 | 0.050 | 0.037 | 0.050 | 0.061 | 0.040 | 0.05 | 0.072 | 0.064 | 0.048 | 0.059 | 0.047 | 0.029 | 0.056 | 0.05 | 0.054 | 0.068 |
| |||||||
| 111 | 0.025 | 0.029 | 0.028 | 0.041 | 0.027 | 0.042 | 0.05 | 0.086 | 0.056 | 0.025 | 0.045 | 0.024 | 0.037 | 0.045 | 0.034 | 0.055 | 0.042 |
| ||||||
| 112 | 0.034 | 0.046 | 0.053 | 0.049 | 0.039 | 0.045 | 0.072 | 0.089 | 0.05 | 0.037 | 0.057 | 0.047 | 0.041 | 0.046 | 0.031 | 0.063 | 0.06 | 0.043 |
| |||||
| 113 | 0.038 | 0.059 | 0.053 | 0.041 | 0.050 | 0.057 | 0.066 | 0.075 | 0.072 | 0.023 | 0.061 | 0.03 | 0.062 | 0.055 | 0.033 | 0.07 | 0.061 | 0.046 | 0.041 |
| ||||
| 123 | 0.026 | 0.007 * | 0.020 | 0.039 | 0.018 | 0.044 | 0.057 | 0.055 | 0.034 | 0.037 | 0.027 | 0.019 | 0.021 | 0.025 | 0.023 | 0.043 | 0.047 | 0.006 * | 0.037 | 0.043 |
| |||
| 124 | 0.065 | 0.056 | 0.077 | 0.091 | 0.068 | 0.069 | 0.09 | 0.083 | 0.086 | 0.062 | 0.078 | 0.072 | 0.084 | 0.074 | 0.065 | 0.093 | 0.076 | 0.06 | 0.083 | 0.065 | 0.058 |
| ||
| Fog | 0.028 | 0.019 | 0.040 | 0.042 | 0.025 | 0.033 | 0.055 | 0.058 | 0.038 | 0.031 | 0.026 | 0.021 | 0.046 | 0.032 | 0.031 | 0.038 | 0.043 | 0.028 | 0.031 | 0.035 | 0.011 * | 0.038 |
| |
| Sha | 0.012 * | 0.023 | 0.023 | 0.036 | 0.019 | 0.04 | 0.054 | 0.065 | 0.03 | 0.034 | 0.031 | 0.007 * | 0.028 | 0.020 | 0.020 | 0.023 | 0.021 | 0.011 * | 0.03 | 0.032 | 0.010 * | 0.064 | 0.010 * |
|
* = No significant differentiation between the pair of accessions. The first column and row are accession names without their two initial letters (NG) for the first 22 accessions. Acc = Accession; Fog = Fogera and Sha = Shambu. Bold: mean FST values of each accession against all other accessions.
Figure 3Neighbor-joining tree of 126 individuals representing the 24 accessions generated based on loci with a minor allele frequency of <0.3, using evolutionary distances computed by the Tajima-Nei method (Tajima and Nei 1984). The individual samples were coded in a way that the first two or three digits/letters represent their accessions and the last two-digit numbers represent the codes for the plant in that accession. The accession names are given without the two initial letters (NG), and Fog and Sha represent Fogera and Shambu, respectively. Individuals represented by the same shape and color belong to the same accession.
Figure 4Neighbor-joining tree of the 24 accessions generated based on (A) all loci, (B) loci with a minor allele frequency of <0.3, and (C) loci with a minor allele frequency of ≥0.3 using evolutionary distances computed by the Tajima-Nei method [53]. The accession names are given without the two initial letters (NG) for 22 of the 24 accessions, and “Fog” and “Sha” represent Fogera and Shambu, respectively. Accessions represented by the same shape and color belong to the same region, and accessions represented with the same color tree-line belong to the same altitudinal range.
Figure 5(A) Principal coordinate analysis (PCoA) based two-dimensional plot for the 24 accessions, in which the first and the second axes explained 24% and 20% of the total variation, respectively, (B) ΔK plot showing its maximum value at K = 3 suggesting the optimal number of genetic clusters (populations) of three, and (C) graphical representation of the population genetic structure of the 24 accessions for K = 3. The three colors represent the three clusters and the proportion of each color in each accession represents the average proportion of the alleles that placed each accession under the three clusters.