| Literature DB >> 35231054 |
Mukhlesur Rahman1, Ahasanul Hoque1,2, Jayanta Roy1.
Abstract
Estimation of genetic diversity in rapeseed is important for sustainable breeding program to provide an option for the development of new breeding lines. The objective of this study was to elucidate the patterns of genetic diversity within and among different structural groups, and measure the extent of linkage disequilibrium (LD) of 383 globally distributed rapeseed germplasm using 8,502 single nucleotide polymorphism (SNP) markers. We divided the germplasm collection into five subpopulations (P1 to P5) according to geographic and growth habit-related patterns. All subpopulations showed moderate genetic diversity (average H = 0.22 and I = 0.34). The pairwise Fst comparison revealed a great degree of divergence (Fst > 0.24) between most of the combinations. The rutabaga type showed highest divergence with spring and winter types. Higher divergence was also found between winter and spring types. Admixture model based structure analysis, principal component and neighbor-joining tree analysis placed all subpopulations into three distinct clusters. Admixed genotype constituted 29.24% of total genotypes, while remaining 70.76% belongs to identified clusters. Overall, mean linkage disequilibrium was 0.03 and it decayed to its half maximum within < 45 kb distance for whole genome. The LD decay was slower in C genome (< 93 kb); relative to the A genome (< 21 kb) which was confirmed by availability of larger haplotype blocks in C genome than A genome. The findings regarding LD pattern and population structure will help to utilize the collection as an important resource for association mapping efforts to identify genes useful in crop improvement as well as for selection of parents for hybrid breeding.Entities:
Mesh:
Year: 2022 PMID: 35231054 PMCID: PMC8887726 DOI: 10.1371/journal.pone.0250310
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Chromosome-wise distribution of SNP markers.
| Chromosome | No. of SNPs | % SNPs | Start position | End position | Length (Mb) | Density |
|---|---|---|---|---|---|---|
| A1 | 440 | 5.18 | 149163 | 35806075 | 35.7 | 81.0 |
| A2 | 392 | 4.61 | 13430 | 34692905 | 34.7 | 88.5 |
| A3 | 685 | 8.06 | 2769 | 49103583 | 49.1 | 71.7 |
| A4 | 236 | 2.78 | 32805 | 23517671 | 23.5 | 99.5 |
| A5 | 413 | 4.86 | 18668 | 31435105 | 31.4 | 76.1 |
| A6 | 448 | 5.27 | 120409 | 36005103 | 35.9 | 80.1 |
| A7 | 384 | 4.52 | 85869 | 27388322 | 27.3 | 71.1 |
| A8 | 281 | 3.31 | 231427 | 27734410 | 27.5 | 97.9 |
| A9 | 541 | 6.36 | 81404 | 45841268 | 45.8 | 84.6 |
| A10 | 305 | 3.59 | 133853 | 22085737 | 22.0 | 72.0 |
| C1 | 445 | 5.23 | 86671 | 50660872 | 50.6 | 113.7 |
| C2 | 589 | 6.93 | 92431 | 68260222 | 68.2 | 115.7 |
| C3 | 651 | 7.66 | 3839 | 80365889 | 80.36 | 123.4 |
| C4 | 634 | 7.46 | 138930 | 70507417 | 70.4 | 111.0 |
| C5 | 366 | 4.30 | 26760 | 44124497 | 44.1 | 120.5 |
| C6 | 414 | 4.87 | 275190 | 45479327 | 45.2 | 109.2 |
| C7 | 518 | 6.09 | 271113 | 62304827 | 62.0 | 119.8 |
| C8 | 383 | 4.50 | 57934 | 46317429 | 46.3 | 120.8 |
| C9 | 377 | 4.43 | 920885 | 51627086 | 50.7 | 134.5 |
| Mean | 447.47 | 99.5 |
a Position of the 1st marker on a particular chromosome corresponding to reference genome
b Position of the last marker on a particular chromosome corresponding to reference genome
c Density was calculated by dividing the length with the marker number.
Fig 1Chromosome-wise SNP density map.
Frequency of SNPs varies according to color gradient.
Transition and transversion SNPs across the genome.
| Genome | SNP type | Model | No. of sites | Frequencies (%) | Total (percentage) |
|---|---|---|---|---|---|
| A | Transitions | A/G | 1195 | 14.06 | 2416 (28.3%) |
| C/T | 1221 | 14.36 | |||
| Transversions | A/T | 457 | 5.38 | 1709 (20.1%) | |
| A/C | 424 | 4.99 | |||
| G/T | 460 | 5.41 | |||
| G/C | 368 | 4.33 | |||
| C | Transitions | A/G | 1273 | 14.97 | 2540 (29.9%) |
| C/T | 1267 | 14.90 | |||
| Transversions | A/T | 496 | 5.83 | 1837(21.6%) | |
| A/C | 482 | 5.67 | |||
| G/T | 494 | 5.81 | |||
| G/C | 365 | 4.29 |
Fig 2Bayesian clustering of whole collection using 8,502 SNP markers in STRUCTURE v.
2.3.4. Graphical representation of optimal number of clusters (K) determined by Evanno’s method [37] with genotypes unassigned (A) and assigned (B) to their respective countries, as well as by Puechmaille [39] and Li and Liu [40] method (C). Estimated population structure of 383 rapeseed genotypes on K = 3 (D) using Puechmaille [39] and Li and Liu [40] method.
Clustering of core collection based on Evanno et al. (2005) [37] and Puechmaille et al. (2016) [39] methods using different combinations of burn-in lengths and Markov Chain Monte Carlo (MCMC) lengths.
| Run # | Burn-in lengths | MCMC lengths | Number of clusters (K) | Number of Reps | Number of populations | Number of populations | ||||
|---|---|---|---|---|---|---|---|---|---|---|
| ΔK (Unassigned) | ΔK (Assigned) | MedMedK | MedMeaK | MaxMedK | MaxMeaK | |||||
| 1 | 5000 | 5000 | 10 | 10 | 3 | 6 | 3 | 3 | 3 | 4 |
| 2 | 10000 | 10000 | 10 | 10 | 8 | 8 | 3 | 3 | 3 | 3 |
| 3 | 20000 | 20000 | 10 | 10 | 8 | 3 | 3 | 3 | 3 | 3 |
| 4 | 20000 | 50000 | 10 | 10 | 8 | 3 | 3 | 3 | 3 | 3 |
| 5 | 50000 | 50000 | 10 | 10 | 9 | 6 | 3 | 3 | 3 | 3 |
| 6 | 50000 | 100000 | 10 | 10 | 9 | 3 | 3 | 3 | 3 | 3 |
| 7 | 100000 | 100000 | 10 | 10 | 3 | 7 | 3 | 3 | 3 | 4 |
α The ad hoc ΔK method [31]
aAccessions unassigned to any subpopulation
bAccessions assigned to subpopulation based on type and origin
βThe median (MedMedK and MaxMedK) or mean (MedMeaK and MaxMeaK) [33] estimators used to determine the number of cluster (K).
Proportion of admixed and non-admixed accessions per subpopulation based on membership coefficients.
| Cluster ( | Core collection subpopulation based on type and origin | Total Number | ||||
|---|---|---|---|---|---|---|
| P1: Winter (151) | P2: Semi-winter (60) | P3: Spring_mixed origin (88) | P4: Spring_NDSU (67) | P5: Rutabaga (17) | ||
| K1 | 3 | 12 | 58 | 39 | 0 | 112 |
| K2 | 15 | 12 | 8 | 0 | 17 | 52 |
| K3 | 99 | 6 | 1 | 1 | 0 | 107 |
| Admixture | 34 | 30 | 21 | 27 | 0 | 112 |
| In-Cluster | 77.48% | 50% | 76.13% | 59.71% | 100% | 70.76% |
| Admixture | 22.52% | 50% | 23.87% | 40.29% | 0% | 19.24% |
a Number of genotypes having q ≥ 0.7 were assigned to specific cluster.
b Genotypes having q < 0.7 were considered as admixed genotype.
Fig 3Principal component analysis of SNP diversity based on genetic distance.
Colors represent subpopulations.
Fig 4Phylogenetic tree (unrooted) based on neighbor-joining (NJ) algorithm using information from 8,502 SNP markers based on 1000 bootstraps.
Each branch is color-coded according to genotype belongs to subpopulation P1 to P5. Genotypes were grouped into three clusters by dividing the tree using black solid lines according to structure output.
Subpopulation-wise diversity parameters.
| Subpopulations | Polymorphic loci (%) | Tajima’s D | |||||
|---|---|---|---|---|---|---|---|
| P1 | 99.12 | 1.99 | 1.32 | 0.35 | 0.21 | 0.21 | 0.53 |
| P2 | 94.32 | 1.94 | 1.40 | 0.40 | 0.25 | 0.25 | 0.30 |
| P3 | 96.98 | 1.97 | 1.35 | 0.36 | 0.22 | 0.23 | 0.30 |
| P4 | 80.67 | 1.81 | 1.30 | 0.31 | 0.19 | 0.19 | -0.70 |
| P5 | 75.25 | 1.75 | 1.31 | 0.31 | 0.19 | 0.20 | 0.23 |
| Mean | 89.27 | 1.89 | 1.34 | 0.34 | 0.22 | 0.22 | 0.13 |
a No. of different alleles
b No. of effective alleles
c Shannon’s information index
d Diversity
e Unbiased diversity. SE (standard error) was zero in all cases. Indices calculated using 8191 SNPs with GenAlex v. 6.5.
* was calculated with 1000 permutations.
Summary of AMOVA.
| Sources of variation | d.f. | Sum of squares | Variance components | % of variation |
|
|
|---|---|---|---|---|---|---|
| Among subpopulations | 4 | 130814.6 | 228.5 | 23.5 | 0.24 | 1.28 |
| Within subpopulations | 761 | 565699.6 | 743.4 | 76.5 | ||
| Total | 765 | 696514.1 | 971.9 |
*** indicates p < 0.001 for 1023 permutations.
Genetic differentiation among subpopulations.
| Subpopulation pairwise | |||||
|---|---|---|---|---|---|
| P1 | P2 | P3 | P4 | P5 | |
| P1 | 0 | ||||
| P2 | 0.19 | 0 | |||
| P3 | 0.25 | 0.24 | 0 | ||
| P4 | 0.21 | 0.24 | 0.11 | 0 | |
| P5 | 0.34 | 0.24 | 0.34 | 0.39 | 0 |
Diagonal values are pairwise F comparisons, performing 1000 permutations using Arlequin v. 3.5.
**indicates p < 0.01.
Fig 5Heatmap of kinship matrix of entire collection.
Summary of subpopulation-wise kinship (IBS) matrix.
| Subpopulations | Whole collection | P1 | P2 | P3 | P4 | P5 |
|---|---|---|---|---|---|---|
| IBS coefficients range | 1.21–1.94 | 1.40–1.94 | 1.27–1.93 | 1.29–1.93 | 1.46–1.94 | 1.35–1.92 |
| Mean of IBS coefficients | 1.47 | 1.58 | 1.49 | 1.55 | 1.62 | 1.60 |
| Pairs having ≤ 1.50 IBS coefficients (%) | 63.9 | 9.6 | 50.7 | 21.7 | 1.1 | 18.0 |
| Pairs having ˃ 1.50 IBS coefficients (%) | 36.1 | 90.4 | 49.3 | 78.3 | 98.9 | 82.0 |
Linkage disequilibrium in the studied collection.
| Subpopulation | Mean linked LD | Mean unlinked LD | Mean LD | Loci pairs in linked LD (%) | Loci pairs in unlinked LD (%) |
|---|---|---|---|---|---|
| AC_Genome | |||||
| Whole collection | 0.44 | 0.02 | 0.03 | 1.81 | 98.2 |
| P1 | 0.48 | 0.01 | 0.02 | 1.52 | 98.5 |
| P2 | 0.41 | 0.02 | 0.03 | 2.65 | 97.4 |
| P3 | 0.45 | 0.02 | 0.03 | 1.94 | 98.1 |
| P4 | 0.45 | 0.02 | 0.04 | 3.98 | 96.0 |
| P5 | 0.43 | 0.03 | 0.07 | 8.76 | 91.2 |
| A_Genome | |||||
| Whole collection | 0.33 | 0.02 | 0.02 | 1.34 | 98.7 |
| P1 | 0.38 | 0.01 | 0.02 | 1.12 | 98.9 |
| P2 | 0.32 | 0.02 | 0.03 | 2.02 | 98.0 |
| P3 | 0.36 | 0.02 | 0.02 | 1.41 | 98.6 |
| P4 | 0.40 | 0.02 | 0.03 | 3.45 | 96.6 |
| P5 | 0.38 | 0.04 | 0.06 | 7.06 | 92.9 |
| C_Genome | |||||
| Whole collection | 0.50 | 0.02 | 0.03 | 2.21 | 97.8 |
| P1 | 0.52 | 0.01 | 0.02 | 1.83 | 98.2 |
| P2 | 0.46 | 0.02 | 0.04 | 3.27 | 96.7 |
| P3 | 0.50 | 0.02 | 0.03 | 2.35 | 97.7 |
| P4 | 0.48 | 0.02 | 0.04 | 4.41 | 95.6 |
| P5 | 0.46 | 0.03 | 0.08 | 10.57 | 89.4 |
a Mean linked LD was calculated by dividing total r (r > 0.2 was considered) value with total number of corresponding loci pair.
b Mean unlinked LD was calculated by dividing total r (r ≤ 0.2 was considered) value with total number of corresponding loci pair.
c Mean LD was calculated by dividing total value with total number of corresponding loci pair.
Fig 6Linkage disequilibrium (LD) differences and decay pattern among subpopulations.
Subpopulation-wise number and length of haplotype blocks (HBs) along chromosomes.
| Entire panel | P1 | P2 | P3 | P4 | P5 | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Chr. | No | Size | No | Size | No | Size | No | Size | No | Size | No | Size |
| A1 | 5 | 24 | 6 | 733 | 2 | 5 | 6 | 557 | 5 | 2080 | 0 | 0 |
| A2 | 8 | 80 | 7 | 96 | 1 | 11 | 8 | 654 | 3 | 15 | 0 | 0 |
| A3 | 6 | 46 | 5 | 29 | 5 | 44 | 8 | 51 | 10 | 927 | 1 | 8 |
| A4 | 5 | 46 | 1 | 27 | 2 | 17 | 0 | 0 | 1 | 6 | 0 | 0 |
| A5 | 7 | 57 | 2 | 13 | 3 | 503 | 5 | 138 | 5 | 427 | 1 | 1 |
| A6 | 9 | 308 | 6 | 564 | 4 | 29 | 5 | 52 | 5 | 51 | 0 | 0 |
| A7 | 8 | 70 | 7 | 72 | 2 | 15 | 7 | 412 | 9 | 1092 | 1 | 13 |
| A8 | 4 | 304 | 1 | 2 | 3 | 21 | 2 | 234 | 6 | 2654 | 0 | 0 |
| A9 | 10 | 901 | 7 | 62 | 14 | 1493 | 6 | 64 | 9 | 1393 | 1 | 21 |
| A10 | 5 | 29 | 6 | 50 | 2 | 21 | 1 | 6 | 5 | 45 | 1 | 1 |
| C1 | 23 | 3099 | 14 | 3314 | 14 | 4638 | 19 | 2947 | 20 | 4295 | 1 | 14 |
| C2 | 26 | 3611 | 19 | 3141 | 9 | 4503 | 22 | 3756 | 16 | 5237 | 7 | 3594 |
| C3 | 14 | 969 | 13 | 930 | 13 | 1480 | 16 | 1603 | 9 | 1070 | 2 | 192 |
| C4 | 16 | 3440 | 15 | 5351 | 15 | 5330 | 14 | 4502 | 13 | 6063 | 4 | 3989 |
| C5 | 9 | 423 | 8 | 410 | 10 | 516 | 7 | 1148 | 7 | 2679 | 1 | 2 |
| C6 | 18 | 1204 | 9 | 972 | 7 | 1191 | 13 | 1206 | 15 | 3357 | 3 | 954 |
| C7 | 17 | 2394 | 17 | 2583 | 7 | 70 | 16 | 2479 | 10 | 1414 | 1 | 13 |
| C8 | 4 | 893 | 5 | 865 | 10 | 1289 | 6 | 941 | 7 | 1143 | 1 | 73 |
| C9 | 6 | 41 | 7 | 49 | 4 | 438 | 7 | 223 | 3 | 11 | 1 | 14 |
| AC Genome | 200 | 17938 | 155 | 19264 | 127 | 21615 | 168 | 20975 | 158 | 33956 | 26 | 8888 |
| A Genome | 67 | 1865 | 48 | 1648 | 38 | 2160 | 48 | 2168 | 58 | 8688 | 5 | 43 |
| C Genome | 133 | 16073 | 107 | 17616 | 89 | 19455 | 120 | 18807 | 100 | 25267 | 21 | 8845 |
a Number of haplotype blocks on each chromosome.
b Total length of haplotype blocks for each chromosome in kb.
Subpopulation specific number and length of haplotype blocks (HBs) along chromosomes.
| P1 specific | P2 specific | P3 specific | P4 specific | P5 specific | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Chr. | No | Size | No | Size | No | Size | No | Size | No | Size |
| A1 | 1 | 683.06 | 0 | 0.00 | 0 | 0.00 | 2 | 1491.24 | 0 | 0.00 |
| A2 | 1 | 39.03 | 0 | 0.00 | 2 | 598.97 | 0 | 0.00 | 0 | 0.00 |
| A3 | 0 | 0.00 | 0 | 0.00 | 0 | 0.00 | 2 | 878.65 | 0 | 0.00 |
| A4 | 1 | 27.02 | 0 | 0.00 | 0 | 0.00 | 0 | 0.00 | 0 | 0.00 |
| A5 | 0 | 0.00 | 2 | 491.27 | 1 | 109.37 | 2 | 408.85 | 0 | 0.00 |
| A6 | 1 | 522.43 | 0 | 0.00 | 1 | 23.04 | 0 | 0.00 | 0 | 0.00 |
| A7 | 1 | 28.42 | 0 | 0.00 | 3 | 392.36 | 5 | 1054.82 | 0 | 0.00 |
| A8 | 0 | 0.00 | 0 | 0.00 | 0 | 0.00 | 3 | 2411.28 | 0 | 0.00 |
| A9 | 0 | 0.00 | 4 | 1409.29 | 1 | 36.87 | 3 | 1339.60 | 1 | 20.50 |
| A10 | 0 | 0.00 | 0 | 0.00 | 0 | 0.00 | 1 | 22.15 | 0 | 0.00 |
| C1 | 4 | 1723.88 | 4 | 3236.73 | 1 | 800.29 | 7 | 2334.07 | 0 | 0.00 |
| C2 | 6 | 1537.92 | 7 | 4464.18 | 6 | 2741.63 | 5 | 3464.31 | 5 | 2796.14 |
| C3 | 2 | 616.65 | 4 | 1024.20 | 4 | 724.95 | 2 | 378.76 | 0 | 0.00 |
| C4 | 4 | 1764.07 | 6 | 2844.27 | 2 | 237.73 | 7 | 5148.84 | 1 | 3439.16 |
| C5 | 1 | 29.99 | 2 | 121.08 | 1 | 771.31 | 2 | 2315.20 | 0 | 0.00 |
| C6 | 3 | 795.86 | 1 | 691.72 | 4 | 782.76 | 4 | 2310.66 | 1 | 715.45 |
| C7 | 3 | 2085.43 | 0 | 0.00 | 4 | 767.17 | 2 | 511.88 | 0 | 0.00 |
| C8 | 0 | 0.00 | 1 | 390.00 | 1 | 885.21 | 4 | 1121.11 | 1 | 73.33 |
| C9 | 1 | 27.36 | 2 | 262.00 | 1 | 32.61 | 0 | 0.00 | 0 | 0.00 |
1 Number of specific haplotype blocks longer than 19 kb on each chromosome
2 Total length (kb) of specific haplotype blocks longer than 19 kb on each chromosome.
Shared haplotype blocks (HBs) among subpopulation along chromosomes.
| Chr. | Shared HBs (size and corresponding subpopulation) |
|---|---|
| A1 | 19.991 (P1, P4), 515.231 (P3, P4) |
| A2 | 0 |
| A3 | 0 |
| A4 | 0 |
| A5 | 0 |
| A6 | 0 |
| A7 | 0 |
| A8 | 232.016 (P3, P4) |
| A9 | 19.986 (P1, P2, P4) |
| A10 | 0 |
| C1 | 20.457 (P1,P2, P3, P4), 38.038 (P1, P3), 134.065 (P2, P3, P4), 241.815 (P2, P3), 260.121 (P2, P3, P4), 336.839 (P1, P3), 374.884 (P3, P4), 438.341 (P1, P4), 652.629 (P3, P4), 718.372 (P1, P2) |
| C2 | 20.408 (P1, P2, P4), 28.252 (P1, P4), 164.414 (P3, P4), 729.678 (P1, P3, P4), 781.808 (P1, P4, P5) |
| C3 | 39.715 (P1, P4), 191.098 (P2, P5), 202.569 (P1, P2, P3), 611.558 (P3, P4) |
| C4 | 98.616 (P3, P5), 149.89 (P1, P2, P3), 378.898 (P1, P3), 436.853 (P1, P2, P3), 447 (P1, P3, P5), 601.227 (P2, P3), 867.133 (P1, P3, P4), 1265.92 (P1, P2, P3) |
| C5 | 337.986 (P1, P2, P3, P4) |
| C6 | 96.217 (P4, P5), 136.404 (P3, P4), 142.519 (P2, P5), 237.159 (P3, P4), 308.114 (P2, P4) |
| C7 | 390.085 (P1, P3), 828.502 (P3, P4) |
| C8 | 255.517 (P1, P2), 592.433 (P1, P2) |
| C9 | 171.465 (P2, P3) |
a Length (kb) of common NBs longer than 19 kb on each chromosome with their corresponding subpopulation shown in parenthesis.