| Literature DB >> 24466016 |
Yunlei Zhao1, Hongmei Wang1, Wei Chen1, Yunhai Li1.
Abstract
Understanding the population structure and linkage disequilibrium in an association panel can effectively avoid spurious associations and improve the accuracy in association mapping. In this study, one hundred and fifty eight elite cotton (Gossypium hirsutum L.) germplasm from all over the world, which were genotyped with 212 whole genome-wide marker loci and phenotyped with an disease nursery and greenhouse screening method, were assayed for population structure, linkage disequilibrium, and association mapping of Verticillium wilt resistance. A total of 480 alleles ranging from 2 to 4 per locus were identified from all collections. Model-based analysis identified two groups (G1 and G2) and seven subgroups (G1a-c, G2a-d), and differentiation analysis showed that subgroup having a single origin or pedigree was apt to differentiate with those having a mixed origin. Only 8.12% linked marker pairs showed significant LD (P<0.001) in this association panel. The LD level for linked markers is significantly higher than that for unlinked markers, suggesting that physical linkage strongly influences LD in this panel, and LD level was elevated when the panel was classified into groups and subgroups. The LD decay analysis for several chromosomes showed that different chromosomes showed a notable change in LD decay distances for the same gene pool. Based on the disease nursery and greenhouse environment, 42 marker loci associated with Verticillium wilt resistance were identified through association mapping, which widely were distributed among 15 chromosomes. Among which 10 marker loci were found to be consistent with previously identified QTLs and 32 were new unreported marker loci, and QTL clusters for Verticillium wilt resistanc on Chr.16 were also proved in our study, which was consistent with the strong linkage in this chromosome. Our results would contribute to association mapping and supply the marker candidates for marker-assisted selection of Verticillium wilt resistance in cotton.Entities:
Mesh:
Substances:
Year: 2014 PMID: 24466016 PMCID: PMC3900507 DOI: 10.1371/journal.pone.0086308
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
ANOVA of Verticillium wilt resistance ratings (indicated by RDI) in two environments, Verticillium wilt nursery environment and greenhouse environment.
| Source | DF | Sum of squares | Mean square |
|
|
| Environment | 1 | 37.722 | 37.722 | 0.439 | 0.509 |
| Genotype | 157 | 33243.084 | 211.739 | 2.463 | <0.0001 |
| Error | 157 | 13499.436 | 85.984 | ||
| Total | 315 | 46780.242 |
Figure 1The frequency distribution of Verticillium wilt resistance ratings (indicated by RDI) of 158 accessions in the disease nursery and in the greenhouse environment.
Figure 2The average LnP(D) and ΔK in the total panel and inferred groups.
A–C LnP(D) with k = 1–15, ΔK with k = 2–15, and ΔK with k = 3–15 for simulations using all 158 accessions; D–E LnP(D) with k = 1–15 and ΔK with k = 2–15 for inferred G1 group; F–G LnP(D) with k = 1–15 and ΔK with k = 2–15 for inferred G2 group.
Figure 3Relationship between the inferred populations.
The two inferred clusters (k = 2) resulted from simulation using all 158 accessions in one and correspond to G1 and G2, respectively. Then three and four clusters (k = 3 and 4) were inferred within inferred G1 and G2 independently.
Figure 4Distribution of pairwise relative kinship estimates between 158 cotton accessions.
Values are from SPAGeDi estimates using 212 SSRs. For simplicity, only percentages of relative kinship estimates ranging from 0 to 0.50 are shown.
Analysis of molecular variance (AMOVA) among inferred populations.
| Source of variation | df | Sum of squares | Variance components | Percentage of variation | P-value |
| Among groups | 1 | 398.569 | 1.3784 | 4.46 | 0.03715±0.00557 |
| Among populations | 5 | 889.309 | 5.27178 | 17.08 | <0.0001 |
| Within populations | 96 | 4605.758 | 23.75531 | 76.95 | <0.0001 |
| Within individuals | 103 | 48 | 0.46602 | 1.51 | <0.0001 |
| Total | 205 | 5941.636 | 30.87151 |
Groups were defined by two inferred groups.
Populations were defined by inferred subgroups.
Fst among seven subgroups.
| Groups | Subgroups | G1 | G2 | |||||
| G1a | G1b | G1c | G2a | G2b | G2c | G2d | ||
| G1 | G1a | |||||||
| G1b | 0.32395 | |||||||
| G1c | 0.15490 | 0.20997 | ||||||
| G2 | G2a | 0.24737 | 0.39020 | 0.19033 | ||||
| G2b | 0.17797 | 0.28545 | 0.15914 | 0.15315 | ||||
| G2c | 0.38511 | 0.57518 | 0.33934 | 0.35653 | 0.28013 | |||
| G2d | 0.27666 | 0.42016 | 0.25889 | 0.24459 | 0.17229 | 0.40944 | ||
Significant at P<0.001.
Summary of genetic diversity for overall panel, groups and subgroups.
| G1 | G2 | ||||||||||||
| Items | overall | G1 overall | G1a | G1b | G1c | G1 mixed | G2 overall | G2a | G2b | G2c | G2d | G2 mixed | Mixed |
| sample size | 158 | 58 | 12 | 7 | 26 | 13 | 73 | 10 | 36 | 5 | 11 | 11 | 27 |
| Alleles | 480 | 465 | 385 | 333 | 441 | 423 | 447 | 368 | 418 | 285 | 353 | 377 | 443 |
| Alleles per locus | 2.26 | 2.19 | 1.82 | 1.57 | 2.08 | 2 | 2.11 | 1.74 | 1.97 | 1.34 | 1.67 | 1.78 | 2.09 |
| Gene diversity | 0.34 | 0.35 | 0.27 | 0.19 | 0.34 | 0.33 | 0.3 | 0.24 | 0.28 | 0.12 | 0.19 | 0.26 | 0.33 |
| PIC | 0.28 | 0.28 | 0.22 | 0.16 | 0.28 | 0.27 | 0.24 | 0.2 | 0.23 | 0.09 | 0.16 | 0.21 | 0.27 |
| Allelic richness | 2.26 | 2.19 | 1.82 | 1.57 | 2.08 | 2 | 2.11 | 1.74 | 1.97 | 1.34 | 1.67 | 1.78 | 2.09 |
| Group-specific alleles | 480 | 18 | 4 | 1 | 12 | 5 | 6 | 1 | 2 | 1 | 0 | 3 | 4 |
Note: Groups G1 and G2 were classified based on the results of STRUCTURE analysis of the 158 cotton lines.
The G1 group were further partitioned into G1a, G1b and G1c subgroups, and the G2 group into G2a, G2b, G2c and G2d subgroups.
The intermediates in the total panel, G1 group and G2 group were named as “Mixed”, “G1 mixed” and “G2 mixed”, respectively.
LD in the entire panel, groups and subgroups at the whole genome level.
| Global | Unlinked | Linked | ||||
| Groups |
| Significant LD (%) |
| Significant LD (%) |
| Significant LD (%) |
| G1 overall | 0.0267 | 0.76 | 0.0257 | 0.58 | 0.0469 | 4.47 |
| G1a | 0.1379 | 0.05 | 0.1365 | 0.03 | 0.1706 | 0.41 |
| G1c | 0.0506 | 0.27 | 0.0498 | 0.17 | 0.067 | 2.22 |
| G2 overall | 0.0213 | 0.94 | 0.0202 | 0.71 | 0.0459 | 6.47 |
| G2b | 0.0354 | 0.35 | 0.0343 | 0.21 | 0.0593 | 3.34 |
| Total | 0.0132 | 1.83 | 0.0121 | 0.972 | 0.0362 | 8.33 |
Groups G1 and G2 were classified based on the results of STRUCTURE analysis of the 158 cotton lines.
The G1 group were further partitioned into G1a, G1b and G1c subgroups, and the G2 group into G2a, G2b, G2c and G2d subgroups. But the G1b, G2a, G2c and G2d subgroups were not included in the analysis due to their small population size.
The whole set of marker pairs, including linked and unlinked markers pairs.
Pairs of markers from different chromosomes.
Pairs of markers on the same chromosome.
Significant threshold is set to P<0.001, which determine whether pairwise LD estimate is significant statistically.
LD in the entire panel, groups and subgroups at single chromosome level.
| Chr. | No. of loci | Overall | G1 | G2 | |||||||||
| Overall | G1a | G1c | Overall | G2b | |||||||||
|
| Significant(%) |
| Significant(%) |
| Significant(%) |
| Significant(%) |
| Significant(%) |
| Significant(%) | ||
| 11 | 21 | 0.0442 | 16.5 | 0.0461 | 12.42 | 0.1241 | 3.64 | 0.0643 | 7.5 | 0.043 | 9.09 | 0.0546 | 9.26 |
| 16 | 25 | 0.0362 | 26.46 | 0.0494 | 16.27 | 0.1582 | 5.88 | 0.0699 | 6.88 | 0.0613 | 25.83 | 0.0581 | 11.67 |
| 18 | 9 | 0.02212 | 14.29 | 0.0465 | 25 | 0.1807 | 4.76 | 0.0901 | 11.54 | 0.0208 | 20 | 0.0439 | 6.67 |
| 19 | 24 | 0.0286 | 22.5 | 0.0319 | 9.8 | 0.1613 | 12.62 | 0.0581 | 3.95 | 0.0327 | 6.67 | 0.0554 | 8.57 |
| 23 | 17 | 0.02712 | 16.48 | 0.0471 | 16.48 | 0.1277 | 3.85 | 0.0779 | 6.06 | 0.0365 | 15.38 | 0.0499 | 10 |
| Mean | 19.2 | 0.031648 | 19.246 | 0.0442 | 15.994 | 0.1504 | 6.15 | 0.07206 | 7.186 | 0.03886 | 15.394 | 0.05238 | 9.234 |
The total panel for the 158 cotton lines.
Significant threshold is set to P<0.05, which determine whether pairwise LD estimate is significant statistically.
Groups G1 and G2 were classified based on the results of STRUCTURE analysis of the 158 cotton lines.
The G1 group were further partitioned into G1a, G1b and G1c subgroups, and the G2 group into G2a, G2b, G2c and G2d subgroups. But the G1b, G2a, G2c and G2d subgroups were not included in the analysis due to their small population size.
Average LD decay distance(cM) in different chromosomes in the total panel, G1 and G2 groups for locus pairs with r 2>0.1 at P<0.05.
| Chr. | Overall | G1 | G2 |
| 11 | 15–20 | >100 | — |
| 16 | 1–2 | >50 | 40–50 |
| 19 | 1–2 | 10–15 | — |
| 23 | 5–10 | 20–25 | 10–15 |
The total panel for the 158 cotton lines
The short horizontal line means that only a few marker pairs were in significant LD that a regression curve was not created to estimate the LD decay.
Marker loci significantly associated with Verticillium wilt resistance and their positions on chromosomes (Chr).
| greenhouse | disease nursery | |||||
| Marker name | Chr. | Position(cM) | P value | Rsq_Marker | P value | Rsq_Marker |
| BNL2599 | 1 | 1.633 | 0.0221 | 0.0597 | NS | |
| NAU5233 | 3 | 108 | NS | 0.034 | 0.0287 | |
| NAU3592 | 4 | 119.269 | 0.0057 | 0.0488 | ||
| NAU3828 | 5 | 24.1 | 0.0282 | 0.0387 | 6.89E-04 | 0.0727 |
| NAU3212 | 5 | 66 | NS | 0.0441 | 0.0531 | |
| BNL3255 | 8 | 76.5 | 0.0224 | 0.0411 | NS | |
| NAU3201 | 8 | 38.4 | 0.0113 | 0.0524 | NS | |
| NAU3499 | 8 | 65.3 | 0.0037 | 0.0871 | NS | |
| DPL0222 | 9 | 137.829 | 0.0339 | 0.0371 | 0.0014 | 0.0665 |
| NAU3074 | 11 | 183.689 | NS | 0.0308 | 0.0303 | |
| CIR196 | 11 | 145.826 | NS | 0.0064 | 0.0684 | |
| NAU980 | 11 | 169.5 | NS | 0.0078 | 0.045 | |
| NAU5428 | 11 | 32.076 | NS | 6.86E-04 | 0.1093 | |
| BNL1034 | 11 | 184.577 | NS | 0.0017 | 0.0805 | |
| NAU5064 | 11 | 162.6 | NS | 0.0143 | 0.0543 | |
| BNL2646 | 15 | 48.8 | NS | 0.0067 | 0.0463 | |
| BNL2441 | 16 | 76.567 | NS | 0.0017 | 0.067 | |
| NAU2627 | 16 | 62.301 | 0.0349 | 0.037 | NS | |
| BNL3319 | 16 | 57.702 | NS | 0.0011 | 0.067 | |
| TMB1114 | 16 | 41.815 | NS | 8.39E-04 | 0.0695 | |
| NAU2887 | 16 | 60.867 | NS | 5.66E-04 | 0.0755 | |
| NAU5120 | 16 | 47.7 | NS | 0.0209 | 0.0341 | |
| BNL1606 | 17 | 51.762 | 0.0101 | 0.0526 | 0.0098 | 0.0427 |
| NAU2859 | 17 | 86.286 | NS | 0.0184 | 0.065 | |
| JESPR101 | 17 | 71.031 | NS | 4.50E-04 | 0.0956 | |
| BNL4069 | 19 | 36.8 | 0.009 | 0.0619 | NS | |
| JESPR0001 | 19 | 123.567 | 0.0477 | 0.0479 | NS | |
| CIR364 | 19 | 66.663 | NS | 0.0037 | 0.0708 | |
| NAU2894 | 19 | 26.581 | NS | 0.0184 | 0.0357 | |
| BNL3646 | 20 | 3.479 | 0.0465 | 0.0317 | NS | |
| NAU3574 | 20 | 58.598 | 0.0411 | 0.0506 | 0.0443 | 0.0399 |
| BNL3649 | 21 | 10.8 | 0.0076 | 0.0566 | 0.0338 | 0.0291 |
| NAU2954 | 23 | 114.846 | 0.0093 | 0.0542 | NS | |
| JESPR274 | 23 | 52.972 | NS | 0.017 | 0.0874 | |
| NAU1047 | 23 | 97.1 | NS | 0.0173 | 0.0362 | |
| NAU4912 | 26 | 0.0226 | 0.0594 | NS | ||
| NAU5463 | — | — | NS | 0.009 | 0.0432 | |
| Gh268 | — | — | NS | 0.0066 | 0.0466 | |
| Gh454 | — | — | NS | 0.006 | 0.0509 | |
| NAU3563 | — | — | NS | 0.0353 | 0.0284 | |
| w11330 | — | — | NS | 0.0065 | 0.065 | |
| 73686–3 | — | — | NS | 0.0313 | 0.0461 | |
NS, not statistically significant;
Rsq_marker, total explained phenotypic variation.