| Literature DB >> 33820546 |
Pablo Federico Roncallo1, Adelina Olga Larsen2, Ana Laura Achilli1, Carolina Saint Pierre3, Cristian Andrés Gallo1, Susanne Dreisigacker3, Viviana Echenique4.
Abstract
BACKGROUND: Durum wheat (Triticum turgidum L. ssp. durum Desf. Husn) is the main staple crop used to make pasta products worldwide. Under the current climate change scenarios, genetic variability within a crop plays a crucial role in the successful release of new varieties with high yields and wide crop adaptation. In this study we evaluated a durum wheat collection consisting of 197 genotypes that mainly comprised a historical set of Argentinian germplasm but also included worldwide accessions.Entities:
Keywords: Diversity; Durum; Linkage disequilibrium; Population structure; Rare alleles; SNP
Year: 2021 PMID: 33820546 PMCID: PMC8022437 DOI: 10.1186/s12864-021-07519-z
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Genome distribution of SNP markers, genetic diversity and linkage disequilibrium indices
| Chr | HF SNPs | LF SNPs | Total SNP | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| N | Marker coverage (Mb) | MAF | Ho | He | LD ( | % LD | LD decay (Mb) | % | ARG LD ( | SNPs on annotated genes | N (Filtered Subset) | N | Marker coverage (Mb) | MAF | Ho | He | SNPs on annotated genes | ||
| 1A | 305 | 1.92 | 0.237 | 0.021 | 0.324 | 0.177 | 18.5 | 14.7 | 63.3 | 0.294 | 50 | 45 | 221 | 2.64 | 0.013 | 0.002 | 0.026 | 46 | 526 |
| 1B | 542 | 1.26 | 0.264 | 0.020 | 0.350 | 0.162 | 27.0 | 19.1 | 60.5 | 0.272 | 118 | 79 | 337 | 2.02 | 0.012 | 0.002 | 0.023 | 106 | 879 |
| 2A | 365 | 2.13 | 0.220 | 0.018 | 0.303 | 0.220 | 19.5 | 9.8 | 56.2 | 0.433 | 69 | 29 | 178 | 4.36 | 0.020 | 0.003 | 0.039 | 37 | 543 |
| 2B | 427 | 1.85 | 0.240 | 0.021 | 0.328 | 0.151 | 21.8 | 14.2 | 61.6 | 0.284 | 84 | 56 | 251 | 3.13 | 0.017 | 0.002 | 0.034 | 68 | 678 |
| 3A | 275 | 2.72 | 0.242 | 0.015 | 0.329 | 0.153 | 26.5 | 14.9 | 64.3 | 0.287 | 46 | 37 | 184 | 4.07 | 0.015 | 0.002 | 0.029 | 31 | 459 |
| 3B | 288 | 2.91 | 0.281 | 0.020 | 0.361 | 0.157 | 19.2 | 9.8 | 63.3 | 0.274 | 62 | 49 | 271 | 3.09 | 0.016 | 0.002 | 0.031 | 71 | 559 |
| 4A | 231 | 3.19 | 0.237 | 0.019 | 0.328 | 0.177 | 17.2 | 10.8 | 65.6 | 0.304 | 47 | 32 | 100 | 7.41 | 0.015 | 0.003 | 0.030 | 23 | 331 |
| 4B | 246 | 1.90 | 0.253 | 0.017 | 0.347 | 0.192 | 19.4 | 14.9 | 63.0 | 0.300 | 58 | 41 | 70 | 9.76 | 0.014 | 0.002 | 0.028 | 8 | 316 |
| 5A | 284 | 2.35 | 0.237 | 0.019 | 0.320 | 0.153 | 19.6 | 10.5 | 65.1 | 0.287 | 48 | 41 | 165 | 4.07 | 0.013 | 0.003 | 0.026 | 34 | 449 |
| 5B | 344 | 2.04 | 0.261 | 0.017 | 0.345 | 0.155 | 19.3 | 14.4 | 62.5 | 0.288 | 66 | 51 | 161 | 4.36 | 0.016 | 0.003 | 0.031 | 46 | 505 |
| 6A | 277 | 2.23 | 0.239 | 0.019 | 0.320 | 0.290 | 15.1 | 8.6 | 56.2 | 0.465 | 40 | 29 | 104 | 5.95 | 0.021 | 0.003 | 0.041 | 20 | 381 |
| 6B | 413 | 1.69 | 0.241 | 0.019 | 0.327 | 0.153 | 15.9 | 9.5 | 65.8 | 0.268 | 97 | 47 | 188 | 3.65 | 0.016 | 0.002 | 0.030 | 45 | 601 |
| 7A | 380 | 1.91 | 0.240 | 0.019 | 0.331 | 0.173 | 19.7 | 5.6 | 62.9 | 0.322 | 77 | 59 | 153 | 4.74 | 0.021 | 0.003 | 0.040 | 39 | 533 |
| 7B | 362 | 2.00 | 0.239 | 0.019 | 0.335 | 0.137 | 21.4 | 8.7 | 66.6 | 0.288 | 90 | 57 | 100 | 7.28 | 0.023 | 0.003 | 0.044 | 21 | 462 |
| A genome | 2117 | 2.35 | 0.236 | 0.019 | 0.322 | 0.192 | 19.4 | 10.7 | 61.7 | 0.345 | 377 | 272 | 1105 | 4.747 | 0.017 | 0.002 | 0.033 | 230 | 3222 |
| B genome | 2622 | 1.95 | 0.254 | 0.019 | 0.342 | 0.158 | 20.6 | 12.9 | 62.7 | 0.278 | 575 | 380 | 1378 | 4.755 | 0.016 | 0.002 | 0.031 | 365 | 4000 |
| Unmapped | 115 | . | 0.254 | 0.022 | 0.34 | 0.116 | . | . | . | . | . | 23 | 94 | . | 0.018 | 0.003 | 0.035 | . | 209 |
| Whole genome | 2.10 | 0.246 | 0.019 | 0.333 | 0.090 | 13.4 | 11.8 | 62.3 | 0.302 | 952 | 675 | 4.01 | 0.016 | 0.002 | 0.031 | 595 | |||
HF High frequency, LF low frequency, N number of SNPs, MAF minor allele frequency, Ho observed heterozygosity, He expected heterozygosity (Nei’s gene diversity), LD linkage disequilibrium
a mean intra-chomosomal LD at p < 0.01
b Percentage of pairwise SNPs in significant LD (p < 0.01)
c Mean LD calculated considering only 85 Argentinian accessions
d SNPs located into annotated genes in the Svevo genome assembly
e Selected SNPs with intra-chromosomal distance > 1 Mb and MAF > 0.3
Genetic diversity estimated in the whole collection and subgroups
| Subgroup | N | 4854 HF SNPs | 2577 LF SNPs | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| %PL | PA | %PL | PA | ||||||||||
| ARM | 71 | 98.0 | 1.98 | 0.478 | 0.029 | 0.315 | 0 | 45.4 | 1.45 | 0.051 | 0.003 | 0.022 | 200 |
| ART | 14 | 83.8 | 1.84 | 0.416 | 0.016 | 0.273 | 0 | 19.6 | 1.20 | 0.056 | 0.002 | 0.031 | 50 |
| CHI | 26 | 80.6 | 1.81 | 0.390 | 0.008 | 0.257 | 0 | 27.7 | 1.28 | 0.057 | 0.001 | 0.028 | 303 |
| CIM | 10 | 66.3 | 1.66 | 0.348 | 0.003 | 0.231 | 0 | 9.1 | 1.09 | 0.032 | 0.001 | 0.019 | 1 |
| FRA | 22 | 92.4 | 1.92 | 0.462 | 0.024 | 0.306 | 0 | 24.6 | 1.25 | 0.061 | 0.003 | 0.032 | 86 |
| ITM | 16 | 81.4 | 1.81 | 0.423 | 0.008 | 0.282 | 0 | 12.3 | 1.12 | 0.041 | 0.002 | 0.023 | 18 |
| ITT | 17 | 91.7 | 1.92 | 0.457 | 0.020 | 0.301 | 0 | 48.3 | 1.48 | 0.131 | 0.005 | 0.070 | 416 |
| USA | 4 | 53.3 | 1.53 | 0.320 | 0.016 | 0.220 | 0 | 6.8 | 1.07 | 0.039 | 0.002 | 0.026 | 29 |
| WAN | 17 | 84.7 | 1.85 | 0.424 | 0.008 | 0.280 | 0 | 17.4 | 1.17 | 0.048 | 0.001 | 0.026 | 26 |
| 1915–1959 | 6 | 70.3 | 1.70 | 0.382 | 0.015 | 0.255 | 0 | 10.6 | 1.11 | 0.047 | 0.002 | 0.029 | 12 |
| 1960–1969 | 5 | 61.5 | 1.62 | 0.352 | 0.018 | 0.239 | 0 | 19.6 | 1.20 | 0.098 | 0.006 | 0.064 | 33 |
| 1970–1979 | 15 | 91.0 | 1.91 | 0.460 | 0.022 | 0.304 | 0 | 48.1 | 1.48 | 0.137 | 0.005 | 0.074 | 396 |
| 1980–1989 | 22 | 94.9 | 1.95 | 0.474 | 0.017 | 0.314 | 0 | 23.4 | 1.23 | 0.049 | 0.002 | 0.024 | 30 |
| 1990–1999 | 24 | 95.2 | 1.95 | 0.482 | 0.008 | 0.320 | 0 | 22.7 | 1.23 | 0.046 | 0.001 | 0.022 | 32 |
| 2000–2009 | 101 | 99.8 | 2.00 | 0.487 | 0.015 | 0.320 | 0 | 71.1 | 1.71 | 0.067 | 0.002 | 0.028 | 590 |
| 2010–2020 | 24 | 92.5 | 1.93 | 0.459 | 0.048 | 0.303 | 0 | 19.2 | 1.19 | 0.034 | 0.003 | 0.016 | 29 |
| Q1 | 68 | 99.2 | 1.99 | 0.478 | 0.023 | 0.315 | 1 | 54.2 | 1.54 | 0.066 | 0.003 | 0.029 | 313 |
| Q2 | 41 | 97.3 | 1.97 | 0.450 | 0.019 | 0.293 | 0 | 35.2 | 1.35 | 0.050 | 0.002 | 0.022 | 104 |
| Q3 | 36 | 92.0 | 1.92 | 0.419 | 0.014 | 0.271 | 0 | 56.0 | 1.56 | 0.108 | 0.003 | 0.054 | 511 |
| Q4 | 18 | 70.6 | 1.71 | 0.327 | 0.030 | 0.212 | 1 | 8.2 | 1.08 | 0.019 | 0.002 | 0.010 | 6 |
| 5 | 34 | 83.2 | 1.83 | 0.364 | 0.011 | 0.234 | 0 | 31.2 | 1.31 | 0.056 | 0.001 | 0.027 | 297 |
| Total | 197 | 100 | 2.00 | 0.503 | 0.019 | 0.333 | – | 100 | 2.00 | 0.078 | 0.002 | 0.031 | – |
HF high frequency, LF low frequency, % PL percentage of polymorphic loci, Na average number of alleles, I Shannon’s Information index, Ho observed heterozygosity, He Nei’s gene diversity or heterozygosity, PA number of private alleles
Q1 to Q5 are the sub-population inferred by DAPC
a ARM Accessions are coded as: modern Argentinian, ART traditional Argentinian, CHI Chile, CIM CIMMYT, FRA France, ITM modern Italian, ITT traditional Italian, USA United States, WAN West Asia/ North Africa region. Accessions from Argentina and Italy were divided into two groups according to the breeding period or year of release (until: ʽtraditional,ʼ and after 1985: ʽmodernʼ)
Fig. 1Genome-wide linkage disequilibrium (LD) distribution and LD decay. a Scatter plot of LD values of intra-chromosomal pairwise loci against physical distance (Mb). LD decay was fitted with the locally weighted polynomial regression-based (LOESS) curve by genome and for genome-wide LD. b LOESS curves fitted by chromosome (only distance to 200 Mb is shown); c Number of SNP pairs in LD distributed along physical distance intervals; d) LD (r) values frequency by chromosome, genome and whole genome
Mean inter-marker distance for SNP pairs in complete LD (r = 1)
| Chr. / Genome | Whole collection | Argentinian accessions |
|---|---|---|
| 1A | 7.38 | 6.57 |
| 1B | 14.93 | 10.52 |
| 2A | 69.79 | 81.28 |
| 2B | 5.63 | 5.48 |
| 3A | 8.07 | 14.34 |
| 3B | 3.66 | 17.84 |
| 4A | 6.69 | 8.68 |
| 4B | 3.13 | 3.00 |
| 5A | 5.56 | 10.79 |
| 5B | 0.91 | 2.04 |
| 6A | 29.50 | 57.48 |
| 6B | 9.07 | 19.84 |
| 7A | 41.40 | 51.93 |
| 7B | 23.95 | 30.72 |
| A genome | 35.13 | 53.96 |
| B genome | 7.46 | 13.11 |
| Whole genome | 25.12 | 37.79 |
Chr. chromosome
Fig. 2Comparison of LD distribution in three breeding periods (a). Changes of average inter-marker distance in significant LD (p < 0.01) over time assessed by chromosome (b) and genome (c)
Fig. 3Population structure according to the discriminant analysis of principal components (DAPC) using 675 SNPs. The first two components are displayed graphically (each sub-population is differentiated by color) (a). Cluster selection was based on the BIC value (b). Number of PC retained using the cross-validation test (c)
Fig. 4Phylogenetic relationships based on genetic distance in 197 durum wheat accessions displayed graphically using a Ward dendrogram. The sub-populations found by DAPC are indicated with colors and named on the external circle