| Literature DB >> 33004874 |
Nan Wang1,2, Yibing Yuan2,3, Hui Wang2,4,5, Diansi Yu2,4,5, Yubo Liu2,6, Ao Zhang6, Manje Gowda7, Sudha K Nair8, Zhuanfang Hao1, Yanli Lu3, Felix San Vicente2, Boddupalli M Prasanna7, Xinhai Li9, Xuecai Zhang10.
Abstract
Genotyping-by-Sequencing (GBS) is a low-cost, high-throughput genotyping method that relies on restriction enzymes to reduce genome complexity. GBS is being widely used for various genetic and breeding applications. In the present study, 2240 individuals from eight maize populations, including two association populations (AM), backcross first generation (BC1), BC1F2, F2, double haploid (DH), intermated B73 × Mo17 (IBM), and a recombinant inbred line (RIL) population, were genotyped using GBS. A total of 955,120 of raw data for SNPs was obtained for each individual, with an average genotyping error of 0.70%. The rate of missing genotypic data for these SNPs was related to the level of multiplex sequencing: ~ 25% missing data for 96-plex and ~ 55% for 384-plex. Imputation can greatly reduce the rate of missing genotypes to 12.65% and 3.72% for AM populations and bi-parental populations, respectively, although it increases total genotyping error. For analysis of genetic diversity and linkage mapping, unimputed data with a low rate of genotyping error is beneficial, whereas, for association mapping, imputed data would result in higher marker density and would improve map resolution. Because imputation does not influence the prediction accuracy, both unimputed and imputed data can be used for genomic prediction. In summary, GBS is a versatile and efficient SNP discovery approach for homozygous materials and can be effectively applied for various purposes in maize genetics and breeding.Entities:
Mesh:
Year: 2020 PMID: 33004874 PMCID: PMC7530987 DOI: 10.1038/s41598-020-73321-8
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Information about maize populations analyzed in the present study.
| Pop | Type | Parent 1 | Parent 2 | Number of samples | Plex | Heterozygosity ratea (%) | MAFb |
|---|---|---|---|---|---|---|---|
| Pop1 | AMc | – | – | 267 | 96 | 0.00 | 0 ~ 0.50 |
| Pop2 | AM | – | – | 523 | 96 | 0.00 | 0 ~ 0.50 |
| Pop3 | BC1F2 | DTPC9F104 | CML491 | 174 | 96 | 25.00 | 0.25 |
| Pop4 | BC1 | CKL09001 | CML444 | 152 | 384 | 50.00 | 0.25 |
| Pop5 | F2 | CLWN201 | CML494 | 423 | 96 | 50.00 | 0.50 |
| Pop6 | DH | LPSC7F64 | CML495 | 209 | 96 | 0.00 | 0.50 |
| Pop7 | RIL | B73 | CML247 | 207 | 384 | 0.00 | 0.50 |
| Pop8 | IBM | B73 | Mo17 | 285 | 384 | 0.00 | 0.50 |
aExpected heterozygosity rate of population.
bExpected minor allele frequency of population.
cAssociation panel.
Information about unimputed SNPs detected in eight maize populations before and after data filtering.
| Pop | Number of taxa | SNPs | Insertion/deletion (%) | Missing (%)a | Het (%)b | MAFc | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Unfiltered | Filtered | Unfiltered | Filtered | Unfiltered | Filtered | Unfiltered | Filtered | Unfiltered | Filtered | Unfiltered | Filtered | |
| Pop1 | 267 | 242 | 955,120 | 167,617 | 1.42 | 0.92 | 54.60 | 32.73 | 0.22 | 0.83 | 0.09 | 0.23 |
| Pop2 | 523 | 513 | 955,120 | 115,311 | 1.46 | 0.78 | 60.83 | 33.48 | 0.11 | 0.56 | 0.09 | 0.24 |
| Pop3 | 174 | 161 | 45,098 | 40,491 | 2.32 | 1.97 | 23.56 | 20.37 | 9.50 | 10.50 | 0.25 | 0.25 |
| Pop4 | 152 | 152 | 41,307 | 13,662 | 1.88 | 1.22 | 57.44 | 31.85 | 5.46 | 12.68 | 0.27 | 0.27 |
| Pop5 | 411 | 408 | 66,725 | 57,411 | 2.42 | 2.31 | 24.74 | 18.99 | 20.23 | 23.17 | 0.43 | 0.43 |
| Pop6 | 207 | 177 | 65,814 | 48,985 | 2.73 | 1.99 | 21.03 | 16.36 | 0.86 | 0.87 | 0.35 | 0.43 |
| Pop7 | 207 | 185 | 75,961 | 19,089 | 1.32 | 0.57 | 57.96 | 34.88 | 0.68 | 1.56 | 0.41 | 0.44 |
| Pop8 | 285 | 216 | 73,013 | 36,468 | 1.26 | 0.88 | 49.55 | 36.13 | 0.70 | 1.00 | 0.40 | 0.42 |
aPercentage of missing SNP.
bPercentage of heterozygous SNP.
cMinor allele frequency.
Imputed SNP information for eight populations before and after data filtering.
| Pop | Number of taxa | SNPs | Insertion/deletion (%) | Missing (%)a | Het (%)b | MAFc | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Unfiltered | Filtered | Unfiltered | Filtered | Unfiltered | Filtered | Unfiltered | Filtered | Unfiltered | Filtered | Unfiltered | Filtered | |
| Pop1 | 242 | 242 | 955,120 | 341,312 | 0.82 | 0.62 | 15.20 | 12.75 | 0.94 | 2.30 | 0.09 | 0.23 |
| Pop2 | 513 | 513 | 955,120 | 340,177 | 0.92 | 0.47 | 13.49 | 12.17 | 0.64 | 1.58 | 0.09 | 0.24 |
| Pop3 | 161 | 161 | 93,760 | 76,437 | 1.48 | 0.74 | 8.44 | 2.86 | 19.40 | 21.26 | 0.24 | 0.24 |
| Pop4 | 152 | 152 | 92,752 | 90,655 | 0.95 | 0.66 | 9.02 | 7.91 | 34.18 | 34.41 | 0.24 | 0.25 |
| Pop5 | 408 | 408 | 91,564 | 90,637 | 1.88 | 1.74 | 4.07 | 3.66 | 43.10 | 43.67 | 0.45 | 0.45 |
| Pop6 | 177 | 177 | 91,208 | 74,487 | 1.38 | 1.73 | 2.89 | 1.94 | 0.78 | 0.91 | 0.35 | 0.43 |
| Pop7 | 185 | 185 | 121,935 | 121,013 | 1.78 | 0.71 | 2.45 | 2.11 | 5.36 | 5.39 | 0.45 | 0.45 |
| Pop8 | 216 | 216 | 111,568 | 110,422 | 1.02 | 0.76 | 4.17 | 3.84 | 2.78 | 2.80 | 0.42 | 0.42 |
aPercentage of missing SNP.
bPercentage of heterozygous SNP.
cMinor allele frequency.
Genotyping error rate of six bi-parental populations.
| Pop | Line | All loci | Homozygous loci | Heterozygous loci | |||
|---|---|---|---|---|---|---|---|
| Unimputed (%) | Imputed (%) | Unimputed (%) | Imputed (%) | Unimputed (%) | Imputed (%) | ||
| Pop3 | DTPC9F104 | 0.85 | 0.57 | 0.30 | 0.05 | 0.55 | 0.52 |
| CML491 | 0.87 | 0.31 | 0.25 | 0.12 | 0.62 | 0.19 | |
| F1 | 8.21 | 10.42 | 3.38 | 0.68 | 4.83 | 9.74 | |
| Pop4 | CKL09001 | 0.82 | 0.25 | 0.27 | 0.11 | 0.55 | 0.14 |
| CML444 | 0.67 | 0.18 | 0.18 | 0.08 | 0.49 | 0.10 | |
| Pop5 | CLWN201 | 0.85 | 0.39 | 0.14 | 0.04 | 0.71 | 0.35 |
| CML494 | 0.61 | 0.33 | 0.09 | 0.07 | 0.52 | 0.26 | |
| F1 | 7.97 | 5.74 | 2.43 | 0.38 | 5.54 | 5.36 | |
| Pop6 | LPSC7F64 | 0.63 | 0.46 | 0.07 | 0.08 | 0.56 | 0.38 |
| CML495 | 1.06 | 0.74 | 0.17 | 0.09 | 0.89 | 0.65 | |
| Pop7 | B73 | 0.56 | 0.38 | 0.16 | 0.19 | 0.40 | 0.19 |
| CML247 | 0.51 | 0.38 | 0.06 | 0.06 | 0.45 | 0.32 | |
| Pop8 | B73 | 0.56 | 0.38 | 0.16 | 0.19 | 0.40 | 0.19 |
| Mo17 | 0.53 | 0.35 | 0.10 | 0.13 | 0.43 | 0.22 | |
| Average | Total | 1.76 | 1.49 | 0.55 | 0.16 | 1.21 | 1.33 |
| Parents | 0.70 | 0.45 | 0.16 | 0.11 | 0.54 | 0.33 | |
| F1 | 8.09 | 8.08 | 2.91 | 0.53 | 5.19 | 7.55 | |
Figure 1Principal component analysis of Pop1 and Pop2 using unimputed and imputed data. (A) Pop1 using unimputed data; (B) Pop1 using imputed data; (C) Pop2 using unimputed data; (D) Pop2 using imputed data.
Figure 2Multidimensional scanning for three bi-parental populations with high heterozygosity rate. (A–C) Pop3-5 using unimputed data; (D–F) Pop3-5 using imputed data.
Figure 3Multidimensional scanning for three bi-parental populations with low heterozygosity rate. (A–C) Pop6-8 using unimputed data; (D–F) Pop6-8 using imputed data.
Figure 4Decrease in linkage disequilibrium (LD) and GWAS for kernel color of Pop1 using filtered unimputed (A, C) and imputed (B, D) SNP data.
Figure 5Distribution of 49,608 SNPs identified in Pop6 (A) and QTL mapping of TSC resistance in Pop6 (B). The red dot represents centromere.
Figure 6Genomic prediction of GY (A, B) and TSC resistance (C, D) using unimputed (A, C) and imputed (B, D) SNP data.