| Literature DB >> 23039970 |
Li Ma1, George R Wiggans, Shengwen Wang, Tad S Sonstegard, Jing Yang, Brian A Crooker, John B Cole, Curtis P Van Tassell, Thomas J Lawlor, Yang Da.
Abstract
BACKGROUND: Artificial insemination and genetic selection are major factors contributing to population stratification in dairy cattle. In this study, we analyzed the effect of sample stratification and the effect of stratification correction on results of a dairy genome-wide association study (GWAS). Three methods for stratification correction were used: the efficient mixed-model association expedited (EMMAX) method accounting for correlation among all individuals, a generalized least squares (GLS) method based on half-sib intraclass correlation, and a principal component analysis (PCA) approach.Entities:
Mesh:
Year: 2012 PMID: 23039970 PMCID: PMC3496570 DOI: 10.1186/1471-2164-13-536
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Multidimensional scaling (MDS) plots of SNP genotypes of 1,654 contemporary Holstein cows.A) All 45,878 SNP markers: C1 and C2 values were calculated using 1,654 contemporary Holstein cows. B) X chromosome: C1 and C2 values were calculated using 1,654 contemporary Holstein cows. C) All 45,878 SNP markers: C1 and C2 values were calculated using 2,366 Holstein cattle, including the University of Minnesota Holstein control line that remained unselected since 1964. D) X chromosome: C1 and C2 values were calculated using 2,366 Holstein cattle, including the University of Minnesota Holstein control line that remained unselected since 1964. C1 = dimension 1, C2 = dimension 2. Graphs for individual chromosomes are given in Additional file 1: Figure S1.
Figure 2Examples of overlap between genome stratification and half-sib family structure. C1 = dimension 1, C2 = dimension 2. C1 and C2 values were calculated using 2,366 Holstein cattle, including the University of Minnesota Holstein control line that remained unselected since 1964. Graphs for more selected families are given in Additional file 3: Figure S3.
Figure 3Examples of overlap between genome and phenotype stratifications. C1 = dimension 1, C2 = dimension 2. C1 and C2 values were calculated using 2,366 Holstein cattle, including the University of Minnesota Holstein control line that remained unselected since 1964. Graphs for all 31 traits are given in Additional file 4: Figure S4.
Figure 4Phenotypic distributions of PTA values of 31 dairy traits in 1,654 contemporary Holstein cows.
Figure 5Global view of test results from four method for fat percentage. EMMAX-IBS is the EMMAX method using correlation measured by identity by state (IBS) among all individuals. EMMAX-BN is the EMMAX using the Balding-Nichols kinship matrix among all individuals. GLS is the generalized least squares method accounting for half-sib intraclass correlation. PCA is the method of principal component analysis for stratification correction using to top 20 principal components as covariables. LS is the least squares method without stratification correction reported in Cole et al. [6]. LS_1494 is the LS analysis without the 160 elite cows.
Number of top 100 SNP effects from the least squares analysis without stratification correction that overlapped with the top 100 effects from EMMAX-IBS (E), GLS (G), and PCA (P) methods for stratification correction
| MY | 7 | 15 | 6 | 2 | 2 | 4 | 1 |
| FY | 4 | 2 | 0 | 1 | 0 | 0 | 0 |
| PY | 1 | 5 | 0 | 0 | 0 | 0 | 0 |
| FPC | 28 | 23 | 22 | 23 | 22 | 22 | 22 |
| PPC | 8 | 3 | 3 | 1 | 1 | 3 | 1 |
| PL | 6 | 10 | 1 | 3 | 0 | 1 | 0 |
| SCS | 6 | 7 | 0 | 0 | 0 | 0 | 0 |
| DPR | 7 | 8 | 2 | 3 | 2 | 2 | 2 |
| SCE | 8 | 3 | 2 | 0 | 2 | 0 | 0 |
| DCE | 3 | 3 | 0 | 1 | 0 | 0 | 0 |
| SSB | 8 | 6 | 6 | 2 | 6 | 2 | 2 |
| DSB | 6 | 11 | 2 | 0 | 0 | 0 | 0 |
| NM | 4 | 1 | 0 | 0 | 0 | 0 | 0 |
| Total | 96 | 97 | 44 | 36 | 35 | 34 | 28 |
| STA | 11 | 24 | 1 | 5 | 1 | 0 | 0 |
| STR | 5 | 12 | 0 | 1 | 0 | 0 | 0 |
| BD | 7 | 14 | 0 | 2 | 0 | 0 | 0 |
| RW | 12 | 31 | 2 | 6 | 1 | 1 | 1 |
| DF | 8 | 7 | 1 | 1 | 1 | 1 | 1 |
| RA | 19 | 22 | 6 | 10 | 4 | 6 | 4 |
| FUA | 6 | 13 | 1 | 1 | 1 | 0 | 0 |
| RUH | 9 | 13 | 4 | 2 | 2 | 0 | 0 |
| UD | 5 | 14 | 0 | 3 | 0 | 0 | 0 |
| UC | 13 | 10 | 3 | 2 | 3 | 0 | 0 |
| FTP | 17 | 16 | 5 | 6 | 4 | 2 | 2 |
| RTP | 12 | 13 | 3 | 2 | 3 | 0 | 0 |
| TL | 16 | 8 | 5 | 3 | 5 | 1 | 1 |
| FA | 6 | 15 | 4 | 4 | 1 | 2 | 1 |
| RLS | 2 | 2 | 6 | 2 | 1 | 1 | 1 |
| RLR | 5 | 18 | 3 | 4 | 0 | 1 | 0 |
| FL | 11 | 14 | 3 | 6 | 2 | 2 | 2 |
| FS | 10 | 21 | 4 | 0 | 2 | 0 | 0 |
| Total | 174 | 267 | 51 | 60 | 31 | 17 | 13 |
E, EMMAX-IBS; G, GLS; P, PCA; MY, milk yield; FY, fat yield; PY, protein yield; FPC, fat percentage; PPC, protein percentage; PL, productive life; SCS, somatic cell score; DPR, daughter pregnancy rate; SCE, service-sire calving ease; DCE, daughter calving ease; SSB, service-sire stillbirth; DSB, daughter stillbirth; NM, net merit; STA, stature; STR, strength; BD, body depth; DF, dairy form; RA, rump angle; RW, rump width; FUA, fore udder attachment; RUH, rear udder height; UD, udder depth; UC, udder cleft; FTP, front teat placement; RTP, rear teat placement; TL, teat length; FA, foot angle; RLS, rear legs (side view); RLR, rear legs (rear view); FL, feet/legs score; FS, final score.
Consensus between the top 20 AIPL effects and the top 20 significant effects of the four methods of SNP testing, LS, EMMAX-IBS, GLS, and PCA
| MY | 9.1540 – 10.3909 | - | 1-16 | 2-1, 1–2, | - | BTA14: |
| 4-4, 7–5, | ||||||
| FY | 0.3717 – 3.1325 | - | 1-1 | 1-1 | 1-1 | BTA14: |
| PY | 0.2409 – 0.6798 | 5- | 5-5 | 3- | - | BTA18: |
| BTA14: | ||||||
| | | | | | | |
| FPC | 0.0022 – 0.0191 | 1-1, 11-9 | 1-1 | 1-1 | 1-1 | BTA14: |
| | | | | | | BTA5: 93.2 Mb |
| PPC | 9.00E-04 – 0.0047 | - | 1-1 | 1-1 | 1-1 | BTA6: |
| PL | 0.0240 – 0.1608 | 2- | 1-2 | - | 1-1 | BTA18: |
| SCS | 0.0026 – 0.0139 | 1-10 | - | 1-16 | - | BTA6: |
| DPR | 0.0175 – 0.0517 | 3- | - | - | - | BTA18: |
| SCE | 0.0231 – 0.0139 | 1- | 1-4 | - | 3-4 | BTA18: |
| SSB | 0.0096 – 0.0554 | - | - | - | 3-17 | BTA18: |
| DCE | 0.0173 – 0.1061 | 1- | - | - | - | BTA18: |
| NM | 2.2525 – 16.2685 | - | - | - | BTA18: | |
| STA | 0.0160 – 0.0453 | 3-19, | - | 3- | - | |
| 12-12 | | 7,12- | BTA11: | |||
| | | | | |||
| | | | | |||
| BD | 0.0087 – 0.0458 | - | 3-1 | - | 5-2 | BTA18: |
| RW | 0.0144 – 0.0394 | 1- | 5-1 | - | - | |
| BTA18: | ||||||
| DF | 0.0140 – 0.0544 | - | 1-1 | - | - | BTAX: |
| FUA | 0.0146 – 0.0465 | 2-16 | - | - | - | BTAX: |
| RUH | 0.0146 – 0.0767 | 1-6 | - | - | - | BTAX: |
| FTP | 0.0126 – 0.0482 | - | 17-7 | - | 17-2 | BTA7: |
| FS | 0.0100 – 0.0673 | 1-7 | - | - | - | BTAX: |
MY, milk yield; FY, fat yield; PY, protein yield; FPC, fat percentage; PPC, protein percentage; PL, productive life; SCS, somatic cell score; DPR, daughter pregnancy rate; SCE, service-sire calving ease; SSB, service-sire stillbirth; DCE, daughter calving ease; NM, net merit; STA, stature; BD, body depth; DF, dairy form; RW, rump width; FUA, fore udder attachment; RUH, rear udder height; FTP, front teat placement; FS, final score.
1 The number on the left is the AIPL effect rank and the number on the right is the significance rank of this method for the same marker effect.
2 Underlined ranking is not an exact match to an AIPL effect and the exact location is the underlined location in the same row under ‘Gene region’.
Figure 6SNP effects and allele frequency differences (AFD) between the elite cows and the average cows for four chromosome regions with confirmation between GWAS and effect size distribution from USDA genomic evaluation.A) The DGAT1-NIBP region for production traits, showing that the consensus effect in DGAT1 had a low AFD of 0.05 while a SNP in NIBP had the largest AFD (0.36) in this region and also was also highly significant for protein percentage (#2 by EMMAX-IBS and EMMAX-BN, #3 by GLS, #7 by PCA, and #19 by LS). B) The BTA18 region for production, somatic cell score, daughter pregnancy rate and calving traits, showing that the two SNP markers detected by the LS method for many traits had the largest AFD (0.48 at 53.95 Mb and 0.46 at 58.7 Mb) in this region while the consensus effect in SIGLEC12 had a low AFD of 0.03. In this figure, the vertical red line indicates the significant marker, and the vertical blue line indicates an adjacent marker. C) The BTA6 region for protein percentage, showing that the consensus effect between HERC3 and PIGY had nearly identical frequencies in the elite and the average cows. D) The BTA6 region for somatic cell score, showing that the consensus effect had a low AFD of 0.09, while the upstream marker identified by LS and GLS as highly significant for somatic cell had a high AFD of 0.37.
Predicted transmitting ability (PTA) values (mean ± standard deviation) for three groups of Holstein cattle representing three stages of artificial selection since 1964
| MY | −1118.6±552.3 | −403.5±246.8 | 301.3±239.3 | ↑↑ |
| FY | −39.2±19.4 | −12.2±8.2 | 11.2±10.3 | ↑↑ |
| PY | −31.7±15.7 | −13.6±6.4 | 10.3±7.5 | ↑↑ |
| FPC | 0.015±0.035 | 0.023±0.066 | 0.002±0.065 | ↑↓ |
| PPC | 0.016±0.018 | −0.014±0.037 | 0.011±0.032 | ↓↑ |
| PL | −3.093±1.785 | −0.327±1.403 | 1.227±1.982 | ↓↑ |
| SCS | 2.854±0.075 | 2.948±0.103 | 2.974±0.165 | ↑↑ |
| DPR | 2.909±1.170 | 0.558±0.943 | −0.151±1.322 | ↓↓ |
| SCE | 8.654±0.329 | 8.149±0.999 | 7.508±1.185 | ↓↓ |
| DCE | 7.884±0.175 | 9.015±0.918 | 7.664±1.052 | ↑↓ |
| SSB | 7.860±0.192 | 7.369±0.488 | 7.657±0.648 | ↓↑ |
| DSB | 8.483±0.694 | 9.987±0.708 | 7.661±1.098 | ↑↓ |
| NM | −549.045±257.56 | −211.581±107.18 | 223.822±178.13 | ↑↑ |
| STA | −1.840±1.357 | −1.146±0.775 | 0.413±0.990 | ↑↑ |
| STR | −0.875±0.871 | −0.629±0.853 | 0.198±0.872 | ↑↑ |
| BD | −1.508±1.225 | −0.855±0.866 | 0.271±0.876 | ↑↑ |
| RW | −1.705±1.220 | −0.918±0.837 | 0.276±0.903 | ↑↑ |
| DF | −3.232±2.482 | −1.638±0.957 | 0.703±0.906 | ↑↑ |
| RA | 0.474±0.622 | 0.143±0.688 | 0.069±0.792 | ↓↓ |
| FUA | −1.754±1.278 | −1.479±0.825 | 0.641±1.110 | ↑↑ |
| RUH | −2.749±2.013 | −1.704±0.954 | 0.942±1.171 | ↑↑ |
| UD | −0.782±0.730 | −0.857±0.738 | 0.316±0.962 | ↓↑ |
| UC | −2.114±1.538 | −1.607±0.887 | 0.470±1.013 | ↑↑ |
| FTP | −1.665±1.307 | −1.447±0.906 | 0.564±0.916 | ↑↑ |
| RTP | −1.797±1.461 | −1.607±0.908 | 0.477±0.929 | ↑↑ |
| TL | 0.063±0.555 | 0.251±0.783 | −0.076±0.758 | ↑↓ |
| FA | −0.893±0.866 | −0.867±0.964 | 0.622±1.052 | ↑↑ |
| RLS | −0.447±0.616 | −0.249±0.811 | −0.163±0.769 | ↑↑ |
| RLR | −1.241±1.134 | −0.980±0.911 | 0.611±0.974 | ↑↑ |
| FL | −1.176±1.033 | −1.045±0.916 | 0.677±0.919 | ↑↑ |
| FS | −2.173±1.607 | −1.378±0.766 | 0.695±0.910 | ↑↑ |
MY, milk yield in units of kilograms; FY, fat yield in units of kilograms; PY, protein yield in units of kilograms; FPC, fat percentage; PPC, protein percentage; PL, productive life; SCS, somatic cell score; DPR, daughter pregnancy rate; SCE, service-sire calving ease; DCE, daughter calving ease; SSB, service-sire stillbirth; DSB, daughter stillbirth; NM, net merit; STA, stature; STR, strength; BD, body depth; DF, dairy form; RA, rump angle; RW, rump width; FUA, fore udder attachment; RUH, rear udder height; UD, udder depth; UC, udder cleft; FTP, front teat placement; RTP, rear teat placement; TL, teat length; FA, foot angle; RLS, rear legs (side view); RLR, rear legs (rear view); FL, feet/legs score; FS, final score.