| Literature DB >> 30696404 |
Zexi Cai1, Bernt Guldbrandtsen2, Mogens Sandø Lund2, Goutam Sahana2.
Abstract
BACKGROUND: Genome-wide association studies (GWAS) have been successfully implemented in cattle research and breeding. However, moving from the associations to identify the causal variants and reveal underlying mechanisms have proven complicated. In dairy cattle populations, we face a challenge due to long-range linkage disequilibrium (LD) arising from close familial relationships in the studied individuals. Long range LD makes it difficult to distinguish if one or multiple quantitative trait loci (QTL) are segregating in a genomic region showing association with a phenotype. We had two objectives in this study: 1) to distinguish between multiple QTL segregating in a genomic region, and 2) use of external information to prioritize candidate genes for a QTL along with the candidate variants.Entities:
Keywords: Candidate genes; Closely linked association signals; Dairy cattle; GWAS; Milk traits
Mesh:
Year: 2019 PMID: 30696404 PMCID: PMC6350337 DOI: 10.1186/s12863-019-0717-0
Source DB: PubMed Journal: BMC Genet ISSN: 1471-2156 Impact factor: 2.797
Lead SNPs from genome-wide associated regions for fat yield in Nordic Holstein cattle. Base positions are given as position in UMD 3.1.1 [49]
| BTA | base position | Imputation accuracy | Effect | –log10(p) | Region | Gene | Annotation |
|---|---|---|---|---|---|---|---|
| 1 | 71,227,484 | 0.9745 | −1.77 | 9.66 | 70,442,929~71,477,578 |
| intron |
| 2 | 126,979,882 | 0.9972 | −1.31 | 11.46 | 126,041,707~127,230,335 | downstream | |
| 2 | 85991577b | 0.9542 | 1.30 | 8.91 | 85,042,155~86,241,732 |
| intron |
| 3 | 7,226,390 | 0.9998 | −1.09 | 9.01 | 6,264,604~7,476,497 |
| intron |
| 5 | 93,948,357 | 0.9906 | 3.28 | 62.41 | 93,698,481~94,198,670 |
| intron |
| 5 | 20284735b | 0.9692 | −1.30 | 9.79 | 20,035,379~20,534,779 | 5S_rRNA (near) | intergenic |
| 6 | 95,497,933 | 0.9996 | −1.45 | 14.76 | 95,248,213~95,747,954 | intergenic | |
| 6 | 32950721b | 0.4975 | 6.33 | 11.39 | 32,367,171~33,200,834 |
| intron |
| 7 | 57,287,990 | 0.8807 | −1.66 | 20.11 | 57,038,215~57,538,309 |
| intron |
| 9 | 38,715,137 | 0.9809 | −1.47 | 8.89 | 38,345,408~38,965,425 |
| intron |
| 11 | 88,771,449 | 0.9876 | 1.16 | 10.43 | 88,521,462~89,021,477 | intergenic | |
| 11 | 15323223b | 0.8962 | −1.32 | 9.81 | 14,855,568~15,573,444 |
| intron |
| 12 | 68,965,758 | 0.9957 | −1.10 | 8.93 | 68,502,223~69,216,445 | intergenic | |
| 14a | 1,802,265 | 0.9398 | −6.93 | 240.56 | 1,549,133~2,049,435 |
| missense |
| 14a | 1,802,266 | 0.9362 | −6.93 | 240.56 | 1,549,133~2,049,435 |
| missense |
| 14 | 67981742b | 0.7652 | 1.65 | 8.71 | 67,117,232~68,231,920 |
| intron |
| 14 | 1321721c | 0.4442 | 1.46 | 8.82 | 1,087,168~1,583,427 |
| missense |
| 15 | 65,891,100 | 0.9992 | 1.50 | 12.99 | 65,641,131~66,141,839 | intergenic | |
| 15 | 25044706b | 0.9908 | −1.17 | 9.80 | 24,795,472~25,295,470 |
| intron |
| 16 | 31,496,700 | 0.9501 | −1.37 | 9.32 | 30,519,873~31,746,789 |
| intron |
| 17 | 62,543,160 | 0.9898 | 1.14 | 10.49 | 62,224,291~62,793,298 |
| intron |
| 18 | 18,970,551 | 0.9442 | −1.19 | 10.30 | 18,341,203~19,220,732 | intergenic | |
| 19 | 27,522,927 | 0.8500 | −1.32 | 10.86 | 26,625,240~27,773,922 | intergenic | |
| 20 | 22,609,736 | 0.9813 | 1.53 | 14.23 | 21,664,412~22,859,809 | intergenic | |
| 20 | 44186112b | 0.9997 | 1.53 | 10.20 | 43,936,468~44,436,133 | intergenic | |
| 26 | 20,547,445 | 0.9993 | −1.76 | 21.46 | 20,299,309~20,797,570 |
| intron |
| 26 | 42408595b | 0.9998 | −1.21 | 10.30 | 41,409,014~42,658,925 |
| intron |
| 29 | 23,609,412 | 0.7717 | 2.06 | 10.73 | 22,613,737~23,859,451 | intergenic | |
| Total number of significant SNPs | 52,334 | ||||||
aFourteen additional SNPs on chromosome 14 located near DGAT1 gene had same highest P value (details on those not presented). Note, bindicated this SNP was found on second round, cindicated this SNP was found on third round
Lead SNPs from genome-wide associated regions for protein yield in Nordic Holstein cattle. Base positions are given as position in UMD 3.1.1 [49]
| BTA | base position | Imputation accuracy | Effect | –log10(p) | Region | gene | Annotation |
|---|---|---|---|---|---|---|---|
| 1 | 63,177,947 | 0.9885 | −1.94 | 12.35 | 62,590,679~63,428,175 | intergenic | |
| 2 | 124,837,669 | 0.9886 | 1.59 | 12.63 | 124,587,873~125,089,732 |
| intron |
| 2 | 86095020a | 0.9910 | 1.27 | 9.53 | 85,393,563~86,345,056 |
| intron |
| 3 | 17,160,521 | 0.9717 | −1.15 | 8.76 | 16,197,245~17,415,613 | upstream | |
| 4 | 103,211,543 | 0.9321 | −1.06 | 8.74 | 102,341,267~103,461,820 |
| intron |
| 5 | 93,511,826 | 0.8626 | −1.37 | 14.25 | 93,087,740~93,762,020 | intergenic | |
| 5 | 21792183a | 0.9813 | −1.37 | 10.39 | 21,542,557~22,042,238 | intergenic | |
| 5 | 87923795b | 0.9926 | 1.50 | 8.97 | 86,950,758~88,173,798 | intergenic | |
| 6 | 88,477,501 | 0.9962 | −2.60 | 25.98 | 88,227,821~88,727,537 |
| intron |
| 6 | 48,694,003a | 0.9785 | 1.89 | 12.16 | 47,720,473~48,944,178 | ENSBTAG00000045570 (near) | intergenic |
| 6 | 88847595b | 0.9009 | −1.82 | 23.84 | 88,477,501~89,097,608 | intergenic | |
| 7 | 41,372,989 | 0.9999 | −1.54 | 18.14 | 41,085,164~41,623,965 | intergenic | |
| 7 | 72100619a | 0.9077 | 1.59 | 13.29 | 71,120,920~72,350,707 | intergenic | |
| 8 | 93,065,787 | 0.8573 | 1.65 | 10.07 | 92,816,321~93,315,869 |
| Intron |
| 8 | 31538155a | 1.0000 | 1.91 | 9.62 | 30,550,864~31,788,181 | intergenic | |
| 9 | 33,267,855 | 0.8655 | −1.46 | 11.96 | 32,627,954~33,518,971 | intergenic | |
| 10 | 93,933,304 | 0.8370 | −1.36 | 9.90 | 92,933,459~94,183,400 |
| intron |
| 11 | 35,512,708 | 0.9999 | −1.45 | 11.82 | 35,189,581~35,762,749 | intergenic | |
| 13 | 37,208,792 | 0.9279 | −1.69 | 10.90 | 36,702,834~37,459,042 | intergenic | |
| 14 | 1,835,440 | 0.7471 | 2.84 | 48.66 | 1,448,510~2,085,468 |
| intron |
| 14 | 67981742a | 0.7652 | 1.78 | 11.60 | 67,731,848~68,231,920 |
| intron |
| 16 | 32,262,983 | 0.9290 | −1.52 | 12.79 | 31,268,349~32,513,084 |
| intron |
| 18 | 57,015,407 | 0.9754 | 2.56 | 17.71 | 56,767,474~57,265,703 |
| intron |
| 18 | 15057077a | 0.9934 | 1.27 | 9.99 | 14,811,219~15,308,407 |
| intron |
| 19 | 27,522,927 | 0.8500 | −1.42 | 12.55 | 27,156,952~27,773,922 | intergenic | |
| 19 | 61014793a | 0.8505 | −1.08 | 8.65 | 60,313,953~61,265,218 | intergenic | |
| 20 | 69,006,609 | 0.9920 | −1.29 | 11.27 | 68,120,719~69,256,618 | intergenic | |
| 20 | 8830351a | 0.9433 | −1.71 | 10.61 | 8,345,063~9,080,402 | intergenic | |
| 23 | 10,974,968 | 0.9304 | −1.18 | 10.68 | 10,234,192~11,224,969 | intergenic | |
| 25 | 36,403,719 | 1.0000 | 1.33 | 10.25 | 36,112,575~36,654,175 | intergenic | |
| 26 | 37,695,494 | 0.9122 | −1.41 | 14.76 | 36,699,144~37,945,656 | intergenic | |
| 27 | 36,304,978 | 0.9834 | 1.06 | 8.52 | 36,037,123~36,555,106 |
| intron |
| 29 | 17,620,617 | 0.9576 | 1.47 | 10.37 | 16,671,270~17,870,637 |
| intron |
| 29 | 35459126a | 0.9999 | 1.61 | 10.11 | 34,854,011~35,709,168 |
| intron |
| Total number of significant SNPs | 36,644 | ||||||
Note, aindicated this SNP was found on second round, bindicated this SNP was found on third round
Lead SNP from genome-wide associated regions for milk yield in Nordic Holstein cattle. Base positions are given as position in UMD 3.1.1 [49]
| BTA | base position | Imputation accuracy | Effect | –log10(p) | Region | Gene | Annotation |
|---|---|---|---|---|---|---|---|
| 2 | 80,753,895 | 0.9454 | 1.13 | 9.95 | 79,777,813~81,003,948 | intergenic | |
| 3 | 56,402,959 | 0.9308 | −1.36 | 11.68 | 56,152,966~56,653,364 | intergenic | |
| 4 | 101,547,644 | 0.7008 | −1.66 | 12.65 | 100,921,921~101,798,041 | upstream | |
| 5 | 93,953,487 | 0.9726 | −2.10 | 29.52 | 93,703,737~94,203,599 | upstream | |
| 5 | 31005518b | 0.9943 | 1.42 | 12.25 | 30,202,453~31,258,920 | upstream | |
| 5 | 85080296c | 0.7619 | −1.28 | 11.24 | 84,425,435~85,330,671 | intergenic | |
| 5 | 20569435d | 0.9944 | 1.23 | 9.37 | 19,600,731~20,820,066 | intergenic | |
| 6 | 88,847,595 | 0.9009 | −1.78 | 21.61 | 88,598,011~89,097,608 | intergenic | |
| 6 | 46901490b | 0.7413 | −1.28 | 11.45 | 46,181,675~47,152,919 | intergenic | |
| 6 | 38027010c | 0.9950 | −4.75 | 9.47 | 37,669,181~38,279,802 |
| missense |
| 7 | 65,370,850 | 0.9848 | −1.36 | 13.58 | 65,120,872~65,620,985 | intergenic | |
| 8 | 73,877,814 | 0.8453 | −1.37 | 11.14 | 73,629,406~74,127,901 | upstream | |
| 8 | 42062591b | 0.9595 | −1.27 | 10.07 | 41,064,643~42,313,291 | intergenic | |
| 9 | 33,478,527 | 0.8801 | −1.25 | 9.23 | 32,627,954~33,728,755 | intergenic | |
| 10 | 1,989,907 | 0.9469 | −1.15 | 9.92 | 1,016,031~2,240,288 | intergenic | |
| 13 | 36,822,330 | 0.9933 | −1.66 | 10.74 | 36,572,364~37,072,486 |
| intron |
| 14a | 1,802,667 | 0.7975 | 5.98 | 178.35 | 1,545,264~2,044,412 |
| intron |
| 15 | 54,392,611 | 0.9577 | 1.57 | 16.58 | 53,485,007~54,642,856 |
| intron |
| 16 | 28,384,260 | 0.9984 | 1.64 | 10.50 | 28,012,864~28,634,313 | intergenic | |
| 17 | 66,510,224 | 0.9438 | 1.83 | 11.63 | 66,119,023~66,760,263 |
| intron |
| 18 | 46,583,346 | 0.9829 | 1.86 | 11.97 | 46,333,384~46,833,392 | upstream | |
| 19 | 27,442,452 | 0.7904 | −1.26 | 9.71 | 26,592,355~27,692,965 | bta-mir-497 (near) | downstream |
| 20 | 29,996,719 | 0.9580 | −2.95 | 31.02 | 29,748,423~30,246,822 | intergenic | |
| 23 | 25,076,472 | 0.9797 | −1.34 | 9.23 | 24,219,868~25,326,583 |
| intron |
| 26 | 37,716,420 | 0.9790 | −1.43 | 12.28 | 36,730,021~ 37,966,463 | intergenic | |
| 28 | 34,972,377 | 0.9991 | 1.18 | 9.81 | 34,722,402~35,222,855 | intergenic | |
| Total number of significant SNPs | 55,600 | ||||||
aEight additional SNPs on chromosome 14 had same highest P value. Note, bindicated this SNP was found on second round, cindicated this SNP was found on third round, dindicated this SNP was found on fourth round
Fig. 1Manhattan plot for association of SNP with fat yield in Nordic Holstein cattle. Red horizontal line indicates genome-wide significance level [−log10(P) = 8.5]
The genetics variants explained by QTL and the rest of SNPs
| Number of QTL | V(G1)/Vpb (%) | V(G2)/Vpc (%) | |
|---|---|---|---|
| Fat1a | 18 | 23.56 | 61.12 |
| Fat2a | 27 | 28.57 | 56.40 |
| Prot1a | 22 | 12.52 | 72.20 |
| Prot2a | 34 | 16.76 | 67.14 |
| Milk1a | 20 | 19.02 | 66.27 |
| Milk2a | 26 | 21.50 | 63.12 |
Note, aFat means the trait of fat yield, Prot means the trait of protein yield, Milk means the trait of milk yield; 1 indicate the lead SNP list only included the lead SNP from the first round, 2 indicated the lead SNP list included all lead SNP found by our approach. bmeans the percentage of genetics variants explained by the QTL, c means the percentage of genetics variants explained by the rest of SNP other than QTL
Fig. 2Manhattan plot for association of SNP with protein yield in Nordic Holstein cattle. Red horizontal line indicates genome-wide significance level [−log10(P) = 8.5]
Fig. 3Manhattan plot for association of SNP with milk yield in Nordic Holstein cattle. Red horizontal line indicates genome-wide significance level [−log10(P) = 8.5]
Genes related to “abnormal milk composition” phenotype in the mammalian phenotype database [24] overlapped with milk QTL identified in the present study
| Gene name | Location | Phenotype |
|---|---|---|
|
| BTA6: 87,141,556-87,159,096 | abnormal milk composition |
|
| BTA6: 87,179,502-87,188,025 | abnormal milk composition |
|
| BTA6: 87,378,398-87,392,750 | abnormal milk composition |
|
| BTA14: 1,795,351-1,804,562 | abnormal milk composition |
|
| BTA18: 56,691,667-56,725,849 | abnormal milk composition |
Genes related to “abnormal of mammary gland development” in the mammalian phenotype database [24] overlapped with milk QTL identified in the present study
| Gene name | Location | Phenotype |
|---|---|---|
|
| BTA5: 31,000,183- 31,003,266 | abnormal mammary gland morphology |
|
| BTA15: 65,779,325-65,815,261 | decreased mammary gland tumor incidence |
|
| BTA15: 65,824,442-65,854,386 | abnormal mammary gland development |
|
| BTA14: 67,677,676-67,987,801 | increased mammary gland tumor incidence |
|
| BTA14: 1,795,351-1,804,562 | abnormal mammary gland development |
|
| BTA26: 20,966,010-21,008,277 | abnormal mammary gland growth during pregnancy |
Fig. 4The VEP annotation of SNPs in linkage disequilibrium (LD > 0.20) with leading SNPs. a The summary of all annotation. b The summary of annotation that change the protein coding sequence