| Literature DB >> 23555972 |
Dunfei Pan1, Shengli Zhang, Jicai Jiang, Li Jiang, Qin Zhang, Jianfeng Liu.
Abstract
Selective signatures in whole genome can help us understand the mechanisms of selection and target causal variants for breeding program. In present study, we performed Extended Haplotype Homozygosity (EHH) tests to identify significant core regions harboring such signals in Chinese Holstein, and then verified the biological significance of these identified regions based on commonly-used bioinformatics analyses. Results showed a total of 125 significant regions in entire genome containing some of important functional genes such as LEP, ABCG2, CSN1S1, CSN3 and TNF based on the Gene Ontology database. Some of these annotated genes involved in the core regions overlapped with those identified in our previous GWAS as well as those involved in a recently constructed candidate gene database for cattle, further indicating these genes under positive selection maybe underlie milk production traits and other important traits in Chinese Holstein. Furthermore, in the enrichment analyses for the second level GO terms and pathways, we observed some significant terms over represented in these identified regions as compared to the entire bovine genome. This indicates that some functional genes associated with milk production traits, as reflected by GO terms, could be clustered in core regions, which provided promising evidence for the exploitability of the core regions identified by EHH tests. Findings in our study could help detect functional candidate genes under positive selection for further genetic and breeding research in Chinese Holstein.Entities:
Mesh:
Year: 2013 PMID: 23555972 PMCID: PMC3610670 DOI: 10.1371/journal.pone.0060440
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Summary of genome-wide marker and core region (CR) distribution in Chinese Holstein.
| Chr | Chr length (Mbp) | SNP (n) | Mean distance (kb) | CR (n) | Total CR Length(kb) | Max CR Length(kb) | Mean CR Length(kb) | CR SNPs(n)a | Max CR SNPs(n) |
| 1 | 161.06 | 2610 | 61.7 | 292 | 40305.4 | 628.64 | 138.0±90.2 | 1099 | 11 |
| 2 | 140.63 | 2109 | 66.7 | 228 | 32967.5 | 696.32 | 144.6±102.5 | 839 | 9 |
| 3 | 127.91 | 2025 | 63.2 | 222 | 32430.1 | 1073.97 | 146.1±117.3 | 820 | 9 |
| 4 | 124.13 | 1935 | 64.1 | 196 | 26006.6 | 509.61 | 132.7±90.2 | 709 | 9 |
| 5 | 125.80 | 1681 | 74.8 | 152 | 23513.1 | 561.02 | 154.7±100.6 | 554 | 12 |
| 6 | 122.54 | 1982 | 61.8 | 206 | 28214.8 | 558.29 | 137.0±99.9 | 774 | 10 |
| 7 | 112.06 | 1759 | 63.7 | 176 | 24310.3 | 1131.54 | 138.1±117.7 | 631 | 8 |
| 8 | 116.94 | 1851 | 63.2 | 190 | 26709.9 | 600.82 | 140.6±92.1 | 716 | 9 |
| 9 | 108.07 | 1570 | 68.8 | 155 | 20279.0 | 588.08 | 130.8±85.6 | 563 | 9 |
| 10 | 106.20 | 1683 | 63.1 | 173 | 20536.7 | 347.15 | 118.7±63.2 | 624 | 8 |
| 11 | 110.17 | 1780 | 61.9 | 194 | 23677.6 | 780.69 | 122.0±91.4 | 684 | 10 |
| 12 | 85.28 | 1290 | 66.1 | 120 | 16387.3 | 618.68 | 136.6±104.5 | 421 | 7 |
| 13 | 84.34 | 1373 | 61.4 | 142 | 18376.8 | 517.75 | 129.4±87.1 | 517 | 11 |
| 14 | 81.32 | 1360 | 59.8 | 152 | 21804.0 | 548.68 | 143.4±98.6 | 549 | 10 |
| 15 | 84.60 | 1324 | 63.9 | 136 | 19048.2 | 780.38 | 140.1±100.9 | 484 | 7 |
| 16 | 77.82 | 1231 | 63.2 | 134 | 17809.6 | 730.69 | 132.9±85.0 | 502 | 11 |
| 17 | 76.45 | 1250 | 61.2 | 127 | 15106.7 | 368.96 | 119.1±66.1 | 442 | 8 |
| 18 | 66.12 | 1072 | 61.7 | 84 | 10677.9 | 690.26 | 127.1±89.3 | 289 | 6 |
| 19 | 65.21 | 1106 | 59.0 | 91 | 11736.3 | 622.59 | 129.0±98.6 | 347 | 6 |
| 20 | 75.71 | 1250 | 60.6 | 115 | 14453.6 | 313.29 | 125.7±65.6 | 413 | 8 |
| 21 | 69.17 | 1072 | 64.5 | 100 | 12443.5 | 490.57 | 124.4±75.0 | 347 | 6 |
| 22 | 61.83 | 980 | 63.1 | 108 | 12539.8 | 487.54 | 116.1±66.6 | 364 | 6 |
| 23 | 53.33 | 870 | 61.3 | 79 | 9709.7 | 502.67 | 122.9±87.3 | 267 | 8 |
| 24 | 64.95 | 993 | 65.4 | 89 | 10595.9 | 518.43 | 119.1±86.6 | 302 | 7 |
| 25 | 44.02 | 790 | 55.7 | 76 | 7983.1 | 359.94 | 105.0±61.4 | 254 | 5 |
| 26 | 51.73 | 834 | 62.0 | 83 | 9855.6 | 463.48 | 118.7±67.7 | 288 | 6 |
| 27 | 48.73 | 774 | 63.0 | 64 | 7273.0 | 385.59 | 113.6±66.7 | 210 | 6 |
| 28 | 46.00 | 748 | 61.5 | 50 | 6432.7 | 324.42 | 127.8±61.7 | 167 | 5 |
| 29 | 51.98 | 828 | 62.8 | 62 | 7923.9 | 321.19 | 127.8±61.7 | 215 | 6 |
| Total | 2544.1 | 40130 | 63.4 | 3996 | 529108.6 | 16521.24 | 129.7±85.6 | 14364 | 12 |
: the number of SNPs involved in core regions of each chromosome.
Figure 1Distribution of the REHH vs. the core haplotype frequency.
Different P values are marked by different color symbols presented in the right of this figure.
Figure 2The distribution of the P values of haplotypes with frequency≥0.20 on the whole genome.
Summary of whole genome extended haplotype homozygosity tests.
| Chr | Tests on CH (n) | P-value<0.05 (n) | P-value<0.01(n) |
| 1 | 1177 | 29 | 10 |
| 2 | 895 | 30 | 5 |
| 3 | 898 | 36 | 7 |
| 4 | 807 | 27 | 9 |
| 5 | 594 | 19 | 8 |
| 6 | 822 | 31 | 9 |
| 7 | 681 | 17 | 7 |
| 8 | 784 | 32 | 5 |
| 9 | 610 | 19 | 6 |
| 10 | 640 | 21 | 2 |
| 11 | 766 | 27 | 8 |
| 12 | 456 | 22 | 2 |
| 13 | 579 | 20 | 5 |
| 14 | 614 | 15 | 6 |
| 15 | 569 | 25 | 2 |
| 16 | 540 | 12 | 5 |
| 17 | 511 | 14 | 7 |
| 18 | 345 | 15 | 3 |
| 19 | 369 | 14 | 3 |
| 20 | 466 | 14 | 2 |
| 21 | 401 | 16 | 1 |
| 22 | 436 | 16 | 2 |
| 23 | 326 | 15 | 0 |
| 24 | 360 | 15 | 2 |
| 25 | 333 | 11 | 3 |
| 26 | 325 | 8 | 0 |
| 27 | 264 | 6 | 3 |
| 28 | 214 | 7 | 0 |
| 29 | 253 | 8 | 3 |
| Total | 16035 | 541 | 125 |
: Core haplotypes involved in each core region determined by Sweep across genome.
Genes within the core regions overlapping with those by genome wide association studies by Jiang et al (2010).
| Gene Symbol | Chr. | Closest core position (bp) | Hap Freq(%) | P-value |
| LOC614166 | 1 | 148796434–148911817 | 0.286 | 0.0547 |
| 149189841–149242164 | 0.209 | 0.0365 | ||
| DIP2A | 1 | 148796434–148911817 | 0.286 | 0.0547 |
| 149189841–149242164 | 0.209 | 0.0365 | ||
| KBTBD10 | 2 | 27607855–27651437 | 0.382 | 0.0369/0.014 |
| SLC30A7 | 3 | 44239453–44433532 | 0.373 | 0.0211 |
| 45775675–45934869 | 0.418 | 0.00681 | ||
| LOC511240 | 5 | 76788487–76882812 | 0.325 | 0.0154 |
| 77546764–77805841 | 0.351 | 0.0587 | ||
| HERC3 | 6 | 37135013–37231101 | 0.572/0.0479 | 0.00220/0.0378 |
| PKD2 | 6 | 37135013–37231101 | 0.572/0.0479 | 0.00220/0.0378 |
| 38479643–38558526 | 0.216 | 0.0562 | ||
| NFIB | 8 | 31713520–31832256 | 0.235 | 0.00359/0.00573 |
| LOC788012 | 9 | 5992605–6049113 | 0.348/0.267 | 0.0334/0.0596 |
| C14H8orf33 | 14 | 260341–443937 | 0.248 | 0.0568 |
| FOXH1 | 14 | 260341–443937 | 0.248 | 0.0568 |
| CYHR1 | 14 | 260341–443937 | 0.248 | 0.0568 |
| VPS28 | 14 | 260341–443937 | 0.248 | 0.0568 |
| DGAT1 | 14 | 260341–443937 | 0.248 | 0.0568 |
| MAF1 | 14 | 260341–443937 | 0.248 | 0.0568 |
| LOC786966 | 14 | 260341–443937 | 0.248 | 0.0568 |
| GRINA | 14 | 260341–443937 | 0.248 | 0.0568 |
| 1889210–1967406 | 0.33 | 0.0454 | ||
| GML | 14 | 260341–443937 | 0.248 | 0.0568 |
| 1889210–1967406 | 0.330 | 0.0454 | ||
| 2163275–2239116 | 0.268 | 0.0554 | ||
| GPIHBP1 | 14 | 1889210–1967406 | 0.330 | 0.0454 |
| 2163275–2239116 | 0.268 | 0.0554 | ||
| COL22A1 | 14 | 2805785–2849483 | 0.579 | 0.0556 |
| 3018726–3099635 | 0.260 | 0.0401/0.0246 | ||
| NKAIN3 | 14 | 28014144–28185224 | 0.322 | 0.0282 |
| OPLAH | 14 | 260341–443937 | 0.248 | 0.0568 |
| MAPK15 | 14 | 260341–443937 | 0.248 | 0.0568 |
| ZNF623 | 14 | 260341–443937 | 0.248 | 0.0568 |
| EEF1D | 14 | 260341–443937 | 0.248 | 0.0568 |
| 1889210–1967406 | 0.330 | 0.0454 | ||
| ZC3H3 | 14 | 260341–443937 | 0.248 | 0.0568 |
| 1889210–1967406 | 0.330 | 0.0454 | ||
| LYPD2 | 14 | 260341–443937 | 0.248 | 0.0568 |
| 1889210–1967406 | 0.330 | 0.0454 | ||
| RHPN1 | 14 | 1889210–1967406 | 0.330 | 0.0454 |
| 2163275–2239116 | 0.268 | 0.0554 | ||
| GPR20 | 14 | 1889210–1967406 | 0.330 | 0.0454 |
| 2163275–2239116 | 0.268 | 0.0554 | ||
| 2805785–2849483 | 0.579 | 0.0556 | ||
| PTK2 | 14 | 1889210–1967406 | 0.330 | 0.0454 |
| 2163275–2239116 | 0.268 | 0.0554 | ||
| 2805785–2849483 | 0.579 | 0.0556 | ||
| 3018726–3099635 | 0.260 | 0.0401/0.0246 | ||
| EIF2C2 | 14 | 1889210–1967406 | 0.330 | 0.0454 |
| 2163275–2239116 | 0.268 | 0.0554 | ||
| 2805785–2849483 | 0.579 | 0.0556 | ||
| 3018726–3099635 | 0.260 | 0.0401/0.0246 | ||
| KCNK9 | 14 | 2163275–2239116 | 0.268 | 0.0554 |
| 2805785–2849483 | 0.579 | 0.0556 | ||
| 3018726–3099635 | 0.260 | 0.0401/0.0246 | ||
| LOC618755 | 14 | 3018726–3099635 | 0.260 | 0.0401/0.0246 |
| ZNF7 | 14 | 260341–443937 | 0.248 | 0.0568 |
| EPPK1 | 14 | 260341–443937 | 0.248 | 0.0568 |
| CYP11B2 | 14 | 260341–443937 | 0.248 | 0.0568 |
| 1889210–1967406 | 0.330 | 0.0454 | ||
| 2163275–2239116 | 0.268 | 0.0554 | ||
| GLI4 | 14 | 1889210–1967406 | 0.330 | 0.0454 |
| 2163275–2239116 | 0.268 | 0.0554 | ||
| TRAPPC9 | 14 | 1889210–1967406 | 0.330 | 0.0454 |
| 2163275–2239116 | 0.268 | 0.0554 | ||
| 2805785–2849483 | 0.579 | 0.0556 | ||
| 3018726–3099635 | 0.260 | 0.0401/0.0246 | ||
| LOC782462 | 20 | 36691324–36837401 | 0.537/0.215 | 0.0187/0.0467 |
| LOC782833 | 20 | 36691324–36837401 | 0.537/0.215 | 0.0187/0.0467 |
| C9 | 20 | 36691324–36837401 | 0.537/0.215 | 0.0187/0.0467 |
| FYB | 20 | 36691324–36837401 | 0.537/0.215 | 0.0187/0.0467 |
| RICTOR | 20 | 36691324–36837401 | 0.537/0.215 | 0.0187/0.0467 |
| LOC100138964 | 20 | 36691324–36837401 | 0.537/0.215 | 0.0187/0.0467 |
| RAI14 | 20 | 40690535–40854433 | 0.341 | 0.0443 |
This table describes genes that associated with milk production traits when compared with genome wide association studies, Hap Freq describes the frequency of core haplotype involved in each core region determined by Sweep across genome.
Figure 3Distributions of candidate genes in whole genome.
Enriched GO terms when comparing candidate genes to the whole genome using GeneMerge1.2.
| GO term | Description | Pop-frec | CR-frec | ratio | P value |
| GO:0000003 | reproduction | 210/27430 | 101/9829 | 1.34 | 9.08E-3 |
| GO:0044421 | extracellular region part | 361/27430 | 171/9829 | 1.32 | 2.12E-4 |
| GO:0016265 | death | 403/27430 | 182/9829 | 1.26 | 3.49E-3 |
| GO:0022414 | reproductive process | 208/27430 | 101/9829 | 1.36 | 5.76E-3 |
| GO:0032991 | macromolecular complex | 1327/27430 | 596/9829 | 1.25 | 1.03E-10 |
| GO:0005623 | cell | 5024/27430 | 2264/9829 | 1.26 | 8.171E-49 |
| GO:0048519 | negative regulation of biological process | 669/27430 | 306/9829 | 1.28 | 3.34E-06 |
| GO:0050896 | response to stimulus | 1500/27430 | 693/9829 | 1.30 | 7.77E-16 |
| GO:0044422 | organelle part | 1953/27430 | 885/9829 | 1.26 | 1.32E-17 |
| GO:0051234 | establishment of localization | 1094/27430 | 494/9829 | 1.26 | 3.30E-09 |
| GO:0031974 | membrane-enclosed lumen | 728/27430 | 340/9829 | 1.30 | 3.68E-08 |
| GO:0022610 | biological adhesion | 198/27430 | 94/9829 | 1.32 | 0.0261 |
| GO:0008152 | metabolic process | 2813/27430 | 1291/9829 | 1.28 | 1.81E-29 |
| GO:0044464 | cell part | 5024/27430 | 2264/9829 | 1.26 | 8.17E-49 |
| GO:0003824 | catalytic activity | 1991/27430 | 919/9829 | 1.29 | 2.99E-21 |
| GO:0005488 | binding | 3393/27430 | 1534/9829 | 1.26 | 1.07E-31 |
| GO:0009987 | cellular process | 3860/27430 | 1739/9829 | 1.26 | 1.49E-35 |
| GO:0005215 | transporter activity | 400/27430 | 180/9829 | 1.26 | 4.84E-3 |
| GO:0032502 | developmental process | 896/27430 | 409/9829 | 1.27 | 2.85E-08 |
| GO:0002376 | immune system process | 390/27430 | 175/9829 | 1.25 | 7.19E-3 |
| GO:0008283 | cell proliferation | 292/27430 | 142/9829 | 1.36 | 2.41E-4 |
| GO:0005576 | extracellular region | 697/27430 | 320/9829 | 1.28 | 1.06E-06 |
| GO:0023052 | signaling | 904/27430 | 424/9829 | 1.31 | 1.34E-10 |
| GO:0048518 | positive regulation of biological process | 785/27430 | 359/9829 | 1.28 | 2.51E-07 |
| GO:0032501 | multicellular organismal process | 1124/27430 | 494/9829 | 1.23 | 3.43E-07 |
| GO:0051704 | multi-organism process | 189/27430 | 92/9829 | 1.36 | 0.0102 |
| GO:0043226 | organelle | 3372/27430 | 1509/9829 | 1.25 | 1.65E-28 |
| GO:0060089 | molecular transducer activity | 309/27430 | 151/9829 | 1.36 | 8.61E-05 |
| GO:0050789 | regulation of biological process | 1900/27430 | 871/9829 | 1.28 | 5.19E-19 |
| GO:0004872 | receptor activity | 347/27430 | 156/9829 | 1.25 | 0.0144 |
| GO:0030234 | enzyme regulator activity | 261/27430 | 122/9829 | 1.30 | 9.57E-3 |
| GO:0040011 | locomotion | 245/27430 | 115/9829 | 1.31 | 0.012 |
| GO:0051179 | localization | 1269/27430 | 575/9829 | 1.26 | 4.01E-11 |
| GO:0040007 | growth | 165/27430 | 81/9829 | 1.37 | 0.0170 |
| GO:0071840 | cellular component organization or biogenesis | 1015/27430 | 474/9829 | 1.30 | 1.57E-11 |
| GO:0065007 | biological regulation | 2029/27430 | 927/9829 | 1.28 | 8.67E-20 |
This table describes second GO terms significantly enriched in core regions based on GeneMerge1.2 software. 36 terms are detected to be significant here. Pop-frec describes the frequency of genes in the population with this term, and CR-frec describes the frequency of genes in the core regions with this term. Ratio is calculated by the comparison of a term within the core regions to that in genome wide. P value here is a Bonferroni corrected P value.
Enriched pathway terms when comparing candidate genes to the whole genome using GeneMerge1.2.
| Pathway term | Description | Pop-frec | CR-frec | ratio | P value |
| bta04510 | Focal adhesion | 192/27430 | 93/9829 | 1.35 | 0.0464 |
| bta03040 | Spliceosome | 126/27430 | 67/9829 | 1.48 | 0.0107 |
| bta04270 | Vascular smooth muscle contraction | 122/27430 | 66/9829 | 1.51 | 5.94E-3 |
| bta05322 | Systemic lupus erythematosus | 189/27430 | 92/9829 | 1.36 | 0.0402 |
| bta04080 | Neuroactive ligand-receptor interaction | 318/27430 | 155/9829 | 1.36 | 3.00E-4 |
| bta04010 | MAPK signaling pathway | 266/27430 | 131/9829 | 1.37 | 9.80E-4 |
| bta01100 | Metabolic pathways | 1081/27430 | 471/9829 | 1.22 | 1.00E-5 |
This table describes significant pathway terms over-represented in core regions based on GeneMerge1.2 software. Pop-frec describes the frequency of genes in the population with this pathway, and CR-frec describes the frequency of genes in the core regions with this pathway. Ratio is calculated by the comparison of a term within the core regions to that in genome wide. P value here is a Bonferroni corrected P value.