| Literature DB >> 29899660 |
Cheng Zhang1,2, Pan Ni1, Hafiz Ishfaq Ahmad1, M Gemingguli3, A Baizilaitibei3, D Gulibaheti3, Yaping Fang2, Haiyang Wang1,2, Akhtar Rasool Asif1, Changyi Xiao2, Jianhai Chen1, Yunlong Ma1, Xiangdong Liu1, Xiaoyong Du1,2, Shuhong Zhao1.
Abstract
Animal domestication gives rise to gradual changes at the genomic level through selection in populations. Selective sweeps have been traced in the genomes of many animal species, including humans, cattle, and dogs. However, little is known regarding positional candidate genes and genomic regions that exhibit signatures of selection in domestic horses. In addition, an understanding of the genetic processes underlying horse domestication, especially the origin of Chinese native populations, is still lacking. In our study, we generated whole genome sequences from 4 Chinese native horses and combined them with 48 publicly available full genome sequences, from which 15 341 213 high-quality unique single-nucleotide polymorphism variants were identified. Kazakh and Lichuan horses are 2 typical Asian native breeds that were formed in Kazakh or Northwest China and South China, respectively. We detected 1390 loss-of-function (LoF) variants in protein-coding genes, and gene ontology (GO) enrichment analysis revealed that some LoF-affected genes were overrepresented in GO terms related to the immune response. Bayesian clustering, distance analysis, and principal component analysis demonstrated that the population structure of these breeds largely reflected weak geographic patterns. Kazakh and Lichuan horses were assigned to the same lineage with other Asian native breeds, in agreement with previous studies on the genetic origin of Chinese domestic horses. We applied the composite likelihood ratio method to scan for genomic regions showing signals of recent selection in the horse genome. A total of 1052 genomic windows of 10 kB, corresponding to 933 distinct core regions, significantly exceeded neutral simulations. The GO enrichment analysis revealed that the genes under selective sweeps were overrepresented with GO terms, including "negative regulation of canonical Wnt signaling pathway," "muscle contraction," and "axon guidance." Frequent exercise training in domestic horses may have resulted in changes in the expression of genes related to metabolism, muscle structure, and the nervous system.Entities:
Keywords: horse; population genetic structure; selective sweep; single-nucleotide polymorphisms
Year: 2018 PMID: 29899660 PMCID: PMC5990873 DOI: 10.1177/1176934318775106
Source DB: PubMed Journal: Evol Bioinform Online ISSN: 1176-9343 Impact factor: 1.625
Sampling and data source information for horses in the study.
| Breed | No. | BioSample ID[ |
|---|---|---|
| Arabian | 6 | SAMN02179860, SAMN02439777, SAMEA3475296, SAMN05616420, SAMN05616421, SAMN05616422 |
| Duelmener | 1 | SAMN02422919 |
| Mongolian | 2 | SAMEA3498582, SAMEA3504070 |
| Franches-Montagnes | 12 | SAMEA3498888-SAMEA3498899 |
| Hanoverian | 2 | SAMN02439779, SAMN02439782 |
| Icelandic | 2 | SAMN02179857, SAMEA3355589 |
| Jeju pony | 4 | SAMN01057170-SAMN01057173 |
| Morgan | 1 | SAMEA3499838 |
| Norwegian Fjord | 1 | SAMN02179856 |
| Quarter | 4 | SAMEA3499834-SAMEA3499836; SAMEA34998378 |
| Sorraia | 2 | SAMN02439778, SAMN03955413 |
| Standardbred | 4 | SAMN02179441, SAMEA3499831-SAMEA3499833 |
| Przewalski | 3 | SAMN02179442,SAMN03009555, SAMN03009556 |
| Kazakh | 3 | SAMN07604084,SAMN07604083, SAMN07604081 |
| Lichuan | 1 | SAMN07604082 |
Summary of single-nucleotide polymorphisms in horses.
| Category | Count | Percentage | Note |
|---|---|---|---|
| Sample size | N = 48 | — | “Splice_region” means that a variant is within 2 bp of a splice junction. “Splice_acceptor” means that the variant hits a splice acceptor site (defined as 2 bases before the exon start site, except for the first exon). “Splice_donor” means that the variant hits a splice donor site (defined as 2 bases after the end of the coding exon, except for the last exon). “Upstream/downstream” means that a variant overlaps with the 1-kb region upstream/downstream of the gene end site. The number of effects is larger than the number of SNPs because a variant can be annotated for 2 or more effects. |
| SNP | 15 341 213 | — | |
| 3_prime_UTR | 172 656 | 0.46 | |
| 5_prime_UTR | 46 446 | 0.13 | |
| Downstream | 2 488 567 | 6.70 | |
| Intergenic_region | 10 032 903 | 27.02 | |
| Intragenic | 794 | 0.00 | |
| Intron | 10 441 311 | 28.12 | |
| Missense | 103 427 | 0.28 | |
| Non_coding_transcript_exon | 96 876 | 0.26 | |
| Non_coding_transcript | 10 793 318 | 29.07 | |
| Splice_acceptor | 384 | 0.00 | |
| Splice_donor | 577 | 0.00 | |
| Splice_region | 25 231 | 0.07 | |
| Start_lost | 283 | 0.00 | |
| Stop_gained | 1054 | 0.00 | |
| Stop_lost | 224 | 0.00 | |
| Stop_retained | 82 | 0.00 | |
| Synonymous | 129 097 | 0.35 | |
| Upstream | 2 474 202 | 6.66 | |
| Exon | 327 360 | 0.88 | |
| No. of effects | 37 134 791 | 100 |
Enriched biological processes from GO analysis of the loss-of-function genes.
| Term | Gene counts (genes without annotation not shown) | Benjamini | |
|---|---|---|---|
| GO:0007186~G protein–coupled receptor signaling pathway | 55 genes including | 4.56E−12 | 4.64E−09 |
| GO:0002504~antigen processing and presentation of peptide or polysaccharide antigen via MHC class II |
| 8.51E−09 | 4.32E−06 |
| GO:0050907~detection of chemical stimulus involved in sensory perception | 14 genes | 1.28E−04 | 4.25E−02 |
| GO:0006955~immune response | 17 genes including | 1.77E−04 | 4.39E−02 |
| GO:0002474~antigen processing and presentation of peptide antigen via MHC class I | 5 genes including | 7.49E−04 | 1.41E−01 |
| GO:0007165~signal transduction | 21 genes including | 4.33E−03 | 5.20E−01 |
| GO:0007608~sensory perception of smell | 13 genes | 1.15E−02 | 8.12E−01 |
| GO:0035518~histone H2A monoubiquitination |
| 3.75E−02 | 9.92E−01 |
| GO:0090179~planar cell polarity pathway involved in neural tube closure |
| 4.49E−02 | 9.94E−01 |
| GO:0019882~antigen processing and presentation | 4 genes | 5.16E−02 | 9.95E−01 |
| GO:0006952~defense response |
| 5.95E−02 | 9.97E−01 |
| GO:0046322~negative regulation of fatty acid oxidation |
| 6.19E−02 | 9.96E−01 |
| GO:0033182~regulation of histone ubiquitination |
| 6.19E−02 | 9.96E−01 |
| GO:0031058~positive regulation of histone modification |
| 6.19E−02 | 9.96E−01 |
| GO:0055123~digestive system development |
| 6.19E−02 | 9.96E−01 |
| GO:0071346~cellular response to interferon gamma |
| 8.41E−02 | 9.99E−01 |
| GO:0050821~protein stabilization |
| 8.49E−02 | 9.98E−01 |
| GO:0051443~positive regulation of ubiquitin-protein transferase activity |
| 8.85E−02 | 9.98E−01 |
| GO:2000360~negative regulation of binding of sperm to zona pellucida |
| 9.14E−02 | 9.98E−01 |
| GO:0042991~transcription factor import into nucleus |
| 9.14E−02 | 9.98E−01 |
| GO:0016485~protein processing |
| 9.40E−02 | 9.97E−01 |
| GO:0035556~intracellular signal transduction |
| 9.91E−02 | 9.97E−01 |
Abbreviation: GO, gene ontology.
Figure 1.Principal component analysis results of all 48 horses. The x-axis denotes the value of PC1, whereas the y-axis denotes the value of PC2. Each dot in the figure represents one individual.
Figure 2.Bayesian clustering output for 5 K values from K = 2 to K = 8 in 45 domestic horses. Each individual is represented by a vertical line, which is partitioned into colored segments that represent the proportion of the inferred K clusters.
Figure 3.Circos plot of the global distribution of genes, SNP variants, and signature of selective sweeps along the 31 autosomes. The circles, from outside to inside, illustrate gene density (yellow), SNP density (green), and CLR values (blue). The genes located in regions with significant strong sweep signatures are presented as outliers. High values in each layout (gene density: number per 100 kB; SNP density: number per 200 kB, and CLR value >15) are marked in red. CLR indicates composite likelihood ratio; SNP, single-nucleotide polymorphism.
Enriched biological processes from GO analysis of the genes in the CLR region.
| Term | Genes | Benjamini | |
|---|---|---|---|
| GO:0090090~negative regulation of canonical Wnt signaling pathway |
| 2.93E−03 | 9.20E−01 |
| GO:0006936~muscle contraction |
| 1.22E−02 | 9.95E−01 |
| GO:0007411~axon guidance |
| 1.31E−02 | 9.77E−01 |
| GO:0001736~establishment of planar polarity |
| 1.48E−02 | 9.59E−01 |
| GO:0008286~insulin receptor signaling pathway |
| 2.22E−02 | 9.79E−01 |
| GO:0016192~vesicle-mediated transport |
| 2.22E−02 | 9.60E−01 |
| GO:0045773~positive regulation of axon extension |
| 2.70E−02 | 9.65E−01 |
| GO:0033539~fatty acid β-oxidation using acyl-CoA dehydrogenase |
| 4.20E−02 | 9.90E−01 |
| GO:0050885~neuromuscular process controlling balance |
| 4.26E−02 | 9.84E−01 |
| GO:0060541~respiratory system development |
| 5.10E−02 | 9.89E−01 |
| GO:0097118~neuroligin clustering involved in postsynaptic membrane assembly |
| 5.10E−02 | 9.89E−01 |
| GO:0007156~homophilic cell adhesion via plasma membrane adhesion molecules |
| 5.36E−02 | 9.87E−01 |
| GO:0051645~Golgi localization |
| 6.74E−02 | 9.93E−01 |
| GO:0042384~cilium assembly |
| 7.52E−02 | 9.94E−01 |
| GO:0090630~activation of GTPase activity |
| 7.89E−02 | 9.94E−01 |
| GO:0055088~lipid homeostasis |
| 8.40E−02 | 9.93E−01 |
| GO:0060070~canonical Wnt signaling pathway |
| 8.53E−02 | 9.92E−01 |
| GO:0086010~membrane depolarization during action potential |
| 8.92E−02 | 9.91E−01 |
| GO:0045184~establishment of protein localization |
| 8.92E−02 | 9.91E−01 |
| GO:0042297~vocal learning |
| 9.94E−02 | 9.93E−01 |
Abbreviation: GO, gene ontology.