| Literature DB >> 32271776 |
Mohammed Ali Al Abri1, Heather Marie Holl2, Sara E Kalla3, Nathan B Sutter4, Samantha A Brooks5.
Abstract
The domesticated horse has played a unique role in human history, serving not just as a source of animal protein, but also as a catalyst for long-distance migration and military conquest. As a result, the horse developed unique physiological adaptations to meet the demands of both their climatic environment and their relationship with man. Completed in 2009, the first domesticated horse reference genome assembly (EquCab 2.0) produced most of the publicly available genetic variations annotations in this species. Yet, there are around 400 geographically and physiologically diverse breeds of horse. To enrich the current collection of genetic variants in the horse, we sequenced whole genomes from six horses of six different breeds: an American Miniature, a Percheron, an Arabian, a Mangalarga Marchador, a Native Mongolian Chakouyi, and a Tennessee Walking Horse, and mapped them to EquCab3.0 genome. Aside from extreme contrasts in body size, these breeds originate from diverse global locations and each possess unique adaptive physiology. A total of 1.3 billion reads were generated for the six horses with coverage between 15x to 24x per horse. After applying rigorous filtration, we identified and functionally annotated 17,514,723 Single Nucleotide Polymorphisms (SNPs), and 1,923,693 Insertions/Deletions (INDELs), as well as an average of 1,540 Copy Number Variations (CNVs) and 3,321 Structural Variations (SVs) per horse. Our results revealed putative functional variants including genes associated with size variation like LCORL gene (found in all horses), ZFAT in the Arabian, American Miniature and Percheron horses and ANKRD1 in the Native Mongolian Chakouyi horse. We detected a copy number variation in the Latherin gene that may be the result of evolutionary selection impacting thermoregulation by sweating, an important component of athleticism and heat tolerance. The newly discovered variants were formatted into user-friendly browser tracks and will provide a foundational database for future studies of the genetic underpinnings of diverse phenotypes within the horse.Entities:
Year: 2020 PMID: 32271776 PMCID: PMC7144971 DOI: 10.1371/journal.pone.0230899
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Yield, filtering and mapping summary of the next generation sequencing data of six horses from different breeds.
The depth of coverage and mapping metrics show a descent quality of the 6 genomes sequencing and genome coverage.
| ARB | PER | AMH | TWH | MM | CH | |
|---|---|---|---|---|---|---|
| Number of paired-end reads before trimming | 241,480,555 | 296,460,133 | 324,123,384 | 198,749,393 | 169,680,137 | 142,502,233 |
| Read lengths | 100/100 | 100/100 | 100/100 | 150/150 | 150/150 | 150/150 |
| Estimated average depth of coverage before trimming | 17.8x | 21.96x | 24x | 22.0x | 18.9x | 15.833x |
| Number of paired-end reads after trimming | 165,277,009 | 187,223,705 | 138,772,441 | 161,659,278 | 134,732,394 | 121,744,242 |
| Total number of aligned reads | 330,554,018 | 374,447,410 | 277,544,882 | 323,318,556 | 269,464,788 | 243,488,484 |
| Estimated depth of coverage | 13.35619062 | 15.12972 | 11.21433 | 19.59576 | 16.33178 | 14.7574 |
| Percentage of mapped reads | 99.8% | 99.85% | 99.7% | 99.95% | 99% | 99.14% |
| Percentage of reads where both pairs mapped | 99.83% | 99.84% | 99.71% | 99.54% | 99.01% | 99.12% |
1 Calculated by multiplying the number of paired end reads after trimming by 2
2 Estimated from the total number of aligned reads using the formula C = L*N/G (where G is the haploid genome length (2,474,912,402), L is the read length and N is the number of reads).
3 Estimated using the bamtools stats procedure in bamtools.
Genotype categories of SNPs and INDELs and counts of CNVs and SVs in the six horses.
| ARB | PER | AMH | TWH | MM | CH | |
|---|---|---|---|---|---|---|
| Homozygous Reference | 11,555,233 | 10,905,683 | 10,908,015 | 10,795,424 | 10,052,722 | 10,259,979 |
| Heterozygous | 3,741,846 | 4,264,207 | 4,182,897 | 4,631,063 | 4,824,665 | 4,872,714 |
| Homozygous Alternative | 1,936,702 | 2,102,585 | 2,067,904 | 1,936,150 | 2,096,698 | 2,089,121 |
| Missing | 280,942 | 242,248 | 355,907 | 152,086 | 540,638 | 292,909 |
| Transitions | 4,936,199 | 5,473,518 | 5,298,618 | 5,331,846 | 5,801,750 | 5,662,053 |
| Transversions | 2,563,411 | 2,871,410 | 2,903,626 | 3,039,726 | 3,089,634 | 3,253,897 |
| Homozygous Reference | 1,289,788 | 1,256,004 | 1,283,913 | 1,106,243 | 1,039,000 | 1,127,564 |
| Heterozygous | 361,414 | 380,707 | 340,107 | 547,182 | 554,852 | 507,378 |
| Homozygous Alternative | 217,931 | 230,637 | 217,597 | 238,297 | 235,883 | 248,055 |
| Missing | 54,560 | 56,345 | 82,076 | 31,971 | 93,958 | 40,696 |
| Gains | 825 | 727 | 671 | 694 | 695 | 719 |
| Losses | 837 | 882 | 853 | 800 | 795 | 747 |
| Interchromosomal | 1,298 | 1,644 | 964 | 1,366 | 686 | 2,140 |
| Intrachromosomal | 2,986 | 2,336 | 1,696 | 1,750 | 928 | 263 |
Real-time quantitative PCR primers.
| Gene | Forward primer | Reverse primer | Amplicon size (BP) |
|---|---|---|---|
| 112 | |||
| 207 | |||
| 198 | |||
| 157 |