Literature DB >> 32753654

A genome-wide scan for candidate lethal variants in Thoroughbred horses.

Evelyn T Todd1, Peter C Thomson2, Natasha A Hamilton3, Rachel A Ang2, Gabriella Lindgren4,5, Åsa Viklund4, Susanne Eriksson4, Sofia Mikko4, Eric Strand6, Brandon D Velie2.   

Abstract

Domestic animal populations are often characterised by high rates of inbreeding and low effective population sizes due to selective breeding practices. These practices can result in otherwise rare recessive deleterious alleles drifting to high frequencies, resulting in reduced fertility rates. This study aimed to identify potential recessive lethal haplotypes in the Thoroughbred horse breed, a closed population that has been selectively bred for racing performance. In this study, we identified a haplotype in the LY49B gene that shows strong evidence of being homozygous lethal, despite having high frequencies of heterozygotes in Thoroughbreds and other domestic horse breeds. Variant analysis of whole-genome sequence data identified two SNPs in the 3'UTR of the LY49B gene that may result in loss of function. Analysis of transcriptomic data from equine embryonic tissue revealed that LY49B is expressed in the trophoblast during placentation stage of development. These findings suggest that LY49B may have an essential, but as yet unknown function in the implantation stage of equine development. Further investigation of this region may allow for the development of a genetic test to improve fertility rates in horse populations. Identification of other lethal variants could assist in improving natural levels of fertility in horse populations.

Entities:  

Mesh:

Substances:

Year:  2020        PMID: 32753654      PMCID: PMC7403398          DOI: 10.1038/s41598-020-68946-8

Source DB:  PubMed          Journal:  Sci Rep        ISSN: 2045-2322            Impact factor:   4.379


Introduction

There is estimated to be a high rate of natural embryonic mortality in mammals. A large proportion of these embryonic losses occur soon after fertilisation, such that pregnancies often go undetected, with the only sign being reduced fertility[1]. Mutation screens in mice reveal that many genes are essential for development, with knockout of 29% of genes tested resulting in embryonic death by day 14[2,3]. Although mutations in these genes are expected to be under strong negative selection due to being completely deleterious, many species are estimated to carry between one and two recessive lethal mutations per genome[4]. However, single mutations are often uncommon in a population, such that unrelated individuals are unlikely to carry the same recessive lethal mutations[5-7]. The likelihood of an individual inheriting two copies of the same lethal mutation is dramatically increased by inbreeding events, whereby alleles that are identical by descent are inherited from a common ancestor[8-10]. In recent years, a number of studies in livestock have identified embryonic lethal mutations at high frequencies due to intensive selective breeding practices[11-18]. This is often due to a limited number of sires with desirable characteristics making large genetic contributions to the population[11,19]. Moreover, population bottlenecks due to domestication and breed formation have also resulted in increased deleterious mutation loads and diminished gene pools in many domestic breeds[20-23]. These processes lead to a reduction in genetic diversity indices including effective population size, which is defined as the theoretical population size that shows the same rate of loss in genetic diversity as the study population (N)[24,25]. Reduced genetic diversity can increase the risk of drift and inbreeding events in future generations of a population. Lethal mutations that have reached high frequencies are often detected by deviations from the Hardy–Weinberg equilibrium with a lack of homozygotes for one allele[12]. Characterisation of such mutations can assist in improving breeding decisions to increase fertility rates in these populations and prevent these mutations from drifting to higher frequencies[26,27]. The identification of high frequency lethal variants is of particular interest in domestic horse populations. Although a recent study has identified some candidate mutations[28], to date there has been no published comprehensive characterisation of common embryonic lethal alleles in horse populations. Despite the large variety of domestic horse breeds found throughout the world, many breeds suffer from low within-breed diversity and small N[29-31]. Some horse breeds with large census population sizes also experience low genetic diversity due to intense artificial selective breeding practices and closed population structures[30,31]. Maintaining good fertility rates is particularly important for horse populations due to the seasonal nature of breeding and the low individual fertility output, as mares produce only one foal from an eleven month gestation period[32]. Despite the extensive use of hormonal therapies to increase covering success in many domestic horse populations, per cycle pregnancy rates in some breeds only average around 65%, suggesting the presence of unknown variables that may reduce fertility[33]. In this study, we aimed to characterise variants at high frequencies that may cause lethality in the Thoroughbred horse population. The Thoroughbred breed is of particular interest due to the closed population structure since the foundation of the studbook in the eighteenth century[34]. The population has since been intensely selected for the improvement of athletic abilities[35,36], resulting in contemporary Thoroughbred horses being characterised by high levels of inbreeding and a small N[31, 37–39]. Due to selective breeding practices, all Thoroughbred horses can trace their ancestry back to a small number of individuals from the foundation of the breed[37,38]. Genetic diversity in the Thoroughbred breed has been reduced in recent decades due to the increased commercialisation of popular stallions providing large genetic contributions to the population[40]. Although such practices are in line with selective breeding principles[41], they could also inadvertently increase the frequency of embryonic lethal variants in the population. Reproductive technologies such as artificial insemination are banned in the Thoroughbred population, making the maintenance of high levels of natural fertility imperative. Additionally, Thoroughbred horses have been used as foundation stock for other popular horse breeds including The Quarter Horse, Standardbred, and many Warmblood breeds[30]. Therefore, identification of lethal variants in Thoroughbreds is also likely to assist in the breeding management of these populations. We also aimed to determine the frequency of any potentially lethal variants identified in the Thoroughbred population in other horse breeds and examine their transcriptomic profile in embryonic tissue.

Results

Identifying candidate lethal SNPs at high frequencies in Thoroughbred horses

Analysis of genotype data from Thoroughbred horses (n = 156) identified only two adjacent, linked SNPs that significantly deviated from the Hardy–Weinberg equilibrium with an absence of homozygotes (Table 1). Under Hardy–Weinberg equilibrium, seven minor allele homozygotes were expected for both of these SNPs in the dataset. Genotype data from Japanese Thoroughbred horses (n = 370) also showed an absence of homozygotes for these SNPs (Table 1). In this dataset, the expected number of minor homozygotes for these SNPs under Hardy–Weinberg equilibrium was nine. Despite a complete absence of homozygotes, almost 35% of Thoroughbreds across both of these datasets were heterozygous for this two-SNP haplotype. These SNPs also showed an absence or reduction of minor homozygotes in genotype data for other domestic horse breeds (n = 2030, Table 1).
Table 1

The allele frequencies of two adjacent SNPs with an absence of minor homozygotes in genotype data from two Thoroughbred horse datasets.

PopulationSample sizeReference6:382780976:38278874
Expected GGGGAGAAExpected CCCCACAA
Australian Thoroughbreds156Own data7*066907*06690
Japanese Thoroughbreds370Fawcett et al.[69]9*01172539*0117253
Swedish Warmblood380Privately provided, Ablondi et al.[70]4*0753044*074306
Coldblooded Trotter646Privately provided, Velie et al.[84]26*02583882822226393
Quarter Horse137Petersen et al.[65]17*0974017*09740
Exmoor Pony285Velie et al.[71]001279001282
Various breeds582Petersen et al.[30]1508549715085497

The expected number of minor homozygotes in each population was calculated under Hardy–Weinberg equilibrium. Observed genotype frequencies that significant deviate from Hardy–Weinberg equilibrium frequency expectations (p < 0.05) are denoted with an asterisk.

The allele frequencies of two adjacent SNPs with an absence of minor homozygotes in genotype data from two Thoroughbred horse datasets. The expected number of minor homozygotes in each population was calculated under Hardy–Weinberg equilibrium. Observed genotype frequencies that significant deviate from Hardy–Weinberg equilibrium frequency expectations (p < 0.05) are denoted with an asterisk. These two candidate SNPs mapped to the coordinates of 6:38278097 (rs68661802) and 6:38278874 (rs68663106), which are found in an intronic region of the LY49B gene on chromosome 6. This gene is part of the LY49 gene family, which plays an important role in innate immunity. There are five functional members of the LY49 gene family in Equus caballus, all of which closely grouped together on chromosome 6. Since both of the SNPs mapped to a non-coding region of the LY49B gene, the likelihood of either being a causal variant for lethality is low.

Phylogenetic origin of the candidate SNPs

According to the phylogenetic tree generated by Petersen et al.[29,30], and their associated SNP data, the SNPs of interest were present in heterozygous state across most phylogenetic branches of domestic horse breeds. Of the 32 breeds in this dataset, 23 had at least one heterozygote for both SNPs of interest. Notably, this two-SNP haplotype was not found in genotype data from one branch of the tree which contains the North Swedish Horse (n = 19), Norwegian Fjord Horse (n = 21) and Exmoor Pony (n = 24) (Table S1). A larger sample of Exmoor Pony data (n = 274, Table 1) found only one heterozygote for this haplotype.

Frequency of the candidate SNPs in other breeds

Analysis of SNP data from other domestic breeds showed that heterozygotes for the SNPs of interest were at a particularly high frequency in the Quarter Horse population (71%, n = 137) (Table 1, Table S2). The proportion of heterozygotes was also high in Swedish Warmbloods (n = 380) and Norwegian-Swedish Coldblooded Trotters (n = 641), being 20% and 40% respectively (Table 1). Smaller datasets also revealed that Belgian Draft (n = 19), French Trotter (n = 17), Paint (n = 15), Morgan (n = 19), Mongolian Paulista (n = 19) and Tuva (n = 15) breeds may also have a high proportion of heterozygotes for this haplotype in their populations (Table S1).

Identifying candidate causal variants using whole genome sequence data

To further investigate SNP frequencies in this region, variants were called from whole-genome sequence data of 90 domestic horses. The two SNPs identified in the preliminary analysis showed a complete absence of homozygotes for their minor alleles in these individuals (Table S3). Additionally, a number of variants closely linked to these SNPs were identified (Table S3, Figure S1). Annotation of these loci using SIFT[42] identified three variants that may result in changes to protein structure or expression, so these represent the most likely candidates to cause lethality in homozygous state (Fig. 1).
Figure 1

The equine LY49B gene structure and SNP positions. The two variants in the intronic region (6:38278097 and 6:38278874) were identified in preliminary analysis as showing a significant absence of homozygotes for one allele. The three variants marked with a * are in linkage disequilibrium to these SNPs and may cause a loss of function in homozygous state. The structure of the gene is based on the EquCab 2.0 reference genome where the LY49B gene is on the reverse strand.

The equine LY49B gene structure and SNP positions. The two variants in the intronic region (6:38278097 and 6:38278874) were identified in preliminary analysis as showing a significant absence of homozygotes for one allele. The three variants marked with a * are in linkage disequilibrium to these SNPs and may cause a loss of function in homozygous state. The structure of the gene is based on the EquCab 2.0 reference genome where the LY49B gene is on the reverse strand. The first of these variants, 6:38282610G > A (rs68663123), was located in an exonic region of the LY49B gene and resulted in an amino acid change from a phenylalanine to a serine residue. This substitution is located next to a tryptophan residue that appears to be highly conserved across members of the LY49 family and across species. However, there is little conservation of the phenylalanine residue across taxa; some species have a phenylalanine and others a serine at this position. This SNP is annotated as being “tolerated” in SIFT. Two other variants that were closely linked to the candidate SNPs (6:38276742A > T and 6:38276955G > A) were found within the 3′UTR (3′untranslated region) of the LY49B gene. Alignment of the 3′UTR of the five functional LY49 genes in Equus caballus revealed that the region containing the SNP 6:38276955G > A (rs1139567427) is highly conserved in all members of the LY49 gene family (Table 2). This region may be important for mRNA stability and translation into a functional protein. The other variant 6:38276742A > T (rs1137325172) was found in an AU-rich region at the end of the LY49B mRNA transcript, which is often associated with polyadenylation and post translation stability.
Table 2

Amino acid residue sequence in a conserved area of the 3′UTR found in all Equus caballus LY49 genes as mapped in the EquCab2.0 assembly.

GeneSequence
LY49BAAAGACTTTCTCAGGGCCATTAAAGAGATGGGAAACTGCTTTCCAAAGAC
LY49CAGAGAATTTCCCAGGGCCATTAAAGAGAAGAGCAACTGATTTCCAAAGAC
LY49DAGAGAATTTCTCAGGGCCATTAAAGAGAAGGGCAACTGATTTCCAAAGAC
LY49EAGAGAATTTCTCAGGGCCATTAAAGAGAAGGGCAACTGATTTCCAAAGAC
LY49FAGAGAATTTTGCAGGGTCATTAAAGAGAGGGGTAACTGCTTTCCAAAGAC

The SNP position of 6:38276955G > A is highlighted in bold.

Amino acid residue sequence in a conserved area of the 3′UTR found in all Equus caballus LY49 genes as mapped in the EquCab2.0 assembly. The SNP position of 6:38276955G > A is highlighted in bold.

Transcriptomic analysis of RNA sequence data

Measurable levels of LY49B mRNA were not detected in equine trophectoderm tissue collected on day 16 of development. However, LY49B mRNA was observed in trophectoderm tissue collected on days 23 and 24 of development (Table 3). Additionally, LY49B mRNA transcripts were detected in microarray data from equine chorion and chorionic girdle tissue between days 27 and 34 of development (Table S4). Inner cell mass tissue collected on days 15, 22 and 25 of development did not show any measurable transcription of LY49B (Table 3). The genotypes of the candidate SNPs in the mRNA samples analysed were unknown.
Table 3

Gene counts from RNA sequence data of three trophectoderm and three inner cell mass tissue samples from equine embryos.

TissueGene count (FPKM)
TrophectodermDay 15Day 22Day 25
0.0000.0310.024
Inner cell massDay 16Day 23Day 24
0.000.000.00

Transcript counts are in fragments per kilobase/million (FPKM).

Gene counts from RNA sequence data of three trophectoderm and three inner cell mass tissue samples from equine embryos. Transcript counts are in fragments per kilobase/million (FPKM).

Discussion

Analysis of genotype data identified a two-SNP haplotype as a strong candidate for harbouring a variant that causes lethality in homozygous state. The SNPs identified in this preliminary analysis mapped to an intronic region in the LY49B gene on ECA6 (Table 1). The LY49B gene belongs to the LY49 (Killer cell lectin-like receptor subfamily A) family of receptors, which consists of five functional members in Equus caballus[43]. Other species (including humans) have a functionally similar, but structurally different gene family called KIR (Killer cell immunoglobin line receptors)[44]. The LY49/KIR gene family are expressed across various types of immune cells, and mediate their function through bindings to MHC-1[45]. The LY49B gene is expressed in myeloid cells where it regulates their activity through an inhibitory effect, possibly to prevent their spontaneous activation[46]. Despite the important role that they play in immunity, the function of LY49 genes in development is currently unknown. In humans, incompatibilities between foetal KIR and maternal MHC (HLA) genotypes are associated with an increased risk of miscarriage and preeclampsia[47-49]. Additionally, knockdown of LY49 in mice showed a high rate of implantation failure[50,51]. These findings indicate that LY49B may play an important role in maternal/foetal compatibility and implantation success in horses. Analysis of transcriptomic data found that LY49B was first expressed in equine trophoblast tissue during the placental development stage. The first evidence of LY49B expression was found on day 23–24 of development (Table 3), during which the glycoprotein capsule surrounding the embryo is broken down and placental tissue starts to develop[52]. Measurable expression of LY49B was also found in chorion and chorionic girdle tissues between days 27 and 34 of development (Table S4). During this time, trophoblast cells rapidly proliferate to form the chorionic girdle, which then invades the endometrium to form epithelial cups[53]. It is possible that LY49B is important for successful implantation of the embryo by mediating the action of MHC-1 which is expressed during this time[54,55]. Further investigations into the role of LY49B in equine development would confirm whether impaired function causes lethality and the stage of development at which this occurs. Variant calling in whole-genome sequence data from 90 domestic horses further confirmed an absence of minor homozygotes for the two SNPs of interest. Three variants closely linked to these SNPs were also identified in these data as the most likely candidates to cause loss of function in the LY49B gene and result in lethality in homozygous state (Table S3). One SNP was a missense variant in the coding region of the LY49B gene that results in the substitution of a negatively charged serine for an aromatic phenylalanine residue. However, lack of conservation of this SNP in LY49 genes across taxa makes it seem unlikely to be a causative variant for embryonic lethality. Two other variants found in the 3′UTR of the LY49B gene were also closely linked to the SNPs identified in the preliminary analysis, and seemed more likely candidates to cause embryonic lethality in homozygous state. The 3′UTR of a gene is responsible for transcriptional stability through the binding of miRNAs and RNA binding proteins[56]. The addition of the polyadenylation tail to the 3′UTR is also essential to ensure proper processing and translation of the mRNA strand[57]. Mutations in the 3′UTR can lead to degradation of the mRNA, resulting in reduced or inhibited translation even when the gene is transcribed[58]. Variation in the 3′UTR of genes are associated with a number of diseases including Huntington’s and breast cancer in humans[59,60]. Additionally, SNPs in the 3′UTR are associated with traits in livestock including milk production in cows, muscularity in sheep and obesity in horses[58,61,62]. Despite the importance of the 3′UTR for the mRNA stability and normal expression of a gene, little is known about how specific polymorphisms can affect post-transcriptional processing. This makes it difficult to identify how the 3′UTR variants identified in this study could affect the translation of LY49B mRNA into a functional protein. The 3′UTR variant 6:38276955G > A was identified as a possible candidate for embryonic lethality because it is highly conserved between all members of the LY49B family (Table 2) in horses, so may play an important role in mRNA stability. The other 3′UTR variant (6:38276742A > T) is found in an AU-rich region at the end of the transcript, so may be important for the addition of the polyadenylation tail. Further examination of the effects that these variants have on post-transcriptional processing would determine if they impact the normal expression of LY49B in horses. Despite an absence of homozygotes, the two intronic SNPs identified in this study were found at high heterozygote frequencies in the Thoroughbred population. Mares are often covered multiple times in a season, which may explain why a more discernible reduction in fertility has not been observed as a result of the high frequency of this variant. However, the presence of lethal variants at high frequencies may result in more coverings being required for each mare in a season. Currently, there is no evidence that variation in the LY49B gene is associated with phenotypic advantages in horses. However, it is possible that one of the variants linked to these SNPs results in a phenotypic advantage in heterozygotes, which could explain why they have reached such high frequencies in the breed. It is also possible that selective breeding practices favouring a limited number of stallion bloodlines are responsible for this potentially lethal haplotype drifting to high frequencies in the Thoroughbred population. This would be most likely to occur if a stallion that made a large genetic contribution to the population was a carrier. A similar instance has recently been documented in cattle, where a lethal variant at a high frequency was traced back to a sire with an extensive genetic influence on the population[19]. The presence of this potentially lethal haplotype across many diverse breeds of domestic horses indicates that it may not be the result of a recent mutation present only in the Thoroughbred population. Rather, heterozygotes for this haplotype may have been present in pre-domesticated horses as a rare variant, and have become more frequent in some domestic breeds as the result of population bottlenecks due to breed formations and selective breeding practices. Domestication and breed formation events have been well documented to result in increased deleterious mutation loads in horses and other domestic species[22,31,63,64]. A high proportion of heterozygotes for this haplotype were found in some breeds closely related to the Thoroughbred including the Paint, French Trotter, Morgan and Quarter Horse. Notably, over 70% of Quarter Horse samples included in this study were heterozygous for these SNPs (Table 1). The Quarter Horse has an open stud book, and higher genetic diversity than the Thoroughbred population[65], making the high frequency of a potentially lethal haplotype at first surprising. The Quarter Horse dataset reportedly did not contain full or half siblings[65], but the collection of samples from one geographical area may not fully reflect the diversity of the worldwide population. An average relatedness analysis of these samples noted the large genetic influence of one particular Thoroughbred stallion[65], which may explain the high frequency of heterozygotes observed in this population. However, the extremely high frequency of heterozygotes in this breed may be due to selective breeding favouring these individuals. The Belgian Draft, Mangalarga Paulista and Tuva breeds also show a high proportion of heterozygotes, but are more distantly related to the Thoroughbred and to each other. Therefore, the high frequency of heterozygotes in these breeds may be due to independent genetic drift events. Heterozygotes for this haplotype were notably absent from one branch of the tree containing small heavy horses from Northern Europe, which are more distantly related to the Thoroughbred. A larger dataset of Exmoor Pony samples from this phylogenetic branch revealed one heterozygote for this haplotype (Table 1). This could be due to a calling error, but it is also possible that these SNPs exist at very low frequencies in these breeds. The small sample size of the genotype data for many individual breeds in this study means that heterozygote frequencies across all subpopulations found throughout the world may deviate from that reported here. However, these data provide an indication of breeds with high proportions of heterozygotes for this region. Analysis of SNP data from Northern-Swedish Coldblooded Trotters identified 22 homozygotes for the SNP at position 6:38278874. It is likely that there has been recombination between this SNP and the causal variant, and may appear more frequent in this population due to differences in breed history and recombination patterns[66]. However, additional analyses are required to explore this further. Overall, our findings suggest that this region shows evidence harbouring a homozygous lethal variant, yet a high proportion of heterozygotes are found across many domestic horse breeds. In this study, we identified a haplotype at high heterozygote frequencies in the Thoroughbred horse population that is a strong candidate for harbouring a variant causing lethality in homozygous state. Similar analyses on larger datasets in other livestock populations have identified multiple lethal haplotypes, so it is likely that other such variants are present at high frequencies in the Thoroughbred population but were not captured in this study. Additionally, the use of commercial SNP arrays only allows for the identification of variants with high minor allele frequencies in populations. Analysis of larger sample sizes, and using higher density genotype data could allow for identification of other variants associated with lethality in domestic horses. The identification of this potentially lethal haplotype demonstrates the potential implications of heavily favouring a limited number of bloodlines in selective breeding practices. Further characterisation of lethal haplotypes in other breeds would also assist in breeding management to increase per covering fertility rates in domestic horse populations.

Methods

DNA extractions

DNA was extracted from the hair samples of Australian Thoroughbred horses using the Qiagen Gentra Puregene Tissue Kit (Qiagen, Redwood City, CA, USA). DNA was extracted from the hair samples of Norwegian-Swedish Coldblooded Trotters and Swedish Warmbloods by incubating the samples for 2 h at 56 °C with Chelex 100 Resin (Bio-Rad Laboratories, Hercules, CA) and Proteinase K (20 mg/mL; Merck KgaA, Darmstadt, Germany). The Proteinase K was then inactivated by incubating for 10 min at 95 °C and DNA resuspended in low TE (1 mM Tris, 0.1 mM EDTA). DNA was extracted from blood samples using the Qiasymphony DSP DNA mini kit (Qiagen, Hilden, Germany).

Initial genotyping

Genotype data from a representative sample of Thoroughbreds were used to identify SNPs with a high proportion of heterozygotes, but an absence of homozygotes for one allele. Genome-wide SNP data were generated for 156 Australian Thoroughbred horses by genotyping samples on either the Illumina 70 K Chip (65,102 SNPs) (n = 102) or the Affymetrix 670 K Chip (670,796 SNPs) (n = 54). Common genotyped SNPs between the two arrays were scanned for deviations from the Hardy–Weinberg equilibrium with an absence of homozygotes for one allele using PLINK (version 1.9)[67]. The p values were adjusted using a false discovery rate correction with the R package “qvalue”[68]. Since SNPs with an absence of homozygotes could indicate a calling error, the search was narrowed to only include adjacent SNPs that fit such criteria. The frequencies of the candidate SNPs were then examined in publicly available genotype data from Japanese Thoroughbreds (n = 370) typed on the Affymetrix 670 K Chip[69] and these were added the Thoroughbred sample. The SNP frequencies were then characterised from genotype data from Swedish Warmbloods (n = 380)[70] and Norwegian-Swedish Coldblooded Trotters (n = 646)[71] typed on the Affymetrix 670 K Chip. Publicly available data from Exmoor Ponies (n = 285, typed on the Affymetrix 670 K Chip)[71], Quarter Horses (n = 137, typed on the Illumina 70 K Chip)[72] and horses of 32 different domestic breeds (n = 582, typed on the Illumina 50 K Chip)[29,30] were also included in this preliminary scan for SNP frequencies. In these data, raw intensities were plotted to check for calling errors. If potential calling errors were detected, SNPs were recalled using a mixture model fitted with an expectation–maximization algorithm in R.

Variant discovery and mapping

Publicly available whole-genome sequence data were used to further examine the frequencies of the candidate SNPs identified in the initial genotype analysis, and to identify linked variants. Paired end whole-genome sequence data from 90 horses of different domestic breeds were used in this analysis (Table S5). The whole genome datasets were downloaded from the European Nucleotide Archives (ENA, https://www.ebi.ac.uk/ena) which included horses of different domestic breeds (PRJEB14779, n = 70) and additional Thoroughbred samples (PRJNA168142, n = 16 and PRJNA184688, n = 4) (Table S5). The SNP array used in the initial genotyping analysis was developed based on coordinates of the EquCab2.0 reference genome. For consistency, we used the EquCab2.0 assembly as a reference for the whole-genome sequence analysis. The EquCab2.0 assembly was also used because of an issue with resolution in the area of interest on the newer EquCab3.0 assembly. The raw reads were mapped to the EquCab2.0 reference genome using BWA-MEM algorithm from Burrows-Wheeler Alignment Tool (version 0.7.17)[73]. Duplicate reads were flagged using Samblaster (version 0.1.22)[74], and base recalibration was performed using Genome Analysis Toolkit (GATK) (version 4.0.8.1)[75]. Variants (SNPs and INDELs (insertions and deletions)) were called using Haplotype Caller and then filtered using the standard hard filtering recommendations in GATK[76]. The individual SNPs were then filtered to only include high quality allele calls with an average filtered depth over 10 and a Phred score over 20. Variants that were linked to the SNPs identified from the genotype data were produced using the LD function in PLINK (version 1.9) with a window size of 5 Mb[67]. Only SNPs with an r2 value of over 0.8 and a D′ value > 0.9 were shortlisted. The effects of each SNP on gene structure and function was characterised using SIFT (version 4G)[42]. The conservation of variants across taxa was analysed using the NCBI Conserved Domain Database Search[77].

Transcriptomic analysis

Publicly available RNA sequence data were used to examine expression levels of the genes of interest in embryonic tissue. The data included equine inner cell mass tissue (collected at day 15, 22 and 25, n = 3) and trophectoderm tissue (collected at day 16, 23 and 24, n = 3) from the Functional Annotation of ANimal Genomes (FAANG) equine biobank (available from ENA under the project name PRJNA223157)[78]. Adaptors were trimmed using bbduk from BBtools (version 37.98)[79]. Reads were aligned to the EquCab 2.0 genome using STAR (version 2.7.2b)[80]. Counts were generated using featurecounts from Subread package(version 1.5.1)[81], then quantified in fragments per kilobase/million (FPKM) using the R package “edgeR”[82] with the Equus_caballus_Ensembl_94 file used for annotation. Microarray data for chorion (n = 19) and chorionic girdle (n = 19) tissue collected from horse embryos between days 27–34 of development[83] were also examined for gene expression levels.

Ethics statement

Hair samples from Australian Thoroughbred horses were collected under approval from University of Sydney Ethics Committee (Number: N00-2009-3-5109.) Written informed consent to use the animals in this study was obtained from the owners of the animals. The hair samples from Swedish Warmblood horses were originally collected for parentage testing and stored in the biobank at the Animal Genetics Laboratory, SLU so ethics approval was not applicable. Hair and blood samples of Norwegian-Swedish Coldblooded Trotters were collected under approval from the Ethics Committee for Animal Experiments in Uppsala, Sweden (Number: C 121/14). All the methods were performed in accordance with the guidelines set out by the respective Animal Ethics Committees and the guidelines contained in the Guide for the Care and Use of Laboratory Animals. No experimental procedure was performed on live animals. All other data was downloaded from publicly available repositories.

Data availability

The whole-genome sequence data used in this study is publicly available for The European Nucleotide Archive. Genotype data for Japanese Thoroughbreds, Exmoor Ponies, Quarter Horses and other domestic horse breeds can be found in the supplementary information of their respective papers (https://doi.org/10.1371/journal.pone.0218407, 10.1371/journal.pone.0152966, 10.1093/jhered/est079 and 10.1371/journal.pone.0054997). The exceptions are genotype data from Australian Thoroughbred, Swedish Warmbloods and North Swedish Coldblooded Trotters, which are available on request but restrictions apply to the availability of these data which were used under license for the current study and so are not publicly available. Supplementary information
  68 in total

Review 1.  The genetics of inbreeding depression.

Authors:  Deborah Charlesworth; John H Willis
Journal:  Nat Rev Genet       Date:  2009-11       Impact factor: 53.242

2.  Mutation rate and dominance of genes affecting viability in Drosophila melanogaster.

Authors:  T Mukai; S I Chigusa; L E Mettler; J F Crow
Journal:  Genetics       Date:  1972-10       Impact factor: 4.562

3.  Are Lethal Alleles Too Abundant in Humans?

Authors:  Mallory A Ballinger; Mohamed A F Noor
Journal:  Trends Genet       Date:  2017-12-28       Impact factor: 11.639

4.  Harmful recessive effects on fertility detected by absence of homozygous haplotypes.

Authors:  P M VanRaden; K M Olson; D J Null; J L Hutchison
Journal:  J Dairy Sci       Date:  2011-12       Impact factor: 4.034

5.  Identification of a large set of rare complete human knockouts.

Authors:  Patrick Sulem; Hannes Helgason; Asmundur Oddson; Hreinn Stefansson; Sigurjon A Gudjonsson; Florian Zink; Eirikur Hjartarson; Gunnar Th Sigurdsson; Adalbjorg Jonasdottir; Aslaug Jonasdottir; Asgeir Sigurdsson; Olafur Th Magnusson; Augustine Kong; Agnar Helgason; Hilma Holm; Unnur Thorsteinsdottir; Gisli Masson; Daniel F Gudbjartsson; Kari Stefansson
Journal:  Nat Genet       Date:  2015-03-25       Impact factor: 38.330

6.  An estimate of the average number of recessive lethal mutations carried by humans.

Authors:  Ziyue Gao; Darrel Waggoner; Matthew Stephens; Carole Ober; Molly Przeworski
Journal:  Genetics       Date:  2015-02-18       Impact factor: 4.562

7.  NGS-based reverse genetic screen for common embryonic lethal mutations compromising fertility in livestock.

Authors:  Carole Charlier; Wanbo Li; Chad Harland; Mathew Littlejohn; Wouter Coppieters; Frances Creagh; Steve Davis; Tom Druet; Pierre Faux; François Guillaume; Latifa Karim; Mike Keehan; Naveen Kumar Kadri; Nico Tamma; Richard Spelman; Michel Georges
Journal:  Genome Res       Date:  2016-09-19       Impact factor: 9.043

8.  Balancing selection of a frame-shift mutation in the MRC2 gene accounts for the outbreak of the Crooked Tail Syndrome in Belgian Blue Cattle.

Authors:  Corinne Fasquelle; Arnaud Sartelet; Wanbo Li; Marc Dive; Nico Tamma; Charles Michaux; Tom Druet; Ivo J Huijbers; Clare M Isacke; Wouter Coppieters; Michel Georges; Carole Charlier
Journal:  PLoS Genet       Date:  2009-09-25       Impact factor: 5.917

9.  Identification of a nonsense mutation in CWC15 associated with decreased reproductive efficiency in Jersey cattle.

Authors:  Tad S Sonstegard; John B Cole; Paul M VanRaden; Curtis P Van Tassell; Daniel J Null; Steven G Schroeder; Derek Bickhart; Matthew C McClure
Journal:  PLoS One       Date:  2013-01-22       Impact factor: 3.240

10.  Genome-wide generation and systematic phenotyping of knockout mice reveals new roles for many genes.

Authors:  Jacqueline K White; Anna-Karin Gerdin; Natasha A Karp; Ed Ryder; Marija Buljan; James N Bussell; Jennifer Salisbury; Simon Clare; Neil J Ingham; Christine Podrini; Richard Houghton; Jeanne Estabel; Joanna R Bottomley; David G Melvin; David Sunter; Niels C Adams; David Tannahill; Darren W Logan; Daniel G Macarthur; Jonathan Flint; Vinit B Mahajan; Stephen H Tsang; Ian Smyth; Fiona M Watt; William C Skarnes; Gordon Dougan; David J Adams; Ramiro Ramirez-Solis; Allan Bradley; Karen P Steel
Journal:  Cell       Date:  2013-07-18       Impact factor: 41.582

View more
  3 in total

Review 1.  Impaired Reproductive Function in Equines: From Genetics to Genomics.

Authors:  Nora Laseca; Gabriel Anaya; Zahira Peña; Yamila Pirosanto; Antonio Molina; Sebastián Demyda Peyrás
Journal:  Animals (Basel)       Date:  2021-02-03       Impact factor: 2.752

2.  Development and validation of a horse reference panel for genotype imputation.

Authors:  Paula Reich; Clemens Falker-Gieske; Torsten Pook; Jens Tetens
Journal:  Genet Sel Evol       Date:  2022-07-04       Impact factor: 5.100

3.  Variations in the fecal microbiota and their functions of Thoroughbred, Mongolian, and Hybrid horses.

Authors:  Xiaohui Wen; Shengjun Luo; Dianhong Lv; Chunling Jia; Xiurong Zhou; Qi Zhai; Li Xi; Caijuan Yang
Journal:  Front Vet Sci       Date:  2022-07-28
  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.