Literature DB >> 34944310

Reverse Genetic Screen for Deleterious Recessive Variants in the Local Simmental Cattle Population of Switzerland.

Irene M Häfliger1, Franz R Seefried2, Cord Drögemüller1.   

Abstract

We herein report the result of a large-scale reverse genetic screen in the Swiss Simmental population, a local dual-purpose cattle breed. We aimed to detect possible recessively inherited variants affecting protein-coding genes, as such deleterious variants can impair fertility and rearing success significantly. We used 115,000 phased SNP data of almost 10 thousand cattle with pedigree data. This revealed evidence for 11 genomic regions of 1.17 Mb on average, with haplotypes (SH1 to SH11) showing a significant depletion in homozygosity and an allele frequency between 3.2 and 10.6%. For the proposed haplotypes, it was unfortunately not possible to evaluate associations with fertility traits as no corresponding data were available. For each haplotype region, possible candidate genes were listed based on their known function in development and disease. Subsequent mining of single-nucleotide variants and short indels in the genomes of 23 sequenced haplotype carriers allowed us to identify three perfectly linked candidate causative protein-changing variants: a SH5-related DIS3:p.Ile678fs loss-of-function variant, a SH8-related CYP2B6:p.Ile313Asn missense variant, and a SH9-related NUBPL:p.Ser143Tyr missense variant. None of these variants occurred in homozygous state in any of more than 5200 sequenced cattle of various breeds. Selection against these alleles in order to reduce reproductive failure and animal loss is recommended.

Entities:  

Keywords:  Bos taurus; SNP genotyping; breeding; cattle; embryonic lethality; fertility; loss-of-function variants; reproduction; whole-genome sequencing

Year:  2021        PMID: 34944310      PMCID: PMC8698008          DOI: 10.3390/ani11123535

Source DB:  PubMed          Journal:  Animals (Basel)        ISSN: 2076-2615            Impact factor:   2.752


1. Introduction

Simmental is a globally recognized cattle breed, originating in the Simmental valley in the canton Bern in Switzerland. Autochthonous of Switzerland, today up to 40 million cattle worldwide are designated as members of the Simmental breed. However, Simmental are characterized by very different local breeding objectives (http://wsff.info; accessed on 24 October 2021). In the 19th century, crossbreeding of local cattle with Simmental cattle exported from Switzerland led to Fleckvieh populations in neighboring countries, including Austrian and German Fleckvieh, French Montbeliarde, and Italian Pezzata Rossa cattle. In the 20th century, introduction of animals of the Holstein breed into the Swiss Simmental population led to today’s Fleckvieh population in Switzerland (Swiss Fleckvieh). A recent analysis of the population structure of Swiss cattle showed a clear differentiation between today’s Swiss Simmental cattle and all other Swiss cattle populations, including the modern Swiss Fleckvieh [1,2]. Various historically younger Central European Simmental populations that descended from the Swiss Simmental can also be clearly distinguished from today’s Swiss Simmental animals, which showed the highest inbreeding level [3]. In contrast, recent studies found a comparatively low degree of genomic inbreeding in purebred Swiss Simmental that might be explained by the continued use of natural service sires, which is likely the major reason for their remarkably high level of genetic diversity although this population has been closed for a long time [1,2]. Therefore, the current so-called Original Simmental breed of Switzerland represents a unique purebred population. At the end of 2020, more than 23 thousand dual-purpose animals in the Simmental population were registered in the Swiss herdbook (https://www.swissherdbook.ch/fileadmin/Domain1/PDF_Dokumente/05-Statistiken-Formulare/53-Jahresstatistik/1_Wichtigste_Zahlen/D_sh_JS_2020_HBZahlen_web.pdf; accessed on 24 October 2021). A universal problem in cattle breeding is reproductive failure. It was shown that the reproduction success is negatively associated with production traits [4,5]. These effects have been thoroughly studied and possible reasons are selection programs focusing on production traits coupled with negative correlation with reproductive traits [6]. Regarding early pregnancy loss in Simmental cattle, e.g., a Croatian study described the body condition score, parity, and milk yield as important influencing factors [7]. Lower reproductive success is an economic problem for farmers and a major reason for cattle slaughter [8,9]. To tackle fertility issues, one approach is to apply a so-called reverse genetic screen, where only genomic data is used rather than phenotypic information. A whole-genome sequencing (WGS)-based approach was suggested by Charlier et al. (2016), where sequencing data of entire genomes is mined for variants affecting fertility by causing embryonic lethality [10]. A similar approach proposes scanning the cumulative population-wide SNP genotyping data from genomic selection programs to identify haplotype regions indicative of a depletion in homozygosity [11]. The analysis includes the statistical evaluation of segregating haplotypes regarding the Hardy–Weinberg equilibrium (HWE) [11]. Such genomic regions harboring deficient homozygous haplotypes indicate potential causal variants for embryonic death or unwanted phenotypes in newborns, leading to the exclusion of homozygous animals from breeding programs [11]. This SNP-based reverse genetics approach has been used successfully in diverse breeds and species (e.g., [11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31]). Haplotypes never observed in homozygous state indicate the presence of recessive, predominantly embryonic lethal variants. Potential causative variants were detected by analyzing whole-genome or whole-exome sequencing data linked to the identified haplotypes (e.g., [10,19,20,21,22,23,24,25,26,27,28,29,30,32,33,34,35]). In general, these approaches are especially interesting for small populations where phenotypic information is sparse. In Simmental-derived breeds in Central Europe, eight causal protein-changing variants for recessively inherited disorders are known so far. In German Fleckvieh cattle, a SNP-based reverse genetic study identified four deficient homozygous haplotypes located on chromosomes 1, 10, and 12 and proposed two candidate causal variants in two genes: the FH2-related frameshift variant in SLC2A2 and the FH4-related missense variant in SUGT1 [21]. In addition, a fifth haplotype is described, affecting the calf survival due to a congenital heart failure and severe liver damage (https://www.lfl.bayern.de/itz/rind/122227/index.php; accessed on 24 October 2021). Based on forward genetic studies, where affected individuals were examined, several causal variants for Mendelian disorders were found. A frameshift variant in GON4L was associated with the autosomal recessively inherited disorder dwarfism (OMIA 001985-9913) [36]. Furthermore, a nonsense variant in PLD4 causing a recessive genodermatosis observed in German Fleckvieh (OMIA 001935-9913) and a missense variant in OPA3 associated with a form of dilated cardiomyopathy predominantly affecting Swiss Fleckvieh (OMIA 000162-9913) were described [37,38]. In German Fleckvieh, a MOCS1-related form of arachnomelia was identified to be due to a frameshift variant (OMIA 001541-9913) [39]. Furthermore, the BH2-related TUBD1 missense variant, known to cause juvenile mortality in Braunvieh cattle (OMIA 001939-9913), was also observed in German and Austrian Fleckvieh [23]. To the best of our knowledge, so far, no carriers for any of these eight deleterious alleles have been found in the purebred Swiss Simmental population. In purebred Swiss Simmental cattle, no genomic analysis has yet been carried out to systematically identify recessively inherited harmful variants that affect female reproduction or calf rearing. Based on a reverse genetic screen, the present study aimed at a comprehensive analysis of available SNP and WGS data to identify genomic regions containing deficient homozygous haplotypes as well as linked candidate causal variants for hidden phenotypes.

2. Materials and Methods

The national breeding association provided us with the available SNP data, including all genotyped purebred Swiss Simmental cattle born after 2009 and their ancestors. The genomic positions of the markers relate to the latest cattle reference sequence ARS-UCD1.2 [40,41]. Due to the application of several routinely available SNP arrays ranging from 9000 to 150,000 SNPs, the data had to be imputed. Therefore, the software Fimpute v2.2 [42] was used with default parameters to increase the number of markers and correct for wrongly called markers. In order to assure the quality, SNPs with a minor allele frequency <0.01 were excluded from the dataset. Furthermore, SNPs that could not pass the quality measures for the call rate per SNP >0.99 were excluded and animals with call rates <0.8 were excluded too. Quality control was applied before and after imputation. This resulted in a final SNP data set of 114,890 markers for 9965 animals. As a first step, haplotypes showing a deviation from the HWE, indicated by depletion of homozygosity, were identified. The first subset of data analyzed included only fully genotyped trios where the complete trio (sire, dam, and offspring) were genotyped (n = 2626), further called “trio” approach. The second dataset analyzed included genotyped trios where an offspring and two paternal animals (sire and maternal grandfather) were genotyped (n = 3969), subsequently called parent–grandparent “pgp” approach. Both data sets, trio and pgp, were used in screening for window size of 50 markers within the software snp1101 [43]. The analyzed haplotypes overlap, as the windows were continuously moved marker by marker. The snp1101 software used the Fisher exact test of HWE [44] to analyze the resulting haplotypes. Furthermore, the p-values were corrected for a false discovery rate with the Benjamini–Yekultieli method and a significance level of 5% was applied [45]. For each significant haplotype region, we selected the most significant haplotype (lowest p-value). Regarding the identification of candidate variants, haplotype regions were defined by the selected haplotypes and increased on both sides by 2 Mb in order to make sure that the regions of overlapping significant haplotypes were included. Within the 11 regions detected, between 14 and 262 genes were annotated, giving a total of 669 genes. To these genes, information regarding associated phenotypes were collected from the various online databases: HUGO Gene Nomenclature Committee (https://www.genenames.org; accessed on 7 October 2021), Mouse Genome Informatics (http://www.informatics.jax.org; accessed on 7 October 2021), International Mouse Phenotyping Consortium (https://www.mousephenotype.org; accessed on 7 October 2021), Online Mendelian Inheritance in Man (https://www.omim.org; accessed on 7 October 2021), Online Mendelian Inheritance in Animals (https://omia.org/home/; accessed on 7 October 2021), and Decipher (https://www.deciphergenomics.org; accessed on 7 October 2021). With the selected haplotypes, we predicted individual diplotypes that represent if an animal carries one, two, or no copies of the haplotype. Based on these diplotypes, we selected three carrier animals for whole-genome sequencing (WGS) for each haplotype region. Therefore, 23 Simmental cattle (1 female and 22 male) carrying one or more significant haplotypes were selected for this study. WGS data were prepared as previously described [46]. However, recalibration was performed with the variant catalogue of the 1000 Bull Genomes Project run 7 (BQSR file version 2) [47,48]. The 23 genomes sequenced in this study were submitted to the 1000 Bull Genomes project [47,48] and are therefore part of the 5116 animals in the recent variant catalogue (run9). This international dataset was used to evaluate candidate causative variants in a larger cohort and across breeds, to evaluate breed specificity as well as reduced homozygosity. Furthermore, we had access to an additional 115 publicly available genomes from the Swiss Comparative Bovine Resequencing project, deposited in the European Nucleotide Archive under project accession PRJEB18113, that were yet not added to the 1000 Bull Genomes project. The combined WGS dataset of 5231 genomes includes 62 purebred Swiss Simmental cattle. In order to identify haplotype-associated candidate causal variants, linkage disequilibrium (LD) analysis was performed using plink v1.9 [49]. LD (r2) was calculated between the diplotype state and the WGS-derived genotypes for the 23 animals that were selected for WGS. Analysis was restricted to protein-changing variants, including variants annotated by SnpEff (version 4.3; [50]) with the following sequence ontology terms: missense_variant, start_lost, stop_gained (nonsense), stop_lost, stop_retained_variant, splice_acceptor_variant, splice_donor_variant, conservative_inframe_deletion, conservative_inframe_insertion, disruptive_inframe_deletion, disruptive_inframe_insertion, exon_loss_variant, frameshift_variant, and gene_fusion. Furthermore, to improve the understanding of candidate causal variants, their base conservation scores from UCSC database called PhyloP and PhastCons were applied [51,52]. To apply these values, firstly, the variants of the bovine genome needed to be mapped to the human genome 38 [53] with the tool LiftOver of the UCSC tools. Secondly, the conservation scores of 99 vertebrates for these human positions were obtained. Additionally, the effects of protein-changing variant were estimated using PROVEAN [54] and PredictSNP [55].

3. Results

In this reverse genetic study, we used SNP data to identify haplotypes showing a depletion in homozygosity and applied WGS data to pinpoint candidate causal variants in purebred Swiss Simmental animals. Reduced homozygosity due to hidden recessive variants in cattle could, so far, only be causally explained by coding variants. Therefore, we focused on variants having a moderate impact, such as missense variants, conservative in-frame insertions and deletions up to a size of 50 bp, as well as on all other protein-changing with high impact including loss-of-function variants, such as stop-gains (nonsense), splice site-disrupting SNVs, frameshift indels in a coding sequence, or deletions that remove coding exons.

3.1. Identification of Deficient Homozygous Haplotypes in Swiss Simmental Cattle

We detected seven haplotype regions with the trio approach and nine haplotype regions applying the pgp approach (Table 1, Figure 1, Table S1). Five of these haplotype regions appear in both analyses. We named the haplotype regions, in accordance to previous studies performed in other cattle breeds, as Simmental Haplotypes (SH) 1 to 11 [22] (Table 1, Table S1). All 11 identified haplotypes show a deficiency of at least 85 percent of the expected homozygous animals within the studied Swiss Simmental population (Table 1, Table S1). Four selected haplotypes (SH5, SH7, SH8, and SH10) presented a complete deficit of observed homozygous animals, whereas the others showed a partial deficiency ranging from 85 to 96% of the expected homozygotes (Table 1; Table S1). The average length of the eleven haplotypes is 1.17 Mb and ranges from 0.73 to 1.94 Mb (Table 1).
Table 1

List of deficient homozygous haplotypes in Swiss Simmental cattle.

Number of Homozygotes
Haplotype aAnalysisChrPosition (Mb) bLength (Mb)ObservedExpectedDeficiency (%)Allele Frequency
SH1trio and pgp165.793–66.8911.10317850.042
SH2trio21.191–1.9200.73116940.040
SH3pgp7104.784–105.6400.86117940.041
SH4pgp1129.389–30.3530.96422850.047
SH5trio1246.828–47.8371.010381000.062
pgp46.831–47.8431.01
SH6pgp1411.304–12.3731.07215880.039
SH7trio166.120–8.0111.890301000.055
pgp6.122–8.0601.945111960.106
SH8pgp1848.763–50.0051.240111000.0337
trio48.806–50.0171.210.0338
SH9trio and pgp2141.855–42.8511.00219960.043
SH10pgp2441.945–43.1401.200101000.032
SH11trio2838.937–40.0691.13112920.035

a SH meaning Simmental Haplotype. b in accordance with the reference sequence ARS-UCD1.2.

Figure 1

Genomic co-localization of the deficient homozygous haplotype regions detected in Swiss Simmental cattle using two different approaches. A typical breed-specific cow is shown in the lower right corner.

3.2. Identification of Candidate Genes in Haplotype Regions

The intensive analysis of all annotated protein-coding genes in the defined haplotype regions, extended by 2 Mb on each side, led to a comprehensive list of candidate genes possibly affecting either prenatal or postnatal lethality or associated sub-lethal phenotypes. We extracted 145 positional candidate genes of special interest, as they are associated with mammalian autosomal recessive disorders in human, mice, or other animals and listed in the consulted databases (Table S2). Loss-of-function mouse models of many of these genes have revealed defects that affect embryonic or perinatal to pre-weaning survival and therefore represent suitable functional candidates for this study. The presented short list of the 43 most probable candidate genes includes all genes that are associated with sub-lethal or lethal phenotypes (Table 2).
Table 2

List of candidate genes known to cause recessive disorders with lethal and sub-lethal phenotypes in human.

HaplotypeGeneGene DescriptionRecessive Disorder
SH1 IQCB1 § IQ motif containing B1Senior-Loken syndrome 5
MYLK § myosin light chain kinaseMegacystis-microcolon-intestinal hypoperistalsis syndrome 1
CASR *calcium sensing receptorHyperparathyroidism, neonatal
POGLUT1 *protein O-glucosyltransferase 1Limb-girdle muscular dystrophy-dystroglycanopathy
TIMMDC1 *translocase of inner mitochondrial membrane domain containing 1Mitochondrial complex I deficiency, nuclear type 31
SH2HERC2 *HECT and RLD domain containing E3 ubiquitin protein ligase 2Mental retardation, MRT 38
OCA2 *OCA2 melanosomal transmembrane proteinAlbinism, brown oculocutaneous
SH4CRIPT *CXXC repeat containing interactor of PDZ3 domainShort stature with microcephaly and distinctive facies
EPCAM *epithelial cell adhesion moleculeCongenital diarrhea with tufting enteropathy
FSHR *follicle stimulating hormone receptorOvarian dysgenesis 1
MSH2 § mutS homolog 2Mismatch repair cancer syndrome 2
MSH6 mutS homolog 6Mismatch repair cancer syndrome 3
PIGF *phosphatidylinositol glycan anchor biosynthesis class FOnychodystrophy, osteodystrophy, impaired intellectual development, and seizures syndrome
PPP1R21 *protein phosphatase 1 regulatory subunit 21Neurodevelopmental disorder with hypotonia, facial dysmorphism, and brain abnormalities
TTC7A *tetratricopeptide repeat domain 7AGastrointestinal defects and immunodeficiency syndrome
SH5PIBF1 *progesterone immunomodulatory binding factor 1Joubert syndrome 33
SH6MYC *MYC proto-oncogene, bHLH transcription factorBurkitt lymphoma, somatic
SH7 CFH complement factor HBasal laminar drusen, complement factor H deficiency
CR2 § complement C3d receptor 2Immunodeficiency, CVID7
IL10 § interleukin 10Critical role in the control of immune responses
SH8 ARHGEF1 § Rho guanine nucleotide exchange factor 1Immunodeficiency 62
BCKDHA *branched chain keto acid dehydrogenase E1 subunit alphaMaple syrup urine disease, type Ia @
B9D2 *B9 domain containing 2Meckel syndrome 10, Joubert syndrome 34
COQ8B *coenzyme Q8BNephrotic syndrome, type 9
DLL3 § delta-like canonical Notch ligand 3Spondylocostal dysostosis 1
ERF *ETS2 repressor factorSpondylocostal dysostosis 1
ETHE1 § ETHE1 persulfide dioxygenaseEthylmalonic encephalopathy
LTBP4 § latent transforming growth factor beta binding protein 4Cutis laxa, autosomal recessive, type IC
MEGF8 *multiple EGF-like domains 8Carpenter syndrome 2
PLEKHG2 pleckstrin homology and RhoGEF domain containing G2Leukodystrophy and acquired microcephaly
RYR1 *ryanodine receptor 1Neuromuscular disease
SMG9 *SMG9 nonsense mediated mRNA decay factorHeart and brain malformation syndrome
SPINT2 *serine peptidase inhibitor, Kunitz type 2Diarrhea 3, secretory sodium, congenital, syndromic
SPTBN4 spectrin beta, non-erythrocytic 4Neurodevelopmental disorder with hypotonia, neuropathy, and deafness
TGFB1 *transforming growth factor beta 1Inflammatory bowel disease, immunodeficiency, and encephalopathy
TIMM50 *translocase of inner mitochondrial membrane 503-methylglutaconic aciduria, type IX
XRCC1 *X-ray repair cross complementing 1Spinocerebellar ataxia, SCAR26
SH9NUBPL *nucleotide binding protein-likeMitochondrial complex I deficiency, nuclear type 21
SH10AFG3L2 *AFG3-like matrix AAA peptidase subunit 2Spastic ataxia 5
LAMA1 *laminin subunit alpha 1Poretti–Boltshauser syndrome
MC2R § melanocortin 2 receptorGlucocorticoid deficiency due to ACTH unresponsiveness
NDUFV2 *NADH:ubiquinone oxidoreductase core subunit V2Mitochondrial complex I deficiency, nuclear type 7
PIEZO2 *piezo type mechanosensitive ion channel component 2Arthrogryposis, distal, with impaired proprioception and touch

* Homozygous lethal with complete penetrance, § homozygous lethal with incomplete penetrance, @ disorder described in cattle (OMIA-ID: 000627-9913).

3.3. Identification of Candidate Causal Variants

For three deficient homozygous haplotypes (SH5, SH8, and SH9), by linkage disequilibrium analysis, we found perfectly linked (r2 = 1) candidate causal variants. These three haplotypes were detected with both the pgp and the trio approach. For each of these haplotypes, we propose a protein changing SNV (Table 3; Table S3). These three variants never occur in homozygous state in the analyzed 5231 bovine genomes of various cattle breeds (Table S3). Interestingly, the SH8-associated variant is apparently specific for Swiss Simmental; however, the variants associated with SH5 and SH9 occur sporadically in single animals of some other breeds (Table S3).
Table 3

Candidate causal variants for deficiency of homozygotes in Swiss Simmental cattle.

HaplotypeGeneOMIMVariantTranscript aCoding DNA ChangePredicted Protein Change
SH5 DIS3 6075331 bp insertion (frameshift)XM_025000110.1c.2032dupAp.Ile678AsnTer2
SH8 CYP2B6 123930SNV (missense)NM_001075173.1c.938T > Ap.Ile313Asn
SH9 NUBPL 613621SNV (missense)NM_001193042.1c.428C > A p.Ser143Tyr

a According to the NCBI Annotation Release 106 (National Center for Biotechnology Information, 2018b).

Among the three proposed non-synonymous variants are two missense variants altering evolutionary conserved residues and a frameshift variant that significantly truncates the encoded protein. The SH8-related SNV in exon 4 of the bovine cytochrome P450 family 2 subfamily B member 6 (CYP2B6) gene on chromosome 18 at position 50296371 is a missense variant (NM_001075173.1: p.Ile313Asn) that was predicted by PROVEAN to have a deleterious effect (Table 3, Table S3). The SH9-related SNV located in exon 6 of the bovine nucleotide binding protein-like (NUBPL) gene on chromosome 21 at position 42154344 represents a missense variant (NM_001193042.1: p.Ser143Tyr) predicted to be deleterious by PROVEAN and PredictSNP (Figure 2, Table 3, Table S3). Both presented missense mutations altering evolutionary conserved amino acids (Figure 2). The SH5-associated 1 bp insertion located in exon 16 of the bovine DIS3 homolog, exosome endoribonuclease and 3’-5’ exoribonuclease (DIS3) gene on chromosome 12 at position 47511687 represents a loss-of-function variant. It was predicted to result in a frameshift after isoleucine 678 with a premature stop codon (NP_025000110.1: p.Ile678AsnTer2), resulting in a significantly shortened amino acid sequence, if expressed, when compared with the wild-type protein (Table 3; Table S3). In addition, the comparative DNA sequence approach (PhyloP and phastCons) showed a high conservation across species for all three variant positions (Table S3).
Figure 2

Multiple sequence alignment of the amino acid sequences of CYP2B6 (A) and NUBPL (B) around the position of the missense variants (shown in red), indicating evolutionary conservation across all species.

4. Discussion

For the first time, the genomic data of the current Swiss Simmental dual-purpose cattle population were analyzed for reduced homozygosity due to hidden recessive monogenic variants and validated using the international variant catalogue of the 1000 Bull Genomes Project. In addition to environmental factors, inherited deleterious variants lead to natural or artificial selection against homozygous individuals, which also explains embryonic lethality, reduce rearing success, or the exclusion from the breeding population due to poor development. Unfortunately, these phenomena are not systematically monitored and are therefore difficult phenotypes to study. Although reduced reproductive success can theoretically be detected in the sires estimated breeding values for certain fertility traits, these effects are only noticeable for deleterious alleles that have reached a high frequency in the population [56]. To overcome this issue, we performed a genome-wide missing homozygosity scan, revealing eleven haplotype regions with considerable homozygous depletion. After subsequent mining of genome sequence data for candidate causal variants, we propose three non-synonymous variants that probably cause the obvious deficiency of homozygous animals. We applied two different approaches, a trio and pgp approach, that include genotyped trios and applies a Fisher exact test. Therefore, we were able to detect haplotypes that segregate at a lower allele frequency. As expected, we found more haplotype regions with the pgp approach in comparison to the trio approach, most likely because more genotyped groups were available. Nevertheless, the trio approach, detecting two regions not found with pgp, appears very powerful, probably because it directly traces the inheritance of the haplotypes. Previous approaches used the assumption of random mating and the deviation from HWE based on allele frequencies or used the deviation from expected number of homozygous offspring based on the haplotype state of the sire and maternal grandsire [14,20]. In contrast, the herein applied trio approach allows performing such an analysis with a small population such as Simmental. The five haplotype regions that were found in both analyses are the most probable genome regions harboring hidden harmful variants. Especially the haplotypes that never occur in homozygous state led to the suspicion of embryonic lethal variants segregating in the population. It is suspected that some haplotypes arose due to imputation errors introduced due to genotyping bias, SNP density/panel, sample size, and a bias introduced by the chosen software [57]. This would explain the haplotype regions SH3 and SH11 that arose in regions for which we could not identify any plausible functional candidate genes. Otherwise, we were able to identify candidate genes within all missing homozygous regions. Unfortunately, effect estimations of the haplotypes towards traits of female fertility and rearing success were not reasonable, as the available phenotypic data is currently limited. To avoid the detection of mostly sporadic associations rather than actual effects, it is planned to conduct such haplotype association studies in the future. Candidate causal variants are proposed for the haplotypes SH5, SH8, and SH9 in the genes DIS3, CYP2B6, and NUBPL, respectively. These variants all show complete depletion in homozygosity, perfect LD to the associated haplotype, and high conservation scores when compared across 99 genomes, indicating their importance in basic biological functions. As in other mammals, it is expected that ~100 harmful recessive variants will be found per individual in cattle, of which up to five of these impact essential genes and cause embryonic lethality or severe disease when homozygous [10]. Nevertheless, it is recognized that it is very difficult to clarify the actual deleterious functions of these variants, although given the genes involved, it is assumed that these variants influence fitness. Reverse genetic screens to identify genes with major effects, as used in the current study, are therefore helpful to assign function to variants in candidate genes and/or so far less characterized genes such as DIS3 and CYP2B6. The SH5-related loss-of-function variant found in bovine DIS3, most likely leading to missing homozygosity, represents the first time a pathogenic variant that most likely causes embryonic lethality has been identified. If the mutant mRNA transcript were to escape nonsense-mediated decay, even if this truncated protein was expressed, it would lack roughly 30 percent of the C-terminal part, and therefore it is not expected to contribute any function. DIS3, also known as ribosomal RNA-processing protein 44 (RRP44), is a RNase II/R-like enzyme located primarily in the nucleus (Table S3) [58]. The protein has catalytic function in the RNA exosome complex, which is responsible for 3′-end processing and RNA degradation of a broad variety of RNAs [58,59]. Biological functions are associated with RNA metabolism, mitotic control, spindle-fiber formation, antibody diversification, microtubule production, and growth and development [60]. Recently, pathogenic variants in genes encoding both structural and catalytic subunits of the RNA exosome have been linked to human disease, such as EXOSC3 and EXOSC8 related forms of pontocerebellar hypoplasia, representing recessive neurodegenerative diseases [61]. To our knowledge, the DIS3 gene has not yet been associated with Mendelian diseases, but variants are reported to be associated with various types of cancers and multiple myeloma [60,61,62]. Interestingly, the book of Fasken et al. (2020) provides a summary of the most common variants in DIS3 which all occur in heterozygous state, are associated with multiple myeloma, and seem to have mild effects only, while Tomecki et al. (2014) suggests the potential lethality of mutations in the PIN domain of DIS3. In Drosophila, a knock-down model led to wingless animals implicating an important role in development [63]. Nevertheless, the inactivation of DIS3 in B cells was shown to lead to an increase in unbalanced DNA translocations [64]. Lastly, public databases for mouse phenotypes indicate a complete pre-weaning lethality of DIS3 knock-out mice (https://www.mousephenotype.org/data/genes/MGI:1919912; http://www.informatics.jax.org/marker/MGI:1919912; accessed on 7 October 2021). Despite a long list of candidate genes located within the SH8-region, we propose a missense variant in the CYP2B6 gene as a candidate causal variant. The main reason for this is the perfect LD to the haplotype, the complete absence of homozygous animals, and the prediction of the DNA position to be highly conserved and the amino acid exchange to be deleterious. Regarding the gene function, CYP2B6 is a protein of the cytochromes P450 subfamily 2B (HGNC: 20604). This enzyme is known to be of importance for drug metabolism, as well as endogenous compounds, environmental toxins, and other substances [65,66]. For example, the susceptibility to Efavirenz depends on the individual CYP2B6 genotype (OMIM: 123930). Several SNV were detected in human that are associated with the expression level and activity (increased and decreased) of CYP2B6 with a population-wide importance [65] (https://www.pharmvar.org/gene/CYP2B6; accessed on 26 October 2021). In monkeys and human, it was shown that CYP2B6 is expressed in the brain and affected by nicotine and alcohol consumption; however, its neurological function remains unclear [65,66,67]. The protein is also expressed in the placenta [68] and it was shown that the pregnancy hormone estradiol induces the expression of CYP2B6 [69]. Nevertheless, what the function and importance of CYP2B6 is in maintaining pregnancy is unclear. As the herein identified bovine variant is predicted to be deleterious, we speculate that function of CYP2B6 might be impaired during development. Lastly, we propose the missense variant in the bovine NUBPL gene, exchanging a strongly conserved residue predicted to be deleterious and affecting a highly conserved nucleotide, to be causal for the deficit of homozygosity of SH9. As we observe few haplotype carriers in our data, we speculate that the effect of the variant is not fully penetrant or that signs of poor development appear later in life, after initial genotyping of young animals. NUBPL, also known as iron-sulfur protein required for NADH dehydrogenase (IND1), is a protein that is vital for the assembly of the respiratory complex I [70]. More precisely, the NUBPL supplies Fe/S clusters to the respiratory complex I and thereby ensures that important subunits are delivered to build the whole complex [70]. In human, pathogenic variants affecting NUBPL are associated with the autosomal recessive mitochondrial complex I deficiency disorder (OMIM: 613621, 618242) [71,72,73]. These variants include missense, frameshift and splice site variants, as well as small and large insertions and deletions. Clinical symptoms of mitochondrial complex I deficiency include, among others, ataxia, dysarthria, hypotonia, nystagmus, spasticity, and tremor [73]. Furthermore, variants in NUBPL were hypothesized as risk factors for Parkinson’s disease [74]. A mouse model identified the necessity of the protein as knock-out alleles led to homozygous lethality (MGI: 1924076) [75]. However, if the proposed variants are indeed depleted in a number of homozygous animals, it needs to be evaluated by further genotyping of larger cohorts before implementation into selection schemes. This is planned by adding the variants to a custom array for genotyping larger cohorts of further animals, both Swiss Simmental as well as other local cattle populations. An alternative approach to confirm the absence of homozygous animals, particularly in the offspring of carrier-by-carrier mating’s, would strongly support the deleterious nature of the variants as shown before [10]. In particular, for the NUBPL-associated variant, the observed homozygous animals should be examined in detail. The chosen approach requires that the haplotype and the causative variant are in near perfect linkage disequilibrium and, obviously, this is not always the case. This could be an explanation for the fact that we could not identify any potentially causal variants in the other haplotype regions. An alternative approach, therefore, is to mine the genome sequence data for candidate variants, e.g., loss-of-function in the listed essential candidate genes, and to genotype these directly in large cohorts of Swiss Simmental cattle. Using this approach, nine causal variants were uncovered in cattle that would not have been detected using SNP-based haplotype approaches [10]. Finally, another drawback of our study was the restriction to protein-changing variants. Moreover, we had no evidence for perfectly linked non-coding regulatory variants, as well as the limitation to consider only SNV and short indels, overlooking possible larger structural variants.

5. Conclusions

In the presented project, we mined SNP and WGS data by applying a reverse genetic approach. Without any phenotypic evidence but mining the data of almost 10 thousand SNP genotyped Swiss Simmental cattle and more than 5200 WGS animals from a variety of breeds, we propose three candidate variants in the genes DIS3, CYP2B6, and NUBPL causing embryonic lethality and/or yet unknown recessive developmental disorders. After phenotypic validation of these variants, selection against these variants is recommended.
  66 in total

1.  The human core exosome interacts with differentially localized processive RNases: hDIS3 and hDIS3L.

Authors:  Rafal Tomecki; Maiken S Kristiansen; Søren Lykke-Andersen; Aleksander Chlebowski; Katja M Larsen; Roman J Szczesny; Karolina Drazkowska; Agnieszka Pastula; Jens S Andersen; Piotr P Stepien; Andrzej Dziembowski; Torben Heick Jensen
Journal:  EMBO J       Date:  2010-06-08       Impact factor: 11.598

2.  Estimating the effect of the deleterious recessive haplotypes AH1 and AH2 on reproduction performance of Ayrshire cattle.

Authors:  A R Guarini; M Sargolzaei; L F Brito; V Kroezen; D A L Lourenco; C F Baes; F Miglior; J B Cole; F S Schenkel
Journal:  J Dairy Sci       Date:  2019-04-04       Impact factor: 4.034

Review 3.  Cytochrome P450 2B6: function, genetics, and clinical relevance.

Authors:  Miia Turpeinen; Ulrich M Zanger
Journal:  Drug Metabol Drug Interact       Date:  2012

4.  A new approach for efficient genotype imputation using information from relatives.

Authors:  Mehdi Sargolzaei; Jacques P Chesnais; Flavio S Schenkel
Journal:  BMC Genomics       Date:  2014-06-17       Impact factor: 3.969

5.  A systematic survey to identify lethal recessive variation in highly managed pig populations.

Authors:  Martijn F L Derks; Hendrik-Jan Megens; Mirte Bosse; Marcos S Lopes; Barbara Harlizius; Martien A M Groenen
Journal:  BMC Genomics       Date:  2017-11-09       Impact factor: 3.969

6.  Identification of homozygous haplotypes carrying putative recessive lethal mutations that compromise fertility traits in French Lacaune dairy sheep.

Authors:  Stéphane Fabre; Carole Moreno-Romieux; Maxime Ben Braiek; Chris Hozé; Jean-Michel Astruc
Journal:  Genet Sel Evol       Date:  2021-05-01       Impact factor: 5.100

7.  Population structure and genomic inbreeding in nine Swiss dairy cattle populations.

Authors:  Heidi Signer-Hasler; Alexander Burren; Markus Neuditschko; Mirjam Frischknecht; Dorian Garrick; Christian Stricker; Birgit Gredler; Beat Bapst; Christine Flury
Journal:  Genet Sel Evol       Date:  2017-11-07       Impact factor: 4.297

8.  High-throughput discovery of novel developmental phenotypes.

Authors:  Mary E Dickinson; Ann M Flenniken; Xiao Ji; Lydia Teboul; Michael D Wong; Jacqueline K White; Terrence F Meehan; Wolfgang J Weninger; Henrik Westerberg; Hibret Adissu; Candice N Baker; Lynette Bower; James M Brown; L Brianna Caddle; Francesco Chiani; Dave Clary; James Cleak; Mark J Daly; James M Denegre; Brendan Doe; Mary E Dolan; Sarah M Edie; Helmut Fuchs; Valerie Gailus-Durner; Antonella Galli; Alessia Gambadoro; Juan Gallegos; Shiying Guo; Neil R Horner; Chih-Wei Hsu; Sara J Johnson; Sowmya Kalaga; Lance C Keith; Louise Lanoue; Thomas N Lawson; Monkol Lek; Manuel Mark; Susan Marschall; Jeremy Mason; Melissa L McElwee; Susan Newbigging; Lauryl M J Nutter; Kevin A Peterson; Ramiro Ramirez-Solis; Douglas J Rowland; Edward Ryder; Kaitlin E Samocha; John R Seavitt; Mohammed Selloum; Zsombor Szoke-Kovacs; Masaru Tamura; Amanda G Trainor; Ilinca Tudose; Shigeharu Wakana; Jonathan Warren; Olivia Wendling; David B West; Leeyean Wong; Atsushi Yoshiki; Daniel G MacArthur; Glauco P Tocchini-Valentini; Xiang Gao; Paul Flicek; Allan Bradley; William C Skarnes; Monica J Justice; Helen E Parkinson; Mark Moore; Sara Wells; Robert E Braun; Karen L Svenson; Martin Hrabe de Angelis; Yann Herault; Tim Mohun; Ann-Marie Mallon; R Mark Henkelman; Steve D M Brown; David J Adams; K C Kent Lloyd; Colin McKerlie; Arthur L Beaudet; Maja Bućan; Stephen A Murray
Journal:  Nature       Date:  2016-09-14       Impact factor: 49.962

9.  Discovering lethal alleles across the turkey genome using a transmission ratio distortion approach.

Authors:  E A Abdalla; S Id-Lahoucine; A Cánovas; J Casellas; F S Schenkel; B J Wood; C F Baes
Journal:  Anim Genet       Date:  2020-10-01       Impact factor: 3.169

Review 10.  NUBPL mitochondrial disease: new patients and review of the genetic and clinical spectrum.

Authors:  Virginia Kimonis; Rehab Al Dubaisi; Andrew E Maclean; Kathy Hall; Lan Weiss; Alexander E Stover; Philip H Schwartz; Bethany Berg; Cheng Cheng; Sumit Parikh; Blair R Conner; Sitao Wu; Anton N Hasso; Daryl A Scott; Mary Kay Koenig; Rachid Karam; Sha Tang; Moyra Smith; Elizabeth Chao; Janneke Balk; Eli Hatchwell; Peggy S Eis
Journal:  J Med Genet       Date:  2020-06-09       Impact factor: 5.941

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.