Literature DB >> 31914939

Assessing genomic diversity and signatures of selection in Original Braunvieh cattle using whole-genome sequencing data.

Meenu Bhati1, Naveen Kumar Kadri2, Danang Crysnanto2, Hubert Pausch2.   

Abstract

BACKGROUND: Autochthonous cattle breeds are an important source of genetic variation because they might carry alleles that enable them to adapt to local environment and food conditions. Original Braunvieh (OB) is a local cattle breed of Switzerland used for beef and milk production in alpine areas. Using whole-genome sequencing (WGS) data of 49 key ancestors, we characterize genomic diversity, genomic inbreeding, and signatures of selection in Swiss OB cattle at nucleotide resolution.
RESULTS: We annotated 15,722,811 SNPs and 1,580,878 Indels including 10,738 and 2763 missense deleterious and high impact variants, respectively, that were discovered in 49 OB key ancestors. Six Mendelian trait-associated variants that were previously detected in breeds other than OB, segregated in the sequenced key ancestors including variants causal for recessive xanthinuria and albinism. The average nucleotide diversity (1.6  × 10- 3) was higher in OB than many mainstream European cattle breeds. Accordingly, the average genomic inbreeding derived from runs of homozygosity (ROH) was relatively low (FROH = 0.14) in the 49 OB key ancestor animals. However, genomic inbreeding was higher in OB cattle of more recent generations (FROH = 0.16) due to a higher number of long (> 1 Mb) runs of homozygosity. Using two complementary approaches, composite likelihood ratio test and integrated haplotype score, we identified 95 and 162 genomic regions encompassing 136 and 157 protein-coding genes, respectively, that showed evidence (P < 0.005) of past and ongoing selection. These selection signals were enriched for quantitative trait loci related to beef traits including meat quality, feed efficiency and body weight and pathways related to blood coagulation, nervous and sensory stimulus.
CONCLUSIONS: We provide a comprehensive overview of sequence variation in Swiss OB cattle genomes. With WGS data, we observe higher genomic diversity and less inbreeding in OB than many European mainstream cattle breeds. Footprints of selection were detected in genomic regions that are possibly relevant for meat quality and adaptation to local environmental conditions. Considering that the population size is low and genomic inbreeding increased in the past generations, the implementation of optimal mating strategies seems warranted to maintain genetic diversity in the Swiss OB cattle population.

Entities:  

Mesh:

Year:  2020        PMID: 31914939      PMCID: PMC6950892          DOI: 10.1186/s12864-020-6446-y

Source DB:  PubMed          Journal:  BMC Genomics        ISSN: 1471-2164            Impact factor:   3.969


Introduction

Following the domestication of cattle, both natural and artificial selection led to the formation of breeds with distinct phenotypic characteristics including morphological, physiological and adaptability traits [1]. With an increasing demand for animal-based food products, few breeds were intensively selected for high milk (e.g., Holstein, Brown Swiss) and beef (e.g., Angus) production. The predominant selection of cattle from specialized breeds caused a sharp decline in the population size of local breeds [2, 3]. Although less productive under intensive production conditions, local breeds of cattle might carry alleles that enable them to adapt to local conditions. Therefore, local breeds represent an important genetic resource to facilitate animal breeding in the future under challenging and changing production conditions [4, 5]. Characterizing the genetic diversity of local cattle breeds is important to optimally manage these genetic resources. The Swiss Original Braunvieh (OB) cattle breed is a dual purpose taurine cattle breed that is used for beef and milk production in alpine areas [6, 7]. In transhumance, the cattle graze at alpine pastures (between 1000 and 2400 m above sea level) during the summer months and return to the stables for the winter months [7]. Mainly due to their strong and firm legs and claws, OB cattle are well adapted to the alpine terrain. Under extensive farming conditions, OB cattle may outperform specialized dairy breeds in terms of fertility, longevity and health status [8]. However, in the early 1960s, Swiss cattle breeders began inseminating OB cows with semen from US Brown Swiss sires to increase milk yield, reduce calving difficulties and improve mammary gland morphology of the Swiss OB cattle population [9]. The extensive cross-breeding of OB cows with Brown Swiss sires decreased the number of female OB calves entering the herd book to less than 2000 by mid 1990’s [9] (Additional file 1). Since then, the Swiss OB population increased steadily, facilitated by governmental subsidies. A number of studies investigated the genomic diversity and population structure of the Swiss OB cattle breed using either pedigree or microarray data [9, 10]. In spite of the small population size, genetic diversity is higher in OB than many commercial breeds likely due to the use of many sires in natural mating and lower use of artificial insemination [9, 10]. Genomic inbreeding and footprints of selection have been compared between OB and other Swiss cattle breeds using SNP microarray-derived genotypes [10]. Because the SNP microarrays were designed in a way that they interrogate genetic markers that are common in the mainstream breeds of cattle, they might be less informative for breeds of cattle that are diverged from the mainstream breeds [11]. Ascertainment bias is inherent in the resulting genotype data because rare, breed-specific, and less-accessible genetic variants are underrepresented among the microarray-derived genotypes [12]. This limitation causes observed allele frequency distributions to deviate from expectations which can distort population genetics estimates [13]. With the availability of whole genome sequencing (WGS), it has become possible to discover sequence variant genotypes at population scale [14]. While sequence variant genotypes might be biased toward the reference allele, this reference bias is less of a concern when the sequencing coverage is high [15]. According to Boitard et al. 2016 [16], WGS data facilitate detecting selection signatures at higher resolution than SNP microarray data. Moreover, the WGS-based detection of runs of homozygosity (ROH) is more sensitive for short ROH that are typically missed using SNP microarray-derived genotypes. In the present study, we analyze more than 17 million WGS variants of 49 key ancestors of the Swiss OB cattle breed that were sequenced to an average fold-coverage of 12.75 per animal [17]. These data enabled us to assess genomic diversity and detect signatures of past or ongoing selection in the breed at nucleotide resolution. Moreover, we estimate genomic inbreeding in the population using runs of homozygosity.

Results

Overview of genomic diversity in OB cattle

We annotated 15,722,811 biallelic SNPs and 1,580,878 Indels that were discovered in 49 OB cattle [17]. The average genome wide nucleotide diversity within the OB breed was 0.001637/bp. Among the detected variants, 546,419 (3.5%) SNPs and 307,847 (19.5%) Indels were found novel when compared to the 102,090,847 polymorphic sites of the NCBI bovine dbSNP database version 150. Functional annotation of the polymorphic sites revealed that the vast majority of SNPs were located in either intergenic (73.8%) or intronic regions (25.2%). Only 1% of SNPs (160,707) were located in the exonic regions (Table 1). In protein-coding sequences, we detected 58,387, 47,249 and 1264 synonymous, missense, and high impact SNPs, respectively. According to the SIFT scoring, 10,738 missense SNPs were classified as likely deleterious to protein function (SIFT score < 0.05). Among the high impact variants, we detected 580, 33, 106, 273 and 272 stop gain, stop lost, start lost, splice donor and splice acceptor variants, respectively. Deleterious and high impact variants were more frequent in the low than high allele frequency classes (Additional file 2).
Table 1

Number of SNPs and Indels in sequence ontology classes annotated using the VEP software

Sequence ontology classSNPIndel
splice_acceptor_variant27284
splice_donor_variant27371
stop_gained58016
frameshift_variant01324
stop_lost330
start_lost1064
inframe_insertion0290
inframe_deletion0440
missense_variant47,4290
protein_altering_variant012
splice_region_variant95531059
stop_retained_variant452
synonymous_variant58,3870
coding_sequence_variant166125
mature_miRNA_variant8323
5_prime_UTR_variant6744600
3_prime_UTR_variant30,7164074
non_coding_transcript_exon_variant6296434
intron_variant3,960,673422,764
non_coding_transcript_variant2425
upstream_gene_variant526,00056,715
downstream_gene_variant454,75351,672
intergenic_variant10,620,6781,041,144
Total15,722,8111,580,878
Number of SNPs and Indels in sequence ontology classes annotated using the VEP software The majority of 1,580,878 Indels were detected in either intergenic (72.7%) or intronic (26.7%) regions. Only 2213 (0.14%) Indels affected coding sequences. Among these, 1499 were classified as high impact variants including 1324, 16, 4, 71 and 84 frameshift, stop gain, start lost, splice donor and splice acceptor variants, respectively. Similar to previous studies in cattle [14, 18], coding regions were enriched for Indels with lengths in multiples of three indicating that they are less likely to be deleterious to protein function than frameshift variants (Additional file 3).

OMIA variants segregating in the OB population

We obtained genomic coordinates of 155 variants that are associated with Mendelian traits in cattle from the OMIA database to analyze if they segregate among the 49 OB cattle. It turned out that six OMIA variants were also detected in the 49 OB cattle including two variants in the MOCOS and SLC45A2 genes that are associated with severe recessive disorders (Additional file 4). Two OB key ancestor bulls born in 1967 and 1974 (ENA SRA sample accession numbers SAMEA4827662 and SAMEA4827664) were heterozygous carriers of a single base pair deletion (BTA24:g.21222030delC) in the MOCOS gene (OMIA 001819–9913) that causes xanthinuria in the homozygous state in Tyrolean grey cattle [19]. Another two OB key ancestor bulls (sire and son; ENA SRA sample accession numbers SAMEA4827659 and SAMEA4827645) that were born in 1967 and 1973 were heterozygous carriers of two missense variants in SLC45A2 (BTA20:g.39829806G > A and BTA20:g.39864148C > T) that are associated with oculocutaneous albinism (OMIA 001821–9913) in Braunvieh cattle [20].

Runs of homozygosity and genomic inbreeding

Runs of homozygosity were analyzed in 33 OB animals that had an average sequencing depth greater than 10-fold. We found 2044 ± 79 autosomal ROH per individual with a length of 179 kb ± 17.6 kb. The length of the ROH ranged from 50 kb (minimum size considered, see methods) to 5,025,959 bp. On average, 14.58% of the genome (excluding sex chromosome) was in ROH (Additional file 5). Average genomic inbreeding for the 29 chromosomes ranged from 11.5% (BTA29) to 18.6% (BTA26) (Fig. 1a).
Fig. 1

ROH in 33 OB cattle with average sequencing depth greater than 10-fold. a Average genomic inbreeding and corresponding standard error for the 29 autosomes. b Average genomic inbreeding (FROH) calculated from short (50–100 kb), medium (0.1–2 Mb) and long (> 2 Mb) ROH. (c) Average number of short, medium and long ROH

ROH in 33 OB cattle with average sequencing depth greater than 10-fold. a Average genomic inbreeding and corresponding standard error for the 29 autosomes. b Average genomic inbreeding (FROH) calculated from short (50–100 kb), medium (0.1–2 Mb) and long (> 2 Mb) ROH. (c) Average number of short, medium and long ROH In order to study the demography of the OB population, we calculated the contributions of short, medium and long ROH to the total genomic inbreeding (Additional file 5). The medium-sized ROH were the most frequent class (50.46%), and contributed most (75.01%) to the total genomic inbreeding. While short ROH occurred almost as frequent (49.17%) as medium-sized ROH, they contributed only 19.52% to total genomic inbreeding (Fig. 1b & c; Additional file 5). Long ROH were rarely (0.36%) observed among the OB key ancestors and contributed little (5.47%) to total genomic inbreeding. The number of long ROH was correlated (r = 0.77) with genomic inbreeding. Genomic inbreeding (FROH) was significantly (P = 0.0002) higher in 20 animals born between 1990 and 2012 than in 13 animals born between 1965 and 1989 (0.16 vs. 0.14) (Additional file 6). The higher FROH in animals born in more recent generations was mainly due to more long (> 2 Mb; P = 0.00004) and medium-sized ROH (0.1–1 Mb; P = 0.001) (Fig. 2).
Fig. 2

Cumulative genomic inbreeding (%) in animals born between 1965 and 1989 (blue lines) and 1990–2012 (red lines) from ROH sorted on length and binned in windows of 10 kb. Thin dashed lines represent individuals and thick solid lines represent the average cumulative genomic inbreeding of the two groups of animals

Cumulative genomic inbreeding (%) in animals born between 1965 and 1989 (blue lines) and 1990–2012 (red lines) from ROH sorted on length and binned in windows of 10 kb. Thin dashed lines represent individuals and thick solid lines represent the average cumulative genomic inbreeding of the two groups of animals

Signatures of selection

We identified candidate signatures of selection using two complementary methods: the composite likelihood ratio (CLR) test and the integrated haplotype score (iHS) (Fig. 3a & b). The CLR test detects ‘hard sweeps’ at genomic regions where beneficial adaptive alleles recently reached fixation [21]. The iHS detects ‘soft sweeps’ at genomic regions where selection for beneficial alleles is still ongoing [22, 23]. We detected 95 and 162 candidate regions of signatures of selection (P < 0.005) using CLR and iHS, respectively, encompassing 12.56 Mb and 12.48 Mb (Additional file 7; Additional file 8). These candidate signatures of selection were not evenly distributed over the genome (Fig. 3c). Functional annotation revealed that 136 and 157 protein-coding genes overlapped with 50 and 86 candidate regions from CLR and iHS analyses, respectively. All other candidate signatures of selection were located in intergenic regions. Closer inspection of the top selection regions of both analyses revealed that 16 CLR candidate regions overlapped with 25 iHS candidate regions on chromosomes 5, 7, 11, 14, 15, 17 and 26 (Fig. 3c) encompassing 35 coding genes (Additional file 9).
Fig. 3

Genome wide distribution of top 0.5% signatures of selection from CLR (a) and iHS (b) analyses and their overlap (c). Each point represents a non-overlapping window of 40 kb along the autosomes

Genome wide distribution of top 0.5% signatures of selection from CLR (a) and iHS (b) analyses and their overlap (c). Each point represents a non-overlapping window of 40 kb along the autosomes

Top candidate signatures of selection

On chromosome 11, we identified 12 and 36 candidate regions of selection using CLR and iHS analyses, respectively. The top CLR candidate region (P = 3.1 × 10− 5) was located on chromosome 11 between 66 Mb and 68.5 Mb (Fig. 4a) and it encompassed 24 protein-coding genes (Additional file 7). The same region was also in ROH in 77% of 33 animals that were sequenced at high coverage. The peak of this top CLR region was located between 67.5 and 68.2 Mb and it contained several adjacent windows with CLR values higher than 5000 (P < 0.003). The top region encompassed 5 genes (Fig. 4a & e). The variant density in the top region was low and SNP allele frequency was skewed which is typical for the presence of a hard sweep (Fig. 4c). The top iHS candidate region was located on chromosome 11 between 68.4 and 69.2 Mb (P = 3.2 × 10− 5) encompassing 7 genes (Fig. 4b & f). The allele frequencies of the SNPs within the top iHS region are approaching fixation indicating ongoing selection possibly due to hitchhiking with the neighboring hard sweep (Fig. 4d).
Fig. 4

Detailed view of a top candidate selection region on chromosome 11 in OB that was detected using CLR tests (a) and iHS (b). Each point represents a non-overlapping window of 40 kb. The dotted horizontal lines indicate the cutoff values (top 0.5%) for CLR (210) and iHS (2.13) statistics. The allele frequencies of the derived (red) or alternate alleles (black) (c and d) and genes (e and f) in the peak region (67.5–68.2 Mb) of the top CLR (66–68.5 Mb) and iHS (68.4–69.2 Mb) regions. Green and black colour indicates genes on the forward and reverse strand of DNA, respectively

Detailed view of a top candidate selection region on chromosome 11 in OB that was detected using CLR tests (a) and iHS (b). Each point represents a non-overlapping window of 40 kb. The dotted horizontal lines indicate the cutoff values (top 0.5%) for CLR (210) and iHS (2.13) statistics. The allele frequencies of the derived (red) or alternate alleles (black) (c and d) and genes (e and f) in the peak region (67.5–68.2 Mb) of the top CLR (66–68.5 Mb) and iHS (68.4–69.2 Mb) regions. Green and black colour indicates genes on the forward and reverse strand of DNA, respectively Another striking CLR signal (P = 0.0012) was detected on chromosome 6 between 38.5 and 39.4 Mb. This genomic region encompasses the DCAF16, FAM184B, LAP3, LCORL, MED28 and NCAPG genes, and the window with the highest CLR value overlapped the NCAPG gene (Fig. 5a & c). This signature of selection coincides with a QTL that is associated with stature, feed efficiency and fetal growth [24-26]. Most SNPs detected within this region were fixed for the alternate allele in the OB key ancestor animals of our study (Fig. 5b). All 49 sequenced OB cattle were homozygous for the Chr6:38777311 G-allele which results in a likely deleterious (SIFT score 0.01) amino acid substitution (p.I442M) in the NCAPG gene that is associated with increased pre- and postnatal growth and calving difficulties [24].
Fig. 5

Top CLR candidate region on chromosome 6 (a). Each point represents a non-overlapping window of 40 kb. The frequencies of the derived (red) or alternate alleles (black) (b) and genes (c) annotated between 38.5 and 39.4 Mb. Green and black colour indicates genes on the forward and reverse strand of DNA, respectively

Top CLR candidate region on chromosome 6 (a). Each point represents a non-overlapping window of 40 kb. The frequencies of the derived (red) or alternate alleles (black) (b) and genes (c) annotated between 38.5 and 39.4 Mb. Green and black colour indicates genes on the forward and reverse strand of DNA, respectively

GO enrichment analysis

Genes within candidate signatures of selection from CLR and iHS analyses were enriched (after correcting for multiple testing) in the panther pathway (P00011) related to “Blood coagulation”. Genes within candidate signatures of selection from CLR tests were also enriched in the pathway “P53 pathway feedback loops 1” (Additional file 10). Although we did not find any enrichment of GO-slim biological processes after correcting for multiple testing, 21 GO-slim biological processes including cellular catabolic processes, oxygen transport and different splicing pathways were nominally enriched for genes within CLR candidate signatures of selection and 14 GO-slim biological processes including nervous system, sensory perception (olfactory receptors) and multicellular processes were nominally enriched for genes within iHS candidate signatures of selection (Additional file 10).

QTL enrichment analysis

We investigated if candidate selection regions overlapped with trait-associated genomic regions using QTL information curated at the Animal QTL Database (Animal QTLdb). We found that 74.7 and 83.9% of CLR and iHS candidate signatures of selection, respectively, were overlapping at least one QTL (Additional file 11). We tested for enrichment of these signatures of selection in QTL for six trait classes: exterior, health, milk, meat, production, and reproduction using permutation. It turned out that QTL associated with meat quality (P = 0.0004, P = 0.0003) and production traits (P = 0.0027, P = 0.0039) were significantly enriched in both CLR and iHS candidate signatures of selection. We did not detect any enrichment of QTL associated with milk, reproduction, health, and exterior traits neither in CLR nor in iHS candidate signatures of selection.

Discussion

We discovered 107,291 variants in coding sequences of 49 sequenced OB cattle. In agreement with previous studies in cattle [14, 27], missense deleterious and high impact variants occurred predominantly at low allele frequency likely indicating that variants which disrupt physiological protein functions are removed from the population through purifying selection [28]. However, deleterious variants may reach high frequency in livestock populations due to the frequent use of individual carrier animals in artificial insemination [29], hitchhiking with favorable alleles under artificial selection [30, 31], or demography effects such as population bottlenecks [32]. Because we predicted functional consequences of missense variants using computational inference, they have to be treated with caution in the absence of experimental validation [33]. High impact variants that segregated among the 49 sequenced OB key ancestors were also listed as Mendelian trait-associated variants in the OMIA database. For instance, we detected frameshift and missense variants in MOCOS and SLC45A2 that are associated with recessive xanthinuria [19] and oculocutaneous albinism [20], respectively. To the best of our knowledge, calves neither with xanthinuria nor oculocutaneous albinism have been reported in the Swiss OB cattle population. The absence of affected calves is likely due to the low frequencies of the deleterious alleles and avoidance of matings between closely related heterozygous carriers. Among 49 sequenced cattle, we detected only two bulls that carried the disease-associated MOCOS and SLC45A2 alleles in the heterozygous state. However, the frequent use of individual carrier bulls in artificial insemination might result in an accumulation of diseased animals within short time even when the frequency of the deleterious allele is low in the population [34]. Because the deleterious alleles were detected in sequenced key ancestor animals that were born decades ago, we cannot preclude that they were lost due to genetic drift or during the recent population bottleneck in OB (Additional file 1). A frameshift variant in SLC2A2 (NM_001103222:c.771_778delTTGAAAAGinsCATC, rs379675307, OMIA 000366–9913) causes a recessive disorder in cattle that resembles human Fanconi-Bickel syndrome [35-37]. Recently, the disease-causing allele was detected in the homozygous state in an OB calf with retarded growth due to liver and kidney disease [38]. We did not detect the disease-associated allele in our study. This may be because it is located on a rare haplotype that does not segregate in the 49 sequenced cattle. Most of the sequenced animals of the present study were selected for sequencing using the key ancestor approach, as their genes contributed significantly to the current population [17, 39]. More sophisticated methods to select animals for sequencing might prioritize rare haplotypes, thus increasing the likelihood to detect rarer alleles when the sequencing budget is constrained [40-42].

Genomic diversity and genomic inbreeding

Original Braunvieh is a local cattle breed with approximately 10,000 cows registered in the breeding population and 4500 calves entering the herd book every year (Additional file 1). In spite of the small population size, the nucleotide diversity (π = 1.6 × 10− 3) is higher in OB than many taurine cattle breeds with considerably more breeding animals including Holstein, Jersey and Fleckvieh (∼1.2–1.4 × 10− 3) [43, 44]. However, nucleotide diversity is lower in OB than African indigenous cattle breeds (2.0–4.0 × 10− 3), New Danish Red (1.7 × 10− 3) and Yakutian cattle breeds (1.7 × 10− 3) [44-46]. The average FROH estimated from WGS data was 0.14 in OB. This is lower than WGS-based FROH in Holstein (0.18), Jersey (0.24), Old Danish Red (0.23) and Belgian Blue (0.3) cattle [47, 48]. However, the genomic inbreeding is slightly higher in OB than New Red Danish cattle (0.11), an admixed breed that contains genes from old Danish and other red breeds [47]. The relatively high genomic diversity of OB cattle is assumed to be the result of many different sires contributing to the gene pool due to frequent use of natural mating [10]. Our WGS based estimate of FROH (0.14) is substantially higher than previous estimates obtained using 50 K SNP microarray data (FROH = 0.029, [10]) for the same population. Genotype data obtained using SNP microarrays with medium density (e.g., BovineSNP50) facilitate to detect long ROH (> 1 Mb). However, due to low SNP density (~ 1 SNP per 50 kb) detecting short ROH is not possible using microarray-derived genotype data. In our data, short and medium-sized ROH accounted for 80.48% of total inbreeding. Most short and medium-sized ROH are not reliably detectable with the SNP microarrays that were used to quantify FROH in Signer-Hasler et al. [10], resulting in an underestimation of genomic inbreeding. Our estimate of the genomic inbreeding using WGS variants also includes short and medium-sized ROH that were previously missed using SNP array data, thus representing a realistic estimate of total genomic inbreeding in OB cattle. Apart from genomic inbreeding, ROH also provide information about population and individual demography [49-51]. Our findings show that medium-sized ROH that reflect historical inbreeding contribute most to the genomic inbreeding of the current OB population. The minor contribution of long ROH to the genomic inbreeding indicates that recent inbreeding is relatively low in OB possibly due to use of many sires in natural matings as suggested by Hagger [9] and Signer-Hasler et al. [10]. Our results based on ROH inferred from WGS variants corroborate that genomic inbreeding is lower in OB than most mainstream breeds [10]. However, comparing the number and size distribution of ROH across studies is subject to bias because misplaced genomic segments might break ROH into multiple small- and medium-sized ROHs and different ROH-detection approaches yield results that are not readily comparable [49, 52–55]. Genomic inbreeding is increasing in the OB population in recent years mainly due to an increase in occurrence of long ROH. The recent population bottleneck in the OB population (Additional file 1) might promote matings between closely related animals that caused inbreeding to increase in recent generations. In this regard, genome-based mating strategies seem to be warranted to achieve sufficient genetic gain while maintaining genetic diversity and avoiding matings between carriers of disease-associated alleles [56, 57]. With WGS data, we were able to identify more signatures of selection compared to SNP array data [10, 58], even though we used only 9 million SNP for which we could readily assign ancestral and derived alleles [59]. Using two complementary approaches, we found several new and known candidate regions that seem to be targets of recent or ongoing selection in OB. Many signatures of selection were located in non-coding regions corroborating that selection frequently acts on regulatory sites [16]. However, it is possible that an improved annotation of the bovine genome might place these regions in yet to be annotated coding regions. We applied methods to detect signatures of selection that depend on frequency changes of alleles (CLR) and haplotypes (iHS). The detected signatures of selection may be confounded by other evolutionary forces including genetic drift and background selection [60-62]. We detected candidate selection regions in OB cattle that harbor genes associated with stature or milk production (NCAPG, LCORL, LAP3), feed efficiency or lipid metabolism (R3HDM1, AOX1), and unknown functions (SLC25A33, TMEM201) that were previously reported to be targets of selection in taurine and indicine cattle breeds [16, 63–65]. The presence of signatures of selection that are common in several breeds indicates that selection at these regions has happened either before the breeds diverged or independently after the formation of breeds [16, 65]. A number of genes that are targets of selection in various cattle breeds are associated with either coat colour (MC1R, KIT), milk production (DGAT1, ABCG2, GHR) or stature (PLAG1) [44, 65–67]. These genes were not detected within the top 0.5% CLR and iHS windows in OB cattle possibly either due to absence of trait-associated genomic variation in our data or because they are not under selection in OB cattle. While some cattle breeds including Holstein and Fleckvieh are selected for particular coat colour patterns [66, 68], animals with variation in coat colour are rarely observed in the OB cattle breed [69]. Moreover, due to the use of OB cattle for both milk and beef production under extensive conditions, the milk production-associated variants that are under strong artificial selection in many dairy breeds seem to be less important in OB cattle [64]. Some of the genes (PLAG1, DGAT1, ABCG2, GHR) that have been reported to be targets of selection in specialized breeds contain well-known variants that contribute to the genetic variation of economically important traits. We investigated if these variants segregate in our data although they were not detected in our selection signature analysis. A number of variants in high linkage disequilibrium stimulate the expression of PLAG1, thus increasing pre- and postnatal growth in cattle [25, 26, 70]. Among 14 candidate causal variants for the PLAG1 QTL, six were fixed for the stature-decreasing alleles in our study (Additional file 12). The other candidate causal variants were either fixed for the stature-increasing allele or segregated at low allele frequency. This pattern indicates that a recombinant haplotype might segregate in Swiss OB cattle that could facilitate fine-mapping of this region. Among known mutations affecting milk production traits, a mutation in ABCG2 (p.Y581S, rs43702337 at 38,027,010 bp) [71] did not segregate in our population of Swiss OB cattle which corroborates previous findings in Brown Swiss cattle [72]. A variant (BTA14, g.1802265_1802266GC > AA, p.A232K) of the DGAT1 gene is associated with milk production traits in cattle [73, 74]. The milk fat-enhancing and milk yield-lowering lysine-allele segregates in OB cattle at low frequency (0.03). A missense variant (BTA20, g.31909478A > T, p.Y279F) in the GHR gene is associated with milk protein percentage [75]. The protein fat percentage-lowering T-allele segregates at low frequency (0.06) in OB cattle. We observed a striking signature of selection on chromosome 11 that has previously been detected in the Swiss Fleckvieh, Simmental, Eringer and Evolèner breeds using microarray-called genotypes [10, 58]. Our results in OB cattle indicate that this region harbors a rapid sweep which seems to act on alleles with selection advantage [22]. While large sweeps are easy to detect using dense sequencing data, pinpointing causal alleles underpinning such regions remains challenging. Most of the variants in such regions are either fixed or segregate at very low frequency [16] which we also observed for the signature of selection on chromosome 11. In our study, the signature of selection on chromosome 11 encompassed millions of nucleotides (between 66 and 72 Mb) and many genes, rendering the identification of underpinning genes and variants a difficult task. The windows with the highest CLR and |iHS| values did not encompass PROKR1, which was previously suggested to be the target of selection at this region due to its association with fertility [10, 58]. However, closer inspection of the sequence variants detected in our study revealed that a stop-gained variant in PROKR1 (g.66998234C > A, rs476744845, p.Y293*) segregates at high frequency in the 49 sequenced OB key ancestors. Yet, it remains to be elucidated, if the presence of a high-impact variant in immediate proximity to a massive selective sweep is causal or just due to hitchhiking. The window with the largest |iHS| value on chromosome 11 was right next to the CAPN13 gene which is associated with meat tenderness and was also suggested as a potential target of selection by Signer-Hasler et al. [10]. Genes within the top 0.5% CLR and iHS windows were enriched in pathways related to reactive oxygen species, metabolic process, blood coagulation and nervous system, indicating that the identified regions under selection might harbor genomic variants that confer adaptive advantage to harsh environments. Moreover, QTL associated with meat quality and production traits including feed efficiency and body weight were enriched in selection signatures possibly indicating that OB cattle harbor variants that enabled them to adapt to particular feed conditions. Combining results from selection signature and association analyses might reveal phenotypic characteristics associated with genomic regions that showed evidence of past or ongoing selection [66], thus providing additional hints why particular genomic regions are under selection in OB cattle.

Conclusions

We provide a comprehensive overview of genomic variation segregating in the Swiss OB cattle population using sequencing data of 49 key ancestor bulls. In spite of the small population size, genetic diversity is higher and genomic inbreeding is lower in OB than many other mainstream cattle breeds. However, genomic inbreeding is increasing in recent generations mainly due to large ROH which should be considered in future management of this breed. Finally, this study highlights regions that show evidence of past and ongoing selection in OB which are enriched for QTL related to meat quality and production traits and pathways related to blood coagulation, cellular metabolic process, and nervous system.

Methods

Sequence variant genotyping

We considered genotypes at 17,303,689 biallelic variants (15,722,811 SNPs and 1,580,878 Indels) that were discovered and genotyped previously [17] in the autosomes of 49 key ancestors of the OB population using a genome graph-based sequence variant genotyping approach [76]. In brief, 49 OB cattle were sequenced at between 6 and 38-fold genome coverage using either Illumina HiSeq 2500 (30 animals) or Illumina HiSeq 4000 (19 animals) instruments. The sequencing reads were filtered and subsequently aligned to the UMD3.1 assembly of the bovine genome [77] using the mem-algorithm of the Burrows Wheeler Aligner (BWA) software package [78]. Single nucleotide and short insertion and deletion polymorphisms were discovered and genotyped using the Graphtyper software [76]. Following recommended filtration criteria (see Crysnanto et al. [17] for more details), 15,722,811 SNPs and 1,580,878 Indels were retained for subsequent analyses. Beagle [53] phasing and imputation was applied to improve the primary genotype calls from Graphtyper and infer missing genotypes. Unless stated otherwise, imputed genotypes were considered for subsequent analyses, because Beagle imputation considerably improved the primary genotype calls particularly in samples that had been sequenced at low coverage [17].

Variant annotation and evaluation

Functional consequences of 15,722,811 SNPs and 1,580,878 Indels were predicted according to the Ensembl (release 91) annotation of the bovine genome assembly UMD3.1 using the Variant Effect Predictor tool (VEP v.91.3) [79] with default parameter settings. The impacts of amino acid substitutions on protein function were predicted using the sorting intolerant from tolerant (SIFT) (version 5.2.2) [80] algorithm that has been implemented in the VEP tool. Variants with SIFT scores less than 0.05 were considered to be likely deleterious to protein function. In order to assess if known Mendelian trait-associated variants segregate among 49 sequenced OB cattle, we downloaded genomic coordinates of 155 trait-associated variants that are curated in the Online Mendelian Inheritance in Animals (OMIA) database [81, 82].

Population genetic analysis

Nucleotide diversity (π) quantifies the average number of nucleotide differences per site between two DNA sequences that originated from the studied population [83]. We estimated π of the OB population over the entire autosomal genome using VCFtools v0.1.15 (in windows of 10 kb) [55].

Detection of runs of homozygosity

Runs of homozygosity were identified using a Hidden Markov Model (HMM)-based approach implemented in the BCFtools/RoH software [84, 85]. The recombination rate was assumed to be constant along the genome at 10− 8 per base pair (1 cM/Mb). For the HMM-based detection of ROH, we considered phred-scaled likelihoods (PL) and allele frequencies of 15,722,811 filtered SNPs before Beagle imputation. Because samples that are sequenced at low coverage are enriched for ROH [86], we considered only 33 samples with average sequencing coverage greater than 10-fold for the detection of ROH (Additional file 13). We only considered ROH longer than 50 kb because they were less likely to contain false-positives (Phred-score > 67 in our data, Additional file 13). Genomic inbreeding (FROH) was calculated for each animal as FROH = ∑LROH/LGENOME, where ∑LROH is the length of all ROH longer than 50 kb and LGENOME is the length of the genome covered by SNPs [87], which is 2,512,054,768 bp in our data. Further, ROH were classified into short (50–100 kb), medium (0.1–2 Mb) and long ROH (> 2 Mb) reflecting ancient, historical, and recent inbreeding, respectively [88]. The contribution of each ROH category to FROH was calculated for each animal. Average genomic inbreeding was compared between animals born before and after 1989 using the two samples t-test.

Detection of signatures of selection

To avoid potential bias arising from extended relationships among the sequenced animals, we did not consider nine sons from sire-son pairs for the detection of signatures of selection. For the remaining 40 cattle, we considered genotypes at 9,051,833 SNPs for which the ancestral allele provided by Rocha et al. [59] was detected in at least two species other than cattle and where it agreed with either the reference or alternate allele in our data. Haplotypes were phased using the Eagle2 software [89] with default parameter settings and assuming a constant recombination rate along the chromosome.

Integrated haplotype score (iHS)

To identify signatures of ongoing selection, integrated haplotype scores (iHS) were calculated for 8,465,912 variants with minor allele frequency (MAF) greater than 0.01 using the R package rehh v.2 [90]. We obtained iHS that ranged from − 6.6 to 6.4. Subsequently, |iHS| were averaged for non-overlapping windows of 40 kb over the whole genome. Windows with either less than 10 SNPs were removed. To test if variants with similar |iHS| properties were pooled in 40 kb windows, we followed the approach of Granka et al. [91]. Specifically, we randomly selected the same number of SNPs that were pooled in 40 kb windows and calculated the mean variance of |iHS| in the true and permuted 40 kb windows for each chromosome. This procedure was repeated for 10,000 randomly selected 40 kb windows. The variance of |iHS| in the non-overlapping 40 kb windows (0.24) was significantly (P < 0.01) less than in windows of randomly selected SNPs (0.37) indicating that SNPs that were grouped in 40 kb windows had |iHS| values that were more alike than random SNPs.

Composite likelihood ratio (CLR)

Composite likelihood ratio (CLR) tests were carried out to identify alleles that are either close to fixation or already reached fixation due to past selection. Following the recommendation of Huber et al. [92], we removed 118,124 SNPs from the data which were fixed for the ancestral alleles because such sites are not informative for CLR tests. Using a pre-computed empirical allele frequency spectrum of 8,933,709 SNPs for which ancestral and derived alleles were assigned (see above), we calculated CLR statistics in non-overlapping 40 kb windows using SweepFinder2 [93, 94]. A window size of 40 kb was chosen to allow comparison and alignment between |iHS| and CLR values. Empirical P values were calculated for CLR and |iHS| windows [66] and the top 0.5% of windows of each statistic were considered as candidate signatures of selection. Adjacent top 0.5% windows were merged separately for each statistic using BEDTools v2.27.1 [95]. For each merged candidate signature of selection, the lowest P value among the merged windows was retained.

Characterization of signatures of selection

Genes within candidate signatures of selection were determined based on the Ensembl (release 91) annotation of the UMD3.1 assembly of the bovine genome. Gene-set enrichment analysis of genes within candidate signatures of selection was performed using PANTHER v.14.1 [96]. Specifically, we investigated if these genes were enriched in the functional categories of GO-slim Biological Process and PANTHER pathways using P ≤ 0.05 as significance level. To determine the overlap between QTL and candidate signatures of selection, we downloaded genomic coordinates for 122,893 QTL from the Animal QTL Database [97, 98]. We classified 85,722 unique QTL that were located on the 29 autosomes into six trait categories: exterior, health, milk, meat and carcass, production and reproduction (Additional file 14). QTL with identical genomic coordinates in associated trait categories were considered as one QTL. We used the intersect module of BEDTools v2.27.1 [95] to identify QTL that overlapped with CLR and |iHS| candidate regions for each of the six trait categories separately. To test if QTL were enriched in candidate signatures of selection, we used a permutation test with 10,000 permutations. In each permutation, we randomly sampled the same number of regions of the same size as the candidate signatures of selection from CLR and |iHS| for each chromosome separately, and overlapped them with QTL of the respective trait categories using BEDTools (see above). The number of QTL that overlapped permuted regions was used as the empirical null distribution to calculate P values. P values less than 0.05 were considered as indicators for a significant enrichment of QTL in candidate signatures of selection. Additional file 1: Original Braunvieh herd book population. Number of female calves entering the OB herd book between 1980 and 2016. Additional file 2: Allele frequency distribution in different functional annotations. Allele frequency of SNPs with different consequences according to VEP prediction, like high impact, deleterious (missense SNP with SIFT score < 0.05). tolerated (missense SNPs with SIFT score > 0.05) and synonymous SNPs. Additional file 3: Distribution of length of Indels. a Number of Indels (× 1000) with size less than 12 bp detected according to the number of affected bases. b number of Indels detected in coding sequences. Additional file 4: OMIA variants detected in OB. Six OMIA variants detected in 49 sequenced OB animals with their respective frequency and information. Additional file 5: ROH statistics of each animal (high coverage). Number, length, average length and genomic fraction of all ROH in each animal. Also categorized in short, medium and long ROH category. Additional file 6: Genomic inbreeding in OB cattle stratified by birth year. Genomic inbreeding in two groups of animals born either between 1960 and 1989 or between 1990 and 2012. Additional file 7: Candidate selection signatures detected using CLR. Genomic coordinates, CLR values, p values and encompassed genes for 95 candidate selection signatures. Additional file 8: Candidate selection signatures detected using iHS. Genomic coordinates, |iHS| values, p values and encompassed genes for 162 candidate selection signatures. Additional file 9: Overlap between the top CLR and iHS selection signatures. Overlapped regions and genes between CLR and iHS candidate selection regions. Additional file 10: Summary of Gene Ontology enrichment analysis. PANTHER and GO-Slim pathways enriched using genes encompassed in signatures of selection from CLR and iHS analyses. Additional file 11: Overlap between QTL and signatures of selection. QTL (from all 6 categories) that overlapped with CLR and iHS selection signatures. Additional file 12: Frequency of candidate causal variants for a stature QTL on BTA14. Genomic coordinates and allele frequencies of 14 variants nearby bovine PLAG1 that were reported as candidate causal variants for a stature QTL in cattle by Karim et al. [70]. Additional file 13: Runs of homozygosity in 49 OB cattle. a Total genome fraction in ROH in 49 cattle with high (>10x) and low (<10x) coverage (b) Phred confidence score for ROH in 33 cattle sequenced at average sequencing depth higher than 10-fold. Red dots indicate mean confidence scores for ROH. Additional file 14: Bovine QTL information downloaded from the AnimalQTL database. Number of QTL for each trait. Traits are grouped in six trait categories.
  90 in total

1.  Genomic scans for selective sweeps using SNP data.

Authors:  Rasmus Nielsen; Scott Williamson; Yuseob Kim; Melissa J Hubisz; Andrew G Clark; Carlos Bustamante
Journal:  Genome Res       Date:  2005-11       Impact factor: 9.043

2.  Marker-assisted conservation of European cattle breeds: An evaluation.

Authors: 
Journal:  Anim Genet       Date:  2006-10       Impact factor: 3.169

3.  Genetic diversity of European cattle breeds highlights the conservation value of traditional unselected breeds with high effective population size.

Authors:  Ivica Medugorac; Ana Medugorac; Ingolf Russ; Claudia E Veit-Kensch; Pierre Taberlet; Bernhard Luntz; Henry M Mix; Martin Förster
Journal:  Mol Ecol       Date:  2009-07-31       Impact factor: 6.185

4.  Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering.

Authors:  Sharon R Browning; Brian L Browning
Journal:  Am J Hum Genet       Date:  2007-09-21       Impact factor: 11.025

5.  Variants modulating the expression of a chromosome domain encompassing PLAG1 influence bovine stature.

Authors:  Latifa Karim; Haruko Takeda; Li Lin; Tom Druet; Juan A C Arias; Denis Baurain; Nadine Cambisano; Stephen R Davis; Frédéric Farnir; Bernard Grisart; Bevin L Harris; Mike D Keehan; Mathew D Littlejohn; Richard J Spelman; Michel Georges; Wouter Coppieters
Journal:  Nat Genet       Date:  2011-04-24       Impact factor: 38.330

6.  Identification of a missense mutation in the bovine ABCG2 gene with a major effect on the QTL on chromosome 6 affecting milk yield and composition in Holstein cattle.

Authors:  Miri Cohen-Zinder; Eyal Seroussi; Denis M Larkin; Juan J Loor; Annelie Everts-van der Wind; Jun-Heon Lee; James K Drackley; Mark R Band; A G Hernandez; Moshe Shani; Harris A Lewin; Joel I Weller; Micha Ron
Journal:  Genome Res       Date:  2005-07       Impact factor: 9.043

7.  Mutations in GLUT2, the gene for the liver-type glucose transporter, in patients with Fanconi-Bickel syndrome.

Authors:  R Santer; R Schneppenheim; A Dombrowski; H Götze; B Steinmann; J Schaub
Journal:  Nat Genet       Date:  1997-11       Impact factor: 38.330

8.  A nonsense mutation in TMEM95 encoding a nondescript transmembrane protein causes idiopathic male subfertility in cattle.

Authors:  Hubert Pausch; Sabine Kölle; Christine Wurmser; Hermann Schwarzenbacher; Reiner Emmerling; Sandra Jansen; Matthias Trottmann; Christian Fuerst; Kay-Uwe Götz; Ruedi Fries
Journal:  PLoS Genet       Date:  2014-01-02       Impact factor: 5.917

9.  Accurate sequence variant genotyping in cattle using variation-aware genome graphs.

Authors:  Danang Crysnanto; Christine Wurmser; Hubert Pausch
Journal:  Genet Sel Evol       Date:  2019-05-15       Impact factor: 4.297

10.  A frameshift mutation in GON4L is associated with proportionate dwarfism in Fleckvieh cattle.

Authors:  Hermann Schwarzenbacher; Christine Wurmser; Krzysztof Flisikowski; Lubica Misurova; Simone Jung; Martin C Langenmayer; Angelika Schnieke; Gabriela Knubben-Schweizer; Ruedi Fries; Hubert Pausch
Journal:  Genet Sel Evol       Date:  2016-03-31       Impact factor: 4.297

View more
  14 in total

1.  Assessing genomic diversity and signatures of selection in Pinan cattle using whole-genome sequencing data.

Authors:  Shunjin Zhang; Zhi Yao; Xinmiao Li; Zijing Zhang; Xian Liu; Peng Yang; Ningbo Chen; Xiaoting Xia; Shijie Lyu; Qiaoting Shi; Eryao Wang; Baorui Ru; Yu Jiang; Chuzhao Lei; Hong Chen; Yongzhen Huang
Journal:  BMC Genomics       Date:  2022-06-21       Impact factor: 4.547

2.  Genetic diversity in reproductive traits of Braunvieh cattle determined with SNP markers.

Authors:  Mitzilin Zuleica Trujano-Chavez; Agustín Ruíz-Flores; Rufino López-Ordaz; Paulino Pérez-Rodríguez
Journal:  Vet Med Sci       Date:  2022-05-12

3.  Bovine breed-specific augmented reference graphs facilitate accurate sequence read mapping and unbiased variant discovery.

Authors:  Danang Crysnanto; Hubert Pausch
Journal:  Genome Biol       Date:  2020-07-27       Impact factor: 13.583

4.  Activation of cryptic splicing in bovine WDR19 is associated with reduced semen quality and male fertility.

Authors:  Maya Hiltpold; Guanglin Niu; Naveen Kumar Kadri; Danang Crysnanto; Zih-Hua Fang; Mirjam Spengeler; Fritz Schmitz-Hsu; Christian Fuerst; Hermann Schwarzenbacher; Franz R Seefried; Frauke Seehusen; Ulrich Witschi; Angelika Schnieke; Ruedi Fries; Heinrich Bollwein; Krzysztof Flisikowski; Hubert Pausch
Journal:  PLoS Genet       Date:  2020-05-14       Impact factor: 5.917

5.  Breed Ancestry, Divergence, Admixture, and Selection Patterns of the Simbra Crossbreed.

Authors:  Magriet A van der Nest; Nompilo Hlongwane; Khanyisile Hadebe; Wai-Yin Chan; Nicolaas A van der Merwe; Lieschen De Vos; Ben Greyling; Bhaveni B Kooverjee; Pranisha Soma; Edgar F Dzomba; Michael Bradfield; Farai C Muchadeyi
Journal:  Front Genet       Date:  2021-01-28       Impact factor: 4.599

6.  Longissimus Dorsi Muscle Transcriptomic Analysis of Simmental and Chinese Native Cattle Differing in Meat Quality.

Authors:  Xiangren Meng; Ziwu Gao; Yusheng Liang; Chenglong Zhang; Zhi Chen; Yongjiang Mao; Bizhi Huang; Kaixing Kui; Zhangping Yang
Journal:  Front Vet Sci       Date:  2020-12-15

7.  Characterization of a haplotype-reference panel for genotyping by low-pass sequencing in Swiss Large White pigs.

Authors:  Adéla Nosková; Meenu Bhati; Naveen Kumar Kadri; Danang Crysnanto; Stefan Neuenschwander; Andreas Hofer; Hubert Pausch
Journal:  BMC Genomics       Date:  2021-04-21       Impact factor: 3.969

8.  Grazing Allometry: Anatomy, Movement, and Foraging Behavior of Three Cattle Breeds of Different Productivity.

Authors:  Caren M Pauler; Johannes Isselstein; Joel Berard; Thomas Braunbeck; Manuel K Schneider
Journal:  Front Vet Sci       Date:  2020-08-14

9.  Full-length transcriptome sequencing analysis and development of EST-SSR markers for the endangered species Populus wulianensis.

Authors:  Qichao Wu; Fengqi Zang; Xiaoman Xie; Yan Ma; Yongqi Zheng; Dekui Zang
Journal:  Sci Rep       Date:  2020-10-01       Impact factor: 4.379

10.  Genetic Diversity and Population Structure for Resistance and Susceptibility to Mastitis in Braunvieh Cattle.

Authors:  Mitzilin Zuleica Trujano-Chavez; Reyna Sánchez-Ramos; Paulino Pérez-Rodríguez; Agustín Ruíz-Flores
Journal:  Vet Sci       Date:  2021-12-14
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.