Literature DB >> 29848486

Human-Mediated Introgression of Haplotypes in a Modern Dairy Cattle Breed.

Qianqian Zhang1,2, Mario P L Calus2, Mirte Bosse2, Goutam Sahana3, Mogens Sandø Lund3, Bernt Guldbrandtsen3.   

Abstract

Domestic animals can serve as model systems of adaptive introgression and their genomic signatures. In part, their usefulness as model systems is due to their well-known histories. Different breeding strategies such as introgression and artificial selection have generated numerous desirable phenotypes and superior performance in domestic animals. The modern Danish Red Dairy Cattle is studied as an example of an introgressed population. It originates from crossing the traditional Danish Red Dairy Cattle with the Holstein and Brown Swiss breeds, both known for high milk production. This crossing happened, among other things due to changes in the production system, to raise milk production and overall performance. The genomes of modern Danish Red Dairy Cattle are heavily influenced by regions introgressed from the Holstein and Brown Swiss breeds and under subsequent selection in the admixed population. The introgressed proportion of the genome was found to be highly variable across the genome. Haplotypes introgressed from Holstein and Brown Swiss contained or overlapped known genes affecting milk production, as well as protein and fat content (CD14, ZNF215, BCL2L12, and THRSP for Holstein origin and ITPR2, BCAT1, LAP3, and MED28 for Brown Swiss origin). Genomic regions with high introgression signals also contained genes and enriched QTL associated with calving traits, body confirmation, feed efficiency, carcass, and fertility traits. These introgressed signals with relative identity-by-descent scores larger than the median showing Holstein or Brown Swiss introgression are mostly significantly correlated with the corresponding test statistics from signatures of selection analyses in modern Danish Red Dairy Cattle. Meanwhile, the putative significant introgressed signals have a significant dependency with the putative significant signals from signatures of selection analyses. Artificial selection has played an important role in the genomic footprints of introgression in the genome of modern Danish Red Dairy Cattle. Our study on a modern cattle breed contributes to an understanding of genomic consequences of selective introgression by demonstrating the extent to which adaptive effects contribute to shape the specific genomic consequences of introgression.
Copyright © 2018 by the Genetics Society of America.

Entities:  

Keywords:  high-yielding cattle breeds; modern dairy cattle breed; selective introgression; signature of selection

Mesh:

Year:  2018        PMID: 29848486      PMCID: PMC6063242          DOI: 10.1534/genetics.118.301143

Source DB:  PubMed          Journal:  Genetics        ISSN: 0016-6731            Impact factor:   4.562


PROCESSES of adaptive introgression are complex, and their genomic signatures in human and other species have been studied extensively (Hasenkamp ; Deschamps ; Figueiró ; Jagoda ). Genome analysis has enabled an in-depth assessment of the genomic consequences of different demographic processes including introgression, selection, and their interplay in the modern species (Deschamps ; Figueiró ). Domestic animals can serve as model organisms for these processes. They have several advantages in understanding the impact of introgression and selection on genomes: first, introgression and selection is known to occur between breeds, and the processes are often well documented by breeders; second, massive data are routinely collected before and after introgression and selection, such as parentage, genotypes, and phenotypes; and third, under controlled appropriate environmental conditions, a large part of genomic consequence is caused by human-mediated directional introgression and selection. Artificial selection and different breeding strategies have enabled the generation of numerous desirable phenotypes in domestic animals such as cattle (Hartwig ; Buzanskas ; Davis ), pigs (Bosse ,b; Ai ), and dogs (Galov ; vonHoldt ). Strategies including crossbreeding and introgression have been very successful in improving productivity and performance in domestic animals. For example, Chinese pig breeds were imported to Europe to improve the productivity of European pigs in the late 18th and early 19th centuries (White 2011). The fertility-related traits have been largely improved by the crossbreeding and introgression from Asian pigs (Merks ; Bosse et al. 2014, 2015). Similarly, in dairy cattle, crossbreeding and following introgression between local breeds with other breeds have been applied to achieve better productivity and performance (Davis ). The genetic architecture of modern domestic animals, including dairy cattle, is shaped by the interplay between different forces, including the intentional introduction of favorable alleles from other breeds, subsequent selection for favorable introgressed alleles, and demographic processes. Since the introduction of scientific breeding, the main breeding goal in cattle has been to improve milk yield, fat, and protein content, even for dual-purpose breeds in intensive farming systems, especially in Europe (Hartwig ). By the crossing of high-yielding breeds with local breeds, the productivity of local breeds could rapidly be increased at the expense of the genetic distinctiveness of the local breeds. The high productivity of these admixed breeds was further improved by intense selection, resulting in increased frequencies or even fixation of favorable alleles. Many of the variants thus spreading in the population will have been of introgressed origin. With the great success of cattle breeding including crossbreeding and because of the availability of large-scale genomic data sets, analysis of admixed local cattle breeds represents an appealing model to identify the genomic consequences of admixture with the intention of improving traits of interest in local breeds. We expect that the introgressed genomic regions play an important role in improving productivity and performance in the crossbreeds. We hypothesize that: (1) the genome-wide introgressed regions are nonrandomly distributed across the genome with respect to their genomic locations; (2) the majority of highly introgressed regions affect production and performance traits; and (3) that these introduced haplotypes are or have been under selection in the admixed breed. To test these hypotheses and identify specific genomic regions and genes of interest involved in the important traits from high-yield breeds such as Holstein (HOL) and Brown Swiss (BSW), we use the hybrid modern Danish Red Dairy Cattle (mRDC) breed as an example. Our analyses illustrate the patterns of introgressed and selected haplotypes in an admixed local breed. The hybrid mRDC originates from traditional Danish Red Dairy Cattle (tRDC). In recent decades, HOL and BSW have been used extensively to improve the milk yield of mRDC (Kantanen ). Years of crossbreeding and selection have led to the differentiation between mRDC and tRDC. The introgression of and selection for haplotypes from HOL and BSW have probably made a significant contribution to the increased milk production level of mRDC. The objective of this study was to examine genomic patterns of introgression from two high-yielding breeds (HOL and BSW) in a modern dairy breed (i.e., the hybrid mRDC), and unravel the consequences of introgression and selection at the genome level using whole-genome sequence data.

Methods

SNP genotyping, sequencing, variant calling, and quality control

Whole-genome sequence data were available for 97 animals from four breeds (32 HOL, 20 BSW, 15 tRDC, and 30 mRDC). All individuals’ genomes were sequenced to ∼10× of depth or deeper using Illumina paired-end sequencing. Reads were aligned to the cattle genome assembly UMD3.1 (Zimin ) using bwa (Li and Durbin 2009). Aligned sequences were converted to raw BAM files using samtools (Li ). Duplicate reads were marked using the samtools rmdup option (Li ). The Genome Analysis Toolkit (McKenna ) was used for local realignment around insertion/deletion (indel) regions and recalibration following the 1000 Bull Genome Project guidelines (Daetwyler ) incorporating information from dbSNP (Sherry ). Finally, variants were called using the Genome Analysis Toolkit’s UnifiedGenotyper (McKenna ), which simultaneously calls short indels and SNPs. Indels and variants on sex chromosomes were excluded from further analyses.

Population structure and admixture

Using PLINK (Chang ), the sequence variants were pruned to remove markers with pairwise linkage disequilibrium (LD) > 0.1 with any other SNP within a 50-SNP sliding window (advancing by 10 SNPs at a time). The SNPs on all the autosomes were used to study the population structure. To get an overview of population structure of the genotyped animals from different breeds, the whole-genome sequence data were used for principal component (PC) estimation using GCTA (Yang ). Admixture analyses were done using the program Admixture (Alexander ) with values of K between 2 and 10. The K value with a low cross validation error was chosen. To investigate the statistical significance of admixture among these cattle populations, TreeMix software was used to perform the three population (f) test (Pickrell and Pritchard 2012). In the f test with the form of f (A; B, C), an extreme negative f statistic indicates that there has been significant gene flow to A from populations B and C. The combination of two of HOL, BSW, and tRDC (A and B) were used as source populations and mRDC were used as admixed population (C) in the f test, which results in three different combinations. We also calculated the breed proportions for the sequenced mRDC individuals using pedigree. The full pedigrees of the sequenced mRDC animals were extracted and used to infer the breed proportions by coding the breed where the ancestors enter as the known parents of the traced animal.

Introgression mapping

Calculation of ancestry dosages in mRDC:

A novel two-layer hidden Markov model was implemented in the method developed in Guan (2014) to infer the structure of local haplotypes introgressed from HOL, BSW, and tRDC in mRDC. The software package ELAI developed by Guan (2014) was used to infer the ancestry dosages of the haplotypes from three source populations in mRDC. The SNPs from sequence variants with minor allele frequency < 0.01 or with a missing proportion > 0.05 were removed from subsequent analyses. The option of three-way admixture with admixture generations of 10 was chosen, which approximates the history of admixture in the mRDC population. Thirty steps of an expectation-maximization (EM) algorithm were run to infer the ancestry dosages of HOL, tRDC, and BSW in mRDC.

Relative identity-by-descent scores:

HOL, BSW, and tRDC have made large genetic contributions to the mRDC. Therefore, the sequenced mRDC, tRDC, BSW, and HOL were selected for introgression mapping analysis. The identity-by-descent (IBD) regions comparing between mRDC and tRDC were used as a reference to map the introgression regions from HOL and BSW using a pairwise comparison between these breeds. Following Bosse et al. (2014), sequences for 29 autosomes were first phased using Beagle fastIBD (version 3.3.2) (Browning and Browning 2007). Pairwise comparisons for detecting IBD were performed between mRDC and tRDC, mRDC and HOL, and mRDC and BSW. As recommended in the Beagle documentation (Browning and Browning 2007), 10 independent runs for phasing and pairwise IBD detection were performed. The identified IBD segments were combined from 10 runs and the threshold parameter compromising between power and false discovery rate was 10−10 for identifying the true shared IBD, as suggested by Browning and Browning (2007). We defined the IBD score as the proportion of the number of recorded true IBD haplotype segments over the total number of pairwise comparisons using a window of 10 kb. The IBD score was calculated for each pairwise comparison using a custom perl script. To quantify the relative proportion of introgressed genome from HOL or BSW, we calculated the relative IBD score (rIBD) as follows: IBD score (mRDC and tRDC) − IBD score (mRDC and HOL) or IBD score (mRDC and BSW). Thus, the rIBD has values in the range of −1 to 1. rIBD = 1 signifies that all haplotypes in the target breed originate from the first source breed, while rIBD = −1 signifies 100% from the second source breed. The variance of rIBD score was calculated using a robust method (Williams 2000). The P-values of rIBD scores were derived from the neutral hypothesis assuming that rIBD are normally distributed with an SD of the squared root of robust estimate of rIBD variance (Bosse et al. 2014). The significant introgressed haplotypes from HOL and BSW in mRDC were defined as the haplotypes with a corrected P-value using the Benjamini–Hochberg procedure (Benjamini and Yekutieli 2001) lower than 0.02 and 0.04, respectively. Peterson correlations between rIBD scores from HOL or BSW introgression larger than the median of rIBD scores, and the corresponding test statistics from signatures of selection analyses, were calculated. Moreover, the putative significant introgressed signals from HOL and BSW were extracted and compared with the putative significant signals from signatures of selection analyses using a χ2 test with Monte Carlo simulation (Hope 1968) implemented in calculating P-values.

Detection of signature of selection

F analysis:

The genetic differentiation between individuals from tRDC and mRDC was measured by pairwise F analysis following Weir and Cockerham (1984). Pairwise F was computed with Genepop 4.2 in bins of 10 kb over the full length of the genome (Weir and Cockerham 1984). The correlations between the F and rIBD scores for HOL and BSW introgression for the same bins of 10 kb were calculated. The R package FlexMix (Grün and Leisch 2008) was used to fit a series of finite normal mixture models to explore the underlying structure of the distribution of F presumably caused by different underlying evolutionary or demographic processes in the populations such as balancing, directional selection, and neutrality. This mixture model postulated that where is the number of components of the mixture, is the th locus, is the probability that belongs to cluster and and are expectation and variance of normal distribution, respectively. The number of components of the mixture () was determined by the smallest Akaike’s information criterion among different models with different numbers of components. Model parameters were estimated by maximum likelihood via the EM algorithm in FlexMix. At last, the P-value of the F combined with the k number of components was calculated as: The regions with the cutoff of corrected P-values using Benjamini–Hochberg procedure (Benjamini and Yekutieli 2001) of < 0.05 were used for testing the dependency of putative significant signals between different test statistics.

Extended haplotype homozygosity test:

The genome-wide scan for integrated haplotype score (iHS) for mRDC was performed using the R package rehh (Sabeti ; Gautier and Vitalis 2012). The threshold of |iHS| of 2.5 for SNPs was used as the cutoff to define signals of selection signatures (Voight ; Gautier and Vitalis 2012). To compare iHS with other test statistics such as rIBD scores at the same scale, we calculated the proportion of SNPs with |iHS| > 2 in a window size of 10 kb and identified the windows in the highest 1% of the empirical distribution for proportion of SNPs with |iHS| > 2 following Voight ). The putative significant windows of proportion of SNPs with |iHS| were used to test the dependency of putative significant signals between different test statistics.

Sharing of runs of homozygosity:

Runs of homozygosity (ROH) were computed for the sequenced animals to detect shared ROH regions (minimum 10 kb considered) among individuals. For a description of the procedures used for the calculation of nucleotide diversity and detection of ROH, see Zhang ,b). The sharing of ROH regions was calculated as the number of individuals sharing the same ROH region on a particular segment using a window of 10 kb bin across the whole genome in mRDC. Regions of enrichment of shared ROH regions were defined as regions exceeding the 95th percentile of the empirical distribution of the number of individuals sharing the same ROH regions in any particular segment.

Gene annotations and enrichment of QTL

Genes in genomic regions showing significant introgression of haplotypes from HOL or BSW in mRDC were annotated. The cattle QTL were extracted from the Animal QTL Database (Hu ). QTL on the X chromosome, or without locations and references in the list from the Animal QTL Database (Hu ), were excluded. The remaining QTL were classified into six groups according to the associated traits: milk, reproduction, production, health, meat and carcass, and exterior. The QTL overlapping with genomic regions with high rIBD scores were identified. When two QTL had the same exact genomic interval and the same associated trait groups, they were counted as one QTL. Gene annotations in these regions were retrieved from the Ensembl Genes 89 Database using BioMart (Kinsella ). To test for enrichment of QTL in the candidate introgressed regions from HOL or BSW, we applied a permutation test. The candidate introgressed regions were randomly distributed across the whole genome. This permutation did not change the relative proportion and length of introgressed regions to preserve their correlation structure. We then computed the number of QTL and the number of QTL in the six groups of associated traits, which overlapped with the permuted, introgressed regions. In total, 10,000 permutations were performed. The distribution of numbers of QTL observed in the permutated regions were treated as the null distribution from which we computed the significance levels of the number of QTL observed in the real data.

Data availability

The whole-genome sequence data used in this study originated from the 1000 Bull Genome Project. Parts of the whole-genome sequence data of individual bulls of the 1000 Bull Genomes Project [Daetwyler )] are available at https://doi.org/10.1038/s41588-018-0056-5 and National Center for Biotechnology Information using Sequence Read Archive numbers SRP039339, SRR1188706, SRR1205973, SRR1205973, SRR1205992, SRR1262533, SRR1262536, SRR1262538, SRR1262539, SRR1262614, SRR1262659, SRR1262660, SRR1262788, SRR1262789, SRR1262846, and SRR1293227. Supplemental material available at Figshare: https://doi.org/10.25386/genetics.6383528.

Results and Discussion

Population structure and evidence of introgression

Population structures of the sampled cattle breeds in this study were analyzed using Admixture and PC analysis (PCA), as shown in Figure 1. mRDC had contribution from tRDC, HOL, and BSW, as observed in the Admixture analysis (Figure 1). Figure 1 clearly demonstrates the hybrid nature of mRDC cattle, which is consistent with recorded pedigree information of the introgression history of mRDC (Andersen ). A large contribution to mRDC was observed from tRDC, the recipient population. It is notable that BSW and HOL are two mainstream breeds that each contributed heavily to the genomes of extant mRDC individuals. The proportion of introgression differs quite extensively among individuals. In the PCA analysis (Figure 1), PC1 (6.23% of variance) separated HOL, BSW, and tRDC. However, mRDC was dispersed between the other breeds demonstrating admixture of HOL, BSW, and tRDC, which have made contributions to mRDC. Similarly, PC2 separated HOL, BSW, and tRDC. mRDC had a wide range of PC2 values among the individuals from the other breeds.
Figure 1

Population structure for four cattle breeds. (A) Admixture analysis of different cattle breeds with k = 3. (B) PCA plot among different cattle breeds (principal component 1 vs. principal component 2) Blue squares = HOL, green triangles = BWS, red crosses = tRDC, and yellow diamonds = mRDC. BSW, Brown Swiss; HOL, Holstein; mRDC, modern Red Dairy cattle; PCA, Principal component analysis; tRDC, traditional Red Dairy Cattle.

Population structure for four cattle breeds. (A) Admixture analysis of different cattle breeds with k = 3. (B) PCA plot among different cattle breeds (principal component 1 vs. principal component 2) Blue squares = HOL, green triangles = BWS, red crosses = tRDC, and yellow diamonds = mRDC. BSW, Brown Swiss; HOL, Holstein; mRDC, modern Red Dairy cattle; PCA, Principal component analysis; tRDC, traditional Red Dairy Cattle. Moreover, the breed proportions of mRDC individuals were derived using the recoded full pedigree and there is an average of 0.27 HOL ancestor, 0.17 BSW ancestor, and 0.29 tRDC ancestor in mRDC. The statistical significance of admixture in mRDC was measured using the combination of two populations chosen from HOL, BSW, and tRDC by f test using the program threepop from TreeMix (Pickrell and Pritchard 2012). We observed extreme Z scores from the f test for mRDC using any combination of HOL, BSW, and tRDC, i.e., −14.97, −11.72, and −19.24. These results are consistent with what we have observed from admixture and PCA analysis, and provide statistical significance for the admixture of mRDC from HOL, BSW, and tRDC. These results support that notion that mRDC is indeed a composite breed and can be studied further for introgression from HOL and BSW. Introgression mapping was performed to identify regions in mRDC that contained an excess of introgressed haplotypes. We first examined the local structure of haplotypes in mRDC introgressed from HOL, BSW, and tRDC using a three-way admixture approach (Supplemental Material, Figure S1). We observed average ancestry dosages of 0.78, 0.63, and 0.59 with an SD of 0.24, 0.22, and 0.18 from HOL, tRDC, and BSW, respectively, to mRDC average across the whole genome and all mRDC individuals. Three genomic regions with the ancestry dosage of two were observed from HOL to mRDC (Figure S1). This indicates complete replacement of tRDC genomic material by HOL genomic material. However, there is only one annotated gene (SNORD116) in these regions. Interestingly, the genomic region with the full HOL ancestry dosage on chromosome 21 (1,348,427–2,107,179 bp) is associated with gestation length and calving ease (Frischknecht ). Other QTL overlapping regions with full HOL ancestry in mRDC were associated with bovine respiratory disease susceptibility, body weight, and udder swelling score (Saatchi ; Keele ; Michenet ). To address which admixed haplotypes have most influenced the existing mRDC breed, we inferred whether a genomic region was introgressed from high-yield breeds in multiple individuals by examining the contributions proportional to the admixture fraction. The frequencies of all mRDC haplotypes that were of HOL, BSW, or tRDC origin were calculated across the whole genome. The relative fractions of HOL or BSW haplotypes vs. tRDC haplotypes in the mRDC group were calculated as rIBD scores. Shared haplotypes (i.e., haplotypes with shared ancestry) were observed between mRDC on one hand and HOL, BSW, and tRDC on the other hand. These findings are in agreement with the results observed from population structure and admixture analysis that showed contributions from HOL, BSW, and tRDC to mRDC. In contrast to tRDC frequency, HOL haplotype and BSW haplotype frequency in the mRDC population for a given locus, (i.e., rIBD score) ranged from 0.73 to −0.74 and from 0.81 to −0.92, where 1 indicates that all haplotypes were of HOL or BSW origin, while none were of tRDC origin. A value of −1 indicates that all haplotypes were tRDC-like (Figure 2 and Figure 3). The rIBD scores averaged across the genome were negative (−0.06 for HOL introgression and −0.09 for BSW introgression) (Figure 2a and Figure 3a), showing that the majority of the genome displayed more similarity with the tRDC than with either HOL or BSW. However, every chromosome contained genomic regions where the signal for the HOL or BSW haplotype was stronger than for tRDC.
Figure 2

Genome-wide pattern of relative identity-by-descent (rIBD) score showing introgressed haplotypes from Holstein (HOL) in modern Danish Red dairy cattle (mRDC). (A) The rIBD score for all 29 autosomes: the positive scores show the signals where they are more HOL-like, whereas the negative scores show the signals where they are more traditional Danish red cattle (tRDC)-like. Chromosomes 1–29 are colored in blue and gray in order. (B) The distribution of rIBD scores: the positive scores show the signals where they are more HOL-like, whereas the negative scores show the signals where they are more tRDC-like.

Figure 3

Genome-wide pattern of relative identity-by-descent (rIBD) score showing introgressed haplotypes from Brown Swiss (BSW) in modern Danish Red dairy cattle. (A) The rIBD score for all 29 autosomes: the positive scores show the signals where they are more BSW-like, whereas the negative scores show the signals where they are more traditional Danish red cattle (tRDC)-like. Chromosomes 1–29 are colored in blue and grey in order. (B) The distribution of rIBD scores: the positive scores show the signals where they are more HOL-like, whereas the negative scores show the signals where they are more tRDC-like.

Genome-wide pattern of relative identity-by-descent (rIBD) score showing introgressed haplotypes from Holstein (HOL) in modern Danish Red dairy cattle (mRDC). (A) The rIBD score for all 29 autosomes: the positive scores show the signals where they are more HOL-like, whereas the negative scores show the signals where they are more traditional Danish red cattle (tRDC)-like. Chromosomes 1–29 are colored in blue and gray in order. (B) The distribution of rIBD scores: the positive scores show the signals where they are more HOL-like, whereas the negative scores show the signals where they are more tRDC-like. Genome-wide pattern of relative identity-by-descent (rIBD) score showing introgressed haplotypes from Brown Swiss (BSW) in modern Danish Red dairy cattle. (A) The rIBD score for all 29 autosomes: the positive scores show the signals where they are more BSW-like, whereas the negative scores show the signals where they are more traditional Danish red cattle (tRDC)-like. Chromosomes 1–29 are colored in blue and grey in order. (B) The distribution of rIBD scores: the positive scores show the signals where they are more HOL-like, whereas the negative scores show the signals where they are more tRDC-like. The distribution of rIBD from the comparison between mRDC and HOL or BSW for IBD haplotypes resembled a normal distribution (Figure 2b and Figure 3b). By taking a cutoff of rIBD values, we were able to identify the regions that were likely to be of HOL or BSW origin. Across the whole genome, many known genes and QTL were located within regions with high rIBD for HOL origin (Table S1). We observed that the QTL are significantly enriched in the HOL-like haplotypes in mRDC (P = 0.025). The genes and QTL were associated with economic traits including milk-related traits such as milk yield, protein, fat yield, and percentage (CD14, ZNF215, BCL2L12, and THRSP) (Ashwell ; Beecher ; Magee ; Cole ; Cochran ; Fontanesi ; Capomaccio ), calving traits (MYH14, KCNC3, SYT3, and CTU1) (Kolbehdari ; Cole ; Mao ; Parker Gaddis ; Abo-Ismail ), feed efficiency-related traits (LRRIQ3, ATP6V1B2, and CCKBR) (Abo-Ismail , 2014), and carcass traits (ZNF215, INTS4, and RPTOR) (Magee ; Sasago ). The QTL-associated production and reproduction traits were significantly enriched in introgressed regions from HOL showing high rIBD scores (P = 0.0004 for production and P = 0.034 for reproduction traits). The longest continuous introgressed region (defined as the region with rIBD > 0) from HOL to mRDC was on chromosome 18 (56,320,000–61,350,000 bp) (Figure 2a). This region was previously found to be associated with calving traits and young stock survival in Nordic HOL cattle (Cole ; Mao ; Wu ). The recombination rate of this region of chromosome 18 is low (Weng ). The long genomic regions showing signals of introgression probably tend to occur in regions with low recombination rates. Moreover, introgressed haplotypes included numerous annotated genes due to genetic hitchhiking and a short time since introgression. The region with the highest rIBD score from HOL was located on chromosome 4 (120,540,000–120,810,000 bp, average rIBD = 0.449), which is downstream of the gene VIPR2. The VIPR2 gene has been proposed to be a candidate gene affecting fat percentage and to play an important role in milk synthesis (Capomaccio ). Similarly, many known genes and QTL occurred in regions where mRDC shared haplotypes with BSW (Table S2). We observed an enrichment of QTL in the BSW-like haplotypes in mRDC that was close to significance (P = 0.056). These genes and QTL mainly affected milk composition including fat and protein percentage and yield (ITPR2, BCAT1, LAP3, and MED28) (Cohen-Zinder ; Pimentel ; Zheng ; Fang ), growth and body conformation traits such as stature (NCAPG, LCORL, PPP2R1A, IGFBP6, and CREBBP) (Kolbehdari ; Baeza ; Lindholm-Perry ; Cole ; Sahana ), calving and fertility traits (EIF4G3, TGFA, and LAP3) (Bongiorni ; Hering ; Höglund ), and feed efficiency-related traits (CLMP and PPP2R1A) (Prakash ; Cole ). We observed a significant enrichment of QTL associated with milk, production, meat, and carcass traits in the introgressed regions from BSW in mRDC (P = 0.045 for milk, P = 0.019 for production, and P = 0.017 for meat and carcass traits). While the longest region introgressed from HOL in mRDC was ∼4 Mb, there was no introgressed region longer than 1.5 Mb from BSW in mRDC. The highest value of rIBD signal was observed on chromosome 17 (35,630,000–35,780,000 bp, average rIBD = 0.563). This region is located downstream of the IL2 gene. Also, there is one unannotated gene in this region. It has previously been shown that the IL2 gene is associated with mastitis, milk yield, and lactation persistency (Alluwaimi ; Prakash ).

Regions of introgression and evidence for selection

We have shown that HOL or BSW haplotypes that were introgressed in mRDC often originated from genomic regions harboring genes associated with milk production, calving traits, feed efficiency, fertility or body conformation, and carcass traits. An introgression-introduced haplotype from HOL or BSW with continued survival in mRDC is not removed by negative selection or genetic drift, and might be a result of positive or balancing selection. The structure of an introgressed haplotype will change in such as LD, and the distribution of allele frequencies due to selection. The length of introgressed haplotype was affected by a combination of the local recombination rate and strength of selection. To further infer the introgressed regions that are under selection, we used three independent methods: iHS, F, and sharing of ROH among individuals to identify the regions that were under selection in the mRDC population. iHS identified regions with extended homozygosity in mRDC due to selection. Local high F statistics reflected the genomic regions that showed strong differentiation between mRDC and tRDC. The sharing of ROH among individuals could differentiate the genomic regions that have been fixed or were close to fixation in mRDC. We observed a significant dependency (P < 0.01) between the putative significant rIBD signals and significant signals from iHS, F, and sharing of ROH among individuals, except between the significant rIBD signals from HOL introgression and significant iHS signals. This supports the hypothesis that many introgressed regions from HOL or BSW are probably a result of selection. We first observed significant positive correlations between F and rIBD scores larger than the median from both HOL and BSW (P < 0.001), supporting the idea that at least some of the regions in mRDC showing differentiation from tRDC have been introgressed from HOL or BSW (Figures S2 and S3). Genes associated with milk yield, protein, and fat yield and percentage, such as CD14 and ZNF215 (Magee ; Cochran ), overlapped with haplotypes introgressed from HOL showing a high F (Figure 4e). The longest region introgressed from HOL in mRDC (chromosome 18: 56,320,000–61,350,000 bp) also showed high F values (Figure 4d). There were a number of genes included in this longest region with introgression from HOL and high F regions, such as MYH14 and ZNF613, that were associated with calving ease and young stock survival (Abo-Ismail ; Wu ) (Figure 4d). The region on chromosome 19 (52,370,000–52,380,000 bp) with a very high F of 0.748 showed strong differentiation between tRDC and mRDC, which overlapped with a haplotype introgressed from HOL into mRDC. The RPTOR gene, associated with carcass traits in cattle (Sasago ), is located here (Figure 4a). At the same time, the RPTOR gene was found to play an important role in regulating cell growth, energy homeostasis, apoptosis, and the immune response during adaptions (Sun ). Similarly, we found that two BSW introgressed haplotypes overlapped with genes (ITPR2 and BCAT1) associated with milk yield, fat, and protein yield and percentage (Pimentel ; Fang ) showed high F (Figure 4c). Moreover, the region on chromosome 6 (38,730,000–38,780,000 bp) with an average F value of 0.251 overlapped with a BSW haplotype introgressed into mRDC. The NCAPG and LCORL genes, which are associated with body confirmation traits such as stature and feeding efficiency (Eberlein ; Lindholm-Perry ; Setoguchi ; Xia ), are located in this region.
Figure 4

Examples of genomic regions with relative identity-by-descent (rIBD) scores, F, integrated haplotype score (iHS), and sharing of runs of homozygosity (ROH). (A) F and rIBD for the genomic region containing gene RPTOR; (B) rIBD and sharing of ROH for the genomic region containing gene THRSP; (C) rIBD and iHS for the genomic region containing gene BCAT1; (D) rIBD, F, and sharing of ROH for the genomic region containing genes MYH14 and ZNF613; and (E) rIBD, F, and iHS for the genomic region containing gene ZNF215.

Examples of genomic regions with relative identity-by-descent (rIBD) scores, F, integrated haplotype score (iHS), and sharing of runs of homozygosity (ROH). (A) F and rIBD for the genomic region containing gene RPTOR; (B) rIBD and sharing of ROH for the genomic region containing gene THRSP; (C) rIBD and iHS for the genomic region containing gene BCAT1; (D) rIBD, F, and sharing of ROH for the genomic region containing genes MYH14 and ZNF613; and (E) rIBD, F, and iHS for the genomic region containing gene ZNF215. Regions introgressed from HOL or BSW into mRDC and the significant regions from the iHS test were compared in Figures S4 and S5. There was a significant positive correlation between the rIBD score larger than the median for the introgressed region from BSW in mRDC and the iHS test (P < 0.001). However, there was no significant correlation between rIBD score for introgressed regions from HOL in mRDC and the iHS test. For example, the genomic region putatively under selection from the iHS test (38,510,000–38,540,000, 104 out of 108 SNPs with |iHS| > 2, highest |iHS| = 3.8) on chromosome 6 overlapped with an introgression signal from BSW into mRDC. This region is located upstream of the LAP3 gene, which is associated with milk composition, including fat and protein percentage, and calving traits (Cohen-Zinder ; Zheng ). We observed that the significant SNPs with highest |iHS| of 5.3 from the iHS test on chromosome 5 overlapped with a high introgression signal from BSW into mRDC. These significant SNPs overlap with the BCAT1 gene, which is associated with milk yield and protein and fat percentage in milk (Pimentel ) (Figure 4c). There were significant SNPs with a highest |iHS| of 4.23 on chromosome 15 that overlapped with an introgressed region from BSW into mRDC. This region lies within the ZNF215 gene affecting body confirmation and milk composition (Magee ) (Figure 4e). Genomic regions showing signals both from the iHS test and introgression mapping, e.g., gene BCAT1, were introgressed from BSW, and probably under selection, but not yet fixed in the population due to low pressure of selection or recent introduction. There was a significant positive correlation (P < 0.001) across genomic regions between the number of individuals containing overlapped ROH regions and rIBD scores larger than the median for both HOL and BSW (Figures S6 and S7). Short ROHs shared between individuals indicate selection events that have reached or are close to fixation. Interestingly, many small regions highly enriched for ROH hotspots overlapped with the longest region introgressed from HOL into mRDC, such as the region where the ZNF613 gene associated with young stock survival (Wu ) is located (Figure 4d). Similarly, we also observed that the region introgressed from HOL in mRDC, where the THRSP and INTS4 genes are located, showed high enrichment of the ROH regions among individuals (Figure 4b). Studies have shown that the THRSP gene is associated with milk composition and involved in the regulation of mammary synthesis of milk fat (Fontanesi ), while the INTS4 gene is associated with a myristic acid content in carcass trait (Sasago ). One region introgressed from BSW into mRDC containing the TGFA gene, which is associated with sperm motility (Hering ), also showed high levels of ROH sharing between individuals.

Conclusions

Together, the observed results demonstrate how crossbreeding followed by selection can shape the genomes of a modern breed on a genome-wide scale, using dairy cattle as an example. The well-documented breeding practice provides a robust model system for studying the consequences of adaptive introgression. Key observations were a highly uneven distribution across the genome of the proportions of genomic regions introgressed from the donor breeds. Highly introgressed regions contained genes and QTL known to affect traits of interest and to have been subjected to active selection by breeders. These traits include milk production, feed efficiency, calving traits, body confirmation, feed efficiency, carcass, and fertility traits. Artificial selection has played an important role on the genomic footprints from introgression on the genome of a modern dairy cattle breed. These findings contribute to our understanding of the genomic consequences of selective introgression in the genomes of modern species.
  73 in total

1.  Detecting recent positive selection in the human genome from haplotype structure.

Authors:  Pardis C Sabeti; David E Reich; John M Higgins; Haninah Z P Levine; Daniel J Richter; Stephen F Schaffner; Stacey B Gabriel; Jill V Platko; Nick J Patterson; Gavin J McDonald; Hans C Ackerman; Sarah J Campbell; David Altshuler; Richard Cooper; Dominic Kwiatkowski; Ryk Ward; Eric S Lander
Journal:  Nature       Date:  2002-10-09       Impact factor: 49.962

2.  Exploration of relationships between production and fertility traits in dairy cattle via association studies of SNPs within candidate genes derived by expression profiling.

Authors:  E C G Pimentel; S Bauersachs; M Tietze; H Simianer; J Tetens; G Thaller; F Reinhardt; E Wolf; S König
Journal:  Anim Genet       Date:  2010-12-30       Impact factor: 3.169

Review 3.  New phenotypes for new breeding goals in pigs.

Authors:  J W M Merks; P K Mathur; E F Knol
Journal:  Animal       Date:  2012-04       Impact factor: 3.240

4.  Genetic markers of body composition and carcass quality in grazing Brangus steers.

Authors:  M C Baeza; P M Corva; L A Soria; G Rincon; J F Medrano; E Pavan; E L Villarreal; A Schor; L Melucci; C Mezzadra; M C Miquel
Journal:  Genet Mol Res       Date:  2011-12-19

5.  Identification of single nucleotide polymorphisms in genes involved in digestive and metabolic processes associated with feed efficiency and performance traits in beef cattle.

Authors:  M K Abo-Ismail; M J Kelly; E J Squires; K C Swanson; S Bauck; S P Miller
Journal:  J Anim Sci       Date:  2013-03-18       Impact factor: 3.159

6.  Dissection of genetic factors modulating fetal growth in cattle indicates a substantial role of the non-SMC condensin I complex, subunit G (NCAPG) gene.

Authors:  Annett Eberlein; Akiko Takasuga; Kouji Setoguchi; Ralf Pfuhl; Krzysztof Flisikowski; Ruedi Fries; Norman Klopp; Rainer Fürbass; Rosemarie Weikard; Christa Kühn
Journal:  Genetics       Date:  2009-08-31       Impact factor: 4.562

7.  Disentangling Immediate Adaptive Introgression from Selection on Standing Introgressed Variation in Humans.

Authors:  Evelyn Jagoda; Daniel J Lawson; Jeffrey D Wall; David Lambert; Craig Muller; Michael Westaway; Matthew Leavesley; Terence D Capellini; Marta Mirazón Lahr; Pascale Gerbault; Mark G Thomas; Andrea Bamberg Migliano; Eske Willerslev; Mait Metspalu; Luca Pagani
Journal:  Mol Biol Evol       Date:  2018-03-01       Impact factor: 16.240

8.  Fast and accurate short read alignment with Burrows-Wheeler transform.

Authors:  Heng Li; Richard Durbin
Journal:  Bioinformatics       Date:  2009-05-18       Impact factor: 6.937

9.  Untangling the hybrid nature of modern pig genomes: a mosaic derived from biogeographically distinct and highly divergent Sus scrofa populations.

Authors:  Mirte Bosse; Hendrik-Jan Megens; Ole Madsen; Laurent A F Frantz; Yogesh Paudel; Richard P M A Crooijmans; Martien A M Groenen
Journal:  Mol Ecol       Date:  2014-06-16       Impact factor: 6.185

10.  A map of recent positive selection in the human genome.

Authors:  Benjamin F Voight; Sridhar Kudaravalli; Xiaoquan Wen; Jonathan K Pritchard
Journal:  PLoS Biol       Date:  2006-03-07       Impact factor: 8.029

View more
  7 in total

Review 1.  Association Studies and Genomic Prediction for Genetic Improvements in Agriculture.

Authors:  Qianqian Zhang; Qin Zhang; Just Jensen
Journal:  Front Plant Sci       Date:  2022-06-02       Impact factor: 6.627

2.  Genomic analysis of the domestication and post-Spanish conquest evolution of the llama and alpaca.

Authors:  Ruiwen Fan; Zhongru Gu; Xuanmin Guang; Juan Carlos Marín; Valeria Varas; Benito A González; Jane C Wheeler; Yafei Hu; Erli Li; Xiaohui Sun; Xukui Yang; Chi Zhang; Wenjun Gao; Junping He; Kasper Munch; Russel Corbett-Detig; Mario Barbato; Shengkai Pan; Xiangjiang Zhan; Michael W Bruford; Changsheng Dong
Journal:  Genome Biol       Date:  2020-07-02       Impact factor: 13.583

3.  Activation of cryptic splicing in bovine WDR19 is associated with reduced semen quality and male fertility.

Authors:  Maya Hiltpold; Guanglin Niu; Naveen Kumar Kadri; Danang Crysnanto; Zih-Hua Fang; Mirjam Spengeler; Fritz Schmitz-Hsu; Christian Fuerst; Hermann Schwarzenbacher; Franz R Seefried; Frauke Seehusen; Ulrich Witschi; Angelika Schnieke; Ruedi Fries; Heinrich Bollwein; Krzysztof Flisikowski; Hubert Pausch
Journal:  PLoS Genet       Date:  2020-05-14       Impact factor: 5.917

4.  Assessing the genetic background and genomic relatedness of red cattle populations originating from Northern Europe.

Authors:  Christin Schmidtmann; Anna Schönherz; Bernt Guldbrandtsen; Jovana Marjanovic; Mario Calus; Dirk Hinrichs; Georg Thaller
Journal:  Genet Sel Evol       Date:  2021-03-06       Impact factor: 4.297

5.  A 1-bp deletion in bovine QRICH2 causes low sperm count and immotile sperm with multiple morphological abnormalities.

Authors:  Maya Hiltpold; Fredi Janett; Xena Marie Mapel; Naveen Kumar Kadri; Zih-Hua Fang; Hermann Schwarzenbacher; Franz R Seefried; Mirjam Spengeler; Ulrich Witschi; Hubert Pausch
Journal:  Genet Sel Evol       Date:  2022-03-07       Impact factor: 4.297

6.  Impact of rare and low-frequency sequence variants on reliability of genomic prediction in dairy cattle.

Authors:  Qianqian Zhang; Goutam Sahana; Guosheng Su; Bernt Guldbrandtsen; Mogens Sandø Lund; Mario P L Calus
Journal:  Genet Sel Evol       Date:  2018-11-20       Impact factor: 4.297

7.  Genome-Wide Analysis Reveals Human-Mediated Introgression from Western Pigs to Indigenous Chinese Breeds.

Authors:  Jue Wang; Chengkun Liu; Jie Chen; Ying Bai; Kejun Wang; Yubei Wang; Meiying Fang
Journal:  Genes (Basel)       Date:  2020-03-04       Impact factor: 4.096

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.