Literature DB >> 29965995

A system-based analysis of the genetic determinism of udder conformation and health phenotypes across three French dairy cattle breeds.

Andrew Marete1,2, Mogens Sandø Lund2, Didier Boichard1, Yuliaxis Ramayo-Caldas1.   

Abstract

Using GWAS to identify candidate genes associated with cattle morphology traits at a functional level is challenging. The main difficulty of identifying candidate genes and gene interactions associated with such complex traits is the long-range linkage disequilibrium (LD) phenomenon reported widely in dairy cattle. Systems biology approaches, such as combining the Association Weight Matrix (AWM) with a Partial Correlation in an Information Theory (PCIT) algorithm, can assist in overcoming this LD. Used in a multi-breed and multi-phenotype context, the AWM-PCIT could aid in identifying udder traits candidate genes and gene networks with regulatory and functional significance. This study aims to use the AWM-PCIT algorithm as a post-GWAS analysis tool with the goal of identifying candidate genes underlying udder morphology. We used data from 78,440 dairy cows from three breeds and with own phenotypes for five udder morphology traits, five production traits, somatic cell score and clinical mastitis. Cows were genotyped with medium (50k) or low-density (7 to 10k) chips and imputed to 50k. We performed a within breed and trait GWAS. The GWAS showed 9,830 significant SNP across the genome (p < 0.05). Five thousand and ten SNP did not map a gene, and 4,820 SNP were within 10-kb of a gene. After accounting for 1SNP:1gene, 3,651 SNP were within 10-kb of a gene (set1), and 2,673 significant SNP were further than 10-kb of a gene (set2). The two SNP sets formed 6,324 SNP matrix, which was fitted in an AWM-PCIT considering udder depth/ development as the key trait resulting in 1,013 genes associated with udder morphology, mastitis and production phenotypes. The AWM-PCIT detected ten potential candidate genes for udder related traits: ESR1, FGF2, FGFR2, GLI2, IQGAP3, PGR, PRLR, RREB1, BTRC, and TGFBR2.

Entities:  

Mesh:

Year:  2018        PMID: 29965995      PMCID: PMC6028091          DOI: 10.1371/journal.pone.0199931

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

Genetic architecture of complex phenotypes in cattle includes many loci affecting a given trait [1]. Most of these loci have small effects, but few segregating loci have moderate-to-large effects possibly due to epistatic effects, varying selection goals or recent selection for the favorable mutant allele. Moreover, markers collectively capture most but not all additive genetic variance for phenotypes. The incomplete variance capture may be due to causal mutations with low allele frequencies and therefore in incomplete linkage disequilibrium (LD) with markers [2]. To reduce this LD, we can either do a between breed analysis with a large sample of genotyped cows [3] or combine the results of a within breed GWAS in a multi-breed context. This is possible because the cost of genotyping is decreasing thus allowing many breeds, including those of medium population size, to be genotyped rapidly primarily for genomic selection purpose. These large populations of genotyped cows with own performances allows us to: (1) detect QTL for newly recorded traits or traits previously not studied; (2) carry out large confirmation studies for conventional traits. Previous studies have demonstrated that polymorphic sites that segregate within and across bovine populations can be studied using imputed low-to-dense genotypes [4,5]. Such genotypes have been used in model organisms and dairy cattle leading to the identification of candidate causal variants or closely neighboring variants that control complex phenotypes [6,7]. These studies have been useful for identifying QTL regions and probable genes associated with a phenotype. So far, however, there have been few validation studies of the vast number of putative variants across and between breeds and amongst multiple phenotypes. This study uses 50k SNP data to validate such variants using the Association Weight Matrix (AWM) [8] approach as a post GWAS analysis tool. The AWM is a systems biology approach for the genetic dissection of complex traits based on applying gene network theory to the results from GWAS. Hence, if the AWM SNP matrix is used in combination with a Partial Correlation (PC) in an Information Theory (IT) framework, and for correlated phenotypes, then it is possible to generate gene networks with regulatory and functional significance for udder related phenotypes. Despite the limitations of the chip density, previous studies have shown the usefulness of the AWM to identify candidate genes in cattle, e.g. [9-10] and corroborated across different species, e.g. [11-12] in independent studies using the 50k marker density. In this study, we report results based on GWAS analysis for mammary conformation, milk production and health phenotypes for 78,440 dairy cows. A multi-step validation by combining the results of single SNP, single phenotype, in a multiple-breed context using the AWM-PCIT algorithm was performed. The aim was to identify the genes associated with mammary conformation and health phenotypes in Holstein, Montbeliarde, and Normande breeds, accounting for milk production as supportive traits. We further explored gene networks with the main gene ontology domains including biological processes, cellular component, and molecular functions.

Material and methods

Phenotypes

The cow sample was comprised of 46,732 Holstein, 20,141 Montbeliarde, and 11,965 Normande all with known parents. Phenotypes were yield deviations as produced by French national evaluation system [13]. A yield deviation is a performance adjusted for all non-genetic effects of the model [14]. In case of repeated records, a yield deviation is adjusted for the permanent environmental effect and averaged per animal. The 12 traits were: fore udder attachment (FUA), udder depth or development (UDD), udder cleft (UC), udder balance (UB), front teat placement (FTP), milk (MY), fat (FAT and FAT%), protein (PROT and PROT%) clinical mastitis (CM) and somatic cell score (SCS). SCS are derived from Somatic Cell Counts (SCC). SCC are obtained at each monthly test-day, and SCS are defined as SCS = 3+log2(SCC/100,000). CM events are declared by the farmer and recorded by the technician at each test-day. The analysis trait is defined by 1 if the cow has at least one clinical mastitis event in the lactation, and 0 if no there is no event. SCS and CM are repeated over lactations. The model for obtaining yield deviations was different according to the trait and all models included the genetic additive effect. The model for SCS and CM (recorded in the first three lactations), included the effect of herd x year, age at calving x parity x year, month of calving x parity x year, preceding days dry (for later parities), and a permanent environmental effect. All models assumed heterogeneous variances depending on herd-year. Each cow had one record for each trait. UDD can be either udder depth (Holstein and Normande) or udder development (Montbeliarde), but treated as the same trait because they have similar definitions. Depending on the trait, the number of animals with phenotypes ranged from 7,671 to 11,965 in Normande, 13,879 to 20,141 in Montbeliarde, and 32,491 to 46,732 in Holstein (Table 1). We estimated variance components by fitting a multiple trait REML animal model as implemented in the DMU software [15]. This study is based on already existing data hence we did not require ethical approval.
Table 1

Characteristics of udder conformation, production and health traits in three French dairy breeds.

TraitCows with traitsStandard Deviation
MON1NOR2HOL3MON1NOR2HOL3
Fore Udder Attachment (FUA)17,3307,67132,4911.181.261.19
Udder Depth or Development (UDD)17,3307,67132,4910.980.810.84
Udder Cleft (UC)17,3307,67132,4911.261.280.85
Udder Balance (UB)17,3307,67132,4910.790.821.39
Front Teat Placement (FTP)17,3307,67132,4911.111.140.96
Milk Yield (MY)20,09611,94446,732786742991
Fat Yield (FAT)20,09611,94446,73233.734.740.2
Protein Yield (PROT)20,09611,94446,73227.525.730.1
Fat Percent (FAT %)20,09611,94446,7321.261.280.85
Protein Percent (PROT %)20,09611,94446,7321.281.280.85
Clinical Mastitis (CM)13,8799,01332,4910.240.260.36
Somatic Cell Score (SCS)20,14111,96546,7320.930.900.96

1Montbeliarde;

2Normande;

3Holstein

1Montbeliarde; 2Normande; 3Holstein

Genotyping, quality control, and imputation

All cows were genotyped using Illumina BovineSNP50 BeadChip (50k) or Illumina BovineLD v.2 BeadChip. We used the UMD3.1 assembly of the bovine genome [16] for SNP chromosomal positions. We did not consider mitochondrial, X-chromosomal and Y-chromosomal SNP, as well as unmapped SNP for further analyses. We examined 43,800 SNP currently used in French genomic evaluation procedure [13]. Selection criteria was: call rate higher than 99%, minor allele frequency (MAF) higher than 2% in at least one of the three breeds, lack of Hardy–Weinberg Equilibrium (P < 10−4), technical quality assessed by their very low rate of Mendelian mismatch between parents and progeny and known position of genome assembly. We imputed cows genotyped from BovineLD v.2 BeadChip to 50k to obtain a genotype without missing information. This step was performed within breed in the conventional pipeline for genomic selection [13]. We imputed using FImpute software [17], and reference population included all 50k genotyped male and female animals per breed. Imputation error rate, measured in routine evaluation situation (and not in this study), varied from 0.2 to 2.5% depending on whether parents were genotyped. Across breeds, best imputation accuracies were observed in Montbeliarde which has the highest proportion of sires and dams genotyped with 50k, while accuracies were lower in Holstein, a breed with a smaller portion of genotyped dams, a lower percentage of 50k genotyped parents, and even a non-zero proportion of non-genotyped (usually foreign) sires.

Statistical framework

GWAS was done within breed with the Mixed Linear Model Association (MLMA) method as implemented in GCTA [18]. Because phenotypes were yield deviations already adjusted for non-genetic effects, we did not consider additional fixed effects. For each phenotype and SNP i, the model used in each breed was the following where y is a vector of yield deviations, μ is a mean; u is a vector of random additive polygenic effects and is where G is genomic relationship matrix based on all cows with phenotypes per breed and all autosomes. Z is incidence matrix relating phenotypes y to u, w is a vector of genotypes for SNP i, si is the effect of SNP i, and e is a vector of random residual effects. We calculated the relationship between two individuals’ j and k as , x being the number of alleles for individual j and SNP i and p is the observed allelic frequency, and, w was 43,800. We applied a genome-wide Bonferroni correction on all 43,800 tests to account for multiple testing.

Candidate variant discovery

We used the Association Weight Matrix (AWM) procedure to identify candidate genes per breed [8]. The AWM is a multiple trait approach that considers the genetic contribution of correlated traits allowing selection of pleiotropic SNP associated with numerous traits rather than a single trait. We classified trait information as either key or supportive trait, and the key trait in this study was udder depth or development (UDD) which is the most important type trait with the strongest relationship with mammary health and longevity. In addition, UDD is an aggregate trait, combining size, attachments, balance and strength of support. Populating the AWM starts with the selection of significant SNP from a GWAS [19]. The SNP additive effects are z-scored normalized by deviating the allele substitution effects from their mean and dividing by their standard deviation. We then created two matrices: (a) A z-scored additive values matrix (b) The GWAS p-values matrix. In both cases, rows represent SNP and columns represent traits. We then processed these matrices using the AWM algorithm, which includes five steps: (1) Primary SNP Selection: We select SNP associated with key trait using a P-value threshold (P < 0.05). (2): Exploring the dependency among traits: For the SNP selected in step (1), and, for the same threshold (P < 0.05), we register the average number of non-key traits to which the SNP are associated. In this study, that number was five traits. (3): Secondary SNP Selection: We select SNP from step (1) associated with at least five other traits including at least two udder traits. This step depends on correlation amongst traits and allows capturing most SNP associated with remaining traits. (4): Exploiting the genome map: We annotated the SNP captured in step (1), and step (3) using the UMD3.1 Genome assembly [16]. We classified the SNP that (i) mapped a gene, (ii) <10-kb to known genes, and, (iii) >10-kb to any coding region. For genes represented by more than one SNP, we select the SNP associated with the highest number of traits and has the lowest P-value average across all traits as representative for that gene. (5): Populating the AWM: Each {i,t} cell value in the AWM matrix corresponds to the z-score normalized additive effect of the ith SNP on the tth trait. This allows exploration of trait correlations column-wise, and gene/SNP interactions row-wise. We then calculate the SNP-based correlations and compare the SNP-based and genetic correlations, the latter being calculated as pedigree-based restricted maximum likelihood (REML), established in the same cows’ populations. We then use the AWM SNP matrix as the input for the PCIT algorithm [20] and for any trio of SNP; we estimate the first order partial correlation coefficients to identify meaningful gene-gene interactions. We annotated and clustered gene ontology (GO) annotations for significant PCIT gene-gene interactions using Cytoscape [21]. Finally, we compare the gene clusters amongst the three breeds and plot the most significant cluster.

Results

GWAS for all traits

Collectively for 78,440 cows, imputation resulted in a genotype density of 43,800 SNP. Of these, 38,827, 38,109, and 40,810 SNP had a MAF greater than 0.1 in Montbeliarde, Normande, and Holstein. Distribution of allele frequencies (MAF) of imputed genotypes was almost uniformly distributed across the MAF classes (Fig 1).
Fig 1

SNP variants as expressed in MAF in three French dairy breeds.

The minor allele frequency of SNP variants in three French dairy breeds.

SNP variants as expressed in MAF in three French dairy breeds.

The minor allele frequency of SNP variants in three French dairy breeds. We performed GWAS for real and imputed SNP and yield deviations (YD) for 12 traits: five udder conformation traits, five milk production phenotypes, somatic cell score and clinical mastitis (Table 2). Fig 2 presents Manhattan plots for the key trait (Udder depth or development) for three breeds and S1 Fig for other traits. As expected, the number of significant SNP increased with breeds sample size. Holstein had 7,029 associated SNP, Montbeliarde had 1,762 associated SNP, and, Normande had 1,039 associated SNP (Table 2). There was an overlap of significant SNP across the breeds. Some 483 SNP were common between Holstein and Montbeliarde, 356 SNP were common between Holstein and Normande, 233 SNP were common between Montbeliarde and Normande, and 206 SNP were common among three breeds (Fig 3). We observed overlap of significant SNP between udder and milk production traits. Irrespective of the breed, 15 SNP overlapped in udder related traits, and 205 SNP overlapped in milk production phenotypes. When considering 1SNP:1Gene, 3,651 significant SNP were close to genes (<10-Kb) across three breeds. Of these, 1,017 were highly associated with udder conformation traits, 2,502 with production traits and 132 with somatic cell score and clinical mastitis. The 2,673 additional SNP satisfying step 4 of the AWM algorithm (as described in M&M) was augmented with the 3,651 significant SNP from GWAS forming the AWM matrix with 6,324 SNP (S1 Table). Of these, 1,309 SNP were common across three breeds, and they mapped 1013 genes for 12 traits.
Table 2

Summary of GWAS for udder related traits in three French dairy cattle breeds.

Trait2Significant SNP1Significant SNP close3 to gene
MON4NOR5HOL6MON4NOR5HOL6
Fore Udder Attachment45295583220235
Udder Depth or Development2043857841347
Udder Cleft37244022922237
Udder Balance10017199581698
Front Teat Placement1253745642933
Milk Yield70536843042182
Fat Yield68475153141112
Protein Yield51036226137
Fat Percent4813901,498136105480
Protein Percent6133822,362214168816
Clinical Mastitis141233656113
Somatic Cell Score118

1Significant SNP = A SNP that has satisfied the Bonferroni threshold for a trait;

2Trait = A yield; deviation: phenotype corrected for environmental variances;

3Significant SNP within 10-kb of gene;

4Montbeliarde;

5Normande;

6Holstein

Fig 2

Manhattan plots for the key trait: Udder depth (Holstein and Normande) or development (Montbeliarde breed).

Fig 3

Summary of GWAS significant SNP and breed overlap for 12 traits in three French dairy breeds.

1Significant SNP = A SNP that has satisfied the Bonferroni threshold for a trait; 2Trait = A yield; deviation: phenotype corrected for environmental variances; 3Significant SNP within 10-kb of gene; 4Montbeliarde; 5Normande; 6Holstein Significant SNP associated with the key trait were evident for Holstein and Montbeliarde, and the most significant SNP that mapped a gene for the key trait is presented per chromosome in Table 3. In total, 17 SNP were most significant per chromosome SNP and mapped to a gene in Holstein and Montbeliarde. This included eight for Montbeliarde, nine for Holstein. Two lead SNP in Montbeliarde were rs41640614 (BTA16, SOAT1 gene, p = 3.47x10-15, Effect size = -0.185(0.02), MAF = 0.08) and rs108972236 (BTA19, ABCA5 gene, p = 2.20x10-11, Effect size = -0.112(0.01), MAF = 0.20). The two lead SNP in Holstein were rs41641987 (BTA19, PAFAH1B1 gene, p = 1.51x10-13, Effect size = -0.128(0.02), MAF = 0.04) and rs110651226 (BTA29, FOXRED1 gene, p = 2.30x10-12, Effect size = 0.059(0.01), MAF = 0.45).
Table 3

Most significant SNP per chromosome associated with udder depth/development and mapping a gene.

BreedSNP1BTA2Pos3 (bp)Effect alleleMAF4Effect sizeSE Effect sizep5Gene6
Montbeliarders43293677220760409G0.4510.0840.012.60X10-08HOXD1
Montbeliarders29019267334184021A0.367-0.0920.013.67X10-11SORT1
Montbeliarders437049461269648659G0.226-0.0950.011.97X10-08GPR180
Montbeliarders1090809851340031719A0.460.0750.012.63X10-07CFAP61
Montbeliarders1107616561582317986A0.3640.0660.013.15X10-04CTNND1
Montbeliarders416406141662100110A0.08-0.1850.023.47X10-15SOAT1
Montbeliarders1089722361961919633C0.2010.1120.012.20X10-11ABCA5
Montbeliarders412568812221326038G0.3150.0780.011.49X10-06ARL8B
Montbeliarders420490772431765644T0.1050.1270.021.00X10-07ZNF521
Holsteinrs1090495111367557015T0.272-0.0510.011.07X10-06TTI1
Holsteinrs418080961651621826C0.1990.0590.011.39X10-06PLCH2
Holsteinrs1108591301745680965T0.035-0.1280.025.57X10-07FBRSL1
Holsteinrs416419871924136906A0.13-0.0880.011.51X10-13PAFAH1B1
Holsteinrs420674312528003780C0.413-0.0550.012.08X10-10PHKG1
Holsteinrs415659912727804403A0.194-0.0590.012.66X10-06GGFBPP5
Holsteinrs421471062842881677T0.438-0.0470.011.94X10-06PTPN20
Holsteinrs1106512262930003729A0.450.0590.012.30X10-12FOXRED1

1SNP = Reference SNP ID as assigned by NCBI;

2BTA = Bos taurus autosome;

3Position = base pair position in BTA (UMD3);

4MAF = minor allele frequency;

5p = Bonferroni corrected P-value;

6Gene = SNP annotation mapping a gene

1SNP = Reference SNP ID as assigned by NCBI; 2BTA = Bos taurus autosome; 3Position = base pair position in BTA (UMD3); 4MAF = minor allele frequency; 5p = Bonferroni corrected P-value; 6Gene = SNP annotation mapping a gene We observed SNP associated with FUA and FTP in all breeds. The most significant of these signals were in Holstein and Montbeliarde and they include: BTA17 (rs41609100, FGF2 gene, Effect size = -0.12(0.01), p = 4.95x10-12, MAF = 0.073, Holstein), BTA20 (rs109428015, PRLR gene, Effect size = -0.22(0.03), p = 3.16x10-16, MAF = 0.033, Holstein), and, BTA26 (rs42088948, BTRC gene, Effect size = -0.10(0.01), p = 6.89x10-4, MAF = 0.068, Montbeliarde).

Genetic parameters and genomic AWM-based correlations

Heritability coefficients (diagonal), pedigree-based genetic correlations (upper diagonal), and SNP correlations (lower diagonal) are presented in Table 4. Heritability values are close to reported values for type traits [22], Fore udder attachment, (FUA) is 0.34, 0.26, 0.33 for Montbeliarde, Normande, and Holstein and higher for the other traits. For example, for Milk yield (MY), they reached 0.50, 0.61, and 0.54 in Montbeliarde, Normande and Holstein, respectively. These high values reflect the nature of the yield deviations (YD), which is a mean of records for repeated traits. YD is adjusted for permanent environment effect and thereby has a reduced non-genetic variability. As an example, assuming additive genetic variance of milk yield is 0.3, permanent environment variance is 0.2, and residual variance is 0.5 (we divide the residual variance by 2.5 records on average for production), the heritability of the corresponding YD is 0.3 / (0.3 + 0.5/2.5) = 0.6. These high values are very favorable for GWAS detection power.
Table 4

Heritability (diagonal), genetic (upper-diagonal), and SNP (lower-diagonal) correlations of udder conformation, milk production and health traits in three French dairy breeds.

BreedTrait3FUAUDDFTPUBUCMYFATPROTFAT%PROT%CMSCS
MontbeliardeFUA0.34-0.11-0.280.400.370.420.430.390.03-0.050.19-0.05
UDD0.540.37-0.39-0.380.370.330.320.31-0.02-0.080.090.03
FTP0.04-0.010.43-0.24-0.150.200.220.210.060.040.060.02
UB0.590.66-0.210.36-0.030.020.010.000.00-0.03-0.09-0.02
UC0.650.61-0.050.40.30-0.01-0.01-0.01-0.05-0.060.00-0.05
MY0.330.1-0.050.030.170.500.880.94-0.22-0.130.020.02
FAT-0.02-0.070.08-0.1-0.060.310.440.900.260.09-0.010.02
PROT-0.01-0.090.12-0.05-0.080.420.150.44-0.060.200.010.05
FAT%-0.32-0.350.09-0.41-0.32-0.260.250.110.680.46-0.03-0.02
PROT%0.010.090.190.090.02-0.230.290.150.150.66-0.030.00
CM0.250.180.330.080.180.75-0.1-0.01-0.170.180.030.20
SCS0.540.75-0.080.550.70.03-0.11-0.05-0.260.010.040.39
NormandeFUA0.260.420.390.630.150.280.270.26-0.01-0.03-0.07-0.04
UDD0.210.330.270.510.64-0.53-0.54-0.56-0.07-0.230.00-0.01
FTP-0.040.030.390.380.49-0.13-0.10-0.120.060.02-0.02-0.02
UB0.160.11-0.190.300.350.010.100.060.220.23-0.01-0.02
UC0.20.21-0.080.720.330.030.000.02-0.04-0.310.00-0.01
MY-0.530.160.060.10.060.610.900.95-0.160.040.00-0.10
FAT0.080.120.090.180.150.780.600.930.280.250.000.00
PROT0.010.37-0.10.40.380.270.370.62-0.010.330.02-0.05
FAT%0.280.29-0.210.380.440.210.310.430.630.50-0.030.06
PROT%0.190.190.02-0.090.010.120.120.090.160.61-0.030.03
CM0.070.370.09-0.010.050.01-0.01-0.060.040.300.030.37
SCS-0.010.060.00-0.27-0.250.00-0.1-0.18-0.110.330.100.40
HolsteinFUA0.330.49-0.330.390.090.00-0.02-0.01-0.04-0.05-0.07-0.02
UDD0.210.46-0.160.420.19-0.07-0.09-0.09-0.02-0.06-0.010.01
FTP-0.040.030.48-0.29-0.35-0.34-0.33-0.340.06-0.050.49-0.04
UB0.160.11-0.190.230.270.050.050.07-0.010.090.12-0.02
UC0.20.21-0.080.720.33-0.02-0.010.00-0.050.02-0.050.01
MY0.090.160.060.10.060.540.850.96-0.43-0.03-0.010.13
FAT0.080.120.090.180.150.780.490.900.100.260.00-0.03
PROT0.010.37-0.10.40.380.270.370.48-0.270.25-0.030.07
FAT%0.280.29-0.210.380.440.210.310.430.650.510.01-0.10
PROT%0.190.190.02-0.090.010.120.120.090.160.58-0.01-0.10
CM0.070.370.59-0.010.050.01-0.01-0.060.12-0.030.02-0.07
SCS-0.010.060.00-0.27-0.250.00-0.1-0.18-0.040.040.330.38

1Genetic correlation = correlations based on phenotypic data as estimated using a pedigree based REML;

2SNP correlation = correlations estimated from AWM matrix;

3Trait = A yield deviation: phenotype corrected for environmental variances: FUA: fore udder attachment, UDD: udder depth development, FTP: front teat placement, UB: udder balance, UC: udder cleft, MY: milk yield, FAT and FAT%: fat yield and content, PROT and PROT%: protein yield and content, CM: clinical mastitis, SCS: Somatic cell score

1Genetic correlation = correlations based on phenotypic data as estimated using a pedigree based REML; 2SNP correlation = correlations estimated from AWM matrix; 3Trait = A yield deviation: phenotype corrected for environmental variances: FUA: fore udder attachment, UDD: udder depth development, FTP: front teat placement, UB: udder balance, UC: udder cleft, MY: milk yield, FAT and FAT%: fat yield and content, PROT and PROT%: protein yield and content, CM: clinical mastitis, SCS: Somatic cell score Mammary morphology genetic correlations to production traits ranged from positive for fore udder attachment (FUA) and milk yield (MY) in Montbeliarde (0.42) and Normande (0.28) to medium negative for udder depth or development (UDD) and protein yield (PROT) in Normande (-0.56) and front teat placement (FTP) and fat yield (FAT) in Holstein (-0.33). SNP correlations were numerically different compared to genetic correlations; however, the correlation was generally in the same direction. For instance, the genetic correlation between fore udder attachment (FUA) and udder balance (UB) was 0.40, 0.63 and 0.39, whereas, SNP correlations for these two traits was 0.59, 0.16 and 0.16 for Montbeliarde, Normande and Holstein breeds, respectively. There were moderate to zero genetic correlations between milk yield (MY) and FUA for Montbeliarde (0.42), Normande (0.28), and Holstein (0), and low SNP correlation (Montbeliarde (0.10), Normande (0.16) and Holstein (0.16)) between MY and udder depth or development (UDD). Holstein breed had a highly positive genetic (0.49) and SNP correlation (0.59) between front teat placement (FTP) and clinical mastitis (CM), a trend that was not evident in other two breeds. However, Montbeliarde breed showed a strong SNP correlation between FTP and CM (0.33) and between UDD and CM (0.18). Other trends evident from genetic correlations were between FUA and PROT for Montbeliarde (0.39) and Normande (0.26), FUA and CM for Montbeliarde (0.19) and a moderate genetic correlation between FTP and CM for Holstein (0.49). Genetic and SNP correlations were also comparable between FTP and PROT for all breeds, with genetic/SNP correlation being from medium in Montbeliarde (0.21 / 0.12) and Holstein (-0.34 / -0.1) to low in Normande (-0.12 / -0.1). However, the trend deviated between udder balance (UB) and fat percent (FAT %) with minimal genetic correlation in Normande (0.22) and no genetic correlation in Montbeliarde and Holstein but with moderate SNP correlations in Montbeliarde (-0.41), Normande (0.38) and Holstein (0.38). In general, genetic correlations are in the range of usual values, with high correlations between milk, fat and protein, a moderately negative correlation between production and type traits, moderately positive correlations between conformation traits, and low correlations otherwise. However, though there were deviations between genetic and SNP correlations in some of the traits, most traits correlations were in the same direction thus drawing plausibility of SNP correlated traits.

Gene ontology (GO) for AWM selected genes

We considered main ontology domains including, biological processes, cellular component, and molecular functions. We observed 39 gene clusters (S2 Table) and Table 5 presents the top five biological processes that are relevant for udder morphology and health traits (S3 Table contains complete gene ontology list for this study). Top cluster had eight GO terms with the most significant GO term being “mammary gland epithelium development” (p = 4.64x10-16). Among the AWM-PCIT, genes ten were transcription factors (TF) directly associated with the terms “mammary gland development”, “mammary gland duct morphogenesis", mammary gland alveolus development”, “tissue development”, and, “epithelial tube morphogenesis”. These ten TF include GLI2 (BTA2:72.98Mb), IQGAP3 (BTA3:14.31Mb), PGR (BTA4:62.53Mb), ESR1 (BTA9:89.97Mb), FGF2 (BTA17:35.23Mb), PRLR (BTA20:39.13Mb), TGFBR2 (BTA22:5.14Mb), RREB1 (BTA23:47.90Mb), BTRC (BTA26:22.06Mb), and, FGFR2 (BTA26:41.82Mb). Fig 4 presents their GO term interactions with percentage association. Other GO terms in the top cluster included “gland development” (p = 6.50x10-12) and “system development” (p = 5.38x10-5). Top GO term for the second cluster was “mammary gland epithelium development” (p = 5.16x10-16) while “regulation of intracellular signal transduction” was the least significant term in this cluster (p = 4x10-4). There was an enrichment for “neuropathic pain-signaling in dorsal horn neurons pathway” (p = 2.57x10-8), “G-protein coupled receptor signaling” (p = 1.07x10-7) as well as the “CREB signaling in neurons” (p = 7.24x10-7). Other pathways detected were “cAMP-mediated signaling” (p = 2.88x10-5), “synaptic long-term depression (p = 8.71x10-7), “axonal guidance signaling” (p = 1.15x10-6), and “synaptic long-term potentiation” (p = 1.58 x10-5).
Table 5

Top five clusters of transcription factors/genes and associated GO terms associated with udder morphology in three French dairy breeds.

Transcription factor1/genes GO2 TermGene countpTop Associated Genes3
cluster1 Enrichment Score4: 9.17
mammary gland epithelium development1344.64 x10-16[BTRC, ESR1, FGF2, FGFR2, GLI2, IQGAP3, PGR, PRLR, RREB1, TGFBR2]
regulation of response to stimulus36537.14 x10-6[BMP7, GCH1]
gland development4296.50 x10−12[ROR2, RORA, SYCP2, TCF7L2]
epithelium development10576.64 x10-9[FGFR2, HHIP, RDH10, RSPO2, TNC]
tissue development17142.10 x10-7[AKT3, CTNNA2, CTNNA3, GNAI3, INADL, MAGI3, PARD3, PPP2R2B, PRKCB, PRKCE, PRKCH, PTEN, RAB3B]
animal organ development30222.66 x10-7[ADGRB1, ADGRB3, ANGPT1, BMPER, CALCRL, COL8A1, CSPG4,EPAS1, FGF2, FGFR2, FN1, GJA5, HDAC9, HSPG2, LOXL2, MAP2K5,MTDH, NOV, NRP2, NRXN3, PDCL3, PRKCB, PTEN, PTK2, ROCK2,RORA, SH2D2A, SHB, SPI1, STAB1, TGFBR2, THSD7A, TIE1, VAV3]
system development43095.38 x10-5[HMGCLL1, HMGCS2, OXCT1]
multicellular organism development48552.00 x10-4[GALNT13, GALNT14, GALNT18, GALNTL6, GCNT3]
cluster2 Enrichment Score4: 8.92
mammary gland alveolus development685.16 x10-16[ESR1, PRLR]
intracellular signal transduction26816.97 x10-5[ABCA1, SYCP2]
regulation of intracellular signal transduction16324 x10-4[ANXA4, CD58]
cluster3 Enrichment Score4: 8.41
positive regulation of macromolecule metabolic process28632.77 x10-9[CHRNB2, GPAM, IL7, PELI1, SPTA1, STAT5B, VAV3, VTCN1]
positive regulation of cellular metabolic process28992.62 x10-9[ARPC2, AXIN2, BAIAP2, CDH4, CUX1, CUX2, EPB41L5, EPHA4, FBXW8, FN1, FYCO1, LPAR3,LRP8, LRRC16A, MAP2K2, NTRK2, PTPRD, RREB1, RUFY3, SEMA4D, TCF7L2, TGFB3, TGFBR2]
positive regulation of metabolic process35631.46 x10-7[PIWIL2, PLCB1]
negative regulation of biological process45798.50 x10-6[DAB1, FYCO1, PTEN, PTK2, RTN4, RUFY3, SEMA3B, SEMA4D, SYNGAP1]
cluster4 Enrichment Score4: 8.26
branching morphogenesis of an epithelial tube1465.32 x10-9[CD44, EYA1, FAT4, FGF2]
morphogenesis of a branching epithelium1781.29 x10-8[ADORA1, ARRB1, CACNA1A, CACNA1B, GABBR2, GABRA1, GNAI3,GNB5, GNG7, GNGT2, PDE11A, PDE1A, PDE2A, PRKACB, PRKCB]
epithelial tube morphogenesis3051.65 x10-7[FGFR2, HHIP, RDH10, RSPO2, TNC]
tube morphogenesis and development3412.28 x10-7[AARS2, ALKBH8, CDK5RAP1, CDKAL1, NDC1, NUP210, NUP93, SARS, TSEN54]
organ morphogenesis9052.63 x10-5[CHRNB2, KCNA2]
cluster5 Enrichment Score4: 7.99
cellular response to organic substance23522.89 x10-9[LBP, NDUFA2]
cellular response to chemical stimulus28451.28 x10-7[ABCC9, AKAP6, CACNA1D, CASQ2, HCN1, KCNA2, KCNAB2, KCNB2, KCNC1, KCND3, KCNH1,KCNK2, KCNMA1, KCNMB1, KCNN2, KCNN3, KCNT1, KCNT2, KCNV2, STK39, YWHAE]
response to organic substance29812.47 x10-7[ASS1, SMYD3, SRD5A1, TRIM63]
response to chemical43736.51 x10-5[MCM3, POLD1, POLE3, PRIM2, SMARCAL1]

1Transcription factor = a protein that controls rate of transcription of genetic information from DNA to messenger RNA, by binding to a specific DNA sequence. Sometimes its homonymous to gene;

2GO = Gene Ontology as expressed in main domains: Biological, Cellular and Molecular;

3Top Associated Genes = most probable associated candidate gene of interest for udder morphology;

4Enrichment Score = measure of over-represented (or under-represented) GO terms using AWM gene annotations

Fig 4

Ten candidate gene-GO term interactions with percentage association in three French dairy breeds.

1Transcription factor = a protein that controls rate of transcription of genetic information from DNA to messenger RNA, by binding to a specific DNA sequence. Sometimes its homonymous to gene; 2GO = Gene Ontology as expressed in main domains: Biological, Cellular and Molecular; 3Top Associated Genes = most probable associated candidate gene of interest for udder morphology; 4Enrichment Score = measure of over-represented (or under-represented) GO terms using AWM gene annotations Pathway enrichments detected (S3 File) included “Calcium: cation antiporter activity” and the “Calcium-activated Potassium channel activity” for molecular functions (p<10−3), whereas, for biological processes, “dendrite development” and “putrescine biosynthetic” process were most represented (P<10−3) while “postsynaptic density” and “presynaptic membrane” were top GO terms for cellular components.

Discussion

Previously, Fortes et al. [8] and Ramayo-Caldas et al. [23] suggested the Association Weight Matrix’s (AWM) as an alternative tool to identify genes that would otherwise be missed by traditional single-trait GWAS. This study further supports that suggestion by focusing on GWAS for other kinds of traits, such as udder morphology and health traits common across three French dairy breeds. Though single-trait-single-SNP GWAS focus is on most significant SNP, they can, aid in identifying lead SNP for QTL associated with a given trait. Our study identified three lead SNP associated with front teat placement (FTP) and fore udder attachment (FUA). These SNP mapped FGF2, PRLR, and BTRC genes. PRLR gene (prolactin receptor) was previously associated with milk production traits in Finnish Ayrshire dairy cows [24]. Wang et al. [25] reported the association of FGF2 gene (Fibroblast growth factor 2) to fat yield and percentage and somatic cell score in US Holstein. Coleman-Krnacik et al. [26] reported the expression of FGF2 gene in the bovine mammary gland and uterine endometrium (UE). In the mammary gland, the FGF2 gene may play a role in development and reorganization of the mammary gland, while in UE, FGF2 gene is mainly expressed throughout estrous cycle and early pregnancy. BTRC gene (Beta-Transducing Repeat Containing E3 Ubiquitin Protein Ligase) is an F-box protein involved in Wnt/β-Catenin signaling pathway and indirectly activates nuclear factor kappa-B (NF-kB)[27]. Raven et al. reported these pathways to be highly relevant during mammary development and pregnancy, and as such, could have a major functional role in lactation [27]. The AWM-PCIT algorithm identified interacting candidate genes for udder conformation traits by first establishing SNP based correlation and fixing udder depth or development (UDD) as the key trait when constructing the AWM SNP matrix. These genes were represented by several biological processes involved in positive regulation of cellular biosynthetic processes and cell development, suggesting an endogenous characterization linked to udder morphology [28]. Fortes et al. [10] reported more similar SNP and genetic correlations for traits with moderate to high heritability and less similar correlations between traits with low heritability (Table 4). In our study, most traits had a heritability >10% thus aiding both GWAS detection power and AWM SNP detection. The AWM was assessed for gene ontology (GO). The top GO terms were Calcium cation antiporter and Calcium-activated Potassium channel activity. As reported by Paulsen et al. [29], the former is a member of the cation diffusion facilitator (CDF) superfamily which are integral membrane proteins that increase tolerance to divalent metal ions, whereas, the latter is involved in ionic signaling in cells, a critical function for hormonal control of cell proliferation and differentiation [30]. Control of calcium signaling is likely to have profound effects on mammary physiology and pathophysiology. Their high significance level can be explained by the fact that mammary glands extract large quantities of calcium from the plasma during lactation [31], to ensure sufficient calcium concentration in milk. Dendrite development and putrescine biosynthetic processes were top biological GO terms. Dendritic cells (DC) are accessory cells of the mammalian immune system whose work is availing antigen material to T cells of the immune system [32-33]. This ability to stimulate native T-cells makes this result significant because it can directly be applied to improve udder health. DC has also been reported to play a pivotal role in the initiation of an adaptive immune response [34]. Putrescine, as a member of polyamine pathway, is regulated by the periovulatory endocrine milieu [35]. The central pathways that showed enrichment were “Neuropathic pain-signaling in dorsal horn neurons pathway," and, “CREB signaling pathway." Neuropathic pain is the pain after nerve injury whereas the dorsal horn (tip of the spinal cord) pathways have been shown to offer substantial overlap with spinal projections from adjacent mammary glands in model organisms [36]. Reflex milk ejection may result from the strong integration of sensory input from mammary glands afferents that terminate in the dorsal horn. CREB (cyclic-AMP response element-binding) protein family of transcription factors (TF—a protein that controls the rate of transcription of genetic information from DNA to messenger RNA, by binding to a specific DNA sequence [37]) play a crucial role in supporting the survival of sensory neurons [38]. The intervention of sensory neurons stimulates cells to secrete nerve growth factor (NGF) via the sympathetic nervous system (SNS) that maintains homeostasis [39]. NGF mediate its functions through ligation of tyrosine kinase (Trk) receptors [40]. Regulation of Trk signaling is by a variety of intracellular signaling cascades, which include MAPK pathway, which promotes cell continuation and growth [41]. Previous studies indicate that Trk could be multifunctional growth factors that exert various effects through their receptors on non-neuronal cells such as mammary ducts [30]. Fontanesi et al. [8] reported that long-term transcriptomic adaptations of tissues depend on the action of external stimuli that induce action of cellular functions on transcription factors (TF) [42]. In this study, we identified 39 interacting gene clusters and the most significant cluster had ten TF directly involved with mammary gland development and mammary gland duct morphogenesis: BTRC, ESR1, FGF2, FGFR2, GLI2, IQGAP3, PGR, PRLR, RREB1, and, TGFBR2. These TF are homonymous with genes encoding them. This top cluster is directly involved in mammary gland development, regulation of response to a stimulus, gland development, epithelium development, tissue development, animal organ development, system development and multicellular organism development. This was partly in agreement to works reported by Yang et al. [43] on QTL associated with follicle stimulating hormone production in Chinese Holstein cattle.

Conclusion

Our study suggests the usefulness of system-based approaches to identify candidate genes from interacting gene networks in a multi-breed context. We achieved this by exploiting associations between correlated traits. The reluctant inclusion of intergenic SNP leaves the possibility that the AWM approach was not capturing significant SNP such as trans-activators. Nonetheless, the AWM proved to be more efficient for integrating related complex traits and analyzing thousands of SNP and therefore appropriate for the analysis of these complex traits. When applied to our dataset, it predicted gene interactions that are consistent with the known biology of udder morphology and health captured known TF (e.g., ESR1, FGF2, GLI2, PGR and BTRC), and provided new candidate genes for udder morphology. This experiment may be replicated using whole genome sequence data and other independent datasets.

Manhattan plots for milk production, udder morphology and udder health traits in three French dairy cattle breeds.

(PDF) Click here for additional data file.

Associated Weight Matrix for three French dairy cattle breeds.

(XZ) Click here for additional data file.

Enrichment scores for udder morphology genes.

(XLSX) Click here for additional data file.

Gene ontology for AWM-PCIT selected genes.

(XLSX) Click here for additional data file.
  41 in total

1.  Association weight matrix: a network-based approach towards functional genome-wide association studies.

Authors:  Antonio Reverter; Marina R S Fortes
Journal:  Methods Mol Biol       Date:  2013

2.  Bovine mammary dendritic cells: a heterogeneous population, distinct from macrophages and similar in phenotype to afferent lymph veiled cells.

Authors:  Nicolas G Maxymiv; Mini Bharathan; Isis K Mullarky
Journal:  Comp Immunol Microbiol Infect Dis       Date:  2011-10-21       Impact factor: 2.268

Review 3.  Calcium transport and signaling in the mammary gland: targets for breast cancer.

Authors:  Won Jae Lee; Gregory R Monteith; Sarah J Roberts-Thomson
Journal:  Biochim Biophys Acta       Date:  2005-12-22

4.  PCIT: an R package for weighted gene co-expression networks based on partial correlation and information theory approaches.

Authors:  Nathan S Watson-Haigh; Haja N Kadarmideen; Antonio Reverter
Journal:  Bioinformatics       Date:  2009-12-09       Impact factor: 6.937

5.  Differential temporal and spatial gene expression of fibroblast growth factor family members during mouse mammary gland development.

Authors:  S Coleman-Krnacik; J M Rosen
Journal:  Mol Endocrinol       Date:  1994-02

6.  Imputation of genotypes from different single nucleotide polymorphism panels in dairy cattle.

Authors:  T Druet; C Schrooten; A P W de Roos
Journal:  J Dairy Sci       Date:  2010-11       Impact factor: 4.034

7.  Afferent projections from the mammary glands to the spinal cord in the lactating rat--I. A neuroanatomical study using the transganglionic transport of horseradish peroxidase-wheatgerm agglutinin.

Authors:  J G Tasker; D T Theodosis; D A Poulain
Journal:  Neuroscience       Date:  1986-10       Impact factor: 3.590

8.  Using regulatory and epistatic networks to extend the findings of a genome scan: identifying the gene drivers of pigmentation in merino sheep.

Authors:  Elsa García-Gámez; Antonio Reverter; Vicki Whan; Sean M McWilliam; Juan José Arranz; James Kijas
Journal:  PLoS One       Date:  2011-06-20       Impact factor: 3.240

9.  From SNP co-association to RNA co-expression: novel insights into gene networks for intramuscular fatty acid composition in porcine.

Authors:  Yuliaxis Ramayo-Caldas; Maria Ballester; Marina R S Fortes; Anna Esteve-Codina; Anna Castelló; Jose L Noguera; Ana I Fernández; Miguel Pérez-Enciso; Antonio Reverter; Josep M Folch
Journal:  BMC Genomics       Date:  2014-03-26       Impact factor: 3.969

10.  A multi-trait meta-analysis with imputed sequence variants reveals twelve QTL for mammary gland morphology in Fleckvieh cattle.

Authors:  Hubert Pausch; Reiner Emmerling; Hermann Schwarzenbacher; Ruedi Fries
Journal:  Genet Sel Evol       Date:  2016-02-16       Impact factor: 4.297

View more
  10 in total

1.  Udder health, conceptual construct, and uses of the term: A systematic review from 1962 to 2019.

Authors:  Richard Zapata-Salas; José F Guarín; Leonardo A Ríos-Osorio
Journal:  Vet World       Date:  2022-04-08

2.  Epigenomics and genotype-phenotype association analyses reveal conserved genetic architecture of complex traits in cattle and human.

Authors:  Shuli Liu; Ying Yu; Shengli Zhang; John B Cole; Albert Tenesa; Ting Wang; Tara G McDaneld; Li Ma; George E Liu; Lingzhao Fang
Journal:  BMC Biol       Date:  2020-07-03       Impact factor: 7.431

3.  Direct Phenotyping and Principal Component Analysis of Type Traits Implicate Novel QTL in Bovine Mastitis through Genome-Wide Association.

Authors:  Asha M Miles; Christian J Posbergh; Heather J Huson
Journal:  Animals (Basel)       Date:  2021-04-17       Impact factor: 2.752

4.  A Co-Association Network Analysis Reveals Putative Regulators for Health-Related Traits in Pigs.

Authors:  Daniel Crespo-Piazuelo; Yuliaxis Ramayo-Caldas; Olga González-Rodríguez; Mariam Pascual; Raquel Quintanilla; Maria Ballester
Journal:  Front Immunol       Date:  2021-11-26       Impact factor: 7.561

5.  Genome-Wide Association Study for Udder Conformation Traits in Chinese Holstein Cattle.

Authors:  Mudasir Nazar; Ismail Mohamed Abdalla; Zhi Chen; Numan Ullah; Yan Liang; Shuangfeng Chu; Tianle Xu; Yongjiang Mao; Zhangping Yang; Xubin Lu
Journal:  Animals (Basel)       Date:  2022-09-22       Impact factor: 3.231

6.  Selection Response Due to Different Combination of Antagonistic Milk, Beef, and Morphological Traits in the Alpine Grey Cattle Breed.

Authors:  Enrico Mancin; Cristina Sartori; Nadia Guzzo; Beniamino Tuliozi; Roberto Mantovani
Journal:  Animals (Basel)       Date:  2021-05-08       Impact factor: 2.752

7.  Genome-Wide Association Studies of Somatic Cell Count in the Assaf Breed.

Authors:  Yasemin Öner; Malena Serrano; Pilar Sarto; Laura Pilar Iguácel; María Piquer-Sabanza; Olaia Estrada; Teresa Juan; Jorge Hugo Calvo
Journal:  Animals (Basel)       Date:  2021-05-24       Impact factor: 2.752

8.  A Meta-Analysis Including Pre-selected Sequence Variants Associated With Seven Traits in Three French Dairy Cattle Populations.

Authors:  Andrew G Marete; Bernt Guldbrandtsen; Mogens S Lund; Sébastien Fritz; Goutam Sahana; Didier Boichard
Journal:  Front Genet       Date:  2018-11-06       Impact factor: 4.599

9.  Diversifying selection signatures among divergently selected subpopulations of Polish Red cattle.

Authors:  Artur Gurgul; I Jasielczuk; E Semik-Gurgul; T Szmatoła; A Majewska; E Sosin-Bzducha; M Bugno-Poniewierska
Journal:  J Appl Genet       Date:  2019-01-26       Impact factor: 3.240

10.  Genome-Wide Association Study Candidate Genes on Mammary System-Related Teat-Shape Conformation Traits in Chinese Holstein Cattle.

Authors:  Mudasir Nazar; Xubin Lu; Ismail Mohamed Abdalla; Numan Ullah; Yongliang Fan; Zhi Chen; Abdelaziz Adam Idriss Arbab; Yongjiang Mao; Zhangping Yang
Journal:  Genes (Basel)       Date:  2021-12-19       Impact factor: 4.096

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.