Literature DB >> 35723402

Multi-Omics Integration and Network Analysis Reveal Potential Hub Genes and Genetic Mechanisms Regulating Bovine Mastitis.

Masoumeh Naserkheil1, Farzad Ghafouri2, Sonia Zakizadeh3, Nasrollah Pirany4, Zeinab Manzari2, Sholeh Ghorbani3, Mohammad Hossein Banabazi3, Mohammad Reza Bakhtiarizadeh5, Md Amdadul Huq6, Mi Na Park1, Herman W Barkema7, Deukmin Lee8, Kwan-Sik Min8.   

Abstract

Mastitis, inflammation of the mammary gland, is the most prevalent disease in dairy cattle that has a potential impact on profitability and animal welfare. Specifically designed multi-omics studies can be used to prioritize candidate genes and identify biomarkers and the molecular mechanisms underlying mastitis in dairy cattle. Hence, the present study aimed to explore the genetic basis of bovine mastitis by integrating microarray and RNA-Seq data containing healthy and mastitic samples in comparative transcriptome analysis with the results of published genome-wide association studies (GWAS) using a literature mining approach. The integration of different information sources resulted in the identification of 33 common and relevant genes associated with bovine mastitis. Among these, seven genes-CXCR1, HCK, IL1RN, MMP9, S100A9, GRO1, and SOCS3-were identified as the hub genes (highly connected genes) for mastitis susceptibility and resistance, and were subjected to protein-protein interaction (PPI) network and gene regulatory network construction. Gene ontology annotation and enrichment analysis revealed 23, 7, and 4 GO terms related to mastitis in the biological process, molecular function, and cellular component categories, respectively. Moreover, the main metabolic-signalling pathways responsible for the regulation of immune or inflammatory responses were significantly enriched in cytokine-cytokine-receptor interaction, the IL-17 signaling pathway, viral protein interaction with cytokines and cytokine receptors, and the chemokine signaling pathway. Consequently, the identification of these genes, pathways, and their respective functions could contribute to a better understanding of the genetics and mechanisms regulating mastitis and can be considered a starting point for future studies on bovine mastitis.

Entities:  

Keywords:  bovine; hub genes; mastitis; multi-omics data; regulatory networks; transcriptome sequencing

Year:  2022        PMID: 35723402      PMCID: PMC8928958          DOI: 10.3390/cimb44010023

Source DB:  PubMed          Journal:  Curr Issues Mol Biol        ISSN: 1467-3037            Impact factor:   2.976


1. Introduction

Over the last decade, advances in high-throughput genotyping and sequencing technologies [1], along with progress in developing computational methods [2], have led to a revolution towards a better understanding of the genetic architecture underlying complex traits and diseases, with exceptional depth. To date, several studies have focused on integrating different information sources (“omics” datasets) to create robust insights into complex molecular functional mechanisms by reinforcing complementary evidence from multiple levels [3,4,5]. In this regard, the results of different types of multi-layer studies have been reported, ranging from simple combinations (two different kinds of -omics data) to more comprehensive and computationally demanding ones (multiple kinds of -omics data). Incorporating two layers under a systems biology framework can involve approaches that integrate genomics and transcriptomics [6,7,8], metabolomics and transcriptomics [9,10,11], proteomics and transcriptomics [12,13], and proteomics and metabolomics analyses [14,15] to functionally characterize the interactions at the molecular level for traits of interest in humans and in livestock species. Bovine mastitis is a common and costly disease, which has a considerable effect on the profitability of the production system, owing to its negative impacts on milk yield, quality, and reproductive performance; early culling; animal welfare issues; and the cost of treatment [16,17,18,19]. The inflammation of the mammary gland occurs in response to infection with pathogenic microorganisms or physiological and metabolic changes [20,21]. Although the heritability of mastitis is low [22], and genetic correlations between mastitis and production traits are unfavorable [23,24], genetic improvement in terms of mastitis resistance is a major breeding goal. It is also known that mastitis is highly genetically correlated with somatic cell count (SCC), which consists of macrophages, lymphocytes, and epithelial cells, and consequently, this can be used as an important indicator of udder health [25]. Furthermore, selection for correlated traits, such as reduced SCC (indicating increased mastitis resistance) could be an interesting alternative, allowing scientists to infer and comprehend the genetic and molecular mechanisms underlying these traits. In other words, the discovery of genomic regions, disease-causing genes, and biomarkers associated with mastitis is of essential importance in improving the diagnosis and treatment of the disease. In the literature, numerous studies have been carried out to identify functional candidate genes associated with mastitis based on genome-wide association studies (GWAS) [26,27,28,29,30] and transcriptome studies [6,16,31,32]. On the other hand, concordance among these studies is low, indicating difficulties in identifying reliable candidate genes for mastitis. New approaches integrating GWAS results with additional sources of information can overcome this challenge. Hence, it is worth investigating the molecular regulatory mechanisms through which mastitis can be developed. Therefore, the objective of this study was to use the integration of previously published RNA-Seq and microarray data with GWAS results to identify and prioritize potential hub genes and create reconstructions of the protein–protein interaction (PPI) and gene regulatory networks, as well as modeling of the three-dimensional hub protein structure involved in pathological processes related to mastitis in dairy cattle.

2. Materials and Methods

The overall workflow for the data collection and the analysis of relevant genes related to mastitis in dairy cattle is presented in Figure 1.
Figure 1

Schematic of the workflow used to reconstruct the metabolic pathways of mastitis in dairy cattle. The main gene list was prepared from RNA-Seq and microarray datasets, and literature mining. The protein–protein interaction network (PPI), gene regulatory network (GRN), and interactive bipartite network of gene–miRNA interactions were reconstructed using Cytoscape.

2.1. Data Collection

Collection and evaluation of the available data is the first step in better understanding the reconstruction of molecular networks and the biological basis in terms of the identification of candidate genes, gene regulation, interactions, protein–protein interaction (PPI), and metabolic signaling networks. In this study, the microarray and RNA-sequencing (RNA-Seq) datasets, available in the public repository of the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO), were retrieved for samples of mastitis and healthy Bos taurus species. The accession numbers for the RNA-Seq and microarray datasets are shown in Table 1. Six Holstein cows from first to third lactations and days in milk (DIM) ranging from 7 to 236 were included in the GSE131607 dataset. All cows were kept in freestall housing at the University of California–Davis, fed a total mixed ration (TMR), and given ad libitum access to water. Two different samples were taken from each cow after diagnosis using the California Mastitis Test, one sample from the mastitic quarter (n = 6), and the other sample taken diagonally across from the mastitic quarter, which was confirmed as the healthy quarter (n = 6), based on having a somatic cell count (SCC) less than 100,000 cells/mL milk [16]. The GSE15020 and GSE15022 datasets were related to microarray analysis from a study by Mitterhuemer et al. [31]. Fifteen healthy German Holstein Frisian cows in mid-lactation (3 to 6 months postpartum) were included in the study. Quarter milk samples were collected and tested weekly before the trial to ensure that they contained <50,000 somatic cells/mL and were free of mastitis pathogens. The animals were inoculated in one quarter with E. coli and slaughtered after 6 h (n = 5) or 24 h (n = 5) in two different infection methods. Five cows, considered as controls, received no treatment and were slaughtered after 24 h. In total, 89 healthy and 75 diseased German Holstein cows were tested for the GSE93082 dataset. Diseases were distinguished by either systemic (extra-mammary) occurrence or those affecting the mammary gland (mastitis) to account for influences on the milk composition from local inflammatory processes. All cows were examined thoroughly by the dairy herd manager, trained staff, or a veterinarian. Healthy animals (2–4 years old, 1st to 3rd lactation, one animal 4th and one 8th lactation) which had no clinical signs of disease and no abnormalities in the udder or milk, with a reported somatic cell count less than 100,000 cells/mL were chosen as controls. Most of the control samples were taken during early lactation within 10 to 100 days postpartum. Diseased animals were in the 1st to 8th lactation period from 10 to 220 days postpartum. The milk samples of control animals or cows with extra-mammary diseases were collected and tested from one quarter or a composite milk sample (equal volumes from all 4 quarters mixed) [32]. In the GSE75379 dataset, sixteen healthy primiparous Holstein cows were inoculated with live E. coli into one mammary quarter at four to six weeks after parturition. The cows were housed in straw-bedded tie-stalls, where they were individually fed and given free access to water. The animals were fed using a TMR based on corn silage, minerals, and vitamins for ad libitum intake twice daily. Daily feed intake and milk yield at each milking were recorded. Prior to the start of the study period, cows were considered healthy and free of mastitis-causing pathogens based on body temperature, white blood cell count (WBC), California Mastitis Test, glutaraldehyde test, SCC, and bacteriological examinations of milk samples. Control quarters were selected based on bacteriological tests, in which quarter foremilk SCCs were <181,000 cells/mL at 24 h post-intramammary infection (IMI). Biopsy specimens of healthy and diseased udder tissue were performed 24 h post-IMI in infected and non-infected (control) mammary quarters [33].
Table 1

Summary of the GEO accession numbers for RNA-Seq and microarray data sets.

No.Data TypeGEO a AccessionPlatformsSamples (M:H) bCitation
1RNA-SeqGSE131607GPL15749 (Illumina HiSeq 2000)12 (6:6)Asselstine et al. [16]
2RNA-SeqGSE75379GPL15749 (Illumina HiSeq 2000)18 (6:12)Moyes et al. [33]
3MicroarrayGSE93082GPL2112 ((Bovine) Affymetrix Bovine Genome Array)12 (6:6)Zoldan et al. [32]
4MicroarrayGSE15020GPL2112 ((Bovine) Affymetrix Bovine Genome Array)10 (5:5)Mitterhuemer et al. [31]
5MicroarrayGSE15022GPL2112 ((Bovine) Affymetrix Bovine Genome Array)10 (5:5)Mitterhuemer et al. [31]

a GEO, Gene Expression Omnibus; b M, number of mastitis samples, and H, number of healthy samples.

2.2. Differential Gene Expression Analysis

Microarray data were pre-processed and normalized using the Lumi package [34] and the GCRMA algorithm (GeneChip Robust Multi-array Averaging) method, implemented in the Affy package in R software, to remove the variance and to prepare the data for further analysis [35]. Gene expression analysis was performed in R/Bioconductor software to screen the significant differential expression genes (DEGs) according to the comparison of the test and control data using the packages Limma [36], GEOquary [37], Biobase [38], and Umap [39]. Concerning RNA-Seq data, the quality of the raw data was assessed using FastQC software (v0.11.9) [40]. Then, based on the results of the raw data quality control, the sequences were edited to remove the adapters, PCR primers, and low-quality reads using Trimmomatic software (v0.38.0) [41]. Alignment sequences and mapping of reads were conducted on the Bos taurus reference genome (http://ftp.ensembl.org/pub/release-103/fasta/bos_taurus/dna/ (accessed on 20 September 2021)) using HISAT2 software (v2.1.0) [42]. For transcript quantification, featureCounts software (v2.0.1) was employed to measure the total raw counts of mapped reads [43]. DESeq2 software (v2.11.40.6) was applied for the measurement of final differences in gene expression [44]. In addition to DEGs, the identification of miRNAs was also performed in the RNA-Seq datasets, simultaneously. Finally, the threshold for statistical significance of the differential expression of each gene was obtained with the criteria of a |log fold-change (FC)| ≥ 2.0 and a false discovery rate (FDR) ≤ 0.05 in accession numbers related to microarray and RNA-Seq datasets. The gene lists from the differential expression related to microarray and RNA-Seq analysis were considered Gene Sets 1 and 2, respectively (Supplementary Materials 1 and 2).

2.3. Literature Mining to Discover Candidate Genes for Mastitis

Extensive literature surveys were performed to search for candidate genes using keywords related to bovine mastitis in the PubMed and Google Scholar databases without time limitation. GWAS studies were selected for further detailed review. Then, iHOP (iHOP literature server, http://www.ihop.net.org/ (accessed on 9 October 2021)) was used, which is a web-based tool that allows the exploration of a network of gene and protein interactions by directly navigating the pool of published scientific literature [45]. Finally, the candidate gene list extracted through literature mining was mentioned as Gene Set 3 (Supplementary Material 3).

2.4. Determination of Main Gene List

To identify the relevant candidate genes related to mastitis, 3 datasets (microarray, RNA-Seq, and GWAS) from the differential expression and GWAS analyses were integrated. Subsequently, genes that were common to the three gene sets were selected as the main gene list for further analysis. The number of shared DE genes between the 3 datasets was analyzed using the R package VennDiagram v1.6.18 [46].

2.5. Functional Enrichment and KEGG Pathway Analysis

Gene ontology (GO) and enrichment analysis were performed using the online programs DAVID [47] (Database for Annotation, Visualization, and Integrated Discovery), PANTHER [48] (Protein ANalysis THrough Evolutionary Relationships), GeneCards (www.genecards.org/ (accessed on 9 October 2021)), g:Profiler [49] (https://biit.cs.ut.ee/gprofiler/gost (accessed on 9 October 2021)), and the STRING database [50] (https://string-db.org (accessed on 9 October 2021)), which are comprehensive web tools that help to explore the biological process (BP), molecular function (MF), and cellular component (CC) of the mined gene set. The pathway enrichment of the identified genes was provided in the Kyoto Encyclopedia of Genes and Genomes (KEGG). Gene Ontology terms with FDR < 0.05 were considered significantly enriched for the identified genes.

2.6. Identification of miRNAs and Target Gene Prediction

The functional annotation of the expressed miRNAs consisted of the functional annotation of their potential target genes. The potentially targeted genes were predicted using miRBase [51] (https://www.mirbase.org/ (accessed on 9 October 2021)) and Targetscan [52]. The predicted target genes were selected and submitted to DAVID, KEGG, Reactome pathways, and the PANTHER database for the enrichment target genes of each miRNA.

2.7. Reconstruction of Omics Multi-Layers Networks

The miRNA–gene bipartite network was reconstructed based on the master gene list and the molecular interactions documented in related papers and in online interaction databases. Protein–protein interaction (PPI) data were abstracted from the Biomolecular Interaction Network Database (BIND), the Database of Interacting Proteins (DIP), the Biological General Repository for Interaction Datasets (BioGRID), and the Mammalian Protein–Protein Interactions Database (MIPS). Finally, PPI network analysis was performed using the STRING database to explore interactions between genes, specifically in Bos taurus species. Each miRNA and target gene was entered into the database and resulting interactions were imported into the networks using Cytoscape software v3.8.2. (National Institute of General Medical Sciences, Bethesda Softworks, Rockville, MD, USA) [53]. Genes and miRNAs in generated networks are represented as nodes and the interactions between these nodes as edges. Furthermore, the metabolic-signaling pathway enrichment of the PPI network was reconstructed using ClueGO v.2.5.5 [54].

2.8. Modeling of Three-Dimensional (3D) Structure of Hub Proteins

The SWISS-MODEL template-based approach [55] (https://www.swissmodel.expasy.org/interactive (accessed on 15 October 2021)) was used to predict the 3D structures of hub proteins using individual FASTA sequences and reference PDB files. The resulting PDB files are enclosed in Supplementary Material 4.

3. Results

3.1. Transcriptome Analysis for Identifying Differentially Expressed Genes (DEGs)

To obtain better insights into the molecular mechanism and the genetic basis of mastitis, we investigated the pattern of transcriptome profiles of mastitis samples versus healthy samples in dairy cattle. The experimental data used for the study were obtained from the GEO database, consisting of microarray and RNA-Seq datasets, as presented in Table 1. The analysis of differentially expressed genes between mastitis and healthy cows was performed based on a fold change > ±2 and a false discovery rate (FDR) < 0.05. The results of the statistical analysis of the microarray datasets showed a total of 564 significant genes from three datasets as follows. The first dataset (GSE93082): 378 DE genes, the second dataset (GSE15020): 177 DE genes, and the third dataset (GSE15022): nine DE genes. Concerning the analysis of RNA-Seq, a total of 774 genes were differentially expressed in the mastitic versus healthy group comparison, of which 442 and 332 genes were detected from accession numbers GSE131607 and GSE75379, respectively. Gene counts and a detailed summary of the alignments for all GEO accession numbers are provided as Gene Sets 1 and 2 for microarray and RNA-Seq analyses, respectively, in Supplementary Materials 1 and 2.

3.2. Identification of miRNAs

Based on analysis of the RNA-Seq dataset with access number GSE75379, we identified eight miRNAs out of the 332 DE genes with functions related to mastitis: bta-mir-339a, bta-mir-24-2, bta-mir-222, bta-mir-27a, bta-mir-146a, bta-mir-23a, bta-mir-142, and bta-mir-223. Considering the threshold of a fold change > ±2 and FDR < 0.05, all differentially expressed miRNAs were overexpressed in the mastitic cows compared to the healthy cows (Table 2).
Table 2

Information about differentially expressed miRNAs between the mastitis and healthy samples in dairy cattle based on GSE75379.

miRNA NamemiRNA RegionFold Changep-ValueFDR
BTAmiRNA StartmiRNA End
bta-mir-339a2541736134417362112.04720.00320.0490
bta-mir-24-2711839032118391032.53020.00010.0056
bta-mir-222X98125920981260303.21944.88 × 10−60.0002
bta-mir-27a711838877118389493.66471.13 × 10−81.16 × 10−6
bta-mir-146a772071548720716463.96772.26 × 10−82.20 × 10−6
bta-mir-23a711838702118387763.98268.97 × 10−99.60 × 10−7
bta-mir-14219930143293015184.26752.69 × 10−136.83 × 10−11
bta-mir-223X94562822945629294.49833.05 × 10−115.58 × 10−9

3.3. Literature Mining and Identification of Main Gene List

Literature mining was carried out using the iHOP web tool to increase the study’s accuracy and to obtain previous evidence for associations between the identified genes and mastitis in dairy cattle. Based on literature mining, 3217 candidate genes were considered as Gene Set 3 (Supplementary Material 3). The Venn diagram shows the number of genes that are unique or common among the three gene sets for mastitis (Figure 2). Notably, there is a remarkably small overlap between the three datasets. Overall, 33 genes were common in the Gene Sets 1, 2, and 3 relating to microarray, RNA-Seq, and GWAS datasets, respectively, which were named as the main genes involved in mastitis and were considered for subsequent analysis (Table 3). The results of differential expression analysis of 33 common genes based on the RNA-Seq datasets are presented in Table 3. When comparing healthy cows to mastitis cows, of 33 DE genes, nine were underexpressed in the mastitis cows and 24 genes were overexpressed in the mastitis cows, based on their fold-change values (fold change > ±2).
Figure 2

Venn diagram of significant genes among the three types of dataset, including microarray, RNA-Seq, and GWAS data related to mastitis in dairy cattle.

Table 3

Summary list of 33 common genes (main genes) in the integrated studies of gene expression and GWAS associated with mastitis in dairy cattle *.

Gene SymbolGene NameGene RegionFold Changep-ValueFDR
ChrGene StartGene End
CSN3 casein kappa68564585485658926−4.17678.29 × 10−101.55 × 10−6
CSN1S2 casein alpha-S268552990585548556−3.96963.53 × 10−81.85 × 10−5
CSN2 casein beta68544916485457744−3.79162.92 × 10−81.74 × 10−5
RHPN2 rhophilin Rho GTPase binding protein 2184340407443474596−3.45561.38 × 10−60.0002
CSN1S1 casein alpha s168541111885429268−3.45302.94 × 10−77.41 × 10−5
LALBA lactalbumin alpha53118343231213145−3.10991.23 × 10−50.0009
ACSS2 acyl-CoA synthetase short chain family member 2136418674364233568−2.67294.02 × 10−101.05 × 10−6
RHOU ras homolog family member U28697339706882−2.49600.00010.0047
KRT7 keratin 752767485427689030−2.38040.00030.0110
SGK1 serum/glucocorticoid regulated kinase 1972305979724185352.01071.85 × 10−145.70 × 10−12
TRIB1 tribbles pseudokinase 11414779050147872062.06504.36 × 10−50.0024
LYST lysosomal trafficking regulator28837917385231142.10660.00090.0212
VAV1 vav guanine nucleotide exchange factor 1717664498177281632.11383.31 × 10−50.0013
GRO1 chemokine (C-X-C motif) ligand 1 (melanoma growth stimulating activity, alpha)689072611890751332.20620.00030.0105
F5 coagulation factor V1637159073372383062.27293.94 × 10−106.01 × 10−8
SERPINE1 serpin family E member 12535596139356171932.34980.00020.0065
BASP1 brain abundant membrane attached signal protein 12055908762559641452.41743.40 × 10−83.14 × 10−6
CD40 CD40 molecule1374842191748531162.42431.20 × 10−50.0005
TNFRSF6B TNF receptor superfamily member 6b1354054302540558102.44160.00010.0035
SLC16A3 solute carrier family 16 member 31950634317506422042.48218.56 × 10−60.0004
CXCR1 chemokine (C-X-C motif) receptor 121062151311062191582.53320.00020.0065
SOCS3 suppressor of cytokine signaling 31953840159538408582.55890.00010.0047
CCDC88B coiled-coil domain containing 88B2942630756426457502.63924.24 × 10−83.81 × 10−6
TNFAIP6 TNF alpha induced protein 6244747145447642142.73060.00010.0040
S100A9 S100 calcium binding protein A9317115128171179842.94595.39 × 10−60.0005
PSTPIP2 proline-serine-threonine phosphatase interacting protein 22445737786458320602.96853.43 × 10−101.05 × 10−6
ALOX5AP arachidonate 5-lipoxygenase activating protein1230108987301382593.22991.64 × 10−113.14 × 10−9
CCL19 C-C motif chemokine ligand 19876054024760559323.53664.73 × 10−73.31 × 10−5
MMP9 matrix metallopeptidase 91374746976747543033.59218.64 × 10−87.35 × 10−6
HCK HCK proto-onco, Src family tyrosine kinase1361563070616085034.22253.23 × 10−181.69 × 10−15
S100A12 S100 calcium binding protein A12317102722171041734.41334.22 × 10−117.47 × 10−9
S100A8 S100 calcium binding protein A8317085577170868274.71796.81 × 10−121.36 × 10−9
IL1RN interleukin 1 receptor antagonist1146815591468378314.96138.55 × 10−163.36 × 10−13

* Information on common differentially expressed genes between the mastitis and healthy samples in dairy cattle provided based on RNA-Seq datasets.

3.4. Functional Annotation and Pathway Enrichment Analysis

The functional annotation of GO terms was performed based on the biological process (BP), molecular function (MF), and cellular component (CC) to identify the biological meaning and the systematic features of the list of 33 DE genes, using the DAVID, PANTHER, and g:Profiler databases. Twenty-three biological processes were identified, such as response to stimulus, defense response, response to stress, immune response, cellular process, and biological regulation, which were the most significant ones associated with mastitis. The identified DEGs were significantly involved in the seven following functions: antioxidant activity, binding, protein binding, zymogen binding, arachidonic acid binding, Toll-like receptor 4 binding, and RAGE receptor binding for molecular function. Regarding cellular components, four GO terms, including extracellular region, Golgi lumen, extracellular space, and cellular anatomical entity, were identified (Table 4). In addition, KEGG pathway analysis revealed that the identified DE genes involved in mastitis were enriched in cytokine–cytokine-receptor interaction, the IL-17 signaling pathway, viral protein interaction with cytokines and cytokine receptors, and the chemokine signaling pathway (Figure 3).
Table 4

Top significant gene ontology (GO) terms enriched using genes associated with mastitis in dairy cattle.

CategoryTerm_IDTermCountFDRGenes
BP 1_DIRECTGO:0050896Response to stimulus203.26 × 10−9CSN2, RHPN2, SGK1, CSN1S2, LALBA, S100A9, CSN1S1, SOCS3, S100A8, ALOX5AP, LYST, F5, RHOU, IL1RN, MMP9, CD40, CSN3, GRO1, S100A12, CXCR1
BP_DIRECTGO:0032570Response to progesterone55.21 × 10−8CSN2, CSN1S2, LALBA, CSN1S1, CSN3
BP_DIRECTGO:0032355Response to estradiol53.34 × 10−7CSN2, CSN1S2, LALBA, CSN1S1, CSN3
BP_DIRECTGO:0006952Defense response93.93 × 10−6CSN1S2, LALBA, S100A9, S100A8, LYST, IL1RN, CD40, GRO1, S100A12
BP_DIRECTGO:0006950Response to stress127.04 × 10−6CSN2, CSN1S2, LALBA, S100A9, S100A8, LYST, F5, IL1RN, MMP9, CD40, GRO1, S100A12
BP_DIRECTGO:0006955Immune response82.06 × 10−5S100A9, S100A8, LYST, IL1RN, CD40, GRO1, S100A12, CXCR1
BP_DIRECTGO:0051716Cellular response to stimulus142.06 × 10−5CSN2, RHPN2, SGK1, S100A9, CSN1S1, SOCS3, S100A8, ALOX5AP, RHOU, IL1RN, MMP9, CD40, GRO1, CXCR1
BP_DIRECTGO:0030593Neutrophil chemotaxis46.46 × 10−5S100A9, S100A8, GRO1, CXCR1
BP_DIRECTGO:0098542Defense response to other organisms76.46 × 10−5CSN1S2, LALBA, S100A9, S100A8, LYST, CD40, S100A12
BP_DIRECTGO:0006954Inflammatory response66.92 × 10−5S100A9, S100A8, IL1RN, CD40, GRO1, S100A12
BP_DIRECTGO:0033993Response to lipids69.95 × 10−5CSN2, CSN1S2, LALBA, CSN1S1, CSN3, GRO1
BP_DIRECTGO:0065007Biological regulation170.00029CSN2, RHPN2, SGK1, S100A9, CSN1S1, SOCS3, S100A8, ALOX5AP, SERPINE1, F5, RHOU, IL1RN, MMP9, CD40, CSN3, GRO1, CXCR1
BP_DIRECTGO:0052548Regulation of endopeptidase activity50.00063CSN2, S100A9, S100A8, SERPINE1, MMP9
BP_DIRECTGO:0045087Innate immune response50.0028S100A9, S100A8, LYST, CD40, S100A12
BP_DIRECTGO:0023051Regulation of signaling80.0041CSN2, S100A9, CSN1S1, SOCS3, S100A8, IL1RN, MMP9, CD40
BP_DIRECTGO:0042981Regulation of apoptotic process60.0041S100A9, CSN1S1, SOCS3, S100A8, MMP9, CD40
BP_DIRECTGO:0050727Regulation of inflammatory response40.0046CSN2, S100A9, SOCS3, S100A8
BP_DIRECTGO:0070488Neutrophil aggregation20.0046S100A9, S100A8
BP_DIRECTGO:0002523Leukocyte migration involved in inflammatory response20.01S100A9, S100A8
BP_DIRECTGO:0009987Cellular process180.0104CSN2, RHPN2, SGK1, LALBA, S100A9, CSN1S1, SOCS3, S100A8, ALOX5AP, SERPINE1, LYST, RHOU, IL1RN, MMP9, CD40, GRO1, S100A12, CXCR1
BP_DIRECTGO:0032268Regulation of cellular protein metabolic process70.0221CSN2, S100A9, SOCS3, S100A8, SERPINE1, MMP9, CD40
BP_DIRECTGO:0050793Regulation of developmental process60.0276CSN2, CSN1S1, SOCS3, RHOU, MMP9, CD40
BP_DIRECTGO:0071345Cellular response to cytokine stimulus40.0302SOCS3, CD40, GRO1, CXCR1
MF 2_DIRECTGO:0016209Antioxidant activity52.82 × 10−5CSN2, S100A9, CSN1S1, S100A8, ALOX5AP
MF_DIRECTGO:0005488Binding190.00034CSN2, SGK1, CSN1S2, LALBA, S100A9, CSN1S1, SOCS3, S100A8, ALOX5AP, SERPINE1, F5, RHOU, IL1RN, MMP9, CD40, CSN3, GRO1, S100A12, CXCR1
MF_DIRECTGO:0005515Protein binding140.00034CSN2, CSN1S2, LALBA, S100A9, SOCS3, S100A8, SERPINE1, RHOU, IL1RN, MMP9, CD40, CSN3, GRO1, CXCR1
MF_DIRECTGO:0035375Zymogen binding30.00034CSN1S2, SERPINE1, CSN3
MF_DIRECTGO:0050544Arachidonic acid binding30.00034S100A9, S100A8, ALOX5AP
MF_DIRECTGO:0035662Toll-like receptor 4 binding20.007S100A9, S100A8
MF_DIRECTGO:0050786RAGE receptor binding20.033S100A9, S100A8
CC 3_DIRECTGO:0005576Extracellular region131.74 × 10−7CSN2, CSN1S2, LALBA, S100A9, CSN1S1, S100A8, SERPINE1, F5, IL1RN, MMP9, CSN3, GRO1, S100A12
CC_DIRECTGO:0005796Golgi lumen41.38 × 10−6CSN2, CSN1S2, CSN1S1, CSN3
CC_DIRECTGO:0005615Extracellular space102.57 × 10−6CSN2, CSN1S2, LALBA, CSN1S1, S100A8, SERPINE1, IL1RN, MMP9, CSN3, GRO1
CC_DIRECTGO:0110165Cellular anatomical entity220.00069CSN2, RHPN2, SGK1, CSN1S2, LALBA, S100A9, CSN1S1, SOCS3, S100A8, ALOX5AP, SERPINE1, KRT7, LYST, F5, RHOU, IL1RN, MMP9, CD40, CSN3, GRO1, S100A12, CXCR1

1 BP, biological process; 2 MF, molecular function; 3 CC, cellular components.

Figure 3

The KEGG pathway-based network analysis of significant genes related to mastitis in dairy cattle.

3.5. PPI Network and Identification of Hub Genes

Protein–protein interaction (PPI) networks for up- and downregulated genes were reconstructed with the STRING database, which indicated the physical connection between two or more protein molecules related to biochemical functions (Figure 4). Twenty-one nodes with 45 connections (edges) were represented in the PPI network, as presented in Figure 4. Moreover, we considered hub genes based on their higher-degree connectivity values in the PPI network. A total of seven hub genes, including MMP9, HCK, GRO1, SOCS3, CXCR1, IL1RN, and S100A9, were identified, all of which were overexpressed genes. All the hub proteins identified are protein-coding genes. The functional enrichment analysis demonstrated that hub genes were involved in the majority of molecular functions and biological processes (Table 4).
Figure 4

Protein–protein interaction (PPI) network analysis of common differentially expressed genes associated with mastitis in dairy cattle.

3.6. Prediction of miRNA-Target Genes and Gene Regulatory Network Reconstruction

We also aimed to determine whether the expression of miRNAs was associated with that of the 33 DE genes in the mastitic and healthy cows. Among the DE miRNAs, bta-mir-222, bta-mir-27a, bta-mir-23a, and bta-mir-142 suppressed 11 of the identified DE genes as targets of the selected miRNAs. A target gene search using TargetScan demonstrated that bta-mir-142 has seven target genes, namely, RHPN2, LYST, SERPINE1, CD40, SOCS3, TRIB1, and SLC16A3, followed by bta-mir-23a having three target genes, SGK1, TNFAIP6, and TRIB1. In addition, bta-mir-27a suppressed the PSTPIP2 and VAV1 genes, and bta-mir-222 suppressed the SOCS3 gene. The TRIB1 and SOCS3 genes displayed the highest suppression by miRNAs. The identified target genes, associated with their miRNAs, are visualized in Figure 5. For constructing the gene regulatory network, we compiled a list of DE genes and miRNAs (as nodes) involved in mastitis based on literature mining and PPI resources. Briefly, miRNA–gene bipartite networks are commonly represented in an undirected graph format, with nodes representing miRNAs or genes and edges corresponding to interactions (genes–genes and miRNAs–targeted genes). In this network, we identified 30 nodes (26 genes and four miRNAs), with 57 edges interacting with it (Figure 5).
Figure 5

Interactive bipartite network (gene–miRNA) which demonstrates the regulatory effect on mastitis in dairy cattle. In this network, the quadrilateral points represent genes, and the octagonal points represent miRNAs. Regarding miRNAs and target genes, the edges indicate the suppressive role of miRNAs. The edges also represent the gene–gene interactions. The green quadrilateral nodes represent the hub genes. The quadrilateral nodes that have purple around them are the genes showing the highest suppression by miRNAs.

3.7. Three-Dimensional Modeling of Hub Proteins

In the present study, we also modeled the 3-dimensional protein structure of the seven hub genes identified in the PPI network that had the most interaction with other genes involved in the network (Figure 6). 3D modeling revealed that the predicted structures of these seven hub proteins were significantly different from each other. Four hub proteins (MMP9, HCK, CXCR1, and S100A9) had the greatest structural complexity compared to the three other proteins.
Figure 6

Modeling of three-dimensional protein structure for genes with the most interaction (hub genes) in the interactive bipartite network (gene–miRNA) according to the SWISS-MODEL repository.

4. Discussion

Mastitis is a complex trait and is prominent among health-related traits in the cattle industry, exerting a severe impact on profitability and animal welfare. The identification of functional candidate genes and molecular mechanisms involved in mastitis is required, given the persistence of the disease on dairy farms. Moreover, understanding the interplay between molecular and cellular components, with each component interacting at different levels that are entangled in several biological pathways, is important. Hence, the present study provides a general framework to investigate and integrate different sources of transcriptome data and previous results from GWAS studies to identify the genetic basis and key pathways associated with bovine mastitis. Numerous studies have demonstrated that the integration of multiple layers of omics data is a powerful strategy to increase the efficiency and accuracy of candidate gene and biomarker discovery, detecting molecular and biochemical interactions, and the relationships between biological variables in different species [4,7,8,56,57,58]. In this study, the integrative analysis of multiple datasets resulted in prioritizing 33 DE genes as the main gene list, of which nine genes were downregulated and 24 genes were upregulated in the mastitic cows compared with the healthy cows based on their FC values. A list of main detected genes related to mastitis is provided in Table 3, with their main functions (biological process, molecular function, and cellular component) listed in Table 4. Among them, CSN2, CSN3, CSN1S1, CSN1S2, RHPN2, LALBA, ACSS2, RHOU, and KRT7 genes were underexpressed in the mastitic cows, mostly located on chromosomes 6. The most important overexpressed genes in cows with mastitis were GRO1, CXCR1, SOCS3, S100A9, MMP9, HCK, and IL1RN, which were hub genes (highly connected genes) involved in mastitis in this study. The casein cluster is composed of four genes; β-casein (CSN2), κ-casein (CSN3), αs1-casein (CSN1S1), and αs2-casein (CSN1S2), which encode approximately 80% of the protein content of bovine milk [59], and the whey protein gene (LALBA) was downregulated in inflamed mammary glands. LALBA encodes α-lactalbumin and is essential for lactose synthesis, which plays an important role in milk production as an osmotic regulator of milk secretion [60]. A possible explanation for the lower expression levels of these genes could be that the protein content in mastitic milk would decrease due to an antagonistic genetic relationship between mastitis and protein yield [24]. In previous studies, it was also demonstrated that all five genes were observed in enhanced abundance in the mammary glands of lactating dairy cows [61], dairy sheep [62], and lactating dairy goats [63]. Interestingly, as presented in Table 4, these genes were found in a majority of enriched pathways, suggesting possible key regulatory roles for them. Other noteworthy genes (RHPN2, ACSS2, RHOU, and KRT7) showing lower expression in cows with mastitis have critical roles in biological pathways, cellular process, fatty acid synthesis, and metabolism. For instance, ACSS2 (acyl-CoA synthetase short-chain family member 2) is well known to affect mastitis resistance in dairy cows and plays a role in the activation of acetate for de novo fatty acid synthesis [64]. Similarly to our results, Chen et al. [65] reported lower expression for ACSS2 and RHPN2 genes in response to the intramammary infection caused by two different pathogens (Escherichia coli and Streptococcus uberis) in dairy cows. As presented in Table 3, in the mastitis cows, 24 genes were more highly expressed, of which seven genes were considered as hub genes involved in significantly enriched biological processes and KEGG pathways. Subsequently, the PPI networks and gene regulatory networks were constructed based on these hub genes, which showed significant connectivity and which could shed light on the post-transcriptional regulation of gene expression by the identified miRNAs. Furthermore, the functional enrichment analysis resulted in four significant KEGG pathways associated with mastitis, which comprised six hub genes, i.e., GRO1, CXCR1, S100A9, MMP9, HCK, and IL1RN, as presented in Figure 3. Among these genes, GRO1 and CXCR1 were observed in four and three pathways, respectively. GRO1 (melanoma growth stimulating activity, alpha) also known as CXCL1, is a protein-encoding gene and plays an important role in inflammation and immune defense due to the modulation of leukocyte infiltration [66], which has been previously proposed as a biomarker and therapeutic target in mastitis [67]. This gene is also involved in the metabolic pathways of cytokine–cytokine-receptor interaction, the IL-17 signaling pathway, the chemokine signaling pathway, and viral protein interaction with cytokines and cytokine receptors. In the case of the cytokine–cytokine-receptor interaction pathway, other genes, such as CD40, IL1RN, CXCR1, TNFRSF6B, and CCL19, were found to be involved in mastitis defense or immune response, as all these genes were upregulated in the mastitic cows based on their FC values. The significant role of cytokines in the immune response to infectious agents is well known because they are soluble extracellular proteins or glycoproteins that are crucial intercellular regulators and mobilizers of cells engaged in innate as well as adaptive inflammatory host defenses, cell growth, differentiation, cell death, and cell development and repair processes. It was previously reported that cytokines can participate in activation of the host defense mechanisms during mastitis [68,69]. The CXCR1 and CCL19 genes were also enriched in two other pathways of the chemokine signaling pathway and in viral protein interaction with cytokines and cytokine receptors. CXCR1 (chemokine (C-X-C motif) receptor 1), identified as a hub gene, is a protein-encoding gene for major pro-inflammatory cytokine receptors [70] that is introduced as a potential genetic marker for resistance to mastitis in dairy cows [71,72]. In our study, the gene CXCR1 was involved in seven GO terms—response to stimulus, immune response, cellular response to stimulus, neutrophil chemotaxis, biological regulation, cellular process, and cellular response to cytokine stimulus for biological processes (Table 4). In addition, earlier studies have reported that a non-synonymous mutation, c.365C > T, located in exon II of the CXCR1 gene is associated with susceptibility to mastitis in different breeds of cattle [73,74]. The viral protein interaction with cytokines and the cytokine receptor pathway is an immune system pathway which has a key role in the inflammatory responses to infection and may activate or inhibit cytokine signaling and possibly affect different aspects of immunity. Furthermore, the S100A8, S100A9, and MMP9 genes have been recognized as components of the IL-17 signaling pathway. This pathway plays crucial roles in both acute and chronic inflammatory responses. In fact, the interleukin 17 (IL-17) family, as a subset of cytokines, signals via their correspondent receptors and activates downstream pathways that include NF-kappaB, MAPKs, and C/EBPs to induce the expression of antimicrobial peptides, cytokines, and chemokines. S100A9 and MMP9, which were identified as hub genes and which were upregulated in the mastitic cows, play key roles in the regulation of immune response and inflammatory pathways [65,66]. SOCS3 (suppressor of cytokine signalling 3) was another hub gene identified with higher expression levels, which encodes an intracellular inhibitor of cytokine signaling and has a crucial role in the initial steps of the recognition of a pathogen-associated molecular pattern (PAMP) in the innate immune cells [75]. Furthermore, in the regulatory network, SOCS3 is suppressed by bta-mir-142 and bta-mir-222. These two miRNAs showed upregulation and their target gene, SOCS3, showed the lower expression than them in mastitic cows. We characterized bta-mir-222, bta-mir-27a, bta-mir-23a, and bta-mir-142 as the major miRNAs which play a prominent role in regulating this network of genes and these were upregulated in the mastitic cows. The gene regulatory network showed that the greatest target genes (seven genes) were suppressed by bta-mir-142. There is evidence that miRNAs play a critical role in the regulation of inflammation and immune function during infection with mastitis in dairy cattle [76,77,78,79]. Modeling of the 3D protein structure of hub proteins can be an invaluable aid in order to better understand the details of a particular protein because studies of protein structure and function are becoming a promising approach in the field of bioinformatics. Functional characterization of a protein is often facilitated by its 3D structure. Hence, it is necessary that a 3D structure is determined in examining the proteins’ function at the molecular level. Sequence identity, as a measure of the expected accuracy of a model represented, >30% indicates a relatively good predictor of the model [80]. When sequence identity drops below 30%, the main problem becomes the identification of related templates and their alignment with the sequence to be modeled. Based on our results, among the seven hub proteins, HCK, CXCR1, S100A9, and MMP9 showed the highest structural complexity, with sequence identities of 94.6%, 75.4%, 69.9%, and 47.5%, respectively, whereas the proteins of SOCS3, IL1RN, and GRO1 had the lowest structural complexity with sequence identities of 92.6%, 80.1%, and 72.5%, respectively (Figure 6). Consequently, these findings demonstrate the relevance of integrating results from transcriptomic and functional analyses for a better understanding of the function of important genes and molecular mechanisms responsible for mastitis development.

5. Conclusions

The integration of multi-omics data resulted in the identification of 33 common and relevant genes associated with bovine mastitis. Among these, seven genes (CXCR1, HCK, IL1RN, MMP9, S100A9, GRO1, and SOCS3) were identified as the hub genes and these can be explored as potential candidate genes for mastitis susceptibility and resistance. Functional annotation and enrichment analysis identified 23, 7, and 4 GO terms related to mastitis in the biological process, molecular function, and cellular component categories, respectively. We identified eight differentially expressed miRNAs, of which four suppressed 11 of the identified genes as their targets. Furthermore, the reconstruction of the regulatory network of genes associated with their miRNAs sheds light on the post-transcriptional regulation of this network. Therefore, this study provides a general framework to investigate and incorporate multiple layers of omics data from high-throughput technologies or available pathway annotation databases, which has led to the elucidation of molecular networks, the cellular and molecular-level features, and the genetic and biological basis of mastitis in dairy cattle.
  75 in total

Review 1.  Proinflammatory cytokines.

Authors:  C A Dinarello
Journal:  Chest       Date:  2000-08       Impact factor: 9.410

2.  Cytoscape: a software environment for integrated models of biomolecular interaction networks.

Authors:  Paul Shannon; Andrew Markiel; Owen Ozier; Nitin S Baliga; Jonathan T Wang; Daniel Ramage; Nada Amin; Benno Schwikowski; Trey Ideker
Journal:  Genome Res       Date:  2003-11       Impact factor: 9.043

3.  lumi: a pipeline for processing Illumina microarray.

Authors:  Pan Du; Warren A Kibbe; Simon M Lin
Journal:  Bioinformatics       Date:  2008-05-08       Impact factor: 6.937

4.  Genetic association between susceptibility to clinical mastitis and protein yield in norwegian dairy cattle.

Authors:  B Heringstad; Y M Chang; D Gianola; G Klemetsdal
Journal:  J Dairy Sci       Date:  2005-04       Impact factor: 4.034

Review 5.  Innate immune response of bovine mammary gland to pathogenic bacteria responsible for mastitis.

Authors:  Javier Oviedo-Boyso; Juan J Valdez-Alarcón; Marcos Cajero-Juárez; Alejandra Ochoa-Zarzosa; Joel E López-Meza; Alejandro Bravo-Patiño; Víctor M Baizabal-Aguirre
Journal:  J Infect       Date:  2006-08-01       Impact factor: 6.072

6.  Metabolomic and transcriptomic analysis of the rice response to the bacterial blight pathogen Xanthomonas oryzae pv. oryzae.

Authors:  Theodore R Sana; Steve Fischer; Gert Wohlgemuth; Anjali Katrekar; Ki-Hong Jung; Pam C Ronald; Oliver Fiehn
Journal:  Metabolomics       Date:  2010-05-27       Impact factor: 4.290

7.  miRBase: from microRNA sequences to function.

Authors:  Ana Kozomara; Maria Birgaoanu; Sam Griffiths-Jones
Journal:  Nucleic Acids Res       Date:  2019-01-08       Impact factor: 16.971

8.  ClueGO: a Cytoscape plug-in to decipher functionally grouped gene ontology and pathway annotation networks.

Authors:  Gabriela Bindea; Bernhard Mlecnik; Hubert Hackl; Pornpimol Charoentong; Marie Tosolini; Amos Kirilovsky; Wolf-Herman Fridman; Franck Pagès; Zlatko Trajanoski; Jérôme Galon
Journal:  Bioinformatics       Date:  2009-02-23       Impact factor: 6.937

9.  PANTHER in 2013: modeling the evolution of gene function, and other gene attributes, in the context of phylogenetic trees.

Authors:  Huaiyu Mi; Anushya Muruganujan; Paul D Thomas
Journal:  Nucleic Acids Res       Date:  2012-11-27       Impact factor: 16.971

10.  Development and characterization of a high density SNP genotyping assay for cattle.

Authors:  Lakshmi K Matukumalli; Cynthia T Lawley; Robert D Schnabel; Jeremy F Taylor; Mark F Allan; Michael P Heaton; Jeff O'Connell; Stephen S Moore; Timothy P L Smith; Tad S Sonstegard; Curtis P Van Tassell
Journal:  PLoS One       Date:  2009-04-24       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.