Literature DB >> 30335783

Combining multi-OMICs information to identify key-regulator genes for pleiotropic effect on fertility and production traits in beef cattle.

Pablo Augusto de Souza Fonseca1,2, Samir Id-Lahoucine1, Antonio Reverter3, Juan F Medrano4, Marina S Fortes5, Joaquim Casellas6, Filippo Miglior1,7, Luiz Brito1, Maria Raquel S Carvalho2, Flávio S Schenkel1, Loan T Nguyen5, Laercio R Porto-Neto3, Milton G Thomas8, Angela Cánovas1.   

Abstract

The identification of biological processes related to the regulation of complex traits is a difficult task. Commonly, complex traits are regulated through a multitude of genes contributing each to a small part of the total genetic variance. Additionally, some loci can simultaneously regulate several complex traits, a phenomenon defined as pleiotropy. The lack of understanding on the biological processes responsible for the regulation of these traits results in the decrease of selection efficiency and the selection of undesirable hitchhiking effects. The identification of pleiotropic key-regulator genes can assist in developing important tools for investigating biological processes underlying complex traits. A multi-breed and multi-OMICs approach was applied to study the pleiotropic effects of key-regulator genes using three independent beef cattle populations evaluated for fertility traits. A pleiotropic map for 32 traits related to growth, feed efficiency, carcass and meat quality, and reproduction was used to identify genes shared among the different populations and breeds in pleiotropic regions. Furthermore, data-mining analyses were performed using the Cattle QTL database (CattleQTLdb) to identify the QTL category annotated in the regions around the genes shared among breeds. This approach allowed the identification of a main gene network (composed of 38 genes) shared among breeds. This gene network was significantly associated with thyroid activity, among other biological processes, and displayed a high regulatory potential. In addition, it was possible to identify genes with pleiotropic effects related to crucial biological processes that regulate economically relevant traits associated with fertility, production and health, such as MYC, PPARG, GSK3B, TG and IYD genes. These genes will be further investigated to better understand the biological processes involved in the expression of complex traits and assist in the identification of functional variants associated with undesirable phenotypes, such as decreased fertility, poor feed efficiency and negative energetic balance.

Entities:  

Mesh:

Substances:

Year:  2018        PMID: 30335783      PMCID: PMC6193631          DOI: 10.1371/journal.pone.0205295

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

The regulation of complex traits involves multiple loci, that contribute to a small proportion of the phenotypic expression, and environmental effects [1, 2]. In addition, loci that regulate complex traits are known to be involved in the regulation of several phenotypes. This describes the primary genetic effect that leads to genetic correlation, known as pleiotropy. Currently, the interaction and regulation pattern of these loci are poorly understood, especially in livestock species. However, the use of pleiotropic markers for genomic selection may result in improvement in efficiency of selection. On the other hand, some pleiotropic markers can lead to the indirect selection of undesirable traits. The exclusion of these markers from genetic selection programs may result in the improvement of a specific trait without affecting other traits or multiple traits simultaneously. Indirect selection for undesirable traits has been well described in livestock breeding programs. For example, variants associated with high productive performance can also reduce fertility or affect mortality rates in both beef and dairy production systems [3-6]. The identification of loci with pleiotropic effects and the dissection of the biological processes in which these loci are involved can provide useful information to improve the accuracy of genomic breeding values by providing insights on the best statistical model to be used. Thus, more accurate breeding values translates in better breeding decisions, which will lead to reduced frequency of undesirable genotypes and phenotypes in the population under selection. Additionally, the use of causal and functional variants in selection models may result in a higher prediction accuracy and improved prediction persistence over generations [7-9]. The investigation of genomic loci with pleiotropic effects involved in the regulation of economically important traits is a key step to apply the breeding strategies previously mentioned. The detection of such genomic regions enables the identification of potential causal variants mapped in key regulator genes. Multivariate trait analysis and subsequent selection indexes may then be enhanced by the identification of these variants, leading to higher genetic improvement [10-12]. The use of functional genomics in a systems biology approach is a powerful tool for the integration and analysis of biological networks using high-throughput OMICs technologies [13, 14]. The integration of results from different breeds and OMICs approaches helps unravel the relationship between candidate genes and several phenotypes to further identify genetic variants associated with changes in the expression of those genes. Fertility and production traits are good examples of phenotypes with distinct genetic architecture, different potential to respond to selection, and with antagonistic genetic relationship. The majority of fertility traits have unfavorable genetic correlations with production traits, which may be explained by the unidirectional selection, “hitchhiking” effect or pleiotropic effects [15-17]. The biological process responsible for sexual maturity, known as puberty, involves complex pathways that are regulated by several genes involved with the development of a wide variety of phenotypes [18]. Crucial biological structures for puberty development, e.g., hypothalamus, pituitary gland and thyroid, are involved in the regulation of synthesis and secretion of several hormones directly related to production traits [19-21]. Therefore, a deep investigation of the biological processes related to genes involved in puberty may result in the identification of key regulator genes for both fertility and production traits. Within this context, the aim of this study was to integrate multi-OMICs data into studies that investigated pubertal status and fertility traits. This integration was performed using a systems biology approach to identify candidate genes with pleiotropic effects on economically relevant traits in beef cattle.

Material and methods

Ethics statement

This study was carried out analysing data from previous studies, which have been approved by respective ethics in research committees [13, 20, 22]. Therefore, additional animal welfare and use committee approval was not required.

Data collection

Candidate genes identified in three independent populations (i.e., breeds) of beef cattle were used. For the first population (Brangus cattle, n = 64), genes differentially expressed (DE) between pre- and post- puberty in eight tissues (hypothalamus, pituitary gland, liver, longissimus dorsi muscle, adipose tissue, uterine horn, endometrium, and ovary) were evaluated [13]. The candidate genes for the second population (Tropical Composite cattle, n = 866; Brahman = 843) were identified by a genome-wide association analysis (GWAS) for age at puberty, postpartum anestrous interval, and occurrence of the first postpartum ovulation before weaning in the first rebreeding period [22]. For the third population (Brahman cattle, n = 24), DE genes were identified in pre- and post- puberty stages in the pituitary gland and ovary [20].

Identification of genes near pleiotropic markers

A total of 9,194 genes identified in the three breeds (2,120 in Brangus, 3,714 in Tropical Composite, and 5,269 in Brahman) were mapped against a list of genomic markers (n = 729,068) with pleiotropic effect (P < 0.05 after false discovery rate (FDR) correction–n = 21,908 markers) reported by Bolormaa et al. (2014) [10], which analyzed 32 traits including growth, feed intake, carcass and meat quality, and reproduction. Using the list of candidate genes obtained by integrating and combining data from OMICs technologies described by Cánovas et al. (2014) [13], Hawken et al. (2012) [22] and Nguyen et al. (2017) [20], genes up to 1 Mb (downstream and upstream) from a pleiotropic marker were selected. This interval was selected based on the average recombination block and the linkage disequilibrium (LD) pattern across the cattle genome [23, 24].

Genes shared among breeds and QTL mapping

The genes identified within the pleiotropic regions were compared and those shared across the three independent populations were selected to be mapped against QTL regions using resources from the Cattle QTL database (CattleQTLdb, www.animalgenome.org/cgi-bin/QTLdb/BT/index). Using an interval of 2 Mb (1 Mb downstream and 1 Mb upstream) from each one of the shared genes, all the QTLs mapped in this interval were annotated. Those genes mapped in regions where at least five QTL categories were annotated (from six possible categories: exterior appearance, health, reproduction, production, meat and carcass and milk traits) were ranked as the genes with the highest pleiotropic potential.

Functional analyses

First, the relationships among pleiotropic genes were estimated using literature text-mining, co-expression, gene fusion, protein homology, gene-neighborhood and gene co-occurrence using STRING database [25]. This estimation used human database, as the available information is more complete for humans than bovine. Additionally, using STRING database, an enrichment analysis was also performed to identify associations between the pleiotropic genes with biological processes (BP) and metabolic pathways using the Kyoto Encyclopedia of Genes and Genomes (KEGG). From the gene network created using STRING database, a centrality analysis was performed in order to identify the pleiotropic genes with the highest number of connections in the network. The igraph package [26] was used to calculate the centrality coefficient, defined as the number of connections of each node in a network. The VarElect software [27] was used to match the genes present in this list to the enriched KEGG pathways. In addition, for all the genes present in the main network, BLAST2GO [28] was used to obtain a better description of the BP that these pleiotropic genes are related to. Finally, the NetworkAnalyst tool [29] was used to confirm the genes with the highest regulatory potential based on the interaction with other proteins. The identification of the interactions among the genes was performed using IMEx, based on a literature-curated comprehensive data from InnateDB [30]. All the analyses described previously are shown in Fig 1.
Fig 1

Flowchart presenting the methodological pipeline performed to identify the potential key-regulatory genes for pleiotropic effect on fertility and production traits in beef cattle.

Results and discussion

A total of 108 genes (S1 File) mapped in pleiotropic regions were shared among all three independent populations of the three beef cattle breeds considered (Fig 2). Among them, 89 genes were mapped in regions with five or six QTL categories annotated (S2 File). It is important to highlight that the approach to evaluate the post-puberty stage was different between Cánovas et al. (2014) [13] and Nguyen et al. (2017) [20]. The puberty period was defined as the second consecutive day of a circulating progesterone values >1 ng/mL by Cánovas et al. (2014) [13]. On the other hand, Nguyen et al. (2017) [20] defined the puberty period when the observation of the first corpus luteum occurred. These differences, together with the different genetic backgrounds and environmental effects between these two populations, can result in different expression profiles, even when two similar groups of phenotypes are measured (pre- and post-puberty). However, the same biological processes are expected to underlying the puberty development in both populations. Therefore, it is expected that the key-regulators of these processes maintained a similar DE profile between pre- and post-puberty phenotypes in both populations. Additionally, the consistency of expression among tissues and populations can help to identify these key-regulatory genes. The results obtained from BP and KEGG pathways enrichment analyses indicated that these genes are involved in development and maturation of the body (Table 1). From those 89 genes, 38 were grouped in the main network identified by STRING database (Fig 3 and S1 File). Fig 4 shows the percentage of each type of QTL (exterior appearance, health, reproduction, production, meat and carcass quality and milk traits) present in a 2 Mb interval (1 Mb downstream and 1 Mb upstream) around these 38 genes. The expression pattern across the tissues analyzed by Cánovas et al. (2014) [13] and Nguyen et al. (2017) [20] reinforce the functional relevance of these genes due to the DE across specific tissues (Table 2). The detailed mapping of the BP associated with these 38 genes suggested a strong relationship between these genes and the regulation of biological processes involved with cellular metabolism, cellular growth and development (Fig 5 and S3 File). Fig 6 shows the genomic context of the regions where these 38 genes were mapped as a function of the pleiotropic markers harboring these genes. The results from evaluation of the gene list composed of 38 genes suggested that the identification of key-regulatory genes was an important step to elucidate the relationship among these genes and the regulation of BP, as well as their association with economically important traits.
Fig 2

Venn diagram displaying the comparison of genes among the different independent populations analysed.

In yellow, the genes identified in Brahman and Tropical Composite breeds (Hawken et al., 2012) [22]. In blue, the genes identified in Brahman breed (Nguyen et al., 2017) [20]. In red, the genes identified in Brangus breed (Cánovas et al., 2014) [13].

Table 1

Top 10 enriched biological processes (BP) and enriched KEGG pathways identified by STRING database using the 89 genes shared among the reported results from Cánovas et al. (2014) [13], Hawken et al. (2012) [22] and Nguyen et al. (2017) [20] and mapped in regions with 5 or more categories of QTL.

Top 10 enriched GO (Biological Function)
IDDescriptionBonferroni P-valueImplicated genes
GO:0048513Organ development8.67E-05ABI1, ABL1, ACTB, ACTL6B, ADAMTS18, ATF2, CAPN3, CCDC85C, CHD7, CUX1, DCHS1, EMX2, FKBP1A, FOLR1, FSHR, GSK3B, HHEX, IGF1R, LIN7A, MYC, NTRK1, POU3F1, PPARG, PRKG1, RBM20, RP1, RSPO2, SPINT1, TBL1XR1, TG, ZNF148
GO:0048731System development0.0002ABI1, ACTB, ACTL6B, ADAMTS18, ADCYAP1, ATF2, BARHL2, BMP7, CAPN3, CCDC85C, CHD7, EMX2, FKBP1A, FOLR1, FSHR, GSK3B, HHEX, IGF1R, LGALS3, LIN7A, MYC, OPCML, PDLIM5, PPARG, PRICKLE1, PRKG1, RBM20, RP1, RSPO2, SCG2, SPINT1, TBL1XR1, TG, TRPC7, ZNF148
GO:0007275Multicellular organismal development0.0003ABI1, ACTB, ACTL6B, ADAMTS18, ADCYAP1, ATF2, BARHL2, BMP7, CAPN3, CCDC85C, CHD7, EMX2, FKBP1A, FOLR1, FSHR, GSK3B, HHEX, IGF1R, LGALS3, LIN7A, MYC, OPCML, PDLIM5, PPARG, PRICKLE1, PRKG1, RBM20, RP1, RSPO2, SCG2, SMAD2, SPINT1, TBL1XR1, TG, TLL2, TRPC7, ZNF148
GO:0048856Anatomical structure development0.0003ABI1, ACTB, ACTL6B, ADAMTS18, ADCYAP1, ATF2, BARHL2, BMP7, CAPN3, CASR, CCDC85C, CHD7, EMX2, FKBP1A, FOLR1, FSHR, GSK3B, HHEX, IGF1R, LIN7A, LMOD2, MYC, OPCML, PDLIM5, PPARG, PRICKLE1, PRKG1, RBM20, RP1, RSPO2, SCG2, SMAD2, SPINT1, TBL1XR1, TG, TRPC7, ZNF148
GO:0044707Single-multicellular organism process0.0004ABI1, ACTB, ACTL6B, ADAMTS18, ADCYAP1, ATF2, BARHL2, BMP7, BRPF3, CAPN3, CASR, CCDC85C, CHD7, EMX2, FKBP1A, FOLR1, FSHR, GSK3B, HHEX, IGF1R, LGALS3, LIN7A, LMOD2, MYC, NPAS3, OPCML, PDLIM5, PGAM2, PNKD, PPARG, PRICKLE1, RBM20, RP1, RSPO2, SCG2, SMAD2, SPINT1, SST, TBL1XR1, TG, TLL2, TRPC7, ZNF148
GO:0009888Tissue development0.0007ABI1, ATF2, CHD7, CUX1, DCHS1, FKBP1A, FOLR1, GSK3B, HHEX, IGF1R, LGALS3, MYC, NTRK1, POU3F1, PPARG, PRICKLE1, RP1, RSPO2, SPINT1, TBL1XR1, TDGF1
GO:0032502Developmental process0.001ABI1, ACTB, ACTL6B, ADAMTS18, ADCYAP1, ATF2, BARHL2, BMP7, CASR, CCDC85C, CHD7, EMX2, FKBP1A, FOLR1, FSHR, GSK3B, HHEX, IGF1R, LIN7A, LMOD2, MYC, NPAS3, OPCML, PDLIM5, PPARG, PRICKLE1, PRKG1, RBM20, RP1, RSPO2, SCG2, SMAD2, SPINT1, TBL1XR1, TG, TLL2, TRPC7, ZNF148
GO:0060429Epithelium development0.001ABI1, CHD7, CUX1, DCHS1, FOLR1, HHEX, IGF1R, LGALS3, MYC, NTRK1, POU3F1, PPARG, PRICKLE1, RSPO2, SMAD2, SPINT1
GO:0035239Tube morphogenesis0.003BMP7, CHD7, DCHS1, FOLR1, HHEX, MYC, PRICKLE1, RSPO2, SMAD2, SPINT1
GO:0006468Receptor activity0.003ABI1, ABL1, ATF2, DAPK2, GSK3B, IGF1R, MYC, NTRK1, PRKG1, SCG2, SMAD2, STK32B, STK33, TDGF1
Enriched KEGG Pathways
KEGG: 5202Transcriptional misregulation in cancer0.00183HHEX, IGF1R, MYC, NTRK1, PPARG, SPINT1, SUPT3H
KEGG: 5200Pathways in cancer0.00943ABL1, DAPK2, GSK3B, IGF1R, MYC, NTRK1, PPARG, SMAD2
KEGG: 5216Thyroid cancer0.0193MYC, NTRK1, PPARG
KEGG: 4390Hippo signaling pathway0.0286ACTB, BMP7, GSK3B, MYC, SMAD2
Fig 3

Gene network displaying the connections between the markers shared among the three independent populations and mapped in pleiotropic regions.

The nodes represent individual genes. The coloured lines linking the nodes represents the interactions between the genes. The interactions between the genes can be divided in three types: 1) Known interaction: from curated databases (light blue) and experimentally determined (purple); 2) Predicted interactions: gene neighbourhood (green), gene fusions (red) and gene co-occurrence (dark blue); 3) Others: text mining (yellow), co-expression (black) and protein homology (violet). The interactions were based on human data, since the human database is more curate and complete than the bovine database.

Fig 4

Proportion of each QTL category (reproduction, production, milk, meat and carcass quality, exterior appearance, and health) mapped in a 2 Mb interval (1Mb upstream and 1 Mb downstream) from each gene present in the main network (38 genes).

Table 2

Inclusion criteria and/or expression pattern (differentially expressed), in Cánovas et al. (2014) [13] and Nguyen et al. (2017) [20], for the 38 genes mapped in the main network identified by STRING database.

Gene SymbolCoordinateBrangus (Cánovas et al., 2014) [13]Tropical composite and Brahman(Nguyen et al., 2017) [20]
SNPTSTFHypPitOvUtEndLvAdldmPitOv
GSK3B1:65265851–65292071XX
CASR1:67255165–67344655XX
SST1:80250205–80251648XXXXXXX
TBL1XR11:90428333–90609529XX
ATF22:21725894–21830562XX
SMARCAL12:105135339–105189135XX
SCG22:112549869–112555375XX
NTRK13:14019229–14037583XXX
ADH46:26853175–26885436XXXXX
PDLIM56:31333660–31568270XX
FBP18:82460863–82491694XX
HTR1E9:63760625–63852410XX
IYD9:88612923–88624698XXX
LGALS310:67843328–67861114X
FSHR11:31110744–31305197XX
ABL111:101011169–101152856XX
ABI113:18094943–18182433XX
BMP713:59424987–59510393XX
FKBP1A13:60276502–60303717XXX
TG14:9253697–9263933XXX
MYC14:13769242–13775688XX
TGS114:24747192–24772996XX
CHD714:28043739–28172246XX
ACACB17:66101179–66217542XX
MYO18B17:67768278–68002792XX
VAT1L18:5045212–5211278XX
IGF1R21:7967701–8268340XX
TDGF122:53432382–53437166XXXXXXX
PPARG22:57367072–57432321XXX
BRPF323:10096561–10132302XX
SUPT3H23:18223167–18623659XX
ADCYAP124:36114443–36121104XXX
SMAD224:47963393–48022086XX
ACTL6B25:36459674–36469794XX
ACTB25:39343633–39347047X
PRKG126:6901760–8343635XX
HHEX26:14120258–14126069XX
DPF229:44205003–44217694XXX

SNP: Genes expressed in at least one tissue among the two physiological states (pre- and post-puberty) and mapped near the markers associated with fertility traits by GWAS; TF: transcriptional factor; TS: genes identified with high probability to show a binding site for TF differentially expressed; Hyp: hypothalamus; Pit: pituitary; Ov: ovary; Ut: uterus; End: endometrium; Lv: liver; Ad: adipose tissue; ldm: longissimus dorsi muscle.

Fig 5

Biological processes (BP) significantly enriched in the main network identified by STRING database.

The number inside the parenthesis indicated the number of genes associated with each BP.

Fig 6

Chromosome specific plots displaying pleiotropic effect around the genes shared among all breeds.

The x-axis corresponds to the genomic position in each chromosome and the y-axis to the -log(p-value). The -log(p-value) showed in the y-axis corresponds to the p-values adjusted to multiple-testing (FDR<0.01) obtained by Bolormaa et al. (2014) [10] for the pleiotropic analysis. The grey diamond corresponds to the start coordinate of each gene. The horizontal dashed lines indicate the nominal threshold of -log(p-value)>2. All the genes mapped in an interval of 1 Mb of a marker with significant signal for pleiotropic effect were considered as a genes in pleiotropic regions.

Venn diagram displaying the comparison of genes among the different independent populations analysed.

In yellow, the genes identified in Brahman and Tropical Composite breeds (Hawken et al., 2012) [22]. In blue, the genes identified in Brahman breed (Nguyen et al., 2017) [20]. In red, the genes identified in Brangus breed (Cánovas et al., 2014) [13].

Gene network displaying the connections between the markers shared among the three independent populations and mapped in pleiotropic regions.

The nodes represent individual genes. The coloured lines linking the nodes represents the interactions between the genes. The interactions between the genes can be divided in three types: 1) Known interaction: from curated databases (light blue) and experimentally determined (purple); 2) Predicted interactions: gene neighbourhood (green), gene fusions (red) and gene co-occurrence (dark blue); 3) Others: text mining (yellow), co-expression (black) and protein homology (violet). The interactions were based on human data, since the human database is more curate and complete than the bovine database.

Biological processes (BP) significantly enriched in the main network identified by STRING database.

The number inside the parenthesis indicated the number of genes associated with each BP.

Chromosome specific plots displaying pleiotropic effect around the genes shared among all breeds.

The x-axis corresponds to the genomic position in each chromosome and the y-axis to the -log(p-value). The -log(p-value) showed in the y-axis corresponds to the p-values adjusted to multiple-testing (FDR<0.01) obtained by Bolormaa et al. (2014) [10] for the pleiotropic analysis. The grey diamond corresponds to the start coordinate of each gene. The horizontal dashed lines indicate the nominal threshold of -log(p-value)>2. All the genes mapped in an interval of 1 Mb of a marker with significant signal for pleiotropic effect were considered as a genes in pleiotropic regions. SNP: Genes expressed in at least one tissue among the two physiological states (pre- and post-puberty) and mapped near the markers associated with fertility traits by GWAS; TF: transcriptional factor; TS: genes identified with high probability to show a binding site for TF differentially expressed; Hyp: hypothalamus; Pit: pituitary; Ov: ovary; Ut: uterus; End: endometrium; Lv: liver; Ad: adipose tissue; ldm: longissimus dorsi muscle.

Identification of key-regulatory genes through gene-network analysis

The centrality analysis for the main gene network, composed by 38 genes, identified the genes with the largest number of connections in the network. Table 3 shows the number of connections for the top 10 genes with the largest number of interactions. In order to confirm and reinforce the identification of genes with the highest regulatory potential, the interaction pattern of these 38 genes with other proteins was evaluated using IMEx. From this analysis, it was observed that there was an overlap between the top 10 genes with more interactions in the STRING database network and the top 10 genes from the IMEx interactions. In the IMEx analyses, it was also possible to identify, from the top 10 genes, six genes directly related to positive and negative regulation of cellular metabolic processes (red circles in Fig 7). Interestingly, these 6 genes were also present in the list of top connected genes in the main network identified by STRING database. These results confirmed the regulatory potential of these genes and highlight the biological processes in which these genes were involved. These genes were: MYC proto-oncogene (MYC) Peroxisome Proliferator Activated Receptor Gamma (PPARG), Glycogen Synthase Kinase 3 Beta (GSK3B), SMAD Family Member 2 (Smad2), ABL Proto-Oncogene 1, Non-Receptor Tyrosine Kinase (ABL1) and Insulin-like growth factor 1 receptor (IGF1R).
Table 3

Top 10 genes based on the number of interactions identified in the STRING database and NetworkAnalyst analyses.

In bold are shown the genes present in both top 10 lists.

Top 10 genes for numbers of interactions with other genes
STRING databaseNetworkAnalyst
Gene symbolNumber of interactionGene symbolNumber of interaction
MYC14MYC714
ACTB9SMAD2313
GSK3B8ABL1276
PPARG8GSK3B243
SST7ATF2241
ACACB7ACTB198
ABL15PPARG124
SMAD25ABI164
ADCYAP15IGF1R58
IGF1R5LGALS356
Fig 7

Interactome displaying the protein-protein interactions for the genes present in the main gene network identified by STRING database with other proteins across the genome.

Larger nodes (highlighted in red and green) represent the genes with the highest number of connections. Genes in red are the genes associated with positive and negative regulation of cellular metabolic processes.

Interactome displaying the protein-protein interactions for the genes present in the main gene network identified by STRING database with other proteins across the genome.

Larger nodes (highlighted in red and green) represent the genes with the highest number of connections. Genes in red are the genes associated with positive and negative regulation of cellular metabolic processes.

Top 10 genes based on the number of interactions identified in the STRING database and NetworkAnalyst analyses.

In bold are shown the genes present in both top 10 lists. Three of these genes are involved in cell proliferation: MYC, SMAD2 and ABL1. MYC was the gene with the highest number of interaction with other genes in both network analyses. This gene codes for a transcription factor responsible for regulating transcription of several genes. Consequently, MYC plays a multifunctional action involved with the control of crucial biological processes, such as cell cycle control and cellular transformation [31]. The expression of MYC was decreased in the muscle tissue of orchidectomized testosterone-treated male mice, indicating that this gene might be involved with the promotion of muscle mass by maintaining myoblasts in the proliferative state; and with the differentiation and growth of muscle tissue in a process mediated by androgen receptors [32]. Additionally, MYC was identified as playing a crucial role in the regulation of gene networks during development of the lactation cycle and meat and carcass traits in cattle [33-35]. SMAD2 is a member of the TGF-beta-SMAD signaling pathway and it is involved with the regulation of several processes associated with female reproduction and embryonic development in cattle [36]. In male rats, SMAD2 is DE between non-sexually mature and sexually mature rats, indicating a relationship with puberty progression [37]. The expression of myostatin, a protein involved with muscle proliferation, is directly regulated by SMAD2 activity [38-40]. Therefore, SMAD2 is a crucial regulator of processes involved with fertility and production traits. ABL1 is a proto-oncogene associated with the regulation of several biological processes related to cellular division and differentiation. ABL1 was observed as DE in mouse testis after heat shock, indicating a regulatory activity in this tissue [41]. Additionally, homozygous disruptions of ABL1 are associated with neonatal lethality in mice [42]. To our best knowledge, the association between ABL1 and production traits in cattle has been poorly investigated. However, a significant peak associated with angularity in Brown Swiss cattle was identified close to ABL1 region. These results suggested a potential association between ABL1 with fertility and production and conformation traits, reinforcing the necessity to improve our understanding of these relationships. Two of the other six genes identified in this analysis are involved in energy conservation metabolism: PPARG, GSK3B. PPARG belongs to a subfamily of nuclear receptors involved in several crucial biological processes, such as adipogenesis and immune cell activation [43]. Alterations in the expression pattern of PPARG, usually decrease in expression, were already associated with puberty progression and fertility traits in several species, including humans and cattle [44-46]. Additionally, PPARG has also been associated with meat quality and, more specifically, intramuscular fat percentage, and milk synthesis in cattle, which are economically important traits in beef and dairy cattle production [47-49], respectively. Muscle glycogen is absolutely fundamental as energy reservoir. The amount of liver and muscle glycogen available determines the use of fat and, as a last resource, amino acids, for providing energy. GSK3B is a serine-threonine kinase member of a subfamily of glycogen synthase kinase and its corresponding function is known to be associated with energy metabolism, body pattern formation and neuronal cell development [50]. Additionally, GSK3B has been associated with age at puberty, sperm motility and decrease in spermatogenesis. Therefore, its function is directly related to fertility traits [51-53]. Interestingly, GSK3B has also been associated with several economically relevant traits such as skeletal muscle hypertrophy, intramuscular fat, meat quality, milk synthesis and mammary gland proliferation [54-57]. The last gene in this list, IGF1R, is the receptor of the Insulin-like growth factor 1 (IGF1), which also binds IGF2 and insulin with lower affinity. IGF1 and IGF2 have remarkable functions in the steroidogenic activity and regulation of body growth and maturation. In female cattle, IGF1 and IGF2 are associated with the regulation of the steroidogenic activity in the glanulosa cells, as well as, the regulation of the mitosis (IGF1 and IGF2) and apoptosis during the follicular development (IGF1) [58, 59]. The bovine testis is an important source of IGF1 production [60]. Both circulating and locally produced (testis) IGF1 may play a crucial role in the testicular size and testosterone secretion [61, 62]. The IGF1 pathway influence gonadotrophin‐releasing hormone (GnRH) neurons during the puberty and it is directly associated with the puberty progression and the development of reproductive traits [52, 63, 64]. Additionally, IGF1 was associated with the regulation of several production traits, for example, feed intake, feed conversion, body weight, milk protein yield, milk fat yield, milk fat concentration and somatic cell score [65-67]. Variants mapped on IGF1R may result in a change of affinity between IGF1 and its receptor, resulting in a different response to the circulating levels of this hormone. The crucial roles of IGF1 in the regulation of body development, in addition to the results obtained in the present study, highlights the potential of IGF1R to act like a key-regulator of pleiotropic effects associated with fertility and production traits. All the six genes described above are directly related to fertility and economically relevant traits. Additionally, these genes were identified as potential key-regulatory genes due to the number of interactions with other genes and its biological functions. Consequently, these genes are important candidate genes for pleiotropic effect on multiple production and fertility traits. Genes that do not appeared in both analyses described in Table 3 were not included in the discussion. However, some of these genes are fundamental to the metabolic processes discussed here. For example, somatostatin (SST), also known as growth hormone inhibiting hormone (GHIH), affects energy conservation metabolism by inhibiting insulin and glucagon secretions. In addition, somatostatin produced in the hypothalamus is transported to anterior pituitary, where it inhibits the release of growth hormone (GH), TSH and prolactin [68]. This gene also encodes neuronostatin, which is also implicated in the releasing of pituitary hormones [69]. In addition, somatostatin is involved in the MAPK, cAMP, PKA- pathways, among many other ones [70]. As a consequence, somatostatin has also a role in energy conservation metabolism and cell proliferation, the two main biological processes detected in the current study. ACACB and ADCYAP1 are involved in the energy conservation metabolism. ACACB encodes acetil-CoA carboxilase, the enzyme converting acetil-CoA in malonyl-CoA, fundamental for fatty acids biosynthesis (KEGG ID: 04931; ID: 00061). ADCYAP1 encodes adelylate-clycase interacting protein 1, which through the cAMP and PKA pathways is involved in the insulin upregulation and in the regulation of insulin levels in insulin secretory granules, respectively, among other functions (KEGG ID: 04911). ABI1 and ATF2 are cell proliferation regulators. ABI1 encodes ABL interacting protein 1, which facilitates the ABL cell proliferation signal. ATF2 encodes activating transcription factor 2, also known as CREB2. CREB2 regulates cell cycle proteins, pro-apoptotic proteins, cell adhesion molecules, and membrane and cytoplasm signaling proteins. In additions, CREB2 is the final step in many pathways, including cAMP, PKA, estradiol 17-beta (KEGG ID: 04915) and glucagon (KEGG ID: 04922). Therefore, CREB2 is also involved in the energy conservation metabolism. LGALS3 encodes galectin 3, a carbohydrate binding protein with affinity for beta-galactosides. As a consequence, galectin 3 is involved in cell proliferation, adhesion, differentiation, angiogenesis and apoptosis [71]. It has been shown that galectin 3 is expressed by trophoblast cells in response to 17b-estradiol, progesterone, and human chorionic gonadotropin (hCG). Among many other effects, galectin 3 induces apoptosis in endometrial cells, which would allow embryo implantation [72].

Thyroid function and genes with pleiotropic effect

Among the genes present in the main network identified by STRING database, there are two crucial genes for the synthesis of thyroid hormones. These genes are thyroglobulin (TG) and iodotyrosine deiodinase (IYD). TG is metabolized through the addition of iodine molecules to produce mono- and di-iodotyrosine and the thyroid hormones triiodothyronine (T3) and thyroxine (T4). Consequently, TG is one of the main storage molecules of iodine in the body [73]. IYD is responsible to conserve iodide, recycling iodine, during synthetization of T3 and T4 hormones, from mono- and diiodotyrosine. Due to the low availability of iodine in the nature, IYD dysfunctions reduce the amount of available iodine for T3 and T4 synthesis [74-76]. In target cells, IYD converts T4 to T3, the active thyroid hormone, and converts T3 to di-iodotyrosine, inactivating T3. These processes also provide iodine to peripheral tissues. The thyroid hormones are related to the control of several crucial biological processes involved with the regulation of basal metabolism. In cattle, genes related to thyroid hormone regulation have been identified as DE in studies evaluating feed efficiency, lactation stage, fat deposition and early embryonic development [77-83]. Additionally, some studies suggest a pleiotropic effect for TG polymorphisms in production traits [84, 85]. Additionally, changes in the levels of thyroid hormones during pregnancy, mainly in the initial development of the embryos, are associated with adverse pregnancy outcomes and embryonic losses [86]. Moreover, alterations in thyroid activity may result in male and female infertility [19, 87]. An interesting link between the thyroid hormones and the selection for productive traits in cattle is that TG (BTA14:9,253,697–9,263,933) is mapped in the same core selective sweep (CSS) region of DGAT1 (BTA14: 1,795,425–1,804,838) [88]. The QTL related to DGAT1 is considered to have a major effect on production traits and has been associated with several phenotypes in both dairy and beef cattle breeds [89-91]. Due to this major effect, molecular markers associated with the DGAT1 effect are intensively exploited in genetic improvement programs. Generally, the use of molecular markers in selection programs does not consider the relationship among the aimed marker and the surrounding mapped markers. However, the intensive selection may increase the extent and the overall LD in a region [92]. This phenomenon may result in an indirect selection (hitchhiking effect) of markers mapped in different genes and with unpredictable effects. It is important to note that, in the same CSS, the highest signal for pleiotropic effect was observed by Bolormaa et al. (2014) [10], as shown in Fig 6. In the same position, three additional genes shared among the three independent populations mapped to regions with 5 or more previously reported QTLs and in the main network identified by STRING database, i.e. MYC, TGS1 and CHD7. The thyroid cancer pathway (KEEG ID: 05216) was one of the enriched pathways identified by STRING database (Table 1). In this pathway, thyroid hormones and precursors are not involved, but both MYC and PPARG are implicated, reinforcing the association between this main-network and the regulation of thyroid activity. In addition to the presence of TG and IYD in the main network identified by STRING database (Fig 3), these results suggest that this network is enriched by genes related to thyroid function. The VarElect software was used to confirm this hypothesis through an association processes between the genes that composed the main network and the keywords “thyroid” and “thyroid hormones”. Through data-mining using information available on GeneCards and MalaCards, it was possible to identify that from the 38 genes (S4 File) present in the main network, 31 are directly related to those keywords. Therefore, indicating that this gene network was enriched by genes related to thyroid function. As previously described, thyroid function is related to the regulation of several biological processes associated with economically important traits. Additionally, due to this association with several processes and traits, the genes involved in thyroid activity are excellent functional candidate genes for pleiotropic effects. Further functional analyses will be performed in order to elucidate the relationship between the different traits affected by pleiotropic effects, as well as, to identify candidate variants associated with the function of these candidate genes. It is important to highlight that during the analyses performed here, some differences were observed in the gene expression profile among populations, even when similar groups were compared (Pre- and post-puberty). A very common phenomenon observed in biological analyses that can help to address this issue is the Simpson’s paradox. The Simpson’s paradox is observed when results from aggregated data contradict those from separate analyses. There are several reports in the literature discussing the impact of Simpson’s paradox in different fields, such as network analysis and gene expression [93, 94, 95]. The biological bases for the Simpson’s paradox in biological analyses are still poorly understood. However, some points can be raised to help in the discussion of this phenomenon. For example, in the data analysed in the present manuscript, as addressed in the previous commentary, the differences in the genetic background, environmental effects and evaluation of the puberty, and number of tissues evaluated between populations can help to explain these differences. It is important to highlight that these populations (Brangus, Brahman and Tropical composition) shared the Brahman genetic component. However, these proportions are different in each population. For example, in the Brangus population the animals have 3/8 of Brahman component and 5/8 of Angus component. All these differences can be taken together in order to discussion the possible causes of the phenomenon observed here. Additionally, it is important to highlight that the genes shown in Table 2 are a specific group of genes, which are the genes with the highest potential to perform pleiotropic effect. Additionally, the consistency of expression among tissues and populations can help to identify these key-regulatory genes. The present study aims exactly on those genes that even with all these possible confounding factors maintain the expression profile, which can be an additional evidence of crucial regulatory role.

Conclusions

The present study described a multi-breed and multi-OMICs approach to identify key-regulatory candidate genes for pleiotropic effects in beef cattle using the results generated by previous studies. Our findings confirm the feasibility of using a systems biology approach to unravel candidate genes regulating complex traits. Genes identified in this study are mainly involved in two biological processes: energy conservation metabolism and cell proliferation, probably the most theoretically plausible processes to unify the phenotypes investigated in this study: exterior appearance, health, reproduction, production, meat and carcass, and milk traits. This study contributes to the understanding of the cause-consequence relationships between variants mapped on candidate pleiotropic genes affecting complex traits. Additionally, the results obtained here will be useful for better defining statistical models to improve the accuracy of genomic prediction of breeding values and avoid the simultaneous selection for unfavorable genetically correlated traits in beef cattle and other livestock species.

Ensembl gene ID, official symbol and genomic coordinates for the 108 genes shared among all the populations (tab 1) and 38 genes present in the main network identified by STRING DB (tab 2).

(XLSX) Click here for additional data file.

QTL information for the 89 genes were mapped in regions with five or six QTL categories annotated.

(TXT) Click here for additional data file.

Gene ontology terms associated with the 38 genes in the main network identified by STRING DB.

(TXT) Click here for additional data file.

Varelect analysis output for the 38 genes in the main network identified by STRING DB.

(XLSX) Click here for additional data file.
  87 in total

1.  Evidence supporting a role for SMAD2/3 in bovine early embryonic development: potential implications for embryotropic actions of follistatin.

Authors:  Kun Zhang; Sandeep K Rajput; Kyung-Bon Lee; Dongliang Wang; Juncheng Huang; Joseph K Folger; Jason G Knott; Jiuzhen Zhang; George W Smith
Journal:  Biol Reprod       Date:  2015-08-19       Impact factor: 4.285

2.  Addressing false discoveries in network inference.

Authors:  Tobias Petri; Stefan Altmann; Ludwig Geistlinger; Ralf Zimmer; Robert Küffner
Journal:  Bioinformatics       Date:  2015-04-24       Impact factor: 6.937

3.  Multiple quantitative trait analysis using bayesian networks.

Authors:  Marco Scutari; Phil Howell; David J Balding; Ian Mackay
Journal:  Genetics       Date:  2014-09       Impact factor: 4.562

Review 4.  Mendelian disorders and multifactorial traits: the big divide or one for all?

Authors:  Stylianos E Antonarakis; Aravinda Chakravarti; Jonathan C Cohen; John Hardy
Journal:  Nat Rev Genet       Date:  2010-05       Impact factor: 53.242

5.  5'-deiodinase activity and circulating thyronines in lactating cows.

Authors:  C Pezzi; P A Accorsi; D Vigo; N Govoni; R Gaiani
Journal:  J Dairy Sci       Date:  2003-01       Impact factor: 4.034

6.  Evidence for multiple alleles at the DGAT1 locus better explains a quantitative trait locus with major effect on milk fat content in cattle.

Authors:  Christa Kühn; Georg Thaller; Andreas Winter; Olaf R P Bininda-Emonds; Bernhard Kaupe; Georg Erhardt; Jörn Bennewitz; Manfred Schwerin; Ruedi Fries
Journal:  Genetics       Date:  2004-08       Impact factor: 4.562

7.  Expression of androgen receptor target genes in skeletal muscle.

Authors:  Kesha Rana; Nicole K L Lee; Jeffrey D Zajac; Helen E MacLean
Journal:  Asian J Androl       Date:  2014 Sep-Oct       Impact factor: 3.285

8.  Peripheral thyroid hormone levels and hepatic thyroid hormone deiodinase gene expression in dairy heifers on the day of ovulation and during the early peri-implantation period.

Authors:  Marie Margarete Meyerholz; Kirsten Mense; Matthias Linden; Mariam Raliou; Olivier Sandra; Hans-Joachim Schuberth; Martina Hoedemaker; Marion Schmicke
Journal:  Acta Vet Scand       Date:  2016-09-08       Impact factor: 1.695

9.  Increased expression of thyroid hormone responsive protein (THRSP) is the result but not the cause of higher intramuscular fat content in cattle.

Authors:  Lisa Schering; Elke Albrecht; Katrin Komolka; Christa Kühn; Steffen Maak
Journal:  Int J Biol Sci       Date:  2017-04-10       Impact factor: 6.580

10.  The STRING database in 2017: quality-controlled protein-protein association networks, made broadly accessible.

Authors:  Damian Szklarczyk; John H Morris; Helen Cook; Michael Kuhn; Stefan Wyder; Milan Simonovic; Alberto Santos; Nadezhda T Doncheva; Alexander Roth; Peer Bork; Lars J Jensen; Christian von Mering
Journal:  Nucleic Acids Res       Date:  2016-10-18       Impact factor: 16.971

View more
  12 in total

1.  Weighted Gene Correlation Network Meta-Analysis Reveals Functional Candidate Genes Associated with High- and Sub-Fertile Reproductive Performance in Beef Cattle.

Authors:  Pablo A S Fonseca; Aroa Suárez-Vega; Angela Cánovas
Journal:  Genes (Basel)       Date:  2020-05-12       Impact factor: 4.096

2.  Development and comparison of RNA-sequencing pipelines for more accurate SNP identification: practical example of functional SNP detection associated with feed efficiency in Nellore beef cattle.

Authors:  S Lam; J Zeidan; F Miglior; A Suárez-Vega; I Gómez-Redondo; P A S Fonseca; L L Guan; S Waters; A Cánovas
Journal:  BMC Genomics       Date:  2020-10-08       Impact factor: 3.969

3.  Comparative Transcriptomic Analysis of the Pituitary Gland between Cattle Breeds Differing in Growth: Yunling Cattle and Leiqiong Cattle.

Authors:  Xubin Lu; Abdelaziz Adam Idriss Arbab; Zhipeng Zhang; Yongliang Fan; Ziyin Han; Qisong Gao; Yujia Sun; Zhangping Yang
Journal:  Animals (Basel)       Date:  2020-07-25       Impact factor: 2.752

4.  Gene regulation could be attributed to TCF3 and other key transcription factors in the muscle of pubertal heifers.

Authors:  Li Yieng Lau; Loan T Nguyen; Antonio Reverter; Stephen S Moore; Aaron Lynn; Liam McBride-Kelly; Louis Phillips-Rose; Mackenzie Plath; Rhys Macfarlane; Vanisha Vasudivan; Lachlan Morton; Ryan Ardley; Yunan Ye; Marina R S Fortes
Journal:  Vet Med Sci       Date:  2020-05-20

5.  GALLO: An R package for genomic annotation and integration of multiple data sources in livestock for positional candidate loci.

Authors:  Pablo A S Fonseca; Aroa Suárez-Vega; Gabriele Marras; Ángela Cánovas
Journal:  Gigascience       Date:  2020-12-30       Impact factor: 6.524

6.  Integrating the RFID identification system for Charolaise breeding bulls with 3D imaging for virtual archive creation.

Authors:  Maria Grazia Cappai; Filippo Gambella; Davide Piccirilli; Nicola Graziano Rubiu; Corrado Dimauro; Antonio Luigi Pazzona; Walter Pinna
Journal:  PeerJ Comput Sci       Date:  2019-03-04

7.  Unravelling Rubber Tree Growth by Integrating GWAS and Biological Network-Based Approaches.

Authors:  Felipe Roberto Francisco; Alexandre Hild Aono; Carla Cristina da Silva; Paulo S Gonçalves; Erivaldo J Scaloppi Junior; Vincent Le Guen; Roberto Fritsche-Neto; Livia Moura Souza; Anete Pereira de Souza
Journal:  Front Plant Sci       Date:  2021-12-21       Impact factor: 5.753

8.  Identification of Loci and Pathways Associated with Heifer Conception Rate in U.S. Holsteins.

Authors:  Justine M Galliou; Jennifer N Kiser; Kayleen F Oliver; Christopher M Seabury; Joao G N Moraes; Gregory W Burns; Thomas E Spencer; Joseph Dalton; Holly L Neibergs
Journal:  Genes (Basel)       Date:  2020-07-08       Impact factor: 4.096

9.  Genome-wide association study to identify genomic regions and positional candidate genes associated with male fertility in beef cattle.

Authors:  H Sweett; P A S Fonseca; A Suárez-Vega; A Livernois; F Miglior; A Cánovas
Journal:  Sci Rep       Date:  2020-11-18       Impact factor: 4.379

Review 10.  Sustainable Intensification of Beef Production in the Tropics: The Role of Genetically Improving Sexual Precocity of Heifers.

Authors:  Gerardo Alves Fernandes Júnior; Delvan Alves Silva; Lucio Flavio Macedo Mota; Thaise Pinto de Melo; Larissa Fernanda Simielli Fonseca; Danielly Beraldo Dos Santos Silva; Roberto Carvalheiro; Lucia Galvão Albuquerque
Journal:  Animals (Basel)       Date:  2022-01-12       Impact factor: 2.752

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.