Literature DB >> 34786546

Identification of key genes and pathways in mild and severe nonalcoholic fatty liver disease by integrative analysis.

Jin Feng¹, Tianjiao Wei¹, Xiaona Cui¹, Rui Wei¹, Tianpei Hong¹.

Abstract

BACKGROUND: The global prevalence of nonalcoholic fatty liver disease (NAFLD) is increasing. The pathogenesis of NAFLD is multifaceted, and the underlying mechanisms are elusive. We conducted data mining analysis to gain a better insight into the disease and to identify the hub genes associated with the progression of NAFLD.
METHODS: The dataset GSE49541, containing the profile of 40 samples representing mild stages of NAFLD and 32 samples representing advanced stages of NAFLD, was acquired from the Gene Expression Omnibus database. Differentially expressed genes (DEGs) were identified using the R programming language. The Database for Annotation, Visualization and Integrated Discovery (DAVID) online tool and Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) database were used to perform the enrichment analysis and construct protein-protein interaction (PPI) networks, respectively. Subsequently, transcription factor networks and key modules were identified. The hub genes were validated in a mice model of high fat diet (HFD)-induced NAFLD and in cultured HepG2 cells by real-time quantitative PCR.
RESULTS: Based on the GSE49541 dataset, 57 DEGs were selected and enriched in chemokine activity and cellular component, including the extracellular region. Twelve transcription factors associated with DEGs were indicated from PPI analysis. Upregulated expression of five hub genes (SOX9, CCL20, CXCL1, CD24, and CHST4), which were identified from the dataset, was also observed in the livers of HFD-induced NAFLD mice and in HepG2 cells exposed to palmitic acid or advanced glycation end products.
CONCLUSION: The hub genes SOX9, CCL20, CXCL1, CD24, and CHST4 are involved in the aggravation of NAFLD. Our results offer new insights into the underlying mechanism of NAFLD progression.

Entities: Chemical

Keywords: Computational biology; Fatty liver; Nonalcoholic fatty liver disease

Year: 2021 PMID： 34786546 PMCID： PMC8579024 DOI： 10.1016/j.cdtm.2021.08.002

Source DB: PubMed Journal: Chronic Dis Transl Med ISSN： 2095-882X

Introduction

The incidence of nonalcoholic fatty liver disease (NAFLD) and its advanced subtypes has been rising rapidly, leading to health and economic burden on the patients., NAFLD is the leading cause of liver diseases globally and is associated with several metabolic disorders, such as type 2 diabetes., NAFLD includes a series of conditions from early steatosis to nonalcoholic steatohepatitis (NASH), and even hepatic carcinoma. However, the exact mechanisms of the development and progression of NAFLD are still not completely elucidated. Nowadays, microarray technology is a widely used method in discovery-based biomedical research. The pathogenesis of NAFLD involves a myriad of distinct molecular pathways and cellular changes. Several studies have reported the molecular mechanisms of NAFLD pathogenesis in the liver.8, 9, 10 However, the key genes associated with the disease progression and the underlying functional pathways remain obscure, and whether the differentially expressed genes (DEGs) are involved in hepatic lipid metabolism is still unclear. In the present study, we have integrated the available microarray datasets of human NAFLD liver tissues to perform comprehensive bioinformatic analysis of DEGs. Moreover, we have verified the expression changes of the liver hub genes of high fat diet (HFD)-induced NAFLD mice, as well as in the HepG2 cells exposed to glucolipotoxicity. Our results might elucidate potential biomarkers and targets for the diagnosis and treatment of NAFLD.

Methods

Animal experiments ethics

The animal experiments in this study were approved by the Animal Care and Use Committee of Peking University (No. LA2018316). All ethical principles involved in the care and usage of laboratory animals were carried out.

Microarray data collection

The Gene Expression Omnibus dataset GSE49541, which was contributed by Moylan et al., was downloaded from the National Center for Biotechnology Information website. The dataset contained a total of 72 RNA profiles from liver samples, including 40 belonging to mild NAFLD (fibrosis stage 0–1) and 32 belonging to advanced NAFLD (fibrosis stage 3–4). The dataset was generated using the GPL570 platform (Affymetrix Human Genome U133A Array).

Data preprocessing and DEG screening

The R language (Affy package, version 1.64.0) was used to manipulate the raw data. Based on annotation files, the probe IDs were converted into gene symbols, following background correction, standardization, and expression value calculation processes, as previously described. Subsequently, DEG screening was performed using the R language Limma package (version 3.42.2). The statistically significant screening criteria for the identification of DEGs were defined at |log 2 (fold change)| > 1 and P < 0.05.

Enrichment analysis of DEGs

To evaluate the functions of a cluster of DEGs, the Database for Annotation, Visualization and Integrated Discovery (DAVID) tool (https://david.ncifcrf.gov/) was used to perform the Gene Ontology (GO) analysis. Moreover, the Kyoto Encyclopedia of Genes and Genomes (KEGG) was used for the pathway analysis of DEGs. The enrichment analysis of DEGs was regarded as statistically significant at P < 0.05. The GO enrichment package of the R language, gplot, was used to list all the enriched pathways according to the P value.

Protein–protein interaction and module analyses

To investigate the connections among the proteins encoded by the identified DEGs, the Search Tool for the Retrieval of Interacting Genes (STRING; https://string-db.org/) was used to establish the protein–protein interaction (PPI) network, with a confidence score >0.4 as the threshold. Cytoscape software (version 3.8.2x; The Cytoscape Consortium, New York City, NY, USA) was used to visualize the PPI network. Molecular Complex Detection (MCODE) algorithm was used to identify the key modules of the PPI network within the set criteria of significance, which were defined at degree = 5, node score = 0.2, k-core = 2, and max depth = 100. According to this algorithm of Cytoscape, the obtained modules were ranked and scored. The top 2 modules with the highest score were considered to be significant. To identify hub genes, the top 16 genes were ranked. The genes with common diagrams ≥10 in every topological algorithm were considered hub genes.

Transcription factor analysis

The expression of genes is regulated by transcription factors. To predict and visualize the key transcription factors of the PPI network, the iRegulon plugin of Cytoscape (version 3.8.2x) was used, as previously described. Normalized enrichment score (NES) > 5 was defined to select transcription factors. According to the NES score, the top three modules were ranked and listed.

Establishment of the mouse NAFLD model

All mice were purchased from the Vital River Animal Center (Beijing, China). After one-week of acclimatization, twelve 8-week-old male C57BL/6N mice were divided into three groups. Two groups of mice were fed an HFD (dietary fat content of 60%) for 18 weeks (n = 6) and 24 weeks (n = 3), which represented the mild and advanced NAFLD models, respectively. One additional group of mice (n = 3) fed a normal diet (dietary fat content of 4%) served as the control. All the mice were reared in individually ventilated cages located in the same room. Food and water were accessed ad libitum. Before sacrificing the animals, magnetic resonance imaging (MRI; Siemens Prisma, Munich, Germany) was used to measure the body fat content.

Oil red O staining

The liver tissues were fixed overnight with 10% (v/v) neutral-buffered formalin at 4 °C and embedded in optimal cutting temperature compound. The 5 μm-thick sections were stained with oil red O solution (Servicebio, Wuhan, China) to assess the accumulation of hepatic fat. Images were obtained using a panoramic section scanner (3Dhistech, Pannoramic, Budapest, Hungary).

Establishment of liver cell glucolipotoxicity models in vitro

Human liver cell line, HepG2, (validated for gene expression and checked for mycoplasma contamination before use) were kindly gifted from the Medical Research Center, Peking University Third Hospital (Beijing, China). Palmitic acid (PA; Sigma, St. Louis-Aldrich, MO, USA) was used to establish the lipotoxicity-induced hepatic insulin resistance model, and advanced glycation end products (AGEs; Abcam, Cambridge, UK) were used to generate the glucotoxicity-induced hepatic damage model in vitro. HepG2 cells were incubated in Dulbecco's Modified Eagle's Medium (Gibco, Carlsbad, CA, USA) with 10% (v/v) fetal bovine serum (Gibco). PA (256 mg) was dissolved in 5 mL anhydrous ethanol, and then titrated with 5 mL sodium hydroxide (0.l mol/L). A total of 5 mL PA solution was slowly dripped into 95 mL 10% bovine serum albumin to obtain a complex with a concentration of 5 mmol/L as previously described. Subsequently, the HepG2 cells were incubated with PA (125, 250, 500, and 1000 nmol/L), or with AGEs (1, 10, and 100 μg/mL) for 24 h. Cells were then collected for RNA extraction.

Real-time quantitative PCR

RNA of liver tissues or HepG2 cells was extracted with Trizol (Thermo Fisher Scientific, Waltham, MA, USA) and reverse transcribed to cDNA using a Revert Aid First Strand cDNA Synthesis kit (Fermentas, Vilnius, Lithuania). The cDNA was subjected to quantitative analysis using the SYBR Green supermix (Bio-Rad Laboratories, Hercules, CA, USA) in a real-time quantitative PCR detection system (Bio-Rad Laboratories). The primer sequences synthesized by the Beijing AuGCT DNA-SYN Biotechnology Company (Beijing, China) are summarized in Supplementary Table S1. The housekeeping gene, GAPDH, was used to normalize the expression level of each gene.

Statistical analysis

All in vivo and in vitro studies were performed as three independent experiments. The experimental data are presented as the means ± standard deviation (SD). Statistical analysis was carried out using one-way ANOVA followed by the post-hoc Tukey–Kramer test. The statistical significance was defined at P < 0.05. All the analyses were performed using the Statistical Product and Service Solutions (SPSS) 22.0 software (IBM SPSS Inc, Chicago, IL, USA).

Results

The dataset contained the microarray data of 40 patients with mild NAFLD (fibrosis stage 0–1) and 32 patients with advanced NAFLD (fibrosis stage 3–4). To identify the hub DEGs precisely, statistical significance was defined at |log 2 (fold change)| > 1 and P < 0.05. A total of 57 DEGs, including 52 upregulated DEGs and 5 downregulated DEGs, were selected (Supplementary Table S2), and displayed in form of a heat map and a volcano map (Fig. 1). The top 5 upregulated DEGs were EPCAM, STMN2, CTHRC1, EFEMP1 and CD24. The five downregulated DEGs were CYP2C19, DHRS2, MT1M, FITM1, and GNMT (Supplementary Table S2).

Fig. 1

Heat map (A) and volcano map (B) of the identified differentially expressed genes (DEGs) between mild (n = 40) and advanced (n = 32) NAFLD livers based on the GSE49541 dataset.

KEGG pathway and GO enrichment analyses of DEGs

To determine the biological functions of the identified DEGs, enrichment analysis was performed using DAVID. As shown in Fig. 2, the upregulated DEGs were enriched in the extracellular region, proteinaceous extracellular matrix, extracellular matrix, extracellular space, and extracellular exosome in the cellular component GO term. In molecular function class, the DEGs were associated with chemokine activity and extracellular matrix structural constituent. In the biological process class, the DEGs were significantly associated with cell adhesion, cell chemotaxis, and sulfur compound metabolic process. In the KEGG pathway enrichment analysis, the DEGs were solely enriched in the chemokine signaling pathway (Fig. 2).

Fig. 2

Top 11 pathways and biological functions enriched in Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis and Gene Ontology (GO) analysis related to DEGs.

PPI network analysis of DEGs

The identified DEGs were introduced into the online database, STRING. Subsequently, Cytoscape was used for network visualization analysis, and the isolated genes that showed no interactions were removed. As shown in Fig. 3A, there were 28 nodes and 36 edges in the PPI network. The MCODE plugin of Cytoscape software was further used to identify the densely connected significant modules that met the cutoff criteria. According to their score, two significant modules were identified from the PPI network. There were 4 nodes and 6 edges in module 1 (score: 4) (Fig. 3B), and 5 nodes and 7 edges in module 2 (score: 3.5) (Fig. 3C). The plugin CytoHubba was used to parse the PPI network. According to a network measures, including the degree, average shortest path length, eccentricity, betweenness centrality, radiality, neighborhood connectivity, stress, topological coefficient, closeness centrality, clustering coefficient, and the number of directed edges, the top 16 genes were regarded as important nodes in each topological analysis method, and the hub genes were selected with the frequency of occurrence ≥10 (Table 1). Based on the analysis of 11 topological algorithms, SOX9, CCL20, CXCL1, CD24, and CHST4 were considered as the hub genes (Table 1), which were used for the further validation studies.

Fig. 3

Protein–protein interaction (PPI) and module analyses. (A) PPI network and module analyses of DEGs in GSE49541 dataset. (B–C) Significant modules, module 1 (B) and module 2 (C), selected from PPI network analysis. The color and size of node are relative to its degree (darker the color and larger the size, the greater is the degree). The strength of the confidence score is symbolized by the thickness of the line (the thicker the line, the higher is the confidence score).

Table 1

Hub genes analyzed by different topological algorithms in the protein−protein interaction network.

Different topological algorithms	Top genes
Degree	LUM, COL1A2, CXCL1, CTHRC1, SOX9, MMP7, CXCL6, CCL20, CCL19, CD24, CHST4, OGN, THBS2, COL15A1, PODXL, EPCAM
Average shortest path length	GAL3ST4, CHI3L1, CXCL6, CCL20, CCL19, PODXL, FZD7, EPCAM, CD24, DPT, CHST4, CXCL1, OGN, THBS2, CTHRC1, SOX9
Eccentricity	CXCL6, CCL20, CCL19, PODXL, GAL3ST4, CHI3L1, CXCL1, CD24, CHST4, EPCAM, DPT, FZD7, LUM, CTHRC1, SOX9, MMP7
Betweenness centrality	MMP7, COL1A2, CXCL1, LUM, SOX9, CHST4, CTHRC1, COL15A1, CD24, THBS2, OGN, PODXL, GAL3ST4, CHI3L1, CXCL6, CCL20
Radiality	GAL3ST4, CHI3L1, CXCL6, CCL20, CCL19, PODXL, FZD7, EPCAM, DPT, CD24, CHST4, CXCL1, OGN, THBS2, CTHRC1, SOX9
Neighborhood connectivity	COL15A1, THBS2, OGN, DPT, CHI3L1, MMP7, SOX9, COL1A2, FZD7, CTHRC1, CXCL6, CCL20, CCL19, LUM, EPCAM, CHST4
Stress	MMP7, COL1A2, LUM, CXCL1, CHST4, SOX9, CTHRC1, COL15A1, CD24, THBS2, OGN, PODXL, DPT, CHI3L1, FZD7, CXCL6
Topological coefficient	CXCL6, CCL20, CCL19, DPT, EPCAM, COL15A1, THBS2, OGN, PODXL, CD24, CTHRC1, CXCL1, SOX9, MMP7, CHST4, COL1A2
Closeness centrality	GAL3ST4, CHI3L1, CXCL6, CCL20, CCL19, PODXL, FZD7, EPCAM, CD24, DPT, CHST4, CXCL1, OGN, THBS2, CTHRC1, SOX9
Clustering coefficient	CXCL6, CCL20, CCL19, EPCAM, DPT, OGN, THBS2, COL15A1, CD24, CTHRC1, SOX9, MMP7, COL1A2, CXCL1, LUM, GAL3ST4
Number of directed edges	COL1A2, LUM, CXCL1, CTHRC1, SOX9, MMP7, CXCL6, CCL20, CCL19, OGN, THBS2, COL15A1, CD24, CHST4, EPCAM, DPT
Common DEGs (≥10 diagrams)	CCL20, CXCL1, CD24, CHST4, SOX9

Modules of key transcription factors

Transcription factors regulate gene expression and function by binding to a specific DNA sequence. Here, the iRegulon plugin was used to predict the transcription factors and the regulatory network of their target genes. All predicted transcription factor modules with NES >5 are listed in Supplementary Table S3. According to the NES score, the top 3 transcription factor modules are displayed in Fig. 4. In module 1, it was predicted that TEAD1, TEAD2, TEAD3, and TEAD4 transcription factors would regulate LPL, THBS2, GABRP, PLCXD3, FABP4, and GABRB3 (Fig. 4A). In module 2, the transcription factors HIVEP1, HIVEP2, HIVEP3, and ZNF831 would regulate THBS2, GABRB3, CCL20, and PODXL (Fig. 4B). In module 3, it was predicted that ZNF333, RUNX2, CBFB, and HOXA13 would regulate SOX9, PLCXD3, CHST4, GABRB3, and COL15A1 (Fig. 4C).

Fig. 4

Transcription factor target networks in the top 3 modules using the iRegulon plugin. Blue octagon nodes represent the predicted transcription factors. Pink oval nodes represent the transcription factor-regulated genes.

Validation of key genes associated with NAFLD in vivo and in vitro

We screened the possible hub genes of NAFLD based on highly correlated topological algorithms from the PPI networks. The 5 hub genes, SOX9, CCL20, CXCL1, CD24, and CHST4, were identified (Table 1). To confirm the role of these hub genes in the liver during different stages of NAFLD, mice were fed an HFD with different periods. First, we detected deposition of adipose in the liver of HFD-fed mice. In addition, all the HFD-fed mice developed some form of hepatic steatosis. The oil red O staining showed more fat droplets and hepatocyte ballooning in the liver of the HFD-fed mice compared with the control group. These changes were more severe in mice fed HFD for 24 weeks than in mice fed HFD for 18 weeks (Fig. 5A). The percentage of total adipose tissue (as detected by the MRI scan) to body weight was higher in the 18-week HFD-fed mice than in the control mice [(14.20 ± 0.11) % vs. (4.72 ± 0.99) %, t = 15.95, P < 0.01], and much higher in the 24-week HFD-fed mice than in the 18-week HFD-fed mice [(15.60 ± 0.60) % vs. (14.20 ± 0.11) %, t = 3.791, P < 0.05)] (Fig. 5B). These results indicate that HFD could successfully induce adipose accumulation and lead to the development and progression of NAFLD in mice. Next, we determined the expression of the key genes in the liver of the HFD-induced NAFLD mice. The mRNA levels of Sox 9, Ccl20, Cxcl1, and Chst4 in the liver were higher in mice fed an 18-week HFD than in the control mice, and their levels were further increased after 24-week HFD (Fig. 5C).

Fig. 5

Validation of the potential key genes in the livers of NAFLD mice and in cultured HepG2 cells exposed to glucolipotoxicity. C57BL/6N mice were fed a high fat diet (HFD) for 18 (n = 6) and 24 (n = 3) weeks. Age-matched C57BL/6N mice fed a normal diet (n = 3) were used as the control. (A) Oil red O staining of liver tissues. Scale bar = 50 μm. (B) The percentage of total adipose tissue (as detected by magnetic resonance imaging scan) to body weight. (C) Relative mRNA levels of hub genes in mouse liver tissues detected by real-time quantitative PCR (qPCR). (D) Relative mRNA levels of hub genes determined by qPCR in HepG2 cells cultured with palmitic acid (PA) or vehicle for 24 h (n = 3). (E) Relative mRNA levels of hub genes detected by qPCR in HepG2 cells cultured with advanced glycation end products (AGEs) or vehicle for 24 h (n = 3). Data are expressed as the means ± standard deviation. Statistical analysis was conducted using one-way ANOVA followed by the post-hoc Tukey–Kramer test. aP < 0.05 (vs. control). bP < 0.05 (vs.18-week HFD exposure). Given that NAFLD is strongly associated with an abnormal metabolism of lipids and glucose, we further explored the expression levels of the hub genes in cultured liver cell line, HepG2, exposed to lipotoxic or glucotoxic conditions, induced by the application of PA or AGEs, respectively. The mRNA levels of SOX9, CCL20, CXCL1, and CHST4 were upregulated by high concentrations (500 and 1000 nmol/L) of PA (Fig. 5D). Furthermore, the mRNA levels of SOX9, CCL20, CD24 and CHST4 were upregulated by 100 μg/mL of AGEs (Fig. 5E). These results indicate that the suggested hub genes might be highly relevant to the development of NAFLD. Interestingly, CD24 and CCL20, the key genes involved in the progression of NAFLD, were also upregulated in the livers of patients with type 2 diabetes, when the GSE15653 database (including 5 normal liver tissues and 9 liver samples from diabetic patients) was used for validation analysis (Supplementary Fig. S1).

Discussion

The prevalence of NAFLD, one of the most common chronic liver diseases, is increasing at an alarming pace globally. However, the pathogenesis of NAFLD is not completely understood. It has been suggested that NAFLD is strongly correlated with genetic components. In this study, we downloaded the GSE49541 dataset to obtain gene expression data of the advanced NAFLD liver tissues and compared them with mild NAFLD liver tissues. A total of 57 DEGs, 52 upregulated genes and 5 downregulated genes, were selected. Functional and enrichment analyses indicated that the DEGs were mainly enriched in the extracellular region, chemokine activity, and cell adhesion. KEGG pathway analysis demonstrated that the DEGs were only enriched in the chemokine signaling pathway. We identified SOX9, CCL20, CXCL1, CD24, and CHST4 as hub genes based on the PPI network analysis. Furthermore, we validated the upregulated expression of these hub genes in the livers of HFD-induced NAFLD mice and in cultured HepG2 cells exposed to glucolipotoxicity. A total of 57 DEGs were chosen in this study. As the expression of a single gene is not sufficient to explain the entire biological process, and the changes in biological phenotype, it is necessary to study the interaction of a series of genes and proteins. Enrichment analysis is fundamental for biological interpretation of experimental “omics” data. Our enrichment analysis revealed that DEGs were significantly enriched in extracellular process and cell adhesion in the cellular component and biological process classes, respectively. Extracellular processes such as neutrophil extracellular traps, have been reported to participate in the inflammation associated with NASH. Some adhesion molecules promote leukocyte recruitment in the liver and exacerbate the NAFLD. These results suggest that the extracellular region is the main pathological site for the aggravation of fatty liver phenotype and that cell adhesion, especially the adhesion of inflammatory factors, is the main biological process of the disease. Next, we screened the hub genes associated with the progression of NAFLD. Through the PPI network analysis, SOX9, CCL20, CXCL1, CD24, and CHST4 were selected as the most common genes in 11 topological algorithms. SRY-box transcription factor 9 (SOX9) is mainly expressed in bile duct cells under physiological conditions. During the process of chronic liver injury, SOX9-positive cells act as facultative liver stem cells and are involved in liver regeneration. SOX9 is also highly expressed in hepatocellular carcinoma tissues, which is related to poor prognosis in the patients., In the present study, SOX9 was upregulated in the livers of HFD-induced NAFLD mice and in HepG2 cells exposed to PA or AGEs. These results suggest that SOX9 is involved in metabolic liver diseases and may serve as a potential biomarker to diagnose and assess the severity of NAFLD. Liver steatosis is associated with a presence of many chemokines and active inflammatory cells, which is a sign of chronic inflammation. Our study showed that the DEGs were enriched in the chemokine pathway and activity in both the KEGG pathway and molecular function analyses. Moreover, C–C motif chemokine ligand 20 (CCL20) and C-X-C motif chemokine ligand 1 (CXCL1) were predicted as the hub genes from the PPI network analysis. Furthermore, we found that the expression levels of Ccl20 and Cxcl1 were higher in the livers of HFD-induced NAFLD mice than in the control mice, and the mRNA levels of CCL20 and CXCL1 were upregulated by PA in HepG2 cells. Many studies in rodent models indicate that chemokines play a crucial role in NAFLD., The levels of CCL20 were increased in the animal models of liver injury, especially with the acute-on-chronic condition. Results from a network meta-analysis showed that the concentrations of chemokines, including CCL20, in the NASH group were higher than those in the control group. Additionally, the CCL20 gene is one of the most upregulated transcripts observed in fibrosis associated with NAFLD, in comparison to normal conditions, which was further validated in a replication group. These results suggest that the CCL20 chemokine is a potential therapeutic target, and can be regarded as one of the most important chemokines involved in the mechanisms underlying NAFLD. The cluster of differentiation 24 (CD24) and carbohydrate sulfotransferase 4 (CHST4) were the two other DEGs that we identified and validated in the livers of HFD-induced NAFLD mice and in HepG2 cells exposed to PA or AGEs. A previous study imported three GEO datasets of NAFLD samples (GSE66676, GSE49541, and GSE834521), and found that CD24 was the only gene co-expressed in all three datasets. In a cross-sectional study, liver tissue-transcriptome differences were evaluated in a subset of 25 mild-NAFLD and 20 NASH biopsies. Five identified DEGs, including CD24, were positively associated with disease severity and were found to be important classifiers of mild NAFLD and severe NAFLD. CD24-positive cells isolated from hepatocellular carcinoma cell lines exhibited stemness properties, such as self-renewal, chemotherapy resistance, metastasis, and tumorigenicity. These results indicate that CD24 may play a role in hepatocyte injury and promote regeneration during the development and progression of NAFLD. Another hub gene, CHST4, encodes sulfotransferase, an enzyme which utilizes 3′-phospho-5′-adenylyl sulfate to catalyze the transfer of sulfate, ultimately serving as ligand for L-selectins (SELL, Selectin L, a lymphocyte homing receptor). SELL ligands are highly expressed in endothelial cells and play a central role in lymphocyte homing at sites of inflammation. Therefore, our findings suggest that CHST4 may participate in the inflammation associated with NAFLD. Up till now, the precise functions and the underlying mechanisms of CD24 and CHST4 in NAFLD progression remain unclear. All the 12 transcription factors identified in the present study using transcription factor analysis were likely to be implicated in the progression of NAFLD. The transcription factors of transcriptional enhanced associate (TEA) domain DNA-binding family (TEAD1, TEAD2, TEAD3, and TEAD4) regulate gene expression primarily through interaction with transcriptional co-activators with PDZ-motif (TAZ). A previous study demonstrated that inhibiting liver TAZ in murine NASH models prevented or even reversed hepatic inflammation, hepatocyte death and hepatic fibrosis, but not liver steatosis. Upregulation of Runt-related transcription factor 2 (Runx 2) in activated murine hepatic stellate cells promoted hepatic infiltration of macrophages by increasing the expression of monocyte chemotactic protein 1. The involvement of other transcription factors, including HIVEP, ZNF, CBFB, and HOXA13, identified in this study has not been reported in liver diseases. The specific function of these transcription factors in NAFLD, especially in hepatic fibrosis, requires further research. There are certain limitations in this study. First, the duration of HFD exposure in our animal model may not be long enough to induce severe NAFLD and to successfully compare the different lengths of HFD treatment in mice. Second, the sample size is relatively small. Larger sample sizes obtained from animal studies and prospective clinical cohort studies are warranted to verify the function of these hub genes. In summary, we used bioinformatics analyses to identify 57 DEGs in mild and advanced NAFLD liver tissues. We identified SOX9, CCL20, CXCL1, CD24, and CHST4 as hub genes, and identified intersecting pathways involved in extracellular space, cell adhesion, and inflammation. Notably, we verified the upregulated expression of these hub genes in the livers of HFD-induced NAFLD mice and in HepG2 cells exposed to PA or AGEs. These hub genes may serve as biomarkers for advanced NAFLD stages and offer new insights into drug discovery. Nevertheless, further studies are required to clarify the detailed function and specific mechanisms of these hub genes in the development and progression of NAFLD.

Funding

This study was supported by grants from the (81830022 and 81970671).

Data supplied

Microarray data is available at NCBI GEO, accession numbers: GSE49541; Raw codes are available at Github (https://github.com/JinFeng-bio/NAFLD).

Conflict of interest

None.

37 in total

Review 1. Sox9 and programming of liver and pancreatic progenitors.

Authors: Yoshiya Kawaguchi
Journal: J Clin Invest Date: 2013-05-01 Impact factor: 14.808

2. CCL20 mediates lipopolysaccharide induced liver injury and is a potential driver of inflammation and fibrosis in alcoholic hepatitis.

Authors: Silvia Affò; Oriol Morales-Ibanez; Daniel Rodrigo-Torres; José Altamirano; Delia Blaya; Dianne H Dapito; Cristina Millán; Mar Coll; Jorge M Caviglia; Vicente Arroyo; Juan Caballería; Robert F Schwabe; Pere Ginès; Ramón Bataller; Pau Sancho-Bru
Journal: Gut Date: 2014-01-10 Impact factor: 23.059

3. Non-alcoholic fatty liver disease and risk of incident chronic kidney disease: an updated meta-analysis.

Authors: Alessandro Mantovani; Graziana Petracca; Giorgia Beatrice; Alessandro Csermely; Amedeo Lonardo; Jörn M Schattenberg; Herbert Tilg; Christopher D Byrne; Giovanni Targher
Journal: Gut Date: 2020-12-10 Impact factor: 23.059

4. Thyroid hormone-related regulation of gene expression in human fatty liver.

Authors: Jussi Pihlajamäki; Tanner Boes; Eun-Young Kim; Farrell Dearie; Brian W Kim; Joshua Schroeder; Edward Mun; Imad Nasser; Peter J Park; Antonio C Bianco; Allison B Goldfine; Mary Elizabeth Patti
Journal: J Clin Endocrinol Metab Date: 2009-06-23 Impact factor: 5.958

Review 5. Current Concepts, Opportunities, and Challenges of Gut Microbiome-Based Personalized Medicine in Nonalcoholic Fatty Liver Disease.

Authors: S R Sharpton; B Schnabl; R Knight; R Loomba
Journal: Cell Metab Date: 2020-12-08 Impact factor: 31.373

6. GOMA: functional enrichment analysis tool based on GO modules.

Authors: Qiang Huang; Ling-Yun Wu; Yong Wang; Xiang-Sun Zhang
Journal: Chin J Cancer Date: 2012-12-07

7. Identification of Key Genes and Pathways in Pancreatic Cancer Gene Expression Profile by Integrative Analysis.

Authors: Wenzong Lu; Ning Li; Fuyuan Liao
Journal: Genes (Basel) Date: 2019-08-13 Impact factor: 4.096

Review 8. Implications of hydrogen sulfide in liver pathophysiology: Mechanistic insights and therapeutic potential.

Authors: Hai-Jian Sun; Zhi-Yuan Wu; Xiao-Wei Nie; Xin-Yu Wang; Jin-Song Bian
Journal: J Adv Res Date: 2020-05-17 Impact factor: 10.479

9. Microarray and its applications.

Authors: Rajeshwar Govindarajan; Jeyapradha Duraiyan; Karunakaran Kaliyappan; Murugesan Palanisamy
Journal: J Pharm Bioallied Sci Date: 2012-08

10. Chemokines in Non-alcoholic Fatty Liver Disease: A Systematic Review and Network Meta-Analysis.

Authors: Xiongfeng Pan; Atipatsa Chiwanda Kaminga; Aizhong Liu; Shi Wu Wen; Jihua Chen; Jiayou Luo
Journal: Front Immunol Date: 2020-09-18 Impact factor: 7.561

1 in total

1. Effect of dietary soybean oil inclusion on liver-related transcription factors in a pig model for metabolic diseases.

Authors: Simara Larissa Fanalli; Bruna Pereira Martins da Silva; Julia Dezen Gomes; Fernanda Nery Ciconello; Vivian Vezzoni de Almeida; Felipe André Oliveira Freitas; Gabriel Costa Monteiro Moreira; Bárbara Silva-Vignato; Juliana Afonso; James Reecy; James Koltes; Dawn Koltes; Luciana Correia Almeida Regitano; Júlio Cesar de Carvalho Baileiro; Luciana Freitas; Luiz Lehmann Coutinho; Heidge Fukumasu; Severino Matias de Alencar; Albino Luchiari Filho; Aline Silva Mello Cesar
Journal: Sci Rep Date: 2022-06-20 Impact factor: 4.996

1 in total