Literature DB >> 30424473

Identification of Potential Key Genes Associated with Adipogenesis through Integrated Analysis of Five Mouse Transcriptome Datasets.

Song Zhang¹, Li Wang², Shijun Li³, Wenzhen Zhang⁴, Xueyao Ma⁵, Gong Cheng⁶, Wucai Yang⁷, Linsen Zan^8,9.

Abstract

Adipose tissue is the most important energy metabolism and secretion organ, and these functions are conferred during the adipogenesis process. However, the cause and the molecular events underlying adipogenesis are still unclear. In this study, we performed integrated bioinformatics analyses to identify vital genes involved in adipogenesis and reveal potential molecular mechanisms. Five mouse high-throughput expression profile datasets were downloaded from the Gene Expression Omnibus (GEO) database; these datasets contained 24 samples of 3T3-L1 cells during adipogenesis, including 12 undifferentiated samples and 12 differentiated samples. The five datasets were reanalyzed and integrated to select differentially expressed genes (DEGs) during adipogenesis via the robust rank aggregation (RRA) method. Functional annotation of these DEGs and mining of key genes were then performed. We also verified the expression levels of some potential key genes during adipogenesis. A total of 386 consistent DEGs were identified, with 230 upregulated genes and 156 downregulated genes. Gene Ontology (GO) analysis showed that the biological functions of the DEGs primarily included fat cell differentiation, lipid metabolic processes, and cell adhesion. Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis showed that these DEGs were mainly associated with metabolic pathways, the peroxisome proliferator-activated receptor (PPAR) signaling pathway, regulation of lipolysis in adipocytes, the tumor necrosis factor (TNF) signaling pathway, and the FoxO signaling pathway. The 30 most closely related genes among the DEGs were identified from the protein⁻protein interaction (PPI) network and verified by real-time quantification during 3T3-L1 preadipocyte differentiation. In conclusion, we obtained a list of consistent DEGs during adipogenesis through integrated analysis, which may offer potential targets for the regulation of adipogenesis and treatment of adipose dysfunction.

Entities: CellLine Chemical Disease Gene Species

Keywords: adipogenesis; differentially expressed genes; gene expression omnibus; integrated bioinformatics; robust rank aggregation

Mesh：

Year: 2018 PMID： 30424473 PMCID： PMC6274731 DOI： 10.3390/ijms19113557

Source DB: PubMed Journal: Int J Mol Sci ISSN： 1422-0067 Impact factor: 5.923

1. Introduction

When preadipocytes differentiate into mature adipocytes during adipogenesis, adipose tissue gains its specific physiological functions: being the primary storage site for fatty acids and the largest endocrine organ. Much of our understanding of adipogenesis comes from in vitro studies of preadipocyte models (e.g., the 3T3-L1 and 3T3-F442A cell lines) [1,2]. Adipogenesis (adipocyte differentiation) is a multistep biological process that is highly controlled. The expression of several transcription factors and adipogenic genes during adipogenesis causes adipocyte development. During the terminal differentiation stage, the cell morphology changes dramatically (from fibroblastic to spherical), and preadipocytes synchronously gain the characteristics of mature adipocytes. Glucose transporters, enzymes involved in triglyceride metabolism, insulin receptors, and adipocyte-secreted products all show increased activity and quantity, which is necessary for lipid metabolic balance and the inflammatory response. In the past few decades, a number of regulatory factors related to adipogenic programs have been gradually unearthed. Some studies determined the gene expression profiles during adipogenesis, and a large number of differentially expressed genes (DEGs) related to the initiation of adipogenesis have been identified [3,4,5,6,7]. However, a frequent cause of confusion is the inconsistency of these results due to differences in sample batch, detection platforms, and data processing methods. Therefore, each independent experiment has certain limitations. We need to integrate these results to find credible DEGs that are stable in multiple independent studies using an unbiased approach. In this way, we can fully utilize multiple expression profiles obtained by high-throughput technology to find reliable and effective molecular targets. In this study, to identify DEGs associated with adipogenesis, the robust rank aggregation (RRA) approach was used to integrate multiple ranked gene lists [8]. The RRA method is a powerful ordering algorithm, and for all of the genes in the final ranked gene list, this method assigns a p-value to estimate significance probabilities indicating that the result is better than expected by chance [9]. The size of the p-value corresponds to the position and significance of the corresponding gene in the final ranked list, meaning that the smaller the p-value is, the higher the ranking position of the gene. In this study, to more accurately identify DEGs associated with adipogenesis, we reanalyzed five expression profile datasets from a public expression database and then integrated these results. With these DEGs, a number of subsequent bioinformatic analyses were performed. Here, we aimed to explore the main pathways and processes associated with adipocyte differentiation and provide key targets for regulating adipose dysfunction and metabolic disorders.

2. Results

2.1. Expression Profile Datasets and Identification of Differentially Expressed Genes (DEGs) Associated with Adipogenesis

Five expression profile datasets from the Gene Expression Omnibus (GEO) database were selected. In total, 24 samples were obtained, including 12 undifferentiated 3T3-L1 cell samples and 12 differentiated 3T3-L1 cell samples (Table 1). The GSE20696 [7] and GSE93637 [3] expression microarray datasets were standardized, and DEGs were screened with the limma package (|Log2 fold change (log2FC)| > 1, and corrected p-value < 0.05). The high-throughput sequencing datasets GSE50934 [6], GSE95533 [4] and GSE50612 [5] were mapped, and the expression level was calculated. DEGs were screened by the DESeq2 package (|log2FC| > 1 and corrected p-value < 0.05). The numbers of DEGs in these datasets are shown in Table 1. The cluster heatmaps of the top 40 DEGs from the two sets of sample data included in each of the five expression profile datasets are shown in Figure 1a–e.

Table 1

Undifferentiated and differentiated 3T3-L1 cell samples from the NCBI Gene Expression Omnibus (GEO) database used in this study.

Reference	GEO	Platform	Study Type	Differentiation Stage of Samples		Number of DEGs(Up/Down)
Reference	GEO	Platform	Study Type	Undifferentiated	Differentiated	Number of DEGs(Up/Down)
Mikkelsen, T.S. et al. (2010) [7]	GSE20696	GPL1261	microarray	GSM519581GSM519582	GSM519585GSM519586	2898(1425/1473)
Romero, M. et al. (2018) [3]	GSE93637	GPL1261	microarray	GSM2459304GSM2459305GSM2459306	GSM2459310GSM2459311GSM2459312GSM2459313	1119(591/528)
Duteil, D. et al. (2014) [6]	GSE50934	GPL13112	RNA-seq	GSM1232686GSM1232687GSM1232688	GSM1232689GSM1232690GSM1232691	2685(1552/1133)
Siersbæk, R. et al. (2017) [4]	GSE95533	GPL18480	RNA-seq	GSM2515916GSM2515917	GSM2515922GSM2515923	4416(2197/2219)
Al, A.H. et al. (2015) [5]	GSE50612	GPL13112	RNA-seq	GSM1224678GSM1224679	GSM1224682GSM1224683	3966(2170/1796)

Figure 1

Hierarchical clustering heatmap of the top 20 up- and down-regulated differentially expressed genes (DEGs) screened in each Gene Expression Omnibus (GEO) dataset. (a) GSE20696 data, (b) GSE93637 data, (c) GSE50934 data, (d) GSE95533 data, and (e) GSE50612 data. Each row represents one gene, and each column represents one sample. Red indicates that the expression of genes is relatively upregulated, and green indicates that the expression of genes is relatively downregulated.

2.2. Identification of DEGs Associated with Adipogenesis Using Integrated Bioinformatics

After the above five adipogenesis datasets were reanalyzed by the limma [10] or DESeq2 [11] package, five gene lists ranked according to log2FC value were obtained and then analyzed by RRA (|log2FC| > 1 and corrected p-value < 0.05). Through rank analysis, we identified 386 DEGs, with 230 upregulated genes and 156 downregulated genes. A heatmap of the top 20 up- and down-regulated genes is shown in Figure 2. The integrated analysis results for the top 20 up- and down-regulated DEGs are shown in Table 2.

Figure 2

Heatmap of the top 20 up- and down-regulated DEGs in the integrated analysis. Each row represents one gene, and each column represents one dataset. Red indicates that the expression of genes is relatively upregulated, and green indicates that the expression of genes is relatively downregulated. The number in each rectangle represents the value of log2FC.

Table 2

Top 20 up- and down-regulated DEGs identified during adipogenesis via integrated analysis.

Top 20 Upregulated Genes				Top 20 Downregulated Genes
Symbol	log₂FC	p-Value	Corrected p-Value	Symbol	log₂FC	p-Value	Corrected p-Value
Cidec	8.815805	2.15 × 10⁻¹⁵	5.34 × 10⁻¹¹	Bnc1	−4.27787	7.38 × 10⁻¹²	1.83 × 10⁻⁷
Retn	8.50511	5.14 × 10⁻¹⁵	1.28 × 10⁻¹⁰	Pcdh18	−3.56837	8.39 × 10⁻¹¹	2.09 × 10⁻⁶
G0s2	8.359012	2.39 × 10⁻¹⁴	5.94 × 10⁻¹⁰	Nrg1	−3.52592	1.30 × 10⁻¹⁰	3.24 × 10⁻⁶
Adipoq	9.012189	1.81 × 10⁻¹³	4.51 × 10⁻⁹	Ogn	−3.95531	2.77 × 10⁻¹⁰	6.89 × 10⁻⁶
Cfd	8.441624	1.81 × 10⁻¹³	4.51 × 10⁻⁹	Mmp9	−3.64602	3.46 × 10⁻¹⁰	8.60 × 10⁻⁶
Adig	7.229626	1.17 × 10⁻¹²	2.90 × 10⁻⁸	Tubb2b	−2.96698	3.77 × 10⁻¹⁰	9.37 × 10⁻⁶
Orm1	7.280763	9.69 × 10⁻¹²	2.41 × 10⁻⁷	Plat	−3.95792	4.36 × 10⁻¹⁰	1.09 × 10⁻⁵
Dgat2	6.066281	1.06 × 10⁻¹¹	2.63 × 10⁻⁷	Slit2	−3.3568	7.55 × 10⁻¹⁰	1.88 × 10⁻⁵
Mc2r	6.186546	1.10 × 10⁻¹¹	2.75 × 10⁻⁷	Rerg	−3.95606	1.10 × 10⁻⁹	2.73 × 10⁻⁵
Lgals12	6.873706	1.67 × 10⁻¹¹	4.15 × 10⁻⁷	Thbs1	−3.27764	1.16 × 10⁻⁹	2.87 × 10⁻⁵
Mrap	7.563215	2.17 × 10⁻¹¹	5.39 × 10⁻⁷	Nfkbie	−2.84309	1.82 × 10⁻⁹	4.53 × 10⁻⁵
Acsl1	6.016773	3.73 × 10⁻¹¹	9.29 × 10⁻⁷	Il17rd	−3.02351	2.32 × 10⁻⁹	5.77 × 10⁻⁵
Wfdc21	7.632637	4.86 × 10⁻¹¹	1.21 × 10⁻⁶	Il1rl1	−3.70607	2.50 × 10⁻⁹	6.21 × 10⁻⁵
Syn2	5.030408	8.88 × 10⁻¹¹	2.21 × 10⁻⁶	Tnc	−3.78602	4.22 × 10⁻⁹	1.05 × 10⁻⁴
Car3	5.021359	1.14 × 10⁻¹⁰	2.83 × 10⁻⁶	Ptger4	−3.15885	4.22 × 10⁻⁹	1.05 × 10⁻⁴
Fabp4	6.257684	1.34 × 10⁻¹⁰	3.32 × 10⁻⁶	Fbn1	−2.96534	4.99 × 10⁻⁹	1.24 × 10⁻⁴
Cd36	7.115236	1.48 × 10⁻¹⁰	3.69 × 10⁻⁶	Atp1b1	−4.18269	5.09 × 10⁻⁹	1.27 × 10⁻⁴
Hp	6.709448	1.60 × 10⁻¹⁰	3.98 × 10⁻⁶	Ptprv	−3.41098	6.87 × 10⁻⁹	1.71 × 10⁻⁴
Pnpla2	5.031089	1.77 × 10⁻¹⁰	4.40 × 10⁻⁶	Siglecg	−3.88948	7.62 × 10⁻⁹	1.90 × 10⁻⁴
Gpd1	6.752793	2.03 × 10⁻¹⁰	5.05 × 10⁻⁶	Pla2g7	−3.91415	7.64 × 10⁻⁹	1.90 × 10⁻⁴

2.3. Enrichment Analysis of the Gene Ontology (GO) Terms of the DEGs

Significant results from the Gene Ontology (GO) term analysis of DEGs associated with adipogenesis are shown in Table 3. The enrichment results for the biological process (BP) GO category showed that the upregulated genes were primarily involved in fat cell differentiation, lipid metabolic processes, and oxidation-reduction processes. The downregulated genes were primarily concentrated in cell adhesion, positive regulation of cell-substrate adhesion, and positive regulation of neuron projection development. The enrichment results for cell composition (CC) GO category showed that the upregulated genes were primarily concentrated in mitochondria, lipid particles, and the cytosol. The downregulated genes were mainly enriched in extracellular regions, the proteinaceous extracellular matrix, and the extracellular matrix. The enrichment results for the molecular function (MF) GO category showed that the upregulated genes were mainly enriched in oxidoreductase activity. The downregulated genes were mainly enriched in heparin binding, calcium ion binding, and integrin binding. Moreover, the correspondence between genes and biological processes is shown in Figure 3.

Table 3

Gene Ontology (GO) term enrichment analysis of DEGs associated with adipogenesis.

DEGs	Term	Category	Count	FDR
up	brown fat cell differentiation	BP	19	1.63 × 10⁻²³
	lipid metabolic process	BP	27	3.04 × 10⁻⁸
	oxidation-reduction process	BP	32	6.77 × 10⁻⁸
	metabolic process	BP	23	3.03 × 10⁻⁵
	fat cell differentiation	BP	10	4.76 × 10⁻⁴
	lipid storage	BP	7	8.27 × 10⁻⁴
	triglyceride catabolic process	BP	6	4.12 × 10⁻³
	glucose homeostasis	BP	10	3.70 × 10⁻²
	white fat cell differentiation	BP	5	4.44 × 10⁻²
	mitochondrion	CC	52	9.65 × 10⁻⁸
	lipid particle	CC	12	2.11 × 10⁻⁷
	cytosol	CC	42	8.61 × 10⁻³
	intracellular membrane-bounded organelle	CC	24	1.60 × 10⁻²
	mitochondrial membrane	CC	9	2.25 × 10⁻²
	oxidoreductase activity	MF	27	3.98 × 10⁻⁶
down	cell adhesion	BP	20	8.40 × 10⁻⁶
	positive regulation of cell-substrate adhesion	BP	7	1.19 × 10⁻³
	positive regulation of neuron projection development	BP	9	1.74 × 10⁻²
	extracellular region	CC	51	4.64 × 10⁻¹⁴
	proteinaceous extracellular matrix	CC	24	2.78 × 10⁻¹³
	extracellular matrix	CC	21	1.28 × 10⁻¹⁰
	extracellular space	CC	41	1.82 × 10⁻⁹
	basement membrane	CC	11	4.33 × 10⁻⁶
	calcium ion binding	MF	24	4.46 × 10⁻⁶
	integrin binding	MF	9	1.34 × 10⁻³
	heparin binding	MF	13	2.56 × 10⁻⁶

Figure 3

Distribution of genes that are up- and down-regulated during adipogenesis for different GO biological processes.

2.4. Kyoto Encyclopedia of Genes and Genomes (KEGG) Pathway Analysis of DEGs

The top 20 significant pathways for the upregulated genes were selected (Table 4). The upregulated genes were mainly enriched in metabolic pathways, the proliferator-activated receptor (PPAR) signaling pathway, and regulation of lipolysis in adipocytes, and were also enriched in some adipogenesis-related pathways, including the AMP-activated protein kinase (AMPK) signaling pathway, fat digestion and absorption, adipocytokine signaling pathway, and insulin signaling pathway. On the other hand, 11 significant pathways for the downregulated genes were filtered out (Table 4). Except for some diseases and cancer-related pathways, the downregulated genes were mainly enriched in the tumor necrosis factor (TNF) signaling pathway, forkhead box O (FoxO) signaling pathway, and some fundamental biochemical processes (such as focal adhesion, endocrine and other factor-regulated calcium reabsorption, and glycosaminoglycan biosynthesis—keratan sulfate). Cytoscape was used to visualize the relationship between genes and pathways. The genes and pathway nodes are represented by circles. The results are shown in Figure 4.

Table 4

Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis of DEGs in adipogenesis.

DEGs	Term	Count	FDR
up	Metabolic pathways	42	1.34 × 10⁻¹⁶
	PPAR signaling pathway	15	3.33 × 10⁻¹⁵
	Regulation of lipolysis in adipocytes	9	1.20 × 10⁻⁸
	AMPK signaling pathway	10	5.06 × 10⁻⁷
	Valine, leucine and isoleucine degradation	7	3.31 × 10⁻⁶
	Non-alcoholic fatty liver disease (NAFLD)	9	2.05 × 10⁻⁵
	Propanoate metabolism	5	5.31 × 10⁻⁵
	Glycerolipid metabolism	6	5.31 × 10⁻⁵
	Metabolism of xenobiotics by cytochrome P450	6	7.77 × 10⁻⁵
	Glycolysis/Gluconeogenesis	6	7.77 × 10⁻⁵
	Pyruvate metabolism	5	8.57 × 10⁻⁵
	Nitrogen metabolism	4	8.57 × 10⁻⁵
	Insulin resistance	7	8.57 × 10⁻⁵
	Fat digestion and absorption	5	8.57 × 10⁻⁵
	Adipocytokine signaling pathway	6	8.90 × 10⁻⁵
	Carbon metabolism	7	1.09 × 10⁻⁴
	Proximal tubule bicarbonate reclamation	4	1.63 × 10⁻⁴
	Fatty acid degradation	5	1.65 × 10⁻⁴
	Glucagon signaling pathway	6	4.20 × 10⁻⁴
	Biosynthesis of amino acids	5	1.26 × 10⁻³
down	Malaria	4	7.99 × 10⁻³
	Transcriptional misregulation in cancer	6	7.99 × 10⁻³
	Rheumatoid arthritis	4	2.24 × 10⁻²
	Bladder cancer	3	2.67 × 10⁻²
	TNF signaling pathway	4	3.24 × 10⁻²
	Focal adhesion	5	3.24 × 10⁻²
	Endocrine and other factor-regulated calcium reabsorption	3	3.24 × 10⁻²
	Glycosaminoglycan biosynthesis—keratan sulfate	2	3.51 × 10⁻²
	Synaptic vesicle cycle	3	3.51 × 10⁻²
	FoxO signaling pathway	4	3.51 × 10⁻²
	Bile secretion	3	4.35 × 10⁻²

Figure 4

Distribution of up- and down-regulated genes associated with adipogenesis in different KEGG pathways. The upregulated genes (red), downregulated genes (green), and pathway nodes (yellow) are represented by circles.

2.5. Protein–Protein Interaction (PPI) Network Construction and Module Analysis of DEGs

To analyze the interaction among DEG expression products, the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) database [12] was used to construct a PPI network. A total of 247 nodes and 751 edges were obtained with a combined score >0.4, as shown in Figure 5a (isolated nodes were ignored). The top 30 highest degree nodes are shown in Figure 5b. Among these genes, adiponectin, C1Q and collagen domain containing (Adipoq) showed the highest node degree, which was 32. Then, nine modules were selected using the plug-in Molecular Complex Detection (MCODE) to screen the above PPI network. The top 4 modules are shown in Figure 5c–f. In addition, functional annotations for these top 30 genes (only leucine-rich repeat kinase 1 (Lrrk1), actin, alpha 2, smooth muscle, aorta (Acta2), matrix metallopeptidase 9 (Mmp9), and tissue inhibitor of metalloproteinase 1 (Timp1) were downregulated) were implemented (Table 5). GO analysis showed that the genes were mainly related to lipid metabolism, oxidation-reduction, and fat cell differentiation. KEGG analysis showed that they were mainly associated with the PPAR signaling pathway, metabolism pathways, and the AMPK signaling pathway.

Figure 5

Protein–protein interaction (PPI) network construction and module analysis of DEGs associated with adipogenesis. (a) Using Cytoscape software, the PPI network was visualized (isolated nodes were removed). The node size represents the node degree (a larger size indicates a higher degree). The width and transparency of the edge indicate the combined score of the edge (a wider or more opaque edge indicates a higher combined score). (b) Top 30 genes with the highest degrees in the PPI network. (c–f) Molecular Complex Detection (MCODE) module screening for the DEGs, including module 1 (score = 11), module 2 (score = 6), module 3 (score = 5.467), and module 4 (score = 4.6).

Table 5

Enrichment analysis of the top 30 genes with the highest degrees.

Method	Term	Count	FDR	Genes
GO	metabolic process	10	3.98 × 10⁻⁵	Dbt, Aldh1l1, Acsl1, Acadm, Bckdhb, Aldh3b2, Pnpla2, Aldh1a7, Pdhb, Lipe
	oxidation-reduction process	9	1.12 × 10⁻²	Ldha, Aldh1l1, Acadm, Adhfe1, Adh1, Bckdhb, Aldh3b2, Aldh1a7, Pdhb
	lipid storage	4	1.51 × 10⁻²	Dgat1, Cd36, Dgat2, Pnpla2
	brown fat cell differentiation	4	3.07 × 10⁻²	Slc2a4, Pparg, Fabp4, Adipoq
	negative regulation of sequestering of triglyceride	3	3.54 × 10⁻²	Pparg, Abhd5, Pnpla2
KEGG	PPAR signaling pathway	8	2.89 × 10⁻¹³	Acadm, Adipoq, Fabp4, Cd36, Acsl1, Pck1, Pparg, Nr1h3
	Metabolic pathways	13	1.61 × 10⁻¹⁰	Acadm, Adh1, Bckdhb, Dbt, Dgat1, Acsl1, Ldha, Pck1, Aldh1a7, Pnpla2, Dgat2, Pdhb, Aldh3b2
	AMPK signaling pathway	6	1.88 × 10⁻⁸	Adipoq, Cd36, Lipe, Pck1, Pparg, Slc2a4
	Glycolysis/Gluconeogenesis	5	3.90 × 10⁻⁸	Adh1, Ldha, Pck1, Pdhb, Aldh3b2
	Adipocytokine signaling pathway	5	5.04 × 10⁻⁸	Adipoq, Cd36, Acsl1, Pck1, Slc2a4

2.6. Verification of Changes in DEG Expression During 3T3-L1 Preadipocyte Differentiation

We believe that the top 30 highest degree genes in the PPI network may play a key regulatory role in the process of adipogenic differentiation. Here, we measured their expression levels before and after 3T3-L1 preadipocyte differentiation. First, we induced differentiation of 3T3-L1 preadipocytes and performed oil red O staining (Figure 6a). The results showed that the lipid droplets were numerous and large after differentiation. Next, we determined the expression levels of 17 potential key genes in undifferentiated and differentiated 3T3-L1 cells through quantitative real-time PCR (qRT-PCR) (Figure 6b). The results showed that their expression varied significantly during adipogenesis. The direction of change was consistent with the results of the previous integrated bioinformatic analysis.

Figure 6

Verification of differential expression of key genes during 3T3-L1 preadipocyte differentiation. (a) Undifferentiated (Day 0) and differentiated (Day 8, Oil red O staining) 3T3-L1 cells under the induced differentiation process. (b) Relative expression levels of potential key genes were measured in undifferentiated and differentiated 3T3-L1 cells by qRT-PCR. Three independent experiments were performed for all of these groups, and the quantitative results are expressed as the mean ± SD of three independent experiments. ** indicates p < 0.01.

3. Discussions

Adipose tissue is composed of many cell types, and mature adipocytes account for only two-thirds of adipose tissue. Undifferentiated cells are also found in adipose tissue, including preadipocytes and stem cells. Stem cells have the potential to differentiate into various types of cells, and the direction of the differentiation of preadipocytes has been determined. Proliferation and differentiation of preadipocytes are essential for the continued development and maintenance of adipose tissue. Spalding et al. found that almost 50% of human subcutaneous fat is renewed every 8 years, suggesting that adipocytes are a dynamic cell type that undergoes constant substitution by newborn adipocytes [13]. In other words, adipogenesis fundamentally determines the expansion and functional properties of adipose tissue. Considering the occurrence of increases in public health problems caused by adipose dysfunction and metabolic disorders (such as obesity, diabetes mellitus, insulin resistance, and cardiovascular disease) on a global scale, it is necessary to explore the fundamental molecular mechanism of adipogenesis [14]. With the development of high-throughput technology, a large amount of transcriptome data is generated and uploaded to a public expression database. Fully exploiting these large datasets can provide good value to life science research. Given the inevitable errors among independent experiments, we urgently need to integrate the results of various experiments to more accurately identify the intrinsic components and elucidate the major molecular mechanisms. In this study, we integrated five gene expression profile datasets of the adipogenesis processes from different independent experiments. A total of 386 DEGs were identified, including 230 upregulated genes and 156 downregulated genes. The upregulated gene list contained many fat marker genes (such as Adipoq, peroxisome proliferator activated receptor gamma (Pparg), and solute carrier family 2 (facilitated glucose transporter), member 4 (Slc2a4)), and a large number of adipogenic differentiation studies have been carried out around them. The downregulated gene list includes the well-known antiadipogenesis gene delta-like 1 (Dlk1), but many of the other genes related to adipogenic differentiation have rarely been reported. In general, the role of these genes with dramatic changes in expression during adipogenesis deserves a deeper understanding. This study provides a reliable collection of DEGs associated with adipogenic processes, providing a large number of potential subjects for subsequent research. In addition, functional annotation of DEGs was performed to fully understand the processes and pathways in which they participate. GO analysis of the upregulated genes revealed that adipocyte differentiation and concomitant activities, such as lipid metabolism and redox activity, were major biological processes in which they were involved. The KEGG analysis of upregulated genes showed consistent results, with the PPAR signaling pathways and lipid metabolism-related pathways (such as metabolic pathways and regulation of lipolysis in adipocytes) playing a dominant role. The PPAR signaling pathway has been extensively studied as a core pathway in the process of adipogenic differentiation. Peroxisome proliferator-activated receptor gamma (PPARG, encoded by Pparg) is the master regulator of adipose differentiation; its expression is necessary for initiating differentiation and maintaining a differentiated state, and a large number of prodifferentiation factors function through its activation [15]. The GO and KEGG analyses of downregulated genes also showed consistent results, with biological processes and pathways involved in adhesion and activity of the extracellular matrix (ECM) predominating. Phenotypic changes in cells during fat differentiation were apparent and were accompanied by changes in the levels and type of ECM components [16], and proteolytic degradation of preadipocytes by ECM components is required for cell-shape changes, adipocyte-specific gene expression, and lipid accumulation [17]. On the other hand, downregulated genes are also enriched in some antiadipogenic differentiation pathways, such as the TNF signaling pathway [18] and the FoxO signaling pathway [19]. A PPI network of DEG-encoded proteins was constructed and the 30 most closely related genes were selected. Further functional annotation of these key genes was performed, and the results were consistent with the functional annotations of the entire set of DEGs. They were mainly concentrated in the same biological processes and pathways involved in adipogenesis. The findings further suggested that the 30 key genes that we screened were credible because the 30 key genes were sufficient to represent all the DEGs and reflect the main activities of adipogenesis. We believe that these genes with the highest node degree play a major regulatory role in multiple processes of adipogenic differentiation. Some of these genes are well-known as key genes in fat differentiation and lipid metabolism functions, such as the transcriptional regulation of adipogenesis (Pparg), insulin sensitive glucose transport (Slc2a4) [20], fatty acid transport (fatty acid binding protein 4 (Fabp4), CD36 Molecule (Cd36)) [21,22], triacylglycerol (TAG) synthesis (diacylglycerol O-acyltransferase 1 (Dgat1), diacylglycerol O-acyltransferase 1 (Dgat2), acyl-CoA synthetase long-chain family member 1 (Acsl1)) [23,24,25,26], lipolysis and its regulation (lipase, hormone sensitive (Lipe), patatin-like phospholipase domain containing 2 (Pnpla2), abhydrolase domain containing 5 (Abhd5)) [27,28], and the endocrine functions of adipocytes (Adipoq, complement factor D (Cfd), resistin (Retn)) [29,30]. In addition, tissue inhibitor of metalloproteinases-1 (TIMP1, encoded by Timp1) is a natural inhibitor of matrix metalloproteinases (MMPs) [31], a cluster of peptidases that are involved in degrading and remodeling the ECM [32,33]. MMP9 (a member of the MMP family, encoded by Mmp9) is secreted by adipocytes, and its proteolytic activity was induced during adipocyte differentiation [34]. The Nr1h3 gene encodes a nuclear receptor, Nuclear Receptor Subfamily 1 Group H Member 3 (NR1H3, also known as LXRα), which is a vital regulator in hepatic de novo lipogenesis and lipid homeostasis [35]. NR1H3 exhibits ligand-dependent activation activity and is activated primarily by cellular cholesterols, leading to the induction of transcription of a downstream nuclear transcriptional factor, sterol regulatory element binding transcription factor 1 (SREBF1) [36]. The Pck1 gene is mainly responsible for the regulation of gluconeogenesis [37] and is closely related to diabetes mellitus and obesity [38]. Cytosolic isozyme of phosphoenolpyruvate carboxykinase (PEPCK-C, encoded by Pck1) stimulates the transformation of oxaloacetate to phosphoenolpyruvate (PEP) in fat cells [39]. As a member of the aldehyde dehydrogenase 3 (ALDH3) family, aldehyde dehydrogenase 3 family member B2 (ALDH3B2, encoded by Aldh3b2) is localized in lipid droplets and is responsible for the removal of lipid aldehydes, exhibiting broad substrate specificity towards medium- and long-chain aldehydes [40]. The remaining genes with high node degrees, such as Lrrk1, aldehyde dehydrogenase 1 family, member L1 (Aldh1l1), and alcohol dehydrogenase, iron containing, 1 (Adhfe1), are very rarely reported to be related to adipogenesis. These genes deserve more attention from frontline researchers, and their potential role in adipogenesis requires further experimental verification. Finally, we verified the expression changes in some key genes before and after 3T3-L1 preadipocyte differentiation. Combining the results of integrated bioinformatic analyses with the results of gene expression measurements, this study has greatly narrowed the range of potential key genes and provides high-value targets for subsequent adipogenic differentiation research.

4. Materials and Methods

4.1. Gene Expression Profile Data

The gene expression profile datasets GSE50934, GSE95533, GSE50612, GSE93637, and GSE20696 were downloaded from the GEO database (https://www.ncbi.nlm.nih.gov/geo/). The platforms of the five GEO datasets were GPL13112 Illumina HiSeq 2000 (Mus musculus) for the GSE50934 and GSE50612 datasets, GPL18480 Illumina HiSeq 1500 (Mus musculus) for the GSE95533 dataset, and GPL1261 [Mouse430_2] Affymetrix Mouse Genome 430 2.0 Array for the GSE93637 and GSE20696 datasets.

4.2. Identification of DEGs

For the raw data of GSE50934, GSE95533, and GSE50612, reads were mapped to the mouse genome (GRCm38) using HiSat2 [41], and annotated genes were quantified with featureCounts [42]. Differential expression analysis was performed using the DESeq2 R package. For the raw data of GSE93637 and GSE20696, normalization and differential expression analysis were performed using the limma R package. DESeq2 and limma R package were all from the Bioconductor project (https://www.bioconductor.org/). All the R packages used in this study were deployed in the programming language R (version 3.3.3, Auckland, New Zealand).

4.3. Integration of Gene Expression Profile Data

The RRA method was based on the assumption that all genes are unordered in each list. In this study, only the overlapping DEGs were used for the integrated analysis, and the five lists of genes were ranked by expression level in the five datasets. The RobustRankAggreg package of programming language R was downloaded from the Comprehensive R Network (https://cran.r-project.org/).

4.4. GO Term and KEGG Pathway Enrichment Analyses

For functional annotation of the DEGs, GO term enrichment analysis was performed using the online Visualization and Integrated Discovery (DAVID) software (https://david.ncifcrf.gov/) [43,44], for the three GO categories (BP, CC and MF). In addition, KEGG pathway enrichment analysis was carried out using KEGG Orthology-Based Annotation System (KOBAS) software (version 3.0, Peking, China (http://kobas.cbi.pku.edu.cn/)) [45,46]. Significant enrichment was considered for a corrected p-value (FDR) < 0.05.

4.5. PPI Network Construction and Module Analysis

The list of genes was mapped to the STRING database (http://www.string-db.org/) to construct a functional protein association network. Then, the PPI network was visualized with Cytoscape software (version 3.6.0, Washington, DC, USA) (http://www.cytoscape.org/). The degree of a node is the number of edges (interactions) incident to that node. A node is important if it links to many other nodes. The genes at the top of the degree distribution (≥90% percentile) in the network were defined as key genes (central genes). The plug-in MCODE (http://apps.cytoscape.org/apps/mcode) was used to scan the PPI network to identify densely connected regions. A p-value < 0.05 was considered statistically significant.

4.6. Cell Culture, Differentiation, and Lipid Droplet Staining

The culture and differentiation of 3T3-L1 preadipocytes were performed as previously described in Reference [47]. For lipid droplet (LD) staining, cells were washed with phosphate buffer saline (PBS) three times and then fixed in 4% paraformaldehyde for 30 min at room temperature. After washing three times with PBS, the cells were stained with a 60% filtered oil red O (Sigma, St. Louis, MO, USA) stock solution (0.3 g/100 mL of isopropanol). The cells were then rinsed three times with PBS. All images were obtained using an inverted fluorescence microscope FV500-IX71 (Olympus, Tokyo, Japan).

4.7. qRT-PCR Validation of Key Genes

Total RNA from undifferentiated and differentiated 3T3 cells was extracted using RNAiso (Takara, Dalian, China). A PrimeScriptTM RT Reagent Kit with gDNA Eraser (Takara) was used to perform reverse transcription of total RNA. A SYBR® Premix Ex TaqTM ΙΙ Kit (Takara) was used to carry out qRT-PCR with an ABI 7500 Real-Time PCR system (Applied Biosystems, Foster, CA, USA). Gapdh served as an internal reference to normalize gene expression levels via the 2−∆∆ method, as in Reference [48]. Some of the key genes whose expression trends during the adipogenesis of 3T3-L1 preadipocytes were known were excluded, and qRT-PCR validation was performed for the remaining genes.

4.8. Statistical Analysis

All the quantitative experiments were performed three times, and the quantitative results were expressed as the mean ± SD. Comparisons between two sets of groups were analyzed using the Student’s two-tailed t-test with GraphPad Prism 7. Statistical significance depended on a value of p < 0.05 (significant) or p < 0.01 (extremely significant).

5. Conclusions

Given the limitations of independent experiments, analyzing different datasets often yields different results, especially when performing differential expression analysis. In this study, we used the RRA method to integrate the results of five high-throughput datasets well, and we obtained consistent DEGs showing an association with adipogenesis. This work will contribute to the study of adipogenesis in the future, thus promoting our understanding of the molecular mechanisms underlying adipogenesis, and it may offer potential targets for the regulation of adipogenesis and the treatment of adipose dysfunction. Further experimental verification is needed to explore the specific functions of these DEGs.

48 in total

1. MicroRNA-139-5p Suppresses 3T3-L1 Preadipocyte Differentiation Through Notch and IRS1/PI3K/Akt Insulin Signaling Pathways.

Authors: Lin Mi; Yaosheng Chen; Xueli Zheng; Youlei Li; Qiangling Zhang; Delin Mo; Gongshe Yang
Journal: J Cell Biochem Date: 2015-07 Impact factor: 4.429

2. A plasma kallikrein-dependent plasminogen cascade required for adipocyte differentiation.

Authors: S Selvarajan; L R Lund; T Takeuchi; C S Craik; Z Werb
Journal: Nat Cell Biol Date: 2001-03 Impact factor: 28.824

3. TP53INP2 regulates adiposity by activating β-catenin through autophagy-dependent sequestration of GSK3β.

Authors: Montserrat Romero; Alba Sabaté-Pérez; Víctor A Francis; Ignacio Castrillón-Rodriguez; Ángels Díaz-Ramos; Manuela Sánchez-Feutrie; Xavier Durán; Manuel Palacín; José María Moreno-Navarrete; Birgit Gustafson; Ann Hammarstedt; José Manuel Fernández-Real; Joan Vendrell; Ulf Smith; Antonio Zorzano
Journal: Nat Cell Biol Date: 2018-03-28 Impact factor: 28.824

4. DGAT2 is a new diacylglycerol acyltransferase gene family: purification, cloning, and expression in insect cells of two polypeptides from Mortierella ramanniana with diacylglycerol acyltransferase activity.

Authors: K D Lardizabal; J T Mai; N W Wagner; A Wyrick; T Voelker; D J Hawkins
Journal: J Biol Chem Date: 2001-07-31 Impact factor: 5.157

5. Cloning of DGAT2, a second mammalian diacylglycerol acyltransferase, and related family members.

Authors: S Cases; S J Stone; P Zhou; E Yen; B Tow; K D Lardizabal; T Voelker; R V Farese
Journal: J Biol Chem Date: 2001-07-31 Impact factor: 5.157

6. FoxO1 antagonist suppresses autophagy and lipid droplet growth in adipocytes.

Authors: Longhua Liu; Louise D Zheng; Peng Zou; Joseph Brooke; Cayleen Smith; Yun Chau Long; Fabio A Almeida; Dongmin Liu; Zhiyong Cheng
Journal: Cell Cycle Date: 2016-06-03 Impact factor: 4.534

7. Oxyresveratrol ameliorates nonalcoholic fatty liver disease by regulating hepatic lipogenesis and fatty acid oxidation through liver kinase B1 and AMP-activated protein kinase.

Authors: Ju-Hee Lee; Su Youn Baek; Eun Jeong Jang; Sae Kwang Ku; Kyu Min Kim; Sung Hwan Ki; Chang-Eop Kim; Kwang Il Park; Sang Chan Kim; Young Woo Kim
Journal: Chem Biol Interact Date: 2018-04-24 Impact factor: 5.192

8. Robust rank aggregation for gene list integration and meta-analysis.

Authors: Raivo Kolde; Sven Laur; Priit Adler; Jaak Vilo
Journal: Bioinformatics Date: 2012-01-12 Impact factor: 6.937

9. MAGeCK enables robust identification of essential genes from genome-scale CRISPR/Cas9 knockout screens.

Authors: Wei Li; Han Xu; Tengfei Xiao; Le Cong; Michael I Love; Feng Zhang; Rafael A Irizarry; Jun S Liu; Myles Brown; X Shirley Liu
Journal: Genome Biol Date: 2014 Impact factor: 13.583

10. The STRING database in 2017: quality-controlled protein-protein association networks, made broadly accessible.

Authors: Damian Szklarczyk; John H Morris; Helen Cook; Michael Kuhn; Stefan Wyder; Milan Simonovic; Alberto Santos; Nadezhda T Doncheva; Alexander Roth; Peer Bork; Lars J Jensen; Christian von Mering
Journal: Nucleic Acids Res Date: 2016-10-18 Impact factor: 16.971

3 in total

1. Identification of Robust and Key Differentially Expressed Genes during C2C12 Cell Myogenesis Based on Multiomics Data.

Authors: Song Zhang; Yuanyuan Zhang; Choulin Chen; Qingqing Hu; Yang Fu; Lingna Xu; Chao Wang; Yuwen Liu
Journal: Int J Mol Sci Date: 2022-05-26 Impact factor: 6.208

2. Regulatory network of metformin on adipogenesis determined by combining high-throughput sequencing and GEO database.

Authors: Zhicong Zhao; Chenxi Wang; Jue Jia; Zhaoxiang Wang; Lian Li; Xia Deng; Zhensheng Cai; Ling Yang; Dong Wang; Suxian Ma; Li Zhao; Zhigang Tu; Guoyue Yuan
Journal: Adipocyte Date: 2022-12 Impact factor: 4.534

3. Identification of KIF4A and its effect on the progression of lung adenocarcinoma based on the bioinformatics analysis.

Authors: Yexun Song; Wenfang Tang; Hui Li
Journal: Biosci Rep Date: 2021-01-29 Impact factor: 3.840

3 in total