Literature DB >> 36056063

Machine learning and bioinformatics to identify 8 autophagy-related biomarkers and construct gene regulatory networks in dilated cardiomyopathy.

Fengjun Zhang1, Mingyue Xia2, Yuan Liu3, Min Peng4, Jiarong Jiang5, Shuai Wang6, Qiong Zhao7, Cheng Yu8, Jinzhen Yu9, Dexian Xian1, Xiao Li10, Lin Zhang11.   

Abstract

Dilated cardiomyopathy (DCM) is a condition of impaired ventricular remodeling and systolic diastole that is often complicated by arrhythmias and heart failure with a poor prognosis. This study attempted to identify autophagy-related genes (ARGs) with diagnostic biomarkers of DCM using machine learning and bioinformatics approaches. Differential analysis of whole gene microarray data of DCM from the Gene Expression Omnibus (GEO) database was performed using the NetworkAnalyst 3.0 platform. Differentially expressed genes (DEGs) matching (|log2FoldChange ≥ 0.8, p value < 0.05|) were obtained in the GSE4172 dataset by merging ARGs from the autophagy gene libraries, HADb and HAMdb, to obtain autophagy-related differentially expressed genes (AR-DEGs) in DCM. The correlation analysis of AR-DEGs and their visualization were performed using R language. Gene Ontology (GO) enrichment analysis and combined multi-database pathway analysis were served by the Enrichr online enrichment analysis platform. We used machine learning to screen the diagnostic biomarkers of DCM. The transcription factors gene regulatory network was constructed by the JASPAR database of the NetworkAnalyst 3.0 platform. We also used the drug Signatures database (DSigDB) drug database of the Enrichr platform to screen the gene target drugs for DCM. Finally, we used the DisGeNET database to analyze the comorbidities associated with DCM. In the present study, we identified 23 AR-DEGs of DCM. Eight (PLEKHF1, HSPG2, HSF1, TRIM65, DICER1, VDAC1, BAD, TFEB) molecular markers of DCM were obtained by two machine learning algorithms. Transcription factors gene regulatory network was established. Finally, 10 gene-targeted drugs and complications for DCM were identified.
© 2022. The Author(s).

Entities:  

Mesh:

Substances:

Year:  2022        PMID: 36056063      PMCID: PMC9440113          DOI: 10.1038/s41598-022-19027-5

Source DB:  PubMed          Journal:  Sci Rep        ISSN: 2045-2322            Impact factor:   4.996


Introduction

Dilated cardiomyopathy (DCM), which manifests clinically as ventricular dilatation and impaired progressive systolic diastole, is one of the most prevalent disease worldwide. It has a heterogeneous etiology, with viral infections, inflammatory reactions, genetic factors[1], etc. It can also cause arrhythmias and atrioventricular block, resulting in sudden cardiac death and heart failure. These circumstances often occur with a poor prognosis[2]. It is reported that men with DCM have a higher mortality rate than women patients[3]. Endomyocardial biopsy (EMB) is the gold standard for the diagnosis of myocarditis and DCM. However, in clinical practice, DCM is not diagnosed and treated promptly, considering the high cardiac complications of performing EMB and treatment limitation[4]. Therefore, developing innovative, non-invasive biomarkers for DCM is essential to improve diagnostic accuracy. Autophagy is a cellular self-degradation process that removes errant proteins and damaged organelles. It also eliminates intracellular pathogens and is often considered a survival mechanism[5]. Numerous studies have shown that autophagy genes are involved in various phenotypes and human diseases[6], including neurodegenerative diseases [7], liver diseases[8], muscle diseases[9], cancer [10], and cardiac diseases[11]. Evidence shows that autophagy is essential in maintaining cardiomyocyte homeostasis[12] and regulating the prognostic efficacy of cardiac diseases. In addition, an increasing number of animal models and clinical studies have reported the involvement of autophagy-related genes (ARGs) in the ventricular remodeling process, which is related to the mechanism of action of DCM[13,14]. However, ARGs' diagnostic performance and prognostic efficacy in DCM have not been fully elucidated. In this study, we downloaded gene expression profile data of DCM from the Gene Expression Omnibus (GEO) database, applied bioinformatics to search for AR-DEGs in DCM, and visualized the correlation between genes. Subsequently, gene enrichment analysis was performed on tagging gene functions and exploring pathogenesis. Machine learning algorithms were afterward executed to filter and identify diagnostic biomarkers of DCM. In addition, based on the diagnostic biomarkers of DCM, transcription factors gene regulatory network and gene-targeted drugs were predicted to provide ideas for clinical precision therapy and experimental studies. The DisGeNET database was used for association analysis of DCM with other related diseases to provide a transcriptomic basis for further investigation of the potential pathogenesis of the disease. The flow chart of this study was shown in Fig. 1.
Figure 1

Workflow diagram of the current study. GO, go ontology; TFs, transcription factors.

Workflow diagram of the current study. GO, go ontology; TFs, transcription factors.

Results

Identification of autophagy-related differentially expressed genes (AR-DEGs) for dilated cardiomyopathy (DCM)

The GSE4172 dataset was used to screen for DEGs in DCM. Based on the threshold set to |log2FoldChange|≥ 0.8, p-value < 0.05, 770 DEGs were acquired, containing 366 up-regulated genes and 404 down-regulated genes. In addition, the heatmap (Fig. 2a) showed the expression of the top 60 DEGs and the asymptotic volcano plot (Fig. 2b) showed the distribution of DEGs.
Figure 2

DEGs differential analysis of GSE4172 dataset. (a) Heatmap of DEGs in GSE4172 dataset (n = 60, p < 0.05, |log2 FoldChange|≥ 0.8). (b) Asymptotic volcano map of gene expression in the GSE4172 dataset. The two vertical lines indicated gene expression ploidy changes > 0.8 and < -0.8, respectively, and the horizontal line indicated a p value of 0.05. The color of the dots represented the level of the p value. The top 10 significantly expressed genes among the DEGs were labeled on the graph.

DEGs differential analysis of GSE4172 dataset. (a) Heatmap of DEGs in GSE4172 dataset (n = 60, p < 0.05, |log2 FoldChange|≥ 0.8). (b) Asymptotic volcano map of gene expression in the GSE4172 dataset. The two vertical lines indicated gene expression ploidy changes > 0.8 and < -0.8, respectively, and the horizontal line indicated a p value of 0.05. The color of the dots represented the level of the p value. The top 10 significantly expressed genes among the DEGs were labeled on the graph. 803 ARGs were obtained through two autophagy-related gene databases, HADb and HAMdb. The Venn diagram obtained by the Omicshare online tool demonstrated 23 AR-DEGs of DCM (ADIPOQ, TRIM17, PPFIA4, CAPN12, PLEKHF1, RCAN1, RAB12, CXCR4, HSPG2, EIF4EBP1, HSF1, ZC3H12A, PRKAB1, TRIM65, ARSA, GABARAPL1, DICER1, VDAC1, CHMP4B, AGTR1, BAD, TFEB, AP2M1) (Fig. 3). The relevant functions of 23 AR-DEGs were shown in Supplementary Table S1.
Figure 3

AR-DEGs were shown by Venn diagram. 366 DEGs-Up and 404 DEGs-Down were intersected with 232 and 796 autophagy-associated genes from the HADb and HAMDb autophagy gene pools, with 23 genes being identical. The number of intersecting genes was marked in the red box. DEGs-Up, differentially expressed up-regulated genes; DEGs-Down, differentially expressed down-regulated genes.

AR-DEGs were shown by Venn diagram. 366 DEGs-Up and 404 DEGs-Down were intersected with 232 and 796 autophagy-associated genes from the HADb and HAMDb autophagy gene pools, with 23 genes being identical. The number of intersecting genes was marked in the red box. DEGs-Up, differentially expressed up-regulated genes; DEGs-Down, differentially expressed down-regulated genes. Correlation matrix analysis of 23 AR-DEGs and the expression of these genes in the disease and control groups were demonstrated in Fig. 4a. The absolute values of relative coefficients between genes exceeding 0.5 were considered to be of typical significance and were labeled in Fig. 4b. Moreover, some genes showed a strong association with others.
Figure 4

23 AR-DEGs in dilated cardiomyopathy (DCM) group and control group and their correlation. (a) Box plot of the expression levels of 23 DEGs-Down in the DCM and control groups. The blue box plots above the corresponding gene names indicated expression in control groups, whereas the red box plots indicated expression in DCM groups. (b) Correlation heatmap of 23 AR-DEGs. The color within the circle shape and the magnitude of the correlation value represented the strength of the correlation; red represented positive correlation and blue represented negative correlation. The darker the color, the larger the absolute value of the correlation value represented a stronger correlation.

23 AR-DEGs in dilated cardiomyopathy (DCM) group and control group and their correlation. (a) Box plot of the expression levels of 23 DEGs-Down in the DCM and control groups. The blue box plots above the corresponding gene names indicated expression in control groups, whereas the red box plots indicated expression in DCM groups. (b) Correlation heatmap of 23 AR-DEGs. The color within the circle shape and the magnitude of the correlation value represented the strength of the correlation; red represented positive correlation and blue represented negative correlation. The darker the color, the larger the absolute value of the correlation value represented a stronger correlation.

Gene Ontology (GO), Pathway Enrichment Analysis

GO analysis and multiple databases (KEGG, Wikipathway, Bioplanet, Reactome) pathway analysis were implemented through the Enrichr database. Three categories of GO analysis were obtained by clustering AR-DEGs of DCM, namely biological process (BP), chromosomal location (CC), and molecular function (MF). The top ten terms of each category were predicted in Table 1.
Table 1

GO category, GO pathways, corresponding p-values, and AR-DEGs.

Go categoryGO pathwaysGO IDp-valueAR-DEGs
Biological processRegulation of autophagyGO:0010506)5.24E−09

PLEKHF1;BAD;ZC3H12A;TFEB;VDAC1;

PRKAB1;TRIM65

Positive regulation of autophagyGO:0010508)5.2E−08PLEKHF1;BAD;ZC3H12A;TFEB;TRIM65
Positive regulation of cellular catabolic processGO:0031331)4.93E−07PLEKHF1;BAD;ZC3H12A;TFEB;TRIM65
Positive regulation of cold-induced thermogenesisGO:0120162)4.29E−06ADIPOQ;HSF1;CXCR4;PRKAB1
Positive regulation of metabolic processGO:0009893)7.87E−06ADIPOQ;HSF1;CXCR4;PRKAB1
MacroautophagyGO:0016236)9.99E−06GABARAPL1;CHMP4B;VDAC1;PRKAB1
Response to sodium arseniteGO:1903935)1.26E−05HSF1;ZC3H12A
Cellular response to sodium arseniteGO:1903936)1.26E−05HSF1;ZC3H12A
Cellular response to saltGO:1902075)1.89E−05HSF1;ZC3H12A
Negative regulation of tumor necrosis factor productionGO:0032720)1.95E−05ADIPOQ;ZC3H12A;DICER1c
Molecular functionLow-density lipoprotein particle receptor bindingGO:0050750)0.000315HSPG2;AP2M1
Lipoprotein particle receptor bindingGO:0070325)0.00047HSPG2;AP2M1
Endoribonuclease activityGO:0004521)0.000694ZC3H12A;DICER1
Regulatory RNA bindingGO:0061980)0.000961ZC3H12A;DICER1
Endonuclease activityGO:0004519)0.001559ZC3H12A;DICER1
Kinase bindingGO:0019900)0.001742BAD;HSF1;VDAC1;PRKAB1
Protein kinase bindingGO:0019901)0.002446BAD;HSF1;VDAC1;PRKAB1
Ribonuclease activityGO:0004540)0.002595ZC3H12A;DICER1
C-X-C chemokine receptor activityGO:0016494)0.005737CXCR4
Intronic transcription regulatory region sequence-specific DNA bindingGO:0001161)0.005737HSF1
Cellular componentlysosomeGO:0005764)7.4E−07

ARSA;PLEKHF1;RAB12;TFEB;CXCR4;

HSPG2;AP2M1

Lytic vacuoleGO:0000323)0.000105ARSA;PLEKHF1;RAB12;CXCR4
Lytic vacuole membraneGO:0098852)0.003419PLEKHF1;TFEB;AP2M1
Lysosomal lumenGO:0043202)0.00436ARSA;HSPG2
Endosome membraneGO:0010008)0.005917PLEKHF1;RAB12;AP2M1
Lysosomal membraneGO:0005765)0.006172PLEKHF1;TFEB;AP2M1
Nuclear stress granuleGO:0097165)0.006881HSF1
AP-2 adaptor complexGO:0030122)0.008023AP2M1
Clathrin coat of endocytic vesicleGO:0030128)0.008023AP2M1
Mitochondrial outer membraneGO:0005741)0.009136BAD;VDAC1
GO category, GO pathways, corresponding p-values, and AR-DEGs. PLEKHF1;BAD;ZC3H12A;TFEB;VDAC1; PRKAB1;TRIM65 ARSA;PLEKHF1;RAB12;TFEB;CXCR4; HSPG2;AP2M1 Based on the number of gene interactions, BP was mainly focused on the regulation of autophagy, positive regulation of autophagy, positive regulation of the cellular catabolic process, and macroautophagy. For cellular components, lysosome and lytic vacuole were significantly associated with autophagy-related differential genes, ultimately pointing to inflammatory cardiomyopathy in response to the human heart. Molecular functional studies revealed that AR-DEGs were most concentrated in low-density lipoprotein particle receptor binding. A similar concentration level could be found in lipoprotein particle receptor blinding and endoribonuclease activity. Notably, the results of the pathway analysis in this study were joint (Table 2). Through the previously set database, the Longevity regulating pathway, Macroautophagy, PI3K-AKT-mTOR signaling pathway, therapeutic opportunities, and AMPK signaling were identified as the top pathways.
Table 2

Top 10 pathways from KEGG, BioPlanet, Reactome, WikiPathways databases and their corresponding p-values and genes for AR-DEGs.

DatabasesPathwaysp-valueGenes
KEGGLongevity regulating pathway2.12E−04ADIPOQ;EIF4EBP1;PRKAB1
AMPK signaling pathway3.42E−04ADIPOQ;EIF4EBP1;PRKAB1
Apelin signaling pathway5.04E−04GABARAPL1;AGTR1;PRKAB1
Insulin signaling pathway5.04E−04BAD;EIF4EBP1;PRKAB1
cGMP-PKG signaling pathway8.96E−04BAD;AGTR1;VDAC1
Calcium signaling pathway0.002531AGTR1;CXCR4;VDAC1
Acute myeloid leukemia0.002673BAD;EIF4EBP1
Mitophagy0.002752GABARAPL1;TFEB
Adipocytokine signaling pathway0.002832ADIPOQ;PRKAB1
Endocytosis0.002905CXCR4;CHMP4B;AP2M1
BioPlanetAMPK signaling6.34E−05ADIPOQ;EIF4EBP1;PRKAB1
Mitochondrial pathway of apoptosis: BH3-only Bcl-2 family1.83E−04CAPN12;BAD;VDAC1
Phosphoinositides and their downstream targets3.73E−04BAD;AP2M1
PKB-mediated events4.70E−04EIF4EBP1;PRKAB1
TOR signaling6.94E−04EIF4EBP1;PRKAB1
Endocytosis0.001527CXCR4;CHMP4B;AP2M1
Calcineurin-dependent NFAT signaling role in lymphocytes0.00181RCAN1;BAD
Acute myeloid leukemia0.001943BAD;EIF4EBP1
ERK1/ERK2 MAPK pathway0.002518BAD;EIF4EBP1
Adipocytokine signaling pathway0.002673ADIPOQ;PRKAB1
ReactomeMacroautophagy Homo sapiens R-HSA-16328526.07E−05GABARAPL1;CHMP4B;PRKAB1
Cellular responses to stress Homo sapiens R-HSA-22627527.49E−04GABARAPL1;HSF1;CHMP4B;PRKAB1
mTOR signalling Homo sapiens R-HSA-1651599.13E−04EIF4EBP1;PRKAB1
PKB-mediated events Homo sapiens R-HSA-1097039.61E−04EIF4EBP1;PRKAB1
Disease Homo sapiens R-HSA-16436850.001206BAD;CXCR4;CHMP4B;HSPG2;AP2M1
HIV Infection Homo sapiens R-HSA-1629060.002028CXCR4;CHMP4B;AP2M1
PI3K Cascade Homo sapiens R-HSA-1097040.003693EIF4EBP1;PRKAB1
Degradation of the extracellular matrix Homo sapiens R-HSA-14742280.006547CAPN12;HSPG2
Infectious disease Homo sapiens R-HSA-56632050.007144CXCR4;CHMP4B;AP2M1
Nef Mediated CD8 Down-regulation Homo sapiens R-HSA-1822180.008023AP2M1
WikiPathwayPI3K-AKT-mTOR signaling pathway and therapeutic opportunities WP38445.28E−06BAD;TFEB;EIF4EBP1
Leptin and adiponectin WP39345.66E−05ADIPOQ;PRKAB1
AMP-activated protein kinase (AMPK) signaling WP14036.62E−05ADIPOQ;EIF4EBP1;PRKAB1
The influence of laminopathies on Wnt signaling WP48447.36E−04ADIPOQ;DICER1
Target Of Rapamycin (TOR) Signaling WP14717.78E−04EIF4EBP1;PRKAB1
Synaptic signaling pathways associated with autism spectrum disorder WP45390.001499EIF4EBP1;PRKAB1
RAC1/PAK1/p38/MMP2 Pathway WP33030.002752BAD;EIF4EBP1
Peptide GPCRs WP240.003249AGTR1;CXCR4
Leptin signaling pathway WP20340.003424BAD;EIF4EBP1
IL-18 signaling pathway WP47540.003602BAD;ADIPOQ;ZC3H12A
Top 10 pathways from KEGG, BioPlanet, Reactome, WikiPathways databases and their corresponding p-values and genes for AR-DEGs. A comparison of GO terms was presented in Fig. 5a. Figure 5b provided pathway analysis from multiple databases.
Figure 5

(a) Identification results of GO terms related to biological processes, cellular components and molecular functions based on gene enrichment analysis. Higher p value indicated a higher number of genes involved in this GO ontology. (b) Identification of results from combined multi-pathway analysis by KEGG, WikiPathways, BioPlanet and Reactome.

(a) Identification results of GO terms related to biological processes, cellular components and molecular functions based on gene enrichment analysis. Higher p value indicated a higher number of genes involved in this GO ontology. (b) Identification of results from combined multi-pathway analysis by KEGG, WikiPathways, BioPlanet and Reactome.

Machine learning screened for autophagy-related biomarkers of DCM

The expression matrices of 23 AR-DEGs were used to construct the best diagnostic model using both least absolute shrinkage selection operator (LASSO) regression and support vector machine recursive feature elimination (SVM-RFE) algorithms to finally obtain potential diagnostic biomarkers of DCM. The LASSO regression algorithm narrowed down the range of AR-DEGs of DCM and obtained 9 variables as potential diagnostic biomarkers for DCM (Fig. 6a). The SVM-RFE algorithm was implemented to identify 13 signature genes (Fig. 6b).
Figure 6

Screening of diagnostic biomarkers for DCM by machine learning algorithms. (a) Screening of optimal genes by LASSO regression model. (b) Plot of the best gene selected by SVM-RFE algorithm. (c) Venn diagram embodying the eight diagnostic biomarkers common to both machine learning algorithms. LASSO, least absolute shrinkage and selection operator; SVM-RFE, support vector machine-recursive feature elimination.

Screening of diagnostic biomarkers for DCM by machine learning algorithms. (a) Screening of optimal genes by LASSO regression model. (b) Plot of the best gene selected by SVM-RFE algorithm. (c) Venn diagram embodying the eight diagnostic biomarkers common to both machine learning algorithms. LASSO, least absolute shrinkage and selection operator; SVM-RFE, support vector machine-recursive feature elimination. Finally, 8 overlapping genes (PLEKHF1, HSPG2, HSF1, TRIM65, DICER1, VDAC1, BAD, TFEB) were obtained (Fig. 6c).

Construction of transcription factor (TF)-gene regulatory network

Based on the JASPAR TF binding site profile database, TF-gene regulatory network was constructed using the NetworkAnalyst 3.0 platform. The TF-gene regulatory network was constructed based on 8 diagnostic biomarkers of DCM (PLEKHF1, HSPG2, HSF1, TRIM65, DICER1, VDAC1, BAD, TFEB) (Figure). The network included 46 loci with 76 edges. In detail, these loci are combined by 8 seed genes and 38 transcription factors. TFEB was regulated by 19 transcription factors and DICER1 was regulated by 15 transcription factors. Figure 7 showed the TF-gene regulatory network.
Figure 7

Network of transcription factors interacting with 8 potential diagnostic biomarkers. The highlighted orange nodes indicated the 8 potential diagnostic biomarkers and the other pink nodes indicated transcription factors. The network consisted of 8 core genes, 46 nodes and 76 edges.

Network of transcription factors interacting with 8 potential diagnostic biomarkers. The highlighted orange nodes indicated the 8 potential diagnostic biomarkers and the other pink nodes indicated transcription factors. The network consisted of 8 core genes, 46 nodes and 76 edges.

Gene targeted drugs screening

Based on the DSigNET drug database, the Enrichr (https://maayanlab.cloud/Enrichr/) web platform was used to identify drug molecules associated with 8 diagnostic biomarkers for DCM. Gene-targeted drugs were collected based on P-values. The combined score is proportional to the gene-drug association when the p-value is satisfied. The analysis showed that Melatonin CTD 00006260 and metformin CTD 00006282 had high gene binding to DCM. Table 3 listed the top 10 drugs for DCM by the DSigDB database.
Table 3

Drugs of choice for dilated cardiomyopathy.

Termp-valueCombined scoreGenes
Arsenenous acid CTD 000009225.77E−0576.35849PLEKHF1;BAD;ADIPOQ;HSF1;TFEB;EIF4EBP1;CXCR4;PPFIA4
Melatonin CTD 000062600.000058462.545BAD;ADIPOQ;EIF4EBP1
Metformin CTD 000062820.000115338.4126BAD;ADIPOQ;EIF4EBP1
Tretinoin HL60 UP0.000152154.1536RCAN1;GABARAPL1;CXCR4;PPFIA4
Imatinib CTD 000032670.000231244.7684BAD;HSF1;EIF4EBP1
Arsenenous acid CTD 000009220.00037461.03121GABARAPL1;HSF1;EIF4EBP1;CXCR4;VDAC1;TRIM65
Wortmannin CTD 000005040.000606154.125BAD;HSF1;EIF4EBP1
Isoflupredone HL60 UP0.000867371.9284RCAN1;CXCR4
Telmisartan CTD 000030210.001059325.1773ADIPOQ;AGTR1
Rosiglitazone CTD 000031390.00114268.5849GABARAPL1;BAD;ADIPOQ;CXCR4
Drugs of choice for dilated cardiomyopathy.

Genetic disease association analysis

Gene list enrichments were identified in the DisGeNET dataset. All genes in the genome had been used as the enrichment background. Terms with a p-value < 0.01, a minimum count of 3, and an enrichment factor > 1.5 (the enrichment factor is the ratio between the observed counts and the counts expected by chance) were collected and grouped into clusters based on their membership similarities. The top 10 enriched clusters were shown in the Fig. 8. The algorithm used here was the same for pathway and process enrichment analysis. Cyst, Uveal melanoma, Diabetes Mellitus, Experimental, Adult T-Cell Lymphoma/Leukemia, and Amyloidosis were identified as top 5 comorbidities of DCM.
Figure 8

The process of identifying comorbidities in DCM.

The process of identifying comorbidities in DCM.

Discussion

It is well known that DCM is impaired ventricular dilation and systolic diastole, leading to arrhythmias and heart failure in severe cases. Unfortunately, with the low prevalence of EMB, most patients with early-stage cardiomyopathy are not effectively treated. The gold standard for myocarditis and DCM is often poor prognosis in cases of concomitant arrhythmias and heart failure[2]. Therefore, early diagnosis, precise evaluation, and therapeutic management of patients with DCM appear crucial. Hence, researchers are increasingly looking for diagnostic markers of DCM. Meanwhile, the molecular pathogenesis of DCM, viral infections, and other factors in disease progression and prognosis are still incompletely studied[15]. It is well known that autophagy plays an important role in cancer, neurodegenerative diseases, inflammatory diseases, and cardiac diseases[5]. Among these, autophagy mechanisms are increasingly studied in cardiac diseases, and autophagy plays a crucial role in maintaining typical cardiac structure, function, and therapy[16,17]. Two key autophagy-related molecules, mTOR and Beclin1, had been shown to play a regulatory role in myocardial ischemia–reperfusion injury[17]. Among them, mTOR is involved in the PI3K and Akt pathway to regulate myocardial ischemia/reperfusion-induced apoptosis and autophagy[18]. In addition, Beclin1 exerts a positive impact on myocardial ischemia and an adverse effect during myocardial ischemia/reperfusion[19]. Currently, studies on the role of autophagy in cardiomyopathy-related diseases are increasing[13], and research has shown that damage to the autophagic lysosomal pathway (ALP) and activation of inflammatory vesicles were important factors contributing to DCM[14]. Improved left ventricular size and cardiac function in mice with DCM deficient in NCOA4 (nuclear receptor coactivator 4, an autophagy-associated gene that mediates ferritin degradation) inhibit free ferrous iron overload and increased lipid peroxidation[20]. Carolina et al.[21] found that autophagy-related genes, such as CALCOCO2 and NRBP2, the former of which regulates the expression of the latter, adversely affected left ventricular function parameters in patients with DCM. In recent years, the exploration of the diagnostic and prognostic role of genetic biomarkers targeting DCM has been on the rise. For example, CYR61 and APN were identified as two target genes for DCM by gene expression profiling studies in the GSE4172 dataset raw data[22]. It had been shown that RBM20 induced aberrant TNN splicing as a determinant of DCM and increased the risk of arrhythmias[23]. In previous bioinformatics studies, genes or transcription factors such as CTGF, POSTN, CORIN, and FIGF were closely associated with DCM[24]. However, few studies have been conducted on the value of autophagy-related genes in diagnosing DCM. To the best of our knowledge, this study is the first to investigate the diagnostic role of ARGs in DCM by mining the GEO database and integrating machine learning and bioinformatics approaches. We used the NetworkAnalyst 3.0 platform to deeply analyze the GSE4172 dataset, which compares gene expression in DCM with healthy samples infected by the fine virus B19. Using differential analysis, we obtained 770 DEGs and combined them with the gene set from the autophagy databases to obtain 23 AR-DEGs of DCM. Finally, by machine learning methods such as LASSO regression and SVM-RFE, we obtained 8 (PLEKHF1, HSPG2, HSF1, TRIM65, DICER1, VDAC1, BAD, TFEB) diagnostic biomarkers of DCM. Previous studies showed significant relevance regarding DCM or cardiomyocyte remodeling in the above eight genes. PLEKHF1 (Pleckstrin homology and FYVE domain containing 1) is located in the lysosome and plays a vital role in caspase-independent apoptosis, a process involved in autophagy[25]. In previous studies, PLEKHF1 is a susceptibility gene for several diseases. For example, Qi et al., identified PLEKHF1 as a potential biomarker for diabetic atherosclerosis[26]; also, PLEKHF1 was shown to be a potential biomarker for chronic graft-versus-host disease, the accuracy of which was confirmed by several clinical independent validation studies[27]. In addition, it had been shown that levosimendan ameliorated myocardial infarction and ventricular remodeling in diabetic rats, and the expression of the gene Plekhf1 received regulation by levosimendan, showing the potential of Plekhf1 as a target gene for myocardial infarction and diabetic cardiomyopathy[28]. HSPG2 (Heparan sulfate proteoglycan 2) plays an important role in cancer growth, development, and metastasis[29]. Previous studies had shown that HSPG2 was identified in key cardiac-related regions controlled by chromosome 1p36[30], and related studies had demonstrated that chromosome 1p36 deletion was responsible for cardiovascular malformations and cardiomyopathy[31], suggesting an important role for HSPG2 in the pathogenesis and prognostic impact of cardiomyopathy[30]. In addition, HSPG2 also plays an independent predictive role in a variety of diseases. For example, HSPG2 was overexpressed in acute myeloid leukemia and can be used as a prognostic biomarker[32]. Recent studies had shown that HSPG2 deficiency was a risk factor for aortic coarctation[33]. HSF1 (Heat shock transcription factor 1) is a significant heat stress response factor that plays an important role in inhibiting apoptosis and pathological remodeling of cardiomyocytes and is a protective factor for cardiomyocytes. In a previous quantitative transcriptomic analysis, HSF1 was found to be significantly enriched in cardiomyocytes[34]. It had been shown that HSF1 could be isolated by the death trap method, preventing hydrogen peroxide-induced cardiomyocyte death. It was found that overexpression of HSG1 in transgenic mice reduced ischemia–reperfusion-induced cardiomyocyte injury[35]. In the present study, HSF1 expression was lower in the DCM group compared with the healthy control group, which was also consistent with the findings of previous studies. In addition, it had been shown that overexpression of HSF1 in BAG mutation-associated DCM helped to attenuate pathological remodeling of cardiomyocytes and alleviate proteostatic stress[36]. In contrast, recent studies had shown that HSF1 overexpression lead to reduced expression of myofilament localization-associated BAG3. Decreased expression of BAG3 was strongly associated with non-inherited heart failure and was more susceptible in male patients with DCM[37]. Therefore, the study of relevant molecules and pathways targeting HSF1 contributes to our understanding of DCM. TRIM65 is an E3 ubiquitin ligase involved in the positive regulation of autophagy and was expressed in vascular endothelial cells, located in the cytoplasmic lysate and nucleoplasm. Unfortunately, there are relatively few studies related to TRIM65. From the available literature, it appeared that TRIM65 was mainly involved in proteopathy and ubiquitination regulation to regulate disease progression and as a target for a variety of diseases[38,39]. Interestingly, although fewer studies are addressing the mechanisms associated with TRIM65 and DCM, according to recent studies, TRIM65 was closely linked to the inflammatory vesicle NRLP3[40], which is known to play a role in a variety of DCM[14]. TRIM65 was associated with antiviral innate immune mechanisms[41]. In addition, it had been shown that TRIM65 regulated VCAM-1 to control inflammatory responses[42]. All these studies point the way to exploring the molecular mechanism of TRIM65 and DCM. DICER1 is a member of the ribonuclease III (RNaseIII) family and is involved in the production of microRNAs, which regulate gene expression at the post-transcriptional level and are more frequently studied in oncological diseases[43]. Evidence suggested that DICER deletion resulted in a dramatic decrease in the level of miRNAs it regulates, which led to severe DCM and heart failure in mice, a trend that was also seen in the expression of DICER proteins in diseased populations, implying an important role of DICER family genes in the pathogenesis of DCM[44]. Follow-up studies had shown that microRNAs act as negative regulators of genes and that specific regulation of microRNA expression could inhibit the loss of cardiac function due to DICER deficiency[45,46], leading to cardioprotection. These studies suggested that endogenous microRNA competitive regulation of DICER family genes will be an essential strategy for gene targeting therapy in DCM. VDAC (voltage dependent anion channel), including VDAC1 and VDAC2, is a mitochondrial outer membrane pore-forming protein present in all eukaryotes. As a mitochondrial transporter protein, VDAG is mostly expressed in cardiac tissue and has significant tissue specificity[47,48]. It is well known that Ca2+ played a detrimental role in heart failure and myocardial ischemia/reperfusion, and Ca2+ overload activated the complex matrix chaperone procyclin D (CypD), which regulated the VDAC1, Grp75, and IP3R1 complex and thus caused damage to cardiomyocytes, whereas inhibition of the CypD, VDAC1, Grp75, and IP3R1 complex could protect cardiomyocytes[49]. Numerous studies had shown[50,51] that regulation of VDAC1 expression through microRNA targeting could regulate mitochondrial function and promoted the release of mitochondrial calcium for cell protection. Furthermore, in DCM mice, the lncRNA H19/miR-675 axis competitively downregulated VDAC1, reducing apoptosis. The above report provides a new strategy to explore the role of VDAC1 in DCM. It was shown that VDAC1 expression was upregulated in the hearts of patients with hypertrophic cardiomyopathy[52]. In the present study, the expression of VDAC1 was also upregulated in samples from patients with DCM. These findings could explain the unique role played by VDAC1 as a target gene for DCM. BAD (Bcl-2 associated agonist of cell death) often follows Bcl-2 and plays an anti-apoptotic role. In a TNF-α-mediated mouse model of DCM in which apoptosis occurs, the expression of BAD was reduced in association with Bcl-2d[53], which was consistent with the findings of the present study. According to previous studies, BAD played a key role in inducing β-cell apoptosis in Friedreich's ataxia, a neurodegenerative disease closely related to cardiomyopathy and diabetes[54]. It is well known that microRNAs regulate protein expression of mRNAs through negative regulation and play an important role in cardiovascular diseases, especially in heart failure and cardiac remodeling[55]. Studies had shown that multiple microRNAs played a regulatory role on BCL2[56] and all of them were upregulated in heart failure[57]. As an antagonist of apoptosis, the protective role of BAD and Bcl-2 in the pathogenesis of DCM depended on further studies. TFEB (transcription factor EB), a transcription factor located within the cytoplasmic lysosol (cytosol), is the master gene of the autophagic machinery of lysosomal biogenesis and coordinates the autophagic process, including autophagosome formation, autophagosome-lysosome fusion, and substrate degradation by driving the expression of autophagy and lysosomal genes[58]. According to reports, TEFB expression was highest in 18-week-old fetal heart tissue, with significant tissue specificity[59]. There is growing evidence that TFEB plays an important role in various types of DCM. Lysosomal storage disorders (LSD) lead to cardiac involvement in hypertrophic cardiomyopathy and DCM[60]. Further studies had shown that the Yes-associated protein (YAP) and Feb signaling pathway played a role in LSD disease by eliminating autophagic lysosomes, reducing cell death, and restoring cardiac function[61]. Also, it was found that TFEB deficiency led to cardiomyocyte hypertrophy and DCM causing heart failure[62]. Therefore, the role of TFEB in targeting DCM is extremely significant. In addition, we performed a functional enrichment analysis of the pathogenesis of DCM and related molecular pathways and found that AR-DEGs of DCM were mainly enriched in autophagy regulatory pathways and cell growth signaling, such as regulation of autophagy, macroautophagy, AMPK signaling pathway, PKB-mediated events, etc. AMPK (Adenosine monophosphate-activated protein kinase) signaling pathway had been reported to be an important intracellular signaling pathway in the heart[63]. As an emerging target recognized for the treatment of heart failure[64], AMPK plays an important role in regulating cardiomyocyte growth[65]. Numerous studies had shown that the AMPK pathway and its binding autophagy-related pathways played a protective role in the pathological development of cardiomyopathy[66-69]. These studies have provided ideas to explore the mechanistic studies of autophagy-related DCM. PKB (protein kinase B), also known as serine/threonine kinase Akt, serves as a central node for a variety of biological processes[70]. It had been reported that PKB was involved in protective mechanisms against myocardial ischemia/reperfusion[71]. However, relatively few studies have been conducted on the association of PKB-mediated events with DCM. According to previous studies, Pleiotrophin, a pro-angiogenic factor, was significantly expressed in rat models of myocardial infarction and DCM patients. It is considered that Pleiotrophin protects the myocardium by inhibiting endogenous AKT/PKB activity[72]. In contrast, Alexander et al. found that PKB phosphorylation expression restored cardiac contractility in a zebrafish model of DCM[73]. In addition, we constructed TF-gene regulatory networks based on 8 autophagy-related genes in DCM and predicted them to target drugs, such as Melatonin and metformin. Studies showed that Melatonin had a better inhibitory effect on left heart dysfunction and ventricular remodeling in DCM rats with cardiorenal syndrome[74]. Metformin was able to partially reverse ventricular remodeling in mice with DCM through an autophagic mechanism[75]. These studies provided a basis and direction for clinical precision targeting therapy and novel drug development in DCM. In addition, we explored the comorbidities associated with DCM, such as fatty liver disease. Some scholars found that[76] NAFLD affected the cardiovascular system through metabolic and inflammatory responses, and also increased the abnormalities of cardiac anatomy including cardiomyopathy[77]. Furthermore, the disease pathways between the two need further investigation. However, there are certain shortcomings in our study. First, our data set of DCM was mined and analyzed secondarily by bioinformatics means, and the results of the study need to be validated with external evidence. In addition, the results of this study need to be combined with single-cell sequencing as the multi-omics study progresses. Finally, the mechanism of action and interrelationship between these 8 DCM genes and autophagy-related genes need further investigation.

Methods

Dilated cardiomyopathy dataset acquisition

The dataset of DCM was downloaded from the GSE4172 dataset of the Gene Expression Omnibus (GEO) (https://www.ncbi.nlm.nih.gov/geo) database, which was contributed by Wittchen et al.[22], piggybacked on the GPL570 [HG- U133_Plus_2] platform using Affymetrix Human Genome U133 Plus 2.0 Array, containing eight endomyocardial myocardial biopsy samples from patients with microvirus B19-associated cardiac inflammation as experimental group and four healthy human samples as a control group. Clinical information of patients from the GSE4172 dataset was presented in Table 4.
Table 4

Clinical information for the GSE4172 dataset.

SampleGroupAgeGenderEjection fractionLeft ventricular end diastolic diameterInflammation/PVB19
GSM94836DCM45Male3462Positive
GSM94837DCM62Male5173Positive
GSM94838DCM31Male5257Positive
GSM94839DCM67Male4359Positive
GSM94840DCM60Male3476Positive
GSM94841DCM69Male3560Positive
GSM94842DCM55Female3161Positive
GSM94843DCM31Female5671Positive
GSM94831Healthy control36Female6847Negative
GSM94854Healthy control46Female6149Negative
GSM94855Healthy control26Female7447Negative
GSM94870Healthy control36Male6450Negative
Clinical information for the GSE4172 dataset.

Autophagy genes acquisition

A total of 232 autophagy genes were downloaded from the Human Autophagy Database (HADb, http://autophagy.lu/). Similarly, 796 autophagy genes were obtained from the Human Autophagy Modulator Database (HAMdb, http://hamdb.scbdd.com)[78]. A total of 803 autophagy-related genes were obtained as the autophagy gene set for this study by taking the intersection of the two.

Identification of differentially expressed genes (DEGs) in autophagy-related genes (ARGs)

NetworkAnalyst 3.0 is a user-friendly bioinformatics visualization web platform for transcriptome analysis, gene network construction, and meta-analysis of gene expression data[79]. The expression data and grouping information of the GSE4172 dataset were submitted to NetworkAnalyst 3.0 for identification of the DCM groups and the healthy control groups for DEGs. For mRNA in microarrays, the threshold was set to |log2FoldChange|≥ 0.8 with a p value < 0.05, and genes meeting this criterion were considered as DEGs. We used the ggplot2 package (R package version 4.1.3) and pheatmap package (R package version 4.1.3) to draw the asymptotic volcano map and heatmap to show the DEGs. Autophagy-related genes (ARGs) and DEGs from the GSE4172 dataset were taken to intersect to obtain the set of autophagy-related differentially expression genes (AR-DEGs). Venn plots were created by using the Omicshare online tool (https://www.omicshare.com/). The expression of 23 AR-DEGs in GSE4172 was demonstrated using box plots through the ggpubr package as well as the associated helper R packages. The correlation analysis of AR-DEGs was visualized using the corrplot package (R package 4.1.3).

Functional enrichment analysis

Functional enrichment consists of performing biological processes, molecular functions, and chromosomal location analysis[80]. Gene annotation uses gene ontology (GO) terminology and consists of biological processes, molecular functions, and cells. The Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway was used to understand metabolic pathways and plays an important role in the gene annotation process[81,82]. In addition, the BioCarta, WikiPathways[83], and Reactome[84] databases were also used to analyze KEGG pathways. The Enrichr (https://amp.pharm.mssm.edu/Enrichr/) platform provides a comprehensive gene enrichment analysis applied databases containing rich gene set annotation, pathway information analysis, and screening of gene target drugs[85,86]. The GO terms of the AR-DEGs of DCM and all pathway information for this study were obtained from the Enrichr platform.

Machine learning identifies molecular markers of AR-DEGs in DCM

In this study, the least absolute shrinkage and selection operator (LASSO) logistic regression was used for feature gene selection to reduce the number of genes in the disease prediction model, solve the multicollinearity problem in the regression analysis, and screen the molecular markers of DCM genes[87]. The "glmnet" package was used to implement the LASSO regression algorithm with α set to 1 which was used to control the traits of the model when dealing with highly correlated data. In addition, the Support Vector Machine-Recursive Feature Elimination (SVM-RFE) algorithm model was also used in this study to characterize the AR-DEGs and remove irrelevant genes to make the diagnostic prediction model more robust[88]. The SVM-RFE was implemented by the e1071 Package R software.

Transcription Factor (TF)-gene regulatory network construction

The JASPAR (http://jaspar.genereg.net/) database was used to generate a visual analysis of the TF-gene co-regulatory network[89]. Based on 8 biomarkers of DCM, TFs that regulated the activity of functional pathways and gene expression levels in DCM were identified from the JASPAR database to form the TF-gene regulatory network. It is important to note that the JASPAR database is included in the NetworkAnalyst 3.0 platform.

Target drug screening

Gene target-based drug screening has become a new approach for drug molecular identification study, which helps to expand the scope of relevant drugs and reduce the process of drug development. In this study, molecular markers of DCM were screened for drug candidates through the drug Signatures database (DSigDB), which consists of 17,389 drugs and 19,531 genes associated with the drugs[90]. The DSigDB database can be accessed by visiting Enrichr (https://www.amp.pharm.mssm.edu/Enrichr/) website to enter relevant gene targets and download target drug information. Drugs with p-values less than 0.05 and with larger combined scores were considered to be typically significant. The combined score represents the degree to which the small molecule drug is closely linked to the gene of interest. The DisGeNET (http://www.disgenet.org) database is an open and versatile platform for studying specific human diseases and their comorbidities through genetic and molecular pathways, probing the characteristics of disease genes and offering the possibility to elucidate the mechanisms of disease[91]. In the present study, molecular markers of DCM were uploaded to the Metascape (https://metascape.org/gp/index.html#/main/step1) platform[92], which contains the DisGeNET database. We have revealed DCM-related comorbidities through the DisGeNET database, laying the foundation for the mechanistic study of DCM.

Copyright permission of KEGG

We have contacted Kanehisa Laboratories. We do not directly use these KEGG Pathway map “images” in the article, we need not obtain copyright permission of KEGG. However, they believe that we have written our article using their data, they kindly ask us to cite the following articles in it[81,93,94]. Supplementary Information.
  92 in total

1.  DSigDB: drug signatures database for gene set analysis.

Authors:  Minjae Yoo; Jimin Shin; Jihye Kim; Karen A Ryall; Kyubum Lee; Sunwon Lee; Minji Jeon; Jaewoo Kang; Aik Choon Tan
Journal:  Bioinformatics       Date:  2015-05-19       Impact factor: 6.937

Review 2.  Dilated Cardiomyopathy: Genetic Determinants and Mechanisms.

Authors:  Elizabeth M McNally; Luisa Mestroni
Journal:  Circ Res       Date:  2017-09-15       Impact factor: 17.367

3.  AMPK signaling pathway is rapidly activated by T3 and regulates the cardiomyocyte growth.

Authors:  Ana Paula Cremasco Takano; Gabriela Placoná Diniz; Maria Luiza Morais Barreto-Chaves
Journal:  Mol Cell Endocrinol       Date:  2013-06-06       Impact factor: 4.102

Review 4.  mTOR and Beclin1: Two key autophagy-related molecules and their roles in myocardial ischemia/reperfusion injury.

Authors:  Binhao Shi; Mengqing Ma; Yitian Zheng; Yanyan Pan; Xianhe Lin
Journal:  J Cell Physiol       Date:  2019-01-07       Impact factor: 6.384

5.  The role of autophagy in cardiomyocytes in the basal state and in response to hemodynamic stress.

Authors:  Atsuko Nakai; Osamu Yamaguchi; Toshihiro Takeda; Yoshiharu Higuchi; Shungo Hikoso; Masayuki Taniike; Shigemiki Omiya; Isamu Mizote; Yasushi Matsumura; Michio Asahi; Kazuhiko Nishida; Masatsugu Hori; Noboru Mizushima; Kinya Otsu
Journal:  Nat Med       Date:  2007-04-22       Impact factor: 53.440

6.  BAG3 expression and sarcomere localization in the human heart are linked to HSF-1 and are differentially affected by sex and disease.

Authors:  Thomas G Martin; Sara Tawfik; Christine S Moravec; Toni R Pak; Jonathan A Kirk
Journal:  Am J Physiol Heart Circ Physiol       Date:  2021-05-14       Impact factor: 5.125

7.  The reactome pathway knowledgebase.

Authors:  Bijay Jassal; Lisa Matthews; Guilherme Viteri; Chuqiao Gong; Pascual Lorente; Antonio Fabregat; Konstantinos Sidiropoulos; Justin Cook; Marc Gillespie; Robin Haw; Fred Loney; Bruce May; Marija Milacic; Karen Rothfels; Cristoffer Sevilla; Veronica Shamovsky; Solomon Shorser; Thawfeek Varusai; Joel Weiser; Guanming Wu; Lincoln Stein; Henning Hermjakob; Peter D'Eustachio
Journal:  Nucleic Acids Res       Date:  2020-01-08       Impact factor: 16.971

Review 8.  Cancer Metastasis: The Role of the Extracellular Matrix and the Heparan Sulfate Proteoglycan Perlecan.

Authors:  Zehra Elgundi; Michael Papanicolaou; Gretel Major; Thomas R Cox; James Melrose; John M Whitelock; Brooke L Farrugia
Journal:  Front Oncol       Date:  2020-01-17       Impact factor: 6.244

9.  Activated PKB/GSK-3β synergizes with PKC-δ signaling in attenuating myocardial ischemia/reperfusion injury via potentiation of NRF2 activity: Therapeutic efficacy of dihydrotanshinone-I.

Authors:  Hao Zeng; Lingling Wang; Jiawei Zhang; Ting Pan; Yinghua Yu; Jingxia Lu; Ping Zhou; Hua Yang; Ping Li
Journal:  Acta Pharm Sin B       Date:  2020-09-18       Impact factor: 11.413

10.  Autophagy and Inflammasome Activation in Dilated Cardiomyopathy.

Authors:  Angela Caragnano; Aneta Aleksova; Michela Bulfoni; Celeste Cervellin; Irene Giulia Rolle; Claudia Veneziano; Arianna Barchiesi; Maria Chiara Mimmi; Carlo Vascotto; Nicoletta Finato; Sandro Sponga; Ugolino Livi; Miriam Isola; Carla Di Loreto; Rossana Bussani; Gianfranco Sinagra; Daniela Cesselli; Antonio Paolo Beltrami
Journal:  J Clin Med       Date:  2019-09-21       Impact factor: 4.241

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.