Literature DB >> 30988790

Investigation of optimal pathways for preeclampsia using network-based guilt by association algorithm.

Yan Ruan1, Yuan Li1, Yingping Liu1, Jianxin Zhou1, Xin Wang1, Weiyuan Zhang1.   

Abstract

This study investigated optimal pathways for preeclampsia (PE) utilizing the network-based guilt by association (GBA) algorithm. The inference method consisted of four steps: preparing differentially expressed genes (DEGs) between PE patients and normal controls from gene expression data; constructing co-expression network (CEN) for DEGs utilizing Spearman's correlation coefficient (SCC) method; and predicting optimal pathways by network-based GBA algorithm of which the area under the receiver operating characteristics curve (AUROC) was gained for each pathway. There were 351 DEGs and 61,425 edges in the CEN for PE. Subsequently, 53 pathways were obtained with a good classification performance (AUROC >0.5). AUROC for 9 was >0.9 and defined as optimal pathways, especially microRNAs in cancer (AUROC=0.9966), gap junction (AUROC=0.9922), and pathogenic Escherichia coli infection (AUROC=0.9888). Nine optimal pathways were identified through comprehensive analysis of data from PE patients, which might shed new light on uncovering molecular and pathological mechanism of PE.

Entities:  

Keywords:  co-expression network; guilt by association; pathway; preeclampsia

Year:  2019        PMID: 30988790      PMCID: PMC6447911          DOI: 10.3892/etm.2019.7410

Source DB:  PubMed          Journal:  Exp Ther Med        ISSN: 1792-0981            Impact factor:   2.447


Introduction

With the development of high throughput technology and gene data analysis over the past decade, rapid progress has been made in discovering genetic associations of diseases (1,2). Generally, genes do not work individually, but co-operate with each other and actively participate in biological processes systemically. To the best of our knowledge, pathway analysis is the first choice for shedding light on underlying biology of genes in many diseases (3). In the present study, using pathway annotations and gene expression data, we proposed to predict optimal pathways for PE patients by integrating the guilt by association (GBA) algorithm and network approach, termed with network-based GBA inference method. Co-expression network (CEN) of differentially expressed genes (DEGs) was constructed by the Spearman's correlation coefficient (SCC) method. Pathway data for PE were collected dependent on the Kyoto Encyclopedia of Genes and Genomes (KEGG) database and DEGs. Ultimately, the network-based GBA inference method was implemented to predict optimal pathways, of which the area under the receiver operating characteristics curve (AUROC) was obtained for each pathway. The results might provide new insights on uncovering molecular mechanism underlying PE.

Materials and methods

Preparing gene expression data and DEGs

To control the quality gene array E-GEOD-25906 from ArrayExpress database was used. This dataset includes larger number of subjects relatively less affected by other factors. The diagnostic standard with preeclampsia (PE) clinical inclusion criteria of the subjects: women were diagnosed with PE if their systolic blood pressure was at least 140 mmHg, their diastolic blood pressure was at least 90 mmHg and they had proteinuria with an estimated 300 mg of protein or greater excreted in 24 h measured directly or indirectly by protein creatinine ratio. Standard pretreatments were conducted, containing background correction (4), normalization (5), probe match (6) and summarization of expressed values (4). After converting the preprocessed data on probe level into gene symbol measure and removing the duplicated ones, we obtained a total of 19,027 genes in gene expression data. The lmFit function implemented in Limma was utilized to perform empirical Bayes statistics and false discovery rate (FDR) calibration of the P-values on the data (7–9). Only genes which met to the thresholds of P<0.01 and |log2FoldChange| >2 were defined as DEGs across PE patients and normal controls.

Constructing CEN

In order to illustrate the relationships among DEGs of PE samples, the SCC method was utilized (10). Besides, for an interaction between gene x and y, the SCC was computed as follows: Note that the absolute SCC value across PE samples and normal controls was denoted as its weight value. The larger of the weight value, the closer of the interaction between two genes was. Next, DEGs and weight values were input into the Cytoscape software to visualize the CEN. Consequently, a CEN with weights was obtained for subsequent analysis.

Recruiting pathway annotation data

Metabolism pathways were recruited from the KEGG pathway database (11). There are 287 pathways covering 6,894 genes in the KEGG pathway database. Subsequently, with an attempt to make these pathways more closely correlated with PE patients, all DEGs were mapped to 287 pathways, and only pathways that had intersections with DEGs were left to the remaining analyses, named as pathway annotation data.

Network-based GBA inference method

All DEGs were mapped to 287 pathways, and the pathways that had intersections with DEGs were left for pathway annotation data. In this work, the network-based GBA inference method was employed to predict pathway functions in the development of PE patients, which combined CEN with the GBA algorithm (12). Taking pathway as our source of functional annotations, a multi-functionality score (MFS) was assigned to each gene i in the CEN (13), Where Num was the number of genes within pathway group k, whose weighting had the effect of giving contribution to a pathway group. Where Num was the number of genes within pathway group k, weighting exerted the action of giving contribution to a pathway group; and Num was the number of genes outside pathway group in the CEN. Where Num was the number of genes within pathway group k, whose weighting had the effect of giving contribution to a pathway group. In subsequent analysis, we computed the AUROC values for assessing the classification performances between PE samples and normal controls (14). Consequently, the AUROC for each pathway was obtained, and we selected these pathways of AUROC >0.5 as optimal pathways of PE patients.

Results

DEGs and pathway data

As described above, a total of 19,027 genes were identified in E-GEOD-25906 after standard pretreatments. Using the Limma package, we determined 351 DEGs between PE patients and normal controls which satisfied the thresholds of P<0.01 and |log2FoldChange| >2. Significantly, the top five genes in descending order of their P-values were SIAE (P=4.59E-10), TRIM24 (P=7.48E-10), PPP1R12C (P=2.90E-09), TUBA1B (P=3.96E-09), and ENG (P=4.23E-09). The total 287 pathways (involving 6,894 genes) belonging to metabolism category were collected from the KEGG pathway database. In addition, 351 DEGs of PE patients were mapped to 287 pathways to make these pathways more correlated to PE patients, and we only took the intersections. As a result, 81 pathways including 300 DEGs were reserved as pathway annotation data for subsequent study (Table I), such as Protein processing in endoplasmic reticulum (ID: hsa04141), Ribosome (ID: hsa03010), and Purine metabolism (ID: hsa00230).
Table I.

KEGG pathway annotation data for PE.

Pathway IDPathway nameDEGs
hsa00010Glycolysis/GluconeogenesisPGAM1; HK2
hsa00230Purine metabolismPOLR2H; RRM1; DCK; PDE8B; HPRT1
hsa00240Pyrimidine metabolismPOLR2H; RRM1; DCK
hsa00270Cysteine and methionine metabolismMAT2B; GOT1
hsa00350Tyrosine metabolismMIF; GOT1
hsa00360Phenylalanine metabolismMIF; GOT1
hsa00480Glutathione metabolismGCLM; TXNDC12; RRM1
hsa00520Amino sugar and nucleotide sugar metabolismHEXB; GNPDA1; HK2
hsa00531Glycosaminoglycan degradationHEXB; GNS
hsa00564Glycerophospholipid metabolismPLA2G16; MBOAT1
hsa00650Butanoate metabolismL2HGDH; HMGCS1
hsa00900Terpenoid backbone biosynthesisHMGCS1; PDSS2
hsa01200Carbon metabolismPGAM1; GPT2; GOT1; HK2
hsa012102-Oxocarboxylic acid metabolismGPT2; GOT1
hsa01230Biosynthesis of amino acidsPGAM1; MAT2B; GPT2; GOT1
hsa02010ABC transportersABCA7; ABCB6
hsa03008Ribosome biogenesis in eukaryotesWDR75; MPHOSPH10; NVL
hsa03010RibosomeRPL7A; MRPS5; RPL18A; RPS2; MRPL14
hsa03013RNA transportTPR; ALYREF; UPF3B; SUMO3
hsa03015mRNA surveillance pathwayALYREF; UPF3B
hsa03018RNA degradationBTG1; HSPD1; LSM7
hsa03040SpliceosomeSYF2; ALYREF; LSM7
hsa04010MAPK signaling pathwayMAP4K3; RRAS2; GNG12
hsa04014Ras signaling pathwayRGL2; GNG2; RRAS2; GNG12; PLA2G16
hsa04020Calcium signaling pathwaySLC25A5; PHKA2
hsa04062Chemokine signaling pathwayGNG2; GNG12
hsa04068FoxO signaling pathwayCSNK1E; GABARAPL2; PRKAB2
hsa04141Protein processing in endoplasmic reticulumDNAJC3; OS9; HSP90B1; SSR1; DNAJB11; UGGT2; DNAJB2; SSR4
hsa04142LysosomeGNPTG; CTSC; HEXB; CTSA; GNS
hsa04145PhagosomeTUBA1B; ACTG1; TUBA1A
hsa04151PI3K-Akt signaling pathwayJAK1; COL27A1; HSP90B1; GNG2; GNG12
hsa04152AMPK signaling pathwayLEP; STRADB; ACACB; PRKAB2
hsa04310Wnt signaling pathwayCSNK1E; FZD7
hsa04360Axon guidanceSEMA4C; SEMA3B
hsa04390Hippo signaling pathwaySNAI2; ACTG1; CSNK1E; BMP6; FZD7
hsa04510Focal adhesionPPP1R12C; COL27A1; ACTG1
hsa04520Adherens junctionSNAI2; ACTG1; PTPRB
hsa04530Tight junctionACTG1; YBX3; RRAS2
hsa04540Gap junctionTUBA1B; TUBA1A
hsa04550Signaling pathways regulating pluripotency of stem cellsJAK1; FZD7
hsa04610Complement and coagulation cascadesF13A1; CFB; TFPI
hsa04611Platelet activationCOL27A1; ACTG1
hsa04614Renin-angiotensin systemMME; CTSA; ACE2
hsa04630Jak-STAT signaling pathwayJAK1; LEP
hsa04640Hematopoietic cell lineageMME; CD24
hsa04710Circadian rhythmCSNK1E; CLOCK; PRKAB2
hsa04713Circadian entrainmentGNG2; GNG12
hsa04723Retrograde endocannabinoid signalingGNG2; GNG12
hsa04724Glutamatergic synapseGNG2; GNG12
hsa04725Cholinergic synapseGNG2; GNG12
hsa04726Serotonergic synapseGNG2; GNG12
hsa04727GABAergic synapseGABARAPL2; GNG2; GNG12
hsa04728Dopaminergic synapseGNG2; CLOCK; GNG12
hsa04810Regulation of actin cytoskeletonPPP1R12C; ACTG1; RRAS2; GNG12
hsa04910Insulin signaling pathwayPHKA2; ACACB; HK2; PRKAB2
hsa04913Ovarian steroidogenesisBMP6; HSD17B2
hsa04919Thyroid hormone signaling pathwayACTG1; NCOA2; MED27; RCAN1
hsa04920Adipocytokine signaling pathwayLEP; ACACB; PRKAB2
hsa04921Oxytocin signaling pathwayPPP1R12C; ACTG1; RCAN1; PRKAB2
hsa04922Glucagon signaling pathwayPGAM1; PHKA2; ACACB; PRKAB2
hsa04932Non-alcoholic fatty liver disease (NAFLD)CEBPA; NDUFA12; LEP; PRKAB2
hsa04974Protein digestion and absorptionCOL27A1; MME; ACE2; KCNN4; COL15A1
hsa05010Alzheimer's diseaseNDUFA12; MME
hsa05012Parkinson's diseaseNDUFA12; SLC25A5; UBB
hsa05016Huntington's diseaseNDUFA12; SLC25A5; POLR2H
hsa05032Morphine addictionGNG2; GNG12; PDE8B
hsa05034AlcoholismH2AFY; HIST2H2AC; GNG2; GNG12
hsa05130Pathogenic Escherichia coli infectionTUBA1B; ACTG1; TUBA1A
hsa05152TuberculosisJAK1; HSPD1; BCL10
hsa05161Hepatitis BJAK1; LAMTOR5
hsa05164Influenza ADNAJC3; JAK1; ACTG1; KPNA2
hsa05166HTLV–I infectionJAK1; SLC25A5; RANBP1; RRAS2; FZD7
hsa05168Herpes simplex infectionJAK1; ALYREF; CLOCK
hsa05169Epstein-Barr virus infectionJAK1; VIM; POLR2H; AKAP8L
hsa05200Pathways in cancerCEBPA; TPR; JAK1; HSP90B1; GNG2; GNG12; FZD7
hsa05203Viral carcinogenesisJAK1; RANBP1
hsa05205Proteoglycans in cancerPPP1R12C; ACTG1; RRAS2; FZD7
hsa05206MicroRNAs in cancerFSCN1; VIM
hsa05230Central carbon metabolism in cancerPGAM1; HK2
hsa05322Systemic lupus erythematosusH2AFY; HIST2H2AC
hsa05410Hypertrophic cardiomyopathy (HCM)ACTG1; PRKAB2

CEN

To describe relationships among DEGs clearly, the SCC method was implemented to weight the strength between a pair of genes, and those weighted interactions were input into Cytoscape and visualized as the CEN for PE patients. A total of 351 nodes and 61,425 edges were deposited on the CEN, which suggested that all DEGs were mapped to the network. The edge between KPNA2 and MAT2B (weight=0.9986), FSTL3 and SKIDA1 (weight=0.9984), SSNA1 and PFDN6 (weight=0.9984) had higher weights than the other interactions. Noteworthy, a good liner correlation was uncovered among weights. Additionally, topological centrality analysis on nodes in the CEN of PE was conducted by summing up the nodes it connected directly. We found that the degree distribution for six nodes was not <200, including RDH13 (degree=202), SELENOS (degree=201), PAPPA2 (degree=201), RASSF7 (degree=201), DNAJC3 (degree=200) and PPP1R12C (degree=200).

Optimal pathways

Utilizing pathway annotation data, we identified optimal pathways through gene function inference dependent on the network-based GBA method. During this process, an MFS was produced for each pathway. Importantly, we carried out 3-fold cross-validation on MFS to calculate AUROC for pathways. The AUROC distribution among GO terms is illustrated in Fig. 1. We found that the AUROC for large amount of pathways distributed to the section of 0.4–0.6 and 0.75–0.9. Accordingly, 53 pathways had AUROC >0.5. Furthermore, 9 of 53 pathways with AUROC >0.9 were denoted as optimal pathways, specifically microRNAs in cancer (AUROC=0.9966), gap junction (AUROC=0.9922), pathogenic Escherichia coli infection (AUROC=0.9888), phagosome (AUROC=0.9881), ovarian steroidogenesis (AUROC=0.9821), viral carcinogenesis (AUROC=0.9642), MAPK signaling pathway (AUROC=0.9473), tuberculosis (AUROC=0.9428), and tight junction (AUROC=0.9136).
Figure 1.

The AUROC distribution among GO terms. AUROC for large amount of pathways distributed to the section of 0.4–0.6 and 0.75–0.9.

Discussion

Our results showed that 53 pathways were provided with a good classification performance with AUROC >0.5, 9 of AUROC with >0.9 were defined as optimal pathways, which included microRNAs in cancer, gap junction, pathogenic Escherichia coli infection, phagosome, ovarian steroidogenesis, viral carcinogenesis, MAPK signaling pathway, tuberculosis, and tight junction. We confirmed that the optimal pathway microRNAs in cancer play a significant role in tumor issues, but the functions for this pathway in PE patients has been reported (15). Furthermore, Bird et al focused on pregnancy endothelial adaptive failure in PE (16). Gap junction implicated modulatory intercellular communication during gestation in accordance with regulation of vascular tone (17). Hence gap junction was closely related to PE patients. Our results showed that 53 pathways had a good classification performance with AUROC >0.5, 9 of AUROC were >0.9 and defined as optimal pathways, which included microRNAs in cancer, gap junction, pathogenic Escherichia coli infection, phagosome, ovarian steroidogenesis, viral carcinogenesis, MAPK signaling pathway, tuberculosis, and tight junction. BMP6 and HSD17B2 were enriched in ovarian steroidogenesis pathway as one of optimal pathways. From previous studies, hydroxysteroid (17-β) dehydrogenase 1, encoded by HSD17B1, was found to be significantly decreased in PE patients and was identified to be an independent risk factor for PE (18,19), thus, it will be proposed as a potential prognostic factor for PE. Additionally, MAPK signaling pathway has been paid increasing attention by demonstrating it to participate in PE progression as a crucial pathogenesis of PE (20–22). In conclusion, 9 optimal pathways were disclosed for PE patients by network-based GBA algorithm, which might shed new lights on unraveling the molecular and pathological mechanism of PE. However, validations of these pathways are still not covered, and future studies should be focused on this aspect.
  2 in total

1.  A role for microRNAs in the epigenetic control of sexually dimorphic gene expression in the human placenta.

Authors:  Lauren Eaves; Preeyaphan Phookphan; Julia Rager; Jacqueline Bangma; Hudson P Santos; Lisa Smeester; Thomas Michael O'Shea; Rebecca Fry
Journal:  Epigenomics       Date:  2020-09-09       Impact factor: 4.778

2.  Integrated Analysis of the Transcriptome Profile Reveals the Potential Roles Played by Long Noncoding RNAs in Immunotherapy for Sarcoma.

Authors:  Boran Pang; Yongqiang Hao
Journal:  Front Oncol       Date:  2021-06-11       Impact factor: 6.244

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.