| Literature DB >> 27892526 |
Prabhash Kumar Jha1, Aatira Vijay1, Anita Sahu1, Mohammad Zahid Ashraf1.
Abstract
Thrombosis is a leading cause of morbidity and mortality in patients with myeloproliferative disorders (MPDs), particularly polycythemia vera (PV) and essential thrombocythemia (ET). Despite the attempts to establish a link between them, the shared biological mechanisms are yet to be characterized. An integrated gene expression meta-analysis of five independent publicly available microarray data of the three diseases was conducted to identify shared gene expression signatures and overlapping biological processes. Using INMEX bioinformatic tool, based on combined Effect Size (ES) approaches, we identified a total of 1,157 differentially expressed genes (DEGs) (697 overexpressed and 460 underexpressed genes) shared between the three diseases. EnrichR tool's rich library was used for comprehensive functional enrichment and pathway analysis which revealed "mRNA Splicing" and "SUMO E3 ligases SUMOylate target proteins" among the most enriched terms. Network based meta-analysis identified MYC and FN1 to be the most highly ranked hub genes. Our results reveal that the alterations in biomarkers of the coagulation cascade like F2R, PROS1, SELPLG and ITGB2 were common between the three diseases. Interestingly, the study has generated a novel database of candidate genetic markers, pathways and transcription factors shared between thrombosis and MPDs, which might aid in the development of prognostic therapeutic biomarkers.Entities:
Mesh:
Substances:
Year: 2016 PMID: 27892526 PMCID: PMC5125005 DOI: 10.1038/srep37099
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Workflow of microarray meta-analysis.
(A) Selection process of eligible microarray datasets for meta-analysis of the shared signatures between thrombosis, essential thrombocythemia (ET) and polycythemia vera (PV), according to Prisma 2009 flow diagram. (B) Depiction of the flow chart of the process involved in integrated meta-analysis of the selected microarray datasets.
Characteristics of individual studies included in the meta-analysis.
| GEO accession no. | Disease | Samples (Ctl/Pt) | Sample source | Platform | Reference |
|---|---|---|---|---|---|
| [GSE17078] | Venous Thrombosis (VT) | (n = 30) 27/3 | Blood Outgrowth Endothelial Cells | Affymetrix Human Genome U133A Array | |
| [GSE19151] | Venous Thrombosis (VT) | (n = 133) 63/70 | Whole Blood | Affymetrix Human Genome U133A 2.0 Array | |
| [GSE2006] | Essential Thrombocythemia (ET) | (n = 14) 8/6 | Platelets | Affymetrix Human Genome U133A Array | |
| [GSE26049] | Polycythemia Vera (PV), Essential Thrombocythemia (ET) | (n = 81) 21/41 (PV)/19 (ET) | Whole Blood | Affymetrix Human Genome U133 Plus 2.0 Array | |
| [GSE47018] | Polycythemia Vera (PV) | (n = 25) 6/19 | Peripheral blood CD34+ cells | Affymetrix Human Genome U133A Array |
GEO: Gene Expression Omnibus; GSE 26049 was further separated into two subgroups with 19 Essential Thrombocythemia and 41 Polycythemia Vera patients with 21 common controls, these two subgroups were considered as individual datasets during meta-analysis.
Figure 2Gene expression pattern of the DEGs from meta-analysis.
(A) Venn diagram of differentially expressed genes identified from the meta-analysis (Meta-DE) and those from each individual microarray analysis (Individual-DE). (B) Heat-map representation of expression profiles for the top 25 up- and 25 down-regulated DEGs obtained from meta-analysis. Clustering of selected genes on the heat-map was performed by hierarchical clustering algorithm using Euclidean distance measure. Class 1: Control; Class 2: Patient.
Top 20 shared DEGs identified in the meta-analysis.
| EntrezID | Gene symbol | Gene Name | Combined ES | Adjusted |
|---|---|---|---|---|
| 678 | ZFP36L2 | ZFP36 ring finger protein-like 2 | 2.2679 | 8.26E-06 |
| 55692 | LUC7L | LUC7 like | 1.7633 | 9.99E-05 |
| 22861 | NLRP1 | NLR family, pyrin domain containing 1 | 1.6398 | 2.28E-11 |
| 440270 | GOLGA8B | Golgin A8 family member B | 1.543 | 0.001225 |
| 5878 | RAB5C | RAB5C, member RAS oncogene family | 1.5252 | 0.010788 |
| 1606 | DGKA | Diacylglycerol kinase alpha | 1.4978 | 0.000458 |
| 9057 | SLC7A6 | Solute carrier family 7 member 6 | 1.4452 | 0.018486 |
| 81669 | CCNL2 | Cyclin L2 | 1.4413 | 0 |
| 55696 | RBM22 | RNA binding motif protein 22 | 1.4279 | 1.63E-05 |
| 923 | CD6 | CD6 molecule | 1.4242 | 0.00429 |
| 7504 | XK | X-linked Kx blood group | −1.6059 | 9.19E-08 |
| 475 | ATOX1 | Antioxidant 1 copper chaperone | −1.4773 | 1.99E-06 |
| 8804 | CREG1 | Cellular repressor of E1A stimulated genes 1 | −1.4133 | 0.001229 |
| 5627 | PROS1 | Protein S (alpha) | −1.3574 | 0.000767 |
| 10855 | HPSE | Heparanase | −1.336 | 0.022878 |
| 4501 | MT1X | Metallothionein 1X | −1.3259 | 1.85E-06 |
| 9446 | GSTO1 | Glutathione S-transferase omega 1 | −1.2905 | 0 |
| 4860 | PNP | Purine nucleoside phosphorylase | −1.2799 | 0 |
| 2281 | FKBP1B | FK506 binding protein 1B | −1.262 | 0.001107 |
| 51327 | AHSP | Alpha hemoglobin stabilizing protein | −1.2479 | 1.30E-09 |
Genes were ranked based according to the Standardized difference, also known as effect size. The corresponding p-values are adjusted, based on the false discovery rate using the Benjamini–Hochberg procedure used to select DE genes obtained in each meta-analysis. Combined ES: Combined Effect.
Figure 3Network based meta-analysis of hub genes.
(A) Zero-order interaction network of shared DEGs obtained from meta-analysis using force-directed algorithm with Fruchterman-Rengold layout; red nodes represents overexpressed and green nodes represents underexpressed DEGs. (B) PPI Subnetwork of most significant underexpressed DEG with its interacting partners. (C) PPI Subnetwork of most significant overerexpressed DEG with its interacting partners.
Top enriched terms and biological pathways identified by functional analysis of the DEGs in the meta-analysis.
| Enrichment Term | Pathway/Term ID | Overlap | GSEA library | Adjusted P-value |
|---|---|---|---|---|
| Processing of Capped Intron-Containing Pre-mRNA | R-HSA-72203 | 33/193 | Reactome | 0.00634 |
| mRNA Splicing - Major Pathway | R-HSA-72163 | 25/134 | Reactome | 0.012207 |
| mRNA Splicing | R-HSA-72172 | 25/144 | Reactome | 0.023195 |
| SUMO E3 ligases SUMOylate target proteins | R-HSA-3108232 | 19/96 | Reactome | 0.029891 |
| mRNA Processing | WP411 | 25/127 | WikiPathways | 0.015246 |
| RNA splicing | (GO:0008380) | 45/313 | GO | 0.011692 |
| mRNA processing | (GO:0006397) | 50/397 | GO | 0.018005 |
| mRNA splicing, via spliceosome | (GO:0000398) | 29/177 | GO | 0.018005 |
| Cytosol | (GO:0005829) | 221/2529 | GO | 0.003815 |
| Nucleoplasm | (GO:0005654) | 115/1051 | GO | 0.000197 |
Overlap: indicates the number of hits from the meta-analysis compared to each curated gene set library. Gene set functional analysis was performed using extended libraries of the EnrichR tool. Enriched terms and pathways were ranked based on the adjusted p-value. GO: gene ontology biological process; GSEA: Gene Set Enrichment Analysis.
Figure 4Overrepresentation of pathways and Gene Ontology categories in Biological Networks identified from meta-analysis.
(A) Network representations of enriched pathway integrating KEGG and Reactome pathways on the DEGs gene list using ClueGO cytoscape plug-in. Hyper-geometric (right-handed) enrichment distribution tests, with a p-value significance level of ≤0.05), followed by the Bonferroni adjustment for the terms and leading term groups were selected based on the highest significance. The node size and deeper color indicates greater significance of the enrichment. The pathways having adjusted p-value <0.05 are shown in the network. (B) Enrichment network of shared DEGs based on biological processes. Significantly overrepresented biological processes based on GO terms were visualized in Cytoscape. The size of a node is proportional to the number of targets in the GO category. The color represents enrichment significance— the deeper the color on a color scale, the higher the enrichment significance. p-values were adjusted using a Benjamini and Hochberg False Discovery Rate (FDR) correction.
Top coagulation related genes across the different datasets of meta-analysis.
| Gene | Gene name | Role | Fold change |
|---|---|---|---|
| SELPLG | Selectin P Ligand | Facilitates calcium-dependent interactions with E P and L-selectins, mediates rapid rolling of leukocytes over vascular surfaces during the initial steps in inflammation and coagulation | 0.89846 |
| CPB2 | Carboxypeptidase B2 | Down regulates fibrinolysis by removing C-terminal lysine residues from fibrin that has already been partially degraded by plasmin | 0.5268 |
| F2R | Coagulation factor II (thrombin) receptor | High affinity receptor for activated thrombin. May play a role in platelets activation and in vascular development | −0.66526 |
| PROS1 | Protein S (alpha) | Anticoagulant plasma protein; it is a cofactor to activated protein C in the degradation of coagulation factors Va and VIIIa. It helps to prevent coagulation and stimulating fibrinolysis | −1.3574 |
| ITGB2 | Integrin, beta 2 | Are receptors for the iC3b fragment of the third complement component and for fibrinogen | 0.91008 |
| PRKCH | Protein kinase C, eta | Serine/threonine-protein kinase that is involved in the regulation of cell differentiation in keratinocytes and pre-B cell receptor | 1.3991 |
| RAC1 | Ras-related C3 botulinum toxin substrate 1 (rho family, small GTP binding protein Rac1) | Plasma membrane-associated small GTPase which cycles between active GTP-bound and inactive GDP-bound states | 0.84209 |
| PDPK1 | 3-phosphoinositide dependent protein kinase-1 | Serine/threonine kinase which acts as a master kinase, phosphorylating and activating a subgroup of the AGC family of protein kinases | 0.62861 |
| GNAI2 | Guanine nucleotide binding protein (G protein), alpha inhibiting activity polypeptide 2 | Guanine nucleotide-binding proteins (G proteins) are involved as modulators or transducers in various transmembrane signaling systems. May play a role in cell division | 0.60468 |
| KDM1A | Lysine (K)-specific demethylase 1 A | Component of a RCOR/GFI/KDM1A/HDAC complex that suppresses, via histone deacetylase (HDAC) recruitment, a number of genes implicated in multilineage blood cell development | 0.71635 |
List of differentially expressed top coagulation-related genes “blood coagulation (GO:0007596)” with overlap (45/472) and as the shared signature between Thrombosis, PV and ET individuals from Gene Ontology analysis. Possible roles were extracted from STRING database and the expression values were added from the meta-analysis results.