Fawad Ali1,2, Arifullah Khan1, Syed Aun Muhammad3, Syed Qamar Abbas4, Syed Shams Ul Hassan5,6, Simona Bungau7,8. 1. Riphah Institute of Pharmaceutical Sciences, Riphah International University, Islamabad, 44000 Pakistan. 2. Department of Pharmacy, Kohat University of science and technology, Kohat, 26000 Pakistan. 3. Institute of Molecular Biology and Biotechnology, Bahauddin Zakariya University, Multan, 60800 Pakistan. 4. Department of Pharmacy, Sarhad University of Science and Technology, Peshawar 24840, Pakistan. 5. Shanghai Key Laboratory for Molecular Engineering of Chiral Drugs, School of Pharmacy, Shanghai Jiao Tong University, Shanghai 200240, PR China. 6. Department of Natural Product Chemistry, School of Pharmacy, Shanghai Jiao Tong University, Shanghai 200240, PR China. 7. Department of Pharmacy, Faculty of Medicine and Pharmacy, University of Oradea, 410028 Oradea, Romania. 8. Doctoral School of Biological and Biomedical Sciences, University of Oradea, 410087 Oradea, Romania.
Abstract
The prevalence of hypertension reported around the world is increasing and is an important public health challenge. This study was designed to explore the disease's genetic variations and to identify new hypertension-related genes and target proteins. We analyzed 22 publicly available Affymetrix cDNA datasets of hypertension using an integrated system-level framework involving differential expression genetic (DEG) analysis, data mining, gene enrichment, protein-protein interaction, microRNA analysis, toxicogenomics, gene regulation, molecular docking, and simulation studies. We found potential DEGs after screening out the extracellular proteins. We studied the functional role of seven shortlisted DEGs (ADM, EDN1, ANGPTL4, NFIL3, MSR1, CEBPD, and USP8) in hypertension after disease gene curation analysis. The expression profiling and cluster analysis showed significant variations and enriched GO terms. hsa-miR-365a-3p, hsa-miR-2052, hsa-miR-3065-3p, hsa-miR-603, hsa-miR-7113-3p, hsa-miR-3923, and hsa-miR-524-5p were identified as hypertension-associated miRNA targets for each gene using computational algorithms. We found functional interactions of source DEGs with target and important gene signatures including EGFR, AGT, AVP, APOE, RHOA, SRC, APOB, STAT3, UBC, LPL, APOA1, and AKT1 associated with the disease. These DEGs are mainly involved in fatty acid metabolism, myometrial pathways, MAPK, and G-alpha signaling pathways linked with hypertension pathogenesis. We predicted significantly disordered regions of 71.2, 48.8, and 45.4% representing the mutation in the sequence of NFIL3, USP8, and ADM, respectively. Regulation of gene expression was performed to find upregulated genes. Molecular docking analysis was used to evaluate Food and Drug Administration-approved medicines against the four DEGs that were overexpressed. For each elevated target protein, the three best drug candidates were chosen. Furthermore, molecular dynamics (MD) simulation using the target's active sites for 100 ns was used to validate these 12 complexes after docking. This investigation establishes the worth of systems genetics for finding four possible genes as potential drug targets for hypertension. These network-based approaches are significant for finding genetic variant data, which will advance the understanding of how to hasten the identification of drug targets and improve the understanding regarding the treatment of hypertension.
The prevalence of hypertension reported around the world is increasing and is an important public health challenge. This study was designed to explore the disease's genetic variations and to identify new hypertension-related genes and target proteins. We analyzed 22 publicly available Affymetrix cDNA datasets of hypertension using an integrated system-level framework involving differential expression genetic (DEG) analysis, data mining, gene enrichment, protein-protein interaction, microRNA analysis, toxicogenomics, gene regulation, molecular docking, and simulation studies. We found potential DEGs after screening out the extracellular proteins. We studied the functional role of seven shortlisted DEGs (ADM, EDN1, ANGPTL4, NFIL3, MSR1, CEBPD, and USP8) in hypertension after disease gene curation analysis. The expression profiling and cluster analysis showed significant variations and enriched GO terms. hsa-miR-365a-3p, hsa-miR-2052, hsa-miR-3065-3p, hsa-miR-603, hsa-miR-7113-3p, hsa-miR-3923, and hsa-miR-524-5p were identified as hypertension-associated miRNA targets for each gene using computational algorithms. We found functional interactions of source DEGs with target and important gene signatures including EGFR, AGT, AVP, APOE, RHOA, SRC, APOB, STAT3, UBC, LPL, APOA1, and AKT1 associated with the disease. These DEGs are mainly involved in fatty acid metabolism, myometrial pathways, MAPK, and G-alpha signaling pathways linked with hypertension pathogenesis. We predicted significantly disordered regions of 71.2, 48.8, and 45.4% representing the mutation in the sequence of NFIL3, USP8, and ADM, respectively. Regulation of gene expression was performed to find upregulated genes. Molecular docking analysis was used to evaluate Food and Drug Administration-approved medicines against the four DEGs that were overexpressed. For each elevated target protein, the three best drug candidates were chosen. Furthermore, molecular dynamics (MD) simulation using the target's active sites for 100 ns was used to validate these 12 complexes after docking. This investigation establishes the worth of systems genetics for finding four possible genes as potential drug targets for hypertension. These network-based approaches are significant for finding genetic variant data, which will advance the understanding of how to hasten the identification of drug targets and improve the understanding regarding the treatment of hypertension.
Hypertension
(HTN) is considered a major health problem associated
with its high risk of cardiovascular abnormalities.[1,2] The
high prevalence of hypertension has been reported in both economically
developed and developing countries, affecting more than one billion
individuals worldwide.[3,4] However, the prevalence ratio
is diverse according to geological conditions and is not uniform.[5] The present management protocols of HTN are inadequate
to control cardiovascular complications because of shortfalls in prevention,
diagnosis, and control of the disorder.[6] The understanding of pharmacoepidemiology plays a significant role
in awareness and reducing hypertension-associated morbidity and mortality.
The etiology of HTN is not clearly understood; however, a number of
factors including diet, lifestyle, and genetics may contribute to
the pathogenesis.[7] Although there are established
recommendations, implementation is difficult due to less awareness,
patient and physician compliance, and health care issues.[8]In this study, we carry out the differential
analysis of HTN-related
Affymetrix datasets to find the genetic reasons for the disease. The
most common techniques involve comparative genomics to proteomic-level
analysis, genome-wide scanning, differential screening, and systems
biology approaches. Many small changes in gene expression and polymorphisms
of genes are associated with the progression of the disease.[9] These genetic variations lead toward post-translational
modifications (PTM) including more than 400 types of chemical alterations
at the amino acid level. PTM sites of the disordered proteins are
responsible for altering motifs and ultimately disease development.[10]Simulation-based analysis improves therapeutic
strategies by providing
innovative ways for medical sciences to cope with diseases using data
in a virtual environment. Therefore, these machine learning programs
can help to find the potential target proteins.[11] Meta-genomics covers the broader spectrum of genetic analysis,
which results in better outcomes in clinical sciences.[12] The virtual-based approach is inexpensive, safe,
and time-effective to analyze the complex samples of the patient.[11] The major objectives of our integrative framework
are to (i) determine the DEGs associated with the pathogenesis of
hypertension, (ii) map their role in physiological and biochemical
functions, (iii) carry out expression profiling, (iv) identify functional
interactors and gene signatures of the disease, (v) analyze the mutation
of core proteins and other regulatory motifs, and (vi) determine potential
targets by drug–gene networks through molecular docking and
simulation studies (Figure ).
Figure 1
Schematic diagram and framework of the study.
Schematic diagram and framework of the study.
Results
Differential Analysis and
Normalization
We obtained 22 hypertension-related Affymetrix
(CELL format) cDNA
datasets. AffyBatch has 712 × 712, 1164 × 1164, 1050 ×
1050, and 732 × 732 array sizes (Table ). The density estimation of data is shown
by the histograms representing expression after normalization. The
array distributions have similar shapes and ranges, indicating that
the data is of good quality. The right-hand distribution of the array
reveals a high background level (Figure ). RNA quality, sequence biases, and RNA
degradation were all examined in cDNA datasets. The use of low-quality
RNA samples in genome sequencing is wasteful in genomic analysis.
It is unclear if transcript degradation happens reliably in low-quality
RNA samples, in which case data normalization can counteract the effects
of degradation, or whether different RNA samples degrade at different
rates, thereby biasing expression measures. As a result, we examined
the RNA quality for differential expression analysis to ensure that
the dataset is reliable for detecting transcriptional variation in
original samples. By applying discrimination measures for statistical
and algorithmic analysis, the normalization procedure was used to
standardize sample handling techniques and to estimate the best RNA
variability threshold. Individual probes in each probe set are positioned
at the 5′-end of the target RNA molecule. A 3′/5′
intensity gradient has been demonstrated to affect the competitive
binding of a probe to its target. Because the RNA is of poor quality,
just a small amount of it is hybridized into the array. The total
signal output level is reduced as a result of the low hybridization.
The 3′/5′ intensity gradient, on the other hand, falls
as the saturation level rises. A probe set corresponding to the transcripts
is located at the target gene’s 3′-end. “AffyRNAdeg”
generates a statistical summary for each batch array to assess RNA
degradation levels and their importance (Figure ). Table is a list of the tools, information, databases, and
web servers that were used in this inquiry.
Table 1
List of
Human Affymetrix cDNA Datasets
S. no.
dataset accession
total samples
tissues
conditions
platform
size of arrays
AffyIDs
ref
1
GSE3356
09
coronary smooth muscle
case vs control
GPL96 [HG-U133A] Affymetrix
Human Genome U133A Array
712 × 712
22,283
(13)
2
GSE6489
06
endothelial cells
case vs control
GPL570 [HG-U133_Plus_2]
Affymetrix Human Genome U133
1164 × 1164
54,675
(14)
3
GSE6573
06
adipose, decidua, and placenta
tissue
case vs control
GPL570 [HG-U133_Plus_2]
Affymetrix Human Genome U133 Plus 2.0 Array
1164 × 1164
54,675
(15)
4
GSE10767
07
arterial hypertension
case vs control
GPL570 [HG-U133_Plus_2]
Affymetrix Human Genome U133 Plus 2.0 Array
1164 × 1164
54,675
(16)
5
GSE11341
12
endothelial cells
case vs control
GPL96 [HG-U133A] Affymetrix
Human Genome U133A Array
712 × 712
22,283
(17)
6
GSE17814
18
endothelial cells
case vs control
GPL9099 Affymetrix GeneChip
Human Genome U133 Plus 2.0 Array
1164 × 1164
54,675
(18)
7
GSE19136
12
human left mammary artery
case vs control
GPL570 [HG-U133_Plus_2]
Affymetrix Human Genome U133
1164 × 1164
54,675
(19)
8
GSE22255
40
blood mononuclear cells
case vs control
GPL570 [HG-U133_Plus_2]
Affymetrix Human Genome U133
1164 × 1164
54,675
(20)
9
GSE22356
38
blood mononuclear cells
case vs control
GPL570 [HG-U133_Plus_2]
Affymetrix Human Genome U133
1164 × 1164
54,675
(21)
10
GSE24752
06
peripheral blood cells
case vs control
GPL570 [HG-U133_Plus_2]
Affymetrix Human Genome U133
1164 × 1164
54,675
(22)
11
GSE28345
08
kidney
case vs control
GPL6244 [HuGene-1_0-st]
Affymetrix Human Gene 1.0 ST Array
1050 × 1050
32,321
(23)
12
GSE28360
14
kidney
case vs control
GPL6244 [HuGene-1_0-st]
Affymetrix Human Gene 1.0 ST Array
1050 × 1050
32,321
(23)
13
GSE37455
41
kidney
case vs control
GPL11670 Affymetrix Human
Genome U133 Plus 2.0 Array
1164 × 1164
54,675
(24)
14
GSE38783
24
endothelial cell
case vs control
GPL570 [HG-U133_Plus_2]
Affymetrix Human Genome U133 Plus 2.0 Array
1164 × 1164
54,675
(25)
15
GSE42955
29
heart
case vs control
GPL6244 [HuGene-1_0-st]
Affymetrix Human Gene 1.0 ST Array
1050 × 1050
32,321
(25)
16
GSE67492
06
heart
case vs control
GPL6244 [HuGene-1_0-st]
Affymetrix Human Gene 1.0 ST Array
1050 × 1050
32,321
(26)
17
GSE69601
06
blood samples
case vs control
GPL6244 [HuGene-1_0-st]
Affymetrix Human Gene 1.0 ST Array
1050 × 1050
32,321
(27)
18
GSE70456
16
endothelial Cell
case vs control
GPL15207 Affymetrix
732 × 732
49,495
(28)
19
GSE71994
40
peripheral blood mononuclear
cells
case vs control
GPL6244 [HuGene-1_0-st]
Affymetrix Human Gene 1.0 ST Array
1050 × 1050
32,321
(29)
20
GSE87493
32
blood mononuclear cells
case vs control
GPL6244 [HuGene-1_0-st]
Affymetrix Human Gene 1.0 ST Array
1050 × 1050
32,321
(30)
21
GSE113439
26
lung tissue
case vs control
GPL6244 [HuGene-1_0-st]
Affymetrix
1050 ×
1050
32,321
(31)
22
GSE124114
18
trabecular meshwork cells
case vs control
GPL6244 [HuGene-1_0-st]
Affymetrix Human Gene 1.0 ST Array
1050 × 1050 features
32,321
(32)
Figure 2
Normalization and differential
analysis. The histogram shows the
density of the data analyzed. Normally, the proportions of the clusters
have comparable shapes. Significant levels of background shifted the
intensities of the different arrays toward the right.
Figure 3
RNA degradation plot produced by the AffyRNAdeg representation
5′ to 3′ pattern, indicating an assessment of the degradation
and severity level.
Table 2
Tools, Databases, and Software Used
in This Study
databases/software/tools
accessibility
utility
CELLO
subcellular localization
predictor
http://cello.life.nctu.edu.tw/
subcellular localization
prediction
DAVID bioinformatics tool
http://david.abcc.ncifcrf.gov
functional annotation
tool
STRING
database
http://string-db.org/
for known and predicted
protein/COGs interaction
NCBI
http://blast.ncbi.nlm.nih.gov/
biomedical and
genomic information
source
KEGG database
http://www.genome.jp/
pathway analysis and comparison
Cytoscape version
3.6.0
http://www.cytoscape.org/
for network analysis and
visualization
FunRich tool 3.1.3
http://funrichweb.org/
for significant association
of DEGs in biological pathways
CIMminer
https://discover.nci.nih.gov
for cluster analysis
regarding
their expression value
ActiveDriverDB
https://www.activedriverdb.org/
for mutation analysis
WikiPathways
https://www.wikipathways.org/
pathway analysis
and comparison
oPOSSUM version 3.0
http://opossum.cisreg.ca/oPOSSUM3/
prediction of regulatory
motifs
miRDB
http://mirdb.org./
to explore the functional
annotation
Normalization and differential
analysis. The histogram shows the
density of the data analyzed. Normally, the proportions of the clusters
have comparable shapes. Significant levels of background shifted the
intensities of the different arrays toward the right.RNA degradation plot produced by the AffyRNAdeg representation
5′ to 3′ pattern, indicating an assessment of the degradation
and severity level.
Finding Potential Drug Targets
We
identified 30 DEGs in every individual dataset by pairwise comparison.
From the list of top-ranked genes, 18 common DEGs were sorted out
(Table S1). The protein subcellular localization
predicted that GREM2 and SCN2B are extracellular; TPBG is membrane-bound;
DUSP6 and HDDC2 are cytoplasmic; and DDIT4, CEBPD, EDN1, NFIL3, ANGPTL4,
BHLHE40, C10orf10, ADM, MSR1, LRR1, c6orf15, ETS2, and USP8 are nuclear
(Figure ). Based on
data mining, seven hypertension-related DEGs were shortlisted (>200
literature count) including ADM, EDN1, ANGPTL4, NFIL3, MSR1, CEBPD,
and USP8 (Figure ).
Figure 4
Subcellular
localization of differential expressed genes and DEG
distribution among the cellular compartments.
Figure 5
Disease–gene
curation. The bar graph indicates the disease–gene
mapping (hypertension-potential genes) using online databases.
Subcellular
localization of differential expressed genes and DEG
distribution among the cellular compartments.Disease–gene
curation. The bar graph indicates the disease–gene
mapping (hypertension-potential genes) using online databases.
Cluster and Functional
Enrichment Analysis
We evaluated the expression of these
seven DEGs to detect the complete
description of molecular functions. The profiling showed variations
and comparative expression of the gene in hypertension. We assessed
the similarity index between disease–gene interactions through
cluster analysis and observed hypertension-related enriched terms[33−36] (Figure ). Pathway-enriched
relations specify a substantial association of hypertension with DEGs
in the biological pathways involving hypoxia and oxygen homeostasis
of HIF-I, an affinity for calcitonin-like ligands, and PIK3, EGF,
and ARF6 signaling (Figure ).
Figure 6
Cluster analysis of seven hypertension-related DEGs with Euclidean
distance (binning method: quantile lines show the limits of the clusters
in the degree of the tree).
Figure 7
Pathway
enrichment analysis indicates the percentage of DEGs in
the biological pathway using FunRich tool.
Cluster analysis of seven hypertension-related DEGs with Euclidean
distance (binning method: quantile lines show the limits of the clusters
in the degree of the tree).Pathway
enrichment analysis indicates the percentage of DEGs in
the biological pathway using FunRich tool.
Identifying Regulatory Motifs and MiRNAs Targets
Seven hypertension-related DEGs were used for de novo examination
to find the regulatory motifs and the transcriptional factors including
Nkx2-5, HOXA5, ARID3A, Pdx1, MZF1_1-4, SPIB, Prrx2, ZEB1, ZNF354C,
SRY, and ELF5 (Table S2). Mainly miRNAs
including hsa-miR-365a-3p, hsa-miR-2052, hsa-miR-3065-3p, hsa-miR-603,
hsa-miR-7113-3p, hsa-miR-3923, and hsa-miR-524-5p were predicted for
gene ADM, USP8, ANGPTL4, NFIL3, EDN1, EDN1, and CEBPD, respectively
(Table ).
Table 3
MiRNA-Targets of Hypertension-Associated
Genes
serial no.
gene symbol
gene description
target scorea
microRNA
name
total hits
miRNA sequence
seed location
3’-UTR
length
1
ADM
adrenomedullin
97
hsa-miR-365a-3p
71
UAAUGCCCCUAAAAAUCCUUAU
531
777
2
USP8
ubiquitin specific peptidase
8
94
hsa-miR-2052
140
UGUUUUGAUAACAGUAAUGU
213
2008
3
ANGPTL4
angiopoietin-like 4
85
hsa-miR-3065-3p
22
UCAGCACCAGGAUAUUGUUGGAG
381, 387
489
4
NFIL3
nuclear factor,
interleukin
3 regulated
97
hsa-miR-603
49
CACACACUGCAAUUACUUUUGC
254
320
5
EDN1
endothelin 1
97
hsa-miR-7113-3p
112
CCUCCCUGCCCGCCUCUCUGCAG
389
1139
6
MSR1
macrophage
scavenger receptor
1
100
hsa-miR-3923
235
AACUAGUAAUGUUGGAUUAGGG
1184, 1870, 1888, 1900,
1906
2207
7
CEBPD
CCAAT enhancer binding protein
delta
98
hsa-miR-524-5p
37
CUACAAAGGGAAGCACUUUCUC
210, 264, 303, 368
415
Highly reliable
score ≥ 80,
least reliable score ≤ 50.
Highly reliable
score ≥ 80,
least reliable score ≤ 50.
Mutation Analysis
We performed a
mutation analysis of the seven DEGs. The NFIL-3 protein has 21 PTM
sites, and at chromosome number 9, it showed recurrent mutations as
a negative-strand representing 71.21% as a disordered region encoding
approximately 462 different protein residues. The isoform NFIL-3 N103S
showed a proximal mutation at position 103, with amino acid residues
N and S. CEBPD exhibited a 52.04% sequence region as disordered and
have six (06) PTM sites with 18 mutations and 269 residues at chromosome
number 8. CEBPD G186D, CEBPD R195Q, and CEBPD P257A isoforms were
investigated by CEBPD mutational enrichment analysis at positions
186, 195, and 257, respectively. USP8 revealed 48.84% of the sequence
disordered, which has 42 PTM sites with 118 protein residues and shows
137 mutations at chromosome 15. USP8 T351A showed the direct mutation.
Other isoforms like USP8 R638T, USP8 R638K, and USP8 R638G showed
mutations at position 638. Similarly, ADM genes have 24 mutations
at chromosome number 11 on the positive strand with 185 residues and
12 PTM regions. The ADM S178N isoform showed a direct mutation at
position 178, while ANGPTL4 showed 86 mutations on the positive strand
of chromosome 19 with 406 protein residues (Figure ).
Figure 8
Mutation analysis indicates the post-translational
change in human
genes/proteins using the ActiveDriverDB database. Needle plots demonstrate
the PTM sites in our proteins (shown in legend color codes). The y axis indicates the mutation count while the x axis demonstrates the position of the amino acid sequence. Pinhead
shading means the mutation effect, and x axis shading
shows the kind of PTM related to the mutation area.
Mutation analysis indicates the post-translational
change in human
genes/proteins using the ActiveDriverDB database. Needle plots demonstrate
the PTM sites in our proteins (shown in legend color codes). The y axis indicates the mutation count while the x axis demonstrates the position of the amino acid sequence. Pinhead
shading means the mutation effect, and x axis shading
shows the kind of PTM related to the mutation area.
Protein Network Analysis
In the protein–protein
network, a total of 72 nodes and 64 edges were retrieved from the
STRING database. The PPI network was principally characterized by
three nodes: source nodes (light pink color) to target nodes (light
gray color), while the remaining light-yellow nodes represent the
other gene signatures (Figure A). The network topological properties were analyzed through
a network analyzer. In this network, the topological parameters in
the network were measured between the nodes and edges and characterized
the qualitative gene pattern (Figure B). Shortlisted DEGs interact with other proteins like
CALCRL, RAMP2, RAMP1, RAMP3, CALCA, ADM2, AVP, MAPK1, EDNRB, EDNRA,
ECE1, AGT, AKT1, RHOA, SRC, LPL, PPARA, PPARGC1A, PARGC1A, PPARG,
FABP4, RXRA, PER2, ARNTL, BHLHE40, CRY2, CRY1, COL4A2, APOE, APOB,
APOA1, COL1A2, CALR, COL3A1, HSP90B1, COL4A1, HSP90B1STAT3, KLF5,
STAM2, EGFR, UBC, and HGS are important in disease phenotype. After
disease gene mapping using the PubMed database, we estimated that
40 target genes have the potential for a drug target in hypertension.
Among them, EGFR, AGT, AVP, APOE, RHOA, SRC, APOB, STAT3, UBC, LPL,
APOA1, and AKT1 are the enriched terms (Figure C).
Figure 9
Gene network analysis. (A) Protein–protein
interaction network.
Interaction of seeder/source nodes (light pink) with target nodes
(light gray). (B) Topological properties of the network were analyzed
by a network analyzer. (C) Disease–gene mapping.
Gene network analysis. (A) Protein–protein
interaction network.
Interaction of seeder/source nodes (light pink) with target nodes
(light gray). (B) Topological properties of the network were analyzed
by a network analyzer. (C) Disease–gene mapping.
Pathway Analysis and Associated Mechanisms
We analyzed the role of potential drug targets in the associated
pathways to identify the mechanism of action of these molecules in
hypertension. It was also found that these proteins have a major role
in integrated pathway regulation including fatty acid metabolism,
melatonin, transcriptional cascade, regulating adipogenesis, prostaglandin
synthesis and regulation, binding and uptake of ligands by scavenger
receptors, plasma lipoprotein assembly, myometrial relaxation, and
contraction pathways, PPAR signaling pathway, relaxin signaling pathway,
MAPK signaling pathway, and G alpha (s) signaling events. The synthesis
of these biomolecules would be affected by the regular expression
patterns of respected genes, while dysregulation of these pathways
results in positive anomalies of hypertension (Figure ). We found 31 signaling pathways associated
with hypertension.
Figure 10
Pathway analysis and molecular mechanisms in hypertension.
The
pathways have been mapped using KEGG and Wiki Pathways. Color codes
are used to describe the reaction steps of the pathway model.
Pathway analysis and molecular mechanisms in hypertension.
The
pathways have been mapped using KEGG and Wiki Pathways. Color codes
are used to describe the reaction steps of the pathway model.
Toxicogenomics
We found the adverse
effects of environmental and pharmaceutical chemicals on disease progression
and ultimately on human health. Based on the activity, binding, and
expression patterns, we observed the effect of chemicals that either
increase or decrease gene activity at cellular levels (Figure ). The effect of chemicals
on either level expression (increase or decrease) has been shown in
light green and yellow color nodes. The effect of cotreatment expression
is indicated by light gray.
Figure 11
Toxicogenomic analysis of differentially expressed
genes by the
Comparative Toxicogenomics Database (CTD) helps to study the chemical
genome to phenome relationships.
Toxicogenomic analysis of differentially expressed
genes by the
Comparative Toxicogenomics Database (CTD) helps to study the chemical
genome to phenome relationships.
Upregulated Genes
The predicted log_2-fold
change of the score of the seven shortlisted genes obtained from the
Expression Atlas data are given in Table . These results were further verified in
a study[2] in which ADM, USP8, ANGPTL4, and
EDNI were up-regulated in the peripheral blood samples of hypertensive
patients (Figure ).
Table 4
Regulation Gene Expression
via Expression
Atlas
S. no.
gene
log_2-fold change
gene regulation
1
ADM
1.1
upregulated
2
ANGPTL4
3.1
upregulated
3
EDN1
1.1
upregulated
4
USP8
1.2
upregulated
5
MSR1
–1.4
downregulated
6
CEBPD
–1.1
downregulated
7
NFIL3
–2.3
downregulated
Figure 12
Based on the fold variations in gene expression and abnormal expression
levels of differentially expressed genes in hypertension patients
and controls.
Based on the fold variations in gene expression and abnormal expression
levels of differentially expressed genes in hypertension patients
and controls.
Molecular Docking
MOE docked FDA-approved
medicines with four upregulated genes in hypertension against a specific
binding pocket, and all of the complexes were graded based on the
energy function score (S-Score), RMSD plots, non-covalent interaction
strength, hydrogen bonding, and maximal accommodation with the binding
pocket. Out of the 99 library medications, 12 were reused as the best
for simulation. CID:65999 had a minimum binding score of −10.2
kcal/mol in ADM (PDB: 4RWF), CID:135409642 had a score of −10.1 kcal/mol,
and CID:3749 had a value of −9.9 kcal/mol (Figure A–C, respectively).
The top three ranking medications, CID:2450, CID:65999, and CID:71301,
all had minimum binding scores of −9.1, −9.1, and −8.6
kcal/mol, respectively. In the case of USP8 (PDB:2GF0) (Figure D–F, respectively).
Top-ranked medicines with a minimum binding score of −8 kcal/mol
in ANGPTL4 (PDB: 6EUB) were CID:110635, CID:135409642, and CID:3157. (Figure G,H,I). EDN1(PDB: 6DK5) with CID:65999,
CID:110635, and CID:3749 had the lowest binding scores, with −9,
−8.7, and −8.5 kcal/mol, respectively, (Figure J–L). All 12 complexes
show the highest hydrogen bonding, van der Waals interaction, and
other hydrophobic interactions with the binding pocket residues.
Figure 13
(A–C)
2D interactions of CID:65999, CID:135409642, and CID:3749
with 4WRF. (D–F)
2D interactions of CID:2450, CID:65999, and CID:71301 with 2GFO. (G–I) 2D
interactions of CID:110635, CID:135409642, and CID:3157 with 6EUB.
(J–L) 2D interactions of CID:65999, CID:110635, and CID:3749
with 6DK5.
(A–C)
2D interactions of CID:65999, CID:135409642, and CID:3749
with 4WRF. (D–F)
2D interactions of CID:2450, CID:65999, and CID:71301 with 2GFO. (G–I) 2D
interactions of CID:110635, CID:135409642, and CID:3157 with 6EUB.
(J–L) 2D interactions of CID:65999, CID:110635, and CID:3749
with 6DK5.
Molecular Dynamic (MD)
Simulation
With the Desmond simulation package, MD simulations
were run for
100 ns per complex after the docking measurements. The root mean square
deviation (RMSD), root mean square fluctuation (RMSF), and protein–ligand
interaction (PLI) parameters were determined using MD trajectories.
We ran MD simulations with similar parameters for each of our twelve
substances to test the outcome of the MD simulations. The results
of each complex’s simulation were demonstrated to be repeatable,
and they are displayed below.
RMSD Analysis
Both ligand and
protein in the 4RWF–CID:65999 complex reached equilibrium at
40 ns and remained stable throughout the simulation. After establishing
stability, ligand RSMD indicated that fluctuation changes were maintained
within 1.6 Å (Figure A). Similarly, the 4RWF complex with the CID:135409642 ligand
RSMD becomes stable at 10 nm and stays stable for 100 ns. After 40
ns, there was considerable variability in protein RSMD, although it
was within acceptable limits (Figure B). RMSD indicated good stability up to 80 ns for compound
4RWF with the CID:3749 ligand (Figure C). RMSD changes stayed within 1.5 Å
over the simulation period, which is acceptable for tiny, globular
proteins like 4RWF. Simultaneously, the 2GFO/CID:2540 ligand complex
(Figure D) reached
equilibrium within 10 ns and remained stable throughout the simulation.
The RMSD fluctuated by 1.6 Å for the first 10 ns and then stayed
steady for the next 100 ns. Similarly, after 20 ns, the complex of
2GFO and CID:65999 ligand (Figure E) remained stable throughout the simulation up to
100 ns. The complex remained steady throughout the simulation up to
100 ns, with just a slight fluctuation of 0.15 nm after 50 ns up to
60 ns. Furthermore, in the 2GFO–CID:71301 ligand complex (Figure F), the stability
was shown by RMSD up to 100 ns after 45 ns. During the simulation
period, changes in the RMSD values stayed within 1.5 Å, which
is acceptable for tiny, globular proteins like 2GFO. Furthermore,
the CID:110635 ligand–6EUB complex (Figure A) demonstrated stabilization up to 50 ns
and showed 2 Å variations between 50 and 55 ns during simulation,
and it became stable at 60 ns and stayed constant up to 100 ns. Similarly,
the 6EUB complex with ligand CID:135409642 (Figure B) showed minor fluctuations of 1.8 Å
between 55 and 60 ns while remaining stable throughout the simulation
time. When compared to the 6EUB combination with the CID:3157 ligand
(Figure C), they
stay stable for 65 ns. RMSD values stayed within 2 Å over the
simulation period, which is ideal for tiny, globular proteins such
as 6EUB. Figure shows the 6D5K RMSD plot, which shows that complexes stayed intact
throughout the experiment. The CID:65999 ligand–6D5K complex
(Figure D) showed
modest variance up to 15 ns at first but thereafter demonstrated simulated
stability of up to 100 ns. Similarly, the 6D5K complex with ligand
CID:110635 (Figure E) showed stability up to 40 ns during the simulation, with a flip
of the ligand between 40 and 50 ns due to variations in protein structure,
but the ligand remained constant after that. Both the ligand and the
protein are initially unstable in the 6D5K complex with the CID:3749
ligand. They reached equilibrium after 20 ns; however, their RSMD
is exceedingly high, indicating relative instability (Figure F).
Figure 14
RMSD plot of 4RWF protein
with three ligands, (A) CID:65999, (B)
CID:135409642, and (C) CID:3749, respectively. RMSD plot of the 2GFO
protein with three ligands, (D) CID:2540, (E) CID:65999, and (F) CID:71301,
respectively. The x axis depicts the simulation’s
time frame (in seconds). The protein RMSD variation is shown on the
right y axis, while the RMSD variation of the ligand
is shown on the left y axis.
Figure 15
RMSD
plot of 6EUB protein with three ligands, (A) CID:110635, (B)
CID:135409642, and (C) CID:3157, respectively. RMSD plot of the 6D5K
protein with three ligands, (D) CID:65999, (E) CID:110635, and (F)
CID:3749, respectively. The x axis depicts the simulation’s
time frame (in seconds). The protein RMSD variation is shown on the
right y axis, while the RMSD variation of the ligand
is shown on the left y axis.
RMSD plot of 4RWF protein
with three ligands, (A) CID:65999, (B)
CID:135409642, and (C) CID:3749, respectively. RMSD plot of the 2GFO
protein with three ligands, (D) CID:2540, (E) CID:65999, and (F) CID:71301,
respectively. The x axis depicts the simulation’s
time frame (in seconds). The protein RMSD variation is shown on the
right y axis, while the RMSD variation of the ligand
is shown on the left y axis.RMSD
plot of 6EUB protein with three ligands, (A) CID:110635, (B)
CID:135409642, and (C) CID:3157, respectively. RMSD plot of the 6D5K
protein with three ligands, (D) CID:65999, (E) CID:110635, and (F)
CID:3749, respectively. The x axis depicts the simulation’s
time frame (in seconds). The protein RMSD variation is shown on the
right y axis, while the RMSD variation of the ligand
is shown on the left y axis.
RMSF Analysis
The deviation of
a particle in a macromolecule is defined by the RMSF. It defines the
flexibility and rigidity of protein structures. The residues with
higher peaks are found in loop areas or N- and C-terminal zones, as
N and C are the most fluctuating in MD trajectories. Low RMSF values
of binding site residues indicate that the ligand binding to the protein
is stable. RMSF values of the protein complexes (4RWF bound with CID:65999,
135,409,642, and 3749), (2GFO bound with CID:2540, 65,999, and 71,301),
(6EUB bound with CID:110635, 135,409,642, and 3157), and (6D5K bound
with CID:65999, 110,635, and 3749) with different ligands are shown
in Figure .
Figure 16
RMSF plot
analysis of complex protein concerning ligands. (A) Plot
of 4RWF–CID:65999, (B) plot of 4RWF–CID:135409642, (C)
plot of 4RWF–CID:3749, (D) plot of 2GFO–CID:2540, (E)
plot of 2GFO–CID:65999, (F) plot of 2GFO–CID:71301,
(G) plot of 6EUB–CID:110635, (H) plot of 6EUB–CID:135409642,
(I) plot of 6EUB–CID:3157, (J) plot of 6D5K–CID:65999,
(K) plot of 6D5K–CID:110635, and (L) plot of 6D5K–CID:3749,
respectively.
RMSF plot
analysis of complex protein concerning ligands. (A) Plot
of 4RWF–CID:65999, (B) plot of 4RWF–CID:135409642, (C)
plot of 4RWF–CID:3749, (D) plot of 2GFO–CID:2540, (E)
plot of 2GFO–CID:65999, (F) plot of 2GFO–CID:71301,
(G) plot of 6EUB–CID:110635, (H) plot of 6EUB–CID:135409642,
(I) plot of 6EUB–CID:3157, (J) plot of 6D5K–CID:65999,
(K) plot of 6D5K–CID:110635, and (L) plot of 6D5K–CID:3749,
respectively.During the simulation procedure,
the RMSF measured the distinct
variations in protein residues. During the modeling of 4RWF with corresponding
ligands, substantial changes (between residues 480–490 with
CID:65999, 170–180 with CID:135409642, and 480–500 with
CID:3749) were detected (Figure A–C). Furthermore, during the modeling of 2GFO
with respective ligands, substantial changes (between residues 130–135
with CID:2540, 130–135 and 240–255 with CID:135409642,
and 120–130 and 200–220 with CID:71301) were detected
(Figure D–F).
Other discrepancies were detected during the modeling of 6EUB with
relevant ligands (between residues 150–155 and 175–185
with CID:110635, 150–160 and 175–185 with CID:135409642,
and 150–160 and 175–190 with CID:3157) (Figure G–I). Moreover, during
the modeling of 6D5K with relevant ligands, other oscillations (between
residues 20–25 with 6D5K–CID:65999, 22–27 with
CID:110635, and 3–7 and 20–25 with CID:3749) were detected
(Figure J–L).
Protein–Ligand Interactions
Throughout
the simulation, protein interactions with the ligand were
observed. Hydrogen bonds, hydrophobic, ionic, and water bridges are
the four forms of protein–ligand interactions.
Exploring the “Simulation Interactions Diagram” tab
reveals more precise subtypes for each interaction type. In ligand
binding, hydrogen bonding (H-bonds) and hydrophobic interactions are
important. Because of their considerable influence on drug selectivity,
metabolization, and adsorption, hydrogen-bonding properties should
be considered in drug design. Water bridges and ionic interaction
are also vital in the formation of complex proteins structure.LYS-17, ASP-43, GLU-113, and TRP-232 were the most active residues
in the 4RWF–CID:65999 complex, forming tight hydrogen bonds,
while ASP-16, LYS-44, GLU-155, and ASN-229 were the amino acid residues
that formed water bridges. Figure A shows the minimal contribution of TRP-64, ALA-65,
TYR-212, and TYR-157 residues to hydrophobic interaction with ligand
atoms. Strong hydrogen bonds were established by the most active residues
in the PL complex of 4RWF–CID:135409642, such as ASN-14, LYS-17,
GLU-113, and TRP-232, whereas TRP-64, ALA-65, TYR-157, PHE-158, and
TYR-212 were five amino acid residues that contributed to strong hydrophobic
interactions Figure B. Although strong hydrogen bonds were formed for the 4RWF–CID:3749
complex, ASN-229, GLY-230, and TRP-232, whereas LEU-115, GLU-115,
and TYR-157 formed hydrophobic interactions with the relevant ligand
atoms (Figure C).
The most active residues, such as SER-953 and ASN-981, established
strong hydrogen bonds, while LYS-996, ILE-998, PHE-971, and PHE-930
contributed to strong hydrophobic contacts in the protein–ligand
complex 2GFO–CID:2540 (Figure D). LYS-996 and LYS-1012 residues displayed hydrogen
bonding, while PHE-930, VAL-934, PHE-971, and TYR-1016 residues were
crucial in the ligand’s hydrophobic interaction. The development
of water bridges for the 2GFO–CID:65999 complex involved SER-953,
LYS-976, THR-978, and ASP-979 (Figure E). Although strong hydrogen bonds were
formed for the 2GFO–CID:71301 complex, ASP-878, ASN-918, and
PHE-946 were formed, while LEU-874, ILE-922, VAL-923, and PHE-930
established a hydrophobic connection with the relevant ligand atoms
(Figure F). The
most active residues in the 6EUB–CID:110635 combination established
excellent hydrogen bonds: LEU-213 and LYS-261. The hydrophobic interaction
of LEU-201, PHE-212, PRO-251, HIS-252, PHE-255, and LEU-257 residues
with ligand atoms adds to the overall contribution (Figure A). Strong hydrogen bonds
were generated by the most active residues in the 6EUB–CID:135409642
complex, such as THR-315, THR-316, SER-323, and HIS-333. The residues
of amino acids LEU-322, PRO-325, LEU-335, and PHE-351 resulted in
significant hydrophobic interactions (Figure B). For the 6EUB–CID:3157 complex,
SER-323, GLN-331, ASP-332, ASP-334, and PHE-351 established strong
hydrogen bonds. Along with ALA-299, PRO-325, LEU-335, and ARG-336,337,
the hydrophobic interaction with the relevant ligand atoms was established
(Figure C). Hydrogen
bonds were produced by the most active residues, such as SER-2 and
LYS-9, in the 6DK5-CID:65999 complex, while hydrophobic interactions
with ligand atoms were contributed by amino acid residues TRP-21,
LEU-6, VAL-12, and HIS-16 (Figure D). For the 6DK5-CID:110635protein-ligand complex,
the most active residues, such as TRP-13 and ILE-19, formed tight
hydrogen bonds, while TRP-21, PHE-14, HIS-16, and LEU-17 amino acid
residues formed hydrophobic contacts (Figure E). SER-5, ILE-19, and 20 performed a significant
function in hydrogen bonding in 6DK5–CID:3749, whereas MET-7,
PHE-14, ILE-19, TYR-13, and 21 showed hydrophobic interaction (Figure F).
Figure 17
Protein interaction
analysis (PIA). (A) PIA plot of 4RWF–CID:65999,
(B) PIA plot of 4RWF–CID:135409642, (C) PIA plot of 4RWF–CID:3749,
(D) PIA plot of 2GFO–CID:2540, (E) PIA plot of 2GFO–CID:65999,
and (F) PIA plot of 2GFO–CID:71301.
Figure 18
Protein
interaction analysis (PIA). (A) PIA plot of 6EUB–CID:110635,
(B) PIA plot of 6EUB–CID:135409642, (C) PIA plot of 6EUB–CID:3157,
(D) PIA plot of 6DK5–CID:65999, (E) PIA plot of 6DK5–CID:110635,
and (F) PIA plot of 6DK5–CID:3749.
Protein interaction
analysis (PIA). (A) PIA plot of 4RWF–CID:65999,
(B) PIA plot of 4RWF–CID:135409642, (C) PIA plot of 4RWF–CID:3749,
(D) PIA plot of 2GFO–CID:2540, (E) PIA plot of 2GFO–CID:65999,
and (F) PIA plot of 2GFO–CID:71301.Protein
interaction analysis (PIA). (A) PIA plot of 6EUB–CID:110635,
(B) PIA plot of 6EUB–CID:135409642, (C) PIA plot of 6EUB–CID:3157,
(D) PIA plot of 6DK5–CID:65999, (E) PIA plot of 6DK5–CID:110635,
and (F) PIA plot of 6DK5–CID:3749.
Discussion
In recent years, the advent
of technological development has been
making remarkable changes in biological sciences. This study shows
the association of genetic variation in hypertension. We found a list
of hypertension-related genes based on differential expression and
systems biology analysis. The system-level framework gives us a consistent
development of the meta-analysis of cDNA microarray datasets to identify
DEGs.[37] We separated seven common genes
as potential medication targets, to be specific ADM, EDN1, ANGPTL4,
NFIL3, MSR1, CEBPD, and USP8 (p < 0.05), from
the rundown of 18 DEGs dependent on physicochemical and functional
examination. The profiling showed variations and comparative expression
of the gene in hypertension. The pathway-enriched study showed a significant
association between hypertension with DEGs. The dysregulation of HIF-I
and calcitonin has been studied in hypertension.[38,39] PIK3 and ARF6 signaling is considered a novel target in hypertension.[40,41] Some studies indicate that abnormal EGF signaling is linked to the
development of cardiovascular pathology.[42] This study showed significant enrichment of four upregulated (ADM,
ANGPTL4, USP8, and EDNI) and three downregulated genes. We found the
role of regulatory motifs of these genes in heart pathogenesis, control
of cell cycle progression, regulation of β-cell development,
myocardial fibrosis, hypertensive nephrosclerosis, and blood pressure
modulation. MicroRNAs are involved in the regulation of human physiological
processes by transcription events, and dysregulation of these expressions
results in many diseases.[43] miR-365b-3p
appears to play a role in coronary artery smooth muscle cell proliferation
and migration, through its direct target gene ADAMTS1.[44] Altered responses to hsa-miR-3065-3p, hsa-miR-603,
and hsa-miR-3923 may result in cardiovascular pathophysiology.[45,46] hsa-miR-524-5p is a circulating miRNA that has been found to be
considerably downregulated in patients with heart disease as compared
to controls.[47] The inherited mutation encodes
genetic variations in the relation of genotype to phenotype. Thousands
of SNVs are reported to be the cause of disease. Amino acid substitution
may have a significant role in cell differentiation and growth in
correspondence to missense mutations as post-translational modifications
(PTM).[10] Mutations can interfere at different
levels leading to the emergence of some toxic confirmations and have
a major role in protein function both in normal and pathological states.
A change in PTM reveals the disease progression[48] and is responsible for functional diversity.[49] In the PPI, we recognized that these conceivable
potential drug targets are practically connected with other associated
protein targets including EGFR, AGT, AVP, APOE, RHOA, STAT3, SRC,
APOB, UBC, LPL, APOA1, and AKT1. Of these proteins, EGFR is involved
in the molecular mechanism of Ang II-mediated cerebrovascular remodeling,[50] AGT polymorphism, and APoB is associated with
hypertension.[51,52] Vasopressin has several functions
via its three distinct receptors, V1a, V1b, and V2. Among them, AVP
is one potent vascular constrictor by regulating the vascular tone
and fluid through its V1aR and/or V2R. The RhoA/Rho-kinase signaling
pathway increases the vasoconstriction characteristic of hypertension.
The increase in vascular peripheral resistance is partly due to vascular
constriction mediated by the calcium-independent activation of the
small G-protein RhoA and a downstream target, Rho-kinase.[53] Src/STAT3, AKT1, and epidermal growth factor
(EGF) activation plays important roles in cell proliferation and results
in the pathobiology of pulmonary arterial hypertension characterized
by the enhanced pulmonary artery smooth muscle cell.[60,61] The allelic association of ApoB and LPL has a strong genetic association
with hypertensive individuals.[54]The synthesis of these biomolecules would be affected by the regular
expression patterns of respective DEGs while dysregulation of these
pathways results in positive anomalies of hypertension. Melatonin
involves an antioxidant activity with its potential role in mitochondrial
physiology in the cardiovascular system. Melatonin regulates blood
pressure by central and peripheral mechanisms, in addition to the
action on the renin–angiotensin system.[55] The prostacyclin pathway was studied for the treatment
of pulmonary arterial hypertension (PAH), which is a chronic and progressive
disease resulting in right ventricular failure and death.[56] Elevated blood pressure activates mitogen-activated
protein (MAP) kinases, resulting in a rapid and transient induction
of MKP-1 mRNA followed by elevated MKP-1 protein expression in the
aorta.[57] The reno-protective effects of
peroxisome proliferator-activated receptors (PPARS) produce ligand-induced
blood-pressure-lowering effects, protective effects on endothelial
function, and vasodilating effects on glomerular efferent arterioles.[58] Similarly, relaxin contributes to sympathetic
overdrive and hypertension via the PI3K-Akt pathway.[59] We find that genes linked directly with blood pressure
regulation by binding with G-protein coupled receptors have a sympathomimetic-like
action and indirectly, through communication with other genes, are
effective in hypertension. The toxicogenomic analysis helps to understand
gene interaction with the chemicals that may be the reason for disease
progression. This analysis suggests the mechanism of action of chemicals
and their effect on the disease state influenced by environmental
exposure.[60] We used an FDA-approved hypertensive
medication with a specific binding site to dock against four target
proteins that have previously been identified as overexpressed DEGs
in hypertensive people.[2] These goals were
chosen based on several variables, including their relative importance
and the availability of the relevant literature. The binding energy
was used to compute the optimal inhibitory potential of these docked
ligands; thus, a ligand with low binding energy is favored, as a low
binding score is directly linked to greater binding affinity.[61] We select three of the best ligands for each
target for additional molecular dynamic modeling to determine the
stability of the complexes up to 100 nm. The stability of all 12 complexes
has been maintained. Because of loop development, the 6EUB–CID:3157
complex showed higher deviation, but changes in the RMSD value remained
within limits. The complex 6D5K–CID:3749 had a high RSMD score,
indicating relative instability. Furthermore, all protein complexes
with their respective ligands had RMSF values of <3. The higher
peaks’ residues follow the MD trajectories’ loop areas
or N- and C-terminal regions. The reduced RMSF values of binding site
residues and ligand atoms demonstrate the stability of ligands bound
to these proteins. Such analysis not only improves our comprehension
of disease pathophysiology but is also useful in drug discovery.[62]
Methods
Accession
to cDNA Datasets
The goal
of the study was to find differentially expressed genes (DEGs) in
hypertension to diagnose its cause. The accession number, sample numbers,
and other traits of a human expression dataset of hypertension were
retrieved from the Gene Expression Omnibus (GEO) database. The DEGs
were discovered using the Affymetrix U133Plus2.0 array platform and
the annotation probe HGU133plus2. To examine the measurable outcomes,
various Bioconductor programs with the R-platform were utilized (Affy,
AffyQCReport, AffyRNADegradation, AnnotationDbi, Annotate, Biobase,
Limma, and HGU133a2cdf).[63]
Normalization and Differential Investigation
For normalization,
the pheno-data files were organized in an acceptable
format.[64] The Array Quality Metric was
used to standardize the dataset to the median level of expression
for each gene set. By utilizing the following equation, the Robust
Multi-Array Analyses (RMA) were employed for background correction
to quantify perfect matches (PM) and mismatches (MM).[65−67]where PM is a perfect match, BG is
the background caused by optical noise, and S is
the nonspecific binding; ijk is the signal for probe, j of the probe set k on array i.The PM data show a mix of both ″BG″
and the expression
signal ″E″. The dataset was analyzed
using the array quality measure, which was normalized to the median
level of expression with a cutoff value of p value
less than 0.15.[65,68] AffyRNAdeg, summary AffyRNAdeg,
and plotAffyRNAdeg were used to verify the quality of RNA in the samples
for RNA degradation analysis.[69,70] The genes that were
chosen were based on their p values and scores. P value = 0.05, absolute log-fold change (log FC) > 1,
FDR
= 0.05 (false discovery rate), and average expression level (AEL)
= 40% were used as cutoff values.
Subcellular
Localization Prediction
The top-ranked DEGs of each dataset
were compared by the Compare-Two
list tool to identify the common genes.[71] The appropriate subcellular protein provides the idea of normal
human functions, and the unusual localization of proteins shows the
pathogenesis of different human diseases.[72] We predicted the subcellular location of DEGs using CELLO version
2.5.[73]
Data
Mining and Disease–Gene Curation
In biomedical research,
literature-based text mining is a significant
step to extract information (DEG–disease interaction) based
on the research entities.[74] We investigated
disease interactions curated from online data sources including the
Comparative Toxicogenomics Database (CTD), Online Mendelian Inheritance
in Man (OMIM), PubMed, and MeSH of shortlisted DEGs to filter specific
genes.[75]
Enrichment
and Cluster Analysis
The
function of the gene gives us the information to understand the signaling
pathway at the cellular level. We performed the expression profiling
of each dataset to understand gene expression variations in different
datasets using online web-based databases DAVID, FunRich, and Enrich
Annotation tool.[76,77] We performed cluster analysis
regarding their expression values in each sample of hypertension-associated
DEGs to evaluate expression profiling using an online one matrix CIMminer
tool.[78]
Prediction
of Regulatory Motifs and Hypertension-Associated
MiRNA Targets
Hypertension mechanisms can be revealed by
understanding the regulation functions at the transcription level.
Various genes are influenced by miRNAs in the signaling cascade. MiRNAs
play an important regulatory role in gene expression and disease etiology.
oPOSSUM tool version 3.0 was used to find the regulatory motifs and
transcription factors of target genes with default parameters (matrix
threshold of 85% with a cutoff value 0.4). Hypertension-related miRNA
targets were predicted by the MiR database.[79]The human genome
with nucleotide variations (SNVs) is associated with many diseases.
The mutation was analyzed to find specific variants. We performed
mutation analysis of DEGs by the ActiveDriverDB database. It is used
to identify protein posttranslational modification (PTM) sites.[10]
Protein–Protein
Interaction Analysis
Protein–protein interactions
(PPIs) play a key role in cellular
functions. Dysregulation in normal protein networks may be the reason
for the disease.[80] PPIs are the functional
interaction of proteins that are used to explore the variations in
biological function.[81,82] This biological network shows
the difference in activity in both physiological and pathological
conditions. The functional interactions of the source gene were retrieved
from STRING.v.10.[83] Furthermore, the roles
of the target genes in hypertension were curated from different databases
including PubMed, CTD, and OMIM. Cytoscape software version 3.6 was
used to visualize the network to explore the role of both source (DEGs)
and target proteins in hypertension.[84] The
entire network is important to find potential hypertension-linked
gene signatures as their abnormality is directly related to the disease
phenotype.
Pathway Analysis
Pathway analysis
is important to analyze the metabolic networks to understand the underlying
functional mechanisms. The KEGG and WIKI databases were used to analyze
the available pathways of target genes.[85−87] We constructed an integrated
network of pathway models using Cystoscope software. This interactive
model reveals the function of each gene in pathways.
Toxicogenomic
The possible cause
of human diseases may be the chemicals in the environment. We retrieved
the available gene–disease information to analyze the chemical–gene
interactions using the Comparative Toxicogenomics Database.[88]
Regulation of Gene Expression
ExAtlas
database (https://www.ebi.ac.uk/gxa/home) and hypertensive patient blood samples[2] were used to predict the gene regulation, i.e., either upregulated
or downregulated.
FDA-Approved Antihypertensive
Drug Interaction
with Upregulated Genes
Protein and Ligand Preparation
for Docking
The crystal structures of ADM (PDB: 4RWF), USP8
(PDB: 2GFO), EDN
(PDB: 6DK5), and ANGPTL4 (PDB: 6EUB) proteins were obtained from the
Protein Data Bank (www.rcsb.com).[89] The sequence was obtained from the
Uniprot database.[90] Using Chimera software,[91] all heteroatoms and water in the PDB data were
eliminated and stored as PDB files for docking. The Drug Bank[92] and PubChem database[5] were used to obtain a library of 99 FDA-approved hypertensive medications.
The Molecular Operating Environment (MOE) software was used to create
the protein and drug library. Adding hydrogens to proteins and ligands
using the protonate 3D approach in MOE followed by energy minimization
was used to prepare them for docking. The AMBER99 force-field was
used to remove further non-bounded structures after this energy minimization
stage.
Molecular Docking
MOE docked a
large library of hypertensive medicines to four elevated proteins
in hypertension with a specific binding site. The triangular matcher
algorithm (TMA) was used to produce 1000 optimal poses for each docked
molecule by using the default ligand-placement approach and the London
dG scoring function.[93] Using the force
field refinement approach, which determines binding affinity using
the generalized born solvation (GBS) model,[9,10] the
top 10 best poses were chosen based on the energy function score and
root mean square deviation (RMSD).[94,95]
Molecular Dynamics
Molecular dynamics
(MD) simulations provide information about the dynamic behavior of
protein–ligand complexes in a virtual graphical environment,
displaying the free energy landscape that approximates the native
state of the protein in the body. As a result, MD simulation is better
for checking the precise ligand-protein interaction profile. The top-ranked
medicines were tested against the four proteins using MD simulations.
The MD simulation for the twelve complexes was done using the Desmond
suite, and each complex was run at 100 ns utilizing the OPLS-2005
force field with an NVIDIA RTX IO: GPU – Dell Xeon series 6th
generation 4 core system. To begin, protein–ligand complexes
were created using the Protein Production Wizard, which included an
optimization and minimization step. Second, each complex was assigned
to a grid box, and NA+/Cl– ions were
introduced to neutralize the system. Finally, simulations were run
at 300 K and 1000 frames, with all other parameters set to default.[96]
Conclusions
This
integrative gene expression analysis is significant to understanding
the genetic variations. From Affymetrix cDNA datasets, we found seven
DEGs as potential drug targets for hypertension. Functional analysis
revealed the significant role of these DEGs in the pathological mechanisms
of hypertension. Mutation analysis showed significant disordered regions
in these molecules. These genes have functional interaction with the
target and other gene signatures including EGFR, AGT, AVP, APOE, RHOA,
SRC, APOB, STAT3, UBC, LPL, and AKT1 linked to hypertension. The associated
pathways involving melatonin, MAPK, PPARs, and relaxin have been found
in disease etiology. Among the seven DEGs we found, four genes were
upregulated. We analyzed and repurposed the 99 FDA-approved hypertensive
drugs to find potential drug targets against upregulated hypertensive
genes. This system-level genomic analysis helps us to find drug targets
and improve the understanding of the treatment of hypertension.
Authors: O Troyanskaya; M Cantor; G Sherlock; P Brown; T Hastie; R Tibshirani; D Botstein; R B Altman Journal: Bioinformatics Date: 2001-06 Impact factor: 6.937
Authors: Okkyoung Choi; Kathy Kanjun Deng; Nam-Jung Kim; Louis Ross; Rao Y Surampalli; Zhiqiang Hu Journal: Water Res Date: 2008-03-04 Impact factor: 11.236
Authors: Damian Szklarczyk; Andrea Franceschini; Michael Kuhn; Milan Simonovic; Alexander Roth; Pablo Minguez; Tobias Doerks; Manuel Stark; Jean Muller; Peer Bork; Lars J Jensen; Christian von Mering Journal: Nucleic Acids Res Date: 2010-11-02 Impact factor: 16.971
Authors: Guido Makransky; Mads T Bonde; Julie S G Wulff; Jakob Wandall; Michelle Hood; Peter A Creed; Iben Bache; Asli Silahtaroglu; Anne Nørremølle Journal: BMC Med Educ Date: 2016-03-25 Impact factor: 2.463