Literature DB >> 28275550

SVMRFE based approach for prediction of most discriminatory gene target for type II diabetes.

Atul Kumar1, D Jeya Sundara Sharmila2, Sachidanand Singh1.   

Abstract

Type II diabetes is a chronic condition that affects the way our body metabolizes sugar. The body's important source of fuel is now becoming a chronic disease all over the world. It is now very necessary to identify the new potential targets for the drugs which not only control the disease but also can treat it. Support vector machines are the classifier which has a potential to make a classification of the discriminatory genes and non-discriminatory genes. SVMRFE a modification of SVM ranks the genes based on their discriminatory power and eliminate the genes which are not involved in causing the disease. A gene regulatory network has been formed with the top ranked coding genes to identify their role in causing diabetes. To further validate the results pathway study was performed to identify the involvement of the coding genes in type II diabetes. The genes obtained from this study showed a significant involvement in causing the disease, which may be used as a potential drug target.

Entities:  

Keywords:  Microarray; Protein-protein interaction; SVMRFE; Type II diabetes; t-test

Year:  2017        PMID: 28275550      PMCID: PMC5331150          DOI: 10.1016/j.gdata.2017.02.008

Source DB:  PubMed          Journal:  Genom Data        ISSN: 2213-5960


Introduction

Support Vector Machine (SVM), a machine learning technique implied in the area of time series prediction and classification [31], [36] has widely been applied in the life science fields, especially in Bioinformatics. It can handle nonlinear classification tasks efficiently by mapping the samples into a higher dimensional feature space by using a nonlinear kernel function. Since the SVM approach is data-driven and model-free, it has important discriminating power for classification. This characteristic of SVM is obvious in cases where the sample sizes are negligible and numerous variables are involved (high-dimensional space). Expression profile come under such a category, which contain a large number of attributes (genes). This type of expression data is used to predict the type and occurrence of the disease in a patient [39]. An important aspect while analyzing such type of expression data is the feature selection or dimensionality reduction. Most algorithms lose their potency when genes are large in number with different time series data or dimensionality [7]. To accomplish the task of dimensionality reduction a modified version of SVM known as SVMRFE (Support Vector Machine Recursive Feature Elimination) has been used in this work. SVMRFE was used to identify the most discriminatory target gene in four different microarray data samples of type II diabetes. These samples have been taken from the Gene Expression Omnibus database (GEO) [13] and Diabetes Genome Anatomy Project (DGAP) (http://www.diabetesgenome.org/). The idea was to build a model wherein the least important features (genes) can be eliminated at each iterative step based on the weight assigned to each gene through SVM. The genes identified through this approach were then classified as essential and non-essential genes. The protein-protein interaction of these non-essential genes revealed vital information regarding interacting proteins. Functional enrichment about these proteins shed a light on their regulatory pathways associated with type II diabetes which can be further explored and confirmed using experimental approach.

Materials and methods

Collection of data sample

71 samples from Pancreatic Islet and Skeletal muscle of Homo sapiens were collected from the GEO and DGAP. Out of these 37 samples are of normal human beings and 34 are of diabetic humans. Table 1 shows the detail description of each of the data sets which were undertaken for studies.
Table 1

Microarray dataset undertaken for studies.

SourceDataNo. of samples
No. of genesCountry
NormalDiabetic
GEOEffect of insulin infusion on human skeletal muscle [33]6622,215Sweden
DGAPHuman pancreatic islets from normal and Type 2 diabetic subjects (A) [18]7522,191Caucasian and Asian
DGAPHuman pancreatic islets from normal and Type 2 diabetic subjects (B) [18]7522,550
DGAPHuman skeletal muscle - type 2 diabetes [29]171822,177Sweden
Microarray dataset undertaken for studies. Fisher linear discriminant was applied to all the above-mentioned data sets to rank them based on the Fischer score [21] which was continued with a redundancy reduction step to reduce the redundant data in the microarray dataset [22]. The gene number present in each data set was still high. A t-test [3] with a significance level of 0.05 was applied to the datasets to filter out the genes which are not involved in causing type II diabetes. After this reduction step SVMRFE approach (with linear kernel function and 6 subsets of the training data) [24] was applied to train the data samples for 5 iterations. As a result, discriminatory genes based on the weighted ranking were obtained. The identified genes were identified as being essential and non-essential using the database of essential genes. A gene interaction and pathway analysis of the potential non-essential genes was performed to identify the novel targets for type II diabetes (Fig. 1)
Fig. 1

Flow chart of the analysis.

Flow chart of the analysis.

Result and discussion

t-test analysis

For each of the T2D datasets, a t-test analysis was performed with a significance level of 0.05. As a result, there was a high dimensionality reduction in each dataset (Table 2). The genes rejecting the null hypothesis were obtained for each of the data samples. Table 3, Table 4, Table 5, Table 6 show the corresponding p-values of all the genes which have rejected the null hypothesis at significance level of 0.05. The Fig. 2, Fig. 3, Fig. 4, Fig. 5 represent graphically the p-value of all the genes in the four datasets under consideration. The p-value for most of the genes was above the significance level value of 0.05. This represents that these genes have almost the same expression value in the normal and diseased and may not be involved in causing the disease.
Table 2

Number of input and output genes from each dataset for t-test analysis.

Name of datasetNo of inputted genesNo of genes rejecting the null hypothesis
Effect of insulin infusion on human skeletal muscle122324
Human pancreatic islets from normal and type II diabetic subjects (A)121017
Human pancreatic islets from normal and type II diabetic subjects (B)80321
Human skeletal muscle-type II diabetes123828
Table 3

p-value of genes following the alternative hypothesis for the dataset “GSE7146”.

Probe idGenep-Value
213524_s_atG0/G1switch 20.00001
216599_x_atSolute carrier family 22 (organic anion transporter), member 60.00005
207295_atSodium channel, non-voltage-gated 1, gamma0.0001
218409_s_atDnaJ (Hsp40) homolog, subfamily C, member 10.0003
203221_atTransducin-like enhancer of split 1 (E (sp1) homolog, (Drosophila)0.0004
210452_x_atCytochrome P450, family 4, subfamily F, polypeptide 20.001
201630_s_atAcid phosphatase 1, soluble0.001
207955_atChemokine (C-C motif) ligand 270.002
208507_atOlfactory receptor, family 7, subfamily C, member 20.002
210889_s_atFc fragment of IgG, low affinity IIb, receptor (CD32)0.002
207732_s_atDiscs, large homolog 3 (neuroendocrine-dlg, Drosophila)0.002
220636_atDynein, axonemal, intermediate polypeptide 20.002
205863_atS100 calcium binding protein A120.002
205603_s_atDiaphanous homolog 2 (Drosophila)0.003
220979_s_atST6 (alpha-N-acetyl-neuraminy l-2, 3-beta-galactosy l-1, 3) -N-acetylgalactosaminide alpha-2, 6-sialyltransferase 50.003
206310_atSerine peptidase inhibitor, Kazal Type II (acrosin-trypsin inhibitor)0.004
210442_atInterleukin 1 receptor-like 10.004
201214_s_atProtein phosphatase 1, regulatory subunit 70.004
220385_atJunctophilin 20.004
205490_x_atGap junction protein, beta 3, 31 kDa (connexin 31)0.004
213772_s_atGolgi-associated, gamma adaptin ear containing, ARF binding protein 20.004
213950_s_atProtein phosphatase 3 (formerly 2B), catalytic subunit, gamma isoform (calcineurin A gamma)0.004
201681_s_atDiscs, large homolog 5 (Drosophila)0.004
220782_x_atKallikrein-related peptidase 120.004
Table 4

p-Value of genes following the alternative hypothesis for the dataset “human pancreatic islets from normal and type II diabetic subjects (A)”.

Probe idGenep-Value
207406_atCytochrome P450, family 7, subfamily A, polypeptide 10.0003
214046_atFucosyltransferase 9 (alpha (1,3) fucosyltransferase)0.0004
213980_s_atC-terminal binding protein 10.0005
202854_atHypoxanthine phosphoribosyltransferase 10.0005
215300_s_atFlavin containing monooxygenase 50.0007
212894_atSuppressor of var1, 3-like 1 (S. cerevisiae)0.0012
202605_atGlucuronidase, beta0.0017
203196_atATP-binding cassette, sub-family C (CFTR/MRP), member 40.0021
205633_s_atAminolevulinate, delta-, synthase 10.0022
207673_atNephrosis 1, congenital, Finnish type (nephrin)0.0027
209759_s_atEnoyl-CoA delta isomerase 10.003
208926_atSialidase 1 (lysosomal sialidase)0.003
205627_atCytidine deaminase0.004
210284_s_atTGF-beta activated kinase 1/MAP3K7 binding protein 20.004
213931_atInhibitor of DNA binding 2, dominant negative helix-loop-helix protein0.0043
213426_s_atCaveolin 20.0047
221572_s_atSolute carrier family 26, member 60.0049
Table 5

p-Value of genes following the alternative hypothesis for the dataset “human pancreatic islets from normal and type II diabetic subjects (B)”.

Probe idGenep-Value
227787_s_atThyroid hormone receptor-associated protein 60.0001
222478_atVacuolar protein sorting 36 (yeast)0.0002
230329_s_atNudix (nucleoside diphosphate linked moiety X) -type motif 60.0003
226424_atCalcyphosine0.0003
225491_atSolute carrier family 1 (glial high affinity glutamate transporter), member 20.0004
225016_atAdenomatosis polyposis coli down-regulated 10.0005
243043_atRAD50 interactor 10.0008
224573_atRibonuclease, RNase K0.0012
228133_s_atMyosin, heavy polypeptide 11, smooth muscle0.0013
225108_atAlkylglycerone phosphate synthase0.0013
224865_atMale sterility domain containing 20.0024
231880_atFamily with sequence similarity 40, member B0.0026
241739_at2-oxoglutarate and iron-dependent oxygenase domain containing 10.003
228036_s_atF-box protein 20.0031
223978_s_atCardiolipin synthase 10.0032
244706_atProtein-L-isoaspartate (d-aspartate) O-methyltransferase domain containing 10.0033
237718_atEukaryotic translation initiation factor 4E0.0033
222999_s_atCyclin L20.0038
230318_atSerpin peptidase inhibitor, clade A (alpha-1 antiproteinase, antitrypsin), member 10.0039
222408_s_atYippee-like 5 (Drosophila)0.004
224954_atSerine hydroxymethyltransferase 1 (soluble)0.0046
Table 6

p-Value of genes following the alternative hypothesis for the dataset “human skeletal muscle-type II diabetes”.

Probe idGenep-Value
219572_atCa ++-dependent secretion activator 20.0002
204447_atLeucine zipper, putative tumor suppressor family member 30.0002
221410_x_atProtocadherin beta 30.0003
201764_atTransmembrane protein 106C0.0005
201429_s_atRibosomal protein L37a0.0008
204761_atUSP6 N-terminal like0.001
219642_s_atPeroxisomal biogenesis factor 5-like0.001
218592_s_atCat eye syndrome chromosome region, candidate 50.001
210835_s_atC-terminal binding protein 20.001
216695_s_atTankyrase, TRF1-interacting ankyrin-related ADP-ribose polymerase0.001
208067_x_atUbiquitously transcribed tetratricopeptide repeat containing, Y-linked0.001
209400_atSolute carrier family 12 (potassium/chloride transporters), member 40.001
201262_s_atBiglycan0.001
203171_s_atRibosomal RNA processing 8, methyltransferase, homolog (yeast)0.002
207131_x_atGamma-glutamyltransferase 10.002
219464_atCarbonic anhydrase XIV0.002
206345_s_atParaoxonase 10.002
210907_s_atProgrammed cell death 100.002
202641_atADP-ribosylation factor-like 30.002
204969_s_atRadixin0.003
222289_atPotassium voltage-gated channel, Shaw-related subfamily, member 20.003
210318_atRetinol binding protein 3, interstitial0.003
219301_s_atContactin associated protein-like 20.004
203116_s_atFerrochelatase0.004
207242_s_atGlutamate receptor, ionotropic, kainate 10.004
214005_atGamma-glutamyl carboxylase0.004
215529_x_atDIP2 disco-interacting protein 2 homolog A (Drosophila)0.004
Fig. 2

p-Value corresponding to all the genes in the training set for dataset “GSE7146”.

Fig. 3

p-Value corresponding to all the genes in the training set for dataset “human pancreatic islets from normal and type II diabetic subjects (A)”.

Fig. 4

p-Value corresponding to all the genes in the training set for dataset “human pancreatic islets from normal and type II diabetic subjects (B)”.

Fig. 5

p-Value corresponding to all the genes in the training set for dataset “human skeletal muscle-type II diabetes”.

p-Value corresponding to all the genes in the training set for dataset “GSE7146”. p-Value corresponding to all the genes in the training set for dataset “human pancreatic islets from normal and type II diabetic subjects (A)”. p-Value corresponding to all the genes in the training set for dataset “human pancreatic islets from normal and type II diabetic subjects (B)”. p-Value corresponding to all the genes in the training set for dataset “human skeletal muscle-type II diabetes”. Number of input and output genes from each dataset for t-test analysis. p-value of genes following the alternative hypothesis for the dataset “GSE7146”. p-Value of genes following the alternative hypothesis for the dataset “human pancreatic islets from normal and type II diabetic subjects (A)”. p-Value of genes following the alternative hypothesis for the dataset “human pancreatic islets from normal and type II diabetic subjects (B)”. p-Value of genes following the alternative hypothesis for the dataset “human skeletal muscle-type II diabetes”.

Identification of best-ranked genes from SVMRFE

The subsets of genes based on the p-value were given as an input to the support vector machine. Recursive Feature Elimination (RFE) is an iterative procedure for SVM classifier. The recursive feature elimination algorithm of the support vector machine assigns a weight to each gene. The weight was calculated based on the expression value of genes in the disease and the normal sample for all the dataset. The algorithm classified the genes (with a classification accuracy of 83.9%) based on the descending order of the weight. Then it generated the list of genes which were found to be the most discriminatory in the normal and disease samples (Table 7, Table 8, Table 9, Table 10). The outline for SVMRFE in the linear kernel is presented below:
Table 7

Best ranked genes for dataset “GSE7146”.

Gene name
G0/G1switch 2
Transducin-like enhancer of split 1 (E (sp1) homolog, Drosophila)
Acid phosphatase 1, soluble
DnaJ (Hsp40) homolog, subfamily C, member 1
Golgi-associated, gamma adaptin ear containing, ARF binding protein 2
Protein phosphatase 1, regulatory subunit 7
Interleukin 1 receptor-like 1
Discs, large homolog 5 (Drosophila)
Cytochrome P450, family 4, subfamily F, polypeptide 2
Protein phosphatase 3 (formerly 2B), catalytic subunit, gamma isoform (calcineurin A gamma)
Gap junction protein, beta 3, 31 kDa (connexin 31)
Diaphanous homolog 2 (Drosophila)
Olfactory receptor, family 7, subfamily C, member 2
Solute carrier family 22 (organic anion transporter), member 6
Serine peptidase inhibitor, Kazal Type II (acrosin-trypsin inhibitor)
Chemokine (C-C motif) ligand 27
Dynein, axonemal, intermediate chain 2
Junctophilin 2
Kallikrein-related peptidase 12
S100 calcium binding protein A12
Discs, large homolog 3 (neuroendocrine-dlg, Drosophila)
Sodium channel, non-voltage-gated 1, gamma subunit
ST6 (alpha-N-acetyl-neuraminyl-2, 3-beta-galactosyl-1, 3) -N- acetylgalactosaminide alpha-2, 6-sialyltransferase 5
Fc fragment of IgG, low affinity IIb, receptor (CD32)
Table 8

Best ranked genes for dataset “human pancreatic islets from normal and type II diabetic subjects (A)”.

Gene name
Glucuronidase, beta
Enoyl-CoA delta isomerase 1
C-terminal binding protein 1
Inhibitor of DNA binding 2, dominant negative helix-loop-helix protein
Hypoxanthine phosphoribosyltransferase 1
Sialidase 1 (lysosomal sialidase)
ATP-binding cassette, sub-family C (CFTR/MRP), member 4
Aminolevulinate, delta-, synthase 1
Suppressor of var1, 3-like 1 (S. cerevisiae)
Flavin-containing monooxygenase 5
Solute carrier family 26, member 6
TGF-beta activated kinase 1/MAP3K7 binding protein 2
Caveolin 2
Nephrosis 1, congenital, Finnish type (nephrin)
Fucosyltransferase 9 (alpha (1,3) fucosyltransferase)
Cytidine deaminase
Cytochrome P450, family 7, subfamily A, polypeptide 1
Table 9

Best ranked genes for dataset “human pancreatic islets from normal and type II diabetic subjects (B)”.

Gene name
Adenomatosis polyposis coli down-regulated 1
Ribonuclease, RNase K
Table 10

Best ranked genes for dataset “human skeletal muscle-type II diabetes”.

Gene name
Protocadherin beta 3
Leucine zipper, putative tumor suppressor family member 3
USP6 N-terminal like
Ubiquitously transcribed tetratricopeptide repeat containing, Y-linked
Best ranked genes for dataset “GSE7146”. Best ranked genes for dataset “human pancreatic islets from normal and type II diabetic subjects (A)”. Best ranked genes for dataset “human pancreatic islets from normal and type II diabetic subjects (B)”. Best ranked genes for dataset “human skeletal muscle-type II diabetes”. Inputs: Training samples X0 = [x1, x2,…, xn]T Class labels (1 for normal or 0 for diseased) y = [y1, y2,…, yn]T Initialize: Surviving genes s = [1, 2,…n] Gene-ranking list r = [] Limit training samples to good genes X = ×0 (:, s) Train the classifier α = SVM-train (X, y) Compute the weight from each selected gene: w = where k indicates the kth training pattern Compute the ranking criterion for the ith gene R (i) = (wi) [2] Mark the gene with the lowest ranking g = arg min (R) Renew the gene-ranking list r = [s (g), r] Eliminate the gene with the lowest ranking s = s (1: g − 1, g + 1: length (s)) Repeat until s = [] Output: A gene-ranking list r

Identification of degree of essentiality and non-essentiality of genes

To identify significant and reliable targets, the work was concentrated on non-essential genes. Essential genes were ruled out based on the hits obtained from the Database of Essential Genes (DEG 10.9) (http://tubic.tju.edu.cn/deg/) [46]. Essential genes sustain an organism. Therefore, having them as a potential gene target may induce side effects of the drugs. Hence, it is important to identify only the non-essential genes which may be used as a potential drug target. Table 11, Table 12, Table 13, Table 14 show the non-essential genes from the microarray dataset which is under study
Table 11

Non-essential genes for dataset “GSE7146”.

Gene symbolGene name
G0S2G0/G1switch 2
ACP1Acid phosphatase 1, soluble
CCL27Chemokine (C-C motif) ligand 27
JPH2Junctophilin 2
KLK12Kallikrein-related peptidase 12
S100A12S100 calcium binding protein A12
DLG3Discs, large homolog 3 (neuroendocrine-dlg, Drosophila)
SCNN1GSodium channel, non-voltage-gated 1, gamma subunit
ST6GALNAC5ST6 (alpha-N-acetyl-neuraminyl-2, 3-beta-galactosyl-1, 3) -N-acetylgalactosaminide alpha-2, 6-sialyltransferase 5
FCGR2BFc fragment of IgG, low-affinity IIb, receptor (CD32)
Table 12

Non-essential genes for dataset “human pancreatic islets from normal and type II diabetic subjects (A)”.

Gene symbolGene name
HPRT1Hypoxanthine phosphoribosyltransferase 1
ABCC4ATP-binding cassette, sub-family C (CFTR/MRP), member 4
FMO5Flavin-containing monooxygenase 5
CAV2Caveolin 2
FUT3Fucosyltransferase 9 (alpha (1, 3) fucosyltransferase)
CDACytidine deaminase
Table 13

Non-essential genes for dataset “human pancreatic islets from normal and type II diabetic subjects (B)”.

Gene symbolGene name
APCDD1Adenomatosis polyposis coli down-regulated 1
RNASEKRibonuclease, RNase K
Table 14

Non-essential genes for dataset “human skeletal muscle-type II diabetes”.

Gene symbolGene name
USP6NLLeucine zipper, putative tumor suppressor family member 3
PROSAPIP1USP6 N-terminal like
Non-essential genes for dataset “GSE7146”. Non-essential genes for dataset “human pancreatic islets from normal and type II diabetic subjects (A)”. Non-essential genes for dataset “human pancreatic islets from normal and type II diabetic subjects (B)”. Non-essential genes for dataset “human skeletal muscle-type II diabetes”.

Gene interaction studies

After obtaining the non-essential genes from the top ranked coding genes for each of the datasets, gene regulatory network was constructed using STRING (Search Tool for the Retrieval of Interacting Genes/Proteins) database [40]. The study was mainly done to observe the interaction between non-essential protein-coding genes with other proteins which are a result of biochemical events and/or electrostatic forces [23]. The function and activity of a protein are often modulated by other proteins with which it interacts.

Gene regulatory network of dataset “GSE7146”

In this dataset, out of the ten best coding genes obtained through the SVMRFE approach, only 5 genes (ACP1, FCGR2B, SCNN1G, CCL27, and DLG3) showed interaction with other protein coding genes (Fig. 6). The ACP1 showed a direct interaction with EPHA2, which is reported to increase the chance of myocardial infarction and reduce the survival rate of hyperglycemic mice [12]. LYN showed indirect interaction with ACP1 via EPHA2 and direct interaction with FCGR2B. Its kinase activation modulation has been reported to be a novel insulin receptor-potentiating agent. This potentiating agent produces a rapid-onset and a durable blood glucose-lowering activity in diabetic animals [32]. FCGR2B also showed direct interaction with PTPN6 which is been reported to negatively regulate insulin action on glucose homeostasis in the liver and muscle [44]. An analysis of DLG3 has shown its direct interaction with GRIN2A and GRIN2B. Both these genes have been reported to play a potential role in diabetes [11], [37], [42]. UBC has been reported to play a major role in the diabetes pathway [8], [16], [26] and its direct interaction with SCNN1G shows that SCNN1G may also play a role in diabetes pathway. CCL27 interacts with CCL25, a protein whose expression was shown to decrease significantly in diabetes [30].
Fig. 6

Gene regulatory network of dataset “GSE7146”.

Gene regulatory network of dataset “GSE7146”.

Gene regulatory network of dataset “human pancreatic islets from normal and type II diabetic subjects (A)”

Except for ABCC4 and FMO5, all the other four proteins showed a significant and strong interaction with other neighboring proteins (Fig. 7). Purine Nucleoside Phosphorylase (PNP) and Nucleoside Phosphate Kinase (NPK) have reportedly played a major role in diabetes either by positive or negative metabolic regulation [9]. These two molecules also showed interaction with the HPRT1 and the CDA. Caveolin has already been reported to mediate insulin signaling thereby affecting the glucose uptake [6]. In the other subgroup network FUT3 has three direct neighbors: FUT1, FUT2, and B4GALT1 of which the B4GALT1 expression level has been shown to be affected by hyperglycemia [25].
Fig. 7

Gene regulatory network of dataset “human pancreatic islets from normal and type II diabetic subjects (A)”.

Gene regulatory network of dataset “human pancreatic islets from normal and type II diabetic subjects (A)”.

Gene regulatory network of the dataset “human pancreatic islets from normal and type II diabetic subjects (B)”

Both the protein coding genes in this dataset (RNASEK and APCDD1) have shown a significant interaction with the neighboring proteins (Fig. 8). The involvement of RNASEK in diabetes is still an unanswered question, but APCDD1 interaction with its neighbors shows that it may be involved in the pathophysiology of diabetes. LPAR6 (Lysophosphatidic Acid Receptor 6) interacting directly with APCCD1 has shown its activity with PPARγ which is a potential target for diabetes [38]. Aranda et al., in 2012 also showed that the DM/HG (Diabetes mellitus/High Glucose) reprograms signaling pathways in RECs (Retinal Endothelial Cells) to induce a state of LPA (Lysophosphatidic Acid) resistance. In the year 2000, Figueroa et al. [14] showed that alterations in LRP5 expression may be responsible for diabetes susceptibility. Therefore it may be a potential target for therapeutic intervention. It has been reported that Wnt/LRP5 (lipoprotein receptor-related protein 5) signaling contributes to the glucose-induced insulin secretion in the islets [15].
Fig. 8

Gene regulatory network of dataset “human pancreatic islets from normal and type II diabetic subjects (B)”.

Gene regulatory network of dataset “human pancreatic islets from normal and type II diabetic subjects (B)”.

Gene regulatory network of dataset “human skeletal muscle-type II diabetes”

The two prominent protein coding genes (USP6NL and ProSAPiP1) as per SVMRFE analysis showed interaction with a different set of genes (Fig. 9). This selective network of ProSAPiP1 has not been reported till now, for diabetes. The three genes (SOS1, EGFR, and EGF) in the interaction network of USP6NL have shown its significance in connection with diabetes. SOS1 has shown its association with reference to the insulin action [4], in differential expression of EGFR which is a major impact on diabetes and associated diseases [1], [5], [27], [28], [41], [45]. Kasayama et al. [19] long back in 1989 reported that EGF deficiency occurs in diabetes mellitus hence insulin may be important in maintaining the normal level of EGF in the submandibular gland and plasma.
Fig. 9

Gene regulatory network of dataset “human skeletal muscle-type II diabetes”.

Gene regulatory network of dataset “human skeletal muscle-type II diabetes”.

Functional enrichment of significant genes implying pathway analysis

To further validate the involvement of the identified genes in type II diabetes, pathway enrichment was considered. This was solely meant for all the interacting proteins with the identified significant protein(s). The study was carried out using Biointerpreter, a web-based biological interpretation tool for Microarray data analysis (Genotypic Technology Pvt. Ltd., Bangalore, India). The pathway analysis showed that some of the interacting proteins were involved in pathways which were directly or indirectly associated with type II diabetes.

Pathway enrichment for the interacting proteins of the dataset “effect of insulin infusion on human skeletal muscle”

GRIN2A (Glutamate [NMDA] receptor subunit epsilon-1) and GRIN2B (Glutamate [NMDA] receptor subunit epsilon-2), the two proteins interacting mainly with the identified protein DLG3 have been shown to be involved in 3 different pathways viz. Neuroactive ligand-receptor interaction, circadian entrainment and Long-term potentiation (Fig. 10). The proteins present in the Neuroactive ligand-receptor interaction have shown a significant role in the pathobiology of obesity and type II diabetes [10]. The second pathway, circadian entrainment is the biological process that displays an endogenous oscillation of about 24 h. Studies show that exposure to light at night lowers glucose-stimulated insulin secretion due to a decrease in insulin secretory pulse mass. Potential mechanisms have been identified by which disturbances in the circadian rhythms due to modern lifestyle can lead to islet failure in the type II diabetes [35]. It has also been reported that the impaired energy utilization from insulin deficiency impairs a long-term potentiation in diabetes [47].
Fig. 10

Involvement of GRIN2A and GRIN2B in different pathways.

Involvement of GRIN2A and GRIN2B in different pathways.

Pathway enrichment for the interacting proteins of the dataset “human pancreatic islets from normal and type II diabetic subjects (A)”

The protein B4GALT1, interacting with the identified protein FUT3 is involved in several metabolic pathways, connected to type II diabetes (Fig. 11). The protein B4GALT1 participates both in glycoconjugate and lactose biosynthesis. It has shown to be a biomarker in hepatocellular carcinoma, mainly caused due to the insulin resistance syndrome. Finally, the ailment manifests as obesity and later as diabetes [17].
Fig. 11

Involvement of B4GALT1 in different pathways

Involvement of B4GALT1 in different pathways

Pathway enrichment for the interacting proteins of the dataset “human pancreatic islets from normal and type II diabetic subjects (B)”

The protein PNPT1 interacting with the RNASEK is reported to be involved in pyrimidine and purine metabolism and the RNA degradation (Fig. 12). Effects of the insulin regulation of purine and pyrimidine metabolism had shown to cause some late complications of the diabetic disease [34]. In 2009, Kocic et al. [20] reported that an impaired dsRNA metabolism may lead to increased levels of different sized RNAs in type II diabetic patients and may have an influence on further ineffective response against the different pathogens.
Fig. 12

Involvement of PNPT1 in different pathways.

Involvement of PNPT1 in different pathways.

Pathway enrichment for the interacting proteins of dataset “human skeletal muscle-type II diabetes”

EGFR protein interacting with the identified protein USP6NL has already been reported by many researchers to be involved in diabetes [1], [5], [27], [28], [41], [45]. With the pathway studies, it was identified that the main pathways in which EGFR is involved, is also leading directly to or indirectly to diabetes (Fig. 13). Hypoxia-inducible factor 1 alpha (HIF-1α) is regulated precisely by hypoxia and hyperglycemia. It had also been shown that the HIF-1α and glucose can sometimes influence each other [43]. It has been reported that the components of the MAPK/ERK pathway act as modifiers of the cellular insulin responsiveness. The insulin resistance was due to downregulation of the insulin-like receptor gene expression following persistent MAPK/ERK inhibition. The mechanism permits physiological adjustment of insulin sensitivity and the subsequent maintenance of the circulating glucose at appropriate levels [48]. MAPK and GnRh-Glp-1 pathways in the ileum have also been reported to be involved in the improvement of the blood glucose level [45].
Fig. 13

Involvement of EGFR in different pathways.

Involvement of EGFR in different pathways.

Conclusion

Analysis of type II diabetes expression data from two different tissue samples i.e. skeletal muscle and pancreatic islet has given a deep insight into genes which may be possibly involved in the pathophysiology of the disease. The most discriminatory genes obtained in each dataset after complete analysis, have been found to be associated with diabetes either directly or indirectly. However, the majority of the genes have not been previously reported in association with diabetes. The genes identified in the current study viz. FCGR2B, DLG3, SCNN1G, FUT3, HPRT1, APCDD1, USP6NL, ProSAPiP1 and RNASEK may act as a potential drug target. The significant pathways identified through the overall approach were Neuroactive ligand-receptor interaction, circadian entrainment, Long-term potentiation, pyrimidine and purine metabolism, dsRNA metabolism, MAPK/ERK pathway, and GnRh-Glp-1. This study gave the insight to focus on these associated pathways with the above-reported proteins to study in pathway models or mouse model to elucidate them as drug targets or markers for type II diabetes.

Conflict of interest

The authors declare that there is no conflict of interest in the present work.
  49 in total

Review 1.  Role of caveolin and caveolae in insulin signaling and diabetes.

Authors:  Alex W Cohen; Terry P Combs; Philipp E Scherer; Michael P Lisanti
Journal:  Am J Physiol Endocrinol Metab       Date:  2003-12       Impact factor: 4.310

2.  Evidence for a role of the ubiquitin-proteasome pathway in pancreatic islets.

Authors:  María D López-Avalos; Valérie F Duvivier-Kali; Gang Xu; Susan Bonner-Weir; Arun Sharma; Gordon C Weir
Journal:  Diabetes       Date:  2006-05       Impact factor: 9.461

3.  Epidermal growth factor deficiency associated with diabetes mellitus.

Authors:  S Kasayama; Y Ohba; T Oka
Journal:  Proc Natl Acad Sci U S A       Date:  1989-10       Impact factor: 11.205

4.  Thymic microenvironmental alterations in experimentally induced diabetes.

Authors:  Patrícia R A Nagib; Jacy Gameiro; Luiz Guilherme Stivanin-Silva; Maria Sueli Parreira de Arruda; Déa Maria Serra Villa-Verde; Wilson Savino; Liana Verinaud
Journal:  Immunobiology       Date:  2010-02-16       Impact factor: 3.144

5.  Low-density lipoprotein receptor-related protein 5 (LRP5) is essential for normal cholesterol metabolism and glucose-induced insulin secretion.

Authors:  Takahiro Fujino; Hiroshi Asaba; Man-Jong Kang; Yukio Ikeda; Hideyuki Sone; Shinji Takada; Dong-Ho Kim; Ryoichi X Ioka; Masao Ono; Hiroko Tomoyori; Minoru Okubo; Toshio Murase; Akihisa Kamataki; Joji Yamamoto; Kenta Magoori; Sadao Takahashi; Yoshiharu Miyamoto; Hisashi Oishi; Masato Nose; Mitsuyo Okazaki; Shinichi Usui; Katsumi Imaizumi; Masashi Yanagisawa; Juro Sakai; Tokuo T Yamamoto
Journal:  Proc Natl Acad Sci U S A       Date:  2002-12-30       Impact factor: 11.205

6.  β-cell dysfunctional ERAD/ubiquitin/proteasome system in type 2 diabetes mediated by islet amyloid polypeptide-induced UCH-L1 deficiency.

Authors:  Safia Costes; Chang-jiang Huang; Tatyana Gurlo; Marie Daval; Aleksey V Matveyenko; Robert A Rizza; Alexandra E Butler; Peter C Butler
Journal:  Diabetes       Date:  2010-10-27       Impact factor: 9.461

7.  MAPK/ERK signaling regulates insulin sensitivity to control glucose metabolism in Drosophila.

Authors:  Wei Zhang; Barry J Thompson; Ville Hietakangas; Stephen M Cohen
Journal:  PLoS Genet       Date:  2011-12-29       Impact factor: 5.917

8.  A novel method for prokaryotic promoter prediction based on DNA stability.

Authors:  Aditi Kanhere; Manju Bansal
Journal:  BMC Bioinformatics       Date:  2005-01-05       Impact factor: 3.169

9.  Consequences of exposure to light at night on the pancreatic islet circadian clock and function in rats.

Authors:  Jingyi Qian; Gene D Block; Christopher S Colwell; Aleksey V Matveyenko
Journal:  Diabetes       Date:  2013-06-17       Impact factor: 9.461

Review 10.  The role of ubiquitination and sumoylation in diabetic nephropathy.

Authors:  Chenlin Gao; Wei Huang; Keizo Kanasaki; Yong Xu
Journal:  Biomed Res Int       Date:  2014-06-04       Impact factor: 3.411

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.