Literature DB >> 27840822

Computational Analysis of Damaging Single-Nucleotide Polymorphisms and Their Structural and Functional Impact on the Insulin Receptor.

Zabed Mahmud1, Syeda Umme Fahmida Malik2, Jahed Ahmed1, Abul Kalam Azad1.   

Abstract

Single-nucleotide polymorphisms (SNPs) associated with complex disorders can create, destroy, or modify protein coding sites. Single amino acid substitutions in the insulin receptor (INSR) are the most common forms of genetic variations that account for various diseases like Donohue syndrome or Leprechaunism, Rabson-Mendenhall syndrome, and type A insulin resistance. We analyzed the deleterious nonsynonymous SNPs (nsSNPs) in INSR gene based on different computational methods. Analysis of INSR was initiated with PROVEAN followed by PolyPhen and I-Mutant servers to investigate the effects of 57 nsSNPs retrieved from database of SNP (dbSNP). A total of 18 mutations that were found to exert damaging effects on the INSR protein structure and function were chosen for further analysis. Among these mutations, our computational analysis suggested that 13 nsSNPs decreased protein stability and might have resulted in loss of function. Therefore, the probability of their involvement in disease predisposition increases. In the lack of adequate prior reports on the possible deleterious effects of nsSNPs, we have systematically analyzed and characterized the functional variants in coding region that can alter the expression and function of INSR gene. In silico characterization of nsSNPs affecting INSR gene function can aid in better understanding of genetic differences in disease susceptibility.

Entities:  

Mesh:

Substances:

Year:  2016        PMID: 27840822      PMCID: PMC5093252          DOI: 10.1155/2016/2023803

Source DB:  PubMed          Journal:  Biomed Res Int            Impact factor:   3.411


1. Introduction

The insulin receptor (INSR) is a tyrosine kinase-specific transmembrane receptor that is activated by insulin, insulin growth factor I, and insulin growth factor II [1]. Metabolically, the INSR plays a crucial role in the regulation of glucose homeostasis which may result in a range of clinical events including diabetes and cancer [2, 3]. The main activity of INSR is persuading uptake of glucose and because of a decrease in insulin receptor signaling leads to diabetes mellitus type 2. The cells' inability to take glucose results in hyperglycemia and all the sequels that result in diabetes. Insulin-resistant patients may also display acanthosis nigricans. It is already proven that the presence of mutant receptors in the cell may have detrimental effects on the activity of the normal receptor. A previous study conducted with kinase-deficient INSRs transfected into cultured cells showed that such receptors suppressed the function of endogenous INSRs and functioned as dominant-negative mutations [4]. However, in most cases of insulin resistance, the mutation is expressed as a recessive form. Yamamoto-Honda et al. [5] studied the function and consequences of recessive mutation in the INSR. For example, Donohue syndrome known as Leprechaunism is a rare and severe genetic autosomal recessive disorder due to defect in the INSR gene. Single-nucleotide polymorphisms (SNPs) are the most common form of human genetic variations and nearly half a million of SNPs reside in the exons of the human genome. Among these SNPs, “nonsynonymous SNPs (nsSNPs)” can alter the amino acid residues and contribute to functional diversity in encoded proteins in the human population. The genomic distribution of SNPs is not obviously homogenous. In general, SNPs occur in noncoding regions more frequently than in coding regions [6]. Genetic recombination and mutation rate are some other factors that can also determine SNP density [7]. SNPs are usually biallelic and a single SNP may cause a Mendelian disease [8]. In case of complex diseases, SNPs do not usually function independently but rather they work as a group with other SNPs to exhibit a disease condition which has been seen in osteoporosis [9]. A wide range of human diseases, such as sickle-cell anemia, β thalassemia, and cystic fibrosis, result from SNPs [10-12]. For drug discovery, diseases with different SNPs may become crucial pharmacogenomic targets; some SNPs are also associated with the metabolism of different drugs [13-15]. For genome-wide association studies, SNPs can serve as a useful genetic marker [16]. The consequences or deleterious effects of SNPs are generally attributed to their impact on the protein structure and function. However, very few studies have been done to predict the SNPs and their impacts on INSR. In this study, we identified the nsSNPs' deleterious mutations in silico which may have an impact on the structural integrity of human INSR protein and are involved in several genetic diseases. Knowledge of in silico analysis of SNPs will play a major role in the understanding of the genetic basis of several complex genetic human diseases. Furthermore, the genetics of human phenotypic diversity could also be implied by establishing the functions of these SNPs. Using laboratory techniques, it is still a major obstacle to identify the functional SNPs in a disease-related gene. However, with recent advancements in the “in silico” technique and procedures, it is now possible to carry out research investigations without the need for extensive lab work. The main focus of this work is to investigate the SNPs genetic variations in the human INSR gene and their possible effects on structure and functions of INSR using bioinformatics and computational algorithms.

2. Materials and Methods

2.1. Datasets

The data of human INSR gene was collected from Online Mendelian Inheritance in Man (OMIM) and Entrez Gene on National Center for Biological Information (NCBI) web sites. The SNPs information (protein accession number and SNP ID) of the INSR gene was retrieved from the NCBI dbSNP (http://www.ncbi.nlm.nih.gov/snp/).

2.2. Analysis of Protein Variation Effects

A sequence based predictor estimates the effect of protein sequence variation on protein function. Many web servers are available to predict the effect of single amino acid variations on protein stability and protein binding efficiency. PROVEAN (http://provean.jcvi.org/index.php), I-Mutant 3.0 (http://gpcr2.biocomp.unibo.it/cgi/predictors/I-Mutant3.0/I-Mutant3.0.cgi), and PolyPhen (http://genetics.bwh.harvard.edu/pph2/) were used in this study. In PROVEAN, protein sequences of BLAST hits with more than 75% global sequence identity were clustered together and top clusters formed a supporting sequence set. A delta alignment scoring system was used, where the scores of each supporting sequence were averaged within and across clusters to generate the final PROVEAN score. A protein variant is said to be “deleterious” if the final score is below a certain threshold (default is −2.5) or is predicted to be “neutral” if the score is above the threshold [17]. PolyPhen version 2 predicts the influence of amino acid substitution on the structure and function of proteins by using the specific empirical rules. Protein sequence, database ID/accession number, amino acid position, and amino acid variant details are the input options for PolyPhen [18]. The tool estimates the position-specific independent count (PSIC) score for every variant and calculates the score difference between variants. I-Mutant 2.0 and I-Mutant 3.0 are based on Support Vector Machine (SVM) algorithm to predict the stability of the protein due to single amino acid variations. It can predict protein stability changes by using protein sequence or structure. It has an overall accuracy of 77% when prediction is based on protein sequence. I-Mutant 2.0 and I-Mutant 3.0 predict the DDG values as a regression estimator and the sign of the stability change. I-Mutant 3.0 furthermore classifies mutations into three categories: neutral mutation (−0.5 ≤ DDG ≤ 0.5), large decrease (≤ −0.5), and large increase (>0.5) [19, 20].

2.3. 3D Modeling and Analysis of Protein Structure

The EMBL-EBI web-based tool PDBsum (http://www.ebi.ac.uk/pdbsum/) was used to find the proteins related to the INSR. PDBsum provides an at-a-glance overview of every macromolecular structure deposited in the Protein Data Bank (PDB). It performs a FASTA search against all sequences in the PDB to obtain a list of the closest matches [21]. LS-SNP/PDB [22] annotates all human SNPs that produce an amino acid change in a protein structure in PDB [23], using features of their local structural environment, putative binding interactions, and evolutionary conservation. The presence of an nsSNP in a highly conserved surface patch or a charged surface patch suggests possible biological importance. These annotations allow users to quickly scan a large number of nsSNPs of interest and prioritize those with higher likelihood of impacting normal protein activities. LS-SNP server is also useful to analyze human nsSNPs onto protein homology models [24]. PYMOL was used to generate the mutant models of each of the selected PDB entries for the corresponding amino acid substitutions. PYMOL allows browsing through a rotamer library to change amino acids. A “Mutagenesis Wizard” was used to replace the native amino acid with new one. The mutation tool facilitates the replacement of the native amino acid by the “best” rotamer of the new amino acid. The “.pdb” files were saved for all the models.

2.4. Structure Validation and Energy Minimization

Structural Analysis and Verification Server (SAVES) was implemented for evaluating the quality and validation of the refined 3D structural models. The SAVES integrates PROCHECK, PROVE, and ERRAT software programs to check overall quality of the 3D models obtained from the PYMOL mutagenesis tool. Structure refinement was carried out using KoBaMIN which is based on knowledge based potential refinement for proteins protocol [25].

2.5. Protein Stability Validation for Mutant Structure

The approach called Mutation Cutoff Scanning Matrix (mCSM) uses the concept of graph-based structural signatures to study and predict the impact of single-point mutations on protein stability and protein-protein and protein-nucleic acid affinity. The mCSM encodes distance patterns between atoms to represent protein residue environments [26].

2.6. Structural Analysis

The predicted structures were viewed in University of California San Francisco (UCSF) Chimera. It is a computationally intensive program for visualization of molecular models and it provides an interactive interface for the user for analyzing the models and model related data. It provides a platform for analyzing sequence alignments, generating homology models, molecular docking, viewing various density models, and also comparing different models by superimposition [27]. The mutant and wild type structures were superimposed and the effect of the nonsynonymous variation was observed in terms of steric hindrance due to the changes of the side chains and charge of the amino acid. Then, the degree of change in the hydrophobicity or hydrophilicity of the substituted amino acid and its effect on the interacting intrachain and interchain molecules was analyzed. A summary of in silico approaches used in this study is shown in Figure 1.
Figure 1

Workflow of in silico approaches used in this study.

3. Results

3.1. SNP Dataset from dbSNP

The dbSNP contains both validated and nonvalidated polymorphisms. In spite of this drawback, we opted to avail the dbSNP because allelic frequency of most of nsSNPs of INSR has been recorded there and that is the most extensive SNP database. In our data search, some previously reported SNPs in dbSNP have been identified as invalid because of wrong sequencing and alignment. These erroneous SNPs have expired or have merged with other SNPs. Some INSR genes have been renamed. We carefully cross-examined the databases and removed those old and invalid SNPs. At dbSNP, INSR gene contains data for 4967 SNPs. Out of 4967 SNPs, only 57 were nsSNPs in the coding region (Table 1). Our investigation accounted for the nsSNPs in the coding region only.
Table 1

List of nsSNPs that were predicted by PROVEAN to have functional significance.

SNP_IDMutationPROVEAN resultsPROVEAN score
rs1799816V1012MNeutral−2.418
rs52836744G58RDeleterious−6.762
rs121913144R1027
rs121913145H236RDeleterious−5.616
rs121913156R1201QDeleterious−3.701
rs891087D261ENeutral−0.116
rs2162771P830LNeutral−1.576
rs13306449Y1361CDeleterious−4.157
rs35045353G811SNeutral−2.401
rs1051691I448TDeleterious−3.774
rs1051692Y171HNeutral−1.304
rs2229429D546ENeutral−2.474
rs7508518A2GNeutral0.465
rs52800171W1220LDeleterious−11.648
rs55816055S353PDeleterious−2.761
rs56395521L1065VNeutral−1.133
rs72549237V362INeutral−0.211
rs76077021R889WDeleterious−4.254
rs76673783E664GDeleterious−5.068
rs78433961R796SNeutral−0.984
rs78827745M65KDeleterious−3.808
rs79312957R413CDeleterious−5.623
rs113527718S1297GDeleterious−2.539
rs138528064T320MNeutral−1.959
rs140762552T107MDeleterious−2.922
rs140852238E51KDeleterious−2.545
rs141484557G262SDeleterious−3.037
rs142391704A706DNeutral0.622
rs142910337D75GNeutral1.672
rs143523271S748LNeutral−0.793
rs143919163G192DNeutral−1.884
rs144029037V900INeutral−0.698
rs146588336D946ENeutral−0.743
rs147671523E517GDeleterious−4.293
rs148838377P755SNeutral−0.076
rs149536206H8598
rs150114699L991INeutral−1.71
rs181150880R410QNeutral−2.011
rs182552223T858ADeleterious−2.612
rs183360558D893NNeutral−1.752
rs185736681R1053CDeleterious−5.933
rs187282966R889QNeutral−1.492
rs199580495S1033FDeleterious−5.642
rs199599404M1319INeutral−1.196
rs199659271C219RDeleterious−9.831
rs200059069K411QNeutral−1.21
rs200110540V866INeutral−0.134
rs200199169P271LNeutral−2.315
rs200400127A1340VNeutral−1.519
rs200921389G1048DNeutral−1.585
rs201147780K294RNeutral−0.904
rs201466857T858MDeleterious−3.384
rs201506342P1312TDeleterious−3.008
rs201978448A537VDeleterious−3.252
rs201979105S1221ADeleterious−2.698
rs202160383R1128HNeutral−2.063

Premature stop codon.

3.2. Effects of nsSNPs on INSR Predicted by Different Tools

The PROVEAN algorithm works mainly with primary sequence for prediction while other tools perform similar task with the structure. Since PROVEAN can predict a large number of substitutions and does not require structures, it is advantageous over other tools. PROVEAN predicts the effect of the variant on the biological function of the protein based on sequence homology. The scores of PROVEAN are classified as “deleterious” below a certain threshold (here −2.5) and “neutral” above it. A  .txt file containing “db SNP rsIDs” of all 57 nsSNPs was submitted to the “dbSNP rsIDs” page to calculate the PROVEAN score. Out of 57 nsSNPs, PROVEAN predicted 24 as deleterious and 33 as neutral (Table 1). Among the 24 deleterious nsSNPs mutations, W1220L and C219R were predicted as highly deleterious with PROVEAN scores of −11.648 and −9.831, respectively. PolyPhen identifies homologues of the input sequences via BLAST and calculates PSIC scores for every variant and estimates the difference between the variant scores; the difference of 0.339 is detrimental. There are certain empirical rules applied to the sequences and the accuracy is approximately 82% with a chance of 8% false-positive prediction. The protein accession number of INSR (P06213) and the amino acid substitutions corresponding to each of the 57 nsSNPs were submitted separately. Table 2 summarizes the results obtained from the PolyPhen server. A PSIC score difference was assigned to categorize SNPs as benign and damaging. “PolyPhen-2: scores are evaluated as 0.000 (most probably benign) to 0.999 (most probably damaging).” Twenty-one of the 57 nsSNPs were predicted as “damaging,” and the PSIC scores fell into the range of 1.51 to 3.41. 18 nsSNPs predicted to be deleterious by the SIFT (Sorting Intolerant from Tolerant) program were also predicted to be damaging by the PolyPhen server.
Table 2

Potential effect of amino acid substitution for nsSNPs in human INSR predicted by the PolyPhen algorithm.

SNP_IDMutationPolyPhen resultsScoreSensitivitySpecificity
rs1799816V1012MProbably damaging0.9920.70.97
rs52836744G58RProbably damaging101
rs121913144R1027
rs121913145H236RProbably damaging101
rs121913156R1201QProbably damaging101
rs891087D261EBenign010
rs2162771P830LBenign010
rs13306449Y1361CProbably damaging101
rs35045353G811SBenign0.4410.890.9
rs1051691I448TProbably damaging0.9960.550.98
rs1051692Y171HBenign0.0240.950.81
rs2229429D546EBenign0.0320.950.82
rs7508518A2GBenign010
rs52800171W1220LProbably damaging101
rs55816055S353PPossibly damaging0.5280.880.9
rs56395521L1065VBenign010
rs72549237V362IBenign0.0030.980.44
rs76077021R889WBenign0.1110.930.86
rs76673783E664GPossibly damaging0.5920.870.91
rs78433961R796SBenign0.0010.990.15
rs78827745M65KPossibly damaging0.9340.80.94
rs79312957R413CProbably damaging0.9990.140.99
rs113527718S1297GBenign0.0040.970.59
rs138528064T320MBenign0.1990.920.88
rs140762552T107MProbably damaging101
rs140852238E51KBenign0.0030.980.44
rs141484557G262SPossibly damaging0.9390.80.94
rs142391704A706DBenign010
rs142910337D75GBenign010
rs143523271S748LBenign010
rs143919163G192DBenign0.0050.970.74
rs144029037V900IBenign0.0480.940.83
rs146588336D946E
rs147671523E517GPossibly damaging0.7260.860.92
rs148838377P755SBenign010
rs149536206H8598
rs150114699L991IBenign0.4420.890.9
rs181150880R410QPossibly damaging0.9350.80.94
rs182552223T858ABenign0.0070.960.75
rs183360558D893NBenign010
rs185736681R1053CBenign0.2230.910.88
rs187282966R889QBenign0.2520.910.88
rs199580495S1033FProbably damaging0.9680.770.95
rs199599404M1319IBenign0.0070.960.75
rs199659271C219RProbably damaging101
rs200059069K411QPossibly damaging0.750.850.92
rs200110540V866IBenign010
rs200199169P271LBenign010
rs200400127A1340VBenign0.0210.950.8
rs200921389G1048DBenign0.0090.960.77
rs201147780K294RBenign0.0080.960.76
rs201466857T858MProbably damaging0.9940.690.97
rs201506342P1312TPossibly damaging0.6160.870.91
rs201978448A537VBenign0.1430.920.86
rs201979105S1221AProbably damaging0.9970.410.98
rs202160383R1128HBenign0.0310.950.82

Premature stop codon.

I-Mutant is a neural network based routine tool used in the analysis of protein stability alterations by considering the single-site mutation. I-Mutant also provides the scores for free energy alterations, calculated with the FOLD-X energy based web server. By assimilating the FOLD-X estimations with those of I-Mutant, about 93% precision can be achieved. We have considered a threshold of −1.5 Kcal/mol to predict a SNP to be destabilized. Forty-six nsSNPs were considered as destabilized with DDG values by I-Mutant (Table 3). Finally, we selected 18 significant nsSNPs because they were predicted to be deleterious by PROVEAN, PolyPhen, and SIFT programs and showed decreased structural stability following analysis by I-Mutant (Table 4).
Table 3

List of nsSNPs' stability predicted by I-MUTANT.

SNP_IDMutationStabilitySNP_IDMutationStability
rs1799816V1012MDecreasers141484557G262SDecrease
rs52836744G58RDecreasers142391704A706DDecrease
rs121913144R1027 rs142910337D75GDecrease
rs121913145H236RDecreasers143523271S748LIncrease
rs121913156R1201QDecreasers143919163G192DDecrease
rs891087D261EIncreasers144029037V900IDecrease
rs2162771P830LDecreasers146588336D946EIncrease
rs13306449Y1361CIncreasers147671523E517GIncrease
rs35045353G811SDecreasers148838377P755SDecrease
rs1051691I448TDecreasers149536206H8598
rs1051692Y171HDecreasers150114699L991IDecrease
rs2229429D546EIncreasers181150880R410QDecrease
rs7508518A2GDecreasers182552223T858ADecrease
rs52800171W1220LDecreasers183360558D893NDecrease
rs55816055S353PIncreasers185736681R1053CDecrease
rs56395521L1065VDecreasers187282966R889QDecrease
rs72549237V362IDecreasers199580495S1033FIncrease
rs76077021R889WDecreasers199599404M1319IDecrease
rs76673783E664GDecreasers199659271C219RDecrease
rs78433961R796SDecreasers200059069K411QIncrease
rs78827745M65KDecreasers200110540V866IDecrease
rs79312957R413CDecreasers200199169P271LDecrease
rs113527718S1297GDecreasers200400127A1340VDecrease
rs138528064T320MDecreasers200921389G1048DDecrease
rs140762552T107MDecreasers201147780K294RIncrease
rs140852238E51KDecreasers201466857T858MDecrease
rs201506342P1312TDecreasers201979105S1221ADecrease
rs201978448A537VIncreasers202160383R1128HDecrease

Premature stop codon.

Table 4

Common amino acid change due to deleterious nsSNPs in human INSR predicted by PROVEAN and PolyPhen algorithms.

SNP_IDMutationPROVEAN resultsPROVEAN scorePolyPhen resultsPolyPhen score
rs52836744G58RDeleterious−6.762Probably damaging1
rs121913145H236RDeleterious−5.616Probably damaging1
rs121913156R1201QDeleterious−3.701Probably damaging1
rs13306449Y1361CDeleterious−4.157Probably damaging1
rs1051691I448TDeleterious−3.774Probably damaging0.996
rs52800171W1220LDeleterious−11.648Probably damaging1
rs55816055S353PDeleterious−2.761Possibly damaging0.528
rs76673783E664GDeleterious−5.068Possibly damaging0.592
rs78827745M65KDeleterious−3.808Possibly damaging0.934
rs79312957R413CDeleterious−5.623Probably damaging0.999
rs140762552T107MDeleterious−2.922Probably damaging1
rs141484557G262SDeleterious−3.037Possibly damaging0.939
rs147671523E517GDeleterious−4.293Possibly damaging0.726
rs199580495S1033FDeleterious−5.642Probably damaging0.968
rs199659271C219RDeleterious−9.831Probably damaging1
rs201466857T858MDeleterious−3.384Probably damaging0.994
rs201506342P1312TDeleterious−3.008Possibly damaging0.616
rs201979105S1221ADeleterious−2.698Probably damaging0.997

3.3. Effects of nsSNPs on Protein Structure

By using the EMBL-EBI web-based tool PDBsum, the INSR protein structures were searched. Two related protein structures, namely, 2HR7 and 4IBM, were found to share 100% amino acid sequence similarity. The single amino acid polymorphism (SAAP) database server (http://www.bioinf.org.uk/saap/db/) is offline due to essential maintenance. Thus, we were unable to map the deleterious nsSNPs into protein structure through SAAP. Mapping the deleterious nsSNPs into protein structure information was performed through the LS-SNP/PDB server. According to this resource, 2HR7 accounted for 9 nsSNPs and 4IBM had 4 nsSNPs. Apart from the SNP scanning, LS-SNP/PDB server also predicts solvent accessibility and conservation ratio of given protein structures. An overview of mapping of mutant structures and their solvent accessibility and conservation ratios is given in Table 5.
Table 5

Mapping of nsSNPs in 2HR7 and 4IBM 3D structures.

SNP_IDMutationPDB residue numberSolvent accessibilityConservation
2HR7
rs52836744G58R31Intermediate 10%5%
rs121913145H236R209Intermediate 23%10%
rs1051691I448T421Buried 1%1%
rs55816055S353P326Buried 5%3%
rs78827745M65K38Buried 0%8%
rs79312957R413C386Exposed 58%3%
rs140762552T107M80Buried 2%5%
rs141484557G262S235Exposed 46%25%
rs199659271C219R192Buried 5%11%
4IBM
rs121913156R1201Q1174Intermediate 10%0%
rs52800171W1220L1193Buried 1%0%
rs199580495S1033F1006Intermediate 30%0%
rs201979105S1221A1194Buried 5%0%
Out of 18 nsSNPs predicted to be deleterious by PROVEAN or PolyPhen, a total of 13 were mapped to the PDB ID 2HR7 and 4IBM native structures. All the functional nsSNPs predicted using the PROVEAN and PolyPhen tools were subjected to the PYMOL mutation tool. A model for each functional nsSNP was made by PYMOL mutagenesis tool and visualized using UCSF Chimera tool for comparison with the native structures (Figure 2, only mutants rs1051691 (I421T) and rs121913156 (R1174Q) are shown).
Figure 2

A comparison of amino acid substitutions due to nsSNPs. Two mutant structures of deleterious nsSNPs rs1051691 (I421T) (c) and rs121913156 (R1174Q) (d) were compared to their native structures 2HR7 (a) and 4IBM (b), respectively. Models were generated by using PYMOL and visualized by UCSF Chimera.

Energy minimization is performed for the native structures (2HR7 and 4IBM) and the mutant modeled structures. The KoBaMIN web server uses a force field for energy minimization. The total energy for all the mutant and native models after minimization is listed in Table 6. The total energies for the native structures of 2HR7 and 4IBM are −22087.6969 kJ/mol and −13041.4646 kJ/mol, respectively. Change in total energy due to mutation is noticeable in the both 2HR7 and 4IBM mutant models. RMSD is the measure of the deviation of the mutant structures from their native configurations. The higher the RMSD value, the more the deviation between the two structures. Structural changes, in turn, affect functional activity. RMSDs for all the mutant structures are listed in Table 6. The mutants rs79312957 and rs121913156 have higher RMSD value of 6.025 and 0.436 compared to native structures RMSD value 6.019 and 0.404, respectively. These two nsSNPs could be believed to affect the structure of the proteins. These two nsSNPs were also shown to be deleterious according to the PROVEAN and PolyPhen server. The 3D structure of the native INSR protein crystal structures 2HR7 and 4IBM and the predicted mutant structures were superimposed over chain A. The superimposed structures revealed that the mutants might have considerably affected the protein structure and thus its function (Figure 3; only rs79312957 is shown). Substituted amino acid residues in the mutants might have altered the conformation of the INSR or networking among neighboring amino acids or interaction between the substrate and receptor [28, 29].
Table 6

RMSD and total energy after energy minimization of native structures and their mutant 3D models.

MoleculesRMSD (Å)Total energy after energy minimization (kJ/mol)
2HR7 native-type structure 6.019−22087.6969
2HR7 mutant (rs52836744)6.007−21968.7347
2HR7 mutant (rs121913145)5.985−21815.0585
2HR7 mutant (rs1051691)5.997−22160.7304
2HR7 mutant (rs55816055)5.957−21903.7516
2HR7 mutant (rs78827745)5.991−21816.4353
2HR7 mutant (rs79312957)6.025−21962.4576
2HR7 mutant (rs140762552)5.992−21823.0448
2HR7 mutant (rs141484557)5.995−22076.7362
2HR7 mutant (rs199659271)5.98−21736.5417

4IBM native-type structure 0.404−13041.4646
4IBM mutant (rs121913156)0.436−13091.3512
4IBM mutant (rs52800171)0.39−12830.5659
4IBM mutant (rs199580495)0.402−11940.1628
4IBM mutant (rs201979105)0.376−13076.6808
Figure 3

Superimposition of native and mutant structures. Native structure (2HR7) shows arginine at position 386 (a) and mutant modeled structure rs79312957 (R386C) shows cysteine residue at the corresponding position (b). (c) shows the superimposition of the native structure (red) with mutant modeled structure (green) in position 386.

3.4. Effects of nsSNP on Protein Stability

The effects of the nsSNPs on protein stability were computed with FOLD-X by mCSM server which uses an empirical energy equation to calculate the Gibbs free energy DDG. The empirical energy terms consider the location and type of a substituted residue. The mCSM is a structure based prediction tool. Two different analysis protocols were utilized to obtain maximum information over the effect of the single amino acid substitutions: (1) all the nsSNPs were considered singularly and their effect on the protein stability and interaction potential was determined; (2) the nsSNPs were considered according to the allelic sequences. Initially, all the structures were minimized and obtained a stable protein stability value. Then the structures for each single amino acid variation were generated using the Build Model feature of FOLD-X 3.0. Finally, the effect of each single amino acid variation on the protein stability of INSR was determined using the analyzed complex features. The mutation was considered as destabilizing and stabilizing when the DDG was >0 and <0, respectively. In this prediction method, all the mutant structures ultimately derived from the PROVEAN, PolyPhen, and I-Mutant programs were finally submitted to the mCSM server to predict mutant structure's protein stability upon mutation. The mCSM predicted all structures as “Destabilizing” including two as “Highly Destabilizing” (Table 7).
Table 7

Protein stability upon mutation.

MoleculesMutationPDB residue numberRSA (%)Predicted ΔΔG Outcome
2HR7 native-type structure
2HR7 mutant (rs52836744)G58R3133.2−1.11Destabilizing
2HR7 mutant (rs121913145)H236R20930.5−0.248Destabilizing
2HR7 mutant (rs1051691) I448T 421 0 −2.403 Highly Destabilizing
2HR7 mutant (rs55816055)S353P32615.5−0.339Destabilizing
2HR7 mutant (rs78827745)M65K380−1.568Destabilizing
2HR7 mutant (rs79312957)R413C38670.2−0.931Destabilizing
2HR7 mutant (rs140762552)T107M802.3−0.387Destabilizing
2HR7 mutant (rs141484557)G262S23570.3−0.775Destabilizing
2HR7 mutant (rs199659271)C219R1923.2−0.039Destabilizing
4IBM native-type structure
4IBM mutant (rs121913156)R1201Q117411.5−1.347Destabilizing
4IBM mutant (rs52800171) W1220L 1193 2.1 −3.223 Highly Destabilizing
4IBM mutant (rs199580495)S1033F100616.7−0.888Destabilizing
4IBM mutant (rs201979105)S1221A11949.5−1.858Destabilizing

4. Discussion

The SNP in INSR can manifest several insulin-resistant syndromes like Leprechaunism, Rabson-Mendenhall syndrome, and type A insulin resistance [30, 31]. Diagnostic measures have already been established on clinical examination as well as laboratory diagnostic tests with elevated insulin levels as a constant feature. Functional and DNA analysis can be used for absolute confirmation, but certain mutations do not contribute to insulin binding and DNA analysis is still not able to identify all the putative mutations. Although there is no direct genotype-phenotype correlation, but mutations in the alpha subunit of the insulin receptor are associated with a more severe phenotype compared to the mutations affecting the beta subunit [32]. Numerous studies have been conducted using in silico analysis approaches to predict the functional effects of nsSNPs on genes such as G6PD, BARF, and PTEN [33, 34]. Therefore, for addressing this issue, we selected in silico strategy to analyze and predict the functional effects of SNPs on INSR. We used different in silico methods based on the combination of two distinctive approaches which are sequence and structural based approaches. In comparison with the structure based methods, sequence based prediction methods are one step ahead because they can be applied to any proteins with known relatives, whereas structure based approaches are not feasible to implement for proteins with unknown 3D structures. Software programs and servers that integrate both sequence and structure resources have advantage of being able to assess the authenticity of the predicted results by cross-referencing the results from both methods. Most computational methods utilize this information for the prediction and analyses of deleterious nsSNPs, among which PROVEAN and PolyPhen algorithms are the main representatives. Considering normalized probability score below −2.5 in PROVEAN and a PSIC score 1.5 in PolyPhen as deleterious, 24 and 21 of amino acid substitutions were predicted to have functional impact on INSR gene. The variation in prediction score of PROVEAN and PolyPhen is mainly because of the difference in sequence alignment and the values used to classify the variants. Significant similarity was observed between the results obtained by PROVEAN and PolyPhen. PROVEAN and PolyPhen in predicting the effect of nsSNPs on protein function might be suitable in silico approach [35]. In order to predict the impact of nsSNPs on protein structure, I-Mutant 3.0 was used which evaluated the stability change upon single-site mutation. I-Mutant 3.0 was ranked as one of the most reliable predictors based on the work performed by Khan and Vihinen [36]. Based on the difference in Gibbs free energy value of mutant and wild type protein, 45 nsSNPs were found to largely destabilize the protein. Structures of the several human INSRs are available in PDB and have been used to analyze the effect of polymorphisms. A 3D structure is essential for analyzing the impact of the SNPs in structural level. Therefore, we predicted the 3D structure most similar to human INSR through the EMBL PDBsum program. Depending on the highest sequence similarity and alignment, we selected 2HR7 and 4IBM from the PDB. Already predicted deleterious and disease-related nsSNPs predicted by PROVEAN and PolyPhen were further subjected to LS-SNP/PDB server for mapping SNPs in 2HR7 and 4IBM crystal structures. The mutant structures served as valuable tool to compare and predict protein stability, RMSD, and energy calculation between wild type and mutant type structures. Each mutation was considered individually to study the inherent effect of the SNP. In addition, the allelic sequences were analyzed to investigate if the polymorphisms neutralized each other by occurring simultaneously as an act of preservation of function by nature. The mCSM was used to analyze the effects of single amino acid variations on the structure and stability of the protein. Our results indicated that all of the 13 mutant structures of 2HR7 and 4IBM were predicted as “Destabilizing” which signified our results found by PROVEAN and PolyPhen. Among all the destabilized mutant structures, two mutants were labelled as “Highly Destabilizing” which were rs1051691 and rs52800171 in their I448T and W1220L positions, respectively, which suggested that these polymorphisms should be considered as a potential target for future experiments. If a single amino acid variation shows a change in protein stability or protein-protein interaction, it should give comparable values with the sign reversal for the reverse mutation. This would indicate that the prediction of the effect of the single amino acid variation on the protein structure or protein-protein interaction might be substantial.

5. Conclusions

This study shows a correlation between SNPs in the INSR gene and several diseases like insulin-resistant syndromes such as Leprechaunism, Rabson-Mendenhall syndrome, and type A insulin resistance. The present study concludes that 13 nsSNPs especially rs1051691 and rs52800171 decreases protein stability and are not tolerated or may result in loss of function. Their presence in the INSR increases the possibility of altered transcriptional and cell cycle regulation and INSR mediated diseases. Therefore, the probability of their involvement in disease predisposition increases. Thus, for further analysis, these mutations should be given priority to obtain detailed information on their effects. In order to confirm the structures modeled in this study, the actual structures should be determined by X-ray crystallography or nuclear magnetic resonance spectroscopy. We anticipate that the results obtained from our analysis would pave the way for providing useful information to the researchers and can play an important role in bridging the gap between biologists and bioinformaticians.
  36 in total

1.  Using SIFT and PolyPhen to predict loss-of-function and gain-of-function mutations.

Authors:  Sarah E Flanagan; Ann-Marie Patch; Sian Ellard
Journal:  Genet Test Mol Biomarkers       Date:  2010-08

2.  Molecular defects in insulin action.

Authors:  C R Kahn; B J Goldstein
Journal:  Science       Date:  1989-07-07       Impact factor: 47.728

3.  LS-SNP/PDB: annotated non-synonymous SNPs mapped to Protein Data Bank structures.

Authors:  Michael Ryan; Mark Diekhans; Stephanie Lien; Yun Liu; Rachel Karchin
Journal:  Bioinformatics       Date:  2009-04-15       Impact factor: 6.937

4.  The human insulin receptor cDNA: the structural basis for hormone-activated transmembrane signalling.

Authors:  Y Ebina; L Ellis; K Jarnagin; M Edery; L Graf; E Clauser; J H Ou; F Masiarz; Y W Kan; I D Goldfine
Journal:  Cell       Date:  1985-04       Impact factor: 41.582

Review 5.  Clinical relevance of genetic polymorphisms in the human CYP2C subfamily.

Authors:  J A Goldstein
Journal:  Br J Clin Pharmacol       Date:  2001-10       Impact factor: 4.335

6.  Genotype-phenotype correlation in inherited severe insulin resistance.

Authors:  Nicola Longo; Yuhuan Wang; Shelley A Smith; Sharon D Langley; Linda A DiMeglio; Daniel Giannella-Neto
Journal:  Hum Mol Genet       Date:  2002-06-01       Impact factor: 6.150

7.  Human non-synonymous SNPs: server and survey.

Authors:  Vasily Ramensky; Peer Bork; Shamil Sunyaev
Journal:  Nucleic Acids Res       Date:  2002-09-01       Impact factor: 16.971

8.  Cystic fibrosis patients bearing both the common missense mutation Gly----Asp at codon 551 and the delta F508 mutation are clinically indistinguishable from delta F508 homozygotes, except for decreased risk of meconium ileus.

Authors:  A Hamosh; T M King; B J Rosenstein; M Corey; H Levison; P Durie; L C Tsui; I McIntosh; M Keston; D J Brock
Journal:  Am J Hum Genet       Date:  1992-08       Impact factor: 11.025

9.  Genome-Wide Characterization of Major Intrinsic Proteins in Four Grass Plants and Their Non-Aqua Transport Selectivity Profiles with Comparative Perspective.

Authors:  Abul Kalam Azad; Jahed Ahmed; Md Asraful Alum; Md Mahbub Hasan; Takahiro Ishikawa; Yoshihiro Sawa; Maki Katsuhara
Journal:  PLoS One       Date:  2016-06-21       Impact factor: 3.240

10.  mCSM: predicting the effects of mutations in proteins using graph-based signatures.

Authors:  Douglas E V Pires; David B Ascher; Tom L Blundell
Journal:  Bioinformatics       Date:  2013-11-26       Impact factor: 6.937

View more
  1 in total

1.  Conceptualization of functional single nucleotide polymorphisms of polycystic ovarian syndrome genes: an in silico approach.

Authors:  B N Prabhu; S H Kanchamreddy; A R Sharma; S K Bhat; P V Bhat; S P Kabekkodu; K Satyamoorthy; P S Rai
Journal:  J Endocrinol Invest       Date:  2021-01-27       Impact factor: 4.256

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.