Literature DB >> 24904648

In silico identification of genetic variants in glucocerebrosidase (GBA) gene involved in Gaucher's disease using multiple software tools.

Madhumathi Manickam1, Palaniyandi Ravanan1, Pratibha Singh1, Priti Talwar1.   

Abstract

Gaucher's disease (GD) is an autosomal recessive disorder caused by the deficiency of glucocerebrosidase, a lysosomal enzyme that catalyses the hydrolysis of the glycolipid glucocerebroside to ceramide and glucose. Polymorphisms in GBA gene have been associated with the development of Gaucher disease. We hypothesize that prediction of SNPs using multiple state of the art software tools will help in increasing the confidence in identification of SNPs involved in GD. Enzyme replacement therapy is the only option for GD. Our goal is to use several state of art SNP algorithms to predict/address harmful SNPs using comparative studies. In this study seven different algorithms (SIFT, MutPred, nsSNP Analyzer, PANTHER, PMUT, PROVEAN, and SNPs&GO) were used to predict the harmful polymorphisms. Among the seven programs, SIFT found 47 nsSNPs as deleterious, MutPred found 46 nsSNPs as harmful. nsSNP Analyzer program found 43 out of 47 nsSNPs are disease causing SNPs whereas PANTHER found 32 out of 47 as highly deleterious, 22 out of 47 are classified as pathological mutations by PMUT, 44 out of 47 were predicted to be deleterious by PROVEAN server, all 47 shows the disease related mutations by SNPs&GO. Twenty two nsSNPs were commonly predicted by all the seven different algorithms. The common 22 targeted mutations are F251L, C342G, W312C, P415R, R463C, D127V, A309V, G46E, G202E, P391L, Y363C, Y205C, W378C, I402T, S366R, F397S, Y418C, P401L, G195E, W184R, R48W, and T43R.

Entities:  

Keywords:  MutPred; PANTHER; PMUT; PROVEAN; SIFT; SNPs&GO; glucocerebrosidase

Year:  2014        PMID: 24904648      PMCID: PMC4034330          DOI: 10.3389/fgene.2014.00148

Source DB:  PubMed          Journal:  Front Genet        ISSN: 1664-8021            Impact factor:   4.599


Introduction

Gaucher's disease (GD) is a rare genetic disease in which fatty substances accumulate in cells and certain organs (James et al., 2006). It is a common lysosomal storage disorder and results from an inborn deficiency of the enzyme glucocerebrosidase (also known as acid β-glucosidase). This enzyme is responsible for glucocerebroside (glucosylceramide) degradation. The accumulation of undegraded substrate generally happens because of enzyme deficiency, mainly within cells of the macrophage lineage or monocyte, and it is responsible for the clinical manifestations of the disease (Beutler and Grabowski, 2001). This glucosylceramide degrading enzyme is encoded by a gene named GBA, which is 7.6 kb in length and located in 1q21 locus. Recessive mutation in GBA gene affects both males and females (Horowitz et al., 1989; Zimran et al., 1991; Winfield et al., 1997). GBA protein is 497 amino acids long with the molecular weight of 55.6 KD. GBA enzyme catalyses the breakdown of glucosylceramide, a cell membrane constituent of white blood cells and red blood cells. The macrophages fail to eliminate the waste product and results in accumulation of lipids in fibrils and this turn into Gaucher cells (Aharon et al., 2004). GD can be classified into three classes namely types 1, 2, and 3. In type 1, Glycosylceramide accumulate in visceral organs whereas in type 2 and 3, the accumulation is in the central nervous system (Grabowski, 2008). The international disease frequency of GD is 200,000 except for areas of the world with large Ashkenazi Jewish populations where 60% of the patients are estimated to be homozygous, which accounts for 75% of disease alleles (Pilar et al., 2012). Almost 300 unique mutations have been reported in the GBA gene, with distribution that spans the entire gene. These include 203 missense mutations, 18 nonsense mutations, 36 small insertions or deletions that lead to frameshift or in-frame alterations, 14 splice junction mutations and 13 complex alleles carrying two or more mutations (Hruska et al., 2008). The single nucleotide variations in the genome that occur at a frequency of more than 1% are referred as single nucleotide polymorphisms (SNPs) and in the human genome, SNPs occur in just about every 3000 base pairs (Cargill et al., 1999). Nearly 200 mutations in the GBA gene have been described in patients with GD types 1, 2, and 3 (Jmoudiak and Futerman, 2005). L444P mutation was identified in GBA gene in patients with GD types 1, 2, and 3. The L444P substitution is one of the major SNP associated with the GBA gene. D409H, A456P, and V460V mutations were also identified in patients with GD (Tsuji et al., 1987; Latham et al., 1990). Previous findings have shown that, in 60 patients with types 1 and 3, the most common Gaucher mutations identified were N370S, L444P, and R463C. (Sidransky et al., 1994). The other mutation E326K had been identified in patients with all three types of GD, but in each instance it was found on the same allele with another GBA mutation. Also, Park et al. identified the E326K allele in 1.3% of patients with GD and in 0.9% of controls, indicating that it is a polymorphism (Park et al., 2002). The harmful SNPs for the GBA gene have not been predicted to date in silico. Therefore we designed a strategy for analyzing the entire GBA coding region. Different algorithms such as SIFT (Ng and Henikoff, 2001), MutPred (Li et al., 2009), nsSNP Analyzer (Bao et al., 2005), PANTHER (Mi et al., 2012), PMUT (Costa et al., 2002), PROVEAN (Choi et al., 2012), and SNPs&GO (Calabrese et al., 2009) were utilized to predict high-risk nonsynonymous single nucleotide polymorphisms (nsSNPs) in coding regions that are likely to have an effect on the function and structure of the protein.

Materials and methods

Data set

SNPs associated with GBA gene were retrieved from the single nucleotide polymorphism database (dbSNP) (http://www.ncbi.nlm.nih.gov/snp/), and are commonly referred by their reference sequence IDs (rsID) (Wheeler et al., 2005).

Validation of tolerated and deleterious SNPs

The type of genetic mutation that causes a single amino acid substitution (AAS) in a protein sequence is called nsSNP. An nsSNP could potentially influence the function of the protein, subsequently altering the phenotype of carrier. This protocol describes the use of the Sorting Intolerant From Tolerant (SIFT) algorithm (http://sift.jcvi.org) for predicting whether an AAS affects protein function. To assess the effect of a substitution, SIFT assumes that important positions in a protein sequence have been conserved throughout evolution and therefore at these positions substitutions may affect protein function. Thus, by using sequence homology, SIFT predicts the effects of all possible substitutions at each position in the protein sequence. The protocol typically takes 5–20 min, depending on the input (Kumar et al., 2009).

Prediction of harmful mutations

MutPred (http://mutdb.org/mutpred) models structural features and functional sites changes between mutant sequences and wild-type sequence. These changes are expressed as probabilities of gain or loss of structure and function. The MutPred output contains a general score (g), i.e., the probability that the AAS is deleterious/disease-associated and top five property scores (p), where p is the P-value that certain structural and functional properties are impacted. Certain combinations of high values of general scores and low values of property scores are referred to as hypotheses (Li et al., 2009).

Identifying disease-associated nsSNPs

nsSNP Analyzer (http://snpanalyzer.uthsc.edu) is a tool to predict whether a nsSNP has a phenotypic effect (disease-associated vs. neutral) using a machine learning method called Random Forest, and extracting structural and evolutionary information from a query nsSNP (Bao et al., 2005).

Prediction of deleterious nsSNPs

PANTHER (http://pantherdb.org/tools/csnpScoreForm.jsp) estimates the likelihood of a particular nsSNP to cause a functional impact on a protein (Thomas et al., 2003). It calculates the substitution position-specific evolutionary conservation (subPSEC) score based on the alignment of evolutionarily related proteins. The subPSEC score is the negative logarithm of the probability ratio of the wild-type and the mutant amino acids at a particular position. The subPSEC scores are values from 0 (neutral) to about −10 (most likely to be deleterious).

Prediction of pathological mutations on proteins

PMUT (http://mmb2.pcb.ub.es:8080/PMut) uses a robust methodology to predict disease-associated mutations. PMUT method is based on the use of neural networks (NNs) trained with a large database of neutral mutations (NEMUs) and pathological mutations of mutational hot spots, which are obtained by alanine scanning, massive mutation, and genetically accessible mutations. The final output is displayed as a pathogenicity index ranging from 0 to 1 (indexes > 0.5 single pathological mutations) and a confidence index ranging from 0 (low) to 9 (high) (Costa et al., 2005).

Predicting the functional effect of amino acid substitutions

PROVEAN (Protein Variation Effect Analyzer) (http://provean.jcvi.org) is a sequence based predictor that estimates the effect of protein sequence variation on protein function (Choi et al., 2012). It is based on a clustering method where BLAST hits with more than 75% global sequence identity are clustered together and top 30 such clusters from a supporting sequence are averaged within and across clusters to generate the final PROVEAN score. A protein variant is predicted to be “deleterious” if the final score is below a certain threshold (default is −2.5), and is predicted to be “neutral” if the score is above the threshold.

Prediction of disease related mutations

The SNPs&GO algorithms (http://snps-and-go.biocomp.unibo.it/snps-and-go/) predict the impact of protein variations using functional information encoded by Gene Ontology (GO) terms of the three main roots: Molecular function, Biological process, and Cellular component (Calabrese et al., 2009). SNPs&GO is a support vector machine (SVM) based web server to predict disease related mutations from the protein sequence, scoring with accuracy of 82% and Matthews correlation coefficient equal to 0.63. SNPs&GO collects, in a unique framework, information derived from protein sequence, protein sequence profile and protein functions.

Results

nsSNPs found by SIFT program

Protein sequence with mutational position and amino acid residue variants associated with 97 missense nsSNPs were submitted as input to the SIFT server, and the results are shown in Table 1. The lower the tolerance index, the higher the functional impact a particular amino acid residue substitution is likely to have and vice versa. Among the 97 nsSNPs analyzed, 47 nsSNPs were identified to be deleterious with a tolerance index score ≤0.05 (Kumar et al., 2009). Among 47 deleterious nsSNPs, 25 nsSNPs were found to be highly deleterious.
Table 1

Tolerated and deleterious nsSNPs using SIFT.

S. NorsIDAllelesPositionAA changePredictionScore
1rs121908314L/V371Leu/ValDamaging0.04
2rs121908313F/L251Phe/LeuDamaging0.01
3rs121908312K/N79Lys/AsnTolerated0.52
4rs121908311G/S377Gly/SerDamaging0.02
5rs121908310V/F398Val/PheDamaging0.01
6rs121908308R/G353Arg/GlyTolerated0.38
7rs121908307S/T364Ser/ThrTolerated0.12
8rs121908306C/G342Cys/GlyDamaging0.01
9rs121908305G/R325Gly/ArgTolerated0.44
10rs121908304W/C312Trp/CysDamaging0.00
11rs121908303F/V216Phe/ValDamaging0.00
12rs121908302V/L15Val/LeuTolerated0.07
13rs121908301G/S478Gly/SerTolerated0.17
14rs121908300Y/H212Tyr/HisDamaging0.03
15rs121908299P/S122Pro/SerTolerated0.37
16rs121908298P/L289Pro/LeuTolerated0.48
17rs121908297K/Q157Lys/GlnTolerated0.06
18rs121908295P/R415Pro/ArgDamaging0.00
19rs80356773R/H496Arg/HisTolerated0.19
20rs80356772R/H463Arg/HisTolerated0.06
21rs80356771R/C463Arg/CysDamaging0.02
22rs80356769V/L394Val/LeuDamaging0.03
23rs80356765A/T338Ala/ThrTolerated0.39
24rs80356763R/L131Arg/LeuTolerated0.24
25rs80205046P/L182Pro/LeuDamaging0.00
26rs80116658G/D265Gly/AspDamaging0.00
27rs80020805M/I416Met/IleTolerated0.42
28rs79945741F/L213Phe/LeuTolerated0.18
29rs79796061D/V127Asp/ValDamaging0.00
30rs79696831R/H285Arg/HisDamaging0.00
31rs79653797R/Q120Arg/GlnDamaging0.00
32rs79637617P/L122Pro/LeuDamaging0.02
33rs79215220P/R266Pro/ArgDamaging0.00
34rs79185870F/L417Phe/LeuDamaging0.01
35rs78973108R/Q257Arg/GlnTolerated0.05
36rs78911246G/V189Gly/ValDamaging0.02
37rs78802049D/E409Asp/GluTolerated0.32
38rs78769774R/Q48Arg/GlnTolerated0.06
39rs78715199D/E380Asp/GluDamaging0.00
40rs78396650A/V309Ala/ValDamaging0.00
41rs78198234H/R311His/ArgDamaging0.00
42rs78188205A/D318Ala/AspTolerated0.63
43rs77959976M/I123Met/IleTolerated1.00
44rs77834747I/S119Ile/SerTolerated0.34
45rs77829017G/E46Gly/GluDamaging0.01
46rs77738682N/I392Asn/IleDamaging0.00
47rs77451368G/E202Gly/GluDamaging0.02
48rs77369218D/V409Asp/ValTolerated0.06
49rs77321207Y/C395Tyr/CysDamaging0.00
50rs77284004D/A380Asp/AlaDamaging0.00
51rs77035024F/L411Phe/LeuTolerated0.30
52rs77019233N/K117Asn/LysTolerated0.21
53rs76910485P/L391Pro/LeuDamaging0.00
54rs76763715N/S370Asn/SerDamaging0.05
55rs76763715N/T370Asn/ThrDamaging0.04
56rs76539814T/I323Thr/IleTolerated0.48
57rs76228122Y/C363Tyr/CysDamaging0.00
58rs76026102Y/C205Tyr/CysDamaging0.00
59rs76014919W/C378Trp/CysDamaging0.00
60rs75954905F/L37Phe/LeuTolerated0.30
61rs75671029D/N443Asp/AsnTolerated0.93
62rs75636769A/E190Ala/GluTolerated1.00
63rs75564605I/T402Ile/ThrDamaging0.04
64rs75548401T/M369Thr/MetTolerated0.08
65rs75528494S/R366Ser/ArgDamaging0.03
66rs75385858N/T396Asn/ThrDamaging0.00
67rs75243000F/S397Phe/SerDamaging0.02
68rs75090908D/E399Asp/GluTolerated0.17
69rs74979486R/Q359Arg/GlnTolerated0.05
70rs74953658D/E24Asp/GluDamaging0.01
71rs74752878Y/C418Tyr/CysDamaging0.00
72rs74731340S/N271Ser/AsnTolerated0.26
73rs74598136P/L401Pro/LeuDamaging0.00
74rs74500255F/Y216Phe/TyrTolerated0.34
75rs74462743G/E195Gly/GluDamaging0.00
76rs61748906W/R184Trp/ArgDamaging0.00
77rs11558184R/Q353Arg/GlnTolerated0.59
78rs2230288E/K326Glu/LysTolerated0.86
79rs1141820H/R60His/ArgTolerated0.54
80rs1141818H/Y60His/TyrTolerated0.09
81rs1141815M/T53Met/ThrTolerated0.59
82rs1141814R/W48Arg/TrpDamaging0.00
83rs1141812R/S44Arg/SerTolerated0.14
84rs1141811T/I43Thr/IleDamaging0.01
85rs1141811T/R43Thr/ArgDamaging0.02
86rs1141808E/K41Glu/LysTolerated0.52
87rs1141804S/G16Ser/GlyTolerated1.00
88rs1141802L/S15Leu/SerTolerated0.63
89rs1064651D/H409Asp/HisTolerated0.05
90rs1064648R/H329Arg/HisTolerated0.17
91rs1064644S/P196Ser/ProTolerated0.17
92rs421016L/P444Leu/ProDamaging0.00
93rs381737F/I213Phe/IleTolerated0.18
94rs381427V/E191Val/GluTolerated0.16
95rs381427V/G191Val/GlyTolerated0.16
96rs368060A/P456Ala/ProTolerated0.09
97rs364897N/S188Asn/SerTolerated0.17

The consensus SNPs are shown in bold.

Tolerated and deleterious nsSNPs using SIFT. The consensus SNPs are shown in bold.

Validation of harmful mutations

The MutPred score is the probability that an AAS is deleterious/disease-associated. A missense mutation with a MutPred score >0.5 could be considered as “harmful,” while a MutPred score >0.75 should be considered a high confidence “harmful” prediction (Li et al., 2009). Among the 47 deleterious nsSNPs, 8 were found to be harmful mutations with a score of >0.5 and <0.75 and 38 were found to be high confidence (highly harmful) mutations and 1 nsSNP found to be normal with the score of 0.193 (Table 2).
Table 2

Prediction of functional effects of nsSNPs using MutPred.

S. NorsIDAllelesPositionAA changeMutPred predictionScore
1rs121908314L/V371Leu/valHigh confidence0.824
2rs121908313F/L251Phe/LeuHigh confidence0.778
3rs121908311G/S377Gly/SerNeutral0.193
4rs121908310V/F298Val/PheHigh confidence0.765
5rs121908306C/G342Cys/GlyHigh confidence0.792
6rs121908304W/C312Trp/CysHarmful mutation0.735
7rs121908303F/V216Phe/ValHigh confidence0.879
8rs121908300Y/H212Tyr/HisHigh confidence0.82
9rs121908295P/R415Pro/ArgHigh confidence0.914
10rs80356771R/C463Arg/CysHarmful mutation0.664
11rs80356769V/L394Val/LeuHigh confidence0.794
12rs80205046P/L182Pro/LeuHigh confidence0.892
13rs80116658G/D265Gly/AspHigh confidence0.963
14rs79796061D/V127Asp/ValHigh confidence0.754
15rs79696831R/H285Arg/HisHigh confidence0.884
16rs79653797R/Q120Arg/GlnHigh confidence0.902
17rs79637617P/L122Pro/LeuHigh confidence0.835
18rs79215220P/R166Pro/ArgHigh confidence0.836
19rs79185870F/L417Phe/LeuHigh confidence0.905
20rs78911246G/V189Gly/ValHarmful mutation0.713
21rs78715199D/E380Asp/GluHigh confidence0.837
22rs78396650A/V309Ala/ValHigh confidence0.776
23rs78198234H/R311His/ArgHigh confidence0.873
24rs77829017G/E46Gly/GluHigh confidence0.856
25rs77738682N/I392Asn/IleHigh confidence0.814
26rs77451368G/E202Gly/GluHarmful mutation0.676
27rs77321207Y/C304Tyr/CysHigh confidence0.909
28rs77284004D/A380Asp/AlaHigh confidence0.872
29rs76910485P/L391Pro/LeuHigh confidence0.889
30rs76763715N/S370Ans/SerHigh confidence0.876
31rs76763715N/T370Asn/ThrHigh confidence0.89
32rs76228122Y/C363Tyr/CysHigh confidence0.93
33rs76026102Y/C205Tyr/CysHigh confidence0.857
34rs76014919W/C378Trp/CysHigh confidence0.842
35rs75564605I/T402IleThrHigh confidence0.838
36rs75528494S/R366Ser/ArgHarmful mutation0.681
37rs75385858N/T396Asn/ThrHigh confidence0.848
38rs75243000F/S397Phe/SerHarmful mutation0.724
39rs74953658D/E24Asp/GluHigh confidence0.818
40rs74752878Y/C418Tyr/CysHigh confidence0.872
41rs74598136P/L401Pro/LeuHigh confidence0.888
42rs74462743G/E195Gly/GluHigh confidence0.859
43rs61748906W/R184Trp/ArgHigh confidence0.902
44rs1141814R/W48Arg/TrpHigh confidence0.804
45rs1141811T/I43Thr/IleHarmful mutation0.504
46rs1141811T/R43Thr/ArgHarmful mutation0.579
47rs421016L/P444Leu/ProHigh confidence0.899

The consensus SNPs are shown in bold.

Prediction of functional effects of nsSNPs using MutPred. The consensus SNPs are shown in bold.

Disease-associated nsSNPs

Out of 47 deleterious nsSNPs, 43 were found to be a disease causing nsSNPs and 4 were found to be neutral nsSNPs (Table 3).
Table 3

The results from nsSNP Analyzer, PMUT, PROVEAN, and SNPs&GO.

S. NorsIDAllelePositionAA changensSNP AnalyzerPMUTPROVEANSNPs&GO
ScorePrediction
1rs121908314L/V371Leu/valNeutralNeutral−2.331NeutralDisease
2rs121908313F/L251Phe/LeuDiseasePathological−4.567DeleteriousDisease
3rs121908311G/S377Gly/SerDiseaseNeutral−5.128DeleteriousDisease
4rs121908310V/F398Val/PheDiseaseNeutral−4.185DeleteriousDisease
5rs121908306C/G342Cys/GlyDiseasePathological−11.467DeleteriousDisease
6rs121908304W/C312Trp/CysDiseasePathological−12.258DeleteriousDisease
7rs121908303F/V216Phe/ValDiseaseNeutral−7DeleteriousDisease
8rs121908300Y/H212Tyr/HisDiseaseNeutral−4.267DeleteriousDisease
9rs121908295P/R415Pro/ArgDiseasePathological−8.793DeleteriousDisease
10rs80356771R/C463Arg/CysDiseasePathological−5.279DeleteriousDisease
11rs80356769V/L394Val/LeuNeutralNeutral−2.031NeutralDisease
12rs80205046P/L182Pro/LeuDiseaseNeutral−9.917DeleteriousDisease
13rs80116658G/D265Gly/AspDiseaseNeutral−6.442DeleteriousDisease
14rs79796061D/V127Asp/ValDiseasePathological−8.625DeleteriousDisease
15rs79696831R/H285Arg/HisDiseaseNeutral−4.792DeleteriousDisease
16rs79653797R/Q120Arg/GlnDiseaseNeutral−3.641DeleteriousDisease
17rs79637617P/L122Pro/LeuDiseaseNeutral−9.265DeleteriousDisease
18rs79215220P/R266Pro/ArgDiseaseNeutral−8.275DeleteriousDisease
19rs79185870F/L417Phe/LeuDiseaseNeutral−5.095DeleteriousDisease
20rs78911246G/V189Gly/ValDiseaseNeutral−6.4DeleteriousDisease
21rs78715199D/E380Asp/GluNeutralNeutral−3.797DeleteriousDisease
22rs78396650A/V309Ala/ValDiseasePathological−3.533DeleteriousDisease
23rs78198234H/R311His/ArgDiseaseNeutral−7.667DeleteriousDisease
24rs77829017G/E46Gly/GluDiseasePathological−5.925DeleteriousDisease
25rs77738682N/I392Asn/IleDiseaseNeutral−7.593DeleteriousDisease
26rs77451368G/E202Gly/GluDiseasePathological−5.178DeleteriousDisease
27rs77321207Y/C304Tyr/CysDiseaseNeutral−8.358DeleteriousDisease
28rs77284004D/A380Asp/AlaDiseaseNeutral−7.593DeleteriousDisease
29rs76910485P/L391Pro/LeuDiseasePathological−9.269DeleteriousDisease
30rs76763715N/S370Ans/SerNeutralNeutral−2.128NeutralDisease
31rs76763715N/T370Asn/ThrDiseaseNeutral−3.062DeleteriousDisease
32rs76228122Y/C363Tyr/CysDiseasePathological−8.492DeleteriousDisease
33rs76026102Y/C205Tyr/CysDiseasePathological−7.552DeleteriousDisease
34rs76014919W/C378Trp/CysDiseasePathological−12.306DeleteriousDisease
35rs75564605I/T402IleThrDiseasePathological−4.363DeleteriousDisease
36rs75528494S/R366Ser/ArgDiseasePathological−2.806DeleteriousDisease
37rs75385858N/T396Asn/ThrDiseaseNeutral−5.562DeleteriousDisease
38rs75243000F/S397Phe/SerDiseasePathological−4.782DeleteriousDisease
39rs74953658D/E24Asp/GluDiseaseNeutral−3.037DeleteriousDisease
40rs74752878Y/C418Tyr/CysDiseasePathological−8.526DeleteriousDisease
41rs74598136P/L401Pro/LeuDiseasePathological−8.136DeleteriousDisease
42rs74462743G/E195Gly/GluDiseasePathological−7.767DeleteriousDisease
43rs61748906W/R184Trp/ArgDiseasePathological−13.028DeleteriousDisease
44rs1141814R/W48Arg/TrpDiseasePathological−-6.879DeleteriousDisease
45rs1141811T/I43Thr/IleDiseaseNeutral−3.515DeleteriousDisease
46rs1141811T/R43Thr/ArgDiseasePathological−2.557DeleteriousDisease
47rs421016L/P444Leu/ProDiseaseNeutral−4.995DeleteriousDisease

The consensus SNPs are shown in bold.

The results from nsSNP Analyzer, PMUT, PROVEAN, and SNPs&GO. The consensus SNPs are shown in bold.

Validation by panther

The protein sequence was given as input and analyzed for the deleterious effect on protein function. The subPSEC scores are values from 0 (neutral) to about −10 (deleterious) (Thomas et al., 2003). Out of 47 deleterious nsSNPs, 8 were found to be more than −6 (highly deleterious) and rest were found to be less deleterious. The mutant with a greater Pdeleterious tends to have more severe destructions in function. It was found that 32 out of 47 deleterious nsSNPs scored greater than 3 and rests were below the damage threshold (Table 4).
Table 4

Mutant scores from PANTHER.

S. NOrsIDAllelesPositionAA changesubPSECPdeleterious
1rs121908314L/V371Leu/val−3.348020.58614
2rs121908313F/L251Phe/Leu−2.590880.39912
3rs121908311G/S377Gly/Ser−5.350620.91298
4rs121908310V/F398Val/Phe−3.366290.59056
5rs121908306C/G342Cys/Gly−3.571930.63921
6rs121908304W/C312Trp/Cys−2.598380.40092
7rs121908303F/V216Phe/Val−4.883410.868
8rs121908300Y/H212Tyr/His−5.327160.9111
9rs121908295P/R415Pro/Arg−4.902280.87015
10rs80356771R/C463Arg/Cys−4.452180.81033
11rs80356769V/L394Val/Leu−2.84360.46098
12rs80205046P/L182Pro/Leu−6.31530.96495
13rs80116658G/D265Gly/Asp−6.009140.95299
14rs79796061D/V127Asp/Val−6.299670.96442
15rs79696831R/H285Arg/His−4.329620.79078
16rs79653797R/Q120Arg/Gln−4.520620.82063
17rs79637617P/L122Pro/Leu−4.490530.81616
18rs79215220P/R266Pro/Arg−6.277430.96365
19rs79185870F/L417Phe/Leu−4.169770.7631
20rs78911246G/V189Gly/Val−3.385370.59517
21rs78715199D/E380Asp/Glu−2.026230.27413
22rs78396650A/V309Ala/Val−4.27690.78192
23rs78198234H/R311His/Arg−4.571980.82807
24rs77829017G/E46Gly/Glu−5.040650.885
25rs77738682N/I392Asn/Ile−4.021880.73534
26rs77451368G/E202Gly/Glu−1.329950.15842
27rs77321207Y/C304Tyr/Cys−6.267370.96329
28rs77284004D/A380Asp/Ala−2.369470.34739
29rs76910485P/L391Pro/Leu−6.125340.95793
30rs76763715N/S370Ans/Ser−2.696030.42459
31rs76763715N/T370Asn/Thr−1.977350.26451
32rs76228122Y/C363Tyr/Cys−4.757490.85289
33rs76026102Y/C205Tyr/Cys−5.892940.9475
34rs76014919W/C378Trp/Cys−5.317720.91033
35rs75564605I/T402IleThr−3.780090.6857
36rs75528494S/R366Ser/Arg−2.076880.28432
37rs75385858N/T396Asn/Thr−3.615690.64924
38rs75243000F/S397Phe/Ser−2.883290.47086
39rs74953658D/E24Asp/Glu−4.174460.76395
40rs74752878Y/C418Tyr/Cys−6.318640.96506
41rs74598136P/L401Pro/Leu−2.148880.2992
42rs74462743G/E195Gly/Glu−4.746690.85153
43rs61748906W/R184Trp/Arg−3.57930.64091
44rs1141814R/W48Arg/Trp−7.033660.9826
45rs1141811T/I43Thr/Ile−4.208690.77007
46rs1141811T/R43Thr/Arg−4.052210.7412
47rs421016L/P444Leu/Pro−3.437470.60766

The consensus SNPs are shown in bold.

Mutant scores from PANTHER. The consensus SNPs are shown in bold.

Functional impact of mutations on proteins

The functional impact of 47 deleterious nsSNPs in protein of GBA was analyzed using PMUT server. Of the 47 nsSNPs, 22 are classified as pathological, and the remaining were neutral (Table 3).

Protein variation effect analysis

PROVEAN predicts the effect of the variant on the biological function of the protein based on sequence homology. PROVEAN scores are classified as “deleterious” if below a certain threshold (here −2.5) and “neutral” if above it (Choi et al., 2012). Out of 47 nsSNPs, 44 were predicted to be “deleterious” and 3 were found to be “neutral” (Table 3).

Prediction of disease related mutations by SNPs&GO

SNPs&GO is trained and tested with cross-validation procedures in which similar proteins are placed together as a dataset to calculate the LGO score derived from the GO data base. All 47 deleterious nsSNPs showed the disease related mutations (Table 3).

Discussion

In the recent years, SNPs have emerged as the new generation molecular markers. The harmful SNPs for the GBA gene were never been predicted to date in silico. This study was designed to understand the genetic variations associated with GBA gene. We have predicted the harmful nsSNPs using SIFT, MutPred, nsSNP Analyzer, PANTHER, PMUT, PROVEAN, and SNPs&GO state of the art computational tools. Among 97 nsSNPs, 47 were found to be deleterious with a tolerance index score of ≤0.05 found by SIFT program. Among the 47 deleterious nsSNPs, 46 were found to be harmful nsSNPs found by MutPred, 43 were found to be disease causing nsSNPs by nsSNP Analyzer tool, 32 are highly deleterious found by PANTHER program, 22 are classified as pathological mutations by PMUT, 44 were predicted to be deleterious by PROVEAN server while all 47 deleterious nsSNPs showed the disease related mutations by SNPs&GO. Also, we found that SNPs&GO was most successful of all state of the art SNP prediction programs that were used for this comparative study. In this work, we found 22 nsSNPs that are common in all (SIFT, MutPred, nsSNP Analyzer, PANTHER, PMUT, PROVEAN, and SNPs&GO) prediction (Figure 1). These sets of 22 nsSNPs (F251L, C342G, W312C, P415R, R463C, D127V, A309V, G46E, G202E, P391L, Y363C, Y205C, W378C, I402T, S366R, F397S, Y418C, P401L, G195E, W184R, R48W, and T43R) are possibly the main targeted mutation for the GD (Tables 1–4). The previous work has shown that, in 60 patients with types 1 and 3, the most common Gaucher mutations identified were L444P, N370S, and R463C. L444P was the most common mutation in GD types 1, 2, and 3 (Latham et al., 1990; Sidransky et al., 1994). In our analysis, out of 7 methods, 6 methods (Sift, MutPred, PROVEAN, PANTHER, nsSNP Analyzer, and SNPs&GO) shows L444P mutation as damaging, 3 methods shows N370S mutation as damaging and all the 7 methods shows R463C mutation as damaging. D409H, A456P, E326K, and V460V mutations were also identified in patients with GD (Tsuji et al., 1987; Park et al., 2002). In our analysis SIFT result shows D409H, A456P, and E326K mutation is the tolerated mutation. Further studies using these mutations will shed light on the genetic understanding of this major lysosomal storage disease.
Figure 1

Sets of various mutations identified using various software tools. The respective locations of 44 amino acids responsible for all 47 mutations are shown in the sequence (center, colored in bold) and 22 common mutations are highlighted as consensus.

Sets of various mutations identified using various software tools. The respective locations of 44 amino acids responsible for all 47 mutations are shown in the sequence (center, colored in bold) and 22 common mutations are highlighted as consensus.

Author contributions

Madhumathi Manickam, Priti Talwar, and Palaniyandi Ravanan wrote the main manuscript and analyzed original datasets. Pratibha Singh prepared tables and figure. All authors reviewed the manuscript.

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
  24 in total

1.  PMUT: a web-based tool for the annotation of pathological mutations on proteins.

Authors:  Carles Ferrer-Costa; Josep Lluis Gelpí; Leire Zamakola; Ivan Parraga; Xavier de la Cruz; Modesto Orozco
Journal:  Bioinformatics       Date:  2005-05-06       Impact factor: 6.937

2.  Phenotype, diagnosis, and treatment of Gaucher's disease.

Authors:  Gregory A Grabowski
Journal:  Lancet       Date:  2008-10-04       Impact factor: 79.321

3.  High frequency of the Gaucher disease mutation at nucleotide 1226 among Ashkenazi Jews.

Authors:  A Zimran; T Gelbart; B Westwood; G A Grabowski; E Beutler
Journal:  Am J Hum Genet       Date:  1991-10       Impact factor: 11.025

4.  The human glucocerebrosidase gene and pseudogene: structure and evolution.

Authors:  M Horowitz; S Wilder; Z Horowitz; O Reiner; T Gelbart; E Beutler
Journal:  Genomics       Date:  1989-01       Impact factor: 5.736

5.  Complex alleles of the acid beta-glucosidase gene in Gaucher disease.

Authors:  T Latham; G A Grabowski; B D Theophilus; F I Smith
Journal:  Am J Hum Genet       Date:  1990-07       Impact factor: 11.025

6.  A mutation in the human glucocerebrosidase gene in neuronopathic Gaucher's disease.

Authors:  S Tsuji; P V Choudary; B M Martin; B K Stubblefield; J A Mayor; J A Barranger; E I Ginns
Journal:  N Engl J Med       Date:  1987-03-05       Impact factor: 91.245

Review 7.  Gaucher disease: mutation and polymorphism spectrum in the glucocerebrosidase gene (GBA).

Authors:  Kathleen S Hruska; Mary E LaMarca; C Ronald Scott; Ellen Sidransky
Journal:  Hum Mutat       Date:  2008-05       Impact factor: 4.878

8.  DNA mutational analysis of type 1 and type 3 Gaucher patients: how well do mutations predict phenotype?

Authors:  E Sidransky; A Bottler; B Stubblefield; E I Ginns
Journal:  Hum Mutat       Date:  1994       Impact factor: 4.878

9.  Predicting the functional effect of amino acid substitutions and indels.

Authors:  Yongwook Choi; Gregory E Sims; Sean Murphy; Jason R Miller; Agnes P Chan
Journal:  PLoS One       Date:  2012-10-08       Impact factor: 3.240

10.  PANTHER in 2013: modeling the evolution of gene function, and other gene attributes, in the context of phylogenetic trees.

Authors:  Huaiyu Mi; Anushya Muruganujan; Paul D Thomas
Journal:  Nucleic Acids Res       Date:  2012-11-27       Impact factor: 16.971

View more
  4 in total

1.  Why individual thermo sensation and pain perception varies? Clue of disruptive mutations in TRPVs from 2504 human genome data.

Authors:  Arijit Ghosh; Navneet Kaur; Abhishek Kumar; Chandan Goswami
Journal:  Channels (Austin)       Date:  2016-03-10       Impact factor: 2.581

2.  Computational modelling approaches as a potential platform to understand the molecular genetics association between Parkinson's and Gaucher diseases.

Authors:  D Thirumal Kumar; Hend Ghasan Eldous; Zainab Alaa Mahgoub; C George Priya Doss; Hatem Zayed
Journal:  Metab Brain Dis       Date:  2018-07-06       Impact factor: 3.584

3.  KCNJ11, ABCC8 and TCF7L2 polymorphisms and the response to sulfonylurea treatment in patients with type 2 diabetes: a bioinformatics assessment.

Authors:  Jingwen Song; Yunzhong Yang; Franck Mauvais-Jarvis; Yu-Ping Wang; Tianhua Niu
Journal:  BMC Med Genet       Date:  2017-06-06       Impact factor: 2.103

4.  In silico analysis of deleterious single nucleotide polymorphisms in human BUB1 mitotic checkpoint serine/threonine kinase B gene.

Authors:  Fatemeh Akhoundi; Nikpour Parvaneh; Emadi-Baygi Modjtaba
Journal:  Meta Gene       Date:  2016-05-28
  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.