Literature DB >> 29930383

Comprehensive in-silico prediction of damage associated SNPs in Human Prolidase gene.

Richa Bhatnager1, Amita S Dang2.   

Abstract

Prolidase is cytosolic manganese dependent exopeptidase responsible for the catabolism of imido di and tripeptides. Prolidase levels have been associated with a number of diseases such as bipolar disorder, erectile dysfunction and varied cancers. Single nucleotide polymorphism present in coding region of proteins (nsSNPs) has the potential to alter the primary structure as well as function of the protein. Hence, it becomes necessary to differentiate the potential harmful nsSNPs from the neutral ones. 19 nsSNPs were predicted as damaging by in-silico analysis of 298 nsSNPs retrieved from dbSNP database. Consurf analysis showed 18 out of 19 substitutions were present in the conserved regions. 4 substitutions (D276N, D287N, E412K, and G448R) that observed to have damaging effect are present in catalytic pocket. Four SNPs listed in splice site region were found to affect splicing of mRNA by altering acceptor site. On 3'UTR scan of 77 SNPs listed in SNP database, 9 SNPs were lead to alter miRNA target sites. These results provide a filtered data to explore the effect of uncharacterized nsSNP and SNP related to UTRs and splice site of prolidase to find their association with the disease susceptibility and to design the target dependent drugs for therapeutics.

Entities:  

Mesh:

Substances:

Year:  2018        PMID: 29930383      PMCID: PMC6013436          DOI: 10.1038/s41598-018-27789-0

Source DB:  PubMed          Journal:  Sci Rep        ISSN: 2045-2322            Impact factor:   4.379


Introduction

Tissues are not only made up of cells, a valuable part of their volume is extracellular space, which is largely filled by a complex network of macromolecules constituting the extracellular matrix. This matrix is a well defined network of a variety of proteins and polysaccharides, which are in close association with the cell surface that secreted them. Collagen is the main component of extracellular matrix. Collagen is not only integral component of ECM but also has been known as a ligand for integrin receptors, playing an important role in signaling that regulate lipid metabolism, transport of ion, activation of various kinases and gene expression[1]. Therefore, any modification in the structure, quantity, and distribution of collagens in tissues affect a number of physiological processes like cell signaling, metabolism and function. Collagen catabolism involves the activity of various enzyme acting at different step. Its final step of degradation is the breakdown of imido dipeptides and tripeptides. Prolidase (E.C. 3.4.13.9) is a cytosolic exopeptidase that specifically cleaves imido dipeptides and imido tripeptides with C-terminal proline or hydroxyproline and releases free proline[2]. In this way prolidase recycles proline for collagen metabolism and serves as a rate limiting step in collagen metabolism. Any change in prolidase activity leads to disturbed collagen metabolism and results in diseased state[3,4]. Physiological levels of prolidase found to be associated with a number of diseases but still its exact role is obscure. It has been found that prolidase level is decreased to a significant extent in prolidase deficiency. Gene expression and post transcriptional modifications can have the potential to change the physiological level of prolidase. Prolidase gene (PEPD) is located on chromosome 19, contain 15 exon which encodes a polypeptide of 493 amino acids with molecular weight 54 kDa[5-7]. It is a dimer having two identical subunits. In humans, two isoforms of prolidase are present i.e. PDI and PDII. Nonsynonymous polymorphisms are those point mutations that insert amino acid change in the protein structure. Primary amino acid sequence is one of the factors which are responsible for mature protein structure as well as function of the protein. As these alterations can affect protein structure then it become important to study the effect of these polymorphisms on structure and function in detail and to figure out highly damaging mutation from the neutral one. Most of the SNPs of prolidase are still uncharacterized in terms of their disease causing potential. From last few years, in-silico approaches have been widely employed to identify the impact of deleterious nsSNP in candidate genes by utilizing information like conservation of residues, structural attributes and physiochemical properties of peptides[8]. The in silico approaches offer advantages over the lab based characterization because of their reliability, convenience, speed, and of lower cost to find such variants that have the potential to regulate the function of prolidase protein[8,9]. So, present study has been carried out to extend and explore the effect of nsSNPs on the stability and function of the prolidase. Here we have used a set of computational techniques to prioritize the deleterious nsSNPs reported in the prolidase gene.

Results

Prolidase is a well known dipeptidase which cleaves imino di and tripeptides containing proline. It possess both carboxypeptidase and aminopeptidase activity to cleave proline dipeptides. During final step of collagen degradation imido dipeptides are formed, prolidase degrades them and releases free proline for collagen resynthesis. Beside the dipeptidase function, prolidase also plays an important role as a detoxificant against chemical agents and pesticides. Human prolidase had a sequence similarity of 22% with OPAA (organophosphorus acid anhydrolase) which is also involved in the hydrolysis of pro-X combinations in Mn2+ dependent manner[10]. It also has been found that recombinant human prolidase also had both the activity i.e. hydrolysis of prolyl-glycyl peptides and digestion of organophosphate containing compounds[1]. Non synonymous mutation may leads to alteration in protein structure and function. In present analysis, sequence based, evolutionary and machine learning softwares were used to characterize the deleterious SNP from all the listed nsSNP in human.

Retrieval of SNP ids

All the nsSNP were retrieved from dbSNP database (build150) using filters. A total of 26240 SNPs are reported in prolidase gene out of which 18308 SNPs are reported in Human prolidase gene. On further selection, 107 SNPs were found to be UTRs variants, 17730 as intron variants and 298 (292 missense, 6 nonsense) were nsSNPs (Fig. 1). By this data nsSNPs contribute to only 1.62% of all the SNPs reported in human prolidase gene. Protein ID used in analysis is NP_000276.1.
Figure 1

SNP distribution of Prolidase gene.

SNP distribution of Prolidase gene.

Deleterious SNP prediction by SIFT

SIFT provides prediction for a list of nsSNP based on sequence homology and physical property of amino acids. It predicts whether the amino acid substitution at a given position is tolerated or not. This prediction is based on tolerance index (TI) where tolerance index is inversely proportional to the functional impact of substitution. rsids of 298 nsSNP were submitted for SIFT input and it predicted 39 substitutions as tolerated and 46 as deleterious as shown in (Table 1). Remaining 212 rsids were not found by SIFT server.
Table 1

Prediction of the effect of nsSNP by various tools.

S. No.rsidsAASSIFTFATHMMMutation AccessorProveanPolyphenPhd-SNPPANTHERnsSNPI-Mutant
1rs17570L435FTTNNBDisNNNo effect
2rs1063319S247LDTMDP.DDisDDisDecrease
3rs1140312D324VDTNNBNDNDecrease
4rs61734503R33WDTMDP.DDisNNDecrease
5rs61734505R148CTTLDBDisNNDecrease
6rs61734506S103NTTMNBDisNNDecrease
7rs61748998E170VTTNDBNNDecrease
8rs121917721D276NDDHDP.DDisDDisDecrease
9rs121917722R184QTDMDP.DNNDisDecrease
10rs121917723G278DDDHDP.DDisDDisDecrease
11rs121917724G448RDDHDP.DDisDDisDecrease
12rs139214756S224TDTMDP.DNNNDecrease
13rs141623136T188MDTMDP.DDisDisDisDecrease
14rs142070498D419GDTMDBNNNDecrease
15rs144944440I418VDTLNBNNN
16rs149042427E391TTTLDBNNNDecrease
17rs151278946D189GTTMDDisNNDecrease
18rs185183225R35WDTMDP.DDisDisDisDecrease
19rs186203899T78STTLNBNNNDecrease
20rs187269138R33QTTLNBNNNDecrease
21Rs188930796T137MTTLNBDisNIncrease
22rs189549581V456MDTMNP.DNNNIncrease
23rs199612179C245CTTMNDisNN
24rs199711203A21VTTLDBNNNIncrease
25rs199794147E208VDTLDBDisNNDecrease
26rs199892951G51EDTMP.DNNNIncrease
27rs200072143V472MDTMNP.DNNNDecrease
28rs200183031D324LDTNBNNNDecrease
29rs200351927I201QTTMDP.DDisNDisDecrease
30rs200435937R35QTTLNBDisNNDecrease
31rs200450538D189MDTMDP.DNDISDecrease
32rs200567073G447RDDMDP.DDisNDisIncrease
33rs200871513G235STTHDP.DNDDisDecrease
34rs200931112V305ITTNNBDisNNDecrease
35rs201089253G296EDDMDP.DDisDDisDecrease
36rs201222933R398TTTMDP.DDisDDisIncreased
37rs201447445N250HDTLDPs.DDisNNIncreased
38rs201572375M210TDTHDP.DNNDisDecrease
39rs201584435D287NDDHDP.DDisDDisDecrease
40rs201752816H72DTTMDP.DNNNDecrease
41rs201865747D87NTTMDBDisNNDecrease
42rs201992066G12RDTMDP.DNNDisDecrease
43rs267606943S202FDTHDP.DNNDisDecrease
44rs267606944E412KDDHDP.DDisDDisDecrease
45rs367841505D378NDDHDP.DDisDDisDecrease
46rs367902648S240NDDHDPs.DDisDDisDecrease
47rs368547324G246STTLDDisNDisDecrease
48rs368559424N151STTMDP.DDisNNDecrease
49rs368647287R196CTTHDP.DNNDisDecrease
50rs368651528G381CTTHDP.DDisDDisIncrease
51rs368784737V171LTTMNBDisNNDecrease
52rs368792538N436NTTLNNDisNNDecrease
53rs368995247L66CDTLNBDisNDisDecrease
54rs369878645I45VTTNNBNNNDecrease
55rs370219399F275IDTNDPs.DDisNNDecrease
56rs370370158K218TDTNDBNNNDecrease
57rs370970279H255SDTHDP.DDisDDisDecrease
58rs371556469A261RTTLDP.DDisNNDecrease
59rs371934154L403HDTHDP.DDisDDisDecrease
60rs371953949L192WDTMDP.DDisDisDisDecrease
61rs372210606C290ITTMDP.DDisNNDecrease
62rs372527759R27QDTHDP.DDisNNDecrease
63rs372530277G414STTDP.DNDDisDecrease
64rs372629704V181LTTMNBDisNNDecrease
65rs373297406G373HTDHDP.DDisDDisDecrease
66rs374162516R196HDTHDP.DNNDisDecrease
67rs374573875F117LTTMDBNNNDecrease
68rs374603111R335MDTMDPs.DNDDisIncrease
69rs374795227E227LDTNDBNNIncrease
70rs375061486Y231CDTMDPs.DNNDisIncrease
71rs375348295S93LDTMDP.DDisNNDecrease
72rs375915358S142FDTMDP.DNNDisIncrease
73rs375919385G323SDTMDP.DNDDisDecrease
74rs376211407I374KTTMDPs.DDisDDisDecrease
75rs376338457G260EDTMDP.DNNDisDecrease
76rs376372688G373CDTHDP.DNDDisDecrease
77rs376397947Y83CTTMDPs.DNNNIncrease
78rs376817734R331CTTMNBDisNNDecrease
79rs377085952P19LDDMDP.DDisDisDisIncrease
80rs377199331W326YDTLP.DNNNIncrease
81rs377429945D125NTTLNBDisNDecrease
82rs377536201I329VDTNBDisNNIncrease
83rs377685056T410VDDHDP.DDisDDisIncrease
84rs377714630F169LTTMDNDecrease
85rs377738544S224IDTMDBDisDisDisDecrease

Abbreviations: T(tolerated), D(deleterious, damaging), N(neutral), Dis(disease causing), M(medium), L(low), H(high), B(benign), P.D(probably damaging), Ps.D(possibly damaging).

Prediction of the effect of nsSNP by various tools. Abbreviations: T(tolerated), D(deleterious, damaging), N(neutral), Dis(disease causing), M(medium), L(low), H(high), B(benign), P.D(probably damaging), Ps.D(possibly damaging).

Prediction of Functional effect of non synonymous SNP by Provean

Provean predicts the functional effect of amino acid substitutions. Threshold of prediction is −2.5, above this score prediction is supposed to be neutral and below or equal −2.5 prediction is deleterious. FASTA format with substitutions predicted by SIFT server were used as input. Out of 85 substitutions submitted, 21 amino acid substitution were predicted to be neutral (score is above-2.5) and remaining 64 were having score below or equal −2.5 and might be associated with disease (Table 1).

Prediction of functional impact of mutation by mutation assessor

Mutation assessor calculates the impact of mutation on the function of protein. Its output results in FI score (functional impact combined score), VC score (variant conservation score), and VS score (variant specificity score). Functional impact categorized in two parts: predicted functional (having high and medium FI score) and predicted non functional (having low and neutral FI score). In this study, 19 mutations were found to be highly damaging, 38 having medium impact, 17 with low impact and 7 were classified as neutral (Table 1).

Prediction of functional impact of nsSNP by PANTHER

PANTHER predicts the impact of mutation on the protein function. It uses HMM and various alignment method to map the mutation and then produces result. It gives result in the form of probability of damage associated with that SNP and noted as Pdeletrious. Minimum cutoff value is −3 for Pdeletrious 0.5. Out of 85 nsSNPs, 21 were found to be damaging by PANTHER prediction (Table 1).

Functional significance of substitution by Polyphen2

Polyphen i.e. polymorphism phenotype predicts the possible effect of amino acid substitution on function and structure of protein based on a number of criteria like phylogenetic, structural information and sequence of protein. It predicts sequence based feature on the basis of PSIC (position-specific independent count) matrix, TMHMM (transmembrane helix prediction by hidden markov model) algorithm, Coils2 program and SignalP program to predict transmembrane, coiled coil and signal peptide regions of the protein sequences and structure based feature on the basis of DSSP(dictionary of secondary structure protein) database. A (PSIC) score difference was assigned using the categories ‘probably damaging’, ‘possibly damaging’, and ‘potentially damaging’, ‘borderline’ and ‘benign’. Out of 85 nsSNP used in this study, 46 substitutions were probably damaging, 8 were possibly damaging and 31 were benign (neutral) in nature (Table 1).

Disease associated SNP prediction by nsSNP analyzer and PhD SNP

Both nsSNP analyzer and PhD SNP predict the phenotypic effect of non synonymous substitution. They also predict whether the substitution is disease associated or not. By nsSNP prediction 40 substitutions are associated with disease whereas PhD SNP predicts 48 disease causing substitutions (Table 1).

Prediction the effect of nsSNP by FATHMM

FATHMM depends upon hidden markov model about the pathogenicity of a substitution. It uses two different coordinates to make any prediction i.e. non coding variants and coding variants. Coding variants further differentiates into three part to be more specific in prediction i.e. inherited diseases (used to differentiates between disease causing mutation and neutral polymorphisms), cancer (used to differentiates between cancer promoting mutations and other germ line polymorphisms), disease specific (used to predict a list of potentially relevant SNPs for the disease of interest). FATHMM uses HMM and align the homologous sequences and conserved protein to give pathogenicity index about the mutation. In our analysis, 19 mutations were found to be damaging out of 85 mutations listed in the study (Table 1).

Prediction of stability change by I-Mutant

The I-Mutant 2.0 server was developed and tested with the data extracted from ProTherm, to predict the change of protein stability due to mutation. Its prediction comes out in two forms i.e. change in DDG and ΔG. Positive G value leads to increased stability whereas negative G values correspond to decreased stability. Results of I- Mutant was summarized in Table 1.

Consensus generation

To find the most deleterious SNP, concordance was done. Substitution which was predicted as deleterious by sequence and SVM based method were selected manually. A total of 19 substitutions were found to deleterious by all the algorithms used in the study as shown in Table 2.
Table 2

Consensus of all the softwares.

S No.SNPAASSIFTProveanPolyphennsSNPPhDI-MutantConsurfNetsurfP
1.rs377085952P19LDELDELProbably damagingDiseaseDiseaseIncreaseConservedExposed
2.rs185183225R35WDELDELProbably damagingDiseaseDiseaseDecreaseVariableExposed
3.rs141623136T188MDELDELProbably damagingDiseaseDiseaseDecreaseConservedExposed
4.rs371953949L192WDELDELProbably damagingDiseaseDiseaseDecreaseIntermediateBuried
5.rs377738544S224IDELDELbenignDiseaseDiseaseDecreaseConservedExposed
6.rs367902648S240NDELDELPossibly damagingDiseaseDiseaseDecreaseConservedBuried
7.rs1063319S247LDELDELProbably damagingDiseaseDiseaseDecreaseConservedBuried
8.rs370970279H255SDELDELProbably damagingDiseaseDiseaseDecreaseMost conservedBuried
9.rs121917721D276NDELDELProbably damagingDiseaseDiseaseDecreaseMost conservedBuried
10.rs121917723G278DDELDELProbably damagingDiseaseDiseaseDecreaseMost conservedBuried
11.rs201584435D287NDELDELProbably damagingDiseaseDiseaseDecreaseMost conservedBuried
12.rs201089253G296EDELDELProbably damagingDiseaseDiseaseDecreaseMost conservedExposed
13.rs373297406G373HDELDELProbably damagingDiseaseDiseaseDecreaseMost conservedBuried
14.rs367841505D378NDELDELProbably damagingDiseaseDiseaseDecreaseMost conservedBuried
15.rs371934154L403HDELDELProbably damagingDiseaseDiseaseDecreaseMost conservedBuried
16.rs377685056T410VDELDELProbably damagingDiseaseDiseaseIncreaseMost conservedBuried
17.rs267606944E412KDELDELProbably damagingDiseaseDiseaseDecreaseMost conservedBuried
18.rs200567073G447RDELDELProbably damagingDiseaseDiseaseIncreaseMost conservedBuried
19.rs121917724G448RDELDELProbably damagingDiseaseDiseaseDecreaseMost conservedBuried
Consensus of all the softwares.

Prediction of association of substitution with disease by Mutpred

It predicts whether the nsSNP will be disease-causing or neutral[11]. It predicts the molecular cause of disease/deleterious. Its score is the probability that predict whether the substitution affects the function of protein or not. Threshold is 0.5: higher than 0.5 could be considered as ‘harmful’, whereas >0.75 could be considered a high confidence ‘harmful’ prediction. Prediction for the SNPs of prolidase is summarized in Table 3.
Table 3

Effect of nsSNP on the structure and function of protein predicted by Mutpred.

S.No rsids Substitution Effect
1.rs377085952P19LGain of helix (P = 0.0022)Loss of loop (P = 0.0031)Gain of MoRF binding (P = 0.0759)Loss of methylation at K17 (P = 0.0844)Gain of ubiquitination at K17 (P = 0.107)
2.rs185183225R35WGain of catalytic residue at P38 (P = 0.0394)Loss of disorder (P = 0.0427)Loss of methylation at R35 (P = 0.1122)Loss of MoRF binding (P = 0.1173)Gain of helix (P = 0.1736)
3.rs141623136T188MLoss of methylation at K187 (P = 0.0313)Loss of ubiquitination at K187 (P = 0.1191)Loss of helix (P = 0.1299)Gain of sheet (P = 0.1945)Gain of MoRF binding (P = 0.2083)
4.rs371953949L192WGain of MoRF binding (P = 0.0284)Gain of methylation at K187 (P = 0.0627)Loss of ubiquitination at K187 (P = 0.1037)Loss of catalytic residue at L192 (P = 0.1737)Loss of stability (P = 0.2356)
5.rs377738544S224ILoss of disorder (P = 0.0628)Loss of catalytic residue at S224 (P = 0.0702)Loss of phosphorylation at Y220 (P = 0.0771)Loss of helix (P = 0.2022)Loss of MoRF binding (P = 0.3016)
6.rs367902648S240NLoss of catalytic residue at S240 (P = 0.0353)Loss of disorder (P = 0.0834)Loss of phosphorylation at S240 (P = 0.116)Gain of sheet (P = 0.1451)Loss of stability (P = 0.3235)
7.rs1063319S247LLoss of glycosylation at S247 (P = 0.0118)Loss of disorder (P = 0.0567)Gain of sheet (P = 0.0827)Loss of loop (P = 0.2237)Gain of stability (P = 0.2614)
8.rs370970279H255SGain of catalytic residue at H255 (P = 0.0558)Gain of disorder (P = 0.0697)Loss of sheet (P = 0.302)Gain of glycosylation at S251 (P = 0.315)Loss of stability (P = 0.4182)
9.rs121917721D276NLoss of sheet (P = 0.0817)Loss of phosphorylation at Y281 (P = 0.1679)Loss of stability (P = 0.3001)Gain of catalytic residue at D271 (P = 0.439)Loss of disorder (P = 0.6276)
10.rs121917723G278DLoss of catalytic residue at D276 (P = 0.0909)Gain of sheet (P = 0.1208)Gain of phosphorylation at Y281 (P = 0.2344)Loss of stability (P = 0.4985)Gain of disorder (P = 0.6248)
11.rs201584435D287NGain of sheet (P = 0.1208)Loss of loop (P = 0.2237)Loss of catalytic residue at D287 (P = 0.229)Loss of stability (P = 0.2971)Gain of MoRF binding (P = 0.3741)
12.rs201089253G296EGain of disorder (P = 0.0902)Gain of sheet (P = 0.1539)Gain of solvent accessibility (P = 0.1683)Loss of catalytic residue at K297 (P = 0.1817)Loss of helix (P = 0.2022)
13.rs373297406G373HLoss of sheet (P = 0.1158)Loss of stability (P = 0.2508)Gain of loop (P = 0.2754)Gain of catalytic residue at G373 (P = 0.3313)Gain of disorder (P = 0.4695)
14.rs367841505D378NGain of sheet (P = 0.0827)Loss of disorder (P = 0.1773)Gain of loop (P = 0.2754)Loss of phosphorylation at Y382 (P = 0.3328)Loss of catalytic residue at G380 (P = 0.5121)
15.rs371934154L403HGain of disorder (P = 0.0202)Loss of stability (P = 0.0827)Loss of sheet (P = 0.1158)Gain of catalytic residue at R401 (P = 0.1741)Gain of loop (P = 0.2045)
16.rs377685056T410VLoss of sheet (P = 0.1907)Loss of catalytic residue at T410 (P = 0.3448)Gain of loop (P = 0.3485)Gain of MoRF binding (P = 0.4771)Loss of glycosylation at T410 (P = 0.5011)
17.rs267606944E412KGain of methylation at E412 (P = 0.0028)Gain of ubiquitination at E412 (P = 0.0408)Loss of sheet (P = 0.0817)Gain of MoRF binding (P = 0.1652)Gain of loop (P = 0.2754)
18.rs200567073G447RGain of MoRF binding (P = 0.0245)Gain of sheet (P = 0.039)Gain of methylation at G447 (P = 0.0399)Gain of loop (P = 0.0435)Gain of solvent accessibility (P = 0.0584)
19.rs121917724G448RGain of MoRF binding (P = 0.0193)Gain of sheet (P = 0.0827)Loss of catalytic residue at V449 (P = 0.0969)Gain of methylation at R444 (P = 0.1378)Loss of helix (P = 0.2022)
Effect of nsSNP on the structure and function of protein predicted by Mutpred.

Prediction of conserved and solvent accessibility by Consurf and NetSurf P

Consurf gives the output in the form of score where score 9 represent the most conserved and 1 represent the highly variable amino acid as given in Table 2. NetSurf P prediction about solvent accessibility (exposed, buried, and partially buried) for the amino acid substitution is also given in Table 2.

Prediction of the effect of SNP located in UTR region by UTRscan Server and PolymiTRS

UTRscan server predicted the effect of UTRs on transcriptional motif. FASTA format of prolidase protein or UTRscan prediction and it predicted one signal in uoRF (Upstream Open Reading Frame) with a match 4 in 5′UTR region. PolymiRTS was employed to screen the effect of 3′UTRs on miRNA target site. It predicted 9 mutations have the potential to alter miRNA seed region. Out of these 9 mutations, 5 were INDELS whose ancestral allele cannot be determined yet but alter the miRNA target site and remaining 4 (rs140038783, rs3556, rs149914845, rs77690463) were SNPs which creates new target miRNA site as shown in Table 4.
Table 4

Predicted results of functional 3′UTR SNPs/Indels.

S.NoSNP IDAllelemiR IDmiRSiteFunction Classcontext+score change
1rs140842ACTTThsa-miR-548av-5pctgaTACTTTActttctgtcaaaaatO−0.028
hsa-miR-548kctgaTACTTTActttctgtcaaaaatO−0.028
hsa-miR-548lctgATACTTTActttctgtcaaaaatO−0.083
hsa-miR-8054ctgaTACTTTActttctgtcaaaaatO−0.028
2rs35012994ACTTThsa-miR-548lttctgATACTTTctgtcaaaaO−0.041
3rs71795604TACTThsa-miR-548lttctgATACTTTctgtcaaaaO−0.041
4rs10659604TACTThsa-miR-548a-5pgcatttctgaTACTTTActttctgtcO−0.097
hsa-miR-548abgcatttctgaTACTTTActttctgtcO−0.097
hsa-miR-548akgcatttctgaTACTTTActttctgtcO−0.076
hsa-miR-548am-5pgcatttctgaTACTTTActttctgtcO−0.085
hsa-miR-548ap-5pgcatttctgaTACTTTActttctgtcO−0.097
hsa-miR-548aq-5pgcatttctgaTACTTTActttctgtcO−0.088
hsa-miR-548ar-5pgcatttctgaTACTTTActttctgtcO−0.107
hsa-miR-548as-5pgcatttctgaTACTTTActttctgtcO−0.085
hsa-miR-548au-5pgcatttctgaTACTTTActttctgtcO−0.085
hsa-miR-548av-5pgcatttctgaTACTTTActttctgtcO−0.028
hsa-miR-548ay-5pgcatttctgaTACTTTActttctgtcO−0.085
hsa-miR-548b-5pgcatttctgaTACTTTActttctgtcO−0.076
hsa-miR-548c-5pgcatttctgaTACTTTActttctgtcO−0.085
hsa-miR-548d-5pgcatttctgaTACTTTActttctgtcO−0.085
hsa-miR-548h-5pgcatttctgaTACTTTActttctgtcO−0.076
hsa-miR-548igcatttctgaTACTTTActttctgtcO−0.097
hsa-miR-548j-5pgcatttctgaTACTTTActttctgtcO−0.097
hsa-miR-548kgcatttctgaTACTTTActttctgtcO−0.028
hsa-miR-548lgcatttctgATACTTTActttctgtcO−0.083
hsa-miR-548o-5pgcatttctgaTACTTTActttctgtcO−0.085
hsa-miR-548wgcatttctgaTACTTTActttctgtcO−0.085
hsa-miR-548ygcatttctgaTACTTTActttctgtcO−0.088
hsa-miR-559gcatttctgaTACTTTActttctgtcO−0.1
hsa-miR-8054gcatttctgaTACTTTActttctgtcO−0.028
5rs201816618TCTGAhsa-miR-548lagcatttctgATACTTTctgtO−0.041
hsa-miR-548a-5pagcatTTACTTTctgtO−0.084
hsa-miR-548abagcatTTACTTTctgtO−0.084
hsa-miR-548akagcatTTACTTTctgtO−0.094
hsa-miR-548am-5pagcatTTACTTTctgtO−0.094
hsa-miR-548ap-5pagcatTTACTTTctgtO−0.112
hsa-miR-548aq-5pagcatTTACTTTctgtO−0.094
hsa-miR-548ar-5pagcatTTACTTTctgtO−0.094
hsa-miR-548as-5pagcatTTACTTTctgtO−0.084
hsa-miR-548au-5pagcatTTACTTTctgtO−0.094
hsa-miR-548ay-5pagcatTTACTTTctgtO−0.094
hsa-miR-548b-5pagcatTTACTTTctgtO−0.094
hsa-miR-548c-5pagcatTTACTTTctgtO−0.094
hsa-miR-548d-5pagcatTTACTTTctgtO−0.094
hsa-miR-548h-5pagcatTTACTTTctgtO−0.094
hsa-miR-548iagcatTTACTTTctgtO−0.084
hsa-miR-548j-5pagcatTTACTTTctgtO−0.112
hsa-miR-548o-5pagcatTTACTTTctgtO−0.094
hsa-miR-548wagcatTTACTTTctgtO−0.094
hsa-miR-548yagcatTTACTTTctgtO−0.094
hsa-miR-559agcatTTACTTTctgtO−0.103
6rs140038783Ahsa-miR-4310gaaaatAATGCTGD−0.237
hsa-miR-7157-5pgaaaatAATGCTGD−0.237
Ghsa-miR-1250-3pGAAAATGAtgctgC−0.201
hsa-miR-153-5pgAAAATGAtgctgC0.023
7rs3556Thsa-miR-3163ctgTTTTATAcctD0.091
Chsa-miR-494-3pcTGTTTCAtacctC−0.065
8rs149914845Thsa-miR-105-5pcgGCATTTGAtcaD−0.167
hsa-miR-7853-5pcgGCATTTGAtcaD−0.188
Chsa-miR-1245b-3pcggCATCTGAtcaC−0.074
hsa-miR-383-5pcggcaTCTGATCAC−0.29
hsa-miR-4772-5pcggcatCTGATCAC−0.091
9rs77690463C
Thsa-miR-4719tcttTTTGTGAtgC−0.002

miRSite: sequence context of the miRNA site: bases complementary to the seed region are in capital letters and SNPs are highlighted in bold font. Function class: D: the derived allele disrupts a conserved miRNA site (ancestral allele with support >2); C: the derived allele creates a new miRNA site; O: the ancestral allele cannot be determined. Context score: negative increase = increase in SNP functionality.

Predicted results of functional 3′UTR SNPs/Indels. miRSite: sequence context of the miRNA site: bases complementary to the seed region are in capital letters and SNPs are highlighted in bold font. Function class: D: the derived allele disrupts a conserved miRNA site (ancestral allele with support >2); C: the derived allele creates a new miRNA site; O: the ancestral allele cannot be determined. Context score: negative increase = increase in SNP functionality.

Prediction the effect of SNP located in splice site by HSF tool

HSF tool analyse the effect of any mutation on splicing signals and recognize the splicing motifs in any human gene sequence. cDNA sequence containing point mutation or insertion or deletion was submitted to HSF server and it predicted 5 SNPs from 3′ and 5′ splicing region would alter the splicing signal. Out of these 5 mutations, 4 (rs542228812, rs753775083, rs761217488 and rs907881705) were found to affect splicing of mRNA by altering acceptor site whereas rs1016478683 affect splicing by affecting donor site (Table 5).
Table 5

Effect of 5′ and 3′ splice sites.

S.NorsidsPredicted signalInterpretationExon location
1.rs542228812Broken WT Acceptor SiteAlteration of the WT acceptor site, affecting splicing7
2.rs753775083Broken WT Acceptor SiteAlteration of the WT acceptor site, most probably affecting splicing7
3.rs761217488Broken WT Acceptor SiteAlteration of the WT acceptor site, most probably affecting splicing11
4.rs907881705Broken WT Acceptor SiteAlteration of the WT acceptor site, most probably affecting splicing6
5.rs1016478683Broken WT Donor SiteAlteration of the WT donor site, most probably affecting splicing.14
6.rs1055732229Not found in HSF database
Effect of 5′ and 3′ splice sites.

Secondary structure prediction by PSIPRED

Secondary structure of prolidase was predicted by PSIPRED which showed the distribution of alpha helix, beta sheet and coils. By analysis it was found that in native structure coils contribute major portion in protein structure (48.9%) followed by alpha helix (26.5%) and β- strand (24.4%) (see Supplementary File S1). On insertion of all the 4 (D276N, D287N, E412K, G448R) damaging substitutions, major distortion was loss of strand at residues 415 and 416 ((see Supplementary File S2).

Three dimensional structure prediction by Swiss-Modeler

4 (D276N, D287N, E412K, G448R) models were generated by Swiss modeler for prolidase protein. Models with the Z-score between the ranges of 0–1 are considered as good models. Both the native and mutated models were further visualized and analyzed by UCSF Chimera (Figs 2 and 3). 3D structure of prolidase protein was of 493 amino acid residues. QMEAN, GMQE, RMSD values, energy minimization values and gradiant norms of mutated models are given in Table 6.
Figure 2

3D structure of native prolidase generated by Modeller and visualized by Pdb viewer.

Figure 3

3D structure of D276N, D287N, E412K, G448R substituted prolidase generated by Modeller and visualized by Pdb viewer.

Table 6

Quality parameters of D276N, D287N, E412K, G448R substituted models.

GMQEQMEANRMSD (metre)Potential Energy after minimization (Joules)Gradient normRAMPAGE
Number of favorable region residuesNumber of allowed region residuesoutlier
D276N0.99−0.100.096019−4254.72691096.507714467 (96.9%)15 (3.1%)0
D287N0.990.320.082341−42762.47095488.725110465 (97.1%)13 (2.7%)1 (0.2%)
E412K0.990.100.100311−44425.26653688.393204467 (96.9%)15 (3.1%)0
G448R0.990.100.101967−44357.7336989.701027468 (97.3)13 (2.7%)0
3D structure of native prolidase generated by Modeller and visualized by Pdb viewer. 3D structure of D276N, D287N, E412K, G448R substituted prolidase generated by Modeller and visualized by Pdb viewer. Quality parameters of D276N, D287N, E412K, G448R substituted models.

Model validation by RAMPAGE

Quality of all the 4 (D276N, D287N, E412K, G448R) models was checked by RAMPAGE which is a indicative of Ramachandran plot. All the substituted models are of good quality as having more than 90% region in favoured region (Table 6). Quality assessment structure of RAMPAGE prediction are given as Supplementary File S3.

Discussion

Prolidase, also known as Peptidase D or Iminopeptidase has been found in almost all the organism ranging from prokaryotes to eukaryotes[12-14]. The human enzyme is homodimeric, and found in two different isoforms i.e. PD I (higher activity against Gly-Pro dipeptides, depends on Mn+2 ion for catalysis) and PD II (higher activity against Met-Pro dipeptides and a little activity against Gly-Pro, requires Zn+2 for catalysis). In humans, PDI isoform is abundant and responsible for prolidase deficiency and collagen related disorders[14]. This dimer has a crystal structure that shows two approximately symmetrical monomers, both have an N-terminal domain, made up of a six-stranded mixed β-sheet flanked by five α-helices, a helical linker, and C-terminal domain, consisting of a mixed six-stranded β-sheet flanked by four α-helices[15]. Human prolidase protein has two domain i.e. domain ranging from 18–191 is aminopeptidase domain and 192–479 is M24 like hydrolases domain respectively. Its main activity i.e. proline dipeptidases activity is confined to a cluster around metal binding site with a conserved stretch ranging from 366–378[15]. Binuclear active metal site cluster which possess substrate binding site activates the nucleophiles and stabilize the transition state to facilitate transitions. Both the active site and metal cluster lies on the inner surface of the β-sheet of M24 domain which is anchored by the side chains of two aspartate residues (Asp276 and Asp287), two glutamate residues (Glu412 and Glu452), and a histidine residue (His370). Carboxylate group of aspartate and glutamine residues serve as bridges between the two Mn atoms as shown by PDB. Function of protein directly depends on its tertiary structure thereby modification in the amino acid may have potential to alter protein structure and can produce severe physiological effects. Alteration in physiological level of prolidase affects the final step of collagen metabolism and can cause collagen related disorders. A well known pathological condition, Prolidase deficiency is characterized by skin ulcers, micrognathia, and hypertelorism. Increased physiological levels of prolidase have been found in cardiac diseases, bipolar disorder, depression, erectile disorder, and in a number of cancer whereas in asthma, COPD, osteoarthritis, chronic pancreatitis, and in pancreatic cancer its levels were found to be decreased[11,16-24]. In-silico analysis provides us a key to predict the effect of single nucleotide polymorphism on the structure and function of a protein[25]. We used algorithms based on sequence and structure along with machine learning methods to deduce the effect of nsSNP on prolidase structure and function. 298 SNPs retrieved from dbSNP were submitted for SIFT prediction to deduce the amino acid substitution caused by these SNPs. SIFT predicted 85 substitutions that caused amino acid change based on the degree of conservation of amino acid residues in sequence alignments derived from closely related sequences, collected through PSI-BLAST. SIFT predicted 46 out of these 85 substitutions were deleterious in nature while other were neutral. These 85 substitutions were analyzed further to conclude their effect on protein structure and function. Provean predicted 64 substitutions to be damaging. Structural impact of non synonymous mutations was predicted by Polyphen2 program which predicted 46 substitutions were probably damaging, 8 possibly damaging and remaining substitutions not having any impact on protein structure. Mutation accessor, predicted 57substitutions to be damaging. nsSNP and PhD server were also employed to check the effect of these substitutions and they predicted 40 and 48 substitutions damaging respectively. Manual concurrence of all the SNPs studied by different softwares was done. Total 19 substitutions were found common in all the softwares used in the study. Effect of these nonsynonymous mutations on stability was checked by I-Mutant server which gives the prediction in the form of DDG. I- mutant predicted 16 out of 19 substitutions decrease the stability of protein whereas 3 substitutions (G447R, P19L, T410V) were found to make protein more stable Consurf predicted that out of these 19 substituted positions, 18 are highly conserved in prolidase structure (Table 2). P19, R35, T188, G296 and G447 positions are exposed in the prolidase structure while remaining 14 are buried inside as predicted by NetSurfP. These substitutions can be segregated on the basis of domain where they are found. 3 substitutions are present in aminopeptidase domain while 16 are located in M24 like domain. Furthermore out of 16 substitutions in M24 domain, 3 substitutions (D276N, D278N, E412K) are present in metal binding site. P19L entails a substitution of proline by leucine. This substitution leads to increased aggregation tendency but decrease the chaperone binding affinity. It also leads to alteration in structure by increasing the tendency to form a helix. It also generate site for ubiquitinylation making the region prone to degradation and decrease the stability thereby affecting the physiological level of prolidase. R35W mark the substitution of arginine (basic amino acid) by tryptophan (a non polar aromatic amino acid). This residue involves in the formation of helix and interacts with P38. By loss of arginine, methylation and MoRF binding activity was found to be lost as predicted by Mutpred. It also leads to gain of catalytic activity at P38 residue but decrease the stability of protein. T188M involves the substitution of theronine (polar) to methionine (non polar) although increase the stability of protein by loss of ubiquitinylation site to make protein more stable but results in loss of methylation and helix formation property. Proline dipeptidase activity of prolidase is dependent on the phosphorylation of serine/Threonine residues. Methylated serine/Threonine residues might serve as the recognition site for serine/theronine kinase resulting in pro-dipeptidase activity. Loss of methylation at ‘T’ donot confers the recognition site for kinase and decreases prolidase activity. This substitution also leads to formation of β- strand thereby altering the protein structure. Both P19L and R35W if present would to lead to disruption of aminopeptidase domain and T188 leads to decreased activity. In M24 domain, 2 SNPs leads to substitutions of leucine (L192W, L403H) by tryptophan and histidine respectively where former belong to non polar group and histidine belong to basic charged amino acid. In L192W both amino acids are non polar in nature but this substitution leads to disruption of helix because of bulky nature of tryptophan which don’t fit inside the helix. Both of these substitutions also leads to loss of chaperone binding affinity, decrease in stability of helix resulting loss of catalytic site. 3 substitutions are related to replacement of serine (S224I, S240N, S247L). They involves the substitution of serine (-OH containing amino acid) to isoleucine (non polar amino acid), arginine (Basic amino acid) and leucine respectively. All three regions forms strand in protein structure. S224I substitution increases the protein stability but results in loss of catalytic residue S at this region. Besides this, this substitution also influences the phosphorylation of tyrosine residue at 220th position leading to the loss of activity of this domain. S240N and S247L both decrease the stability of protein, loss of catalytic property thereby making the protein non functional. S247L substitution also leads to loss of glycosylation at 247th position resulting in altered catalytic site of protein.H255S substitution leads to decrease in the protein stability. It involves the substitution of Histidine (basic amino acid) to serine (OH containing amino acid) this substitution disrupt the secondary structure of protein. As deduce by the study of Roberta Besio et al.[26], it was found that Asp 276, Asp 287, His370 and Glu 412, 452 forms the catalytic site responsible for di-peptidase activity of the prolidase. Asp 287 and Glu 452 forms the binding site for Mn1 and Mn2 ion in subunit A and B as well. Glu412 binds with Mn1 and Asp 276 binds with Mn2 in both the subunits whereas His370 binds only Mn1 in subunit B. Theronine residues were found to be more conserved near this catalytic site. T289 residue helps in binding with Mn2 whereas T410 found in the site bind with Mn1[26]. Any mutation in this region would lead to loss of di-peptidase activity and contribute to prolidase deficiency. Our results also suggest that substitutions in these residues may have damaging effects. D276N decrease the protein stability, loss in strand formation and phosphorylation at Y281 residue. G278D results in loss of catalytic activity of the residue D276 but gain of phosphorylation at Y281 as predicted by Mutpred. This alteration makes the catalytic site nonfunctional. G296E and G373H substitution severely reduces the stability of protein. This substitution increases the solvent accessibility making the buried region to expose and destabilizing the structure with the loss of catalytic residue at K297. Mn(II) ions in the catalytic site are surrounded by negatively charged amino acids aspartic acid and Glutamic acid (D276, D287, E412, E452) and a phosphate group. E412K mutation decreases the negative charge by two units in the coordination sphere making it non functional. Furthermore, E412K substitution increases the aggregation tendency of protein but decrease chaperone binding property responsible for proper folding of the protein. This substitution makes the protein prone to ubiquitinylation and results in loss of strand from the protein structure. G447R and G448R substitutions both results in loss of prolidase activity. Residue G448 is inaccessible to solvent because it is buried inside the protein region. The residue lies at about 14.5 A° from the active site and is not directly involved in Mn(II) binding. G448 is a part of anti-parallel β strand combined with a short strand composed made up of residues G414, I415, Y416, F417. The G448R substitution leads the insertion of a bulky arginine side chain which is not appropriate with pairing of the two anti-parallel β strands and with the correct assembly of the b-sheet. Furthermore, the G448R mutation falls only four amino acids before residue E452, that coordinates one of the Mn(II) cofactor ions; thereby disrupting the catalytic site for di- peptidase activity. Residues ranging from 366–378 are highly conserved and results in proline di- peptidase activity. All the above listed substitutions lead to decrease in prolidase activity either by disrupting its structure or by loss of proper catalysis and phosphorylation at the sites needed for its activity. Secondary structure of native prolidase and mutation incorporated (D276N, D287N, E412K, G448R) prolidase reveals no such considerable variation. But these substitutions affect the tertiary structure of protein as being a part of catalytic site. Therefore it can predict that these 4 substitutions have potential to affect the function of prolidase protein.

Conclusion

Prolidase is an important regulator of collagen metabolism. A number of studies are present on prolidase deficiency, a rare autosomal recessive disorder. But there is lack of studies related to prolidase on molecular level. Almost all of the SNPs are still uncharacterized in their disease causing potential except those for related to prolidase deficiency. This is the first study which predicts the functional and structural impact of nsSNP on prolidase structure and function. This study differentiates disease causing mutations from neutral ones as listed in SNP database. Furthermore, the predicted disease associated nsSNP can be studied to find their association in various disease development and development in potent drug discovery. In addition to this, results of present study should be updated in relevant database so that other can use these results to make further studies[27-30].

Materials and Methods

SNP retrieval

SNP of prolidase gene and their protein sequence (FASTA format) were retrieved from dbSNP database (http://www.ncbi.nlm.nih.gov/SNP/) and NCBI respectively for computational analysis. Selection of SNPs related to Homo sapiens was done by using filters non synonymous, missense, nonsense, stop gained SNP and human[31]. Other databases such as Exome Aggregation Consortium (ExAC), Genome Variation Server (GVS) and F-SNP were also searched to cross check the nsSNP data for prolidase gene.

Prediction of the effect of nsSNPs

nsSNPs carried out amino acid substitution was first screened by SIFT(Sorting Intolerant from Tolerant) server. Its prediction is based on the conservation and alignment of highly similar orthologoue and paralogoue protein sequences and predict the functional importance of an amino acid substitution. Positions with probability score less than 0.05 are considered to be deleterious, those greater than or equal to 0.05 are considered to be tolerated[32]. In our study, we submitted rsids retrieved from dbSNP as a query to make prediction. nsSNPs prediction by SIFT server was further used to find their effect on the structure and fnction of prolidase gene. Protein variation effect analyzer(PROVEAN) predicts whether the substitution of amino acid is deleterious or tolerated. The threshold for a mutation to be deleterious is −2.5; if below threshold, prediction will be deleterious and will be neutral if it is above threshold. Provean program can be used to predict a functional effect of single or multiple amino acid substitutions, insertions or deletion[10]. Mutation Assessor predicts the effect amino‐acid substitutions on the function of proteins by utilizing a combinatorial entropy optimization’ technique to find key residues responsible for function and then assigns a conservation score to them. This server provides semantic linking to variant analysis, annotations, variant multiple sequence alignment html page, and variant 3D structure page. Its output contains two annotation i.e. FI score (functional impact score) and functional impact (high, medium, neutral). PANTHER is a mutation analysis software that depends upon the HMM to make any prediction. It has three variants: gene list analysis, panther scoring, and evolutionary analysis of coding SNPs. In gene list analysis, it analyzes the list of gene, and expression data files with PANTHER. By Evolutionary analysis of coding SNPs it predicted the chances of a particular nonsynonymous coding SNP will cause a functional impact on the protein or not. Polyphen2 predict the functional impact of single amino acid substitution on protein function using physical and comparative models generate by the sequence information. Its prediction is based on a number of features such as sequence, structure and phylogenetic comparison to analyze the mutation[33]. PhDSNP is support vector machine based software which support the local sequence environment and output of multiple sequence alignment to predict the nature of a particular mutation. It requires input in the form of protein sequence, residue position, new residue[34]. Output is based on reliability score which predict whether the substitution is disease causing or neutral. nsSNP analyser predicts the phenotypic effect of nonsynonymous substitution. It uses multiple sequence alignment and protein 3D structure to predict the result. nsSNP Analyzer uses “Random Forest” network i.e. a machine learning method to classify the nsSNP from native one. Its prediction is purely dependent on swissprot database and was trained using a curated SNP dataset. nsSNP Analyzer summarizes the structural environment of the mutated residue and similarity between the substituted and native residue from the normalized probability of the substitution in the multiple sequence alignment[35]. FATHMM uses hidden Markov models (HMMs) to predict the functional effects of protein missense mutations and assign a pathogenicity score representing the overall tolerance of the protein/domain to mutations. A consensus of all the predictions was generated to prioritize the deleterious substitution predicted by various softwares used. It was done by manual method. Results of all the software were analyzed and substitution were selected which are found to be deleterious in all the predictions. All the prioritized nsSNP were further studied by MutPred server which is a web tool that predicted nsSNP association with disease along with molecular effect of that particular substitution. It takes the input as SIFT output and calculate 14 different structural and functional properties. It was trained utilizing the deleterious mutations reported in Human Gene Mutation Database and neutral polymorphisms from Swiss-Prot. It uses SIFT, PSI-BLAST, and Pfam profiles[36], also some structural disorder prediction algorithms, including TMHMM, MARCOIL[37], and DisProt[38]. It uses SVM v2.50 for analysis The output of MutPred consists of a general score (g), i.e., P (deleterious) the probability that the amino acid substitution is deleterious or disease-associated, and top five characteristic scores (p), where p is the P-value that certain functional and structural characteristics of the protein are impacted. Certain combinations of high values of ‘g’ (p deleterious) and low values of ‘p’ (property scores) are referred as hypotheses. • Scores for an aas with g > 0.5 and p < 0.05, are referred as actionable hypotheses. • Scores for an aas with g > 0.75 and p < 0.05, are referred as confident hypotheses. • Scores for an aas with g > 0.75 and p < 0.01, are referred as very confident hypotheses. User input involves FASTA sequence and amino acid substitutions.

Prediction of conserved residues by ConSurf

It calculates the evolutionary conservation of amino acid within a protein sequence by using empirical Bayesian inference. It gives conservation score along with color scheme. Score 9 was given to most conserved amino acid whereas 1 is given to variable amino acid[39], Consurf is available at http://consurf.tau.ac.il/.

Prediction on surface and solvent accessibility by NetSurf P

It predicts the solvent accessible surface area or solvent accessibility of amino acids to locate the active site in a fully folded protein. This prediction method relies on the Z-score, which can predict the surfaces but not secondary structures of proteins. Its ouput includes 3 subclasses meant for buried, partial buried and exposed region in protei structure[40], www.cbs.dtu.dk/services/NetSurfP/. A support vector machine based tool iMutant 2.0 predicts the change in the stability of the protein by a particular mutation. iMutant 2.0 can be utilized both as a classifier that predicts the signs of the protein stability changes upon a variation and as a regression estimator that predicts the relative change in Gibbs-free energy (ΔG) at a given temperature. It utilizes a comprehensive database based on protein mutation ProTherm[41], http://folding.biofold.org/i-mutant/i-mutant2.0.html.

Prediction of the effect of SNP located in UTR region by UTRscan Server

Untranslated regions have considerable role in the post transcriptional regulation of gene expression, stability and efficiency of translation. UTRscan server predicts the functional SNPs by BLAST search to find UTR motifs present in UTRsite[42]. Its input format requires submission of protein’s FASTA format and output was in the form of signal name and its position in the transcript, http://itbtools.ba.itb.cnr.it/utrscan.

Functionally significant 3′UTR prediction by PolymiRTS

Polymorphism in microRNA Target Site (PolymiRTS) is a repository of naturally occurring DNA mutations in the miRNA target site[43]. It predicts whether a point mutation or INDELS in 3′UTR affect the miRNA target site or not. Output was in the form of 4 categories i.e. ‘D’ (the derived allele disrupts a conserved miRNA site), ‘N’ (the derived allele disrupts a nonconserved miRNA site), ‘C’ (the derived allele creates a new miRNA site) and ‘O’ (other cases when the ancestral allele cannot be determined unambiguously) where class ‘C’ may cause abnormal gene repression and class ‘D’ may cause loss of normal repression control. These two classes of PolymiRTS are most likely to have functional impacts, http://compbio.uthsc.edu/miRSNP/. Human splicing finder(HSF) identify and predicts the effect of mutations on the splicing motifs including the acceptor and donor splice sites, the branch point and auxiliary sequences known to either enhance or repress splicing: Exonic Splicing Enhancers (ESE) and Exonic Splicing Silencers (ESS)[44], http://www.umd.be/HSF3/HSF.shtml. PSIPRED (PSI BLAST based secondary structure prediction) predicted secondary structure of protein based on related sequences and position specific scoring matrix. It predicted whether the residues were form strand, helix and coils. Input format was the FASTA sequence of prolidase protein, http://bioinf.cs.ucl.ac.uk/psipred/.

Three dimensional structure prediction by Swiss Model

Prediction of 3D structure was done by Swiss Modeller which allow to model the amino acid on the basis of structure homology. It allows modeling using manual template selection or by automated selection mode. It identifies the template, align the sequence, generate model then assess the model quality in terms of QMEAN value. FASTA sequence (mutation incorporated) was modeled against PDB structure of prolidase rprotein. Swiss Pdb viewer, tool was used to visualize and energy minimization of generated model, https://swissmodel.expasy.org/.

Quality assessment by RAMPAGE

RAMPAGE is a web server predicted dihedral angles and number of residues in allowed, favorable region based on the Φ and Ψ angles. Pdb files of models obtained after energy minimization was used as input of RAMPAGE online tool. More than 90% residues in allowed region is considered as good model. Supplementary information
  44 in total

1.  dbSNP: the NCBI database of genetic variation.

Authors:  S T Sherry; M H Ward; M Kholodov; J Baker; L Phan; E M Smigielski; K Sirotkin
Journal:  Nucleic Acids Res       Date:  2001-01-01       Impact factor: 16.971

2.  Serum prolidase activity and oxidative status in patients with stage I endometrial cancer.

Authors:  Dagistan Tolga Arioz; Hakan Camuzcuoglu; Harun Toy; Sefa Kurt; Hakim Celik; Nurten Aksoy
Journal:  Int J Gynecol Cancer       Date:  2009-10       Impact factor: 3.437

3.  Prolidase deficiency: biochemical classification of alleles.

Authors:  A P Boright; C R Scriver; G A Lancaster; F Choy
Journal:  Am J Hum Genet       Date:  1989-05       Impact factor: 11.025

4.  Serum prolidase activity and oxidative status in patients with bronchial asthma.

Authors:  Alpay Cakmak; Dost Zeyrek; Ali Atas; Hakim Celik; Nurten Aksoy; Ozcan Erel
Journal:  J Clin Lab Anal       Date:  2009       Impact factor: 2.352

5.  Primary structure and gene localization of human prolidase.

Authors:  F Endo; A Tanoue; H Nakai; A Hata; Y Indo; K Titani; I Matsuda
Journal:  J Biol Chem       Date:  1989-03-15       Impact factor: 5.157

6.  Structural organization of the gene for human prolidase (peptidase D) and demonstration of a partial gene deletion in a patient with prolidase deficiency.

Authors:  A Tanoue; F Endo; I Matsuda
Journal:  J Biol Chem       Date:  1990-07-05       Impact factor: 5.157

7.  DisProt: the Database of Disordered Proteins.

Authors:  Megan Sickmeier; Justin A Hamilton; Tanguy LeGall; Vladimir Vacic; Marc S Cortese; Agnes Tantos; Beata Szabo; Peter Tompa; Jake Chen; Vladimir N Uversky; Zoran Obradovic; A Keith Dunker
Journal:  Nucleic Acids Res       Date:  2006-12-01       Impact factor: 16.971

8.  Kinetic and structural evidences on human prolidase pathological mutants suggest strategies for enzyme functional rescue.

Authors:  Roberta Besio; Roberta Gioia; Federica Cossu; Enrico Monzani; Stefania Nicolis; Lucia Cucca; Antonella Profumo; Luigi Casella; Ruggero Tenni; Martino Bolognesi; Antonio Rossi; Antonella Forlino
Journal:  PLoS One       Date:  2013-03-13       Impact factor: 3.240

9.  The Pfam protein families database: towards a more sustainable future.

Authors:  Robert D Finn; Penelope Coggill; Ruth Y Eberhardt; Sean R Eddy; Jaina Mistry; Alex L Mitchell; Simon C Potter; Marco Punta; Matloob Qureshi; Amaia Sangrador-Vegas; Gustavo A Salazar; John Tate; Alex Bateman
Journal:  Nucleic Acids Res       Date:  2015-12-15       Impact factor: 16.971

10.  RNALocate: a resource for RNA subcellular localizations.

Authors:  Ting Zhang; Puwen Tan; Liqiang Wang; Nana Jin; Yana Li; Lin Zhang; Huan Yang; Zhenyu Hu; Lining Zhang; Chunyu Hu; Chunhua Li; Kun Qian; Changjian Zhang; Yan Huang; Kongning Li; Hao Lin; Dong Wang
Journal:  Nucleic Acids Res       Date:  2016-08-19       Impact factor: 16.971

View more
  10 in total

1.  Exploring the Structural and Functional Effects of Nonsynonymous SNPs in the Human Serotonin Transporter Gene Through In Silico Approaches.

Authors:  Md Arzo Mia; Md Nasir Uddin; Yasmin Akter; Lolo Wal Marzan
Journal:  Bioinform Biol Insights       Date:  2022-06-09

2.  Extracting Complementary Insights from Molecular Phenotypes for Prioritization of Disease-Associated Mutations.

Authors:  Shayne D Wierbowski; Robert Fragoza; Siqi Liang; Haiyuan Yu
Journal:  Curr Opin Syst Biol       Date:  2018-09-17

3.  Molecular Epidemiology of Myroides odoratimimus in Nosocomial Catheter-Related Infection at a General Hospital in China.

Authors:  Shuang Yang; Qian Liu; Zhen Shen; Hua Wang; Lei He
Journal:  Infect Drug Resist       Date:  2020-06-25       Impact factor: 4.003

Review 4.  Clinical Genetics of Prolidase Deficiency: An Updated Review.

Authors:  Marta Spodenkiewicz; Michel Spodenkiewicz; Maureen Cleary; Marie Massier; Giorgos Fitsialos; Vincent Cottin; Guillaume Jouret; Céline Poirsier; Martine Doco-Fenzy; Anne-Sophie Lèbre
Journal:  Biology (Basel)       Date:  2020-05-21

5.  Genotyping and In Silico Analysis of Delmarva (DMV/1639) Infectious Bronchitis Virus (IBV) Spike 1 (S1) Glycoprotein.

Authors:  Ahmed Ali; Davor Ojkic; Esraa A Elshafiee; Salama Shany; Mounir Mohamed El-Safty; Adel A Shalaby; Mohamed Faizal Abdul-Careem
Journal:  Genes (Basel)       Date:  2022-09-09       Impact factor: 4.141

6.  ADGRL3 genomic variation implicated in neurogenesis and ADHD links functional effects to the incretin polypeptide GIP.

Authors:  Oscar M Vidal; Jorge I Vélez; Mauricio Arcos-Burgos
Journal:  Sci Rep       Date:  2022-09-23       Impact factor: 4.996

7.  Predicting the most deleterious missense nsSNPs of the protein isoforms of the human HLA-G gene and in silico evaluation of their structural and functional consequences.

Authors:  Elaheh Emadi; Fatemeh Akhoundi; Seyed Mehdi Kalantar; Modjtaba Emadi-Baygi
Journal:  BMC Genet       Date:  2020-08-31       Impact factor: 2.797

8.  In silico approach to the analysis of SNPs in the human APAF1 gene.

Authors:  Tuğba Kaman; Ömer Faruk Karasakal; Ebru Özkan Oktay; Korkut Ulucan; Muhsin Konuk
Journal:  Turk J Biol       Date:  2019-12-13

9.  Evaluating the Effect of 3'-UTR Variants in DICER1 and DROSHA on Their Tissue-Specific Expression by miRNA Target Prediction.

Authors:  Dmitrii S Bug; Artem V Tishkov; Ivan S Moiseev; Natalia V Petukhova
Journal:  Curr Issues Mol Biol       Date:  2021-07-06       Impact factor: 2.976

10.  SNPeffect: identifying functional roles of SNPs using metabolic networks.

Authors:  Debolina Sarkar; Costas D Maranas
Journal:  Plant J       Date:  2020-04-18       Impact factor: 7.091

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.