| Literature DB >> 28747718 |
Tikam Chand Dakal1, Deepak Kala2, Gourav Dhiman2, Vinod Yadav3, Andrey Krokhotin4, Nikolay V Dokholyan4.
Abstract
Here we report an in-silico approach for identification, characterization and validation of deleterious non-synonymous SNPs (nsSNPs) in the interleukin-8 gene using three steps. In first step, sequence homology-based genetic analysis of a set of 50 coding SNPs associated with 41 rsIDs using SIFT (Sorting Intolerant from Tolerant) and PROVEAN (Protein Variation Effect Analyzer) identified 23 nsSNPs to be putatively damaging/deleterious in at least one of the two tools used. Subsequently, structure-homology based PolyPhen-2 (Polymorphism Phenotyping) analysis predicted 9 of 23 nsSNPs (K4T, E31A, E31K, S41Y, I55N, P59L, P59S, L70P and V88D) to be damaging. According to the conditional hypothesis for the study, only nsSNPs that score damaging/deleterious prediction in both sequence and structural homology-based approach will be considered as 'high-confidence' nsSNPs. In step 2, based on conservation of amino acid residues, stability analysis, structural superimposition, RSMD and docking analysis, the possible structural-functional relationship was ascertained for high-confidence nsSNPs. Finally, in a separate analysis (step 3), the IL-8 deregulation has also appeared to be an important prognostic marker for detection of patients with gastric and lung cancer. This study, for the first time, provided in-depth insights on the effects of amino acid substitutions on IL-8 protein structure, function and disease association.Entities:
Mesh:
Substances:
Year: 2017 PMID: 28747718 PMCID: PMC5529537 DOI: 10.1038/s41598-017-06575-4
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1The pie-chart displays number of coding SNPs, 5′ UTR, 3′ UTR, intronic and other human active SNPs in human IL8 gene (based on the dbSNP database).
A record of 50 SNPs, 3 synonymous and 47 non-synonymous, associated with 41 rsIDs in the IL8 coding region.
| rsID | Codons | SNP Type | Substitution |
|---|---|---|---|
| rs572157399 | ATG-gTG | Nonsynonymous | *1 V |
| rs202071309 | AAG-AcG | Nonsynonymous | K4T |
| rs563959935 | CTG-CaG | Nonsynonymous | L5Q |
| rs200602609 | GCC-tCC | Nonsynonymous | A6S |
| rs202202182 | GCT-GaT | Nonsynonymous | A8D |
| rs564043731 | CTC-tTC | Nonsynonymous | L9F |
| rs200254616 | CTG-CaG | Nonsynonymous | L14Q |
| rs763622469 | GAA-GAc | Nonsynonymous | E21D |
| rs751273843 | GGT-aGT | Nonsynonymous | G22S |
| rs767339386 | CCA-aCA | Nonsynonymous | P26T |
| rs199855020 | AGT-cGT | Nonsynonymous | S28R |
| rs755727808 | AAA-tAA | Nonsynonymous | K30* |
| rs138567132 | GAA-GcA | Nonsynonymous | E31A |
| rs188378669 | GAA-tAA | Nonsynonymous | E31* |
| rs188378669 | GAA-aAA | Nonsynonymous | E31K |
| rs149273289 | CAG-aAG | Nonsynonymous | Q35K |
| rs149273289 | CAG-gAG | Nonsynonymous | Q35E |
| rs745916337 | TAC-cAC | Nonsynonymous | Y40H |
| rs144469788 | TCC-aCC | Nonsynonymous | S41T |
| rs749738011 | TCC-TaC | Nonsynonymous | S41Y |
| rs200107073 | CCT-CaT | Nonsynonymous | P43H |
| rs139503118 | CAC-CgC | Nonsynonymous | H45R |
| rs774766411 | AAA-gAA | Nonsynonymous | K50E |
| rs202114642 | ATT-AaT | Nonsynonymous | I55N |
| rs202114642 | ATT-AcT | Nonsynonymous | I55T |
| rs765951700 | CCA-CtA | Nonsynonymous | P59L |
| rs373821605 | CCA-tCA | Nonsynonymous | P59S |
| rs147544998 | TGC-TGa | Nonsynonymous | C61* |
| rs147544998 | TGC-TGt | Synonymous | C61C |
| rs140214046 | GCC-aCC | Nonsynonymous | A62T |
| rs140214046 | GCC-tCC | Nonsynonymous | A62S |
| rs758228010 | ACA-cCA | Nonsynonymous | T64P |
| rs142957504 | ACA-AgA | Nonsynonymous | T64R |
| rs759032011 | AAG-AAa | Synonymous | K69K |
| rs759032011 | AAG-AAc | Nonsynonymous | K69N |
| rs759032011 | AAG-AAt | Nonsynonymous | K69N |
| rs762899923 | CTT-CcT | Nonsynonymous | L70P |
| rs751369405 | GAG-GAc | Nonsynonymous | E75D |
| rs536774132 | TGT-TGa | Nonsynonymous | C77* |
| rs373408845 | AAC-AAa | Nonsynonymous | N83K |
| rs780209935 | TGG-TGa | Nonsynonymous | W84* |
| rs753921688 | AGG-AaG | Nonsynonymous | R87K |
| rs756294837 | AGG-AGa | Synonymous | R87R |
| rs756294837 | AGG-AGc | Nonsynonymous | R87S |
| rs779068762 | GTT-GcT | Nonsynonymous | V88A |
| rs779068762 | GTT-GaT | Nonsynonymous | V88D |
| rs200662278 | TTG-TaG | Nonsynonymous | L93* |
| rs185040023 | AAG-AgG | Nonsynonymous | K94R |
| rs200005090 | GCT-GtT | Nonsynonymous | A96V |
| rs201643630 | TAA-TtA | Nonsynonymous | *100 L |
Figure 2An overview of the experimental design for in-silico identification/characterization (step 1), validation (step 2) of nsSNPs in IL8 gene and associating deregulation of IL8 gene with survival of cancer patients (step 3).
Sequence homology-based prediction of damaging coding nsSNPs in IL8 gene using SIFT and PROVEAN.
| SNP rsID | Codons | Substitution | SNP Type | SIFT prediction | PROVEAN prediction | ||
|---|---|---|---|---|---|---|---|
| Prediction | Score | Prediction | Score | ||||
| rs202071309 | AAG-AcG | K4T | Nonsynonymous | Damaging | 0 | Neutral | −2.4 |
| rs563959935 | CTG-CaG | L5Q | Nonsynonymous | Damaging | 0.01 | Deleterious | −3.14 |
| rs202202182 | GCT-GaT | A8D | Nonsynonymous | Damaging | 0 | Deleterious | −3.66 |
| rs200254616 | CTG-CaG | L14Q | Nonsynonymous | Damaging | 0 | Deleterious | −4.63 |
| rs763622469 | GAA-GAc | E21D | Nonsynonymous | Damaging | 0 | Neutral | −1.47 |
| rs199855020 | AGT-cGT | S28R | Nonsynonymous | Damaging | 0.04 | Neutral | −1.13 |
| rs138567132 | GAA-GcA | E31A | Nonsynonymous | Tolerated | 0.07 | Deleterious | −4.96 |
| rs188378669 | GAA-aAA | E31K | Nonsynonymous | Damaging | 0.01 | Deleterious | −3.52 |
| rs149273289 | CAG-gAG | Q35E | Nonsynonymous | Damaging | 0.04 | Neutral | −2.17 |
| rs749738011 | TCC-TaC | S41Y | Nonsynonymous | Damaging | 0 | Deleterious | −4.27 |
| rs139503118 | CAC-CgC | H45R | Nonsynonymous | Tolerated | 0.12 | Deleterious | −5.83 |
| rs202114642 | ATT-AaT | I55N | Nonsynonymous | Damaging | 0.02 | Deleterious | −4.15 |
| rs765951700 | CCA-CtA | P59L | Nonsynonymous | Damaging | 0 | Deleterious | −9.35 |
| rs373821605 | CCA-tCA | P59S | Nonsynonymous | Tolerated | 0.3 | Deleterious | −7.21 |
| rs758228010 | ACA-cCA | T64P | Nonsynonymous | Damaging | 0.01 | Deleterious | −3.28 |
| rs142957504 | ACA-AgA | T64R | Nonsynonymous | Damaging | 0.01 | Deleterious | −3.53 |
| rs759032011 | AAG-AAc | K69N | Nonsynonymous | Damaging | 0.01 | Neutral | 0.1 |
| rs759032011 | AAG-AAt | K69N | Nonsynonymous | Damaging | 0.01 | Neutral | 0.1 |
| rs762899923 | CTT-CcT | L70P | Nonsynonymous | Damaging | 0 | Deleterious | −6.4 |
| rs751369405 | GAG-GAc | E75D | Nonsynonymous | Damaging | 0.05 | Neutral | −2.02 |
| rs756294837 | AGG-AGc | R87S | Nonsynonymous | Damaging | 0.02 | Neutral | −2.21 |
| rs779068762 | GTT-GcT | V88A | Nonsynonymous | Damaging | 0 | Deleterious | −3.4 |
| rs779068762 | GTT-GaT | V88D | Nonsynonymous | Damaging | 0 | Deleterious | −5.6 |
Structural homology-based prediction of damaging coding nsSNPs using PolyPhen-2.
| Substitution | Effect | Score | Sensitivity | Specificity |
|---|---|---|---|---|
| K4T | Probably damaging | 0.964 | 0.78 | 0.95 |
| L5Q | Benign | 0.164 | 0.92 | 0.87 |
| A8D | Possibly damaging | 0.906 | 0.82 | 0.94 |
| L14Q | Possibly damaging | 0.917 | 0.81 | 0.94 |
| E21D | Possibly damaging | 0.952 | 0.79 | 0.95 |
| G22S | Possibly damaging | 0.518 | 0.88 | 0.9 |
| S28R | Possibly damaging | 0.662 | 0.86 | 0.91 |
| E31A | Probably damaging | 1 | 0 | 1 |
| E31K | Probably damaging | 1 | 0 | 1 |
| Q35K | Benign | 0.012 | 0.96 | 0.78 |
| S41Y | Probably damaging | 0.999 | 0.14 | 0.99 |
| H45R | Possibly damaging | 0.549 | 0.88 | 0.91 |
| I55N | Probably damaging | 0.998 | 0.27 | 0.99 |
| P59L | Probably damaging | 1 | 0 | 1 |
| P59S | Probably damaging | 1 | 0 | 1 |
| T64P | Possibly damaging | 0.939 | 0.8 | 0.94 |
| T64R | Possibly damaging | 0.884 | 0.82 | 0.94 |
| K69N | Possibly damaging | 0.549 | 0.88 | 0.91 |
| L70P | Probably damaging | 0.997 | 0.41 | 0.98 |
| E75D | Benign | 0.012 | 0.96 | 0.78 |
| R87S | Benign | 0.072 | 0.94 | 0.84 |
| V88A | Possibly damaging | 0.856 | 0.83 | 0.93 |
| V88D | Probably damaging | 0.992 | 0.7 | 0.97 |
Physical-principal based prediction of ∆∆G using Eris server (http://eris.dokhlab.org).
| Models | Parameter used | Free energy (∆∆G) prediction | ||
|---|---|---|---|---|
| Backbone modeling | Pre-relaxation | Score | Mutation category | |
| IL-8E31A | Flexible | Yes | −0.83 | stabilizing |
| IL-8E31K | Flexible | Yes | 0.97 | destabilizing |
| IL-8S41Y | Flexible | Yes | −1.81 | stabilizing |
| IL-8I55N | Flexible | Yes | 3.64 | destabilizing |
| IL-8P59L | Flexible | Yes | −3.07 | stabilizing |
| IL-8P59S | Flexible | Yes | −1.67 | stabilizing |
| IL-8L70P | Flexible | Yes | >10 | destabilizing |
| IL-8V88D | Flexible | Yes | 4.96 | destabilizing |
Prediction of disease related amino acid substitution and phenotypes by nsSNPAnalyzer.
| SNPs | Phenotype | Environment | AreaBuried | FracPolar | Secondstr |
|---|---|---|---|---|---|
| K4T | Unknown | — | — | — | — |
| E31A | Neutral | EC | 0.094 | 0.896 | C |
| E31K | Neutral | EC | 0.094 | 0.896 | C |
| S41Y | Neutral | EC | 0.079 | 0.906 | C |
| I55N | Neutral | B3S | 0.509 | 0.719 | S |
| P59L | Neutral | EC | 0.016 | 0.854 | C |
| P59S | Neutral | EC | 0.016 | 0.854 | C |
| L70P | Disease | B2S | 0.61 | 0.323 | S |
| V88D | Neutral | B1H | 0.508 | 0.219 | H |
Prediction of disease related amino acid substitution and phenotypes by MutPred.
| SNPs | Actionable/Confident hypothesis | g-value | p-value |
|---|---|---|---|
| E31A | Loss of solvent accessibility | 0.572 | 0.0404 |
| E31K | Gain of MoRF binding | 0.568 | 0.0031 |
| S41Y | Gain of solvent accessibility | 0.510 | 0.0739 |
| I55N | Gain of disorder | 0.710 | 0.033 |
| P59L | Gain of catalytic residue at P59 | 0.609 | 0.051 |
| Loss of glycosylation at S57 | 0.0797 | ||
| P59S | Loss of glycosylation at S57 | 0.521 | 0.0829 |
| L70P | Loss of stability | 0.771 | 0.0189 |
| Loss of catalytic residue at L70 | 0.0214 | ||
| V88D | Gain of disorder | 0.786 | 0.0306 |
| Loss of MoRF binding | 0.0325 | ||
| Gain of ubiquitination at K91 | 0.0401 |
Figure 3Amino acid alignment of human IL-8 protein (UniProt ID: P10145) along with its homologues in phylogenetically close species in mammals and fouls. Solid horizontal bars indicate conserved sequence motifs and residues with asterisk (*) mark indicate evolutionary conserved amino acids. The amino acid identities were colored according the Clustal color scheme, and the conservation index at each alignment position were provided by Jalview[61].
Figure 4Structural model of modeled human IL-8, wherein panel A shows two conserved sequence motifs, ELR31–33 and SGP57–59, that face each other and form a structural scaffold putatively involved in IL-8 binding to its receptor; and panel B shows the hydrophobic pocket formed of F44, F48, I49 and L70 on the surface of IL-8 protein having role in receptor binding.
Figure 5Consurf analysis of human interleukin-8 protein (Uniprot ID: P10145).
Figure 6Structural superimposition of modelled mutant proteins (in pink) on the wild type IL-8 protein (in peacock blue) using PyMOL.
Figure 7The representative docking poses of IL-8 wild type (IL-8 WT) (Panel A) and mutant proteins (IL-8 E31A & IL-8 E31K) (Panel B–D) onto its receptor IL-8R1 (PDB id: 2LNL). The IL-8 WT and mutants proteins are red colored and the receptor CXCR1 is blue colored. The N- and C-terminal ends of both IL-8 WT/mutants and CXCR1 are also marked.
Figure 8Microarray gene expression data based association of the deregulation of IL8 gene with survival of patients with different cancer types.
Statistical outputs of the modeled 3D structure of different mutant IL-8 protein using I-Tasser (left side). I-TASSER predicted 3D models of mutant IL-8 proteins after structural refinement and energy minimization with ModRefiner (right side).
| Structural modelling | Refined & Energy minimized models | ||||
|---|---|---|---|---|---|
| Models | C-score | RSMD | TM-score | RSMD | TM-score |
| IL8_WT | 0.57 | 0.79 ± 0.09 | 2.2 ± 1.7 | 0.205 | 0.9951 |
| IL8_E31A | 0.59 | 0.79 ± 0.09 | 2.2 ± 1.7 | 0.186 | 0.996 |
| IL8_E31K | 0.67 | 0.80 ± 0.09 | 2.0 ± 1.6 | 0.191 | 0.9958 |
| IL8_S41Y | 0.66 | 0.80 ± 0.09 | 2.1 ± 1.6 | 0.178 | 0.9963 |
| IL8_I55N | 0.61 | 0.80 ± 0.09 | 2.1 ± 1.7 | 0.19 | 0.9958 |
| IL8_P59L | 0.59 | 0.79 ± 0.09 | 2.2 ± 1.7 | 0.147 | 0.9975 |
| IL8_P59S | 0.68 | 0.81 ± 0.09 | 2.0 ± 1.6 | 0.211 | 0.9948 |
| IL8_L70P | 1.17 | 0.87 ± 0.07 | 1.2 ± 1.2 | 0.19 | 0.9958 |
| IL8_V88D | 0.61 | 0.80 ± 0.09 | 2.1 ± 1.7 | 0.273 | 0.9917 |
The summary of possible structural and functional consequences in IL-8 protein as a result of nsSNPs in IL8 gene.
| Models | Structural effect | Functional effect | ||
|---|---|---|---|---|
| Effect | Analysis | Effect | Analysis | |
| IL8_E31A | No change in conformation | I-Tasser & St. superimposition | Loss of conserved functional residue | Clustalɷ & ConSurf |
| Change in RSMD | Increase in protein stability | ERIS | ||
| Increase in IL-8 binding to CXCR1 | ClusPro | |||
| IL8_E31K | No change in conformation Change in RSMD | I-Tasser & St. superimposition | Loss of conserved functional residue | Clustalɷ & ConSurf |
| Decrease in protein stability | ERIS | |||
| Decrease in IL-8 binding to CXCR1 | ClusPro | |||
| IL8_S41Y | No change in conformation | I-Tasser & St. superimposition | Loss of conserved functional residue | Clustalɷ & ConSurf |
| Change in RSMD | Increase in protein stability | ERIS | ||
| IL8_I55N | Change in conformation Change in RSMD | I-Tasser & St. superimposition | Loss of conserved functional residue | Clustalɷ & ConSurf |
| Decrease in protein stability | ERIS | |||
| IL8_P59L | Change in conformation | I-Tasser & St. superimposition | Loss of conserved functional residue | Clustal ɷ & ConSurf |
| Change in RSMD | Increase in protein stability | ERIS | ||
| IL8_P59S | Change in conformation | I-Tasser & | Loss of conserved functional residue | Clustalɷ & ConSurf |
| Change in RSMD | St. superimposition | Increase in protein stability | ERIS | |
| IL8_L70P | Change in conformation Change in RSMD | St. superimposition | Loss of conserved functional residue | Clustalɷ & ConSurf |
| Loss of conserved structural residue | I-Tasser | Decrease in protein stability | ERIS and MutPred | |
| Clustalɷ & ConSurf | Loss of catalytic residue at L70 | MutPred | ||
| Association with diseased phenotype | nsSNPAnalyzer | |||
| IL8_V88D | Change in conformation | I-Tasser & St. superimposition Clustalɷ & ConSurf | Decrease in protein stability | ERIS |
| Change in RSMD | Gain of disorder | MutPred | ||
| Loss of conserved structural residue | Loss of MoRF binding | |||
| Gain of ubiquitination at K91 | ||||