| Literature DB >> 30310586 |
David A Armstrong1, Quentin Kaas2, K Johan Rosengren1.
Abstract
Cystine residues result from the formation of disulfide bonds between pairs of cysteine residues. This cross linking of the backbone is essential for the structure and activity of peptides and proteins. The conformation of a cystine side chain can be described using five dihedral angles, χ1, χ2, χ3, χ2', and χ1', with cystines favouring certain combinations of these angles. 2D NMR spectroscopy is ideally suited for structure determination of disulfide-rich peptides, because of their small size and constrained nature. However, only limited information of the cystine side chain conformation can be determined by NMR spectroscopy, leading to ambiguity in the deduced 3D structures. Resolving accurate structures is important as disulfide-rich peptides have proven to be promising drug candidates in a number of fields, either as bioactive leads or scaffolds. Using a database of NMR chemical shifts combined with crystallographic structures, we have developed a method called DISH that uses support vector machines to predict the dihedral angles of cysteine side chains. It is able to successfully predict χ2 angles with 91% accuracy, and has improved performance over existing prediction methods for χ1 angles, with 87% accuracy. For 81% of cysteine residues, DISH successfully predicted both the χ1 and χ2 angles. By revisiting published solution structures of peptides determined using NMR spectroscopy, we assessed the impact of additional cystine dihedral restraints on the quality of 3D models. DISH improved the resolution and accuracy, highlighting the potential for improving the understanding of structure-activity relationships and rational development of peptide drugs.Entities:
Year: 2018 PMID: 30310586 PMCID: PMC6115640 DOI: 10.1039/c8sc01423j
Source DB: PubMed Journal: Chem Sci ISSN: 2041-6520 Impact factor: 9.825
Fig. 1(a) The five dihedral angles of a cystine residue side chain: χ1, χ2, χ3, χ2′, and χ1′ (b–d) distribution of χ angles of 3342 cystine residues. Angles were binned to the nearest 5°. X-Axis is the dihedral angle in degrees (°) and the Y-axis is the frequency in the database. Green areas indicate the dihedral angle ranges used to define three angle classes, χ1 (and χ1′), χ2 (and χ2′) and χ3.
Fig. 2Workflow of the DISH method. The prediction of each χ angle uses a two level SVM. The workflow details the input values as well as the optimized γ and C SVM parameters.
The MCC for each stage and final accuracy for χ1 and χ2 angle prediction by DISH from a ‘leave-one-out’ evaluation
| Stage I MCC | Stage II MCC | Accuracy | |
| SVM- | 0.89 | 0.70 | 87 |
| SVM- | 0.85 | 0.85 | 91 |
χ1 is an input of stage II and was measured in the crystal structure for this test.
Accuracy was measured by serially using stages I and II.
Fig. 3Correlations between the expected accuracy of predictions (confidence score) and the SVM output values for (a) χ1 × χ2 predictions, (b) χ1 predictions and (c) χ2 predictions. The accuracies were estimated using the leave-one-out method and correlations with output values were computed using the Platt scaling method. The frequency of predictions with output values above a cut-off is indicated in red. Each plot represents the mean with error bars showing standard deviation of ten (n = 10) rounds of Platt scaling on all the data. The dashed line represents the overall accuracy for 100% of the frequency.
The Cys residues of cyc-PVIIA and χ1 angles calculated from the E.COSY spectrum, χ1 angles predicted by TALOS-N and the χ1 and χ2 angles predicted by DISH, either gauche+ (g+), gauche– (g–) or trans (t)
| Residue |
|
|
|
| 1 | — | — | — |
| 8 |
|
|
|
| 15 |
|
|
|
| 16 | — |
|
|
| 20 | — |
|
|
| 26 | — |
|
|
As DISH was not in agreement with reported experimental data restraints or were found to violate were not included in the new structure calculation.
Fig. 4Comparison of the backbone conformation of the 20 lowest energy models of cyc-PVIIA computed using CNS without DISH predictions (PDB 2n8e; in blue) and with DISH predictions (in pink). Cystine side chains are in yellow sticks.
Structural statistics of the 20 lowest energy structures of cyc-PVIIA and the re-evaluated structure with additional χ1 and χ2 restraints calculated using simulated annealing procedures in CNS
| Original | Additional | |
| Clash score | 6.1 ± 2.7 | 11.8 ± 4.7 |
| Poor rotamers | 1.1 ± 1.0 | 0.05 ± 0.22 |
| Ramachandran outliers | 0.0 ± 0.0 | 0.45 ± 0.61 |
| Ramachandran favoured (%) | 95.5 ± 4.0 | 89.9 ± 5.1 |
| MolProb. score | 1.9 ± 0.33 | 2.1 ± 0.18 |
| Percentile (%) | 79.3 ± 15.5 | 69.8 ± 9.8 |
| Residues with bad bonds | 0.2 ± 0.45 | 0.6 ± 0.68 |
|
| ||
| Mean global backbone | 0.91 ± 0.25 | 0.61 ± 0.18 |
| Mean global heavy | 1.78 ± 0.26 | 1.52 ± 0.26 |
|
| ||
| Mean global backbone | 1.65 ± 0.31 | 1.29 ± 0.35 |
| Mean global heavy | 2.42 ± 0.30 | 2.24 ± 0.48 |
Definition of MolProbity structural statistics.55
The number of non-donor–acceptor atoms that overlap by more than 0.4 Å per 1000 atoms.
Overall quality of protein statistics. Log weighted combination of the clash score, percentage Ramachandran not favoured and percentage of bad side chain rotamers. Reflects the crystallographic resolution for structures that those values would be expected.
100th percentile is the best among structures of comparable resolution; 0th percentile is the worst.
The Cys residues of ProTx-II, Pn3A and G117 and χ1 angles calculated from the E.COSY spectrum, χ1 angles predicted by TALOS-N and the χ1 and χ2 angles predicted by DISH
| ProTx-II |
|
|
|
|
| 2 | — |
| — |
|
| 9 |
|
| — |
|
| 15 |
|
|
|
|
| 16 |
|
| — |
|
| 21 |
|
|
|
|
| 25 |
|
| — |
|
|
| ||||
| 2 |
|
| — |
|
| 9 | — |
| — |
|
| 15 |
|
|
|
|
| 16 |
|
|
|
|
| 21 |
|
|
|
|
| 28 |
|
| — |
|
|
| ||||
| 8 |
|
| — |
|
| 14 |
|
| — |
|
| 15 |
|
| — |
|
| 19 |
|
| — |
|
| 20 |
|
| — |
|
| 24 |
|
| — |
|
| 31 |
|
| — |
|
Fig. 5The mean (error bars representing standard deviation) of the absolute difference between experimental N–HN RDC values and those predicted by PALES (n = 20). Two sets of structures for the hen lysozyme were calculated in CNS, with Cys χ1 and χ2 restraints and those without the statistical test being used to compute the P-values: (unpaired Student's t-test). *P < 0.05, **P < 0.005, ***P < 0.0005.