| Literature DB >> 30050050 |
Hussam J Al-Barakati1, Evan W McConnell2, Leslie M Hicks2, Leslie B Poole3, Robert H Newman4, Dukka B Kc5.
Abstract
Protein S-sulfenylation, which results from oxidation of free thiols on cysteine residues, has recently emerged as an important post-translational modification that regulates the structure and function of proteins involved in a variety of physiological and pathological processes. By altering the size and physiochemical properties of modified cysteine residues, sulfenylation can impact the cellular function of proteins in several different ways. Thus, the ability to rapidly and accurately identify putative sulfenylation sites in proteins will provide important insights into redox-dependent regulation of protein function in a variety of cellular contexts. Though bottom-up proteomic approaches, such as tandem mass spectrometry (MS/MS), provide a wealth of information about global changes in the sulfenylation state of proteins, MS/MS-based experiments are often labor-intensive, costly and technically challenging. Therefore, to complement existing proteomic approaches, researchers have developed a series of computational tools to identify putative sulfenylation sites on proteins. However, existing methods often suffer from low accuracy, specificity, and/or sensitivity. In this study, we developed SVM-SulfoSite, a novel sulfenylation prediction tool that uses support vector machines (SVM) to identify key determinants of sulfenylation among five feature classes: binary code, physiochemical properties, k-space amino acid pairs, amino acid composition and high-quality physiochemical indices. Using 10-fold cross-validation, SVM-SulfoSite achieved 95% sensitivity and 83% specificity, with an overall accuracy of 89% and Matthew's correlation coefficient (MCC) of 0.79. Likewise, using an independent test set of experimentally identified sulfenylation sites, our method achieved scores of 74%, 62%, 80% and 0.42 for accuracy, sensitivity, specificity and MCC, with an area under the receiver operator characteristic (ROC) curve of 0.81. Moreover, in side-by-side comparisons, SVM-SulfoSite performed as well as or better than existing sulfenylation prediction tools. Together, these results suggest that our method represents a robust and complementary technique for advanced exploration of protein S-sulfenylation.Entities:
Mesh:
Substances:
Year: 2018 PMID: 30050050 PMCID: PMC6062547 DOI: 10.1038/s41598-018-29126-x
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1In the presence of an oxidizing agent (ROOH), the thiolate anion of a redox-sensitive Cys (Cys-S−) is reversibly oxidized to form a sulfenic acid (SOH). When in close proximity to another reactive Cys (SH), either in the same protein molecule or in another protein, SOH leads to disulfide bond (S-S) formation. In addition to Cys residues in proteins, SOH can also react with the cellular antioxidant, glutathione (g-Glu-Cys-Gly; GSH) to form a mixed S-S bond (PSSG). Aside from altering the chemical properties of the Cys residue and the tertiary structure of the protein, S-S bonds are also believed to prevent terminal oxidation to sulfinic (SO2H) and sulfonic (SO3H) acid. Disulfide bonds can be reduced back to the thiol by cellular antioxidant enzymes, such as glutaredoxin (Grx) or thioredoxin (Trx).
Results of 10-fold cross-validation using individual and cumulative features.
| Features | Performance (%) | |||
|---|---|---|---|---|
| ACC | SN | SP | MCC | |
| BE | 74 | 81 | 69 | 0.49 |
| AAindex | 70 | 73 | 66 | 0.39 |
| KSAAP | 76 | 85 | 67 | 0.53 |
| AAC | 65 | 76 | 55 | 0.31 |
| HQI | 70 | 73 | 66 | 0.39 |
| All Features | 89 | 95 | 83 | 0.79 |
Figure 2Receiver operator characteristic (ROC) curves for each of five features used to develop our method as well as that for the final method utilizing all features (SVM-SulfoSite) for the 10-fold cross-validation. The area under the curve (AUC) for each feature is given in parentheses.
Independent test result using individual and cumulative features.
| Features | Performance (%) | |||
|---|---|---|---|---|
| ACC | SN | SP | MCC | |
| BE | 68 | 66 | 69 | 0.34 |
| AAindex | 65 | 68 | 63 | 0.30 |
| KSAAP | 65 | 72 | 62 | 0.32 |
| AAC | 61 | 74 | 53 | 0.26 |
| HQI | 68 | 71 | 66 | 0.35 |
| All Features | 74 | 62 | 80 | 0.42 |
Comparison of sulfenylation site predictors using 10-fold cross-validation.
| Predictor | Performance (%) | |||
|---|---|---|---|---|
| ACC | SN | SP | MCC | |
| iSulf-Cys | 66 | 67 | 64 | 0.31 |
| MDD-SOH | 70 | 68 | 70 | 0.27 |
| SOHSite | 74 | 74 | 74 | 0.33 |
| SOHPRED | — | 59 | — | 0.28 |
| PRESS | 77 | 80 | 74 | — |
| SulCysSite | — | 62 | 81 | 0.45 |
| S-SulPred | 88 | 78 | 91 | 0.64 |
| SVM-SulfoSite | 89 | 95 | 83 | 0.79 |
Comparison of sulfenylation site predictors using an independent test set.
| Predictor | Performance (%) | |||
|---|---|---|---|---|
| ACC | SN | SP | MCC | |
| iSulf-Cys | 64 | 69 | 66 | 0.33 |
| MDD-SOH | 71 | 71 | 71 | 0.30 |
| SOHSite | 69 | 72 | 69 | 0.28 |
| SOHPRED | — | 73 | 71 | 0.32 |
| PRESS | — | 68 | 69 | 0.27 |
| SulCysSite | — | 76 | 71 | 0.34 |
| S-SulPred | 72 | 75 | 71 | 0.43 |
| SVM-SulfoSite | 74 | 62 | 80 | 0.42 |
Figure 3Schematic showing the workflow used to develop our method. KSAAP: k-space amino acid pairs; HQI: high-quality indices; BE: Binary encoding; AAC: Amino acid composition; AAindex: physiochemical amino acid properties; SVM: support vector machines; NCBI: National Center for Biotechnology Information.