| Literature DB >> 26819243 |
Van-Minh Bui1, Shun-Long Weng2,3,4, Cheng-Tsung Lu5, Tzu-Hao Chang6, Julia Tzu-Ya Weng7,8, Tzong-Yi Lee9,10.
Abstract
BACKGROUND: Protein S-sulfenylation is a type of post-translational modification (PTM) involving the covalent binding of a hydroxyl group to the thiol of a cysteine amino acid. Recent evidence has shown the importance of S-sulfenylation in various biological processes, including transcriptional regulation, apoptosis and cytokine signaling. Determining the specific sites of S-sulfenylation is fundamental to understanding the structures and functions of S-sulfenylated proteins. However, the current lack of reliable tools often limits researchers to use expensive and time-consuming laboratory techniques for the identification of S-sulfenylation sites. Thus, we were motivated to develop a bioinformatics method for investigating S-sulfenylation sites based on amino acid compositions and physicochemical properties.Entities:
Mesh:
Substances:
Year: 2016 PMID: 26819243 PMCID: PMC4895302 DOI: 10.1186/s12864-015-2299-1
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1Analytical flowchart of SOHSite including data collection and preprocessing, features extraction and encoding, model construction and evaluation, and independent testing
Data statistics of S-sulfenylated and non-S-sulfenylated sites
| Data resource |
| Non- |
|---|---|---|
| UniProtKB 201412 | 17 | 92 |
| RedoxDB v1 (PMID: 22833525) | 102 | 401 |
| Yang J et al., 2014 [ | 1,443 | 10,521 |
| Other Literature | 33 | 143 |
| Combined non-redundant dataset | ||
| Training | 1,145 | 8,368 |
| Independent testing | 289 | 2,108 |
Fig. 2Amino acid composition of protein S-sulfenylation sites. a Comparison of amino acid composition between S-sulfenylation sites (blue) and non- S-sulfenylation sites (red). b Position-specific amino acid composition of S-sulfenylation sites. c Position-specific amino acid composition of non-S-sulfenylation sites. d TwoSampleLogo between S-sulfenylation sites (positive data) and non-S-sulfenylation sites (negative data)
Fig. 3Comparison of the solvent-accessible surface area between S-sulfenylation and non-S-sulfenylation sites
Five-fold cross validation results for SVM models trained with various features individually
| Training features | Sn | Sp | Acc | MCC |
|---|---|---|---|---|
| 20D Binary code | 0.66 | 0.68 | 0.68 | 0.23 |
| BLOSUM62 | 0.68 | 0.70 | 0.69 | 0.26 |
| Amino Acid Composition (AAC) | 0.64 | 0.65 | 0.65 | 0.19 |
| Amino Acid Pair Composition (AAPC) | 0.64 | 0.67 | 0.67 | 0.21 |
| Accessible Surface Area (ASA) | 0.60 | 0.61 | 0.61 | 0.14 |
| Secondary structure (SS) | 0.56 | 0.56 | 0.56 | 0.08 |
| Position Weight Matrix (PWM) | 0.64 | 0.66 | 0.66 | 0.20 |
| Position-specific scoring matrix (PSSM) | 0.71 | 0.72 | 0.72 | 0.30 |
A total of 1145 positive data and 8368 negative data were used in the cross validation process. Sn, sensitivity; Sp, specificity; Acc, accuracy; MCC, Matthews Correlation Coefficient
Fig. 4The predictive performance of PSSM model combined with forward selection of the top 20 physicochemical properties
Fig. 5Comparison of the independent testing results between PSSM model and the hybrid model combining PSSM with the top 12 physicochemical properties
Fig. 6Discrimination of S-sulfenylation sites from S-nitrosylation and S-glutathionylation sites. a Number of duplicate proteins among S-sulfenylation, S-nitrosylation and S-glutathionylation; (b) Number of duplicate sites among S-sulfenylation, S-nitrosylation and S-glutathionylation; (c) Significant differences in position-specific compositions among three PTMs as identified by TwoSampleLogo