| Literature DB >> 34249915 |
Meenal Chaudhari1, Niraj Thapa1, Hamid Ismail1, Sandhya Chopade1, Doina Caragea2, Maja Köhn3, Robert H Newman4, Dukka B Kc5.
Abstract
Phosphorylation, which is mediated by protein kinases and opposed by protein phosphatases, is an important post-translational modification that regulates many cellular processes, including cellular metabolism, cell migration, and cell division. Due to its essential role in cellular physiology, a great deal of attention has been devoted to identifying sites of phosphorylation on cellular proteins and understanding how modification of these sites affects their cellular functions. This has led to the development of several computational methods designed to predict sites of phosphorylation based on a protein's primary amino acid sequence. In contrast, much less attention has been paid to dephosphorylation and its role in regulating the phosphorylation status of proteins inside cells. Indeed, to date, dephosphorylation site prediction tools have been restricted to a few tyrosine phosphatases. To fill this knowledge gap, we have employed a transfer learning strategy to develop a deep learning-based model to predict sites that are likely to be dephosphorylated. Based on independent test results, our model, which we termed DTL-DephosSite, achieved efficiency scores for phosphoserine/phosphothreonine residues of 84%, 84% and 0.68 with respect to sensitivity (SN), specificity (SP) and Matthew's correlation coefficient (MCC). Similarly, DTL-DephosSite exhibited efficiency scores of 75%, 88% and 0.64 for phosphotyrosine residues with respect to SN, SP, and MCC.Entities:
Keywords: computational prediction; deep learning; dephosphorylation; post-translational modification; transfer learning
Year: 2021 PMID: 34249915 PMCID: PMC8264445 DOI: 10.3389/fcell.2021.662983
Source DB: PubMed Journal: Front Cell Dev Biol ISSN: 2296-634X
FIGURE 1Phosphorylation and dephosphorylation, mediated by kinase and phosphatase as a key reversible post translational modification.
Summary of the training and test datasets used for model development based on sites extracted from the DEPOD-19, Downreg (literature resources) and composite ComDephos datasets.
| Dataset | Residue | Train | Test | Total positive | Total negative |
| DEPOD-19 | ST | 304 | 78 | 191 | 191 |
| Y | 161 | 41 | 101 | 101 | |
| Downreg | ST | 1478 | 370 | 924 | 924 |
| Y | 40 | 10 | 25 | 25 | |
| ComDephos | ST | 1,806 | 446 | 1,112 | 1,112 |
| Y | 201 | 50 | 125 | 125 |
FIGURE 2Schematic illustrating the Bi-LSTM deep learning architecture and the parameters used. The input sequence is first fed into embedding layer with dimension of 21, then through two Bi-LSTM layers with 128 neurons and then followed by a time-distributed layer of 128 neurons, which was followed by a flatten layer and then followed by dense layer with 2 neurons with softmax activation.
Parameters used in LSTM Model for dephosphorylation.
| Parameters | Settings |
| Embedding output dimension | 21 |
| Learning rate | 0.01 |
| Batch size | 512 |
| Epochs | 30 |
| LSTM_layer1_neurons | 128 |
| Dropout | 0.4 |
| Dense_layer_neurons | 128, 64, 2 |
FIGURE 3Schematic illustrating the transfer-learning. Green dotted box depicts the training on source task, phosphorylation (S,T), to obtain the Phos-ST model. Once the Phos-Model was obtained the Bi-LSTM model was instantiated with the Phos-Model weights before being trained on the dephosphorylation data. Blue dotted box depicts the transfer learning on the target task, dephosphorylation for ST residues, to obtain the DTL-DephosSite-ST model. During transfer learning, all layers were allowed to re-train and none of the layers were frozen. (We tried various options with various layers frozen but this version produced the best results). Orange dotted box depicts the transfer learning from DTL-DephosSite-ST, to obtain the DTL-DephosSite-Y model.
Performance of Deep learning model on Depod19 and ComDephos datasets.
| Dataset | MCC | Specificity | Sensitivity | ROC_AUC |
| Depod19 | 0.36 | 0.49 | 0.85 | 0.79 |
| ComDephos | 0.46 | 0.71 | 0.76 | 0.81 |
Five-fold cross-validation of various window sizes for prediction of S/T residues following transfer learning using Phos-Model (source) and ComDephos dataset (target).
| Window size | MCC ± SD | Specificity ± SD | Sensitivity ± SD | Accuracy ± SD | ROC_AUC |
| 23 | 0.58 ± 0.05 | 0.78 ± 0.04 | 0.80 ± 0.01 | 0.79 ± 0.02 | 0.86 |
| 25 | 0.60 ± 0.04 | 0.78 ± 0.02 | 0.82 ± 0.03 | 0.80 ± 0.02 | 0.86 |
| 27 | 0.60 ± 0.05 | 0.79 ± 0.04 | 0.81 ± 0.02 | 0.80 ± 0.02 | 0.87 |
| 29 | 0.82 ± 0.03 | 0.80 ± 0.02 | 0.86 | ||
| 31 | 0.77 ± 0.03 | 0.80 ± 0.02 | |||
| 33 | 0.60 ± 0.05 | 0.78 ± 0.04 | 0.82 ± 0.03 | 0.80 ± 0.02 | 0.87 |
Comparison between DTL-DephosSite-ST and transfer-learned models developed using other deep learning architectures based on an independent test set.
| Architecture | MCC | Specificity | Sensitivity | ROC_AUC |
| CNN | 0.60 | 0.74 | 0.89 | |
| LSTM | 0.64 | 0.79 | 0.85 | 0.86 |
| DeepPhos (DC-CNN): ( | 0.64 | 0.82 | 0.83 | 0.89 |
| DTL-DephosSite-ST (Bi-LSTM) | 0.84 |
Five-fold cross-validation of various window sizes for prediction of Y residues following transfer learning using DTL-DephosSite-ST (source) and ComDephos dataset (target).
| Window size | MCC ± SD | Specificity ± SD | Sensitivity ± SD | Accuracy ± SD | ROC_AUC |
| 23 | 0.53 ± 0.09 | 0.76 ± 0.11 | 0.76 ± 0.07 | 0.76 ± 0.04 | 0.81 |
| 25 | 0.49 ± 0.13 | 0.76 ± 0.12 | 0.72 ± 0.09 | 0.74 ± 0.06 | 0.79 |
| 27 | 0.78 ± 0.10 | 0.79 ± 0.03 | 0.82 | ||
| 29 | 0.50 ± 0.07 | 0.74 ± 0.06 | 0.76 ± 0.06 | 0.75 ± 0.03 | 0.82 |
| 31 | 0.76 ± 0.06 | ||||
| 33 | 0.58 ± 0.08 | 0.78 ± 0.07 | 0.79 ± 0.05 | 0.82 |
Independent test results of DeepPhos (Luo et al., 2019), DTL-DephosSite-ST and DTL-DephosSite-Y on ComDephos independent set, using the optimized parameters.
| Predictor | MCC | Specificity | Sensitivity | Accuracy | ROC_AUC |
| DeepPhos | 0.44 | 0.48 | 0.70 | 0.86 | |
| DTL-DephosSite-ST | 0.84 | 0.84 | |||
| DTL-DephosSite-Y | 0.64 | 0.75 | 0.82 | 0.89 |