Alok Sharma1,2,3, Artem Lysenko4, Yosvany López5, Abdollah Dehzangi6, Ronesh Sharma7,8, Hamendra Reddy7, Abdul Sattar9, Tatsuhiko Tsunoda10,11,12. 1. Institute for Integrated and Intelligent Systems, Griffith University, Q, Brisbane, LD-4111, Australia. alok.sharma@griffith.edu.au. 2. Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa, Japan. alok.sharma@griffith.edu.au. 3. School of Engineering and Physics, Faculty of Science, Technology and Environment, University of the South Pacific, Suva, Fiji Islands. alok.sharma@griffith.edu.au. 4. Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa, Japan. 5. Genesis Institute of Genetic Research, Genesis Healthcare Co, Tokyo, Japan. 6. Department of Computer Science, Morgan State University, Baltimore, MD, USA. 7. School of Engineering and Physics, Faculty of Science, Technology and Environment, University of the South Pacific, Suva, Fiji Islands. 8. School of Electrical and Electronics Engineering, Fiji National University, Suva, Fiji. 9. Institute for Integrated and Intelligent Systems, Griffith University, Q, Brisbane, LD-4111, Australia. 10. Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa, Japan. tsunoda.mesm@mri.tmd.ac.jp. 11. Department of Medical Science Mathematics, Medical Research Institute, Tokyo Medical and Dental University, Tokyo, Japan. tsunoda.mesm@mri.tmd.ac.jp. 12. CREST, JST, Tokyo, 113-8510, Japan. tsunoda.mesm@mri.tmd.ac.jp.
Abstract
BACKGROUND: Post-translational modifications are viewed as an important mechanism for controlling protein function and are believed to be involved in multiple important diseases. However, their profiling using laboratory-based techniques remain challenging. Therefore, making the development of accurate computational methods to predict post-translational modifications is particularly important for making progress in this area of research. RESULTS: This work explores the use of four half-sphere exposure-based features for computational prediction of sumoylation sites. Unlike most of the previously proposed approaches, which focused on patterns of amino acid co-occurrence, we were able to demonstrate that protein structural based features could be sufficiently informative to achieve good predictive performance. The evaluation of our method has demonstrated high sensitivity (0.9), accuracy (0.89) and Matthew's correlation coefficient (0.78-0.79). We have compared these results to the recently released pSumo-CD method and were able to demonstrate better performance of our method on the same evaluation dataset. CONCLUSIONS: The proposed predictor HseSUMO uses half-sphere exposures of amino acids to predict sumoylation sites. It has shown promising results on a benchmark dataset when compared with the state-of-the-art method. The extracted data of this study can be accessed at https://github.com/YosvanyLopez/HseSUMO .
BACKGROUND: Post-translational modifications are viewed as an important mechanism for controlling protein function and are believed to be involved in multiple important diseases. However, their profiling using laboratory-based techniques remain challenging. Therefore, making the development of accurate computational methods to predict post-translational modifications is particularly important for making progress in this area of research. RESULTS: This work explores the use of four half-sphere exposure-based features for computational prediction of sumoylation sites. Unlike most of the previously proposed approaches, which focused on patterns of amino acid co-occurrence, we were able to demonstrate that protein structural based features could be sufficiently informative to achieve good predictive performance. The evaluation of our method has demonstrated high sensitivity (0.9), accuracy (0.89) and Matthew's correlation coefficient (0.78-0.79). We have compared these results to the recently released pSumo-CD method and were able to demonstrate better performance of our method on the same evaluation dataset. CONCLUSIONS: The proposed predictor HseSUMO uses half-sphere exposures of amino acids to predict sumoylation sites. It has shown promising results on a benchmark dataset when compared with the state-of-the-art method. The extracted data of this study can be accessed at https://github.com/YosvanyLopez/HseSUMO .