Jyun-Rong Wang1, Wen-Lin Huang2, Ming-Ju Tsai1, Kai-Ti Hsu1, Hui-Ling Huang1,3, Shinn-Ying Ho1,3. 1. Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu 300, Taiwan. 2. Department and Institute of Industrial Engineering and Management, Minghsin University of Science and Technology, Hsinchu 300, Taiwan. 3. Department of Biological Science and Technology, National Chiao Tung University, Hsinchu 300, Taiwan.
Abstract
Motivation: Numerous ubiquitination sites remain undiscovered because of the limitations of mass spectrometry-based methods. Existing prediction methods use randomly selected non-validated sites as non-ubiquitination sites to train ubiquitination site prediction models. Results: We propose an evolutionary screening algorithm (ESA) to select effective negatives among non-validated sites and an ESA-based prediction method, ESA-UbiSite, to identify human ubiquitination sites. The ESA selects non-validated sites least likely to be ubiquitination sites as training negatives. Moreover, the ESA and ESA-UbiSite use a set of well-selected physicochemical properties together with a support vector machine for accurate prediction. Experimental results show that ESA-UbiSite with effective negatives achieved 0.92 test accuracy and a Matthews's correlation coefficient of 0.48, better than existing prediction methods. The ESA increased ESA-UbiSite's test accuracy from 0.75 to 0.92 and can improve other post-translational modification site prediction methods. Availability and Implementation: An ESA-UbiSite-based web server has been established at http://iclab.life.nctu.edu.tw/iclab_webtools/ESAUbiSite/ . Contact: syho@mail.nctu.edu.tw. Supplementary information: Supplementary data are available at Bioinformatics online.
Motivation: Numerous ubiquitination sites remain undiscovered because of the limitations of mass spectrometry-based methods. Existing prediction methods use randomly selected non-validated sites as non-ubiquitination sites to train ubiquitination site prediction models. Results: We propose an evolutionary screening algorithm (ESA) to select effective negatives among non-validated sites and an ESA-based prediction method, ESA-UbiSite, to identify human ubiquitination sites. The ESA selects non-validated sites least likely to be ubiquitination sites as training negatives. Moreover, the ESA and ESA-UbiSite use a set of well-selected physicochemical properties together with a support vector machine for accurate prediction. Experimental results show that ESA-UbiSite with effective negatives achieved 0.92 test accuracy and a Matthews's correlation coefficient of 0.48, better than existing prediction methods. The ESA increased ESA-UbiSite's test accuracy from 0.75 to 0.92 and can improve other post-translational modification site prediction methods. Availability and Implementation: An ESA-UbiSite-based web server has been established at http://iclab.life.nctu.edu.tw/iclab_webtools/ESAUbiSite/ . Contact: syho@mail.nctu.edu.tw. Supplementary information: Supplementary data are available at Bioinformatics online.