| Literature DB >> 29989085 |
Hua Tang1, Ya-Wei Zhao2, Ping Zou1, Chun-Mei Zhang1, Rong Chen1, Po Huang1, Hao Lin2.
Abstract
Hormone-binding protein (HBP) is a kind of soluble carrier protein and can selectively and non-covalently interact with hormone. HBP plays an important role in life growth, but its function is still unclear. Correct recognition of HBPs is the first step to further study their function and understand their biological process. However, it is difficult to correctly recognize HBPs from more and more proteins through traditional biochemical experiments because of high experimental cost and long experimental period. To overcome these disadvantages, we designed a computational method for identifying HBPs accurately in the study. At first, we collected HBP data from UniProt to establish a high-quality benchmark dataset. Based on the dataset, the dipeptide composition was extracted from HBP residue sequences. In order to find out the optimal features to provide key clues for HBP identification, the analysis of various (ANOVA) was performed for feature ranking. The optimal features were selected through the incremental feature selection strategy. Subsequently, the features were inputted into support vector machine (SVM) for prediction model construction. Jackknife cross-validation results showed that 88.6% HBPs and 81.3% non-HBPs were correctly recognized, suggesting that our proposed model was powerful. This study provides a new strategy to identify HBPs. Moreover, based on the proposed model, we established a webserver called HBPred, which could be freely accessed at http://lin-group.cn/server/HBPred.Entities:
Keywords: Benchmark dataset; Dipeptide composition; Feature selection; Hormone-binding protein; Webserver
Mesh:
Substances:
Year: 2018 PMID: 29989085 PMCID: PMC6036759 DOI: 10.7150/ijbs.24174
Source DB: PubMed Journal: Int J Biol Sci ISSN: 1449-2288 Impact factor: 6.580
Figure 1Schematic diagram of human growth hormone (red) binding to two HBPs (yellow) 4
Figure 3Heat map or chromaticity diagram for the F-scores of the 400 dipeptides. Red elements indicate the dipeptides enriched in HBPs, whereas blue elements indicate the dipeptides enriched in non-HBPs.