| Literature DB >> 29583068 |
Huan Hu1, Li Zhang1, Haixin Ai1,2,3, Hui Zhang1, Yetian Fan4, Qi Zhao4, Hongsheng Liu1,2,3.
Abstract
LncRNA plays an important role in many biological and disease progression by binding to related proteins. However, the experimental methods for studying lncRNA-protein interactions are time-consuming and expensive. Although there are a few models designed to predict the interactions of ncRNA-protein, they all have some common drawbacks that limit their predictive performance. In this study, we present a model called HLPI-Ensemble designed specifically for human lncRNA-protein interactions. HLPI-Ensemble adopts the ensemble strategy based on three mainstream machine learning algorithms of Support Vector Machines (SVM), Random Forests (RF) and Extreme Gradient Boosting (XGB) to generate HLPI-SVM Ensemble, HLPI-RF Ensemble and HLPI-XGB Ensemble, respectively. The results of 10-fold cross-validation show that HLPI-SVM Ensemble, HLPI-RF Ensemble and HLPI-XGB Ensemble achieved AUCs of 0.95, 0.96 and 0.96, respectively, in the test dataset. Furthermore, we compared the performance of the HLPI-Ensemble models with the previous models through external validation dataset. The results show that the false positives (FPs) of HLPI-Ensemble models are much lower than that of the previous models, and other evaluation indicators of HLPI-Ensemble models are also higher than those of the previous models. It is further showed that HLPI-Ensemble models are superior in predicting human lncRNA-protein interaction compared with previous models. The HLPI-Ensemble is publicly available at: http://ccsipb.lnu.edu.cn/hlpiensemble/ .Entities:
Keywords: bioinformatics; ensemble strategy; lncRNA; lncRNA-protein interaction; protein
Mesh:
Substances:
Year: 2018 PMID: 29583068 PMCID: PMC6152435 DOI: 10.1080/15476286.2018.1457935
Source DB: PubMed Journal: RNA Biol ISSN: 1547-6286 Impact factor: 4.652