Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 HPOLabeler: improving prediction of human protein-phenotype associations by learning to rank.

Literature DB >> 32379868

HPOLabeler: improving prediction of human protein-phenotype associations by learning to rank.

Lizhi Liu^1,2,3, Xiaodi Huang⁴, Hiroshi Mamitsuka^5,6, Shanfeng Zhu^1,2,3,7.

Abstract

MOTIVATION: Annotating human proteins by abnormal phenotypes has become an important topic. Human Phenotype Ontology (HPO) is a standardized vocabulary of phenotypic abnormalities encountered in human diseases. As of November 2019, only <4000 proteins have been annotated with HPO. Thus, a computational approach for accurately predicting protein-HPO associations would be important, whereas no methods have outperformed a simple Naive approach in the second Critical Assessment of Functional Annotation, 2013-2014 (CAFA2).
RESULTS: We present HPOLabeler, which is able to use a wide variety of evidence, such as protein-protein interaction (PPI) networks, Gene Ontology, InterPro, trigram frequency and HPO term frequency, in the framework of learning to rank (LTR). LTR has been proved to be powerful for solving large-scale, multi-label ranking problems in bioinformatics. Given an input protein, LTR outputs the ranked list of HPO terms from a series of input scores given to the candidate HPO terms by component learning models (logistic regression, nearest neighbor and a Naive method), which are trained from given multiple evidence. We empirically evaluate HPOLabeler extensively through mainly two experiments of cross validation and temporal validation, for which HPOLabeler significantly outperformed all component models and competing methods including the current state-of-the-art method. We further found that (i) PPI is most informative for prediction among diverse data sources and (ii) low prediction performance of temporal validation might be caused by incomplete annotation of new proteins.
AVAILABILITY AND IMPLEMENTATION: http://issubmission.sjtu.edu.cn/hpolabeler/. CONTACT: zhusf@fudan.edu.cn. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Entities: Disease Species

Year: 2020 PMID： 32379868 DOI： 10.1093/bioinformatics/btaa284

Source DB: PubMed Journal: Bioinformatics ISSN： 1367-4803 Impact factor: 6.937

Keyword Cloud
Cited

4 in total

1. An automated and combinative method for the predictive ranking of candidate effector proteins of fungal plant pathogens.

Authors: Darcy A B Jones; Lina Rozano; Johannes W Debler; Ricardo L Mancera; Paula M Moolhuijzen; James K Hane
Journal: Sci Rep Date: 2021-10-05 Impact factor: 4.379

2. Cardiovascular Phenotypes Profiling for L-Transposition of the Great Arteries and Prognosis Analysis.

Authors: Qiyu He; Huayan Shen; Xinyang Shao; Wen Chen; Yafeng Wu; Rui Liu; Shoujun Li; Zhou Zhou
Journal: Front Cardiovasc Med Date: 2022-01-21

3. Integration of Human Protein Sequence and Protein-Protein Interaction Data by Graph Autoencoder to Identify Novel Protein-Abnormal Phenotype Associations.

Authors: Yuan Liu; Ruirui He; Yingjie Qu; Yuan Zhu; Dianke Li; Xinping Ling; Simin Xia; Zhenqiu Li; Dong Li
Journal: Cells Date: 2022-08-10 Impact factor: 7.666

4. iPiDA-LTR: Identifying piwi-interacting RNA-disease associations based on Learning to Rank.

Authors: Wenxiang Zhang; Jialu Hou; Bin Liu
Journal: PLoS Comput Biol Date: 2022-08-15 Impact factor: 4.779

4 in total