Literature DB >> 20421687

Multitask learning for protein subcellular location prediction.

Qian Xu1, Sinno Jialin Pan, Hannah Hong Xue, Qiang Yang.   

Abstract

Protein subcellular localization is concerned with predicting the location of a protein within a cell using computational methods. The location information can indicate key functionalities of proteins. Thus, accurate prediction of subcellular localizations of proteins can help the prediction of protein functions and genome annotations, as well as the identification of drug targets. Machine learning methods such as Support Vector Machines (SVMs) have been used in the past for the problem of protein subcellular localization, but have been shown to suffer from a lack of annotated training data in each species under study. To overcome this data sparsity problem, we observe that because some of the organisms may be related to each other, there may be some commonalities across different organisms that can be discovered and used to help boost the data in each localization task. In this paper, we formulate protein subcellular localization problem as one of multitask learning across different organisms. We adapt and compare two specializations of the multitask learning algorithms on 20 different organisms. Our experimental results show that multitask learning performs much better than the traditional single-task methods. Among the different multitask learning methods, we found that the multitask kernels and supertype kernels under multitask learning that share parameters perform slightly better than multitask learning by sharing latent features. The most significant improvement in terms of localization accuracy is about 25 percent. We find that if the organisms are very different or are remotely related from a biological point of view, then jointly training the multiple models cannot lead to significant improvement. However, if they are closely related biologically, the multitask learning can do much better than individual learning.

Entities:  

Mesh:

Substances:

Year:  2011        PMID: 20421687     DOI: 10.1109/TCBB.2010.22

Source DB:  PubMed          Journal:  IEEE/ACM Trans Comput Biol Bioinform        ISSN: 1545-5963            Impact factor:   3.710


  8 in total

1.  Automated protein subcellular localization based on local invariant features.

Authors:  Chao Li; Xue-hong Wang; Li Zheng; Ji-feng Huang
Journal:  Protein J       Date:  2013-03       Impact factor: 2.371

2.  An ensemble classifier for eukaryotic protein subcellular location prediction using gene ontology categories and amino acid hydrophobicity.

Authors:  Liqi Li; Yuan Zhang; Lingyun Zou; Changqing Li; Bo Yu; Xiaoqi Zheng; Yue Zhou
Journal:  PLoS One       Date:  2012-01-30       Impact factor: 3.240

3.  Using multitask classification methods to investigate the kinase-specific phosphorylation sites.

Authors:  Shan Gao; Shuo Xu; Yaping Fang; Jianwen Fang
Journal:  Proteome Sci       Date:  2012-06-21       Impact factor: 2.480

4.  Pan-cancer classification by regularized multi-task learning.

Authors:  Sk Md Mosaddek Hossain; Lutfunnesa Khatun; Sumanta Ray; Anirban Mukhopadhyay
Journal:  Sci Rep       Date:  2021-12-20       Impact factor: 4.379

5.  Mining Proteins with Non-Experimental Annotations Based on an Active Sample Selection Strategy for Predicting Protein Subcellular Localization.

Authors:  Junzhe Cao; Wenqi Liu; Jianjun He; Hong Gu
Journal:  PLoS One       Date:  2013-06-26       Impact factor: 3.240

6.  An ensemble method approach to investigate kinase-specific phosphorylation sites.

Authors:  Sutapa Datta; Subhasis Mukhopadhyay
Journal:  Int J Nanomedicine       Date:  2014-05-10

7.  Knowledge-transfer learning for prediction of matrix metalloprotease substrate-cleavage sites.

Authors:  Yanan Wang; Jiangning Song; Tatiana T Marquez-Lago; André Leier; Chen Li; Trevor Lithgow; Geoffrey I Webb; Hong-Bin Shen
Journal:  Sci Rep       Date:  2017-07-18       Impact factor: 4.379

8.  Comparative Evaluation of Machine Learning Strategies for Analyzing Big Data in Psychiatry.

Authors:  Han Cao; Andreas Meyer-Lindenberg; Emanuel Schwarz
Journal:  Int J Mol Sci       Date:  2018-10-29       Impact factor: 5.923

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.