Literature DB >> 26164062

mLASSO-Hum: A LASSO-based interpretable human-protein subcellular localization predictor.

Shibiao Wan1, Man-Wai Mak2, Sun-Yuan Kung3.   

Abstract

Knowing the subcellular compartments of human proteins is essential to shed light on the mechanisms of a broad range of human diseases. In computational methods for protein subcellular localization, knowledge-based methods (especially gene ontology (GO) based methods) are known to perform better than sequence-based methods. However, existing GO-based predictors often lack interpretability and suffer from overfitting due to the high dimensionality of feature vectors. To address these problems, this paper proposes an interpretable multi-label predictor, namely mLASSO-Hum, which can yield sparse and interpretable solutions for large-scale prediction of human protein subcellular localization. By using the one-vs-rest LASSO-based classifiers, 87 out of more than 8000 GO terms are found to play more significant roles in determining the subcellular localization. Based on these 87 essential GO terms, we can decide not only where a protein resides within a cell, but also why it is located there. To further exploit information from the remaining GO terms, a method based on the GO hierarchical information derived from the depth distance of GO terms is proposed. Experimental results show that mLASSO-Hum performs significantly better than state-of-the-art predictors. We also found that in addition to the GO terms from the cellular component category, GO terms from the other two categories also play important roles in the final classification decisions. For readers' convenience, the mLASSO-Hum server is available online at http://bioinfo.eie.polyu.edu.hk/mLASSOHumServer/.
Copyright © 2015 Elsevier Ltd. All rights reserved.

Entities:  

Keywords:  Depth-dependent information; Interpretable prediction; Multi-label classification; Protein subcellular localization; Sparse solutions

Mesh:

Substances:

Year:  2015        PMID: 26164062     DOI: 10.1016/j.jtbi.2015.06.042

Source DB:  PubMed          Journal:  J Theor Biol        ISSN: 0022-5193            Impact factor:   2.691


  7 in total

1.  The effect of three novel feature extraction methods on the prediction of the subcellular localization of multi-site virus proteins.

Authors:  Lei Wang; Yaou Zhao; Yuehui Chen; Dong Wang
Journal:  Bioengineered       Date:  2017-11-22       Impact factor: 3.269

2.  Protein Subcellular Localization Prediction.

Authors:  Elettra Barberis; Emilio Marengo; Marcello Manfredi
Journal:  Methods Mol Biol       Date:  2021

3.  Machine and Deep Learning for Prediction of Subcellular Localization.

Authors:  Gaofeng Pan; Chao Sun; Zijun Liao; Jijun Tang
Journal:  Methods Mol Biol       Date:  2021

4.  Sparse regressions for predicting and interpreting subcellular localization of multi-label proteins.

Authors:  Shibiao Wan; Man-Wai Mak; Sun-Yuan Kung
Journal:  BMC Bioinformatics       Date:  2016-02-24       Impact factor: 3.169

5.  Prediction of subcellular location of apoptosis proteins by incorporating PsePSSM and DCCA coefficient based on LFDA dimensionality reduction.

Authors:  Bin Yu; Shan Li; Wenying Qiu; Minghui Wang; Junwei Du; Yusen Zhang; Xing Chen
Journal:  BMC Genomics       Date:  2018-06-19       Impact factor: 3.969

6.  A Novel Protein Subcellular Localization Method With CNN-XGBoost Model for Alzheimer's Disease.

Authors:  Long Pang; Junjie Wang; Lingling Zhao; Chunyu Wang; Hui Zhan
Journal:  Front Genet       Date:  2019-01-18       Impact factor: 4.599

7.  Benchmark data for identifying multi-functional types of membrane proteins.

Authors:  Shibiao Wan; Man-Wai Mak; Sun-Yuan Kung
Journal:  Data Brief       Date:  2016-05-21
  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.