Literature DB >> 33078303

Self-evoluting framework of deep convolutional neural network for multilocus protein subcellular localization.

Hanhan Cong1,2, Hong Liu3,4, Yuehui Chen5,6, Yi Cao5,6.   

Abstract

In the present paper, deep convolutional neural network (DCNN) is applied to multilocus protein subcellular localization as it is more suitable for multi-class classification. There are two main problems with this application. First, the appropriate features for correlation between multiple sites are hard to find. Second, the classifier structure is difficult to determine as it is greatly affected by the distribution of classified data. To solve these problems, a self-evoluting framework using DCNNs for multilocus protein subcellular localization is proposed. It has three characteristics that the previous algorithms do not. The first is that it combines the ant colony algorithm with the DCNN to form a self-evoluting algorithm for multilocus protein subcellular localization. The second is that it randomly groups subcellular sites using a limited random k-labelsets multi-label classification method. It also solves complex problems in a divide-and-conquer approach and proposes a flexible expansion model. The third is that it realizes the random selection feature extraction method in the positioning process and avoids the defects in individual feature extraction methods. The algorithm in the present paper is tested on the human database, and the overall correct rate is 67.17%, which is higher than that for the stacked self-encoder (SAE), support vector machine (SVM), random forest classifier (RF), or single deep convolutional neural network.Graphical abstract The algorithm mentioned in the present paper mainly includes four parts. They are protein sequence data preprocessing, integrated DCNN model construction, finding optimal DCNN combination by ant colony optimization, and protein subcellular localization for sequences. These parts are sequential relationships and the data obtained in the previous part is the basis for the latter part of the function. In the part of data preprocessing, the limited RAkEL multi-label classification method is used to randomly group subcellular sites. At the same time, the feature fusion of protein sequences is carried out by using multiple feature extraction methods. Each combination including features and sites information corresponds to a DCNN model. In the part of finding optimal DCNN combination by ant colony optimization, the main purpose is to find the best combination of DCNN models through the global optimization ability of the ant colony algorithm. The positioning of sequences is mainly to obtain multilocus subcellular localization by the optimal model combination.

Entities:  

Keywords:  Ant colony algorithm; Deep convolutional neural network; Multilocus protein subcellular localization; Random k-labelsets

Mesh:

Substances:

Year:  2020        PMID: 33078303     DOI: 10.1007/s11517-020-02275-w

Source DB:  PubMed          Journal:  Med Biol Eng Comput        ISSN: 0140-0118            Impact factor:   2.602


  15 in total

1.  Multi-kernel transfer learning based on Chou's PseAAC formulation for protein submitochondria localization.

Authors:  Suyu Mei
Journal:  J Theor Biol       Date:  2011-10-21       Impact factor: 2.691

2.  Virus-mPLoc: a fusion classifier for viral protein subcellular location prediction by incorporating multiple sites.

Authors:  Hong-Bin Shen; Kuo-Chen Chou
Journal:  J Biomol Struct Dyn       Date:  2010-10

3.  PairProSVM: protein subcellular localization based on local pairwise profile alignment and SVM.

Authors:  Man-Wai Mak; Jian Guo; Sun-Yuan Kung
Journal:  IEEE/ACM Trans Comput Biol Bioinform       Date:  2008 Jul-Sep       Impact factor: 3.710

4.  SCLpred: protein subcellular localization prediction by N-to-1 neural networks.

Authors:  Catherine Mooney; Yong-Hong Wang; Gianluca Pollastri
Journal:  Bioinformatics       Date:  2011-08-27       Impact factor: 6.937

5.  Genome-wide analysis of RNA and protein localization and local translation in mESC-derived neurons.

Authors:  Katarzyna A Ludwik; Nicolai von Kuegelgen; Marina Chekulaeva
Journal:  Methods       Date:  2019-02-08       Impact factor: 3.608

Review 6.  Some remarks on predicting multi-label attributes in molecular biosystems.

Authors:  Kuo-Chen Chou
Journal:  Mol Biosyst       Date:  2013-03-28

7.  Human proteins characterization with subcellular localizations.

Authors:  Lei Yang; Yingli Lv; Tao Li; Yongchun Zuo; Wei Jiang
Journal:  J Theor Biol       Date:  2014-05-23       Impact factor: 2.691

8.  Prediction of protein subcellular localization with oversampling approach and Chou's general PseAAC.

Authors:  Shengli Zhang; Xin Duan
Journal:  J Theor Biol       Date:  2017-10-31       Impact factor: 2.691

9.  A reliable method for colorectal cancer prediction based on feature selection and support vector machine.

Authors:  Dandan Zhao; Hong Liu; Yuanjie Zheng; Yanlin He; Dianjie Lu; Chen Lyu
Journal:  Med Biol Eng Comput       Date:  2018-11-26       Impact factor: 2.602

10.  Mislocalization-related disease gene discovery using gene expression based computational protein localization prediction.

Authors:  Zhonghao Liu; Jianjun Hu
Journal:  Methods       Date:  2015-09-28       Impact factor: 3.608

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.