Literature DB >> 23864226

Predict subcellular locations of singleplex and multiplex proteins by semi-supervised learning and dimension-reducing general mode of Chou's PseAAC.

Eakasit Pacharawongsakda, Thanaruk Theeramunkong.   

Abstract

Predicting protein subcellular location is one of major challenges in Bioinformatics area since such knowledge helps us understand protein functions and enables us to select the targeted proteins during drug discovery process. While many computational techniques have been proposed to improve predictive performance for protein subcellular location, they have several shortcomings. In this work, we propose a method to solve three main issues in such techniques; i) manipulation of multiplex proteins which may exist or move between multiple cellular compartments, ii) handling of high dimensionality in input and output spaces and iii) requirement of sufficient labeled data for model training. Towards these issues, this work presents a new computational method for predicting proteins which have either single or multiple locations. The proposed technique, namely iFLAST-CORE, incorporates the dimensionality reduction in the feature and label spaces with co-training paradigm for semi-supervised multi-label classification. For this purpose, the Singular Value Decomposition (SVD) is applied to transform the high-dimensional feature space and label space into the lower-dimensional spaces. After that, due to limitation of labeled data, the co-training regression makes use of unlabeled data by predicting the target values in the lower-dimensional spaces of unlabeled data. In the last step, the component of SVD is used to project labels in the lower-dimensional space back to those in the original space and an adaptive threshold is used to map a numeric value to a binary value for label determination. A set of experiments on viral proteins and gram-negative bacterial proteins evidence that our proposed method improve the classification performance in terms of various evaluation metrics such as Aiming (or Precision), Coverage (or Recall) and macro F-measure, compared to the traditional method that uses only labeled data.

Entities:  

Mesh:

Substances:

Year:  2013        PMID: 23864226     DOI: 10.1109/TNB.2013.2272014

Source DB:  PubMed          Journal:  IEEE Trans Nanobioscience        ISSN: 1536-1241            Impact factor:   2.935


  6 in total

Review 1.  Some illuminating remarks on molecular genetics and genomics as well as drug development.

Authors:  Kuo-Chen Chou
Journal:  Mol Genet Genomics       Date:  2020-01-01       Impact factor: 3.291

2.  Gram-positive and Gram-negative subcellular localization using rotation forest and physicochemical-based features.

Authors:  Abdollah Dehzangi; Sohrab Sohrabi; Rhys Heffernan; Alok Sharma; James Lyons; Kuldip Paliwal; Abdul Sattar
Journal:  BMC Bioinformatics       Date:  2015-02-23       Impact factor: 3.169

3.  iRNA-3typeA: Identifying Three Types of Modification at RNA's Adenosine Sites.

Authors:  Wei Chen; Pengmian Feng; Hui Yang; Hui Ding; Hao Lin; Kuo-Chen Chou
Journal:  Mol Ther Nucleic Acids       Date:  2018-03-30       Impact factor: 8.886

4.  iRSpot-Pse6NC: Identifying recombination spots in Saccharomyces cerevisiae by incorporating hexamer composition into general PseKNC.

Authors:  Hui Yang; Wang-Ren Qiu; Guoqing Liu; Feng-Biao Guo; Wei Chen; Kuo-Chen Chou; Hao Lin
Journal:  Int J Biol Sci       Date:  2018-05-22       Impact factor: 6.580

5.  Prediction of subcellular location of apoptosis proteins by incorporating PsePSSM and DCCA coefficient based on LFDA dimensionality reduction.

Authors:  Bin Yu; Shan Li; Wenying Qiu; Minghui Wang; Junwei Du; Yusen Zhang; Xing Chen
Journal:  BMC Genomics       Date:  2018-06-19       Impact factor: 3.969

6.  PseAAC-General: fast building various modes of general form of Chou's pseudo-amino acid composition for large-scale protein datasets.

Authors:  Pufeng Du; Shuwang Gu; Yasen Jiao
Journal:  Int J Mol Sci       Date:  2014-02-26       Impact factor: 5.923

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.