Literature DB >> 28351701

Extracting features from protein sequences to improve deep extreme learning machine for protein fold recognition.

Wisam Ibrahim1, Mohammad Saniee Abadeh2.   

Abstract

Protein fold recognition is an important problem in bioinformatics to predict three-dimensional structure of a protein. One of the most challenging tasks in protein fold recognition problem is the extraction of efficient features from the amino-acid sequences to obtain better classifiers. In this paper, we have proposed six descriptors to extract features from protein sequences. These descriptors are applied in the first stage of a three-stage framework PCA-DELM-LDA to extract feature vectors from the amino-acid sequences. Principal Component Analysis PCA has been implemented to reduce the number of extracted features. The extracted feature vectors have been used with original features to improve the performance of the Deep Extreme Learning Machine DELM in the second stage. Four new features have been extracted from the second stage and used in the third stage by Linear Discriminant Analysis LDA to classify the instances into 27 folds. The proposed framework is implemented on the independent and combined feature sets in SCOP datasets. The experimental results show that extracted feature vectors in the first stage could improve the performance of DELM in extracting new useful features in second stage.
Copyright © 2017 Elsevier Ltd. All rights reserved.

Keywords:  Extreme learning machine; Feature extraction; Protein descriptor; Protein fold recognition

Mesh:

Year:  2017        PMID: 28351701     DOI: 10.1016/j.jtbi.2017.03.023

Source DB:  PubMed          Journal:  J Theor Biol        ISSN: 0022-5193            Impact factor:   2.691


  3 in total

1.  Discriminating cirRNAs from other lncRNAs using a hierarchical extreme learning machine (H-ELM) algorithm with feature selection.

Authors:  Lei Chen; Yu-Hang Zhang; Guohua Huang; Xiaoyong Pan; ShaoPeng Wang; Tao Huang; Yu-Dong Cai
Journal:  Mol Genet Genomics       Date:  2017-09-14       Impact factor: 3.291

2.  Profiles of Natural and Designed Protein-Like Sequences Effectively Bridge Protein Sequence Gaps: Implications in Distant Homology Detection.

Authors:  Gayatri Kumar; Narayanaswamy Srinivasan; Sankaran Sandhya
Journal:  Methods Mol Biol       Date:  2022

3.  PrESOgenesis: A two-layer multi-label predictor for identifying fertility-related proteins using support vector machine and pseudo amino acid composition approach.

Authors:  Mohammad Reza Bakhtiarizadeh; Maryam Rahimi; Abdollah Mohammadi-Sangcheshmeh; Vahid Shariati J; Seyed Alireza Salami
Journal:  Sci Rep       Date:  2018-06-13       Impact factor: 4.379

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.