Literature DB >> 23108592

An empirical study on the matrix-based protein representations and their combination with sequence-based approaches.

Loris Nanni1, Alessandra Lumini, Sheryl Brahnam.   

Abstract

Many domains have a stake in the development of reliable systems for automatic protein classification. Of particular interest in recent studies of automatic protein classification is the exploration of new methods for extracting features from a protein that enhance classification for specific problems. These methods have proven very useful in one or two domains, but they have failed to generalize well across several domains (i.e. classification problems). In this paper, we evaluate several feature extraction approaches for representing proteins with the aim of sequence-based protein classification. Several protein representations are evaluated, those starting from: the position specific scoring matrix (PSSM) of the proteins; the amino-acid sequence; a matrix representation of the protein, of dimension (length of the protein) ×20, obtained using the substitution matrices for representing each amino-acid as a vector. A valuable result is that a texture descriptor can be extracted from the PSSM protein representation which improves the performance of standard descriptors based on the PSSM representation. Experimentally, we develop our systems by comparing several protein descriptors on nine different datasets. Each descriptor is used to train a support vector machine (SVM) or an ensemble of SVM. Although different stand-alone descriptors work well on some datasets (but not on others), we have discovered that fusion among classifiers trained using different descriptors obtains a good performance across all the tested datasets. Matlab code/Datasets used in the proposed paper are available at http://www.bias.csr.unibo.it\nanni\PSSM.rar.

Mesh:

Substances:

Year:  2012        PMID: 23108592     DOI: 10.1007/s00726-012-1416-6

Source DB:  PubMed          Journal:  Amino Acids        ISSN: 0939-4451            Impact factor:   3.520


  3 in total

1.  An empirical study of different approaches for protein classification.

Authors:  Loris Nanni; Alessandra Lumini; Sheryl Brahnam
Journal:  ScientificWorldJournal       Date:  2014-06-15

2.  An ensemble approach for large-scale identification of protein- protein interactions using the alignments of multiple sequences.

Authors:  Lei Wang; Zhu-Hong You; Xing Chen; Jian-Qiang Li; Xin Yan; Wei Zhang; Yu-An Huang
Journal:  Oncotarget       Date:  2017-01-17

3.  AutoPPI: An Ensemble of Deep Autoencoders for Protein-Protein Interaction Prediction.

Authors:  Gabriela Czibula; Alexandra-Ioana Albu; Maria Iuliana Bocicor; Camelia Chira
Journal:  Entropy (Basel)       Date:  2021-05-21       Impact factor: 2.524

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.