Literature DB >> 21844638

Structural SCOP superfamily level classification using unsupervised machine learning.

Ulavappa B Angadi1, M Venkatesulu.   

Abstract

One of the major research directions in bioinformatics is that of assigning superfamily classification to a given set of proteins. The classification reflects the structural, evolutionary, and functional relatedness. These relationships are embodied in a hierarchical classification, such as the Structural Classification of Protein (SCOP), which is mostly manually curated. Such a classification is essential for the structural and functional analyses of proteins. Yet a large number of proteins remain unclassified. In this study, we have proposed an unsupervised machine learning approach to classify and assign a given set of proteins to SCOP superfamilies. In the method, we have constructed a database and similarity matrix using P-values obtained from an all-against-all BLAST run and trained the network with the ART2 unsupervised learning algorithm using the rows of the similarity matrix as input vectors, enabling the trained network to classify the proteins from 0.82 to 0.97 f-measure accuracy. The performance of ART2 has been compared with that of spectral clustering, Random forest, SVM, and HHpred. ART2 performs better than the others except HHpred. HHpred performs better than ART2 and the sum of errors is smaller than that of the other methods evaluated.

Entities:  

Mesh:

Substances:

Year:  2011        PMID: 21844638     DOI: 10.1109/TCBB.2011.114

Source DB:  PubMed          Journal:  IEEE/ACM Trans Comput Biol Bioinform        ISSN: 1545-5963            Impact factor:   3.710


  1 in total

1.  Efficient feature selection and classification of protein sequence data in bioinformatics.

Authors:  Muhammad Javed Iqbal; Ibrahima Faye; Brahim Belhaouari Samir; Abas Md Said
Journal:  ScientificWorldJournal       Date:  2014-06-19
  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.