Literature DB >> 25462328

Inferring biological basis about psychrophilicity by interpreting the rules generated from the correctly classified input instances by a classifier.

Abhigyan Nath1, Karthikeyan Subbiah2.   

Abstract

Organisms thriving at extreme cold surroundings are called as psychrophiles and they present a wealth of knowledge about sequence adjustments in proteins that had occurred during the adaptation to low temperatures. In this paper, we propose a new cascading model to investigate the basis for psychrophilicity. In this model, a superior classifier was used to discriminate psychrophilic from mesophilic protein sequences, and then the PART rule generating algorithm was applied on the input instances that are correctly classified by the classifier, to generate human interpretable rules. These derived rules were further validated on a structural dataset and finally analyzed to discover the underlying biological basis about the psychrophilicity. In this study, we have used one of the key features of psychrophilic proteins accountable for remaining functional in extreme cold temperature surroundings i.e., global patterns of amino acid composition as the input features. The rotation forest classifier outperformed all the other classifiers with maximum accuracy of 70.5% and maximum AUC of 0.78. The effect of sequence length on the classification accuracy was also investigated. The analysis of the derived rules and interpretation of the analyzed results had revealed some interesting phenomena such as the amino acids A, D, G, F, and S are over-represented, and T is under-represented in psychrophilic proteins. These findings augment the existing domain knowledge for psychrophilic sequence features.
Copyright © 2014 Elsevier Ltd. All rights reserved.

Entities:  

Keywords:  Amino acid composition patterns; Biologically interpretable rules; Cold adaptation; PART rule induction method; Rotation forest

Year:  2014        PMID: 25462328     DOI: 10.1016/j.compbiolchem.2014.10.002

Source DB:  PubMed          Journal:  Comput Biol Chem        ISSN: 1476-9271            Impact factor:   2.877


  1 in total

1.  Probing an optimal class distribution for enhancing prediction and feature characterization of plant virus-encoded RNA-silencing suppressors.

Authors:  Abhigyan Nath; Karthikeyan Subbiah
Journal:  3 Biotech       Date:  2016-03-21       Impact factor: 2.406

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.