Literature DB >> 25907142

Predicting protein-binding RNA nucleotides with consideration of binding partners.

Narankhuu Tuvshinjargal1, Wook Lee1, Byungkyu Park1, Kyungsook Han2.   

Abstract

In recent years several computational methods have been developed to predict RNA-binding sites in protein. Most of these methods do not consider interacting partners of a protein, so they predict the same RNA-binding sites for a given protein sequence even if the protein binds to different RNAs. Unlike the problem of predicting RNA-binding sites in protein, the problem of predicting protein-binding sites in RNA has received little attention mainly because it is much more difficult and shows a lower accuracy on average. In our previous study, we developed a method that predicts protein-binding nucleotides from an RNA sequence. In an effort to improve the prediction accuracy and usefulness of the previous method, we developed a new method that uses both RNA and protein sequence data. In this study, we identified effective features of RNA and protein molecules and developed a new support vector machine (SVM) model to predict protein-binding nucleotides from RNA and protein sequence data. The new model that used both protein and RNA sequence data achieved a sensitivity of 86.5%, a specificity of 86.2%, a positive predictive value (PPV) of 72.6%, a negative predictive value (NPV) of 93.8% and Matthews correlation coefficient (MCC) of 0.69 in a 10-fold cross validation; it achieved a sensitivity of 58.8%, a specificity of 87.4%, a PPV of 65.1%, a NPV of 84.2% and MCC of 0.48 in independent testing. For comparative purpose, we built another prediction model that used RNA sequence data alone and ran it on the same dataset. In a 10 fold-cross validation it achieved a sensitivity of 85.7%, a specificity of 80.5%, a PPV of 67.7%, a NPV of 92.2% and MCC of 0.63; in independent testing it achieved a sensitivity of 67.7%, a specificity of 78.8%, a PPV of 57.6%, a NPV of 85.2% and MCC of 0.45. In both cross-validations and independent testing, the new model that used both RNA and protein sequences showed a better performance than the model that used RNA sequence data alone in most performance measures. To the best of our knowledge, this is the first sequence-based prediction of protein-binding nucleotides in RNA which considers the binding partner of RNA. The new model will provide valuable information for designing biochemical experiments to find putative protein-binding sites in RNA with unknown structure.
Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

Entities:  

Keywords:  Binding partners; Binding sites; RNA–protein interactions; Sequence-based prediction

Mesh:

Substances:

Year:  2015        PMID: 25907142     DOI: 10.1016/j.cmpb.2015.03.010

Source DB:  PubMed          Journal:  Comput Methods Programs Biomed        ISSN: 0169-2607            Impact factor:   5.428


  1 in total

1.  Predicting protein-binding regions in RNA using nucleotide profiles and compositions.

Authors:  Daesik Choi; Byungkyu Park; Hanju Chae; Wook Lee; Kyungsook Han
Journal:  BMC Syst Biol       Date:  2017-03-14
  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.