Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Predicting protein-binding RNA nucleotides using the feature-based removal of data redundancy and the interaction propensity of nucleotide triplets.

Literature DB >> 24209914

Predicting protein-binding RNA nucleotides using the feature-based removal of data redundancy and the interaction propensity of nucleotide triplets.

Abstract

Several learning approaches have been used to predict RNA-binding amino acids in a protein sequence, but there has been little attempt to predict protein-binding nucleotides in an RNA sequence. One of the reasons is that the differences between nucleotides in their interaction propensity are much smaller than those between amino acids. Another reason is that RNA exhibits less diverse sequence patterns than protein. Therefore, predicting protein-binding RNA nucleotides is much harder than predicting RNA-binding amino acids. We developed a new method that removes data redundancy in a training set of sequences based on their features. The new method constructs a larger and more informative training set than the standard redundancy removal method based on sequence similarity, and the constructed dataset is guaranteed to be redundancy-free. We computed the interaction propensity (IP) of nucleotide triplets by applying a new definition of IP to an extensive dataset of protein-RNA complexes, and developed a support vector machine (SVM) model to predict protein binding sites in RNA sequences. In a 5-fold cross-validation with 812 RNA sequences, the SVM model predicted protein-binding nucleotides with an accuracy of 86.4%, an F-measure of 84.8%, and a Matthews correlation coefficient of 0.66. With an independent dataset of 56 RNA sequences that were not used in training, the resulting accuracy was 68.1% with an F-measure of 71.7% and a Matthews correlation coefficient of 0.35. To the best of our knowledge, this is the first attempt to predict protein-binding RNA nucleotides in a given RNA sequence from the sequence data alone. The SVM model and datasets are freely available for academics at http://bclab.inha.ac.kr/primer.

Keywords: Data redundancy removal; Interaction propensity; Protein-binding nucleotide; Protein–RNA interaction

Mesh：

Substances：

Year: 2013 PMID： 24209914 DOI： 10.1016/j.compbiomed.2013.08.011

Source DB: PubMed Journal: Comput Biol Med ISSN： 0010-4825 Impact factor: 4.589

Keyword Cloud
Cited

6 in total

1. A MOTIF-BASED METHOD FOR PREDICTING INTERFACIAL RESIDUES IN BOTH THE RNA AND PROTEIN COMPONENTS OF PROTEIN-RNA COMPLEXES.

Authors: Usha Muppirala; Benjamin A Lewis; Carla M Mann; Drena Dobbs
Journal: Pac Symp Biocomput Date: 2016

Predicting protein-binding RNA nucleotides using the feature-based removal of data redundancy and the interaction propensity of nucleotide triplets.

1. A MOTIF-BASED METHOD FOR PREDICTING INTERFACIAL RESIDUES IN BOTH THE RNA AND PROTEIN COMPONENTS OF PROTEIN-RNA COMPLEXES.

2. PNImodeler: web server for inferring protein-binding nucleotides from sequence data.

3. RPI-Bind: a structure-based method for accurate identification of RNA-protein binding sites.

4. Predicting protein-binding regions in RNA using nucleotide profiles and compositions.

5. A boosting approach for prediction of protein-RNA binding residues.

Review 6. Comprehensive Survey and Comparative Assessment of RNA-Binding Residue Predictions with Analysis by RNA Type.