Literature DB >> 19816781

Prediction of interactiveness of proteins and nucleic acids based on feature selections.

YouLang Yuan1, XiaoHe Shi, XinLei Li, WenCong Lu, YuDong Cai, Lei Gu, Liang Liu, MinJie Li, XiangYin Kong, Meng Xing.   

Abstract

It is important to identify which proteins can interact with nucleic acids for the purpose of protein annotation, since interactions between nucleic acids and proteins involve in numerous cellular processes such as replication, transcription, splicing, and DNA repair. This research tries to identify proteins that can interact with DNA, RNA, and rRNA, respectively. mRMR (Minimum redundancy and maximum relevance), with its elegant mathematical formulation, has been applied widely in processing biological data and feature analysis since its introduction in 2005. mRMR plus incremental feature selection (IFS) is known to be very efficient in feature selection and analysis, and able to improve both effectiveness and efficiency of a prediction model. IFS is applied to decide how many features should be selected from feature list provided by mRMR. In the end, the selected features of mRMR and IFS are further refined by a conventional feature selection method--forward feature wrapper (FFW), by reordering the features. Each protein is coded by 132 features including amino acid compositions and physicochemical properties. After the feature selection, k-Nearest Neighbor algorithm, the adopted prediction model, is trained and tested. As a result, the optimized prediction accuracies for the DNA, RNA, and rRNA are 82.0, 83.4, and 92.3%, respectively. Furthermore, the most important features that contribute to the prediction are identified and analyzed biologically. The predictor, developed for this research, is available for public access at http://chemdata.shu.edu.cn/protein_na_mrmr/.

Mesh:

Substances:

Year:  2009        PMID: 19816781     DOI: 10.1007/s11030-009-9198-9

Source DB:  PubMed          Journal:  Mol Divers        ISSN: 1381-1991            Impact factor:   2.943


  37 in total

1.  PredAcc: prediction of solvent accessibility.

Authors:  M H Mucchielli-Giorgi; S Hazout; P Tufféry
Journal:  Bioinformatics       Date:  1999-02       Impact factor: 6.937

2.  Using structural motif templates to identify proteins with DNA binding function.

Authors:  Susan Jones; Jonathan A Barker; Irene Nobeli; Janet M Thornton
Journal:  Nucleic Acids Res       Date:  2003-06-01       Impact factor: 16.971

Review 3.  RNA-protein interactions in 30S ribosomal subunits: folding and function of 16S rRNA.

Authors:  S Stern; T Powers; L M Changchien; H F Noller
Journal:  Science       Date:  1989-05-19       Impact factor: 47.728

4.  Protein sumoylation sites prediction based on two-stage feature selection.

Authors:  Lin Lu; Xiao-He Shi; Su-Jun Li; Zhi-Qun Xie; Yong-Li Feng; Wen-Cong Lu; Yi-Xue Li; Haipeng Li; Yu-Dong Cai
Journal:  Mol Divers       Date:  2009-05-27       Impact factor: 2.943

5.  A novel computational approach to predict transcription factor DNA binding preference.

Authors:  Yudong Cai; Jianfeng He; Xinlei Li; Lin Lu; Xinyi Yang; Kaiyan Feng; Wencong Lu; Xiangyin Kong
Journal:  J Proteome Res       Date:  2009-02       Impact factor: 4.466

6.  Prediction of interaction between small molecule and enzyme using AdaBoost.

Authors:  Bing Niu; Yuhuan Jin; Lin Lu; Kaiyan Fen; Lei Gu; Zhisong He; Wencong Lu; Yixue Li; Yudong Cai
Journal:  Mol Divers       Date:  2009-02-14       Impact factor: 2.943

7.  Predicting N-terminal acetylation based on feature selection method.

Authors:  Yu-Dong Cai; Lin Lu
Journal:  Biochem Biophys Res Commun       Date:  2008-06-03       Impact factor: 3.575

8.  Prediction of protein structural classes using hybrid properties.

Authors:  Wenjin Li; Kao Lin; Kaiyan Feng; Yudong Cai
Journal:  Mol Divers       Date:  2008-10-25       Impact factor: 2.943

Review 9.  The classification and origins of protein folding patterns.

Authors:  C Chothia; A V Finkelstein
Journal:  Annu Rev Biochem       Date:  1990       Impact factor: 23.643

Review 10.  Protein-DNA recognition complexes: conservation of structure and binding energy in the transition state.

Authors:  L Jen-Jacobson
Journal:  Biopolymers       Date:  1997       Impact factor: 2.505

View more
  4 in total

1.  Identifying DNA-binding proteins by combining support vector machine and PSSM distance transformation.

Authors:  Ruifeng Xu; Jiyun Zhou; Hongpeng Wang; Yulan He; Xiaolong Wang; Bin Liu
Journal:  BMC Syst Biol       Date:  2015-02-06

2.  The role of electrostatic energy in prediction of obligate protein-protein interactions.

Authors:  Mina Maleki; Gokul Vasudev; Luis Rueda
Journal:  Proteome Sci       Date:  2013-11-07       Impact factor: 2.480

3.  Prediction of substrate-enzyme-product interaction based on molecular descriptors and physicochemical properties.

Authors:  Bing Niu; Guohua Huang; Linfeng Zheng; Xueyuan Wang; Fuxue Chen; Yuhui Zhang; Tao Huang
Journal:  Biomed Res Int       Date:  2013-12-22       Impact factor: 3.411

4.  An improved sequence based prediction protocol for DNA-binding proteins using SVM and comprehensive feature analysis.

Authors:  Chuanxin Zou; Jiayu Gong; Honglin Li
Journal:  BMC Bioinformatics       Date:  2013-03-09       Impact factor: 3.169

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.