Literature DB >> 26375780

Identification of protein-protein binding sites by incorporating the physicochemical properties and stationary wavelet transforms into pseudo amino acid composition.

Jianhua Jia1, Zi Liu1, Xuan Xiao1,2, Bingxiang Liu1, Kuo-Chen Chou3,2.   

Abstract

With the explosive growth of protein sequences entering into protein data banks in the post-genomic era, it is highly demanded to develop automated methods for rapidly and effectively identifying the protein-protein binding sites (PPBSs) based on the sequence information alone. To address this problem, we proposed a predictor called iPPBS-PseAAC, in which each amino acid residue site of the proteins concerned was treated as a 15-tuple peptide segment generated by sliding a window along the protein chains with its center aligned with the target residue. The working peptide segment is further formulated by a general form of pseudo amino acid composition via the following procedures: (1) it is converted into a numerical series via the physicochemical properties of amino acids; (2) the numerical series is subsequently converted into a 20-D feature vector by means of the stationary wavelet transform technique. Formed by many individual "Random Forest" classifiers, the operation engine to run prediction is a two-layer ensemble classifier, with the 1st-layer voting out the best training data-set from many bootstrap systems and the 2nd-layer voting out the most relevant one from seven physicochemical properties. Cross-validation tests indicate that the new predictor is very promising, meaning that many important key features, which are deeply hidden in complicated protein sequences, can be extracted via the wavelets transform approach, quite consistent with the facts that many important biological functions of proteins can be elucidated with their low-frequency internal motions. The web server of iPPBS-PseAAC is accessible at http://www.jci-bioinfo.cn/iPPBS-PseAAC , by which users can easily acquire their desired results without the need to follow the complicated mathematical equations involved.

Entities:  

Keywords:  asymmetric bootstrap; physicochemical property; protein–protein binding sites; pseudo amino acid composition; random forest; stationary wavelet transform

Mesh:

Substances:

Year:  2015        PMID: 26375780     DOI: 10.1080/07391102.2015.1095116

Source DB:  PubMed          Journal:  J Biomol Struct Dyn        ISSN: 0739-1102


  32 in total

1.  Prediction of Protein-Protein Interaction Sites with Machine-Learning-Based Data-Cleaning and Post-Filtering Procedures.

Authors:  Guang-Hui Liu; Hong-Bin Shen; Dong-Jun Yu
Journal:  J Membr Biol       Date:  2015-11-12       Impact factor: 1.843

2.  PreDTIs: prediction of drug-target interactions based on multiple feature information using gradient boosting framework with data balancing and feature selection techniques.

Authors:  S M Hasan Mahmud; Wenyu Chen; Yongsheng Liu; Md Abdul Awal; Kawsar Ahmed; Md Habibur Rahman; Mohammad Ali Moni
Journal:  Brief Bioinform       Date:  2021-03-12       Impact factor: 11.622

3.  Comparative analysis of housekeeping and tissue-selective genes in human based on network topologies and biological properties.

Authors:  Lei Yang; Shiyuan Wang; Meng Zhou; Xiaowen Chen; Yongchun Zuo; Dianjun Sun; Yingli Lv
Journal:  Mol Genet Genomics       Date:  2016-02-20       Impact factor: 3.291

4.  FEPS: A Tool for Feature Extraction from Protein Sequence.

Authors:  Hamid Ismail; Clarence White; Hussam Al-Barakati; Robert H Newman; Dukka B Kc
Journal:  Methods Mol Biol       Date:  2022

5.  Prediction of Protein-Protein Interaction Sites Using Convolutional Neural Network and Improved Data Sets.

Authors:  Zengyan Xie; Xiaoya Deng; Kunxian Shu
Journal:  Int J Mol Sci       Date:  2020-01-11       Impact factor: 5.923

Review 6.  Affinity Hydrogels for Protein Delivery.

Authors:  Lidya Abune; Yong Wang
Journal:  Trends Pharmacol Sci       Date:  2021-02-22       Impact factor: 14.819

7.  iHyd-PseCp: Identify hydroxyproline and hydroxylysine in proteins by incorporating sequence-coupled effects into general PseAAC.

Authors:  Wang-Ren Qiu; Bi-Qian Sun; Xuan Xiao; Zhao-Chun Xu; Kuo-Chen Chou
Journal:  Oncotarget       Date:  2016-07-12

8.  iCar-PseCp: identify carbonylation sites in proteins by Monte Carlo sampling and incorporating sequence coupled effects into general PseAAC.

Authors:  Jianhua Jia; Zi Liu; Xuan Xiao; Bingxiang Liu; Kuo-Chen Chou
Journal:  Oncotarget       Date:  2016-06-07

9.  iPhos-PseEn: identifying phosphorylation sites in proteins by fusing different pseudo components into an ensemble classifier.

Authors:  Wang-Ren Qiu; Xuan Xiao; Zhao-Chun Xu; Kuo-Chen Chou
Journal:  Oncotarget       Date:  2016-08-09

10.  iROS-gPseKNC: Predicting replication origin sites in DNA by incorporating dinucleotide position-specific propensity into general pseudo nucleotide composition.

Authors:  Xuan Xiao; Han-Xiao Ye; Zi Liu; Jian-Hua Jia; Kuo-Chen Chou
Journal:  Oncotarget       Date:  2016-06-07
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.