Literature DB >> 31228283

Enabling full-length evolutionary profiles based deep convolutional neural network for predicting DNA-binding proteins from sequence.

Sucheta Chauhan1, Shandar Ahmad1.   

Abstract

Sequence based DNA-binding protein (DBP) prediction is a widely studied biological problem. Sliding windows on position specific substitution matrices (PSSMs) rows predict DNA-binding residues well on known DBPs but the same models cannot be applied to unequally sized protein sequences. PSSM summaries representing column averages and their amino-acid wise versions have been effectively used for the task, but it remains unclear if these features carry all the PSSM's predictive power, traditionally harnessed for binding site predictions. Here we evaluate if PSSMs scaled up to a fixed size by zero-vector padding (pPSSM) could perform better than the summary based features on similar models. Using multilayer perceptron (MLP) and deep convolutional neural network (CNN), we found that (a) Summary features work well for single-genome (human-only) data but are outperformed by pPSSM for diverse PDB-derived data sets, suggesting greater summary-level redundancy in the former, (b) even when summary features work comparably well with pPSSM, a consensus on the two outperforms both of them (c) CNN models comprehensively outperform their corresponding MLP models and (d) actual predicted scores from different models depend on the choice of input feature sets used whereas overall performance levels are model-dependent in which CNN leads the accuracy.
© 2019 Wiley Periodicals, Inc.

Entities:  

Keywords:  DNA-binding proteins; PSSM; convolutional neural networks; evolutionary profiles; functional annotations; sequence-based predictions

Year:  2019        PMID: 31228283     DOI: 10.1002/prot.25763

Source DB:  PubMed          Journal:  Proteins        ISSN: 0887-3585


  6 in total

1.  BoT-Net: a lightweight bag of tricks-based neural network for efficient LncRNA-miRNA interaction prediction.

Authors:  Muhammad Nabeel Asim; Muhammad Ali Ibrahim; Christoph Zehe; Johan Trygg; Andreas Dengel; Sheraz Ahmed
Journal:  Interdiscip Sci       Date:  2022-08-10       Impact factor: 3.492

2.  Research on DNA-Binding Protein Identification Method Based on LSTM-CNN Feature Fusion.

Authors:  Weizhong Lu; Xiaoyi Chen; Yu Zhang; Hongjie Wu; Yijie Ding; Jiawei Shen; Shixuan Guan; Haiou Li
Journal:  Comput Math Methods Med       Date:  2022-06-02       Impact factor: 2.809

3.  PredDBP-Stack: Prediction of DNA-Binding Proteins from HMM Profiles using a Stacked Ensemble Method.

Authors:  Jun Wang; Huiwen Zheng; Yang Yang; Wanyue Xiao; Taigang Liu
Journal:  Biomed Res Int       Date:  2020-04-13       Impact factor: 3.411

4.  Computational Identification and Analysis of Ubiquinone-Binding Proteins.

Authors:  Chang Lu; Wenjie Jiang; Hang Wang; Jinxiu Jiang; Zhiqiang Ma; Han Wang
Journal:  Cells       Date:  2020-02-24       Impact factor: 6.600

5.  Prediction of Transcription Factor Binding Sites Using a Combined Deep Learning Approach.

Authors:  Linan Cao; Pei Liu; Jialong Chen; Lei Deng
Journal:  Front Oncol       Date:  2022-06-03       Impact factor: 5.738

6.  HMMPred: Accurate Prediction of DNA-Binding Proteins Based on HMM Profiles and XGBoost Feature Selection.

Authors:  Xiuzhi Sang; Wanyue Xiao; Huiwen Zheng; Yang Yang; Taigang Liu
Journal:  Comput Math Methods Med       Date:  2020-03-28       Impact factor: 2.238

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.