Literature DB >> 34176069

UMAP-DBP: An Improved DNA-Binding Proteins Prediction Method Based on Uniform Manifold Approximation and Projection.

Jinyue Wang1, Shengli Zhang2, Huijuan Qiao1, Jiesheng Wang1.   

Abstract

DNA-binding proteins play a vital role in cellular processes. It is an extremely urgent to develop a high-throughput method for efficiently identifying DNA-binding proteins. According to the current research situation, some methods in machine learning and deep learning show excellent computational speed and accuracy, which are worthy of application. In this work, a novel predictor was proposed to predict DNA binding proteins called UMAP-DBP. Firstly, the feature extraction of primary protein sequence was realized based on physicochemical distance transformation, Profile-based auto-cross covariance and General series correlation pseudo amino acid composition. Secondly, uniform manifold approximation and projection (UMAP) and feature importance score methods were used for feature selection; there is a progressive relationship between them. Finally, the Adaboost operation engine with jackknife test were adopted for predicting DNA-binding proteins. For the jackknife test on the BP1075 and BP594, we obtained an overall accuracy of 82.97% and 82.14%, Cohen's kappa (CK) of 0.66 and 0.64, respectively. The results illustrate that a feasible method has been developed for predicting DNA-binding proteins by UMAP and Adaboost. This is the first study in which UMAP has been successfully applied to identify DNA-binding proteins. All the datasets and codes are accessible at https://github.com/Wang-Jinyue/UMAP-DBP .
© 2021. The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.

Entities:  

Keywords:  Adaboost; DNA-binding proteins; Feature importance score; Uniform manifold approximation and projection

Mesh:

Substances:

Year:  2021        PMID: 34176069     DOI: 10.1007/s10930-021-10011-y

Source DB:  PubMed          Journal:  Protein J        ISSN: 1572-3887            Impact factor:   2.371


  35 in total

Review 1.  Multiprotein-DNA complexes in transcriptional regulation.

Authors:  C Wolberger
Journal:  Annu Rev Biophys Biomol Struct       Date:  1999

2.  Rapid identification of DNA-binding proteins by mass spectrometry.

Authors:  E Nordhoff; A M Krogsdam; H F Jorgensen; B H Kallipolitis; B F Clark; P Roepstorff; K Kristiansen
Journal:  Nat Biotechnol       Date:  1999-09       Impact factor: 54.908

3.  Crystal structure of the hyperthermophilic archaeal DNA-binding protein Sso10b2 at a resolution of 1.85 Angstroms.

Authors:  Chia-Cheng Chou; Ting-Wan Lin; Chin-Yu Chen; Andrew H-J Wang
Journal:  J Bacteriol       Date:  2003-07       Impact factor: 3.490

4.  Identifying DNA-binding proteins using structural motifs and the electrostatic potential.

Authors:  Hugh P Shanahan; Mario A Garcia; Susan Jones; Janet M Thornton
Journal:  Nucleic Acids Res       Date:  2004-09-08       Impact factor: 16.971

5.  DNA-Prot: identification of DNA binding proteins from protein sequence information using random forest.

Authors:  K Krishna Kumar; Ganesan Pugalenthi; P N Suganthan
Journal:  J Biomol Struct Dyn       Date:  2009-06

6.  Structural and mutational studies of a hyperthermophilic intein from DNA polymerase II of Pyrococcus abyssi.

Authors:  Zhenming Du; Jiajing Liu; Clayton D Albracht; Alice Hsu; Wen Chen; Michelle D Marieni; Kathryn M Colelli; Jennie E Williams; Julie N Reitter; Kenneth V Mills; Chunyu Wang
Journal:  J Biol Chem       Date:  2011-09-13       Impact factor: 5.157

7.  4-Hydroxynonenal induces a DNA-binding protein similar to the heat-shock factor.

Authors:  F Cajone; M Salina; A Benelli-Zazzera
Journal:  Biochem J       Date:  1989-09-15       Impact factor: 3.857

8.  Kernel-based machine learning protocol for predicting DNA-binding proteins.

Authors:  Nitin Bhardwaj; Robert E Langlois; Guijun Zhao; Hui Lu
Journal:  Nucleic Acids Res       Date:  2005-11-10       Impact factor: 16.971

Review 9.  An overview of the structures of protein-DNA complexes.

Authors:  N M Luscombe; S E Austin; H M Berman; J M Thornton
Journal:  Genome Biol       Date:  2000-06-09       Impact factor: 13.583

10.  Identification of DNA-binding proteins using support vector machines and evolutionary profiles.

Authors:  Manish Kumar; Michael M Gromiha; Gajendra P S Raghava
Journal:  BMC Bioinformatics       Date:  2007-11-27       Impact factor: 3.169

View more
  4 in total

1.  FTWSVM-SR: DNA-Binding Proteins Identification via Fuzzy Twin Support Vector Machines on Self-Representation.

Authors:  Yi Zou; Yijie Ding; Li Peng; Quan Zou
Journal:  Interdiscip Sci       Date:  2021-11-06       Impact factor: 2.233

2.  Accurate Prediction of Anti-hypertensive Peptides Based on Convolutional Neural Network and Gated Recurrent unit.

Authors:  Hongyan Shi; Shengli Zhang
Journal:  Interdiscip Sci       Date:  2022-04-27       Impact factor: 3.492

3.  Prediction of DNA-Binding Protein-Drug-Binding Sites Using Residue Interaction Networks and Sequence Feature.

Authors:  Wei Wang; Yu Zhang; Dong Liu; HongJun Zhang; XianFang Wang; Yun Zhou
Journal:  Front Bioeng Biotechnol       Date:  2022-04-20

4.  The impact of educational attainment, intelligence and intellectual disability on schizophrenia: a Swedish population-based register and genetic study.

Authors:  Jie Song; Shuyang Yao; Kaarina Kowalec; Yi Lu; Amir Sariaslan; Jin P Szatkiewicz; Henrik Larsson; Paul Lichtenstein; Christina M Hultman; Patrick F Sullivan
Journal:  Mol Psychiatry       Date:  2022-04-05       Impact factor: 13.437

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.