Literature DB >> 32112084

RFPR-IDP: reduce the false positive rates for intrinsically disordered protein and region prediction by incorporating both fully ordered proteins and disordered proteins.

Yumeng Liu1, Xiaolong Wang1, Bin Liu1,2,3.   

Abstract

As an important type of proteins, intrinsically disordered proteins/regions (IDPs/IDRs) are related to many crucial biological functions. Accurate prediction of IDPs/IDRs is beneficial to the prediction of protein structures and functions. Most of the existing methods ignore the fully ordered proteins without IDRs during training and test processes. As a result, the corresponding predictors prefer to predict the fully ordered proteins as disordered proteins. Unfortunately, these methods were only evaluated on datasets consisting of disordered proteins without or with only a few fully ordered proteins, and therefore, this problem escapes the attention of the researchers. However, most of the newly sequenced proteins are fully ordered proteins in nature. These predictors fail to accurately predict the ordered and disordered proteins in real-world applications. In this regard, we propose a new method called RFPR-IDP trained with both fully ordered proteins and disordered proteins, which is constructed based on the combination of convolution neural network (CNN) and bidirectional long short-term memory (BiLSTM). The experimental results show that although the existing predictors perform well for predicting the disordered proteins, they tend to predict the fully ordered proteins as disordered proteins. In contrast, the RFPR-IDP predictor can correctly predict the fully ordered proteins and outperform the other 10 state-of-the-art methods when evaluated on a test dataset with both fully ordered proteins and disordered proteins. The web server and datasets of RFPR-IDP are freely available at http://bliulab.net/RFPR-IDP/server.
© The Author(s) 2020. Published by Oxford University Press.

Entities:  

Keywords:  bidirectional long short-term memory; convolution neural network; fully ordered proteins; intrinsically disordered proteins and regions

Year:  2021        PMID: 32112084      PMCID: PMC7986600          DOI: 10.1093/bib/bbaa018

Source DB:  PubMed          Journal:  Brief Bioinform        ISSN: 1467-5463            Impact factor:   11.622


  36 in total

1.  The Protein Data Bank.

Authors:  H M Berman; J Westbrook; Z Feng; G Gilliland; T N Bhat; H Weissig; I N Shindyalov; P E Bourne
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

2.  Assessment of disorder predictions in CASP7.

Authors:  Lorenza Bordoli; Florian Kiefer; Torsten Schwede
Journal:  Proteins       Date:  2007

3.  Genome-scale prediction of proteins with long intrinsically disordered regions.

Authors:  Zhenling Peng; Marcin J Mizianty; Lukasz Kurgan
Journal:  Proteins       Date:  2013-09-17

4.  Long short-term memory.

Authors:  S Hochreiter; J Schmidhuber
Journal:  Neural Comput       Date:  1997-11-15       Impact factor: 2.026

5.  Protein fold recognition based on multi-view modeling.

Authors:  Ke Yan; Xiaozhao Fang; Yong Xu; Bin Liu
Journal:  Bioinformatics       Date:  2019-09-01       Impact factor: 6.937

Review 6.  Intrinsically unstructured proteins and their functions.

Authors:  H Jane Dyson; Peter E Wright
Journal:  Nat Rev Mol Cell Biol       Date:  2005-03       Impact factor: 94.444

Review 7.  Intrinsically disordered proteins in human diseases: introducing the D2 concept.

Authors:  Vladimir N Uversky; Christopher J Oldfield; A Keith Dunker
Journal:  Annu Rev Biophys       Date:  2008       Impact factor: 12.981

8.  Assessment of protein disorder region predictions in CASP10.

Authors:  Bohdan Monastyrskyy; Andriy Kryshtafovych; John Moult; Anna Tramontano; Krzysztof Fidelis
Journal:  Proteins       Date:  2013-11-22

9.  DisProt 7.0: a major update of the database of disordered proteins.

Authors:  Damiano Piovesan; Francesco Tabaro; Ivan Mičetić; Marco Necci; Federica Quaglia; Christopher J Oldfield; Maria Cristina Aspromonte; Norman E Davey; Radoslav Davidović; Zsuzsanna Dosztányi; Arne Elofsson; Alessandra Gasparini; András Hatos; Andrey V Kajava; Lajos Kalmar; Emanuela Leonardi; Tamas Lazar; Sandra Macedo-Ribeiro; Mauricio Macossay-Castillo; Attila Meszaros; Giovanni Minervini; Nikoletta Murvai; Jordi Pujols; Daniel B Roche; Edoardo Salladini; Eva Schad; Antoine Schramm; Beata Szabo; Agnes Tantos; Fiorella Tonello; Konstantinos D Tsirigos; Nevena Veljković; Salvador Ventura; Wim Vranken; Per Warholm; Vladimir N Uversky; A Keith Dunker; Sonia Longhi; Peter Tompa; Silvio C E Tosatto
Journal:  Nucleic Acids Res       Date:  2016-11-28       Impact factor: 16.971

10.  Gene2vec: gene subsequence embedding for prediction of mammalian N 6-methyladenosine sites from mRNA.

Authors:  Quan Zou; Pengwei Xing; Leyi Wei; Bin Liu
Journal:  RNA       Date:  2018-11-13       Impact factor: 4.942

View more
  4 in total

Review 1.  Protein Function Analysis through Machine Learning.

Authors:  Chris Avery; John Patterson; Tyler Grear; Theodore Frater; Donald J Jacobs
Journal:  Biomolecules       Date:  2022-09-06

2.  Computational Prediction of Intrinsically Disordered Proteins Based on Protein Sequences and Convolutional Neural Networks.

Authors:  Hao He; Yong Yang
Journal:  Comput Intell Neurosci       Date:  2021-12-28

Review 3.  Deep learning in prediction of intrinsic disorder in proteins.

Authors:  Bi Zhao; Lukasz Kurgan
Journal:  Comput Struct Biotechnol J       Date:  2022-03-08       Impact factor: 7.271

4.  Identifying Intrinsically Disordered Protein Regions through a Deep Neural Network with Three Novel Sequence Features.

Authors:  Jiaxiang Zhao; Zengke Wang
Journal:  Life (Basel)       Date:  2022-02-26
  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.