Literature DB >> 29993950

ProtDet-CCH: Protein Remote Homology Detection by Combining Long Short-Term Memory and Ranking Methods.

Bin Liu, Shumin Li.   

Abstract

As one of the most challenging tasks in sequence analysis, protein remote homology detection has been extensively studied. Methods based on discriminative models and ranking approaches have achieved the state-of-the-art performance, and these two kinds of methods are complementary. In this study, three LSTM models have been applied to construct the predictors for protein remote homology detection, including ULSTM, BLSTM, and CNN-BLSTM. They are able to automatically extract the local and global sequence order information. Combined with PSSMs, the CNN-BLSTM achieved the best performance among the three LSTM-based models. We named this method as CNN-BLSTM-PSSM. Finally, a new method called ProtDet-CCH was proposed by combining CNN-BLSTM-PSSM and a ranking method HHblits. Tested on a widely used SCOP benchmark dataset, ProtDet-CCH achieved an ROC score of 0.998, and an ROC50 score of 0.982, significantly outperforming other existing state-of-the-art methods. Experimental results on two updated SCOPe independent datasets showed that ProtDet-CCH can achieve stable performance. Furthermore, our method can provide useful insights for studying the features and motifs of protein families and superfamilies. It is anticipated that ProtDet-CCH will become a very useful tool for protein remote homology detection.

Entities:  

Mesh:

Substances:

Year:  2018        PMID: 29993950     DOI: 10.1109/TCBB.2018.2789880

Source DB:  PubMed          Journal:  IEEE/ACM Trans Comput Biol Bioinform        ISSN: 1545-5963            Impact factor:   3.710


  3 in total

1.  RFPR-IDP: reduce the false positive rates for intrinsically disordered protein and region prediction by incorporating both fully ordered proteins and disordered proteins.

Authors:  Yumeng Liu; Xiaolong Wang; Bin Liu
Journal:  Brief Bioinform       Date:  2021-03-22       Impact factor: 11.622

2.  Gene2vec: gene subsequence embedding for prediction of mammalian N 6-methyladenosine sites from mRNA.

Authors:  Quan Zou; Pengwei Xing; Leyi Wei; Bin Liu
Journal:  RNA       Date:  2018-11-13       Impact factor: 4.942

3.  Computational analysis and prediction of PE_PGRS proteins using machine learning.

Authors:  Fuyi Li; Xudong Guo; Dongxu Xiang; Miranda E Pitt; Arnold Bainomugisa; Lachlan J M Coin
Journal:  Comput Struct Biotechnol J       Date:  2022-01-22       Impact factor: 7.271

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.