Literature DB >> 26011889

Discovering Variable-Length Patterns in Protein Sequences for Protein-Protein Interaction Prediction.

Keith C C Chan.   

Abstract

To predict Protein-Protein Interactions (PPIs), there have recently been some attempts to use computational approaches and among them, sequence-based approaches are often preferred over other kinds of approaches as they do not require prior knowledge about proteins to perform their tasks. However, in deciding if two proteins may interact with each other, existing sequence-based approaches consider only fixed-length segments. We believe that if segments of variable-length can also be considered, interactions between proteins can be more accurately predicted. To consider variable-length segments for PPI predictions, we have developed a VLASPD algorithm. Given a database of protein sequences, VLASPD performs its tasks in several steps. The protein database is first searched to identify frequent sequence segments (FSSs) of different length. The different combinations of the presence and absence of these FSSs are then used to form different associative sequential patterns (ASPs). Based on a statistical measure, the ASPs that occur significantly frequently among proteins in the training set are then identified as significant associative sequential patterns (SASPs). If an SASP is found in a protein pair, it can be considered as providing some evidence to support or refute the existence of an interaction relationship between the protein pairs. The amount of evidence provided are then quantified with an information theoretic measure. How likely two proteins may interact with each other are then decided by the total amount of evidence provided by the SASPs found in the protein pairs. To test the effectiveness of VLASPD, we used several sets of real data. The experimental results show that VLASPD can be a promising approach for PPI prediction. The VLASPD is made available for use and testing at http://www.comp.polyu.edu.hk/~cslhu/resources/vlaspd/.

Entities:  

Year:  2015        PMID: 26011889     DOI: 10.1109/TNB.2015.2429672

Source DB:  PubMed          Journal:  IEEE Trans Nanobioscience        ISSN: 1536-1241            Impact factor:   2.935


  2 in total

1.  Multi-view heterogeneous molecular network representation learning for protein-protein interaction prediction.

Authors:  Xiao-Rui Su; Lun Hu; Zhu-Hong You; Peng-Wei Hu; Bo-Wei Zhao
Journal:  BMC Bioinformatics       Date:  2022-06-16       Impact factor: 3.307

2.  Predicting HIV-1 Protease Cleavage Sites With Positive-Unlabeled Learning.

Authors:  Zhenfeng Li; Lun Hu; Zehai Tang; Cheng Zhao
Journal:  Front Genet       Date:  2021-03-26       Impact factor: 4.599

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.