Literature DB >> 33959851

A New Sequential Forward Feature Selection (SFFS) Algorithm for Mining Best Topological and Biological Features to Predict Protein Complexes from Protein-Protein Interaction Networks (PPINs).

Haseeb Younis1,2, Muhammad Waqas Anwar3, Muhammad Usman Ghani Khan4, Aisha Sikandar5, Usama Ijaz Bajwa2.   

Abstract

Protein-protein interaction plays an important role in the understanding of biological processes in the body. A network of dynamic protein complexes within a cell that regulates most biological processes is known as a protein-protein interaction network (PPIN). Complex prediction from PPINs is a challenging task. Most of the previous computation approaches mine cliques, stars, linear and hybrid structures as complexes from PPINs by considering topological features and fewer of them focus on important biological information contained within protein amino acid sequence. In this study, we have computed a wide variety of topological features and integrate them with biological features computed from protein amino acid sequence such as bag of words, physicochemical and spectral domain features. We propose a new Sequential Forward Feature Selection (SFFS) algorithm, i.e., random forest-based Boruta feature selection for selecting the best features from computed large feature set. Decision tree, linear discriminant analysis and gradient boosting classifiers are used as learners. We have conducted experiments by considering two reference protein complex datasets of yeast, i.e., CYC2008 and MIPS. Human and mouse complex information is taken from CORUM 3.0 dataset. Protein interaction information is extracted from the database of interacting proteins (DIP). Our proposed SFFS, i.e., random forest-based Brouta feature selection in combination with decision trees, linear discriminant analysis and Gradient Boosting Classifiers outperforms other state of art algorithms by achieving precision, recall and F-measure rates, i.e. 94.58%, 94.92% and 94.45% for MIPS, 96.31%, 93.55% and 96.02% for CYC2008, 98.84%, 98.00%, 98.87 % for CORUM humans and 96.60%, 96.70%, 96.32% for CORUM mouse dataset complexes, respectively.

Entities:  

Keywords:  Complex topology; Machine learning; Protein complex detection; Protein–protein interaction network

Year:  2021        PMID: 33959851     DOI: 10.1007/s12539-021-00433-8

Source DB:  PubMed          Journal:  Interdiscip Sci        ISSN: 1867-1462            Impact factor:   2.233


  34 in total

Review 1.  The tandem affinity purification (TAP) method: a general procedure of protein complex purification.

Authors:  O Puig; F Caspary; G Rigaut; B Rutz; E Bouveret; E Bragado-Nilsson; M Wilm; B Séraphin
Journal:  Methods       Date:  2001-07       Impact factor: 3.608

Review 2.  Protein microarray technology.

Authors:  Markus F Templin; Dieter Stoll; Monika Schrenk; Petra C Traub; Christian F Vöhringer; Thomas O Joos
Journal:  Drug Discov Today       Date:  2002-08-01       Impact factor: 7.851

Review 3.  Phage display for engineering and analyzing protein interaction interfaces.

Authors:  Sachdev S Sidhu; Shohei Koide
Journal:  Curr Opin Struct Biol       Date:  2007-09-17       Impact factor: 6.809

Review 4.  Characterizing Protein-Protein Interactions Using Mass Spectrometry: Challenges and Opportunities.

Authors:  Arne H Smits; Michiel Vermeulen
Journal:  Trends Biotechnol       Date:  2016-03-17       Impact factor: 19.536

5.  Novel TIA biomarkers identified by mass spectrometry-based proteomics.

Authors:  Paul M George; Michael Mlynash; Christopher M Adams; Calvin J Kuo; Gregory W Albers; Jean-Marc Olivot
Journal:  Int J Stroke       Date:  2015-08-26       Impact factor: 5.266

6.  Quantitative analysis of protein interaction network dynamics in yeast.

Authors:  Albi Celaj; Ulrich Schlecht; Justin D Smith; Weihong Xu; Sundari Suresh; Molly Miranda; Ana Maria Aparicio; Michael Proctor; Ronald W Davis; Frederick P Roth; Robert P St Onge
Journal:  Mol Syst Biol       Date:  2017-07-13       Impact factor: 11.429

7.  The BioGRID interaction database: 2019 update.

Authors:  Rose Oughtred; Chris Stark; Bobby-Joe Breitkreutz; Jennifer Rust; Lorrie Boucher; Christie Chang; Nadine Kolas; Lara O'Donnell; Genie Leung; Rochelle McAdam; Frederick Zhang; Sonam Dolma; Andrew Willems; Jasmin Coulombe-Huntington; Andrew Chatr-Aryamontri; Kara Dolinski; Mike Tyers
Journal:  Nucleic Acids Res       Date:  2019-01-08       Impact factor: 16.971

Review 8.  Yeast two-hybrid, a powerful tool for systems biology.

Authors:  Anna Brückner; Cécile Polge; Nicolas Lentze; Daniel Auerbach; Uwe Schlattner
Journal:  Int J Mol Sci       Date:  2009-06-18       Impact factor: 6.208

9.  Kismeth: analyzer of plant methylation states through bisulfite sequencing.

Authors:  Eyal Gruntman; Yijun Qi; R Keith Slotkin; Ted Roeder; Robert A Martienssen; Ravi Sachidanandam
Journal:  BMC Bioinformatics       Date:  2008-09-11       Impact factor: 3.169

Review 10.  Deciphering protein-protein interactions. Part I. Experimental techniques and databases.

Authors:  Benjamin A Shoemaker; Anna R Panchenko
Journal:  PLoS Comput Biol       Date:  2007-03-30       Impact factor: 4.475

View more
  2 in total

1.  Identification of Chemical-Disease Associations Through Integration of Molecular Fingerprint, Gene Ontology and Pathway Information.

Authors:  Zhanchao Li; Mengru Wang; Dongdong Peng; Jie Liu; Yun Xie; Zong Dai; Xiaoyong Zou
Journal:  Interdiscip Sci       Date:  2022-04-07       Impact factor: 3.492

Review 2.  Emerging landscape of molecular interaction networks:Opportunities, challenges and prospects.

Authors:  Gauri Panditrao; Rupa Bhowmick; Chandrakala Meena; Ram Rup Sarkar
Journal:  J Biosci       Date:  2022       Impact factor: 2.795

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.