Literature DB >> 29126218

LIBRA-WA: a web application for ligand binding site detection and protein function recognition.

Daniele Toti1, Le Viet Hung2, Valentina Tortosa1, Valentina Brandi1, Fabio Polticelli1,3.   

Abstract

Summary: Recently, LIBRA, a tool for active/ligand binding site prediction, was described. LIBRA's effectiveness was comparable to similar state-of-the-art tools; however, its scoring scheme, output presentation, dependence on local resources and overall convenience were amenable to improvements. To solve these issues, LIBRA-WA, a web application based on an improved LIBRA engine, has been developed, featuring a novel scoring scheme consistently improving LIBRA's performance, and a refined algorithm that can identify binding sites hosted at the interface between different subunits. LIBRA-WA also sports additional functionalities like ligand clustering and a completely redesigned interface for an easier analysis of the output. Extensive tests on 373 apoprotein structures indicate that LIBRA-WA is able to identify the biologically relevant ligand/ligand binding site in 357 cases (∼96%), with the correct prediction ranking first in 349 cases (∼98% of the latter, ∼94% of the total). The earlier stand-alone tool has also been updated and dubbed LIBRA+, by integrating LIBRA-WA's improved engine for cross-compatibility purposes. Availability and implementation: LIBRA-WA and LIBRA+ are available at: http://www.computationalbiology.it/software.html. Contact: polticel@uniroma3.it. Supplementary information: Supplementary data are available at Bioinformatics online.
© The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

Entities:  

Mesh:

Substances:

Year:  2018        PMID: 29126218      PMCID: PMC6192203          DOI: 10.1093/bioinformatics/btx715

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 Introduction

In recent years, structure-based protein function recognition has gained renewed interest due to the availability of a large number of experimental protein structures, determined within the context of structural genomics initiatives, whose function is unknown (Grabowski ; Petrey ). In this framework, we recently developed and described LIBRA, a graph theory-based software tool that, given a protein’s structural model, predicts the presence and identity of active sites and/or small molecule ligand binding sites (Viet Hung ). Extensive tests carried out on the LigaSite (Dessailly ) set of approximately 400 apoproteins indicated that LIBRA was able to identify the correct binding/active site in ∼90% of the cases analyzed, outperforming other structure-based function recognition software such as SiteSeer (Laskowski ,b), EF-Seek (Murakami ) and ASSIST (previously developed in our lab; Caprari ), while displaying a performance comparable to ProFunc, which employs a combined sequence/structure approach (Laskowski ). However, the identified correct site ranked first only in 80% of the cases, a suboptimal performance that needed to be improved for LIBRA to be able to handle the most challenging cases. For this purpose, an improved version of LIBRA featuring a novel scoring system has been developed both as a web application, LIBRA-WA and a standalone tool, LIBRA+. The new system also features an improved algorithm to deal with binding sites located at the interface of different protein subunits and clustering of identified ligands according to their chemical similarity. Tests carried out on the same set of apoproteins earlier used on LIBRA demonstrate a significant improvement of the performance, as LIBRA-WA is able to identify the correct binding site in ∼96% of the cases, with the correct site ranking first in ∼94% of the cases. Comparative tests demonstrate that LIBRA-WA has a performance comparable to the state-of-the-art COACH meta-server.

2 Materials and methods

2.1 LIBRA-WA’s improved engine and features

LIBRA-WA features an improved active/ligand binding site detection engine and a number of additional features, including a redesigned GUI freely accessible online. The core improvement of the engine lies in a novel scoring system, which takes advantage of a clustering process carried out on more than 17 000 unique small molecule ligands stored in the application’s database, based on their SMILES representation. LIBRA-WA, for each alignment record, now provides a score obtained by combining the contributions given by the aligned binding site’s clique size (number of matching residues between the input protein and the target binding site), RMSD value, and the relative size of the cluster containing the ligand. A detailed description of the calculation of this combined score is provided in the Supplementary Material. Besides, the detection algorithm has been further refined by allowing the identification of binding sites hosted at the interface between different subunits. Recognition jobs can be launched against two pre-compiled databases: a ligand binding sites database, including more than 173 000 entries, and a database of active sites derived from the Catalytic Site Atlas (Furnham, 2014) (∼1000 entries) that can be used for the prediction of the catalytic activity of an input protein. For a detailed description of the procedure used to build the two databases, see Viet Hung . Aside from that, as a web application, sharing the same architectural framework of (Atzeni et al., 2011a, b; Toti ), LIBRA-WA is freely accessible by any web user, who can schedule multiple recognition jobs. Optionally, LIBRA-WA also enables users to create a personal workspace and access their results at a later time, by notifying the users once the jobs’ executions have terminated. Results can be also graphically displayed in three-dimensions via the Jmol HTML5 plug-in (Hanson, 2010). Furthermore, the LIBRA desktop application has been updated by incorporating the new detection engine and the information about the ligand clusters: this new version, which has been dubbed LIBRA+, can read the results exported from LIBRA-WA and is backward-compatible with the output files produced with the original version of LIBRA. A more thorough description of LIBRA-WA’s additional functionalities is reported in the Supplementary Material.

3 Results

The effectiveness of LIBRA-WA has been tested on the LigaSite set. A detailed analysis of the results is reported in Supplementary Table S1. As shown in the table, LIBRA-WA finds the biologically relevant ligand/binding site in ∼96% of the cases. More important for the predictive power of the application, the correct ligand/binding site ranks first in ∼94% of all cases. In fact, in ‘real life’ applications, where no functional information is available on the protein of interest, it is essential that the correct prediction is found in the few first-ranking hits. Even removing from the database the holo-proteins present in the LigaSite set, the application still performs fairly well. In fact, LIBRA-WA still identifies a biologically relevant ligand in 88% of the cases, with the correct ligand ranking first in 80% of the cases (Supplementary Table S4). Particularly striking is the ability of LIBRA-WA to pick out similar ligand binding motifs even in structures that do not display significant sequence/structure similarity. One such example, illustrated in Supplementary Figure S3, is that of the E.coli adenylate kinase (apoprotein PDB code 4AKE) which, upon ADP binding undergoes a large conformational change (holoprotein PDB code 2ECK). Therefore, the program does not identify the ADP binding site contained in the database entry 2ECK as a correct match. Nonetheless, it correctly identifies the ADP binding site in the input protein by virtue of the structural similarity with the ADP binding site of the human kinesin-8 motor domain (PDB code 3LRE). As detailed in the Supplementary Material, the E.coli adenylate kinase and the human kinesin-8 motor domain do not display a similar fold and share a non-significant 11% sequence identity. A combined execution of LIBRA-WA using both the ligand binding sites and the catalytic sites databases allows a user to obtain information on both the location of the binding site, the identity of the ligand(s) and, in case the input protein is an enzyme, its catalytic activity, and thus assign a function to the input protein with high confidence. For example, on the E.coli adenylate kinase and using the ligand binding sites database, LIBRA-WA detects as first hit an ADP binding site similar to that of the kinesin-8 motor domain. However, an execution using the catalytic sites database detects as first hit a catalytic site similar to that of Bacillus stearothermophilus adenylate kinase (PDB code 1ZIO). Combining the two information together leads to a highly reliable function prediction for the input protein.

4 Discussion

In this paper, the development of LIBRA-WA, a web application based on an improved LIBRA engine has been described. By employing an enhanced, composite scoring system, in LIBRA-WA both precision and recall are significantly improved with respect to LIBRA, as it can be clearly seen from the results of the extensive tests detailed in Supplementary Table S1. Furthermore, LIBRA-WA outperforms SiteSeer while displaying a performance comparable to that of COACH (Yang ), ranked as the best method in the weekly CAMEO ligand Binding Site Prediction Experiments (Haas ), even though the latter uses a combination of structure-based and sequence-based algorithms, while LIBRA-WA is purely structure-based (Supplementary Tables S2 and S3). Conflict of Interest: none declared. Click here for additional data file.
  11 in total

1.  LIBRA: LIgand Binding site Recognition Application.

Authors:  Le Viet Hung; Silvia Caprari; Massimiliano Bizai; Daniele Toti; Fabio Polticelli
Journal:  Bioinformatics       Date:  2015-08-26       Impact factor: 6.937

2.  Protein function prediction using local 3D templates.

Authors:  Roman A Laskowski; James D Watson; Janet M Thornton
Journal:  J Mol Biol       Date:  2005-08-19       Impact factor: 5.469

3.  ASSIST: a fast versatile local structural comparison tool.

Authors:  Silvia Caprari; Daniele Toti; Le Viet Hung; Maurizio Di Stefano; Fabio Polticelli
Journal:  Bioinformatics       Date:  2013-11-15       Impact factor: 6.937

Review 4.  The impact of structural genomics: the first quindecennial.

Authors:  Marek Grabowski; Ewa Niedzialkowska; Matthew D Zimmerman; Wladek Minor
Journal:  J Struct Funct Genomics       Date:  2016-03-02

5.  Protein-ligand binding site recognition using complementary binding-specific substructure comparison and sequence profile alignment.

Authors:  Jianyi Yang; Ambrish Roy; Yang Zhang
Journal:  Bioinformatics       Date:  2013-08-23       Impact factor: 6.937

6.  ProFunc: a server for predicting protein function from 3D structure.

Authors:  Roman A Laskowski; James D Watson; Janet M Thornton
Journal:  Nucleic Acids Res       Date:  2005-07-01       Impact factor: 16.971

7.  The Catalytic Site Atlas 2.0: cataloging catalytic sites and residues identified in enzymes.

Authors:  Nicholas Furnham; Gemma L Holliday; Tjaart A P de Beer; Julius O B Jacobsen; William R Pearson; Janet M Thornton
Journal:  Nucleic Acids Res       Date:  2013-12-06       Impact factor: 16.971

8.  The Protein Model Portal--a comprehensive resource for protein structure and model information.

Authors:  Juergen Haas; Steven Roth; Konstantin Arnold; Florian Kiefer; Tobias Schmidt; Lorenza Bordoli; Torsten Schwede
Journal:  Database (Oxford)       Date:  2013-04-26       Impact factor: 3.451

9.  LigASite--a database of biologically relevant binding sites in proteins with known apo-structures.

Authors:  Benoit H Dessailly; Marc F Lensink; Christine A Orengo; Shoshana J Wodak
Journal:  Nucleic Acids Res       Date:  2007-10-11       Impact factor: 16.971

10.  Exhaustive comparison and classification of ligand-binding surfaces in proteins.

Authors:  Yoichi Murakami; Kengo Kinoshita; Akira R Kinjo; Haruki Nakamura
Journal:  Protein Sci       Date:  2013-09-04       Impact factor: 6.725

View more
  4 in total

1.  Computational methods and tools for binding site recognition between proteins and small molecules: from classical geometrical approaches to modern machine learning strategies.

Authors:  Gabriele Macari; Daniele Toti; Fabio Polticelli
Journal:  J Comput Aided Mol Des       Date:  2019-10-18       Impact factor: 3.686

2.  Sequence-based prediction of physicochemical interactions at protein functional sites using a function-and-interaction-annotated domain profile database.

Authors:  Min Han; Yifan Song; Jiaqiang Qian; Dengming Ming
Journal:  BMC Bioinformatics       Date:  2018-06-01       Impact factor: 3.169

3.  PrankWeb: a web server for ligand binding site prediction and visualization.

Authors:  Lukas Jendele; Radoslav Krivak; Petr Skoda; Marian Novotny; David Hoksza
Journal:  Nucleic Acids Res       Date:  2019-07-02       Impact factor: 16.971

4.  FGDB: a comprehensive graph database of ligand fragments from the Protein Data Bank.

Authors:  Daniele Toti; Gabriele Macari; Enrico Barbierato; Fabio Polticelli
Journal:  Database (Oxford)       Date:  2022-06-27       Impact factor: 4.462

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.