Literature DB >> 23559752

PyPLIF: Python-based Protein-Ligand Interaction Fingerprinting.

Muhammad Radifar¹, Nunung Yuniarti, Enade Perdana Istyastono.

Abstract

UNLABELLED: Structure-based virtual screening (SBVS) methods often rely on docking score. The docking score is an over-simplification of the actual ligand-target binding. Its capability to model and predict the actual binding reality is limited. Recently, interaction fingerprinting (IFP) has come and offered us an alternative way to model reality. IFP provides us an alternate way to examine protein-ligand interactions. The docking score indicates the approximate affinity and IFP shows the interaction specificity. IFP is a method to convert three dimensional (3D) protein-ligand interactions into one dimensional (1D) bitstrings. The bitstrings are subsequently employed to compare the protein-ligand interaction predicted by the docking tool against the reference ligand. These comparisons produce scores that can be used to enhance the quality of SBVS campaigns. However, some IFP tools are either proprietary or using a proprietary library, which limits the access to the tools and the development of customized IFP algorithm. Therefore, we have developed PyPLIF, a Python-based open source tool to analyze IFP. In this article, we describe PyPLIF and its application to enhance the quality of SBVS in order to identify antagonists for estrogen α receptor (ERα). AVAILABILITY: PyPLIF is freely available at http://code.google.com/p/pyplif.

Entities: Chemical Disease Gene Species

Keywords: Python; Virtual screening; docking software; interaction fingerprinting; open source

Year: 2013 PMID： 23559752 PMCID： PMC3607193 DOI： 10.6026/97320630009325

Source DB: PubMed Journal: Bioinformation ISSN： 0973-2063

Background

Interaction fingerprinting (IFP) is a relatively new method in virtual screening (VS) and proven to be able to increase VS quality. This method is matching the protein-ligand interaction from the output of molecular docking against the reference (usually from experimental study). In fact, the current world record for prospective fragment-based VS study was aided by IFP [1]. Unfortunately the IFP software is usually proprietary, or using a proprietary library. Therefore, we have attempted to develop a Python-based IFP software which depends on OpenBabel [2], an open source chemical library to give a completely free IFP tool that anyone can use and freely modify/develop according to their need.

Methodology

Basically PyPLIF accomplishes IFP by converting the molecular interaction of ligand-protein into bit array according to the residue of choice and the interaction type [3]. For every residue there are seven bits which represent seven type of interactions: (i) Apolar (van der Waals), (ii) aromatic face to face, (iii) aromatic edge to face, (iv) hydrogen bond (protein as hydrogen bond donor), (v) hydrogen bond (protein as hydrogen bond acceptor), (vi) electrostatic interaction (protein positively charged), and (vii) electrostatic interaction (protein negatively charged) (Figure 1a). Subsequently, the bit arrays from the docking pose are compared against the reference and checked for the similarity using Tanimoto coefficient (Tc) (Figure 1B), which give the result between 0.000 – 1.000 where 0.000 means no similarity, and 1.000 means the docking pose interaction fingerprints (within the selected residues) are identical with the reference.

Figure 1

PyPLIF results: (A) 7 bits that represent 7 different interactions for each residue, 1 (one) means the interaction is exist (on) while 0 (zero) means the interaction is not exist (off); (B) Tanimoto coefficient (Tc) which is used to measure interaction similarity; (C) An example of PyPLIF result; and (D) Best ligand pose screened with PyPLIF and additional ASP351 filter, the ligand (ZINC03815477 conformation #9) gives not only high overlap but also hydrogen bond with ASP351. The 3D figure was generated using PyMOL 1.2r1 (http://www.pymol.org).

Input

Aside from the docking output from PLANTS [4], PyPLIF requires three files: Configuration file (config.txt), protein binding site file, and ligand reference. The configuration file consists of five lines each with a keyword-value pair, where the keywords are protein_reference, ligand_reference, protein_ligand_folder, residue_of_choice, and output_file (available in supplementary material).

Output

After a run has completed, PyPLIF generates an output file in .csv format (Figure 1C), which is best opened using a text editor. This file contains many lines, the first line shows the list of residue of choice, the subsequent line shows the ligand reference and its bitstring, while the rest of the lines are the ligand output from PLANTS. Each line of the ligand output from PLANTS consists of 4 columns: The first one is the name of the ligand, the second one is the docking score, the third is the Tc, and the last column presents the bitstrings. A simple shell script can be employed to PyPLIF to increase the quality of SBVS.

Results & Discussion

PyPLIF version 0.1.1 has been tested by running it in Ubuntu with three different versions of Open Babel libraries: (i) 2.2.3, (ii) 2.3.0, and (iii) 2.3.1. These Open Babel library versions were selected as they are available in the recent Ubuntu versions as the default version [5]. For the input we used the docking results of retrospective validation of SBVS protocols to identify estrogen α receptor (ERα) antagonists, which were kindly provided by Anita, et al. [6]. Despite the code and data differences among three Open Babel versions, the output has shown that the bit arrays and the Tc's are identical. This means that PyPLIF is stable and robust enough, at least for the dataset used in the retrospective validation of SBVS protocols to identify estrogen α receptor (ERα) antagonists [6]. In order to see the applicability of PyPLIF to enhance the SBVS quality, the enrichment factor at 1% false positives (EF1%) values were examined by sorting the ligands based on their Tc's. In case of multiple ligands with the same Tc's values appear, those ligands were sorted by the docking score. This method gives EF1% value of 17.94, whereas the previous study showed EF1% value of 21.2 [6]. In this attempt, PyPLIF could not enhance the SBVS quality. Then, to demonstrate another way of using PyPLIF we tried another approach employing the knowledge of molecular determinants of ligand binding to ERα. This approach is similar to the one used by de Graaf et al. [1]. Since residue ASP351 has been particularly important for ligand binding to ERα [7, 8], we added a hydrogen bond filter of the residue ASP351 using a simple shell script (available in supplementary material) which surprisingly increased EF1% value to 53.84. Thus, it is clear that post-dock analysis using PyPLIF could significantly increase VS campaign quality.

Caveat & Future Development

Since this tool is still very new, the feature is quite limited. First, this tool works only for the output from PLANTS. Currently, the tool is developed to support for Autodock Vina [9]. Second, this tool is still based on command-line interface that needs additional skill to run and analyze the output of PyPLIF. We would like to integrate a graphical user interface (GUI) to assist any medicinal chemists to easily run PyPLIF and analyze the results.

8 in total

1. Crystal structure-based virtual screening for fragment-like ligands of the human histamine H(1) receptor.

Authors: Chris de Graaf; Albert J Kooistra; Henry F Vischer; Vsevolod Katritch; Martien Kuijer; Mitsunori Shiroishi; So Iwata; Tatsuro Shimamura; Raymond C Stevens; Iwan J P de Esch; Rob Leurs
Journal: J Med Chem Date: 2011-11-07 Impact factor: 7.446

2. Structure-function relationships of estrogenic triphenylethylenes related to endoxifen and 4-hydroxytamoxifen.

Authors: Philipp Y Maximov; Cynthia B Myers; Ramona F Curpan; Joan S Lewis-Wambi; V Craig Jordan
Journal: J Med Chem Date: 2010-04-22 Impact factor: 7.446

3. Optimizing fragment and scaffold docking by use of molecular interaction fingerprints.

Authors: Gilles Marcou; Didier Rognan
Journal: J Chem Inf Model Date: 2007 Jan-Feb Impact factor: 4.956

4. AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading.

Authors: Oleg Trott; Arthur J Olson
Journal: J Comput Chem Date: 2010-01-30 Impact factor: 3.376

5. Empirical scoring functions for advanced protein-ligand docking with PLANTS.

Authors: Oliver Korb; Thomas Stützle; Thomas E Exner
Journal: J Chem Inf Model Date: 2009-01 Impact factor: 4.956

6. Tamoxifen and raloxifene differ in their functional interactions with aspartate 351 of estrogen receptor alpha.

Authors: Guila Dayan; Mathieu Lupien; Anick Auger; Silvia I Anghel; Walter Rocha; Sébastien Croisetière; John A Katzenellenbogen; Sylvie Mader
Journal: Mol Pharmacol Date: 2006-05-05 Impact factor: 4.436

7. Structure-based design of eugenol analogs as potential estrogen receptor antagonists.

Authors: Yulia Anita; Muhammad Radifar; Leonardus Bs Kardono; Muhammad Hanafi; Enade P Istyastono
Journal: Bioinformation Date: 2012-10-01

8. Open Babel: An open chemical toolbox.

Authors: Noel M O'Boyle; Michael Banck; Craig A James; Chris Morley; Tim Vandermeersch; Geoffrey R Hutchison
Journal: J Cheminform Date: 2011-10-07 Impact factor: 5.514

8 in total

13 in total

1. Delineation of Polypharmacology across the Human Structural Kinome Using a Functional Site Interaction Fingerprint Approach.

Authors: Zheng Zhao; Li Xie; Lei Xie; Philip E Bourne
Journal: J Med Chem Date: 2016-03-17 Impact factor: 7.446

2. Improved pose and affinity predictions using different protocols tailored on the basis of data availability.

Authors: Philip Prathipati; Chioko Nagao; Shandar Ahmad; Kenji Mizuguchi
Journal: J Comput Aided Mol Des Date: 2016-10-06 Impact factor: 3.686

3. Scoring Functions for Protein-Ligand Binding Affinity Prediction using Structure-Based Deep Learning: A Review.

Authors: Rocco Meli; Garrett M Morris; Philip C Biggin
Journal: Front Bioinform Date: 2022-06-17

4. PLIP 2021: expanding the scope of the protein-ligand interaction profiler to DNA and RNA.

Authors: Melissa F Adasme; Katja L Linnemann; Sarah Naomi Bolz; Florian Kaiser; Sebastian Salentin; V Joachim Haupt; Michael Schroeder
Journal: Nucleic Acids Res Date: 2021-07-02 Impact factor: 16.971

5. Insights into the binding mode of MEK type-III inhibitors. A step towards discovering and designing allosteric kinase inhibitors across the human kinome.

Authors: Zheng Zhao; Lei Xie; Philip E Bourne
Journal: PLoS One Date: 2017-06-19 Impact factor: 3.240

6. Prediction of sensitivity to gefitinib/erlotinib for EGFR mutations in NSCLC based on structural interaction fingerprints and multilinear principal component analysis.

Authors: Bin Zou; Victor H F Lee; Hong Yan
Journal: BMC Bioinformatics Date: 2018-03-07 Impact factor: 3.169

7. Structural Insights into the Binding Modes of Viral RNA-Dependent RNA Polymerases Using a Function-Site Interaction Fingerprint Method for RNA Virus Drug Discovery.

Authors: Zheng Zhao; Philip E Bourne
Journal: J Proteome Res Date: 2020-09-29 Impact factor: 4.466