| Literature DB >> 24098102 |
Francisco Martínez-Jiménez1, George Papadatos, Lun Yang, Iain M Wallace, Vinod Kumar, Ursula Pieper, Andrej Sali, James R Brown, John P Overington, Marc A Marti-Renom.
Abstract
Mycobacterium tuberculosis, the causative agent of tuberculosis (TB), infects an estimated two billion people worldwide and is the leading cause of mortality due to infectious disease. The development of new anti-TB therapeutics is required, because of the emergence of multi-drug resistance strains as well as co-infection with other pathogens, especially HIV. Recently, the pharmaceutical company GlaxoSmithKline published the results of a high-throughput screen (HTS) of their two million compound library for anti-mycobacterial phenotypes. The screen revealed 776 compounds with significant activity against the M. tuberculosis H37Rv strain, including a subset of 177 prioritized compounds with high potency and low in vitro cytotoxicity. The next major challenge is the identification of the target proteins. Here, we use a computational approach that integrates historical bioassay data, chemical properties and structural comparisons of selected compounds to propose their potential targets in M. tuberculosis. We predicted 139 target--compound links, providing a necessary basis for further studies to characterize the mode of action of these compounds. The results from our analysis, including the predicted structural models, are available to the wider scientific community in the open source mode, to encourage further development of novel TB therapeutics.Entities:
Mesh:
Substances:
Year: 2013 PMID: 24098102 PMCID: PMC3789770 DOI: 10.1371/journal.pcbi.1003253
Source DB: PubMed Journal: PLoS Comput Biol ISSN: 1553-734X Impact factor: 4.475
Figure 1GSK dataset of 776 compounds.
Panels A to D describe the drug-like properties of the compounds, including the subset of 177 compounds active against MTB (green color). Red colored subsets correspond to compounds with weighted QED score smaller than 0.35 [12]. The distribution's mean values are shown in the top-right corner of each plot. A) Molecular weight distribution. B) PSA distribution. C) ALogP distribution. D) Weighted QED distribution. Panels E and F show the structural clusters of the compounds. Links between compounds indicate 0.9 or higher RFS similarity. E) Entire network of 776 compounds resulting in 551 structural families (486 singletons). F) Highlight of family number 1 with 38 compounds (inner images for the three most connected compounds in the family).
Figure 2Subset of GSK compounds with predicted targets.
A) Venn diagram with common compounds with predictions from the three different approaches (that is, in green from the search of the chemogenomics space, in purple from the search of the structural space, and in red from the historical data). B) Venn diagram with common compound families with predictions from the three different approaches. C) Most under and over-represented chemical families in our predictions. Upper plot shows the probability of finding a given family in the original dataset (grey bars) compared to the probability of finding it in the dataset with predicted targets (blue bars). Lower plot shows the log odds per selected family (i.e., absolute log odds larger than 0.5).
Figure 3Predicted KEGG pathways targeted by the GSK compounds.
A) Venn diagram with common pathways from the three different approaches. B) Most under and over-represented pathways in our predictions. Panels A) and B) with the same representation as in Figure 2.
List of seven common hit pathways identified by the three independent approaches.
| Pathway | Approach | Targets | Compound families |
|
| STR |
|
|
|
| Rv3048c |
| |
|
| Rv3314c |
| |
| CHEM | Rv2139 |
| |
| Rv2764c |
| ||
| Rv3247c |
| ||
| HIST | Rv2139 |
| |
|
| STR | Rv0489 |
|
|
|
|
| |
|
|
| ||
| CHEM | Rv1905c |
| |
| Rv3170 |
| ||
| HIST | Rv3170 |
| |
|
| STR |
|
|
|
| CHEM | Rv0458 |
|
| Rv1905c |
| ||
| Rv3170 |
| ||
| HIST | Rv1263 |
| |
| Rv3170 |
| ||
|
| STR | Rv0187 |
|
|
| Rv0520 |
| |
| Rv1498c |
| ||
| Rv1603 |
| ||
| Rv1605 |
| ||
| CHEM | Rv0458 |
| |
| Rv3170 |
| ||
| HIST | Rv3170 |
| |
|
| STR | Rv0187 |
|
|
| Rv0520 |
| |
| Rv1498c |
| ||
| Rv1703c |
| ||
| CHEM | Rv3170 |
| |
| HIST | Rv3170 |
| |
|
| STR | Rv1908c |
|
|
| Rv3469c |
| |
| CHEM | Rv3170 |
| |
| HIST | Rv1263 |
| |
| Rv3170 |
| ||
|
| STR | Rv0859 |
|
|
| Rv1908c |
| |
| CHEM | Rv0458 |
| |
|
|
| ||
| Rv3170 |
| ||
| HIST | Rv1263 |
| |
| Rv3170 |
|
The additional four common pathways identified not shown correspond to general pathway descriptions (i.e., mtu01100 “Metabolic pathways”, mtu01110 “Biosynthesis of secondary metabolites”, mtu01120 “Microbial metabolism in diverse environments”, and mtu00000 “No Pathway”). Target genes in italics are either in vivo or in vitro essential in the TraSH Essentiality database [21].
Significant links between GSK compound families and KEGG pathways.
| GSK Family | Compound | Target | Pathways |
| 1 | GSK975784A |
|
|
|
| |||
|
| No Pathway | ||
| GSK975810A |
|
| |
|
| |||
|
| No Pathway | ||
| GSK975839A |
|
| |
|
| |||
|
| No Pathway | ||
| Rv2299c | No Pathway | ||
| GSK975840A |
|
| |
|
| |||
|
| No Pathway | ||
| GSK975842A |
|
| |
|
| |||
|
| No Pathway | ||
| Rv2045c | No Pathway | ||
| Rv2139 | Pyrimidine metabolism (mtu00240) | ||
| Rv2299c | No Pathway | ||
| Rv2483c | No Pathway | ||
| 3 | GSK547481A | Rv0194 |
|
| GSK547490A | Rv0194 |
| |
| GSK547491A | Rv0194 |
| |
| GSK547499A | Rv0194 |
| |
| GSK547500A | Rv0194 |
| |
| GSK547511A | Rv0194 |
| |
| GSK547512A | Rv0194 |
| |
| GSK547527A |
|
| |
| Rv3598c |
| ||
| Rv0194 |
| ||
| GSK547528A |
|
| |
| Rv3598c |
| ||
| Rv0194 |
| ||
| GSK547543A | Rv0194 |
| |
| 7 | GSK1829727A | Rv0053 |
|
| Rv0379 | No Pathway | ||
| Rv0650 | Glycolysis/Gluconeogenesis (mtu00010) | ||
|
| |||
|
| |||
|
| |||
|
| |||
| GSK1829729A |
| No Pathway | |
| Rv0053 |
| ||
| Rv0379 | No Pathway | ||
| Rv0650 | Glycolysis/Gluconeogenesis (mtu00010) | ||
|
| |||
|
| |||
|
| |||
|
| |||
| GSK1829816A | Rv0053 |
| |
| Rv0379 | No Pathway | ||
| Rv0650 | Glycolysis/Gluconeogenesis (mtu00010) | ||
|
| |||
|
| |||
|
| |||
|
| |||
| GSK479031A | Rv0053 |
| |
| Rv0379 | NoPathway (mtu00000) | ||
| Rv0650 | Glycolysis/Gluconeogenesis (mtu00010) | ||
|
| |||
|
| |||
|
| |||
|
| |||
| GSK957094A | Rv3170 | Gly, Ser and Thr metabolism (mtu00260) | |
| Arginine and proline metabolism (mtu00330) | |||
| Histidine metabolism (mtu00340) | |||
| Tyrosine metabolism (mtu00350) | |||
| Phenylalanine metabolism (mtu00360) | |||
| Tryptophan metabolism (mtu00380) | |||
| Rv0053 |
| ||
| Rv0379 | No Pathway | ||
| Rv0650 | Glycolysis/Gluconeogenesis (mtu00010) | ||
|
| |||
|
| |||
|
| |||
|
| |||
| 9 | GSK1188379A | Rv0194 |
|
| GSK1188380A | Rv0194 |
| |
| 16 | GSK1825940A | Rv0194 |
|
| GSK1825944A | Rv0194 |
| |
| 35 | BRL-10143SA |
| Aminoacyl-tRNA biosynthesis (mtu00970) |
| Rv2763c |
| ||
| Folate biosynthesis (mtu00790) | |||
|
| |||
| Rv2764c | Pyrimidine metabolism (mtu00240) | ||
| BRL-51093AM | Rv2763c |
| |
| Rv2764c | Folate biosynthesis (mtu00790) | ||
|
| |||
| Pyrimidine metabolism (mtu00240) | |||
| 173 | GSK1402290A |
|
|
|
|
| ||
|
|
| ||
|
| No Pathway | ||
|
| No Pathway | ||
| 334 | GSK270671A |
|
|
|
|
| ||
| Rv3273 |
| ||
| Rv1707 | No Pathway |
Target genes in italics are either in vivo or in vitro essential in the TraSH Essentiality database [21]. Pathways highlighted in bold are responsible of the significant link to the GSK family.
Figure 4PknB kinase docking to GSK1598164A.
A) Multiple sequence alignment of Mycobacterium PknB kinase with selected human kinases. Human kinases were selected on the criteria of having available PDB structures and top Psi-BLAST scores to M. bovis transmembrane serine/threonine-protein kinase B (pknB). First sequence in the alignment (gene name; PDB identifier) is M. tuberculosis transmembrane serine/threonine-protein kinase B (PknB; 3F69), which is 99% identical to M. bovis PknB and was used in compound docking models. Other sequences are CAMK2D (2EWL), MARK3 (2QNJ), MARK2 (3IEC), AKT2 (1GZK) and SGK1 (2R5T). Residues known to interact with ADP in pknB are highlighted in red. The amino acids aligned with Glu93, which may be essential for the binding of the GSK1132084A, are highlighted in green. B) Binding models of the GSK1598164A and ADP within pknB binding site (left and right panels, respectively).
Figure 5Targeting the aminoacyl-tRNA biosynthesis pathway.
A) CHEM results show that GSK1402290A shared several substructural features with compounds reported as potent lysyl-tRNA synthetase inhibitors in the ChEMBL database (e.g., CHEMBL474582 and CHEMBL508242). B) STR results predicted the serS as a target of GSK1402290A with its binding site including residues F205, H209, G225, T226, E228, R257, F276, K278, and E280, which are conserved in the PFAM family PF00587 (tRNA synthetase class II core domain). Zoomed image shows the pose for GSK1402290A predicted by AutoDock and the binding site residues (i.e., within 6 Å from the compound) coloured from low sequence conservation (blue) to high sequence conservation (red).
Figure 6Predictive accuracy of the CHEM and STR methods.
A) Predictive power of the MCNBC model using individual targets (left) or target classification information (right). B) Accuracy of the RFS differentiating similar from non-similar pairs of ligands. ROC curve indicates the optimal threshold for the RFS score of 0.58, which results in an area under the curve of 0.97 and a false positive rate of only 1.6%.