| Literature DB >> 35518105 |
Xiuming Li1, Xin Yan1, Yuedong Yang2, Qiong Gu1, Huihao Zhou1, Yunfei Du2, Yutong Lu2, Jielou Liao3, Jun Xu1,4.
Abstract
Similar structures having similar activities is a dogma for identifying new functional molecules. However, it is not rare that a minor structural change can cause a significant activity change. Methods to measure the molecular similarity can be classified into two categories of overall three-dimensional shape based methods and local substructure based methods. The former states the relation between overall similarity and activity, and is represented by conventional similarity algorithms. The latter states the relation between local substructure and activity, and is represented by conventional substructure match algorithms. Practically, the similarity of two molecules with similar activity depends on the contributions from both overall similarity and local substructure match. We report a new tool termed as a local-weighted structural alignment (LSA) tool for pharmaceutical virtual screening, which computes the similarity of two molecular structures by considering the contributions of both overall similarity and local substructure match. LSA consists of three steps: (1) mapping a common substructure between two molecular topological structures; (2) superimposing two three-dimensional molecular structures with substructure focus; (3) computing the similarity score based on superimposing. LSA has been validated with 102 testing compound libraries from DUD-E collection with the average AUC (the area under a receiver-operating characteristic curve) value of 0.82 and an average EF1% (the enrichment factor at top 1%) of 27.0, which had consistently better performance than conventional approaches. LSA is implemented in C++ and run on Linux and Windows systems. This journal is © The Royal Society of Chemistry.Entities:
Year: 2019 PMID: 35518105 PMCID: PMC9060470 DOI: 10.1039/c8ra08915a
Source DB: PubMed Journal: RSC Adv ISSN: 2046-2069 Impact factor: 3.361
Fig. 1A core substructure of an HDAC inhibitor. The core substructure (highlighted in red circle) is for an HDAC inhibitor (CHEMBL275089). The core substructure is the chelator “warhead” binding Zn2+ ion in the HDAC binding site. The rest of the molecular structure is for selective molecular recognition.
Fig. 2The specification of substructures. The core substructure is specified from a template molecule which is as the core query substructure. One (or more) alternative core query substructure(s) is specified.
The virtual screening performances comparisons of WEGA, Rigid-LS-align, Flexi-LS-align, SPOT-ligand2 and LSA based on AUC and enrichment factors (EF) at top 1%, 5% and 10% of DUD-E
| Method | AUC | EF1% | EF5% | EF10% |
|---|---|---|---|---|
| WEGA | 0.74 | 20.7 | 7.5 | 4.4 |
| Rigid-LS-align | — | 20.1 | 6.9 | 4.3 |
| Flexi-LS-align | 0.75 | 22.0 | 7.2 | 4.5 |
| SPOT-ligand2 | — | 24.1 | 8.6 | 5.2 |
| LSA |
|
|
|
|
Fig. 3The ROC curves of top-12 most performance improved targeted libraries virtual screenings using LSA and WEGA. The curves in red are for LSA and the curves in black are for WEGA.
EF values of WEGA, Rigid-LS-align and LSA on four protein categories of DUD-E
| Categories (#proteins) | Method | EF1% | EF5% | EF10% |
|---|---|---|---|---|
| Kinases (26) | WEGA | 17.7 | 6.4 | 3.8 |
| Rigid-LS-align | 19 | 6.5 | 4.2 | |
| LSA |
|
|
| |
| Proteases (15) | WEGA | 14.4 | 6.2 | 4.0 |
| Rigid-LS-align | 15.4 | 6.3 | 4.3 | |
| LSA |
|
|
| |
| Nuclear receptors (11) | WEGA |
|
| 5.4 |
| Rigid-LS-align | 22.2 | 7.2 | 4.6 | |
| LSA | 22.3 | 8.9 |
| |
| GPCRs (5) | WEGA | 9.6 | 3.8 | 2.7 |
| Rigid-LS-align | 16.6 | 5.5 | 3.6 | |
| LSA |
|
|
|
Fig. 4The superimposed structures. The core substructures are superimposed in the magnifier. The molecule in green is CHEMBL343068 and the other molecule is CHEMBL275089 as in Fig. 1.