| Literature DB >> 18499712 |
Mathias Dunkel1, Stefan Günther, Jessica Ahmed, Burghardt Wittig, Robert Preissner.
Abstract
UNLABELLED: The drug classification scheme of the World Health Organization (WHO) [Anatomical Therapeutic Chemical (ATC)-code] connects chemical classification and therapeutic approach. It is generally accepted that compounds with similar physicochemical properties exhibit similar biological activity. If this hypothesis holds true for drugs, then the ATC-code, the putative medical indication area and potentially the medical target should be predictable on the basis of structural similarity. We have validated that the prediction of the drug class is reliable for WHO-classified drugs. The reliability of the predicted medical effects of the compounds increases with a rising number of (physico-) chemical properties similar to a drug with known function. The web-server translates a user-defined molecule into a structural fingerprint that is compared to about 6300 drugs, which are enriched by 7300 links to molecular targets of the drugs, derived through text mining followed by manual curation. Links to the affected pathways are provided. The similarity to the medical compounds is expressed by the Tanimoto coefficient that gives the structural similarity of two compounds. A similarity score higher than 0.85 results in correct ATC prediction for 81% of all cases. As the biological effect is well predictable, if the structural similarity is sufficient, the web-server allows prognoses about the medical indication area of novel compounds and to find new leads for known targets. AVAILABILITY: the system is freely accessible at http://bioinformatics.charite.de/superpred. SuperPred can be obtained via a Creative Commons Attribution Noncommercial-Share Alike 3.0 License.Entities:
Mesh:
Substances:
Year: 2008 PMID: 18499712 PMCID: PMC2447784 DOI: 10.1093/nar/gkn307
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Distribution of the fractions of correctly predicted indications
| Range of Tanimoto coefficient | Numbers of hits/misses | Fraction of hits |
|---|---|---|
| 0.4–0.5 | 5/18 | 21.7 |
| 0.5–0.6 | 18/27 | 40.0 |
| 0.6–0.7 | 40/60 | 40.0 |
| 0.7–0.8 | 93/84 | 52.5 |
| 0.8–0.9 | 171/58 | 74.7 |
| 0.9–1.0 | 367/79 | 82.3 |
| 0.0–1.0 | 700/335 | 67.6 |
For the reduced data set of 1035 drugs, 700 right and 335 wrong predictions are investigated. In detail: a similarity score of 90–100% specifies the correct ATC-class in about 82% (367 right and 79 wrong predictions). A hit/miss-rate of about 3/1 is achieved for similarity scores of 70% and higher.
Figure 1.Cumulative recall for ATC-recognition relative to rank of retrieval.
Compounds identified with SuperPred and similar to Enalapril and NSC 600221, respectively
| Name of the compound | Tanimoto coefficient | Medical function | Target protein | Reference |
|---|---|---|---|---|
| Enalapril | 100.00 | ACE-inhibitor | Angiotensin-converting enzyme | ( |
| Sch 31846 | 94.57 | ACE-inhibitor (predicted) | Angiotensin-converting enzyme (predicted) | ( |
| Delapril hydrochloride | 83.84 | ACE-inhibitor (predicted) | Angiotensin-converting enzyme (predicted) | ( |
| Hoe 065 | 81.90 | ACE-inhibitor/increasing central cholinergic activity (predicted) | Angiotensin-converting enzyme (predicted) | ( |
| NSC 600221 | 100.00 | Antineoplastic agent | Tubulin (predicted) | |
| Paclitaxel | 91.62 | Antineoplastic agent | Tubulin beta-1chain | ( |
a(2S,3aS,7aS)-1-((S)-N-((S)-1-Carboxy-3-phenylpropyl)alanyl) hexahydro-2 indolinecarboxylic acid, 1-ethyl ester, monohydrochloride.
bCyclopenta(c)pyrrole-1-carboxylic acid, 2-(2-((1-(ethoxycarbonyl)-3-phenylpropyl)amino)-1-oxopropyl)octahydro-, octyl ester, (1S-(1-alpha,2-(R*(R*)),3a-beta,6a-alpha))-, (Z)-2-butenedioate (1:1).
cBeta-Phenylalanine, N-benzoyl-2-[[(2-carboxyethyl) carbonyl]oxy]-, 6,12b-diacetoxy-12-(benzoyloxy)-2a,3,3a,4,5,6,9, 10,11,12,12a,12b-dodecahydro-4,11- dihydroxy-4a,8,13, 13-tetramethyl-5-oxo-7,11-methano- 1H-cyclodeca[3,4]benz[1, 2-b]oxet-9-yl ester.
Figure 2.Assembly of the SuperPred server and possible requests for ATC-code prediction. Data: the SuperPred server now contains 2500 compounds of the SuperDrug database. Additionally, 3800 experimental drugs were classified and stored on the server. The drugs are annotated by 7300 links to targets. Methods: the structural properties of the compounds are stored in so-called structural fingerprints, where each bit encodes for an element of the compound structure. The similarity of two compounds is calculated by using the Tanimoto coefficient. Moreover, physicochemical properties are stored for each compound. SuperPred can be used to find new targets for ligands and vice versa to find new ligands for medical biological targets. There are two possibilities to use the SuperPred server. The figure shows two examples for querying the SuperPred server.