| Literature DB >> 28291243 |
Adam S Brown1, Chirag J Patel1.
Abstract
Drug repositioning, the process of discovering, validating, and marketing previously approved drugs for new indications, is of growing interest to academia and industry due to reduced time and costs associated with repositioned drugs. Computational methods for repositioning are appealing because they putatively nominate the most promising candidate drugs for a given indication. Comparing the wide array of computational repositioning methods, however, is a challenge due to inconsistencies in method validation in the field. Furthermore, a common simplifying assumption, that all novel predictions are false, is intellectually unsatisfying and hinders reproducibility. We address this assumption by providing a gold standard database, repoDB, that consists of both true positives (approved drugs), and true negatives (failed drugs). We have made the full database and all code used to prepare it publicly available, and have developed a web application that allows users to browse subsets of the data (http://apps.chiragjpgroup.org/repoDB/).Entities:
Mesh:
Year: 2017 PMID: 28291243 PMCID: PMC5349249 DOI: 10.1038/sdata.2017.29
Source DB: PubMed Journal: Sci Data ISSN: 2052-4463 Impact factor: 6.444
Characteristics of databases* using in the construction of repoDB
| DrugCentral | Indication (UMLS) | 41,388 |
| AACT | Indication (UMLS) | 1,13,571 |
*Static versions available through figshare (DrugCentral [Full PostgreSQL Database], AACT [Pipe Delimited], Data Citation 1).
Figure 1repoDB data sources and database characteristics.
(a) repoDB data were downloaded from two sources: (1) the AACT indexed version of ClinicalTrials.gov for failed indication information, and (2) DrugCentral for approved indication information. AACT drug-indication pairs were filtered to include only failed pairs, and exclude currently approved pairs. (b) The repoDB database contains 6,677 approved drug-indication pairs and 4,123 failed drug-indication pairs. Indications are broken into UMLS semantic types, which describe broad categories of disease. For the two categories with the most records, ‘Disease and Syndrome’ and ‘Neoplastic Process’, we provide lists of the individual terms and their respective numbers in repoDB (see Supplementary Tables 2 and 3).
Summary of data available for download through repoDB*
| Approved | 1,519 | 1,229 |
| Terminated | 386 | 785 |
| Withdrawn | 199 | 279 |
| Suspended | 77 | 143 |
*Also available for download through figshare (repoDB [Final Database], Data Citation 1).
Figure 2repoDB at-a-glance.
Indications were grouped by high-level UMLS ‘Semantic Types,’ which provide insight into broad categories of disease. (a) Overlap of approved drugs between semantic types are shown as number of shared drugs. Black indicates the higher overlap and white indicates the lowest (diagonal entries were removed). (b) Drugs that are approved for one semantic type and failed in another are shown as above.