| Literature DB >> 20952398 |
Jessica Ahmed1, Thomas Meinel, Mathias Dunkel, Manuela S Murgueitio, Robert Adams, Corinna Blasse, Andreas Eckert, Saskia Preissner, Robert Preissner.
Abstract
During the development of methods for cancer diagnosis and treatment, a vast amount of information is generated. Novel cancer target proteins have been identified and many compounds that activate or inhibit cancer-relevant target genes have been developed. This knowledge is based on an immense number of experimentally validated compound-target interactions in the literature, and excerpts from literature text mining are spread over numerous data sources. Our own analysis shows that the overlap between important existing repositories such as Comparative Toxicogenomics Database (CTD), Therapeutic Target Database (TTD), Pharmacogenomics Knowledge Base (PharmGKB) and DrugBank as well as between our own literature mining for cancer-annotated entries is surprisingly small. In order to provide an easy overview of interaction data, it is essential to integrate this information into a single, comprehensive data repository. Here, we present CancerResource, a database that integrates cancer-relevant relationships of compounds and targets from (i) our own literature mining and (ii) external resources complemented with (iii) essential experimental and supporting information on genes and cellular effects. In order to facilitate an overview of existing and supporting information, a series of novel information connections have been established. CancerResource addresses the spectrum of research on compound-target interactions in natural sciences as well as in individualized medicine; CancerResource is available at: http://bioinformatics.charite.de/cancerresource/.Entities:
Mesh:
Substances:
Year: 2010 PMID: 20952398 PMCID: PMC3013779 DOI: 10.1093/nar/gkq910
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Numbers of known interactions in external databases and from the CancerResource literature text mining
| Resource | References | Numbers of compound–target relationships | Uniqueness | |
|---|---|---|---|---|
| Redundant | Unique | Degree (%) | ||
| External databases | ||||
| CTD | ( | 3875 | 3748 | 96 |
| PharmGKB | ( | 1307 | 1158 | 88 |
| TTD | ( | 282 | 163 | 58 |
| DrugBank | ( | 4949 | 4763 | 96 |
| CancerResource | ||||
| Literature mining | (this article) | 1122 | 992 | 88 |
| CancerResource | ||||
| Full data integration | (this article) | 11 585 | 10 824 | 93 |
Data from CTD, PharmGKB and TTD are filtered according to cancer-related disease annotations, data for DrugBank are unfiltered. Relationships unique to each approach include the CancerResource literature mining result. The full integration result is presented additionally. The degree of uniqueness reveals that the data sets are more or less disjunct.
Figure 2.Knowledge retrieval in CancerResource: (a) Access to detailed compound/drug and target gene information in CancerResource. A similar layout for both information layers, compounds (left) and target genes (right), comprises three information sections: (i) cancer relevance of a target gene or a compound with KEGG cancer pathways, involved somatic cancer types, information on expression across cancer cell lines; (ii) information on interactions with a toggle option between compound and target gene, source of information and link to the original literature source; (iii) specific compound or gene information. (b) Access to complementary information on growth activity across NCI-60 human cancer cell lines and structures of acting compounds. A search by compound structures (iv) reveals similar structures and associated growth activity profiles. The search by activity profiles (v) enables the user to compare structure formulas, activity profiles (pairwise mean graphs) and similarity measures for both growth activity and structures. Complementary queries can be performed by structures after downloading or by implemented links for a profile.
Figure 3.Finding the most similar cell line from the NCI-60 set and, subsequently, compounds or drugs having the highest influence on that cell line (this workflow includes two user interactions).