| Literature DB >> 25338045 |
Michael Römer1, Linus Backert2, Johannes Eichner3, Andreas Zell4.
Abstract
We present a new tool for hepatocarcinogenicity evaluation of drug candidates in rodents. ToxDBScan is a web tool offering quick and easy similarity screening of new drug candidates against two large-scale public databases, which contain expression profiles for substances with known carcinogenic profiles: TG-GATEs and DrugMatrix. ToxDBScan uses a set similarity score that computes the putative similarity based on similar expression of genes to identify chemicals with similar genotoxic and hepatocarcinogenic potential. We propose using a discretized representation of expression profiles, which use only information on up- or down-regulation of genes as relevant features. Therefore, only the deregulated genes are required as input. ToxDBScan provides an extensive report on similar compounds, which includes additional information on compounds, differential genes and pathway enrichments. We evaluated ToxDBScan with expression data from 15 chemicals with known hepatocarcinogenic potential and observed a sensitivity of 88 Based on the identified chemicals, we achieved perfect classification of the independent test set. ToxDBScan is publicly available from the ZBIT Bioinformatics Toolbox.Entities:
Mesh:
Substances:
Year: 2014 PMID: 25338045 PMCID: PMC4227259 DOI: 10.3390/ijms151019037
Source DB: PubMed Journal: Int J Mol Sci ISSN: 1422-0067 Impact factor: 5.923
Figure 1Gene fingerprint sizes for different intensity ratio thresholds. (a) Low threshold, 1.5-fold deregulation; (b) high threshold, two-fold deregulation.
Chemicals used for evaluation. Male Wistar rats were treated with the chemicals each day for up to 14 days. For each chemical, the Chemical Abstracts Service (CAS) registry number, dosing time and dose is listed, as well as the short name that is used in the tables and figures. The last column lists the databases that contain the test compound (DM = DrugMatrix, TGG = TG-GATEs). Adapted from Römer et al. [12].
| Compound | Short Name | CAS Number | Dosing Time (day) | Dose (mg/kg/day) | Contained in |
|---|---|---|---|---|---|
| Genotoxic carcinogens (GCs) | |||||
| Direct Black 38 | CIDB | 1937-37-7 | 7 | 146 | - |
| Nitrosodimethylamine | DMN | 62-75-9 | 7 | 4 | DM |
| Non-genotoxic carcinogens (NGCs) | |||||
| Piperonyl butoxide | PBO | 51-03-6 | 3 | 1200 | - |
| Methyl carbamate | MCA | 598-55-0 | 14 | 400 | - |
| Dehydroepiandrosterone | DHEA | 53-43-0 | 14 | 600 | - |
| Methapyrilene | MP | 135-23-9 | 14 | 60 | TGG, DM |
| Thioacetamide | TAA | 62-55-5 | 7 | 19.2 | TGG, DM |
| Diethylstilbestrol | DES | 56-53-1 | 3 | 10 | DM |
| Wy-14643 | WY | 50892-23-4 | 3 | 60 | TGG, DM |
| Acetamide | AAA | 60-35-5 | 14 | 3000 | TGG |
| Ethionine | ET | 67-21-0 | 14 | 200 | TGG |
| Cyproterone acetate | CPR | 427-51-0 | 14 | 100 | DM |
| Phenobarbital | PB | 50-06-6 | 14 | 80 | TGG, DM |
| Non-hepatocarcinogens (NCs) | |||||
| Cefuroxime | CFX | 55268-75-2 | 14 | 250 | - |
| Nifedipine | NIF | 21829-25-4 | 14 | 3 | TGG |
Figure 2Gene expression heat maps of similar compounds. For selected test chemicals, we extracted the most similar chemicals included in either TG-GATEs or DrugMatrix. Each column corresponds to a chemical that was identified as similar. The chemicals are sorted from left to right by descending similarity score. The heat maps show the log2 fold change of 20 selected genes from the gene fingerprints of the test chemical. Genes above the black line are upregulated at least 1.5-fold in the test chemical, and genes below are downregulated, respectively. Genes were selected based on average expression in the identified chemicals. The color bar above the chemical name indicates the hepatocarcinogenicity annotation, and the legend is shown in (a). (a) Wy-14643 (NGC); (b) dehydroepiandrosterone (NGC); (c) piperonyl butoxide (NGC); (d) C.I Direct Black (GC).
Percentage of correctly identified conditions. The most similar conditions were extracted for each chemical in the evaluation set. The percentage of conditions with the same carcinogenicity class in the five, 10 and 20 most similar conditions was calculated.
| Chemical | 1.5-Fold Deregulation | 2-Fold Deregulation | |||||
|---|---|---|---|---|---|---|---|
| Best 5 | Best 10 | Best 20 | Best 5 | Best 10 | Best 20 | ||
| Genotoxic carcinogens | |||||||
| CIDB | 100 | 100 | 95 | 100 | 90 | 90 | |
| DMN | 80 | 70 | 50 | 100 | 80 | 75 | |
| Non-genotoxic carcinogens | |||||||
| PBO | 100 | 80 | 65 | 80 | 70 | 75 | |
| MCA | 60 | 60 | 45 | 80 | 60 | 50 | |
| DHEA | 100 | 90 | 85 | 100 | 90 | 80 | |
| MP | 80 | 70 | 70 | 80 | 50 | 40 | |
| TAA | 100 | 80 | 65 | 100 | 70 | 60 | |
| DES | 100 | 100 | 100 | 100 | 100 | 100 | |
| WY | 100 | 90 | 90 | 100 | 100 | 95 | |
| AAA | 60 | 50 | 35 | 80 | 60 | 50 | |
| ET | 100 | 100 | 100 | 100 | 100 | 85 | |
| CPR | 80 | 80 | 65 | 60 | 60 | 60 | |
| PB | 80 | 60 | 45 | 20 | 20 | 25 | |
| Non-hepatocarcinogens | |||||||
| CFX | 100 | 90 | 85 | 60 | 70 | 85 | |
| NIF | 60 | 80 | 70 | 60 | 70 | 70 | |
| Mean | 86 | 80 | 71 | 82 | 73 | 70 | |
Percentage of correctly identified conditions. The most similar conditions were extracted for each chemical in the evaluation set. The percentage of conditions with the same carcinogenicity class and a relative similarity above 0.8 and 0.7 was calculated.
| Chemical | 1.5-Fold Deregulation | 2-Fold Deregulation | |||
|---|---|---|---|---|---|
| Genotoxic carcinogens | |||||
| CIDB | 100 | 100 | 100 | 100 | |
| DMN | 70 | 48 | 78 | 71 | |
| Non-genotoxic carcinogens | |||||
| PBO | 77 | 66 | 73 | 76 | |
| MCA | 100 | 100 | 100 | 100 | |
| DHEA | 88 | 89 | 100 | 80 | |
| MP | 100 | 100 | 100 | 100 | |
| TAA | 100 | 100 | 100 | 67 | |
| DES | 100 | 100 | 100 | 100 | |
| WY | 100 | 91 | 100 | 94 | |
| AAA | 60 | 42 | 67 | 55 | |
| ET | 100 | 100 | 100 | 100 | |
| CPR | 71 | 57 | 50 | 60 | |
| PB | 100 | 60 | 0 | 20 | |
| Non-hepatocarcinogens | |||||
| CFX | 100 | 88 | 33 | 50 | |
| NIF | 50 | 67 | 100 | 100 | |
| Mean | 88 | 80 | 80 | 78 | |
Classification results. Similar conditions in TG-GATEs and DrugMatrix were identified by computing the similarity score S and selecting conditions with a relative similarity S̃ > 0.8. Ratios of genotoxic carcinogens (RGC) and non-genotoxic carcinogens (RNGC) were computed based on the annotation of the similar conditions. A permutation test (n = 100, 000) was performed to assess the significance of over-representation of GCs (pGC) and NGCs (pNGC). If the p-values were significant for α = 0.05, the corresponding class was predicted. If no significant enrichment was found for either of the two classes, the test chemical was predicted as non-hepatocarcinogen (NC). Significant p-values are printed in bold font.
| Chemical |
|
|
|
| Prediction |
|---|---|---|---|---|---|
| Genotoxic carcinogens | |||||
| CIDB | 1.00 |
| 0.00 | 1.000 | GC |
| DMN | 0.70 |
| 0.20 | 0.615 | GC |
| Non-genotoxic carcinogens | |||||
| MP | 0.00 | 1.000 | 1.00 |
| NGC |
| TAA | 0.00 | 1.000 | 1.00 |
| NGC |
| DES | 0.00 | 1.000 | 1.00 | NGC | |
| WY | 0.00 | 1.000 | 1.00 | NGC | |
| PBO | 0.00 | 1.000 | 0.77 |
| NGC |
| MCA | 0.00 | 1.000 | 1.00 |
| NGC |
| AAA | 0.00 | 1.000 | 0.60 |
| NGC |
| DHEA | 0.00 | 1.000 | 0.88 | NGC | |
| ET | 0.00 | 1.000 | 1.00 | NGC | |
| CPR | 0.00 | 1.000 | 0.71 |
| NGC |
| PB | 0.00 | 1.000 | 1.00 |
| NGC |
| Non-hepatocarcinogens | |||||
| CFX | 0.00 | 1.000 | 0.00 | 1.000 | NC |
| NIF | 0.25 | 0.132 | 0.25 | 0.399 | NC |
Figure 3HTML report of the compound similarity scan for PBO. This figure shows the results of the similarity search against TG-GATEs and DrugMatrix. Additional information for each compound can be shown by clicking on the “plus” in the first column of the table. Additional information on the deregulated genes is available from the “Gene analysis” tab at the head of the report. The results of the pathway enrichment analysis against the KEGG database are available from the “Pathway analysis” tab. The “Heat maps” tab shows heat maps of the gene expression in the most similar compounds.