| Literature DB >> 27818731 |
Luis Orlando Pérez1, Rolando González-José1, Pilar Peral García2.
Abstract
Non-genotoxic carcinogens are substances that induce tumorigenesis by non-mutagenic mechanisms and long term rodent bioassays are required to identify them. Recent studies have shown that transcription profiling can be applied to develop early identifiers for long term phenotypes. In this study, we used rat liver expression profiles from the NTP (National Toxicology Program, Research Triangle Park, USA) DrugMatrix Database to construct a gene classifier that can distinguish between non-genotoxic carcinogens and other chemicals. The model was based on short term exposure assays (3 days) and the training was limited to oxidative stressors, peroxisome proliferators and hormone modulators. Validation of the predictor was performed on independent toxicogenomic data (TG-GATEs, Toxicogenomics Project-Genomics Assisted Toxicity Evaluation System, Osaka, Japan). To build our model we performed Random Forests together with a recursive elimination algorithm (VarSelRF). Gene set enrichment analysis was employed for functional interpretation. A total of 770 microarrays comprising 96 different compounds were analyzed and a predictor of 54 genes was built. Prediction accuracy was 0.85 in the training set, 0.87 in the test set and increased with increasing concentration in the validation set: 0.6 at low dose, 0.7 at medium doses and 0.81 at high doses. Pathway analysis revealed gene prominence of cellular respiration, energy production and lipoprotein metabolism. The biggest target of toxicogenomics is accurately predict the toxicity of unknown drugs. In this analysis, we presented a classifier that can predict non-genotoxic carcinogenicity by using short term exposure assays. In this approach, dose level is critical when evaluating chemicals at early time points.Entities:
Keywords: Non-genotoxic carcinogen; Random forest; Toxicogenomics
Year: 2016 PMID: 27818731 PMCID: PMC5080858 DOI: 10.5487/TR.2016.32.4.289
Source DB: PubMed Journal: Toxicol Res ISSN: 1976-8257
Groups of chemicals for classification analysis
| Drugs | Analysis set |
|---|---|
| DrugMatrix | |
| Training set | |
| Test set | |
| Unknown/ | |
| TGGATE | |
| Validation set |
Class discrimination analysis by Random Forest in DrugMatrix data
| Chemicals | Dose level (mg/mL) | Random Forest classification |
|---|---|---|
|
| ||
| Training set (N = 18) | Predicted class | |
| Carbon Tetrachloride (CCL4) | 1175 | 0( |
| Methapyrilene (MT) | 100 | NGTXC |
| Cyproterone Acetate (CPA) | 2500 | NGTXC |
| Phenobarbital (PBT) | 54 | NGTXC |
| Fenofibrate (FF) | 215 | NGTXC |
| Clofibrate (CFB) | 130 | NGTXC |
| Bezafibrate (BF) | 617 | NGTXC |
| Diethylstilbestrol (DES) | 280 | NGTXC |
| Gemfibrozil (GFB) | 700 | NGTXC |
| 2-Acetylaminofluorene (2-AAF) | 30 | 0 |
| 3-Methylcholanthrene (MCA) | 300 | 0 |
| Doxorubicin (DOXO) | 3 | 0 |
| Hydrazine (N2H4) | 45 | 0 |
| 1-Naphthyl Isothiocyanate (ANIT) | 60 | 0 |
| Methyl salicylate (MS) | 444 | 0 |
| Albendazole (ALB) | 62 | 0 |
| Amiodarone (AMI) | 147 | 0 |
| Ibuprofen (IBF) | 263 | 0 |
|
| ||
| Test set (n = 33) | ||
|
| ||
| Clofibric Acid (CA) | 448 | NGTX |
| Carbamazepine (CBZ) | 490 | NGTX |
| Ethinylestradiol (EE) | 1480 | NGTX |
| 17-Methyltestosterone (MET) | 2000 | NGTX |
| Aflatoxin b1 (AFB1) | 0.3 | 0 |
| Carmustine (BCNU) | 16 | 0 |
| Chlorambucil (CBC) | 4.5 | 0 |
| Cytarabine (ara-C) | 487 | 0 |
| Lomustine (LS) | 8.75 | 0 |
| Mitomycin-c (MMC) | 1.7 | 0 |
| N-nitrosodiethylamine (NDEA) | 34 | 0 |
| Raloxifene (RLX) | 650 | 0 |
| Safrole (SF) | 488 | 0 |
| Streptozotocin (STZ), | 138 | 0 |
| Tamoxifen (TMX) | 64 | 0 |
|
| ||
| 6-Mercaptopurine (MP) | 25 | 0 |
| Allyl alcohol (AA) | 32 | 0 |
| Altretamine (HMM) | 40 | 0 |
| Clomipramine (CMP) | 115 | 0 |
| Clotrimazole (CLOT) | 89 | NGTXC( |
| Dexamethasone (DXM) | 150 | NGTXC( |
| Diclofenac (DFNa) | 10 | 0 |
| Erythromycin (ERM) | 1500 | 0 |
| Fluphenazine (FP) | 2.5 | 0 |
| Meloxicam (MLX) | 33 | 0 |
| Methimazole (MTZ) | 100 | 0 |
| Naproxen (NAP) | 10 | 0 |
| Nimesulide (NIM) | 162 | 0 |
| Pioglitazone (PGZ) | 1500 | 0 |
| Promethazine (PMZ) | 113 | 0 |
| Stavudine (D4T) | 1400 | NGTXC( |
| Troglitazone (TGZ) | 1200 | 0 |
| Valproic acid (VPA) | 1340 | 0 |
|
| ||
| Aminoglutethimide (AG) | 350 | NGTXC |
| Balsalazide (BZ) | 1100 | 0 |
| Carbimazole (CBZ) | 400 | 0 |
| Catechol (CC) | 195 | 0 |
| Chloroxylenol (CXL) | 1915 | 0 |
| Cinnarizine (CIN) | 750 | 0 |
| Clomiphene (CLM) | 250 | 0 |
| Closantel (CLO) | 22 | 0 |
| Crotamiton(CROT) | 750 | NGTXC |
| Danazol (DZ) | 2000 | 0 |
| Finasteride (FIN) | 800 | NGTXC |
| Indomethacin (IM) | 12 | 0 |
| Isoeugenol (IEUG) | 1560 | 0 |
| Ketoconazole (KET) | 227 | 0 |
| Ketorolac (KET) | 48 | 0 |
| Leflunomide (LEF) | 60 | 0 |
| MLN518 | 212 | 0 |
| Modafinil (MO) | 325 | NGTXC |
| Nystatin (NYS) | 134 | 0 |
| Progesterone (PR) | 164 | 0 |
| Propylthiouracil (PTU) | 625 | Und |
| Rosiglitazone (RGZ) | 1800 | NGTXC |
| Salicylamide (SA) | 1300 | 0 |
| Simvastatin (SIM) | 1200 | NGTXC |
| Sulindac (SULIN) | 132 | 0 |
| Zileuton (ZL) | 450 | 0 |
Abbreviations: NGTXC, Non-genotoxic Carcinogen; GTX, Genotoxic compound; NH, Non hepatocarcinogen; 0, Negative for NGTXC.
Results based on the OOB classification.
Undetermined.
Misclassified.
Random Forest classification of TG-GATE data according to dose level
| Low dose | Medium dose | High dose | ||||
|---|---|---|---|---|---|---|
|
|
|
| ||||
| Samples | Conc. (mg/mL) | Predicted class | Conc. (mg/mL) | Predicted class | Conc. (mg/mL) | Predicted class |
| Acetamide | 300 | 0( | 1000 | 0( | 200 | 0( |
| Carbamazepine | 30 | 0( | 100 | 0( | 300 | NGTX |
| Carbon tetrachloride | 30 | 0( | 100 | 0( | 300 | 0( |
| Clofibrate | 30 | 0( | 100 | 0( | 300 | NGTX |
| Ethinylestradiol | 1 | 0( | 3 | NGTX | 10 | NGTX |
| Fenofibrate | 10 | 0( | 100 | NGTX | 100 | NGTX |
| Gemfibrozil | 30 | NGTX | 100 | NGTX | 300 | NGTX |
| Methapyrilene | 10 | 0( | 30 | 0( | 100 | 0( |
| Methyltestosterone | 30 | 0( | 100 | 0( | 300 | 0( |
| Monocrotaline | 3 | 0( | 10 | 0( | 30 | 0( |
| Phenobarbital | 10 | 0( | 30 | 0( | 100 | NGTX |
| Thioacetamide | 4.5 | 0( | 15 | NGTX | 45 | NGTX |
| WY-14643 | 10 | NGTX | 30 | NGTX | 100 | NGTX |
|
| ||||||
| Acetamidofluorene | 30 | 0 | 100 | 0 | 300 | 0 |
| Carboplatin | 1 | 0 | 3 | 0 | 10 | 0 |
| Colchicine | 0.5 | 0 | 1.5 | 0 | 5 | 0 |
| Doxorubicin | 0.1 | 0 | 0.3 | 0 | 1 | 0 |
| Lomustine | 0.6 | 0 | 2 | 0 | 6 | 0 |
| Naphthyl isothiocyanate | 1.5 | 0 | 5 | 0 | 15 | 0 |
| Tamoxifen | 6 | 0 | 20 | 0 | 60 | 0 |
|
| ||||||
| Acetaminophen | 300 | 0 | 600 | 0 | 1000 | 0 |
| Allyl alcohol | 3 | 0 | 10 | 0 | 30 | 0 |
| Amiodarone | 20 | 0 | 60 | 0 | 200 | 0 |
| Aspirin | 45 | 0 | 150 | 0 | 450 | 0 |
| Caffeine | 10 | 0 | 30 | 0 | 100 | 0 |
| Diclofenac | 1 | 0 | 3 | 0 | 10 | 0 |
| Erythromycin ethylsuccinate | 100 | 0 | 300 | 0 | 1000 | 0 |
| Ethanol | 400 | 0 | 1200 | 0 | 4000 | 0 |
| Gentamicin | 10 | 0 | 30 | 0 | 100 | 0 |
| Nimesulide | 10 | 0 | 30 | 0 | 100 | 0 |
| Promethazine | 20 | 0 | 60 | 0 | 200 | 0 |
| Tannic acid | 100 | 0 | 300 | 0 | 1000 | 0 |
| Tetracycline | 100 | 0 | 300 | 0 | 1000 | 0 |
Misclassified.
Fig. 1Principal Component Analysis (PCA) of differential expressed genes for DrugMatrix chemicals. Each compound was averaged among replicates. Shapes indicate their class: circles correspond to non-genotoxic carcinogens, squares to genotoxins and triangles to non-hepatocarcinogens.
High-scoring genes selected according to mean decrease accuracy
Fig. 2Principal Component Analysis (PCA) performed with the 54 selected genes on the DrugMatrix set. Each compound was averaged among replicates. Shapes indicate their class: circles correspond to non-genotoxic carcinogens, squares to genotoxins and triangles to non-hepatocarcinogens. Note that non-genotoxic carcinogens (circles) clustered at the right of PC1, with the exception of CC4.
Fig. 3Mean differential expression of the predictor genes in log2 scale. Black bars represent overexpressed genes; gray bars represent under-expressed genes.
Top GO terms of enriched analysis according to Fisher’s exact test
| Biological process | Term | Annotated | Significant | Expected | F. classic |
|---|---|---|---|---|---|
| GO:0045333 | Cellular respiration | 28 | 5 | 0.42 | 4.8e-05 |
| GO:0015980 | Energy derivation by oxidation of organic compounds | 50 | 6 | 0.75 | 8.3e-05 |
| GO:0014823 | Response to activity | 23 | 4 | 0.35 | 0.00033 |
| GO:0007507 | Heart development | 124 | 8 | 1.87 | 0.00043 |
| GO:0006091 | Generation of precursor metabolites and energy | 68 | 6 | 1.03 | 0.00046 |
| GO:0042157 | Lipoprotein metabolic process | 29 | 4 | 0.44 | 0.00082 |
| GO:0051701 | Interaction with host | 30 | 4 | 0.45 | 0.00094 |
| GO:0006869 | Lipid transport | 79 | 6 | 1.19 | 0.00104 |
|
| |||||
| Molecular function | |||||
|
| |||||
| GO:0004872 | Receptor activity | 172 | 8 | 2.3 | 0.0016 |
| GO:0008528 | G-protein coupled peptide receptor activity | 20 | 3 | 0.27 | 0.0022 |
| GO:0001653 | Peptide receptor activity | 21 | 3 | 0.28 | 0.0025 |
| GO:0060089 | Molecular transducer activity | 215 | 8 | 2.88 | 0.0066 |
| GO:0001948 | Glycoprotein binding | 31 | 3 | 0.42 | 0.0077 |
| GO:0038024 | Cargo receptor activity | 11 | 2 | 0.15 | 0.0089 |
| GO:0005319 | Lipid transporter activity | 35 | 3 | 0.47 | 0.0109 |
| GO:0004930 | G-protein coupled receptor activity | 49 | 3 | 0.66 | 0.0269 |