| Literature DB >> 28117354 |
Shan-Han Huang1, Chun-Wei Tung1,2,3,4.
Abstract
The assessment of non-genotoxic hepatocarcinogens (NGHCs) is currently relying on two-year rodent bioassays. Toxicogenomics biomarkers provide a potential alternative method for the prioritization of NGHCs that could be useful for risk assessment. However, previous studies using inconsistently classified chemicals as the training set and a single microarray dataset concluded no consensus biomarkers. In this study, 4 consensus biomarkers of A2m, Ca3, Cxcl1, and Cyp8b1 were identified from four large-scale microarray datasets of the one-day single maximum tolerated dose and a large set of chemicals without inconsistent classifications. Machine learning techniques were subsequently applied to develop prediction models for NGHCs. The final bagging decision tree models were constructed with an average AUC performance of 0.803 for an independent test. A set of 16 chemicals with controversial classifications were reclassified according to the consensus biomarkers. The developed prediction models and identified consensus biomarkers are expected to be potential alternative methods for prioritization of NGHCs for further experimental validation.Entities:
Mesh:
Substances:
Year: 2017 PMID: 28117354 PMCID: PMC5259716 DOI: 10.1038/srep41176
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Summary of microarray datasets.
| Dataset | #Probe | #NGHC | #NHC | Description | Reference |
|---|---|---|---|---|---|
| DMAAffymetrix whole genome 230 2.0 rat GeneChip® array | 31099 | 25 | 47 | ■ National Toxicology Programs of the National Institute of Environmental Health Sciences in U.S.A■ 3 doses: Low, Medium and High■ 4 time points: 0.25, 1, 3 and 5 days | Fielden |
| DMCThe GE Codelink™ rat array | 10399 | 39 | 138 | ||
| GSE8858CodeLink UniSet Rat I Bioarray | 10399 | 35 | 121 | ■ Maximum tolerated dose■ 3 time points: 1, 3 and 5 days | Liu |
| TG-GATEsAffymetrix Rat Genome 230 2.0 array | 31099 | 12 | 93 | ■ The TGx Project in Japan■ 3 doses: Low, Medium, High■ 8 time points: 3, 6, 9 and 24 hours; 3, 7, 14 and 28 days | Uehara |
Figure 1Flowchart of consensus biomarker identification and independent test.
A chemical list was firstly collected from literature. Secondly, consensus biomarkers were identified based on the chemical list without inconsistent classifications. Prediction models based on machine learning algorithms were subsequently constructed using the consensus biomarkers and validated on an independent test JNJ dataset. Finally, inconsistently classified chemicals were reanalyzed using the developed models.
Figure 2ROC curves representing the LOOCV performance of consensus biomarkers on four microarray datasets.
Figure 3LOOCV performances for consensus biomarkers and published biomarkers.
Figure 4Independent test performances for consensus biomarkers and published biomarkers.
The dosage and classification of inconsistently classified chemicals.
| Chemical | Dosage (mg/kg) | Classification | ||||
|---|---|---|---|---|---|---|
| DMA | DMC | GSE8858 | TG-GATEs | NGHC | NHC | |
| Acetaminophen | 972 | 972 | 1000 | Fielden | Uehara | |
| Beta-estradiol | 150 | 150 | 150 | Fielden | Liu | |
| Carbamazepine | 490 | 490 | 490 | 300 | Fielden | Uehara |
| Diazepam | 710 | 710 | 710 | 250 | Yamada | Fielden |
| Diethylstilbestrol | 2.8 | 280 | 280 | Fielden | Liu | |
| Ethanol | 6000 | 6000 | 4000 | Fielden | Uehara | |
| Ethionamide | 250 | Yamada | Uehara | |||
| Griseofulvin | 2500 | 2500 | 1000 | Yamada | Uehara | |
| Haloperidol | 30 | Yamada | Uehara | |||
| Isoniazid | 79 | 79 | 2000 | Liu | Uehara | |
| Rifampin | 200 | Uehara | Yamada | |||
| Simvastatin | 1200 | 1200 | 1200 | 400 | Fielden | Uehara |
| Sulfasalazine | 1000 | Yamada | Uehara | |||
| Tamoxifen | 64 | 64 | 60 | Yamada | Uehara | |
| Tannic acid | 1000 | Yamada | Uehara | |||
| Triamterene | 150 | Yamada | Uehara | |||
Figure 5Heatmap representing the expression levels of consensus biomarkers for reclassifying chemicals based on four microarray datasets.
Grey color indicates no available data.