| Literature DB >> 26956490 |
Ping Gong1, Xiaofei Nan2,3, Natalie D Barker4, Robert E Boyd5, Yixin Chen6, Dawn E Wilkins7, David R Johnson8, Burton C Suedel9, Edward J Perkins10.
Abstract
BACKGROUND: Chemical bioavailability is an important dose metric in environmental risk assessment. Although many approaches have been used to evaluate bioavailability, not a single approach is free from limitations. Previously, we developed a new genomics-based approach that integrated microarray technology and regression modeling for predicting bioavailability (tissue residue) of explosives compounds in exposed earthworms. In the present study, we further compared 18 different regression models and performed variable selection simultaneously with parameter estimation.Entities:
Mesh:
Substances:
Year: 2016 PMID: 26956490 PMCID: PMC4784335 DOI: 10.1186/s12864-016-2541-5
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1The overall experimental approach. See the Methods section for explanation
Performance of 18 regression modeling methods on four datasets assessed by coefficient of determination (R2, mean ± standard deviation, n = 10) estimated from ten runs of 10-fold cross-validation with values of the best performing method for each dataset shown in bold
| Regression method | RDX_D4 | RDX_D14 | TNT_D4 | TNT_D14 |
|---|---|---|---|---|
| Predictor size (gene #) | 26 | 3 | 53 | 6 |
| Linear | ||||
| Multivariate | 0.62 ± 0.19 | 0.65 ± 0.12 | 0.42 ± 0.14 |
|
| Robust | 0.63 ± 0.14 | 0.65 ± 0.13 | NA | 0.67 ± 0.15 |
| Ridge | 0.65 ± 0.15 | 0.65 ± 0.13 | 0.73 ± 0.15 | 0.71 ± 0.16 |
| LASSO | 0.65 ± 0.18 | 0.65 ± 0.14 | 0.73 ± 0.15 | 0.69 ± 0.15 |
| Elastic net |
| 0.66 ± 0.13 | 0.75 ± 0.19 | 0.69 ± 0.17 |
| SVR | 0.60 ± 0.15 | 0.68 ± 0.14 | 0.74 ± 0.16 | 0.66 ± 0.16 |
| Nonlinear | ||||
| Stepwise | 0.42 ± 0.21 | 0.69 ± 0.14 | 0.33 ± 0.21 | 0.6 ± 0.16 |
| Ridge Polynomial | 0.62 ± 0.18 |
| 0.71 ± 0.14 | 0.66 ± 0.16 |
| Ridge Exponential | 0.65 ± 0.13 | 0.67 ± 0.13 | 0.68 ± 0.14 | 0.67 ± 0.17 |
| Ridge Gaussian | 0.64 ± 0.14 | 0.70 ± 0.15 | 0.43 ± 0.13 | 0.64 ± 0.16 |
| SVR Polynomial | 0.61 ± 0.15 | 0.68 ± 0.14 | 0.70 ± 0.12 | 0.63 ± 0.16 |
| SVR Gaussian | 0.63 ± 0.13 | 0.68 ± 0.14 | 0.74 ± 0.12 | 0.67 ± 0.13 |
| SVR Sigmoid | 0.17 ± 0.00 | NA | 0.08 ± 0.00 | NA |
| Nadaraya-Watson | 0.54 ± 0.09 | 0.68 ± 0.16 | 0.73 ± 0.17 | 0.67 ± 0.13 |
| Inverse | 0.44 ± 0.14 | NA | 0.31 ± 0.10 | NA |
| Loglog | NA | NA | NA | NA |
| Regression Tree | 0.53 ± 0.10 | 0.59 ± 0.13 | 0.73 ± 0.12 | 0.54 ± 0.14 |
| Random Forest | 0.60 ± 0.12 | 0.59 ± 0.16 |
| 0.70 ± 0.17 |
RDX_D4 4-day RDX exposure, RDX_D14 14-day RDX exposure, TNT_D4 4-day TNT exposure, TNT_D14 14-day TNT exposure, NA not available. See Additional file 5 for the lists and annotation of predictor genes
Fig. 2The average predicted versus the measured tissue residues of TNT or RDX in all 4-day or 14-day exposed samples using their respective best performing models
Fig. 3Tissue residue of radio-labeled HMX measured in earthworms exposed for 4-, 14-, and 28-days (see Supplementary file 3 for raw data). Data are represented as mean (column) + standard deviation (error bar) with n = 10. BC = blank control; SC = solvent control
Performance of 18 regression modeling methods on the three HMX exposure datasets assessed by coefficient of determination (R2, mean ± standard deviation, n = 10) estimated from ten runs of 10-fold cross-validation with values of the best performing method shown in bold
| Regression method | D4 | D14 | D28 |
|---|---|---|---|
| Predictor size (gene #) | 6 | 6 | 10 |
| Linear | |||
| Multivariate | 0.53 ± 0.15 | 0.52 ± 0.15 | 0.58 ± 0.15 |
| Robust | 0.66 ± 0.12 | 0.72 ± 0.09 | 0.79 ± 0.02 |
| Ridge | 0.67 ± 0.10 | 0.70 ± 0.11 | 0.81 ± 0.02 |
| LASSO | 0.69 ± 0.10 | 0.72 ± 0.10 | 0.81 ± 0.04 |
| Elastic net |
| 0.71 ± 0.11 |
|
| SVR | 0.70 ± 0.10 | 0.65 ± 0.09 | 0.81 ± 0.05 |
| Nonlinear | |||
| Stepwise | 0.67 ± 0.07 | 0.66 ± 0.11 | 0.79 ± 0.05 |
| Ridge Polynomial | 0.63 ± 0.11 |
| 0.76 ± 0.05 |
| Ridge Exponential | 0.68 ± 0.08 | 0.68 ± 0.09 | 0.79 ± 0.04 |
| Ridge Gaussian | 0.51 ± 0.16 | 0.56 ± 0.14 | 0.66 ± 0.06 |
| SVR Polynomial | 0.69 ± 0.11 | 0.64 ± 0.11 | 0.79 ± 0.06 |
| SVR Gaussian | 0.65 ± 0.09 | 0.60 ± 0.10 | 0.73 ± 0.10 |
| SVR Sigmoid | 0.48 ± 0.15 | 0.49 ± 0.15 | 0.68 ± 0.12 |
| Nadaraya-Watson | 0.68 ± 0.09 | 0.67 ± 0.09 | 0.80 ± 0.04 |
| Inverse | NA | NA | NA |
| Loglog | NA | NA | NA |
| Regression Tree | 0.56 ± 0.15 | 0.61 ± 0.14 | 0.65 ± 0.13 |
| Random Forest | 0.55 ± 0.16 | 0.60 ± 0.13 | 0.69 ± 0.10 |
D4 4-day HMX exposure, D14 14-day HMX exposure, D28 28-day HMX exposure, NA not available. See Additional file 5 for the lists and annotation of predictor genes
Fig. 4Prediction results of 4-, 14- and 28-day HMX-exposed earthworm tissue residues using the best performing models (shown are the results of a single run of 10-fold cross-validation)