| Literature DB >> 28670277 |
Christoph Helma1, Micha Rautenberg1, Denis Gebele1.
Abstract
The lazar framework for read across predictions was expanded for the prediction of nanoparticle toxicities, and a new methodology for calculating nanoparticle descriptors from core and coating structures was implemented. Nano-lazar provides a flexible and reproducible framework for downloading data and ontologies from the open eNanoMapper infrastructure, developing and validating nanoparticle read across models, open-source code and a free graphical interface for nanoparticle read-across predictions. In this study we compare different nanoparticle descriptor sets and local regression algorithms. Sixty independent crossvalidation experiments were performed for the Net Cell Association endpoint of the Protein Corona dataset. The best RMSE and r2 results originated from models with protein corona descriptors and the weighted random forest algorithm, but their 95% prediction interval is significantly less accurate than for models with simpler descriptor sets (measured and calculated nanoparticle properties). The most accurate prediction intervals were obtained with measured nanoparticle properties (no statistical significant difference (p < 0.05) of RMSE and r2 values compared to protein corona descriptors). Calculated descriptors are interesting for cheap and fast high-throughput screening purposes. RMSE and prediction intervals of random forest models are comparable to protein corona models, but r2 values are significantly lower.Entities:
Keywords: QSAR; k-nearest-neighbors; machine learning; nanoparticle; predictive toxicology; read-across; toxicity
Year: 2017 PMID: 28670277 PMCID: PMC5472659 DOI: 10.3389/fphar.2017.00377
Source DB: PubMed Journal: Front Pharmacol ISSN: 1663-9812 Impact factor: 5.810
Results from five independent crossvalidations for various descriptor/algorithm combinations.
| MP2D | WA | NA | ||
| MP2D | PLS | 96 94 94 93 94 | ||
| MP2D | RF | 1.73 1.77 1.67 1.67 1.73 | 96 93 94 94 96 | |
| P-CHEM | WA | NA | ||
| P-CHEM | PLS | |||
| P-CHEM | RF | 1.76 1.73 1.81 1.86 1.83 | 0.56 0.58 0.54 0.51 0.53 | 97 95 94 93 94 |
| Proteomics | WA | 1.88 1.72 1.73 1.91 1.76 | 0.52 0.6 0.59 0.52 0.58 | NA |
| Proteomics | PLS | 1.74 1.85 1.78 1.61 1.68 | 0.59 0.56 0.56 0.64 0.62 | |
| Proteomics | RF | |||
| P-CHEM Proteomics | WA | 1.72 1.77 1.85 1.44 1.67 | 0.6 0.58 0.55 0.7 0.62 | NA |
| P-CHEM Proteomics | PLS | 1.55 1.91 1.79 1.94 1.64 | 0.67 0.54 0.58 0.51 0.64 | |
| P-CHEM Proteomics | RF |
Best results (mean of 5 crossvalidations) are indicated by bold letters, statistically significant (p < 0.05) different results by italics. Results in normal fonts do not differ significantly from best results.
Figure 1Correlation of predicted vs. measured values for five independent crossvalidations with MP2D fingerprint descriptors and local random forest models.
Figure 3Correlation of predicted vs. measured values for five independent crossvalidations with Proteomics descriptors and local random forest models.
P-CHEM properties of the Protein corona dataset measured with and without human serum.
| Localized Surface Plasmon Resonance (LSPR) index | – | |
| LSPR peak position (nm) | – | nm |
| – | nm | |
| Polydispersity index | Human serum | nm |
| – | nm | |
| nmol | ||
| Total surface area (SAtot) | Human serum | cm2 |
| Protein density | Human serum | μg/cm2 |
| μg | ||
| – | mV | |
| mV | ||
| Z-Average Hydrodynamic Diameter | – | nm |
| nm | ||
| Volume Mean Hydrodynamic Diameter | – | nm |
| nm | ||
| Number Mean Hydrodynamic Diameter | – | nm |
| Number Mean Hydrodynamic Diameter | Human serum | nm |
| Intensity Mean Hydrodynamic Diameter | – | nm |
| nm |
Features correlating with the Net cell association endpoint (relevant features) are indicated by bold letters.
Random forest predictions with measurements outside of the 95% prediction interval (Median log2 transformed values).
| MP2D fingerprints | G15.DDT@SDS | 5 | 2.2 | 6.2 |
| MP2D fingerprints | G15.NT@DCA | 5 | 0.7 | 3.0 |
| MP2D fingerprints | G60.MBA | 5 | 0.5 | 2.7 |
| MP2D fingerprints | G15.DDT@ODA | 1 | 1.1 | 5.0 |
| MP2D fingerprints | S40.MHDA | 1 | 0.0 | 3.4 |
| MP2D fingerprints | S40.CIT | 1 | 0.0 | 2.3 |
| MP2D fingerprints | G30.DDT@HDA | 1 | 0.0 | 4.2 |
| P-CHEM | G30.cPEG5K-SH | 5 | 2.3 | 4.5 |
| P-CHEM | G15.nPEG5K-SH | 5 | 1.0 | 5.4 |
| P-CHEM | G60.mPEG5K-SH | 5 | 0.7 | 4.3 |
| P-CHEM | S40.AUT | 4 | 0.7 | 3.0 |
| P-CHEM | G15.DDT@CTAB | 3 | 0.9 | 6.1 |
| P-CHEM | G15.HDA | 2 | 0.3 | 5.6 |
| P-CHEM | S40.PLL-SH | 2 | 0.1 | 2.2 |
| P-CHEM | G15.PEI-SH | 1 | 0.5 | 4.6 |
| P-CHEM | G15.DDT@SA | 1 | 0.4 | 1.2 |
| P-CHEM | G60.DTNB | 1 | 0.2 | 1.7 |
| P-CHEM | G15.MES | 1 | 0.2 | 2.3 |
| P-CHEM | S40.MAA | 1 | 0.1 | 2.6 |
| P-CHEM | G60.MBA | 1 | 0.0 | 1.6 |
| Proteomics | G15.nPEG5K-SH | 5 | 1.3 | 3.9 |
| Proteomics | G15.mPEG1K-SH | 5 | 0.8 | 3.5 |
| Proteomics | G30.cPEG5K-SH | 5 | 0.6 | 3.9 |
| Proteomics | G15.ODA | 4 | 1.8 | 4.5 |
| Proteomics | G60.NT@PVA | 4 | 0.3 | 2.8 |
| Proteomics | G60.MUTA | 4 | 0.3 | 1.5 |
| Proteomics | G30.AUT | 4 | 0.2 | 0.6 |
| Proteomics | G30.CALNN | 3 | 0.3 | 2.1 |
| Proteomics | G15.PEI-SH | 3 | 0.3 | 0.3 |
| Proteomics | S40.AUT | 2 | 1.6 | 3.3 |
| Proteomics | G60.mPEG5K-SH | 2 | 0.9 | 2.9 |
| Proteomics | S40.LA | 2 | 0.1 | 1.3 |
| Proteomics | G60.HDA | 1 | 2.4 | 3.7 |
| Proteomics | G15.MES | 1 | 1.8 | 3.2 |
| Proteomics | G15.PEG3K(NH2)-SH | 1 | 1.8 | 3.9 |
| Proteomics | G60.ODA | 1 | 1.0 | 4.2 |
| Proteomics | G15.AUT | 1 | 0.1 | 0.4 |
| Proteomics | G15.SA | 1 | 0.1 | 0.8 |
| Proteomics | G60.CIT | 1 | 0.1 | 0.7 |
| P-CHEM and Proteomics | G15.ODA | 5 | 2.0 | 5.0 |
| P-CHEM and Proteomics | G15.mPEG1K-SH | 5 | 0.8 | 3.1 |
| P-CHEM and Proteomics | G30.CALNN | 5 | 0.7 | 2.2 |
| P-CHEM and Proteomics | G15.nPEG5K-SH | 5 | 0.6 | 3.4 |
| P-CHEM and Proteomics | G60.MUTA | 5 | 0.5 | 1.5 |
| P-CHEM and Proteomics | G60.DTNB | 4 | 1.1 | 1.6 |
| P-CHEM and Proteomics | S40.AUT | 3 | 1.6 | 3.3 |
| P-CHEM and Proteomics | G60.mPEG5K-SH | 2 | 0.4 | 3.5 |
| P-CHEM and Proteomics | G30.AUT | 2 | 0.3 | 0.8 |
| P-CHEM and Proteomics | G15.AUT | 2 | 0.1 | 0.4 |
| P-CHEM and Proteomics | G15.MUA | 2 | 0.1 | 1.1 |
| P-CHEM and Proteomics | G30.cPEG5K-SH | 1 | 2.4 | 3.5 |
| P-CHEM and Proteomics | G15.PEG3K(NH2)-SH | 1 | 1.2 | 2.8 |
| P-CHEM and Proteomics | G15.PEI-SH | 1 | 0.3 | 0.3 |
| P-CHEM and Proteomics | G15.HDA | 1 | 0.2 | 3.9 |
| P-CHEM and Proteomics | G15.DDT@ODA | 1 | 0.1 | 2.0 |
| P-CHEM and Proteomics | G15.SA | 1 | 0.1 | 0.7 |
| P-CHEM and Proteomics | G15.PVA | 1 | 0.0 | 1.7 |
Figure 2Correlation of predicted vs. measured values for five independent crossvalidations with P-CHEM descriptors and local random forest models.