| Literature DB >> 35211661 |
Nicoleta Spînu1, Mark T D Cronin1, Junpeng Lao2, Anna Bal-Price3, Ivana Campia3, Steven J Enoch1, Judith C Madden1, Liadys Mora Lagares4,5, Marjana Novič5, David Pamies6,7, Stefan Scholz8, Daniel L Villeneuve9, Andrew P Worth3.
Abstract
In a century where toxicology and chemical risk assessment are embracing alternative methods to animal testing, there is an opportunity to understand the causal factors of neurodevelopmental disorders such as learning and memory disabilities in children, as a foundation to predict adverse effects. New testing paradigms, along with the advances in probabilistic modelling, can help with the formulation of mechanistically-driven hypotheses on how exposure to environmental chemicals could potentially lead to developmental neurotoxicity (DNT). This investigation aimed to develop a Bayesian hierarchical model of a simplified AOP network for DNT. The model predicted the probability that a compound induces each of three selected common key events (CKEs) of the simplified AOP network and the adverse outcome (AO) of DNT, taking into account correlations and causal relations informed by the key event relationships (KERs). A dataset of 88 compounds representing pharmaceuticals, industrial chemicals and pesticides was compiled including physicochemical properties as well as in silico and in vitro information. The Bayesian model was able to predict DNT potential with an accuracy of 76%, classifying the compounds into low, medium or high probability classes. The modelling workflow achieved three further goals: it dealt with missing values; accommodated unbalanced and correlated data; and followed the structure of a directed acyclic graph (DAG) to simulate the simplified AOP network. Overall, the model demonstrated the utility of Bayesian hierarchical modelling for the development of quantitative AOP (qAOP) models and for informing the use of new approach methodologies (NAMs) in chemical risk assessment.Entities:
Keywords: ADMET, Absorption, distribution, metabolism, excretion, and toxicity; AO, Adverse outcome; AOP, Adverse outcome pathway; Adverse Outcome Pathway; BBB, Blood-brain-barrier; BDNF, Brain-derived neurotrophic factor; Bayesian hierarchical model; CAS RN, Chemical Abstracts Service Registry Number; CI, Credible interval CKE, Common key event; CNS, Central nervous system; CRA, Chemical risk assessment; Common Key Event; DAG, Directed acyclic graph; DNT, Developmental neurotoxicity; DTXSID, The US EPA Comptox Chemical Dashboard substance identifier; Developmental Neurotoxicity; EC, Effective concentration; HDI, Highest density interval; IATA, Integrated Approaches to Testing and Assessment; KE, Key event; KER, Key event relationship; LDH, Lactate dehydrogenase; MCMC, Markov chain Monte Carlo; MIE, Molecular initiating event; NAM, New approach methodology; New Approach Methodology; OECD, Organisation for Economic Cooperation and Development; P-gp, P-glycoprotein; PBK, Physiologically-based kinetic; QSAR, Quantitative structure-activity relationship; SMILES, Simplified molecular input line entry system; qAOP, Quantitative adverse outcome pathway
Year: 2022 PMID: 35211661 PMCID: PMC8857173 DOI: 10.1016/j.comtox.2021.100206
Source DB: PubMed Journal: Comput Toxicol ISSN: 2468-1113
Fig. 1Types of information collected for model development exemplified for bisphenol A. This figure illustrates how different streams of data can be integrated for causal predictions to complement the information on key events. The full data set is provided in Table S3 in the supplementary material. The ECx values were extracted from the corresponding in vitro studies as published.
Types of data and their sources collected for the development of the Bayesian hierarchical model. The table describes all variables, i.e., predictors and outcomes defined as features included in the model for the type of data and performance where applicable. See also Tables S1-S2 in the supplementary material.
| Chemical Name | The names used to define the compounds tested in both | Not Applicable | Not Applicable | |
| CAS RN | Chemical Abstracts Service Registry Number associated with the tested compounds used to identify and track them during the modelling. | Not Applicable | Not Applicable | |
| DNT Classification | Each compound was classified as either positive, known or potential inducing DNT/negative, safe or without evidence for inducing DNT based on | Binary, i.e., positive (i.e., associated with DNT) or negative | Not Applicable | |
| LogD | The logarithm distribution coefficient calculated based on the compounds’ SMILES strings. | Continuous, unitless values | Not Applicable | ChemSpider database |
| BBB | Each compound was classified for its capability to permeate the blood–brain-barrier (BBB) based on curated SMILES. Predicting BBB permeability means indicating whether compounds pass across the BBB. Compounds that cross the BBB have the potential to be CNS-active, whereas compounds that do not cross are expected to be CNS-inactive. | Binary, i.e., positive (BBB permeable) and negative | The | Literature review |
| Cbrain/Cblood | Continuous, unitless values | The QSAR model of Ma et al. | PreADMET v.2.0 | |
| P-glycoprotein Status | Each compound was classified based on curated SMILES as a substrate or not, inhibitor or not, active or inactive for P-glycoprotein (P-gp) transporter using an | Binary, i.e., yes or no for a compound acting as a substrate or an inhibitor, or activity against P-gp | The non-error rate and the average precision was 0.70 for the external validation set. |
Fig. 2A simplified graphical representation of the Bayesian hierarchical model utilised to assess individual compounds for their DNT induction potential. The model follows a specific biological path in the AOP network for DNT. The dotted lines represent the imputation step of the missing values, which was conducted either from the prior distribution for X or from the posterior distribution for Y. BDNF: reduction of brain-derived neurotrophic factor; SYN: decrease of synaptogenesis; NNF: decrease of neural network formation; DNT: developmental neurotoxicity; miss: missing values; i: number of compounds; X: predictors, independent variables; Y: outcomes, dependent variables; parameters of the model.
Fig. 3Visualisations of the predictions for three compounds: fluoxetine, sodium fluoride and glyphosate to show the relationships between the compounds and CKEs and the AO following the structure of the simplified biological path for DNT. The shaded blue distribution represents the predicted severity of the CKEs and the AO as well as the uncertainty in the prediction for the 95% highest density interval (HDI). For a complete overview of the results of all compounds, the reader is referred to Fig. S9 in the supplementary material. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
Fig. 4The results of the posterior predictive probabilities. A. The distribution of the posterior probabilities of the three CKEs and the AO for positive and negative compounds, and compounds with a level of missingness for the CKE reduction of BDNF. It shows how the mean of the predicted probability of compounds was clustered and distributed based on the two thresholds to describe a low, medium and high probability for the induction of the corresponding CKE and AO. A detailed graphical representation is shown in Supplementary material, Figs. S10–S12. B. Predicted probabilities of compounds for the induction of developmental neurotoxicity. The predicted probabilities are colour-coded based on two thresholds estimated from the results set to group the compounds for their low, medium and high probability. Compounds were listed in the order of increasing probability.
Qualitative assessment of sources of uncertainties characteristic to the model proposed herein.
| Causal structure | High | The causal links were inferred from well established AOPs, even though there may be other (as yet unknown) causal links. The causal structure does not fulfil all Bradford Hill criteria and should therefore be considered with caution. |
| High | The model is data-driven, however, the compiled data set is not ideally suited for modelling purposes (i.e., it was not specifically designed to evaluate computationally such a hypothesis). The variability given by the | |
| High | Limited applicability domain (organics) of the | |
| Probabilistic modelling | Medium | A parametric model was developed. Such a model combined with a subjective assessment type of probabilistic model might lead to a better-informed prediction and increase the trust in its use. |
| Choice of priors | Low | Weakly-informative priors were chosen with little influence on the posterior probabilities. This is also shown by the sensitivity analysis for exploring three hyperpriors. |
| Mathematical approach | Medium | Linear and logistic regressions were defined to describe the causal structure. It did not account for temporal dynamics, ADMET, kinetics and types of exposure (acute vs chronic). |
| Model robustness | Low | Statistical parameters showed the model converged well. |
| Imputation method | Medium | Prior-based imputation is very informative especially in a hierarchical type of Bayesian model that helps to inform each of the CKEs. Posterior-based imputation led to an almost uniform distribution of posterior predictive probabilities for the reduction of BDNF, a CKE with this type of information missing. Such imputation might suit better multi-classes instead of a binary problem (e.g., proportional odds). |
| Model performance | High | Several metrics are available with few specifically developed for Bayesian models. The reporting is more important than the selection of such metrics, in addition to making the model accessible. Model performance can have an impact on its future applications. |
| Uncertainty metrics for outputs | Low | Mean and credible intervals are informative, which is an advantage of the probabilistic approach. |
| Applicability domain | High | The chemical diversity, in comparison with |
Low – very little impact on the predicted probabilities; Medium – a relatively moderate level of influence on the predicted probabilities; High – a strong influence on the model outcome.