| Literature DB >> 32778734 |
Heather L Ciallella1, Daniel P Russo1, Lauren M Aleksunes2, Fabian A Grimm3, Hao Zhu4,5.
Abstract
As defined by the World Health Organization, an endocrine disruptor is an exogenous substance or mixture that alters function(s) of the endocrine system and consequently causes adverse health effects in an intact organism, its progeny, or (sub)populations. Traditional experimental testing regimens to identify toxicants that induce endocrine disruption can be expensive and time-consuming. Computational modeling has emerged as a promising and cost-effective alternative method for screening and prioritizing potentially endocrine-active compounds. The efficient identification of suitable chemical descriptors and machine-learning algorithms, including deep learning, is a considerable challenge for computational toxicology studies. Here, we sought to apply classic machine-learning algorithms and deep-learning approaches to a panel of over 7500 compounds tested against 18 Toxicity Forecaster assays related to nuclear estrogen receptor (ERα and ERβ) activity. Three binary fingerprints (Extended Connectivity FingerPrints, Functional Connectivity FingerPrints, and Molecular ACCess System) were used as chemical descriptors in this study. Each descriptor was combined with four machine-learning and two deep- learning (normal and multitask neural networks) approaches to construct models for all 18 ER assays. The resulting model performance was evaluated using the area under the receiver- operating curve (AUC) values obtained from a fivefold cross-validation procedure. The results showed that individual models have AUC values that range from 0.56 to 0.86. External validation was conducted using two additional sets of compounds (n = 592 and n = 966) with established interactions with nuclear ER demonstrated through experimentation. An agonist, antagonist, or binding score was determined for each compound by averaging its predicted probabilities in relevant assay models as an external validation, yielding AUC values ranging from 0.63 to 0.91. The results suggest that multitask neural networks offer advantages when modeling mechanistically related endpoints. Consensus predictions based on the average values of individual models remain the best modeling strategy for computational toxicity evaluations.Entities:
Mesh:
Substances:
Year: 2020 PMID: 32778734 PMCID: PMC7873171 DOI: 10.1038/s41374-020-00477-2
Source DB: PubMed Journal: Lab Invest ISSN: 0023-6837 Impact factor: 5.662
Estrogen Receptor Toxicity Forecaster (ToxCast) Agonism, Antagonism, and Binding Assays
| Assay ID | Assay Endpoint Name | Assay Source | Organism | Gene Name | Timepoint (min) | Biological Process Target | Assay Design Type | Cell Line |
|---|---|---|---|---|---|---|---|---|
| A1 | NVS_NR_bER | NovaScreen | Bovine | ERα | 1080 | Receptor binding | Radioligand binding | NA |
| A2 | NVS_NR_hER | NovaScreen | Human | ERα | 1080 | Receptor binding | Radioligand binding | NA |
| A3 | NVS_NR_mERa | NovaScreen | Mouse | ERα | 1080 | Receptor binding | Radioligand binding | NA |
| A4 | OT_ER_ERaERa_0480 | Odyssey Thera | Human | ERα | 480 | Protein stabilization | Protein fragment complementation assay | HEK293T |
| A5 | OT_ER_ERaERa_1440 | Odyssey Thera | Human | ERα | 1440 | Protein stabilization | Protein fragment complementation assay | HEK293T |
| A6 | OT_ER_ERaERb_0480 | Odyssey Thera | Human | ERα, ERβ | 480 | Protein stabilization | Protein fragment complementation assay | HEK293T |
| A7 | OT_ER_ERaERb_1440 | Odyssey Thera | Human | ERα, ERβ | 1440 | Protein stabilization | Protein fragment complementation assay | HEK293T |
| A8 | OT_ER_ERbERb_0480 | Odyssey Thera | Human | ERβ | 480 | Protein stabilization | Protein fragment complementation assay | HEK293T |
| A9 | OT_ER_ERbERb_1440 | Odyssey Thera | Human | ERβ | 1440 | Protein stabilization | Protein fragment complementation assay | HEK293T |
| A10 | OT_ERa_EREGFP_0120 | Odyssey Thera | Human | ERα | 120 | Regulation of gene expression | Fluorescent protein induction | HeLa |
| A11 | OT_ERa_EREGFP_0480 | Odyssey Thera | Human | ERα | 480 | Regulation of gene expression | Fluorescent protein induction | HeLa |
| A12 | ATG_ERa_TRANS_up | Attagene, Inc. | Human | ERα | 1440 | Regulation of transcription factor activity | mRNA induction | HepG2 |
| A13 | ATG_ERE_CIS_up | Attagene, Inc. | Human | ERα | 1440 | Regulation of transcription factor activity | mRNA induction | HepG2 |
| A14 | TOX21_ERa_BLA_Agonist_ratio | Tox21 | Human | ERα | 1440 | Regulation of transcription factor activity | Beta lactamase induction | HEK293T |
| A15 | TOX21_ERa_LUC_BG1_Agonist | Tox21 | Human | ERα | 1320 | Regulation of transcription factor activity | Luciferase induction | BG1 |
| A16 | ACEA_T47D_80hr_Positive | ACEA Biosciences, Inc. | Human | ERα | 1920 | Cell proliferation | Real-time cell-growth kinetics | T47D |
| A17 | TOX21_ERa_BLA_Antagonist_ratio | Tox21 | Human | ERα | 1440 | Regulation of transcription factor activity | Beta lactamase induction | HEK293T |
| A18 | TOX21_ERa_LUC_BG1_Antagonist | Tox21 | Human | ERα | 1320 | Regulation of transcription factor activity | Luciferase induction | BG1 |
Figure 1.Distributions of (A) compounds in the ToxCast and Tox21 dataset (n=7,576) by the number of conclusive active or inactive results per compound and (B) individual assay datasets (n=18) by the number of active and inactive compounds.
Figure 2.Consensus QSAR modeling workflow used in this study.
Performance of Individual Models for 18 ToxCast and Tox21 ER Assays Using a Five-Fold Cross-Validation
| Algorithms | Descriptors | AUC | |||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| A1 | A2 | A3 | A4 | A5 | A6 | A7 | A8 | A9 | A10 | A11 | A12 | A13 | A14 | A15 | A16 | A17 | A18 | ||
| BNB | MACCS | 0.732 | 0.702 | 0.664 | 0.803 | 0.764 | 0.788 | 0.705 | 0.770 | 0.723 | 0.688 | 0.672 | 0.716 | 0.670 | 0.698 | 0.618 | 0.597 | 0.685 | 0.716 |
| FCFP6 | 0.723 | 0.725 | 0.727 | 0.819 | 0.764 | 0.829 | 0.749 | 0.820 | 0.749 | 0.740 | 0.720 | 0.742 | 0.687 | 0.724 | 0.645 | 0.725 | 0.746 | ||
| ECFP6 | 0.722 | 0.704 | 0.723 | 0.828 | 0.763 | 0.824 | 0.705 | 0.800 | 0.725 | 0.688 | 0.692 | 0.735 | 0.682 | 0.730 | 0.643 | 0.632 | 0.722 | 0.736 | |
| MACCS | 0.625 | 0.649 | 0.639 | 0.681 | 0.676 | 0.729 | 0.634 | 0.693 | 0.651 | 0.707 | 0.682 | 0.686 | 0.659 | 0.712 | 0.616 | 0.601 | 0.654 | 0.636 | |
| FCFP6 | 0.597 | 0.597 | 0.596 | 0.639 | 0.643 | 0.650 | 0.614 | 0.641 | 0.627 | 0.603 | 0.616 | 0.622 | 0.622 | 0.650 | 0.592 | 0.588 | 0.615 | 0.605 | |
| ECFP6 | 0.593 | 0.600 | 0.610 | 0.626 | 0.642 | 0.609 | 0.576 | 0.599 | 0.597 | 0.590 | 0.573 | 0.618 | 0.587 | 0.644 | 0.562 | 0.578 | 0.601 | 0.599 | |
| RF | MACCS | 0.740 | 0.687 | 0.689 | 0.843 | 0.848 | 0.733 | 0.827 | 0.736 | 0.743 | 0.714 | 0.750 | 0.704 | 0.762 | 0.658 | 0.620 | 0.799 | 0.818 | |
| FCFP6 | 0.730 | 0.723 | 0.707 | 0.796 | 0.735 | 0.837 | 0.708 | 0.812 | 0.743 | 0.751 | 0.696 | 0.748 | 0.683 | 0.733 | 0.642 | 0.635 | 0.748 | 0.747 | |
| ECFP6 | 0.742 | 0.685 | 0.726 | 0.805 | 0.783 | 0.843 | 0.716 | 0.809 | 0.715 | 0.677 | 0.729 | 0.740 | 0.689 | 0.740 | 0.646 | 0.617 | 0.745 | 0.726 | |
| SVM | MACCS | 0.737 | 0.717 | 0.679 | 0.845 | 0.795 | 0.864 | 0.712 | 0.819 | 0.715 | 0.759 | 0.737 | 0.712 | 0.782 | 0.652 | 0.622 | 0.819 | 0.827 | |
| FCFP6 | 0.713 | 0.677 | 0.701 | 0.822 | 0.736 | 0.827 | 0.735 | 0.818 | 0.733 | 0.768 | 0.709 | 0.742 | 0.698 | 0.744 | 0.639 | 0.626 | 0.794 | 0.789 | |
| ECFP6 | 0.706 | 0.697 | 0.713 | 0.827 | 0.748 | 0.810 | 0.667 | 0.792 | 0.683 | 0.684 | 0.664 | 0.756 | 0.697 | 0.785 | 0.641 | 0.613 | 0.802 | 0.798 | |
| Normal DNN | MACCS | 0.695 | 0.690 | 0.679 | 0.827 | 0.771 | 0.855 | 0.659 | 0.751 | 0.723 | 0.737 | 0.699 | 0.724 | 0.674 | 0.777 | 0.637 | 0.596 | 0.798 | 0.790 |
| FCFP6 | 0.687 | 0.656 | 0.673 | 0.780 | 0.689 | 0.738 | 0.658 | 0.770 | 0.725 | 0.662 | 0.661 | 0.675 | 0.631 | 0.648 | 0.609 | 0.562 | 0.649 | 0.641 | |
| ECFP6 | 0.708 | 0.682 | 0.672 | 0.811 | 0.752 | 0.661 | 0.605 | 0.701 | 0.667 | 0.588 | 0.643 | 0.696 | 0.624 | 0.590 | 0.574 | 0.592 | 0.678 | 0.674 | |
| Multitask DNN | MACCS | 0.707 | 0.705 | 0.700 | 0.752 | 0.849 | 0.743 | 0.822 | 0.733 | 0.775 | 0.761 | 0.699 | 0.781 | 0.647 | 0.635 | 0.815 | 0.818 | ||
| FCFP6 | 0.709 | 0.685 | 0.677 | 0.810 | 0.732 | 0.818 | 0.790 | 0.726 | 0.720 | 0.709 | 0.647 | 0.724 | 0.625 | 0.618 | 0.748 | 0.722 | |||
| ECFP6 | 0.691 | 0.677 | 0.664 | 0.810 | 0.705 | 0.791 | 0.694 | 0.776 | 0.686 | 0.679 | 0.674 | 0.723 | 0.650 | 0.735 | 0.614 | 0.626 | 0.775 | 0.739 | |
| Consensus | MACCS | 0.703 | 0.852 | 0.796 | 0.718 | 0.819 | 0.739 | 0.749 | 0.728 | 0.764 | 0.634 | ||||||||
| FCFP6 | 0.741 | 0.703 | 0.809 | 0.742 | 0.829 | 0.742 | 0.750 | 0.726 | 0.752 | 0.700 | 0.745 | 0.644 | 0.638 | 0.779 | 0.784 | ||||
| ECFP6 | 0.725 | 0.707 | 0.728 | 0.833 | 0.770 | 0.798 | 0.700 | 0.798 | 0.713 | 0.686 | 0.710 | 0.754 | 0.697 | 0.743 | 0.639 | 0.642 | 0.781 | 0.784 | |
External Validation of ER Agonists, Antagonists, and Binders
| Algorithms | Descriptors | AUC | |||
|---|---|---|---|---|---|
| CERAPP | CERAPP | CERAPP | EADB | ||
| BNB | MACCS | 0.859 | 0.731 | 0.684 | 0.640 |
| FCFP6 | 0.799 | 0.815 | 0.715 | 0.757 | |
| ECFP6 | 0.780 | 0.831 | 0.702 | 0.686 | |
| MACCS | 0.796 | 0.768 | 0.688 | 0.729 | |
| FCFP6 | 0.732 | 0.711 | 0.622 | 0.751 | |
| ECFP6 | 0.736 | 0.786 | 0.626 | 0.684 | |
| RF | MACCS | 0.901 | 0.759 | 0.713 | 0.756 |
| FCFP6 | 0.884 | 0.747 | 0.703 | 0.726 | |
| ECFP6 | 0.706 | 0.707 | 0.747 | ||
| SVM | MACCS | 0.887 | 0.820 | 0.739 | 0.770 |
| FCFP6 | 0.829 | 0.830 | 0.667 | 0.765 | |
| ECFP6 | 0.829 | 0.849 | 0.670 | 0.790 | |
| Normal DNN | MACCS | 0.879 | 0.860 | 0.767 | |
| FCFP6 | 0.794 | 0.780 | 0.691 | ||
| ECFP6 | 0.801 | 0.733 | 0.681 | 0.724 | |
| Multitask DNN | MACCS | 0.866 | 0.749 | 0.698 | 0.720 |
| FCFP6 | 0.822 | 0.672 | 0.787 | ||
| ECFP6 | 0.821 | 0.751 | 0.736 | 0.757 | |
| Consensus | MACCS | 0.889 | 0.828 | 0.726 | 0.766 |
| FCFP6 | 0.826 | 0.817 | 0.704 | 0.784 | |
| ECFP6 | 0.823 | 0.831 | 0.726 | 0.738 | |
Figure 3.Predictivity of individual and consensus QSAR models using MACCS descriptors for (A) cross-validation and (B) external validation with a chemical similarity threshold of 0.8, using FCFP descriptors for (C) cross-validation and (D) external validation with a chemical similarity threshold of 0.4, and using ECFP descriptors for (E) cross-validation and (F) external validation with a chemical similarity threshold of 0.3. All AUC values are reported as the mean value ± standard deviation.