| Literature DB >> 30714190 |
Kristin A Connors1, Amy Beasley2, Mace G Barron3, Scott E Belanger1, Mark Bonnell4, Jessica L Brill1, Dick de Zwart5, Aude Kienzler6, Jesse Krailler1, Ryan Otter7, Joshua L Phillips7, Michelle R Embry8.
Abstract
Flexible, rapid, and predictive approaches that do not require the use of large numbers of vertebrate test animals are needed because the chemical universe remains largely untested for potential hazards. Development of robust new approach methodologies and nontesting approaches requires the use of existing information via curated, integrated data sets. The ecological threshold of toxicological concern (ecoTTC) represents one such new approach methodology that can predict a conservative de minimis toxicity value for chemicals with little or no information available. For the creation of an ecoTTC tool, a large, diverse environmental data set was developed from multiple sources, with harmonization, characterization, and information quality assessment steps to ensure that the information could be effectively organized and mined. The resulting EnviroTox database contains 91 217 aquatic toxicity records representing 1563 species and 4016 unique Chemical Abstracts Service numbers and is a robust, curated database containing high-quality aquatic toxicity studies that are traceable to the original information source. Chemical-specific information is also linked to each record and includes physico-chemical information, chemical descriptors, and mode of action classifications. Toxicity data are associated with the physico-chemical data, mode of action classifications, and curated taxonomic information for the organisms tested. The EnviroTox platform also includes 3 analysis tools: a predicted-no-effect concentration calculator, an ecoTTC distribution tool, and a chemical toxicity distribution tool. Although the EnviroTox database and tools were originally developed to support ecoTTC analysis and development, they have broader applicability to the field of ecological risk assessment. Environ Toxicol Chem 2019;9999:1-12.Entities:
Keywords: Aquatic toxicity; Database; Environmental toxicology; Toxicological threshold of concern
Mesh:
Substances:
Year: 2019 PMID: 30714190 PMCID: PMC6850623 DOI: 10.1002/etc.4382
Source DB: PubMed Journal: Environ Toxicol Chem ISSN: 0730-7268 Impact factor: 3.742
Candidate sources of ecotoxicological data for inclusion in the EnviroTox database as of October 2016
| Data source | Description | No. of records/information (SIFT step 3) |
|---|---|---|
| ECHA (REACH) | Obtained by query of the REACH data from eChemPortal database of publicly available substance data, submitted to ECHA under the REACH regulations (Organisation for Economic Co‐operation and Development | 2398 records 215 substances 131 species |
| USEPA ECOTOX | Obtained by query of the USEPA's ECOTOX Knowledgebase, including USEPA‐generated test data and data from the public literature (US Environmental Protection Agency | 68 716 records 1864 substances 955 species |
| Peer‐reviewed literature | Original data set foundational to species sensitivity distribution work by De Zwart ( | 29 903 records 3447 substances 1557 species |
| AiiDA | Aquatic Impact Indicator Database; contains data sourced from ECHA, ECOTOX, and others. Queried to supplement for data not found in REACH ( | 2709 records 533 substances 146 species |
| METI | Summary of aquatic toxicity test results from OECD guideline tests conducted by the Japanese Ministry of the Environment. Some data publicly available via the OECD Toolbox. Also known as the NITE‐CHRIP database ( | 2787 records 464 substances 3 species |
| FET | Data set of acute aquatic toxicity test results from the OECD validation study to evaluate the reproducibility of the zebrafish embryo test (Belanger et al. | 2516 records 229 substances 7 species |
| USGS Columbia | Columbia Summary data set of acute aquatic toxicity tests conducted by the USGS Columbia Environmental Research Center, including Mayer and Ellersieck ( | 4053 records 294 substances 66 species |
| Pharmaceuticals | Summary of acute and chronic aquatic toxicity data for active pharmaceutical ingredients (provided by Sanofi and detailed in Vestel et al. | 334 records 163 substances 3 taxa |
| ECOSAR training set | Set of aquatic toxicity data used to train the computational QSAR tool ECOSAR (ECOlogical Structure Activity Relationship) developed by the USEPA for hazard estimation; sourced from the help files for the ECOSAR program (US Environmental Protection Agency | 2311 records 1007 substances 11 taxa |
| USEPA Pesticide Data | Pesticide Ecotoxicity Database (formerly the Ecological Effects Database); aquatic toxicity data provided by the USEPA's Office of Pesticide Programs Environmental Fate and Effects Division ( | 516 records 338 substances 15 species |
| OECD QSAR Toolbox | Queried to supplement aquatic toxicity data from ECOTOX and ECHA; contents include data from ECETOC OASIS and Aquatic Japan Ministry of the Environment (Organisation for Economic Co‐operation and Development | 6178 records 65 substances 60 species |
ECHA = European Chemicals Agency; FET = fish embryo test; HESI = Health and Environmental Sciences Institute; METI = Japanese Ministry of the Environment; OECD = Organisation for Economic Co‐operation and Development; QSAR = quantitative structure–activity relationship; REACH = Registration, Evaluation, Authorisation and Restriction of Chemicals; SIFT = Stepwise Information‐Filtering Tool; USEPA = US Environmental Protection Agency; USGS = US Geological Survey.
SIFT criteria used to ascertain inclusion of ecotoxicological data in the EnviroTox database
| Step | Criteria | Specifics | Approximate no. records |
|---|---|---|---|
| 0: Purpose | Aquatic toxicity data and metadata | Initial pull of available information from databases listed in Table | 220 000 |
| 1: Relevance | Trophic designations | Fish, amphibian, invertebrate, algae | 158 000 |
| 2: Validity | CAS | CAS present | 132 000 |
| Required fields | Effect value/units, duration, test statistic, effect measured, source present | ||
| Qualifiers | Exclude effect values with qualifiers (e.g., <>) | ||
| Effect | Specific measurement (e.g., EC50) | ||
| 3: Acceptability | Duration | ≥24 h | 122 500 |
| Test statistic | ≥5% and ≤70% effect measure (e.g., IC10, LC50), NOEC, LOEC, MATC | ||
| Effect | Abundance, biomass, cells, chlorophyll, emergence, filtration rate, gross primary productivity, growth, hatchability, intoxication, mortality, nitrogen fixation, population growth, population reduction, population change, primary production, regeneration, reproduction, shell deposition, teratogenesis | ||
| Focus is on endpoints of regulatory significance and known use in decision‐making | |||
| 4: Additional criteria | CAS, chemical name, SMILES | Harmonized. Database trimmed to only contain validated chemicals | 91 000 |
| Metals | Inorganic compounds were collapsed to a “dummy metal ion” CAS | ||
| ID of duplicates | Removed records that were full duplicates (e.g., citation, species, test duration, test statistics, measured effect, effect level) | ||
| Removal of outliers | When a chemical had multiple experimental results, identified and removed outliers within a species and/or trophic level |
Outliers are defined as 3 orders of magnitude away from the species geometric mean effect value for species tested ≥3 times for any given chemical–species pair; 4 orders of magnitude different from the trophic level geometric mean for a trophic group tested ≥3 times for any given chemical; 3 orders of magnitude from the trophic level geometric mean for a rare species if a trophic group had ≥30 individual entries for a given chemical.
CAS = Chemical Abstracts Service; EC50 = 50% effective concentration; IC10 = 10% inhibitory concentration; LC50 = 50% lethal concentration; LOEC = lowest‐observed‐effect concentration; MATC = maximum acceptable toxicant concentration; NOEC = no‐observed‐effect concentration.
Description of information included in the physico‐chemical file for the EnviroTox database
| Information | Description |
|---|---|
|
| |
| CAS | Chemical Abstracts Service (CAS) number, no dashes or spaces |
| Chemical name | Commonly employed chemical name. |
| SMILES | Unified SMILES (Simplified Molecular Input Line Entry Specification) code associated with the chemical and CAS. |
| Desalted canonical SMILES | Open Babel (Open Babel |
| Molecular weight | Molecular weight in g/mol; generated from desalted SMILES using EpiSuite DermWin (USEPA 2018b) |
| Log | Octanol‐water partition coefficient; unitless; EpiSuite KOWWIN (USEPA 2018b) used to populate Log |
| Water Solubility | Solubility of the chemical in pure water (25 °C, 1 atmosphere) in mg/L; EpiSuite WSKOW (USEPA 2018b) used to populate water solubility from desalted canonical SMILES. Experimental used if available; modeled if no experimental available. Effect values that are greater than 5x of the water solubility level were flagged but not removed. |
| ECOSAR Classification | Assignment of chemical class based on desalted, canonical SMILES input to OECD QSAR Toolbox
( |
| ECOSAR Classification–collapsed | For chemicals where multiple classifications were generated by ECOSAR, the first reported was used. These categories were further collapsed into 46 more general categories. The complete list of ECOSAR classification collapsed assignments is available as Supplementary Information |
| USEPA New Chemical Categories | Original categories cited in the document “TSCA New Chemicals Program (NCP)/ Chemical Categories” (USEPA |
|
| |
| Verhaar | Verhaar classes obtained via OECD QSAR Toolbox
( |
| TEST | Toxicity Estimation Software Tool (TEST) based on the MOAtox broad assignments as described by Barron et al. ( |
| OASIS | OASIS acute aquatic toxicity MOA obtained via OECD QSAR Toolbox
( |
| ASTER | ASTER (ASsessment Tool for Evaluating Risk) is a rule‐based expert system and is operated on a proprietary basis by US EPA based on the MOA categories in Russom et al. ( |
|
| Determined from SMILES |
| Halogenated | Contains F, Cl, Br, I |
| Heavy Metal | Contains a heavy metal (metallic element with a density greater than 5) |
OECD = Organisation for Economic Co‐operation and Development; QSAR = quantitative structure–activity relationship; USEPA = US Environmental Protection Agency.
Information included in the standalone (noninteractive) species file for taxa found in the EnviroTox database
| Latin name | Linnaean genus and species name |
|---|---|
| Trophic level | Algae, invertebrate, fish, amphibian, macrophyte, fungi |
| Taxonomic kingdom | Consensus‐based designation |
| Taxonomic phylum or division | Phylum (animal) or division (plant) |
| Taxonomic subphylum | Not always available |
| Taxonomic superclass | Not always available |
| Taxonomic class | Taxonomic class |
| Taxonomic order | Taxonomic order |
| Taxonomic family | Taxonomic family |
Figure 1Acute and chronic designations for fish tests in the EnviroTox database. ASTM = ASTM International; ECx = x% effective concentration; LC50 = 50% lethal concentration; NOEC = no‐observed‐effect concentration; OECD = Organisation for Economic Co‐operation and Development; USEPA = US Environmental Protection Agency.
Summary of EnviroTox database acute and chronic data
| No. of species | No. of entries | ||||
|---|---|---|---|---|---|
| Total | Acute tests | Chronic tests | Acute tests | Chronic tests | |
| Algae | 196 | 191 | 89 | 6376 | 3998 |
| Amphibian | 37 | 37 | 5 | 375 | 10 |
| Invertebrate | 872 | 854 | 111 | 21 565 | 4066 |
| Fish | 458 | 455 | 79 | 50 689 | 4138 |
Figure 2Summary of toxicity values in the EnviroTox database. Effect concentrations are grouped by their trophic level (algae, amphibian, fish, invertebrates) and colored based on the experimental duration (acute = red; chronic = blue). CAS = Chemical Abstracts Service.
Figure 3Number of data points for each species in the EnviroTox database. The database contains a total of 196 algae, 37 amphibian, 458 fish, and 872 invertebrate species.
Figure 4Chronic (blue) and acute (red) toxicity values for the 10 most common ECOSAR classes, separated by trophic level. The box represents the 25th, 50th, and 75th quartiles. Outliers were identified as toxicity values >1.5 times the interquartile range and are plotted as dots. Additional details (number of substances and species) on this figure are included in the Supplemental Data.
Figure 5Chemical and data coverage within the Verhaar mode of action scheme with all acute data in the EnviroTox database. MOA = mode of action.
Figure 6Distribution of acute fish effects, by Verhaar mode of action.