Literature DB >> 30536051

A new semi-automated workflow for chemical data retrieval and quality checking for modeling applications.

Domenico Gadaleta1, Anna Lombardo2, Cosimo Toma2, Emilio Benfenati2.   

Abstract

The quality of data used for QSAR model derivation is extremely important as it strongly affects the final robustness and predictive power of the model. Ambiguous or wrong structures need to be carefully checked, because they lead to errors in calculation of descriptors, hence leading to meaningless results. The increasing amounts of data, however, have often made it hard to check of very large databases manually. In the light of this, we designed and implemented a semi-automated workflow integrating structural data retrieval from several web-based databases, automated comparison of these data, chemical structure cleaning, selection and standardization of data into a consistent, ready-to-use format that can be employed for modeling. The workflow integrates best practices for data curation that have been suggested in the recent literature. The workflow has been implemented with the freely available KNIME software and is freely available to the cheminformatics community for improvement and application to a broad range of chemical datasets.

Entities:  

Keywords:  Data cleaning; Data curation; QSAR; Semi-automated; Workflow

Year:  2018        PMID: 30536051      PMCID: PMC6503381          DOI: 10.1186/s13321-018-0315-6

Source DB:  PubMed          Journal:  J Cheminform        ISSN: 1758-2946            Impact factor:   5.514


  7 in total

1.  A ligand-based computational drug repurposing pipeline using KNIME and Programmatic Data Access: case studies for rare diseases and COVID-19.

Authors:  Alzbeta Tuerkova; Barbara Zdrazil
Journal:  J Cheminform       Date:  2020-11-25       Impact factor: 5.514

2.  Monte Carlo Models for Sub-Chronic Repeated-Dose Toxicity: Systemic and Organ-Specific Toxicity.

Authors:  Gianluca Selvestrel; Giovanna J Lavado; Alla P Toropova; Andrey A Toropov; Domenico Gadaleta; Marco Marzo; Diego Baderna; Emilio Benfenati
Journal:  Int J Mol Sci       Date:  2022-06-14       Impact factor: 6.208

3.  Setting the stage for next-generation risk assessment with non-animal approaches: the EU-ToxRisk project experience.

Authors:  M J Moné; G Pallocca; S E Escher; T Exner; M Herzler; S Hougaard Bennekou; H Kamp; E D Kroese; Marcel Leist; T Steger-Hartmann; B van de Water
Journal:  Arch Toxicol       Date:  2020-09-04       Impact factor: 5.153

4.  Processing binding data using an open-source workflow.

Authors:  Errol L G Samuel; Secondra L Holmes; Damian W Young
Journal:  J Cheminform       Date:  2021-12-11       Impact factor: 5.514

5.  Ligand-based prediction of hERG-mediated cardiotoxicity based on the integration of different machine learning techniques.

Authors:  Pietro Delre; Giovanna J Lavado; Giuseppe Lamanna; Michele Saviano; Alessandra Roncaglioni; Emilio Benfenati; Giuseppe Felice Mangiatordi; Domenico Gadaleta
Journal:  Front Pharmacol       Date:  2022-09-05       Impact factor: 5.988

6.  Towards an Understanding of the Mode of Action of Human Aromatase Activity for Azoles through Quantum Chemical Descriptors-Based Regression and Structure Activity Relationship Modeling Analysis.

Authors:  Chayawan Chayawan; Cosimo Toma; Emilio Benfenati; Ana Y Caballero Alfonso
Journal:  Molecules       Date:  2020-02-08       Impact factor: 4.411

7.  Prediction of the Neurotoxic Potential of Chemicals Based on Modelling of Molecular Initiating Events Upstream of the Adverse Outcome Pathways of (Developmental) Neurotoxicity.

Authors:  Domenico Gadaleta; Nicoleta Spînu; Alessandra Roncaglioni; Mark T D Cronin; Emilio Benfenati
Journal:  Int J Mol Sci       Date:  2022-03-11       Impact factor: 5.923

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.