| Literature DB >> 34427989 |
Timur R Gimadiev1, Arkadii Lin2, Valentina A Afonina3, Dinar Batyrshin3, Ramil I Nugmanov3, Tagir Akhmetshin2,3, Pavel Sidorov1, Natalia Duybankova4, Jonas Verhoeven4, Joerg Wegner4, Hugo Ceulemans4, Andrey Gedich5, Timur I Madzhidov3, Alexandre Varnek1,2.
Abstract
The quality of experimental data for chemical reactions is a critical consideration for any reaction-driven study. However, the curation of reaction data has not been extensively discussed in the literature so far. Here, we suggest a 4 steps protocol that includes the curation of individual structures (reactants and products), chemical transformations, reaction conditions and endpoints. Its implementation in Python3 using CGRTools toolkit has been used to clean three popular reaction databases Reaxys, USPTO and Pistachio. The curated USPTO database is available in the GitHub repository (Laboratoire-de-Chemoinformatique/Reaction_Data_Cleaning).Entities:
Keywords: Pistachio; Reaxys; USPTO; big data; chemical reactions; data cleaning
Mesh:
Year: 2021 PMID: 34427989 DOI: 10.1002/minf.202100119
Source DB: PubMed Journal: Mol Inform ISSN: 1868-1743 Impact factor: 3.353