Literature DB >> 23884706

From data point timelines to a well curated data set, data mining of experimental data and chemical structure data from scientific articles, problems and possible solutions.

Villu Ruusmann1, Uko Maran.   

Abstract

The scientific literature is important source of experimental and chemical structure data. Very often this data has been harvested into smaller or bigger data collections leaving the data quality and curation issues on shoulders of users. The current research presents a systematic and reproducible workflow for collecting series of data points from scientific literature and assembling a database that is suitable for the purposes of high quality modelling and decision support. The quality assurance aspect of the workflow is concerned with the curation of both chemical structures and associated toxicity values at (1) single data point level and (2) collection of data points level. The assembly of a database employs a novel "timeline" approach. The workflow is implemented as a software solution and its applicability is demonstrated on the example of the Tetrahymena pyriformis acute aquatic toxicity endpoint. A literature collection of 86 primary publications for T. pyriformis was found to contain 2,072 chemical compounds and 2,498 unique toxicity values, which divide into 2,440 numerical and 58 textual values. Every chemical compound was assigned to a preferred toxicity value. Examples for most common chemical and toxicological data curation scenarios are discussed.

Entities:  

Mesh:

Year:  2013        PMID: 23884706     DOI: 10.1007/s10822-013-9664-4

Source DB:  PubMed          Journal:  J Comput Aided Mol Des        ISSN: 0920-654X            Impact factor:   3.686


  76 in total

1.  Structure-toxicity relationships for methyl esters of cyanoacetic acids to Tetrahymena pyriformis.

Authors:  M B Cottrell; T W Schultz
Journal:  Bull Environ Contam Toxicol       Date:  2003-03       Impact factor: 2.151

2.  Subcellular pharmacokinetics and its potential for library focusing.

Authors:  Stefan Balaz; Viera Lukacova
Journal:  J Mol Graph Model       Date:  2002-06       Impact factor: 2.518

3.  Structure-toxicity relationships for the effects of N- and N,N'-alkyl thioureas to Tetrahymena pyriformis.

Authors:  T W Schultz; V A Tucker
Journal:  Bull Environ Contam Toxicol       Date:  2003-06       Impact factor: 2.151

4.  Stepwise discrimination between four modes of toxic action of phenols in the Tetrahymena pyriformis assay.

Authors:  Gerrit Schüürmann; Aynur O Aptula; Ralph Kühne; Ralf-Uwe Ebert
Journal:  Chem Res Toxicol       Date:  2003-08       Impact factor: 3.739

5.  Partial least squares modelling of the acute toxicity of aliphatic compounds to Tetrahymena pyriformis.

Authors:  T I Netzeva; T W Schultz; A O Aptula; M T D Cronin
Journal:  SAR QSAR Environ Res       Date:  2003-08       Impact factor: 3.000

Review 6.  Structure-toxicity relationships for phenols to Tetrahymena pyriformis.

Authors:  M T Cronin; T W Schultz
Journal:  Chemosphere       Date:  1996-04       Impact factor: 7.086

7.  Model-based QSAR for ionizable compounds: toxicity of phenols against Tetrahymena pyriformis.

Authors:  K Pirselová; S Baláz; T W Schultz
Journal:  Arch Environ Contam Toxicol       Date:  1996-02       Impact factor: 2.804

8.  Structure-toxicity relationships of selected nitrogenous heterocyclic compounds II. Dinitrogen molecules.

Authors:  T W Schultz; M Cajina-Quezada
Journal:  Arch Environ Contam Toxicol       Date:  1982       Impact factor: 2.804

9.  QSAR analyses of the toxicity of aliphatic carboxylic acids and salts to Tetrahymena pyriformis.

Authors:  J R Seward; T W Schultz
Journal:  SAR QSAR Environ Res       Date:  1999-12       Impact factor: 3.000

10.  Reactivity and aquatic toxicity of aromatic compounds transformable to quinone-type Michael acceptors.

Authors:  F Bajot; M T D Cronin; D W Roberts; T W Schultz
Journal:  SAR QSAR Environ Res       Date:  2011-03       Impact factor: 3.000

View more
  4 in total

1.  How should the completeness and quality of curated nanomaterial data be evaluated?

Authors:  Richard L Marchese Robinson; Iseult Lynch; Willie Peijnenburg; John Rumble; Fred Klaessig; Clarissa Marquardt; Hubert Rauscher; Tomasz Puzyn; Ronit Purian; Christoffer Åberg; Sandra Karcher; Hanne Vriens; Peter Hoet; Mark D Hoover; Christine Ogilvie Hendren; Stacey L Harper
Journal:  Nanoscale       Date:  2016-05-04       Impact factor: 7.790

2.  QSAR DataBank repository: open and linked qualitative and quantitative structure-activity relationship models.

Authors:  V Ruusmann; S Sild; U Maran
Journal:  J Cheminform       Date:  2015-06-25       Impact factor: 5.514

3.  Automated Extraction of Information From Texts of Scientific Publications: Insights Into HIV Treatment Strategies.

Authors:  Nadezhda Biziukova; Olga Tarasova; Sergey Ivanov; Vladimir Poroikov
Journal:  Front Genet       Date:  2020-12-22       Impact factor: 4.599

4.  QSAR DataBank - an approach for the digital organization and archiving of QSAR model information.

Authors:  Villu Ruusmann; Sulev Sild; Uko Maran
Journal:  J Cheminform       Date:  2014-05-14       Impact factor: 5.514

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.