Literature DB >> 26541361

How Consistent are Publicly Reported Cytotoxicity Data? Large-Scale Statistical Analysis of the Concordance of Public Independent Cytotoxicity Measurements.

Isidro Cortés-Ciriano1, Andreas Bender2.   

Abstract

While increased attention is being paid to the impact of data quality in cell-line sensitivity and toxicology modeling, to date, no systematic study has evaluated the comparability of independent cytotoxicity measurements on a large-scale. Here, we estimate the experimental uncertainty of public cytotoxicity data from ChEMBL version 19. We applied stringent filtering criteria to assemble a curated data set comprised of pIC50 data for compound-cell line systems measured in independent laboratories. The estimated experimental uncertainty calculated was a mean unsigned error (MUE) value of 0.61-0.76, a median unsigned error (MedUE) value of 0.51-0.58, and a standard deviation of 0.76-1.00 pIC50 units. The experimental uncertainty (σE) estimated from all pairs of cytotoxicity measurements with a ΔpIC50 value lower than 2.5 was found to be 0.59-0.77 pIC50 units, and thus 21-60% and 21-26% higher than that of pKi and pIC50 data for ligand-protein data (σE =0.47-0.48 pKi units and σE =0.57-0.61 pIC50 units, respectively). The estimated σE value from the pairs of pIC50 values measured with metabolic assays was 0.98, whereas the σE value was found to be 0.69 when using the 1388 pIC50 pairs measured using exactly the same experimental setup. The maximum achievable Pearson correlation coefficient (RPearsonmax.2) of in silico models trained on cytotoxicity data from different laboratories was estimated to be 0.51-0.85, which is considerably different from the value of 1 corresponding to perfect predictions, hinting at the maximum performance one can expect also from computational cytotoxicity predictions. The lowest concordance between pairs of measurements was found for the drugs paclitaxel, methotrexate, zidovudine, and docetaxel, and for the cell lines HepG2, NCI-H460, L1210, and CCRF-CEM, hinting at particular sensitivity of those systems to experimental setups. The highest concordance was estimated for the compound-cell line system HL-60-etoposide (σE =0.70), whereas the lowest for L1210-methotrexate (σE =1.68). We found that annotation errors are responsible for the high discordance observed for some pairs of measurements, pointing out the importance of data curation when automatically extracting cytotoxicity data from public databases. Likewise, these results highlight the importance of estimating compound cytotoxicity with assays providing complementary biological information (i.e., metabolic, clonogenic and assays based on cell membrane integrity), especially when the mechanism of action of test compounds is unknown. From this analysis, guidelines can be created on the reliability of cytotoxicity data from public databases, which could ultimately prove valuable for modeling purposes, and to guide reporting of data in the literature.
© 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

Entities:  

Keywords:  assay reproducibility; computational medicinal chemistry; cytotoxicity data; data modeling and prediction; experimental uncertainty

Mesh:

Substances:

Year:  2015        PMID: 26541361     DOI: 10.1002/cmdc.201500424

Source DB:  PubMed          Journal:  ChemMedChem        ISSN: 1860-7179            Impact factor:   3.466


  5 in total

1.  Modelling compound cytotoxicity using conformal prediction and PubChem HTS data.

Authors:  Fredrik Svensson; Ulf Norinder; Andreas Bender
Journal:  Toxicol Res (Camb)       Date:  2016-10-31       Impact factor: 3.524

2.  Data Mining Approach for Extraction of Useful Information About Biologically Active Compounds from Publications.

Authors:  Olga A Tarasova; Nadezhda Yu Biziukova; Dmitry A Filimonov; Vladimir V Poroikov; Marc C Nicklaus
Journal:  J Chem Inf Model       Date:  2019-09-10       Impact factor: 4.956

3.  QSAR-derived affinity fingerprints (part 2): modeling performance for potency prediction.

Authors:  Isidro Cortés-Ciriano; Ctibor Škuta; Andreas Bender; Daniel Svozil
Journal:  J Cheminform       Date:  2020-06-05       Impact factor: 5.514

4.  A decision-theoretic approach to the evaluation of machine learning algorithms in computational drug discovery.

Authors:  Oliver P Watson; Isidro Cortes-Ciriano; Aimee R Taylor; James A Watson
Journal:  Bioinformatics       Date:  2019-11-01       Impact factor: 6.937

Review 5.  Uncertainty quantification: Can we trust artificial intelligence in drug discovery?

Authors:  Jie Yu; Dingyan Wang; Mingyue Zheng
Journal:  iScience       Date:  2022-07-21
  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.