Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 How Consistent are Publicly Reported Cytotoxicity Data? Large-Scale Statistical Analysis of the Concordance of Public Independent Cytotoxicity Measurements.

Literature DB >> 26541361

How Consistent are Publicly Reported Cytotoxicity Data? Large-Scale Statistical Analysis of the Concordance of Public Independent Cytotoxicity Measurements.

Isidro Cortés-Ciriano¹, Andreas Bender².

Abstract

While increased attention is being paid to the impact of data quality in cell-line sensitivity and toxicology modeling, to date, no systematic study has evaluated the comparability of independent cytotoxicity measurements on a large-scale. Here, we estimate the experimental uncertainty of public cytotoxicity data from ChEMBL version 19. We applied stringent filtering criteria to assemble a curated data set comprised of pIC50 data for compound-cell line systems measured in independent laboratories. The estimated experimental uncertainty calculated was a mean unsigned error (MUE) value of 0.61-0.76, a median unsigned error (MedUE) value of 0.51-0.58, and a standard deviation of 0.76-1.00 pIC50 units. The experimental uncertainty (σE) estimated from all pairs of cytotoxicity measurements with a ΔpIC50 value lower than 2.5 was found to be 0.59-0.77 pIC50 units, and thus 21-60% and 21-26% higher than that of pKi and pIC50 data for ligand-protein data (σE =0.47-0.48 pKi units and σE =0.57-0.61 pIC50 units, respectively). The estimated σE value from the pairs of pIC50 values measured with metabolic assays was 0.98, whereas the σE value was found to be 0.69 when using the 1388 pIC50 pairs measured using exactly the same experimental setup. The maximum achievable Pearson correlation coefficient (RPearsonmax.2) of in silico models trained on cytotoxicity data from different laboratories was estimated to be 0.51-0.85, which is considerably different from the value of 1 corresponding to perfect predictions, hinting at the maximum performance one can expect also from computational cytotoxicity predictions. The lowest concordance between pairs of measurements was found for the drugs paclitaxel, methotrexate, zidovudine, and docetaxel, and for the cell lines HepG2, NCI-H460, L1210, and CCRF-CEM, hinting at particular sensitivity of those systems to experimental setups. The highest concordance was estimated for the compound-cell line system HL-60-etoposide (σE =0.70), whereas the lowest for L1210-methotrexate (σE =1.68). We found that annotation errors are responsible for the high discordance observed for some pairs of measurements, pointing out the importance of data curation when automatically extracting cytotoxicity data from public databases. Likewise, these results highlight the importance of estimating compound cytotoxicity with assays providing complementary biological information (i.e., metabolic, clonogenic and assays based on cell membrane integrity), especially when the mechanism of action of test compounds is unknown. From this analysis, guidelines can be created on the reliability of cytotoxicity data from public databases, which could ultimately prove valuable for modeling purposes, and to guide reporting of data in the literature.

Entities: Chemical Disease

Keywords: assay reproducibility; computational medicinal chemistry; cytotoxicity data; data modeling and prediction; experimental uncertainty

Mesh：

Substances：

Year: 2015 PMID： 26541361 DOI： 10.1002/cmdc.201500424

Source DB: PubMed Journal: ChemMedChem ISSN： 1860-7179 Impact factor: 3.466

Keyword Cloud
Cited

5 in total

How Consistent are Publicly Reported Cytotoxicity Data? Large-Scale Statistical Analysis of the Concordance of Public Independent Cytotoxicity Measurements.

1. Modelling compound cytotoxicity using conformal prediction and PubChem HTS data.

2. Data Mining Approach for Extraction of Useful Information About Biologically Active Compounds from Publications.

3. QSAR-derived affinity fingerprints (part 2): modeling performance for potency prediction.

4. A decision-theoretic approach to the evaluation of machine learning algorithms in computational drug discovery.

Review 5. Uncertainty quantification: Can we trust artificial intelligence in drug discovery?