Literature DB >> 21556903

Effect of training data size and noise level on support vector machines virtual screening of genotoxic compounds from large compound libraries.

Pankaj Kumar1, Xiaohua Ma, Xianghui Liu, Jia Jia, Han Bucong, Ying Xue, Ze Rong Li, Sheng Yong Yang, Yu Quan Wei, Yu Zong Chen.   

Abstract

Various in vitro and in-silico methods have been used for drug genotoxicity tests, which show limited genotoxicity (GT+) and non-genotoxicity (GT-) identification rates. New methods and combinatorial approaches have been explored for enhanced collective identification capability. The rates of in-silco methods may be further improved by significantly diversified training data enriched by the large number of recently reported GT+ and GT- compounds, but a major concern is the increased noise levels arising from high false-positive rates of in vitro data. In this work, we evaluated the effect of training data size and noise level on the performance of support vector machines (SVM) method known to tolerate high noise levels in training data. Two SVMs of different diversity/noise levels were developed and tested. H-SVM trained by higher diversity higher noise data (GT+ in any in vivo or in vitro test) outperforms L-SVM trained by lower noise lower diversity data (GT+ in in vivo or Ames test only). H-SVM trained by 4,763 GT+ compounds reported before 2008 and 8,232 GT- compounds excluding clinical trial drugs correctly identified 81.6% of the 38 GT+ compounds reported since 2008, predicted 83.1% of the 2,008 clinical trial drugs as GT-, and 23.96% of 168 K MDDR and 27.23% of 17.86M PubChem compounds as GT+. These are comparable to the 43.1-51.9% GT+ and 75-93% GT- rates of existing in-silico methods, 58.8% GT+ and 79% GT- rates of Ames method, and the estimated percentages of 23% in vivo and 31-33% in vitro GT+ compounds in the "universe of chemicals". There is a substantial level of agreement between H-SVM and L-SVM predicted GT+ and GT- MDDR compounds and the prediction from TOPKAT. SVM showed good potential in identifying GT+ compounds from large compound libraries based on higher diversity and higher noise training data.

Entities:  

Mesh:

Substances:

Year:  2011        PMID: 21556903     DOI: 10.1007/s10822-011-9431-3

Source DB:  PubMed          Journal:  J Comput Aided Mol Des        ISSN: 0920-654X            Impact factor:   3.686


  44 in total

1.  Antiarthritic effect of bee venom: inhibition of inflammation mediator generation by suppression of NF-kappaB through interaction with the p50 subunit.

Authors:  Hye Ji Park; Seong Ho Lee; Dong Ju Son; Ki Wan Oh; Ki Hyun Kim; Ho Sueb Song; Goon Joung Kim; Goo Taeg Oh; Do Young Yoon; Jin Tae Hong
Journal:  Arthritis Rheum       Date:  2004-11

Review 2.  Comparative analysis of machine learning methods in ligand-based virtual screening of large compound libraries.

Authors:  Xiao H Ma; Jia Jia; Feng Zhu; Ying Xue; Ze R Li; Yu Z Chen
Journal:  Comb Chem High Throughput Screen       Date:  2009-05       Impact factor: 1.339

3.  Epigenetic side-effects of common pharmaceuticals: a potential new field in medicine and pharmacology.

Authors:  Antonei B Csoka; Moshe Szyf
Journal:  Med Hypotheses       Date:  2009-06-05       Impact factor: 1.538

Review 4.  Genotoxic and carcinogenic effects of antipsychotics and antidepressants.

Authors:  Giovanni Brambilla; Francesca Mattioli; Antonietta Martelli
Journal:  Toxicology       Date:  2009-05-03       Impact factor: 4.221

5.  Virtual screening of Abl inhibitors from large compound libraries by support vector machines.

Authors:  X H Liu; X H Ma; C Y Tan; Y Y Jiang; M L Go; B C Low; Y Z Chen
Journal:  J Chem Inf Model       Date:  2009-09       Impact factor: 4.956

6.  Reduction of use of animals in regulatory genotoxicity testing: Identification and implementation opportunities-Report from an ECVAM workshop.

Authors:  Stefan Pfuhler; David Kirkland; Peter Kasper; Makoto Hayashi; Philippe Vanparys; Paul Carmichael; Stephen Dertinger; David Eastmond; Azeddine Elhajouji; Cyrille Krul; Andreas Rothfuss; Gabriele Schoening; Andrew Smith; Guenter Speit; Claire Thomas; Jan van Benthem; Raffaella Corvi
Journal:  Mutat Res       Date:  2009-09-16       Impact factor: 2.433

7.  Genotoxicity of hydrochlorothiazide in cultured human lymphocytes. I. Evaluation of chromosome delay and chromosome breakage.

Authors:  Constantinos Andrianopoulos; Georgia Stephanou; Nikos A Demopoulos
Journal:  Environ Mol Mutagen       Date:  2006-04       Impact factor: 3.216

Review 8.  Inhibitors of the tyrosine kinase signaling cascade for asthma.

Authors:  W S Fred Wong
Journal:  Curr Opin Pharmacol       Date:  2005-06       Impact factor: 5.547

Review 9.  Genotoxicity and carcinogenicity studies of antihypertensive agents.

Authors:  Giovanni Brambilla; Antonietta Martelli
Journal:  Mutat Res       Date:  2006-02-07       Impact factor: 2.433

10.  Synergy between systemic toxicity and genotoxicity: relevance to human cancer risk.

Authors:  Herbert S Rosenkranz
Journal:  Mutat Res       Date:  2003-08-28       Impact factor: 2.433

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.