Literature DB >> 26046311

QSAR Modeling Using Large-Scale Databases: Case Study for HIV-1 Reverse Transcriptase Inhibitors.

Olga A Tarasova1, Aleksandra F Urusova1, Dmitry A Filimonov1, Marc C Nicklaus2, Alexey V Zakharov2, Vladimir V Poroikov1.   

Abstract

Large-scale databases are important sources of training sets for various QSAR modeling approaches. Generally, these databases contain information extracted from different sources. This variety of sources can produce inconsistency in the data, defined as sometimes widely diverging activity results for the same compound against the same target. Because such inconsistency can reduce the accuracy of predictive models built from these data, we are addressing the question of how best to use data from publicly and commercially accessible databases to create accurate and predictive QSAR models. We investigate the suitability of commercially and publicly available databases to QSAR modeling of antiviral activity (HIV-1 reverse transcriptase (RT) inhibition). We present several methods for the creation of modeling (i.e., training and test) sets from two, either commercially or freely available, databases: Thomson Reuters Integrity and ChEMBL. We found that the typical predictivities of QSAR models obtained using these different modeling set compilation methods differ significantly from each other. The best results were obtained using training sets compiled for compounds tested using only one method and material (i.e., a specific type of biological assay). Compound sets aggregated by target only typically yielded poorly predictive models. We discuss the possibility of "mix-and-matching" assay data across aggregating databases such as ChEMBL and Integrity and their current severe limitations for this purpose. One of them is the general lack of complete and semantic/computer-parsable descriptions of assay methodology carried by these databases that would allow one to determine mix-and-matchability of result sets at the assay level.

Entities:  

Mesh:

Substances:

Year:  2015        PMID: 26046311      PMCID: PMC7738000          DOI: 10.1021/acs.jcim.5b00019

Source DB:  PubMed          Journal:  J Chem Inf Model        ISSN: 1549-9596            Impact factor:   4.956


  28 in total

1.  The Binding Database: data management and interface design.

Authors:  Xi Chen; Yuhmei Lin; Ming Liu; Michael K Gilson
Journal:  Bioinformatics       Date:  2002-01       Impact factor: 6.937

2.  [The laboratory in programs for enteric infection control].

Authors:  O B Grados
Journal:  Bol Oficina Sanit Panam       Date:  1975-04

Review 3.  High throughput screening methodologies classified for major drug target classes according to target signaling pathways.

Authors:  Jeroen Kool; Henk Lingeman; Wilfried Niessen; Hubertus Irth
Journal:  Comb Chem High Throughput Screen       Date:  2010-07       Impact factor: 1.339

4.  QNA-based 'Star Track' QSAR approach.

Authors:  D A Filimonov; A V Zakharov; A A Lagunin; V V Poroikov
Journal:  SAR QSAR Environ Res       Date:  2009-10       Impact factor: 3.000

5.  The ChEMBL database: a taster for medicinal chemists.

Authors:  George Papadatos; John P Overington
Journal:  Future Med Chem       Date:  2014-03       Impact factor: 3.808

6.  Trust, but verify: on the importance of chemical structure curation in cheminformatics and QSAR modeling research.

Authors:  Denis Fourches; Eugene Muratov; Alexander Tropsha
Journal:  J Chem Inf Model       Date:  2010-07-26       Impact factor: 4.956

7.  PASS: identification of probable targets and mechanisms of toxicity.

Authors:  V Poroikov; D Filimonov; A Lagunin; T Gloriozova; A Zakharov
Journal:  SAR QSAR Environ Res       Date:  2007 Jan-Mar       Impact factor: 3.000

8.  HMDB 3.0--The Human Metabolome Database in 2013.

Authors:  David S Wishart; Timothy Jewison; An Chi Guo; Michael Wilson; Craig Knox; Yifeng Liu; Yannick Djoumbou; Rupasri Mandal; Farid Aziat; Edison Dong; Souhaila Bouatra; Igor Sinelnikov; David Arndt; Jianguo Xia; Philip Liu; Faizath Yallou; Trent Bjorndahl; Rolando Perez-Pineiro; Roman Eisner; Felicity Allen; Vanessa Neveu; Russ Greiner; Augustin Scalbert
Journal:  Nucleic Acids Res       Date:  2012-11-17       Impact factor: 16.971

9.  Database resources of the National Center for Biotechnology Information.

Authors: 
Journal:  Nucleic Acids Res       Date:  2013-11-19       Impact factor: 16.971

10.  Evolving BioAssay Ontology (BAO): modularization, integration and applications.

Authors:  Saminda Abeyruwan; Uma D Vempati; Hande Küçük-McGinty; Ubbo Visser; Amar Koleti; Ahsan Mir; Kunie Sakurai; Caty Chung; Joshua A Bittker; Paul A Clemons; Steve Brudz; Anosha Siripala; Arturo J Morales; Martin Romacker; David Twomey; Svetlana Bureeva; Vance Lemmon; Stephan C Schürer
Journal:  J Biomed Semantics       Date:  2014-06-03
View more
  11 in total

Review 1.  QSAR without borders.

Authors:  Eugene N Muratov; Jürgen Bajorath; Robert P Sheridan; Igor V Tetko; Dmitry Filimonov; Vladimir Poroikov; Tudor I Oprea; Igor I Baskin; Alexandre Varnek; Adrian Roitberg; Olexandr Isayev; Stefano Curtarolo; Denis Fourches; Yoram Cohen; Alan Aspuru-Guzik; David A Winkler; Dimitris Agrafiotis; Artem Cherkasov; Alexander Tropsha
Journal:  Chem Soc Rev       Date:  2020-05-01       Impact factor: 54.564

2.  Data Mining Approach for Extraction of Useful Information About Biologically Active Compounds from Publications.

Authors:  Olga A Tarasova; Nadezhda Yu Biziukova; Dmitry A Filimonov; Vladimir V Poroikov; Marc C Nicklaus
Journal:  J Chem Inf Model       Date:  2019-09-10       Impact factor: 4.956

3.  Ranking-Oriented Quantitative Structure-Activity Relationship Modeling Combined with Assay-Wise Data Integration.

Authors:  Katsuhisa Matsumoto; Tomoyuki Miyao; Kimito Funatsu
Journal:  ACS Omega       Date:  2021-04-28

Review 4.  HIV Resistance Prediction to Reverse Transcriptase Inhibitors: Focus on Open Data.

Authors:  Olga Tarasova; Vladimir Poroikov
Journal:  Molecules       Date:  2018-04-19       Impact factor: 4.411

5.  (Q)SAR Models of HIV-1 Protein Inhibition by Drug-Like Compounds.

Authors:  Leonid A Stolbov; Dmitry S Druzhilovskiy; Dmitry A Filimonov; Marc C Nicklaus; Vladimir V Poroikov
Journal:  Molecules       Date:  2019-12-25       Impact factor: 4.411

6.  Prediction of pharmacological activities from chemical structures with graph convolutional neural networks.

Authors:  Miyuki Sakai; Kazuki Nagayasu; Norihiro Shibui; Chihiro Andoh; Kaito Takayama; Hisashi Shirakawa; Shuji Kaneko
Journal:  Sci Rep       Date:  2021-01-12       Impact factor: 4.379

7.  Automated Extraction of Information From Texts of Scientific Publications: Insights Into HIV Treatment Strategies.

Authors:  Nadezhda Biziukova; Olga Tarasova; Sergey Ivanov; Vladimir Poroikov
Journal:  Front Genet       Date:  2020-12-22       Impact factor: 4.599

Review 8.  Computational drug design strategies applied to the modelling of human immunodeficiency virus-1 reverse transcriptase inhibitors.

Authors:  Lucianna Helene Santos; Rafaela Salgado Ferreira; Ernesto Raúl Caffarena
Journal:  Mem Inst Oswaldo Cruz       Date:  2015-11       Impact factor: 2.743

9.  A Chemographic Audit of anti-Coronavirus Structure-activity Information from Public Databases (ChEMBL).

Authors:  Dragos Horvath; Alexey Orlov; Dmitry I Osolodkin; Aydar A Ishmukhametov; Gilles Marcou; Alexandre Varnek
Journal:  Mol Inform       Date:  2020-05-14       Impact factor: 4.050

10.  A Computational Approach for the Prediction of Treatment History and the Effectiveness or Failure of Antiretroviral Therapy.

Authors:  Olga Tarasova; Nadezhda Biziukova; Dmitry Kireev; Alexey Lagunin; Sergey Ivanov; Dmitry Filimonov; Vladimir Poroikov
Journal:  Int J Mol Sci       Date:  2020-01-23       Impact factor: 5.923

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.