Literature DB >> 20857153

Brainstorming: weighted voting prediction of inhibitors for protein targets.

Dariusz Plewczynski1.   

Abstract

The "Brainstorming" approach presented in this paper is a weighted voting method that can improve the quality of predictions generated by several machine learning (ML) methods. First, an ensemble of heterogeneous ML algorithms is trained on available experimental data, then all solutions are gathered and a consensus is built between them. The final prediction is performed using a voting procedure, whereby the vote of each method is weighted according to a quality coefficient calculated using multivariable linear regression (MLR). The MLR optimization procedure is very fast, therefore no additional computational cost is introduced by using this jury approach. Here, brainstorming is applied to selecting actives from large collections of compounds relating to five diverse biological targets of medicinal interest, namely HIV-reverse transcriptase, cyclooxygenase-2, dihydrofolate reductase, estrogen receptor, and thrombin. The MDL Drug Data Report (MDDR) database was used for selecting known inhibitors for these protein targets, and experimental data was then used to train a set of machine learning methods. The benchmark dataset (available at http://bio.icm.edu.pl/∼darman/chemoinfo/benchmark.tar.gz ) can be used for further testing of various clustering and machine learning methods when predicting the biological activity of compounds. Depending on the protein target, the overall recall value is raised by at least 20% in comparison to any single machine learning method (including ensemble methods like random forest) and unweighted simple majority voting procedures.

Entities:  

Mesh:

Substances:

Year:  2010        PMID: 20857153      PMCID: PMC3168748          DOI: 10.1007/s00894-010-0854-x

Source DB:  PubMed          Journal:  J Mol Model        ISSN: 0948-5023            Impact factor:   1.810


  37 in total

1.  Structure prediction meta server.

Authors:  J M Bujnicki; A Elofsson; D Fischer; L Rychlewski
Journal:  Bioinformatics       Date:  2001-08       Impact factor: 6.937

2.  Automated prediction of CASP-5 structures using the Robetta server.

Authors:  Dylan Chivian; David E Kim; Lars Malmström; Philip Bradley; Timothy Robertson; Paul Murphy; Charles E M Strauss; Richard Bonneau; Carol A Rohl; David Baker
Journal:  Proteins       Date:  2003

Review 3.  Chemical genomics: probing protein function using small molecules.

Authors:  Bridget K Wagner; Stephen J Haggarty; Paul A Clemons
Journal:  Am J Pharmacogenomics       Date:  2004

4.  Prediction of CASP6 structures using automated Robetta protocols.

Authors:  Dylan Chivian; David E Kim; Lars Malmström; Jack Schonbrun; Carol A Rohl; David Baker
Journal:  Proteins       Date:  2005

5.  Analysis of data fusion methods in virtual screening: theoretical model.

Authors:  Martin Whittle; Valerie J Gillet; Peter Willett; Jens Loesel
Journal:  J Chem Inf Model       Date:  2006 Nov-Dec       Impact factor: 4.956

6.  Contemporary QSAR classifiers compared.

Authors:  Craig L Bruce; James L Melville; Stephen D Pickett; Jonathan D Hirst
Journal:  J Chem Inf Model       Date:  2007 Jan-Feb       Impact factor: 4.956

7.  Target specific compound identification using a support vector machine.

Authors:  Dariusz Plewczynski; Marcin von Grotthuss; Stephane A H Spieser; Leszek Rychlewski; Lucjan S Wyrwicz; Krzysztof Ginalski; Uwe Koch
Journal:  Comb Chem High Throughput Screen       Date:  2007-03       Impact factor: 1.339

8.  Detecting drug targets with minimum side effects in metabolic networks.

Authors:  Z Li; R-S Wang; X-S Zhang; L Chen
Journal:  IET Syst Biol       Date:  2009-11       Impact factor: 1.615

9.  Extending the trend vector: the trend matrix and sample-based partial least squares.

Authors:  R P Sheridan; R B Nachbar; B L Bush
Journal:  J Comput Aided Mol Des       Date:  1994-06       Impact factor: 3.686

10.  In silico drug discovery: solving the "target-rich and lead-poor" imbalance using the genome-to-drug-lead paradigm.

Authors:  Y P Pang
Journal:  Clin Pharmacol Ther       Date:  2007-01       Impact factor: 6.875

View more
  3 in total

1.  Protein-protein interaction site prediction in Homo sapiens and E. coli using an interaction-affinity based membership function in fuzzy SVM.

Authors:  Brijesh Kumar Sriwastava; Subhadip Basu; Ujjwal Maulik
Journal:  J Biosci       Date:  2015-10       Impact factor: 1.826

2.  The influence of the inactives subset generation on the performance of machine learning methods.

Authors:  Sabina Smusz; Rafał Kurczab; Andrzej J Bojarski
Journal:  J Cheminform       Date:  2013-04-05       Impact factor: 5.514

3.  PPIcons: identification of protein-protein interaction sites in selected organisms.

Authors:  Brijesh K Sriwastava; Subhadip Basu; Ujjwal Maulik; Dariusz Plewczynski
Journal:  J Mol Model       Date:  2013-06-02       Impact factor: 1.810

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.