| Literature DB >> 30273487 |
Fergus Imrie1, Anthony R Bradley2,3,4, Mihaela van der Schaar5,6, Charlotte M Deane1.
Abstract
Machine learning has shown enormous potential for computer-aided drug discovery. Here we show how modern convolutional neural networks (CNNs) can be applied to structure-based virtual screening. We have coupled our densely connected CNN (DenseNet) with a transfer learning approach which we use to produce an ensemble of protein family-specific models. We conduct an in-depth empirical study and provide the first guidelines on the minimum requirements for adopting a protein family-specific model. Our method also highlights the need for additional data, even in data-rich protein families. Our approach outperforms recent benchmarks on the DUD-E data set and an independent test set constructed from the ChEMBL database. Using a clustered cross-validation on DUD-E, we achieve an average AUC ROC of 0.92 and a 0.5% ROC enrichment factor of 79. This represents an improvement in early enrichment of over 75% compared to a recent machine learning benchmark. Our results demonstrate that the continued improvements in machine learning architecture for computer vision apply to structure-based virtual screening.Mesh:
Substances:
Year: 2018 PMID: 30273487 DOI: 10.1021/acs.jcim.8b00350
Source DB: PubMed Journal: J Chem Inf Model ISSN: 1549-9596 Impact factor: 4.956