Literature DB >> 27774113

GPU-accelerated Chemical Similarity Assessment for Large Scale Databases.

Marco Maggioni1, Marco Domenico Santambrogio2, Jie Liang1.   

Abstract

The assessment of chemical similarity between molecules is a basic operation in chemoinformatics, a computational area concerning with the manipulation of chemical structural information. Comparing molecules is the basis for a wide range of applications such as searching in chemical databases, training prediction models for virtual screening or aggregating clusters of similar compounds. However, currently available multimillion databases represent a challenge for conventional chemoinformatics algorithms raising the necessity for faster similarity methods. In this paper, we extensively analyze the advantages of using many-core architectures for calculating some commonly-used chemical similarity coefficients such as Tanimoto, Dice or Cosine. Our aim is to provide a wide-breath proof-of-concept regarding the usefulness of GPU architectures to chemoinformatics, a class of computing problems still uncovered. In our work, we present a general GPU algorithm for all-to-all chemical comparisons considering both binary fingerprints and floating point descriptors as molecule representation. Subsequently, we adopt optimization techniques to minimize global memory accesses and to further improve efficiency. We test the proposed algorithm on different experimental setups, a laptop with a low-end GPU and a desktop with a more performant GPU. In the former case, we obtain a 4-to-6-fold speed-up over a single-core implementation for fingerprints and a 4-to-7-fold speed-up for descriptors. In the latter case, we respectively obtain a 195-to-206-fold speed-up and a 100-to-328-fold speed-up.

Entities:  

Keywords:  GPU; Tanimoto coefficient; chemical fingerprints; chemical similarity; chemoinformatics

Year:  2011        PMID: 27774113      PMCID: PMC5072535          DOI: 10.1016/j.procs.2011.04.219

Source DB:  PubMed          Journal:  Procedia Comput Sci


  9 in total

1.  Comparison of ranking methods for virtual screening in lead-discovery programs.

Authors:  David Wilton; Peter Willett; Kevin Lawson; Graham Mullier
Journal:  J Chem Inf Comput Sci       Date:  2003 Mar-Apr

2.  ElectroShape: fast molecular similarity calculations incorporating shape, chirality and electrostatics.

Authors:  M Stuart Armstrong; Garrett M Morris; Paul W Finn; Raman Sharma; Loris Moretti; Richard I Cooper; W Graham Richards
Journal:  J Comput Aided Mol Des       Date:  2010-07-08       Impact factor: 3.686

3.  Virtual screening of molecular databases using a support vector machine.

Authors:  Robert N Jorissen; Michael K Gilson
Journal:  J Chem Inf Model       Date:  2005 May-Jun       Impact factor: 4.956

4.  Accelerating molecular modeling applications with graphics processors.

Authors:  John E Stone; James C Phillips; Peter L Freddolino; David J Hardy; Leonardo G Trabuco; Klaus Schulten
Journal:  J Comput Chem       Date:  2007-12       Impact factor: 3.376

5.  Ultrafast shape recognition to search compound databases for similar molecular shapes.

Authors:  Pedro J Ballester; W Graham Richards
Journal:  J Comput Chem       Date:  2007-07-30       Impact factor: 3.376

6.  970 million druglike small molecules for virtual screening in the chemical universe database GDB-13.

Authors:  Lorenz C Blum; Jean-Louis Reymond
Journal:  J Am Chem Soc       Date:  2009-07-01       Impact factor: 15.419

7.  GPU accelerated support vector machines for mining high-throughput screening data.

Authors:  Quan Liao; Jibo Wang; Yue Webster; Ian A Watson
Journal:  J Chem Inf Model       Date:  2009-12       Impact factor: 4.956

8.  SIML: a fast SIMD algorithm for calculating LINGO chemical similarities on GPUs and CPUs.

Authors:  Imran S Haque; Vijay S Pande; W Patrick Walters
Journal:  J Chem Inf Model       Date:  2010-04-26       Impact factor: 4.956

9.  CUDA compatible GPU cards as efficient hardware accelerators for Smith-Waterman sequence alignment.

Authors:  Svetlin A Manavski; Giorgio Valle
Journal:  BMC Bioinformatics       Date:  2008-03-26       Impact factor: 3.169

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.