Literature DB >> 21332133

Using inverted indices for accelerating LINGO calculations.

Thomas G Kristensen1, Jesper Nielsen, Christian N S Pedersen.   

Abstract

The ever growing size of chemical databases calls for the development of novel methods for representing and comparing molecules. One such method called LINGO is based on fragmenting the SMILES string representation of molecules. Comparison of molecules can then be performed by calculating the Tanimoto coefficient, which is called LINGOsim when used on LINGO multisets. This paper introduces a verbose representation for storing LINGO multisets, which makes it possible to transform them into sparse fingerprints such that fingerprint data structures and algorithms can be used to accelerate queries. The previous best method for rapidly calculating the LINGOsim similarity matrix required specialized hardware to yield a significant speedup over existing methods. By representing LINGO multisets in the verbose representation and using inverted indices, it is possible to calculate LINGOsim similarity matrices roughly 2.6 times faster than existing methods without relying on specialized hardware.

Entities:  

Mesh:

Year:  2011        PMID: 21332133     DOI: 10.1021/ci100437e

Source DB:  PubMed          Journal:  J Chem Inf Model        ISSN: 1549-9596            Impact factor:   4.956


  3 in total

Review 1.  Methods for Similarity-based Virtual Screening.

Authors:  Thomas G Kristensen; Jesper Nielsen; Christian N S Pedersen
Journal:  Comput Struct Biotechnol J       Date:  2013-03-03       Impact factor: 7.271

2.  Anatomy of high-performance 2D similarity calculations.

Authors:  Imran S Haque; Vijay S Pande; W Patrick Walters
Journal:  J Chem Inf Model       Date:  2011-09-07       Impact factor: 4.956

3.  Speeding up chemical searches using the inverted index: the convergence of chemoinformatics and text search methods.

Authors:  Ramzi Nasr; Rares Vernica; Chen Li; Pierre Baldi
Journal:  J Chem Inf Model       Date:  2012-04-10       Impact factor: 4.956

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.