Literature DB >> 21851122

Error bounds on the SCISSORS approximation method.

Imran S Haque1, Vijay S Pande.   

Abstract

The SCISSORS method for approximating chemical similarities has shown excellent empirical performance on a number of real-world chemical data sets but lacks theoretically proven bounds on its worst-case error performance. This paper first proves reductions showing SCISSORS to be equivalent to two previous kernel methods: kernel principal components analysis and the rank-k Nyström approximation of a Gram matrix. These reductions allow the use of generalization bounds on these techniques to show that the expected error in SCISSORS approximations of molecular similarity kernels is bounded in expected pairwise inner product error, in matrix 2-norm and Frobenius norm for full kernel matrix approximations and in root-mean-square deviation for approximated matrices. Finally, we show that the actual performance of SCISSORS is significantly better than these worst-case bounds, indicating that chemical space is well-structured for chemical sampling algorithms.

Entities:  

Mesh:

Substances:

Year:  2011        PMID: 21851122      PMCID: PMC3183166          DOI: 10.1021/ci200251a

Source DB:  PubMed          Journal:  J Chem Inf Model        ISSN: 1549-9596            Impact factor:   4.956


  4 in total

1.  SCISSORS: a linear-algebraical technique to rapidly approximate chemical similarities.

Authors:  Imran S Haque; Vijay S Pande
Journal:  J Chem Inf Model       Date:  2010-06-28       Impact factor: 4.956

2.  LINGO, an efficient holographic text based method to calculate biophysical properties and intermolecular similarities.

Authors:  David Vidal; Michael Thormann; Miquel Pons
Journal:  J Chem Inf Model       Date:  2005 Mar-Apr       Impact factor: 4.956

3.  Molecular basis sets - a general similarity-based approach for representing chemical spaces.

Authors:  Akshay S Raghavendra; Gerald M Maggiora
Journal:  J Chem Inf Model       Date:  2007-06-07       Impact factor: 4.956

4.  SIML: a fast SIMD algorithm for calculating LINGO chemical similarities on GPUs and CPUs.

Authors:  Imran S Haque; Vijay S Pande; W Patrick Walters
Journal:  J Chem Inf Model       Date:  2010-04-26       Impact factor: 4.956

  4 in total
  1 in total

1.  SCISSORS: practical considerations.

Authors:  Steven M Kearnes; Imran S Haque; Vijay S Pande
Journal:  J Chem Inf Model       Date:  2013-12-16       Impact factor: 4.956

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.