Literature DB >> 24693210

Approximate Algorithms for Computing Spatial Distance Histograms with Accuracy Guarantees.

Vladimir Grupcev1, Yongke Yuan2, Yi-Cheng Tu3, Jin Huang4, Shaoping Chen5, Sagar Pandit6, Michael Weng7.   

Abstract

Particle simulation has become an important research tool in many scientific and engineering fields. Data generated by such simulations impose great challenges to database storage and query processing. One of the queries against particle simulation data, the spatial distance histogram (SDH) query, is the building block of many high-level analytics, and requires quadratic time to compute using a straightforward algorithm. Previous work has developed efficient algorithms that compute exact SDHs. While beating the naive solution, such algorithms are still not practical in processing SDH queries against large-scale simulation data. In this paper, we take a different path to tackle this problem by focusing on approximate algorithms with provable error bounds. We first present a solution derived from the aforementioned exact SDH algorithm, and this solution has running time that is unrelated to the system size N. We also develop a mathematical model to analyze the mechanism that leads to errors in the basic approximate algorithm. Our model provides insights on how the algorithm can be improved to achieve higher accuracy and efficiency. Such insights give rise to a new approximate algorithm with improved time/accuracy tradeoff. Experimental results confirm our analysis.

Entities:  

Keywords:  Molecular simulation; quadtree; scientific databases; spatial distance histogram

Year:  2012        PMID: 24693210      PMCID: PMC3969837          DOI: 10.1109/TKDE.2012.149

Source DB:  PubMed          Journal:  IEEE Trans Knowl Data Eng        ISSN: 1041-4347            Impact factor:   6.977


  6 in total

1.  The role of declarative querying in bioinformatics.

Authors:  Jignesh M Patel
Journal:  OMICS       Date:  2003

2.  Finding the nearest neighbors in biological databases using less distance computations.

Authors:  Jianjun Zhou; Jörg Sander; Zhipeng Cai; Lusheng Wang; Guohui Lin
Journal:  IEEE/ACM Trans Comput Biol Bioinform       Date:  2010 Oct-Dec       Impact factor: 3.710

3.  GROMACS 4:  Algorithms for Highly Efficient, Load-Balanced, and Scalable Molecular Simulation.

Authors:  Berk Hess; Carsten Kutzner; David van der Spoel; Erik Lindahl
Journal:  J Chem Theory Comput       Date:  2008-03       Impact factor: 6.006

4.  Simulations of the formation, evolution and clustering of galaxies and quasars.

Authors:  Volker Springel; Simon D M White; Adrian Jenkins; Carlos S Frenk; Naoki Yoshida; Liang Gao; Julio Navarro; Robert Thacker; Darren Croton; John Helly; John A Peacock; Shaun Cole; Peter Thomas; Hugh Couchman; August Evrard; Jörg Colberg; Frazer Pearce
Journal:  Nature       Date:  2005-06-02       Impact factor: 49.962

5.  Performance analysis of a dual-tree algorithm for computing spatial distance histograms.

Authors:  Shaoping Chen; Yi-Cheng Tu; Yuni Xia
Journal:  VLDB J       Date:  2011-08-01       Impact factor: 2.868

6.  Distance Histogram Computation Based on Spatiotemporal Uniformity in Scientific Data.

Authors:  Anand Kumar; Vladimir Grupcev; Yongke Yuan; Yi-Cheng Tu; Gang Shen
Journal:  Adv Database Technol       Date:  2012
  6 in total
  2 in total

1.  Computing Spatial Distance Histograms for Large Scientific Datasets On-the-Fly.

Authors:  Anand Kumar; Vladimir Grupcev; Yongke Yuan; Jin Huang; Yi-Cheng Tu; Gang Shen
Journal:  IEEE Trans Knowl Data Eng       Date:  2014-10       Impact factor: 6.977

2.  DCMS: A data analytics and management system for molecular simulation.

Authors:  Anand Kumar; Vladimir Grupcev; Meryem Berrada; Joseph C Fogarty; Yi-Cheng Tu; Xingquan Zhu; Sagar A Pandit; Yuni Xia
Journal:  J Big Data       Date:  2014-11-26
  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.