| Literature DB >> 24693210 |
Vladimir Grupcev1, Yongke Yuan2, Yi-Cheng Tu3, Jin Huang4, Shaoping Chen5, Sagar Pandit6, Michael Weng7.
Abstract
Particle simulation has become an important research tool in many scientific and engineering fields. Data generated by such simulations impose great challenges to database storage and query processing. One of the queries against particle simulation data, the spatial distance histogram (SDH) query, is the building block of many high-level analytics, and requires quadratic time to compute using a straightforward algorithm. Previous work has developed efficient algorithms that compute exact SDHs. While beating the naive solution, such algorithms are still not practical in processing SDH queries against large-scale simulation data. In this paper, we take a different path to tackle this problem by focusing on approximate algorithms with provable error bounds. We first present a solution derived from the aforementioned exact SDH algorithm, and this solution has running time that is unrelated to the system size N. We also develop a mathematical model to analyze the mechanism that leads to errors in the basic approximate algorithm. Our model provides insights on how the algorithm can be improved to achieve higher accuracy and efficiency. Such insights give rise to a new approximate algorithm with improved time/accuracy tradeoff. Experimental results confirm our analysis.Entities:
Keywords: Molecular simulation; quadtree; scientific databases; spatial distance histogram
Year: 2012 PMID: 24693210 PMCID: PMC3969837 DOI: 10.1109/TKDE.2012.149
Source DB: PubMed Journal: IEEE Trans Knowl Data Eng ISSN: 1041-4347 Impact factor: 6.977