Literature DB >> 21804753

Performance analysis of a dual-tree algorithm for computing spatial distance histograms.

Shaoping Chen1, Yi-Cheng Tu, Yuni Xia.   

Abstract

Many scientific and engineering fields produce large volume of spatiotemporal data. The storage, retrieval, and analysis of such data impose great challenges to database systems design. Analysis of scientific spatiotemporal data often involves computing functions of all point-to-point interactions. One such analytics, the Spatial Distance Histogram (SDH), is of vital importance to scientific discovery. Recently, algorithms for efficient SDH processing in large-scale scientific databases have been proposed. These algorithms adopt a recursive tree-traversing strategy to process point-to-point distances in the visited tree nodes in batches, thus require less time when compared to the brute-force approach where all pairwise distances have to be computed. Despite the promising experimental results, the complexity of such algorithms has not been thoroughly studied. In this paper, we present an analysis of such algorithms based on a geometric modeling approach. The main technique is to transform the analysis of point counts into a problem of quantifying the area of regions where pairwise distances can be processed in batches by the algorithm. From the analysis, we conclude that the number of pairwise distances that are left to be processed decreases exponentially with more levels of the tree visited. This leads to the proof of a time complexity lower than the quadratic time needed for a brute-force algorithm and builds the foundation for a constant-time approximate algorithm. Our model is also general in that it works for a wide range of point spatial distributions, histogram types, and space-partitioning options in building the tree.

Entities:  

Year:  2011        PMID: 21804753      PMCID: PMC3145372          DOI: 10.1007/s00778-010-0205-7

Source DB:  PubMed          Journal:  VLDB J        ISSN: 1066-8888            Impact factor:   2.868


  6 in total

1.  Pathways database system: an integrated system for biological pathways.

Authors:  L Krishnamurthy; J Nadeau; G Ozsoyoglu; M Ozsoyoglu; G Schaeffer; M Tasan; W Xu
Journal:  Bioinformatics       Date:  2003-05-22       Impact factor: 6.937

2.  DSMM: a Database of Simulated Molecular Motions.

Authors:  Giacomo Finocchiaro; Ting Wang; Rene Hoffmann; Aitor Gonzalez; Rebecca C Wade
Journal:  Nucleic Acids Res       Date:  2003-01-01       Impact factor: 16.971

3.  The role of declarative querying in bioinformatics.

Authors:  Jignesh M Patel
Journal:  OMICS       Date:  2003

4.  Covariant Evolutionary Event Analysis for Base Interaction Prediction Using a Relational Database Management System for RNA.

Authors:  Weijia Xu; Stuart Ozer; Robin R Gutell
Journal:  Int Conf Sci Stat Database Manag       Date:  2009

5.  GROMACS 4:  Algorithms for Highly Efficient, Load-Balanced, and Scalable Molecular Simulation.

Authors:  Berk Hess; Carsten Kutzner; David van der Spoel; Erik Lindahl
Journal:  J Chem Theory Comput       Date:  2008-03       Impact factor: 6.006

6.  Simulations of the formation, evolution and clustering of galaxies and quasars.

Authors:  Volker Springel; Simon D M White; Adrian Jenkins; Carlos S Frenk; Naoki Yoshida; Liang Gao; Julio Navarro; Robert Thacker; Darren Croton; John Helly; John A Peacock; Shaun Cole; Peter Thomas; Hugh Couchman; August Evrard; Jörg Colberg; Frazer Pearce
Journal:  Nature       Date:  2005-06-02       Impact factor: 49.962

  6 in total
  6 in total

1.  Computing Spatial Distance Histograms for Large Scientific Datasets On-the-Fly.

Authors:  Anand Kumar; Vladimir Grupcev; Yongke Yuan; Jin Huang; Yi-Cheng Tu; Gang Shen
Journal:  IEEE Trans Knowl Data Eng       Date:  2014-10       Impact factor: 6.977

2.  Efficient SDH Computation In Molecular Simulations Data.

Authors:  Yi-Cheng Tu; Shaoping Chen; Sagar Pandit; Anand Kumar; Vladimir Grupcev
Journal:  ACM BCB       Date:  2012-10

3.  Performance Modeling in CUDA Streams - A Means for High-Throughput Data Processing.

Authors:  Hao Li; Di Yu; Anand Kumar; Yi-Cheng Tu
Journal:  Proc IEEE Int Conf Big Data       Date:  2014-10

4.  Approximate Algorithms for Computing Spatial Distance Histograms with Accuracy Guarantees.

Authors:  Vladimir Grupcev; Yongke Yuan; Yi-Cheng Tu; Jin Huang; Shaoping Chen; Sagar Pandit; Michael Weng
Journal:  IEEE Trans Knowl Data Eng       Date:  2012-09-01       Impact factor: 6.977

5.  Distance Histogram Computation Based on Spatiotemporal Uniformity in Scientific Data.

Authors:  Anand Kumar; Vladimir Grupcev; Yongke Yuan; Yi-Cheng Tu; Gang Shen
Journal:  Adv Database Technol       Date:  2012

6.  DCMS: A data analytics and management system for molecular simulation.

Authors:  Anand Kumar; Vladimir Grupcev; Meryem Berrada; Joseph C Fogarty; Yi-Cheng Tu; Xingquan Zhu; Sagar A Pandit; Yuni Xia
Journal:  J Big Data       Date:  2014-11-26
  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.