Literature DB >> 19847318

Distributed Computation of the knn Graph for Large High-Dimensional Point Sets.

Erion Plaku1, Lydia E Kavraki.   

Abstract

High-dimensional problems arising from robot motion planning, biology, data mining, and geographic information systems often require the computation of k nearest neighbor (knn) graphs. The knn graph of a data set is obtained by connecting each point to its k closest points. As the research in the above-mentioned fields progressively addresses problems of unprecedented complexity, the demand for computing knn graphs based on arbitrary distance metrics and large high-dimensional data sets increases, exceeding resources available to a single machine. In this work we efficiently distribute the computation of knn graphs for clusters of processors with message passing. Extensions to our distributed framework include the computation of graphs based on other proximity queries, such as approximate knn or range queries. Our experiments show nearly linear speedup with over one hundred processors and indicate that similar speedup can be obtained with several hundred processors.

Year:  2007        PMID: 19847318      PMCID: PMC2764297          DOI: 10.1016/j.jpdc.2006.10.004

Source DB:  PubMed          Journal:  J Parallel Distrib Comput        ISSN: 0743-7315            Impact factor:   3.734


  5 in total

1.  Stochastic roadmap simulation: an efficient representation and algorithm for analyzing molecular motion.

Authors:  Mehmet Serkan Apaydin; Douglas L Brutlag; Carlos Guestrin; David Hsu; Jean-Claude Latombe; Chris Varma
Journal:  J Comput Biol       Date:  2003       Impact factor: 1.479

2.  BLAST: at the core of a powerful and diverse set of sequence analysis tools.

Authors:  Scott McGinnis; Thomas L Madden
Journal:  Nucleic Acids Res       Date:  2004-07-01       Impact factor: 16.971

3.  Protein folding by motion planning.

Authors:  Shawna Thomas; Guang Song; Nancy M Amato
Journal:  Phys Biol       Date:  2005-11-09       Impact factor: 2.583

4.  Low-dimensional, free-energy landscapes of protein-folding reactions by nonlinear dimensionality reduction.

Authors:  Payel Das; Mark Moll; Hernán Stamati; Lydia E Kavraki; Cecilia Clementi
Journal:  Proc Natl Acad Sci U S A       Date:  2006-06-19       Impact factor: 11.205

5.  Combining nearest neighbor classifiers versus cross-validation selection.

Authors:  Minhui Paik; Yuhong Yang
Journal:  Stat Appl Genet Mol Biol       Date:  2004-06-09
  5 in total
  2 in total

1.  GPU-FS-kNN: a software tool for fast and scalable kNN computation using GPUs.

Authors:  Ahmed Shamsul Arefin; Carlos Riveros; Regina Berretta; Pablo Moscato
Journal:  PLoS One       Date:  2012-08-28       Impact factor: 3.240

2.  Systematic Review of Privacy-Preserving Distributed Machine Learning From Federated Databases in Health Care.

Authors:  Fadila Zerka; Samir Barakat; Sean Walsh; Marta Bogowicz; Ralph T H Leijenaar; Arthur Jochems; Benjamin Miraglio; David Townend; Philippe Lambin
Journal:  JCO Clin Cancer Inform       Date:  2020-03
  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.