Literature DB >> 19805446

On landmark selection and sampling in high-dimensional data analysis.

Mohamed-Ali Belabbas1, Patrick J Wolfe.   

Abstract

In recent years, the spectral analysis of appropriately defined kernel matrices has emerged as a principled way to extract the low-dimensional structure often prevalent in high-dimensional data. Here, we provide an introduction to spectral methods for linear and nonlinear dimension reduction, emphasizing ways to overcome the computational limitations currently faced by practitioners with massive datasets. In particular, a data subsampling or landmark selection process is often employed to construct a kernel based on partial information, followed by an approximate spectral analysis termed the Nyström extension. We provide a quantitative framework to analyse this procedure, and use it to demonstrate algorithmic performance bounds on a range of practical approaches designed to optimize the landmark selection process. We compare the practical implications of these bounds by way of real-world examples drawn from the field of computer vision, whereby low-dimensional manifold structure is shown to emerge from high-dimensional video data streams.

Mesh:

Year:  2009        PMID: 19805446      PMCID: PMC2865880          DOI: 10.1098/rsta.2009.0161

Source DB:  PubMed          Journal:  Philos Trans A Math Phys Eng Sci        ISSN: 1364-503X            Impact factor:   4.226


  7 in total

1.  A global geometric framework for nonlinear dimensionality reduction.

Authors:  J B Tenenbaum; V de Silva; J C Langford
Journal:  Science       Date:  2000-12-22       Impact factor: 47.728

2.  Spectral grouping using the Nyström method.

Authors:  Charless Fowlkes; Serge Belongie; Fan Chung; Jitendra Malik
Journal:  IEEE Trans Pattern Anal Mach Intell       Date:  2004-02       Impact factor: 6.226

3.  Hessian eigenmaps: locally linear embedding techniques for high-dimensional data.

Authors:  David L Donoho; Carrie Grimes
Journal:  Proc Natl Acad Sci U S A       Date:  2003-04-30       Impact factor: 11.205

4.  Density-weighted Nyström method for computing large kernel eigensystems.

Authors:  Kai Zhang; James T Kwok
Journal:  Neural Comput       Date:  2009-01       Impact factor: 2.026

5.  Spectral methods in machine learning and new strategies for very large datasets.

Authors:  Mohamed-Ali Belabbas; Patrick J Wolfe
Journal:  Proc Natl Acad Sci U S A       Date:  2009-01-07       Impact factor: 11.205

6.  Riemannian manifold learning.

Authors:  Tong Lin; Hongbin Zha
Journal:  IEEE Trans Pattern Anal Mach Intell       Date:  2008-05       Impact factor: 6.226

7.  Geometric diffusions as a tool for harmonic analysis and structure definition of data: diffusion maps.

Authors:  R R Coifman; S Lafon; A B Lee; M Maggioni; B Nadler; F Warner; S W Zucker
Journal:  Proc Natl Acad Sci U S A       Date:  2005-05-17       Impact factor: 12.779

  7 in total
  4 in total

1.  Making sense of big data.

Authors:  Patrick J Wolfe
Journal:  Proc Natl Acad Sci U S A       Date:  2013-10-21       Impact factor: 11.205

2.  Statistical challenges of high-dimensional data.

Authors:  Iain M Johnstone; D Michael Titterington
Journal:  Philos Trans A Math Phys Eng Sci       Date:  2009-11-13       Impact factor: 4.226

3.  Sampling from Determinantal Point Processes for Scalable Manifold Learning.

Authors:  Christian Wachinger; Polina Golland
Journal:  Inf Process Med Imaging       Date:  2015

4.  Global denoising for 3D MRI.

Authors:  Xi Wu; Zhipeng Yang; Jing Peng; Jiliu Zhou
Journal:  Biomed Eng Online       Date:  2016-05-12       Impact factor: 2.819

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.