Literature DB >> 15333211

Learning eigenfunctions links spectral embedding and kernel PCA.

Yoshua Bengio1, Olivier Delalleau, Nicolas Le Roux, Jean-François Paiement, Pascal Vincent, Marie Ouimet.   

Abstract

In this letter, we show a direct relation between spectral embedding methods and kernel principal components analysis and how both are special cases of a more general learning problem: learning the principal eigenfunctions of an operator defined from a kernel and the unknown data-generating density. Whereas spectral embedding methods provided only coordinates for the training points, the analysis justifies a simple extension to out-of-sample examples (the Nyström formula) for multidimensional scaling (MDS), spectral clustering, Laplacian eigenmaps, locally linear embedding (LLE), and Isomap. The analysis provides, for all such spectral embedding methods, the definition of a loss function, whose empirical average is minimized by the traditional algorithms. The asymptotic expected value of that loss defines a generalization performance and clarifies what these algorithms are trying to learn. Experiments with LLE, Isomap, spectral clustering, and MDS show that this out-of-sample embedding formula generalizes well, with a level of error comparable to the effect of small perturbations of the training set on the embedding.

Entities:  

Mesh:

Year:  2004        PMID: 15333211     DOI: 10.1162/0899766041732396

Source DB:  PubMed          Journal:  Neural Comput        ISSN: 0899-7667            Impact factor:   2.026


  12 in total

1.  Using ancestry matching to combine family-based and unrelated samples for genome-wide association studies.

Authors:  Andrew Crossett; Brian P Kent; Lambertus Klei; Steven Ringquist; Massimo Trucco; Kathryn Roeder; Bernie Devlin
Journal:  Stat Med       Date:  2010-12-10       Impact factor: 2.373

2.  Kernel machine approach to testing the significance of multiple genetic markers for risk prediction.

Authors:  Tianxi Cai; Giulia Tonini; Xihong Lin
Journal:  Biometrics       Date:  2011-01-31       Impact factor: 2.571

3.  Enhancement of breast CADx with unlabeled data.

Authors:  Andrew R Jamieson; Maryellen L Giger; Karen Drukker; Lorenzo L Pesce
Journal:  Med Phys       Date:  2010-08       Impact factor: 4.071

4.  Reduced models for binocular rivalry.

Authors:  Carlo R Laing; Thomas Frewen; Ioannis G Kevrekidis
Journal:  J Comput Neurosci       Date:  2010-02-25       Impact factor: 1.621

Review 5.  Systems analysis of high-throughput data.

Authors:  Rosemary Braun
Journal:  Adv Exp Med Biol       Date:  2014       Impact factor: 2.622

6.  Low Dimensionality in Gene Expression Data Enables the Accurate Extraction of Transcriptional Programs from Shallow Sequencing.

Authors:  Graham Heimberg; Rajat Bhatnagar; Hana El-Samad; Matt Thomson
Journal:  Cell Syst       Date:  2016-04-27       Impact factor: 10.304

7.  Risk Classification with an Adaptive Naive Bayes Kernel Machine Model.

Authors:  Jessica Minnier; Ming Yuan; Jun S Liu; Tianxi Cai
Journal:  J Am Stat Assoc       Date:  2015-04-22       Impact factor: 5.033

8.  A Method to Exploit the Structure of Genetic Ancestry Space to Enhance Case-Control Studies.

Authors:  Corneliu A Bodea; Benjamin M Neale; Stephan Ripke; Mark J Daly; Bernie Devlin; Kathryn Roeder
Journal:  Am J Hum Genet       Date:  2016-04-14       Impact factor: 11.025

9.  Partition decoupling for multi-gene analysis of gene expression profiling data.

Authors:  Rosemary Braun; Gregory Leibon; Scott Pauls; Daniel Rockmore
Journal:  BMC Bioinformatics       Date:  2011-12-30       Impact factor: 3.169

10.  Local and global perspectives on diffusion maps in the analysis of molecular systems.

Authors:  Z Trstanova; B Leimkuhler; T Lelièvre
Journal:  Proc Math Phys Eng Sci       Date:  2020-01-15       Impact factor: 2.704

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.