Literature DB >> 34124607

Doubly Stochastic Normalization of the Gaussian Kernel Is Robust to Heteroskedastic Noise.

Boris Landa1, Ronald R Coifman1, Yuval Kluger1,2,3.   

Abstract

A fundamental step in many data-analysis techniques is the construction of an affinity matrix describing similarities between data points. When the data points reside in Euclidean space, a widespread approach is to from an affinity matrix by the Gaussian kernel with pairwise distances, and to follow with a certain normalization (e.g. the row-stochastic normalization or its symmetric variant). We demonstrate that the doubly-stochastic normalization of the Gaussian kernel with zero main diagonal (i.e., no self loops) is robust to heteroskedastic noise. That is, the doubly-stochastic normalization is advantageous in that it automatically accounts for observations with different noise variances. Specifically, we prove that in a suitable high-dimensional setting where heteroskedastic noise does not concentrate too much in any particular direction in space, the resulting (doubly-stochastic) noisy affinity matrix converges to its clean counterpart with rate m -1/2, where m is the ambient dimension. We demonstrate this result numerically, and show that in contrast, the popular row-stochastic and symmetric normalizations behave unfavorably under heteroskedastic noise. Furthermore, we provide examples of simulated and experimental single-cell RNA sequence data with intrinsic heteroskedasticity, where the advantage of the doubly-stochastic normalization for exploratory analysis is evident.

Entities:  

Year:  2021        PMID: 34124607      PMCID: PMC8194191          DOI: 10.1137/20M1342124

Source DB:  PubMed          Journal:  SIAM J Math Data Sci        ISSN: 2577-0187


  13 in total

1.  Highly Parallel Genome-wide Expression Profiling of Individual Cells Using Nanoliter Droplets.

Authors:  Evan Z Macosko; Anindita Basu; Rahul Satija; James Nemesh; Karthik Shekhar; Melissa Goldman; Itay Tirosh; Allison R Bialas; Nolan Kamitaki; Emily M Martersteck; John J Trombetta; David A Weitz; Joshua R Sanes; Alex K Shalek; Aviv Regev; Steven A McCarroll
Journal:  Cell       Date:  2015-05-21       Impact factor: 41.582

2.  Graph Laplacian Regularization for Image Denoising: Analysis in the Continuous Domain.

Authors:  Jiahao Pang; Gene Cheung
Journal:  IEEE Trans Image Process       Date:  2017-01-11       Impact factor: 10.856

3.  Spectral biclustering of microarray data: coclustering genes and conditions.

Authors:  Yuval Kluger; Ronen Basri; Joseph T Chang; Mark Gerstein
Journal:  Genome Res       Date:  2003-04       Impact factor: 9.043

4.  Dissecting the multicellular ecosystem of metastatic melanoma by single-cell RNA-seq.

Authors:  Itay Tirosh; Benjamin Izar; Sanjay M Prakadan; Marc H Wadsworth; Daniel Treacy; John J Trombetta; Asaf Rotem; Christopher Rodman; Christine Lian; George Murphy; Mohammad Fallahi-Sichani; Ken Dutton-Regester; Jia-Ren Lin; Ofir Cohen; Parin Shah; Diana Lu; Alex S Genshaft; Travis K Hughes; Carly G K Ziegler; Samuel W Kazer; Aleth Gaillard; Kellie E Kolb; Alexandra-Chloé Villani; Cory M Johannessen; Aleksandr Y Andreev; Eliezer M Van Allen; Monica Bertagnolli; Peter K Sorger; Ryan J Sullivan; Keith T Flaherty; Dennie T Frederick; Judit Jané-Valbuena; Charles H Yoon; Orit Rozenblatt-Rosen; Alex K Shalek; Aviv Regev; Levi A Garraway
Journal:  Science       Date:  2016-04-08       Impact factor: 47.728

5.  mRNA-Seq whole-transcriptome analysis of a single cell.

Authors:  Fuchou Tang; Catalin Barbacioru; Yangzhou Wang; Ellen Nordman; Clarence Lee; Nanlan Xu; Xiaohui Wang; John Bodeau; Brian B Tuch; Asim Siddiqui; Kaiqin Lao; M Azim Surani
Journal:  Nat Methods       Date:  2009-04-06       Impact factor: 28.547

6.  Massively parallel digital transcriptional profiling of single cells.

Authors:  Grace X Y Zheng; Jessica M Terry; Phillip Belgrader; Paul Ryvkin; Zachary W Bent; Ryan Wilson; Solongo B Ziraldo; Tobias D Wheeler; Geoff P McDermott; Junjie Zhu; Mark T Gregory; Joe Shuga; Luz Montesclaros; Jason G Underwood; Donald A Masquelier; Stefanie Y Nishimura; Michael Schnall-Levin; Paul W Wyatt; Christopher M Hindson; Rajiv Bharadwaj; Alexander Wong; Kevin D Ness; Lan W Beppu; H Joachim Deeg; Christopher McFarland; Keith R Loeb; William J Valente; Nolan G Ericson; Emily A Stevens; Jerald P Radich; Tarjei S Mikkelsen; Benjamin J Hindson; Jason H Bielas
Journal:  Nat Commun       Date:  2017-01-16       Impact factor: 14.919

7.  Determining sequencing depth in a single-cell RNA-seq experiment.

Authors:  Martin Jinye Zhang; Vasilis Ntranos; David Tse
Journal:  Nat Commun       Date:  2020-02-07       Impact factor: 14.919

8.  Normalization and variance stabilization of single-cell RNA-seq data using regularized negative binomial regression.

Authors:  Christoph Hafemeister; Rahul Satija
Journal:  Genome Biol       Date:  2019-12-23       Impact factor: 13.583

9.  Feature selection and dimension reduction for single-cell RNA-Seq based on a multinomial model.

Authors:  F William Townes; Stephanie C Hicks; Martin J Aryee; Rafael A Irizarry
Journal:  Genome Biol       Date:  2019-12-23       Impact factor: 13.583

10.  Demystifying "drop-outs" in single-cell UMI data.

Authors:  Tae Hyun Kim; Xiang Zhou; Mengjie Chen
Journal:  Genome Biol       Date:  2020-08-06       Impact factor: 13.583

View more
  1 in total

1.  Feature Optimization Method of Material Identification for Loose Particles Inside Sealed Relays.

Authors:  Zhigang Sun; Aiping Jiang; Guotao Wang; Min Zhang; Huizhen Yan
Journal:  Sensors (Basel)       Date:  2022-05-07       Impact factor: 3.847

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.