Literature DB >> 29375152

Minimax Estimation of Functionals of Discrete Distributions.

Jiantao Jiao1, Kartik Venkat1, Yanjun Han2, Tsachy Weissman1.   

Abstract

We propose a general methodology for the construction and analysis of essentially minimax estimators for a wide class of functionals of finite dimensional parameters, and elaborate on the case of discrete distributions, where the support size S is unknown and may be comparable with or even much larger than the number of observations n. We treat the respective regions where the functional is nonsmooth and smooth separately. In the nonsmooth regime, we apply an unbiased estimator for the best polynomial approximation of the functional whereas, in the smooth regime, we apply a bias-corrected version of the maximum likelihood estimator (MLE). We illustrate the merit of this approach by thoroughly analyzing the performance of the resulting schemes for estimating two important information measures: 1) the entropy [Formula: see text] and 2) [Formula: see text], α > 0. We obtain the minimax L2 rates for estimating these functionals. In particular, we demonstrate that our estimator achieves the optimal sample complexity n ≍ S/ln S for entropy estimation. We also demonstrate that the sample complexity for estimating Fα (P), 0 < α < 1, is n ≍ S1/α /ln S, which can be achieved by our estimator but not the MLE. For 1 < α < 3/2, we show the minimax L2 rate for estimating Fα (P) is (n ln n)-2(α-1) for infinite support size, while the maximum L2 rate for the MLE is n-2(α-1). For all the above cases, the behavior of the minimax rate-optimal estimators with n samples is essentially that of the MLE (plug-in rule) with n ln n samples, which we term "effective sample size enlargement." We highlight the practical advantages of our schemes for the estimation of entropy and mutual information. We compare our performance with various existing approaches, and demonstrate that our approach reduces running time and boosts the accuracy. Moreover, we show that the minimax rate-optimal mutual information estimator yielded by our framework leads to significant performance boosts over the Chow-Liu algorithm in learning graphical models. The wide use of information measure estimation suggests that the insights and estimators obtained in this paper could be broadly applicable.

Entities:  

Keywords:  Chow; Liu algorithm; Mean squared error; Rényi entropy; approximation theory; entropy estimation; high dimensional statistics; maximum likelihood estimator; minimax lower bound; minimax-optimality; nonsmooth functional estimation; polynomial approximation

Year:  2015        PMID: 29375152      PMCID: PMC5786426          DOI: 10.1109/tit.2015.2412945

Source DB:  PubMed          Journal:  IEEE Trans Inf Theory        ISSN: 0018-9448            Impact factor:   2.501


  13 in total

1.  Entropy estimation of symbol sequences.

Authors:  Thomas Schurmann; Peter Grassberger
Journal:  Chaos       Date:  1996-09       Impact factor: 3.642

2.  Binless strategies for estimation of information from neural data.

Authors:  Jonathan D Victor
Journal:  Phys Rev E Stat Nonlin Soft Matter Phys       Date:  2002-11-11

Review 3.  Mutual-information-based registration of medical images: a survey.

Authors:  Josien P W Pluim; J B Antoine Maintz; Max A Viergever
Journal:  IEEE Trans Med Imaging       Date:  2003-08       Impact factor: 10.048

4.  Estimating mutual information.

Authors:  Alexander Kraskov; Harald Stögbauer; Peter Grassberger
Journal:  Phys Rev E Stat Nonlin Soft Matter Phys       Date:  2004-06-23

5.  Entropy and information in neural spike trains: progress on the sampling problem.

Authors:  Ilya Nemenman; William Bialek; Rob de Ruyter van Steveninck
Journal:  Phys Rev E Stat Nonlin Soft Matter Phys       Date:  2004-05-24

6.  Information-theoretic inference of large transcriptional regulatory networks.

Authors:  Patrick E Meyer; Kevin Kontos; Frederic Lafitte; Gianluca Bontempi
Journal:  EURASIP J Bioinform Syst Biol       Date:  2007

7.  On the impact of entropy estimation on transcriptional regulatory network inference based on mutual information.

Authors:  Catharina Olsen; Patrick E Meyer; Gianluca Bontempi
Journal:  EURASIP J Bioinform Syst Biol       Date:  2009-01-12

8.  Limits of predictability in human mobility.

Authors:  Chaoming Song; Zehui Qu; Nicholas Blumm; Albert-László Barabási
Journal:  Science       Date:  2010-02-19       Impact factor: 47.728

9.  Coverage-adjusted entropy estimation.

Authors:  Vincent Q Vu; Bin Yu; Robert E Kass
Journal:  Stat Med       Date:  2007-09-20       Impact factor: 2.373

10.  Estimating mutual information using B-spline functions--an improved similarity measure for analysing gene expression data.

Authors:  Carsten O Daub; Ralf Steuer; Joachim Selbig; Sebastian Kloska
Journal:  BMC Bioinformatics       Date:  2004-08-31       Impact factor: 3.169

View more
  6 in total

1.  smallWig: parallel compression of RNA-seq WIG files.

Authors:  Zhiying Wang; Tsachy Weissman; Olgica Milenkovic
Journal:  Bioinformatics       Date:  2015-09-30       Impact factor: 6.937

2.  Shannon Entropy Estimation in ∞-Alphabets from Convergence Results: Studying Plug-In Estimators.

Authors:  Jorge F Silva
Journal:  Entropy (Basel)       Date:  2018-05-23       Impact factor: 2.524

3.  Minimax Rate-optimal Estimation of KL Divergence between Discrete Distributions.

Authors:  Yanjun Han; Jiantao Jiao; Tsachy Weissman
Journal:  Int Symp Inf Theory Appl       Date:  2016

4.  On the Information Bottleneck Problems: Models, Connections, Applications and Information Theoretic Views.

Authors:  Abdellatif Zaidi; Iñaki Estella-Aguerri; Shlomo Shamai Shitz
Journal:  Entropy (Basel)       Date:  2020-01-27       Impact factor: 2.524

5.  Ensemble Estimation of Information Divergence .

Authors:  Kevin R Moon; Kumar Sricharan; Kristjan Greenewald; Alfred O Hero
Journal:  Entropy (Basel)       Date:  2018-07-27       Impact factor: 2.524

6.  Empirical Estimation of Information Measures: A Literature Guide.

Authors:  Sergio Verdú
Journal:  Entropy (Basel)       Date:  2019-07-24       Impact factor: 2.524

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.