Literature DB >> 17567838

Coverage-adjusted entropy estimation.

Vincent Q Vu1, Bin Yu, Robert E Kass.   

Abstract

Data on 'neural coding' have frequently been analyzed using information-theoretic measures. These formulations involve the fundamental and generally difficult statistical problem of estimating entropy. We review briefly several methods that have been advanced to estimate entropy and highlight a method, the coverage-adjusted entropy estimator (CAE), due to Chao and Shen that appeared recently in the environmental statistics literature. This method begins with the elementary Horvitz-Thompson estimator, developed for sampling from a finite population, and adjusts for the potential new species that have not yet been observed in the sample-these become the new patterns or 'words' in a spike train that have not yet been observed. The adjustment is due to I. J. Good, and is called the Good-Turing coverage estimate. We provide a new empirical regularization derivation of the coverage-adjusted probability estimator, which shrinks the maximum likelihood estimate. We prove that the CAE is consistent and first-order optimal, with rate O(P)(1/log n), in the class of distributions with finite entropy variance and that, within the class of distributions with finite qth moment of the log-likelihood, the Good-Turing coverage estimate and the total probability of unobserved words converge at rate O(P)(1/(log n)(q)). We then provide a simulation study of the estimator with standard distributions and examples from neuronal data, where observations are dependent. The results show that, with a minor modification, the CAE performs much better than the MLE and is better than the best upper bound estimator, due to Paninski, when the number of possible words m is unknown or infinite. 2007 John Wiley & Sons, Ltd

Mesh:

Year:  2007        PMID: 17567838     DOI: 10.1002/sim.2942

Source DB:  PubMed          Journal:  Stat Med        ISSN: 0277-6715            Impact factor:   2.373


  8 in total

1.  Anthropic Correction of Information Estimates and Its Application to Neural Coding.

Authors:  Michael C Gastpar; Patrick R Gill; Alexander G Huth; Frédéric E Theunissen
Journal:  IEEE Trans Inf Theory       Date:  2010-02-25       Impact factor: 2.501

2.  Model for comparative analysis of antigen receptor repertoires.

Authors:  Grzegorz A Rempala; Michał Seweryn; Leszek Ignatowicz
Journal:  J Theor Biol       Date:  2010-10-16       Impact factor: 2.691

3.  On Generalized Schürmann Entropy Estimators.

Authors:  Peter Grassberger
Journal:  Entropy (Basel)       Date:  2022-05-11       Impact factor: 2.738

4.  Minimax Estimation of Functionals of Discrete Distributions.

Authors:  Jiantao Jiao; Kartik Venkat; Yanjun Han; Tsachy Weissman
Journal:  IEEE Trans Inf Theory       Date:  2015-03-13       Impact factor: 2.501

5.  Methods for diversity and overlap analysis in T-cell receptor populations.

Authors:  Grzegorz A Rempala; Michal Seweryn
Journal:  J Math Biol       Date:  2012-09-25       Impact factor: 2.259

6.  Information in the nonstationary case.

Authors:  Vincent Q Vu; Bin Yu; Robert E Kass
Journal:  Neural Comput       Date:  2009-03       Impact factor: 2.026

7.  Selecting an Effective Entropy Estimator for Short Sequences of Bits and Bytes with Maximum Entropy.

Authors:  Lianet Contreras Rodríguez; Evaristo José Madarro-Capó; Carlos Miguel Legón-Pérez; Omar Rojas; Guillermo Sosa-Gómez
Journal:  Entropy (Basel)       Date:  2021-04-30       Impact factor: 2.524

8.  Estimating diversity in networked ecological communities.

Authors:  Amy D Willis; Bryan D Martin
Journal:  Biostatistics       Date:  2022-01-13       Impact factor: 5.899

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.