Literature DB >> 20161224

The Choice of the Number of Bins for the M Statistic.

Laura Forsberg White1, Marco Bonetti, Marcello Pagano.   

Abstract

Methods to monitor spatial patterns of disease in populations are of interest in public health practice. The M statistic uses interpoint distances between cases to detect abnormalities in the spatial patterns of diseases. This statistic compares the observed distribution of interpoint distances with that which is expected when no unusual spatial patterns exist. We show the relationship of M to Pearson's Chi Square statistic, xn2. Both statistics require the discretization of continuous data into bins and then are formed by creating a quadratic form, scaled by an appropriate variance covariance matrix. We seek to choose the number and type of these bins for the M statistic so as to maximize the power to detect spatial anomalies. By showing the relationship between M to xn2, we argue for the extension of the theory that has been developed for the selection of the number and type of bins for xn2 to M. We further show that spatial data provides a unique insight into the problem through examples with simulated data and spatial data from a health care provider. In the spatial setting, these indicate that the optimal number of bins depends on the size of the cluster. For large clusters, a smaller number of bins appears to be preferrable, however for small clusters having many bins increases the power. Further, results indicate that the number of bins does not appear to vary with m, the number of spatial locations. We discuss the implications of this result for further work.

Entities:  

Year:  2009        PMID: 20161224      PMCID: PMC2703493          DOI: 10.1016/j.csda.2009.03.005

Source DB:  PubMed          Journal:  Comput Stat Data Anal        ISSN: 0167-9473            Impact factor:   1.681


  2 in total

1.  The interpoint distance distribution as a descriptor of point patterns, with an application to spatial disease clustering.

Authors:  Marco Bonetti; Marcello Pagano
Journal:  Stat Med       Date:  2005-03-15       Impact factor: 2.373

2.  Bivariate method for spatio-temporal syndromic surveillance.

Authors:  Al Ozonoff; L Forsberg; M Bonetti; M Pagano
Journal:  MMWR Suppl       Date:  2004-09-24
  2 in total
  2 in total

1.  On the simultaneous association analysis of large genomic regions: a massive multi-locus association test.

Authors:  Dandi Qiao; Michael H Cho; Heide Fier; Per S Bakke; Amund Gulsvik; Edwin K Silverman; Christoph Lange
Journal:  Bioinformatics       Date:  2013-11-20       Impact factor: 6.937

2.  Improving the power of chronic disease surveillance by incorporating residential history.

Authors:  Justin Manjourides; Marcello Pagano
Journal:  Stat Med       Date:  2011-05-11       Impact factor: 2.373

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.