| Literature DB >> 30462741 |
Kunihiko Takahashi1, Hideyasu Shimadzu2.
Abstract
The spatial scan statistic is commonly used to detect spatial and/or temporal disease clusters in epidemiological studies. Although multiple clusters in the study space can be thus identified, current theoretical developments are mainly based on detecting a 'single' cluster. The standard scan statistic procedure enables the detection of multiple clusters, recursively identifying additional 'secondary' clusters. However, their p-values are calculated one at a time, as if each cluster is a primary one. Therefore, a new procedure that can accurately evaluate multiple clusters as a whole is needed. The present study focuses on purely temporal cases and then proposes a new test procedure that evaluates the p-value for multiple clusters, combining generalized linear models with an information criterion approach. This framework encompasses the conventional, currently widely used detection procedure as a special case. An application study adopting the new framework is presented, analysing the Japanese daily incidence of out-of-hospital cardiac arrest cases. The analysis reveals that the number of the incident increases around New Year's Day in Japan. Further, simulation studies undertaken confirm that the proposed method possesses a consistency property that tends to select the correct number of clusters when the truth is known.Entities:
Mesh:
Year: 2018 PMID: 30462741 PMCID: PMC6249023 DOI: 10.1371/journal.pone.0207821
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Assumed cluster periods with expected counts under the null model in simulation studies.
| period | # days | expected counts | |
|---|---|---|---|
| A | 2006/01/01–03 | 3 | (115.10, 131.43, 122.34) |
| B | 2005/01/01–03 | 3 | (103.08, 108.53, 124.18) |
| C | 2005/04/01–03 | 3 | (76.44, 77.10, 84.35) |
| D | 2005/02/01 | 1 | (87.27) |
| E | 2007/01/01–05 | 5 | (127.77, 117.05, 114.54, 99.36, 100.34) |
| F | 2007/04/01–05 | 5 | (84.83, 81.85, 78.47, 79.52, 82.61) |
Assumed relative risks in simulation studies.
| scenario | # clusters | # days | Relative Risk (RR) | |||||
|---|---|---|---|---|---|---|---|---|
| A | B | C | D | E | F | |||
| S0 | 0 | 0 | ||||||
| S1 | 1 | 3 | 1.5 | |||||
| S2.1 | 3 | 9 | 1.2 | 1.2 | 1.2 | |||
| S2.2 | 3 | 9 | 1.3 | 1.3 | 1.3 | |||
| S2.3 | 3 | 9 | 1.5 | 1.5 | 1.5 | |||
| S2.4 | 3 | 9 | 2.0 | 2.0 | 2.0 | |||
| S3.1 | 6 | 20 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 |
| S3.2 | 6 | 20 | 1.3 | 1.5 | 2.0 | 2.0 | 1.3 | 2.0 |
* All the blanks should read as Relative Risk (RR) is 1.
Fig 1The black solid line represents the daily out-of-hospital cardiac arrest (OHCA), male non-cardiac cases (MNC), in Japan from 1 January 2005 to 10 March 2011 (m = 2, 260 days); total 185,819 cases, and maximum and minimum of daily counts were 244 and 43, respectively.
Grey line overlays the null expected counts (Takahashi and Shimadzu [28]).
Detected significant temporal clusters in daily incidence of out-of-hospital cardiac arrest (OHCA), male non-cardiac cases (MNC), by the secondary-cluster procedure.
| rank | cluster | clustered period | cases | expects | RR | |
|---|---|---|---|---|---|---|
| 1 | 2010/01/01—2010/01/02 | 423 | 232.43 | 1.82 | 0.001 | |
| 2 | 2005/01/01—2005/01/02 | 381 | 211.61 | 1.80 | 0.001 | |
| 3 | 2011/01/01 | 228 | 111.40 | 2.05 | 0.001 | |
| 4 | 2008/12/31—2009/01/03 | 630 | 444.54 | 1.42 | 0.001 | |
| 5 | 2006/01/01 | 207 | 115.10 | 1.80 | 0.001 | |
| 6 | 2008/01/01—2008/01/04 | 614 | 446.72 | 1.37 | 0.001 | |
| 7 | 2007/01/01—2007/01/02 | 344 | 244.83 | 1.41 | 0.001 | |
| 8 | 2011/01/02—2011/01/06 | 711 | 589.98 | 1.21 | 0.011 |
RR: relative risk
Fig 2Histogram of the null distribution of RDC(K).
Detected multiple-cluster model by the proposed procedure.
| clustered period | coef. | OR | 95%CI | ||
|---|---|---|---|---|---|
| intercept | −0.006 | 0.0077 | |||
| 2010/01/01—2010/01/02 | 0.605 | 1.831 | (1.662, 2.012) | < 0.0001 | |
| 2005/01/01—2005/01/02 | 0.594 | 1.812 | (1.636, 2.000) | < 0.0001 | |
| 2011/01/01 | 0.722 | 2.059 | (1.803, 2.339) | < 0.0001 | |
| 2008/12/31—2009/01/03 | 0.355 | 1.426 | (1.317, 1.541) | < 0.0001 | |
| 2006/01/01 | 0.593 | 1.810 | (1.574, 2.068) | < 0.0001 | |
| 2008/01/01—2008/01/04 | 0.324 | 1.383 | (1.276, 1.500) | < 0.0001 | |
| 2007/01/01—2007/01/02 | 0.346 | 1.414 | (1.270, 1.569) | < 0.0001 | |
| 2011/01/02—2011/01/06 | 0.193 | 1.213 | (1.126, 1.304) | < 0.0001 |
coef.: estimated coefficients; OR: odds ratio; 95%CI: its 95% confidence interval; and p−value: the p-value of the estimated coefficient.
Fig 3Values of the proposed criterion C for each K, with other criteria, −2log likelihood (−2log L), AIC and BIC.
The power of the secondary-cluster and the proposed procedures in the simulation study, along with the sensitivity (Sen) and the positive predictive value (PPV) of days detected significantly.
| N.S. | total | Sen | Sen | PPV | PPV | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| S0: null RR = 1.0 | |||||||||||||
| s-c proc. | 0.951 | 0.049 | 0.049 | — | — | — | — | ||||||
| proposed proc. | 0.951 | 0.049 | 0.049 | — | — | — | — | ||||||
| S1: one cluster (three days) RR = 1.5 | |||||||||||||
| s-c proc. | 0.959 | 0.040 | 0.001 | 1.000 | 0.996 | 0.988 | 0.971 | 0.919 | |||||
| proposed proc. | 0.994 | 0.006 | 1.000 | 0.996 | 0.988 | 0.986 | 0.954 | ||||||
| S2.1: three clusters (nine days) RR = 1.2 | |||||||||||||
| s-c proc. | 0.430 | 0.436 | 0.123 | 0.011 | 0.570 | 0.361 | 0.003 | 0.870 | 0.376 | ||||
| proposed proc. | 0.426 | 0.531 | 0.043 | 0.574 | 0.313 | 0.000 | 0.878 | 0.405 | |||||
| S2.2: three clusters (nine days) RR = 1.3 | |||||||||||||
| s-c proc. | 0.009 | 0.118 | 0.455 | 0.406 | 0.012 | 0.991 | 0.728 | 0.266 | 0.935 | 0.658 | |||
| proposed proc. | 0.009 | 0.315 | 0.469 | 0.206 | 0.001 | 0.991 | 0.605 | 0.140 | 0.946 | 0.734 | |||
| S2.3: three clusters (nine days) RR = 1.5 | |||||||||||||
| s-c proc. | 0.004 | 0.962 | 0.034 | 1.000 | 0.992 | 0.940 | 0.978 | 0.847 | |||||
| proposed proc. | 0.011 | 0.984 | 0.005 | 1.000 | 0.990 | 0.934 | 0.984 | 0.872 | |||||
| S2.4: three clusters (nine days) RR = 2.0 | |||||||||||||
| s-c proc. | 0.960 | 0.039 | 0.001 | 1.000 | 1.000 | 1.000 | 0.991 | 0.957 | |||||
| proposed proc. | 0.990 | 0.010 | 1.000 | 1.000 | 1.000 | 0.997 | 0.987 | ||||||
| S3.1: six clusters (20 days) RR = 2.0 | |||||||||||||
| s-c proc. | 0.976 | 0.024 | 1.000 | 0.999 | 0.999 | 0.997 | 0.972 | ||||||
| proposed proc. | 0.002 | 0.989 | 0.009 | 1.000 | 0.999 | 0.997 | 0.998 | 0.987 | |||||
| S3.2: six clusters (20 days) RR = 1.3, 1.5, 2.0 | |||||||||||||
| s-c proc. | 0.003 | 0.170 | 0.795 | 0.032 | 1.000 | 0.952 | 0.597 | 0.976 | 0.669 | ||||
| proposed proc. | 0.012 | 0.276 | 0.704 | 0.008 | 1.000 | 0.934 | 0.526 | 0.979 | 0.699 | ||||
* averages among the custers detected as K > 0; Sen = 1: #{Sen = 1}/1000; PPV = 1: #{PPV = 1}/1000; s-c proc.: secondary-cluster procedure.