| Literature DB >> 28129368 |
Jiyu Kim1, Inkyung Jung1.
Abstract
Spatial scan statistics with circular or elliptic scanning windows are commonly used for cluster detection in various applications, such as the identification of geographical disease clusters from epidemiological data. It has been pointed out that the method may have difficulty in correctly identifying non-compact, arbitrarily shaped clusters. In this paper, we evaluated the Gini coefficient for detecting irregularly shaped clusters through a simulation study. The Gini coefficient, the use of which in spatial scan statistics was recently proposed, is a criterion measure for optimizing the maximum reported cluster size. Our simulation study results showed that using the Gini coefficient works better than the original spatial scan statistic for identifying irregularly shaped clusters, by reporting an optimized and refined collection of clusters rather than a single larger cluster. We have provided a real data example that seems to support the simulation results. We think that using the Gini coefficient in spatial scan statistics can be helpful for the detection of irregularly shaped clusters.Entities:
Mesh:
Year: 2017 PMID: 28129368 PMCID: PMC5271318 DOI: 10.1371/journal.pone.0170736
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Number of clusters and districts in the clusters of simulated cluster models A–G.
| Cluster model | Number of clusters | Number of districts |
|---|---|---|
| A | 1 | 11 |
| B | 1 | 12 |
| C | 1 | 14 |
| D | 2 | 2 / 5 |
| E | 2 | 7 / 11 |
| F | 3 | 2 / 2 / 4 |
| G | 1 | 8 |
Fig 1Simulated cluster models A–G.
Sensitivity and PPV for 7 cluster scenarios with RR = 1.3.
| CS | ES | GCS | GES | OF | RC | RF | |
|---|---|---|---|---|---|---|---|
| Sensitivity | 0.543 | 0.661 | 0.546 | 0.690 | 0.676 | 0.450 | 0.698 |
| PPV | 0.793 | 0.904 | 0.794 | 0.906 | 0.830 | 0.968 | 0.876 |
| Power | 0.982 | 0.998 | 0.982 | 0.998 | 0.998 | 0.999 | 0.999 |
| Sensitivity | 0.640 | 0.638 | 0.719 | 0.810 | 0.824 | 0.688 | 0.837 |
| PPV | 0.707 | 0.807 | 0.827 | 0.877 | 0.937 | 0.973 | 0.952 |
| Sensitivity | 0.510 | 0.709 | 0.570 | 0.766 | 0.782 | 0.495 | 0.756 |
| PPV | 0.798 | 0.946 | 0.830 | 0.918 | 0.913 | 0.958 | 0.947 |
| Sensitivity | 0.672 | 0.603 | 0.695 | 0.755 | 0.763 | 0.567 | 0.786 |
| PPV | 0.610 | 0.783 | 0.639 | 0.788 | 0.825 | 0.938 | 0.896 |
| Sensitivity | 0.702 | 0.732 | 0.731 | 0.793 | 0.757 | 0.708 | 0.784 |
| PPV | 0.832 | 0.869 | 0.914 | 0.934 | 0.867 | 0.983 | 0.963 |
| Sensitivity | 0.698 | 0.756 | 0.714 | 0.830 | 0.949 | 0.663 | 0.879 |
| PPV | 0.727 | 0.831 | 0.766 | 0.851 | 0.742 | 0.941 | 0.878 |
| Power | 0.999 | 0.999 | 0.999 | 0.999 | 1.000 | 0.998 | 1.000 |
| Sensitivity | 0.871 | 0.960 | 0.873 | 0.959 | 0.945 | 0.832 | 0.942 |
| PPV | 0.958 | 0.988 | 0.945 | 0.979 | 0.958 | 0.989 | 0.961 |
The usual power = 1 for each method under scenarios B, C, D, and E. CS: Circular spatial scan statistic, ES: Elliptic spatial scan statistic, GCS: Circular spatial scan statistic using Gini coefficient, GES: Elliptic spatial scan statistic using Gini coefficient, OF: Flexible spatial scan statistic, RC: Circular spatial scan statistic with a restricted likelihood ratio. RF: Flexible spatial scan statistic with a restricted likelihood ratio.
Sensitivity and PPV for 7 cluster scenarios with RR = 2.
| CS | ES | GCS | GES | OF | RC | RF | |
|---|---|---|---|---|---|---|---|
| Sensitivity | 0.931 | 0.805 | 0.941 | 0.999 | 1.000 | 1.000 | 1.000 |
| PPV | 0.772 | 0.947 | 0.924 | 0.847 | 0.928 | 0.997 | 0.983 |
| Sensitivity | 0.740 | 0.641 | 0.999 | 0.985 | 1.000 | 0.999 | 1.000 |
| PPV | 0.691 | 0.878 | 0.997 | 0.858 | 1.000 | 1.000 | 1.000 |
| Sensitivity | 0.700 | 0.847 | 0.989 | 0.926 | 0.995 | 0.989 | 0.995 |
| PPV | 0.777 | 0.915 | 0.931 | 0.888 | 0.937 | 0.998 | 0.997 |
| Sensitivity | 0.838 | 0.573 | 0.999 | 1.000 | 0.996 | 0.999 | 0.998 |
| PPV | 0.606 | 0.979 | 0.981 | 0.872 | 0.940 | 0.997 | 0.996 |
| Sensitivity | 0.935 | 0.787 | 0.999 | 0.927 | 0.960 | 0.999 | 0.998 |
| PPV | 0.880 | 0.845 | 0.996 | 0.954 | 0.902 | 1.000 | 1.000 |
| Sensitivity | 0.934 | 0.855 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 |
| PPV | 0.782 | 0.871 | 0.965 | 0.938 | 0.802 | 0.996 | 0.999 |
| Sensitivity | 0.918 | 1.000 | 0.998 | 1.000 | 1.000 | 1.000 | 1.000 |
| PPV | 0.955 | 1.000 | 0.865 | 0.930 | 1.000 | 0.999 | 1.000 |
The usual power = 1 for each method under all scenarios. CS: Circular spatial scan statistic, ES: Elliptic spatial scan statistic, GCS: Circular spatial scan statistic using Gini coefficient, GES: Elliptic spatial scan statistic using Gini coefficient, OF: Flexible spatial scan statistic, RC: Circular spatial scan statistic with a restricted likelihood ratio. RF: Flexible spatial scan statistic with a restricted likelihood ratio.
Estimated bivariate power distributions P(l,s) × 1,000 of the 7 methods for cluster model A (RR = 1.5).
| CS | ES | ||||||||||||||||
| Included | Included | ||||||||||||||||
| 5 | 6 | 7 | 8 | 9 | 10 | 11 | Total | 5 | 6 | 7 | 8 | 9 | 10 | 11 | Total | ||
| 5 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 5 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
| 6 | 0 | 5 | 0 | 0 | 0 | 0 | 0 | 5 | 6 | 0 | 7 | 0 | 0 | 0 | 0 | 0 | 7 |
| 7 | 0 | 0 | 20 | 0 | 0 | 0 | 0 | 20 | 7 | 0 | 1 | 63 | 0 | 0 | 0 | 0 | 64 |
| 8 | 0 | 0 | 0 | 24 | 0 | 0 | 0 | 24 | 8 | 0 | 0 | 11 | 409 | 0 | 0 | 0 | 420 |
| 9 | 0 | 0 | 3 | 1 | 6 | 0 | 0 | 10 | 9 | 0 | 0 | 0 | 111 | 41 | 0 | 0 | 152 |
| 10 | 0 | 1 | 2 | 14 | 0 | 0 | 0 | 17 | 10 | 0 | 0 | 2 | 15 | 46 | 71 | 0 | 134 |
| 11 | 0 | 0 | 1 | 14 | 33 | 1 | 0 | 49 | 11 | 0 | 0 | 0 | 5 | 14 | 17 | 52 | 88 |
| 12 | 0 | 0 | 0 | 1 | 170 | 8 | 0 | 179 | 12 | 0 | 0 | 0 | 0 | 3 | 7 | 79 | 89 |
| 13 | 0 | 0 | 0 | 0 | 2 | 522 | 1 | 525 | 13 | 0 | 0 | 0 | 0 | 2 | 4 | 3 | 9 |
| 14 | 0 | 0 | 0 | 0 | 1 | 7 | 114 | 122 | 14 | 0 | 0 | 0 | 0 | 1 | 2 | 5 | 8 |
| 15 | 0 | 0 | 0 | 0 | 2 | 1 | 1 | 4 | 15 | 0 | 0 | 0 | 0 | 2 | 0 | 0 | 2 |
| 16 | 0 | 0 | 0 | 1 | 1 | 8 | 2 | 12 | 16 | 0 | 0 | 0 | 0 | 0 | 9 | 10 | 19 |
| 17 | 0 | 0 | 0 | 0 | 0 | 5 | 24 | 29 | 17 | 0 | 0 | 0 | 0 | 0 | 3 | 3 | 6 |
| 18 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 2 | 18 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 |
| Total | 2 | 6 | 26 | 55 | 215 | 552 | 144 | 1000 | Total | 1 | 8 | 76 | 540 | 109 | 113 | 153 | 1000 |
| GCS | GES | ||||||||||||||||
| Included | Included | ||||||||||||||||
| 5 | 6 | 7 | 8 | 9 | 10 | 11 | Total | 5 | 6 | 7 | 8 | 9 | 10 | 11 | Total | ||
| 5 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 5 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 6 | 0 | 4 | 0 | 0 | 0 | 0 | 0 | 4 | 6 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 7 | 0 | 0 | 19 | 0 | 0 | 0 | 0 | 19 | 7 | 0 | 1 | 4 | 0 | 0 | 0 | 0 | 5 |
| 8 | 0 | 0 | 0 | 23 | 0 | 0 | 0 | 23 | 8 | 0 | 0 | 2 | 41 | 0 | 0 | 0 | 43 |
| 9 | 0 | 0 | 3 | 3 | 31 | 0 | 0 | 37 | 9 | 0 | 0 | 0 | 11 | 168 | 0 | 0 | 179 |
| 10 | 0 | 0 | 2 | 12 | 4 | 44 | 0 | 62 | 10 | 0 | 0 | 0 | 4 | 56 | 86 | 0 | 146 |
| 11 | 0 | 0 | 1 | 7 | 61 | 4 | 7 | 80 | 11 | 0 | 0 | 0 | 0 | 19 | 88 | 71 | 178 |
| 12 | 0 | 0 | 0 | 0 | 117 | 131 | 1 | 249 | 12 | 0 | 0 | 0 | 0 | 2 | 35 | 161 | 198 |
| 13 | 0 | 0 | 0 | 0 | 1 | 352 | 33 | 386 | 13 | 0 | 0 | 0 | 0 | 3 | 12 | 147 | 162 |
| 14 | 0 | 0 | 0 | 0 | 1 | 7 | 84 | 92 | 14 | 0 | 0 | 0 | 0 | 0 | 4 | 57 | 61 |
| 15 | 0 | 0 | 0 | 0 | 2 | 1 | 1 | 4 | 15 | 0 | 0 | 0 | 0 | 1 | 0 | 8 | 9 |
| 16 | 0 | 0 | 0 | 0 | 1 | 7 | 2 | 10 | 16 | 0 | 0 | 0 | 0 | 0 | 2 | 11 | 13 |
| 17 | 0 | 0 | 0 | 0 | 0 | 5 | 25 | 30 | 17 | 0 | 0 | 0 | 0 | 0 | 2 | 3 | 5 |
| 18 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 2 | 18 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 |
| Total | 2 | 4 | 25 | 45 | 218 | 551 | 155 | 1000 | Total | 0 | 1 | 6 | 56 | 249 | 229 | 459 | 1000 |
| OF | RC | ||||||||||||||||
| Included | Included | ||||||||||||||||
| 5 | 6 | 7 | 8 | 9 | 10 | 11 | Total | 5 | 6 | 7 | 8 | 9 | 10 | 11 | Total | ||
| 5 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 5 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
| 6 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 6 | 0 | 7 | 0 | 0 | 0 | 0 | 0 | 7 |
| 7 | 0 | 0 | 4 | 0 | 0 | 0 | 0 | 4 | 7 | 0 | 1 | 47 | 0 | 0 | 0 | 0 | 48 |
| 8 | 0 | 0 | 2 | 3 | 0 | 0 | 0 | 5 | 8 | 0 | 0 | 1 | 158 | 0 | 0 | 0 | 159 |
| 9 | 0 | 0 | 0 | 26 | 39 | 0 | 0 | 65 | 9 | 0 | 0 | 0 | 12 | 277 | 0 | 0 | 289 |
| 10 | 0 | 0 | 0 | 5 | 19 | 118 | 0 | 142 | 10 | 0 | 0 | 0 | 1 | 28 | 269 | 0 | 298 |
| 11 | 0 | 0 | 0 | 2 | 5 | 148 | 17 | 172 | 11 | 0 | 0 | 0 | 0 | 2 | 53 | 110 | 165 |
| 12 | 0 | 0 | 0 | 0 | 1 | 36 | 364 | 401 | 12 | 0 | 0 | 0 | 0 | 1 | 13 | 13 | 27 |
| 13 | 0 | 0 | 0 | 0 | 0 | 12 | 154 | 166 | 13 | 0 | 0 | 0 | 0 | 0 | 2 | 4 | 6 |
| 14 | 0 | 0 | 0 | 0 | 1 | 2 | 32 | 35 | 14 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 15 | 0 | 0 | 0 | 0 | 0 | 2 | 7 | 9 | 15 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 16 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 16 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 17 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 17 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Total | 0 | 0 | 6 | 36 | 65 | 318 | 575 | 1000 | Total | 1 | 8 | 48 | 171 | 308 | 337 | 127 | 1000 |
| RF | |||||||||||||||||
| Included | |||||||||||||||||
| 5 | 6 | 7 | 8 | 9 | 10 | 11 | Total | ||||||||||
| 5 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |||||||||
| 6 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |||||||||
| 7 | 0 | 0 | 4 | 0 | 0 | 0 | 0 | 4 | |||||||||
| 8 | 0 | 0 | 2 | 6 | 0 | 0 | 0 | 8 | |||||||||
| 9 | 0 | 0 | 0 | 7 | 60 | 0 | 0 | 67 | |||||||||
| 10 | 0 | 0 | 0 | 2 | 24 | 343 | 0 | 369 | |||||||||
| 11 | 0 | 0 | 0 | 1 | 2 | 161 | 115 | 279 | |||||||||
| 12 | 0 | 0 | 0 | 0 | 0 | 30 | 162 | 192 | |||||||||
| 13 | 0 | 0 | 0 | 0 | 0 | 8 | 59 | 67 | |||||||||
| 14 | 0 | 0 | 0 | 0 | 1 | 1 | 10 | 12 | |||||||||
| 15 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 2 | |||||||||
| Total | 0 | 0 | 6 | 16 | 87 | 544 | 347 | 1000 | |||||||||
CS: Circular spatial scan statistic, ES: Elliptic spatial scan statistic, GCS: Circular spatial scan statistic using Gini coefficient, GES: Elliptic spatial scan statistic using Gini coefficient, OF: Flexible spatial scan statistic, RC: Circular spatial scan statistic with a restricted likelihood ratio. RF: Flexible spatial scan statistic with a restricted likelihood ratio. 1000 trials were carried out.
*The usual power is 1000/1000.
#The number of districts in the true cluster for model A is 11.
Sensitivity and PPV for 7 cluster scenarios with RR = 1.5.
| CS | ES | GCS | GES | OF | RC | RF | |
|---|---|---|---|---|---|---|---|
| Sensitivity | 0.883 | 0.791 | 0.886 | 0.916 | 0.947 | 0.845 | 0.928 |
| PPV | 0.776 | 0.950 | 0.807 | 0.912 | 0.905 | 0.986 | 0.948 |
| Sensitivity | 0.713 | 0.667 | 0.911 | 0.924 | 0.977 | 0.939 | 0.982 |
| PPV | 0.709 | 0.829 | 0.978 | 0.886 | 0.992 | 0.993 | 0.992 |
| Sensitivity | 0.643 | 0.800 | 0.839 | 0.878 | 0.935 | 0.782 | 0.907 |
| PPV | 0.793 | 0.936 | 0.902 | 0.919 | 0.943 | 0.984 | 0.982 |
| Sensitivity | 0.791 | 0.584 | 0.871 | 0.978 | 0.922 | 0.885 | 0.902 |
| PPV | 0.628 | 0.899 | 0.853 | 0.898 | 0.918 | 0.983 | 0.974 |
| Sensitivity | 0.893 | 0.770 | 0.944 | 0.886 | 0.905 | 0.943 | 0.932 |
| PPV | 0.883 | 0.870 | 0.984 | 0.972 | 0.902 | 0.996 | 0.993 |
| Sensitivity | 0.881 | 0.855 | 0.945 | 0.957 | 0.998 | 0.963 | 0.995 |
| PPV | 0.765 | 0.871 | 0.905 | 0.927 | 0.801 | 0.976 | 0.968 |
| Sensitivity | 0.888 | 0.992 | 0.939 | 0.986 | 0.989 | 0.965 | 0.989 |
| PPV | 0.967 | 0.999 | 0.922 | 0.962 | 0.997 | 0.996 | 0.997 |
The usual power = 1 for each method under all scenarios. CS: Circular spatial scan statistic, ES: Elliptic spatial scan statistic, GCS: Circular spatial scan statistic using Gini coefficient, GES: Elliptic spatial scan statistic using Gini coefficient, OF: Flexible spatial scan statistic, RC: Circular spatial scan statistic with a restricted likelihood ratio. RF: Flexible spatial scan statistic with a restricted likelihood ratio.
Fig 2Spatial clusters with high mortality rates of male liver cancer in Seoul and Gyeonggi province in Korea for 2010–2013, detected by the 7 methods.
Most likely and secondary clusters of high rates of male liver cancer mortality in Seoul and Gyeonggi province in Korea for 2010–2013, detected by the 7 methods.
| Cluster | CS | ES | GCS | GES | OF | RC | RF | |
|---|---|---|---|---|---|---|---|---|
| 1 | RR | 4.56 | 4.53 | 4.56 | 4.53 | 4.37 | 4.32 | 4.32 |
| <0.001 | <0.001 | <0.001 | <0.001 | <0.001 | <0.001 | <0.001 | ||
| # Districts | 4 | 6 | 4 | 6 | 4 | 4 | 4 | |
| 2 | RR | 2.00 | 2.18 | 4.02 | 2.18 | 1.89 | 3.92 | 3.92 |
| <0.001 | <0.001 | <0.001 | <0.001 | <0.001 | <0.001 | <0.001 | ||
| # Districts | 6 | 3 | 2 | 3 | 9 | 2 | 2 | |
| 3 | RR | 4.72 | 2.00 | 4.72 | 2.00 | 2.98 | 4.66 | 2.14 |
| <0.001 | <0.001 | <0.001 | <0.001 | <0.001 | <0.001 | <0.001 | ||
| # Districts | 1 | 3 | 1 | 3 | 2 | 1 | 5 | |
| 4 | RR | 1.64 | 1.64 | 1.95 | 1.77 | 1.95 | ||
| 0.0042 | <0.001 | <0.001 | 0.003 | <0.001 | ||||
| # Districts | 4 | 4 | 3 | 2 | 3 | |||
| 5 | RR | 1.8 | 1.84 | |||||
| <0.001 | 0.01 | |||||||
| # Districts | 2 | 2 | ||||||
| 6 | RR | 1.61 | 1.59 | |||||
| 0.012 | 0.015 | |||||||
| # Districts | 3 | 3 | ||||||
| 7 | RR | 2.23 | 2.21 | |||||
| 0.019 | 0.02 | |||||||
| # Districts | 1 | 1 | ||||||
| 8 | RR | 2.21 | 2.21 | |||||
| 0.029 | 0.025 | |||||||
| # Districts | 1 | 1 | ||||||
| 9 | RR | 2.19 | ||||||
| 0.029 | ||||||||
| # Districts | 1 |
RR: Relative risk. CS: Circular spatial scan statistic, ES: Elliptic spatial scan statistic, GCS: Circular spatial scan statistic using Gini coefficient, GES: Elliptic spatial scan statistic using Gini coefficient, OF: Flexible spatial scan statistic, RC: Circular spatial scan statistic with a restricted likelihood ratio. RF: Flexible spatial scan statistic with a restricted likelihood ratio. Cluster 1 is the most likely cluster and the others are secondary by the order of statistical significance.