| Literature DB >> 25983545 |
Umaporn Siangphoe1, David C Wheeler1.
Abstract
Generalized additive models (GAMs) with bivariate smoothing functions have been applied to estimate spatial variation in risk for many types of cancers. Only a handful of studies have evaluated the performance of smoothing functions applied in GAMs with regard to different geographical areas of elevated risk and different risk levels. This study evaluates the ability of different smoothing functions to detect overall spatial variation of risk and elevated risk in diverse geographical areas at various risk levels using a simulation study. We created five scenarios with different true risk area shapes (circle, triangle, linear) in a square study region. We applied four different smoothing functions in the GAMs, including two types of thin plate regression splines (TPRS) and two versions of locally weighted scatterplot smoothing (loess). We tested the null hypothesis of constant risk and detected areas of elevated risk using analysis of deviance with permutation methods and assessed the performance of the smoothing methods based on the spatial detection rate, sensitivity, accuracy, precision, power, and false-positive rate. The results showed that all methods had a higher sensitivity and a consistently moderate-to-high accuracy rate when the true disease risk was higher. The models generally performed better in detecting elevated risk areas than detecting overall spatial variation. One of the loess methods had the highest precision in detecting overall spatial variation across scenarios and outperformed the other methods in detecting a linear elevated risk area. The TPRS methods outperformed loess in detecting elevated risk in two circular areas.Entities:
Keywords: cancer; cluster; generalized additive model (GAM); simulation; smoothing functions; spatial analysis
Year: 2015 PMID: 25983545 PMCID: PMC4415687 DOI: 10.4137/CIN.S17300
Source DB: PubMed Journal: Cancer Inform ISSN: 1176-9351
Scenarios and parameters for the simulation study.
| SCENARIO | FIGURE | RISK AREA SHAPE | RISK AREA SIZE | PROBABILITY OF DISEASE | ODDS RATIO (OR) |
|---|---|---|---|---|---|
| 1 |
| Circular | 15% | 0.05, 0.2 | OR1 |
| 2 |
| Linear | 15% | 0.05, 0.2 | OR1 |
| 3 |
| Triangle | 15% | 0.05, 0.2 | OR1 |
| 4 |
| Circular | 5%, 10% | 0.05, 0.2 | OR1 = OR2 |
| 5 |
| Circular | 10%, 10% | 0.05, 0.2 | OR1 vs OR2 |
Notes: Blue squares represent study regions. Red shapes represent areas of true elevated or decreased risk. OR1: Odds ratio in the first risk area, OR2: Odds ratio in the second risk area (in the third quadrant). Scenarios 1, 2, and 3 consist of OR1: 0.5, 1.0, 1.5, 2.0, 2.5, 3.0, and 3.5. Scenario 4 consists of OR1 = OR2: 0.5, 1.0, 1.5, 2.0, 2.5, 3.0, and 3.5.
Scenario 5: OR1 vs OR2 were defined as [3.5 vs 1.75, 3.0 vs 1.5, 2.5 vs 1.25, 2.0 vs 1.0, 1.5 vs 1.0, 1.0 vs 1.0, and 0.5 vs 0.75].
Type I error rate for GAM methods in the simulation study.
| PROBABILITY OF DISEASE | APPROXIMATE Chi-square | MONTE CARLO | ||||||
|---|---|---|---|---|---|---|---|---|
| UPT | CPT | TPRS | TPRS-S | UPT | CPT | TPRS | TPRS-S | |
| 0.05 | 0.178 | 0.115 | 0.168 | 0.620 | 0.055 | 0.059 | 0.047 | 0.050 |
| 0.20 | 0.173 | 0.111 | 0.153 | 0.588 | 0.061 | 0.047 | 0.057 | 0.058 |
Note: Four smoothing functions in GAMs: loess with UPT, loess with CPT, TPRS, and TPRS-S.
Figure 1Power and centroid-grid detection of four smoothing functions in GAMs at different true ORs in risk areas. The power for rejecting the null hypothesis of overall constant risk was calculated based on Monte Carlo permutation tests. Centroid-grid detection for true risk areas was defined as detecting the centroid of the true risk area based on a grid spatial reference. The proportion of the detection by the power and centroid-grid detection together are shown in the third row. The detection of two centroids simultaneously was defined for the detection in the two circular elevated risk areas. Four smoothing functions in GAMs: loess with UPT, loess CPT, and TPRS and TPRS-S. Probability of disease unexposed risk was 0.2. Average values over 1,000 data sets are presented. Results of true risk areas in the first quadrant were evaluated for scenarios 4 and 5. The grid detection was not meaningful under the null hypothesis (OR = 1.0).
Figure 2Sensitivity rate, false-positive rate, accuracy rate, and precision rate for detection of true risk areas with varying ORs based on a grid spatial reference. Four smoothing functions in GAMs: loess with UPT, loess CPT, and TPRS and TPRS-S. Probability of disease outside the true risk was 0.2. Averages over 1,000 data sets are shown in the figures. Results of true risk areas in the first quadrant were evaluated for scenarios 4 and 5. Performance rates were not meaningful under the null hypothesis (OR = 1.0).
Figure 3Boxplots of sensitivity, false-positive rate, accuracy rate, and precision rate for detection of elevated risk areas with an OR of 3.5 and probability of diseases outside the true risk (P) of 0.2 based on a grid spatial reference and 1,000 simulated data sets. Four smoothing functions in GAMs: loess with UPT, loess CPT, and TPRS and TPRS-S. Results of true risk areas in the first quadrant were evaluated for Scenario 4 and 5. Red dots represent mean values.