| Literature DB >> 29175263 |
Matthew E Barclay1, Georgios Lyratzopoulos2, David C Greenberg1, Gary A Abel3.
Abstract
BACKGROUND: The percentage of cancer patients diagnosed at an early stage is reported publicly for geographically-defined populations corresponding to healthcare commissioning organisations in England, and linked to pay-for-performance targets. Given that stage is incompletely recorded, we investigated the extent to which this indicator reflects underlying organisational differences rather than differences in stage completeness and chance variation.Entities:
Mesh:
Year: 2017 PMID: 29175263 PMCID: PMC5786666 DOI: 10.1016/j.canep.2017.11.005
Source DB: PubMed Journal: Cancer Epidemiol ISSN: 1877-7821 Impact factor: 2.984
Fig. A1Percentage of tumours by stage at diagnosis, England 2013.
Fig. 1Observed early-stage percentage calculated using: A. the ‘best estimate’ multiple imputation approach; B. the missing-is-late approach; and C. the complete-case approach, plotted against the percentage of tumours with no recorded stage information, CCGs, England 2013.
Fig. 2Bias in scores calculated using the complete-case and missing-is-late approaches when compared with the ‘best estimate’ MI indicator, plotted against the percentage of tumours with no recorded stage information, CCGs, England 2013.
Number of CCGs, staged tumours per CCG, odds ratios over estimated underlying distribution of CCG performance, quartiles of the reliability of the complete-case early stage indicator, and the number of tumours and associated aggregated years of data for 50%, 70%, 90% and 100% of CCGs to have reliability of 0.7 or higher or of 0.9 or higher.
| CCGs | 209 | |
|---|---|---|
| Number of staged tumours per CCG | Minimum | 125 |
| 25th percentile | 479 | |
| Median | 691 | |
| 75th percentile | 943 | |
| Maximum | 3575 | |
| Odds ratio over CCG distribution | 75th/25th percentiles | 1.16 |
| 95th/5th percentiles | 1.43 | |
| Reliability | Minimum | 0.26 |
| 25th percentile | 0.58 | |
| Median | 0.66 | |
| 75th percentile | 0.73 | |
| Maximum | 0.91 | |
| Number of tumours per CCG required for reliability 0.7 | 50% of units | 803 |
| 70% of units | 812 | |
| 90% of units | 833 | |
| All units | 926 | |
| Data years required for reliability 0.7 | 50% of units | 1.2 |
| 70% of units | 1.5 | |
| 90% of units | 2.3 | |
| All units | 6.6 | |
| Number of tumours per CCG required for reliability 0.9 | 50% of units | 3095 |
| 70% of units | 3132 | |
| 90% of units | 3210 | |
| All units | 3570 | |
| Data years required for reliability 0.9 | 50% of units | 4.5 |
| 70% of units | 5.6 | |
| 90% of units | 8.7 | |
| All units | 25.3 |
p < 0.0001. Odds ratio calculated directly from the estimated variance of the random intercept from the mixed-effects logistic regression ( 0.012) using the appropriate centiles of the standard normal distribution. The 75th/25th percentile odds ratio is calculated as and the 95th/5th percentile odds ratio is calculated as .
Number of organisations, staged tumours per organisation, odds ratios over estimated underlying distribution of organisational performance, quartiles of the reliability of the complete-case early stage indicator, and the number of tumours and associated aggregated years of data for 50%, 70%, 90% and 100% of organisations to have reliability of 0.7 or higher or of 0.9 or higher, for CCGs, local authorities and general practices.
| CCG | LA | GP | ||
|---|---|---|---|---|
| Units with staged tumours | 209 | 326 | 8075 | |
| Staged tumours per unit | Minimum | 125 | 12 | 1 |
| 25th percentile | 479 | 311 | 9 | |
| Median | 691 | 427 | 17 | |
| 75th percentile | 943 | 634 | 30 | |
| Maximum | 3575 | 2992 | 150 | |
| Odds ratio over unit distribution | 75th/25th percentiles | 1.16 | 1.18 | 1.29 |
| 95th/5th percentiles | 1.43 | 1.49 | 1.85 | |
| Reliability | Minimum | 0.26 | 0.04 | 0.01 |
| 25th percentile | 0.58 | 0.53 | 0.06 | |
| Median | 0.66 | 0.61 | 0.12 | |
| 75th percentile | 0.73 | 0.70 | 0.20 | |
| Maximum | 0.91 | 0.92 | 0.56 | |
| Tumours required for reliability 0.7 | 50% of units | 803 | 641 | 280 |
| 70% of units | 812 | 652 | 302 | |
| 90% of units | 833 | 668 | 358 | |
| All units | 926 | 784 | 1546 | |
| Data years required for reliability 0.7 | 50% of units | 1.2 | 1.5 | 17.1 |
| 70% of units | 1.5 | 2.0 | 30.2 | |
| 90% of units | 2.3 | 2.7 | 89.5 | |
| All units | 6.6 | 53.7 | 269.0 | |
| Tumours required for reliability 0.9 | 50% of units | 3095 | 2470 | 1078 |
| 70% of units | 3132 | 2514 | 1165 | |
| 90% of units | 3210 | 2575 | 1380 | |
| All units | 3570 | 3022 | 5963 | |
| Data years required for reliability 0.9 | 50% of units | 4.5 | 5.8 | 65.8 |
| 70% of units | 5.6 | 7.5 | 116.4 | |
| 90% of units | 8.7 | 10.5 | 345.0 | |
| All units | 25.3 | 206.8 | 1,035.0 |
p < 0.0001 across CCGs, LAs and GPs.
National number of diagnoses and median reliability of complete-case composite and site-specific early stage indicators for general practices, CCGs and local authorities, with number of years of data at current completeness levels required for reliable indicators for 70% of organisations.
| Cancer site | Tumours | Median reliability | Years of data required for reliable indicators ( | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Total | Staged | Stage 1–2 | GP | CCG | LA | GP | CCG | LA | |
| All ten sites combined | 208,141 | 172,001 | 98,780 | 0.12 | 0.66 | 0.61 | 30.2 | 1.5 | 2.0 |
| Breast | 44,558 | 37,465 | 31,635 | 0.08 | 0.59 | 0.42 | 28.7 | 2.3 | 4.8 |
| Prostate | 39,934 | 32,859 | 19,422 | 0.05 | 0.71 | 0.62 | 75.3 | 1.3 | 1.9 |
| Lung | 35,972 | 31,234 | 7307 | 0.02 | 0.44 | 0.32 | 142.0 | 4.0 | 7.0 |
| Colorectal | 33,477 | 27,719 | 12,398 | 0.04 | 0.26 | 0.15 | 92.0 | 8.7 | 17.3 |
| Melanoma | 12,245 | 10,520 | 9591 | 0.10 | 0.26 | 0.19 | 34.0 | 9.7 | 19.2 |
| NHL | 11,222 | 8080 | 2916 | 0.33 | 0.24 | 7.0 | 10.8 | ||
| Endometrial | 7232 | 6615 | 5405 | 0.10 | 0.06 | 30.6 | 50.1 | ||
| Bladder | 8669 | 6505 | 4835 | 0.25 | 0.14 | 10.3 | 22.7 | ||
| Renal | 8368 | 5970 | 3202 | 0.28 | 0.21 | 8.1 | 13.4 | ||
| Ovarian | 6464 | 5034 | 2069 | 0.14 | 0.14 | 20.7 | 21.6 | ||
Fig. 3Estimated number of true positives, false positives, true negatives and false negatives, with associated sensitivity, specificity, positive and negative predictive values (95% confidence intervals), for the 60% early stage target given performance similar to 2013 and tumours counts as in 2013.
Estimated number of true positives, false positives, true negatives and false negatives, with associated sensitivity, specificity, positive predictive value and negative predictive values (95% confidence intervals), for the 60% early stage target given performance similar to 2013 and tumours counts as in 2013 for reporting periods of 1, 2.5 and 9 years.
| Reporting period | 1 year | 2.5 years | 9 years | |||
|---|---|---|---|---|---|---|
| Expected value | (95% CI) | Expected value | (95% CI) | Expected value | (95% CI) | |
| True positives | 21 | (13, 50) | 23 | (15, 32) | 25 | (16, 35) |
| False positives | 19 | (11, 28) | 11 | (6, 18) | 5 | (1, 10) |
| True negatives | 161 | (149, 172) | 169 | (157, 179) | 175 | (164, 185) |
| False negatives | 8 | (3, 14) | 6 | (2, 11) | 4 | (1, 8) |
| Sensitivity | 0.73 | (0.56, 0.89) | 0.80 | (0.64, 0.93) | 0.88 | (0.73, 0.97) |
| Specificity | 0.89 | (0.85, 0.94) | 0.94 | (0.90, 0.97) | 0.97 | (0.94, 0.99) |
| Positive predictive value | 0.52 | (0.37, 0.68) | 0.67 | (0.50, 0.82) | 0.83 | (0.68, 0.95) |
| Negative predictive value | 0.95 | (0.92, 0.98) | 0.97 | (0.94, 0.99) | 0.68 | (0.96, 0.99) |
Fig. 4Expected percentage of CCGs with observed increases in the early stage percentage of 4 percentage points or more, given uniform national changes of between −4 and +12 percentage points. For example, for a typical CCG to have an 80% chance of being classified as achieving a 4%-point increase (blue dashed line), it would need to have an underlying increase of 6.2%-points. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
Studies evaluating bias introduced by missing data in cancer registry data.
| First Author | Year published | Country | Setting | What was imputed | Summary |
|---|---|---|---|---|---|
| He Y | 2008 | US | Regional, California | Indicators of receiving chemotherapy or radiotherapy treatment (colorectal cancer), outcome variables | Correcting under-reporting using internal gold standard. |
| Krieger N | 2008 | US | Regional, California | ER-status (breast cancer), outcome variable | Records with missing ER status bias complete-case analysis |
| Nur U | 2010 | UK | 1 English registry (NWCIS) | Stage (colorectal cancer), covariate | Complete-case analysis is likely to be biased. Indicator methods give spurious precision levels. MI allows inclusion of more information (leading to higher precision than complete-case). MAR assumption probably reasonable, but further research valuable. |
| Eisemann N | 2011 | Germany | 1 German registry (Schleswig-Holstein) | Simulation (truly MAR), based on real data on breast cancer and melanoma. Stage was imputed by various methods (multinomial logistic; PMM; random forests) with various levels of missing data. Stage was used as outcome (incidence counts) and covariate (survival analysis) | MI is superior to simpler methods for handling missing data. MI using random forests does not perform well (and is associated with model convergence problems). |
| Howlader N | 2012 | US | 13 SEER registries | ER-status (breast cancer), outcome variable but also for re-use by other researchers | Demonstration with incidence trends. Include the cancer registry in imputation when data from more than one registry are imputed. |
| Falcaro M | 2015 | UK | 4 English registries | Simulation, based on real data. Stage was imputed by various methods under various (MAR) missingness mechanisms. | Ordinal logistic model is inadequate. Multinomial logistic model works well. Use of Nelson-Aalen estimate of cumulative hazard is recommended. |
| Andridge R | 2016 | US | 13 SEER registries | ER-status (breast cancer), as outcome variable, using PMM under MAR and various MNAR assumptions | In SEER 1992–2012 breast cancer data, MAR and MNAR approaches give broadly similar results. |
| Falcaro M | 2017 | UK | 4 English registries | Simulation, based on real data. Stage was imputed by various methods under various (MAR) missingness mechanisms, and the bias in different approaches to imputation was compared. | Can use imputation with non-congenial analysis methods (in this case, Pohar-Perme net survival estimation) to avoid bias associated with “missing indicator” approaches. |
ER: Estrogen Receptor.
MAR: Missing At Random.
MNAR: Missing Not At Random.
PMM: Predictive Mean Matching.