BACKGROUND: Population-based cancer survival is an important measure of the overall effectiveness of cancer care in a population. Population-based cancer registries collect data that enable the estimation of cancer survival. To ensure accurate, consistent and comparable survival estimates, strict control of data quality is required before the survival analyses are carried out. In this paper, we present a basis for data quality control for cancer survival. METHODS: We propose three distinct phases for the quality control. Firstly, each individual variable within a given record is examined to identify departures from the study protocol; secondly, each record is checked and excluded if it is ineligible or logically incoherent for analysis; lastly, the distributions of key characteristics in the whole dataset are examined for their plausibility. RESULTS: Data for patients diagnosed with bladder cancer in England between 1991 and 2010 are used as an example to aid the interpretation of the differences in data quality. The effect of different aspects of data quality on survival estimates is discussed. CONCLUSIONS: We recommend that the results of data quality procedures should be reported together with the findings from survival analysis, to facilitate their interpretation.
BACKGROUND: Population-based cancer survival is an important measure of the overall effectiveness of cancer care in a population. Population-based cancer registries collect data that enable the estimation of cancer survival. To ensure accurate, consistent and comparable survival estimates, strict control of data quality is required before the survival analyses are carried out. In this paper, we present a basis for data quality control for cancer survival. METHODS: We propose three distinct phases for the quality control. Firstly, each individual variable within a given record is examined to identify departures from the study protocol; secondly, each record is checked and excluded if it is ineligible or logically incoherent for analysis; lastly, the distributions of key characteristics in the whole dataset are examined for their plausibility. RESULTS: Data for patients diagnosed with bladder cancer in England between 1991 and 2010 are used as an example to aid the interpretation of the differences in data quality. The effect of different aspects of data quality on survival estimates is discussed. CONCLUSIONS: We recommend that the results of data quality procedures should be reported together with the findings from survival analysis, to facilitate their interpretation.
Authors: Claudia Allemani; Hannah K Weir; Helena Carreira; Rhea Harewood; Devon Spika; Xiao-Si Wang; Finian Bannon; Jane V Ahn; Christopher J Johnson; Audrey Bonaventure; Rafael Marcos-Gragera; Charles Stiller; Gulnar Azevedo e Silva; Wan-Qing Chen; Olufemi J Ogunbiyi; Bernard Rachet; Matthew J Soeberg; Hui You; Tomohiro Matsuda; Magdalena Bielska-Lasota; Hans Storm; Thomas C Tucker; Michel P Coleman Journal: Lancet Date: 2014-11-26 Impact factor: 79.321
Authors: Audrey Bonaventure; Rhea Harewood; Charles A Stiller; Gemma Gatta; Jacqueline Clavel; Daniela C Stefan; Helena Carreira; Devon Spika; Rafael Marcos-Gragera; Rafael Peris-Bonet; Marion Piñeros; Milena Sant; Claudia E Kuehni; Michael F G Murphy; Michel P Coleman; Claudia Allemani Journal: Lancet Haematol Date: 2017-04-11 Impact factor: 18.959