| Literature DB >> 25569206 |
Jamie L Carter1, Russell J Coletti2, Russell P Harris3.
Abstract
OBJECTIVE: To determine the optimal method for quantifying and monitoring overdiagnosis in cancer screening over time.Entities:
Mesh:
Year: 2015 PMID: 25569206 PMCID: PMC4332263 DOI: 10.1136/bmj.g7773
Source DB: PubMed Journal: BMJ ISSN: 0959-8138
Criteria for evaluating risk of bias by study type among studies that quantified overdiagnosis resulting from cancer screening
| Study type | Risk of bias criteria |
|---|---|
| Modeling studies | Extent to which assumptions are transparent and clearly stated |
| Pathological and imaging studies | Probability of selection bias and confounding |
| Ecological and cohort studies | |
| Follow-up of a randomized controlled trial |
Definitions of criteria for evaluating strength of evidence among studies quantifying overdiagnosis from cancer screening
| Criterion | Definition |
|---|---|
| Risk of bias (high/moderate/low) | See table 1 |
| Directness (good/fair/poor) | Extent to which the evidence links screening directly to differences in long term cumulative incidence between populations without making assumptions |
| Analysis (good/fair/poor) | Extent to which the analysis appropriately quantifies overdiagnosis, without inclusion of age groups or time frames that lack the potential to be overdiagnosed, and without statistically adjusting for lead time |
| Time frame (good/fair/poor) | Extent to which the time frame is sufficient to account for the effects of lead time |
| External validity (good/fair/poor) | Extent to which study population is similar to US general population or Western European populations in factors that are associated with cancer incidence, screening situation (such as expertise of screening radiographers, quality of screening facilities, threshold for defining a result as abnormal), medical care, and risks for competing mortality |
| Precision (good/fair/poor/cannot determine) | Confidence interval of difference in cumulative incidence attributable to screening between populations over an appropriate time frame should be provided. Width of confidence interval should be narrow, not wider than about 20%. |
| Consistency (good/fair/poor) | Extent to which the overdiagnosis measurements from the included studies have a similar magnitude |

Flow diagram of study selection process
Summary of evidence from 21 modelling studies quantifying overdiagnosis from cancer screening.
| Study; model(s) | Modelled population: country, ages; screening schedule | Data sources: ( | ( | Reports outcome as % of screen detected cancers? | Magnitude of overdiagnosis (95% CI) | Sensitivity analyses varying mean sojourn time or lead time? | Overall risk of bias |
|---|---|---|---|---|---|---|---|
| Davidov 200411 | US; 50–60, 70, or 80 year olds; at 5 year intervals | ( | ( | Unclear | 8.48–53.6% | Univariate. MST 5–15 years | Moderate |
| Draisma 200912; MISCAN, FHCRC, UMichigan | US, 54–80 year olds; typical US screening patterns | ( | ( | Yes | MISCAN 42%; FHCRC 28%; UMich 23% | Not performed | High |
| Gulati 201313; FHCRC | US, 40 year olds; 32 screening schedules simulated | ( | ( | No, reports lifetime risk of overdiagnosis | 1.8–6% | Other sensitivity analyses performed. | Moderate |
| Gulati 201414 | US, 50–84 year olds; multiple | ( | ( | No | 2.9–88.1% depending on age, Gleason score, and PSA level (% likelihood that a tumor is overdiagnosed) | Not performed | High |
| Heijnsdijk 200915; MISCAN | Europe; 55–70 year olds every 1 or 2 years, or 55–75 year olds every 4 years | ( | ( | Yes (estimated from figures) | Annual, 60%; biennial, 60%; every 4 years (to age 75), 67% | Not performed | High |
| McGregor 199816 | Quebec, 50–85 year olds; annual PSA test for ages 50–70 | ( | ( | Yes | 84% | Other sensitivity analyses performed. | High |
| Pashayan 200917 | UK; single PSA | ( | ( | Yes | 50–54 years, 10% (7 to 11%); 55–59, 15% (12 to 15); 60–64, 23% (20 to 24); 65–69, 31% (26 to 32) | Not performed | High |
| Telesca 200818 | US; typical US screening patterns | ( | ( | Yes | White men 22.7%; black men 34.4% | Not performed | High |
| Tsodikov 200619 | US; typical US screening patterns | ( | ( | Yes | 30% | Not performed | High |
| Wu 201220 | Finland; 55, 59, 63, 67 year olds; 3 PSA tests every 4 years until age 71 | ( | ( | No | 3.4% (2.4 to 5.7%) risk of overdetection during study period | Not performed | High |
| De Gelder 2011 (Epi Rev)21; MISCAN | Netherlands; 0–100 year olds; biennial mammogram ages 49–74 | ( | ( | Yes | Implementation 22.1–67.4%; extension 15.4–30.5%; steady state 8.9–15.2% | Not performed | High |
| De Gelder 2011 (Prev Med)22; MISCAN | Netherlands, 0–100 year olds; biennial film or digital mammogram | ( | ( | Yes | Screen film, 7.2%; digital, 8.2% | Other sensitivity analyses performed. | High |
| Duffy 200523 | Sweden; 40–74 or 39–59 year olds; mammogram every 18, 24, or 33 months | All data: Swedish 2-County RCT (1977–84) and Gothenburg RCT (1982–87) (separate analyses) | ( | Yes | Swedish: 1st screen, 3.1% (0.1 to 10.9%); 2nd, 0.3% (0.1 to 1); 3rd, 0.3% (0.1 to 1) | Not performed | High |
| Gunsoy 201224 | UK, 40–49 year olds; annual mammogram | ( | ( | Yes | 0.70% | Univariate; varied MST and sensitivity; 0.5 to 2.9% | Moderate |
| Martinez-Alonso 201025 | Spain; 25-84 year olds; biennial mammogram ages 50–69 | ( | ( | No, reported as % excess of expected incidence | 1935 birth cohort, 0.4% (−8.8 to 12.2%); 1940, 23.3% (9.1 to 43.4); 1945, 30.6% (12.7 to 57.6); 1950, 46.6% (22.7 to 85.2) | Univariate; varied MST from 1 to 5. | Moderate |
| Olsen 200626 | Denmark; 50-69 year olds; biennial mammogram | ( | ( | Yes | 1st screen, 7.8% (0.3 to 27.5%); 2nd screen, 0.5% (0.01 to 2.2) | Other sensitivity analyses performed | High |
| Seigneurin 201227 | France, 50-69 year olds; not specified | ( | ( | Yes | DCIS, 31.9% (2.9 to 62.3%); invasive cancer, 3.3% (0.7 to 6.5) | Univariate; varied MST; DCIS, 17.3 to 51.7%; invasive cancer, 0 to 8.9% | Moderate |
| Duffy 201428 | UK; 55–74 or 50–75 year olds; annual and biennial | ( | ( | Yes | 11% | Univariate; varying MST; 0 to 18% | Moderate |
| Hazelton 201229 | US; Heavy smokers, <5 years asbestos exposure; low dose CT | ( | ( | Yes | Men 14.1% (11.6 to 19.7%); women 35.2% (28.9 to 39.3) | Not performed | High |
| Pinsky 200430 | US; men aged 50–75 years, heavy smokers; annual CXR and sputum cytology | All data: Mayo Lung Screening Trial (prevalence screen and screening arm only) | ( | Yes | 13–17% | Not performed | High |
| Luo 201231 | US; 40, 50, or 60 year olds; 5 annual or 3 biennial FOBT | ( | ( | Yes (reported for age 50) | Women 6.65% (2.56 to 20.49%); men 6.15% (1.92 to 44.69) | Not performed | High |
SEER=Surveillance, Epidemiology, and End Results database; SSA=Social Security Administration; MST=mean sojourn time; MISCAN= Microsimulation Screening Analysis model; FHCRC=Fred Hutchinson Cancer Research Center; IARC=International Registry for Research on Cancer; DCIS=ductal carcinoma in situ; NLST=National Lung Screening Trial; UKLS= UK Lung Screening pilot trial; CARET=Carotene and Retinol Efficacy Trial; CT=computed tomography; CXR=chest x ray; FOBT=Fecal Occult Blood Test.
Summary of evidence from 8 pathological and imaging studies quantifying overdiagnosis from cancer screening
| Study; study period | Country; No of cancers; screening test | Overdiagnosis definition | Results | Magnitude of overdiagnosis (%) | Overall risk of bias |
|---|---|---|---|---|---|
| Dominioni 201233; 1997–2011 | Italy; 21; CXR | VDT >300 days | 1/21 cancers had VDT >300 days | “Minimal” | High |
| Lindell 200734; 1999–2004 | US; 61; CT | VDT >400 days | 13/48 cancers had VDT >400 days | 27 | Moderate |
| Sobue 199235; 1976–89 | Japan; 42; CXR | Dying from a cause other than lung cancer in patients diagnosed with clinical stage 1 disease | 20% of screen detected patients died from cause other than lung cancer | “Minimal” | High |
| Sone 200736; 1996–98 | Japan; 45; CT | Expected age of death (calculated from VDT) greater than average Japanese life expectancy | 6/45 cases had expected death age greater than Japanese life expectancy | 13.3 | Moderate |
| Veronesi 201237; 2004–10 | Italy; 120; LDCT | VDT >400 days | 31/120 cases had VDT >400 days | 25.8 (95% CI 18.3 to 34.6) | Moderate |
| Yankelevitz 200338; not provided | US; 87; CXR or sputum cytology | VDT >400 days | 4/87 cases had VDT >400 days | 5 | High |
| Graif 200739; 1989–2005 | US; 2126; PSA | Tumor volume <0.5 cm3, Gleason score <7, organ-confined disease in RRP specimen with clear surgical margins | 4.5% met criteria for overdiagnosis compared with 27% meeting criteria for underdiagnosis | 4.5 | High |
| Pelzer 200840; 1999–2006 | Austria; 997 (806 screened, 161 not screened); PSA | Gleason score <7, pathological stage of pT2a, and negative surgical margins | 16.8% of screened group and 7.9% of unscreened met overdiagnosis criteria | 16.8% | High |
CXR=chest x-ray; VDT=volume doubling time; CT=computed tomography; LDCT=low dose computed tomography; RRP=radical retropubic prostatectomy.
Summary of evidence from 20 ecological and cohort studies quantifying overdiagnosis from cancer screening
| Study; study design | Study population: country; ages; time period | Reference population | Adjustment for confounders | Management of lead time | Calculation of overdiagnosis | Magnitude of overdiagnosis (95% CI) | Risk of bias; time frame; analysis | |
|---|---|---|---|---|---|---|---|---|
| Bleyer 201241; | US; ≥40 year olds; 1976–2008 | Pre-screening trend (1976–78) | HRT, baseline increasing incidence | Steady-state screening | (Excess cases)/ (observed cases) during screening | 31% | Moderate; good; good | |
| Coldman 201342; ecological | British Columbia; 40–89 year olds; 2005–09 | ( | Age, baseline increasing incidence | Including women aged up to 89 in incidence, with up to 10 years FU post-screening | (Excess cases)/ (observed cases during screening and post-screening) | ( | Moderate; fair; poor | |
| Duffy 201043; cohort and ecological | Sweden; 50–60 year olds; 1977–98 | Sweden from Swedish 2-county RCT as control | Swedish: unclear | Swedish: excluded prevalence screen | Based on complex calculation | Swedish: 12%* | Moderate; NA; poor | |
| Falk 201344; cohort | Norway; 50–69 year olds; 1995–2009 | Screening program non-attenders | Age, county, calendar year | 10 year FU post-screening | (Excess cases)/ (expected cases) during screening | 19.4% (11.8 to 27.0%) | High; good; good | |
| Hellquist 201245; ecological | Sweden; 40-49 year olds; 1986–2005 | Contemporary counties without screening | Differences in baseline incidence trends | Statistical adjustment | (excess cases)/ (expected cases) during screening | 1% (−6 to 8%) (16% without lead time adjustment) | Moderate; NA; poor | |
| Jorgensen 2009 ( | UK; 50–64 year olds; 1993–99 | Pre-screening trend (UK 1971–84, CA 1970–78, NSW 1972–87, Sweden 1971–85, Norway 1980–94) | Baseline increasing incidence | Up to 7 years FU post-screening | (Excess cases)/ (expected cases) during screening | UK: 57% (53 to 61%) | Moderate; fair; good | |
| Jorgensen 2009 ( | Denmark; 50–69 year olds; 1991–2003 | Contemporary counties without screening | Age and differences in baseline incidence trends | Up to 10–12 years FU post-screening | (Excess cases)/ (expected cases) during screening | 33% | Moderate; fair; good | |
| Junod 201148; ecological | France; 50–79 year olds; 1995–2005 | Age-matched historical cohorts from 1980–90 | HRT, alcohol intake, obesity | Unclear | (Excess cases)/; (expected cases) during screening | Ages 50–64: 76% (67 to 85%)† | Moderate; fair; poor | |
| Kalager 201249; ecological | Norway; 50–79 year olds; 1996–2005 | Contemporary counties without screening; historical cohorts in screening region without screening | Differences in baseline incidence trends | Including women up to age 79 in incidence with up to 10 years FU post-screening | (Excess cases)/ (observed cases) during screening period, including women up to age 79 | Entire country: 25% (19 to 31%)† | Moderate; fair; poor | |
| Morrell 201050; ecological | NSW; 50–69 year olds; 1991–2001 | Pre-screening trend (1972–90) | HRT, obesity, nulliparity | Statistical adjustment | (excess cases)/ (expected cases) during screening | 30%† | Moderate; NA; poor | |
| Njor 201351; cohort | Denmark | Contemporary counties without screening; historical cohorts in screening region without screening | Differences in baseline incidence trends | Up to 8 years FU post-screening | (Excess cases)/ (expected cases) during screening and 8 years post-screening | Copenhagen: 6% (−10 to 25%) | Moderate; fair; poor | |
| Paci 200652; cohort | Italy; 50–74 year olds; 1986–2006 (10 year period) | Pre-screening trend | Age | Statistical adjustment | (Excess cases)/ (expected cases) during screening | 4.6% (2 to 7%) after adjustment for lead time | Moderate; NA; poor | |
| Peeters 198953; ecological | Netherlands; ≥35 year olds; 1975–86 | Contemporary county without screening | None | Did not | (Excess cases)/ (expected cases) during screening | 11% | High; poor; poor | |
| Puliti 200954; cohort | Italy; 60–69 year olds; 1990–2005 | Pre-screening trend (forced to 1.2% growth) | Age | 5–10 years FU post-screening | (Excess cases)/ (expected cases) during screening and 5 years post-screening | 1% (−5 to 7%) | Moderate; fair; poor | |
| Puliti 201255; cohort | Italy; 60–69 year olds; 1991–2007 | Screening non-attenders | Age, marital status, area-level socioeconomic status | 5–14 years FU post-screening | (Excess cases)/ (expected cases) during screening and 5–14 years post-screening | 10% (−2 to 23%) | High; fair; poor | |
| Svendsen 200656; ecological | Denmark; 50–69 year olds; 1991–2001 | Pre-screening trend (1979–90) | Age | Did not | Not calculated | “None” | Moderate; poor; poor | |
| Zahl 200457; ecological | Norway (N); 50–74 year olds; 1995–2000 | N: pre-screening period (1991) | Age | Up to 4 (N) and 14 (S) years FU post-screening | (Excess cases)/ (expected cases) during screening | N: 56% (42 to 73%) increased incidence with no post-screening drop S: 45% (41 to 49%) increased incidence with 12% drop | Moderate; poor (N) fair (S); good | |
| Zahl 201258; ecological | Norway; 50–79 year olds; 1995–2009 | Pre-screening trend (1991–95) | Age, county, population growth, baseline incidence trend | Up to 14 years FU post-screening | (Excess cases)/ (expected cases) during screening | Confirmed 50% incidence growth from Zahl 2004, with non-significant drop of 7% in women aged 70–74 | Moderate; fair; good | |
| Ciatto 200559; cohort | Italy; 60–74 year olds; 1991–2000 | Contemporary counties without screening | Age | 7–9 years FU post-screening | (Excess cases)/ (expected cases) during screening and 9 years post-screening | 66% (40 to 100%) | Moderate; fair; poor | |
| Zappa 199860; cohort and modeling | Italy; 60 or 65 year olds; not provided | Contemporary counties without screening | None | 4 years FU post-screening | (Excess cases)/ (expected cases) during screening and 4 years post-screening | Age 60: 25% (19 to 32%) | Moderate; fair; poor | |
HRT=hormone replacement therapy; FU=follow up; CA=Canada; NSW=New South Wales, Australia.
*Unclear if Duffy 2010 estimates of overdiagnosis include ductal carcinoma in situ.
†Does not include ductal carcinoma in situ.
Summary of evidence from 3 randomized controlled trial follow-up studies quantifying overdiagnosis from cancer screening
| Study | Study population: country; age; time period | Post-study length of follow-up | Calculation of overdiagnosis | Magnitude of overdiagnosis (95% CI) | Risk of bias; time frame; analysis |
|---|---|---|---|---|---|
| Patz 201362 | US high risk; 55–74 year olds; 2002–09 | Up to 7 years | (Excess cases)/(screen detected cases) | 18.5% (5.4 to 30.6%) | Low; fair; good |
| Miller 201463 | Canada; 40–59 year olds; 1980–2005 | 22 years (average) | (Excess cases)/(screen detected cases) | 22% | Low; good; good |
| Zackrisson 200661 | Sweden; 55–69 year olds; 1976–86 | 15 years | (Excess cases)/(control cases) during trial and 15 years follow-up | 10% (1 to 18%)* | Low; good; poor† |
*Welch et al re-analysis64 found overdiagnosis of 15% as percentage of cases diagnosed during screening period; overdiagnosis of 24% as percentage of screen detected cases.
†Welch et al re-analysis rated as good.
Strengths and weaknesses of the main research methods used to quantify overdiagnosis from cancer screening
| Research method | Strengths | Weaknesses |
|---|---|---|
| Follow-up of randomized controlled trials | Best able to minimize biases | Substantial time and resource requirements |
| Modeling | Can project through areas of uncertainty | Validity of results depends on assumptions (poor directness) |
| Pathological and imaging studies | Can be used for monitoring over time | Validity of results depends on assumptions (poor directness) |
| Ecological and cohort studies | Directly answers question of interest | Potential for confounding factors related to diagnosis, treatment, and health status between populations |