| Literature DB >> 34117912 |
Jennifer A Cooper1,2, David Jenkinson1, Chris Stinton1, Matthew G Wallis3, Sue Hudson4, Sian Taylor-Phillips5,6.
Abstract
OBJECTIVES: In breast cancer screening, two readers separately examine each woman's mammograms for signs of cancer. We examined whether preventing the two readers from seeing each other's decisions (blinding) affects behaviour and outcomes.Entities:
Keywords: Breast neoplasms; Early detection of cancer; Mammography; Markov chains; Mass screening
Mesh:
Year: 2021 PMID: 34117912 PMCID: PMC8660753 DOI: 10.1007/s00330-021-07965-z
Source DB: PubMed Journal: Eur Radiol ISSN: 0938-7994 Impact factor: 5.315
Fig. 1Study flow diagram with reasons for exclusions. There were 46 centres in the CO-OPS trial, but three shared a common computer system so are counted as one centre in this analysis, a further centre was removed which had no reader identifiers, giving 43 centres in the dataset. Of the 43 centres, 23 centres were classified as not blinded, 14 as blinded, and 6 as mixed. Reader 2 was blinded for 34.2% (382,490/1,119,191) of women screened
Characteristics of the study sample, and recall and cancer detection rates for blinded versus not blinded reader 2
| Study characteristic | Reader 2 | % | Reader 2 | % | |
|---|---|---|---|---|---|
| Age of woman (mean) | Mean age | 59.2 | 59.3 | ||
| Group 1 (52 or less) | 90,505 | 23.66 | 167,217 | 22.70 | |
| Group 2 (53-59) | 111,642 | 29.19 | 214,996 | 29.18 | |
| Group 3 (60 or more) | 180,343 | 47.15 | 354,488 | 48.12 | |
| Total | 382,490 | 100.00 | 736,701 | 100.00 | |
| First screen/subsequent screen | Subsequent screen | 300,820 | 78.65 | 581,080 | 78.88 |
| First screen | 81,670 | 21.35 | 155,621 | 21.12 | |
| Total | 382,490 | 100.00 | 736,701 | 100.00 | |
Reader 1 Recall (pre arbitration) | Not recalled | 363,034 | 94.91 | 697,294 | 94.65 |
| Recalled | 19,456 | 5.09 | 39,407 | 5.35 | |
| Total | 382,490 | 100.00 | 736,701 | 100.00 | |
Reader 2 Recall (pre arbitration) | Not recalled | 364,387 | 95.27 | 698,512 | 94.82 |
| Recalled | 18,103 | 4.73 | 38,189 | 5.18 | |
| Total | 382,490 | 100.00 | 736,701 | 100.00 | |
| Recall (after arbitration) | Not recalled | 367,341 | 96.04 | 706,082 | 95.84 |
| Recalled | 15,149 | 3.96 | 30,619 | 4.16 | |
| Total | 382,490 | 100.00 | 736,701 | 100.00 | |
| Cancers detected by reader 1 | Cancer detected | 3066 | 0.80 | 5717 | 0.78 |
| No Cancer detected | 379,424 | 99.20 | 730,984 | 99.22 | |
| Total | 382,490 | 100.00 | 736,701 | 100.00 | |
| Cancers detected by reader 2 | Cancer detected | 3226 | 0.84 | 6117 | 0.83 |
| No cancer detected | 379,264 | 99.16 | 730,584 | 99.17 | |
| Total | 382,490 | 100.00 | 736,701 | 100.00 | |
| Cancers detected overall (after arbitration) | Cancer detected | 3355 | 0.88 | 6301 | 0.86 |
| No Cancer detected | 379,135 | 99.12 | 730,400 | 99.14 | |
| Total | 382,490 | 100.00 | 736,701 | 100.00 |
Fig. 2Facet plot showing the probability (with 95% credible interval) of reader 2 recalling a woman (at the mean age of 59.27) by whether reader 1 (R1) recalls or not, when reader 2 (R2) is blinded and not blinded. Results also presented for women who are first time screens or subsequent screens. The probability of recall is lower for a woman attending a subsequent screen compared to attending a first-time screen
Markov chain Monte Carlo (MCMC) multilevel model determining the effect of blinding on recall rate overall (after arbitration). The full model including one-sided Bayesian p values is reported in the Supplementary Material (Table A.1) along with the caterpillar plot showing level 3 residuals and their 95% CIs (Figure A.1)
| Recall rate overall (after arbitration) multilevel model | |||
|---|---|---|---|
| Fixed Effects | Odds ratioa | 95% credible interval for the odds ratiob | Pr(>|z|)c |
| Blinding Yes (versus no as the reference category) | 0.869 | 0.826, 0.913 | < 0.001 |
| Age (centred) | 1.007 | 1.005, 1.009 | < 0.001 |
| First screen (versus subsequent screen as the reference category) | 2.823 | 2.728, 2.922 | < 0.001 |
| Blinded yes * age (interaction term) | 1.005 | 1.001, 1.008 | 0.015 |
| Blinding yes * first screen (interaction term) | 1.060 | 0.999, 1.125 | 0.057 |
The mean of the 100,000 chain iterations after converting from the log odds scale to the odds scale.
95% credible interval is generated by taking the 2.5th and 97.5th quantiles of the 100,000 chain iterations after converting from the log odds scale to the odds scale
Two-tailed p value of the z score for the coefficient (testing whether the estimate is significantly different from zero assuming normality)
Fig. 3Interaction plot displaying the odds ratios (median and 95% credible intervals) of recall rate after arbitration when reader 2 is blinded versus not blinded by screen status and age. See accompanying Table A.2 in the Supplementary Material
Markov chain Monte Carlo (MCMC) multilevel model determining the effect of blinding on cancers detected overall (after arbitration). Interactions were not included for this model due to both a higher deviance information criterion (DIC) in comparison to a model without and non-significant p values for the estimates. The full model including one-sided Bayesian p values is reported in the Supplementary Material (Table A.5) along with the caterpillar plot showing level 3 residuals and their 95% CIs (Figure A.3)
| Cancer detection overall (after arbitration) multilevel model | |||
|---|---|---|---|
| Fixed effects | Odds ratioa | 95% credible interval for the odds ratiob | Pr(>|z|)c |
| Blinding yes (versus no as the reference category) | 1.029 | 0.970, 1.089 | 0.341 |
| Age (centred) | 1.052 | 1.049, 1.056 | < 0.001 |
| First screen (versus subsequent screen as the reference category) | 1.696 | 1.591, 1.807 | < 0.001 |
The mean of the 100,000 chain iterations after converting from the log odds scale to the odds scale
95% credible interval is generated by taking the 2.5th and 97.5th quantiles of the 100,000 chain iterations after converting from the log odds scale to the odds scale
Two-tailed p value of the z score for the coefficient (testing whether the estimate is significantly different from 0 assuming normality)
Two by two table of positive predictive value (PPV) for both after arbitration and for reader 2 against whether reader 2 is blinded or not. Chi-squared test for independence assesses whether PPV after arbitration is independent of blinding (χ2(1) = 14.9, p < 0.001) and whether PPV of reader 2 is independent of blinding (χ2(1) = 28.7, p < 0.001)
| Reader 2 | Reader 2 | Total | |
|---|---|---|---|
| Cancer detected (TP) after arbitration | 3355 | 6301 | 9656 |
| Recall (after arbitration), no cancer detected (FP) | 11,794 | 24,318 | 36,112 |
| Total recalled (TP + FP) | 15,149 | 30,619 | 45,768 |
| PPV (TP/total recalled) % | 22.1% | 20.6% | |
| Cancer detected (TP) (reader 2) | 3226 | 6117 | 9343 |
| Recall (reader 2), no cancer detected (FP) | 14,877 | 32,072 | 46,949 |
| Total recalled (TP + FP) | 18,103 | 38,189 | 56,292 |
| PPV (TP/total recalled)% | 17.8% | 16.0% |
Test accuracy statistic estimates for reader 2 blinded or not blinded. Cancer includes those detected at the screen and interval cancers within 3 years of the screen
| Reader 2 | Reader 2 | Equality of proportions tests | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Cancer | Cancer | ||||||||||
| Detected | Interval | None | Total | Detected | Interval | None | Total | ||||
| Recall (after arbitration) | Yes | 3355 | 44 | 11,750 | 15,149 | 6301 | 90 | 24,228 | 30,619 | ||
| No | 0 | 724 | 366,617 | 367,341 | 0 | 1346 | 704,736 | 706,082 | |||
| Total | 3355 | 768 | 378,367 | 382,490 | 6301 | 1436 | 728,964 | 736,701 | |||
| Test accuracy statistics | 95% confidence interval | 95% confidence interval | χ2(1) | ||||||||
| Sensitivity | 82.44% | (81.28%,83.60%) | 82.60% | (81.76%,83.45%) | 0.0390 | 0.843 | |||||
| Specificity | 96.89% | (96.84%,96.95%) | 96.68% | (96.64%,96.72%) | 37.6 | < 0.001 | |||||
| PPV | 22.44% | (21.77%,23.10%) | 20.87% | (20.42%,21.33%) | 14.7 | < 0.001 | |||||
| NPV | 99.80% | (99.79%,99.82%) | 99.81% | (99.80%,99.82%) | 0.491 | 0.483 | |||||
PPV positive predictive value, NPV negative predictive value
Number of women with cancer used in the “test accuracy” statistics found by adding the detected and interval cancers together