| Literature DB >> 33554035 |
Jaimie Z Shing1, Marie R Griffin2, Linh D Nguyen3, James C Slaughter4, Edward F Mitchel2, Manideepthi Pemmaraju2, Alyssa B Rentuza1, Pamela C Hull1.
Abstract
Background: Human papillomavirus vaccine (HPV) impact on cervical precancer (cervical intraepithelial neoplasia grades 2+ [CIN2+]) is observable sooner than impact on cancer. Biopsy-confirmed CIN2+ is not included in most US cancer registries. Billing codes could provide surrogate metrics; however, the International Classification of Diseases, ninth (ICD-9) to tenth (ICD-10) transition disrupts trends. We built, validated, and compared claims-based models to identify CIN2+ events in both ICD eras.Entities:
Mesh:
Substances:
Year: 2020 PMID: 33554035 PMCID: PMC7853170 DOI: 10.1093/jncics/pkaa112
Source DB: PubMed Journal: JNCI Cancer Spectr ISSN: 2515-5091
Figure 1.Flow diagram to capture cohort of cervical diagnostic procedural encounters from 2008 to 2017 among Tennessee Medicaid (TennCare)-enrolled women aged 18-39 years residing in Davidson County, Tennessee. aConfirmed cervical intraepithelial neoplasia (CIN) grades 2+ (CIN2+) events are all events reported and validated by the Human Papillomavirus Vaccine Impact Monitoring Project (HPV-IMPACT); multiple events may be included for each woman. bConfirmed incident CIN2+ events are index events for each woman.
Characteristics of cervical diagnostic procedures among TennCare-enrolled women aged 18-39 years in Davidson County, Tennessee, by ICD era
| Characteristic | ICD-9 era N = 7105 No. (%) | ICD-10 era N = 1444 No. (%) |
|
|---|---|---|---|
| Confirmed CIN2+ | <.001 | ||
| Yes | 885 (12.5) | 231 (16.0) | |
| No | 6220 (87.5) | 1213 (84.0) | |
| Age group, y | <.001 | ||
| 18-24 | 3062 (43.1) | 261 (18.1) | |
| 25-29 | 2081 (29.3) | 507 (35.1) | |
| 30-39 | 1962 (27.6) | 676 (46.8) | |
| Race or ethnicity | <.001 | ||
| NH White | 2011 (28.3) | 341 (23.6) | |
| NH Black | 2184 (30.7) | 382 (26.5) | |
| NH other or unknown | 2755 (38.8) | 690 (47.8) | |
| Hispanic | 155 (2.2) | 31 (2.2) | |
| CIN2+ | <.001 | ||
| Yes | 1508 (21.2) | 381 (26.4) | |
| No | 5597 (78.8) | 1063 (73.6) | |
| Nonspecific CIN tissue diagnosis code | <.001 | ||
| Yes | 808 (11.4) | 119 (8.2) | |
| No | 6297 (88.6) | 1325 (91.8) | |
| HGSIL cytologic diagnosis code | .20 | ||
| Yes | 845 (11.9) | 189 (13.1) | |
| No | 6260 (88.1) | 1255 (86.9) | |
| CIN1 tissue diagnosis code | .95 | ||
| Yes | 1831 (25.8) | 371 (25.7) | |
| No | 5274 (74.2) | 1073 (74.3) | |
| Low-grade squamous intraepithelial lesion cytologic diagnosis code | .44 | ||
| Yes | 2492 (35.1) | 491 (34.0) | |
| No | 4613 (64.9) | 953 (66.0) | |
| ASCUS diagnosis code | .03 | ||
| Yes | 2480 (34.9) | 547 (37.9) | |
| No | 4625 (65.1) | 897 (62.1) | |
| HPV screening test code | <.001 | ||
| Yes | 173 (2.4) | 167 (11.6) | |
| No | 6932 (97.6) | 1277 (88.4) | |
| Pap smear or test code | .02 | ||
| Yes | 4987 (70.2) | 1059 (73.3) | |
| No | 2118 (29.8) | 385 (26.7) | |
| HPV DNA test code | .11 | ||
| Yes | 3434 (48.3) | 731 (50.6) | |
| No | 3671 (51.7) | 713 (49.4) | |
| Cervical treatment procedure code | .65 | ||
| Yes | 469 (6.6) | 100 (6.9) | |
| No | 6636 (93.4) | 1344 (93.1) | |
| Cervical or vaginal biopsy code | <.001 | ||
| Yes | 3140 (44.2) | 735 (50.9) | |
| No | 3965 (55.8) | 709 (49.1) |
The ICD-9 era includes procedures from January 1, 2008, through September 30, 2015; the ICD-10 era includes procedures from October 1, 2015, through December 31, 2017. ASCUS = atypical squamous cells of undetermined significance; CIN = cervical intraepithelial neoplasia; DNA = deoxyribonucleic acid; HGSIL = high-grade squamous intraepithelial lesion; HPV = human papillomavirus; ICD = International Classification of Diseases, Clinical Modification; NH = non-Hispanic; Pap = Papanicolaou; TennCare = Tennessee Medicaid.
CIN2+ includes CIN2, CIN3, and adenocarcinoma in situ.
Beta coefficients and predictor importance scores of LASSO and random forest algorithms to classify CIN2+ event status in the training set (N = 5129) of cervical diagnostic procedures among TennCare-enrolled women aged 18-39 years in Davidson County, Tennessee
| Predictors | LASSO beta coefficients | Random forest predictor importance scores |
|---|---|---|
| Constant | −5.915605 | — |
| CIN2+ tissue diagnosis | 5.341873 | 0.695894 |
| Cervical treatment procedure | 0.9440706 | 0.089150 |
| Cervical or vaginal biopsy | 0.9414902 | 0.032999 |
| High-grade squamous intraepithelial lesion diagnosis | 0.9338596 | 0.095700 |
| Nonspecific CIN diagnosis | 0.3964537 | 0.028032 |
| Low-grade squamous intraepithelial lesion diagnosis | 0.3541705 | 0.010605 |
| ASCUS diagnosis | 0.2838765 | 0.010486 |
| CIN1 tissue diagnosis | −0.2115674 | 0.015590 |
| HPV DNA test | −0.2082338 | 0.008846 |
| Pap smear or test | −0.1695168 | 0.011962 |
| HPV screening test | −0.0893877 | 0.000737 |
LASSO and random forest algorithms were built using training sets of both ICD-9 and ICD-10 eras combined. ASCUS = atypical squamous cells of undetermined significance; CIN = cervical intraepithelial neoplasia; DNA = deoxyribonucleic acid; HPV = human papillomavirus; ICD = International Classification of Diseases, Clinical Modification; LASSO = least absolute shrinkage and selection operator; Pap = Papanicolaou; TennCare = Tennessee Medicaid.
CIN2+ includes CIN2, CIN3, and adenocarcinoma in situ.
Figure 2.Confusion matrices of claims-based models to classify cervical intraepithelial neoplasia (CIN) grades 2+ (CIN2+) event status in the testing set of cervical diagnostic procedures among Tennessee Medicaid (TennCare)-enrolled women aged 18-39 years in Davidson County, Tennessee, by International Classification of Diseases, Clinical Modification (ICD) era. CIN2+ includes CIN2, CIN3, and adenocarcinoma in situ. The ICD-9 era includes procedures from January 1, 2008, through September 30, 2015; the ICD-10 era includes procedures from October 1, 2015, through December 31, 2017. LASSO = least absolute shrinkage and selection operator.
Performance of claims-based models to classify CIN2+ event status among cervical diagnostic procedures of TennCare-enrolled women aged 18-39 years in Davidson County, Tennessee, by ICD era
| CIN2+ tissue diagnosis codes alone | LASSO | Random forest | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| ICD-9 (N = 7105) | ICD-10 (N = 1444) | ICD-9 (N = 7105) | ICD-10 (N = 1444) | ICD-9 (N = 7105) | ICD-10 (N = 1444) | |||||
| Performance Measure | Testing set (n = 2842) | Testing set (n = 578) | Training set (n = 4263) | Testing set (n = 2842) | Training set (n = 866) | Testing set (n = 578) | Training set (n = 4263) | Testing set (n = 2842) | Training set (n = 866) | Testing set (n = 578) |
| Sensitivity, % (95% CI) |
96.1 (93.5 to 97.8) |
97.8 (92.2 to 99.7) |
82.0 (78.5 to 85.2) |
77.3 (72.5 to 81.5) |
75.2 (67.2 to 82.1) |
81.1 (71.5 to 88.6) |
79.8 (76.1 to 83.1) |
70.2 (65.2 to 74.9) |
75.9 (68.0 to 82.7) |
75.6 (65.4 to 84.0) |
| Specificity, % (95% CI) |
88.7 (87.3 to 89.9) |
85.3 (81.8 to 88.3) |
94.2 (93.4 to 94.9) |
93.0 (91.9 to 94.0) |
93.1 (91.0 to 94.8) |
90.2 (87.2 to 92.7) |
95.0 (94.3 to 95.7) |
93.8 (92.8 to 94.8) |
93.5 (91.5 to 95.2) |
90.6 (87.6 to 93.0) |
| PPV, % (95% CI) |
54.8 (50.8 to 58.8) |
55.0 (46.9 to 62.9) |
66.8 (63.0 to 70.4) |
61.3 (56.6 to 65.8) |
68.0 (60.0 to 75.2) |
60.3 (51.0 to 69.1) |
69.5 (65.7 to 73.2) |
62.0 (57.1 to 66.8) |
69.5 (61.6 to 76.6) |
59.6 (50.1 to 68.7) |
| NPV, % (95% CI) |
99.4 (98.9 to 99.7) |
99.5 (98.3 to 99.9) |
97.4 (96.8 to 97.9) |
96.6 (95.8 to 97.3) |
95.1 (93.2 to 96.5) |
96.3 (94.1 to 97.8) |
97.1 (96.5 to 97.6) |
95.7 (94.8 to 96.4) |
95.2 (93.4 to 96.7) |
95.3 (92.9 to 97.0) |
| Accuracy, % (95% CI) |
89.6 (88.4 to 90.7) |
87.2 (84.2 to 89.8) |
92.7 (91.9 to 93.5) |
91.0 (89.9 to 92.1) |
90.2 (88.0 to 92.1) |
88.8 (85.9 to 91.2) |
93.2 (92.4 to 93.9) |
90.9 (89.8 to 91.9) |
90.6 (88.5 to 92.5) |
88.2 (85.3 to 90.7) |
| C-Index, % (95% CI) |
92.4 (91.2 to 93.6) |
91.5 (89.3 to 93.7) |
88.1 (86.5 to 89.8) |
85.1 (82.9 to 87.4) |
84.1 (80.5 to 87.8) |
85.6 (81.4 to 89.9) |
87.4 (85.7 to 89.2) |
82.0 (79.6 to 84.5) |
84.7 (81.1 to 88.4) |
83.1 (78.4 to 87.7) |
CIN2+ includes CIN2, CIN3, and adenocarcinoma in situ. CI = confidence interval; CIN = cervical intraepithelial neoplasia; ICD = International Classification of Diseases, clinical modification; LASSO = least absolute shrinkage and selection operator; NPV = negative predictive value; PPV = positive predictive value; TennCare = Tennessee Medicaid.
The ICD-9 era includes procedures from January 1, 2008, through September 30, 2015; the ICD-10 era includes procedures from October 1, 2015, through December 31, 2017.
Performance between the training and testing sets are statistically significantly different (95% confidence intervals do not overlap with each other).
Performance of claims-based models to classify CIN2+ event status by age group in the testing set of cervical diagnostic procedures among TennCare-enrolled women aged 18-39 years in Davidson County, Tennessee, by ICD era
| Aged 18-24 y (N = 1349) | Aged 25-29 y (N = 1033) | Aged 30-39 y (N = 1038) | ||||
|---|---|---|---|---|---|---|
| Performance measure | ICD-9 (n = 1248) | ICD-10 (n = 101) | ICD-9 (n = 835) | ICD-10 (n = 198) | ICD-9 (n = 759) | ICD-10 (n = 279) |
| CIN2+ tissue diagnosis codes alone | ||||||
| Sensitivity, % (95% CI) | 99.2 (95.5 to 99.8) | 91.7 (61.5 to 99.8) | 93.0 (86.6 to 96.9) | 100.0 (91.4 to 100.0) | 95.8 (90.5 to 98.6) | 97.3 (85.8 to 99.9) |
| Specificity, % (95% CI) | 90.0 (88.1 to 91.7) | 89.9 (81.7 to 95.3) | 87.2 (84.6 to 89.6) | 80.9 (73.9 to 86.7) | 88.0 (85.2 to 90.4) | 86.4 (81.4 to 90.4) |
| PPV, % (95% CI) | 51.7 (45.1 to 58.3) | 55.0 (31.5 to 76.9) | 53.3 (46.3 to 60.6) | 57.7 (45.4 to 69.4) | 59.9 (52.6 to 66.9) | 52.2 (39.8 to 64.4) |
| NPV, % (95% CI) | 99.9 (99.5 to 100.0) | 98.8 (93.3 to 100.0) | 98.7 (97.5 to 99.5) | 100.0 (97.1 to 100.0) | 99.1 (98.0 to 99.7) | 99. 5 (97.4 to 100.0) |
| Accuracy, % (95% CI) | 90.9 (89.1 to 92.4) | 90.1 (82.5 to 95.2) | 88.0 (85.6 to 90.2) | 84.9 (79.1 to 89.5) | 89.2 (86.8 to 91.3) | 87.8 (83.4 to 91.4) |
| C-Index, % (95% CI) | 94.6 (93.4 to 95.8) | 90.8 (82.0 to 99.5) | 98.7 (97.5 to 99.5) | 90.5 (87.4 to 93.5) | 91.9 (89.7 to 94.1) | 91.8 (88.4 to 95.3) |
| LASSO | ||||||
| Sensitivity, % (95% CI) | 81.1 (73.1 to 87.7) | 75.0 (42.8 to 94.5) | 74.6 (65.6 to 82.3) | 80.5 (65.1 to 91.2) | 75.8 (67.2 to 83.2) | 83.8 (68.0 to 93.8) |
| Specificity, % (95% CI) | 93.7 (92.1 to 95.0) | 94. 4 (87.4 to 98.2) | 92.0 (89.7 to 93.8) | 87.9 (81.7 to 92.6) | 93.0 (90.7 to 94.8) | 90.1 (85.6 to 93.5) |
| PPV, % (95% CI) | 58.2 (50.4 to 65.7) | 64.3 (35.1 to 87.2) | 59.4 (50.9 to 67.6) | 63.5 (49.0 to 76.4) | 66.9 (58.3 to 74.7) | 56.4 (42.3 to 69.7) |
| NPV, % (95% CI) | 97.9 (96.8 to 98.6) | 96.6 (90.3 to 99.3) | 95.8 (94.0 to 97.2) | 94.5 (89.5 to 97.6) | 95.3 (93.4 to 96.9) | 97.3 (94.3 to 99.0) |
| Accuracy, % (95% CI) | 92.5 (90.1 to 93.9) | 92.1 (85.0 to 96.5) | 89.6 (87.3 to 91.6) | 86.4 (80.8 to 90.8) | 90.3 (87.9 to 92.3) | 89.3 (85.0 to 92.6) |
| C-Index, % (95% CI) | 87.4 (83.9 to 91.0) | 84.7 (71.7 to 97.7) | 83.3 (79.1 to 87.4) | 84.2 (77.5 to 90.9) | 84.4 (80.4 to 88.4) | 86.9 (80.6 to 93.2) |
| Random forest | ||||||
| Sensitivity, % (95% CI) | 77.0 (68.6 to 84.2) | 50.0 (21.1 to 78.9) | 64.9 (55.4 to 73.6) | 73.2 (57.1 to 85.8) | 68.3 (59.2 to 76.5) | 86.5 (71.2 to 95.5) |
| Specificity, % (95% CI) | 94.7 (93.2 to 95.9) | 94.4 (87.4 to 98.2) | 92.4 (90.2 to 94.2) | 89.8 (84.0 to 94.1) | 94.1 (91.9 to 95.8) | 89.7 (85.1 to 93.2) |
| PPV, % (95% CI) | 61.0 (52.9 to 68.8) | 54.5 (23.4 to 83.3) | 57.4 (48.4 to 66.0) | 65.2 (49.8 to 78.6) | 68.3 (59.2 to 76.5) | 56.1 (42.4 to 69.3) |
| NPV, % (95% CI) | 97.4 (96.3 to 98.3)c | 93.3 (86.1 to 97.5) | 94.3 (92.4 to 95.9)c | 92.8 (87.4 to 96.3) | 94.1 (91.9 to 95.8) | 97.7 (94.8 to 99.3) |
| Accuracy, % (95% CI) | 93.0 (91.4 to 94.3) | 89.1 (81.4 to 94.4) | 88.6 (86.3 to 90.7) | 86.4 (80.8 to 90.8) | 90.0 (87.6 to 92.0) | 89.3 (85.0 to 92.6) |
| C-Index, % (95% CI) | 85.9 (82.1 to 89.7) | 72.2 (57.2 to 87.2) | 78.6 (74.1 to 83.2) | 81.5 (74.2 to 88.8) | 81.2 (76.9 to 85.5) | 88.1 (82.2 to 94.0) |
CIN2+ includes CIN2, CIN3, and adenocarcinoma in situ. CIN = cervical intraepithelial neoplasia; ICD = International Classification of Diseases, Clinical Modification; LASSO = least absolute shrinkage and selection operator; NPV = negative predictive value; PPV = positive predictive value; TennCare = Tennessee Medicaid.
The ICD-9 era includes procedures from January 1, 2008, through September 30, 2015; the ICD-10 era includes procedures from October 1, 2015, through December 31, 2017.
Performance between age groups, either aged 18-24 years vs 25-29 years, 18-24 years vs 30-39 years, or 25-29 years vs 35-39 years are statistically significantly different (95% confidence intervals do not overlap with each other).
Performance between the ICD-9 and ICD-10 eras are statistically significantly different (95% confidence intervals do not overlap with each other).
Figure 3.Annual number of incident cervical intraepithelial neoplasia (CIN) grades 2+ (CIN2+) events identified by claims-based models among Tennessee Medicaid (TennCare)-enrolled women aged 18-39 years residing in Davidson County, Tennessee, who had cervical diagnostic procedures from 2008 to 2017. CIN2+ includes CIN2, CIN3, and adenocarcinoma in situ; incident events were determined by applying each model to the cohort of cervical diagnostic procedures and counting index events classified by each model. HPV-IMPACT = human papillomavirus vaccine impact monitoring project; LASSO = least absolute shrinkage and selection operator.