| Literature DB >> 35396247 |
Agata Foryciarz1,2, Stephen R Pfohl2, Birju Patel2, Nigam Shah2.
Abstract
OBJECTIVES: The American College of Cardiology and the American Heart Association guidelines on primary prevention of atherosclerotic cardiovascular disease (ASCVD) recommend using 10-year ASCVD risk estimation models to initiate statin treatment. For guideline-concordant decision-making, risk estimates need to be calibrated. However, existing models are often miscalibrated for race, ethnicity and sex based subgroups. This study evaluates two algorithmic fairness approaches to adjust the risk estimators (group recalibration and equalised odds) for their compatibility with the assumptions underpinning the guidelines' decision rules.MethodsUsing an updated pooled cohorts data set, we derive unconstrained, group-recalibrated and equalised odds-constrained versions of the 10-year ASCVD risk estimators, and compare their calibration at guideline-concordant decision thresholds.Entities:
Keywords: BMJ Health Informatics; clinical; decision support systems; health equity; machine learning; medical informatics
Mesh:
Substances:
Year: 2022 PMID: 35396247 PMCID: PMC8996004 DOI: 10.1136/bmjhci-2021-100460
Source DB: PubMed Journal: BMJ Health Care Inform ISSN: 2632-1009
Figure 1(A) Identifying an optimal therapeutic threshold. An individual with risk r should be treated if the expected value of treatment exceeds that of non-treatment. As risk increases, the benefits of treatment become more significant, and assigning treatment becomes more optimal than withholding it. The optimal therapeutic threshold t is the value of risk at which treatment and non-treatment have the same expected value (the indifference point)—for individuals with r>t, treatment is expected to be more beneficial than non-treatment. Setting a non-optimal therapeutic threshold could lead to suboptimal treatment decisions for some individuals (treating some individuals for whom non-treatment has a higher expected value, or not treating individuals for whom treatment has a higher expected value). (B) Illustration of the sensitivity of FPR and FNR to the distribution of risk. Assume that there are two types of easily distinguishable individuals: with 5% and 50% chance of developing a disease, respectively, and there are two groups composed of both types of individuals, but one has a higher proportion of lower-risk individuals. If the same therapeutic threshold is applied to both groups, false positive rates (FPR) and false negative rates (FNR) will not be equal, even though we would be making optimal treatment decisions for each patient, in both populations. (C) Under miscalibration, implied thresholds differ from therapeutic thresholds. If risk scores are miscalibrated, taking action at the threshold of 7.5% corresponds to different observed outcome rates in the two groups. For Group I, a risk score of 7.5% corresponds to an observed outcome incidence of 10%, while for Group II it corresponds to 6%, therefore, individuals in Group II would be treated at a lower risk than individuals in Group I.
Figure 2Visual abstract. Data from the six considered data sets: ARIC (Atherosclerosis Risk in Communities Study), CARDIA (Coronary Artery Risk Development in Young Adults Study), CHS (Cardiovascular Health Study), FHS OS (Framingham Heart Study Offspring Cohort), MESA (Multi-Ethnic Study of Atherosclerosis) and JHS (Jackson Heart Study), is extracted using the cohort definition used in the original pooled cohort equations (PCEs), and divided into train (80%), validation (10%) and test (10%) sets. Equalised odds and unconstrained (UC) models are derived directly from the training set. The recalibrated model is derived from the UC model using a recalibration procedure, which uses the validation data set (not seen during training). Finally, predictions on the test set are generated for all models—including the PCEs and the revised PCEs (rPCE), derived in past work—and evaluated.
Figure 3Model performance across evaluation metrics, stratified by demographic group, evaluated on the test set. The left panel showsAUROC and absolute calibration error. The right panel shows false negative rates, false positive rates and threshold calibration error at two therapeutic thresholds (7.5% and 20%). EO, equalised odds; PCEs, original pooled cohort equations; rPCE, revised PCEs; rUC, recalibrated model; UC, unconstrained model.
Figure 4Relationship between intergroup variability in threshold calibration rate (TCE) and error rates. The figure shows the relationship between intergroup SD (IGSD) of threshold calibration error (on the x-axis) and IGSD of false negative rate (FNR, circles) and false positive rate (FPR, crosses) across the models: EO1–4, equalised odds with increasing values of λ. The EO3 corresponds to the EO model discussed in the Results section. In the models we trained, IGSD of TCE scales inversely with the IGSD of FNR and FPR. PCE, original pooled cohort equations; rPCE, revised PCEs; rUC, recalibrated model; UC, unconstrained model.
Cohort characteristics for patients who met inclusion criteria
| Study | N | Age | ASCVD event incidence* | % censored | N | Age | ASCVD event incidence* | % censored |
| Black women | Black men | |||||||
| ARIC | 1812 | 53.2 | 5.70% | 6.51 | 1216 | 53.8 | 9.61% | 10.03 |
| CARDIA | 232 | 43.0 | 4.42% | 8.19 | 153 | 42.7 | 1.63% | 14.38 |
| CHS | 304 | 70.7 | 22.52% | 15.46 | 181 | 70.5 | 30.89% | 27.62 |
| JHS | 1310 | 51.4 | 2.77% | 14.96 | 751 | 51.1 | 4.47% | 14.11 |
| MESA | 768 | 60.3 | 5.18% | 9.64 | 630 | 60.9 | 7.19% | 13.17 |
| All | 4426 | 54.6 | 5.69% | 10.26 | 2931 | 55.1 | 8.15% | 13.07 |
|
|
| |||||||
| ARIC | 4815 | 53.9 | 2.54% | 3.30 | 4383 | 54.5 | 7.17% | 4.86 |
| CARDIA | 289 | 42.7 | 0.39% | 6.23 | 333 | 42.5 | 0.90% | 6.91 |
| CHS | 1848 | 70.7 | 20.18% | 15.58 | 1169 | 71.0 | 32.00% | 17.45 |
| FHS OS | 828 | 46.4 | 2.61% | 1.81 | 856 | 47.1 | 8.67% | 3.86 |
| MESA | 1913 | 60.5 | 3.81% | 7.68 | 1828 | 60.8 | 6.67% | 10.07 |
| All | 9693 | 57.4 | 5.95% | 6.47 | 8569 | 56.9 | 10.36% | 7.67 |
|
| ||||||||
| All | 25 619 | 56.5 | 7.54% | 8.28 | ||||
Data are grouped by sex and race, as well as data set. Each group of patients is described by four values: total number of individuals, mean age, censoring-adjusted incidence of ASCVD events within 10 years of the initial examination and fraction of censored individuals.
*ASCVD event incidence was calculated by weighing the number of positive outcome and negative outcome uncensored individuals with the sum of their inverse probability of censoring weights.
ARIC, Atherosclerosis Risk in Communities Study; ASCVD, atherosclerotic cardiovascular disease; AUROC, area under the receiver operating characteristic; CARDIA, Coronary Artery Risk Development in Young Adults Study; CHS, Cardiovascular Health Study; FHS OS, Framingham Heart Study Offspring Cohort; FNR, false negative rate; FPR, false positive rate; IPCW, inverse probability of censoring; JHS, Jackson Heart Study; MESA, Multi-Ethnic Study of Atherosclerosis; PCE, Pooled Cohort Equations.