| Literature DB >> 34819386 |
Alison K Spencer1, Jigar Bandaria1, Michelle B Leavy2, Benjamin Gliklich3, Zhaohui Su4, Gary Curhan5, Costas Boussios1.
Abstract
OBJECTIVE: Disease activity measures, such as the Clinical Disease Activity Index (CDAI), are important tools for informing treatment decisions and monitoring patient outcomes in rheumatoid arthritis (RA). Yet, documentation of CDAI scores in electronic medical records and other real-world data sources is inconsistent, making it challenging to use these data for research. The purpose of this study was to validate a machine learning model to estimate CDAI scores for patients with RA using clinical notes.Entities:
Keywords: arthritis; health care; health services research; outcome assessment; rheumatoid
Mesh:
Year: 2021 PMID: 34819386 PMCID: PMC8614150 DOI: 10.1136/rmdopen-2021-001781
Source DB: PubMed Journal: RMD Open ISSN: 2056-5933
Demographic and clinical characteristics of training and validation cohorts
| Training cohort | Validation Cohort (n=11 839) | ||
| Age, years | Mean (SD) | 62.5 (13.4) | 62.5 (13.4) |
| Sex | Female | 78.5% | 78.7% |
| Male | 21.5% | 21.3% | |
| Race | White | 67.0% | 67.2% |
| Black | 8.7% | 9.0% | |
| Other | 22.4% | 22.4% | |
| Unknown | 1.9% | 1.5% | |
| Duration of follow-up, years | Mean (SD) | 5.5 (1) | 5.5 (0.9) |
| Body mass index (BMI)* | Mean (SD) | 29.8 (7.1) | 29.9 (7.2) |
| Hypertension | n (%) | 720 (3.4%) | 235 (2.0%) |
| Type 2 diabetes | n (%) | 751 (3.6%) | 372 (3.1%) |
| Cardiovascular disease | n (%) | 202 (1.0%) | 83 (0.7%) |
*Note, the BMI recorded on the encounter date was used. If BMI was not recorded on the encounter date, the closest recorded BMI before the encounter date was used.
Figure 1The AUC was calculated using a binarised version of the outcome in which the negative class is defined as those notes with CDAI scores less than or equal to 10.0 (the threshold at which CDAI scores are considered to transition from ‘low’ to ‘moderate’ disease activity), and the positive class is defined as those records with scores greater than or equal to 10.1. For this model, the AUC=0.88. AUC, area under the receiver operating characteristic curve; CDAI, Clinical Disease Activity Index; eCDAI, estimated CDAI.
Figure 2Distribution of categories and confusion matrix of estimated and clinician-recorded CDAI scores in validation cohort. The figure presents the distribution of eCDAI scores and the distribution of clinician-recorded CDAI scores categorised into remission (0–2.8), mild (2.9–10.0), moderate (10.1–22.0) and high disease activity (22.1–76.0) (left). The confusion matrix (right) shows the model’s performance at estimating CDAI scores in each of these four categories. CDAI, Clinical Disease Activity Index; eCDAI, estimated CDAI.