| Literature DB >> 34238346 |
Marianne A Messelink1, Nadia M T Roodenrijs2, Paco M J Welsing2, Saskia Haitjema3, Bram van Es3,4, Cornelia A R Hulsbergen-Veelken3, Sebastiaan Jong4, L Malin Overmars3,4, Leon C Reteig3,4, Sander C Tan4,5, Tjebbe Tauber4, Jacob M van Laar2.
Abstract
BACKGROUND: The new concept of difficult-to-treat rheumatoid arthritis (D2T RA) refers to RA patients who remain symptomatic after several lines of treatment, resulting in a high patient and economic burden. During a hackathon, we aimed to identify and predict D2T RA patients in structured and unstructured routine care data.Entities:
Keywords: Applied data analytics in medicine; Difficult-to-treat rheumatoid arthritis; Machine learning; Routine care data
Year: 2021 PMID: 34238346 PMCID: PMC8265126 DOI: 10.1186/s13075-021-02560-5
Source DB: PubMed Journal: Arthritis Res Ther ISSN: 1478-6354 Impact factor: 5.156
Classification of D2T and non-D2T patients in structured routine care data
| Classification in structured data | Validation | |||
|---|---|---|---|---|
| Clinically classified D2T RA* | Clinically classified non-D2T RA* | Newly classified patients in the UPOD | Total | |
| D2T RA | 25 | 2 | 43 | 70 |
| Non-D2T RA | 27 | 98 | 1678 | 1803 |
| Total | 52 | 100 | 1721 | 1873 |
Patients were classified by applying the D2T RA definition [8] in structured routine care data from the UPOD
D2T difficult-to-treat, DAS28-ESR disease activity score based on 28-joint count and erythrocyte sedimentation rate, RA rheumatoid arthritis, UPOD Utrecht Patient Oriented Database
*Clinical classification of D2T and non-D2T RA patients as performed in the cross-sectional study [6]
Classification of D2T and non-D2T patients in unstructured routine care data
| Classification in unstructured data | Validation | |||
|---|---|---|---|---|
| Clinically classified D2T RA* | Clinically classified non-D2T RA* | Newly classified patients in the UPOD | Total | |
| D2T RA | 36 | 8 | 117 | 161 |
| Non-D2T RA | 16 | 92 | 1604 | 1712 |
| Total | 52 | 100 | 1721 | 1873 |
Patients were classified by applying the D2T RA definition [8] in unstructured routine care data from the UPOD
D2T difficult-to-treat, RA rheumatoid arthritis, UPOD Utrecht Patient Oriented Database
*Clinical classification of D2T and non-D2T RA patients as performed in the cross-sectional study [6]
The most important features to identify D2T RA patients based on logistic regression coefficients
| Feature | Logistic regression coefficient |
|---|---|
| Number of different medication prescriptions, based on the extracted medication in Supplemental table | 1.05 |
| Mean DAS28-ESR score over time | 0.76 |
| Median DAS28-ESR score over time | 0.70 |
| Median non-invasively measured blood pressure over time | 0.64 |
| Standard deviation of the creatinine laboratory measurements over time | 0.63 |
| Time since RA diagnosis | 0.52 |
| Median of banded neutrophils over time | 0.37 |
| Ratio of segmented neutrophils by percentage of immature granulocytes over time | 0.30 |
| Standard deviation of percentage of reticulocytes over time | 0.30 |
| Median of the delta over time of banded neutrophils over time | 0.29 |
Features are noted in order of importance. A higher value of a feature corresponds to a higher likelihood of having D2T RA
DAS28 disease activity score based on 28-joint count, ESR erythrocyte sedimentation rate, RA rheumatoid arthritis
The most important features to identify non-D2T RA patients based on logistic regression coefficients
| Feature | Logistic regression coefficient |
|---|---|
| Maximum ESR over time | 0.84 |
| Standard deviation of ESR values over time | 0.78 |
| Mean minus median of intermediate angle scatter of platelets over time | 0.63 |
| White blood cell count divided by lymphocyte concentration over time | 0.62 |
| Median length | 0.58 |
| Minimum potassium value over time | 0.56 |
| Female sex | 0.56 |
| Median neutrophils over time | 0.46 |
| Median percentage of reticulocytes over time | 0.43 |
| Standard deviation of DAS28-ESR score over time | 0.43 |
Features are noted in order of importance. A higher value of a feature corresponds to a higher likelihood of having non-D2T RA
DAS28 disease activity score based on 28-joint count, ESR erythrocyte sedimentation rate, IAS intermediate angle scatter of platelet
Fig. 1ROC-curve of the D2T RA identification model based on a feature importance analysis. AUC-ROC for an identification model to identify D2T and non-D2T RA patients based on structured UPOD data. The model is based on the most important features derived with logistic regression techniques from the available structured data from the UPOD. D2T, difficult-to-treat; RA, rheumatoid arthritis; AUC, area under the curve; ROC, receiver-operator curve; UPOD, Utrecht Patient Oriented Database
Fig. 2Reduced dimensions of longitudinal hematological data. A Medians of the reduced dimensions of the longitudinal hematological data of all 52 clinically classified D2T and 100 clinically classified non-D2T RA patients. B Medians of the reduced dimensions of the longitudinal hematological data of all 1873 RA patients in the UPOD-database, where a higher Y-score indicates a higher estimated probability of having D2T RA according to the classifications in structured and unstructured data, and the clinical classification (if available). All available hematological parameters were reduced to two dimensions (d1 and d2). For each patient, the median of these reduced dimensions over time is visualized. d, reduced dimension; D2T, difficult-to-treat; RA, rheumatoid arthritis; UPOD, Utrecht Patient Oriented Database
The number of predicted D2T and non-D2T RA patients
| Prediction | Validation | ||
|---|---|---|---|
| Clinically classified D2T RA | Clinically classified non-D2T RA | Total | |
| D2T RA | 22 | 44 | 66 |
| Non-D2T RA | 6 | 44 | 50 |
| Total | 28 | 88 | 116 |
Predictions are based on data from before the start of the first b/tsDMARD
b/tsDMARD biological or targeted synthetic disease-modifying antirheumatic drug, D2T difficult-to-treat, RA rheumatoid arthritis
*Clinical classification of D2T and non-D2T RA patients as performed in the cross-sectional study [6]
A decision threshold of 0.15 was applied
Fig. 3ROC-curve of the D2T RA machine learning prediction model. AUC-ROC of the D2T RA prediction model based on data from before the start of the first b/tsDMARD. AUC, area under the curve; b/tsDMARD, biological or targeted synthetic disease-modifying antirheumatic drug; csDMARD, conventional synthetic disease-modifying antirheumatic drug; D2T, difficult-to-treat; RA, rheumatoid arthritis; ROC, receiver-operator characteristic; std dev, standard deviation