| Literature DB >> 31649533 |
Xiaolan Mo1,2, Xiujuan Chen3, Hongwei Li4, Jiali Li2, Fangling Zeng5, Yilu Chen1, Fan He1, Song Zhang4, Huixian Li3, Liyan Pan3, Ping Zeng4, Ying Xie4, Huiyi Li2, Min Huang2, Yanling He1, Huiying Liang3, Huasong Zeng4.
Abstract
Background and Aims: Accurately predicting the response to methotrexate (MTX) in juvenile idiopathic arthritis (JIA) patients before administration is the key point to improve the treatment outcome. However, no simple and reliable prediction model has been identified. Here, we aimed to develop and validate predictive models for the MTX response to JIA using machine learning based on electronic medical record (EMR) before and after administering MTX. Materials andEntities:
Keywords: clinical response; juvenile idiopathic arthritis; machine learning; methotrexate; prediction model
Year: 2019 PMID: 31649533 PMCID: PMC6791251 DOI: 10.3389/fphar.2019.01155
Source DB: PubMed Journal: Front Pharmacol ISSN: 1663-9812 Impact factor: 5.810
The full name and abbreviation name of variables.
| Full name of variables | Abbreviation name | Full name of variables | Abbreviation name |
|---|---|---|---|
| Age of methotrexate start | Age of MTX start | Hemoglobin | HGB |
| Age onset | Age onset | Indirect bilirubin | IBIL |
| Albumin | ALB | Immune globulin A | IgA |
| Alanine transaminase | ALT | Immune globulin E | IgE |
| Anti-cyclic citrullinated peptide | Anti-CCP | Immune globulin G | IgG |
| Active partial thrombin time | APTT | Immune globulin M | IgM |
| Aspartate aminotransferase | AST | JIA subtype | JIA subtype |
| Complement 3 | C3 | Lymphocyte | LYM |
| Complement 4 | C4 | Neutrophil | NEUT |
| CD16+CD56+ | CD16+CD56+ | Platelet | PLT |
| CD19+ | CD19+ | Prothrombin time | PT |
| CD3+Abs | CD3+Abs | Red blood cell | RBC |
| CD3+CD4+ | CD3+CD4+ | Rheumatoid factor-IgG | RF-IgG |
| CD3+CD8+ | CD3+CD8+ | Serum creatinine | SCr |
| C-reactive protein | CRP | Swollen joint count | SJC |
| Direct bilirubin | DBIL | Total bilirubin | TBIL |
| The first dose of MTX on the start | Dose0 | Helper T cells/Suppressor T cells | Th/Ts |
| Erythrocyte sedimentation rate | ESR | Time interval | Time interval |
| Ferritin | FER | Tender joint count | TJC |
| Fibrinogen | FIB | Thrombin time | TT |
| Gender | Gender | Urea | Urea |
| Blood glucose | GLU | White blood cell | WBC |
| Hematocrit | HCT | Weight | Weight |
| Ritchie articular index | RAI | C-reactive protein near 3 months after administration | CRP/3m |
CD, Cluster of differentiation cell; CD3+Abs, the absolute value of T cell with Cluster of differentiation 3; CD3+CD4+, the ratio of CD4+ divided by CD3+.
Variables with the suffix “/3m” are those collected within 3 months after administration of MTX. For example, CRP/3m refers to the CRP variable collected within 3 months after administration.
Figure 1The variables used for modeling and their importance ranking (in order of median importance). The left part of the figure shows the variables used in the pre-administration variables model. The right part of the figure shows a mixture of variables before and after administration (variables collected within 3 months after administration with MTX). The shorter the transverse column (i.e. the smaller the value), the greater importance of the median ranking of the variable (see the top and left part).
Figure 2The flowchart of model developing and validation. ET, extremely randomized trees; GBDT, gradient boosting decision tree; RF, random forest; XGBoost, extreme gradient boosting; SVM, support vector machine; LR, logistic regression.
Baseline patient characteristics.
| Characteristics | Data (n = 362) |
|---|---|
| Gender, n (male/female) | 211/151 |
| Age of MTX start, years, (mean ± SD) | 6.7 ± 3.4 |
| Age of disease onset, years, (mean ± SD) | 6.3 ± 3.4 |
| Time interval*, months, (mean ± SD) | 5.6 ± 2.7 |
| Polyarticular JIA, n | 101 |
| Oligoarticular JIA, n | 186 |
| Other types of JIA, n | 75 |
| Tender joint count, median (range) | 3(0–36) |
| Swollen joint count, median (range) | 4(0–36) |
| ESR, mm/h, (mean ± SD) | 36.22 ± 33.34 |
| CRP, mg/L, (mean ± SD) | 24.81 ± 33.32 |
| RF-IgG, U/ml, (mean ± SD) | 23.68 ± 52.55 |
| MTX dose at start, mg/m2/wk, median (range) | 5.0(0.5–18.0) |
*Time interval, the time from disease onset to initiation of MTX treatment.
Figure 3The forward feature selection results for the model of pre-administration variables (the A part of figure 3) and the model of mix-variables before and after administration (the B part of figure 3). We examined the predictive performance of the most prominent feature and identified the point at which there was no considerable gain in accuracy, sensitivity, and area under the curve (AUC), when adding the feature of the next highest ranking one to the model. The optimum values were obtained when these three measurements defined the most discriminative features.
The classification performance results of the models.
| Data set | Model | Sensitivity (%) | Specificity (%) | Accuracy (%) | PPV | NPV | AUC |
|---|---|---|---|---|---|---|---|
| MTX-A | XGBoost | 90.70 | 93.33 | 91.78 | 95.12 | 87.50 | 0.97 |
| RF | 90.70 | 80.00 | 86.30 | 86.67 | 85.71 | 0.95 | |
| SVM | 79.07 | 83.33 | 80.82 | 87.18 | 73.53 | 0.87 | |
| LR | 65.12 | 73.33 | 68.49 | 77.78 | 59.46 | 0.80 | |
| MTX-B | XGBoost | 95.35 | 93.33 | 94.52 | 95.35 | 93.33 | 0.99 |
| RF | 95.35 | 93.33 | 94.52 | 95.35 | 93.33 | 0.98 | |
| SVM | 88.37 | 80.00 | 84.93 | 86.36 | 82.76 | 0.81 | |
| LR | 88.37 | 76.67 | 83.56 | 84.44 | 82.14 | 0.83 |
MTX-A, pre-administration variables prediction models; MTX-B, mix-variables models with features collected before and after administered with MTX within 3 months; PPV, positive predictive value; NPV, negative predictive value; AUC, area under the curve; XGBoost, extreme gradient boosting; RF, random forest; SVM, support vector machine; LR, logistic regression.
Figure 4The mixed matrix results of each model in the MTX-A and MTX-B predictors of the test set. For example, when the true lables were NR(non-response) and predicted lables were NR, it indicated that the number of NR was correctly predicted. However, when true lables were NR, and predicted labels were GR (good response), it indicated the number of NR incorrectly predicted to GR. As can be seen from the figure, the predicted values of model A (XGBoost model) and model E (XGBoost model) are the closest to the real values, indicating the best prediction performance. The numbers in the pink grids represent the number of cases that were accurately predicted.
The application of XGBoost predictors for clinical patients to predict their response.
| MTX-A | Patient name | Input variables | Output | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| CRP | FIB | APTT | CD3+Abs (cells/ul) | PT | TT | TJC | RF-IgG | DBIL | IBIL | |||||||
| AAA | 35.30 | 2.77 | 34.20 | 2,454.46 | 12.00 | 10.90 | 1 | 21.50 | 1.30 | 4.80 | Non- | |||||
| BBB | 150.00 | 4.71 | 26.22 | 3,309.00 | 14.40 | 10.90 | 2 | 5.90 | 2.60 | 2.70 | Good | |||||
| MTX-B | Patient name | Input variables | Output | |||||||||||||
| CRP/3m | CD3+CD4+/3m(%) | CD3+CD8+/3m | RF-IgG/3m(U/ml) | TBIL/3m | FIB(g/L) | |||||||||||
| AAA | 49.60 | 38.00 | 50.00 | 11.70 | 6.17 | 2.77 | Non- | |||||||||
| BBB | 32.30 | 34.44 | 26.77 | 3.90 | 7.60 | 4.71 | Good | |||||||||
In this table, variables were from clinical determination. The patient named AAA was responded to MTX in clinic practice, but the patient named BBB was not responded to MTX. We input their clinical variables into the two predictors. Both models produced correct predictive outcomes. Output means the prediction results of the MTX response.