| Literature DB >> 35476181 |
Aria Zand1,2,3, Zack Stokes4,5, Arjun Sharma4, Welmoed K van Deen6, Daniel Hommes4,7.
Abstract
BACKGROUND: Inflammatory Bowel Diseases with its complexity and heterogeneity could benefit from the increased application of Artificial Intelligence in clinical management. AIM: To accurately predict adverse outcomes in patients with IBD using advanced computational models in a nationally representative dataset for potential use in clinical practice.Entities:
Keywords: Artificial intelligence; Big data; Inflammatory bowel diseases; Machine learning; Precision medicine
Mesh:
Year: 2022 PMID: 35476181 PMCID: PMC9515047 DOI: 10.1007/s10620-022-07506-8
Source DB: PubMed Journal: Dig Dis Sci ISSN: 0163-2116 Impact factor: 3.487
Fig. 1Context of the different models. AI is the broad umbrella term of techniques which enables machines to mimic human behavior, when talking about predictive models we usually refer to machine learning which is a subset of AI that uses statistical methods to improve the accuracy of their outcome with experience. Deep Learning is a subset that makes the computation of multi-layer neural networks feasible and thus improving the accuracy even further
Introduction and description of different models
| Model | Explanation | Method | Advantages | Disadvantages |
|---|---|---|---|---|
| Ridge Logistic | This method creates a model that is not perfectly fit, or overfit, to the data in a given training set. In doing so, it reduces variance and makes the model a better predictor of data points outside of the training set | Regression | Can reduce overfitting Shrinks effects towards 0 Fast/easy to implement | Simplistic representation may be far from reality Assumptions may be difficult to justify with many predictors |
| LASSO Logistic | This method attempts to do the same thing as Ridge Regression but uses slightly different mathematical formulas that make it better in certain situations | Regression | Can reduce overfitting Performs variable selection Fast/easy to implement | Simplistic representation may be far from reality Variable selection is not robust to multicollinearity |
| Support Vector Machine | Attempts to find the largest separation between two groups. Sometimes the space of observations has to be transformed to find a clear separation | Machine learning | Works well with many predictors Makes prediction easy by clearly segmenting population | Lack of a clear separation can lead to poor performance Requires long training times for big data |
| Random Forest | Random forest is a collection of decision trees trained on different subsets of the data. Each decision tree decides the best places to cut so that observations from the same class fall on the same side of the cut | Machine learning | Performs variable selection Good performance for linear and nonlinear relationships Fast/easy to implement | Difficult to interpret Prone to overfitting |
| Neural Network | Neural networks consists of layers of nested linear models (neurons) with a nonlinear transformation (activation) after each layer. The output is often the probability that a given observation is a success | Deep learning | Captures complex nonlinear relationships Fully utilizes big data | Difficult to implement Requires many small decisions that can greatly affect performance |
A explanation of the different models used in our analysis is displayed below highlighting the advantages and disadvantages
Baseline demographics and variables of training and validation cohorts in the baseline year
| Variable | Training Set Baseline (2015) | Validation Set Baseline (2016) |
|---|---|---|
| Age, mean (SD) | 48.5 years (16.8) | 47.9 years (16.5) |
| Female Gender, | 38,254 (53%) | 35,966 (52%) |
| Race, | ||
| White | 47,710 (66.1%) | 44,473 (64.3%) |
| Unknown | 12,776 (17.7%) | 12,381 (17.9%) |
| Black | 5052 (7%) | 5672 (8.2%) |
| Hispanic | 4692 (6.5%) | 4219 (6.1%) |
| Asian | 1949 (2.7%) | 2490 (3.6%) |
| Hospitalizations and ER visits in baseline year, | ||
| Any ER Visit (#103) | 10,827 (15%) | 11,066 (16%) |
| Any Hospitalization (#97) | 4331 (6%) | 4150 (6%) |
| Any IBD-related Hospitalization (#100) | 3609 (5%) | 3458 (5%) |
| Any IBD-related ER Visit (#105) | 2887 (4%) | 2767 (4%) |
| Any IBD-related surgery (#64) | 2165 (3%) | 2075 (3%) |
| Medication use during baseline year, | ||
| Any IBD Medication use (#1) | 28,149 (39%) | 15,908 (23%) |
| Any Aminosalicylate use (#2&6) | 12,270 (17%) | 11,758 (17%) |
| Any Antibiotic use (#8) | 7218 (10%) | 6917 (10%) |
| Any Corticosteroid use (#11,14,17) | 18,766 (26%) | 18,675 (27%) |
| Any Immunomodulator use (#21, 24, 27) | 5774 (8%) | 5533 (8%) |
| Any Biologics use (#42) | 8661 (12%) | 8991 (13%) |
# Refers to the corresponding feature in Supplementary Table 1
Performance of the different models for the four main outcomes
| Sensitivity | Specificity | AUC | Brier Score* | |
|---|---|---|---|---|
| IBD-related Hospitalizations | ||||
| Ridge Logistic | 72% | 56% | 0.65 | 0.95 |
| LASSO Logistic | 65% | 66% | 0.71 | 0.17 |
| Support Vector Machine | 54% | 48% | 0.53 | 0.04 |
| Random Forest | 66% | 67% | 0.73 | 0.21 |
| Neural Network | 57% | 58% | 0.61 | 0.04 |
| Initiation of Biologics | ||||
| Ridge Logistic | 70% | 97% | 0.82 | 0.07 |
| LASSO Logistic | 83% | 96% | 0.94 | 0.05 |
| Support Vector Machine | 75% | 89% | 0.86 | 0.10 |
| Random Forest | 82% | 92% | 0.92 | 0.10 |
| Neural Network | 81% | 93% | 0.90 | 0.05 |
| Long-term Steroid Use | ||||
| Ridge Logistic | 99% | 4% | 0.51 | 0.83 |
| LASSO Logistic | 52% | 74% | 0.70 | 0.83 |
| Support Vector Machine | 50% | 74% | 0.72 | 0.13 |
| Random Forest | 48% | 86% | 0.81 | 0.15 |
| Neural Network | 50% | 74% | 0.72 | 0.16 |
| IBD-related surgery | ||||
| Ridge Logistic | 72% | 55% | 0.64 | 0.97 |
| LASSO Logistic | 64% | 67% | 0.71 | 0.22 |
| Support Vector Machine | 54% | 55% | 0.57 | 0.03 |
| Random Forest | 69% | 63% | 0.71 | 0.21 |
| Neural Network | 50% | 63% | 0.58 | 0.03 |
*The Brier score measures the correctness of a model’s predictions by summing the differences between the predicted probability of an observation belonging to a class and its actual class label. A low Brier score indicates that the model on average confidently places observations into the correct class
Fig. 2Overview of the performance of the different models for the four main outcomes
Feature importance of the different models
| Ridge Logistic (AUC = 0.65; Brier score = 0.95) | OR | LASSO Logistic (AUC = 0.71; Brier score = 0.17) | OR | Random Forest (AUC = 0.73; Brier score = 0.21) | Neural Network (AUC = 0.61; Brier score = 0.04) | |
|---|---|---|---|---|---|---|
| 1 | #65 Number of acute IBD surgeries | 8.72 | #20 Episodes of long-term steroids | 1.96 | #44 Number of IBD claims | #102 Number of ED visits |
| 2 | #64 Any IBD surgeries | 2.74 | #88 Number of Clostridium difficile stool tests | 1.57 | #49 Number of office visits | #36 Any certolizumab used this year |
| 3 | #88 Number of Clostridium difficile stool tests | 2.24 | #65 Number of acute IBD surgeries | 1.52 | #47 Number of UC claims | #35 Episodes of infliximab |
| 4 | #20 Episodes of long-term steroids | 1.72 | #43 Number of episodes of biologics | 1.52 | #94 Total number of claims | #5 Any oral aminosalicylates used this year |
| 5 | #54 Any IBD-related GI visits | 1.61 | #84 Any MR scans this year | 1.51 | #96 Number of hospitalizations | #30 Any adalimumab used this year |
In this table, we showcase the features that were most predictive for our four main outcomes
Additionally the features are broken down by the different statistical models used. The performance of the Support Vector Machine was excluded because of its overall poor performance