| Literature DB >> 36220873 |
Linxi Meng1, Will Treem2, Graham A Heap2, Jingjing Chen3.
Abstract
Alpha-1 antitrypsin deficiency associated liver disease (AATD-LD) is a rare genetic disorder and not well-recognized. Predicting the clinical outcomes of AATD-LD and defining patients more likely to progress to advanced liver disease are crucial for better understanding AATD-LD progression and promoting timely medical intervention. We aimed to develop a tailored machine learning (ML) model to predict the disease progression of AATD-LD. This analysis was conducted through a stacking ensemble learning model by combining five different ML algorithms with 58 predictor variables using nested five-fold cross-validation with repetitions based on the UK Biobank data. Performance of the model was assessed through prediction accuracy, area under the receiver operating characteristic (AUROC), and area under the precision-recall curve (AUPRC). The importance of predictor contributions was evaluated through a feature importance permutation method. The proposed stacking ensemble ML model showed clinically meaningful accuracy and appeared superior to any single ML algorithms in the ensemble, e.g., the AUROC for AATD-LD was 68.1%, 75.9%, 91.2%, and 67.7% for all-cause mortality, liver-related death, liver transplant, and all-cause mortality or liver transplant, respectively. This work supports the use of ML to address the unanswered clinical questions with clinically meaningful accuracy using real-world data.Entities:
Mesh:
Substances:
Year: 2022 PMID: 36220873 PMCID: PMC9554039 DOI: 10.1038/s41598-022-21389-9
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.996
Summary of demographic and disease characteristics in patients with any liver disease and AATD-LD.
| Variables | Category | Any liver disease (N = 11,583) | AATD-LD (N = 455) |
|---|---|---|---|
| Sex, n (%) | Female | 6097 (52.6%) | 226 (49.7%) |
| Male | 5486 (47.4%) | 229 (50.3%) | |
| Missing | 0 | 0 | |
| Race, n (%) | White | 10,840 (94.1%) | 426 (94.2%) |
| Non-white | 674 (5.9%) | 26 (5.8%) | |
| Missing | 69 | 3 | |
| Obesity, n (%) | Non-obese | 7180 (62.7%) | 321 (72.0%) |
| Obese | 4278 (37.3%) | 125 (28.0%) | |
| Missing | 125 | 9 | |
| Diabetes, n (%) | Non-diabetic | 9862 (85.9%) | 401 (88.9%) |
| Diabetic | 1625 (14.1%) | 50 (11.1%) | |
| Missing | 96 | 4 | |
| Smoking status, n (%) | Never smoking | 5090 (44.3%) | 182 (40.3%) |
| Past smoker | 4493 (39.1%) | 194 (42.9%) | |
| Current smoker | 1911 (16.6%) | 76 (16.8%) | |
| Missing | 89 | 3 | |
| Age (years) | Mean | 58.5 | 60.3 |
| Min, max | 40, 70 | 41, 70 | |
| Missing | 0 | 0 | |
| BMI (kg/m2) | Mean | 29.0 | 27.5 |
| Min, max | 15.0, 69.0 | 16.9, 52.5 | |
| Missing | 125 | 9 | |
| Weight (kg) | Mean | 82.0 | 78.0 |
| Min, max | 35.8, 190.0 | 41.7, 151.4 | |
| Missing | 122 | 7 | |
| Waist (cm) | Mean | 95.4 | 92.9 |
| Min, max | 57, 171 | 62, 153 | |
| Missing | 84 | 4 |
Patients with any liver disease were identified by ICD code. Patients with AATD-LD is a subset of patients with any liver disease.
Summary of clinical outcomes in patients with any liver disease and AATD-LD.
| Clinical outcomes, n (%) | Any liver disease (N = 11,583) | AATD-LD (N = 455) |
|---|---|---|
| All-cause mortality | 3524 (30%) | 245 (54%) |
| Liver-related death | 1230 (10%) | 41 (9%) |
| Liver transplant | 124 (1%) | 5 (1%) |
| All-cause mortality or liver transplant | 3619 (31%) | 246 (54%) |
Description of potential predictor variables.
| Category | Description |
|---|---|
Demographics | Age Age of diagnosis Gender Ethnicity BMI Weight Waist circumference |
Baseline disease characteristics | Other underlying conditions Non-alcoholic steatohepatitis Lung disease Diabetes Obesity |
Lifestyle and others | Alcohol intake/status Smoking status Medical procedure Major operation |
Baseline laboratory parameters | Blood assays Albumin Alanine aminotransferase Aspartate aminotransferase Alkaline phosphatase Gamma-glutamyl transferase (GGT) Total bilirubin Direct bilirubin International normalised ratio Hemoglobin A1c Total protein Spirometry Forced vital capacity (FVC) Forced expiratory volume in 1 s (FEV1) Peak expiratory flow (PEF) |
Variables in predictor blocks 2 and 3 were obtained from patient-reported questionnaires. There may be more than one predictor variable in each predictor category. 58 predictor variables were identified via feature selection prior to ML model training.
Figure 1Feature selection strategy prior to model training.
Figure 2Flow chart of data assembly, processing, and analysis.
Figure 3The workflow of stacking ensemble learning.
Mean (± standard deviation) model performance measures for stacking ensemble learning in Training Set and Test Set across the nested five-fold cross-validation with 100 repetitions in patients with AATD-LD (N = 455), respectively.
| Clinical outcomes | Accuracy | AUROC | AUPRC | |||
|---|---|---|---|---|---|---|
| Training | Test | Training | Test | Training | Test | |
| All-cause mortality | 0.828 ± 0.091 | 0.632 ± 0.038 | 0.899 ± 0.080 | 0.681 ± 0.035 | 0.911 ± 0.073 | 0.709 ± 0.032 |
| Liver-related death | 0.991 ± 0.011 | 0.914 ± 0.009 | 0.997 ± 0.007 | 0.759 ± 0.108 | 0.979 ± 0.043 | 0.411 ± 0.170 |
| Liver transplant | 1.000 ± 0.001 | 0.989 ± 0.000 | 1.000 ± 0.000 | 0.912 ± 0.133 | 1.000 ± 0.000 | 0.414 ± 0.416 |
| All-cause mortality or liver transplant | 0.837 ± 0.087 | 0.633 ± 0.029 | 0.903 ± 0.076 | 0.677 ± 0.025 | 0.917 ± 0.067 | 0.703 ± 0.040 |
AUROC = area under the receiver operating characteristic, AUPRC = area under the precision-recall curve.
Mean (± standard deviation) model performance measures in Training Set and Test Set across the nested fivefold cross-validation with 100 repetitions in patients with any liver disease (N = 11,583).
| Clinical outcomes | Accuracy | AUROC | AUPRC | |||
|---|---|---|---|---|---|---|
| Training | Test | Training | Test | Training | Test | |
| All-cause mortality | 0.806 ± 0.025 | 0.756 ± 0.008 | 0.852 ± 0.034 | 0.770 ± 0.009 | 0.737 ± 0.049 | 0.629 ± 0.016 |
| Liver-related death | 0.991 ± 0.011 | 0.913 ± 0.004 | 0.999 ± 0.002 | 0.835 ± 0.009 | 0.998 ± 0.003 | 0.517 ± 0.023 |
| Liver transplant | 0.999 ± 0.001 | 0.989 ± 0.003 | 1.000 ± 0.000 | 0.859 ± 0.045 | 1.000 ± 0.000 | 0.142 ± 0.048 |
| All-cause mortality or liver transplant | 0.815 ± 0.039 | 0.755 ± 0.006 | 0.863 ± 0.046 | 0.777 ± 0.010 | 0.764 ± 0.067 | 0.636 ± 0.010 |
AUROC = area under the receiver operating characteristic, AUPRC = area under the precision-recall curve.
Figure 4ROC curves for the trained classifiers in the Test Set from one Training-Set split for patients AATD-LD (N = 445). Given the low liver transplant event incidence in AATD-LD patients in the Test Set, there was insufficient data to populate the ROC curve for liver transplant and a box plot was presented instead.
Figure 5ROC curves for the trained classifiers in the Test Set from one Training-Set split for patients with any liver disease (N = 11,583).
Overall predictive model performance measures of stacking ensemble and each base model used in the ML training in patients with AATD-LD.
| Model | Performance measure | All-cause mortality (N = 455) | Liver-related death (N = 455) | Liver transplant (N = 455) | All-cause mortality or liver transplant (N = 455) | ||||
|---|---|---|---|---|---|---|---|---|---|
| Training | Test | Training | Test | Training | Test | Training | Test | ||
| RF | Accuracy | 0.779 | 0.589 | 0.970 | 0.894 | 0.999 | 0.988 | 0.797 | 0.620 |
| AUROC | 0.858 | 0.626 | 0.981 | 0.756 | 1.000 | 0.908 | 0.871 | 0.655 | |
| AUPRC | 0.872 | 0.657 | 0.928 | 0.344 | 0.999 | 0.335 | 0.885 | 0.682 | |
| XGBOOST | Accuracy | 0.757 | 0.614 | 0.908 | 0.893 | 0.993 | 0.981 | 0.76 | 0.603 |
| AUROC | 0.829 | 0.649 | 0.954 | 0.746 | 1.000 | 0.867 | 0.826 | 0.638 | |
| AUPRC | 0.848 | 0.675 | 0.777 | 0.366 | 0.999 | 0.273 | 0.842 | 0.668 | |
| LGBM | Accuracy | 0.733 | 0.616 | 0.950 | 0.897 | 0.999 | 0.988 | 0.754 | 0.605 |
| AUROC | 0.805 | 0.649 | 0.977 | 0.722 | 1.000 | 0.821 | 0.821 | 0.635 | |
| AUPRC | 0.826 | 0.677 | 0.873 | 0.331 | 1.000 | 0.297 | 0.841 | 0.664 | |
| ENRR | Accuracy | 0.682 | 0.620 | 0.928 | 0.851 | 0.986 | 0.988 | 0.675 | 0.641 |
| AUROC | 0.751 | 0.648 | 0.855 | 0.717 | 1.000 | 0.922 | 0.738 | 0.668 | |
| AUPRC | 0.773 | 0.676 | 0.608 | 0.379 | 0.994 | 0.363 | 0.762 | 0.697 | |
| ANN-MLP | Accuracy | 0.646 | 0.595 | 0.931 | 0.770 | 0.999 | 0.982 | 0.723 | 0.624 |
| AUROC | 0.736 | 0.674 | 0.827 | 0.622 | 1.000 | 0.930 | 0.798 | 0.683 | |
| AUPRC | 0.757 | 0.692 | 0.565 | 0.228 | 1.000 | 0.395 | 0.822 | 0.712 | |
Mean model performance measures were reported in the Training Set and the Test Set, respectively.
AATD-LD = alpha-1 antitrypsin deficiency-associated liver disease, AUPRC = area under the precision-recall curve, AUROC = area under the receiver operating characteristic, RF = random forest, XGBOOST = extreme gradient boosting, LGBM = light gradient boosting, ENRR = elastic net regularized regression, ANN-MLP = artificial neural network multilayer perceptron.
Model performance measures from the stacking ensemble learning model are in bold.
Overall predictive model performance measures of stacking ensemble and each base model used in the ML training in patients with any liver disease.
| Model | Performance measure | All-cause mortality (N = 11,583) | Liver-related death (N = 11,583) | Liver transplant (N = 11,583) | All-cause mortality or liver transplant (N = 11,583) | ||||
|---|---|---|---|---|---|---|---|---|---|
| Training | Test | Training | Test | Training | Test | Training | Test | ||
| RF | Accuracy | 0.739 | 0.700 | 0.804 | 0.818 | 0.957 | 0.947 | 0.739 | 0.707 |
| AUROC | 0.801 | 0.745 | 0.866 | 0.800 | 0.993 | 0.829 | 0.800 | 0.752 | |
| AUPRC | 0.676 | 0.577 | 0.784 | 0.391 | 0.906 | 0.114 | 0.688 | 0.601 | |
| XGBOOST | Accuracy | 0.764 | 0.724 | 0.972 | 0.897 | 0.999 | 0.985 | 0.765 | 0.727 |
| AUROC | 0.820 | 0.764 | 0.995 | 0.812 | 1.000 | 0.843 | 0.827 | 0.772 | |
| AUPRC | 0.685 | 0.606 | 0.991 | 0.461 | 1.000 | 0.112 | 0.710 | 0.631 | |
| LGBM | Accuracy | 0.759 | 0.726 | 0.978 | 0.900 | 0.999 | 0.987 | 0.754 | 0.728 |
| AUROC | 0.813 | 0.766 | 0.996 | 0.812 | 1.000 | 0.831 | 0.812 | 0.772 | |
| AUPRC | 0.678 | 0.609 | 0.994 | 0.465 | 1.000 | 0.105 | 0.691 | 0.632 | |
| ENRR | Accuracy | 0.757 | 0.754 | 0.807 | 0.890 | 0.954 | 0.983 | 0.755 | 0.753 |
| AUROC | 0.767 | 0.761 | 0.848 | 0.831 | 0.916 | 0.870 | 0.769 | 0.764 | |
| AUPRC | 0.616 | 0.608 | 0.766 | 0.507 | 0.406 | 0.143 | 0.636 | 0.630 | |
| ANN-MLP | Accuracy | 0.767 | 0.746 | 0.886 | 0.878 | 0.988 | 0.979 | 0.769 | 0.740 |
| AUROC | 0.785 | 0.762 | 0.945 | 0.824 | 0.994 | 0.735 | 0.790 | 0.776 | |
| AUPRC | 0.648 | 0.604 | 0.908 | 0.45 | 0.943 | 0.055 | 0.671 | 0.632 | |
Mean model performance measures were reported in the Training Set and the Test Set, respectively.
AUPRC = area under the precision-recall curve, AUROC = area under the receiver operating characteristic, RF = random forest, XGBOOST = extreme gradient boosting, LGBM = light gradient boosting, ENRR = elastic net regularized regression, ANN-MLP = artificial neural network multilayer perceptron.
Model performance measures from the stacking ensemble learning model are in bold.
Figure 6Feature importance in the final stacking ensemble learning model for patients AATD-LD (N = 455).
Figure 7Feature importance in the final stacking ensemble learning model for patients with any liver disease (N = 11,583).