Background: Non-alcoholic fatty liver (NAFL) can progress to the severe subtype non-alcoholic steatohepatitis (NASH) and/or fibrosis, which are associated with increased morbidity, mortality, and healthcare costs. Current machine learning studies detect NASH; however, this study is unique in predicting the progression of NAFL patients to NASH or fibrosis. Aim: To utilize clinical information from NAFL-diagnosed patients to predict the likelihood of progression to NASH or fibrosis. Methods: Data were collected from electronic health records of patients receiving a first-time NAFL diagnosis. A gradient boosted machine learning algorithm (XGBoost) as well as logistic regression (LR) and multi-layer perceptron (MLP) models were developed. A five-fold cross-validation grid search was utilized for hyperparameter optimization of variables, including maximum tree depth, learning rate, and number of estimators. Predictions of patients likely to progress to NASH or fibrosis within 4 years of initial NAFL diagnosis were made using demographic features, vital signs, and laboratory measurements. Results: The XGBoost algorithm achieved area under the receiver operating characteristic (AUROC) values of 0.79 for prediction of progression to NASH and 0.87 for fibrosis on both hold-out and external validation test sets. The XGBoost algorithm outperformed the LR and MLP models for both NASH and fibrosis prediction on all metrics. Conclusion: It is possible to accurately identify newly diagnosed NAFL patients at high risk of progression to NASH or fibrosis. Early identification of these patients may allow for increased clinical monitoring, more aggressive preventative measures to slow the progression of NAFL and fibrosis, and efficient clinical trial enrollment.
Background: Non-alcoholic fatty liver (NAFL) can progress to the severe subtype non-alcoholic steatohepatitis (NASH) and/or fibrosis, which are associated with increased morbidity, mortality, and healthcare costs. Current machine learning studies detect NASH; however, this study is unique in predicting the progression of NAFL patients to NASH or fibrosis. Aim: To utilize clinical information from NAFL-diagnosed patients to predict the likelihood of progression to NASH or fibrosis. Methods: Data were collected from electronic health records of patients receiving a first-time NAFL diagnosis. A gradient boosted machine learning algorithm (XGBoost) as well as logistic regression (LR) and multi-layer perceptron (MLP) models were developed. A five-fold cross-validation grid search was utilized for hyperparameter optimization of variables, including maximum tree depth, learning rate, and number of estimators. Predictions of patients likely to progress to NASH or fibrosis within 4 years of initial NAFL diagnosis were made using demographic features, vital signs, and laboratory measurements. Results: The XGBoost algorithm achieved area under the receiver operating characteristic (AUROC) values of 0.79 for prediction of progression to NASH and 0.87 for fibrosis on both hold-out and external validation test sets. The XGBoost algorithm outperformed the LR and MLP models for both NASH and fibrosis prediction on all metrics. Conclusion: It is possible to accurately identify newly diagnosed NAFL patients at high risk of progression to NASH or fibrosis. Early identification of these patients may allow for increased clinical monitoring, more aggressive preventative measures to slow the progression of NAFL and fibrosis, and efficient clinical trial enrollment.
Non‐alcoholic fatty liver (NAFL) disease consists of a spectrum of liver disorders from isolated steatosis, termed NAFL, to non‐alcoholic steatohepatitis (NASH). NASH is considered to be progressive and is a more severe form of NAFL. While a minority (10–20%) of NAFL patients develop NASH, they are at increased risk for progression to fibrosis, cirrhosis, hepatocellular carcinoma (HCC), and liver‐related mortality.
Although previously under‐recognized, NAFL patients may also progress to advanced fibrosis.
,
NAFL and NASH have a large economic impact on health care use.
,
Estimates of total NAFL‐associated costs in the United States are approximately 103 billion dollars per year.
The prevalence of risk factors for NAFL, such as diet westernization, sedentary lifestyle, obesity, and type 2 diabetes (T2D) mellitus has increased, resulting in increased prevalence of NAFL. It is estimated that up to 34% of the United States and 25.2% of the worldwide population may have NAFL.
,
,
Similarly, increasing global rates of T2D and obesity are expected to lead to an increase in the burden of NASH and associated complications.Currently, there are insufficient fully validated noninvasive diagnostic tests for NAFL.
The gold standard for NASH diagnosis and staging is liver biopsy, which is invasive, costly, and incurs procedure‐related risk for patients.
Noninvasive diagnostic procedures using ultrasound and MRI have been developed. However, these methods remain costly and time‐consuming, require expert use and interpretation, cannot capture all histopathological features assessed during evaluation for NASH, and are not fully validated for diagnosis of NASH. Serum biomarkers and biomarker‐based indexes have also been explored as diagnostic aids, but are not widely in use, as they are not perceived to provide additional actionable information relative to standard clinical evaluation.
,
Disease activity scores, such as the NAFL Activity Score (NAS) utilize liver biopsies yet lack utility in predicting disease progression and are not considered a clinical standard at this time.
An urgent need remains for tools to predict specific clinical outcomes in NAFL.Early detection is a significant challenge in NASH and fibrosis diagnosis and management.
,
Improved screening methods for patients at high risk of progression will enable clinicians to identify high‐risk NAFL patients requiring early intensive intervention. These methods may also assist investigators and sponsors of clinical trials in identifying appropriate patients for NAFL, NASH, and fibrosis drug clinical trials.
,
This study utilized clinical information from NAFL‐diagnosed patients to predict the likelihood of progression to NASH or fibrosis.
Methods
Data sources
Data were collected from a proprietary national longitudinal electronic health record (EHR) repository that incorporates clinical, claims, and other medical administrative data obtained from over 700 inpatient and ambulatory care sites. The data were extracted from January 2016 to June 2020, and resulted from aggregation across several different EHR systems. All data were de‐identified in compliance with the Health Insurance Portability and Accountability Act (HIPAA) and thus, this study did not require institutional review board approval as per 45 Code of Federal Regulations 46.102. The dataset included lab results, vital sign measurements, medication orders, patient diagnoses, and demographic information, as well as natural language processing (NLP)‐extracted features derived from clinicians' notes (Table S1, Supporting information).
Gold standard
NAFL, NASH, and fibrosis were defined using International Classification of Disease, Tenth Revision (ICD‐10) codes (Table S2, Supporting information). To meet the gold standard definition of the study, progression to NASH or fibrosis from NAFL had to occur within 4 years—but no earlier than 14 days—from the first diagnosis of NAFL. The majority of patients in the dataset developed NASH or fibrosis within this period (Fig. 1).
Figure 1
Study design timeline. At time 0, the patient was diagnosed with NAFL using ICD‐10 codes. EHR data were collected 365 days prior to NAFL diagnosis. The prediction window for risk of progression to either NASH or fibrosis was from 14 days post NAFL diagnosis and spanned up to ~4 years (1350 days).
Study design timeline. At time 0, the patient was diagnosed with NAFL using ICD‐10 codes. EHR data were collected 365 days prior to NAFL diagnosis. The prediction window for risk of progression to either NASH or fibrosis was from 14 days post NAFL diagnosis and spanned up to ~4 years (1350 days).
Patient measurements and inclusion criteria
Minimal inclusion criteria were applied in order to increase the likelihood of our studied population having sufficient data. Only patients that had at least one of their vital signs measured at the index encounter in which they were diagnosed with NAFL were included in the study. Further, patients were required to have had data available in the database for at least 2 years after their first NAFL diagnosis to be included, with at least one additional healthcare encounter within 2 years from their index encounter (Fig. 2).
Patient encounter inclusion diagram. EHR, electronic health record; NAFL, non‐alcoholic fatty liver; NASH, non‐alcoholic steatohepatitis.Inputs for the machine learning algorithm (MLA) included demographic features, vital signs, and laboratory results that were reported in encounters 1 year prior to the index encounter when a patient was diagnosed with NAFL (Table 1). Vital signs and laboratory results were first classified into measurements taken at the current or from any previous hospital encounter. A previous hospital encounter was defined as any visit, outpatient or inpatient, that happened >24 h prior to the current encounter. The measurements for each included feature were then summarized by their mean, standard deviation, minimum, maximum, and total number of observations. As a result, there were distinct summary statistics for prior and current measurements for every lab or vital sign feature included in the model. This method of incorporating clinical observations was chosen because their distributions across an encounter and over time may indicate underlying health and illness severity, and are therefore helpful to the model. Including all measurements taken over a patient's current or previous encounter would allow for a more direct time‐series analysis. However, that representation is generally more valuable for the inpatient setting, as acute conditions can change rapidly. Because NASH and fibrosis develop at a slower rate, the use of a more explicit feature set was chosen in order to represent the data more simply and with greater explainability.
Table 1
Best features as measured by an XGB model
Demographics
Laboratory measures
Age
Sex
Race–ethnicity
Alanine transaminase (ALT)
Aspartate aminotransferase (AST)
Bilirubin (direct)
Ferritin
Platelets
Body mass index (BMI)
Prothrombin time or international normalized ratio (INR)
Gamma‐glutamyl transferase (GGT)
Vital signs
Diastolic blood pressure
Heart rate
Respiratory rate
Systolic blood pressure
Temperature
These features were then used to train the final XGBoost (XGB), logistic regression (LR), and multi‐layered perceptron (MLP) models.
Best features as measured by an XGB modelAgeSexRace–ethnicityAlanine transaminase (ALT)Aspartate aminotransferase (AST)Bilirubin (direct)FerritinPlateletsBody mass index (BMI)Prothrombin time or international normalized ratio (INR)Gamma‐glutamyl transferase (GGT)Diastolic blood pressureHeart rateRespiratory rateSystolic blood pressureTemperatureThese features were then used to train the final XGBoost (XGB), logistic regression (LR), and multi‐layered perceptron (MLP) models.
Machine learning model
XGBoost (XGB) was the primary model architecture employed in this study. XGB is a gradient boosting algorithm implemented in Python.
The XGB algorithm combined results from various decision trees to give prediction scores. Within each decision tree, the patient population was split into successively smaller groups, as each tree branch divided patients who entered it into one of two groups according to the variable value and a predetermined threshold. NASH and fibrosis patient encounters were represented at the end of the decision tree, which was a set of leaf nodes. As the XGB model was trained, successive trees were developed in order to improve the accuracy of the model. Successive iterations of trees used gradient descent on the prior trees in order to minimize the error of the next tree that is formed.XGB model performance was compared to that of logistic regression (LR) and multi‐layer perceptron (MLP) models. Because LR and MLP models are unable to incorporate missing data, the median observation was used for imputation of that feature. Data were standardized for both the LR and MLP models to a standard Gaussian distribution. The XGB model was trained using the same 139 inputs as the MLP and LR models.The dataset was partitioned into a train: hold‐out test ratio of 80:20 with stratified sampling, because the positive class was relatively small with respect to the negative class. As our dataset comprises multiple distinct facilities from throughout the United States, a second external validation experiment was performed in which the dataset was partitioned into a training set composed of many separate sites and a validation set composed of one distinct facility not seen by the algorithm during training (see Table S3, Supporting information for demographic information). Inputs for the LR and MLP models were standardized, and missing data were imputed within the training and test datasets independently. All of the models underwent hyperparameter selection with a five‐fold cross‐validated grid search. The optimization of the hyperparameters was confirmed by evaluating the area under the receiver operating characteristic (AUROC) for different combinations of hyperparameters included in the grid search. For XGB, optimization parameters included maximum tree depth, regularization term (lambda), scale positive weight, learning rate, and number of estimators. Similarly, for LR, optimization parameters included penalty term, optimization problem solver, and inverse of regularization strength (C). For MLP, optimization parameters included maximum iteration, hidden layer size, and learning rate. Performance was assessed against the hold‐out test set and external validation cohort with respect to the AUROC curve, sensitivity, and specificity. Confidence intervals for these metrics were constructed using 1000 bootstrapped resamples. A SHAP (SHapley Additive exPlanations)
analysis was performed to evaluate how important different features were to the performance of each model.
Results
In total, 141,293 patients were included in the experiments, 4384 and 4472 of whom were eventually diagnosed with NASH or fibrosis, respectively (Table 2). P‐values were calculated using an exact binomial test for noncontinuous observations and Welch's t‐test for continuous observations (such as age) to handle the unequal variance associated with our positive and negative classes. Median time to NASH or fibrosis diagnosis was 272 and 341 days after NAFL diagnosis, respectively, with a range of 15 to 1250 days (Fig. 3).
Table 2
Characteristics of the study sample
Characteristic
Non‐NASH patients (%)
NASH patients (%)
P‐value
Nonfibrosis patients (%)
Fibrosis patients (%)
P‐value
n = 136 909
n = 4384
n = 136 821
n = 4472
Age
<30
9690 (7.08)
343 (7.82)
0.0583
9825 (7.18)
207 (4.55)
<0.001
30–49
44 044 (32.17)
1444 (32.94)
0.2842
44 253 (32.33)
1235 (27.24)
<0.001
50–59
38 997 (28.48)
1345 (30.68)
0.0015
38 876 (28.44)
1466 (32.83)
<0.001
60–69
29 935 (21.86)
946 (21.58)
0.6515
29 749 (21.75)
1132 (25.90)
<0.001
70–79
12 271 (8.96)
276 (6.30)
<0.001
12 169 (8.89)
378 (8.38)
0.3071
80+
1971 (1.44)
30 (0.68)
<0.001
1948 (1.42)
53 (1.10)
0.1839
Sex
Male
60 978 (44.53)
1692 (38.59)
<0.001
60 595 (44.31)
2075 (45.91)
0.0054
Female
75 885 (55.42)
2689 (61.34)
<0.001
76 178 (55.65)
2396 (54.05)
0.0054
Ethnicity
Hispanic
11 821 (8.63)
459 (10.47)
<0.001
11 842 (8.66)
438 (9.66)
0.0078
Not Hispanic
114 805 (83.85)
3651 (83.30)
0.3088
114 735 (83.84)
3721 (83.59)
0.2444
Unknown
10 283 (7.51)
274 (6.25)
0.0018
10 244 (7.49)
313 (6.74)
0.2444
Race
Caucasian
112 602 (82.24)
3658 (83.45)
0.0415
112 632 (82.31)
3628 (81.32)
0.0397
African American
9898 (7.22)
212 (4.83)
<0.001
9787 (7.15)
323 (7.17)
0.8590
Asian
2717 (1.98)
98 (2.23)
0.2419
2716 (1.98)
99 (2.19)
0.2815
Other/unknown
11 692 (8.54)
416 (9.48)
0.0271
11 686 (8.55)
422 (9.31)
0.0353
Comorbidities
Obesity
21 612 (15.78)
848 (19.34)
<0.001
21 781 (15.92)
679 (15.12)
0.1853
Hypertension
61 145 (44.66)
2042 (46.57)
0.120
61 122 (44.68)
2065 (46.32)
0.0466
Dyslipidemia
45 034 (32.89)
1596 (36.40)
<0.001
45 231 (33.05)
1399 (31.30)
0.0130
Obstructive sleep apnea
15 287 (11.16)
645 (14.71)
<0.001
15 430 (11.27)
502 (11.39)
0.9137
Figure 3
Distribution of patients diagnosed with either NASH or fibrosis over time (the positive class). (a) the number of patients diagnosed with NASH, indicating the progression from NAFL to NASH and (b) the number of patients diagnosed with fibrosis (Fib) over time, indicating the progression from NAFL to fibrosis.
Characteristics of the study sampleDistribution of patients diagnosed with either NASH or fibrosis over time (the positive class). (a) the number of patients diagnosed with NASH, indicating the progression from NAFL to NASH and (b) the number of patients diagnosed with fibrosis (Fib) over time, indicating the progression from NAFL to fibrosis.For the three models trained using 139 features, constructed as outlined above through the use of summary statistics for observations collected during current and previous encounters, the XGB model demonstrated the highest performance with an AUROC of 0.792 (95% CI [0.777–0.808]) for prediction of progression to NASH and of 0.871 (95% CI [0.859–0.882]) for prediction of progression to fibrosis within 4 years on the hold‐out test set. The performance of the XGB model on the external validation cohort was similar with AUROC values of 0.795 (95% CI [0.770–0.817]) for progression to NASH and 0.871 (95% CI [0.843–0.897]) for progression to fibrosis (Fig. S1, Supporting information). The LR and MLP models demonstrated AUROC values on the hold‐out test set of 0.689 (95% CI [0.669–0.731]) and 0.737 (95% CI [0.720–0.755]), respectively, for prediction of progression to NASH and of 0.753 (95% CI [0.735–0.771]) and 0.784 (95% CI [0.768–0.799]), respectively, for prediction of progression to fibrosis (Fig. 4). Similarly, the AUROC values for the LR and MLP models on the external validation set for prediction of progression to NASH were 0.680 (95% CI [0.653–0.706]) and 0.740 (95% CI [0.716–0.765]), respectively, and for progression to fibrosis were 0.795 (95% CI [0.762–0.828]) and 0.829 (95% CI [0.801–0.859]), respectively.
Figure 4
Receiver operating characteristic curves for the prediction of progression to NASH (a) and fibrosis (b) within 4 years using models trained on 139 features on the hold‐out test set. AUROC, area under the receiver operating characteristic; MLP, multi‐layer perceptron; XGB, XGBoost.
Receiver operating characteristic curves for the prediction of progression to NASH (a) and fibrosis (b) within 4 years using models trained on 139 features on the hold‐out test set. AUROC, area under the receiver operating characteristic; MLP, multi‐layer perceptron; XGB, XGBoost.Using 139 features for training, the XGB model demonstrated similar performance in both the hold‐out test set (results shown) and external validation cohort (Fig. S2, Supporting information) in determining the risk of progression from NAFL to NASH for males and females with AUROCs of 0.797 and 0.800, respectively, and progression from NAFL to fibrosis of 0.851 and 0.846, respectively (Fig. 5).
Figure 5
Receiver operating characteristic curves for prediction of progression to NASH (a) and fibrosis (b) on male and female populations within 4 years using XGBoost models trained on 139 features on the hold‐out test set. AUROC, area under the receiver operator characteristic.
Receiver operating characteristic curves for prediction of progression to NASH (a) and fibrosis (b) on male and female populations within 4 years using XGBoost models trained on 139 features on the hold‐out test set. AUROC, area under the receiver operator characteristic.Feature importance plots for the XGB model trained to predict the progression of NAFL to NASH show the top three most important features on the hold‐out test set were prior max and mean aspartate aminotransferase (AST) and sex. For the XGB model trained to predict the progression of NAFL to fibrosis, the three most important features were prior mean and standard deviation of prothrombin time or international normalized ratio (INR) and prior mean AST (Fig. 6). Similar results for the external validation cohort are presented in Figure S3, Supporting information.
Figure 6
Feature correlations and distribution of feature importance for each patient for the XGBoost model on the hold‐out test set. (a) Feature importance plots for model predicting risk progression from NAFL to NASH and (b) NAFL to fibrosis. Features are ranked in descending order of importance as measured by SHAP values. Red indicates a high feature value; blue indicates a low feature value. Dots to the right resulted in a higher score; dots to the left resulted in a lower score. Min, max, mean, count, and std. represent minimum, maximum, average, number of data points for a certain measurement, and standard deviation, respectively. ALT, alanine transaminase; AST, aspartate aminotransferase; BMI, body mass index; DBP, diastolic blood pressure; GGT, gamma‐glutamyl transferase; INR, international normalized ratio; SBP, systolic blood pressure
Feature correlations and distribution of feature importance for each patient for the XGBoost model on the hold‐out test set. (a) Feature importance plots for model predicting risk progression from NAFL to NASH and (b) NAFL to fibrosis. Features are ranked in descending order of importance as measured by SHAP values. Red indicates a high feature value; blue indicates a low feature value. Dots to the right resulted in a higher score; dots to the left resulted in a lower score. Min, max, mean, count, and std. represent minimum, maximum, average, number of data points for a certain measurement, and standard deviation, respectively. ALT, alanine transaminase; AST, aspartate aminotransferase; BMI, body mass index; DBP, diastolic blood pressure; GGT, gamma‐glutamyl transferase; INR, international normalized ratio; SBP, systolic blood pressure
Discussion
In this study, we retrospectively validated an MLA to predict progression to NASH or fibrosis in newly NAFL‐diagnosed patients. The XGB algorithm demonstrated an AUROC of 0.79 and 0.87 for prediction of progression to NASH and fibrosis, respectively, and outperformed LR and MLP models in both the hold‐out test set and external validation cohort. The results indicate that the XBG algorithm is capable of accurately identifying newly diagnosed NAFL patients at high risk of progression to NASH or fibrosis.An exploration of feature importance for the 139 feature XGB model revealed that lab values collected at previous visits were the most important features, with previous mean and maximum values of AST and sex forming the three most important features for prediction of NASH development (Fig. 6). In addition to previous mean AST values, the previous mean and standard INR values were the top three features for the prediction of the development of fibrosis. AST and alanine aminotransferase (ALT, were important features in both sets. Although both may be elevated in NAFL, ALT elevations are generally greater and the AST:ALT ratio is typically less than 1.
The importance of historical AST measurements for predicting NAFL progression to NASH in the XGB model, therefore, suggests that historical trends in AST levels may be an important indicator for likely progression at the time of NAFL diagnosis. This is consistent with the observation that AST increases as hepatic steatosis and fibrosis occurs.
The role of prothrombin time–INR as an indirect marker of liver fibrosis is well studied,
,
and the predictive importance of INR for progression of fibrosis in the feature sets is well correlated with its diagnostic accuracy. INR has also been identified as significantly associated with histologically proven NASH
and thus it is congruent that it is an important feature in both feature sets of the model.Several additional features were also identified as ranking high in predictive value for risk progression in both sets. The importance of these features is consistent with previous studies and literature. Ferritin expression, which can be elevated due to systemic inflammation as well as increased iron stores, often shows elevated levels in patients with NAFL disease.
Ferritin stores iron, which is a putative element that interacts with oxygen radicals.26 Increased iron stores are associated with increased liver damage in patients with NASH.
It is also associated with an increased risk of fibrosis.
Serum ferritin has been identified as a predictor in NASH,
,
although some studies have disputed the veracity of its predictive value.
Other studies have found value in the use of ferritin as part of a diagnostic panel in evaluating individual patients, but not as an independent predictor.
Our analysis showed previous mean and maximum levels of ferritin were highly valuable for predicting risk of progression in both feature sets. Platelet count is another important feature identified in both risk prediction models. Peripheral platelet production is regulated via a hormone primarily produced in the liver; decreased stimulation of platelet production and increased platelet destruction due to liver‐disease associated hypersplenism can result in decreased platelet counts, or thrombocytopenia.
,
Low platelet count has been documented in subsets of patients with NAFL
and was found to be an important predictor in determining risk of progression to NASH and fibrosis.There is a need to separately study risk of progression to NASH and fibrosis for men and women diagnosed with NAFL.
Sex differences in the prevalence, risk factors, fibrosis, and clinical outcomes of NAFLD
,
,
,
should be included when analyzing disease progression. The National Institutes of Health recommend that experiments address sex differences in preclinical studies,
as well as studies involving algorithm development and interpretation.
The XGB model demonstrated an AUROC of 0.797 and 0.800 for the risk of progression from NAFL to NASH and 0.851 and 0.846 for the risk of progression to fibrosis for males and females, respectively, on the hold‐out test set and similar results on the external validation cohort. This indicates that the algorithm predicts the risk development of NASH and fibrosis equally well in both males and females.Despite the significant disease burden posed by NAFL, effective targeted treatments for NAFL, NASH, and fibrosis are limited.
,
Newly diagnosed NAFL patients are encouraged to make a number of lifestyle modifications, including weight loss, dietary modifications, decreased alcohol consumption, and increased exercise.
In addition, pharmacological therapies, including pioglitazone and omega‐3 polyunsaturated fatty acids, have been investigated for off‐label use in treating NASH in certain patient populations, though their use is generally recommended only for those with advanced stages or at high risk of disease progression.
While such treatments are not recommended for the general NAFL population, their use could be appropriate for those at high risk of progression to NASH, as could more aggressive weight‐loss interventions such as bariatric surgery.
Although pharmacologic efforts to treat fibrosis due to NASH have recently been set back with the rejection of obeticholic acid by the Food and Drug Administration (FDA),
there are currently more than 20 agents in investigational trials for treatment of various etiologies of fibrosis.Despite the potential clinical benefits of accurately identifying NAFL patients most at risk of progression to advanced NASH, few methods exist to readily identify these individuals. The majority of research into machine learning (ML) methods for NASH has explored ways to assist in diagnosis via detecting existing NASH using various features, for example, biopsy specimens and biochemical parameters in peripheral blood.
,
,
One study also demonstrated the superior performance of the XGB algorithm for NAFL and NASH detection, where it outperformed other ML approaches, including LR, decision tree, and random forest.
While these studies provide context and demonstrate comparably superior performance of the XGB model as seen in this study, it is notable that the prediction tasks, input features, and population cohorts differed from out study. To the authors' knowledge, only one study to date has explicitly used ML to assist in prognosis, specifically to identify predictors of NASH among a hypertensive patient population.50 However, this study analyzed features in a specific subpopulation and so algorithm performance is not directly comparable. This research adds to the literature by demonstrating that an MLA can be trained to accurately predict progression to NASH or fibrosis in a general NAFL patient population with no other filtering of the patient population by the presence of other comorbidities.In addition to improving prognostic predictions, our methods may also help identify and enrich clinical trial populations. The algorithm developed in this study could improve study power by enrolling NAFL patients most likely to experience rapid disease progression, increasing study efficiency. Slow disease progression has been noted as a major barrier to clinical trials of NAFL, given the long follow‐up required for most patients to see endpoints of disease progression and mortality.
As such, researchers have recommended focusing enrollment on patients most likely to benefit from the trial protocol.
Researchers have also noted difficulties in enrolling NASH patients in clinical trials due to low detection and diagnosis rates, a lack of formal diagnosis, and/or discrepancies between initial pathological readings of liver biopsies and pathology readings performed by study clinicians.
,
,
These difficulties have prompted suggestions that other methods of screening patients may improve trial enrollment by reducing screening failures. The algorithm presented in this study is capable of defining a population at high risk of progression to NASH, potentially enabling prevention as well as treatment trials.There are several limitations to this study. First, the timing of NAFL, NASH, and fibrosis onset was determined by the first application of ICD codes. NAFL may be clinically silent for a prolonged period prior to diagnosis, therefore the observed onset may differ from the pathophysiological onset of disease. Second, although the gold standard for diagnosis of NASH and fibrosis is liver biopsy, in the absence of these data we instead used clinician diagnosis via ICD codes as a proxy. Third, the retrospective nature of this study means that we cannot make inferences about the impact this algorithm may have on clinical care or patient outcomes. Future prospective research will be critical in elucidating the role that prognostic MLAs may play in treatment planning and clinical trial design for NASH.
Conclusion
In this study, we have developed an MLA capable of accurately identifying newly diagnosed NAFL patients at high risk of progression to NASH or fibrosis. Identification of patients at high risk of progression to NASH or fibrosis may serve to improve patient outcomes through increased patient monitoring and aggressive preventative measures.Appendix S1. Supporting Information.Click here for additional data file.
Authors: Naga Chalasani; Zobair Younossi; Joel E Lavine; Anna Mae Diehl; Elizabeth M Brunt; Kenneth Cusi; Michael Charlton; Arun J Sanyal Journal: Hepatology Date: 2012-06 Impact factor: 17.425