ABSTRACT: Non-alcoholic fatty liver disease (NAFLD) is the most common chronic liver disease, and its pathogenesis is complicated and triggered by unbalanced diet, sedentary lifestyle, and genetic background. The aim of this study was to construct and validate a nomogram incorporated lifestyle habits for predicting NAFLD incidence.The overall cohort was divided into training set and test set as using computer-generated random numbers. We constructed the nomogram by multivariate logistic regression analysis in the training set. Thereafter, we validated this model by concordance index, the area under the receiver operating characteristic curve (ROC), net reclassification index, and a calibration curve in the test set. Additionally, we also evaluated the clinical usefulness of the nomogram by decision curve analysis.There were no statistically significant differences about characteristics between training cohort (n = 748) and test cohort (n = 320). Eleven features (age, sex, body mass index, drinking tea, physical exercise, energy, monounsaturated fatty acids, polyunsaturated fatty acids, hypertension, hyperlipidemia, diabetes) were incorporated to construct the nomogram, concordance index, the area under the ROC curve, net reclassification index were 0.801, 0.801, and 0.084, respectively, indicating the nomogram have good discrimination of predicting NAFLD incidence. Also, the calibration curve showed good consistency between nomogram prediction and actual probability. Moreover, the decision curve showed that when the threshold probability of an individual is within a range from approximately 0.5 to 0.8, this model provided more net benefit to predict NAFLD incidence risk than the current strategies.This nomogram can be regarded as a user-friendly tool for assessing the risk of NAFLD incidence, and thus help to facilitate management of NAFLD including lifestyle and medical interventions.
ABSTRACT: Non-alcoholic fatty liver disease (NAFLD) is the most common chronic liver disease, and its pathogenesis is complicated and triggered by unbalanced diet, sedentary lifestyle, and genetic background. The aim of this study was to construct and validate a nomogram incorporated lifestyle habits for predicting NAFLD incidence.The overall cohort was divided into training set and test set as using computer-generated random numbers. We constructed the nomogram by multivariate logistic regression analysis in the training set. Thereafter, we validated this model by concordance index, the area under the receiver operating characteristic curve (ROC), net reclassification index, and a calibration curve in the test set. Additionally, we also evaluated the clinical usefulness of the nomogram by decision curve analysis.There were no statistically significant differences about characteristics between training cohort (n = 748) and test cohort (n = 320). Eleven features (age, sex, body mass index, drinking tea, physical exercise, energy, monounsaturated fatty acids, polyunsaturated fatty acids, hypertension, hyperlipidemia, diabetes) were incorporated to construct the nomogram, concordance index, the area under the ROC curve, net reclassification index were 0.801, 0.801, and 0.084, respectively, indicating the nomogram have good discrimination of predicting NAFLD incidence. Also, the calibration curve showed good consistency between nomogram prediction and actual probability. Moreover, the decision curve showed that when the threshold probability of an individual is within a range from approximately 0.5 to 0.8, this model provided more net benefit to predict NAFLD incidence risk than the current strategies.This nomogram can be regarded as a user-friendly tool for assessing the risk of NAFLD incidence, and thus help to facilitate management of NAFLD including lifestyle and medical interventions.
Non-alcoholic fatty liver disease (NAFLD) is the most common chronic liver disease, and its prevalence has increased rapidly with great changes in lifestyle.[ It affecting approximately 25.24% of the global population.[ The prevalence of NAFLD is 31% in South America, 32% in the Middle East, 14% in Africa, and 27% in Asia, varying from 24.77% to 43.91% in the Chinese population.[ NAFLD is the hepatic presentation of metabolic syndrome (MetS), encompassing simple fatty liver, nonalcoholic steatohepatitis, and fibrosis.[ NAFLD is increasingly identified as one of the major causes of liver-related morbidity, mortality, and liver transplantation. The pathogenesis of NAFLD is complicated and triggered by some risk factors including unbalanced diet (high fat, high cholesterol, high sucrose), sedentary lifestyle, and genetic background.[ There are currently no approved pharmacotherapy for treating NAFLD. Therefore, lifestyle management, such as weight loss, balanced dietary, and physical exercise, remains a first-choice approach in the treatment of NAFLD.Nuts serve as one of the most nutrient dense foods, rich in unsaturated fatty acids, protein, fiber, vitamins, minerals, and antioxidants. Nuts are a key component of several dietary patterns such as mediterranean diet, and their intake is associated with health benefits.[ A meta-analysis evidence revealed a beneficial role of intake in reducing the incidence of cardiovascular disease and cardiovascular events.[ Moreover, clinical research suggested that consumption of nuts may lower risk of obesity, diabetes, and MetS.[ However, the association between nut intake and NAFLD has not been fully elucidated. Recently, Chen et al[ reported that higher intake of nuts is negatively correlates to incidence risk of NAFLD, particularly in men. Thus, our research interests arouse considering the association between nut intake combined with conventional indicators and NAFLD incidence.Although several models have been developed to predict NAFLD risk,[ a simple nomogram incorporated lifestyle indicators has not been developed to assess NAFLD incidence. In the present study, we seeked to develop and validate a nomogram for predicting NAFLD risk, and provide an individualized prediction tool by cost-benefit variables. The predictive value of the model was evaluated based on discrimination, calibration, and clinical utility in the training and validation sets. This simple-to-use model might serve as an early warning and prediction system for us.
Methods and materials
Participants
The raw data was downloaded from the Dryad Digital Repository (http://www.datadryad.org/) for secondary analysis, which were shared by Chen et al.[ This retrospective cohort study included 534 NAFLD cases and 534 controls. All participants were randomly split into 2 groups in a 7:3 ratio. Thus, 748 participants were allocated to the training group, and 320 participants were allocated to the test group. NAFLD was newly diagnosed by liver ultrasonography, excluding alcoholic hepatitis, autoimmune hepatitis, viral hepatitis, or drug-induced liver disease. Participants were excluded as follows: age <18 or >70 years; taking lipid-lowering or weight loss drugs; did not answer food frequency and nut intake questionnaires. The controls were randomly selected and were frequency-matched by age, sex, ethnicity, and region.The raw data contained age, sex, body mass index (BMI), education level, marital status, income, occupation, smoking, drinking tea, physical exercise, history of hyperlipidemia, diabetes, and hypertension. Daily nut and energy intake were calculated from a semiquantitative food frequency questionnaire. Monounsaturated fatty acids (MUFAs) and polyunsaturated fatty acids (PUFAs) were calculated by multiplying the nutrient content of the specified portion by frequency and summing across all relevant food items. Ethical approval was not necessary as the data available in a public website.
Construction of the nomogram
Bivariate univariate logistic regression analysis was used to select the features at P < .05. Next, identified features were then included in a multivariate logistic regression analysis using a stepwise forward selection. Finally, we developed a nomogram to estimate NAFLD incidence in the training group.
Discrimination of the nomogram
Concordance index (C-index) and the area under receiver operating characteristic curve (AUC) were applied to assess the discrimination of the nomogram in the test cohort, and their value >0.7 suggested good discrimination. The net reclassification index (NRI) was computed to evaluate the predictive capabilities of the 2 models. Additionally, a calibration plotting was used to evaluate the agreement between actual probability and nomogram prediction.
Decision curve analysis (DCA) of the nomogram
We performed DCA to evaluate the clinical application of the model in the test cohort. DCA is an evaluation method that calculates the net benefits of predictive models.[
Identification of cutoff value for continuous variables
We performed receiver operating characteristic curve (ROC) analysis to evaluate the performance and optimal cutoff values of identified variables for NAFLD incidence in the whole cohort. The performance of variables for predicting in NAFLD was defined by AUC. The optimal cutoff values were defined by the highest Youden index (sensitivity + specificity –1).
Statistical analyses
Measurement data were expressed as the means ± standard deviation and analyzed by t tests between the 2 groups. Numerical data were expressed as percentages and analyzed by chi-square test. Logistic regression analysis, nomogram, C-index, ROC, NRI, calibration curve, and DCA curve were conducted in R “stats,” “pROC,” “rms,” “nricens,” “ResourceSelection,” and “rmda” packages. All statistical tests were two-sided with P-values <.05 considered to indicate a statistically significant difference.
Results
Participant characteristics
There are 728 men (68.2%) and 340 women (31.8%) in the overall cohort, and most (63.1%) were aged 40 to 60 years. Among all participants, nut daily intake was 8.01 ± 12.97 g/d, energy was 2215.79 ± 609.74 kcal/d, MUFA daily intake was 32.64 ± 8.97 g/d, and PUFA daily intake was 24.62 ± 4.90 g/d. The overall cohort (n = 1068) was divided into training cohort (n = 748) and test cohort (n = 320) according to the ratio of 7:3. The participant characteristics including demographics, lifestyle habits, history of diseases, and nut intake are summarized in Table 1. Significant differences of characteristics were not observed between these 2 cohorts.
Table 1
Baseline characteristics of training, test, and overall cohorts.
Characteristics
Training cohort (n = 748)
Test cohort (n = 320)
Overall cohort (n = 1068)
P value
Sex
.441
Male
504 (67.4%)
224 (70.0%)
728 (68.2%)
Female
244 (32.6%)
96 (30.0%)
340 (31.8%)
Age, y
.494
<40
188 (25.1%)
90 (28.1%)
278 (26.0%)
40–60
475 (63.5%)
199 (62.2%)
674 (63.1%)
≥60
85 (11.4%)
31 (9.7%)
116 (10.9%)
Education
.394
Primary school and less than
63 (8.4%)
26 (8.1%)
89 (8.3%)
Junior middle and high school
307 (41.0%)
118 (36.9%)
425 (39.8%)
Junior college or above
378 (50.5%)
176 (55.0%)
554 (51.9%)
Marriage
.411
Single
82 (11.0%)
29 (9.1%)
111 (10.4%)
Married or other
666 (89.0%)
291 (90.9%)
957 (89.6%)
BMI, kg/m2
.842
<24.0
411 (54.9%)
173 (54.1%)
584 (54.7%)
≥24.0
337 (45.1%)
147 (45.9%)
484 (45.3%)
Income (yuan/mo)
.764
<2000
45 (6.0%)
22 (6.9%)
67 (6.3%)
2000–3000
229 (30.6%)
102 (31.9%)
331 (31.0%)
≥3000
474 (63.4%)
196 (61.3%)
670 (62.7%)
Smoke
.538
No
530 (70.9%)
220 (68.8%)
750 (70.2%)
Yes
218 (29.1%)
100 (31.3%)
318 (29.8%)
Dringing tea
.564
No
308 (41.2%)
125 (39.1%)
433 (40.5%)
Yes
440 (58.8%)
195 (60.9%)
635 (59.5%)
Occupation
.831
Mental labor
213 (28.5%)
97 (30.3%)
310 (29.0%)
Physical labor
174 (23.3%)
73 (22.8%)
247 (23.1%)
Other
361 (48.3%)
150 (46.9%)
511 (47.8%)
Physical exercise
.279
Light
235 (31.4%)
115 (35.9%)
350 (32.8%)
Moderate
218 (29.1%)
93 (29.1%)
311 (29.1%)
Severe
295 (39.4%)
112 (35.0%)
407 (38.1%)
Hyperlipidemia
.954
No
713 (95.3%)
306 (95.6%)
1019 (95.4%)
Yes
35 (4.7%)
14 (4.4%)
49 (4.6%)
Diabetes
.606
No
724 (96.8%)
307 (95.9%)
1031 (96.5%)
Yes
24 (3.2%)
13 (4.1%)
37 (3.5%)
Hypertension
.184
No
577 (77.1%)
234 (73.1%)
811 (75.9%)
Yes
171 (22.9%)
86 (26.9%)
257 (24.1%)
Nut frequency
.735
None
77 (10.3%)
37 (11.6%)
114 (10.7%)
1–3 times/d
45 (6.0%)
24 (7.5%)
69 (6.5%)
1–2 times/wk
208 (27.8%)
95 (29.7%)
303 (28.4%)
3–6 times/wk
55 (7.4%)
25 (7.8%)
80 (7.5%)
1–3 times/mo
206 (27.5%)
81 (25.3%)
287 (26.9%)
<Once/mo
157 (21.0%)
58 (18.1%)
215 (20.1%)
Nut daily intake, g/d
7.68 ± 11.92
8.78 ± 15.14
8.01 ± 12.97
.204
Energy, kcal/d
2215.92 ± 606.29
2215.48 ± 618.67
2215.79 ± 609.74
.991
MUFA daily intake, g/d
32.66 ± 9.07
32.61 ± 8.73
32.64 ± 8.97
.939
PUFA daily intake, g/d
24.51 ± 4.88
24.86 ± 4.94
24.62 ± 4.90
.290
Data are shown as numbers (%) or mean ± SD.
BMI = body mass index, MUFA = monounsaturated fatty acids, PUFA = polyunsaturated fatty acids.
Baseline characteristics of training, test, and overall cohorts.Data are shown as numbers (%) or mean ± SD.BMI = body mass index, MUFA = monounsaturated fatty acids, PUFA = polyunsaturated fatty acids.
Risk factor selection
Univariate logistic regression analysis identified 7 candidate features including BMI, drinking tea, physical exercise, hypertension, energy, MUFAs, and PUFAs intake (Fig. 1). To adjust various confounding factors, we performed multivariate logistic regression analysis to further explore risk factors predisposed to NAFLD. The results revealed that BMI and PUFAs daily intake are independent risk factors of NAFLD. Individuals with BMI ≥24.0 were also 6.391 times more likely to develop NAFLD than those BMI <24.0. Each increase in PUFAs daily intake by 1 g increased the risk of developing NAFLD by 1.141% (Table 2).
Figure 1
Forest plots of univariate logistic regression analysis for estimated risk of developing NAFLD. NAFLD = non-alcoholic fatty liver disease, OR = odds ratio.
Table 2
Multivariate logistic regression models in the overall cohort.
Traits
OR (95%CI)
P value
BMI ≥24.0
6.391 [4.552–8.972]
.000
Tea
1.026 [0.727–1.447]
.885
[0,1-3]Physical exercise
Moderate
0.770 [0.502–1.182]
.232
Severe
0.842 [0.562–1.261]
.403
Hypertension
1.281 [0.862–1.903]
.220
Energy
1.000 [0.999–1.000]
.066
MUFA daily intake
1.007 [0.983–1.031]
.581
PUFA daily intake
1.141 [1.092–1.192]
.000
BMI = body mass index, MUFA = monounsaturated fatty acids, PUFA = polyunsaturated fatty acids.
Forest plots of univariate logistic regression analysis for estimated risk of developing NAFLD. NAFLD = non-alcoholic fatty liver disease, OR = odds ratio.Multivariate logistic regression models in the overall cohort.BMI = body mass index, MUFA = monounsaturated fatty acids, PUFA = polyunsaturated fatty acids.The nomogram was developed to analyze NAFLD incidence incorporated the significant features (BMI, drinking tea, physical exercise, hypertension, energy, MUFAs, and PUFAs intake) and well-known factors (age, sex, hyperlipidemia, diabetes) in the training group (Fig. 2).
Figure 2
Nomogram for predicting NAFLD incidence in the training group. NAFLD = non-alcoholic fatty liver disease.
Nomogram for predicting NAFLD incidence in the training group. NAFLD = non-alcoholic fatty liver disease.
Evaluation of the nomogram
We next evaluated the predictive model using various metrics in the test cohort. A C-index was 0.801, 95% CI, 0.738 to 0.864 which demonstrated moderate accuracy of the nomogram. The AUC was 0.801, and the cutoff was 0.189 (Fig. 3), indicating a good discrimination of the nomogram. The calibration curve showed excellent performance of the predictive model (Fig. 4). To further explore whether independent risk (PUFAs intake) exerts an additional effect on this model, we compared model B (age, sex, BMI, drinking tea, physical exercise, energy, MUFAs intake, PUFAs intake, history of hyperlipidemia, diabetes, and hypertension) with model A (age, sex, BMI, drinking tea, physical exercise, energy, MUFAs intake, history of hyperlipidemia, diabetes, and hypertension). The probability 0.189 from ROC analysis was used as thresholds for categorical NRI. The value of NRI was 0.084, 95% CI, –0.005 to 0.089 indicating that predicted risks have not been reclassified in the old and new models. Collectively, these results revealed that the nomogram shows good performance of predicting NAFLD incidence.
Figure 3
A ROC curve of the nomogram in the training set. The AUC for NAFLD in incidence was 0.801. AUC = area under receiver operating characteristic curve, NAFLD = non-alcoholic fatty liver disease, ROC = receiver operating characteristic curve.
Figure 4
A calibration curve of the nomogram in the training set. The dotted line indicates an ideal model, and the solid line indicates the predictive performance of the nomogram. The closer the distance between 2 lines, the better the performance of the nomogram.
A ROC curve of the nomogram in the training set. The AUC for NAFLD in incidence was 0.801. AUC = area under receiver operating characteristic curve, NAFLD = non-alcoholic fatty liver disease, ROC = receiver operating characteristic curve.A calibration curve of the nomogram in the training set. The dotted line indicates an ideal model, and the solid line indicates the predictive performance of the nomogram. The closer the distance between 2 lines, the better the performance of the nomogram.
Clinical usefulness of nomogram
The DCA of the nomogram is presented in Fig. 5. The decision curve showed that when the threshold probability of a individual is within a range from approximately 0.5 to 0.8, this model provided more net benefit to predict NAFLD incidence risk than the “all” or “none” strategies.
Figure 5
The decision curve analysis of the nomogram for NAFLD incidence. The black line indicates the net benefit when no individuals develop NAFLD, while the purple line indicates the net benefit when all individuals suffer from NAFLD. The area among the black line, purple line, and green line indicates the clinical usefulness of the nomogram. NAFLD = non-alcoholic fatty liver disease.
The decision curve analysis of the nomogram for NAFLD incidence. The black line indicates the net benefit when no individuals develop NAFLD, while the purple line indicates the net benefit when all individuals suffer from NAFLD. The area among the black line, purple line, and green line indicates the clinical usefulness of the nomogram. NAFLD = non-alcoholic fatty liver disease.
Cutoff value for continuous variables
The AUC for energy, MUFAs intake, and PUFAs intake was 54.28, 60.02, 67.66, respectively, indicating that PUFAs intake contribute the most to NAFLD incidence (Fig. 6). The cutoff values of energy, MUFAs intake, and PUFAs intake were 2142.08 kcal/d, 31.42, and 24.66 g/d, respectively, to optimally predict the risk of NAFLD (Table 3).
Figure 6
The ROC curves of identified continuous variables for predicting NAFLD. The AUC for energy, MUFA, and PUFA was 0.543, 0.600, and 0.677, respectively. AUC = area under receiver operating characteristic curve, MUFA = monounsaturated fatty acids, NAFLD = non-alcoholic fatty liver disease, PUFA = polyunsaturated fatty acids, ROC = receiver operating characteristic curve.
Table 3
Optimal cutoff values of identified risk factors for NAFLD.
The ROC curves of identified continuous variables for predicting NAFLD. The AUC for energy, MUFA, and PUFA was 0.543, 0.600, and 0.677, respectively. AUC = area under receiver operating characteristic curve, MUFA = monounsaturated fatty acids, NAFLD = non-alcoholic fatty liver disease, PUFA = polyunsaturated fatty acids, ROC = receiver operating characteristic curve.Optimal cutoff values of identified risk factors for NAFLD.MUFA = monounsaturated fatty acids, NAFLD = nonalcoholic fatty liver disease, PUFA = polyunsaturated fatty acids.
Discussion
In this retrospective cohort study, we constructed a novel tool for predicting NAFLD incidence using easily available variables including demographic characteristics (age, sex), lifestyle habits (drinking tea, physical exercise, energy, MUFAs, and PUFAs intake), and physical examination parameters (BMI, history of hypertension, hyperlipidemia, diabetes, and hypertension). We also validated this model using C-index, AUC, NRI, and calibration curve, indicating good performance of predicting NAFLD incidence. Moreover, the nomogram was demonstrated to have good clinical usefulness for predicting NAFLD incidence.BMI is the most commonly used to define an individual as underweight, normal weight, overweight, or obesity. BMI ≥24 kg/m2 is considered overweight, BMI ≥28 kg/m2 is considered obesity for Chinese.[ Obesity is not only a well-known risk factor for the development of NAFLD, but also linked with progression of liver disease.[ A retrospective longitudinal cohort study revealed that BMI was positively correlation with NAFLD incidence, and it was considered as the most useful predictive risk factor for NAFLD in both sexes.[ Another study indicated that increase in BMI during young adulthood correlates with a greater risk of NAFLD in midlife.[ Consistent with previous research, our results also implied BMI is an independent risk factor for NAFLD. BMI incorporated our NAFLD-predictive model was reliable and accurate parameter.Our study revealed that hypertension is a risk factor for NAFLD onset. Evidence demonstrated that individuals with hypertension have a higher prevalence of NAFLD, and the risk of NAFLD is independently associated with hypertension and blood pressure category.[ Similarly, Donati et al[ reported that hypertensive patients had a significantly higher prevalence of NAFLD, which may be related to increases in insulin resistance and BMI. MetS is a cluster of metabolic abnormalities, comprising obesity, hypertension, dyslipidemia, hyperglycemia, and insulin resistance. A possible explanation is that hypertension, a key factor for MetS, might indirectly NAFLD onset; however, their causal association would still have to be clearly determined.It is well established that excess caloric intake lead to obesity and insulin resistance, and thus a leading risk factor for NAFLD.[ Clearly, gradual weight loss achieved by energy restriction regardless of physical activity condition, could improve liver fat deposit, insulin resistance, and hepatic inflammation and fibrosis.[ Recent studies have also noted that lifestyle change including energy restriction and increased physical activity during 6 to 12 months reduces liver fat and volume, steatosis, and incident NAFLD.[ These results corroborate the findings of our work in association between lifestyle and NAFLD. One surprising result was that drinking tea was found to be inversely associated with NAFLD incidence. A meta-analysis of the clinical trials found that green tea has a favorable effects on BMI, liver enzymes, and blood lipids parameters.[ This may be the mechanism by reducing dietary lipid absorption, lipogenesis, hepatic gluconeogenesis, and lipid peroxidation levels.[Nuts predominantly contain unsaturated fatty acids (MUFAs and PUFAs) and have a relatively low amount of saturated fatty acids. Frequent nut consumption could have favorable effect on lipid metabolism and endothelial function, and thus reduce risk of cardiovascular disease.[ A case-control study found that low intake of nuts was associated with a significantly higher risk for NAFLD in Korean men.[ Nuts, component of mediterranean diet pattern, are dietary recommendations for the prevention and management of NAFLD in adults.[ The high energy density and fat content of nuts has raised concerns that regular nut intake will lead to weight gain. For example, Alper and Mattes[ reported that peanut consumption for 8 weeks contributed to body weight gain (1 kg). In the present study, we estimated the optimal cutoff values of energy and unsaturated fatty acids intake to predict NAFLD incidence, which may provide references in defining the best thresholds of nut intake for the Chinese individuals.There are several limitations to need to be acknowledged in the present study. First, this study is a case-control study design, selection bias and recall bias are inherent weaknesses. The prospective, longitudinal study design is needed to confirm these findings. Second, detailed food intake questionnaires were not available, such as different types of nuts and tea. Third, although the performance of our nomogram was evaluated with internal validation in the same population, there is a lack of external verification in other regions and countries.Taken together, we constructed a nomogram incorporated 9 risk predictors (age, sex, drinking tea, physical exercise, energy, MUFAs, PUFAs intake, BMI, history of hypertension, hyperlipidemia, diabetes, and hypertension) to predict risk of NAFLD. This nomogram can be regarded as a user-friendly tool for assessing the risk of NAFLD incidence, and thus help to facilitate management of NAFLD including lifestyle and medical interventions.
Acknowledgments
Special thanks to Chen et al. for sharing data.
Author contributions
Data curation: Kaili Peng, Shuofan Wang.Formal analysis: Linjiao Gao.Software: Shuofan Wang.Supervision: You Huaqiang.Writing – original draft: Kaili Peng.Writing – review & editing: You Huaqiang.
Authors: Gang Liu; Marta Guasch-Ferré; Yang Hu; Yanping Li; Frank B Hu; Eric B Rimm; JoAnn E Manson; Kathryn M Rexrode; Qi Sun Journal: Circ Res Date: 2019-03-15 Impact factor: 17.367
Authors: T C-F Yip; A J Ma; V W-S Wong; Y-K Tse; H L-Y Chan; P-C Yuen; G L-H Wong Journal: Aliment Pharmacol Ther Date: 2017-06-06 Impact factor: 8.171