| Literature DB >> 29391513 |
Sajida Perveen1, Muhammad Shahbaz2,3, Karim Keshavjee3,4, Aziz Guergachi3,5,6.
Abstract
Prevention and diagnosis of NAFLD is an ongoing area of interest in the healthcare community. Screening is complicated by the fact that the accuracy of noninvasive testing lacks specificity and sensitivity to make and stage the diagnosis. Currently no non-invasive ATP III criteria based prediction method is available to diagnose NAFLD risk. Firstly, the objective of this research is to develop machine learning based method in order to identify individuals at an increased risk of developing NAFLD using risk factors of ATP III clinical criteria updated in 2005 for Metabolic Syndrome (MetS). Secondly, to validate the relative ability of quantitative score defined by Italian Association for the Study of the Liver (IASF) and guideline explicitly defined for the Canadian population based on triglyceride thresholds to predict NAFLD risk. We proposed a Decision Tree based method to evaluate the risk of developing NAFLD and its progression in the Canadian population, using Electronic Medical Records (EMRs) by exploring novel risk factors for NAFLD. Our results show proposed method could potentially help physicians make more informed choices about their management of patients with NAFLD. Employing the proposed application in ordinary medical checkup is expected to lessen healthcare expenditures compared with administering additional complicated test.Entities:
Mesh:
Year: 2018 PMID: 29391513 PMCID: PMC5794753 DOI: 10.1038/s41598-018-20166-x
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Definition of the metabolic syndrome, according to recent classifications[13].
| Risk Factor/Features | National Cholesterol Education Program, ATP-III | International Diabetes Federation | Joint statement of IDF, NHLBI, AHA, WHF, IAS, IASO |
|---|---|---|---|
| Abdominal obesity, (waist circumference) | >102 cm (males), >88 cm (females) | ≥94 cm (males), ≥80 cm (females)(ethnic differences) | ≥94 cm (males), ≥80 cm (females)(ethnic differences) |
| Lipoprotein level | TRG ≥150 mg/dL or treated for dyslipidemia | TRG ≥150 mg/dL or treated for dyslipidemia | TRG ≥150 mg/dL or treated for dyslipidemia |
| HDL level | HDL-Chol <40 mg/dL (males); <50 mg/dL (females) | HDL-Chol <40 mg/dL (males); <50 mg/dL (females) | HDL-Chol <40 mg/dL (males); <50 mg/dL (females) |
| Blood pressure | ≥130/85 mmHg or treated for Htx | ≥130/85 mmHg or treated for Htx | ≥130/85 mmHg or treated for Htx |
| Fasting Glucose (FG) | ≥110 mg/dL or treated for DM | ≥100 mg/dL or treated for DM | ≥100 mg/dL or treated for DM |
|
| 3 of the above | Abdominal obesity + 2 of the above | 3 of the above |
|
|
| ||
| 0 | No abdominal adiposity and no other features of MetS | ||
| 1 | Abdominal adiposity | ||
| 2 | Abdominal adiposity +1 feature of MetS (i.e. atherogenic dyslipidemia, low HDL cholesterol and/or high TRG, hypertension or fasting hyperglycemia/glucose intolerance/diabetes) | ||
| 3 | Abdominal adiposity + 2 features of MetS | ||
| 4 | Abdominal adiposity + 3 features of MetS | ||
Quantitative score to estimate the impact of metabolic factors on nonalcoholic fatty liver disease[6]. FG, Fasting glucose, HDL, high-density lipoprotein, BMI, body mass index, DM, Diabetes Mellitus, TRG, Triglyceride, MetS, metabolic syndrome.
Characteristics of the population in the Canadian primary care sentinel surveillance network database.
| Predictors | Findings |
|---|---|
|
| |
| Male, sample size (%) | 287964 (43) |
| Female, sample size (%) | 379561, (57) |
| Male age mean ±SD,Years | 47.2 ± 25.1 |
| Female age mean ± SD,Years | 49.5 ± 24.8 |
|
| |
| Diastolic BP mean ± SD, mm Hg | 73.3 ± 12.4 |
| Systolic BP, mean ± SD, mm Hg | 121.9 ± 16.9 |
| Unknown disease frequency (%) | 393344 (59) |
| COPD frequency (%) | 15926 (2.4) |
| Dementia frequency (%) | 12007 (1.8) |
| Depression frequency (%) | 62682 (10) |
| Diabetes Mellitus frequency (%) | 40637 (6) |
| Epilepsy frequency (%) | 5553 (0.8) |
| Hypertension frequency (%) | 88615 (13) |
| Osteoarthritis frequency (%) | 47606 (7) |
| Parkinson’s Disease frequency (%) | 1825 (0.2) |
|
| |
| FG, mean ± SD, mmol/L | 5.54 ± 1.91 |
| Triglycerides, mean ± SD, mmol/L | 1.43 ± 1.21 |
| HDL, sample size, mean ± SD, mmol/L | 1.38 ± 0.41 |
| BMI, mean ± SD, kg/m2 | 26.54 ± 7.37 |
SD, standard deviation; BP, Blood Pressure, BMI, body mass index, FG, Fasting glucose, HDL, high-density lipoprotein, COPD, chronic obstructive pulmonary disease.
*Some patients have more than 1 disease in the database.
Characteristics of study samples without random under-sampling and with random under-sampling with uniform class distribution.
| Predictors | Findings | |
|---|---|---|
| Without random under-sampling | With random under-sampling | |
|
| ||
| Male, sample size | 16631 | 473 |
| Female, sample size | 24006 | 527 |
| Overall maximum age, Years | 103 | 93 |
| Overall minimum age, Years | 9 | 19 |
| Overall age mean ± SD,Years | 61.2 ± 14.2 | 59.48 ± 12.74 |
|
| ||
| Systolic blood pressure, mean (SD), mm Hg | 125.5 ± 15.7 | 127.3 ± 15.403 |
| Diastolic blood pressure mean (SD), mm Hg | 75.4 ± 9.7 | 77.064 ± 10.243 |
|
| ||
| FG, mean ± SD, mmol/L | 5.4 ± 1.2 | 5.783 ± 1.935 |
| Triglycerides, mean ± SD, mmol/L | 1.4 ± 1.2 | 1.5 ± 1.31 |
| HDL, sample size, mean ± SD, mmol/L | 1.4 ± 0.4 | 1.248 ± 0.399 |
| BMI, mean ± SD, kg/m2 | 28.5 ± 6.1 | 30.618 ± 6.164 |
Study sample distribution among different ordinal categories.
| Categories | NAFLD | |
|---|---|---|
| N | % | |
| Desirable | 30332 | 74.6 |
| Borderline-High | 5105 | 12.6 |
| High | 5011 | 12.08 |
| Very-High | 189 | 0.661 |
| Total | 40637 | 100.0 |
Figure 1Abstract overview of proposed methodology.
Figure 2Decision tree drawn from CPCSSN Dataset.
Detailed performance analysis of prediction model without random under-sampling.
| Class | |||||
|---|---|---|---|---|---|
| Desirable | Borderline_high | High | Very_High | Weighted Avg. | |
| TP Rate | 0.937 | 0.05 | 0.296 | 0.024 | 0.762 |
| FP Rate | 0.777 | 0.03 | 0.053 | 0.001 | 0.542 |
| Precisionµ | 0.78 | 0.493 | 0.573 | 0.133 | 0.669 |
| Recallµ | 0.937 | 0.451 | 0.396 | 0.024 | 0.735 |
| F-Measureµ | 0.851 | 0.279 | 0.349 | 0.04 | 0.676 |
| PrecisionM | 0.757 | 0.561 | 0.651 | 0.416 | 0.677 |
| RecallM | 0.832 | 0.59 | 0.503 | 0.366 | 0.713 |
| MCC | 0.328 | 0.247 | 0.195 | 0.055 | 0.299 |
| AROC | 0.748 | 0.631 | 0.738 | 0.507 | 0.731 |
Detailed performance analysis of prediction model with random under-sampling.
| Class | |||||
|---|---|---|---|---|---|
| Desirable | Borderline-High | High | Very_High | Weighted Avg. | |
| TP Rate | 0.574 | 0.53 | 0.511 | 0.637 | 0.582 |
| FP Rate | 0.108 | 0.197 | 0.374 | 0.11 | 0.223 |
| Precisionµ | 0.62 | 0.587 | 0.468 | 0.647 | 0.594 |
| Recallµ | 0.516 | 0.654 | 0.672 | 0.687 | 0.637 |
| F-Measureµ | 0.592 | 0.669 | 0.504 | 0.598 | 0.614 |
| PrecisionM | 0.646 | 0.603 | 0.581 | 0.597 | 0.610 |
| RecallM | 0.547 | 0.678 | 0.72 | 0.667 | 0.661 |
| MCC | 0.204 | 0.377 | 0.167 | 0.164 | 0.276 |
| AROC | 0.748 | 0.812 | 0.693 | 0.809 | 0.746 |