| Literature DB >> 34055037 |
Xiue Gao1, Wenxue Xie2, Zumin Wang2, Bo Chen1, Shengbin Zhou1.
Abstract
Diabetes mellitus is a disease that has reached epidemic proportions globally in recent years. Consequently, the prevention and treatment of diabetes have become key social challenges. Most of the research on diabetes risk factors has focused on correlation analysis with little investigation into the causality of these risk factors. However, understanding the causality is also essential to preventing the disease. In this study, a causal discovery method for diabetes risk factors was developed based on an improved functional causal likelihood (IFCL) model. Firstly, the issue of excessive redundant and false edges in functional causal likelihood structures was resolved through the construction of an IFCL model using an adjustment threshold value. On this basis, an IFCL-based causal discovery algorithm was designed, and a simulation experiment was performed with the developed algorithm. The experimental results revealed that the causal structure generated using a dataset with a sample size of 2000 provided more information than that produced using a dataset with a sample size of 768. In addition, the causal structures obtained with the developed algorithm had fewer redundant and false edges. The following six causal relationships were identified: insulin→plasma glucose concentration, plasma glucose concentration→body mass index (BMI), triceps skin fold thickness→BMI and age, diastolic blood pressure→BMI, and number of times pregnant→age. Furthermore, the reasonableness of these causal relationships was investigated. The algorithm developed in this study enables the discovery of causal relationships among various diabetes risk factors and can serve as a reference for future causality studies on diabetes risk factors.Entities:
Year: 2021 PMID: 34055037 PMCID: PMC8143882 DOI: 10.1155/2021/5552085
Source DB: PubMed Journal: Comput Math Methods Med ISSN: 1748-670X Impact factor: 2.238
Figure 1Partial causal structure consisting of X and X.
Figure 2Flowchart of the IFCL-based diabetes risk factor causal discovery algorithm.
Figure 3Scatter plots, bar charts, and correlation coefficients for the M = 768 dataset.
Figure 4Scatter plots, bar charts, and correlation coefficients for the M = 2000 dataset.
Correlation coefficients and P values of variable pairs for the M = 768 dataset.
| Variable pair | Correlation coefficient |
|
|---|---|---|
| No. of times pregnant and age | 0.56 | 0 |
| No. of times pregnant and diastolic blood pressure | 0.21 | 0 |
| No. of times pregnant and insulin | 0.14 | 0 |
| No. of times pregnant and triceps skin fold thickness | 0.11 | 0.003 |
| No. of times pregnant and plasma glucose concentration | 0.11 | 0.001 |
| Plasma glucose concentration and insulin | 0.38 | 0 |
| Plasma glucose concentration and age | 0.27 | 0 |
| Plasma glucose concentration and BMI | 0.23 | 0 |
| Plasma glucose concentration and diastolic blood pressure | 0.21 | 0 |
| Plasma glucose concentration and triceps skin fold thickness | 0.17 | 0 |
| Plasma glucose concentration and diabetes pedigree function | 0.10 | 0.005 |
| Diastolic blood pressure and age | 0.33 | 0 |
| Diastolic blood pressure and BMI | 0.28 | 0 |
| Diastolic blood pressure and triceps skin fold thickness | 0.20 | 0 |
| Diastolic blood pressure and insulin | 0.10 | 0.005 |
| Triceps skin fold thickness and BMI | 0.54 | 0 |
| Triceps skin fold thickness and insulin | 0.16 | 0 |
| Triceps skin fold thickness and age | 0.12 | 0.001 |
| Insulin and age | 0.19 | 0 |
| Insulin and BMI | 0.17 | 0 |
| BMI and diabetes pedigree function | 0.12 | 0.001 |
Correlation coefficients and P values of variable pairs for the M = 2000 dataset.
| Variable pair | Correlation coefficient |
|
|---|---|---|
| No. of times pregnant and age | 0.55 | 0 |
| No. of times pregnant and diastolic blood pressure | 0.21 | 0 |
| No. of times pregnant and insulin | 0.11 | 0 |
| No. of times pregnant and triceps skin fold thickness | 0.11 | 0 |
| No. of times pregnant and plasma glucose concentration | 0.11 | 0 |
| Plasma glucose concentration and insulin | 0.38 | 0 |
| Plasma glucose concentration and age | 0.26 | 0 |
| Plasma glucose concentration and BMI | 0.23 | 0 |
| Plasma glucose concentration and diastolic blood pressure | 0.19 | 0 |
| Plasma glucose concentration and triceps skin fold thickness | 0.18 | 0 |
| Diastolic blood pressure and age | 0.33 | 0 |
| Diastolic blood pressure and BMI | 0.28 | 0 |
| Diastolic blood pressure and triceps skin fold thickness | 0.21 | 0 |
| Triceps skin fold thickness and BMI | 0.53 | 0 |
| Triceps skin fold thickness and insulin | 0.17 | 0 |
| Triceps skin fold thickness and age | 0.13 | 0 |
| Insulin and age | 0.18 | 0 |
| Insulin and BMI | 0.17 | 0 |
| BMI and diabetes pedigree function | 0.12 | 0 |
Figure 5Causal structure for the M = 768 dataset (structure 1).
Figure 6Causal structure for the M = 2000 dataset (structure 2).
Maximum likelihoods of causal structures 1 and 2.
| Causal structure | Maximum likelihood |
|---|---|
| 1 | -8.34 |
| 2 | -8.17 |
Figure 7Causal structure for the M = 768 dataset (α = 0.05–0.06) (structure 3).
Figure 8Causal structure for the M = 768 dataset (α = 0.07–0.14) (structure 4).
Figure 9Causal structure for the M = 768 dataset (α = 0.15) (structure 5).
Figure 10Causal structure for the M = 2000 dataset (α = 0.05–0.06) (structure 6).
Figure 11Causal structure for the M = 2000 dataset (α = 0.07–0.15) (structure 7).
Figure 12Causal structure for the M = 2000 dataset (α = 0.18) (structure 8).
Maximum likelihoods for causal structures 3–8.
| Dataset | Causal structure | Maximum likelihood |
|---|---|---|
|
| 3 ( | -8.18, -8.13 |
| 4 ( | -8.09, -8.03, -7.97, -7.91, -7.86, -7.79, -7.73, -7.67 | |
| 5 ( | -7.63 | |
|
| 6 ( | -8.01, -7.96 |
| 7 ( | -7.92, -7.86, -7.80, -7.74, -7.68, -7.62, -7.56, -7.50, -7.44 | |
| 8 ( | -7.26 |