| Literature DB >> 34185395 |
Hongquan Peng1, Haibin Zhu2, Chi Wa Ao Ieong1, Tao Tao1, Tsung Yang Tsai1, Zhi Liu2,3.
Abstract
Accurate detection of chronic kidney disease (CKD) plays a pivotal role in early diagnosis and treatment. Measured glomerular filtration rate (mGFR) is considered the benchmark indicator in measuring the kidney function. However, due to the high resource cost of measuring mGFR, it is usually approximated by the estimated glomerular filtration rate, underscoring an urgent need for more precise and stable approaches. With the introduction of novel machine learning methodologies, prediction performance is shown to be significantly improved across all available data, but the performance is still limited because of the lack of models in dealing with ultra-high dimensional datasets. This study aims to provide a two-stage neural network approach for prediction of GFR and to suggest some other useful biomarkers obtained from the blood metabolites in measuring GFR. It is a composite of feature shrinkage and neural network when the number of features is much larger than the number of training samples. The results show that the proposed method outperforms the existing ones, such as convolutionneural network and direct deep neural network.Entities:
Mesh:
Substances:
Year: 2021 PMID: 34185395 PMCID: PMC8675857 DOI: 10.1049/syb2.12031
Source DB: PubMed Journal: IET Syst Biol ISSN: 1751-8849 Impact factor: 1.615
Summary of mGFR and basic features
| Target variable and basic information | Training and validation set | Test set |
|---|---|---|
| mGFR – ml/min/1.73 m2 of body‐surface area | 59.1 ± 32.0 | 60.8 ± 33.6 |
| Age – years | 59.0 ± 18.5 | 56.2 ± 18.4 |
| Weight – kg | 63.5 ± 14.1 | 65.4 ± 11.7 |
| Height – cm | 161.9 ± 9.4 | 162.1 ± 8.8 |
| Male sex – no. (%) | 82 (51.9) | 20 (50.0) |
Note: Values associated with plus‐minus sign are sample mean ± sample standard deviation.
Abbreviation: mGFR, measured glomerular filtration rate.
FIGURE 1A methodology workflow which illustrates how a two‐stage neural network prediction is implemented. An ultra‐high dimensional data will be inputted into the first stage and reformed into matrices of lower dimensions by a feature selection approach. Then, predictions are obtained from a neural network in the second stage fed by the new subset of features. Finally, all predictions are assessed by three metrics
FIGURE 2Result of feature selection by all mentioned methodologies, where the y‐axis collects all included features' name and the x‐axis lists all feature selection methods. Feature placed in the y axis is in the order of their average importance among all methods, in which features on top carry out relatively higher importance while those on the bottom have the less influence
This table reports the results of root mean squared errors obtained from the fivefold cross‐validations, allowing hyper‐parameter p* to take values of either 10, 15, 20 or 25 and the number of neurons to vary from 1 to 10
| Number of features | Number of neurons | Lasso | Opt‐Lasso | SCAD | ISIS | RRCS | PLS | Average |
|---|---|---|---|---|---|---|---|---|
|
| 1 | 13.45 | 13.88 | 13.61 | 14.11 | 14.02 | 16.06 | 14.19 |
| 2 | 14.22 | 14.60 | 13.89 | 15.38 | 13.21 | 16.79 | 14.68 | |
| 3 | 16.01 | 16.05 | 13.93 | 16.26 | 14.38 | 18.58 | 15.87 | |
| 4 | 16.79 | 16.71 | 17.12 | 17.98 | 13.82 | 19.23 | 16.94 | |
| 5 | 19.16 | 18.88 | 18.86 | 18.07 | 13.89 | 22.48 | 18.56 | |
| 6 | 19.40 | 19.75 | 18.49 | 18.77 | 14.75 | 20.47 | 18.60 | |
| 7 | 20.90 | 22.05 | 20.07 | 17.73 | 15.35 | 23.69 | 19.97 | |
| 8 | 20.95 | 25.21 | 21.53 | 18.97 | 14.59 | 25.08 | 21.06 | |
| 9 | 23.40 | 24.88 | 23.19 | 20.02 | 15.10 | 25.82 | 22.07 | |
| 10 | 23.81 | 24.45 | 24.51 | 21.11 | 15.05 | 27.00 | 22.65 | |
|
| 1 | 12.61 | 14.00 | 14.84 | 11.95 | 13.43 | 14.85 | 13.61 |
| 2 | 14.67 | 16.41 | 17.08 | 16.11 | 13.49 | 16.40 | 15.69 | |
| 3 | 14.99 | 17.80 | 18.79 | 33.59 | 13.92 | 17.95 | 19.51 | |
| 4 | 18.36 | 19.19 | 19.83 | 14.05 | 13.99 | 21.67 | 17.85 | |
| 5 | 20.29 | 19.30 | 19.73 | 15.50 | 13.57 | 21.04 | 18.24 | |
| 6 | 19.12 | 22.52 | 25.30 | 16.63 | 13.86 | 22.37 | 19.97 | |
| 7 | 22.13 | 21.83 | 23.51 | 20.71 | 14.89 | 24.87 | 21.32 | |
| 8 | 22.58 | 27.18 | 27.93 | 19.64 | 14.20 | 27.56 | 23.18 | |
| 9 | 23.70 | 24.30 | 26.81 | 19.23 | 32.70 | 27.15 | 25.65 | |
| 10 | 25.86 | 23.65 | 28.75 | 23.78 | 14.88 | 27.15 | 24.01 | |
|
| 1 | 13.18 | 12.53 | 14.79 | 11.62 | 13.46 | 14.01 | 13.27 |
| 2 | 15.00 | 14.15 | 16.92 | 11.77 | 14.25 | 14.92 | 14.50 | |
| 3 | 16.97 | 15.41 | 20.40 | 14.27 | 14.70 | 17.99 | 16.62 | |
| 4 | 18.98 | 17.86 | 22.78 | 14.51 | 13.74 | 19.45 | 17.89 | |
| 5 | 19.78 | 23.64 | 23.30 | 16.64 | 16.54 | 20.53 | 20.07 | |
| 6 | 24.12 | 25.05 | 25.56 | 18.36 | 16.21 | 22.26 | 21.93 | |
| 7 | 22.29 | 24.40 | 27.99 | 22.96 | 16.60 | 24.90 | 23.19 | |
| 8 | 22.89 | 22.32 | 26.23 | 20.51 | 17.13 | 25.07 | 22.36 | |
| 9 | 24.55 | 26.13 | 31.14 | 23.81 | 18.65 | 22.84 | 24.52 | |
| 10 | 22.95 | 22.10 | 27.20 | 24.27 | 20.42 | 24.96 | 23.65 | |
|
| 1 | 12.34 | 11.59 | 15.38 | 12.04 | 13.62 | 14.19 | 13.19 |
| 2 | 15.29 | 14.22 | 16.81 | 13.70 | 13.79 | 17.06 | 15.15 | |
| 3 | 16.28 | 15.64 | 22.77 | 14.05 | 14.19 | 19.13 | 17.01 | |
| 4 | 19.23 | 19.37 | 23.38 | 19.38 | 15.72 | 20.76 | 19.64 | |
| 5 | 22.13 | 22.32 | 27.89 | 17.99 | 16.89 | 24.15 | 21.89 | |
| 6 | 25.09 | 21.34 | 25.46 | 19.56 | 16.25 | 23.51 | 21.87 | |
| 7 | 26.03 | 24.94 | 29.91 | 20.42 | 18.29 | 23.60 | 23.86 | |
| 8 | 26.20 | 24.63 | 30.52 | 20.43 | 16.87 | 27.11 | 24.29 | |
| 9 | 26.20 | 23.07 | 27.97 | 21.03 | 19.59 | 25.96 | 23.97 | |
| 10 | 24.45 | 23.46 | 24.52 | 22.00 | 20.14 | 25.12 | 23.28 |
Abbreviations: ISIS, iterative sure independent screening; PLS, partial least squares; RRCS, robust rank correlation‐based screening; SCAD, smoothly clippedabsolute deviations.
This table shows the results of prediction performance by fivefold cross‐validation estimations
| Metrics | Lasso | Opt‐Lasso | SCAD | ISIS | RRCS | PLS | CNN | Direct DNN |
|---|---|---|---|---|---|---|---|---|
| Bias | −0.23 | −0.57 | −0.43 | −0.59 | −0.98 | −0.11 | −0.56 | −1.96 |
| IQR | 10.06 | 10.81 | 12.17 | 9.66 | 8.95 | 10.79 | 13.46 | 21.51 |
| RMSE | 13.18 | 12.53 | 14.79 | 11.62 | 13.46 | 14.01 | 16.65 | 25.52 |
Abbreviations: CNN, convolutional neural network; DNN, deep neural network; IQR, interquartile range; ISIS, iterative sure independent screening; PLS, partial least squares; RRCS, robust rank correlation‐based screening; SCAD, smoothly clippedabsolute deviations.