| Literature DB >> 28293424 |
Miao Lu1, Jianhui Zhou1, Caitlin Naylor2, Beth D Kirkpatrick3, Rashidul Haque4, William A Petri2, Jennie Z Ma5.
Abstract
BACKGROUND: Environmental Enteropathy (EE) is a subclinical condition caused by constant fecal-oral contamination and resulting in blunting of intestinal villi and intestinal inflammation. Of primary interest in the clinical research is to evaluate the association between non-invasive EE biomarkers and malnutrition in a cohort of Bangladeshi children. The challenges are that the number of biomarkers/covariates is relatively large, and some of them are highly correlated.Entities:
Keywords: Biomarker selection; Correlated covariates; Environmental enteropathy; Malnutrition; Penalized linear regression
Year: 2017 PMID: 28293424 PMCID: PMC5345248 DOI: 10.1186/s40364-017-0089-4
Source DB: PubMed Journal: Biomark Res ISSN: 2050-7771
Biomarker list and descriptive summary in PROVIDE study (N=512)
| Variable category | Biomarker | Child age (week) | Mean (SD) |
|---|---|---|---|
| Enteric inflammation | Myeloperoxidase (MPO) | 12 | 10952.92 (11489.08) |
| Calprotectin | 12 | 781.68 (725.30) | |
| Neopterin | 12 | 2601.90 (2041.17) | |
| Alpha-1 anti-trypsin (ALA) | 12 | 0.85 (0.71) | |
| Mannitol in urine | 12 | 0.02 (0.02) | |
| 24 | 0.02 (0.02) | ||
| Reg1B | 6 | 56.13 (91.12) | |
| 12 | 80.87 (117.88) | ||
| Days of diarrhea | 18 | 6.22 (10.75) | |
| Systemic inflammation | Ferritin | 6 | 229.42 (153.31) |
| 18 | 45.45 (56.39) | ||
| C Reactive Protein (CRP) | 6 | 1.11 (3.83) | |
| 18 | 2.89 (7.53) | ||
| Soluble CD14 | 6 | 1686.90 (630.27) | |
| 18 | 1967.24 (697.34) | ||
| Endocab lipopolysaccharide (LPS) | 6 | 29.21 (42.25) | |
| 18 | 11.27 (39.98) | ||
| Log Scale of Activin | 6 | 6.41(1.12) | |
| Nutritional measures | Vitamin D | 6 | 35.58 (18.20) |
| 18 | 61.38 (24.17) | ||
| Zinc | 6 | 725.64 (107.58) | |
| 18 | 771.86 (146.50) | ||
| Retinol binding protein (RBP) | 6 | 24317.52 (11461.94) | |
| 18 | 29780.83 (15167.70) | ||
| Height for age z score (HAZ) | Birth | -0.90 (0.89) | |
| 18 | -1.02 (0.93) | ||
| Weight for age z score (WAZ) | 18 | -0.82 (1.06) | |
| Weight for height z score (WHZ) | 18 | -0.10 (1.01) | |
| Days of exclusive breast milk feeding | 18 | 95.99 (41.71) | |
| Maternal health, SES | Monthly household expenditure | NA | 11736.56 (7555.13) |
| Monthly household income | NA | 13021.23 (9708.96) | |
| Mother height (cm) | NA | 150.38 (5.61) | |
| Mother weight (kg) | NA | 49.36 (9.33) |
Fig. 1Heatmap of correlation for all biomarkers
Comparison of methods with different correlation levels (ρ) and signal noise ratio (SNR) from 100 simulation datasets
|
| SNR | Criteria | Stepwise | Elastic Net | LASSO | MCP | SCAD | Adaptive LASSO | Oracle |
|---|---|---|---|---|---|---|---|---|---|
| 1 | TP | 2.30 | 4.44 | 4.48 | 3.78 | 4.26 | 3.74 | 7.00 | |
| FP | 0.30 | 4.06 | 4.20 | 2.30 | 3.39 | 2.04 | 0.00 | ||
| MRME (%) | 60.40 | 46.34 | 46.56 | 45.04 | 43.51 | 45.23 | 17.93 | ||
| 0.2 | 3 | TP | 4.97 | 6.51 | 6.55 | 6.05 | 6.35 | 6.08 | 7.00 |
| FP | 0.35 | 4.75 | 4.80 | 2.04 | 3.34 | 2.08 | 0.00 | ||
| MRME (%) | 65.26 | 61.87 | 62.17 | 56.00 | 59.48 | 54.50 | 17.93 | ||
| 5 | TP | 6.16 | 6.89 | 6.88 | 6.68 | 6.76 | 6.71 | 7.00 | |
| FP | 0.41 | 5.30 | 5.32 | 1.93 | 3.21 | 2.06 | 0.00 | ||
| MRME (%) | 46.40 | 61.05 | 61.65 | 49.76 | 52.87 | 47.44 | 17.93 | ||
| 0.5 | 1 | TP | 2.29 | 4.38 | 4.41 | 2.66 | 4.15 | 3.67 | 7.00 |
| FP | 0.34 | 4.16 | 4.11 | 2.38 | 3.36 | 2.17 | 0.00 | ||
| MRME (%) | 57.76 | 46.01 | 46.56 | 44.03 | 43.00 | 44.66 | 17.89 | ||
| 3 | TP | 4.95 | 6.54 | 6.54 | 6.01 | 6.31 | 6.00 | 7.00 | |
| FP | 0.39 | 4.79 | 4.73 | 1.99 | 3.20 | 2.16 | 0.00 | ||
| MRME (%) | 65.52 | 60.72 | 62.90 | 56.98 | 60.66 | 56.64 | 17.89 | ||
| 5 | TP | 6.15 | 6.88 | 6.87 | 6.70 | 6.76 | 6.72 | 7.00 | |
| FP | 0.38 | 5.28 | 5.27 | 1.82 | 3.04 | 2.14 | 0.00 | ||
| MRME (%) | 46.56 | 60.39 | 61.09 | 49.18 | 52.93 | 48.03 | 17.89 | ||
| 0.8 | 1 | TP | 2.07 | 4.12 | 4.05 | 3.30 | 3.76 | 3.34 | 7.00 |
| FP | 0.58 | 4.47 | 4.20 | 2.25 | 3.24 | 2.21 | 0.00 | ||
| MRME (%) | 58.85 | 44.01 | 45.63 | 44.16 | 42.13 | 42.63 | 18.04 | ||
| 3 | TP | 4.54 | 6.36 | 6.32 | 5.76 | 6.07 | 5.78 | 7.00 | |
| FP | 0.52 | 5.16 | 4.88 | 2.03 | 2.90 | 2.37 | 0.00 | ||
| MRME (%) | 65.93 | 60.92 | 63.41 | 58.36 | 59.99 | 56.21 | 18.04 | ||
| 5 | TP | 5.90 | 6.85 | 6.83 | 6.51 | 6.61 | 6.59 | 7.00 | |
| FP | 0.49 | 5.59 | 5.37 | 1.73 | 2.83 | 2.27 | 0.00 | ||
| MRME (%) | 52.57 | 59.45 | 62.29 | 53.52 | 56.85 | 50.16 | 18.04 |
Oracle estimator is the least squares estimator of the true model, which contains seven nonzero covariates. TP (True Positive) is the average number of nonzero covariates being correctly selected. FP (False Positive) is the average number of zero covariates being incorrectly selected. Median relative model error (MRME) is used to measure the overall performance of different models
Selection results by different variable selection methods
| Biomarkers | Variable selection methods | |||||
|---|---|---|---|---|---|---|
| Stepwise | Elastic Net | LASSO | MCP | SCAD | Adaptive LASSO | |
| HAZ at birth | - (0.54) | |||||
| WAZ at wk18 | + (1.00) | |||||
| HAZ at wk18 | + (0.98) | + (1.00) | + (1.00) | + (1.00) | + (1.00) | + (1.00) |
| WHZ at wk18 | + (0.70) | + (1.00) | + (1.00) | + (1.00) | + (0.74) | |
| Exclusive breast feeding until wk18 | - (0.80) | - (0.73) | - (0.67) | - (0.77) | ||
| RBP at wk18 | + (0.72) | + (0.72) | + (0.59) | + (0.69) | ||
| Vitamin D at wk18 | - (0.32) | |||||
| Mannitol at wk12 | - (0.73) | - (0.72) | - (0.60) | - (0.71) | - (0.96) | |
| Mannitol at wk24 | - (0.85) | |||||
| ALA at wk12 | + (0.62) | |||||
| MPO wk12 | - (0.59) | - (0.87) | - (0.87) | - (0.80) | - (0.87) | |
| Expenditure | + (0.87) | + (0.87) | + (0.60) | + (0.75) | ||
| Mother weight | + (0.89) | + (1.00) | + (1.00) | + (0.97) | + (0.99) | + (0.90) |
| Mother height | + (0.52) | + (0.52) | ||||
| Reg1B at wk12 | - (0.58) | - (0.53) | ||||
| Soluble CD14 at wk18 | - (0.66) | - (0.67) | - (0.62) | |||
Here, “+” and “-” means positive and negative sign of coefficient estimates. Percentage of variables being selected via 100 bootstrapping samples is listed in parenthesis