| Literature DB >> 29398298 |
Joy Shi1, Jill Korsiak1, Daniel E Roth2.
Abstract
PURPOSE: We aimed to demonstrate the use of jackknife residuals to take advantage of the longitudinal nature of available growth data in assessing potential biologically implausible values and outliers.Entities:
Keywords: Biologically implausible values; Jackknife residuals; Longitudinal growth data; Outliers
Mesh:
Year: 2018 PMID: 29398298 PMCID: PMC5840491 DOI: 10.1016/j.annepidem.2018.01.007
Source DB: PubMed Journal: Ann Epidemiol ISSN: 1047-2797 Impact factor: 3.797
Summary of anthropometric measurements available from the Maternal Vitamin D for Infant Growth (MDIG) trial∗
| Measure | Length | Weight | Head circumference |
|---|---|---|---|
| Number of measurements, by age | |||
| Birth (0–48 h) | 828 | 835 | 835 |
| Birth (>48 h) | 252 | 251 | 252 |
| 2 to 8 wk | 1095 | 1099 | 1100 |
| 3 mo | 1125 | 1132 | 1132 |
| 6 mo | 1131 | 1132 | 1133 |
| 9 mo | 1126 | 1126 | 1126 |
| 12 mo | 1072 | 1071 | 1071 |
| 15 mo | 880 | 880 | 880 |
| 18 mo | 610 | 609 | 610 |
| 21 mo | 443 | 442 | 443 |
| 24 mo | 306 | 306 | 306 |
| Total number of measurements | 8868 | 8883 | 8888 |
| Number of measurements per infant | |||
| Mean ± SD | 7.3 ± 2.0 | 7.3 ± 2.0 | 7.3 ± 2.0 |
| Median (range) | 7 (1, 11) | 7 (1, 11) | 7 (1, 11) |
| Number of infants with | |||
| ≥1 measurement | 1211 | 1211 | 1211 |
| ≥2 measurements, n (%) | 1196 (98.8) | 1196 (98.8) | 1196 (98.8) |
| ≥4 measurements, n (%) | 1165 (96.2) | 1166 (96.3) | 1166 (96.3) |
| ≥6 measurements, n (%) | 1005 (83.0) | 1004 (82.9) | 1006 (83.1) |
| ≥8 measurements, n (%) | 557 (46.0) | 557 (46.0) | 557 (46.0) |
Based on data available up to January 26, 2017.
Because of variability in the timing of measurements, these ages represent the scheduled visit time and the actual age of infants at their visit range from the midpoints of adjacent categories (e.g., timing of 6 month measurements range from 4.5 to 7.5 months of age).
Comparison of using jackknife residuals from linear versus nonlinear models of z-scores or raw growth data, respectively, as a function of age to identify biologically implausible decreases in length, weight, and head circumference measurements over 1000 simulations with an induced error rate of 5%
| Model | Number of pairs of adjacent values with a biologically implausible decrease | Percent of pairs in which the error was correctly identified (%), mean ± SD | Kappa statistic |
|---|---|---|---|
| Length | |||
| Model 1 | 62.5 ± 7.9 | 88.2 ± 4.0 | 0.760 ± 0.081 |
| Model 2 | 62.5 ± 7.9 | 89.6 ± 3.9 | 0.788 ± 0.080 |
| Weight | |||
| Model 1 | 26.0 ± 5.3 | 91.5 ± 5.2 | 0.795 ± 0.127 |
| Model 2 | 26.0 ± 5.3 | 91.2 ± 5.4 | 0.789 ± 0.129 |
| Head circumference | |||
| Model 1 | 123.3 ± 10.8 | 84.3 ± 3.1 | 0.685 ± 0.062 |
| Model 2 | 123.3 ± 10.8 | 73.2 ± 3.9 | 0.462 ± 0.079 |
Any decrease in raw length or head circumference measurements were considered to be biologically implausible, whereas a decrease of greater than 15% in the raw measurements for weight were considered biologically implausible.
Agreement between the jackknife residual method and truth in the classification of induced plausible values.
Linear equation of sex- and age-standardized z-score as a function of age (Z = β + βt + ε).
Raw anthropometric measurement as a function of square root age (Y = β + βt + ε).
Comparison of alternative methods to identify induced errors in length, weight, and head circumference measurements over 1000 simulations with an induced error rate of 5%
| Measure | Jackknife residuals (model 1) with >5 or < −5 cutoff | Jackknife residuals (model 2) with >5 or < −5 cutoff | Conditional growth percentile with >4 or < −4 cutoff | Recommended cutoffs from the WHO child growth standards | >4 or < −4 SD from population average |
|---|---|---|---|---|---|
| Length | |||||
| Sensitivity (%), mean ± SD | 11.9 ± 1.5 | 10.2 ± 1.4 | 0.2 ± 0.2 | 0.1 ± 0.1 | 0.4 ± 0.3 |
| Specificity (%), mean ± SD | 97.4 ± 0.1 | 97.4 ± 0.1 | 86.2 ± 0.1 | 100.0 ± 0.0 | 99.9 ± 0.0 |
| Weight | |||||
| Sensitivity (%), mean ± SD | 14.1 ± 1.6 | 9.7 ± 1.4 | 0.1 ± 0.2 | 0.9 ± 0.5 | 0.6 ± 0.3 |
| Specificity (%), mean ± SD | 97.4 ± 0.1 | 98.0 ± 0.1 | 86.3 ± 0.1 | 99.9 ± 0.0 | 99.9 ± 0.0 |
| Head circumference | |||||
| Sensitivity (%), mean ± SD | 10.7 ± 1.4 | 4.1 ± 0.9 | 0.2 ± 0.2 | 0.4 ± 0.3 | 0.5 ± 0.3 |
| Specificity (%), mean ± SD | 97.6 ± 0.1 | 98.1 ± 0.1 | 86.3 ± 0.1 | 99.8 ± 0.0 | 99.8 ± 0.0 |
Linear equation of sex- and age-standardized z-score as a function of age (Z = β + βt + ε).
Raw anthropometric measurement as a function of square root of age (Y = β + βt+ ε).
Based on a random effects restricted cubic spline (with 5 knots) model.
For LAZ, <−6 SD or >6 SD; for WAZ, <−6 SD or >5 SD; and for HCAZ, <−5 SD or >5 SD [7].
Fig. 1Sensitivity and specificity of the jackknife residual method for detection of outliers in child (A) raw length, (B) length-for-age z-score, (C) raw weight, (D) weight-for-age z-score, (E) raw head circumference, and (F) head circumference-for-age z-score data using cutoffs from ±3 to ± 8.
Comparison of various cutoffs to identify induced errors in length, weight, and head circumference using the jackknife residual method over 1000 simulations with an induced error rate of 5%
| Measure | Jackknife residuals: Model 1 | Jackknife residuals: Model 2 | ||||
|---|---|---|---|---|---|---|
| >3 or < −3 residual cutoff | >5 or < −5 residual cutoff | >7 or < −7 residual cutoff | >3 or < −3 residual cutoff | >5 or < −5 residual cutoff | >7 or < −7 residual cutoff | |
| Length | ||||||
| Sensitivity (%), mean ± SD | 27.8 ± 2.0 | 11.9 ± 1.5 | 5.6 ± 1.1 | 25.8 ± 2.1 | 10.2 ± 1.4 | 4.5 ± 1.0 |
| Specificity (%), mean ± SD | 94.4 ± 0.1 | 97.4 ± 0.1 | 98.2 ± 0.1 | 94.1 ± 0.1 | 97.4 ± 0.1 | 98.2 ± 0.1 |
| Weight | ||||||
| Sensitivity (%), mean ± SD | 29.9 ± 2.1 | 14.1 ± 1.6 | 7.4 ± 1.2 | 24.2 ± 1.9 | 9.7 ± 1.4 | 4.4 ± 1.0 |
| Specificity (%), mean ± SD | 94.1 ± 0.1 | 97.4 ± 0.1 | 98.3 ± 0.1 | 95.7 ± 0.1 | 98.0 ± 0.1 | 98.6 ± 0.0 |
| Head circumference | ||||||
| Sensitivity (%), mean ± SD | 25.6 ± 1.9 | 10.7 ± 1.4 | 5.1 ± 1.0 | 14.6 ± 1.6 | 4.1 ± 0.9 | 1.6 ± 0.6 |
| Specificity (%), mean ± SD | 94.2 ± 0.1 | 97.6 ± 0.1 | 98.4 ± 0.1 | 95.5 ± 0.1 | 98.1 ± 0.1 | 98.6 ± 0.0 |
Linear equation of sex- and age-standardized z-score as a function of age (Z = β + βt + ε).
Raw anthropometric measurement as a function of square root of age (Y = β + βt + ε).
Comparison of various methods to identify induced errors in length, weight, and head circumference measurements over 1000 simulations with an induced error rate of 5%, stratified by the number of measurements available
| Number of observations per infant | Jackknife residuals (model 1) with >5 or < −5 cutoff | Jackknife residuals (model 2) with >5 or < −5 cutoff | Conditional growth percentile with >4 SD or < −4 SD cutoff | |||
|---|---|---|---|---|---|---|
| Sensitivity (%), mean ± SD | Specificity (%), mean ± SD | Sensitivity (%), mean ± SD | Specificity (%), mean ± SD | Sensitivity (%), mean ± SD | Specificity (%), mean ± SD | |
| Length | ||||||
| 2–3 measurements per infant | — | — | — | — | 1.6 ± 8.7 | 54.2 ± 1.4 |
| 4−5 measurements per infant | 19.0 ± 6.2 | 92.7 ± 0.4 | 17.1 ± 6.2 | 93.1 ± 0.4 | 0.1 ± 0.6 | 79.1 ± 0.3 |
| 6−7 measurements per infant | 13.3 ± 2.7 | 98.5 ± 0.1 | 10.7 ± 2.5 | 98.1 ± 0.1 | 0.1 ± 0.3 | 84.6 ± 0.2 |
| ≥8 measurements per infant | 10.2 ± 1.8 | 99.1 ± 0.1 | 9.0 ± 1.7 | 99.2 ± 0.1 | 0.2 ± 0.3 | 88.9 ± 0.1 |
| Overall | 11.9 ± 1.5 | 97.4 ± 0.1 | 10.2 ± 1.4 | 97.4 ± 0.1 | 0.2 ± 0.2 | 86.2 ± 0.1 |
| Weight | ||||||
| 2−3 measurements per infant | — | — | — | — | 0.1 ± 1.7 | 56.4 ± 1.4 |
| 4−5 measurements per infant | 21.0 ± 6.3 | 94.5 ± 0.4 | 15.7 ± 6.1 | 95.1 ± 0.4 | 0.1 ± 0.5 | 79.3 ± 0.3 |
| 6−7 measurements per infant | 15.6 ± 3.0 | 97.9 ± 0.1 | 10.3 ± 2.5 | 98.7 ± 0.1 | 0.1 ± 0.3 | 84.7 ± 0.2 |
| ≥8 measurements per infant | 12.4 ± 2.1 | 99.2 ± 0.1 | 8.6 ± 1.8 | 99.8 ± 0.0 | 0.1 ± 0.2 | 89.0 ± 0.1 |
| Overall | 14.1 ± 1.6 | 97.4 ± 0.1 | 9.7 ± 1.4 | 98.0 ± 0.1 | 0.1 ± 0.2 | 86.3 ± 0.1 |
| Head circumference | ||||||
| 2−3 measurements per infant | — | — | — | — | 0.2 ± 2.8 | 57.0 ± 1.4 |
| 4−5 measurements per infant | 20.0 ± 6.5 | 95.1 ± 0.4 | 14.2 ± 5.3 | 94.9 ± 0.4 | 0.2 ± 0.7 | 79.3 ± 0.3 |
| 6−7 measurements per infant | 12.4 ± 2.7 | 98.4 ± 0.1 | 5.2 ± 1.8 | 99.2 ± 0.1 | 0.3 ± 0.4 | 84.6 ± 0.2 |
| ≥8 measurements per infant | 8.4 ± 1.7 | 99.1 ± 0.1 | 2.0 ± 0.9 | 99.6 ± 0.1 | 0.3 ± 0.3 | 88.9 ± 0.1 |
| Overall | 10.7 ± 1.4 | 97.6 ± 0.1 | 4.1 ± 0.9 | 98.1 ± 0.1 | 0.3 ± 0.2 | 86.3 ± 0.1 |
Linear equation of sex- and age-standardized z-score as a function of age (Z = β + βt + ε).
Raw anthropometric measurement as a function of square root of age (Y = β + βt + ε).
Based on a random effects restricted cubic spline (with 5 knots) model.
Comparison of various methods to identify induced errors in length, weight, and head circumference measurements among infants with at least 2 measurements over 1000 simulations with an induced error rate of 5%, stratified by visit at which error was induced
| Timing of visit | Jackknife residuals (model 1) with >5 or < −5 cutoff | Jackknife residuals (model 2) with >5 or < −5 cutoff | Conditional growth percentile with >4 SD or < −4 SD cutoff | |||
|---|---|---|---|---|---|---|
| Sensitivity (%), mean ± SD | Specificity (%), mean ± SD | Sensitivity (%), mean ± SD | Specificity (%), mean ± SD | Sensitivity (%), mean ± SD | Specificity (%), mean ± SD | |
| Length | ||||||
| First visit | 17.6 ± 4.8 | 91.7 ± 0.4 | 11.2 ± 4.1 | 90.9 ± 0.4 | — | — |
| Middle visit | 11.5 ± 1.8 | 98.9 ± 0.1 | 9.8 ± 1.7 | 99.0 ± 0.1 | 0.2 ± 0.3 | 99.8 ± 0.0 |
| Last visit | 8.1 ± 3.4 | 96.4 ± 0.2 | 11.7 ± 4.2 | 96.5 ± 0.2 | 0.2 ± 0.5 | 99.8 ± 0.0 |
| Overall | 11.9 ± 1.5 | 97.4 ± 0.1 | 10.2 ± 1.4 | 97.4 ± 0.1 | 0.2 ± 0.2 | 86.2 ± 0.1 |
| Weight | ||||||
| First visit | 21.9 ± 5.5 | 90.0 ± 0.4 | 5.9 ± 3.0 | 93.5 ± 0.3 | — | — |
| Middle visit | 13.3 ± 1.9 | 99.2 ± 0.1 | 9.5 ± 1.6 | 99.4 ± 0.0 | 0.2 ± 0.2 | 99.9 ± 0.0 |
| Last visit | 11.1 ± 4.1 | 96.7 ± 0.2 | 14.8 ± 4.6 | 96.6 ± 0.2 | 0.0 ± 0.2 | 100.0 ± 0.0 |
| Overall | 14.1 ± 1.6 | 97.4 ± 0.1 | 9.7 ± 1.4 | 98.0 ± 0.1 | 0.1 ± 0.2 | 86.3 ± 0.1 |
| Head circumference | ||||||
| First visit | 17.5 ± 5.1 | 91.6 ± 0.4 | 6.2 ± 3.2 | 94.6 ± 0.3 | — | — |
| Middle visit | 10.2 ± 1.6 | 99.0 ± 0.1 | 3.0 ± 0.9 | 99.5 ± 0.0 | 0.4 ± 0.3 | 99.8 ± 0.0 |
| Last visit | 6.3 ± 3.1 | 97.2 ± 0.2 | 8.1 ± 3.6 | 95.5 ± 0.3 | 0.1 ± 0.4 | 99.9 ± 0.0 |
| Overall | 10.7 ± 1.4 | 97.6 ± 0.1 | 4.1 ± 0.9 | 98.1 ± 0.1 | 0.3 ± 0.2 | 86.3 ± 0.1 |
Linear equation of sex- and age-standardized z-score as a function of age (Z = β + βt + ε).
Raw anthropometric measurement as a function of square root of age (Y = β + βt + ε).
Based on a random effects restricted cubic spline (with 5 knots) model.
Comparison of various methods to identify induced errors in length, weight, and head circumference measurements over 1000 simulations with induced error rates of 5%, 10%, and 15%
| Error rate | Jackknife residuals (model 1) with >5 or < −5 cutoff | Jackknife residuals (model 2) with >5 or < −5 cutoff | Conditional growth percentile with >4 SD or < −4 SD cutoff | Recommended cutoffs from the WHO child growth standards | >4 or < −4 SD from population average | |||||
|---|---|---|---|---|---|---|---|---|---|---|
| Sensitivity (%), mean ± SD | Specificity (%), mean ± SD | Sensitivity (%), mean ± SD | Specificity (%), mean ± SD | Sensitivity (%), mean ± SD | Specificity (%), mean ± SD | Sensitivity (%), mean ± SD | Specificity (%), mean ± SD | Sensitivity (%), mean ± SD | Specificity (%), mean ± SD | |
| Length | ||||||||||
| 5% error rate | 11.9 ± 1.5 | 97.4 ± 0.1 | 10.2 ± 1.4 | 97.4 ± 0.1 | 0.2 ± 0.2 | 86.2 ± 0.1 | 0.1 ± 0.1 | 100.0 ± 0.0 | 0.4 ± 0.3 | 99.9 ± 0.0 |
| 10% error rate | 2.3 ± 0.7 | 96.7 ± 0.1 | 2.2 ± 0.7 | 96.8 ± 0.1 | 0.1 ± 0.2 | 86.2 ± 0.1 | 0.1 ± 0.1 | 100.0 ± 0.0 | 0.2 ± 0.2 | 99.9 ± 0.0 |
| 15% error rate | 2.4 ± 0.7 | 96.7 ± 0.1 | 2.3 ± 0.7 | 96.8 ± 0.1 | 0.1 ± 0.1 | 86.2 ± 0.1 | 0.1 ± 0.1 | 99.9 ± 0.0 | 0.1 ± 0.2 | 99.9 ± 0.0 |
| Weight | ||||||||||
| 5% error rate | 14.1 ± 1.6 | 97.4 ± 0.1 | 9.7 ± 1.4 | 98.0 ± 0.1 | 0.1 ± 0.2 | 86.3 ± 0.1 | 0.2 ± 0.2 | 99.9 ± 0.0 | 0.6 ± 0.3 | 99.9 ± 0.0 |
| 10% error rate | 2.5 ± 0.8 | 96.6 ± 0.1 | 1.6 ± 0.6 | 97.4 ± 0.1 | 0.1 ± 0.1 | 86.3 ± 0.1 | 0.1 ± 0.1 | 99.9 ± 0.0 | 0.1 ± 0.2 | 99.8 ± 0.0 |
| 15% error rate | 2.6 ± 0.8 | 96.5 ± 0.1 | 1.8 ± 0.7 | 97.2 ± 0.1 | 0.1 ± 0.1 | 86.3 ± 0.1 | 0.1 ± 0.1 | 99.9 ± 0.0 | 0.1 ± 0.2 | 99.9 ± 0.0 |
| Head circumference | ||||||||||
| 5% error rate | 10.7 ± 1.4 | 97.6 ± 0.1 | 4.1 ± 0.9 | 98.1 ± 0.1 | 0.3 ± 0.2 | 86.3 ± 0.1 | 0.4 ± 0.3 | 99.8 ± 0.0 | 0.5 ± 0.3 | 99.8 ± 0.0 |
| 10% error rate | 2.1 ± 0.7 | 97.0 ± 0.1 | 1.2 ± 0.5 | 97.9 ± 0.1 | 0.1 ± 0.2 | 86.3 ± 0.1 | 0.2 ± 0.2 | 99.8 ± 0.0 | 0.2 ± 0.2 | 99.8 ± 0.0 |
| 15% error rate | 2.2 ± 0.7 | 96.9 ± 0.1 | 1.3 ± 0.5 | 97.8 ± 0.1 | 0.1 ± 0.1 | 86.3 ± 0.1 | 0.2 ± 0.2 | 99.8 ± 0.0 | 0.2 ± 0.2 | 99.8 ± 0.0 |
Linear equation of sex- and age-standardized z-score as a function of age (Z = β + βt + ε).
Raw anthropometric measurement as a function of square root of age (Y = β + βt + ε).
Based on a random effects restricted cubic spline (with 5 knots) model.
Fig. 2Example of a participant for whom an error was identified within a pair of values in which there was a biologically implausible decrease in length between two adjacent encounters (shown in hollow circles). The error was similarly identified when the jackknife residual method was applied to (A) length-for-age z-scores (LAZ) or (B) raw length measurements. Each measurement is labeled with its corresponding jackknife residual values.
Fig. 3Example of a participant for whom an outlier (shown in hollow circles) was identified when the jackknife residual method is applied to (A) length-for-age z-score (LAZ) or (B) raw length measurement. Each measurement is labeled with its corresponding jackknife residual values.
Comparison of various methods to identify induced errors in length, weight, and head circumference measurements over 1000 simulations in which magnitude of induced errors have standard deviations of 1, 2, and 3
| Jackknife residuals (model 1) with >5 or < −5 cutoff | Jackknife residuals (model 2) with >5 or < −5 cutoff | Conditional growth percentile with >4 SD or < −4 SD cutoff | Recommended cutoffs from the WHO child growth standards | >4 or < −4 SD from population average | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Sensitivity (%), mean ± SD | Specificity (%), mean ± SD | Sensitivity (%), mean ± SD | Specificity (%), mean ± SD | Sensitivity (%), mean ± SD | Specificity (%), mean ± SD | Sensitivity (%), mean ± SD | Specificity (%), mean ± SD | Sensitivity (%), mean ± SD | Specificity (%), mean ± SD | |
| Length | ||||||||||
| SD 1 | 11.9 ± 1.5 | 97.4 ± 0.1 | 10.2 ± 1.4 | 97.4 ± 0.1 | 0.2 ± 0.2 | 86.2 ± 0.1 | 0.1 ± 0.1 | 100.0 ± 0.0 | 0.4 ± 0.3 | 99.9 ± 0.0 |
| SD 2 | 29.3 ± 2.2 | 97.5 ± 0.1 | 26.9 ± 2.1 | 97.5 ± 0.1 | 2.8 ± 0.7 | 86.3 ± 0.1 | 1.4 ± 0.6 | 100.0 ± 0.0 | 3.5 ± 0.8 | 99.9 ± 0.0 |
| SD 3 | 41.0 ± 2.3 | 97.5 ± 0.1 | 38.9 ± 2.3 | 97.5 ± 0.1 | 9.1 ± 1.2 | 86.3 ± 0.1 | 6.6 ± 1.2 | 100.0 ± 0.0 | 10.0 ± 1.2 | 99.9 ± 0.0 |
| Weight | ||||||||||
| SD 1 | 14.1 ± 1.6 | 97.4 ± 0.1 | 9.7 ± 1.4 | 98.0 ± 0.1 | 0.1 ± 0.2 | 86.3 ± 0.1 | 0.2 ± 0.2 | 99.9 ± 0.0 | 0.6 ± 0.3 | 99.9 ± 0.0 |
| SD 2 | 30.7 ± 2.2 | 97.5 ± 0.1 | 25.2 ± 2.1 | 98.1 ± 0.1 | 3.1 ± 0.8 | 86.3 ± 0.1 | 3.8 ± 0.9 | 99.9 ± 0.0 | 5.0 ± 0.9 | 99.9 ± 0.0 |
| SD 3 | 41.9 ± 2.3 | 97.6 ± 0.1 | 36.5 ± 2.2 | 98.1 ± 0.1 | 9.7 ± 1.1 | 86.3 ± 0.1 | 11.2 ± 1.5 | 99.9 ± 0.0 | 12.5 ± 1.2 | 99.9 ± 0.0 |
| Head circumference | ||||||||||
| SD 1 | 10.7 ± 1.4 | 97.6 ± 0.1 | 4.1 ± 0.9 | 98.1 ± 0.1 | 0.3 ± 0.2 | 86.3 ± 0.1 | 0.4 ± 0.3 | 99.8 ± 0.0 | 0.5 ± 0.3 | 99.8 ± 0.0 |
| SD 2 | 26.9 ± 2.1 | 97.7 ± 0.1 | 14.3 ± 1.6 | 98.1 ± 0.1 | 4.2 ± 0.9 | 86.3 ± 0.1 | 4.1 ± 1.0 | 99.8 ± 0.0 | 4.3 ± 0.9 | 99.8 ± 0.0 |
| SD 3 | 38.7 ± 2.3 | 97.7 ± 0.1 | 25.4 ± 2.1 | 98.2 ± 0.1 | 10.7 ± 1.2 | 86.3 ± 0.1 | 12.9 ± 1.6 | 99.8 ± 0.0 | 11.3 ± 1.2 | 99.8 ± 0.0 |
Linear equation of sex- and age-standardized z-score as a function of age (Z = β + βt + ε).
Raw anthropometric measurement as a function of square root of age (Y = β + βt + ε).
Based on a random effects restricted cubic spline (with 5 knots) model.