| Literature DB >> 34030086 |
Hedyeh Ahmadi1, Douglas A Granger2, Katrina R Hamilton3, Clancy Blair4, Jenna L Riis5.
Abstract
Left censoring in salivary bioscience data occurs when salivary analyte determinations fall below the lower limit of an assay's measurement range. Conventional statistical approaches for addressing censored values (i.e., recoding as missing, substituting or extrapolating values) may introduce systematic bias. While specialized censored data statistical approaches (i.e., Maximum Likelihood Estimation, Regression on Ordered Statistics, Kaplan-Meier, and general Tobit regression) are available, these methods are rarely implemented in biobehavioral studies that examine salivary biomeasures, and their application to salivary data analysis may be hindered by their sensitivity to skewed data distributions, outliers, and sample size. This study compares descriptive statistics, correlation coefficients, and regression parameter estimates generated via conventional and specialized censored data approaches using salivary C-reactive protein data. We assess differences in statistical estimates across approach and across two levels of censoring (9% and 15%) and examine the sensitivity of our results to sample size. Overall, findings were similar across conventional and censored data approaches, but the implementation of specialized censored data approaches was more efficient (i.e., required little manipulations to the raw analyte data) and appropriate. Based on our review of the findings, we outline preliminary recommendations to enable investigators to more efficiently and effectively reduce statistical bias when working with left-censored salivary biomeasure data.Entities:
Keywords: C-reactive protein; Censored data; Saliva; Statistical analysis; Tobit regression
Mesh:
Substances:
Year: 2021 PMID: 34030086 PMCID: PMC8260151 DOI: 10.1016/j.psyneuen.2021.105274
Source DB: PubMed Journal: Psychoneuroendocrinology ISSN: 0306-4530 Impact factor: 4.905
Fig. 1.Left- and right-censored data distributions compared to complete data. For salivary analyte data, the dashed lines represent the assay’s lower (left-censored) and upper (right-censored) limits of measurement.
Fig. 2.The distribution of salivary c-reactive protein (CRP) concentrations in early adolescence using the raw (left) and log-transformed (right) data. Panels A and B use the conventional deletion approach, and panels C and D use censored data visualization approaches. Note: N = 569 for panel A and B; N = 622 for panel C and D. These data are censored at the true lower limit of assay sensitivity (LLOS = 9.9 pg/mL; see Supplemental Figure A.1 for similar plots for the inflated LLOS of 19.7 pg/ mL). Panel C is a typical Q-Q plot with a normality assumption for right skewed data (i.e., concave up) showing the non-normal distribution of the salivary CRP data. Panel D assumes log-normality and shows strong improvement in the distribution of the data under this assumption. The horizontal gray line is drawn at the log of the LLOS (ln(9.9)). The point below this line in panel D represents the censored values.
Descriptive statistics for salivary c-reactive protein (CRP) concentrations in early adolescence using conventional and censored data approaches under two levels of censoring.
| Conventional Approaches | Censored Data Approaches[ | |||||
|---|---|---|---|---|---|---|
| Deletion | Substitution with ½ the LLOS | Substitution with 0.01 pg/mL | K-M | ROS | MLE | |
| Mean | 884.18 | 809.26 | 808.84 | 809.69 | 809.24 | 865.92 |
| Median | 149.60 | 119.01 | 119.01 | 118.54 | 119.01 | 118.36 |
| Standard Deviation | 3637.72 | 3487.69 | 3487.79 | 3487.85 | 3487.69 | 6275.62 |
| Mean | 955.32 | 809.39 | 807.87 | 811.02 | 809.16 | 936.74 |
| Median | 179.88 | 119.01 | 119.01 | 118.54 | 119.01 | 112.73 |
| Standard Deviation | 3774.89 | 3487.66 | 3488.01 | 3487.80 | 3487.71 | 7727.19 |
Note: K-M = Kaplan-Meier, ROS = Regression on Order Statistics, MLE = Maximum Likelihood Estimation, LLOS = lower limit of sensitivity. Data and estimates are presented in their raw scale (pg/mL; not log-transformed). Descriptive statistics for the deletion approach under the true level of censoring (LLOS = 9.9 pg/mL) represent the observed CRP data.
Samples sizes: deletion approach with LL0S = 9.9 pg/mL: N = 569; deletion approach with LLOS = 19.7 pg/mL:N = 526; all other methods: N = 622.
ROS and MLE estimates assume a log-normal distribution of CRP data and are subject to transformation bias. These estimation approaches require censored data points have a value; in these calculations censored values were recoded to ½ the LLOS
Fig. 3.Censored data scatter plots for raw (left) and log-transformed (right) salivary c-reactive protein (CRP) concentrations from early adolescents (N = 622) showed minimal systematic patterns of censoring across body mass index (BMI) percentile score. Note: The average BMI percentile score for early adolescents with censored CRP determinations was 58.39 (median = 67.40; range = 0.26–98.08), and the average BMI percentile score for participants with observed CRP determinations was 74.85 (median = 85.57; range = 0.00–99.78).Observed concentrations of salivary CRP are plotted as individual points. Censored salivary CRP data are represented by dashed lines spanning from zero to the LLOS. These data are censored at the true lower limit of assay sensitivity (LLOS = 9.9 pg/mL; see Supplemental Fig. A.2 for similar plots for the artificially-inflated LLOS of 19.7 pg/mL).
Unadjusted associations between salivary c-reactive protein (CRP) and body mass index percentile score in early adolescence using conventional and censored data approaches under two levels of censoring.
| Conventional Approach Pearson’s r[ | Rank-based Approaches[ | Censored Data Spearman’s ρ | |
|---|---|---|---|
| Deletion | 0.05 | 0.26 | 0.37 |
| Substitution with ½ LLOS | 0.06 | 0.28 | 0.40 |
| Substitution with 0.01 pg/mL | 0.06 | 0.28 | 0.40 |
| Deletion | 0.05 | 0.26 | 0.37 |
| Substitution with ½ LLOS | 0.06 | 0.28 | 0.39 |
| Substitution with 0.01 pg/mL | 0.06 | 0.28 | 0.39 |
Note: Deletion approach with LL0S = 9.9 pg/mL: N = 569, deletion approach with LLOS = 19.7 pg/mL: N = 526; all other methods: N = 622.
p < 0.001
The Pearson’s correlation measures linear associations and makes bivariate normality assumptions. These assumptions may not be appropriate for these relations.
The Spearman’s and Kendall’s rank-based correlations can be considered specialized methods for censored data. These estimation approaches require censored data points have a value, and these calculations recode censored values to either half the LLOS or 0.01 pg/mL. Once substituted with a value, all censored data points are ranked at the same level, making correlation coefficient estimates the same under both substitution approaches.
Adjusted associations between salivary c-reactive protein (CRP) and body mass index (BMI) percentile score in early adolescence using conventional and censored data linear regression approaches under two levels of censoring.
| Conventional Approaches | Censored Data Approach Log-Normal Tobit | |||
|---|---|---|---|---|
| Deletion | Substitution with ½ the LLOS | Substitution with 0.01 pg/mL | ||
| 3.70 | 3.00 | 1.37 | 2 92 | |
| 0.02 | 0.02 | 0.03 | 0.02 | |
| 0.42 | 0.53 | 0.86 | 0.56 | |
| −0.07 (0.17) | 0.02 (0.18) | 0.21 (0.31) | 0.03 (0.19) | |
| 0.33 (0.24) | 0.42[ | 0.63 (0.43) | 0.43[ | |
| 569 | 622 | 622 | 622 | |
| 4.03 | 3.13 | 0.40 (0.54) | 2.94 | |
| 0.01 | 0.02 | 0.04 | 0.02 | |
| 0.39 | 0.51 | 0.98 | 0.55 | |
| −0.08 (0.17) | 0.01 (0.18) | 0.18 (0.40) | 0.01 (0.19) | |
| 0.39[ | 0.40 (0.25) | 0.53 (0.54) | 0.41 (0.26) | |
| 526 | 622 | 622 | 622 | |
Note: Conventional approaches use log-transformed CRP values. Coefficients are all non-standardized and on the log scale. SD= Standard Deviation. Male and Excellent health are the reference categories.
p<0.05
p < 0.01.
p < 0.001
p < 0.1.