| Literature DB >> 20157848 |
Karen McKeown1, Nicholas P Jewell.
Abstract
We describe a simple method for nonparametric estimation of a distribution function based on current status data where observations of current status information are subject to misclassification. Nonparametric maximum likelihood techniques lead to use of a straightforward set of adjustments to the familiar pool-adjacent-violators estimator used when misclassification is assumed absent. The methods consider alternative misclassification models and are extended to regression models for the underlying survival time. The ideas are motivated by and applied to an example on human papilloma virus (HPV) infection status of a sample of women examined in San Francisco.Entities:
Mesh:
Year: 2010 PMID: 20157848 PMCID: PMC9150792 DOI: 10.1007/s10985-010-9154-0
Source DB: PubMed Journal: Lifetime Data Anal ISSN: 1380-7870 Impact factor: 1.429
Fig. 1Hypothetical unconstrained NPMLE with the positions of hypothetical α and 1 − β shown on the vertical axis and the positions of k0, k1, m0 and m1 shown on the horizontal axis
Fig. 2a Hypothetical data (α = 0.8, β = 0.8). b HPV data (α = 0.73, β = 0.9). Estimated cumulative distribution functions for hypothetical data (F assumed Exponential with mean 2) and the HPV data. Both the unconstrained NPMLE obtained through the pool-adjacent-violators algorithm and the proposed adjusted NPMLE are presented
Confidence interval estimation for the adjusted (α = 0.73, β = 0.9) NPMLE at three monitoring times obtained using the m out of n bootstrap for various values of m ranging from 9 to 423
|
| 15.3 years | 19 years | 22 years |
|---|---|---|---|
|
| 0.609 | 0.763 | 0.793 |
| [0.471 0.747] | [0.614 0.912] | [0.718 0.868] | |
| [0.407 0.811] | [0.646 0.880] | [0.718 0.868] | |
| [0.311 0.907] | [0.646 0.880] | [0.687 0.899] | |
| [0.396 0.822] | [0.667 0.859] | [0.665 0.921] | |
| [0.407 0.811] | [0.667 0.859] | [0.655 0.931] | |
| [0.449 0.769] | [0.688 0.838] | [0.676 0.910] |
Simulation averages (standard deviations) of two estimators of the distribution function F (Exponential with mean 2) at 5 monitoring times, when the data generating distribution is either subject to always being misclassified (A = ∞), or never being misclassified (A = 0)
|
| 0.4 | 0.8 | 1.4 | 1.8 | 2.8 |
|---|---|---|---|---|---|
| 0% (0)% | |||||
| NPMLE0 | 0.178(0.055) | 0.331(0.063) | 0.496(0.059) | 0.591(0.056) | 0.760(0.049) |
| NPMLE∞ | 0.022(0.043) | 0.218(0.104) | 0.494(0.098) | 0.652(0.094) | 0.923(0.068) |
| 100% (20)% | |||||
| ZNPMLE0 | 0.306(0.056) | 0.397(0.056) | 0.500(0.051) | 0.557(0.047) | 0.662(0.050) |
| NPMLE∞ | 0.178(0.091) | 0.329(0.094) | 0.500(0.086) | 0.593(0.078) | 0.769(0.084) |
The resulting % subject to misclassification (average % actually misclassified) are also given for each simulation. NPMLE0 and NPMLE∞ represent the unconstrained NPMLE and the NPMLE adjusted for constant response misclassification, respectively
Simulation averages (standard deviations) of two estimators of the distribution function F (Exponential with mean 2) at 5 monitoring times when the data generating distribution is subject to constant misclassification (α = 0.8, β = 0.8) only within a window of length 2A around the underlying failure time
|
| 0.4 | 0.8 | 1.4 | 1.8 | 2.8 |
|---|---|---|---|---|---|
| 0.225(0.057) | 0.327(0.057) | 0.454(0.056) | 0.543(0.057) | 0.732(0.052) | |
| 0.063(0.069) | 0.214(0.095) | 0.424(0.094) | 0.571(0.095) | 0.884(0.079) | |
| Bias adjusted | |||||
| 0.177(0.088) | 0.315(0.097) | 0.467(0.101) | 0.575(0.102) | 0.775(0.084) | |
| 0.172(0.103) | 0.337(0.116) | 0.485(0.118) | 0.588(0.119) | 0.769(0.104) | |
| 0.258(0.055) | 0.358(0.056) | 0.470(0.056) | 0.530(0.057) | 0.673(0.051) | |
| 0.104(0.084) | 0.263(0.095) | 0.451(0.087) | 0.549(0.085) | 0.787(0.090) | |
| Bias adjusted | |||||
| 0.185(0.093) | 0.320(0.099) | 0.487(0.092) | 0.551(0.090) | 0.729(0.094) | |
| NPMLE∞ | 0.184(0.109) | 0.331(0.117) | 0.506(0.107) | 0.559(0.105) | 0.731(0.109) |
Window lengths of A = 1.5 and A = 2.5 are evaluated. The resulting % subject to misclassification (average % misclassified) are also given for each simulation. NPMLE0 and NPMLE∞ represent the unconstrained NPMLE and the NPMLE adjusted for constant response misclassification, respectively. The corresponding bias adjusted estimates (standard deviations) for each estimator under the different window lengths are also presented
Simulation averages (standard deviations) of two estimators of the distribution function F (unconstrained NPMLE from the HPV data) at 5 monitoring times when the data generating distribution is subject to misclassification that varies with time
|
| 15 years | 16.2 years | 19.2 years | 21.7 years | 23.2 years |
|---|---|---|---|---|---|
| 0.414(0.086) | 0.464(0.054) | 0.511(0.031) | 0.540(0.039) | 0.584(0.069) | |
| 0.498(0.137) | 0.578(0.085) | 0.652(0.049) | 0.698(0.061) | 0.766(0.102) | |
| 0.388(0.084) | 0.433(0.057) | 0.498(0.034) | 0.532(0.042) | 0.584(0.076) | |
| 0.457(0.132) | 0.528(0.090) | 0.632(0.054) | 0.686(0.067) | 0.764(0.111) | |
| Bias adjusted | |||||
| 0.435(0.153) | 0.475(0.113) | 0.537(0.100) | 0.562(0.115) | 0.612(0.168) | |
| 0.475(0.187) | 0.472(0.149) | 0.524(0.141) | 0.544(0.147) | 0.606(0.191) | |
| 0.383(0.081) | 0.426(0.054) | 0.474(0.030) | 0.513(0.045) | 0.573(0.078) | |
| 0.449(0.129) | 0.517(0.086) | 0.594(0.047) | 0.655(0.071) | 0.747(0.114) | |
| Bias adjusted | |||||
| 0.437(0.153) | 0.473(0.113) | 0.520(0.090) | 0.553(0.114) | 0.610(0.172) | |
| 0.473(0.189) | 0.472(0.150) | 0.513(0.132) | 0.542(0.154) | 0.604(0.203) | |
| 0.384(0.083) | 0.428(0.052) | 0.472(0.032) | 0.496(0.038) | 0.544(0.075) | |
| NPMLE∞ | 0.451(0.129) | 0.521(0.083) | 0.590(0.050) | 0.629(0.060) | 0.702(0.111) |
Classification rates of α = 0.8 and β = 0.9 are assumed outside the window and rates of α = 0.73 and β = 0.9 are assumed within a window of length 2A around the underlying failure time. Window lengths of A = 0, 4.5, 8, ∞ are evaluated. The resulting % subject to misclassification (average % misclassified) are also given for each simulation. NPMLE0 and NPMLE∞ represent the unconstrained NPMLE and the NPMLE adjusted for constant (α = 0.73, β = 0.9) misclassification, respectively. The corresponding bias adjusted estimates (standard deviations) for each estimator in the windows of length A = 4.5 and A = 8 are also presented
Estimates (and standard errors) of the log Relative Hazard (RH) for time to first HPV infection, which is assumed to follow a Weibull distribution
| Covariate | Log (RH): Model (a) | Log(RH): Model (b) |
|
|---|---|---|---|
| Smoke now | 0.056(0.108) | 0.103(0.144) | 0.544 |
| STD | −0.479(0.299) | −0.698(0.258) | 0.686 |
| Log(age at screening) | 0.822(0.455) | 1.269(0.552) | 0.648 |
Model (a) ignores misclassification in the response variable and Model (b) incorporates constant misclassification corresponding to α = 0.73 and β = 0.9