| Literature DB >> 29601047 |
Ali Yousefi1,2, Darin D Dougherty3, Emad N Eskandar1, Alik S Widge3,4, Uri T Eden2.
Abstract
Censored data occur commonly in trial-structured behavioral experiments and many other forms of longitudinal data. They can lead to severe bias and reduction of statistical power in subsequent analyses. Principled approaches for dealing with censored data, such as data imputation and methods based on the complete data's likelihood, work well for estimating fixed features of statistical models but have not been extended to dynamic measures, such as serial estimates of an underlying latent variable over time. Here we propose an approach to the censored-data problem for dynamic behavioral signals. We developed a state-space modeling framework with a censored observation process at the trial timescale. We then developed a filter algorithm to compute the posterior distribution of the state process using the available data. We showed that special cases of this framework can incorporate the three most common approaches to censored observations: ignoring trials with censored data, imputing the censored data values, or using the full information available in the data likelihood. Finally, we derived a computationally efficient approximate Gaussian filter that is similar in structure to a Kalman filter, but that efficiently accounts for censored data. We compared the performances of these methods in a simulation study and provide recommendations of approaches to use, based on the expected amount of censored data in an experiment. These new techniques can broadly be applied in many research domains in which censored data interfere with estimation, including survival analysis and other clinical trial applications.Entities:
Keywords: Bayesian filtering Gaussian approximation; censored data; dynamic behavioral signal; likelihood function; missing data; state-space model
Year: 2017 PMID: 29601047 PMCID: PMC5774187 DOI: 10.1162/CPSY_a_00003
Source DB: PubMed Journal: Comput Psychiatr ISSN: 2379-6227
Exact filter procedure
| For each time |
| time posterior distribution. |
| predictive distribution. |
Estimation results for a single realization. (a) One realization of the inattention-state process over 100 trials, based on Equation 8. (b) Realization of reaction time and binary decision data, based on the state process in panel a and Equations 9 and 10. Both the reaction time and binary response data are censored whenever the reaction time is longer than 0.9 s. The circles show the binary decision, either 0 or 1, and the “x” marks show the missing points. (c) Estimation results for the exact filter, based on the full likelihood in Equation 11. (d) Estimation results for data deletion, based on the likelihood in Equation 25M. (e) Estimation results for data imputation, based on the likelihood in Equation 26. (f) Estimation results for the approximate Gaussian filter, detailed in Equations 21–24.
Estimator accuracy summary results.(a) Root-mean square error (RMSE) values as a function of the expected fraction of censored trials, based only on reaction time data. (b) Fractions of trials in which the true state value is contained in the estimated 95% HPD region, as a function of the expected fraction of censored trials based only on reaction time data. (c) RMSE values as a function of the expected fraction of censored trials, based on both reaction time and binary decision data. (d) Fractions of trials in which the true state value is contained in the estimated 95% HPD region, as a function of the expected fraction of censored trials, based on both reaction time and binary decision data. The simulation was run for a set of threshold values marked by “o,” and the performance graphs are derived by interpolation of the corresponding marked points.