| Literature DB >> 34093885 |
Stephanie Coffey1, Brady T West2, James Wagner2, Michael R Elliott2,3.
Abstract
Responsive survey designs introduce protocol changes to survey operations based on accumulating paradata. Case-level predictions, including response propensity, can be used to tailor data collection features in pursuit of cost or quality goals. Unfortunately, predictions based only on partial data from the current round of data collection can be biased, leading to ineffective tailoring. Bayesian approaches can provide protection against this bias. Prior beliefs, which are generated from data external to the current survey implementation, contribute information that may be lacking from the partial current data. Those priors are then updated with the accumulating paradata. The elicitation of the prior beliefs, then, is an important characteristic of these approaches. While historical data for the same or a similar survey may be the most natural source for generating priors, eliciting prior beliefs from experienced survey managers may be a reasonable choice for new surveys, or when historical data are not available. Here, we fielded a questionnaire to survey managers, asking about expected attempt-level response rates for different subgroups of cases, and developed prior distributions for attempt-level response propensity model coefficients based on the mean and standard error of their responses. Then, using respondent data from a real survey, we compared the predictions of response propensity when the expert knowledge is incorporated into a prior to those based on a standard method that considers accumulating paradata only, as well as a method that incorporates historical survey data.Entities:
Keywords: Bayesian Analysis; Elicitation of Priors; Expert Opinion; Response Propensity; Responsive Survey Design
Year: 2020 PMID: 34093885 PMCID: PMC8174793 DOI: 10.12758/mda.2020.05
Source DB: PubMed Journal: Methoden Daten Anal ISSN: 1864-6956
Significant predictors of screener response propensity in the final discrete time logit model for call-level data from the eight most recent quarters, after applying backward selection (n = 119,981 calls; Nagelkerke pseudo R-squared = 0.09; AUC = 0.66).
| Predictor | Coefficient | Standard Error |
|---|---|---|
| Intercept | −2.56 | 0.32 |
| Mail Delivery Point Type: Missing | 0.08 | 0.03 |
| Mail Delivery Point Type: A | 0.03 | 0.02 |
| Mail Delivery Point Type: B | −0.04 | 0.03 |
| Mail Delivery Point Type: C | −0.09 | 0.03 |
| Interviewer-Judged Eligibility: Missing | 2.46 | 0.10 |
| Interviewer-Judged Eligibility: No | 0.63 | 0.07 |
| Segment Listed: Car Alone | 0.03 | 0.02 |
| PSU Type: Non Self-Representing | 0.06 | 0.03 |
| PSU Type: Self-Representing (Not Largest 3 MSAs) | 0.03 | 0.03 |
| Previous Call: Contact | 3.97 | 0.28 |
| Previous Call: Different Window | −0.12 | 0.02 |
| Previous Call: Building Ever Locked | 0.32 | 0.05 |
| Previous Call: Building Locked | 2.16 | 0.14 |
| Previous Call: Strong Concerns Expressed | 0.26 | 0.04 |
| Previous Call: No Contact | 2.26 | 0.13 |
| Previous Call: Other Contact, No Concerns Expressed | −1.35 | 0.25 |
| Previous Call: Concerns Expressed | −1.58 | 0.26 |
| Previous Call: Soft Appointment | −1.03 | 0.30 |
| Previous Call: Call Window Sun.-Thurs. 6pm-10pm | 0.07 | 0.03 |
| Previous Call: Call Window Fri.-Sat. 6pm-10pm | 0.08 | 0.02 |
| No Access Problems in Segment | −0.05 | 0.02 |
| Evidence of Other Languages (not Spanish) | −0.09 | 0.03 |
| Census Division: G | −0.14 | 0.03 |
| Census Division: B | −0.32 | 0.03 |
| Census Division: D | −0.22 | 0.03 |
| Census Division: H | −0.24 | 0.03 |
| Census Division: C | −0.20 | 0.03 |
| Census Division: F | −0.27 | 0.04 |
| Census Division: E | −0.20 | 0.03 |
| Census Division: A | −0.19 | 0.04 |
| Contacts: None | −0.68 | 0.24 |
| Contacts: 1 | −0.54 | 0.22 |
| Contacts: 2 to 4 | −0.42 | 0.19 |
| Segment Domain: <10% Black, <10% Hispanic | −0.04 | 0.02 |
| Segment Domain: >10% Black, <10% Hispanic | −0.04 | 0.02 |
| Segment Domain: <10% Black, >10% Hispanic | 0.01 | 0.03 |
| Percentage of Segment Non-Eligible (Census Data) | −0.01 | <0.01 |
| Interviewer-Estimated Segment Eligibility Rate | −0.55 | 0.12 |
| Interviewer-Estimated Household Eligible | −0.09 | 0.02 |
| Segment Type: All Residential | 0.04 | 0.02 |
| Log(Number of Calls Made) | −0.60 | 0.03 |
| Log(Number of Calls Made) x No. Prev. Contacts | −0.04 | 0.01 |
| CML | −0.12 | 0.02 |
| CML Adult Count: Missing | −0.13 | 0.04 |
| CML Adult Count: 1 | −0.09 | 0.03 |
| CML Adult Count: 2 | 0.01 | 0.03 |
| CML Asian in HH: Missing | 0.21 | 0.04 |
| CML Asian in HH: No | 0.20 | 0.05 |
| CML HoH Gender: Missing | −0.03 | 0.02 |
| CML HoH Gender: Female | −0.01 | 0.02 |
| CML HoH Income: $35k-$70k | 0.12 | 0.02 |
| CML HoH Income: less than $35k | 0.14 | 0.02 |
| CML HH Own/Rent: Missing | −0.06 | 0.03 |
| CML HH Own/Rent: Owned | −0.02 | 0.02 |
| CML Age of 2nd Person: Missing | −0.13 | 0.03 |
| CML Age of 2nd Person: 18–44 | −0.15 | 0.03 |
| No Respondent Comments | 0.08 | 0.04 |
| Non-Contacts: None | −0.51 | 0.08 |
| Non-Contacts: 1 | −0.25 | 0.05 |
| Non-Contacts: 2–4 | −0.03 | 0.03 |
| Occupancy Rate of PSU | −0.26 | 0.10 |
| Respondent Other Concerns | 0.18 | 0.06 |
| Physical Impediment to Housing Unit: Locked | −0.35 | 0.03 |
| Day of Quarter | 0.01 | <0.01 |
| Respondent Concerns Expressed: None | −1.25 | 0.15 |
| Respondent Concerns Expressed: Once | 0.15 | 0.09 |
| Single Family Home / Townhome | −0.22 | 0.03 |
| Structure with 2–9 Units | −0.29 | 0.04 |
| Structure with 10+ Units | −0.21 | 0.04 |
| Respondent Concern: Survey Voluntary? | −0.46 | 0.15 |
| Respondent Concern: Too Old | 0.60 | 0.15 |
CML denotes that the variable came from a commercial data source.
Normal Prior Definitions, , for all predictors included in the NSFG response propensity model described in Section 3.2. The table notes which categories served as reference categories in the prior generation process, and also notes how many responses (out of a maximum of 20) that we received for each category.
| All Respondents (max n = 20) | |||
|---|---|---|---|
| Questions and Categories | Count of Responses | Mean Beta | StdErr Beta |
| Female | 20 | 0.336 | 0.063 |
| Missing | 14 | −0.465 | 0.257 |
| < 50 | 20 | −0.370 | 0.108 |
| Missing | 15 | −0.831 | 0.293 |
| 1 | 20 | 0.066 | 0.198 |
| Missing | 12 | −0.732 | 0.219 |
| White | 18 | 0.532 | 0.121 |
| Black | 18 | −0.031 | 0.173 |
| Hispanic | 18 | −0.118 | 0.112 |
| Other | 13 | −0.348 | 0.233 |
| Missing | 12 | −0.326 | 0.292 |
| Household Income Effect | |||
| +$10,000 | 17 | 0.466 | 0.235 |
| G | 14 | 0.020 | 0.129 |
| B | 14 | −0.205 | 0.138 |
| D | 14 | 0.041 | 0.141 |
| H | 14 | 0.060 | 0.161 |
| C | 14 | 0.133 | 0.170 |
| F | 15 | 0.294 | 0.150 |
| E | 15 | 0.057 | 0.145 |
| A | 16 | −0.050 | 0.192 |
| < 10% Black, < 10% Hispanic | 16 | 0.696 | 0.202 |
| > 10% Black, < 10% Hispanic | 16 | 0.535 | 0.132 |
| < 10% Black, > 10% Hispanic | 16 | 0.364 | 0.143 |
| Locked Buildings/Gated Communities | 19 | −0.687 | 0.190 |
| Seasonal Hazardous Conditions | 18 | −0.418 | 0.153 |
| Unimproved Roads | 17 | 0.267 | 0.164 |
| None | 10 | 1.091 | 0.189 |
| Yes | 15 | −0.725 | 0.163 |
| 10 years older than national average | 17 | 0.520 | 0.099 |
|
| |||
| 10% increase in occupancy rates | 16 | 0.187 | 0.170 |
| Minor Metropolitan Area | 18 | 0.155 | 0.155 |
| Not Metropolitan | 17 | 0.398 | 0.158 |
| On Foot With Someone | 11 | 0.787 | 0.607 |
| In a Car Alone | 11 | −0.066 | 0.135 |
| In a Car With Someone | 11 | 0.795 | 0.614 |
| Single Family Home | 5 | 1.172 | 0.567 |
| Structure with 2–9 Units | 5 | 0.788 | 0.602 |
| Structure with 10+ Units | 5 | 0.600 | 0.617 |
| Mobile Home | 5 | 0.728 | 0.462 |
| Curbline | 3 | 0.917 | 0.590 |
| Neighborhood Delivery Collection Box | 3 | 0.199 | 0.289 |
| Central | 3 | 0.069 | 0.384 |
| Missing | 3 | 0.000 | 0.000 |
| Locked Entrance | 19 | −0.096 | 0.206 |
| Doorperson or Gatekeeper | 19 | −0.627 | 0.117 |
| Access controlled via Intercom | 19 | −0.371 | 0.106 |
| None | 14 | 1.076 | 0.155 |
| Concerns Expressed on Previous Attempt | 17 | −1.347 | 0.434 |
| Concerns Expressed Not on Previous but Prior Attempt | 17 | −1.451 | 0.244 |
| Strong Concerns Ever Expressed | 15 | −2.228 | 0.593 |
| Contacted at Previous Attempt | 15 | 1.367 | 0.329 |
| Not Previous but Prior Contact | 15 | 1.009 | 0.298 |
| Ever Said „Too Old“ | 14 | −0.532 | 0.336 |
| Comment re: Voluntary Nature of Survey | 17 | 0.335 | 0.489 |
| Any Other Comments | 14 | 0.118 | 0.182 |
| Never Made Comment | 13 | 0.325 | 0.205 |
| Change in RR for Each Day of Field Period | 12 | 0.213 | 0.078 |
| Weekday Evening | 19 | 1.203 | 0.193 |
| Weekend Day | 19 | 1.052 | 0.166 |
| Weekend Evening | 19 | 0.426 | 0.220 |
| Yes | 18 | 0.564 | 0.339 |
| Change in RR for Each Additional Contact | 17 | −0.058 | 0.109 |
| Change in RR for Each Add’l Call*Contact | 13 | 0.177 | 0.228 |
Model Fit Statistics for In-Sample Predictions of Response, 5 Evaluation Quarters
| Q16 | Q17 | Q18 | Q19 | Q20 | |
|---|---|---|---|---|---|
| ROC-AUC | 0.711 | 0.682 | 0.661 | 0.690 | 0.654 |
| Nagelkerke-Pseudo R2 | 0.143 | 0.115 | 0.089 | 0.130 | 0.086 |
Figure 1Coefficients for Contact Status by Organization
Figure 3Coefficients for Expressed Concerns by Organization
Figure 2Coefficients for Contact Status by Experience
Figure 4Coefficients for Expressed Concerns by Experience
Figure 5Estimated Betas for Listing Procedure by Organization
Figure 6Estimated Betas for Likely Non-English Speaker by Organization
Figure 7Bias in Response Propensities by Quarter (Early)
Figure 12RMSE of Response Propensities by Quarter (Late)
Figure 9Bias in Response Propensities by Quarter (Mid)
Figure 11Bias in Response Propensities by Quarter (Late)
Figure 8RMSE of Response Propensities by Quarter (Early)
Figure 10RMSE of Response Propensities by Quarter (Mid)