Joy E Lawn1, Hannah Blencowe1, Vladimir Sergeevich Gordeev2,3, Joseph Akuze1,4,5, Angela Baschieri1, Sanne M Thysen6,7,8, Francis Dzabeng9, M Moinuddin Haider10, Melanie Smuk11, Michael Wild12, Michael M Lokshin12, Temesgen Azemeraw Yitayew13, Solomon Mokonnen Abebe13, Davis Natukwatsa14, Collins Gyezaho14, Seeba Amenga-Etego9. 1. Maternal, Adolescent, Reproductive & Child Health (MARCH) Centre, London School of Hygiene & Tropical Medicine, London, UK. 2. Institute of Population Health Sciences, Queen Mary University of London, London, UK. v.gordeev@qmul.ac.uk. 3. Maternal, Adolescent, Reproductive & Child Health (MARCH) Centre, London School of Hygiene & Tropical Medicine, London, UK. v.gordeev@qmul.ac.uk. 4. Department of Health Policy, Planning and Management, Makerere University School of Public Health, Kampala, Uganda. 5. Centre of Excellence for Maternal Newborn and Child Health Research, Makerere University, Kampala, Uganda. 6. Bandim Health Project, Bissau, Guinea-Bissau. 7. Research Centre for Vitamins and Vaccines, Statens Serum Institut, Copenhagen, Denmark. 8. Department of Clinical Research Open Patient data Explorative Network (OPEN), University of Southern Denmark, Odense, Denmark. 9. Kintampo Health Research Centre, Kintampo, Ghana. 10. Health Systems and Population Studies Division, icddr,b, Dhaka, Bangladesh. 11. Department of Medical Statistics, London School of Hygiene & Tropical Medicine, London, UK. 12. The World Bank, Washington DC, USA. 13. Dabat Research Centre Health and Demographic Surveillance System, Dabat, Ethiopia. 14. IgangaMayuge Health and Demographic Surveillance System, Makerere University Centre for Health and Population Research, Makerere, Uganda.
Abstract
BACKGROUND: Paradata are (timestamped) records tracking the process of (electronic) data collection. We analysed paradata from a large household survey of questions capturing pregnancy outcomes to assess performance (timing and correction processes). We examined how paradata can be used to inform and improve questionnaire design and survey implementation in nationally representative household surveys, the major source for maternal and newborn health data worldwide. METHODS: The EN-INDEPTH cross-sectional population-based survey of women of reproductive age in five Health and Demographic Surveillance System sites (in Bangladesh, Guinea-Bissau, Ethiopia, Ghana, and Uganda) randomly compared two modules to capture pregnancy outcomes: full pregnancy history (FPH) and the standard DHS-7 full birth history (FBH+). We used paradata related to answers recorded on tablets using the Survey Solutions platform. We evaluated the difference in paradata entries between the two reproductive modules and assessed which question characteristics (type, nature, structure) affect answer correction rates, using regression analyses. We also proposed and tested a new classification of answer correction types. RESULTS: We analysed 3.6 million timestamped entries from 65,768 interviews. 83.7% of all interviews had at least one corrected answer to a question. Of 3.3 million analysed questions, 7.5% had at least one correction. Among corrected questions, the median number of corrections was one, regardless of question characteristics. We classified answer corrections into eight types (no correction, impulsive, flat (simple), zigzag, flat zigzag, missing after correction, missing after flat (zigzag) correction, missing/incomplete). 84.6% of all corrections were judged not to be problematic with a flat (simple) mistake correction. Question characteristics were important predictors of probability to make answer corrections, even after adjusting for respondent's characteristics and location, with interviewer clustering accounted as a fixed effect. Answer correction patterns and types were similar between FPH and FBH+, as well as the overall response duration. Avoiding corrections has the potential to reduce interview duration and reproductive module completion by 0.4 min. CONCLUSIONS: The use of questionnaire paradata has the potential to improve measurement and the resultant quality of electronic data. Identifying sections or specific questions with multiple corrections sheds light on typically hidden challenges in the survey's content, process, and administration, allowing for earlier real-time intervention (e.g.,, questionnaire content revision or additional staff training). Given the size and complexity of paradata, additional time, data management, and programming skills are required to realise its potential.
BACKGROUND: Paradata are (timestamped) records tracking the process of (electronic) data collection. We analysed paradata from a large household survey of questions capturing pregnancy outcomes to assess performance (timing and correction processes). We examined how paradata can be used to inform and improve questionnaire design and survey implementation in nationally representative household surveys, the major source for maternal and newborn health data worldwide. METHODS: The EN-INDEPTH cross-sectional population-based survey of women of reproductive age in five Health and Demographic Surveillance System sites (in Bangladesh, Guinea-Bissau, Ethiopia, Ghana, and Uganda) randomly compared two modules to capture pregnancy outcomes: full pregnancy history (FPH) and the standard DHS-7 full birth history (FBH+). We used paradata related to answers recorded on tablets using the Survey Solutions platform. We evaluated the difference in paradata entries between the two reproductive modules and assessed which question characteristics (type, nature, structure) affect answer correction rates, using regression analyses. We also proposed and tested a new classification of answer correction types. RESULTS: We analysed 3.6 million timestamped entries from 65,768 interviews. 83.7% of all interviews had at least one corrected answer to a question. Of 3.3 million analysed questions, 7.5% had at least one correction. Among corrected questions, the median number of corrections was one, regardless of question characteristics. We classified answer corrections into eight types (no correction, impulsive, flat (simple), zigzag, flat zigzag, missing after correction, missing after flat (zigzag) correction, missing/incomplete). 84.6% of all corrections were judged not to be problematic with a flat (simple) mistake correction. Question characteristics were important predictors of probability to make answer corrections, even after adjusting for respondent's characteristics and location, with interviewer clustering accounted as a fixed effect. Answer correction patterns and types were similar between FPH and FBH+, as well as the overall response duration. Avoiding corrections has the potential to reduce interview duration and reproductive module completion by 0.4 min. CONCLUSIONS: The use of questionnaire paradata has the potential to improve measurement and the resultant quality of electronic data. Identifying sections or specific questions with multiple corrections sheds light on typically hidden challenges in the survey's content, process, and administration, allowing for earlier real-time intervention (e.g.,, questionnaire content revision or additional staff training). Given the size and complexity of paradata, additional time, data management, and programming skills are required to realise its potential.
Authors: Benjamin M Craig; Shannon K Runge; Kim Rand-Hendriksen; Juan Manuel Ramos-Goñi; Mark Oppe Journal: Value Health Date: 2015-02-02 Impact factor: 5.725
Authors: Joseph Akuze; Hannah Blencowe; Peter Waiswa; Angela Baschieri; Vladimir S Gordeev; Doris Kwesiga; Ane B Fisker; Sanne M Thysen; Amabelia Rodrigues; Gashaw A Biks; Solomon M Abebe; Kassahun A Gelaye; Mezgebu Y Mengistu; Bisrat M Geremew; Tadesse G Delele; Adane K Tesega; Temesgen A Yitayew; Simon Kasasa; Edward Galiwango; Davis Natukwatsa; Dan Kajungu; Yeetey Ak Enuameh; Obed E Nettey; Francis Dzabeng; Seeba Amenga-Etego; Sam K Newton; Charlotte Tawiah; Kwaku P Asante; Seth Owusu-Agyei; Nurul Alam; Moinuddin M Haider; Ali Imam; Kaiser Mahmud; Simon Cousens; Joy E Lawn Journal: Lancet Glob Health Date: 2020-04 Impact factor: 26.763
Authors: Angela Baschieri; Vladimir S Gordeev; Joseph Akuze; Doris Kwesiga; Hannah Blencowe; Simon Cousens; Peter Waiswa; Ane B Fisker; Sanne M Thysen; Amabelia Rodrigues; Gashaw A Biks; Solomon M Abebe; Kassahun A Gelaye; Mezgebu Y Mengistu; Bisrat M Geremew; Tadesse G Delele; Adane K Tesega; Temesgen A Yitayew; Simon Kasasa; Edward Galiwango; Davis Natukwatsa; Dan Kajungu; Yeetey Ak Enuameh; Obed E Nettey; Francis Dzabeng; Seeba Amenga-Etego; Sam K Newton; Alexander A Manu; Charlotte Tawiah; Kwaku P Asante; Seth Owusu-Agyei; Nurul Alam; M M Haider; Sayed S Alam; Fred Arnold; Peter Byass; Trevor N Croft; Kobus Herbst; Sunita Kishor; Florina Serbanescu; Joy E Lawn Journal: J Glob Health Date: 2019-06 Impact factor: 4.413
Authors: Ingrid Oakley-Girvan; Juan M Lavista; Yasamin Miller; Sharon Davis; Carlos Acle; Jeffrey Hancock; Lorene M Nelson Journal: JMIR Form Res Date: 2019-01-11