Michael A Kallen1, Karon F Cook2, Dagmar Amtmann3, Elizabeth Knowlton2, Richard C Gershon2. 1. Department of Medical Social Sciences, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA. michael.kallen@northwestern.edu. 2. Department of Medical Social Sciences, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA. 3. Department of Physical Medicine and Rehabilitation, School of Medicine, University of Washington, Seattle, WA, USA.
Abstract
PURPOSE: To evaluate the degree to which applying alternative stopping rules would reduce response burden while maintaining score precision in the context of computer adaptive testing (CAT). DATA: Analyses were conducted on secondary data comprised of CATs administered in a clinical setting at multiple time points (baseline and up to two follow ups) to 417 study participants who had back pain (51.3%) and/or depression (47.0%). Participant mean age was 51.3 years (SD = 17.2) and ranged from 18 to 86. Participants tended to be white (84.7%), relatively well educated (77% with at least some college), female (63.9%), and married or living in a committed relationship (57.4%). The unit of analysis was individual assessment histories (i.e., CAT item response histories) from the parent study. Data were first aggregated across all individuals, domains, and time points in an omnibus dataset of assessment histories and then were disaggregated by measure for domain-specific analyses. Finally, assessment histories within a "clinically relevant range" (score ≥ 1 SD from the mean in direction of poorer health) were analyzed separately to explore score level-specific findings. METHOD: Two different sets of CAT administration rules were compared. The original CAT (CATORIG) rules required at least four and no more than 12 items be administered. If the score standard error (SE) reached a value < 3 points (T score metric) before 12 items were administered, the CAT was stopped. We simulated applying alternative stopping rules (CATALT), removing the requirement that a minimum four items be administered, and stopped a CAT if responses to the first two items were both associated with best health, if the SE was < 3, if SE change < 0.1 (T score metric), or if 12 items were administered. We then compared score fidelity and response burden, defined as number of items administered, between CATORIG and CATALT. RESULTS: CATORIG and CATALT scores varied little, especially within the clinically relevant range, and response burden was substantially lower under CATALT (e.g., 41.2% savings in omnibus dataset). CONCLUSIONS: Alternate stopping rules result in substantial reductions in response burden with minimal sacrifice in score precision.
PURPOSE: To evaluate the degree to which applying alternative stopping rules would reduce response burden while maintaining score precision in the context of computer adaptive testing (CAT). DATA: Analyses were conducted on secondary data comprised of CATs administered in a clinical setting at multiple time points (baseline and up to two follow ups) to 417 study participants who had back pain (51.3%) and/or depression (47.0%). Participant mean age was 51.3 years (SD = 17.2) and ranged from 18 to 86. Participants tended to be white (84.7%), relatively well educated (77% with at least some college), female (63.9%), and married or living in a committed relationship (57.4%). The unit of analysis was individual assessment histories (i.e., CAT item response histories) from the parent study. Data were first aggregated across all individuals, domains, and time points in an omnibus dataset of assessment histories and then were disaggregated by measure for domain-specific analyses. Finally, assessment histories within a "clinically relevant range" (score ≥ 1 SD from the mean in direction of poorer health) were analyzed separately to explore score level-specific findings. METHOD: Two different sets of CAT administration rules were compared. The original CAT (CATORIG) rules required at least four and no more than 12 items be administered. If the score standard error (SE) reached a value < 3 points (T score metric) before 12 items were administered, the CAT was stopped. We simulated applying alternative stopping rules (CATALT), removing the requirement that a minimum four items be administered, and stopped a CAT if responses to the first two items were both associated with best health, if the SE was < 3, if SE change < 0.1 (T score metric), or if 12 items were administered. We then compared score fidelity and response burden, defined as number of items administered, between CATORIG and CATALT. RESULTS: CATORIG and CATALT scores varied little, especially within the clinically relevant range, and response burden was substantially lower under CATALT (e.g., 41.2% savings in omnibus dataset). CONCLUSIONS: Alternate stopping rules result in substantial reductions in response burden with minimal sacrifice in score precision.
Authors: Kathryn E Flynn; Rebecca A Shelby; Sandra A Mitchell; Maria R Fawzy; N Chantelle Hardy; Aatif M Husain; Francis J Keefe; Andrew D Krystal; Laura S Porter; Bryce B Reeve; Kevin P Weinfurt Journal: Psychooncology Date: 2010-10 Impact factor: 3.894
Authors: John E Ware; Mark Kosinski; Jakob B Bjorner; Martha S Bayliss; Alice Batenhorst; Carl G H Dahlöf; Stewart Tepper; Andrew Dowson Journal: Qual Life Res Date: 2003-12 Impact factor: 4.147
Authors: Dennis A Revicki; Wen-Hung Chen; Neesha Harnam; Karon F Cook; Dagmar Amtmann; Leigh F Callahan; Mark P Jensen; Francis J Keefe Journal: Pain Date: 2009-08-15 Impact factor: 6.961
Authors: Dokyoung S You; Karon F Cook; Benjamin W Domingue; Maisa S Ziadni; Jennifer M Hah; Beth D Darnall; Sean C Mackey Journal: Pain Med Date: 2021-07-25 Impact factor: 3.750
Authors: Ruchi N Patel; Valeria G Esparza; Jin-Shei Lai; Elizabeth L Gray; Bryce B Reeve; Rowland W Chang; David Cella; Kaveh Ardalan Journal: Arthritis Care Res (Hoboken) Date: 2021-07-30 Impact factor: 5.178