| Literature DB >> 25912621 |
C J Ravesloot1, M F Van der Schaaf2, A M M Muijtjens3, C Haaring4, C L J J Kruitwagen5, F J A Beek4, J Bakker6, J P J Van Schaik7, Th J Ten Cate8.
Abstract
Formula scoring (FS) is the use of a don't know option (DKO) with subtraction of points for wrong answers. Its effect on construct validity and reliability of progress test scores, is subject of discussion. Choosing a DKO may not only be affected by knowledge level, but also by risk taking tendency, and may thus introduce construct-irrelevant variance into the knowledge measurement. On the other hand, FS may result in more reliable test scores. To evaluate the impact of FS on construct validity and reliability of progress test scores, a progress test for radiology residents was divided into two tests of 100 parallel items (A and B). Each test had a FS and a number-right (NR) version, A-FS, B-FS, A-NR, and B-NR. Participants (337) were randomly divided into two groups. One group took test A-FS followed by B-NR, and the second group test B-FS followed by A-NR. Evidence for impaired construct validity was sought in a hierarchical regression analysis by investigating how much of the participants' FS-score variance was explained by the DKO-score, compared to the contribution of the knowledge level (NR-score), while controlling for Group, Gender, and Training length. Cronbach's alpha was used to estimate NR and FS-score reliability per year group. NR score was found to explain 27 % of the variance of FS [F(1,332) = 219.2, p < 0.0005], DKO-score, and the interaction of DKO and Gender were found to explain 8 % [F(2,330) = 41.5, p < 0.0005], and the interaction of DKO and NR 1.6 % [F(1,329) = 16.6, p < 0.0005], supporting our hypothesis that FS introduces construct-irrelevant variance into the knowledge measurement. However, NR-scores showed considerably lower reliabilities than FS-scores (mean year-test group Cronbach's alphas were 0.62 and 0.74, respectively). Decisions about FS with progress tests should be a careful trade-off between systematic and random measurement error.Entities:
Keywords: Construct validity; Construct-irrelevant variance; Don’t know option; Formula scoring; Progress testing; Reliability; Risk-taking tendency
Mesh:
Year: 2015 PMID: 25912621 PMCID: PMC4639571 DOI: 10.1007/s10459-015-9604-2
Source DB: PubMed Journal: Adv Health Sci Educ Theory Pract ISSN: 1382-4996 Impact factor: 3.853
Group characteristics and test results
| Variables | Group 1 | Group 2 |
|---|---|---|
| Tests A-FS, B-NR | Tests B-FS, A-NR | |
| Number of participants (n) | 168 | 169 |
| Gender (male:female) | 95:73 | 94:75 |
| TR, training length in years, Mean (SD) | 2.4 (1.4) | 2.3 (1.4) |
| FS score, percentage correct minus incorrect under FS conditions, Mean (SD) | A-FS 31.5 (17.4) | B-FS 34.3 (16.3) |
| NR score, percentage correct under NR conditions, Mean (SD) | B-NR 67.5 (9.6) | A-NR 65.4 (7.9) |
| DKO score, percentage don’t know under FS conditions, Mean (SD) | 26.4 (19.2) | 29.8 (19.8) |
Results of the sequential (hierarchical) multiple regression analysis of dependent variable formula scoring (FS) score with independent variables Gender, training length TR), group, number right (NR) score, don’t know option (DKO) score, and the interactions of Gender and DKO score (Gender × DKO), and of NR score and DKO score (NR score × DKO score)
| Model | Independent variable | Regression coefficient | Standardized regression coefficient | R2 change ∆R2 | Importance | R2 | ||
|---|---|---|---|---|---|---|---|---|
| b |
| SE | ||||||
| 1 | 0.327 | 0.57 | 0.33 | |||||
| Intercept | 32.13 | 0.000 | 1.26 | |||||
| Group | 2.91 | 0.056 | 1.52 | 0.09 | ||||
| Training-length (years) | 6.61 | 0.000 | 0.53 | 0.57 | ||||
| Gender (0:male; 1:female) | −1.54 | 0.315 | 1.53 | −0.05 | ||||
| 2 | 0.268 | 0.52 | 0.60 | |||||
| Intercept | 31.09 | 0.000 | 0.98 | |||||
| Group | 5.31 | 0.000 | 1.19 | 0.16 | ||||
| Training-length | 2.56 | 0.000 | 0.49 | 0.22 | ||||
| Gender | −1.90 | 0.110 | 1.19 | −0.06 | ||||
| Number right score | 1.20 | 0.000 | 0.08 | 0.63 | ||||
| 3 | 0.081 | 0.28 | 0.68 | |||||
| Intercept | 30.27 | 0.000 | 0.89 | |||||
| Group | 5.93 | 0.000 | 1.07 | 0.18 | ||||
| Training-length | 0.40 | 0.434 | 0.51 | 0.03 | ||||
| Gender | −0.94 | 0.380 | 1.07 | −0.03 | ||||
| Number right score | 0.99 | 0.000 | 0.08 | 0.52 | ||||
| Don’t know option score | −0.39 | 0.000 | 0.04 | −0.45 | ||||
| Gender × DKO score | 0.14 | 0.014 | 0.06 | 0.11 | ||||
| 4 | 0.016 | 0.13 | 0.69 | |||||
| Intercept | 28.99 | 0.000 | 0.92 | |||||
| Group | 6.37 | 0.000 | 1.05 | 0.19 | ||||
| Training-length | 0.29 | 0.560 | 0.50 | 0.03 | ||||
| Gender | −1.26 | 0.232 | 1.05 | −0.04 | ||||
| Number right score | 0.96 | 0.000 | 0.08 | 0.50 | ||||
| Don’t know option score | −0.43 | 0.000 | 0.04 | −0.50 | ||||
| Gender × DKO score | 0.10 | 0.078 | 0.06 | 0.08 | ||||
| NR score × DKO score | −0.11 | 0.000 | 0.03 | −0.14 | ||||
Number right score and don’t know option score are expressed as percentages of the maximum attainable score
Continuous independent variables training-length, number right score and don’t know option score are centered on their mean value
For scaling purposes interaction NR score × DKO score is defined as (NR score × DKO score)/(standard deviation of NR score)
Reliability (Cronbach’s alpha) obtained with tests A and B under formula scoring conditions (tests A-FS and B-FS) and number-right conditions (tests A-NR and B-NR) in each of the five postgraduate year groups, with the residents divided into experimental Groups 1, and 2
| Postgraduate year group | Test conditions | |||
|---|---|---|---|---|
| Formula scoring | Number-right | Formula scoring | Number-right | |
| Group 1a | Group 2 | Group 2 | Group 1 | |
| Test A-FSb | Test A-NR | Test B-FS | Test B-NR | |
| 1 | 0.70* | 0.55* | 0.81** | 0.56** |
| 2 | 0.77** | 0.58** | 0.76* | 0.65* |
| 3 | 0.69 | 0.63 | 0.81* | 0.72* |
| 4 | 0.78** | 0.61** | 0.70* | 0.58* |
| 5 | 0.75** | 0.56** | 0.59** | 0.79** |
Significance of difference in reliability formula scoring versus number-right: * p < 0.05; ** p < 0.001
aNumber of residents in experimental Groups 1, and 2: 168, and 169, respectively
bNumber of items in tests A and B: 97, and 98, respectively
Results of the simultaneous multiple regression analysis of dependent variable don’t know option (DKO) score with independent variables group, training length (TR), number right (NR) score, and Gender
| Independent variable | Regression coefficient | Standardized regression coefficient | R2 | ||
|---|---|---|---|---|---|
| b |
| SE | β | ||
| 0.51 | |||||
| Intercept | 25.84 | 0.000 | 1.25 | ||
| Group | 1.88 | 0.216 | 1.51 | 0.05 | |
| Training-length (years) | −6.62 | 0.000 | 0.63 | −0.49 | |
| Number right score | −0.68 | 0.000 | 0.10 | −0.31 | |
| Gender (0:male; 1:female) | 3.05 | 0.045 | 1.51 | 0.08 | |
Number right score and don’t know option score are expressed as percentages of the maximum attainable score
Continuous independent variables training-length, and number right score are centered on their mean value