| Literature DB >> 31549257 |
Nichola Burton1, Michael Burton2, Dan Rigby3, Clare A M Sutherland4, Gillian Rhodes4.
Abstract
A common goal in psychological research is the measurement of subjective impressions, such as first impressions of faces. These impressions are commonly measured using Likert ratings. Although these ratings are simple to administer, they are associated with response issues that can limit reliability. Here we examine best-worst scaling (BWS), a forced-choice method, as a potential alternative to Likert ratings for measuring participants' facial first impressions. We find that at the group level, BWS scores correlated almost perfectly with Likert scores, indicating that the two methods measure the same impressions. However, at the individual participant level BWS outperforms Likert ratings, both in terms of ability to predict preferences in a third task, and in terms of test-retest reliability. These benefits highlight the power of BWS, particularly for use in individual differences research.Entities:
Year: 2019 PMID: 31549257 PMCID: PMC6757072 DOI: 10.1186/s41235-019-0183-2
Source DB: PubMed Journal: Cogn Res Princ Implic ISSN: 2365-7464
Fig. 1An example of a best-worst scaling (BWS) trial. Participants view a subset of the faces to be rated, and select the “best” (in this case, most attractive) and “worst” (in this case, least attractive) from the subset. This “best”/“worst” decision is easy to understand, naturalistic and relies only on the faces presented in the current trial, with no need to remember previous responses. These faces, from the Face Research Lab London Set (DeBruine & Jones, 2017), are for illustration purposes only, and were not used in the studies reported here
Log-likelihoods and AICCs for the rank-ordered logistic regression models predicting participants’ rankings for each criterion set from their BWS or Likert scores. Higher log-likelihoods, and lower values of Akaike’s bias-corrected information criterion (AICC) (i.e. closer to zero for both measures) indicate better model fit. Models using best-worst scaling (BWS) scores achieved better fit for each of the six criterion sets. This improvement is reflected by the difference between AICC values for the two models (∆i), which are > 10 for all criterion sets, indicating a substantial improvement in model fit (Symonds & Moussalli, 2011)
| Criterion set | BWS | Likert |
| ||
|---|---|---|---|---|---|
| Log-likelihood | AICC | Log-likelihood | AICC | ||
| 1 | − 528.74 | 1059.49 | − 562.19 | 1126.39 | 66.90 |
| 2 | − 317.47 | 636.95 | − 366.44 | 734.89 | 97.94 |
| 3 | − 509.45 | 1020.91 | − 563.55 | 1129.11 | 108.20 |
| 4 | − 400.60 | 803.21 | − 478.17 | 958.35 | 155.14 |
| 5 | − 494.29 | 990.59 | − 548.38 | 1098.77 | 108.18 |
| 6 | − 489.88 | 981.77 | − 561.11 | 1124.23 | 142.46 |
Mean Cronbach’s alpha calculated from 50 random samples of size N from each of the BWS and Likert conditions. Higher values of alpha indicate increased reliability for the face-level scores in that condition. BWS best-worst scaling
|
| BWS | Likert |
|---|---|---|
| 8 | 0.703 | 0.655 |
| 12 | 0.814 | 0.745 |
| 20 | 0.870 | 0.839 |
| 30 | 0.911 | 0.883 |
| 40 | 0.931 | 0.914 |
| 50 | 0.947 | 0.929 |
| 166 | 0.984 | 0.978 |
Mean Cronbach’s alpha calculated from 50 random samples of size N from each of the BWS and Likert conditions. Higher values of alpha indicate increased reliability for the face-level scores in that condition. BWS best-worst scaling
|
| BWS | Likert |
|---|---|---|
| 8 | 0.822 | 0.778 |
| 12 | 0.893 | 0.827 |
| 20 | 0.924 | 0.901 |
| 30 | 0.951 | 0.931 |
| 40 | 0.962 | 0.948 |
| 50 | 0.970 | 0.958 |
| 122 | 0.988 | 0.983 |