Literature DB >> 18664302

Vignette studies of medical choice and judgement to study caregivers' medical decision behaviour: systematic review.

Lucas M Bachmann1, Andrea Mühleisen, Annekatrin Bock, Gerben ter Riet, Ulrike Held, Alfons G H Kessels.   

Abstract

BACKGROUND: Vignette studies of medical choice and judgement have gained popularity in the medical literature. Originally developed in mathematical psychology they can be used to evaluate physicians' behaviour in the setting of diagnostic testing or treatment decisions. We provide an overview of the use, objectives and methodology of these studies in the medical field.
METHODS: Systematic review. We searched in electronic databases; reference lists of included studies. We included studies that examined medical decisions of physicians, nurses or medical students using cue weightings from answers to structured vignettes. Two reviewers scrutinized abstracts and examined full text copies of potentially eligible studies. The aim of the included studies, the type of clinical decision, the number of participants, some technical aspects, and the type of statistical analysis were extracted in duplicate and discrepancies were resolved by consensus.
RESULTS: 30 reports published between 1983 and 2005 fulfilled the inclusion criteria. 22 studies (73%) reported on treatment decisions and 27 (90%) explored the variation of decisions among experts. Nine studies (30%) described differences in decisions between groups of caregivers and ten studies (33%) described the decision behaviour of only one group. Only six studies (20%) compared decision behaviour against an empirical reference of a correct decision. The median number of considered attributes was 6.5 (IQR 4-9), the median number of vignettes was 27 (IQR 16-40). In 17 studies, decision makers had to rate the relative importance of a given vignette; in six studies they had to assign a probability to each vignette. Only ten studies (33%) applied a statistical procedure to account for correlated data.
CONCLUSION: Various studies of medical choice and judgement have been performed to depict weightings of the value of clinical information from answers to structured vignettes of care givers. We found that the design and analysis methods used in current applications vary considerably and could be improved in a large number of cases.

Entities:  

Mesh:

Year:  2008        PMID: 18664302      PMCID: PMC2515847          DOI: 10.1186/1471-2288-8-50

Source DB:  PubMed          Journal:  BMC Med Res Methodol        ISSN: 1471-2288            Impact factor:   4.615


Background

Preferences and perceived similarities or differences between choice alternatives can be evaluated using structured vignettes. There are two prominent methods of constructing such models of medical judgements, each with their own literature and set of advocates. These are conjoint analysis, developed in the 1970s to study preference and choice [1], and judgement analysis, also called social judgement theory, developed in the 1950s from Brunswik's lens model [2,3]. The two have developed along very different theoretical lines and have developed somewhat different methodology, although there is considerable overlap. Today there is a large number of marketing applications, where the joint effects of multiple product attributes on product choice have been studied. The types of choices include 'ranking', 'rating', and 'discrete choice'. These methods can be carried forward to the analysis of medical decision making, as medical decisions require judgement under uncertainty. This uncertainty may concern a state, such as the presence of illness, the likelihood of future events, such as those in the natural course of an illness, or the likelihood with which such events may be averted, that is, treatment effects. For many years decision-making research has explored physicians' estimation of probabilities given clinical scenarios [4]. However, there have been concerns whether physicians' probability setting leads to consistent ratings [5]. Moreover, cognitive psychological research shows that physicians do not apply probabilities as suggested by decision-making theory but use their own heuristics to decide [6-8]. Studies of medical choice and judgement offer a way to elicit the public's, patients' and caregivers' views on healthcare that circumvents probability statements [9-11]. The technique is gaining widespread use in healthcare and has been applied in different areas for example to establish patients' preferences in the doctor-patient relationship [12], or to determine optimal treatments for patients [13]. Increasingly, discrete choice analyses are being employed to study how physicians weigh clinical information in the diagnostic work-up. In particular, respondents are asked to rank, rate, or choose between simulated clinical cases varying in values of different symptoms along the possibility that this case will have a certain illness or will need a certain treatment. Comparison with the results of clinical studies allows an analysis of potential discrepancies (e.g. undervaluation of signs and symptoms, overvaluation of test results). Moreover, such comparisons with reference data from clinical studies allow linking physicians' behaviour to illness probabilities and therefore allow examining (implicit) decision thresholds. A considerable number of studies have been published recently. We provide an overview of existing reports, present an inventory of their objectives and methods, and evaluate them using systematic review methodology.

Methods

We defined a study of medical choice and judgement as an investigation in which preferences were elicited in physicians, nurse practitioners or medical students and that allowed the estimation of the relative importance of different characteristics.

Search strategy

We performed electronic searches in Medline, PsychINFO, CINAHL (Ovid®-version). Web of Science (ISI web of Science®) was used to locate studies that cited four key papers [14-17]. The last update search was performed on 25/3/2005. The exact search strategy may be obtained from the authors.

Inclusion criteria

Eligible articles for this review had to infer cue or attribute weighting from answers to structured vignettes and had to report on caregivers' decision making.

Data extraction strategy

We developed a data extraction form based on the assessment of three articles [17-19]. The form contained twelve items describing a study's salient features of context, design and analysis (for details see Table 1).
Table 1

Salient features of studies included in the systematic review.

AuthorYear of publicationClinical problemType1 of decisionAim2 + reference2Number of participantsType3 of participantsNumber of vignettesNumber of attributesSource4 of attributesType5 of outcomeType6 of analysis
Kirwan1983Rheumatoid arthritis3324175222
Wigton1986Pulmonary embolism15 → B554, 1278161
Smith1987Tube feeding222224, 1126122
Holmes1989Hypertension13983164121
Von Preyss-Friedman1992Tube feeding221414, 4166121
Lee1994Surgical patients22344, 3, 2308421
Harries1996Diversity of diseases2332413013321
McKinlay1997Breast cancer41128432612, 41
Shea1997Bile duct stones426244, 427811, 2, 41
Skaner1998Heart failure112744010112
Timmermans1997Colonic emergency221024163121
VanMilten-burg-Van Zijl1997Unstable angina22184, 4127222
Ross1999Depression4140746232, 41
Backlund2000Hypercholesterolemia25 → A384408221
Haggerty2000Fetal risk situation3157323210121
Skaner2000Heart failure15 → C704, 4, 1408212
Bouma2001Aortic stenosis222754, 43210921
Engelsbel2001Ectopic pregnancy14274166922
Kee2002Renal disease2184501122, 31
Sorum2002Acute otitis media42754, 4461531, 22
Sorum2002Acute otitis media44754, 4461511, 42
Wahlström2002Asthma25 → A3144, 4, 4185321
Bouma2004Aortic stenosis25 → B344329921
Sorum2003Prostate cancer42654, 4, 432511, 22
Tamayo-Sarver2003Opioid analgesic212872433341
Mays2004Vaccine program212242134921
Raley2004Papilloma vaccine211814134921
Arnold2005Resp. tract infection212574, 4164121
Lee2005Postoperat. recovery51604, 3, 283352
Tiemeier2002Depression25 → A4494, 4, 422734, 61

1) 1 = diagnosis, 2 = treatment, 3 = risk, prognosis, 4 = diagnosis & treatment, 5 = other

2) 1st digit describes aim: 1 = descriptive, 2 = group comparison, 3 = consistency, 4 = change over time, 5 = comparison with reference

2nd digit describes reference: A = guidelines, B = actual patients, C = clinical study

3) 1 = student, 2 = paramedic, 3 = physician in training, 4 = expert

4) 1 = literature, 2 = patients, 3 = expert, 4 = guidelines, 5 = no information

5) 1 = probability, 2 = rating, 3 = ranking, 4 = yes/no choice, 5 = discrete choice, 6 = >2 alternatives

6) 1 = no adjustment for correlated data, 2 = adjustment for correlated data

Salient features of studies included in the systematic review. 1) 1 = diagnosis, 2 = treatment, 3 = risk, prognosis, 4 = diagnosis & treatment, 5 = other 2) 1st digit describes aim: 1 = descriptive, 2 = group comparison, 3 = consistency, 4 = change over time, 5 = comparison with reference 2nd digit describes reference: A = guidelines, B = actual patients, C = clinical study 3) 1 = student, 2 = paramedic, 3 = physician in training, 4 = expert 4) 1 = literature, 2 = patients, 3 = expert, 4 = guidelines, 5 = no information 5) 1 = probability, 2 = rating, 3 = ranking, 4 = yes/no choice, 5 = discrete choice, 6 = >2 alternatives 6) 1 = no adjustment for correlated data, 2 = adjustment for correlated data Besides some study descriptors such as first author and year of publication, we extracted information on the studies' objectives, the clinical problem, who the decision-maker was, the type of decision/preference (diagnosis, treatment, risk, prognosis, diagnosis & treatment, and other), the number of participants and the authors' aims. The objectives were extracted into five categories: description of preferences in one group of caregivers (1), comparison of two or more groups such as different professions or different levels of competence. (2), assessment of the consistency within caregivers with their actual decisions or their direct rating of the attributes (3), assessment of changes in preferences over time, e.g. after attending a course (4), and comparison of caregivers with guidelines (5a), actual patients' preferences (5b), or the findings of one or more clinical studies (5c). We also registered the number of vignettes, the number of attributes of each vignette and the rationale behind the selection of the attributes. Finally, we documented how participants were asked to respond to the vignettes: rating (yes/no, otherwise), ranking, probability estimates, or discrete choice and the way, if any, in which authors accounted for correlated data in the analysis. We extracted this item because observations resulting from these experiments are typically not independent. Each respondent evaluates each of the vignettes. This makes the data from one respondent more alike than one would expect under the assumption of independence, and therefore standard deviations of the attributes could be underestimated. We searched for any statistical method that allows to adjust the standard errors for the intra-group correlation. All studies were assessed in duplicate. Discordant scores based on reading errors were corrected. Discordant scores based on real differences in interpretation were discussed and resolved through consensus.

Results

The searches retrieved 2001 records. Full papers of 81 potentially relevant studies were obtained. In total 51 articles did not meet the inclusion criteria and were excluded after reading the full reports, leaving 30 reports published between 1983 and 2005 for evaluation. (See flowchart in the Figure 1) The salient features of included studies are shown in the Table 1.
Figure 1

Study flow.

Study flow.

General aspects

Although the first study was published in 1983, 24 studies (84%) were published after 1995. Twenty-seven out of thirty studies examined decision behaviour of medical experts [15,17-42]. In half of the studies more than one type of respondent was surveyed. Twenty-eight different medical problems were addressed. Twenty-two (73%) studies examined treatment decisions. Eleven studies (37%) asked for a preferred diagnostic decision, sometimes (6 studies) in combination with a treatment decision.

Objectives

Ten studies (33 percent) aimed at describing decision preferences of specific groups of participants [20,24,26,27,29,30,32,38,43,44] and nine studies (30 percent) described decision preference differences between groups [18,25,28,31,34,36,37,40,41]. Three studies explored the consistency of decisions between groups of experts [15,23,45] and two studies examined change of preferences after an intervention [22,35]. Only six studies (20 percent) compared decision behaviour against some sort of empirical reference such as a guideline [21,39,42] (n = 3), actual patient data [17,19] (n = 2) or the result of a clinical study [33].

Design

The median number of attributes was 6.5 (inter quartile range IQR 4–9, range 2–15). In 20 studies (67%) the selection of attributes was based on information like the literature [17,20,27,28,31,32,34,35,37,40,43,45] (12 studies), expert opinion (7 studies) or guidelines (1 study). In five studies patient files [15,21,24,33,41] were used to construct the vignettes. The median number of vignettes was 25 (IQR 16–32), ranging from 3 to 130. Authors used several response modes for the vignettes. In eight cases they used more than one response mode. In 23 cases authors used a rating procedure [15,18,20-25,27-31,34-37,40-45], where respondents had to rate the relative importance of a given vignette or assign a probability (n = 6) to a diagnosis or outcome [31-33,35-37]. One study used a ranking design, where respondents had to arrange each of the attributes in descending order of importance [24]. In six studies respondents could reply with a yes/no choice [27,30,31,35,38,39]. One study used a conventional discrete choice mode, where respondents, given two or more vignettes, had to select one with the highest likelihood of postoperative recovery [26].

Analysis

Twenty (67%) studies did not correct for correlated data. Consequently, only ten studies applied some statistical procedure to account for this correlation within the data [15,22,26,32-37,41].

Discussion

This review has two main findings. First, studies of medical choice and judgement are regularly used in the medical field to explore healthcare providers' decision behaviour or preferences. Second, we found a broad spectrum of different methods, and both design and analysis were suboptimal in some cases.

Cognitive burden/complexity

One fourth of our studies either contained vignettes with more than nine attributes or compiled sets of over forty vignettes in the same experiment. Empirical evidence showing that these figures are too high is scarce and there is much controversy particularly about the number of vignettes [46]. From a cognitive psychological point of view both figures appear to be very high and could bias the results. This bias typically occurs because respondents are unable to integrate and process large information quantities provided simultaneously, or because respondents lose attention when sifting through too many vignettes. However, evidence suggests that more attributes, more choice options and more vignettes decrease response reliability, but do not bias mean responses [46]. As a rule of thumb, the number of attributes per vignette should not exceed six to eight [47-49]. There is much opinion and controversy about maximally allowed number of vignettes, but little rigorous evidence [46]. A re-analysis of 21 commercial studies suggests a maximum of 20 vignettes [48] and a review of discrete choice experiments evaluating healthcare shows that the number of vignettes seldom exceeds 16 [49]. Furthermore, the majority of studies either used a ranking or rating response mode. These two modes imply very strong assumptions about human cognitive abilities making it more likely that measures will be biased and invalid [50]. Consequently, we therefore recommend the choice based approach.

Validity, usefulness of study objectives

In contrast to applications in marketing research where the main topic of a study is to identify opinions regarding a new product, we would be particularly interested to learn about the correctness of care givers' weighting of the value of clinical information in decisions. While there is no normative benchmark for a "correct" product there is usually one in medical judgement if clinical studies are available. For example, if the results of a study on medical choice and judgement showed that physicians consistently attribute high weights to relatively uninformative lab test but instead undervalue the informativeness of cues from clinical examination they would hint at something that needed to be improved perhaps with an educational intervention. Also the method would allow assessing the change in preferences after intervening with educational measures. Most studies did not compare the attributed weights to some sort of normative benchmark such as the results of a clinical study. We only found one out of 30 studies that actually examined this and another five that used a further normative reference (guidelines or patient files). In absence of a normative benchmark these studies leave it to the reader to approve or disapprove the results. Moreover, assessment of discrepancies between different groups of participants has the problem that these could be explained by different clinical circumstances or other factors rather than group specific differences. On the other hand there are medical situations in which views about optimal choices are controversial. In these situations studies that do not compare caregivers' decision behaviour (or preferences) to some norm may still be useful in that they allow the examination of present opinions.

Statistical model

The majority of studies did not account for correlated data in the analysis. Correlated data occur because each respondent assesses different vignettes. Not accounting for this leads to too small estimates of the standard deviations for an attribute and can mimic a statistically significant association where in fact there is none. Unfortunately, guidelines on the conduct of conjoint analyses have not yet reached consensus about the optimal way to analyse correlated data.

Limitations

What are the limitations of this review? We think that the search and appraisal procedures were reliable. However, sometimes classifications were difficult to make because of unclear descriptions in the article. We did not contact authors to clarify these uncertainties. Second, there have been two prominent methods of constructing linear models of medical judgements, each with their own literature and set of advocates. These are conjoint analysis, developed in the 1970s to study preference and choice[1], and judgement analysis, also called social judgement theory, developed in the 1950s from Brunswik's lens model[2,3]. In this review we did not make a distinction between the two methods because there is substantial overlap in methodology. Arguably this is a weakness of our study. However, since we were interested in providing an overview of all studies that examined medical decisions of care givers using cue weightings from answers to structured vignettes applying all sorts of different methods, we feel that our approach has its own merit.

Future research

Our review indicates that current applications of conjoint and judgment analysis in the medical field remain suboptimal in some instances. We think that researchers should consider our propositions to ensure internal validity. Moreover we believe that studies investigating care givers' judgements are most valuable if they allow comparisons with some norm and if they include an assessment of deviations from that norm. Our review only found few such investigations. From a more methodological point of view we agree with a statement in a recent editorial that research is required to learn whether individuals do behave in reality as they state in a hypothetical context. [51]

Conclusion

We believe that studies of medical choice and judgement offer many attractive and new insights into medical action. Provided that both methods and application evolve they offer a unique opportunity to improve quality of care.

Competing interests

The authors declare that they have no competing interests.

Authors' contributions

AGHK conceived of the study and LMB obtained funding. AGHK and LMB designed the study, supervised the work and drafted the manuscript. AM and AB carried out the data extraction. AM, AB, UH and GtR participated in the design of the study and gave important conceptual input. All authors read and approved the final manuscript.

Pre-publication history

The pre-publication history for this paper can be accessed here:
  43 in total

1.  Medicine. Communicating statistical information.

Authors:  U Hoffrage; S Lindsey; R Hertwig; G Gigerenzer
Journal:  Science       Date:  2000-12-22       Impact factor: 47.728

2.  A comparison of patients' and health care professionals' preferences for symptoms during immediate postoperative recovery and the management of postoperative nausea and vomiting.

Authors:  Anna Lee; Tony Gin; Angel S C Lau; Floria F Ng
Journal:  Anesth Analg       Date:  2005-01       Impact factor: 5.108

3.  Variability in treatment advice for elderly patients with aortic stenosis: a nationwide survey in The Netherlands.

Authors:  B J Bouma; J H van der Meulen; R B van den Brink; A E Arnold; A Smidts; L H Teunter; K I Lie; J G Tijssen
Journal:  Heart       Date:  2001-02       Impact factor: 5.994

4.  Sex and attitude: a randomized vignette study of the management of depression by general practitioners.

Authors:  S Ross; K Moffat; A McConnachie; J Gordon; P Wilson
Journal:  Br J Gen Pract       Date:  1999-01       Impact factor: 5.386

5.  The use of clinical information in diagnosing chronic heart failure: a comparison between general practitioners, cardiologists, and students.

Authors:  Y Skånér; J Bring; B Ullman; L E Strender
Journal:  J Clin Epidemiol       Date:  2000-11       Impact factor: 6.437

6.  Factors influencing GPs' decisions on the treatment of hypercholesterolaemic patients.

Authors:  L Backlund; B Danielsson; J Bring; L E Strender
Journal:  Scand J Prim Health Care       Date:  2000-06       Impact factor: 2.581

7.  Experienced obstetric nurses' decision-making in fetal risk situations.

Authors:  L A Haggerty; R L Nuttall
Journal:  J Obstet Gynecol Neonatal Nurs       Date:  2000 Sep-Oct

8.  Physicians' attitudes toward tube feeding chronically ill nursing home patients.

Authors:  S M Von Preyss-Friedman; R F Uhlmann; K C Cain
Journal:  J Gen Intern Med       Date:  1992 Jan-Feb       Impact factor: 5.128

9.  Determining priority for liver transplantation: a comparison of cost per QALY and discrete choice experiment-generated public preferences.

Authors:  Julie Ratcliffe; Martin Buxton; Tracey Young; Louise Longworth
Journal:  Appl Health Econ Health Policy       Date:  2005       Impact factor: 2.561

10.  Gynecologists' attitudes regarding human papilloma virus vaccination: a survey of Fellows of the American College of Obstetricians and Gynecologists.

Authors:  Janice C Raley; Kristen A Followwill; Gregorgy D Zimet; Kevin A Ault
Journal:  Infect Dis Obstet Gynecol       Date:  2004 Sep-Dec
View more
  41 in total

Review 1.  Designing randomized-controlled trials to improve head-louse treatment: systematic review using a vignette-based method.

Authors:  Giao Do-Pham; Laurence Le Cleach; Bruno Giraudeau; Annabel Maruani; Olivier Chosidow; Philippe Ravaud
Journal:  J Invest Dermatol       Date:  2013-10-11       Impact factor: 8.551

2.  The Feasibility of Sophisticated Multicriteria Support for Clinical Decisions.

Authors:  James G Dolan; Peter J Veazie
Journal:  Med Decis Making       Date:  2017-10-30       Impact factor: 2.583

3.  Understanding surgical decision making in early hepatocellular carcinoma.

Authors:  Hari Nathan; John F P Bridges; Richard D Schulick; Andrew M Cameron; Kenzo Hirose; Barish H Edil; Christopher L Wolfgang; Dorry L Segev; Michael A Choti; Timothy M Pawlik
Journal:  J Clin Oncol       Date:  2011-01-04       Impact factor: 44.544

4.  Physicians' Decision-making When Implementing Buprenorphine With New Patients: Conjoint Analyses of Data From a Cohort of Current Prescribers.

Authors:  Hannah K Knudsen; Michelle R Lofwall; Sharon L Walsh; Jennifer R Havens; Jamie L Studts
Journal:  J Addict Med       Date:  2018 Jan/Feb       Impact factor: 3.702

5.  Physician variability in treating pain and irritability of unknown origin in children with severe neurological impairment.

Authors:  Harold B Siden; Bruce C Carleton; Tim F Oberlander
Journal:  Pain Res Manag       Date:  2013-07-24       Impact factor: 3.037

6.  Opinions of the Dutch public on palliative sedation: a mixed-methods approach.

Authors:  Hilde T H van der Kallen; Natasja J H Raijmakers; Judith A C Rietjens; Alex A van der Male; Herman J Bueving; Johannes J M van Delden; Agnes van der Heide
Journal:  Br J Gen Pract       Date:  2013-10       Impact factor: 5.386

7.  Physicians' decision about long-term thromboprophylaxis in cancer outpatients: CAT AXIS, a case vignette study on clinical practice in France.

Authors:  Florian Scotté; I Elalamy; D Mayeur; G Meyer
Journal:  Support Care Cancer       Date:  2018-01-20       Impact factor: 3.603

8.  Supplementing cross-cover communication with the patient acuity rating.

Authors:  Andrew W Phillips; Trevor C Yuen; Elizabeth Retzer; James Woodruff; Vineet Arora; Dana P Edelson
Journal:  J Gen Intern Med       Date:  2012-11-06       Impact factor: 5.128

9.  The development of quality indicators in mental healthcare: a discrete choice experiment.

Authors:  Ron Schellings; Brigitte A B Essers; Alfons G Kessels; Florian Brunner; Tijmen van de Ven; Paul B M Robben
Journal:  BMC Psychiatry       Date:  2012-08-07       Impact factor: 3.630

10.  Decision Making about Risk of Infection by Young Adults with CF.

Authors:  Lisa Reynolds; Gary Latchford; Alistair J A Duff; Miles Denton; Tim Lee; Daniel Peckham
Journal:  Pulm Med       Date:  2013-01-10
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.