| Literature DB >> 28762607 |
Holly Walton1, Aimee Spector1, Ildiko Tombor2, Susan Michie1.
Abstract
PURPOSE: Understanding the effectiveness of complex, face-to-face health behaviour change interventions requires high-quality measures to assess fidelity of delivery and engagement. This systematic review aimed to (1) identify the types of measures used to monitor fidelity of delivery of, and engagement with, complex, face-to-face health behaviour change interventions and (2) describe the reporting of psychometric and implementation qualities.Entities:
Keywords: behaviour change; complex intervention; engagement; fidelity of delivery; health; implementation; measures; psychometric; quality
Mesh:
Year: 2017 PMID: 28762607 PMCID: PMC5655766 DOI: 10.1111/bjhp.12260
Source DB: PubMed Journal: Br J Health Psychol ISSN: 1359-107X
Figure 1A flow diagram of the paper selection process (based on Moher, Liberati, Tetzlaff, and Altman's (2009) PRISMA flow diagram).
A summary of the measures used to monitor fidelity of delivery and engagement
| Fidelity ( | Engagement ( | |
|---|---|---|
| What was measured? |
Delivery of intervention components compared with intervention protocol ( Motivational interviewing adherence/fidelity/infidelity ( Dose delivered and fidelity ( Fidelity of delivery but unclear which aspect as results not reported ( Dose of intervention components ( Competence and success delivering behaviour change strategies ( Treatment integrity/demonstration of skills ( Extent to which environmental changes made ( Consistency and quality of use of innovation ( Motivational interviewing fidelity, dose, and context ( ‘Quality of counselling’ – use of skills and therapeutic alliance ( Number of times skills were modelled and telephone fidelity ( Clinician competence/demonstration of intervention method ( |
Adherence to target behaviour ( Attendance ( Understanding (receipt) and use of intervention skills (enactment) ( Understanding and engagement ( Compliance and attendance ( Adherence to target behaviour and attendance ( Completion of study visits ( Intervention enactment – use of BCTs ( Receipt, enactment, homework compliance, and attendance ( Dose received/exposure – assignments completed ( Dose received – intervention receipt and compliance ( How much learned/adopted, helpfulness, and current use ( Effectiveness of intervention – trying practices, participating, influencing practice, comprehension, future participation ( Adoption of intervention and maintenance ( Dose of intervention received ( Receipt and reaching goals ( Participation in activities, dose, and checklist completion ( Activity adherence, sessions delivered, telephone contact ( Adherence to target behaviour and diary ( Adherence to target behaviour, attendance, and diary ( Exposure to intervention – attendance/receipt of calls ( Uptake of intervention – attendance/use of modules ( Attendance, reading materials, usefulness, meeting goals ( Attendance and completion of diaries ( Completion of diaries ( Completion of home assignments, self‐monitoring, attendance ( Homework adherence and commitment ( Completion of homework, receipt of information, telephone calls ( |
| Type of measures used |
Observational measures ( Video ( Audio ( Non‐specific ( Provider (hand) ( Provider (computer) ( Participant (hand) ( Participant (computer) ( Non‐specific (computer) ( Provider and participant self‐report ( Audio and provider self‐report ( Video + provider self‐report ( Observation and exercise log (participant) ( Direct observation and rating ( Participant self‐report and patient files ( Quantitative rated interviews with providers ( |
Self‐report measures ( Participant ( Provider ( Provider and participant self‐report ( Participant self‐report and attendance records ( Provider and participant self‐report and attendance records ( Attendance records and behaviour monitoring ( Direct observation and provider and participant self‐report ( Non‐specific observation and provider self‐report ( Provider self‐report, attendance records, homework review ( Participant self‐report and verbal verification ( Provider self‐report and homework review ( Participant self‐report and objective verification ( Provider self‐report and attendance records ( Attendance/referral records ( Study completion ( |
| More details about measures | Who completed the measures?
Researcher ( Provider ( Provider and participant ( Provider and researcher ( Participant ( Participant and researcher ( Not specified ( | Who completed the measures?
Participant ( Researcher ( Participant and researcher ( Provider ( Provider and participant ( Provider and researcher ( Provider, participant, researcher ( |
| Development of measures
Not specified ( Used a previously developed measure ( Motivational interviewing treatment integrity code (Moyers MITI + Motivational interviewing skill code (Miller Behaviour Change Counselling Index (Lane Flanders Interaction Analysis Technique ( Developed own measure: ( | Development of measures
Not specified: ( Used previously developed measure ( DASH adherence index: ( Pittsburgh Rehabilitation Participation scale ( Participation scale and the participation scale and recovery practice scale ( Developed own measure and used measures that were previously developed: ( | |
| Responses on measures
Not specified ( Rating scales ( 3‐point scale (completely covered, partially covered, not covered) ( 4‐point scale ( Two 4‐point rating scales (unsatisfactory, doubtful, satisfactory, good’, ‘not at all, hardly, slightly, considerably, strongly’ + Not applicable ( Two 4‐point scales (‘Excellent, good, fair, poor’ and ‘used well, used well but not often, used well and not well, not used or not used well) ( 5‐point scale (Totally disagree – totally agree) ( 5‐point scale (‘Never, most of the time, often, always, do not remember’) ( 5‐point scale (‘Non‐use, low compliance, compliant use, high compliance, committed use’) ( 7‐point scale (low (1), high (7)) + behaviour counts ( 7‐point scale ( Eight point scales (no adherence – optimal adherence and no competence – excellent competency) ( 10‐point scale (very bad to very good) + three point scale (yes/partly/not implemented) ( Dichotomous scale: ( Yes/no ( Applied(1)/not applied (0) or completed (1)/not completed (0) ( Completed)(1)/not completed(0) ( Rating scale and dichotomous scale ( 4‐point scale (rarely (1), sometimes (2), often (3), most/all of the time (4) and yes (1)/no (0) ( | Responses on measures
Not specified: ( Rating scales ( 3‐point scale adherence (poor, fair, excellent), others not specified ( 3‐point scales: perceived helpfulness (0 not at all, 2 very much) + currently using (0 not at all, 2 very much) ( 3‐point scale (0 = effectively non‐compliant, 0.5 = uncertain or partly compliant, 1 = compliant) ( 3‐point scales (yes/no/don't know and ‘very helpful, neither helpful nor unhelpful, very unhelpful’), four point scale (most, all, some, none), ( 3‐point scale (Better than target range [>1], 0–1 within target range, worse than target range [<0]): ( 3‐point Likert scale (very low to very high) ( 3‐point scale ( 4‐point scale (dissatisfied to very satisfied) ( 4‐point scale (1 missed most–4 missed none) and 10 point scale (1 none, 10 complete) ( 5‐point Likert scale: ( 6‐point Likert scale (1 no engagement, 6 excellent engagement) and 3‐point scale (1 minimal understanding, some understanding, good understanding) ( 7‐point scale (Never, <3 months ago, 4–6 months ago, 7–9 months ago, 10–12 months ago, 1–2 years ago, <2 years ago) ( Dichotomous scales ( Yes/no: ( Rating scale + dichotomous scale ( 3‐point scale (yes/no/don't know) and dichotomous scale (yes/no): ( 3‐point scale (0 not at all, fully) – measure receipt. 5‐point scale (1 not at all, 5 extremely) measure willingness, interest and supportiveness and dichotomous scale (attempted, not attempted) – to measure enactment ( | |
| Sample | How many participants were sampled?
Not specified ( Subsample ( Reported number of sessions sampled ( Reported number of clinicians/sites data was sampled from ( Reported the percentage of sessions sampled ( Reported sampling some but not all but did not specify how many ( All ( | How many participants were sampled?
Not specified ( Subsample ( Reported sampling a number of participants ( |
| How were participants sampled?
Not specified: ( Random ( N/A (sampled all: Purposive: ( Self‐selected ( Opportunity: ( Stratified: ( | How were participants sampled?
Not specified: ( | |
| Which conditions were participants sampled from?
Not specified (likely intervention only) : ( All: (Explicitly reported) ( Intervention(s) ( | Which conditions were participants sampled from?
Not specified (likely intervention only): ( All (explicitly reported): ( Intervention(s) ( | |
| Analysis method |
Descriptive statistics ( Descriptive and inferential statistical techniques ( Not reported ( |
Descriptive statistics ( Descriptive statistics and Inferential statistical techniques ( |
| Framework/model |
Framework not specified/mentioned ( Used a framework ( Steckler and Linnan's (2002, as cited in2,14,42,50) framework ( NIH Treatment fidelity model/NIH Behaviour change Consortium framework (Bellg RE‐AIM framework ( Resnick Baranowski & Stables ( Saunders Hasson ( | |
| Definitions |
Provided definitions ( Fidelity (constructs that fit into fidelity): ( Engagement (constructs that fit under engagement): ( Did not provide definitions ( | |
(R) = receipt; (E) = enactment; (R&E) = receipt and enactment.
Number of studies reporting psychometric and implementation qualities, across all studies (N = 66) and by studies reporting fidelity of delivery (N = 44) and engagement (N = 46)
| Psychometric qualities | Implementation qualities | ||||||
|---|---|---|---|---|---|---|---|
| Reported at least one quality | Validity | Reliability | Reported at least one quality | Practicality | Acceptability | Cost | |
| All studies; | 49 (74.2) | 41 (62) | 34 (52) | 17 (25.8) | 14 (21) | 6 (9) | 2 (3) |
| Fidelity of delivery; | 37 (84.1) | 31 (70.5) | 29 (65.9) | 12 (27.3) | 11 (25) | 5 (11.4) | 0 (0) |
| Engagement; | 21 (45.7) | 16 (34.8) | 10 (21.7) | 9 (19.6) | 6 (13.4) | 2 (4.3) | 2 (4.3) |
Number of times qualities were reported in total, and for fidelity of delivery and engagement
| Quality | Total number of times (%) | Category | Total number of times | Fidelity of delivery | Engagement |
|---|---|---|---|---|---|
| Psychometric quality | 215 (82.4) | Validity | 129 | 100 | 33 |
| Reliability | 85 | 75 | 14 | ||
| Reliability and validity | 1 | 1 | 0 | ||
| Implementation quality | 41 (15.7) | Practicality | 30 | 25 | 6 |
| Acceptability | 8 | 7 | 1 | ||
| Cost | 2 | 0 | 2 | ||
| Acceptability and practicality | 1 | 1 | 0 | ||
| Psychometric and Implementation quality | 5 (1.9) | Reliability and practicality | 1 | 1 | 0 |
| Validity and practicality | 3 | 2 | 1 | ||
| Validity and acceptability | 1 | 1 | 1 | ||
| Total | 261 (100) | ||||
The fidelity of delivery and engagement columns do not add up to 261 because 10 qualities were reported for both fidelity of delivery and engagement.
Qualities, category, and number of studies qualities were reported in
| Group of quality | Quality | Category | Number of studies reported in | Fidelity studies | Engagement studies |
|---|---|---|---|---|---|
| Psychometric qualities | |||||
| Use of multiple researchers | Coding | R | 11 | 20,26,27,29,33,34,45,51,58,64 | 47 |
| Data collection | 3 | 6,29,31 | |||
| Develop measures | 3 | 14,26,60 | |||
| Data analysis | 2 | 10,42 | |||
| Data entry | 1 | 26 | |||
| Validate coding frame | 1 | 26 | |||
| Validity of measures | Validated | V | 9 | 21,22,34,48,51 | 4,17,25,51 |
| Not validated | 8 | 2,10,34,35,41,42,50 | 13 | ||
| Use of independent researchers | Used – coding | R | 12 | 20,22,26,27,29,34,38,45,51,55,63,64 | |
| Not used – coding | 1 | 58 | |||
| Used – develop measures | 1 | 14 | |||
| Used – analysis | 1 | 42 | |||
| Not used | V | 1 | 20 | ||
| Measurement of conditions | All conditions (result output) | V | 8 | 7,50 | 4,13,17,18,51,53 |
| All conditions (reported) | 5 | 2,48,51 | 2,3,35 | ||
| Intervention only | 3 | 2,24 | 24,25 | ||
| Reliability of measures | Reliable | R | 6 | 21,22,48 | 4,17,51 |
| Not reliable | 5 | 2,14,23,34,50 | 2,23 | ||
| Random selection of data | Randomly selected | V | 9 | 31,40,51,55,57,58,63,64 | 52 (data entry) |
| Not randomly selected | 2 | 45,48 | |||
| Reporting of inter‐rater agreement | Reported – high | R | 3 | 26,59 | 17 |
| Not reported | 2 | 29,33 | |||
| Reported – poor to fair | 2 | 27,58 | |||
| Reported – fair to excellent | 1 | 58 | |||
| Reported – no coder drift | 1 | 26 | |||
| Coding of sessions | A percentage | V | 7 | 33,45,51,55,57,58,63 | |
| All | 1 | 27 | |||
| Calculated inter‐rater agreement | R | 8 | 20,26,27,29,33,58,59 | 17 | |
| Use of experts | Coding | V | 5 | 10,21,22,36,38 | |
| Develop measures | 1 | 27 | |||
| Not used – coding | 1 | 27 | |||
| Checked % of data input | R | 1 | 10 | ||
| Blinding | Coders | V | 3 | 7,26,48 | |
| Not blinded | 2 | 2 | 52 | ||
| Researchers | 1 | 15 | |||
| Participants | 1 | 2 | |||
| Measurement of content of intervention | Some aspects of intervention | V | 3 | 20,38 | 36,38 |
| All aspects of intervention | 2 | 33,63 | |||
| Problems with scoring criteria | Scoring criteria not sensitive | V | 2 | 20,26 | |
| No success cut‐off point | 1 | 14 | |||
| Dichotomized responses reduce variability | 1 | 25 | |||
| Measures may capture different aspects of fidelity | 1 | 26 | |||
| Standardization of procedure | Script | V | 2 | 34,66 | |
| Data entry | 1 | 52 | |||
| Coding guidelines | 1 | 64 | |||
| Not used standardized procedure | 1 | 33 | |||
| Not used standardized measure | 1 | 52 | |||
| Self‐report bias | V | 4 | 10,26,26,30 | ||
| R | 2 | 5 | 4 | ||
| Sampling | Across all providers | V | 2 | 27,45 | |
| Across all sites | 1 | 10 | |||
| Across all sites (purposively) | 1 | 33 | |||
| Across all participants | 1 | 27 | |||
| Balanced facilitator and gender (purposively) | 1 | 26 | |||
| Audit | Data collection | R | 1 | 6 | |
| Data analysis | 1 | 6 | |||
| Coding | 1 | 20 | 20 | ||
| Data entry | V | 1 | 23 | ||
| Recordings | 1 | 40 | |||
| Missing responses | Missing responses | V | 1 | 15 | |
| Trained researchers | Trained coders | V | 3 | 7,27,58 | |
| Trained researcher (data collection) | 1 | 52 | |||
| Observation effects | V | 4 | 22,26,27,34 | ||
| Use of one researcher | Coding | R | 1 | 38 | |
| Trained observers | 1 | 34 | |||
| Revised coding guidelines | R | 3 | 20,26,48 | ||
| V | 1 | 33 | |||
| Team meetings | R | 4 | 1,6,23,36 | 23 | |
| Recording of sessions | All sessions | V | 2 | 40,55 | |
| % of sessions | 1 | 35 | |||
| Triangulation | Method | V | 2 | 34,42 | |
| Researcher | 1 | 42 | |||
| Problems with analysis plan | Did not control for provider | V | 1 | 36 | |
| Missing responses excluded | 1 | 10 | |||
| Social desirability | V | 3 | 22 | 13,52 | |
| Objective verification | V | 2 | 15,43 | ||
| R | 1 | 12 | |||
| Used coding guidelines | R | 2 | 20,27 | ||
| Analysis consideration – coded missing responses as no adherence | V | 1 | 15 | ||
| Independently validated coding frame | V | 1 | 26 | ||
| Measurement differences – observation and self‐report | V | 1 | 26 | ||
| Measurement period – year after intervention | V | 1 | 25 | ||
| Piloted coding guidelines | V | 1 | 26 | ||
| Practice period before recording | V | 1 | 27 | ||
| Pre‐specified dates for recordings | V | 1 | 27 | ||
| Statistician involved in sampling (stratified) | V | 1 | 10 | ||
| Training before recording may overestimate adherence | V | 1 | 58 | ||
| Piloted measure | V | 1 | 34 | ||
| Provided a reason for inter‐rater agreement | R | 1 | 27 | ||
| Supervision | R | 1 | 58 | ||
| Measures were internally consistent indicating content validity | R+V | 1 | 27 | ||
| Implementation qualities | |||||
| Resource challenges | Time restrictions | P | 4 | 5,20,27,62 | |
| Technical difficulties | P | 3 | 5,5,58 | ||
| Financial restrictions | P | 2 | 5,27 | ||
| Sharing Dictaphones | P | 1 | 45 | ||
| Providers’ attitudes | Dislike paperwork | A | 1 | 10 | |
| Fear of discouraging participants | A | 1 | 27 | ||
| Nerves | A | 1 | 27 | ||
| Report participants behaving differently | A | 1 | 27 | ||
| Positive attitudes | A | 1 | 42 | ||
| Additional work | A | 1 | 62 | ||
| Not enthusiastic | A | 1 | 62 | ||
| Measurement of content of intervention | Telephone calls not assessed due to difficulty | P | 1 | 38 | |
| Measure cannot capture non‐verbal data | P | 1 | 20 | ||
| Problems with documentation | No record of responses | P | 2 | 10,58 | |
| Providers did not document everything | 1 | 10 | |||
| No record of refusals | A+P | 1 | 27 | ||
| Missing responses | Missing responses | P | 1 | 10,10 (different aspects) | |
| Problems with sampling | Low recruitment | P | 1 | 60 | |
| Problems with analysis plan | Analysis not feasible | P | 1 | 10 | |
| Incentives | Incentives used | P | 2 | 15,52 | |
| Incentives required | P | 1 | 62 | ||
| Feedback to providers | P | 2 | 21,27 | ||
| Feedback delay | P | 1 | 38 | ||
| Forgetting to return data | P | 1 | 15 | ||
| Logbook showed that not all steps were applied | P | 1 | 42 | ||
| Paper and digital version of measures given | P | 1 | 5 | ||
| Need simpler coding guidelines to achieve agreement | P | 1 | 27 | ||
| Reviewed fidelity after trial | P | 1 | 45 | ||
| Participants – dislike paperwork | A | 1 | 15 | ||
| Did not do a cost analysis | C | 1 | 13 | ||
| Cost of materials | C | 1 | 37 | ||
| Both psychometric and implementation qualities | |||||
| Problems with scoring criteria | Lack of clarity on items | V+P | 1 | 25 | |
| Missing responses | Missing responses | V+P | 1 | 58 | |
| Use of one researcher | Data collection | R+P | 2 | 5 | 52 |
| Problems with sampling | Selection bias | V+A | 1 | 2 | 2 |
| Not randomly selected | V+P | 1 | 27 | ||
This table is ordered by the number of studies that reported a quality that fits into the ‘group of quality’ column (e.g., ‘use of multiple researchers’). Most frequent → Least frequent. The numbers in this table will not add up to the total number of studies included, as some studies included information on multiple qualities.
R = reliability; V = validity; A = acceptability; P = practicality; C = cost.