Literature DB >> 34996588

Utility of unidimensional and functional pain assessment tools in adult postoperative patients: a systematic review.

Reham M Baamer¹, Ayesha Iqbal², Dileep N Lobo³, Roger D Knaggs⁴, Nicholas A Levy⁵, Li S Toh².

Abstract

BACKGROUND: We aimed to appraise the evidence relating to the measurement properties of unidimensional tools to quantify pain after surgery. Furthermore, we wished to identify the tools used to assess interference of pain with functional recovery.
METHODS: Four electronic sources (MEDLINE, Embase, CINAHL, PsycINFO) were searched in August 2020. Two reviewers independently screened articles and assessed risk of bias using the COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) checklist.
RESULTS: Thirty-one studies with a total of 12 498 participants were included. Most of the studies failed to meet the methodological quality standards required by COSMIN. Studies of unidimensional assessment tools were underpinned by low-quality evidence for reliability (five studies), and responsiveness (seven studies). Convergent validity was the most studied property (13 studies) with moderate to high correlation ranging from 0.5 to 0.9 between unidimensional tools. Interpretability results were available only for the visual analogue scale (seven studies) and numerical rating scale (four studies). Studies on functional assessment tools were scarce; only one study included an 'Objective Pain Score,' a tool assessing pain interference with respiratory function, and it had low-quality for convergent validity.
CONCLUSIONS: This systematic review challenges the validity and reliability of unidimensional tools in adult patients after surgery. We found no evidence that any one unidimensional tool has superior measurement properties in assessing postoperative pain. In addition, because promoting function is a crucial perioperative goal, psychometric validation studies of functional pain assessment tools are needed to improve pain assessment and management. CLINICAL TRIAL REGISTRATION: PROSPERO CRD42020213495.

Entities: Chemical

Keywords: COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN); functional pain assessment tool; pain scores; postoperative pain; tool utility; unidimensional pain assessment

Mesh：

Year: 2022 PMID： 34996588 PMCID： PMC9074792 DOI： 10.1016/j.bja.2021.11.032

Source DB: PubMed Journal: Br J Anaesth ISSN： 0007-0912 Impact factor: 11.719

Well validated assessment tools are essential for measuring postoperative pain intensity and impact This systematic review shows that despite many tools available, evidence regarding their validity or reliability is scarce. After surgery, the Visual Analogue Scale (VAS) showed the highest error rate in general and was the least preferred compared to the 0-10 Numerical Rating Scale (NRS). Importantly statistically significant changes in VAS or NRS do not necessarily indicate clinically important changes, and NRS cut-off points used by healthcare professionals to determine acute pain severity do not always reflect patients' desire for analgesics. Patients experience acute pain after surgery as a result of tissue damage and inflammation at the operation site.1, 2, 3 Careful assessment of pain using a valid and reliable tool is the first step towards a rational choice of analgesic therapy, which is essential for ensuring patient comfort, mobility, and satisfaction and reducing healthcare costs. The most commonly used tools for the assessment of postoperative pain are unidimensional and assess only pain intensity. These include the visual analogue scale (VAS), numerical rating scale (NRS), verbal rating scale (VRS), sometimes referred to as the verbal descriptor scale (VDS), and faces pain scales (FPS). They are quick to administer and do not encroach on the time required for usual care. Despite their extensive use, the reliance on these unidimensional tools as the sole approach to measuring pain is currently insufficient as the cut-off points commonly used by healthcare providers do not reflect the patient's desire for additional analgesics., Furthermore, patients have reported difficulties in describing the complexity of their pain experience by a single numerical value, descriptive words, or as a mark on a line. Striving to lower pain intensity scores to zero as suggested by the ‘Pain as the 5th Vital Sign’ campaign has not improved pain outcomes,15, 16, 17 and resulted in increased opioid analgesic use in the post-anaesthesia care unit (PACU). Furthermore, Vila and colleagues highlighted the potential hazard associated with a pain score-based treatment algorithm in increasing the prevalence of sedation-related side-effects by more than twofold. Treating pain as the fifth vital sign has been abandoned now as it may have contributed to the current US opioid epidemic., Restoration of function by allowing the patient to breathe, cough, ambulate, and turn in bed is important for postoperative pain relief., Therefore, assessing the functional impact of pain, which includes patient-centred objective assessment by a healthcare provider who judges if the pain prevents the patient from performing activities that help accelerate recovery, could be an appropriate alternative to achieve better pain assessment. Hence, options to treat pain will be used to maximise functional capacity, rather than striving to reduce the patient's postoperative pain score to below a specified numerical value., Despite being used widely, the validity, reliability, and utility of unidimensional pain assessment tools for postoperative patients have not been reviewed systematically. The aim of this systematic review was to appraise the available evidence concerning the measurement properties of different unidimensional and functional pain assessment tools when used to assess postoperative pain in hospitalised adults.

Methods

We performed this systematic review according to COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) (http://www.cosmin.nl/) guidelines, and reported it according to the Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) statement guidelines.

Search strategy

We performed a systematic search of the MEDLINE, Embase, PsycINFO (all via OVID) and CINAHL (via EBSCOhost) databases from their inception to August 2020. Our search strategy consisted of four search concepts: (1) measurement properties or outcome terms, (2) pain assessment tool terms, (3) acute postoperative pain, and (4) limits (English language or English translation, human adults ≥18 yr old). We combined the first three using the Boolean operator AND, which works as a conjunction to narrow the search to include our specific three search concepts resulting in more focused results. This was then combined with the result string of the fourth concept to limit the results. We performed these steps separately for each pain assessment tool. We carried out backward citation tracking as well by checking the reference lists from eligible studies. The comprehensive search strategy used is provided in Supplementary material, Appendix S1.

Inclusion criteria

We included any of the following pain measurement tools to assess acute pain in hospitalised adult patients from all surgical specialties: unidimensional pain assessment tools (including the numerical pain rating scale, VRS, VAS, faces scales [Wong-Baker FACES, Faces Pain Scale – Revised]), and functional pain assessment tools included any tool that helps assess acute pain based on its interference with functional activity, including walking, breathing, turning in bed, and coughing. Included functional pain assessment tools could be used objectively by the clinician or when self-reported by patients. We included instrument validation or instrument evaluation types of studies. Any studies that included at least one or more of the instruments to evaluate postoperative pain and assessed at least one of the nine measurement properties identified by COSMIN taxonomy: internal consistency, test–retest reliability, measurement error, content validity, structural validity, construct validity, hypothesis testing, cross-cultural validity, criterion validity, and responsiveness were considered (Appendix S2). In addition, we included any study that evaluated any of the specified additional outcomes of the tools, including feasibility, interpretability, and desire for analgesia.

Exclusion criteria

We excluded abstracts, editorials, reviews, and studies that included paediatric or adolescent populations, or sedated, mechanically ventilated and critically ill patients.

Selection of articles

After our database search, we collated and uploaded all identified citations to EndNote X9 (Clarivate Analytics, Philadelphia, PA, USA) and removed duplicates. The identified studies were uploaded to Rayyan QCRI online software. Two reviewers (RMB and AI) independently applied the inclusion criteria to the titles, then to relevant abstracts. Afterwards, we thoroughly examined potentially eligible full texts for inclusion. We documented the full search results in the PRISMA flow diagram (Fig. 1). Excluded studies and the reasons for their exclusion are provided in Appendix S3.

Fig 1

PRISMA diagram. PRISMA, Preferred Reporting Items for Systematic Reviews and Meta-Analyses.

Data extraction

One reviewer (RMB) extracted data from the included full-text articles, with the extraction verified by a second reviewer (AI). The two reviewers resolved any disagreements through discussion, or consultation with other reviewers (RDK, LST, or DNL) when necessary. The extracted data included specific details about the assessment tool used, country, language of scale administration, study design, patient characteristics, surgical procedure, the specific measurement properties assessed, outcomes related to the review question and objectives, and the main statistical analysis.

Assessment of methodology

Two independent reviewers (RMB and AI) critically appraised the methodological quality of studies looking at feasibility and interpretability using a modified version of the Newcastle–Ottawa Scale (Appendix S4). For validation studies, we assessed the quality using the COSMIN criteria for methodological quality.27, 28, 29 We included three phases in the assessment of each measurement property. First, we assessed the risk of bias, which pertains to methodological quality in each study: very good, adequate, doubtful, or inadequate quality was assigned to each study. Second, we related the results to a measurement property rated against criteria for ‘sufficient measurement properties’, and the results were classified as sufficient, insufficient, or indeterminate (Appendix S5). Third, we combined the results from each study and graded the quality of evidence for each pain assessment tool. A summary of the scoring criteria and appraisals is provided in (Appendices S6 and S7).

Protocol registration

The protocol was registered (No. CRD42020213495) with the PROSPERO database and can be accessed at https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=213495.

Results

The search identified 14 216 potential studies after removal of duplicates. After reviewing the titles, we excluded 13 798 for irrelevance and another 380 after abstract screening. Of the 38 remaining studies, we excluded 19 after examination of the full texts against the inclusion criteria (Appendix S2). An additional 12 studies were identified through searching the bibliography of eligible studies, so a total of 31 studies,,,,30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56 (Fig. 1) with 12 498 participants were included. The number of participants in individual studies ranged from 35 to 3045. The distribution of male and female participants in the studies varied, with some studies including only female participants or only male participants and others not reporting sex distribution.,,, The studies matching our inclusion criteria were published between 1982 and 2018, and assessed postoperative pain after different types of surgical procedures (Table 1). Nine studies included only cognitively intact patients,,,,,,,,, whereas two studies included mild cognitively impaired participants., The remaining 20 studies did not report on cognitive function.,,,,32, 33, 34, 35, 36,39, 40, 41, 42, 43, 44, 45,,,,

Table 1

First author, year (country)	PROM/s	Study design	Surgical procedure	Outcome(s)	High anchor∗	Main exclusion criteria	Patient characteristics
First author, year (country)	PROM/s	Study design	Surgical procedure	Outcome(s)	High anchor∗	Main exclusion criteria	n (Female%)	Age (yr)Mean (sd) [range]
Van Dijk, 2015¹³ (The Netherlands)	NRS	Cross-sectional design	Orthopaedic, ENT, gynaecological, cardiothoracic, Others	Ability to detect desire for analgesics	Worst pain imaginable	ICU patients, not proficient in Dutch or English, ambulatory surgery	1084 (48)	53 [18–90]
Banos, 1989³⁴ (Spain)	VASVRS-5	Descriptive correlational design	Abdominal, orthopaedic, gynaecological	Convergent validity	10Unbearable pain	NR	212 (50)	<30=4331–50=69>50=107
Akinpelu, 2002³⁰ (Nigeria)	VASM-VRSBNS	Cross-sectional design	Caesarean section	Convergent validity	Worst painWorst imaginableWorst pain	Complications, illness unconscious	35 (100)	31 (5)
Briggs, 1999³⁵ (UK)	VASVRS∗∗	Secondary analysis of RCT	Orthopaedic	Convergent validityFeasibility	Number 100Severe pain at rest and movement	NR	417 (45)	47 (20)∗64 (17)
Fadaizadeh, 2009³⁹ (Iran)	VASFPS	Cross-sectional design	General, gynaecological	Convergent validity	10Agonised	History of substance abuse, unconscious	82 (72)34 GS48 GYN	32 (14)GYN 27 (7)GS 38 (18)
Deloach, 1998³⁸ (USA)	VASVPS	Descriptive correlational design	Various type of surgeries	Convergent validity	Worst imaginableHorrible pain	NR	NR	NR
Pesonen, 2008⁵¹ (Finland)	VASVRS-5RWSFPS-7	Descriptive correlational design	Cardiac surgery: elective CABG, valvular repair	Feasibility	Worst possible painUnbearable painWorst possible painWorst possible pain	Dementia, cognitive impairment	160FPS 80 (36)RWS 80 (44)	73 (5)
Aubrun, 2003³² (France)	VASNRSVRSBehavioural scale	Prospective observational design	Orthopaedic, abdominal, gynaecological, others	Feasibility	Worst imaginable painWorst imaginable painSevereNR	NR	600 (47)	51 (17)
Myles, 1999⁴⁹ (Australia)	VAS	Clinical study	General, orthopaedic, ENT, faciomaxillary, cardiothoracic	Interpretability	100 worst pain ever	Severe pain, inability to complete the VAS	52 (40)	42 (15)
Myles, 2005⁵⁰ (Australia)	VAS	Clinical study	General, orthopaedic, ENT, faciomaxillary, cardiothoracic	Interpretability	100 worst pain ever	Postoperative deliriumFrailty, visual impairment	22 (NR)	33 (17)
Jensen, 2003⁴⁴ (USA)	VASVRS-4VRS-P	Secondary analysis of RCT	Total knee replacement, hysterectomy, laparotomy	Interpretability	Worst painSevere painComplete relief	NR	123 (66)	65 (10)
Gerbershagen, 2011⁴¹ (Germany)	NRS	Comparative study design	Cholecystectomy, thyroidectomy, gastrointestinal, inguinal hernia repair, others	Interpretability	Worst imaginable pain	Repeated surgical, procedures, mechanical ventilation	444 (44)	18–20=3821–30=7531–40=8841–50=9651–60=8761–70=4971–80=2
Cepeda, 2003³⁶ (USA)	NRSVRS	Clinical study	Head and neck, thoracic, spinal abdominal, orthopaedic	Interpretability	Worst imaginableSevere pain	NR	700 (62)	50 (15)
Jensen, 2002⁴⁵ (USA)	VASVRSPain relief	Secondary analysis of RCT	Total knee replacement, abdominal hysterectomy, laparotomy	Responsiveness	Worst painSevere painComplete relief	NR	246 (66)	Knee 65 (10)Laparotomy 41 (7.5)
Jenkinson, 1995⁴³ (UK)	VASCPIMcGill	RCT	Orthopaedic	Responsiveness	Severe pain	NR	75 (64)	Male: 41 (13)Female: 43 (12)
Aubrun, 2003³¹ (France)	VAS	Clinical study	Orthopaedic, urological, abdominal gynaecological, vascular, thoracic	Interpretability	100	Minor pain, delirium, dementia, non-French speaking	3045 (54)	50 (18)
Sriwatanakul, 1982⁵² (USA)	VAS	Secondary analysis of RCT	NR	Interpretability	Pain as bad as it could be	NR	NR	NR
Van Giang, 2015⁵⁵ (Vietnam)	FPSNRS	Validation study	Orthopaedic	Concurrent validity Responsiveness	The worst possible pain	Hearing impairmentAltered mental status	144 (45)	37 (13)
Van Dijk, 2012⁵⁴ (The Netherlands)	NRSVRS	Cross-sectional design	General, ENT, orthopaedic, neurosurgical, urological, gynaecological, plastic, vascular, cardiothoracic	Interpretability	10Worst pain imaginable	ICU patientsNon-Dutch speakingCognitive or hearing impairment, inability to use self-report	2674 (51)	73 (6)
Li, 2007⁴⁷ (China)	VASNRS-11VDSFPS	Prospective clinical study	NR	Convergent validityScale reliabilityResponsivenessFeasibility	10 The most intense imaginable pain10 The most intense imaginable painThe most intense imaginable painWorst pain	NR	173 (45)	45.3 (15)
Li, 2009⁴⁶ (China)	FPSNRSIPT	Descriptive correlational design	Gastrointestinal, orthopaedic, abdominal	Convergent validityScale reliabilityResponsivenessFeasibility	1010The most intense imaginable pain	Did not speak ChineseMore than one surgeryASA score of 4Chronic pain	180 (68)	72 (6)
Zhou, 2011⁵⁶ (China)	VDSNRSFPSCAS	Descriptive comparative design	NR	Criterion validityConvergent validityTest–retest reliabilityFeasibility	Worst pain	Severe cognitive impairment	200 (46)	56 (16)
Gagliese, 2005⁶ (Canada)	VAS-HVAS-VNRSVDSMPQ	Validation study	NR	FeasibilityConvergent validityCriterion validity	10 Worst possible pain10 Worst pain imaginableExcruciating	On epidural or regional analgesia, ASA score of >3Chronic pain, cognitive impairment, opioid or substance abuse	504 (58)	53 (15)
Tandon, 2016⁵³ (India)	OPSNRS	Descriptive correlational design	Abdominal surgery	Convergent validity	Worst possible painInadequate pain relief/pain at rest	Haemodynamic instabilityUnable to use a PCA pump	93	NR
Aziato, 2015³³ (Ghana)	NRSFPSCCPS	Two phases: qualitative and psychometric testing	Caesarean section, leg amputation, laminectomy, laparotomy, others	Convergent validityInter-rater reliabilityResponsivenessFeasibility	Worst possible painHurts worst	NR	150 (77)	<30=44.730–39=3540+=21
Hamzat, 2009⁴² (Ghana)	VAS	Validation study	Various gynaecological procedures	Cross-cultural validity	Worst possible pain	History of psychological or psychiatric disorders	60 (100)	NR
Gagliese, 2003⁴⁰ (Canada)	MPQPPIVAS-RVAS-M	Descriptive correlation design	Radical prostatectomy	Convergent validityResponsiveness	Worst possible pain5 Excruciating10 Worst possible10 Worst possible pain	Non-English speakerASA >3Chronic painChronic use of opioids	200	Younger patients: 56 (6)Older patients: 67 (3)
Myles, 2017⁴⁸ (Australia)	VAS	Observational design	General, orthopaedic, gynaecological, urological, major vascular, cardiac faciomaxillary, others	Test–retest reliabilityInterpretability	Very severe pain	Poor English comprehensionDrug or alcohol dependencePsychiatric disorderUncontrolled pain	219 (68)	53 (17)
Danoff, 2018³⁷ (USA)	VAS	Prospective observational design	THATKA	Measurement error	Worst possible pain	Preoperative pain Catastrophizing Scale score greater than 30 points	304THA (21)TKA (30)	THA: 60 [20–81]TKA; 63 [46–88]
Sloman, 2006² (Israel)	NRS	One group pretest–post-test design	Abdominal, orthopaedic, others	Interpretability	10 Excruciating	NR	150 (47)	47 [14–89]
Bodian, 2001³ (USA)	VASMcGill	Clinical study	Intra-abdominal Surgery	InterpretabilityDesire for analgesics	Worst pain imaginable	NR	150 (48)	49 [37–61]

The low anchor was “no pain”.

Characteristics of included studies. BNS, box numerical rating scale; CAS, coloured analogue scale; CCPS, colour circle pain scale; ENT, ear, nose and throat; FPS, face pain scale; ICU, intensive care unit; MPQ, McGill pain questionnaire; M-VRS, modified verbal rating scale with 11 description of pain intensity; NR, not reported; NRS, numerical rating scale; OPS, objective pain score; PCA, patient controlled analgesia; PPI, present pain intensity; PROM/s, patient-reported outcome measures; RWS, red wedge scale; sd, standard deviation; THA, total hip arthroplasty; TKA, total knee arthroplasty; VAS-R, visual analogue scale at rest; VAS-M; visual analogue scale at movement; VDS, verbal descriptor scale; VPS, 11-point verbal scale; VRS∗∗, 4-point verbal rating scale; VRS-5, 5-point verbal rating scale; VRS-P; verbal rating scale for pain relief. The low anchor was “no pain”. Seven studies were performed in the USA,,36, 37, 38,,, three in China,,, three in Australia,48, 49, 50 and two each in the UK,, the Netherlands,, Ghana,, France, and Canada., One study each was performed in Finland, Spain, Nigeria, Iran, India, Vietnam, Israel, and Germany. Although all the included studies were reported in English, some of the tools were administered in other languages: Chinese,,, Twi,, Vietnamese, Finnish, and both English and Yoruba. Using the modified Newcastle–Ottawa Score, the majority of studies looking at feasibility were of medium,,,,,,, or high quality.,,,,,,46, 47, 48,, The methodological quality of three secondary analysis studies that looked at VAS interpretability could not be assessed.,, The methodological quality for other measurement properties is described under each measurement property section. The following measurement properties were assessed: measurement error (n=1), cross-cultural validity (n=1), reliability (n=5),,46, 47, 48, responsiveness (n=7),,,,45, 46, 47, and hypothesis testing for construct validity (namely convergent validity; n=13),,33, 34, 35,38, 39, 40,,,54, 55, 56 and criterion validity (n=2)., No studies assessed structural validity, internal consistency, or content validity of any pain assessment tool. Interpretability was measured in 11 studies.,,,,,,48, 49, 50,, Two studies included the desire for analgesics as an outcome., The feasibility of pain assessment tools as an outcome measure was examined in eight studies.,,,,,,,

Outcomes for measurement properties

Unidimensional pain assessment tools

Convergent validity

Eight studies,,,,38, 39, 40, reported the convergent validity of the VAS with moderate-to-high correlations between several self-report scales that also measured pain intensity. Similarly, seven studies reported good convergent validity results for VRS,,,,,,, and six studies each reported good convergent validity results for NRS,,,,, and FPS,,,,, scores (Table 2). The correlations between scores obtained from several unidimensional tools were moderate to high, ranging from 0.5 to 0.9.

Table 2

Summary of methodological quality of studies using COSMIN risk of bias and measurement properties. COSMIN, COnsensus-based Standards for the selection of health Measurement INstruments; FPS, faces pain scale; LoE, Level of evidence using GRADE approach reported as: High, Moderate, Low, or Very low; NRS, numerical rating scale; OBS, objective pain score; VDS, verbal descriptor scale. Ratings for overall quality reported as sufficient (+), insufficient (–), inconsistent (+/–), indeterminate (?). Empty cells indicate no available results for measurement properties.

First author, year	Cross-cultural validity	Reliability	Measurement error	Criterion validity	Construct validity/convergent	Responsiveness
VAS Methodological quality assessment (COSMIN risk of bias)
Banos, 1989³⁴					Adequate
Akinpelu, 2002³⁰					Doubtful
Briggs, 1999³⁵					Adequate
Fadaizadeh, 2009³⁹					Adequate
DeLoach, 1998³⁸					Doubtful
Li, 2007⁴⁷		Inadequate			Adequate	Inadequate
Gagliese, 2005⁶				Inadequate	Inadequate
Gagliese, 2003⁴⁰					inadequate	Inadequate
Myles, 2017⁴⁸		Inadequate
Jensen, 2002⁴⁵						Inadequate
Danoff, 2018³⁷			Adequate
Hamzat, 2009⁴²	Inadequate
Rating LoE	?Very low	+Low	?Moderate	?Very low	+High	?Low
NRS Methodological quality assessment (COSMIN risk of bias)
Van Dijk, 2012⁵⁴					Adequate
Li, 2007⁴⁷		Inadequate			Adequate	Inadequate
Li, 2009⁴⁶		Inadequate			Adequate	Inadequate
Zhou, 2011⁵⁶		Inadequate		Adequate	Adequate
Gagliese, 2005⁶				Inadequate	Inadequate
Aziato, 2015³³		Inadequate			Doubtful	Inadequate
Rating LoE		+Low		+/–Low	+High	?Low
VDS Methodological quality assessment (COSMIN risk of bias)
Banos, 1989³⁴					Adequate
Briggs, 1999³⁵					Adequate
Van Dijk, 2012⁵⁴					Adequate
Li, 2007⁴⁷		Inadequate			Adequate
Zhou, 2011⁵⁶		Inadequate		Adequate	Adequate
Gagliese, 2005⁶				Inadequate	Inadequate
Jensen, 2002⁴⁵						Inadequate
Rating LoE		+Low		+/–Low	+/–High	?Low
FPS Methodological quality assessment (COSMIN risk of bias)
Fadaizadeh, 2009³⁹					Adequate
Van Giang, 2015⁵⁵					Adequate	Doubtful
Li, 2007⁴⁷		Inadequate			Adequate	Inadequate
Li, 2009⁴⁶		Inadequate			Adequate	Inadequate
Zhou, 2011⁵⁶		Inadequate		Adequate	Adequate
Aziato, 2015³³		Inadequate			Doubtful	Inadequate
Rating LoE		+Low		+Moderate	+High	?Low
OPS Methodological quality assessment (COSMIN risk of bias)
Tandon, 2016⁵³					Doubtful
Rating LoE					+Very low

Cross-cultural validity

One study established the validity of a Twi (Ghanaian) version of the VAS. The pain scores reported by patients using the new instrument correlated significantly with those reported by patients using the original (English) version of the VAS, with the highest correlation on the fifth postoperative day. Because of inadequate quality owing to an extremely serious risk of bias and imprecision, very low quality evidence was reported for cross-cultural validity of the VAS.

Reliability

The VAS showed high scale,, and test–retest reliability with an intraclass correlation coefficient of 0.79 (95% confidence interval [CI], 0.49–0.91). The NRS demonstrated high test–retest, inter-rater, and scale reliability.,,, VDS demonstrated high scale and test–retest reliability. Similarly, FPS demonstrated high inter-rater and test–retest reliability (Table 3). All four scales showed low-quality evidence because of very serious risk of bias.

Table 3

Reliability of unidimensional pain assessment tools in surgical patients. ∗Average interclass correlation coefficient calculated for 7 days. †No separate result for each scale. ‡Results categorised in 20–44 yr (n=43), 45–59 yr (n=39), 60 yr without cognitive impairment (n=40), ≤60 yr with mild cognitive impairment (n=31). ¶95% confidence interval. FPS, faces pain scale; n, number of patients; NRS, numerical rating scale; PROM/s, patient-reported outcome measures; SD, standard deviation; VAS, visual analogue scale; VDS, verbal descriptor scale.

First author, year	PROM/s	Pain construct	Reliability
First author, year	PROM/s	Pain construct	Type	n	Time interval	Interclass correlation coefficient
Li, 2007⁴⁷	VASNRSVDSFPS	Current, worst, least, average pain on 7 postoperative days	Scale reliability	173	Every 24 h	0.66∗0.76∗0.72∗0.72∗
Li, 2009⁴⁶	FPSNRSIowa Pain Thermometer	Current pain and daily retrospective ratings of worst and least pain	Scale reliability	180	Every 24 h	0.95 to 0.97^†
Zhou, 2011⁵⁶	VDSNRSFPSNumeric Box-21 ScaleColoured Analogue Scale	Recalled pain and postoperative pain	Test–retest reliability	153	24 h	0.96, 0.88, 0.93, 0.84^‡0.94, 0.90, 0.91, 0.80^‡0.93, 0.91, 0.84, 0.80^‡0.92, 0.91, 0.78, 0.76^‡0.93, 0.90, 0.88, 0.77¶
Aziato, 2015³³	NRSFPSColour Circle Pain Scale	No pain – worst possible painNo pain – worst possible painNo pain – unbearable	Inter-rater reliability	150	5–10 min	0.920.930.93
Myles, 2017⁴⁸	VAS	Pain unchanged or almost the same	Test–retest reliability	22	Not reported	0.79 (0.49–0.91)^¶

Responsiveness

Seven studies,,,45, 46, 47, reported responsiveness results for the four unidimensional pain assessment tools and provided low-quality evidence because of a very serious risk of bias (Table 4). The identified risk of bias was mainly related to the use of inappropriate measures of responsiveness such as effect size and statistical tests used.

Table 4

Responsiveness results of unidimensional tools. Empty cells indicate data not available or not assessed. ∗P-value is statistically significant at <0.0001. †Knee surgery. ‡Laparotomy. ¶VAS score. §CPI score. ||Time 2 vs time 1. #Time 3 vs time 1. ††Time 4 vs time 1. ‡‡Time 5 vs time. ¶¶Results for younger patient split of the sample at the median age of 62 yr. CCPS, colour circle pain scale; CI, confidence interval; CPI, categorical verbal pain rating scale; FPS, face pain scale; G, group; MPQ, McGill pain questionnaire; PPI, present pain intensity; PROM/s, patient-reported outcome measures; SRM, standardized response mean; VAS, visual analogue scale; VAS-R, visual analogue scale at rest; VAS-M, visual analogue scale at movement; VDS, verbal descriptor scale. Effect size, calculated by taking a mean change of variable and dividing it by standard deviation of that variable.

First author, year	PROM/s	Time interval	n	Better, same, worse %	Mean difference before and after treatment (95% CI)	Effect size OR SRM (95% CI)	Correlation with changes in other instruments
Jensen, 2002⁴⁵	VASVDSRelief rating	Baseline then several times	123125		10.37,^† 20.71^‡7.17,^† 15.09^‡7.59,^† 26,61¶
Jenkinson, 1995⁴³	VASCPIMPQ	Baseline then 120 min	75	Moderate 2.23,^¶ 1.83^§Good 1.91,^¶ 3.13^§Complete 1.89,^¶ 5^§		G1;0.99,^¶ 1.93^§G2;1.23,^¶ 1.82^§G3; 2,^¶ 3.29^§G4;1.48,^¶ 1.48^§	CPI 0.67 to VAS
Van Giang, 2015⁵⁵	FPSNRS	Every 30 min for 2 h	144		–1.17^\|\|–1.59^#–1.66^††–1.82^‡‡	–0.70^\|\|–1.05^#–1.20^††–1.31^‡‡	0.78
Li, 2007⁴⁷	VASNRSVDSFPS	NR	28		4.3 [2.4]^††4.2 [2.3]^††4.5 [2.1]^††4.3 [1.9]^††
Li, 2009⁴⁶	FPSNRSJPT	NR	180		14.095^{\|\| ††}
Aziato, 2015³³	NRSFPSCCPS	NR	150		2.3 (2.1–2.5)^††1.5 (1.4–1.6)^††1.4 (1.3–1.5) ^††
Gagliese, 2003⁴⁰	MPQPPIVAS-RVAS-M	NR	200			0.31,^¶¶ 0.390.25,^¶¶ 0.260.23,^¶¶ 0.32Not reported

Measurement error

Only one study assessed measurement error of VAS by determining the minimal detectable change (MDC), which describes the smallest change outside of inherent measurement error that the VAS can detect. The study showed that the MDC on a 100 mm VAS was 15 mm for total hip arthroplasty and 16 mm for total knee arthroplasty. We evaluated the evidence regarding VAS measurement error as moderate quality because we could not determine the minimal important change for VAS in acute pain to compare with MDC and the risk of bias.

Functional pain assessment tool

Only one study examined the ‘Objective Pain Score’, which assesses the interference of pain with respiratory function. The study evaluated the correlation between scores obtained from the Objective Pain Score and NRS. Whilst patients rated their pain using a printed NRS, the clinician rated pain using the Objective Pain Score. A linear regression model determined the relationship between NRS and Objective Pain Score, and showed that, for every unit increase in the NRS, the Objective Pain Score decreased by 0.334. The study reported sufficient convergent validity with the NRS, although with low-quality evidence because of risk of bias and imprecision. A summary of findings on all assessed measurement properties is provided (Table 2).

Other outcomes

Interpretability and desire for analgesics

Visual analogue scale

Seven studies,,,48, 49, 50, looked at the interpretability of VAS, and one study included the desire for analgesics as an outcome. Several studies,, reported nearly similar cut-off points for VAS, indicating that VAS ratings of 0–5 mm were very likely to be rated as no pain by patients, 6–44 mm were considered mild pain, 45–69 mm were considered moderate pain, and VAS ratings ≥70 mm were suggestive of severe pain. Two studies, determined the interpretability of VAS by identifying the minimal clinically important difference (MCID) defined as the minimal change in score indicating a meaningful change in pain status. The use of a combination of distribution- and anchor-based methods resulted in an MCID of 9.9 mm for VAS in assessing several types of surgical procedures. In contrast, Danoff and colleagues reported higher MCID values for pain improvement in patients undergoing total hip or knee arthroplasty. Pain was improving clinically when the VAS decreased by 19 and 23 mm, respectively. Bodian and colleagues found that the proportion of patients requesting additional analgesia after abdominal surgery increased as VAS increased (4%, 43%, and 80% with VAS scores of 30 mm or less, 31–70 mm, and greater than 70 mm, respectively).

Numerical rating scale

Four studies,,, looked at interpretability of the NRS, and one study included desire for analgesics as an outcome. Sloman and colleagues determined the meaning of changes in NRS in relation to perceived pain relief before and after treatment. Patients who rated their pain relief as ‘minimal’ had, on average, a 35% reduction in NRS. NRS was less sensitive to detect changes from ‘moderate’ to ‘much’ as there was a 67% reduction for those who rated their reduction as ‘moderate’, a 70% decrease for those who rated it is as ‘much’, and a 94% reduction for those assessed their pain reduction as ‘complete’. Inconsistent cut-off points between moderate to severe pain were identified for NRS. For example Gerbershagen and colleagues determined NRS ≥4 as a cut-point for moderate pain, whereas ‘pain interfering with function’ resulted in a lower cut-off point of NRS ≥3. While using receiver operating characteristic analysis in another study, Van Dijk and colleagues found that the sensitivity of NRS to differentiate bearable pain (VRS ≤2) from unbearable pain (VRS >2) reached higher values (94%) for high cut-off point of NRS >5 compared with lower cut-off points of 3 and 4 (sensitivity 72% and 83%), respectively. In another study, Van Dijk and colleagues showed that 19% of patients with NRS scores ranging from 5 to 10 had no desire for additional opioids; meanwhile, 62% reported that they did not want additional opioids because their pain was tolerable. When patients were asked at which score they would request opioids, both the median and the modal pain scores were an NRS of 8.

Feasibility

Eight studies included feasibility of pain assessment tools as an outcome measure.,,,,,,, Error rates were reported as an inability to understand the tool, responses that could not be scored reliably, and lack of responses.,,, Some studies reported the most preferred scale or the easiest to complete ones.,,, There was a lack of studies that assessed the time required to complete the tool or time taken to train patients or nurses. For multiple types of surgical procedures and in different populations, VDS or VRS was more successful when compared with other tools. Using VRS in patients aged ≥75 yr after cardiac surgery showed a higher success rate (81%) compared with VAS (60%) and the FPS (44%). These rates varied significantly on all postoperative days (P<0.02). The reported reasons for the failure rate, which was identified as failure to understand or express level of pain using the assessment tool, were postoperative confusion, delirium, exhaustion, and an inability to differentiate between facial expressions. In a similar way, VRS was more suited for compliance and ease of use after orthopaedic surgery compared with VAS in which 56% of patients included in the study did not understand how to complete VAS and one-third could not perform the assessment using VAS because of visual or hearing impairment. Moreover, VAS showed the highest error rate of 12.3% when used in Chinese populations, whereas VRS reported the lowest error rate (0.8%), which was statistically significant (P<0.05). Interestingly, 40% of the patients rated NRS as the easiest, most preferred tool for assessment; in contrast, VAS was reported the least preferred. From the nurses' perspectives in PACUs, NRS was the most preferred tool in 60% of the included sample. Even though the VAS was the recommended tool to be used in the institution where the study was conducted, 50% of the nurses preferred to use either NRS or VRS owing to its complexities making it difficult for patients to understand VAS. Three studies reported FPS as the preferred tool among a Chinese population, for women, middle-aged adults, and older patients without and with mild cognitive impairment, followed by VRS and NRS. Likewise, FPS (55%) was preferred to NRS (33%) among a Ghanaian population.

Discussion

This systematic review presents a comprehensive examination of the measurement properties of unidimensional and functional assessment tools used for adult postoperative patients. The quality of evidence for the measurement properties and utility of the VAS, VDS, NRS, and FPS was suboptimal. Overall, construct validity (convergent validity) was most commonly assessed across measures. Content validity, internal consistency, and structural validity were not assessed as these measures are not designed for single-item scales. The VAS had the greatest number of studies assessing its measurement properties in the postoperative setting, followed by the NRS. Studies on functional pain assessment tools were scarce. Most of the reviewed studies failed to meet the COSMIN methodological standards required. Good-quality studies were found for interpretability and feasibility as assessed by the Newcastle–Ottawa Scale. Most of the studies reported sufficient convergent validity of several unidimensional pain assessment tools, indicating that the scales tended to measure score variations in the same direction. Similar positive findings of good convergent validity results were reported when these tools were used to assess pain associated with rheumatoid arthritis, osteoarthritis, and low back pain. However, the methodology used to measure convergent validity was limited. Because no gold standard tool exists for assessing pain, most studies assessed the correlation of scores obtained from one unidimensional tool with another, measuring only pain intensity. However, when a multidimensional tool such as the McGill Pain Questionnaire was used as a comparator, studies reported lower correlation scores.,, This variation may be related to assessor and patient fatigue during the detailed pain assessment. There was good reliability of pain assessment for all unidimensional tools. However, the quality of evidence was low for all four scales because of serious risk of bias owing to unreported intervals for repeated measures or the use of inappropriate reliability measures by treating ranked NRS, VDS, or FPS scores as a continuous value. Measurement error was only available for VAS; however, the study outcome was indeterminate because we could not determine for VAS in acute pain to compare it with the MDC. When the MDC is smaller than the minimal important change, significant change can be distinguished from measurement error. Small, albeit statistically significant changes in VAS do not necessarily indicate clinically important changes to guide the interpretation of studies evaluating analgesic therapies. Therefore, obtaining an accurate MCID is crucial. Previous studies have shown that the MCID differs by patient population and diagnosis. We identified two studies reporting inconsistent MCID values for the postoperative population., The MCID tended to be higher in patients who underwent joint arthroplasties than other procedures. One explanation might be that patients reporting severe, acute pain need a larger reduction in pain to be clinically meaningful. Measures of responsiveness are an important psychometric property to assess the sensitivity of change in pain over time. Measures of responsiveness used included effect size, standardized response mean, and scores before and after intervention.,,,,,, According to COSMIN methodology, effect size and standardised response mean are inappropriate to assess responsiveness because they measure the size of the change scores rather than their validity. Moreover, the P-value of statistical tests only measures the statistical significance of the change in scores rather than their validity. Pain assessment tools help diagnose surgical catastrophes, allow communication between healthcare providers, and are used to assess efficacy of analgesic treatments and allow comparison between therapies. As no agreement exists on how to identify the optimal cut-off point of a unidimensional pain assessment tool, various arbitrarily chosen values are used. In general, VAS cut-off points of 30, 70, and 100 mm indicate the upper boundaries of mild, moderate, and severe pain, respectively. However, a recent study conducted found a higher cut-off point between mild and moderate pain of around 55 mm on the VAS, which is greater than the values reported by most earlier studies and physicians' consensus.,67, 68, 69 NRS cut-off points used by healthcare professionals do not necessarily reflect patients' desire for additional analgesics. Previous studies have also found that a high proportion of patients with pain scores >4 did not demand analgesics (28% of patients visiting an emergency department and 42% of children after surgery). Cho and colleagues showed that postoperative patients requested an analgesic when their pain was VAS ≥5.5, NRS ≥6, FPS-R ≥6, or VRS ≥2 (moderate or severe pain). This might be influenced by a general refusal for analgesic medicines, or fear of side-effects or addiction, especially with opioids.,, Cut-off points, although important, are not validated to guide analgesic interventions. Previously, postoperative pain assessment and management was focused on providing humanitarian pain relief, which constitutes only one objective to tackle a complex experience, and that was achieved by using unidimensional scores. However, healthcare providers should address pain by several approaches to determine if the pain is tolerable, is hindering recovery, or requires intervention. Efforts have been made to encourage use of multidimensional tools to assess postoperative pain. A recent systematic review indicated that the Brief Pain Inventory and the American Pain Society Pain Outcomes Questionnaire – Revised were the two commonly used and studied multidimensional pain assessment tools for patients after surgery, followed by the McGill Pain Questionnaire. These multidimensional tools showed good ratings for some psychometric properties such as internal consistency. However, this recommendation was based on low- to moderate-quality evidence. Moreover, these tools involve a detailed assessment that can range from 5 to 30 min, hindering routine use for frequent assessment in a busy surgical ward. Alternatively, functional pain assessment has been recommended., However, as no gold standard objective measures exist for pain-related functional capacity in postoperative patients, we included objective tools assessing the impact of pain on function. Only one study reported sufficient convergent validity of functional assessment based on pain interference with normal breathing and NRS score. The low methodological quality of the study limits the generalisability of the result. Other researchers have tried to incorporate a non-formally validated three-level ‘Functional Activity Score’ into clinical practice. One study in a Chinese population combining the Functional Activity Score and dynamic NRS found that this allowed nurses to guide and educate patients to better use patient-controlled analgesia to facilitate functional recovery. In addition, a pilot study in hospitalised patients validated a four-level scale (no interference, interference with some or most activities, or inability to do any activity). It established the convergent validity of this tool compared with NRS and VAS in cognitively intact patients. Patients aged ≥40 yr also preferred a functional assessment scale, possibly because functional assessment considered the impact of pain on activity. The heterogeneity of study designs, including the assessment scales used, surgical procedures, sample sizes, countries in which the studies were conducted, and the languages used, make determining the most feasible assessment tool difficult. However, the VAS showed the highest error rate and was the least preferred in several studies, whereas the VRS showed the lowest error rate. Difficulties comprehending the VAS and linearly quantifying pain resulted in a higher frequency of incomplete responses, especially for older patients., Therefore, older adults and children who have less abstract thinking ability might prefer a categorical scale such as the VRS for easier use. Interestingly, although the FPS is commonly used in paediatric populations, it was also the most preferred tool in the Ghanaian and Chinese adult populations. This might be because of the simplicity of facial expressions, which can quickly reflect pain. Alternatively, cultural aspects may explain why the FPS was preferred.

Strengths and limitations

The main strength of this review is that it includes the most frequently used unidimensional and functional pain assessment tools. In addition, we put no limits on publication date, enabling us to obtain information on early studies of these tools. To our knowledge, this is the first review to evaluate the validity of these tools, focusing solely on postsurgical populations and applying COSMIN methodology. Potential limitations include the fact that the search strategy may have excluded grey literature and studies published in languages other than English. However, we tried to limit the effect of language and publication biases by searching the references of included studies. In addition, the clinical diversity and limitations in the methodologies and quality of the included studies, may have reduced the strength of the conclusions.

Conclusions

This systematic review challenges the validity and reliability of unidimensional tools to quantify pain in adult patients after surgery. Despite their extensive use, no evidence clearly suggests that one tool has superior measurement properties in assessing postoperative pain. Therefore, future studies should be prioritised to assess their validity, reliability, measurement error, and responsiveness using COSMIN methodology. Moreover, adequate quality head-to-head comparison studies are required to assess several unidimensional pain assessment tools alongside other tools covering multiple dimensions of the pain experience. In addition, because promoting function is a crucial perioperative goal, psychometric validation studies of functional pain assessment tools are warranted to identify patients who need additional interventions to promote recovery and improve postoperative pain assessment and management.

Authors' contributions

Study design: RMB, AI, DNL, RDK, NAL, LST Literature search: RMB, AI Data extraction: RMB, AI Data analysis: RMB, AI Data interpretation: RMB, AI, DNL, RDK, NAL, LST Writing of the manuscript: RMB, AI, DNL, RDK, NAL, LST Critical review: RMB, AI, DNL, RDK, NAL, LST Approval of submitted manuscript: RMB, AI, DNL, RDK, NAL, LST Overall supervision: DNL, RDK

74 in total

1. The measurement of postoperative pain: a comparison of intensity scales in younger and older surgical patients.

Authors: Lucia Gagliese; Nataly Weizblit; Wendy Ellis; Vincent W S Chan
Journal: Pain Date: 2005-10 Impact factor: 6.961

2. Comparison of visual analogue scale and faces rating scale in measuring acute postoperative pain.

Authors: Lida Fadaizadeh; Habib Emami; Kamran Samii
Journal: Arch Iran Med Date: 2009-01 Impact factor: 1.354

Review 3. Improving postoperative pain management: what are the unresolved issues?

Authors: Paul F White; Henrik Kehlet
Journal: Anesthesiology Date: 2010-01 Impact factor: 7.892

4. The visual analog scale in the immediate postoperative period: intrasubject variability and correlation with a numeric scale.

Authors: L J DeLoach; M S Higgins; A B Caplan; J L Stiff
Journal: Anesth Analg Date: 1998-01 Impact factor: 5.108

5. The measurement of clinical pain intensity: a comparison of six methods.

Authors: Mark P Jensen; Paul Karoly; Sanford Braver
Journal: Pain Date: 1986-10 Impact factor: 6.961

Review 6. Studies comparing Numerical Rating Scales, Verbal Rating Scales, and Visual Analogue Scales for assessment of pain intensity in adults: a systematic literature review.

Authors: Marianne Jensen Hjermstad; Peter M Fayers; Dagny F Haugen; Augusto Caraceni; Geoffrey W Hanks; Jon H Loge; Robin Fainsinger; Nina Aass; Stein Kaasa
Journal: J Pain Symptom Manage Date: 2011-06 Impact factor: 3.612

7. How Much Pain Is Significant? Defining the Minimal Clinically Important Difference for the Visual Analog Scale for Pain After Total Joint Arthroplasty.

Authors: Jonathan R Danoff; Rahul Goel; Ryan Sutton; Mitchell G Maltenfort; Matthew S Austin
Journal: J Arthroplasty Date: 2018-02-22 Impact factor: 4.757

8. The efficacy and safety of pain management before and after implementation of hospital-wide pain management standards: is patient safety compromised by treatment based solely on numerical pain ratings?

Authors: Hector Vila; Robert A Smith; Michael J Augustyniak; Peter A Nagi; Roy G Soto; Thomas W Ross; Alan B Cantor; Jennifer M Strickland; Rafael V Miguel
Journal: Anesth Analg Date: 2005-08 Impact factor: 5.108

Review 9. Assessment of pain.

Authors: H Breivik; P C Borchgrevink; S M Allen; L A Rosseland; L Romundstad; E K Breivik Hals; G Kvarstein; A Stubhaug
Journal: Br J Anaesth Date: 2008-05-16 Impact factor: 9.166

10. An international multidisciplinary consensus statement on the prevention of opioid-related harm in adult surgical patients.

Authors: N Levy; J Quinlan; K El-Boghdadly; W J Fawcett; V Agarwal; R B Bastable; F J Cox; H D de Boer; S C Dowdy; K Hattingh; R D Knaggs; E R Mariano; P Pelosi; M J Scott; D N Lobo; P E Macintyre
Journal: Anaesthesia Date: 2020-10-07 Impact factor: 6.955

1 in total

1. Intracapsular tonsillectomy in the treatment of recurrent and chronic tonsillitis in adults: a protocol of a prospective, single-blinded, randomised study with a 5-year follow-up (the FINITE trial).

Authors: Jaakko Matias Piitulainen; Tapani Uusitalo; Henrik M Sjöblom; Lotta E Ivaska; Henri Jegoroff; Tommi Kauko; Hannu Kokki; Eero Kytö; Iisa Mansikka; Jenni Ylikoski; Jussi Jero
Journal: BMJ Open Date: 2022-09-14 Impact factor: 3.006

1 in total