Literature DB >> 36168394

Consistency of pediatric pain ratings between dyads: an updated meta-analysis and metaregression.

Huaqiong Zhou^1,2, Matthew A Albrecht², Pam A Roberts², Paul Porter³, Phillip R Della⁴.

Abstract

Accurate assessment of pediatric pain remains a challenge, especially for children who are preverbal or unable to communicate because of their health condition or a language barrier. A 2008 meta-analysis of 12 studies found a moderate correlation between 3 dyads (child-caregiver, child-nurse, and caregiver-nurse). We updated this meta-analysis, adding papers published up to August 8, 2021, and that included intraclass correlation/weighted kappa statistics (ICC/WK) in addition to standard correlation. Forty studies (4,628 children) were included. Meta-analysis showed moderate pain rating consistency between child and caregiver (ICC/WK = 0.51 [0.39-0.63], correlation = 0.59 [0.52-0.65], combined = 0.55 [0.48-0.62]), and weaker consistency between child and health care provider (HCP) (ICC/WK = 0.38 [0.19-0.58], correlation = 0.49 [0.34-0.55], combined = 0.45; 95% confidence interval 0.34-0.55), and between caregiver and HCP (ICC/WK = 0.27 [-0.06 to 0.61], correlation = 0.49 [0.32 to 0.59], combined = 0.41; 95% confidence interval 0.22-0.59). There was significant heterogeneity across studies for all analyses. Metaregression revealed that recent years of publication, the pain assessment tool used by caregivers (eg, Numerical Rating Scale, Wong-Baker Faces Pain Rating Scale, and Visual Analogue Scale), and surgically related pain were each associated with greater consistency in pain ratings between child and caregiver. Pain caused by surgery was also associated with improved rating consistency between the child and HCP. This updated meta-analysis warrants pediatric pain assessment researchers to apply a comprehensive pain assessment scale Patient-Reported Outcomes Measurement Information System to acknowledge psychological and psychosocial influence on pain ratings.

Entities: Chemical

Keywords: Consistency; Dyads; Meta-analysis; Metaregression; Pediatric pain ratings

Year: 2022 PMID： 36168394 PMCID： PMC9509055 DOI： 10.1097/PR9.0000000000001029

Source DB: PubMed Journal: Pain Rep ISSN： 2471-2531

1. Background

Managing pain is an essential responsibility of health care providers (HCPs) in the pediatric setting. Effective pain management provides comfort for children and prevents undesired physical and psychosocial outcomes including longer length of hospital stay, phobia to medical procedures, or increased financial burden to the family and health system.[40,65] Children can effectively self-report their pain intensity when asked if an appropriate assessment scale is provided. Selection of a pain assessment scale needs to be compatible with the child's age, verbal ability, and comprehension. Proxy pain intensity ratings by caregivers or HCP may be a useful alternative or adjunct in situations where a child is unable to provide a meaningful self-report rating, such as if the child is very young, very unwell, highly distressed, nonverbal, or severely cognitively impaired. The use of proxy pain intensity rating assumes that a caregiver and/or HCP's assessment of a child's pain is consistent with the child's self-report of pain intensity. However, assessing the level of another person's pain has proved challenging as pain perception is subjective and influenced by multiple factors, including a child's sociodemographic background, source of pain, severity of health conditions, and the child's mental health status.[40,65] A previous meta-analysis examining correlation of pain ratings in the pediatric setting was published in 2008 by 2 of the same authors of this present systematic review.[66] The meta-analysis pooled 12 studies comparing 3 dyads of child and parent, child and nurse, and parent and nurse. The studies were published from 1990 to 2007 and involved a total of 770 children with age ranging from 1 to 16 years. Self-report pain assessment scales used by children involved in the studies were the Faces Pain Scales (FPS; n = 5 studies), Visual Analogue Scale (VAS; n = 3 studies), one study with both FPS and VAS, and a final study with the Oucher Scale. Parents and nurses used VAS (n = 6), the Oucher Scale (n = 1), FPS (n = 1), and one study with 7-point FPS and VAS. Twenty-two effect sizes (ESs) were initially combined across 12 studies using a fixed-effect model to obtain the summary estimate of ES on pain ratings (9 for the child–parent dyad, 8 for the child–nurse dyad, and 5 for the parent–nurse dyad). Moderate correlation of pain ratings were found between child and parent (r = 0.64), followed by the dyads of child and nurse (r = 0.58) and parent and nurse (r = 0.49). However, only studies that used the Pearson correlation coefficient were included in the meta-analysis, omitting several studies that could increase insight into the comparability between pain ratings in a pediatric setting. Furthermore, multiple attempts have been made to develop new and/or validate existing pain intensity assessment tools since 2008. The selection of participants, pain assessment scales and statistical analysis tests varied among the studies; therefore, the results were inconsistent.[11,36,67] This paper presents an updated meta-analysis with the inclusion of metaregression examining the consistency of pediatric pain ratings between 4 dyads. The objectives were: (1) To perform meta-analysis on the consistency of pediatric pain ratings between the child and caregiver dyad, child and HCP dyad, and the caregiver and HCP dyad. (2) To understand factors that might contribute to the heterogeneity of pain rating consistency between dyads that occurs across the studies, including year of publication, child's age, source of pain, and the pain intensity assessment scale. (3) To narratively synthesize evidence on consistency of pediatric pain ratings between nurse and other HCP dyad. The hypotheses of this updated systematic review are based on the previous meta-analysis undertaken by the authors, that there would be moderate pain rating consistency between the dyads of child and caregiver, and child and HCP, and that weak consistency in pain ratings would be evident between caregiver and HCP. The second hypothesis was that year of publication, child's age, source of pain, and pain intensity assessment scale would significantly impact the pain rating consistency between the dyads.

2. Methods

2.1. Study design and registration

This systematic review was conducted according to the Preferred Reporting Items for Systematic Reviews and Meta-analyses Statement.[38]

2.2. Data sources and search strategies

In addition to the 12 studies from the previous meta-analysis, an electronic database search was conducted using CINAHL, EMBASE, Medline, and PsycINFO from January 1, 2008, till August 6, 2021, to identify new publications. The key search terms included “child” and “pain assessment,” and the detailed search terms and search results refers to Appendix A (available at http://links.lww.com/PR9/A168). The 8 studies excluded from the previous meta-analysis were also screened against the selection criteria for this update as below.

2.3. Selection criteria

The inclusion criteria included: (1) Studies comparing pain intensity ratings of a child's pain between 4 dyads, including child vs caregiver, child vs HCP, caregiver vs HCP, and/or nurse vs other HCP; (2) Studies with a detailed description of research methods clearly stating data collection and data analysis methods; (3) Studies examining pain rating consistency between dyads using statistical analysis tests of intraclass correlation (ICC), weighted kappa, Pearson correlation, Spearman correlation, or Kendall's tau correlation; and (4) Eligible studies were published in peer-reviewed journals in English with full-text access. Exclusion criteria were: (1) Pain assessed as to whether it was present as a binary “yes” or “no” question, as the review focused on pain intensity assessment; and (2) Abstract-only references.

2.4. Study selection

After the initial database searches, 2 authors independently screened titles, abstracts, and appraised full papers against the inclusion and exclusion criteria. The exclusion process was relatively straightforward, and only a handful of studies warranted discussion between authors to reach consensus on whether they met the inclusion criteria. Moreover, the reference list of all identified relevant records were searched for additional studies. The screening process is displayed in the Preferred Reporting Items for Systematic Reviews and Meta-analyses flow chart as Figure 1.

Figure 1.

Flow chart for the search and study selection process (PRISMA). PRISMA, Preferred Reporting Items for Systematic Reviews and Meta-analyses.

2.5. Data extraction

Data extracted comprised study characteristics and results. Study characteristics included year of publication, study setting, study duration, sample size, child's age, source of pain, pain assessment scale, and statistical test to examine pediatric pain rating consistency (Appendix B, available at http://links.lww.com/PR9/A168). Two authors independently extracted data for all articles, and disagreements between the 2 authors about the extracted data were resolved with the third author.

2.6. Quality assessment of included studies

Two authors independently completed the assessment of study quality using the Newcastle–Ottawa Quality Assessment Scale, adapted for cross-sectional studies.[24,55] This version of the Newcastle–Ottawa Quality Assessment Scale consisted of 3 domains of risk-of-bias assessment: selection bias, comparability, and outcome. The selection bias domain has 4 items for a maximum 5 points: sample representativeness, sample size, nonrespondents, and ascertainment of circumcision status. The comparability domain has one item for a maximum of 2 points, based on the study design or analysis. The outcome domain has 2 items for a maximum of 3 points: assessment of the outcome and statistical test. The maximum score for this scale is 10, and the result is shown in Table 1. Studies with a total score from 0 to 3 were considered to have a high risk of bias. Studies with total scores from 4 to 6 or 7 to 10 were considered as having moderate and low risk of bias, respectively.[24]

Table 1

Risk-of-bias assessment of the 40 included studies (the Newcastle Ottawa Scale adapted version for cross-sectional/observational studies).

Reference (1st author/year)	Selection				Comparability based on design and analysis	Outcome		Total/10
Reference (1st author/year)	Representativeness of the sample	Sample size	Nonrespondents	Ascertainment of circumcision status	Comparability based on design and analysis	Assessment of outcome	Statistical test	Total/10
Chen 2020	*	*	*		*	*	*	6
Kang 2020			*			*	*	3
da Cunha Batalha 2018			*			*	*	3
Kovalchuk 2018			*			*	*	3
Lawson 2018			*			*	*	3
Lifland 2018			*			*	*	3
Birnie 2017			*			*	*	3
Brudvik 2017			*			*	*	3
Labajo 2017			*			*	*	3
James 2017			*				*	2
Matziou 2016			*		*	*	*	3
Bailey 2015	*		*		*	*	*	4
Hamill 2015		*	*			*	*	4
Zhukovsky 2015		*	*				*	3
Gibbins 2014			*			*	*	3
Vetter 2014						*	*	2
Parkinson 2013			*			*	*	3
Jensen 2012			*			*	*	3
van Cleve 2012			*		*	*	*	4
de Tovac 2010					*	*	*	2
Traddio 2009	*	*	*		*	*	*	6
Barakat 2008			*			*	*	3
Nilsson 2008			*			*	*	3
Subhashini 2008			*			*	*	3
Baxt 2004			*			*	*	3
Singer 2002			*			*	*	3
Goodenough 2000			*			*	*	3
Chambers 1999			*			*	*	3
Goodenough 1999			*			*	*	3
Chambers 1998			*			*	*	3
Miller 1996			*			*	*	3
Stein 1995			*			*	*	3
West 1994			*			*	*	3
Bennett- Branson 1993			*			*	*	3
Robertson 1993			*			*	*	3
Manne 1992			*			*	*	3
Schneider 1992			*			*	*	3
LaMontagne 1991			*			*	*	3
Hendrickson 1990			*			*	*	3
Favaloro 1990			*			*	*	3

Scored 1 point based on the criteria assessment.

Risk-of-bias assessment of the 40 included studies (the Newcastle Ottawa Scale adapted version for cross-sectional/observational studies). Scored 1 point based on the criteria assessment.

2.7. Data synthesis and analysis

2.7.1. Meta-analysis

Meta-analysis was performed to calculate aggregated ESs of pediatric pain rating consistency between the dyads of child and caregiver, child and HCP, and caregiver and HCP. Aggregated ESs were estimated using the “metafor” package[62] in R version 4.0.1[50] using a Hunter and Schmidt–style method for correlation[52] with 95% confidence intervals (CIs). Briefly, random-effects models were conducted using the raw correlation/agreement coefficients (sensitivity analyses were conducted on the Fisher's r to z transformed ESs and yielded practically identical outcomes). The sampling variances of the ESs were estimated using the sample-size weighted average of the coefficients according to was then substituted into the equation for varianceWithin the “metafor” package, this was performed using the parameters measure = “COR” and vtype = “AV” options from the “escalc” function. Following this, a series of random effects meta-analyses were conducted on the ESs and variances using the “rma” function in “metafor,” with the method set to “HS” (ie, Hunter–Schmidt). The metaregressions were performed using the same process, but with each of the specified moderators set using the “mods” parameter. More details on the metaregressions are provided below. Given the heterogeneity of summary measures provided, along with some studies reporting more than one valid summary measure (eg, an ICC plus a Spearman correlation), we ran separate meta-analyses for studies reporting (1) ICC or weighted kappa (ICC/Weighted Kappa) and (2) Pearson, Spearman, or Kendall correlation coefficient (Correlation). Consideration was also given to an overall estimate that combined all ESs into one analysis. Despite the conceptual differences between an ICC/weighted kappa and a correlation coefficient, studies that reported on both measures (or were able to be calculated from data available in the paper) indicated little difference in the value between these measures. Indeed, the primary difference between correlation and ICC/kappa metrics is that a correlation coefficient will be insensitive to any systematic bias and will thus provide an overestimate of agreement depending on the magnitude of the systematic bias (alternatively, agreement measures will produce an underestimate of correlation). The studies included in this meta-analysis showed relatively little systematic bias (except for the caregiver and HCP dyad analysis). As a result the authors cautiously combined measures for an overall metric of consistency. Averages for each study group correlation coefficients and kappa statistics using Fisher's r to z transform were used to combine the measures. This way, each unique study group only provided one estimate, although the study might be included in a subgroup analysis providing input for a correlation coefficient and a separate input for an ICC. Similarly, studies reported using more than one pain assessment instrument were averaged together for any overall analysis and then separated for moderator analyses. The variations in ESs across the included studies were quantified using the I2 statistic, which measures the proportion of variability attributable to heterogeneity. A value of I2 > 25% is considered low heterogeneity, > 50% moderate heterogeneity, and 75% high heterogeneity. Regardless of heterogeneity estimates, random-effects models were chosen in this systematic review, given the heterogeneity in demographics, instruments, and pain sources across included studies.[22]

2.7.2. Metaregression analysis for the dyads of child vs caregiver and child vs health care provider

Metaregression analyses were performed to determine any patterns between important study characteristics and estimates of ES. Four variables, namely, the year of publication, age of the child, source of pain, and pain intensity assessment scale were included as moderators in separate metaregressions. There was significant heterogeneity among the included 40 studies. Four moderators were chosen for the following reasons: (1) Year of publication was selected to address changing quality of publication over time and changing levels of education over time on selection and application of pain intensity assessment scales in pediatric health care services; (2) Age of child: Many scales are developed with reference to suitability for a child based on their age and cognition ability. It is important to assess the link between verbal ability, communication, comprehension, and pain rating consistency, and that the ability for a child to self-report on their pain is likely reduced the younger their age; (3) Source of pain: To identify the context where pain ratings between dyads are more consistently identified and rated in the pediatric setting; and (4) Pain rating scales: With the implementation of a greater number and wider variety of pain intensity assessment scales in clinical settings, it is useful to know whether there are differences in consistency between different scales.

2.7.3. Narrative synthesis for the dyad of nurse vs health care provider

There were a limited number of studies that examined the consistency of a child's pain ratings between nurse and other HCPs (nurse vs physician [n = 2], nurse vs investigator [n = 2], and nurse vs pain expert [n = 1]). Therefore, the results were narratively synthesized, not statistically pooled using meta-analysis.

3. Results

3.1. Search results

A total of 9,397 records were generated for this updated search (2008 till August 6, 2021). After removing 2,041 duplicates, titles, and abstracts, 7,356 records of the remaining records were screened, and 7,312 records were excluded because of being irrelevant to the aim of this systematic review, for example, studies that assessed a child's pain intensity by either child or caregiver but made no comparisons using correlation or ICC, or studies that involved both children and adults. Of the 44 remaining records, 7 were only conference abstract and excluded. The full text of 37 records was retrieved and assessed against selection criteria, and 13 records were further excluded. Reasons for exclusion of studies included those did not use one of the identified measures of consistency rating detailed above, but instead used paired-samples t test (n = 3),[3,20,33] Cohen's kappa (n = 3),[32,45,47] Wilcoxon signed ranks test (n = 2),[29,58] absolute discrepancy (n = 1),[63] or percentage of accurate agreement (n = 1)[30]; studies were non-English publications (n = 2)[28,37]; and studies assessed pain using yes/no response (n = 1).[12] The reference lists of the remaining 24 records were hand searched for further publications, but no further relevant article was identified (Fig. 1). This systematic review included all 12 of the studies included in the previous meta-analysis. Eight studies originally excluded from the previous meta-analysis were also reviewed and 4 of these studies were included because of the inclusion of a wider array of tests including ICC, weighted kappa, Pearson correlation, Spearman correlation, or Kendall's tau correlation. As a result, a total of 40 studies were included in this updated systematic review. The 40 studies are grouped and displayed under the 4 dyads: child vs caregiver (n = 30),[2,4-11,13,15,18,19,23,27,31,34,36,39,41,43,44,48,53,54,56,57,61,64,67] child vs HCP (n = 17),[8,11,13,14,16,23,31,35,41,44,46,51,53,54,57,60,64] caregiver vs HCP (n = 10),[8,11,13,23,41,44,53,54,64,67] and nurse vs other HCPs (n = 5).[17,21,26,35,59] Fifteen studies examined more than one dyad (Appendix B, available at http://links.lww.com/PR9/A168).

3.2. Study quality assessment outcome

Critical appraisal of the included studies is summarized in Table 1. Thirty-five studies scored between 2 and 3/10 (high risk) while the remaining 5 studies scored from 4 to 6/10 (moderate risk).

3.3. Characteristics of included studies

Fifteen of the 40 included studies were conducted in the United States, followed by Canada (n = 6), Australia (n = 5), 2 each in Sweden and the United Kingdom, and one each in France, Greece, India, New Zealand, Norway, Portugal, Spain, South Korea, Taiwan, and Ukraine (Appendix B, available at http://links.lww.com/PR9/A168). Twenty-three included studies reported study durations ranging from 17 days[8] to 5 years.[27] Thirty-three included studies were conducted at a single site, and the remaining 7 accessed multiple sites. A total of 4,628 children were involved in the 40 included studies. The sample size ranged from 13[35] to 667.[48] The age range of the children was from neonate[17] to 18 years old.[15,39] Three main sources of pain were identified: surgical-related pain (n = 16), non–surgical-related pain (n = 14), and procedural-related pain (n = 10). Surgical-related pain refers to operations under general anaesthetics, for example, ear, nose and throat, dental, abdominal, orthopedic, and urological surgeries. Non–surgical-related pain refers to specific health conditions including advanced cancer; infection; injuries; musculoskeletal conditions; cerebral palsy; or neonates at the neonatal intensive care unit. Procedural-related pain refers to patients who received procedures without general anaesthetics, including intravenous cannula insertion, immunisation vaccine injection, or postmedical treatment or diagnostic procedures. A total of 17 pain intensity assessment scales were used in the included studies, and some studies used multiple scales in pain assessment. The scales include Bodily Pain and Discomfort items of the Child Health Questionnaire, Colour Analogue Scale, the Common Toxicity Criteria-Revised, Facial Analogue Scale, Face, Legs, Activity, Cry, Consolability (FLACC), FPS or FPS-Revised (FPS-R), Memorial Symptom Assessment Scale, Numeric Rating Scale (NRS), Oucher Scale, Premature Infant Pain Profile-Revised, Postoperative Pain Measure For Parents, Patient Self-Reported Pain Intensity Measurement Instruments (PPQ), Pediatric Memorial Symptom Assessment Scale, Royal College of Emergency Medicine Composite Pain Scale, VAS, and Wong-Baker Faces Pain Rating Scale (WBF).

3.4. Consistency of pediatric pain ratings between the child and caregiver dyad

3.4.1. Meta-analysis results

Thirty studies were included, with 17 ESs from 14 studies analysed for ICC/weighted kappa analysis and 27 ESs from 24 studies analysed for correlation analysis. Figure 2 presents the forest plot summarizing the ESs for consistency of pain ratings across the different measures used. There was moderate consistency in ratings between the dyad across both summary measures: correlation = 0.59 (95% CI 0.52–0.65), and ICC/weighted kappa = 0.51 (95% CI 0.39–0.63). There was significant heterogeneity across studies for both summary measures (ICC/weighted kappa I2 = 87.6% and correlation I2 = 78.2%).

Figure 2.

Forest plot of pain rating consistency between child and caregiver. CI, confidence interval; ICC, intraclass correlation.

Forest plot of pain rating consistency between child and caregiver. CI, confidence interval; ICC, intraclass correlation. Moderator analysis comparing ICC/weighted kappa with correlation was not statistically significant (Q = 1.35, P = 0.25). Combining effect measures did not substantially increase heterogeneity (I2 = 81.5%). Combining ICC/weighted kappa with correlation similarly indicated moderate consistency in ratings between child and caregiver (ES = 0.55; 95% CI 0.43–0.62).

3.4.2. Moderator regression analysis results

Figure 3 presents the moderator analysis for year of publication and age of children. There was a positive association between year of publication and consistency of pain ratings between child and caregiver for correlation analysis (β = 0.007; P = 0.013), ICC/weighted kappa (β = 0.017; P = 0.011), and the combined measure (β = 0.008; P = 0.011). There was no significant relationship between the consistency of pain ratings and the age of the children in the study (β = 0.012; P = 0.299), or with the pain rating assessment scale used by the child (Fig. 4, Appendix C and D, available at http://links.lww.com/PR9/A168).

Figure 3.

Bubble plot of the relationship between age and year of publication with child and caregiver rating consistency. ICC, intraclass correlation.

Figure 4.

Forest plot for the effect of child pain assessment scale on child and caregiver rating consistency (combined ICC/WK and correlation). CAS, Colour Analogue Scale; CI, confidence interval; FPS-R, Faces Pain Scale-Revised; ICC, intraclass correlation; NRS, Numeric Rating Scale; VAS, Visual Analogue Scale; WBF, Wong-Baker Faces; WK, weighted kappa.

Bubble plot of the relationship between age and year of publication with child and caregiver rating consistency. ICC, intraclass correlation. Forest plot for the effect of child pain assessment scale on child and caregiver rating consistency (combined ICC/WK and correlation). CAS, Colour Analogue Scale; CI, confidence interval; FPS-R, Faces Pain Scale-Revised; ICC, intraclass correlation; NRS, Numeric Rating Scale; VAS, Visual Analogue Scale; WBF, Wong-Baker Faces; WK, weighted kappa. Figure 5 presents the forest plot of effects for the moderator analysis of pain assessment scale used by the caregiver. Individual ICC/weighted kappa or correlation analyses did not show a significant effect (Appendix E and F, available at http://links.lww.com/PR9/A168); however, the combined measure indicated a significant effect of the pain assessment scale used (Q = 12.22, P = 0.032) (Fig. 5). The highest pain rating consistency was when the caregiver used the NRS (0.65) followed by WBF (0.63) and VAS (0.61), and the lowest was when the caregiver used the FPS-R (0.43).

Figure 5.

Forest plot for the effect of caregiver pain assessment scale on child and caregiver rating consistency (combined ICC/WK and correlation). CI, confidence interval; FLACC, Face, Legs, Activity, Cry, Consolability; FPS-R, Faces Pain Scale-Revised; ICC, intraclass correlation; NRS, Numeric Rating Scale; VAS, Visual Analogue Scale; WBF, Wong-Baker Faces; WK, weighted kappa. Figure 6 presents the forest plot effects for the moderator analysis for the source of pain. There was a significant effect with the ICC/weighted kappa analysis (Q = 11.35, P = 0.003) (Appendix G, available at http://links.lww.com/PR9/A168), but not for the correlation analysis (Q = 4.12, P = 0.127; Appendix H, available at http://links.lww.com/PR9/A168). When combining 2 measures into one analysis, the result was statistically significant (Q = 8.68; P = 0.013), with surgical pain having higher consistency than procedural pain, with nonsurgical pain in between (Fig. 6).

Figure 6.

Forest plot for the effect of source of pain on child and caregiver rating consistency (combined ICC/WK and correlation). CI, confidence interval; ICC, intraclass correlation; WK, weighted kappa

3.5. Consistency of pediatric pain ratings between the child and health care provider dyad

3.5.1. Meta-analysis results

Figure 7 presents the forest plot examining consistency between child and HCP dyad (n = 17 studies). Thirteen studies involved nurses only, 3 mixed nurses and physicians, and 2 with physicians only. There were 6 ESs from 4 studies for ICC/weighted kappa, and 17 ESs from 15 studies for correlation. The correlation ES was moderate (ES = 0.49, 95% CI = 0.34–0.55), and the ICC/weighted kappa ES was weak-moderate (ES = 0.38, 95% CI = 0.19–0.58). Heterogeneity was high for the ICC/weighted kappa studies (I2 = 84%) and moderate for correlation studies (I2 = 61.7%).

Figure 7.

Forest plot of pain rating consistency between child and health care provider. CI, confidence interval; ICC, intraclass correlation.

Forest plot of pain rating consistency between child and health care provider. CI, confidence interval; ICC, intraclass correlation. Moderator analysis comparing ICC/weighted kappa with correlation was not statistically significant (Q = 1.29, P = 0.26). Pooling measures modestly increased heterogeneity relative to the correlation only analysis (I2 = 75.6%). Combining summary measures indicated weak-moderate consistency in ratings between child and HCP (ES = 0.45; 95% CI 0.34–0.55).

3.5.2. Moderator regression analysis results

Year of publication, age of the children, child self-report scale, and HCP scale were not significantly associated with consistency of pain ratings between child and HCP (β = −0.007, P = 0.118; β = − 0.023, P = 0.316; Q = 0.73, P = 0.865; Q = 5.74, P = 0.125, respectively; Appendix I to Appendix O, available at http://links.lww.com/PR9/A168). Moderator analysis investigating the source of pain indicated a statistically significant difference among nonsurgical, surgical, and procedural pain for correlation, ICC/weighted kappa, and combined studies (Q = 15.22, P < 0.001; Q = 12.08, P < 0.001; Q = 28.55, P < 0.001, respectively). The source of pain with the greatest consistency was related to surgical, followed by procedural and nonsurgical pain (Appendix P to Appendix R, available at http://links.lww.com/PR9/A168).

3.6. Consistency of pediatric pain ratings between the caregiver and health care provider dyad

3.6.1. Meta-analysis results

Figure 8 presents the forest plot for the meta-analysis performed on caregiver and HCP dyad (n = 10 studies). ICC/weighted kappa provided 2 ESs from 2 studies, and correlation provided 8 ESs from 8 studies for analysis. The correlation ES was moderate (ES = 0.49, 95% CI = 0.32–0.65), whereas the ICC/weighted kappa ES was weak to moderate (ES = 0.27, 95% CI = −0.06–0.61). Heterogeneity was high for the correlation (I2 = 78.3%) and ICC/weighted kappa (I2 = 81.1%) studies.

Figure 8.

Forest plot of pain rating consistency between caregiver and health care provider. CI, confidence interval; ICC, intraclass correlation.

3.6.2. Moderator regression analysis results

There was no significant relationship between consistency of pain ratings between caregiver and HCP in terms of year of publication or mean/median age of children (Appendix S, available at http://links.lww.com/PR9/A168). Moderator analysis of pain assessment scales comparing ICC/weighted kappa with correlation was not statistically significant (Q = 1.3, P = 0.25), albeit with only 2 ESs contributing to the ICC/weighted kappa ES. Pooling measures did not increase heterogeneity, remaining high (I2 = 83.1%). Combining summary measures indicated weak to moderate consistency (0.41; 95% CI 0.22–0.59; Appendix T and U, available at http://links.lww.com/PR9/A168).

3.7. Consistency of pediatric pain ratings between the nurse and other health care provider dyad

Five studies examined the consistency of child's pain ratings between the nurse and other HCP dyad: 3 studies compared nurses with physicians, and one each compared nurses and investigators and nurses and pain experts. The results were not statistically pooled. Two studies using ICC or weighted kappa were highly discordant (ES = 0.17[26] and 0.87[59]). One study that used a Pearson correlation showed a high ES (ES = 0.90[35]).

4. Discussion

The overall aim of this meta-analysis was to synthesize additional research published since the meta-analysis conducted in 2008. The updated meta-analysis found support for the hypothesis proposed in this study of moderate pain rating consistency between the child and caregiver dyad. The updated meta-analysis expanded the number of eligible studies through the inclusion of ICC/weighted kappa statistics to assess agreement and found a moderate level of agreement in the pain ratings between the child-caregiver dyad, consistent with the correlation assessment. A weak-moderate consistency of pediatric pain ratings between the child and HCP dyad was found across both agreement and correlation metrics, indicating lower consistency than our initial hypothesis predicting moderate consistency between child and HCP dyads. Mixed support found for the hypothesis that pain intensity rating consistency between caregiver and HCP would be weak than the child-caregiver dyad. Specifically, consistency depended on the measure, with agreement and correlation measures indicating weak and moderate ESs, respectively, between the caregiver and HCP dyad. The results of this updated meta-analysis are compared with the 2008 meta-analysis in Table 2.[66] Correlation ESs between the 2 reviews for the child-caregiver dyad were practically equivalent (r = 0.64 vs 0.59, within 0.05 of each other). The effect estimates for the correlation between the dyads child-HCP (0.58 vs 0.49) and for the caregiver-HCP dyad (0.49 vs 0.41) were slightly more discordant between the meta-analyses. These differences may have been because of the previous meta-analysis had too few studies for a stable ES estimate. Alternatively, this study included all HCPs (nurses and physicians) in the meta-analysis compared with the previous meta-analysis, which reported ratings only from nurses.

Table 2

Summary effect size compared with the early published meta-analysis.

1990–6 August 2021 (40 studies)			1990–2007 (12 studies)Pearson r
ICC/Weighted kappa studies	Pearson, spearman, or Kendall correlation coefficient studies	Combined	1990–2007 (12 studies)Pearson r
0.51Child and caregiver	0.59Child and caregiver	0.55Child and caregiver	0.64Child and caregiver
0.38Child and health care provider	0.49Child and health care provider	0.45Child and health care provider	0.58Child and nurse
0.27Caregiver and health care provider	0.49Caregiver and health care provider	0.41Caregiver and health care provider	0.49Caregiver and nurse

ICC, intraclass correlation.

Summary effect size compared with the early published meta-analysis. ICC, intraclass correlation. There was significant heterogeneity across studies for practically all analyses. Differences in health condition, study setting, population, and pain assessment scales are worth noting, which were assessed via metaregression. The majority of the 40 included studies in this systematic review were cross-sectional studies (n = 37). Three studies assessed pain rating consistency at multiple time points, with all studies reporting acute pain with respect to a clearly defined intervention.[2,10,27] The results varied across data assessment points with a trend that day-1 postsurgery pain rating consistency was higher than the following days. In addition, only one study reported consistency in ratings separately for mothers and fathers with consideration of different level of bond and personal experience with pain. However, the results showed little difference of pain rating consistency from mother-child to father-child.[57] All other studies simply reported whether the rating was conducted by a parent, caregiver, or guardian. Furthermore, 8 of the 40 included studies specified pain intensity ratings were blindly conducted, ie, each rater was completely unaware of any other rater's scoring. Another 8 studies stated that the pain assessment was conducted independently/separately, but without a clear indication of whether it was a true blind rating among participants. The remaining 24 studies did not specify how the ratings were undertaken. Therefore, summary ESs could be inflated depending on the actual independence of ratings in studies that do not report blinding status, as participants might be consciously and/or unconsciously influenced by other raters comprising the dyad. Interestingly, there was little difference in ES as a function of the statistical test used (ICC/weighted kappa or correlation). Studies that reported both a correlation and an agreement metric usually reported very similar values between the two. Formal analysis comparing ICC/weighted kappa with correlation was not statistically significant, and combining effect measures did not substantially increase heterogeneity. In addition, the ES difference between these 2 measures was approximately 0.08 (in favour of correlation) in the largest sample (child-parent dyad). As noted earlier, a major difference between these measures is how they handle systematic bias, with correlation indifferent to large systematic biases. Consistency in ES between studies using agreement or correlation measures potentially suggests little systematic bias in ratings. Future studies should consider using of Bland–Altman analysis[1] that combines an explicit assessment of both bias and agreement in the one analysis. In our sample, only one of the 40 included studies[36] incorporated a Bland–Altman analysis; the result echoed their ICC analysis. The metaregression analysis results support the second hypothesis in this study that year of publication, source of pain, and pain intensity assessment scale would significantly impact the pain rating consistency between the dyads. The result, however, failed to find support for the hypothesis that child's age contributed to pain intensity rating consistency between the dyads. For the child and caregiver dyad, year of publication and the pain assessment scales used by caregivers were significantly associated with greatest consistency between pediatric pain ratings. More recent publications have shown greater consistency of pediatric pain ratings between children and caregivers. Although the age range of the children assessed differed with publication year,[7,39,48,61] the effect of child's age itself as a moderator was not statistically significant. Interestingly, the choice of caregiver assessment scales used differed by year of publication. The mean year of publication for the FPS-R and VAS studies was 2004, whereas the mean year for the NRS, WBF, PPQ, and FLACC was 2012 and beyond (Fig. 5). Although this may have contributed to the association with the year of publication, with NRS being the highest (mean publication date of 2018) and FPS-R the lowest (mean publication date 2004), it does not account for the VAS, FLACC, and PPQ outcomes. Other reasons for an increase in consistency in ratings over time relate to the recognition of the importance of pain assessment. There have been significant efforts and resources allocated to educating children, caregivers, and HCPs on the selection and use of pain assessment scales in the past decade, which may have contributed to the greater consistency in pain ratings, which is evident in the research published in recent years. In addition, pain rating consistency between the dyad of child and caregiver was greater when caregivers used the VAS, NRS, and WBF scales. All these 3 scales have the common feature of using a single item for assessing pain, which is valid, simple to understand, and quick to administer, potentially increasing consistency between reporters in a complex clinical setting.[42] Moderator analysis of the child and HCP dyad measures indicated that pain caused by surgical and procedural interventions was associated with increased consistency of pediatric pain ratings. Three of the 9 studies in the dyad of child and HCP involved children who had undergone general, orthopedic, or ENT surgeries.[13,14,31] The remaining 5 studies examined children who had invasive procedures or general health conditions. Health care providers working on surgical units/wards principally recognise the importance of selecting appropriate pain assessment scales and pain management promoting more positive patient outcomes postoperatively. Further investigation of factors impacting on consistency of pain intensity ratings among the dyads is suggested, particularly in the area of non–surgical-related pain assessment. This updated meta-analysis and metaregression extracted a relatively large number of studies examining pain rating consistency between 2 of the 3 dyads: child and caregiver and child and HCP. Based on the result, when the child is not able to self-report because of medical or developmental reasons, proxy assessment of child's pain intensity by the caregiver is the next most appropriate person, to estimate the child's experience of pain particularly from a surgical context. Improved consistency in pain intensity ratings between child and HCP in the surgical/procedural context suggests that HCP can provide useful pain intensity proxy ratings in the absence of caregiver. Pain is a multidimensional experience; therefore, a pain intensity rating by child and/or caregivers may be under the influence of not only the acute physical suffering but also psychosocial and environment factors. The pain intensity assessment scales used in the 40 included studies were reliable and valid tools, particularly for children with acute illnesses. Outcomes for children with chronic health conditions are more complex and require inclusion of psychosocial and environmental factors.[49] The authors suggest applying the Patient-Reported Outcomes Measurement Information System (PROMIS) for these situations. PROMIS evaluates 5 domains of a person's well-being when experiencing pain, including physical function, fatigue, pain, emotional distress, and social health.[25] Both adults and children can use PROMIS, and can be used by people with or without chronic conditions. This may improve the consistency of pain evaluations by acknowledging the psychological and psychosocial factors that substantially influence chronic pain perception in children and their caregivers.

5. Conclusions

This meta-analysis presents an updated of the literature evaluating pediatric pain rating consistency between multiple dyads involved in pediatric care in the clinical setting. Moderate consistency of pain ratings were found between the child and caregiver dyad in studies using measures of agreement or correlation. The consistency between other dyads was weaker. A more recent date of publication, specific pain assessment scales used by the caregivers (VAS, NRS, and WBF), and pain related to surgical intervention were associated with increased pain rating consistency for the child and caregiver dyad. Future studies should consider including Bland–Altman analyses when quantifying agreement of pediatric pain intensity ratings between dyads. Application of assessment scale such as PROMIS, which assesses a wider impact of pain on children, may further improve consistency in pain intensity ratings at the same time as better reflecting the psychosocial impact of pain in chronic conditions.

Disclosures

The authors have no conflicts of interest to declare. Supplemental digital content associated with this article can be found online at http://links.lww.com/PR9/A168.

58 in total

1. Perceptions of children and their parents about the pain experienced during their hospitalization and its impact on parents' quality of life.

Authors: Vasiliki Matziou; Efrosini Vlachioti; Eustathia Megapanou; Agapi Ntoumou; Christina Dionisakopoulou; Vasia Dimitriou; Konstantinos Tsoumakas; Theodora Matziou; Pantelis Perdikaris
Journal: Jpn J Clin Oncol Date: 2016-06-15 Impact factor: 3.019

2. Can we screen young children for their ability to provide accurate self-reports of pain?

Authors: Carl L von Baeyer; Lindsay S Uman; Christine T Chambers; Adele Gouthro
Journal: Pain Date: 2011-03-10 Impact factor: 6.961

3. Qualitative Evaluation of Pediatric Pain Behavior, Quality, and Intensity Item Candidates and the PROMIS Pain Domain Framework in Children With Chronic Pain.

Authors: C Jeffrey Jacobson; Susmita Kashikar-Zuck; Jennifer Farrell; Kimberly Barnett; Ken Goldschneider; Carlton Dampier; Natoshia Cunningham; Lori Crosby; Esi Morgan DeWitt
Journal: J Pain Date: 2015-08-31 Impact factor: 5.820

4. The FLACC behavioral scale for procedural pain assessment in children aged 5-16 years.

Authors: Stefan Nilsson; Berit Finnström; Eva Kokinsky
Journal: Paediatr Anaesth Date: 2008-08 Impact factor: 2.556

5. Correlates of pain-rating concordance for adolescents with sickle cell disease and their caregivers.

Authors: Lamia P Barakat; Katherine Simon; Lisa A Schwartz; Jerilynn Radcliffe
Journal: Clin J Pain Date: 2008-06 Impact factor: 3.442

6. Patient versus parental perceptions about pain and disability in children and adolescents with a variety of chronic pain conditions.

Authors: Thomas R Vetter; Cynthia L Bridgewater; Lee I Ascherman; Avi Madan-Swain; Gerald L McGwin
Journal: Pain Res Manag Date: 2013-10-21 Impact factor: 3.037

7. Effect of Gum Chewing on Pain and Anxiety in Turkish Children During Intravenous Cannulation: A Randomized Controlled Study.

Authors: Sacide Yildizeli Topcu; Melahat Akgun Kostak; Remziye Semerci; Ozlem Guray
Journal: J Pediatr Nurs Date: 2019-12-27 Impact factor: 2.145