Literature DB >> 32655904

Cross-Cultural Adaptation and Psychometric Properties of Quality of Life Scales for Arabic-Speaking Adults: A systematic review.

Mohammed Al Maqbali^1,2, Jackie Gracey¹, Jane Rankin³, Lynn Dunwoody⁴, Eileen Hacker⁵, Ciara Hughes¹.

Abstract

This review aimed to explore the psychometric properties of quality of life (QOL) scales to identify appropriate tools for research and clinical practice in Arabic-speaking adults. A systematic search of the Cumulative Index to Nursing and Allied Health Literature® (EBSCO Information Services, Ipswich, Massachusetts, USA), MEDLINE® (National Library of Medicine, Bethesda, Maryland, USA), EMBASE (Elsevier, Amsterdam, Netherlands) and PsycINFO (American Psychological Association, Washington, District of Columbia, USA) databases was conducted according to Preferred Reporting Items Systematic Reviews and Meta-Analysis guidelines. Quality assessment criteria were then utilised to evaluate the psychometric properties of identified QOL scales. A total of 27 studies relating to seven QOL scales were found. While these studies provided sufficient information regarding the scales' validity and reliability, not all reported translation and cross-cultural adaptation processes. Researchers and clinicians should consider whether the psychometric properties, subscales and characteristics of their chosen QOL scale are suitable for use in their population of interest. © Copyright 2020, Sultan Qaboos University Medical Journal, All Rights Reserved.

Entities: Disease Gene Species

Keywords: Cross-Cultural Comparison; Psychometrics; Quality of Life; Surveys and Questionnaires; Systematic Review; Translations; Validity and Reliability

Mesh：

Year: 2020 PMID： 32655904 PMCID： PMC7328836 DOI： 10.18295/squmj.2020.20.02.002

Source DB: PubMed Journal: Sultan Qaboos Univ Med J ISSN： 2075-051X

Quality of life (qol) is a multidimensional construct which relies on both personal characteristics as well as contextual and environmental variables.1 In medical research and clinical practice, the assessment of QOL is important because it measures the effect of a disease or medical intervention on affected patients. In addition, QOL is an essential endpoint in treatment planning for policy-makers, healthcare providers and the patient themselves.2 In recent years, focus on the patient’s functioning, lifestyles and well-being have increased medical interest in tools for measuring QOL.2 Accordingly, it is necessary to identify robust scales with satisfactory psychometric properties that can be used for this purpose. However, as most QOL assessment scales are initially designed in English, these scales need to be translated and adapted for use in different languages and cultures. Cross-cultural adaptation and translation is a systematic process that prepares questionnaires and scales for use in another setting.3 Nevertheless, it is crucial that the scale maintain its content validity after translation and cultural adaptation. Reliability refers to the reproducibility or consistency of scores from one assessment to another, usually assessed via measures of internal consistency, inter- or intra-rater reliability or test-retest reliability.4 Internal consistency is generally reported as an alpha coefficient ranging from 0 (no correlation) to 1 (perfect correlation), with values of ≥0.70 and ≥0.90 considered acceptable and highly reliable, respectively.5 In contrast, validity is the ability of the scale to measure the attributes of the construct under consideration (i.e. the degree to which the scale measures that which it is intended to measure). Validity is divided into three types: content, construct and criterion validity, with the latter encompassing concurrent and predictive validity.6 Worldwide, there are approximately 420 million Arabic-speakers living in 23 countries.7 Generally, there are two main types of Arabic, with the first being modern standard Arabic primarily used in the written form in official and educational settings, while the second consists of differing regional and colloquial dialects.8 In 1998, Coons et al. conducted the first psychometric study to translate and validate an Arabic version of a QOL scale.9 Since then, many different QOL scales have been translated, resulting in a need to determine those which demonstrate satisfactory cross-cultural adaptation and validity. As such, this review aimed to explore the psychometric properties and translation and cross-cultural adaptation processes of Arabic QOL scales in order to identify appropriate scales that can be used for research and clinical practice in Arabic-speaking adults.

Methods

This systematic review was carried out according to the Preferred Reporting Items Systematic Reviews and Meta-Analysis guidelines.10 A systematic search of various electronic databases was conducted in order to identify studies investigating QOL among Arabic-speaking participants published between January 1946 and April 2019, including the Cumulative Index to Nursing and Allied Health Literature® (EBSCO Information Services, Ipswich, Massachusetts, USA), MEDLINE® (National Library of Medicine, Bethesda, Maryland, USA), EMBASE (Elsevier, Amsterdam, Netherlands) and PsycINFO (American Psychological Association, Washington, District of Columbia, USA) databases. The search terms included combinations of free-text words and Medical Subject Headings® (National Library of Medicine) with Boolean operators (i.e. or/and) as follows: “psychometrics”, “reliability”, “validity” or “instrument validation” and “Arabs” or “medicine, Arabic” and “functional status”, “well-being”, “quality of life”, “health status”, “health and life quality”, “quality of health care”, “assessment”, “patient assessment”, “clinical assessment tools”, “health impact assessment”, “clinical assessment tools”, “outcome assessment”, “measurement tool” or “questionnaires”. In addition, the reference lists of identified articles were screened to find other potential publications that could be included in the analysis. All articles identified during the literature search were assessed to determine their eligibility. Articles were considered eligible for inclusion if they: (1) were published in English; (2) involved adults over 18 years of age; (3) were primarily psychometric studies with information concerning validity or reliability; (4) utilised QOL measures translated into Arabic; (5) involved an Arabic-speaking population; and (6) had no restrictions regarding study design. Studies with QOL scales developed and validated for a specific disease were excluded; however, those used for multiple types of cancer were permitted. Overall, a total of 1,087 articles were identified during the database search; however, this was reduced to 43 following screening of the titles and abstracts, with 27 articles meeting the inclusion criteria after full-text screening [Figure 1].

Figure 1

Diagram showing the search process used to identify articles included in this systematic review.

The psychometric properties of identified QOL scales were then evaluated according to nine quality assessment criteria suggested by Terwee et al., including content validity, internal consistency, criterion validity, construct validity, reproducibility, responsiveness, floor and ceiling effects and interpretability [Table 1].11 Each scale was given either a positive (+), indeterminate (?) or negative (−) rating for each of these measures, or a rating of 0 if no information was available. Terwee et al. recommended presenting the assessment results in a table, but not using an overall score, as this would bestow equal importance on each psychometric property which is not necessarily appropriate.11

Table 1

Criteria for assessing the pschometric properties of quality of life scales11

Property	Definition	Rating*	Quality criteria
Content validity	The extent to which the domain of interest is comprehensively sampled by the items in the questionnair	+	A clear description is provided of the measurement aim, the target population, the concepts that are being measured and the item selection AND both target population and investigators OR experts are involved in item selection
		?	A clear description of the aforementioned aspects is missing OR only the target population is involved OR doubtful† design or methods
		−	No target population involvement
		0	No information found
Internal consistency	The extent to which items in a scale or subscale are intercorrelated (i.e. measuring the same construct)	+	Factor analyses are performed on an adequate sample size (calculated to be at least seven times the number of items AND >100) AND Cronbach’s alpha(s) is calculated per dimension AND Cronbach’s alpha(s) is between 0.70–0.95
		?	No factor analysis OR doubtful† design or methods
		−	Cronbach’s alpha is <0.70 or >0.95, despite adequate design and methods
		0	No information found
Criterion validity	The extent to which scores on a particular questionnaire refer to a gold standard	+	Convincing arguments to support gold standard AND correlation with Cronbach’s alpha of >0.70
		?	No convincing arguments to support gold standard OR doubtful† design or methods
		−	Correlation with Cronbach’s alpha of 0.70 AND continuous adequate design and methods
		0	No information found
Construct validity	The extent to which scores on a particular questionnaire refer to other measures in a manner consistent with theoretically supported hypotheses relating to the concepts being measured	+	Specific hypotheses are formed AND at least 75% of the results are in accordance with these hypotheses
		?	Doubtful† design or methods (e.g. no hypotheses)
		−	Less than 75% of the hypotheses are confirmed, despite adequate design and methods
		0	No information found
Reproducibility
Agreement	The extent to which scores on repeated measures are close to each other (i.e. absolute measurement error)	+	The SDC is less than the MIC OR the MIC is outside the LOA OR convicing arguments that the level of agreement is acceptable
		?	Doubtful† design or methods OR the MIC is not defined AND no convincing arguments that the level of agreement is acceptable
		−	The MIC is less than or equal to the SDC OR the MIC equals or is inside the LOA, despite adequate design and methods
		0	No information found
Reliability	The extent to which subjects can be distinguished from each other, despite measurement errors (i.e. relative measurement error)	+	The ICC or Cohen’s weighted kappa is >0.70
		?	Doubtful† design or methods (e.g. time interval not mentioned)
		−	The ICC or weighed Kappa is ≤0.70, despite adequate design and methods
		0	No information found
Responsiveness	The ability of a questionnaire to detect clinically important changes over time	+	The MIC is less than the SDC OR the MIC is outside the LOA OR the RR is 1.96 OR the AUC is >0.70
		?	Doubtful† design or methods
		−	The SDC is more than or equal to the MIC OR the MIC equals or is inside the LOA OR the RR is <1.96 OR the AUC is 0.70, despite adequate design and methods
		0	No information found
Floor and ceiling effects	The number of responders who achieve the lowest or highest possible scores	+	<15% of the respondents achieve the highest or lowest possible scores
		?	Doubtful† design or methods
		−	<15% of the respondents achieve the highest or lowest possible scores, despite adequate design and methods
		0	No information found
Interpretability	The degree to which one can assign qualitative meaning to quantitative scores	+	Mean and SD scores are presented for at least four relevant subgroups of patients AND the MIC is defined
		?	Doubtful† design or methods OR mean and SD scores are presented for less than four subgroups OR no MIC is defined
		0	No information found

SDC = smallest detectable change; MIC = minimal important change; LOA = limits of agreement; ICC = intraclass correlation; RR = responsiveness ratio; AUC = area under the curve; SD = standard deviation.

Ratings were either positive (+), intermediate (?), negative (−) or no information was available (0).

Either the study lacks a clear description of its design or methods, the sample size is under 50 subjects in each subgroup analysis or there are important methodological weaknesses in its design or execution.

Table adapted with permission from Terwee CB, Bot SD, de Boer MR, van der Windt DA, Knol DL, Dekker J, et al. Quality criteria were proposed for measurement properties of health status questionnaires. J Clin Epidemiol 2007; 60:34–42.11

The cross-cultural adaptation and translation of the scales was evaluated according to the five-step guidelines of Guillemin et al., namely: (1) translation, (2) back-translation, (3) committee review, (4) pre-testing and (5) re-examination of score weighting.12 In the first step, at least two qualified translators should translate the scale from the original language to the target language. In the second step, two independent translators should translate the translated version back into the language of the original version to ensure that the translation reflects the content of the original.12 The third step ideally involves a committee review to develop the penultimate version for pre-testing and the fourth step, pilots this version among 30–40 subjects from the target population. The final step should be the re-examination of the weighting of the scores in light of cultural context.12 Each study was assessed and given a score of either 1 (poor), 2 (moderate) or 3 (good) for each of these steps, with the overall score representing the mean of all scores obtained.

Results

STUDY CHARACTERISTICS

A total of 27 studies were included in the analysis, all of which were published between 1998 and 2019.9,13–38 The majority were conducted in the Middle Eastern and North African region, including Jordan (n = 7), Saudi Arabia (n = 4), Egypt (n = 2), Morocco (n = 2), Kuwait (n = 2), Tunisia (n = 2), Lebanon (n = 2), the United Arab Emirates (UAE; n = 2), Sudan (n = 1) and Qatar (n = 1).9,13,14,16–35,37,38 However, two studies were conducted in the Netherlands among samples of Moroccan Arabic-speaking subjects.15,36 The majority of the studies (n = 21) had translated the QOL scales into modern standard Arabic suitable for use among all Arabic-speaking populations.9,13,18–35,38 However, six had translated the scales into Arabic dialects only suitable for specific populations, including Moroccan Arabic (n = 3), Tunisian Arabic (n = 2) and Egyptian Arabic (n = 1).14–17,36,37 All of the studies utilised quantitative research methods, with 20 cross-sectional and seven longitudinal surveys. None used a mixed-method approach [Table 2].9,13–38

Table 2

Characteristics of studies involving quality of life scales translated and adapted for Arabic-speaking adults (N = 27)9,13–38

Author and year of publication	Country	Study design		Type of participants	Sample size	QOL scale	Language	Cronbach’s α coefficient
Huijer et al.28 (2013)	Lebanon	Cross-sectional		Mixed cancer patients	200	EORTC QLQ-C30	Standard Arabic	• Overall: <0.70• Range: 0.38–0.80
Awad et al.26 (2008)	UAE	Cross-sectional		Breast cancer patients	87	EORTC QLQ-C30	Standard Arabic	• Overall: <0.70• Range: 0.51–0.84
Alawadhi and Ohaeri35 (2010)	Kuwait	Cross-sectional		Breast cancer patients	348	EORTC QLQ-C30	Standard Arabic	• Overall: 0.91• Range: 0.51–0.84
Bener et al.27 (2017)	Qatar	Cross-sectional		Breast cancer patients	678	EORTC QLQ-C30	Standard Arabic	• Overall: 0.91• Range: 0.55–0.89
Alawneh et al.25 (2010)	Jordan	Cross-sectional		Mixed cancer patients	175	EORTC QLQ-C15-PAL	Standard Arabic	• Overall: <0.70• Range: 0.72–0.90
Lazenby et al.29 (2013)	Jordan	Cross-sectional		Mixed cancer patients	205	FACT-G	Standard Arabic	• Range: 0.80–0.83
Zahran et al.30 (2017)	Egyptian	Cross-sectional		Bladder cancer patients	90	FACT-G	Standard Arabic	• Range: 0.80–0.94
Al Barmawi et al.24 (2018)	Jordan	Cross-sectional		Head and/or neck cancer patients	118	FACT-G	Standard Arabic	• Overall: 0.76• Range: 0.67–0.83
Soudy et al.23 (2018)	Saudi Arabia	Cross-sectional		Lymphoma patients who had undergone stem cell transplantation	108	FACT-G	Standard Arabic	• Overall: 0.89• Range: 0.67–0.88
Coons et al.9 (1998)	Saudi Arabia	Longitudinal	General population		415	SF-36	Standard Arabic	• Range: 0.60–0.87
Sabbah et al.18 (2003)	Lebanon	Cross-sectional		General population	524	SF-36	Standard Arabic	• Range: 0.70–0.90
Hoopman et al.36 (2009)	Netherlands	Longitudinal		General population	Subgroup of 377 Moroccan subjects	SF-36	Local dialect (Tarifit)	• Range: 0.63–0.93
Hoopman et al.15 (2006)	Netherlands	Longitudinal		Mixed cancer patients	Subgroup of 79 Moroccan patients	SF-36	Local dialect (Tarifit)	• Range: 0.65–0.94
Khoudri et al.37 (2007)	Morocco	Cross-sectional		Patients discharged from the ICU	145	SF-36	Standard Arabic	• Overall: ≥0.70• Range: 0.84–0.99
Guermazi et al.38 (2012)	Tunisia	Cross-sectional		General population	130	SF-36	Local dialect (Tunisian)	• Overall: 0.94• Range: 0.72–0.89
El-Kalla et al.17 (2016)	Egypt	Longitudinal		Patients with burn injuries	40	SF-36	Local dialect (Egyptian)	• Overall: 0.8
Sheikh et al.31 (2015)	Saudi Arabia	Cross-sectional		Khat chewers	300	SF-36	Standard Arabic	• Overall: 0.94• Range: 0.72–0.90
Khader et al.22 (2011)	Jordan	Cross-sectional		General population	511	SF-36	Standard Arabic	• Range: 0.71–0.90
Younsi and Chakroun16 (2014)	Tunisia	Cross-sectional		General population	3,582	SF-12	Local dialect (Tunisian)	• Overall: 0.73
Aburuz et al.13 (2009)	Jordan	Cross-sectional		General population	186	EQ-5D	Standard Arabic	• Overall: ≥0.75
Bekairy et al.32 (2018)	Saudi Arabia	Longitudinal		Mixed patients	80	EQ-5D	Standard Arabic	• Overall: 0.72
Ohaeri and Awadalla21 (2009)	Kuwait	Longitudinal		General population	3,303	WHOQOL-BREF	Standard Arabic	• Overall: 0.90• Range: 0.69–0.83
Ohaeri et al.20 (2007)	Sudan	Cross-sectional		General population and psychiatric patients	623	WHOQOL-BREF	Standard Arabic	• Overall: 0.88 (general population), 0.93 (psychiatric patients) and 0.92 (caregivers)
Bani-Issa33 (2011)	UAE	Cross-sectional		Diabetic patients	200	WHOQOL-BREF	Standard Arabic	• Overall: 0.85• Range: 0.89–0.91
Dalky et al.19 (2017)	Jordan	Cross-sectional		Family/caregivers of patients	266	WHOQOL-BREF	Standard Arabic	• Overall: 0.92
Hoopman et al.14 (2008)	Morocco	Cross-sectional		Mixed cancer patients	Subgroup of 37Moroccan patients	COOP/WONCA	Local dialect (Tarifit)	• Not reported
Halabi34 (2006)	Jordan	Longitudinal		General population and hypertensive, diabetic, cancer and dialysis patients.	35	QLI	Standard Arabic	• Overall: 0.90

QOL = quality of life; EORTC QLQ = European Organisation for Research and Treatment of Cancer Quality of Life Questionnaire; C30 = Core Version 30; UAE = United Arab Emirates; C15-PAL = Core Version 15 Palliative; FACT-G = Functional Assessment of Cancer Therapy - General; SF-36 = 36-item Medical Outcomes Study Short-Form; ICU = intensive care unit; SF-12 = 12-item Medical Outcomes Study Short-Form; EQ-5D = EuroQOL Group Health Status Index 5-Dimensions; WHOQOL-BREF = World Health Organisation Quality of Life: Brief Version; COOP/WONCA = Dartmouth Cooperative Functional Health Assessment Charts/World Organisation of General Practice/Family Physicians; QLI = Quality of Life Index.

SCALE CHARACTERISTICS

Overall, the 27 articles included a total of seven self-reporting QOL scales that were translated and tested psychometrically in Arabic, including: (1) the 12- or 36- item Medical Outcomes Study Short-Form (SF-12 or SF-36); (2) the Dartmouth Cooperative Functional Health Assessment Charts/World Organisation of General Practice/Family Physicians (COOP/WONCA); (3) the World Health Organisation Quality of Life: Brief Version (WHOQOL-BREF); (4) the EuroQOL Group Health Status Index 5-Dimensions (EQ-5D); (5) the European Organisation for Research and Treatment of Cancer (EORTC) Quality of Life Questionnaire Core Versions 30 or 15 Palliative (QLQ-C30 or QLQ-C15-PAL); (6) the Functional Assessment of Cancer Therapy-General (FACT-G); and (7) the Quality of Life Index (QLI).

PSYCHOMETRIC PROPERTIES

The psychometric properties of the QOL scales are detailed in Table 3.9,13–38 None of the studies tested all nine psychometric criteria suggested by Terwee et al.11 In terms of content validity, 22 studies had a positive score.9,13–15,18–21,21–30,32,34–48 For the remaining five studies, no information was available.16,17,22,31,33 Internal consistency was generally high, with 26 studies scoring positively.9,13,15–38 Only one study did not report information regarding internal consistency.14 Criterion validity was tested in only two studies, with positive ratings for both.18,19 The remaining 25 studies did not provide any information concerning this psychometric property.9,13–17,20–38 Construct validity was assessed in 22 studies, of which 21 received positive ratings.9,13–15,18–32,37,38 Only one study was rated as intermediate for this aspect.16

Table 3

Psychometric properties of scales in studies involving quality of life scales translated and adapted for Arabic-speaking adults (N = 27)9,13–38

Author and year of publication	QOL scale	Rating*
		Reproducibility
		Content validity	Internal consistency	Criterion validity	Agreement	Reliability	Construct validity	Responsiveness	Floor and ceiling effects	Interpretability
Huijer et al.28 (2013)	EORTC QLQ-C30	+	+	0	+	0	0	0	0	0
Awad et al.26 (2008)	EORTC QLQ-C30	+	+	0	+	0	0	0	0	0
Alawadhi and Ohaeri35 (2010)	EORTC QLQ-C30	+	+	0	0	0	0	0	0	0
Bener et al.27 (2017)	EORTC QLQ-C30	+	+	0	+	0	0	0	0	0
Alawneh et al.25 (2010)	EORTC QLQ-C15-PAL	+	+	0	+	0	+	0	0	0
Lazenby et al.29 (2013)	FACT-G	+	+	0	+	0	0	0	0	0
Zahran et al.30 (2017)	FACT-G	+	+	0	+	0	0	0	0	0
Al Barmawi et al.24 (2018)	FACT-G	+	+	0	+	0	0	0	0	0
Soudy et al.23 (2018)	FACT-G	+	+	0	+	0	0	0	0	0
Coons et al.9 (1998)	SF-36	+	+	0	+	0	+	0	0	0
Sabbah et al.18 (2003)	SF-36	+	+	+	+	0	0	0	+	0
Hoopman et al.36 (2009)	SF-36	+	+	0	0	0	0	0	+	0
Hoopman et al.15 (2006)	SF-36	+	+	0	+	0	0	+	+	0
Khoudri et al.37 (2007)	SF-36	+	+	0	+	0	+	0	0	0
Guermazi et al.38 (2012)	SF-36	+	+	0	+	+	+	0	0	0
El-Kalla et al.17 (2016)	SF-36	0	+	0	0	0	+	0	0	0
Sheikh et al.31 (2015)	SF-36	0	+	0	+	0	0	0	0	0
Khader et al.22 (2011)	SF-36	0	+	0	+	0	0	0	+	0
Aburuz et al.13 (2009)	EQ-5D	+	+	0	+	0	+	0	0	?
Bekairy et al.32 (2018)	EQ-5D	+	+	0	+	0	+	0	0	0
Ohaeri and Awadalla21 (2009)	WHOQOL-BREF	+	+	0	+	0	?	0	?	0
Ohaeri et al.20 (2007)	WHOQOL-BREF	+	+	0	+	0	?	0	0	0
Bani-Issa33 (2011)	WHOQOL-BREF	0	+	0	0	0	0	0	0	0
Dalky et al.19 (2017)	WHOQOL-BREF	+	+	+	+	0	0	0	0	0
Hoopman et al.14 (2008)	COOP/WONCA	+	0	0	+	0	0	?	?	0
Halabi34 (2006)	QLI	+	+	0	0	0	0	0	0	0

QOL = quality of life; EORTC QLQ = European Organisation for Research and Treatment of Cancer Quality of Life Questionnaire; C30 = Core Version 30; C15-PAL = Core Version 15 Palliative; FACT-G = Functional Assessment of Cancer Therapy - General; SF-36 = 36-item Medical Outcomes Study Short-Form; SF-12 = 12-item Medical Outcomes Study Short-Form; EQ-5D = EuroQOL Group Health Status Index 5-Dimensions; WHOQOL-BREF = World Health Organisation Quality of Life: Brief Version; COOP/WONCA = Dartmouth Cooperative Functional Health Assessment Charts/World Organisation of General Practice/Family Physicians; QLI = Quality of Life Index.

Ratings were scored as either positive (+), intermediate (?), negative (−) or no information available (0).

Information concerning agreement was present in only one study which received a positive score.38 Reliability was investigated in nine studies, of which seven scored positively.9,13,17,25,32,36,37 The remaining two studies received intermediate scores.20,21 Two studies provided information regarding responsiveness, with one positive and one intermediate rating.14,15 Floor and ceiling effects were tested in six studies, with four scoring positively.15,18,22,36 The other two studies received intermediate ratings.14,21 Only one study reported information concerning interpretability, receiving an intermediate score.13 Overall, the SF-36 demonstrated the most robust psychometric properties, followed by the WHOQOL-BREF. The SF-36 was tested using eight psychometric criteria, with positive ratings for content validity, internal consistency, criterion validity, construct validity, agreement, reliability, responsiveness and floor and ceiling effects.9,15,17,18,22,31,36–38 Similarly, the WHOQOL-BREF received positive scores for content validity, internal consistency, criterion validity and construct validity, although both reliability and floor and ceiling effects were rated as indeterminate.19–21,33

TRANSLATION AND CULTURAL ADAPTATION

The processes of translation and cultural adaptation of the QOL scales are presented in Table 4.9,13–38 In total, 14 studies reported information regarding translation and cross-cultural adaptation processes.9,13–15,17, 18,23–25,28–30,34,38 However, only two studies adopted all five steps recommended by Guillemin et al.9,12,23 A total of nine studies reported four of the steps, without providing any information regarding the re-evaluation of score weightings.13,14,17,18,28–30,34,38 One study presented a three-step technique (including translation, back-translation and pre-testing), while another reported only the first two steps.15,24 Finally, in one study, a single-step technique consisting solely of forward-translation was performed.25

Table 4

Cross-cultural adaptation and translation processes of scales in studies involving quality of life scales translated and adapted for Arabic-speaking adults (N = 27)9,13–38

Author and year of publication	QOL scale	Score*
Author and year of publication	QOL scale	Translation	Back-translation	Committee approach	Pre-testing	Reassessment of score weighting	Overall mean score weighting
Huijer et al.28 (2013)	EORTC QLQ-C30	3	3	3	3	N/A	3
Awad et al.26 (2008)	EORTC QLQ-C30	N/A	N/A	N/A	N/A	N/A	N/A
Alawadhi and Ohaeri35 (2010)	EORTC QLQ-C30	N/A	N/A	N/A	N/A	N/A	N/A
Bener et al.27 (2017)	EORTC QLQ-C30	N/A	N/A	N/A	N/A	N/A	N/A
Alawneh et al.25 (2010)	EORTC QLQ-C15-PAL	2	N/R	N/R	N/R	N/R	2
Lazenby et al.29 (2013)	FACT-G	3	3	3	3	N/A	3
Zahran et al.30 (2017)	FACT-G	3	3	3	3	N/A	3
Al Barmawi et al.24 (2018)	FACT-G	3	3	N/R	3	N/R	3
Soudy et al.23 (2018)	FACT-G	3	3	3	3	3	3
Coons et al.9 (1998)	SF-36	3	3	2	3	3	2.8
Sabbah et al.18 (2003)	SF-36	3	3	3	3	N/R	3
Hoopman et al.36 (2009)	SF-36	N/A	N/A	N/A	N/A	N/A	N/A
Hoopman et al.15 (2006)	SF-36	2	1	N/R	N/R	N/R	1.5
Khoudri et al.37 (2007)	SF-36	N/A	N/A	N/A	N/A	N/A	N/A
Guermazi et al.38 (2012)	SF-36	3	3	1	3	N/R	2.5
El-Kalla et al.17 (2016)	SF-36	3	3	3	3	N/R	3
Sheikh et al.31 (2015)	SF-36	N/A	N/A	N/A	N/A	N/A	N/A
Khader et al.22 (2011)	SF-36	N/A	N/A	N/A	N/A	N/A	N/A
Younsi and Chakroun16 (2014)	SF-12	N/A	N/A	N/A	N/A	N/A	N/A
Aburuz et al.13 (2009)	EQ-5D	3	2	2	3	N/A	2.5
Bekairy et al.32 (2018)	EQ-5D	N/A	N/A	N/A	N/A	N/A	N/A
Ohaeri and Awadalla21 (2009)	WHOQOL-BREF	N/A	N/A	N/A	N/A	N/A	N/A
Ohaeri et al.20 (2007)	WHOQOL-BREF	N/A	N/A	N/A	N/A	N/A	N/A
Bani-Issa33 (2011)	WHOQOL-BREF	N/A	N/A	N/A	N/A	N/A	N/A
Dalky et al.19 (2017)	WHOQOL-BREF	N/A	N/A	N/A	N/A	N/A	N/A
Hoopman et al.14 (2008)	COOP/WONCA	2	2	3	3	N/R	2.5
Halabi34 (2006)	QLI	3	3	3	3	N/R	3

QOL = quality of life; EORTC QLQ = European Organisation for Research and Treatment of Cancer Quality of Life Questionnaire; C30 = Core Version 30; NA = not applicable; C15-PAL = Core Version 15 Palliative; NR = not reported; FACT-G = Functional Assessment of Cancer Therapy - General; SF-36 = 36-item Medical Outcomes Study Short-Form; SF-12 = 12-item Medical Outcomes Study Short-Form; EQ-5D = EuroQOL Group Health Status Index 5-Dimensions; WHOQOL-BREF = World Health Organisation Quality of Life: Brief Version; COOP/WONCA = Dartmouth Cooperative Functional Health Assessment Charts/World Organisation of General Practice/Family Physicians; QLI = Quality of Life Index.

Each step of the process was scored as either good (3), moderate (2) or poor (1).

The EORTC-QLQ-C30, SF-36, FACT-G and QLI scales received overall mean scores of 3 with regards to translation and cultural adaptation processes. 17,18,23,24,28–30,34 In addition, the overall mean score of the COOP/WONCA and EQ-5D scales was 2.5.13,14 However, there was no information regarding translation or cross-cultural adaptation for any of the studies seeking to validate the WHOQOL-BREF scale.19–21,33

INDIVIDUAL QUALITY OF LIFE SCALES

European Organisation for Research and Treatment of Cancer Quality of Life Questionnaire

The EORTC has developed several scales to assess the QOL of cancer patients.39–42 The EORTC-QLQ-C30 consists of nine multi-item scales, five functional subscales (assessing physical, role, cognitive, emotional and social functioning), three symptom subscales (assessing fatigue, pain and nausea/vomiting) and a global health status and QOL scale.39 In addition, six items assessing other common symptoms of cancer are included (dyspnoea, insomnia, appetite loss, constipation, diarrhoea and financial difficulties). The first 28 items of the scale are scored on a 4-point Likert scale, with scores ranging from 1 (not at all) to 4 (very much).39 The remaining two items are assessed on a seven-point numeric scale. The original version of the EORTC-QLQ-C30 scored a Cronbach’s α coefficient of ≥0.70.39 Overall, four studies sought to validate and translate the EORTC-QLQ-C30 for use in Arabic-speaking adults.26–28,35 All of the studies translated the scale into modern standard Arabic and were cross-sectional in nature. In total, the overall sample of these studies consisted of 1,313 cancer patients.26–28,35 Generally, the scale showed satisfactory psychometric properties consistent with its purpose for use among Arabic-speaking cancer patients. In terms of internal consistency, the coefficient alpha of the Arabic versions was >0.70, in line with that of the original version.26–28,35 In the EORTC-QLQ-C15-PAL scale, the 30-item QLQ core version is reduced to 15 items for the purposes of addressing QOL in palliative care.39,43 The EORTC-QLQ-C15-PAL includes three multi-item scales, functional subscales (assessing physical and emotional functioning), symptom subscales (assessing fatigue and pain) and a global health status and QOL scale.43 The first 14 items of the scale are scored on a 4-point Likert scale, with scores ranging from 1 (not at all) to 4 (very much). The final item is assessed on a 7-point numeric scale.43 Alawneh et al. investigated the validity and reliability of a standard Arabic version of the EORTC-QLQ-C15-PAL scale among 175 Jordanian mixed cancer patients, with a coefficient alpha of >0.70.25

Functional Assessment of Cancer Therapy – General

The FACT-G scale consists of 27 items assessed on a 5-point Likert scale and was originally validated for a population of mixed cancer patients.44 The first part of the scale assesses three QOL dimensions (physical well-being, social/family wellbeing and functional wellbeing) using seven items and a fourth (emotional wellbeing) with six items. The scale also has specific items that can be added to the general scale for specific types of cancer.45,46 In addition to cancer, the FACT-G scale has been used and validated for use among patients with other chronic illnesses as well as the general population.47–49 Four studies were conducted to evaluate the psychometric properties of Arabic versions of the FACT-G on 521 subjects in total, each involving a different subtype of cancer patient (mixed, lymphoma, bladder and head-and/or-neck cancer).23,24,29,30 In addition, one study assessed the FACT-G in conjunction with a spiritual subscale.29 All four studies were translated into modern standard Arabic. The internal consistency of the Arabic FACT-G scales yielded almost the same results as that of the original scale (coefficient alpha: 0.76–0.89).23,24,29,30,44 However, none of the studies assessed reproducibility properties such as test-retest reliability or agreement.23,24,29,30

Medical Outcomes Study Short-Form

The SF-36 scale is a 36-item multi-purpose health survey consisting of eight multi-item subscales (assessing physical functioning, emotional problems, physical problems, mental health, bodily pain, general health, social functioning and vitality) and one single-item subscale (assessing health transition).50 The total score ranges from 0 to 100, with higher scores indicating a better QOL. The reliability of the original SF-36 scale was high, with an intraclass correlation coefficient (ICC) of >0.8.50 Nine studies evaluated the psychometric properties of Arabic versions of the SF-36 scale.9,15,17,18,22,31,36–38 The scale was tested on multiple populations, including the general population (n = 4), burn victims (n = 1), cancer patients (n = 1), patients admitted to an intensive care unit (n = 1) and khat chewers (n = 1), with a total sample size of 2,521.9,15,17,18,22,31,36–38 Four studies translated the SF-36 scale into three different dialects of Arabic, including Moroccan Tarifit (n = 2), Tunisian (n = 1) and Egyptian (n = 1).15,17,36,38 The other five studies translated the scale into standard Arabic.9,18,22,31,37 With regards to internal consistency, the coefficient alpha of the Arabic SF-36 scale ranged from 0.70–0.94.9,15,17,18,22,31,36–38 Test-retest reliability was assessed in four studies, with the ICC exceeding 0.70.9,15,17,37,38 The SF-12 scale is a shorter 12-item version of the SF-36 scale and assesses the same eight health domains as the original.51 Younsi and Chakroun tested the SF-12 scale among 3,582 members of the general population.16 The scale was translated into the Tunisian Arabic dialect, with a coefficient alpha of 0.73.16

EuroQOL Group Health Status Index 5-Dimensions

The EQ-5D is a 5-item scale assessing five dimensions of QOL (mobility, self-care, usual activities, pain/discomfort and anxiety/depression).52 Each item has three possible responses, including no problems, some/moderate problems and extreme problems. In addition, health states are measured using a visual analogue scale ranging from 0 to 100.52 Two studies evaluated Arabic versions of the EQ-5D.13,32 Both were translated into modern standard Arabic. Aburuz et al. investigated the validity and reliability of the EQ-5D in a sample of 186 members of the general population in Jordan.13 In contrast, Bekairy et al. assessed its use among 80 mixed patients in Saudi Arabia.32 Both Arabic EQ-5D scales were deemed valid and reliable, with coefficient alphas of ≥0.72. In terms of test-retest reliability, Aburuz et al. and Bekairy et al. reported Cohen’s kappa values of 0.48–1.0 and 0.53–1.00, respectively.13,32

WORLD HEALTH ORGANISATION QUALITY OF LIFE: BRIEF VERSION

The WHOQOL-BREF scale is a 26-item questionnaire scored on a 5-point Likert scale which was originally validated to measure QOL among people with diseases in the general population.53 The scale represents an abbreviated version of the much longer 100-item WHOQOL assessment.54 The WHOQOL-BREF has four subscales (assessing physical health, psychological health, social relationships and environmental health) and two overall QOL and general health items. In terms of internal consistency, the coefficient alpha of the original WHOQOL-BREF scale was 0.66–0.84.53 Four studies sought to validate standard Arabic versions of the WHOQOL-BREF in different countries, including Kuwait, Sudan, UAE and Jordan.19–21,33 The total sample size for all four studies was 4,392, including both psychiatric and diabetic patients, family members and caregivers of patients and members of the general population. The coefficient alpha of the Arabic WHOQOL-BREF scales ranged from 0.69–0.93, indicating acceptable internal consistency. In one study, the test-retest reliability of the scale was significant (ICC = 0.95).21

Dartmouth Cooperative Functional Health Assessment Charts/World Organisation of General Practice/Family Physicians

The COOP/WONCA scale contains six items assessed using a 5-point Likert scale and covering core QOL functional domains (physical fitness, feelings, daily activities, social activities, changes in health and overall health).55 The COOP/WONCA scale was culturally adapted and translated into Arabic in only one study.14 Hoopman et al. assessed the use of the scale on 37 mixed cancer patients when translated into Tarifit, a local dialect of Arabic spoken in Morocco. The scale was found to have adequate content and construct validity, but its discriminant validity could only be partially confirmed.14

Quality of Life Index

The QLI scale consists of 70 items scored on a 6-point Likert scale assessing health and functioning and socioeconomic, psychological/spiritual and family-related aspects of QOL.56 The scale was designed to assess the QOL of both healthy and ill individuals. The original scale has been validated in many different diseases.57–59 In terms of internal consistency, the coefficient alpha of the original QLI scale was 0.73–0.99.60 Only one study culturally adapted, translated and tested the QLI in modern standard Arabic.34 The study involved 35 subjects, including both healthy individuals as well as hypertensive, diabetic, cancer and dialysis patients. The reliability of the scale was adequate, with an ICC of 0.88–0.97.34

Discussion

This review identified 27 studies assessing seven QOL scales translated and tested for validity and reliability in Arabic-speaking adults.9,13–38 None of the scales were originally developed in Arabic, with the majority initially developed for use in English. All of the QOL scales were consistent in that they assessed both physical and psychological aspects as well as other important components of QOL. Nevertheless, in order to fully understand QOL in Arabic-speaking populations, there is a need for QOL scales to be properly translated and culturally adapted for use in these populations. All of the studies included in this review utilised quantitative research methods, with 20 cross-sectional and seven longitudinal surveys.9,13–38 Cross-sectional studies indicate that the data were collected at a specific point in time without further follow-up, while longitudinal data were collected over different periods of time.61 Generally, longitudinal studies give more precise information regarding temporal changes or treatment effects that can have an important impact on the QOL of patients. In contrast, researchers conducting cross-sectional analyses will have more difficulty creating a cohesive narrative regarding the impact of medical treatment, interventions or other variables on QOL.61 None of the studies in this review evaluated the psychometric criteria as suggested by Terwee et al.11 As such, further psychometric studies are required to improve the validity and reliability of Arabic versions of QOL scales. For instance, only seven studies positively assessed test-retest reliability and only one reported test-retest agreement.9,13,17,25,32,37,38 In terms of specific scales, no reliability properties were reported for the EORTC QLQ-C30, FACT-G, WHOQOL-BREF, COOP/WONCA or QLI scales. Accordingly, the reliability and validity of these QOL scales should be evaluated prior to their use in Arabic-speaking populations. Selection of an appropriate QOL scale is dependent on a number of different factors, including the demographic and clinical characteristics of the sample, the psychometric properties of the scale and the number of items included in the scale. Most importantly, researchers need to consider the various aspects and domains of QOL that require evaluation in their population of interest. For example, if the sample consists of cancer patients, the FACT-G or EORTC QLQ-C30 scales would be most appropriate as both can be used in multiple types of cancer.39–42,44–46 In addition, both scales have been validated among Arab cancer patients, as well as members of the general population speaking other languages.23,24,26–30,35,49,62,63 However, it should be noted that the EORTC-QLQ-C30 scale does not address either spiritual or existential components of QOL.39 On the other hand, if the sample consists of a general Arabic-speaking population, the SF-36 might be a better choice in order to provide more generic QOL-related information.50 While the studies in this review included a variety of populations, the general population was most frequently studied, perhaps because this choice provides a larger sample size, thus improving the psychometric evaluation. With regards to the COOP/WONCA scale, this scale was validated by only one study involving a small sample size (N = 37) and translated into a local dialect.14 Further examination of the psychometric properties of a modern standard Arabic version of this scale is therefore required before it can be recommended for use among other Arabic-speaking adults. All seven QOL scales identified in this review are of varying lengths, consisting of between 5–70 items. Overall, six of the scales contain fewer than 36 items and could therefore be administered between 5–10 minutes.39,44,50,52,53,55 The QLI is much longer with a total of 70 items, although the administration time is reported as being approximately 10 minutes.56 Unlike other QOL scales, the QLI weighs satisfaction in a particular domain of QOL in terms of importance, so that items with high satisfaction and importance scores receive the highest score. Nevertheless, the length of this questionnaire could present an obstacle when conducting research in a clinical setting, as the inclusion of more items in a survey tends to discourage high response rates.64,65 Processes of cross-cultural adaptation and translation affect the credibility of an adapted scale by ensuring that the content of the translated scale is equivalent to that of the original. Adhering to a systematic and standardised approach, such as that suggested by Guillemin et al., produces cultural equivalence and maximises acceptability of the linguistic structure of the translated scale.12 Unfortunately, only two studies included in this review followed all five recommended steps.9,23 In contrast, these processes were only partially reported or not reported at all by the remaining 25 studies.13–22,24–38 This may be because the scales had originally been translated into Arabic in earlier studies. Nevertheless, it is recommended that researchers identify and report detailed information regarding each stage of the cross-cultural adaptation process when translating and adapting QOL scales for use in Arabic-speaking populations. This review was subject to several limitations. The focus of the analysis was primarily on the psychometric properties of QOL scales; as such, further research is necessary to evaluate the quality of the design and methodologies of the reported studies. In addition, a single researcher undertook the screening and assessed the eligibility of the articles included in the analysis. This may have increased the risk of bias or resulted in possible errors during the data collection process. Finally, although a systematic search of multiple electronic databases was conducted using various search terms in different combinations, it is possible that some relevant studies were unintentionally overlooked and not included in the analysis.

Conclusion

This review evaluated the psychometric properties and cultural adaptation and translation processes of Arabic versions of QOL scales. In general, the studies provided insufficient information regarding the exact processes of translation and cultural adaptation. Additionally, while most scales provided sufficient information regarding the content and construct validity and internal consistency of the scales, information related to agreement, responsiveness, floor and ceiling effects and interpretability was lacking. Specifically, the test-retest reliability, criterion validity and sensitivity of Arabic QOL scales requires further validation. Future research involving the translation and cultural adaptation of QOL scales should utilise recommended guidelines to ensure the content of the translated scale is equivalent to that of the original.

58 in total

1. Development of the World Health Organization WHOQOL-BREF quality of life assessment. The WHOQOL Group.

Authors:
Journal: Psychol Med Date: 1998-05 Impact factor: 7.723

2. The World Health Organization Quality of Life Assessment (WHOQOL): development and general psychometric properties.

Authors:
Journal: Soc Sci Med Date: 1998-06 Impact factor: 4.634

3. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement.

Authors: David Moher; Alessandro Liberati; Jennifer Tetzlaff; Douglas G Altman
Journal: Int J Surg Date: 2010-02-18 Impact factor: 6.071

4. A 12-Item Short-Form Health Survey: construction of scales and preliminary tests of reliability and validity.

Authors: J Ware; M Kosinski; S D Keller
Journal: Med Care Date: 1996-03 Impact factor: 2.983

5. Validation of the functional assessment of multiple sclerosis quality of life instrument.

Authors: D F Cella; K Dineen; B Arnason; A Reder; K A Webster; G karabatsos; C Chang; S Lloyd; J Steward; D Stefoski
Journal: Neurology Date: 1996-07 Impact factor: 9.910

6. The development of the EORTC QLQ-C15-PAL: a shortened questionnaire for cancer patients in palliative care.

Authors: Mogens Groenvold; Morten Aa Petersen; Neil K Aaronson; Juan I Arraras; Jane M Blazeby; Andrew Bottomley; Peter M Fayers; Alexander de Graeff; Eva Hammerlid; Stein Kaasa; Mirjam A G Sprangers; Jakob B Bjorner
Journal: Eur J Cancer Date: 2005-09-12 Impact factor: 9.162

7. Validity and reliability of the European Organization for Research and Treatment in Cancer Quality of Life Questionnaire (EORTC QLQ): experience from Kuwait using a sample of women with breast cancer.

Authors: Shafika A Alawadhi; Jude U Ohaeri
Journal: Ann Saudi Med Date: 2010 Sep-Oct Impact factor: 1.526

8. Validation of the Arabic version of the EORTC quality of life questionnaire among cancer patients in Lebanon.

Authors: Huda Abu-Saad Huijer; Knar Sagherian; Hani Tamim
Journal: Qual Life Res Date: 2012-09-08 Impact factor: 4.147

9. Confirmatory factor analytical study of the WHOQOL-Bref: experience with Sudanese general population and psychiatric samples.

Authors: Jude U Ohaeri; Abdel W Awadalla; Abdul-Hamid M El-Abassi; Anila Jacob
Journal: BMC Med Res Methodol Date: 2007-08-01 Impact factor: 4.615

Review 10. Is Health Related Quality of Life (HRQoL) a valid indicator for health systems evaluation?

Authors: Martin Romero; David Vivas-Consuelo; Nelson Alvis-Guzman
Journal: Springerplus Date: 2013-12-11