| Literature DB >> 29464456 |
Melanie Hawkins1, Gerald R Elsworth2, Richard H Osborne2,3.
Abstract
BACKGROUND: Data from subjective patient-reported outcome measures (PROMs) are now being used in the health sector to make or support decisions about individuals, groups and populations. Contemporary validity theorists define validity not as a statistical property of the test but as the extent to which empirical evidence supports the interpretation of test scores for an intended use. However, validity testing theory and methodology are rarely evident in the PROM validation literature. Application of this theory and methodology would provide structure for comprehensive validation planning to support improved PROM development and sound arguments for the validity of PROM score interpretation and use in each new context.Entities:
Keywords: Health Literacy Questionnaire (HLQ); Health literacy; IUA; Interpretation/use argument; Interpretive argument; PROM; Patient-reported outcome measure; Qualitative methods; Validation; Validity; Validity argument
Mesh:
Year: 2018 PMID: 29464456 PMCID: PMC5997725 DOI: 10.1007/s11136-018-1815-6
Source DB: PubMed Journal: Qual Life Res ISSN: 0962-9343 Impact factor: 4.147
Fig. 1Flow chart of the application of validity testing theory and methodology to assess the validity of patient-reported outcome measure (PROM) score interpretation and use in a new context
Fig. 2Community Healthcare Centre Vignette: a community healthcare centre wishes to use the HLQ as a community needs assessment for a minority language group
Evaluating validity evidence for an interpretive argument for a translated patient-reported outcome measure (PROM)
| Components of the interpretive argument and assumptions | Evidence required for a validity argument | Examples of methods to obtain validity data, including relevant studies on the Health Literacy Questionnaire (HLQ) |
|---|---|---|
| 1.1 The content of the source language constructs and items are appropriate for the target culture | Evidence that the PROM constructs are appropriate and relevant for members of the target culture and that item content will be understood as intended by members of the target culture | Pre-translation qualitative evaluation by in-context PROM user about the cultural appropriateness of the items and constructs for the target language and culture. For example, an ethnographer who lived with Roma populations assessed each of the HLQ scales according to how Roma people might understand them [ |
| 1.2 Application of a systematic translation protocol that provides evidence that linguistic equivalence and cultural appropriateness are highly likely to be achieved, thus supporting the argument for the successful transfer of the intended meaning of the source language constructs while maximising understanding of items, response options and administration methods in the target language and culture | Evidence that a structured translation method, with detailed descriptions of the item intents, was appropriately implemented | Formal and documented translation method and process to manage translations to different languages, including documentation of participants in translation consensus meetings. A developer or other person deeply familiar with the PROM’s content and purpose oversees the concordance between the intents of original items and target language items. For example, two papers that describe the translation of the HLQ to other languages (Slovak [ |
| 2.1 Respondents to both the translated PROM and source language PROM engage in the same or similar cognitive (response) processes when responding to items, and these processes align with the source language construct criteria, thus indicating that similar respondents across cultures are formulating responses to the same items in the same way | Evidence that linguistic equivalence and cultural appropriateness has been achieved in the PROM translation | Analysis of the documented translation process to determine how difficulties in translation were resolved such that each translated item retained the intent of its corresponding source language item, while accommodating linguistic and cultural nuances. For example, in the German HLQ publication, the translation method is described, as well as linguistic and cultural adaptation difficulties encountered and how these were resolved [ |
| 2.2 PROM users (e.g. health professionals or researchers who administer the PROM and interpret the PROM scores) of both the translated and source language PROMs engage in the same or similar cognitive processes (i.e. apply source language construct-relevant criteria) when interpreting scores, and that these processes match the intended interpretation of scores | Evidence that the cognitive processes of PROM users when evaluating respondents’ scores from a translated PROM are consistent with the source language construct criteria and with the interpretation of scores as intended by developers of the source language PROM | In-depth cognitive interviews with target PROM users, and content analysis to compare narrative data from interviews with source language PROM item intents and construct definitions, and to compare narrative data with the data from cognitive interviews with source language PROM users. For example, the Hawkins et al. study [ |
| 3.1 Item interrelationships and measurement structure of scales of the translated PROM conform to the constructs of the source language PROM | Evidence that the translated PROM scales are homogeneous and distinct and thus items are uniquely related to the hypothesised target constructs | Confirmatory factor analysis (CFA) of data in the target language culture and comparisons with CFAs from data in the source language culture |
| 4.1 Convergent-discriminant validity is established for the translated PROM | Evidence that the relationships between the translated PROM and similar constructs in other tools are substantial and congruent with patterns observed in the source PROM (i.e. convergent evidence) such that score interpretation of the translated PROM is consistent with the score interpretation of the source language PROM and other PROMs measuring similar constructs | Use of CFA to examine Fornell and Larker’s [ |
| 4.2 Test–criterion relationships are robust for translated PROMs | Evidence that test–criterion relationships are concordant with expectations from theory to provide general support for construct meaning and information to support decisions about score interpretation and use for specific population groups and purposes | Correlation and group differences, e.g. analysis of variance of translated PROM summed scores by sub-groups (i.e. gender, age, education etc.) |
| 4.3 Validity generalisation is established for a PROM that is translated across two or more cultures | Evidence of validity generalisation information to support valid score interpretation and use of translated PROMs in other cultures similar to those already studied. Validity generalisation relates the PROM constructs within a nomological net [ | Systematic review of results of validity studies of translated PROM scales across the five categories of validity evidence in the |
| 5.1 PROM users (e.g. health professionals, researchers) interpret and use respondents’ scores from a translated PROM as intended by the developers of the source language PROM and for the intended benefit | Evidence that the intended benefit from testing with the translated PROM has been realised | In-depth interviews with users of a translated PROM to assess the outcomes that arose from testing with the translated PROM (i.e. predicted or actual actions taken from score interpretation and use) and if these align with the intended benefits, as stipulated by the developers of the source language PROM. For example, the OPtimising HEalth LIteracy and Access (Ophelia) process [ |
| 5.2 Claims for benefits of testing that are not based directly on the developers’ intended score interpretations and uses | Evidence to determine if there are potential testing benefits that go beyond the intended interpretation and use of the translated PROM scores. For example, the HLQ was not designed to measure the broad concept of patient experience. However, data from the HLQ could be used for this purpose because the constructs and items include information about this concept. Consequently, a hospital that sought to measure patients’ health literacy might also make claims about patients’ hospital experiences | A companion or follow-up study or a critical review by an external evaluator could identify and evaluate benefits that are directly based on intended score interpretation and use (as based on the source language PROM) and benefits that are based on grounds other than intended score interpretation and use. For example, a companion study could consist of co-administering the HLQ with specific patient experience questionnaires, auditing patient complaints records, and undertaking in-depth interviews with patients and hospital staff to determine the validity of HLQ score interpretation for measuring patient experiences |
| 5.3 Awareness of and mitigation of unintended consequences of testing due to construct underrepresentation and/or construct irrelevance to prevent inappropriate decisions or claims about an individual or group, i.e. to take action/intervention when not warranted, or to take no action/intervention when an action is warranted, or to falsely claim an action/intervention is a success or a failure | Evidence of sound translation method to help minimise unintended consequences related to errors in score interpretation for a given use that are due to poor equivalence between the source language and translated PROM constructs | Collection and analysis of translation process data that verify that a structured translation method is implemented such that congruence between source language and translated PROM constructs, and potential construct underrepresentation and/or construct irrelevance, is continually addressed. For example, an as yet unpublished study has been conducted to analyse field notes from translations of the HLQ into nine languages to determine aspects of the translation method that improve congruence of item intent between the source and translated items and thus constructs |