| Literature DB >> 25927828 |
Christian J Apfelbacher1,2, Daniel Heinl3, Cecilia A C Prinsen4, Stefanie Deckert5, Joanne Chalmers6, Robert Ofenloch7, Rosemary Humphreys8, Tracey Sach9, Sarah Chamlin10, Jochen Schmitt11.
Abstract
BACKGROUND: Eczema is a common chronic or chronically relapsing skin disease that has a substantial impact on quality of life (QoL). By means of a consensus-based process, the Harmonising Outcome Measures in Eczema (HOME) initiative has identified QoL as one of the four core outcome domains to be assessed in all eczema trials (Allergy 67(9):1111-7, 2012). Various measurement instruments exist to measure QoL in adults with eczema, but there is a great variability in both content and quality (for example, reliability and validity) of the instruments used, and it is not always clear if the best instrument is being used. Therefore, the aim of the proposed research is a comprehensive systematic assessment of the measurement properties of the existing measurement instruments that were developed and/or validated for the measurement of patient-reported QoL in adults with eczema. METHODS/Entities:
Mesh:
Year: 2015 PMID: 25927828 PMCID: PMC4403900 DOI: 10.1186/s13643-015-0041-3
Source DB: PubMed Journal: Syst Rev ISSN: 2046-4053
Inclusion and exclusion criteria
|
|
| |
|---|---|---|
| Population | Eczema (synonyms: atopic eczema, atopic dermatitis, neurodermatitis) | Populations with other skin diseases than eczema, populations of children with eczema, and populations of adolescents with eczema |
| Study design | Development study, validation study | Linguistic validation studies |
| Outcome | Quality of life, health-related quality of life | Signs, disease severity measure, disease control measure, biomarker, and physiology of the skin |
| Type of measurement instrument | Self-reported measurement instrument | All others |
| Publication type | Articles with available full text | Abstracts |
Definitions of domains, measurement properties, and aspects of measurement properties
|
|
|
|
|
|---|---|---|---|
| Reliability | The degree to which the measurement is free from measurement error | ||
| Reliability (extended definition) | The extent to which scores for patients who have not changed are the same for repeated measurement under several conditions: for example, using different sets of items from the same HR-PROs (internal consistency) over time (test-retest) by different persons on the same occasion (inter-rater) or by the same persons (that is, raters or responders) on different occasions (intra-rater) | ||
| Internal consistency | The degree of interrelatedness among the items | ||
| Reliability | The proportion of total variance in the measurements which is because of ‘true’a differences among patients | ||
| Measurement error | The systematic and random error of a patient’s score that is not attributed to true change of the construct to be measured | ||
| Validity | The degree to which an HR-PRO instrument measures the construct(s) it purports to measure | ||
| Content validity | The degree to which the content of an HR-PRO instrument is an adequate reflection of the construct to be measured | ||
| Face validity | The degree to which (the items of) an HR-PRO instrument indeed looks as though they are an adequate reflection of the construct to be measured | ||
| Construct validity | The degree to which the scores of an HR-PRO instrument are consistent with hypotheses (for instance with regard to internal relationships, relationships to scores of other instruments, or differences between relevant groups) based on the assumption that the HR-PRO instrument validly measures the construct to be measured | ||
| Structural validity | The degree to which the scores of an HR-PRO instrument are an adequate reflection of the dimensionality of the construct to be measured | ||
| Hypothesis testing | Idem construct validity | ||
| Cross-cultural validity | The degree to which the performance of the items on a translated or culturally adapted HR-PRO instrument is an adequate reflection of the performance of the items of the original version of the HR-PRO instrument | ||
| Responsiveness | The ability of an HR-PRO instrument to detect change over time in the construct to be measured | ||
| Responsiveness | Idem responsiveness | ||
| Interpretabilityb | The degree to which one can assign qualitative meaning - that is, clinical or commonly understood connotations - to an instrument’s quantitative scores or changes in scores |
Abbreviations: HR-PROs health related patient-reported outcomes, CTT classical test theory. aThe word ‘true’ must be seen in the context of the CTT, which states that any observation is composed of two components - a true score and error associated with the observation. ‘True’ is the average score that would be obtained if the scale were given an infinite number of times. It refers only to the consistency of the score and not to its accuracy [14]. bInterpretability is not considered a measurement property but an important characteristic of a measurement instrument.
Quality criteria for measurement properties adapted from [11] and [15]
|
|
|
|
|---|---|---|
| Reliability | ||
| Internal consistency | + | Cronbach’s alpha(s) ≥ 0.70 |
| ? | Dimensionality not known OR Cronbach’s alpha not determined | |
| − | Cronbach’s alpha(s) < 0.70 | |
| + | MIC > SDC OR MIC outside the LOA | |
| ? | MIC not defined | |
| − | MIC ≤ SDC OR MIC equals or inside LOA | |
| Reliability | + | ICC/weighted Kappa ≥ 0.70, OR Pearson’s |
| ? | Neither ICC/weighted Kappa nor Pearson’s | |
| − | ICC/weighted Kappa < 0.70 OR Pearson’s | |
| Validity | ||
| Content validity | + | All items are considered to be relevant for the construct to be measured, for the target population, and for the purpose of the measurement, AND the questionnaire is considered to be comprehensive |
| ? | Not enough information available | |
| − | Not all items are considered to be relevant for the construct to be measured, for the target population, and for the purpose of the measurement, OR the questionnaire is considered not to be comprehensive | |
| Construct validity | ||
| Structural validity | + | Factors should explain at least 50% of the variance |
| ? | Explained variance not mentioned | |
| − | Factors explain <50% of the variance | |
| Structural validity (IRT methods applied) | + | Residual correlations among the items after controlling for the dominant factor <0.20 OR Q3′s < 0.37, item scalability >0.30, IRT model fit: G2 > 0.01, no DIF for important subject characteristics (such as age, gender, education): McFadden’s R2 < 0.02 |
| ? | Important characteristics not reported | |
| − | Residual correlations among the items after controlling for the dominant factor ≥0.20 OR Q3′s ≥ 0.37, item scalability ≤0.30, IRT model fit: G2 ≤ 0.01, important DIF for important subject characteristics (such as age, gender, education): McFadden’s R2 ≥ 0.02 | |
| Hypothesis testing | + | Correlation with an instrument measuring the same construct ≥0.50 OR at least 75% of the results are in accordance with the hypotheses, AND correlation with related constructs is higher than with unrelated constructs |
| ? | Solely correlations determined with unrelated constructs | |
| − | Correlation with an instrument measuring the same construct <0.50 OR <75% of the results are in accordance with the hypotheses OR correlation with related constructs is lower than with unrelated constructs | |
| Cross-cultural validity | + | No differences in factor structure OR no important DIF between language versions |
| ? | Multiple group factor analysis not applied AND DIF not assessed | |
| − | Differences in factor structure OR important DIF between language versions | |
| Responsiveness | ||
| Responsiveness | + | Correlation with changes on instruments measuring the same construct ≥0.50 OR at least 75% of the results are in accordance with the hypotheses OR AUC ≥ 0.70, AND correlations with changes in related constructs are higher than with unrelated constructs |
| ? | Solely correlations determined with unrelated constructs | |
| − | Correlations with changes on instruments measuring the same construct <0.50, OR <75% of the results are in accordance with the hypotheses, OR AUC < 0.70, OR correlations with changes in related constructs are lower than with unrelated constructs | |
| Interpretability | + | MIC calculated and anchor questions clearly described |
| ? | MIC calculated but anchor questions not clearly labelled | |
| − | MIC not reported |
MIC: minimal important change, SDC: smallest detectable change, LOA: limits of agreement, ICC: intraclass correlation coefficient, AUC: area under the curve. +positive rating, ? indeterminate rating, −negative rating.
Levels of evidence for the overall quality of a measurement property [16]
|
|
|
|
|---|---|---|
| Strong | +++ or −−− | Consistent findings in multiple studies of good methodological quality OR in one study of excellent methodological quality |
| Moderate | ++ or −− | Consistent findings in multiple studies of fair methodological quality OR in one study of good methodological quality |
| Limited | + or − | One study of fair methodological quality |
| Conflicting | +/− | Conflicting findings |
| Unknown | ? | Only studies of poor methodological quality |
+positive rating, ? indeterminate rating, −negative rating.
Quality criteria required for recommendation of QoL measures for eczema
|
|
|
|
|---|---|---|
| Content validity | Truth | + |
| Structural validity | Truth | + |
| Hypotheses testing | Truth | + |
| Cross-cultural validity | Truth | + |
| Internal consistency | Discrimination | + |
| Reliability | Discrimination | + |
| Measurement error | Discrimination | + |
| Responsiveness | Discrimination | + |
| Interpretability | Feasibility | + |