| Literature DB >> 21732199 |
Caroline B Terwee1, Lidwine B Mokkink, Dirk L Knol, Raymond W J G Ostelo, Lex M Bouter, Henrica C W de Vet.
Abstract
BACKGROUND: The COSMIN checklist is a standardized tool for assessing the methodological quality of studies on measurement properties. It contains 9 boxes, each dealing with one measurement property, with 5-18 items per box about design aspects and statistical methods. Our aim was to develop a scoring system for the COSMIN checklist to calculate quality scores per measurement property when using the checklist in systematic reviews of measurement properties.Entities:
Mesh:
Year: 2011 PMID: 21732199 PMCID: PMC3323819 DOI: 10.1007/s11136-011-9960-1
Source DB: PubMed Journal: Qual Life Res ISSN: 0962-9343 Impact factor: 4.147
Example of one COSMIN box with 4-point scale
| Box B. Reliability: relative measures (including test–retest reliability, inter-rater reliability, and intra-rater reliability) | ||||
|---|---|---|---|---|
| Excellent | Good | Fair | Poor | |
|
| ||||
| 1. Was the percentage of missing items given? | Percentage of missing items described | Percentage of missing items NOT described | ||
| 2. Was there a description of how missing items were handled? | Described how missing items were handled | Not described but it can be deduced how missing items were handled | Not clear how missing items were handled | |
| 3. Was the sample size included in the analysis adequate? | Adequate sample size (≥100) | Good sample size (50–99) | Moderate sample size (30–49) | Small sample size (<30) |
| 4. Were at least two measurements available? | At least two measurements | Only one measurement | ||
| 5. Were the administrations independent? | Independent measurements | Assumable that the measurements were independent | Doubtful whether the measurements were independent | Measurements NOT independent |
| 6. Was the time interval stated? | Time interval stated | Time interval NOT stated | ||
| 7. Were patients stable in the interim period on the construct to be measured? | Patients were stable (evidence provided) | Assumable that patients were stable | Unclear whether patients were stable | Patients were NOT stable |
| 8. Was the time interval appropriate? | Time interval appropriate | Doubtful whether time interval was appropriate | Time interval NOT appropriate | |
| 9. Were the test conditions similar for both measurements? e.g., type of administration, environment, and instructions | Test conditions were similar (evidence provided) | Assumable that test conditions were similar | Unclear whether test conditions were similar | Test conditions were NOT similar |
| 10. Were there any important flaws in the design or methods of the study? | No other important methodological flaws in the design or execution of the study | Other minor methodological flaws in the design or execution of the study | Other important methodological flaws in the design or execution of the study | |
|
| ||||
| 11. For continuous scores: Was an intraclass correlation coefficient (ICC) calculated? | ICC calculated and model or formula of the ICC is described | ICC calculated but model or formula of the ICC not described. Pearson or Spearman correlation coefficient calculated with evidence provided that no systematic change has occurred | Pearson or Spearman correlation coefficient calculated WITHOUT evidence provided that no systematic change has occurred or WITH evidence that systematic change has occurred | No ICC or Pearson or Spearman correlations calculated |
| 12. For dichotomous/nominal/ordinal scores: Was kappa calculated? | Kappa calculated | Only percentage agreement calculated | ||
| 13. For ordinal scores: Was a weighted kappa calculated? | Weighted Kappa calculated | Unweighted Kappa calculated | Only percentage agreement calculated | |
| 14. For ordinal scores: Was the weighting scheme described? e.g. linear, quadratic | Weighting scheme described | Weighting scheme NOT described | ||
Fig. 1Percentage of studies with excellent, good, fair, or poor quality. Included number of studies: Internal consistency 35; reliability 36; measurement error 21; content validity 16; structural validity 11; construct validity (hypotheses testing) 41; translation 25; responsiveness 37 (criterion validity was not assessed in any of the studies)