| Literature DB >> 25852260 |
Wan Tang1, Jun Hu2, Hui Zhang3, Pan Wu4, Hua He5.
Abstract
In mental health and psychosocial studies it is often necessary to report on the between-rater agreement of measures used in the study. This paper discusses the concept of agreement, highlighting its fundamental difference from correlation. Several examples demonstrate how to compute the kappa coefficient - a popular statistic for measuring agreement - both by hand and by using statistical software packages such as SAS and SPSS. Real study data are used to illustrate how to use and interpret this coefficient in clinical research and practice. The article concludes with a discussion of the limitations of the coefficient.Entities:
Keywords: correlation; interrater agreement; weighted kappa
Year: 2015 PMID: 25852260 PMCID: PMC4372765 DOI: 10.11919/j.issn.1002-0829.215010
Source DB: PubMed Journal: Shanghai Arch Psychiatry ISSN: 1002-0829
Diagnosis of depression among 200 primary care patients based on information provided by the proband and by other informants about the proband
| Proband | Informant | total | |
|---|---|---|---|
| not depressed | depressed | ||
| not depressed | 66 | 19 | 85 |
| depressed | 50 | 65 | 115 |
| total | 116 | 84 | 200 |
Hypothetical example of proportional distribution of diagnoses by two raters that make diagnoses independently from each other
| Rater 1 result | Rater 2 result | total | |
|---|---|---|---|
| positive | negative | ||
| positive | 0.72 | 0.08 | 0.80 |
| negative | 0.18 | 0.02 | 0.20 |
| total | 0.90 | 0.10 | 1.00 |
A typical 2×2 contingency table to assess agreement of two rater
| First rater (x) | Second rater (y) | total | |
|---|---|---|---|
| 1 (positive) | 0 (negative) | ||
| 1 (positive) | |||
| 0 (negative) | |||
| total | |||
Model KxK contingency table to assess agreement about k categories by two different raters
| x | y | total | |||
|---|---|---|---|---|---|
| 1 | 2 | ... | k | ||
| 1 | ... | ||||
| 2 | ... | ||||
| ... | ... | ... | ... | ... | ... |
| k | ... | ||||
| total | ... | n | |||
Three ranked levels of depression categorized based on information from the probands themselves or on information from other informants about the probands
| Probands | Other informants | total | ||
|---|---|---|---|---|
| no depression | minor depression | major depressionc | ||
| no depression | 66 (1.0/1.0)a | 13 (0.5/0.75)a | 6 (0.0/0.0)a | 85 |
| minor depression | 36 (0.5/0.75)a | 16 (1.0/1.0)a | 10 (0.5/0.75)a | 62 |
| major depression | 14 (0.0/0.0)a | 12 (0.5/0.75)a | 27 (1.0/1.0)a | 53 |
| 116 | 41 | 43 | 200 | |
a values in parentheses are the Cicchetti-Allison and Fleiss-Cohen weights used when computing weighted kappa
Hypothetical example of incorrect agreement table that can occur when two raters on a three-level scale each only use 2 of the 3 levels
| Classification | Classification of rater 2 | total | |
|---|---|---|---|
| B | C | ||
| A | 16 | 2 | 18 |
| B | 5 | 14 | 19 |
| 21 | 16 | 37 | |
Adjustment of the agreement table (by adding zero cells) needed when two raters on a three-level scale each only use 2 of the 3 levels
| Classification | Classification of rater 2 | total | ||
|---|---|---|---|---|
| A | B | C | ||
| A | 0 | 16 | 2 | 18 |
| B | 0 | 5 | 14 | 19 |
| C | 0 | 0 | 0 | 0 |
| 0 | 21 | 16 | 37 | |