| Literature DB >> 23627889 |
Nahathai Wongpakaran1, Tinakon Wongpakaran, Danny Wedding, Kilem L Gwet.
Abstract
BACKGROUND: Rater agreement is important in clinical research, and Cohen's Kappa is a widely used method for assessing inter-rater reliability; however, there are well documented statistical problems associated with the measure. In order to assess its utility, we evaluated it against Gwet's AC1 and compared the results.Entities:
Mesh:
Year: 2013 PMID: 23627889 PMCID: PMC3643869 DOI: 10.1186/1471-2288-13-61
Source DB: PubMed Journal: BMC Med Res Methodol ISSN: 1471-2288 Impact factor: 4.615
Pair, rater matches and number of subjects per pair
| Rater Names | VU | US | TW | NW | SU | AM | TW | MN | |
| MN | SP | SR | SR | TW | SU | VU | TW | | |
| No. of Subjects | 19 | 16 | 10 | 8 | 3 | 3 | 2 | 6 | 67 |
Distribution of subjects - by rater and response category
| Category 1 | Category 2 | Total | |
| Category 1 | A | B | B1(A+B) |
| Category 2 | C | D | B2(C+D) |
| A1 (A+C) | A2 (B+D) | N | |
Distribution of subjects by rater and response category for the VU-MN and US-SP pairs of raters
| | | |||||||
|---|---|---|---|---|---|---|---|---|
| Avoidant | No (N) | 14 | 0 | 100 | No (N) | 13 | 0 | 100 |
| | Yes (Y) | 0 | 5 | | Yes (Y) | 0 | 3 | |
| Dependent | N | 17 | 1 | 95 | N | 15 | 0 | 100 |
| | Y | 0 | 1 | | Y | 0 | 1 | |
| Obsessive-Compulsive | N | 12 | 1 | 95 | N | 10 | 0 | 88 |
| | Y | 0 | 6 | | Y | 2 | 4 | |
| Passive-aggressive | N | 15 | 0 | 100 | N | 14 | 0 | 94 |
| | Y | 0 | 4 | | Y | 1 | 1 | |
| Depressive | N | 15 | 1 | 89 | N | 13 | 1 | 94 |
| | Y | 1 | 2 | | Y | 0 | 2 | |
| Paranoid | N | 13 | 0 | 100 | N | 15 | 0 | 100 |
| | Y | 0 | 6 | | Y | 0 | 1 | |
| Schizotypal | N | 18 | 0 | 100 | N | 16 | 0 | 100 |
| | Y | 0 | 1 | | Y | 0 | 0 | |
| Schizoid | N | 13 | 2 | 84 | N | 15 | 0 | 100 |
| | Y | 1 | 3 | | Y | 0 | 1 | |
| Histrionic | N | 17 | 1 | 94 | N | 15 | 0 | 100 |
| | Y | 0 | 1 | | Y | 0 | 1 | |
| Narcissistic | N | 19 | 0 | 100 | N | 16 | 0 | 100 |
| | Y | 0 | 0 | | Y | 0 | 0 | |
| Borderline | N | 15 | 0 | 100 | N | 14 | 0 | 100 |
| | Y | 0 | 4 | | Y | 0 | 2 | |
| Total Antisocial | N | 16 | 1 | 89 | N | 15 | 0 | 100 |
| Y | 1 | 1 | Y | 0 | 1 | |||
Distribution of subjects by rater and response category for the TW-SR and NW-SR
| | | |||||||
| | | | ||||||
| Avoidant | ||||||||
| | | | ||||||
| Dependent | N | 9 | 1 | 90 | N | 7 | 0 | 100 |
| | Y | 0 | 0 | | Y | 0 | 1 | |
| Obsessive-Compulsive | N | 8 | 0 | 100 | N | 6 | 0 | 100 |
| | Y | 0 | 2 | | Y | 0 | 2 | |
| Passive-aggressive | N | 9 | 0 | 90 | N | 6 | 1 | 88 |
| | Y | 1 | 0 | | Y | 0 | 1 | |
| Depressive | N | 10 | 0 | 100 | N | 7 | 0 | 100 |
| | Y | 0 | 0 | | Y | 0 | 1 | |
| Paranoid | N | 9 | 0 | 90 | N | 8 | 0 | 100 |
| | Y | 1 | 0 | | Y | 0 | 0 | |
| Schizotypal | N | 9 | 0 | 100 | N | 8 | 0 | 100 |
| | Y | 0 | 1 | | Y | 0 | 0 | |
| Schizoid | N | 7 | 1 | 90 | N | 6 | 0 | 88 |
| | Y | 0 | 2 | | Y | 1 | 1 | |
| Histrionic | N | 10 | 0 | 100 | N | 8 | 0 | 100 |
| | Y | 0 | 0 | | Y | 0 | 0 | |
| Narcissistic | N | 9 | 0 | 100 | N | 8 | 0 | 100 |
| | Y | 0 | 1 | | Y | 0 | 0 | |
| Borderline | N | 8 | 1 | 90 | N | 7 | 0 | 100 |
| | Y | 0 | 1 | | Y | 0 | 1 | |
| Total Antisocial | N | 10 | 0 | 100 | N | 7 | 0 | 100 |
| Y | 0 | 0 | Y | 0 | 1 | |||
Inter-rater reliability between raters, based on Cohen’s Kappa and Gwet’s AC1
| | ||||||||
|---|---|---|---|---|---|---|---|---|
| | ||||||||
| Avoidant | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 0 | .858 |
| Dependent | .640 | .934 | 1.000 | 1.000 | 0 | .890 | 1.000 | 1.000 |
| Obsessive-Compulsive | .883 | .904 | .714 | .781 | 1.000 | 1.000 | 1.000 | 1.000 |
| Passive-Aggressive | 1.000 | 1.000 | .636 | .924 | 0 | .890 | .600 | .820 |
| Depressive | .604 | .857 | .765 | .915 | 1.000 | 1.000 | 1.000 | 1.000 |
| Paranoid | 1.000 | 1.000 | 1.000 | 1.000 | 0 | .890 | 1.000 | 1.000 |
| Schizotypal | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 |
| Schizoid | .565 | .752 | 1.000 | 1.000 | .737 | .840 | .600 | .820 |
| Histrionic | .641 | .938 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 |
| Narcissistic | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 |
| Borderline | 1.000 | 1.000 | 1.000 | 1.000 | .615 | .866 | 1.000 | 1.000 |
| Total Antisocial | .441 | .870 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 |
Comparison between Cohen’s Kappa and Gwet’s AC1 according to prevalence rate
| Avoidant | 26.32 | 1.000 | 1.000 | 100 |
| | 18.75 | 1.000 | 1.000 | 100 |
| | 10.00 | 1.000 | 1.000 | 100 |
| | 0.0 | 0.0 | .858 | 88 |
| Dependent | 12.50 | 1.000 | 1.000 | 100 |
| | 6.25 | 1.000 | 1.000 | 100 |
| | 5.26 | .640 | .934 | 95 |
| | 0.0 | 0.0 | .890 | 90 |
| Obsessive-Compulsive | 31.58 | .883 | .904 | 95 |
| | 25.00 | .714 | .781 | 88 |
| | 25.00 | 1.000 | 1.000 | 100 |
| | 20.00 | 1.000 | 1.000 | 100 |
| Passive-Aggressive | 21.05 | 1.000 | 1.000 | 100 |
| | 12.50 | .600 | .820 | 88 |
| | 6.25 | .636 | .924 | 94 |
| | 0.0 | 0.0 | .890 | 90 |
| Depressive | 12.50 | 1.000 | 1.000 | 100 |
| | 12.50 | .765 | .915 | 94 |
| | 10.53 | .604 | .857 | 89 |
| | 0.0 | 1.000 | 1.000 | 100 |
| Paranoid | 31.58 | 1.000 | 1.000 | 100 |
| | 6.25 | 1.000 | 1.000 | 100 |
| | 0.0 | 1.000 | 1.000 | 100 |
| | 0.0 | 0.0 | .890 | 90 |
| Schizotypal | 10.00 | 1.000 | 1.000 | 100 |
| | 5.26 | 1.000 | 1.000 | 100 |
| | 0.0 | 1.000 | 1.000 | 100 |
| | 0.0 | 1.000 | 1.000 | 100 |
| Schizoid | 20.00 | .737 | .840 | 90 |
| | 15.79 | .565 | .752 | 84 |
| | 12.50 | .600 | .820 | 88 |
| | 6.25 | 1.000 | 1.000 | 100 |
| Histrionic | 6.25 | 1.000 | 1.000 | 100 |
| | 5.26 | .641 | .938 | 94 |
| | 0.0 | 1.000 | 1.000 | 100 |
| | 0.0 | 1.000 | 1.000 | 100 |
| Narcissistic | 10.00 | 1.000 | 1.000 | 100 |
| | 0.0 | 1.000 | 1.000 | 100 |
| | 0.0 | 1.000 | 1.000 | 100 |
| | 0.0 | 1.000 | 1.000 | 100 |
| Borderline | 21.05 | 1.000 | 1.000 | 100 |
| | 12.50 | 1.000 | 1.000 | 100 |
| | 12.50 | 1.000 | 1.000 | 100 |
| | 10.00 | .615 | .866 | 90 |
| Total Antisocial | 12.50 | 1.000 | 1.000 | 100 |
| | 6.25 | 1.000 | 1.000 | 100 |
| | 5.26 | .441 | .870 | 89 |
| 0.0 | 1.000 | 1.000 | 100 |
Benchmark scales for Kappa’s value, as proposed by different investigators
| <.0 Poor | | |
| .00 to .20; Slight | <.20 ;Poor | <.40; Poor |
| .21 to .40; Fair | .21 to .40; Fair | .40 to .75; Intermediate to Good |
| .41 to .60; Moderate | .41 to .60; Moderate | |
| .61 to .80; Substantial | .61 to .80; Good | More than .75; Excellent |
| .81 to 1.00; Almost Perfect | .81 to 1.00; Very Good |