| Literature DB >> 30132375 |
Abstract
Agreement studies are of paramount importance in various scientific domains. When several observers classify objects on categorical scales, agreement can be quantified through multirater kappa coefficients. In most statistical packages, the standard error of these coefficients is only available under the null hypothesis that the coefficient is equal to zero, preventing the construction of confidence intervals in the general case. The aim of this paper is triple. First, simple analytic formulae for the standard error of multirater kappa coefficients will be given in the general case. Second, these formulae will be extended to the case of multilevel data structures. The formulae are based on simple matrix algebra and are implemented in the R package "multiagree". Third, guidelines on the choice between the different mulitrater kappa coefficients will be provided.Entities:
Keywords: Conger kappa; Fleiss’s kappa; hierarchical; nested; pairwise agreement; rater
Mesh:
Year: 2018 PMID: 30132375 PMCID: PMC6745615 DOI: 10.1177/0962280218794733
Source DB: PubMed Journal: Stat Methods Med Res ISSN: 0962-2802 Impact factor: 3.021
Figure 1.Simulations. Coverage for Conger’s kappa coefficient against the number of clusters obtained with the delta method (black) and the percentile-based bootstrap method (gray) in the presence of 2 (dotted), 5 (dashed) and 10 (plain) observers with uniform marginal probability distribution. The number of objects per cluster is equal to 1.
Figure 2.Simulations. Coverage for Conger’s kappa coefficient according to the delta method in the presence of 2 (dotted), 5 (dashed) and 10 (plain) observers with uniform marginal probability distribution, 25 (up), 50 (middle) and 100 (bottom) clusters and five objects per cluster.
Fleiss example.
| Delta method | Bootstrap method | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Category |
|
|
| 95% CI | 95% CI | ||||
| 1 | 0.144 | 0.813 | 0.753 | 0.245 (0.109) | 0.031 | 0.459 | 0.232 (0.108) | 0.020 | 0.443 |
| 2 | 0.144 | 0.813 | 0.753 | 0.245 (0.115) | 0.020 | 0.470 | 0.231 (0.098) | 0.040 | 0.422 |
| 3 | 0.167 | 0.867 | 0.722 | 0.520 (0.100) | 0.324 | 0.716 | 0.511 (0.078) | 0.358 | 0.664 |
| 4 | 0.306 | 0.776 | 0.576 | 0.471 (0.084) | 0.307 | 0.635 | 0.459 (0.076) | 0.310 | 0.608 |
| 5 | 0.239 | 0.842 | 0.636 | 0.566 (0.115) | 0.341 | 0.791 | 0.550 (0.128) | 0.298 | 0.801 |
| Overall | 0.556 | 0.220 | 0.430 (0.054) | 0.324 | 0.536 | 0.418 (0.055) | 0.309 | 0.526 | |
Note: Summary of the statistics to compute Fleiss kappa for each category separately and overall.
Tromsø example.
| Body | |||||||
|---|---|---|---|---|---|---|---|
| location | EXP | NOR | RUS | WAL | NLD | PLN | STU |
| U | 0.13 | 0.11 | 0.23 | 0.083 | 0.12 | 0.22 | 0.22 |
| L | 0.29 | 0.38 | 0.37 | 0.19 | 0.16 | 0.34 | 0.36 |
| A | 0.016 | 0.048 | 0.3 | 0.029 | 0.046 | 0.12 | 0.22 |
| <0.0001 | <0.0001 | 0.031 | <0.0001 | 0.0015 | <0.0001 | 0.0093 |
Note: Probability to detect crackles according to the location (anterior thorax (A), upper posterior thorax (U) and lower posterior thorax (L). The probabilities are compared among locations using a multilevel probit regression.
Tromsø example.
| U | L | A | All | |||||
|---|---|---|---|---|---|---|---|---|
| Group |
|
|
|
| ||||
| EXP | 0.88 | 0.65 (0.13) | 0.78 | 0.52 (0.08) | 0.91 | 0.04 (0.06) | 0.86 | 0.56 (0.08) |
| NOR | 0.92 | 0.75 (0.12) | 0.78 | 0.55 (0.10) | 0.85 | 0.10 (0.06) | 0.85 | 0.58 (0.08) |
| RUS | 0.72 | 0.25 (0.08) | 0.64 | 0.26 (0.07) | 0.59 | 0.06 (0.07) | 0.65 | 0.20 (0.05) |
| WAL | 0.86 | 0.48 (0.17) | 0.88 | 0.71 (0.10) | 0.86 | 0.01 (0.05) | 0.87 | 0.53 (0.09) |
| NLD | 0.85 | 0.54 (0.13) | 0.86 | 0.61 (0.12) | 0.85 | 0.07 (0.06) | 0.86 | 0.49 (0.10) |
| PLN | 0.80 | 0.50 (0.14) | 0.76 | 0.49 (0.12) | 0.73 | 0.05 (0.07) | 0.76 | 0.40 (0.09) |
| STU | 0.78 | 0.43 (0.15) | 0.79 | 0.56 (0.11) | 0.63 | 0.02 (0.05) | 0.74 | 0.37 (0.08) |
Note: Proportion of agreement (P) and Conger’s kappa coefficient (standard error) for each group of observers reported overall (All) and at each thorax location (anterior thorax (A), upper posterior thorax (U) and lower posterior thorax (L)).