| Literature DB >> 26933956 |
Andrei P Kirilenko1, Svetlana Stepchenkova1.
Abstract
Content analysis involves classification of textual, visual, or audio data. The inter-coder agreement is estimated by making two or more coders to classify the same data units, with subsequent comparison of their results. The existing methods of agreement estimation, e.g., Cohen's kappa, require that coders place each unit of content into one and only one category (one-to-one coding) from the pre-established set of categories. However, in certain data domains (e.g., maps, photographs, databases of texts and images), this requirement seems overly restrictive. The restriction could be lifted, provided that there is a measure to calculate the inter-coder agreement in the one-to-many protocol. Building on the existing approaches to one-to-many coding in geography and biomedicine, such measure, fuzzy kappa, which is an extension of Cohen's kappa, is proposed. It is argued that the measure is especially compatible with data from certain domains, when holistic reasoning of human coders is utilized in order to describe the data and access the meaning of communication.Entities:
Mesh:
Year: 2016 PMID: 26933956 PMCID: PMC4775035 DOI: 10.1371/journal.pone.0149787
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Examples of membership function μ(u): categories 1 through 5.
“Crisp” function: coder selects only one category, that of #3. “Fuzzy 1” function: coder selects categories 2, 3, and 4; they are given equal weights. “Fuzzy 2” function: coder selects categories 3 (first choice), 4 (second choice) and 2 (third choice).
Inter-coder agreement indices: Examples 1 and 2.
For data see support information files S1 and S2 Zip Archives.
| Examples | Crisp kappa | Fuzzy kappa | |||||||
|---|---|---|---|---|---|---|---|---|---|
| Number of categories | P0 | PE | CI | P0 | PE | CI | |||
| 1. Russia | 30 | - | - | - | - | 0.90 | 0.53 | 0.79 | 0.76–0.81 |
| People | 5 | 0.95 | 0.43 | 0.90 | 0.86–0.95 | ||||
| Nature Landscape | 2 | 0.99 | 0.80 | 0.95 | 0.89–1.01 | ||||
| Place | 3 | 0.85 | 0.39 | 0.75 | 0.68–0.82 | ||||
| Space | 4 | 0.73 | 0.28 | 0.62 | 0.55–0.69 | ||||
| Transport & Infr | 2 | 0.96 | 0.68 | 0.86 | 0.78–0.94 | ||||
| Activities | 5 | 0.85 | 0.50 | 0.70 | 0.62–0.78 | ||||
| Season | 3 | 0.96 | 0.57 | 0.91 | 0.85–0.96 | ||||
| Architecture | 2 | 0.92 | 0.53 | 0.84 | 0.77–0.90 | ||||
| Heritage | 4 | 0.91 | 0.60 | 0.76 | 0.68–0.85 | ||||
| 2. Peru | 20 | - | - | - | - | 0.84 | 0.14 | 0.81 | 0.75–0.87 |
| First selection only | 20 | 0.79 | 0.12 | 0.76 | 0.67–0.85 | ||||
* 95% confidence interval.
NB: Fuzzy kappa for single categories (in italic) equals crisp kappa.
One-to-one coding sheet.
| ID | PEOPLE | NATURE LAND | PLACE | SPACE | TRANS/ INF | ACTIV | SEASON | ARCHIT | HERIT |
|---|---|---|---|---|---|---|---|---|---|
| 1001 | 2 | 0 | 1 | 2 | 0 | 1 | 2 | 0 | 0 |
| 1002 | 0 | 0 | 2 | 2 | 0 | 0 | 2 | 1 | 0 |
| 1003 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |
| 1004 | 3 | 0 | 1 | 1 | 0 | 1 | 2 | 0 | 2 |
| 1005 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |
| 1006 | 0 | 0 | 1 | 2 | 0 | 0 | 0 | 0 | 0 |
| 1007 | 2 | 0 | 1 | 1 | 0 | 1 | 2 | 1 | 0 |
One-to-many coding sheet.
| ID | Cat1 | Cat2 | Cat3 | Cat4 |
|---|---|---|---|---|
| 63 | WL | FL | ||
| 70 | PP | TC | FR | |
| 71 | NL | PP | OA | |
| 72 | A | |||
| 114 | NL | PP | OA | TF |
| 135 | DA | NL |