| Literature DB >> 25945109 |
Ingrid Jardim de Azeredo Souza Oliveira1, Arthur de Sá Ferreira1.
Abstract
This study compared the interrater agreement for pattern differentiation and acupoints prescription between two groups of human patients simulated with different diagnostic outcomes. Patients were simulated using a dataset about zangfu patterns and separated into groups (n = 30 each) according to the diagnostic outcome determined by a computational model. A questionnaire with 90 patients was delivered to 6 TCM experts (4-year minimal of clinic experience) who were asked to indicate a single pattern (among 73) and 8 acupoints (among 378). Interrater agreement was higher for pattern differentiation than for acupuncture prescription. Interrater agreement on pattern differentiation was slight for both groups with correct (Light's κ = 0.167, 95% CI = [0.108; 0.254]) and incorrect diagnosis (Light's κ = 0.190, 95% CI = [0.120; 0.286]). Interrater agreement on acupuncture prescription was slight for both groups of correct (ι = 0.029, 95% CI = [0.015; 0.057]) and incorrect diagnosis (ι = 0.040, 95% CI = [0.023; 0.058], P = 0.075). Diagnostic performance of raters yielded the following: accuracy = 60.9%, sensitivity = 21.7%, and specificity = 100%. An overall improvement in the interrater agreement and diagnostic accuracy was observed when the data were analyzed using the internal systems instead of the pattern's labels.Entities:
Year: 2015 PMID: 25945109 PMCID: PMC4405219 DOI: 10.1155/2015/469675
Source DB: PubMed Journal: Evid Based Complement Alternat Med ISSN: 1741-427X Impact factor: 2.629
Figure 1Study flowchart.
Descriptive data of the studied sample.
| Variable | Value* |
|---|---|
| Sample size, | 6 |
| Male | 4 (66.7%) |
| Female | 2 (33.3%) |
| Professional activity | |
| Clinical consultant | 6 (100%) |
| Postgraduate professor | 5 (83.3%) |
| Supervisor of clinic-school | 3 (50.0%) |
| Age, years | 43 [37; 64] |
| Formal training and practicing | |
| Duration of postgraduate course, years | 2 [2; 2.5] |
| Acupuncture and TCM theory, hours | 800 |
| Acupuncture training, hours | 400 |
| Time since postgraduate, years | 12 [4; 33] |
∗Median [minimum; maximum] for continuous variables; frequency (%) for categorical variables.
Figure 2Bootstrap analysis of interrater agreement on pattern differentiation (a) and acupuncture prescription (b) estimated from the specific labels of 73 zangfu patterns of simulated human patients grouped by correct (upper row, n = 30) or incorrect (lower row, n = 30) diagnosis.
Figure 3Bootstrap analysis of interrater agreement on pattern differentiation (a) and acupuncture prescription (b) estimated from the affected zangfu systems of simulated human patients grouped by correct (upper row, n = 30) or incorrect (lower row, n = 30) diagnosis.
Diagnostic performance of six raters for pattern differentiation grouped by diagnostic outcome.
| Outcome | Correct diagnosis (true profiles) | Incorrect diagnosis (false profiles) | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Descriptor | ACC [95% CI] |
| SEN | SPE | +PV | −PV | ACC [95% CI] |
| SEN | SPE | +PV | −PV |
| Patterns | ||||||||||||
| Rater 1 | 61.7 [48.2; 73.9] |
| 23.3 | 100 | 100 | 56.6 | 58.3 [44.9; 70.9] | 0.123 | 16.7 | 100 | 100 | 54.5 |
| Rater 2 | 58.3 [44.9; 70.9] | 0.123 | 16.7 | 100 | 100 | 55.6 | 56.7 [43.2; 69.4] | 0.183 | 13.3 | 100 | 100 | 53.6 |
| Rater 3 | 60.0 [46.5; 72.4] | 0.078 | 20.0 | 100 | 100 | 55.6 | 58.3 [44.9; 70.9] | 0.123 | 16.7 | 100 | 100 | 54.5 |
| Rater 4 | 56.7 [43.2; 69.4] | 0.183 | 13.3 | 100 | 100 | 53.6 | 51.7 [38.4; 64.8] | 0.449 | 3.3 | 100 | 100 | 50.8 |
| Rater 5 | 70.0 [56.8; 81.2] |
| 40.0 | 100 | 100 | 62.5 | 63.3 [49.9; 75.4] |
| 26.7 | 100 | 100 | 57.7 |
| Rater 6 | 61.7 [48.2; 73.9] | 0.046 | 23.3 | 100 | 100 | 56.6 | 61.7 [48.2; 73.9] |
| 23.3 | 100 | 100 | 56.6 |
| Group-median |
|
|
|
|
|
|
|
|
|
|
|
|
| Internal organs | ||||||||||||
| Rater 1 | 65.0 [51.6; 76.9] |
| 33.3 | 96.7 | 90.9 | 59.2 | 65.0 [51.6; 76.9] |
| 33.3 | 96.7 | 90.9 | 59.2 |
| Rater 2 | 61.7 [48.2; 73.9] |
| 23.3 | 100 | 100 | 56.6 | 63.3 [49.9; 75.4] |
| 26.7 | 100 | 100 | 57.7 |
| Rater 3 | 73.3 [60.3; 83.9] |
| 56.7 | 90.0 | 85.0 | 67.5 | 70.0 [56.8; 81.2] |
| 50.0 | 90.0 | 83.3 | 64.3 |
| Rater 4 | 55.0 [41.6; 67.9] | 0.259 | 23.3 | 86.7 | 63.6 | 53.1 | 58.3 [44.9; 70.9] | 0.123 | 30.0 | 86.7 | 69.2 | 55.3 |
| Rater 5 | 68.3 [55.0; 79.7] |
| 46.7 | 90.0 | 82.4 | 62.8 | 75.0 [61.2; 85.3] |
| 60.0 | 90.0 | 85.7 | 69.2 |
| Rater 6 | 68.3 [55.0; 79.7] |
| 40.0 | 96.7 | 92.3 | 61.7 | 70.0 [56.8; 81.2] |
| 43.3 | 96.7 | 92.9 | 63.0 |
| Group-median |
|
|
|
|
|
|
|
|
|
|
|
|
ACC: accuracy; 95% CI: 95% confidence interval; SEN: sensitivity; SPE: specificity; +PV: positive predictive value; −PV: negative predictive value; NT: not tested.