| Literature DB >> 22662248 |
Rafdzah Zaki1, Awang Bulgiba, Roshidi Ismail, Noor Azina Ismail.
Abstract
BACKGROUND: Accurate values are a must in medicine. An important parameter in determining the quality of a medical instrument is agreement with a gold standard. Various statistical methods have been used to test for agreement. Some of these methods have been shown to be inappropriate. This can result in misleading conclusions about the validity of an instrument. The Bland-Altman method is the most popular method judging by the many citations of the article proposing this method. However, the number of citations does not necessarily mean that this method has been applied in agreement research. No previous study has been conducted to look into this. This is the first systematic review to identify statistical methods used to test for agreement of medical instruments. The proportion of various statistical methods found in this review will also reflect the proportion of medical instruments that have been validated using those particular methods in current clinical practice. METHODOLOGY/Entities:
Mesh:
Year: 2012 PMID: 22662248 PMCID: PMC3360667 DOI: 10.1371/journal.pone.0037908
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Flowchart of studies.
Most popular statistical methods used to assess agreement in medicine.
| Statistical Method Used | Number of articles using the method, x (%) n = 210 |
| 1. Bland-Altman Limits of Agreement2. Correlation coefficient (r)3. Compare means/Significant test4. Intra-class Correlation Coefficient5. Compare slopes or/and intercepts | 178 (85%)58 (28%)38 (18%)14 (7%)13 (6%) |
n = Total number of studies retrieved, x = number of studies, % = percentage.
Most popular statistical methods used to assess agreement according to area of specialty in medicine.
| Area of specialty | Statistical Method Used | Number of articles using the method (x) |
|
| 1. Bland-Altman Limits of Agreement2. Correlation coefficient (r)3. Compare slopes or/and intercepts4. Intra-class Correlation Coefficient5. Compare means/Significant test | 246432 |
|
| 1. Bland-Altman Limits of Agreement2. Correlation coefficient (r)3. Compare means/Significant test4. Intra-class Correlation Coefficient5. Percentage of error | 218541 |
|
| 1. Bland-Altman Limits of Agreement2. Correlation coefficient (r)3. Compare means/Significant test4. Intra-class Correlation Coefficient5. Compare slopes/intercepts | 266632 |
|
| 1. Bland-Altman Limits of Agreement2. Correlation coefficient (r)3. Coefficient of determination (r2)4. Compare means/Significant test5. Compare slopes or/and intercepts | 2513444 |
n = Total number of studies retrieved for each specialty, x = number of studies.
Examples of inappropriate applications and interpretations of statistical analyses to assess agreement found in this review.
| Study objective | Results & author’s conclusion | |
| Ten 2007 | To compare four different commercial activated partial thromboplastin time (aPTT) reagents to detect shortened aPTT. | Correlation coefficients among the four methods ranged from 0.51 to 0.83 (all P values <0.001). Acceptable agreement between the different commercial reagents was found with respect to detection of short aPTT. Good agreement were found between Instrumentation Laboratory and bioMerieux reagents (r = 0.74–0.83) |
| Reis 2007 | To validate a method for the quantification ofthe very low levels of urinary human chorionic gonadotropin (hCG). | Equation from regression analysis: y = 0.99x+8.55, Correlation coefficient of 0.993 demonstrates very good immunoassay accuracy for the studied range of hCG concentrations. |
| Satia 2007 | To assess the degree of agreement betweenthree instruments of measuring dietaryfat consumption. | Pearson’s correlation coefficients among the three methods ranged from 0.18 to 0.58(all P values <0.0001). There was good concordance among the three methods. |
| Mündermann 2008 | To compare three dimensional position capture with skin markers and radiographicmeasurement for measuringmechanical axis alignment. | The mechanical axis alignment from position capture correlated well with thegold standard of measurement using radiographs (R2 = 0.544 P<0.001). Theproposed method allows the measurement of the mechanical axis alignmentwithout exposure to radiation. |
| Anderst 2009 | To compare the bead-based method oftracking bone motion in vivo with themodel-based method. | Agreement between the two systems was quantified by comparing bias (mean difference). All bias measures not significantly different from zero. The newmodel-based tracking achieves excellent accuracy without the necessityfor invasive bead implantation. |
| Naidu 2009 | To evaluate the validity of the Hand Assessment Tool (HAT) and Disabilities of Arm, Shoulder,and Hand Questionnaire (DASH). | Strong positive correlation between DASH and HAT (r = 0.91). The HAT mayserve as useful alternative to the DASH. |
Data sets to demonstrate the inappropriate use of correlation coefficient in testing agreement.
| Reading | A | B | C (twice of B) |
| 12345678910 | 10.208.208.709.609.608.209.407.006.6010.80 | 10.208.008.059.709.058.158.806.556.5510.50 | 20.4016.0016.1019.4018.1016.3017.6013.1013.1021.00 |