Sibel Zehra Aydin1, Esen Kasapoglu Gunal2, Esra Kurum3, Servet Akar4, Halit Eyyup Mungan5, Fatma Alibaz-Oner6, Robert G Lambert7, Pamir Atagunduz6, Helena Marzo Ortega8, Dennis McGonagle8, Walter P Maksymowych9. 1. Division of Rheumatology, the Ottawa Hospital Research Institute, University of Ottawa, Ottawa, Canada. 2. Division of Rheumatology, Faculty of Medicine, Istanbul Medeniyet University, Istanbul, Turkey. 3. Department of Statistics, University of California, Riverside, CA, USA. 4. Rheumatology Division, Katip Çelebi University Faculty of Medicine, Izmir. 5. Internal Medicine Department, Faculty of Medicine, Istanbul Medeniyet University. 6. Rheumatology Division, Faculty of Medicine, Marmara University, Istanbul, Turkey. 7. Alberta Heritage Foundation for Medical Research, University of Alberta, Edmonton, Alberta, Canada. 8. NIHR Leeds Musculoskeletal Biomedical Research Unit, Leeds Teaching Hospitals Trust and Leeds Institute of Rheumatic and Musculoskeletal Medicine, University of Leeds, Leeds, UK. 9. Department of Medicine, University of Alberta, Edmonton, Alberta, Canada.
Abstract
Objectives: Conventional radiography is key to assessing AS-related spinal involvement and has become increasingly important given that spinal fusion may continue under biologic therapy. We aimed to compare the reliability of radiographic scoring of the spine by using different approaches to understand how different readers agree on overall scores and on individual findings. Method: Six investigators scored 68 plain radiographs of the cervical and lumbar spine of 34 patients with a 2-year interval, for erosions, sclerosis, squaring, syndesmophytes and ankyloses using the Spondyloarthritis Radiography (SPAR) module. The intraclass correlation coefficients were calculated compared with two gold standards. The reproducibility of each finding in 1632 vertebral corners and new syndesmophytes in each corner was calculated by kappa analysis and positive agreement rates. Results: The intraclass correlation coefficients mostly revealed good to excellent agreement with the gold standards (0.69-0.95). The kappa analysis showed worse agreement, being relatively higher for syndesmophytes (0.163-0.559) and ankylosis (0.48-0.95). Positive agreement rates showed that erosions were never detected at the same vertebral corner by two readers (positive agreement rate: 0%). The mean (range) positive agreement rates were 10.1% (0-27.7%) for sclerosis and 19.2% (0-59.7%) for squaring, and were higher for syndesmophytes [38.8% (21.4-62.5%)] and ankylosis [77.3% (64-95.3%)]. Conclusion: Our results show that there is a poor agreement on the presence of grade 1 lesions included in the Modified Stoke Ankylosing Spondylitis Spine Score-mostly for erosions and sclerosis-which may increase the measurement error. The currently used definitions of reliability have a risk of overestimating reproducibility.
Objectives: Conventional radiography is key to assessing AS-related spinal involvement and has become increasingly important given that spinal fusion may continue under biologic therapy. We aimed to compare the reliability of radiographic scoring of the spine by using different approaches to understand how different readers agree on overall scores and on individual findings. Method: Six investigators scored 68 plain radiographs of the cervical and lumbar spine of 34 patients with a 2-year interval, for erosions, sclerosis, squaring, syndesmophytes and ankyloses using the Spondyloarthritis Radiography (SPAR) module. The intraclass correlation coefficients were calculated compared with two gold standards. The reproducibility of each finding in 1632 vertebral corners and new syndesmophytes in each corner was calculated by kappa analysis and positive agreement rates. Results: The intraclass correlation coefficients mostly revealed good to excellent agreement with the gold standards (0.69-0.95). The kappa analysis showed worse agreement, being relatively higher for syndesmophytes (0.163-0.559) and ankylosis (0.48-0.95). Positive agreement rates showed that erosions were never detected at the same vertebral corner by two readers (positive agreement rate: 0%). The mean (range) positive agreement rates were 10.1% (0-27.7%) for sclerosis and 19.2% (0-59.7%) for squaring, and were higher for syndesmophytes [38.8% (21.4-62.5%)] and ankylosis [77.3% (64-95.3%)]. Conclusion: Our results show that there is a poor agreement on the presence of grade 1 lesions included in the Modified Stoke Ankylosing Spondylitis Spine Score-mostly for erosions and sclerosis-which may increase the measurement error. The currently used definitions of reliability have a risk of overestimating reproducibility.