Literature DB >> 28050541

Reproducibility assessment of different descriptions of the Kellgren and Lawrence classification for osteoarthritis of the knee.

Felipe Borges Gonçalves¹, Felipe Almeida Rocha¹, Rodrigo Pires E Albuquerque¹, Alan de Paula Mozella¹, Bernardo Crespo¹, Hugo Cobra¹.

Abstract

OBJECTIVE: To assess the inter- and intraobserver reproducibility of the original version and different descriptions of the Kellgren and Lawrence classification used in epidemiological studies for osteoarthritis of the knee.
METHODS: The study included 72 patients with osteoarthritis of the knee. Three medical members of the Brazilian Society of Knee Surgery were invited to evaluate the images. An intra- and interobserver analysis was conducted, with an interval of one month. The intraobserver agreement was analyzed using the weighted Cohen's Kappa coefficient. The interobserver agreement was analyzed using the Krippendorff alpha coefficient (α).
RESULTS: The intraobserver assessment indicated conflicting results. In the interobserver analysis, the level of agreement was superficial.
CONCLUSIONS: The classification of Kellgren and Lawrence and its variants generated a low reproducibility between observers. The intraobserver analysis showed a lack of uniformity in the use of this classification and its variants, even among experienced observers.

Entities: Chemical

Keywords: Classification; Knee; Radiography

Year: 2016 PMID： 28050541 PMCID： PMC5198109 DOI： 10.1016/j.rboe.2016.10.009

Source DB: PubMed Journal: Rev Bras Ortop ISSN： 2255-4971

Introduction

Osteoarthritis is one of the most common diseases worldwide, with no distinction or ethnic preference. The knee, being a load-bearing joint, is a frequent involved site. The radiological evaluation is paramount in patients with osteoarthritis of the knee. In the radiographic study, one is able to grade the severity of joint involvement, measure the axis, assess ligament instability or bone loss, and also indicate the type of treatment, as well as the necessary implant when surgery is needed. The Kellgren and Lawrence grading is the most widely used classification for knee osteoarthritis when X-rays are assessed; however, five versions of this classification have been described in epidemiological studies. In order to be reproducible, a classification should be simple, easy to remember, and helpful in guiding treatment and defining the prognosis of these injuries. A characteristic that must be present in any classification is reproducibility. This study aimed to assess the interobserver and intraobserver reproducibility of the original version and of the different variants of the Kellgren and Lawrence classification used in epidemiological studies for osteoarthritis of the knee.

Material and methods

The study was presented in detail to and approved by the Ethics Committee under CAAE No. 31378714.6.0000.5273. All participants signed a informed consent prior to enrollment. They were also offered a financial incentive to participate. In this hospital's outpatient clinic, 200 patients with osteoarthritis of the knee were selected. There was no age limitation. Exclusion criteria were: patients who underwent prior surgical procedures in the knee to be assessed or who underwent joint replacement on the contralateral knee, as well as patients with other rheumatologic diseases. After applying the exclusion criteria, 72 patients and their radiographic studies were selected to comprise the sample. The study consisted of three observers, members of the Brazilian Society of Knee Surgery and part of the hospital staff, who conducted the radiographic analysis. Knee radiographs in anteroposterior (AP) with bipedal load, lateral, axial patellar at 30°, and Rosenberg views were obtained from all patients, following a standard protocol. The AP view was made with the knee in extension and bipedal support. The tube-film distance was 1 m, and the radius was centered at the lower pole of the patella. The lateral view was achieved with the knee in 20° of flexion with patient standing; the tube-film distance was 1 m. Rosenberg view was made in posteroanterior (PA), under load and 45° of flexion. Feet were positioned parallel and aligned forward. The patella touched the film. X-rays were centered at the level of the inferior pole of the patella, with a craniocaudal inclination of 10° and a tube-film distance of 1 m. A Shimatzo X-ray device, rated at 50 kV and 40 mA, was used. The exams were overseen by the main investigator regarding image quality and were repeated if considered of poor technical quality; patient positioning, knee, and X-ray device angulation were also observed. Angles were measured with a goniometer. Scanned images were delivered on a CD-ROM to the observers. In order to minimize bias due to the difficulty of interpretation or possible forgetfulness, the classification and its variants are described in Table 1.

Table 1

Classification and its variants.

	Original	Variant 1	Variant 2	Variant 3	Variant 4
Grade 0 (0)	Normal	Normal	Normal	Normal	Normal
Grade I	Doubtful JSN and minute osteophytes on the border	Minute osteophytes	Minimal osteophytes dubious significance	Only minute osteophytes	Minute marginal osteophytes
Grade II	Possible JSN and definite osteophytes	Definite osteophytes	Definite osteophytes without JSN	Definite osteophytes and minute JSN	Definite osteophytes and minute JSN
Grade III	Moderate JSN, multiple osteophytes, a certain degree of subchondral sclerosis and possible deformity of the bone contour	Osteophytes and JSN	Moderate JSN (with osteophytes)	Osteophytes in moderate quantity and/or definite JSN	Multiple osteophytes of moderate size, definite JSN, and possible deformity in the bone contour (bone friction)
Grade IV	Notable JSN, severe subchondral sclerosis, definite deformity of the bone contour, and presence of large osteophytes	Large osteophytes, definite JSN, and deformity	Substantial JSN with subchondral sclerosis	Large osteophytes, severe JSN and/or bone sclerosis	Large osteophytes, considerable JSN, severe sclerosis, definite bone contour deformity (bone friction)

JSN, joint space narrowing.

Radiographic analyses were performed blindly on two occasions, with a one-month interval, and the interpretations of the three observers were scanned for subsequent statistical analysis. Data were analyzed with statistical analysis software R version 3.1.0, and SPSS (Statistical Package for the Social Sciences) version 22.0. The intraobserver agreement, which compared both assessments from the same observer for each of the five classifications, was analyzed by the weighted Cohen's Kappa coefficient. The weighted Cohen's Kappa coefficient ranges from −1 to 1; values less than or equal to 0 represent no agreement and 1 represents total agreement. In this study, the classification adopted was the one proposed by Byrt, as described in Table 2. The coefficients were calculated using the “psy” package of R.

Table 2

Kappa coefficient values (K) and agreement classification.

K-value	Concordance rating
−1 to 0.00	None
0.0 to 0.20	Poor
0.21 to 0.40	Superficial
0.41 to 0.60	Reasonable
0.61 to 0.80	Good
0.81 to 0.92	Very good
0.93 to 1.0	Excellent

In the interobserver analysis, another measure of agreement was used, Krippendorf's alpha coefficient (α). The rating of the agreement, given the value of α, was the same as that presented in Table 2. The coefficients were calculated using the Kalpha macro in SPSS.

Results

Table 3 shows the values of the weighted Kappa coefficient (K) and its confidence interval (CI) at 95% confidence, which assesses the intraobserver agreement of each observer for each of the ratings. The values indicate that Observer 1 presented a “superficial” agreement between first and second observation for the original classification and for all its variants, with Kappa values equal to 0.34 or 0.35. Observer 2 presented a “very good” agreement between first and second observation for the original classification and for all its variants, with Kappa values between 0.85 and 0.92. Finally, Observer 3 showed an “excellent” agreement between first and second observation for the original classification and for all variants, with Kappa values equal to 0.97 for variants 1 and 4 and perfect agreement (K = 1) between the two evaluations in the original classification and in variants 2 and 3.

Table 3

Weighted Kappa coefficients of the intraobserver agreement between the first and the second evaluation, for each classification.

Observer	Classification
	Original	Variant 1	Variant 2	Variant 3	Variant 4
1	0.35 (0.15; 0.55)	0.34 (0.14; 0.54)	0.34 (0.14; 0.54)	0.34 (0.14; 0.54)	0.34 (0.14; 0.54)
2	0.92 (0.84; 0.99)	0.85 (0.75; 0.95)	0.90 (0.82; 0.98)	0.90 (0.82; 0.98)	0.90 (0.82; 0.98)
3	1.0 (1.0; 1.0)	0.97 (0.92; 1.0)	1.0 (1.0; 1.0)	1.0 (1.0; 1.0)	0.97 (0.92; 1.0)

Table 4 shows the values of Krippendorff's alpha coefficient, which was used to assess interobserver agreement, in the first and second evaluation, for each of the ratings. Values show that, both in the first and second evaluation, for all ratings, the agreement between observers was “superficial”. It is noteworthy that the agreement was lower in the first evaluation.

Table 4

Krippendorff's alpha coefficient of the interobserver agreement in the first and second evaluation for each classification.

Classification	First evaluation	Second evaluation
Original	0.25	0.33
Variant 1	0.23	0.28
Variant 2	0.21	0.32
Variant 3	0.22	0.34
Variant 4	0.26	0.35

Discussion

Classifying diseases is a common practice. A good rating system is designed to be simple, reproducible, and able to group different stages of a lesion into homogeneous subgroups, allowing for comparisons, treatment algorithms, and prognosis. What usually happens is that once a classification for a particular injury is established, based on a relevant and representative sample, a case that does not fit the described or classified types appears. Weber, in his study of malleolar fractures, reserved a subgroup for “unclassifiable” injuries, i.e., those that, due to their peculiarity, could not be fitted into classes or groups. Over time, some ratings have been replaced by more complete ones. In the literature, there is still no consensus on which classification should be used for the study of osteoarthritis of the knee. Weidow et al. reported that knee radiographic classifications must be reviewed and improved through the examination technique or method employed. Sun et al., in a review study of 16 classifications for osteoarthritis of the knee, concluded that there was no unanimous choice among the various medical specialties. The Kellgren and Lawrence classification values the presence or absence of osteophytes. In contrast, the Ahlbäck classification assesses reduction of the joint space; some studies consider it to be the best method for analyzing progression of osteoarthritis.10, 11 Studies as such as that by Danielsson and Hernborg demonstrated that osteophytes did not change over 16 years of evolution. In turn, Kijowski et al. concluded that osteoarthritis of the knee should be diagnosed by marginal osteophytes. In fact, it is the progression of the disease that must be assessed by joint space narrowing, subchondral sclerosis, and subchondral cysts. Felson et al. observed that osteophytes are associated with poor alignment of the ipsilateral lower limb. Poor alignment is a powerful risk factor for the progression of osteoarthritis. The present study used the Kellgren and Lawrence classification, as it is routinely used by orthopedic surgeons and rheumatologists. Albuquerque et al. observed that the Kellgren and Lawrence classification had lower level of agreement on a intra- and interobserver analysis of three different classifications: those by Dejour et al., Ahlbäck apud Keyes et al., and Kellgren and Lawrence. The present research confirms the poor results of the Kellgren and Lawrence classification. Rodrigues et al. analyzed the interobserver reproducibility of the original Kellgren and Lawrence classification and did not observe a statistically significant difference between knee specialists and general orthopedists. Furthermore, they observed a regular Kappa coefficient index. The present study performed an intra- and interobserver analysis and attempted to achieve a more accurate assessment when compared with studies such as those by Rodrigues et al. The literature features some studies comparing the Kellgren and Lawrence classification and its different variants.2, 17 However, none of these studies used a radiographic analysis described by Rosenberg et al. nor included patients with advanced stages of osteoarthritis of the knee. For this reason, the present research included these two variables, thus representing an unprecedented and extremely important study. Some studies indicate that the Rosenberg view provides a better evidence of joint wear.18, 19 Furthermore, authors believe that, for a classification to be assessed accurately, it must feature the pathology studied in its various grades. Villardi et al. and Galli et al. observed a low degree of interobserver agreement in the use of Ahlbäck classification modified apud Keyes et al. The present study, although using a different classification system, also observed a weak agreement among observers. The observers of the present research are experienced specialists in knee surgery; in order to reproduce a more accurate assessment, a response time was not stipulated.21, 22 Vilalta et al. found that experienced observers generated individual variability and caused differences in results and confusion in the literature, a belief that the authors of the present study proved and defend. Brandt et al. and Kijowski et al., when assessing patients with osteoarthritis, compared the AP view in loaded knee extension with arthroscopic findings. They emphasize that, in patients with osteoarthritis, the assessment of joint space and osteophytes are not suitable parameters for analyzing the disease. They suggest that further research should be conducted in order to find a complementary test with better accuracy. The present authors believe that knee arthroscopy is an excellent therapeutic method, but it is an invasive procedure and, therefore, should not be used as a diagnostic method. In the future, magnetic resonance with load may perhaps become a superior imaging exam in comparison with radiography. Osteoarthritis of the knee is a common and fascinating disease. The radiographic analysis and the classification used are crucial points of controversy on this subject. The present study suggests that the original Kellgren and Lawrence classification and its variants generated disagreement among observers. Thus, it is important to research and develop a radiographic classification of the knee to obtain a consensus or, perhaps, to improve agreement.

Conclusions

The Kellgren and Lawrence classification and its variants generated low reproducibility among observers. In the intraobserver analysis, discordant results were observed. This demonstrates the lack of uniformity in the use of this classification and its variants, even among experienced observers.

Conflicts of interest

The authors declare no conflicts of interest.

19 in total

1. Radiological assessment of osteo-arthrosis.

Authors: J H KELLGREN; J S LAWRENCE
Journal: Ann Rheum Dis Date: 1957-12 Impact factor: 19.103

2. Osteophytes and progression of knee osteoarthritis.

Authors: D T Felson; D R Gale; M Elon Gale; J Niu; D J Hunter; J Goggins; M P Lavalley
Journal: Rheumatology (Oxford) Date: 2004-09-20 Impact factor: 7.580

3. Ahlbäck grading of osteoarthritis of the knee: poor reproducibility and validity based on visual inspection of the joint.

Authors: Jonas Weidow; Claes-Göran Cederlund; Jonas Ranstam; Johan Kärrholm
Journal: Acta Orthop Date: 2006-04 Impact factor: 3.717

4. Arthroscopic validation of radiographic grading scales of osteoarthritis of the tibiofemoral joint.

Authors: Richard Kijowski; Donna Blankenbaker; Paul Stanton; Jason Fine; Arthur De Smet
Journal: AJR Am J Roentgenol Date: 2006-09 Impact factor: 3.959

Review 5. Differences in descriptions of Kellgren and Lawrence grades of knee osteoarthritis.

Authors: D Schiphof; M Boers; S M A Bierma-Zeinstra
Journal: Ann Rheum Dis Date: 2008-01-15 Impact factor: 19.103

6. Radiographic findings of osteoarthritis versus arthroscopic findings of articular cartilage degeneration in the tibiofemoral joint.

Authors: Richard Kijowski; Donna G Blankenbaker; Paul T Stanton; Jason P Fine; Arthur A De Smet
Journal: Radiology Date: 2006-04-26 Impact factor: 11.105

10. Radiographic grading of the severity of knee osteoarthritis: relation of the Kellgren and Lawrence grade to a grade based on joint space narrowing, and correlation with arthroscopic evidence of articular cartilage degeneration.

Authors: K D Brandt; R S Fife; E M Braunstein; B Katz
Journal: Arthritis Rheum Date: 1991-11