OBJECTIVE: The use of images in 3D reconstruction is an instrument that facilitates the interpretation of the fracture, observations of deviations, rotations and articular surface. OBJECTIVE: To evaluate the inter-observer and intra-observer reliability of the Neer x AO proximal humerus fracture classification on radiographs versus computed tomography with three-dimensional reconstruction (3D). METHODS: We evaluated the digital radiographs (anteroposterior and profile) and computerized tomography with 3D reconstruction of patients presenting with a proximal humerus fracture, surgically treated at an Orthopedics and Traumatology Service. All radiographs and computed tomography were classified (Neer and AO) by eight (8) orthopedic surgeons, specialists in the upper limb and sent, following the pre-established numeration by the author, in a spreadsheet to the author of the study. RESULTS: The Neer and AO scores were more reproducible when determined by computed tomography with 3D reconstruction, mainly in fractures of greater complexity (Neer 4 parts and AO group C). However, in absolute values, inter and intra-observer reproducibility and concordance still remain low. CONCLUSION: Computed tomography with 3D reconstruction allows a better analysis of fractures of group C and Neer 4 parts. However, the inter and intra-observer agreement does not increase significantly in comparison to the radiographs. Level of evidence III, Study of non-consecutive patients, without gold standard, applied uniformly.
OBJECTIVE: The use of images in 3D reconstruction is an instrument that facilitates the interpretation of the fracture, observations of deviations, rotations and articular surface. OBJECTIVE: To evaluate the inter-observer and intra-observer reliability of the Neer x AO proximal humerus fracture classification on radiographs versus computed tomography with three-dimensional reconstruction (3D). METHODS: We evaluated the digital radiographs (anteroposterior and profile) and computerized tomography with 3D reconstruction of patients presenting with a proximal humerus fracture, surgically treated at an Orthopedics and Traumatology Service. All radiographs and computed tomography were classified (Neer and AO) by eight (8) orthopedic surgeons, specialists in the upper limb and sent, following the pre-established numeration by the author, in a spreadsheet to the author of the study. RESULTS: The Neer and AO scores were more reproducible when determined by computed tomography with 3D reconstruction, mainly in fractures of greater complexity (Neer 4 parts and AO group C). However, in absolute values, inter and intra-observer reproducibility and concordance still remain low. CONCLUSION: Computed tomography with 3D reconstruction allows a better analysis of fractures of group C and Neer 4 parts. However, the inter and intra-observer agreement does not increase significantly in comparison to the radiographs. Level of evidence III, Study of non-consecutive patients, without gold standard, applied uniformly.
Entities:
Keywords:
Inter and intra-observer; Proximal Humeral Fracture; Tomography
Proximal humerus fracture corresponds to 5% of fractures, and it is the third most common fracture, only behind distal radius fractures, femur in its proximal portion, and it corresponds to 80% of humerus fractures.
The most frequent mechanism of trauma is the fall on the same level. Approximately 80% of cases present or not small deviations and can be treated conservatively. (
However, understanding the most complex fractures can be a challenge to the orthopedic surgeon. Inadequate and poorly performed radiographs may alter or even hinder analysis. (In 1970, Charles Neer created the classification of four segments for humerus fracture in his proximal portion, namely greater tuberosity, lesser tuberosity, humeral head and humeral shaft. After 46 years, it continues to be used due to its usability, guidance in the treatment and explanation of pathological characteristics of the injury. (
)- (
However, its reliability is increasingly contested due to the low inter-observer agreement, (
explained by the poor image quality and poor positioning of patients. (
Charles Neer claims this low agreement occurs due surgeons’ inexperience, in the case of 4-part fracture. (The AO classification (Arbeitsgemeinschaft für Osteosynthesefragen) values the vascularization of the humeral head. (
Created in 1986 and revised in 1990, it uses an A-to-C system related to the fracture pattern. A subdivision into 3 subgroups (1, 2 and 3) is added based on the degree of fragmentation and complexity of the fracture, obtaining 27 fractures with different patterns. (
), (Conventional radiography has an important role in the initial evaluation. However, computed tomography and 3D reconstruction have stood out in observations of deviations, rotations and joint surface due to technology improvement. The AO and Neer classifications have shown low reproducibility during conventional radiographic and tomographic evaluation. Images in 3D reconstruction facilitates the interpretation of the fracture. Neer emphasizes that better understanding of the fracture pattern is essential to recommend a treatment. (Our study sought to evaluate the inter- and intra-observer reliability of the classification of proximal humerus fracture described by Neer compared with AO classification on radiographs, versus computed tomographies with three-dimensional reconstruction (3D).
MATERIAL AND METHODS
This project was submitted to the ethics committee in human research and was approved on 11/02/2016 by code 59901816.0.0000.5225.Based on the codes of procedures and surgery records, we identified all patients that underwent initial digital radiographs and computed tomographies with 3D reconstruction for proximal humerus fracture.All patients were treated surgically in the orthopedics and traumatology service of a large hospital and signed an informed consent form.All radiographs and computed tomographies were classified by 8 orthopedic surgeons specialized in the upper limb. The tests were previously edited by one of the authors (who did not participate in the evaluation) to remove the identification and randomization of the sequence of patients. Radiographies were first sent digitally to each orthopedist and, about one month after, tomographies. Each orthopedist classified each fracture using Neer (number and fractured segments), and using AO (with subgroups) and classified in tables, following the pre-established numbering, in a spreadsheet to the author responsible for randomization of the images.After data collection, radiographic and tomographic classifications were compared by inter- and intra-observer analysis. A statistical study of the data, values found and a discussion on the basis of the current literature in already published data were conducted.Patients without initial radiographs and computed tomographies for proximal humerus fractures and pathological fractures were excluded from the study.We used the Kappa coefficient of agreement for statistical analysis between the inter- and intra-observer agreement. The coefficient values found in this test can be classified as follows: 0-0.19 as unsatisfactory, 0.20-0.39 low agreement, 0.40-0.59 moderate agreement, 0.60-0.79 satisfactory agreement and 0.80-1.00 as almost perfect.
RESULTS
Inter-observer
In total, 54 patients were included in the sample. The tomographies and radiographs of the 54 cases were evaluated by eight orthopedists specialized in the upper limb.Regarding the radiographs for the Neer classification, kappa agreement values were 0.275 (2 parts), 0.083 (3 parts), 0.204 (4 parts), 0.178 general kappa, p < 0.001. (Table 1). In tomographies, the kappa values were 0.229 (2 parts), 0.147 (3 parts), and 0.32 (4 parts), 0.22 mean kappa, with p < 0.001. (Table 2)
Table 1
Concordance Table with radiographs by Neer.
0
2 parts
3 parts
4 parts
Kappa of the category
0.275
0.083
0.204
P-value of Kappa of the category
< 0.001
0.001
< 0.001
General Kappa
0.178
P-value
< 0.001
95% CI Kappa
upper: 0.213
lower: 0.144
Table 2
Concordance Table with tomographies by Neer.
2 parts
3 parts
4 parts
Kappa of the category
0.229
0.147
0.32
P-value of Kappa of the category
< 0.001
< 0.001
< 0.001
General Kappa
0.22
P-value
< 0.001
95% CI Kappa
upper: 0.256
lower: 0.184
The results regarding the radiographs classified according to AO showed kappa values of 0.232 to A1, 0.194 to A2, 0.266 to A3, 0.15 to B1, 0.21 to B2, 0.078 to B3, 0.045 to C1, 0.133 to C2 and 0.419 to C3, with 0.201 general kappa. (Table 3)
Table 3
Concordance Table with AO radiographs.
A1
A2
A3
B1
B2
B3
C1
C2
C3
Kappa of the category
0.232
0.194
0.266
0.15
0.21
0.078
0.045
0.133
0.419
P-value of Kappa of the category
< 0.001
< 0.001
< 0.001
< 0.001
< 0.001
0.002
0.081
< 0.001
< 0.001
General Kappa
General Kappa
0.201
P-value
< 0.001
95% CI Kappa
upper: 0.221
lower: 0.18
Regarding the tomographies, the results showed kappa 0.535 to A1, 0.273 to A2, 0.28 to A3, 0.242 to B1, 0.221 to B2, 0.236 to B3, 0.114 to C1, 0.479 to C2 and 0.311 to C3, with 0.277 general mean. (Table 4)
Table 4
Concordance Table with AO tomographies.
A1
A2
A3
B1
B2
B3
C1
C2
C3
Kappa of the category
0.535
0.273
0.28
0.242
0.221
0.236
0.114
0.479
0.311
P-value of Kappa of the category
< 0.001
< 0.001
< 0.001
< 0.001
< 0.001
< 0.001
< 0.001
< 0.001
< 0.001
General Kappa
General Kappa
0.277
P-value
< 0.001
95% CI Kappa
upper: 0.298
lower: 0.256
On radiographs, according to Neer classification, the mean agreement between the classification was 4.71 physicians for each case, while by the AO classification there was an agreement between 4 or more physicians, totaling 36 cases.In the tomographies, according to Neer classification, the agreement between the classification was 5.06 physicians for each patient, while in AO there was an agreement between 4 or more physicians, totaling 42.
Intra-observer
Regarding intra-observer evaluations, there was agreement in the classification on radiographs with tomographies on average of 26.92 cases, ranging from 15 to 31 according to Neer classification in the 54 patients and 17.125, ranging from 12 to 22 correct answers, according to AO classification.
DISCUSSION
Radiography is the standard method for evaluation, diagnosis and classification. However, computed tomography is expected to facilitate and improve the reproducibility of the analyzed fractures, providing a greater intra-observer agreement, enabling a better choice of treatment and a more reliable and reproducible classification system. (
), (Despite numerous complaints regarding its reproducibility, Neer classification is widely accepted and commonly used to guide treatment and anticipate prognosis; it is pedagogically useful, of easy learning and separates fractures into broad categories, being easy to understand. (
), (
AO classification (Arbeitsgemeinschaft für Osteosynthesefragen) divides fractures according to their complexity and facilitates choice of treatment and prognosis. The AO is one of the most complete classification system, however, its intra- and inter-observer reproducibility has reduced. (
), (
), (In our study, we evaluated 54 patients with proximal humerus fracture, whose initial evaluation was performed by radiography and tomography with 3D reconstruction.In the evaluation of the inter-observer results regarding radiographs according to Neer classification, we observed that kappa agreement ranged between 0.083 (analysis with fractures classified into 3 parts), 0.204 (4 parts) and 0.275 (2 parts), with 0.178 general kappa, with p < 0.001. (Table 1) These data are lower than those of Papakonstantinou et al.
, which showed a 0.40-0.58 global kappa, Bernstein et al. (
, a 0.52 kappa, Siebenrock and Gerber
, a 0.40 kappa, and Sidor et al.
, a 0.48 kappa. Brorson and Hróbjartsson
conducted a systematic review, finding 11 studies with kappa ranging from 0.17 to 0.52. However, of the revised studies, the higher the number of evaluations and the larger the group that classified them, less is the kappa agreement. Among the studies mentioned, Schwartz and Cuny
used 11 orthopedists to evaluate the radiographs of 21 patients, obtaining a 0.17 kappa value; Kristiansen
studied 100 patients, obtaining a 0.07-0.48 kappa value. The best result was found in the study by Bernstein et al. (
, with 20 cases analyzed by 2 orthopedists and 2 orthopedic residents, which obtained a 0.52 kappa value.In the evaluation of the results of the tomographies, we found a 0.22 mean kappa, with p < 0.001, ranging between 0.147 for fractures classified in 3 parts, 0.229 in 2 parts and 0.32 in 4 parts, as shown in Table 2. Brorson and Hróbjartsson
had a 0.34-0.72 mean.We can justify the low agreement in our study by the evaluation of tomography with 3D reconstruction being conducted without radiographic analysis. Sjödén et al. (
, on the other hand, showed that the addition of tomography did not improve the Neer classification reproducibility.However, our study showed a small improvement in reproducibility in computed tomography (5.06), obtaining a better agreement in the classification of computed tomography versus 4.71 on radiographs.When analyzing the results of the radiographs classified according to AO, a greater agreement was obtained when classifying fractures in C3, 0.419 kappa, and A3, 0.266 kappa, with a 0.201 general mean.Tomographies showed higher agreement when classified according to AO and compared with radiographs, A1 0.535 kappa, C2 0.479, C3 0.311 with a 0.277 general mean, as shown in Tables 3 and 4. The values found were similar to the results of Matsushigue et al. (
, in which a 0.25 kappa value was obtained for radiographs and a 0.36 kappa for tomographies. The values were higher than in the analysis by Majed et al. (
, which showed weak inter-observer reliability, with a 0.11 kappa. Values below Sjödén et al. (
, a 0.31 kappa, Siebenrock and Gerber et al.
, a 0.42 kappa and Papakonstantinou et al.
with a 0.31-0.54 kappa were observed in our analysis. The high complexity of the classification system and the high number of categories and subcategories explains the low inter-observer agreement. (
), (
), (
), (In our study, we showed the Neer and AO classification were more reproducible and presented better results when performed through tomography with 3D reconstruction, especially in fractures of greater complexity (Neer 4 parts and AO group C). However, inter- and intra-observer reproducibility and agreement (26.92 cases, ranging from 15 to 31 according to Neer and 17.125, ranging from 12 to 22 correct answers, according AO in the 54 cases analyzed) still remain low in absolute values.The statistical method used in our study was kappa agreement analysis. This measure of agreement presents values between 1 (one), representing total agreement, and values near 0 (zero), representing no agreement. Although this form of calculation is planned for two observers, Kappa was used with more than 2 observers in our study and in the other studies we analyzed. Thus, the Kappa values obtained are below the real, since the rate of chance is calculated for each observer. However, Kappa is still the most assertive statistical method for this type of analysis. (One of the limitations of our study was its retrospective nature. All radiographs were performed in the emergency room, in emergency situations, some with limited quality. This is the reason why we could not repeat radiographs or request new ones so that they would improve quality.Eight orthopedists specialized in the upper limb participated in our study to level the agreement indexes and to obtain professionals with the same experience level. The classification was not repeatedly applied at different times because, according to studies, it would not change the reproducibility.
CONCLUSION
The 3D resection tomography did not significantly improve inter- and intra-observer global agreement for Neer and AO classifications compared with radiographs. We found a low agreement for the evaluation of proximal humerus fracture, except in group C and Neer fracture 4 parts. Despite being applied to 8 specialists in the upper limb, this supports previous studies on the difficulty of achieving good reliability and reproducibility of classifications.
Authors: Addie Majed; Iain Macleod; Anthony M J Bull; Karol Zyto; Herbert Resch; Ralph Hertel; Peter Reilly; Roger J H Emery Journal: J Shoulder Elbow Surg Date: 2011-04-09 Impact factor: 3.019
Authors: M Wade Shrader; Joaquin Sanchez-Sotelo; John W Sperling; Charles M Rowland; Robert H Cofield Journal: J Shoulder Elbow Surg Date: 2005 Sep-Oct Impact factor: 3.019
Authors: Maritsa K Papakonstantinou; Melissa J Hart; Richard Farrugia; Belinda J Gabbe; Afshin Kamali Moaveni; Dirk van Bavel; Richard S Page; Martin D Richardson Journal: ANZ J Surg Date: 2016-02-17 Impact factor: 1.872