Literature DB >> 33204866

Interobserver and intraobserver variability in magnetic resonance imaging evaluation of patients with suspected disc herniation.

Somayeh Hajiahmadi1, Azin Shayganfar1, Mahsa Askari1, Shadi Ebrahimian1.   

Abstract

OBJECTIVE: Magnetic resonance imaging (MRI) is usually the modality of choice to assess sciatica and intervertebral disc herniation. Despite remarkable progression in diagnostic imaging and surgical techniques, definite diagnosis based on imaging interpretation is still a great challenge. The aim of this study was to determine interobserver and intraobserver variability in reporting lumbar MRI between two neuroradiologists based on the new 2014 version of disc nomenclature. PATIENTS AND METHODS: The study population was composed of 134 irresponsive to conservative therapy patients with clinical presentations of disc herniation and lumbar radiculopathy. MRI was taken from all the participants using a 1.5 T MRI system. Two neuroradiologists evaluated the images, separately and one of them did it twice and interpreted the scans in sagittal and axial planes. Disc bulge, disc herniation and nerve root compression were evaluated at each level. Interobserver and interaobserver agreements between two neuroradiologists, and one neuroradiologist in two times of reporting were calculated for the evaluation of bulging and herniated discs and nerve root compression by applying the Kappa statistics.
RESULTS: Bulging disc, herniated disc, the type of disc, location of the discs, and nerve root compression diagnosis were significantly in excellent agreement (kappa>0.7, p-value<0.001) through intraobserver assessments, while interobserver assessments presented statistically significant with a fair agreement (kappa:0.4-0.7 and p-value<0.05).
CONCLUSION: Remarkable intraobserver agreement was found between diagnoses of disc-related pathologies of the lumbar spine while interobserver assessments revealed only fair concordance.
© 2020 The Authors.

Entities:  

Keywords:  Anatomy; Interobserver reliability; Intraobserver reliability; Lumbar disc herniation; Medical imaging; Musculoskeletal system; Nerve root compression; Nervous system; Neurology

Year:  2020        PMID: 33204866      PMCID: PMC7649260          DOI: 10.1016/j.heliyon.2020.e05201

Source DB:  PubMed          Journal:  Heliyon        ISSN: 2405-8440


Introduction

Magnetic resonance imaging (MRI) is commonly used in patients with intervertebral disc herniation (IDH). Low back pain which is caused by IDH is one of the most common health problems in the world. Since this modality can provide exquisite morphologic details of the disc abnormality [1, 2], MRI is considered the diagnostic imaging of choice for IDH [3]. The decision on treatment options in patients with IDH is based on the clinical and imaging findings. The conservative treatment for at least 6–8 weeks is used as the first line management in patients with IDH. In cases with poor response to conservative treatment, surgery might be considered and MRI is routinely applied to assess the presence of nerve root compression [4]. It has been reported that about 10–40% of the patients did not have a satisfactory improvement in symptoms after lumbar disc surgery despite the developments in diagnostic and surgical techniques [5]. The poor outcomes following lumbar disc surgery have been most commonly due to the errors in diagnosis rather than the surgical technique or its complications [6]. A variability among spine MRI interpretations could negatively effect the therapeutic decisions and lead to inappropriate medical managements in false-positive or false-negative diagnosis of nerve root compression. Unreliable interpretation may also result in problems when attempting to detect the relationship between specific imaging characteristics and patient outcomes. Therefore, it is essential to have insight into the interpretation variability of MRI findings among potential candidates for lumbar disc surgery [4]. In this regard, several studies [7, 8, 9, 10, 11, 12, 13, 14, 15] have been done to assess the agreement between different neuroradiologists in reporting findings such as disc degeneration, modic changes, annular tears, disc bulges, protrusions and herniations, and spinal stenosis on MRI. Depending on the evaluated MRI finding, a moderate to excellent concordance was found between the neuroradiologists' reports in previous studies [16]. To decrease interobserver variability in the interpretation of lumbar MRI, ‘‘Nomenclature and classification of lumbar disc pathology” has been updated and is used as a classification and reporting system to prevent the inconsistency in anatomic, physiologic, and pathologic descriptions of lumbar disc [17]. The aim of this study was to determine interobserver and intraobserver variability in reporting lumbar MRI between two neuroradiologist using the new version of disc nomenclature published in 2014.

Material and methods

Our study population was composed of patients who were referred to the Radiology Department of Al-Zahra hospital, Isfahan University of Medical Sciences, Isfahan, Iran with clinical suspicion of disc herniation and lumbar radiculopathy. Patients with no respond to conservative management for at least 6–8 weeks entered the study. Patients younger than 18 years and older than 70 years of age, patients with a history of surgery, spinal infections or tumors, and pregnant women were excluded from the study. The study was approved by the ethical committee of Radiology department, Isfahan University of Medical Sciences, Isfahan, Iran (IR.MUI.REC.1396.3.342) and the need for the informed consent was waved. All images were acquired with a 1.5 T MRI system (Ingenia, Philips). The standard imaging protocol included T1 weighted sagittal images (Slice thickness 4 mm, FOV 230 × 317 × 48 mm, Image Matrix 220 × 238 × 11), T2 weighted sagittal images (Slice thickness 4 mm, FOV 230 × 317 × 48 mm, Image Matrix 232 × 226 × 11), and T2 weighted fat suppressed sagittal images (Slice thickness 4mm, FOV 230 × 317 × 48 mm, Image Matrix 220 × 238 × 11). Two neuroradiologists who were blinded to the patients’ clinical findings and each other's reports evaluated the lumbar discs at L3-L4, L4-L5, and L5-S1 levels on picture archiving and communication system (PACS) and interpreted the images on sagittal and axial planes using Onis 2.6 software. Disc at each level was evaluated for the presence or absence of disc bulge and disc herniation, separately. No distinction was made between disc protrusion and extrusion and both were included in the term of herniation. The location of the discs was determined. Descriptions were based on “Lumbar disc nomenclature: version 2.0 “[18]. Nerve roots from L3 to S1 levels were evaluated for nerve compression. Also, one of the neuroradiologists with an interval of one month, reported all images on a regular basis for the second time. The findings of both observers and also two-time findings of one observer were recorded. Afterward, the results were analyzed using SPSS 16. Kappa statistics was applied to calculate the interobserver and interaobserver agreement between two neuroradiologists and one neuroradiologist in two times of reporting for evaluation of bulging and herniated discs and nerve root compression. The interpretation of Kappa was done as proposed by Cohen i.e., a value of less than 0.4 was considered as poor, a value between 0.4 to 0.75 was considered as fair to good, and a value above 0.75 was considered as excellent.

Results

In the present study, of 134 patients, 70 (52.2%) were male and 64 (47.8%) were female. The results of bulging disc detection by the first neuroradiologist were compared with the results of her previous reports. At L3-L4 and L5-S1 disc levels, an intraobserver agreement of more than 90% was detected. Moreover, it was found that the kappa coefficient, which had the reported values of 0.980 and 0.871 at each of the L3-L4 and L5-S1 disc levels, respectively, was acceptable and significant (p-value < 0.001). The results of the bulging disc diagnosis of the second neuroradiologist were compared with those of the first neuroradiologist and indicated a lower agreement percentage and lower kappa coefficient values, which were 0.712 and 0.668 at each of the L3-L4 and L5-S1 disc levels, respectively. However, the interobserver agreement still remained at a fair level (p-value < 0.001). Furthermore, the lowest kappa coefficient value and agreement percentage between the observers were related to the intraobserver agreement of the first neuroradiologist (K = 0.605, intraobserver agreement percentage= 22.5%) or the interobserver agreement of the first and second neuroradiologists (K = 0.598, interobserver agreement percentage= 22.4%) at the L4-L5 disc level; nevertheless, the diagnosis agreement between the two observers was still significant (p-value < 0.001) (Table 1).
Table 1

Inter and Intra observer agreement in evaluation of disc bulge.

ObserverNeuroradiologist 1
L3-L4L4-L5L5-S1
NoYestotalNoYestotalNoYestotal
Neuroradiologist 1No9911002958871081109
Yes034344514642125
Intra Observer Agreement= (99.2%)Kapa = 0.980, P < 0.001Intra Observer Agreement= (22.5%)Kapa = 0.605, p < 0.001Intra Observer Agreement= (96.3%)Kapa = 0.871, p < 0.001
Neuroradiologist 2NoYestotalNoYestotalNoYestotal
No874912756831013104
Yes12314348351111930
Inter Observer Agreement= (88.1%)Kapa = 0.712, P < 0.001Inter Observer Agreement= (22.4%)Kapa = 0.598, P < 0.001Inter Observer Agreement= (89.5%)Kapa = 0.668, P < 0.001
Inter and Intra observer agreement in evaluation of disc bulge. In the detection of the herniated disc at three levels of L3-L4, L4-L5, and L5-S1, observations of the first neuroradiologist compared with her previous observations indicated the kappa coefficient values of 0.970, 0.954, and 0.985, respectively (p-value < 0.001). Compared with the observations of the second neuroradiologist, the kappa coefficient values were 0.670, 0.835, and 0.804 at three levels of L3-L4, L4-L5, and L5-S1, respectively (p-value < 0.001). In fact, the intraobserver agreement and kappa coefficient values obtained from the first neuroradiologist were far stronger than the interobserver agreement and kappa coefficient values obtained from comparing the observations of the first and second neuroradiologists (Table 2).
Table 2

Inter and Intra observer agreement in evaluation of disc herniation.

ObserverNeuroradiologist 1
L3-L4L4-L5L5-S1
Neuroradiologist 1NoYestotalNoYestotalNoYestotal
No11411157217373174
Yes019192586006060
Intra Observer Agreement= (99.2%)Kapa = 0.970, p < 0.001Intra Observer Agreement= (97.7%)Kapa = 0.954, p < 0.001Intra Observer Agreement= (99.2%)Kapa = 0.985, p < 0.001
Neuroradiologist 2NoYestotalNoYestotalNoYestotal
No10961156737067876
Yes514198566455358
Inter Observer Agreement= (91.8%)Kapa = 0.670, p < 0.001Inter Observer Agreement= (91.8%)Kapa = 0.835, p < 0.001Inter Observer Agreement= (89.5%)Kapa = 0.804, p < 0.001
Inter and Intra observer agreement in evaluation of disc herniation. Evaluating the diagnosis of the type of disc indicated the agreement percentage of 100% and kappa coefficient value of 1 obtained from observations of the first neuroradiologist as compared with her previous observations and the observations of the second neuroradiologist at the L3-L4 level. However, at the L4-L5 and L5-S1 levels, the agreement percentage was more than 90% and the kappa coefficient values of the two observers were at acceptable and significant levels (p-value < 0.001) (Table 3).
Table 3

Inter and Intra observer agreement in evaluation of type of disc.

ObserverNeuroradiologist 1
L3-L4L4-L5L5-S1
Neuroradiologist 1PEStotalPEStotalPEStotal
P180018472049460046
E01010808112013
S000001010011
Intra Observer Agreement= (100%)Kapa = 1, p < 0.001Intra Observer Agreement= (94.8%)Kapa = 0.821, p < 0.001Intra Observer Agreement= (96.7%)Kapa = 0.953, p < 0.001
Neuroradiologist 2PEStotalPEStotalPEStotal
P130013441045410041
E01013801148012
S000000000011
Inter Observer Agreement= (100%)Kapa = 1, p < 0.001Inter Observer Agreement= (92.9%)Kapa = 0.757, p < 0.001Inter Observer Agreement= (90.7%)Kapa = 0.734, p < 0.001

P: Portrusion; E: Extrusion; S: Sequestration.

Inter and Intra observer agreement in evaluation of type of disc. P: Portrusion; E: Extrusion; S: Sequestration. In addition, evaluation of the first neuroradiologist intraobserver agreement in detecting the disc location in the central, paracentral, subarticular, foraminal, and broad base regions indicated the agreement percentage of 100% at the levels of L3-L4 and L5-S1 (p-value < 0.001). This percentage was 98.2% at the L4-L5 level (p-value < 0.001). The interobserver agreements at the levels of L3-L4, L4-L5, and L5-S1 were 64.3%, 82.7%, and 84.6%, respectively. Moreover, the kappa coefficient values at the three mentioned levels were 0.421, 0.670, and 0.681, respectively (p-value < 0.001) (Table 4).
Table 4

Inter and Intra observer agreement in evaluation of disc location.

ObserverNeuroradiologist 1
L3-L4L4-L5L5-S1
Neuroradiologist 1CPSFBtotalCPSFBtotalCPSFBtotal
C7000073300003337000037
P0200020110101201900019
S000000000000000000
F000909000909000000
B000000000033000033
Inter Observer Agreement= (100%)Kapa = 1, p < 0.001Inter Observer Agreement= (98.2%)Kapa = 0.970, p < 0.001Inter Observer Agreement= (100%)Kapa = 1, p < 0.001
Neuroradiologist 2CPSFBtotalCPSFBtotalCPSFBtotal
C6101083040003431800039
P010102060401001000010
S000000000000000000
F100203000404000000
B000101100134000033
Intra Observer Agreement= (64.3%)Kapa = 0.421, p < 0.001Intra Observer Agreement= (82.7%)Kapa = 0.670, p < 0.001Intra Observer Agreement= (84.6%)Kapa = 0.681, p < 0.001

C: Central, P: Paracentral, S: Subarticular, F: Foraminal, B: Broad base.

Inter and Intra observer agreement in evaluation of disc location. C: Central, P: Paracentral, S: Subarticular, F: Foraminal, B: Broad base. Finally, the evaluation of interobserver agreement rates in detecting type of pressure indicated that the observations of the first neuroradiologist compared with her previous observations had agreement percentage of more than 95% and the kappa coefficient value of more than 0.85 at three levels of L3-L4, L4-L5, and L5-S1 (p-value < 0.001). As compared with the second neuroradiologist, the agreement percentage was more than 80%; however, the kappa coefficient value was between 0.50 and 0.60 (p-value < 0.001). Although the mentioned result indicated a reduced agreement percentage, the reported value of the kappa coefficient was still acceptable (Table 5).
Table 5

Inter and Intra observer agreement in evaluation of type of pressure on the disc.

ObserverNeuroradiologist 1
L3-L4L4-L5L5-S1
Neuroradiologist 1NoYestotalNoYestotalNoYestotal
No461476816959160
Yes0660333321820
Intra Observer Agreement= (98.1%)Kapa = 0.912, p < 0.001Intra Observer Agreement= (99%)Kapa = 0.978, p < 0.001Intra Observer Agreement= (96.3%)Kapa = 0.898, p < 0.001
Neuroradiologist 2NoYestotalNoYestotalNoYestotal
No370374814946147
Yes8614193352151833
Inter Observer Agreement= (84.3%)Kapa = 0.521, p < 0.001Inter Observer Agreement= (80.2%)Kapa = 0.608, p < 0.001Inter Observer Agreement= (80%)Kapa = 0.560, p < 0.001
Inter and Intra observer agreement in evaluation of type of pressure on the disc.

Discussion

Making the decision to do the surgery in patients presenting with nerve root compression symptoms is a great challenge due to postoperative complications. Therefore, an accurate interpretation of the MRI findings can help physicians to make the best threapeutic decision that merits [19]. In the present study, we found considerable intra- and interobserver agreement regarding disc bulging (p-value < 0.001). The intraobserver kappa value was excellent for L3-L4 and L5-S1 bulging discs while interobserver assessments revealed fair agreement though significant. Herniated disc assessment presented similar results. Intraobserver evaluation presented excellent agreement while interobserver variations were notably more. In this regard, studies have assessed this correlation and presented a wide range of kappa index (0.32–0.79) for an interobserver agreement [9, 20, 21] that is consistent with our study. Rehman et al. presented a nonsignificant interobserver kappa index of 0.41 for bulging discs and statistically significant results with a kappa index of 0.51 for the herniated disc [21]. Another study by Braga-Baiak et al. showed a high intraobserver percent agreement but low values of kappa. Similar to our presentation, they stated that the intraobserver agreement was better than interobserver agreement. However, contrary to our study they presented low values of kappa while we found fair to good values [22]. Interobserver variations have been noted in different reports that all unanimously presented a probable role of disc bulging for the lower extends of the agreement by neuroimaging interpretations. The influence of this diagnosis is considerable as Van Rijn et al. presented that up to 50% of interobserver discordances occured due to disc bulging [10]. In fact, this is a common pathologic finding on MRI among asymptomatic cases. Therefore, patients presenting with radiculopathy while only disc bulging is observed in their imaging, are considered for conservative treatment. Accordingly, this finding has a low clinical significance [21, 23]. Also, it is noted that the lower extent of agreement in our study compared to some of the previous papers may be attributed to the interpreters. In the present study, we selected neuroradiologists while in some other studies, neurosurgeons or a combination of neuroradiologists and neurosurgeons were participated in the study [9]. Another surprising aspect is the probable association of the symptoms with MRI findings. In other words, it seems that the presentation of radiculopathy symptoms may be in accordance with incidental findings of images such as bulging while these pathologic findings may be ignored in normal cases [24]. To assess this hypothesis, conducting further studies on normal cases versus patients with radiculopathy is recommended. The further assessments of our study targeted diagnosis of herniated disc location. The agreement was remarkably acceptable through intraobserver assessments while interobserver evaluations showed fair concordance. Moreover, evaluation of the stenotic region responsible for nerve compression symptoms revealed excellent kappa values all above 0.80 through intraobserver assessments. However, similar to previous findings, the interobserver discrepancy was considerable. The mentioned interobserver discordance was reported by Rehman et al. as well. These researchers presented fair values of kappa when assessing the stenotic nerve root [9]. On the contrary, Van Rijn et al. declared excellent concordance for detecting the compressed root responsible for symptoms presented by patients [10]. Previous studies have unanimously presented considerably higher rates of intraobserver agreement than interobserver agreement which can simply occur due to clinical features but not a scientific matter [25]. In this regard, variations in observer interpretations pose a challenge in imaging-related research settings. This fact has occurred due to the inaccessibility to a practical approach for minimizing interobserver variation effects on images interpretation. In fact, not only the correct interpretation of an imaging based findings should be considered, but also the etiologies of interpreter bias should be minimized as well [26]. Although in this study we measured the intra and inter-observer agreement, the clinical correlation of imaging findings with patients' symptoms and outcomes were not being evaluated due to the retrospective nature of the study. Since no distinctive correlation was found between the clinical symptoms and imaging findings [27], the imaging evaluation with clinical assessments would be more valuable. Other studies are needed to evaluate the influence of MRI interpretation variability on the outcome and therapeutic managements of patients with IDH, in addition to the clinical assessments.

Conclusion

The current study presented a remarkable intraobserver agreement for the diagnosis of disc-related pathologies of the lumbar spine, while interobserver assessments revealed only fair concordance. The findings of our study were consistent with those of previous ones. Overall, it is concluded that practical approaches are needed to minimize the interobserver variations of a neuroimaging interpretation.

Declarations

Author contribution statement

Somayeh Hajiahmadi, Azin Shayganfar: Conceived and designed the experiments; Performed the experiments; Contributed reagents, materials, analysis tools or data. Mahsa Asgari: Conceived and designed the experiments; Performed the experiments; Analyzed and interpreted the data. Shadi Ebrahimian: Conceived and designed the experiments; Performed the experiments; Wrote the paper.

Funding statement

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Competing interest statement

The authors declare no conflict of interest.

Additional information

No additional information is available for this paper.
  23 in total

1.  Low-field magnetic resonance imaging of the lumbar spine: reliability of qualitative evaluation of disc and muscle parameters.

Authors:  J Solgaard Sorensen; P Kjaer; Secher T Jensen; P Andersen
Journal:  Acta Radiol       Date:  2006-11       Impact factor: 1.990

2.  Agreement in the interpretation of magnetic resonance images of the lumbar spine.

Authors:  F M Kovacs; A Royuela; T S Jensen; A Estremera; G Amengual; A Muriel; I Galarraga; C Martínez; E Arana; H Sarasíbar; R M Salgado; V Abraira; O López; C Campillo; M T Gil del Real; J Zamora
Journal:  Acta Radiol       Date:  2009-06       Impact factor: 1.990

3.  Lumbar spine: reliability of MR imaging findings.

Authors:  John A Carrino; Jon D Lurie; Anna N A Tosteson; Tor D Tosteson; Eugene J Carragee; Jay Kaiser; Margaret R Grove; Emily Blood; Loretta H Pearson; James N Weinstein; Richard Herzog
Journal:  Radiology       Date:  2008-10-27       Impact factor: 11.105

Review 4.  Magnetic resonance imaging. Use in patients with low back or radicular pain.

Authors:  R J Herzog; R D Guyer; A Graham-Smith; E D Simmons
Journal:  Spine (Phila Pa 1976)       Date:  1995-08-15       Impact factor: 3.468

5.  Lumbar spine: agreement in the interpretation of 1.5-T MR images by using the Nordic Modic Consensus Group classification form.

Authors:  Estanislao Arana; Ana Royuela; Francisco M Kovacs; Ana Estremera; Helena Sarasíbar; Guillermo Amengual; Isabel Galarraga; Carmen Martínez; Alfonso Muriel; Víctor Abraira; María Teresa Gil Del Real; Javier Zamora; Carlos Campillo
Journal:  Radiology       Date:  2010-02-01       Impact factor: 11.105

6.  Rapid magnetic resonance imaging for diagnosing cancer-related low back pain.

Authors:  William Hollingworth; Darryl T Gray; Brook I Martin; Sean D Sullivan; Richard A Deyo; Jeffrey G Jarvik
Journal:  J Gen Intern Med       Date:  2003-04       Impact factor: 5.128

7.  Prolonged conservative care versus early surgery in patients with sciatica caused by lumbar disc herniation: two year results of a randomised controlled trial.

Authors:  Wilco C Peul; Wilbert B van den Hout; Ronald Brand; Ralph T W M Thomeer; Bart W Koes
Journal:  BMJ       Date:  2008-05-23

8.  Intra- and inter-observer reliability of MRI examination of intervertebral disc abnormalities in patients with cervical myelopathy.

Authors:  Andresa Braga-Baiak; Anand Shah; Ricardo Pietrobon; Larissa Braga; Arnolfo Carvalho Neto; Chad Cook
Journal:  Eur J Radiol       Date:  2007-05-25       Impact factor: 3.528

9.  Magnetic resonance imaging interpretation in patients with symptomatic lumbar spine disc herniations: comparison of clinician and radiologist readings.

Authors:  Jon D Lurie; David M Doman; Kevin F Spratt; Anna N A Tosteson; James N Weinstein
Journal:  Spine (Phila Pa 1976)       Date:  2009-04-01       Impact factor: 3.468

10.  Magnetic resonance imaging interpretation in patients with sciatica who are potential candidates for lumbar disc surgery.

Authors:  Abdelilah El Barzouhi; Carmen L A M Vleggeert-Lankamp; Geert J Lycklama À Nijeholt; Bas F Van der Kallen; Wilbert B van den Hout; Annemieke J H Verwoerd; Bart W Koes; Wilco C Peul
Journal:  PLoS One       Date:  2013-07-10       Impact factor: 3.240

View more
  1 in total

Review 1.  Artificial intelligence and spine imaging: limitations, regulatory issues and future direction.

Authors:  Alexander L Hornung; Christopher M Hornung; G Michael Mallow; J Nicolas Barajas; Alejandro A Espinoza Orías; Fabio Galbusera; Hans-Joachim Wilke; Matthew Colman; Frank M Phillips; Howard S An; Dino Samartzis
Journal:  Eur Spine J       Date:  2022-01-27       Impact factor: 2.721

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.