Literature DB >> 32733694

Comparison of breast density assessments according to BI-RADS 4th and 5th editions and experience level.

Aysegul Akdogan Gemici¹, Ersoy Bayram¹, Elif Hocaoglu¹, Ercan Inci¹.

Abstract

BACKGROUND: Breast density is an important variable that can change the sensitivity of mammography. It can be analyzed with using the 4th and 5th editions of the Breast Imaging and Reporting Data System (BI-RADS) recommendations from the American College of Radiology (ACR).
PURPOSE: To define the intra- and inter-reader agreement levels of breast density assignments performed by readers with different experience levels using two versions of BI-RADS.
MATERIAL AND METHODS: The breast density assessments of 330 women were conducted by two readers with different levels of experience (one breast radiologist and one resident). Each reader independently defined the breast density four times-twice using the 4th edition and twice using the 5th edition. Assessments were analyzed on four- and two-category scales.
RESULTS: The intra-reader agreement of the breast radiologist for the 4th and 5th editions of BI-RADS was almost perfect (k = 0.90 and k = 0.87, respectively.) The resident had similar results (k = 0.88 and k = 0.87, respectively). The agreement between the breast radiologist and resident for the 4th and 5th edition of BI-RADS was substantial (k = 0.70 and k = 0.63, respectively). There was a statistically significant difference with the two-category scale analysis between the dense and non-dense for both readers and versions of BI-RADS (McNemar's test, P < 0.001).
CONCLUSION: Although there were high intra- and inter-reader agreement levels when using both versions, the percentage of women having dense breasts increased when using the 5th edition, and the difference was statistically significant. There were no differences found with regard to the readers' level of experience in all analyses. © The Foundation Acta Radiologica 2020.

Entities: Chemical

Keywords: Breast; breast density; mammography; radiologists

Year: 2020 PMID： 32733694 PMCID： PMC7372628 DOI： 10.1177/2058460120937381

Source DB: PubMed Journal: Acta Radiol Open

Introduction

Breast density is an important variable that can change the sensitivity of mammography in the screening populations. False-negative rates are higher in dense breasts due to the masking effect of density. Additional methods are considered to overcome this issue (1–3). Furthermore, women with dense breasts have a four- to six-fold increased risk of breast cancer when compared to women with fatty breasts (4). Studies concerning breast density have been ongoing since the 1990s; those mammographic interpretation-based studies were conducted to analyze the usage of the Breast Imaging and Reporting Data System (BI-RADS) of the American College of Radiology (ACR) (5–8). These studies continued to increase after recognizing the impact of breast density on screening for breast cancer. Additionally, the density notification laws in the United States, the development of new quantitative density assessment tools, and the introduction of the BI-RADS 5th edition in 2013 led to a rise in published research on the topic (5–10). Roughly, the main goal of these studies was to determine a method to improve the measurement of breast density for the risk assessment of breast cancer. This would, therefore, help identify better screening strategies. Mammographic density is commonly measured using BI-RADS recommendations from the ACR. The 4th edition of BI-RADS categorizes breast density based on the percentage of fibroglandular tissue present (11). The 5th edition of BI-RADS (10), published in 2013, redefines the density categories; it excludes the numeric quartiles for percentage density based on dense area readings and describes the distribution on the basis of possibly having an obscured lesion. Although the reliability and reproducibility of visual assessments are limited by inter-observer and intra-observer variability, BI-RADS is the most commonly used method for the assessment of breast density in clinical practice (12). Additionally, automated volumetric methods for density assessments have been introduced. Although they are easily reproducible, they have not been incorporated into the clinical setting (13–16). Visual assessments of the BI-RADS breast density category are subjective, and the level of agreement between readers varies in the literature from “slight” to “almost perfect” (17–24). This discrepancy persists due to many reasons, which include differences in the study populations, the reader’s level of experience, and the BI-RADS version and methods used in the study. The aim of the present study was to examine the variation in assessment of breast density using two versions of BI-RADS, which were performed by two readers with different levels of experience.

Material and Methods

Study design

A total of 330 full-field digital mammography (FFDM) examinations with and without symptoms of breast cancer that were acquired in our university hospital between January 2018 and March 2018 were retrospectively analyzed in the study. Women with a history of breast surgery, breast augmentation, chemotherapy, and lesions detected by mammography were excluded. Routine craniocaudal and mediolateral oblique views were obtained for each breast. FFDM images required the Hologic Selenia Dimension system (Hologic, Bedford, MA, USA) using a dedicated FFDM (Hologic, Selenia Dimensions) with standard-screening automatic exposure control. The monitor settings and reading conditions remained unchanged throughout the study. One technologist with 10 years of mammography experience performed the exam to obtain similar compression technics in all patients. The compression was complete when blanching occurred on the breast or the patient could not tolerate any more pressure. Mammographic density was analyzed according to the 4th and 5th editions of the BI-RADS by two readers with different levels of experience. One reader was a breast radiologist with five years of experience in reading mammograms and the other was a third-year radiology resident with six months of experience in breast radiology. All mammograms were read four times by each radiologist— twice using the 4th edition of the ACR BI-RADS guidelines and twice using the 5th edition. Each reading was separated by a one-month interval. The presentation of cases within each reading session was randomized to reduce bias. Each reader was blinded to the results of any previous reading. The radiologists were also blinded to patient information.

ACR BI-RADS density

The 4th edition of the ACR BI-RADS guidelines relies on the percentage of fibroglandular tissue within the total breast using craniocaudal and mediolateral oblique views. Breasts with glandular densities of <25%, 25%–50%, 50%–75%, and >75% were assigned a BI-RADS density value of 1, 2, 3, and 4, respectively (11). In the 5th edition, the percentage system was redefined with an emphasis on the possibility of having an obscured lesion. The categories were defined as follows: category A = almost entirely fat; category B = scattered fibroglandular densities; category C = heterogeneously dense; and category D = extremely dense (10). The intra-reader and inter-reader analyses were performed for each of the two BI-RADS versions using four- and two-category scales. BI-RADS assessment categories 1 and 2 or A and B were considered non-dense, and categories 3 and 4 or C and D were considered dense. In the case of differences in the assessment between two breasts, the BI-RADS breast density category was assigned on the basis of the denser breast.

Ethical considerations

Our institutional review board approved this retrospective study (decision number: 2018-03-03), and the requirement for informed consent was waived.

Statistical analysis

Cohen’s kappa coefficient (k) and standard error were calculated to measure inter- and intra-reader variability. Because both variables are ordinal scales, we used k with linear weights. Inter-observer agreements and comparison of density assignment distributions were performed according to the radiologists’ first assessments. The kappa values were interpreted as suggested by Landis and Koch (25): a kappa value of 0.20 indicates slight agreement; 0.21–0.40, fair agreement; 0.41–0.60, moderate agreement; 0.61–0.80, substantial agreement; and 0.81–1.00, almost perfect agreement. The analyses were also performed for the two broader categories: non-dense and dense. Statistical analysis was performed using SPSS statistical analysis software (PASW Statistics, version 21.0.0; SPSS Inc., Chicago, IL, USA) and P < 0.05 was considered to be statistically significant.

Results

The mean age of the 330 participants was 51.1 years (age range = 35–76 years). The mean percentage of mammograms rated as BI-RADS density 1 was 36% and category A accounted for 23%. BI-RADS density 2 was 33% and category B was 37%. BI-RADS density 3 was 26% and category C was 24%. BI-RADS density 4 was 13% and category D was 24% (Fig. 1).

Fig. 1.

Assessments of two readers with BI-RADS 4th and 5th editions on a four-category scale.

Agreement analyses: BI-RADS category

The intra-reader agreement of the breast radiologist for the 4th and 5th editions of BI-RADS was almost perfect (k = 0.90 and k = 0.87, respectively). The resident had similar results (k = 0.88 and k = 0.87, respectively). The intra-reader agreement for the 4th and 5th editions of BI-RADS was moderate for both the breast radiologist and resident (k = 0.62, k = 0.69, respectively). The distribution of breast density for the four-cycle analyses of both readers and the changes that occurred by shifting the BI-RADS versions from the 4th to the 5th edition were shown in Fig. 1. The breast radiologist was defined as Reader 1. The resident was defined as Reader 2. The agreement between the breast radiologist and resident with regard to the 4th and 5th editions of BI-RADS was substantial (k = 0.70 and k = 0.63, respectively). It was almost perfect for category 1, substantial for category 2, and moderate for category 3 for both editions. The inter-reader agreement for category 4 changed from moderate to substantial when using the 5th edition instead of the 4th edition (k = 0.49 and k = 0.61, respectively). With the exception of category 4, the inter-reader agreement for the individual categories generally decreased when using the 5th edition, although they stayed within the same agreement levels (Table 1).

Table 1.

Inter-reader agreement for the 4th and 5th versions of the ACR BI-RADS using the four-category scale.

Outcome	Kappa	Z	Prob > Z
Inter-reader agreement for the 4th version of BI-RADS
Density 1	0.8603	38.28	0.0000
Density 2	0.7275	32.37	0.0000
Density 3	0.5826	25.93	0.0000
Density 4	0.4940	21.98	0.0000
Combined	0.7010	51.24	0.0000
Inter-reader agreement for the 5th version of BI-RADS
Density A	0.8332	37.08	0.000
Density B	0.6734	29.96	0.000
Density C	0.4078	18.15	0.000
Density D	0.6092	27.11	0.000
Combined	0.6352	48.40	0.0000

Inter-reader agreement for the 4th and 5th versions of the ACR BI-RADS using the four-category scale.

Agreement analyses: non-dense versus dense

The intra-reader agreement of the breast radiologist for the 4th and 5th editions of BI-RADS was almost perfect (k = 0.93 and k = 0.92, respectively). It was nearly the same for the resident (k = 0.92–0.90) when using the two-category scale assessment. The intra-reader agreement between the 4th and 5th editions of BI-RADS was substantial for the breast radiologist and almost perfect for the resident (k = 0.68 and k = 0.81, respectively). The agreement between the breast radiologist and resident for the 4th and 5th editions of BI-RADS when using the two-category scale was almost perfect (k = 0.89 and k = 0.81, respectively). There was a statistically significant difference with regard to the two-scale analysis between the dense and non-dense categorization for both readers (McNemar’s test, P < 0.001) (Table 2).

Table 2.

Intra-reader agreement for the 4th and 5th versions of the ACR BI-RADS with respect to the two-category scale.

	Non-dense	Dense	Total
	Reader 1 V5
Reader 1 V4*
Non-dense	177	44	221
Dense	6	103	109
Total	183	147	330
	Reader 2 V5
Reader 2 V4[†]
Non-dense	186	27	213
Dense	2	115	117
Total	188	142	330

*k = 0.68; McNemar’s test, P < 0.001.

†k = 0.81; McNemar’s test, P < 0.001.

Intra-reader agreement for the 4th and 5th versions of the ACR BI-RADS with respect to the two-category scale. *k = 0.68; McNemar’s test, P < 0.001. †k = 0.81; McNemar’s test, P < 0.001.

Discussion

In the present study, the density distribution under the BI-RADS 4th edition guidelines was as follows for the breast radiologist: 67% for the non-dense breast and 33% for the dense breast; the distribution for the resident was 64.5% for the non-dense breast and 35.5% for the dense breast. Before the introduction of the BI-RADS 5th edition, the non-dense and dense breast tissues were nearly equally distributed within the general screening population, with 10% almost entirely fatty, 40% scattered fibroglandular, 40% heterogeneously dense, and 10% extremely dense (26–28). When using the BI-RADS 4th edition in the present study, the non-dense breast was seen more frequently than the rate reported in the literature. Although the reports of dense breasts increased and the distribution became closer when changing the BI-RADS version, the most commonly seen pattern in our study group was again the non-dense pattern. The breast radiologist reported 55.4% as non-dense breasts and 44.6% as dense breast, whereas the resident reported 57% as non-dense breasts and 43% as dense breasts. This discrepancy may have been attributed to differences in the study group characteristics, such as in age, body mass index, menopausal status, and whether the examination was for screening or diagnostic purposes. From these parameters, we could only evaluate the mean age of the study group, which was 51 years; it was not greater than the mean age of the general screening population reported in the literature (29,30). Another reason for that density distribution might be the geographical variation but there is no published literature about the breast density of Turkish women. To ensure consistency with the breast density assessments, the present study analyzed the extent of breast density classification and agreement levels when the 4th and 5th editions of BI-RADS was used by two readers with different experience levels. The results showed near-perfect intra-reader agreement when using both editions with a four-category scale. The results were similar when using a two-category scale. However, there was no evidence indicating that the consistency changed as a result of different experience levels. Eom et al. (31) conducted their study, which used similar methodology with six readers and three different levels of experience (two breast imaging experts, two general radiologists, and two medical students) and showed substantial to near-perfect intra-reader agreement when using the BI-RADS 5th edition. They also found no statistically significant difference with regard to the different levels of experience for the intra-reader agreement. Additionally, they studied the agreement between the Volpara automated volumetric breast density measurements and those obtained using the BI-RADS; they found that expert radiologists showed higher consistency with regard to the volumetric and qualitative assessments than did the students (31). In the literature, intra-reader agreement levels were found to be substantial and nearly perfect (18,22,23,31,32). Similar to the results reported in the literature, we found a high intra-reader agreement level, suggesting that visual methods might be reproducible without any effect from the experience level of the reader. However, it should be mentioned that reliability might be more important than reproducibility; the variation in radiologist assessment of breast density may present a problem in screening strategies that use breast density to personalize screening methods. To our knowledge, one of the oldest studies presented by Kerlikowske et al. (5) showed that intra-radiologist agreement was higher than the inter-radiologist agreement with regard to breast density analysis, which are findings similar to ours. Two of our readers showed near-perfect consistencies in their judgment regardless of the BI-RADS version used. However, the inter-reader agreement became slightly lower in the present study. To overcome this objectiveness, several automated software programs have been developed for density quantification; these provide a highly reproducible (13–16) and objective method to measure density. In the present study, we did not aim to compare the reliability of BI-RADS with other qualitative methods, as our focus was on the reproducibility of breast density measurements under different conditions. Inter-reader agreement was found to be broad in the literature, with a range of kappa values indicating slight agreement (17), moderate agreement (18,19), substantial agreement (20–23), and almost-perfect agreement (24). In the aforementioned studies, the readers were chosen with different levels of experience and different editions of the BI-RADS lexicon were used. Additionally, the kappa analysis, which can be done linearly or weighting may affect the agreement level. For example, the best inter-reader agreement was presented in the study by Østerås et al. (24) which used quadratic-weighted kappa values different from those used in our study. Additionally, the radiologists in the aforementioned study went through a training program which increased compliance. Those differences in methodology might be attributed to different agreement levels. In the present study, inter-reader agreement between the breast radiologist and resident was substantial. However, it should be noted that during the study period, the resident worked in the same department with the breast radiologist. Also, because the present study was conducted when the resident was training in breast imaging, this might have influenced the high agreement levels obtained. Whether this level of agreement can be sustained by the resident over time should be evaluated. Additionally, the used of different BI-RADS editions may affect consistency. The study by Ekpo et al. (23) was the first to use the 5th edition of BI-RADS for calculating the intra-reader and inter-reader agreement. Their results showed similar agreement levels to those of the present study. They assessed substantial to almost-perfect inter- and intra-reader agreement levels with 1000 cases evaluated by five radiologists from the same institution. Their study differed from ours in that the readers comprised West African College of Radiology board-certified general radiologists with the number of years of experience in the range of 9–16 years (median = 11 years). In the present study, the resident showed similar inter-reader agreement with the breast radiologist, even with a low experience level. To our knowledge, there is currently no study in the literature that evaluates the effect of the reader’s experience level on compliance. When using the 5th edition of the BI-RADS guidelines instead of the 4th edition, studies aimed to clarify similarly to us that more subjective judgment with leaving the percentage assessment might be done about breast density or on the other hand the simplicity by analyzing the possibility of lesions obscured by fibroglandular tissue might bring more consistency (31–35). In 2016, Irshad et al. (33) reported lower inter-reader agreement using the 5th edition (k = 0.57) than when using the 4th edition (k = 0.63). In our study, we found substantial inter-reader agreement, which is higher than that reported by Irshad et al. Similarly, with the exception of category 4, the inter-reader agreement of individual categories slightly decreased when using the 5th edition, even if it stayed within the same agreement level (Table 1). The inter-reader agreement of category 4 changed from moderate to substantial when using the 5th edition (k = 0.49 and k = 0.61, respectively), which are results new to the literature. A reason for this is that it is the radiologist’s job to define lesions and determine which densities obscure a lesion most should be a comprehensible issue for a radiologist. The majority of studies in the literature that analyzed the changes when using the 5th edition instead of the previous version, found statistical differences between the dense and non-dense groups; this might affect the screening approach, and different supplemental methods may come into question (33–35). A study by Alikhassi et al. (32), which used a methodology similar to the present study, showed that the percentage of dense breasts was higher when radiologists used the 5th edition of the ACR BI-RADS guidelines than when they used the 4th edition. This finding was not in agreement with those reported in the literature; however, this difference was not found to be statistically significant in their study. Their incompatible result may be attributed to the low number of mammograms (n = 72) they analyzed. In the present study, we found statistically significant differences between the dense and non-dense distributions when using the BI-RADS 5th edition—results that are consistent with those in the literature. It should be understood that this difference also occurred with the resident’s analysis. This may present an opportunity to include an additional screening test, such as an ultrasound examination, that could identify additional mammographically occult breast cancers. On the other hand, this additional workload may increase healthcare costs. Further studies are needed to evaluate the effectiveness of adding supplemental tools for personalized screening when using the BI-RADS 5th edition for breast density assignment. The effect of increasing adjuvant screening tools should be evaluated. The present study some some limitations. First, it was performed by two radiologists with two different levels of experience at the same institution. This may have led to similar assessments based on shared practice patterns. Also, the small sample size used in the study might limit the generalizability of the results. A larger and multi-institutional study may elicit more accurate findings. In conclusion, although it might be seen as a sign of consistency, intra- and inter-reader agreement levels were high when using both versions of the BI-RADS guidelines, regardless of differences in experience level. Because the percentage of dense breasts was higher when the radiologists used the 5th edition of the ACR BI-RADS, there may be an opportunity to introduce a second step for a more personalized screening approach. Also it might be continue to analyse the effect of changing the BI-RADS version on the different density categories individually.

30 in total

1. Breast Imaging Reporting and Data System: inter- and intraobserver variability in feature analysis and final assessment.

Authors: W A Berg; C Campassi; P Langenberg; M J Sexton
Journal: AJR Am J Roentgenol Date: 2000-06 Impact factor: 3.959

2. Interobserver agreement in breast radiological density attribution according to BI-RADS quantitative classification.

Authors: D Bernardi; M Pellegrini; S Di Michele; P Tuttobene; C Fantò; M Valentini; M Gentilini; S Ciatto
Journal: Radiol Med Date: 2012-01-07 Impact factor: 3.469

3. A first evaluation of breast radiological density assessment by QUANTRA software as compared to visual classification.

Authors: Stefano Ciatto; Daniela Bernardi; Massimo Calabrese; Manuela Durando; Maria Adalgisa Gentilini; Giovanna Mariscotti; Francesco Monetti; Enrica Moriconi; Barbara Pesce; Antonella Roselli; Carmen Stevanin; Margherita Tapparelli; Nehmat Houssami
Journal: Breast Date: 2012-01-27 Impact factor: 4.380

4. Mammography: interobserver variability in breast density assessment.

Authors: E A Ooms; H M Zonderland; M J C Eijkemans; M Kriege; B Mahdavian Delavary; C W Burger; A C Ansink
Journal: Breast Date: 2007-12 Impact factor: 4.380

5. Effects of Changes in BI-RADS Density Assessment Guidelines (Fourth Versus Fifth Edition) on Breast Density Assessment: Intra- and Interreader Agreements and Density Distribution.

Authors: Abid Irshad; Rebecca Leddy; Susan Ackerman; Abbie Cluver; Dag Pavic; Ahad Abid; Madelene C Lewis
Journal: AJR Am J Roentgenol Date: 2016-09-22 Impact factor: 3.959

6. Comparison of variability in breast density assessment by BI-RADS category according to the level of experience.

Authors: Hye-Joung Eom; Joo Hee Cha; Ji-Won Kang; Woo Jung Choi; Han Jun Kim; EunChae Go
Journal: Acta Radiol Date: 2017-08-02 Impact factor: 1.990

7. Breast density and parenchymal patterns as markers of breast cancer risk: a meta-analysis.

Authors: Valerie A McCormack; Isabel dos Santos Silva
Journal: Cancer Epidemiol Biomarkers Prev Date: 2006-06 Impact factor: 4.254

8. Volumetric breast density assessment: reproducibility in serial examinations and comparison with visual assessment.

Authors: J M Singh; E M Fallenberg; F Diekmann; D M Renz; R Witlandt; U Bick; F Engelken
Journal: Rofo Date: 2013-07-25

9. Inter- and intraradiologist variability in the BI-RADS assessment and breast density categories for screening mammograms.

Authors: A Redondo; M Comas; F Macià; F Ferrer; C Murta-Nascimento; M T Maristany; E Molins; M Sala; X Castells
Journal: Br J Radiol Date: 2012-09-19 Impact factor: 3.039

10. Factors contributing to mammography failure in women aged 40-49 years.

Authors: Diana S M Buist; Peggy L Porter; Constance Lehman; Stephen H Taplin; Emily White
Journal: J Natl Cancer Inst Date: 2004-10-06 Impact factor: 13.506

1 in total

1. Area-based breast percentage density estimation in mammograms using weight-adaptive multitask learning.

Authors: Veli-Matti Kosma; Arto Mannermaa; Naga Raju Gudhe; Hamid Behravan; Mazen Sudah; Hidemi Okuma; Ritva Vanninen
Journal: Sci Rep Date: 2022-07-14 Impact factor: 4.996

1 in total