Literature DB >> 31867417

Dataset of breast ultrasound images.

Walid Al-Dhabyani1, Mohammed Gomaa2, Hussien Khaled2, Aly Fahmy1.   

Abstract

Breast cancer is one of the most common causes of death among women worldwide. Early detection helps in reducing the number of early deaths. The data presented in this article reviews the medical images of breast cancer using ultrasound scan. Breast Ultrasound Dataset is categorized into three classes: normal, benign, and malignant images. Breast ultrasound images can produce great results in classification, detection, and segmentation of breast cancer when combined with machine learning.
© 2019 The Authors.

Entities:  

Keywords:  Breast cancer; Classification; Dataset; Deep learning; Detection; Medical images; Segmentation; Ultrasound

Year:  2019        PMID: 31867417      PMCID: PMC6906728          DOI: 10.1016/j.dib.2019.104863

Source DB:  PubMed          Journal:  Data Brief        ISSN: 2352-3409


Specifications Table Walid Al-Dhabyani, Mohammed Gomaa, Hussien Khaled and Aly Fahmy, Deep Learning Approaches for Data Augmentation and Classification of Breast Masses using Ultrasound Images [1] Ultrasound scan is mostly used for examination and early detection of breast cancer. Moreover, it is safe in comparison to other radiology imaging techniques. Breast Ultrasound dataset can be used to train machine learning models which can classify, detect and segment early signs of masses or micro-calcification in breast cancer. Researchers with interest in classification, detection, and segmentation of breast cancer can utilize this data of breast ultrasound images, combine it with others' datasets, and analyze them for further insights. The data is comprehensive, containing breast cancer states (normal, benign, and malignant). This dataset is – to our best knowledge – the first breast ultrasound dataset publically available.

Data

The data collected at baseline include breast ultrasound images among women in ages between 25 and 75 years old. This data was collected in 2018. The number of patients is 600 female patients. The dataset consists of 780 images with an average image size of 500 × 500 pixels. The images are in PNG format. The images are categorized into three classes, which are normal, benign, and malignant. The number of images in each class is shown in Table 1. The data samples are illustrated in Fig. 1. Samples of original images and the images after preprocessing are shown in Fig. 2 and Fig. 3, respectively. Furthermore, each image has its own ground truth (mask image) as shown in Fig. 4.
Table 1

The three classes of breast cases and the number of images in each case.

CaseNumber of images
Benign487
Malignant210
Normal133
Total780
Fig. 1

Samples of Ultrasound breast images dataset.

Fig. 2

Samples of original Ultrasound breast images dataset (Original images that are scanned by the LOGIQ E9 ultrasound system).

Fig. 3

Samples of Ultrasound breast images dataset after refining.

Fig. 4

Samples of Ultrasound breast images and Ground Truth Images.

The three classes of breast cases and the number of images in each case. Samples of Ultrasound breast images dataset. Samples of original Ultrasound breast images dataset (Original images that are scanned by the LOGIQ E9 ultrasound system). Samples of Ultrasound breast images dataset after refining. Samples of Ultrasound breast images and Ground Truth Images.

Experimental design, materials, and methods

Dataset collection

Ultrasound (US) images are generally in grayscale. They were collected and stored in a DICOM format at Baheya hospital. The consumed time used to collect and annotate the images is about one year. US dataset is categorized into three classes: normal, benign, and malignant. At the beginning, the number of images collected was 1100. After performing preprocessing to the dataset, the number of images was reduced to 780 images. The original images contain unimportant information not used for mass classification. Moreover, they may affect the output results of the training process. The instruments used in the scanning process are LOGIQ E9 ultrasound system and LOGIQ E9 Agile ultrasound system. These instruments are usually used in top-notch imaging for radiology, cardiac and vascular application. They produce image resolution of 1280*1024. The transducers are 1–5 MHz on ML6-15-D Matrix linear probe. Fig. 2 Illustrates a sample of the original scanned images.

Preprocessing

To make the dataset useful, some tasks should be performed. The data included duplicated images that required to be removed. Furthermore, radiologists from Baheya reviewed and fixed the incorrect annotation. DICOM images were converted to PNG format by using a DICOM converter application [2]. After refining the dataset, the number of US images was reduced to 780 images. The images are categorized into three classes (cases), which are normal, benign, and malignant. All images were cropped to different sizes to remove unused and unimportant boundaries from the images. We used fast photo crop [3] for this task. The image annotation is added to the image name. Special radiologists at Baheya hospital reviewed and checked all images. An example of the refined images is shown in Fig. 3.

Ground truth

Ground truth (image boundary) is performed to make the ultrasound dataset beneficial. Matlab [4] is used to perform this step. A freehand segmentation is established for each image separately. An example of mask images is shown in Fig. 4. Three folders are created for each type of breast cancer categories. Each folder has the images of its class. The image name includes the name of the class and the number of the image. Furthermore, the name of the masked image has the name as the US images with adding “_mask” to the end name of the image.

Ethical considerations

Researchers are mindful of the fact that patients have a right to be protected from public scrutiny of their private lives and illness. To this end, the researcher ensured that the patients and the hospital were adequately informed about the objective of this study. In addition, every patient's data stays unknown and his or her illness states is with the utmost confidentiality.

Specifications Table

Subject areaMedicine and Dentistry
More specific subject areaRadiology and Imaging
Type of dataImages and mask images
How data was acquiredLOGIQ E9 ultrasound and LOGIQ E9 Agile ultrasound system
Data formatPNG
Experimental factorsAll images are classified as normal, benign and malignant
Experimental featuresWhen medical images are used for training deep learning models, they provide fast and accurate results in classification, detection, and segmentation of breast cancer.
Data source locationBaheya Hospital for Early Detection & Treatment of Women's Cancer, Cairo, Egypt.
Data accessibilityhttps://scholar.cu.edu.eg/?q=afahmy/pages/dataset
Related research article

Walid Al-Dhabyani, Mohammed Gomaa, Hussien Khaled and Aly Fahmy, Deep Learning Approaches for Data Augmentation and Classification of Breast Masses using Ultrasound Images [1]

Value of the Data

Ultrasound scan is mostly used for examination and early detection of breast cancer. Moreover, it is safe in comparison to other radiology imaging techniques.

Breast Ultrasound dataset can be used to train machine learning models which can classify, detect and segment early signs of masses or micro-calcification in breast cancer.

Researchers with interest in classification, detection, and segmentation of breast cancer can utilize this data of breast ultrasound images, combine it with others' datasets, and analyze them for further insights.

The data is comprehensive, containing breast cancer states (normal, benign, and malignant).

This dataset is – to our best knowledge – the first breast ultrasound dataset publically available.

  31 in total

Review 1.  What is new in computer vision and artificial intelligence in medical image analysis applications.

Authors:  Jimena Olveres; Germán González; Fabian Torres; José Carlos Moreno-Tagle; Erik Carbajal-Degante; Alejandro Valencia-Rodríguez; Nahum Méndez-Sánchez; Boris Escalante-Ramírez
Journal:  Quant Imaging Med Surg       Date:  2021-08

2.  Artificial Intelligence-Based Breast Cancer Diagnosis Using Ultrasound Images and Grid-Based Deep Feature Generator.

Authors:  Haixia Liu; Guozhong Cui; Yi Luo; Yajie Guo; Lianli Zhao; Yueheng Wang; Abdulhamit Subasi; Sengul Dogan; Turker Tuncer
Journal:  Int J Gen Med       Date:  2022-03-01

3.  BUSnet: A Deep Learning Model of Breast Tumor Lesion Detection for Ultrasound Images.

Authors:  Yujie Li; Hong Gu; Hongyu Wang; Pan Qin; Jia Wang
Journal:  Front Oncol       Date:  2022-03-25       Impact factor: 6.244

4.  CTG-Net: Cross-task guided network for breast ultrasound diagnosis.

Authors:  Kaiwen Yang; Aiga Suzuki; Jiaxing Ye; Hirokazu Nosato; Ayumi Izumori; Hidenori Sakanashi
Journal:  PLoS One       Date:  2022-08-11       Impact factor: 3.752

5.  EMT-NET: EFFICIENT MULTITASK NETWORK FOR COMPUTER-AIDED DIAGNOSIS OF BREAST CANCER.

Authors:  Jiaqiao Shi; Aleksandar Vakanski; Min Xian; Jianrui Ding; Chunping Ning
Journal:  Proc IEEE Int Symp Biomed Imaging       Date:  2022-04-26

Review 6.  Towards a guideline for evaluation metrics in medical image segmentation.

Authors:  Dominik Müller; Iñaki Soto-Rey; Frank Kramer
Journal:  BMC Res Notes       Date:  2022-06-20

7.  Deep learning applied to breast imaging classification and segmentation with human expert intervention.

Authors:  Rory Wilding; Vivek M Sheraton; Lysabella Soto; Niketa Chotai; Ern Yu Tan
Journal:  J Ultrasound       Date:  2022-01-09

8.  BI-RADS-NET: AN EXPLAINABLE MULTITASK LEARNING APPROACH FOR CANCER DIAGNOSIS IN BREAST ULTRASOUND IMAGES.

Authors:  Boyu Zhang; Aleksandar Vakanski; Min Xian
Journal:  IEEE Int Workshop Mach Learn Signal Process       Date:  2021-11-15

9.  Evaluation of the performance of traditional machine learning algorithms, convolutional neural network and AutoML Vision in ultrasound breast lesions classification: a comparative study.

Authors:  Ka Wing Wan; Chun Hoi Wong; Ho Fung Ip; Dejian Fan; Pak Leung Yuen; Hoi Ying Fong; Michael Ying
Journal:  Quant Imaging Med Surg       Date:  2021-04

10.  Motion blur invariant for estimating motion parameters of medical ultrasound images.

Authors:  Barmak Honarvar Shakibaei Asli; Yifan Zhao; John Ahmet Erkoyuncu
Journal:  Sci Rep       Date:  2021-07-12       Impact factor: 4.996

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.