Literature DB >> 36060588

Application of deep learning to identify ductal carcinoma in situ and microinvasion of the breast using ultrasound imaging.

Meng Zhu¹, Yong Pi², Zekun Jiang³, Yanyan Wu⁴, Hong Bu⁵, Ji Bao⁵, Yujuan Chen⁶, Lijun Zhao⁷, Yulan Peng¹.

Abstract

Background: The treatment and prognosis of breast ductal carcinoma in situ (DCIS) with and without microinvasion (MIC) are different. Ultrasound imaging shows that DCIS is a heterogeneous breast tumor with diverse manifestations. DCIS means that the cancer cells are confined in the duct without penetrating the basement membrane, MIC means that the cancer cells penetrate the basement membrane and the maximum diameter of any largest invasive lesion is less than or equal to 1 mm. This study was designed to evaluate how deep learning can be used to identify DCIS with MIC on ultrasound images.
Methods: The clinical and ultrasound data of 467 consecutive inpatients diagnosed with DCIS (213 with MIC) in West China Hospital of Sichuan University were collected from January 2013 to April 2019 and randomly apportioned to training and internal validation sets. An external validation set comprised data from Sichuan Provincial People's Hospital with 101 patients (33 with MIC) collected between January 2017 and December 2019. There were 2,492 original images; 66% of these were used to establish a model, and the remaining 34% were used to evaluate the model. Three experienced breast ultrasound clinicians analyzed the ultrasound images to establish a logistic regression model. Finally, the logistic regression model and five deep learning models (ResNet-50, ResNet-101, DenseNet-161, DenseNet-169, and Inception-v3) were compared and evaluated to assess their diagnostic efficiency when identifying MIC based on ultrasound image data.
Results: The characteristics of high nuclear grade (P<0.001), necrosis (P=0.006), estrogen receptor negative (ER-; P=0.003), progesterone receptor negative (PR-; P=0.001), human epidermal growth factor receptor 2 positive (HER2+; P=0.034), lymphatic metastasis (P=0.008), and calcification (P<0.001) all showed significant correlations with MIC. The Inception-v3 model achieved the best performance (P<0.05) in MIC identification. The area under the receiver operating curve (AUC) of the Inception-v3 model was 0.803 [95% confidence interval (CI): 0.709 to 0.878], with a classification accuracy of 0.766, a sensitivity of 0.767, and a specificity of 0.765. Conclusions: Deep learning can be used to identify MIC of breast DCIS from ultrasound images. Models based on Inception-v3 can provide automated detection of DCIS with MIC from ultrasound images. 2022 Quantitative Imaging in Medicine and Surgery. All rights reserved.

Entities: Chemical

Keywords: Breast; deep learning; ductal carcinoma in situ (DCIS); microinvasion; ultrasound image

Year: 2022 PMID： 36060588 PMCID： PMC9403599 DOI： 10.21037/qims-22-46

Source DB: PubMed Journal: Quant Imaging Med Surg ISSN： 2223-4306

Introduction

Ductal carcinoma in situ (DCIS) of the breast comprises abnormal proliferation of ductal endothelial cells without invasion of the basement membrane. Following a DCIS diagnosis, the 10-year breast cancer-specific mortality rate is 1.1% (1). However, some patients with breast DCIS also exhibit microinvasion (MIC) (2-5), where cancer cells within the duct break through the basement membrane and infiltrate the surrounding stroma with a diameter of no more than 1 mm (2-6). In the clinical and pathological anatomical stages, DCIS cases are given Tis (tumor in situ) classification (7). The invasive T1 tumors are subdivided into T1mi (invasive carcinoma of 1 mm or smaller), T1a (invasive carcinoma larger than 1 mm, up to and including 5 mm), T1b (invasive carcinoma larger than 5 mm, up to and including 10 mm), or T1c (invasive carcinoma larger than 10 mm, up to and including 20 mm) (7,8). The DCIS (with or without MIC) are divided into low, intermediate, and high grades. After breast-conserving surgery, high-risk DCIS patients (i.e., those with comedonecrosis or high nuclear grade) have a higher recurrence rate (9,10). High nuclear grade and comedonecrosis are more common in MIC patients (11-13). Overall, DCIS with MIC carries a nearly two-fold increase in mortality rate compared to DCIS without MIC (14). Patients with MIC are typically treated similarly to those with T1a breast cancer as recommended by current guidelines from the National Comprehensive Cancer Network (NCCN) (15). The clinical diagnosis of breast tumors is based principally on imaging examination, including mammography, magnetic resonance imaging (MRI), and ultrasound. Mammography is an important method for breast cancer screening in developed countries. However, for dense breast tissue without calcification, mammography is only useful for identifying lesions (16-19). An auxiliary breast ultrasound screening can help detect mammographically-occult breast cancers, with typical detection rates of between 0.8 and 10 cancers detected in every 1,000 women screened (20). Breast volumes in Chinese women are typically smaller and denser than the global average, making ultrasound a more suitable option for Chinese DCIS patients due to its higher sensitivity and specificity when used in dense breast tissue (21,22). Ultrasound provides higher sensitivity to MIC than mammography (23). Although MRI also provides high sensitivity, it is expensive, and there is conflicting evidence as to whether preoperative MRI is beneficial to DCIS patients (24-28). A recent systematic review by Canelo-Aybar et al. (29) showed that preoperative MRI did not reduce the rate of repeat surgery or mastectomy. With increasing access to imaging technology and screening, patients with DCIS can be identified earlier. The provision of ultrasound imaging findings of MIC inpatients with DCIS provides surgeons with more information from which to make clinical decisions. Techniques such as sentinel lymph node biopsy (SLNB) may also be considered to determine axillary lymph node status (9). Determining the contour of a DCIS lesion using traditional manual data-driven methods is challenging because the heterogeneity of DCIS makes the boundary contour between the tumor and normal tissue unclear. Contour extraction is often based on a single image, while the diagnosis is typically based on multiple images of a lesion, which is more consistent with clinical practice. Deep neural networks (DNNs) are the most widely used models in medical image recognition and provide obvious advantages in the diagnosis of breast cancers. Impressive achievements have been made in automating the prediction of benign and malignant breast cancers and lymph node metastasis, and in simulating human decision-making (30-33). The objectives of this study were to investigate the feasibility of using DNN models to identify MIC on ultrasound images of patients with DCIS and to evaluate the classification efficiency of deep learning using external verification. We present the following article in accordance with the TRIPOD reporting checklist (available at https://qims.amegroups.com/article/view/10.21037/qims-22-46/rc).

Methods

The study was conducted in accordance with the Declaration of Helsinki (as revised in 2013). The study was approved by the Ethics Committee on Biomedical Research of West China Hospital, Sichuan University (No. 2020-1219). The committee waived individual consent for this retrospective analysis because it only involved the use of ultrasound images and did not involve patients’ personal information.

Patient population

The medical records of consecutive patients with DCIS confirmed by surgical pathology were collected from January 2013 to April 2019 to form a training data set. The inclusion and exclusion criteria were based on the gold standard of pathology (). The inclusion criteria were as follows: (I) the patient underwent an ultrasound examination in the 14 days before surgery; (II) the ultrasound images showed lesions that could be used for evaluation; (III) the patient underwent a simple mastectomy, and the lesions were completely removed for testing, after which the comprehensive pathological report confirmed either DCIS or DCIS MIC; (IV) if the patient had a multi-center or multi-focus DCIS, only the largest lesion was included in the analysis; and (V) the image background was uniform, with appropriate gain, and the lesion was clearly visible. The exclusion criteria were as follows: (I) any patient with invasive carcinoma of T1a or above; (II) any patient with a pathological diagnosis of DCIS or DCIS MIC where the tumor could not be located using an ultrasound; and (III) the background quality of the ultrasound image was poor, such that the tumor could not be recognized. Subsequent comprehensive pathological reports indicated DCIS with and without MIC in 213 and 254 patients, respectively. External verification data were collected using the same methodology from Sichuan Provincial People’s Hospital , comprising 101 patients with DCIS (68 without MIC; 33 with MIC).

Figure 1

A flow chart of data collection. DCIS, ductal carcinoma in situ; MIC, microinvasion; DNN, deep neural network.

Pathological assessment

The criteria for MIC were the presence of invasive foci on continuous sections of tumor tissue and a maximum diameter of the largest invasion less than or equal to 1 mm, according to the American Joint Committee on Cancer (AJCC) guidelines (6). The final surgical pathological report recorded the nuclear grade (low, medium, or high), the presence/absence of necrosis, and the expression statuses of estrogen receptor (ER), progesterone receptor (PR), human epidermal growth factor receptor 2 (HER2), and the nuclear protein Ki-67. For the classification criteria of molecular markers, see Appendix 1.

Ultrasound image acquisition and visual inspection

Ultrasound images were acquired using machines equipped with a linear array high-frequency probe with a frequency range of 4 to 15 MHz, including the Philips (HD11, IU22, EPIQ5; Philips, Amsterdam, Netherlands), GE Logic E9 (GE Healthcare, Chicago, IL, USA), HI VISION Preirus (Hitachi Medical Corp., Tokyo, Japan), Esaote MyLab90 (Esaote, Maastricht, Holland), Supersonic Imagine (Aix-en-Province, France), and Siemens ACUSON OXANA 2 (Siemens Healthineers, Erlangen, Germany). Ultrasound examination used a portable probe for radial scanning of the whole breast. The standard acquisition procedure for ultrasound images is to obtain longitudinal and transverse sections at the maximum diameter of the lesion with the lesion located in the middle of the image. Supplementary images are obtained that show any malignant signs of the lesion, such as calcification or structural distortion. For both internal and external verification of ultrasound images, the reading was conducted blind and the clinician reading the image was unaware of the clinicopathological status of the patient. Ultrasound images were assessed by three experienced breast radiologists (with 10, 12, and 30 years of experience), who reached a consensus through discussion. The clinicians read all acquired ultrasound images. Ultrasound characteristics were classified by gland background, phenotype, and calcification. The ultrasonic phenotype was summarized as either a mass or another type. A mass was defined as a lesion that could be recognized as a mass on the ultrasound images, including on the longitudinal and transverse sections. Depending on the mass composition, it was classified as either a solid texture mass or a cystic solid texture mass. Other types were characterized as having an indistinct hypoechoic area, dilated duct, architectural distortion, complicated cyst, or local thickening of the glands.

Image preprocessing

Only ultrasonographic images were input into the DNN. Data preprocessing included scaling, flipping, and rotating the original image to increase the datasets’ diversity and improve the model’s robustness (). Images from the same patient showed similarities; therefore, lesions from the same patient were not divided into subsets. The DNN analysis was performed on the balanced data set. This comprised convolution feature extraction, nonlinear activation function mapping, pooling, and classification of full connection layers and the Softmax layer. The maximum characteristics of each image were recorded for each case. The results were obtained by combining four images.

Figure 2

System overview. Multiple images of the lesion were first randomly flipped and rotated to augment the data, and then the deep features were extracted through the feature extraction network. Extracted high-level features were then characterized by the max-pooling operator. Finally, the merged features were classified by a classification network. The data collected from West China Hospital of Sichuan University (the internal dataset) were randomly divided into training and validation sets at a ratio of 8:2 (Figure S1) to test five different DNN models. The data collected from Sichuan Provincial People’s Hospital formed the external validation dataset. There were 2,492 original images from both hospitals (1474 for DCIS; 1018 for MIC). During the training and validation phases, the number of images per case was set at four. For a case with fewer than four images, existing images were randomly chosen to be duplicated to make the total up to four. Otherwise, four images were randomly selected from each case. After the image preprocessing, only four images were included in each case, making the total number of images 2,272.

Deep learning

The model input comprised multiple breast ultrasonography images of the same lesion {X=x1, … x, …, x}. The output was the prediction Y ∈ {MIC, non-MIC}, where X denotes the n-th set of the same lesion containing K images, and x indicates the kth image in the set. Breast ultrasonography images were in RGB (red-green-blue) format with a size of 1,024×768×3 voxels. Original images were resized to 299×299×3 voxels using a bilinear interpolation algorithm from the Python Imaging Library (Secret Labs AB, Östergötland, Sweden) to save on computational resources without sacrificing model performance. Online data augmentation, including random horizontal flipping (probability 0.5) and random rotation by 0 to 180 degrees, was applied to resized images to augment the training data in real-time. The preprocessed image values were then divided by 255 to convert values into the range of 0.0 to 1.0. Multiple ImageNet pre-trained networks were explored and fine-tuned on the dataset, namely Inception-v3, ResNet-50, ResNet-101, DenseNet-161, and DenseNet-169, all of which were implemented using PyTorch version 0.4.1 (Meta AI, New York, NY, USA). Each DNN model was split into two parts, one for extracting features from input images and one for classification The extraction process comprised aggregating high-level features using max aggregation operators (). The classification process comprised transforming aggregated features into the classification space, Y. The DNN models were trained using Adadelta as the optimizer, with a learning rate of 0.1 and a weight decay rate of 10−4 for 200 epochs (34). The mini-batch size was fixed at 8. Dropout was applied to the last fully connected layer of the second sub-network to reduce overfitting, with a drop probability of 0.7. Finally, Inception-v3 was the highest performing network and was used in the final analysis (35).

DNN performance evaluation and statistical analysis

From each sample in the evaluation set, the probability score of the Softmax function and corresponding tag value were recorded. The scores were sorted from high to low, and a threshold was set. If the probability that the sample belonged to the positive sample was greater than the threshold, it was considered a positive sample; otherwise, it was a negative sample. Each time a different threshold was selected, a set of true-positive and false-positive rates were obtained corresponding to points on the receiver operating characteristic (ROC) curve. A cut-off value for the median score of 0.5 was used. The ROC curve was plotted, and the area under the ROC curve (AUC) was calculated. The accuracy was then calculated. Statistical analysis was performed in SPSS version 22.0 (IBM Corp., Armonk, NY, USA). The characteristics of the training, internal, and external verification sets were compared using a t-test and the chi-square test/Fisher exact test. Continuous variables were described using mean ± standard deviation. Categorical variables were described using proportion (%). The clinical pathology and ultrasound characteristics were analyzed using univariate logistic regression to identify MIC. Statistical significance was considered at a threshold of P<0.05 (two-tailed).

Results

Clinical characteristics

Between January 2013 and April 2019, a total of 3,019 patients underwent surgery with a pathological diagnosis of DCIS, and 2,532 patients were excluded from the study due to the presence of an invasive carcinoma of T1a grade or above. A total of 12 patients were excluded due to negative ultrasound findings, and 8 cases were excluded due to poor ultrasound image quality. Between January 2017 and December 2019, 101 patients with DCIS with or without MIC were finally reported by pathology after surgery in Sichuan Provincial People’s Hospital. All 568 patients included in the study were women, 322 (56.7%) without MIC, and 246 (43.3%) with MIC. The training, internal evaluation, and external verification sets consisted of 373, 94, and 101 patients, respectively, with ages 48.79±11.12 (range, 21 to 83) years, 46.64±10.71 (range, 21 to 81) years, and 45.41±8.85 (range 25 to 70) years, respectively. The characteristics of the training, internal verification, and external verification datasets are shown in . Age (P=0.086), menstrual status (P=0.158), histology (P=0.977), gland background (P=0.949), phenotype (P=0.751), and calcification (P=0.407) did not differ between the three datasets. Nuclear grade (P=0.115), necrosis (P=0.930), ER (P=0.233), PR (P=0.316), HER2 (P=0.184), Ki-67 (P=0.103), and lymphatic metastasis (P=0.663) did not differ between the training and internal validation sets.

Table 1

Patient demographic and ultrasonic visual characteristics

Characteristics	Training (n=373), n (%)	Internal validation (n=94), n (%)	External validation (n=101), n (%)	P value*
Age, mean ± SD [range]	48.79±11.12 [21–83]	46.64±10.71 [21–81]	45.41±8.85 [25–70]	0.086
Menstrual status				0.158
Premenopausal	216 (57.91)	62 (65.96)	71 (70.30)
Postmenopausal	157 (42.09)	32 (34.04)	30 (29.70)
Histology				0.977
DCIS	203 (54.42)	51 (54.26)	68 (67.33)
DCIS MIC	170 (45.58)	43 (45.74)	33 (32.67)
Nuclear grade				0.115
High	220 (58.98)	47 (50.00)	NA
Low and medium	148 (39.70)	46 (48.94)	NA
Null	5 (1.32)	1 (1.06)	NA
Necrosis				0.930
Yes	123 (32.96)	32 (34.04)	NA
No	245 (65.68)	61 (64.89)	NA
Null	5 (1.36)	1 (1.07)	NA
ER				0.233
Positive	174 (46.65)	45 (47.87)	NA
Negative	115 (30.83)	33 (35.11)	NA
Null	84 (22.52)	16 (17.02)	NA
PR				0.316
Positive	155 (41.55)	42 (44.68)	NA
Negative	133 (35.66)	36 (38.30)	NA
Null	85 (22.79)	16 (17.02)	NA
HER2				0.184
Negative	171 (45.84)	49 (52.13)	NA
Positive	118 (31.64)	29 (30.85)	NA
Null	84 (22.52)	16 (17.02)	NA
Ki-67				0.103
Low expression	184 (49.33)	55 (58.51)	NA
High expression	105 (28.15)	23 (24.47)	NA
Null	84 (22.52)	16 (17.02)	NA
Lymphatic metastasis				0.663
Yes	6 (1.60)	1 (1.11)	NA
No	367 (98.40)	93 (98.89)	NA
Gland background				0.949
Homogeneous	201 (53.89)	51 (54.26)	46 (45.54)
Heterogeneous	172 (46.11)	43 (45.74)	55 (54.46)
Phenotype				0.751
Mass	148 (39.70)	39 (41.49)	51 (50.50)
Other type**	225 (60.30)	55 (58.51)	50 (49.50)
Calcification				0.407
Present	232 (62.20)	54 (57.45)	63 (62.38)
Absent	141 (37.80)	40 (42.55)	38 (37.62)

*, T-test between training and internal validation cohort. **, Other type includes indistinct hypoechoic area; dilated duct; architectural distortion; complicated cyst; and local thickening of glands. SD, standard deviation; DCIS, ductal carcinoma in situ; MIC, microinvasion; ER, estrogen receptor; PR, progesterone receptor; HER2, human epidermal growth factor receptor 2; NA, not available.

Univariate analysis of characteristics and ultrasound visual inspection

The clinicopathology of patients treated in the external hospital was not available; therefore, the univariate analysis was only performed on patients recruited through the internal hospital. High nuclear grade (P<0.001), necrosis (P=0.006), an ER negative result (P=0.003), a PR negative result (P=0.001), an HER2 positive result (P=0.034), lymphatic metastasis (P=0.008), and calcification (P<0.001) were found to be predictive of MIC (). Age (P=0.989), menstrual status (P=0.200), Ki-67 (P=0.091), gland background (P=0.096), and phenotype (P=0.050) were not found to have any predictive value.

Table 2

Univariate analysis of demographic and ultrasonic visual characteristics

Characteristics	Group	DCIS (n=254), n (%)	MIC (n=213), n (%)	P value
Age (years)	<40	44 (17.32)	37 (17.37)	0.989
Age (years)	≥40	210 (82.68)	176 (82.63)	0.989
Menstrual status	Premenopausal	158 (62.20)	120 (56.34)	0.200
Menstrual status	Postmenopausal	96 (37.80)	93 (43.66)	0.200
Nuclear grade	Low and medium	145 (57.09)	49 (23.00)	<0.001
	High	103 (40.55)	164 (77.00)
	Null	6 (2.36)	0 (0.00)
Necrosis	No	184 (72.44)	122 (57.28)	0.006
	Yes	64 (25.20)	91 (42.72)
	Null	6 (2.36)	0 (0.00)
ER	Negative	63 (24.80)	85 (39.91)	0.003
	Positive	131 (51.57)	88 (41.31)
	Null	60 (23.63)	40 (18.78)
PR	Negative	70 (27.56)	99 (46.48)	0.001
	Positive	123 (48.43)	74 (34.74)
	Null	61 (24.01)	40 (18.78)
HER2	Negative	143 (56.30)	77 (36.15)	0.034
	Positive	51 (20.08)	96 (45.07)
	Null	60 (23.62)	40 (18.78)
Ki-67 level	Low	150 (59.06)	89 (41.78)	0.091
	High	44 (17.32)	84 (39.44)
	Null	60 (23.62)	40 (18.78)
Lymphatic metastasis	No	254 (100)	206 (96.71)	0.008
Lymphatic metastasis	Yes	0 (0.00)	7 (3.29)	0.008
Gland background	Homogeneous	146 (57.48)	106 (49.77)	0.096
Gland background	Heterogeneous	108 (42.52)	107 (50.23)	0.096
Phenotype	Mass	112 (44.09)	75 (35.21)	0.050
Phenotype	Other types	142 (55.91)	138 (64.79)	0.050
Calcification	Absent	132 (51.97)	49 (23.00)	<0.001
Calcification	Present	122 (48.03)	164 (77.00)	<0.001

DCIS, ductal carcinoma in situ; MIC, microinvasion; ER, estrogen receptor; PR, progesterone receptor; HER2, human epidermal growth factor receptor 2.

Deep learning models

The network architecture of the Inception-v3 model is shown in Table S1. The AUCs, classification accuracy, sensitivity, and specificity of the DNN model performance are shown in . The ROC curves of the internal and external datasets using the subsets of the DNN models are shown in . In the internal evaluation, the 5 DNN models achieved classification results with AUCs ranging from 0.740 to 0.803. The Inception-v3 model showed the best classification performance with an AUC of 0.803 [95% confidence interval (CI): 0.709 to 0.878] and a higher classification accuracy than that of the other four deep learning models. The DenseNet-169 model had a higher sensitivity of 79.1%. The specificities of the ResNet-101, DenseNet-161, and Inception-v3 models reached 76.5%. The AUC of the logistic regression model in the internal test set was 0.740 (95% CI: 0.685 to 0.757). The classification ability of the logistic regression model was consistent with that of ResNet-50. The accuracy and specificity of the logistic regression model were lower than the other five deep learning models; however, the sensitivity was higher than that of ResNet-101 and DenseNet-161 (72.5% vs. 67.4% vs. 67.4%). In the external test, the AUCs of the 5 DNN models ranged from 0.614 to 0.696. The DenseNet-161 model showed a better classification performance AUC of 0.696 (95% CI: 0.597–0.784). Inception-v3 had a classification accuracy of 74.3% and a sensitivity of 69.7%. Both ResNet-50 and DenseNet-161 had a specificity of 79.4%.

Table 3

Results of deep learning and logistic regression models

Models	Accuracy	Sensitivity	Specificity	F1	AUC
Internal validation
ResNet-50	0.713	0.767	0.667	0.769	0.740 (0.639–0.825)
ResNet-101	0.723	0.674	0.765	0.704	0.757 (0.657–0.839)
DenseNet-161	0.723	0.674	0.765	0.704	0.746 (0.646–0.830)
DenseNet-169	0.745	0.791	0.706	0.795	0.768 (0.669–0.849)
Inception-v3	0.766	0.767	0.765	0.781	0.803 (0.709–0.878)
Logistic regression	0.691	0.725	0.651	0.712	0.740 (0.685–0.757)
External validation
ResNet-50	0.713	0.545	0.794	0.642	0.670 (0.570–0.761)
ResNet-101	0.693	0.545	0.765	0.640	0.653 (0.551–0.745)
DenseNet-161	0.703	0.515	0.794	0.618	0.696 (0.597–0.784)
DenseNet-169	0.703	0.576	0.765	0.665	0.614 (0.512–0.709)
Inception-v3	0.743	0.697	0.765	0.761	0.685 (0.585–0.774)

AUC, the area under the curve.

Figure 3

ROC curves of the 5 DNN model subsets with AUCs: (A) Internal validation of 94 patients. AUCs for ResNet-50, ResNet-101, DenseNet-161, and DenseNet-169, and Inception-v3 were 0.740, 0.757, 0.746, 0.768, and 0.803, respectively. (B) External validation of 101 patients. AUCs of ResNet-50, ResNet-101, DenseNet-161, DenseNet-169, and Inception-v3 were 0.670, 0.653, 0.696, 0.614, and 0.685, respectively. ROC, receiver operating characteristic; DNN, deep neural network; AUC, the area under the curve.

AUC, the area under the curve. ROC curves of the 5 DNN model subsets with AUCs: (A) Internal validation of 94 patients. AUCs for ResNet-50, ResNet-101, DenseNet-161, and DenseNet-169, and Inception-v3 were 0.740, 0.757, 0.746, 0.768, and 0.803, respectively. (B) External validation of 101 patients. AUCs of ResNet-50, ResNet-101, DenseNet-161, DenseNet-169, and Inception-v3 were 0.670, 0.653, 0.696, 0.614, and 0.685, respectively. ROC, receiver operating characteristic; DNN, deep neural network; AUC, the area under the curve. The confusion matrices of the internal validation of 94 patients from West China Hospital of Sichuan University and the external validation of 101 patients from Sichuan Provincial People’s Hospital are shown in , including the correct classification and error classification numbers of each DNN model for DCIS with and without MIC. Inception-v3 showed the best performance in the internal test set. Of the 22 cases that were incorrectly classified by Inception-v3, 12 manifested hypoechoic areas, 10 manifested mass structures on the ultrasound images (8 solid; 2 cystic solid), and 3 MIC of those 10 had calcification. shows some examples of Inception-v3 identifying the probability of MIC.

Figure 4

Figure 5

Examples of patients with or without MIC correctly or incorrectly classified by Inception-v3. (A) A 46-year-old DCIS (without MIC) patient with a 24-mm diameter solid mass without calcification. The probability of this patient being correctly diagnosed is 93.95%. (B) A 42-year-old MIC patient with an irregular hypoechoic area with complex internal echo and some calcification foci. The probability of being correctly diagnosed as MIC is 100%. (C) A 50-year-old DCIS (without MIC) patient with an indistinct hypoechoic area. The boundary with the adjacent surrounding glandular tissue is unclear. The probability of this patient being correctly diagnosed is 5.95%. (D) A 44-year-old MIC patient with an irregular hypoechoic mass. The probability of being correctly diagnosed as MIC is 9.11%. MIC, microinvasion; DCIS, ductal carcinoma in situ.

Confusion matrix for internal verification and external verification tests to determine the presence or absence of MIC using DNN models, including: (A,B) Inception-v3; (C,D) ResNet-50; (E,F) ResNet-101; (G,H) DenseNet-161; and (I,J) DenseNet-169. DCIS, ductal carcinoma in situ; MIC, microinvasion; DNN, deep neural network. Examples of patients with or without MIC correctly or incorrectly classified by Inception-v3. (A) A 46-year-old DCIS (without MIC) patient with a 24-mm diameter solid mass without calcification. The probability of this patient being correctly diagnosed is 93.95%. (B) A 42-year-old MIC patient with an irregular hypoechoic area with complex internal echo and some calcification foci. The probability of being correctly diagnosed as MIC is 100%. (C) A 50-year-old DCIS (without MIC) patient with an indistinct hypoechoic area. The boundary with the adjacent surrounding glandular tissue is unclear. The probability of this patient being correctly diagnosed is 5.95%. (D) A 44-year-old MIC patient with an irregular hypoechoic mass. The probability of being correctly diagnosed as MIC is 9.11%. MIC, microinvasion; DCIS, ductal carcinoma in situ.

Discussion

This study established that a DNN model can be used to identify MIC in DCIS based on ultrasound images. The most accurate DNN model was Inception-v3, which showed 76.6% accuracy and an AUC of 0.803. In the internal test, the AUC of Inception-v3 was higher than that of the logistic model (0.803 vs. 0.740). Internal validation showed the AUCs of the 5 DNN models to range from 0.740 to 0.803, while the AUCs of the external validation ranged from 0.614 to 0.696. The best AUC in the external validation (0.696; DenseNet-161) was surpassed by that of Inception-v3 in the internal validation (0.803). The better performance of Inception-v3 was mainly because the multiple operations in the Inception block (e.g., convolution kernel sizes of 1×1, 3×3, and 5×5) facilitated the extraction of variable features on multiple scales. Deep learning has been used to detect orthotopic cancer lesions, predict the risk of invasive cancer, and identify high-risk populations based on imaging features. Recently, deep learning has been used to differentiate atypical ductal hyperplasia from DCIS (36,37) with AUC values of 0.86 (using a total of 298 images from 149 patients) and 0.90 (a total of 280 images from 140 patients). Mutasa et al. (38) predicted DCIS and invasive cancer with an AUC of 0.71 (246 images from 123 patients), and Shi et al. (39) predicted occult invasion from pure DCIS with an AUC of 0.70 (99 patients). These findings were based on mammography alone and have not been externally verified. The present study is the first to report the use of artificial intelligence, specifically deep learning, to identify MIC on ultrasound images. DNN models were found to have great potential for diagnostic utility. The MIC type of cancer is a minimally invasive breast cancer representing approximately 0.9% of breast cancer diagnoses (40). Previous investigations identifying MIC have been based on clinicopathological factors or imaging characteristics. Overall, the histological grade of DCIS with MIC was higher than that of DCIS without MIC (13,22,41,42). Calcification on ultrasound images is more often seen in cases of MIC than in DCIS without MIC (22,41). The expression of molecular markers varies across different studies. Treatment modalities and prognoses of MIC patients were found to be similar to those in invasive breast cancer (42). In the present study, high nuclear grade, necrosis, ER negativity, PR negativity, HER2 positivity, lymph node metastasis, and calcification were all found to be more common in MIC patients (all P<0.05). We used a logistic model combined with clinicopathological factors and ultrasound features to identify MIC with an accuracy comparable to that of DNN and an AUC of 0.740 (95% CI: 0.685 to 0.757). For invasive carcinoma or extensive DCIS patients, SLNB has been a standard surgical technique (43). Axillary lymph node positivity was found almost exclusively in patients with MIC (13,41,44). Based on a large dataset, 7.6% of MIC had lymph node metastases (45). In the present study, 7 of 213 MIC patients had lymph node metastasis, representing a rate of 3.3%. Future studies of surgical planning may assess the use of SLNB in MIC patients identified by DNN models. As reported in this study, many DCIS patients exhibit a heterogeneous glandular background and diverse imaging manifestations, most of which are non-mass structures. If these features are extracted manually using conventional methods, it may be difficult to determine the tumor boundary and range; this decreases the accuracy of the extracted contour and the tumor size estimation. The DNN models perform reliably as they do not require artificial computing features. They can be used to objectively identify important tumor features while eliminating misinterpretation and human error, thereby reducing the clinical workload (46,47). The DNN algorithms are more robust and direct due to the omission of unnecessary steps during the learning period. Subsequent studies may use new algorithms and larger sample sizes to renew interest in machine learning (48). Further, an ultrasound is easy to operate, and images are easily acquired; therefore, it is vital that research can support the suitability of ultrasonography for applying a DNN system. There are several limitations to this study. First, only ultrasound data were used, while mammography and MRI data were excluded. Second, the dataset was relatively small, and the classification performance of the experimental model did not achieve the expected effect (the classification accuracy was less than 80%). However, the DNN models benefit greatly from a large validation data set; more data from multiple hospitals may improve the performance of the deep learning model. Third, as this was a retrospective study, the number of ultrasound images was uneven. This study only selected four images of each patient as input, which cannot fully represent all the features of the tumor. Future studies can use full-automatic volume imaging and three-dimensional ultrasound to obtain more comprehensive characteristics. Finally, the external validation did not use a logistic regression model due to the unavailability of clinicopathological data.

Conclusions

We implemented deep learning to detect MIC on ultrasound images. The DNN models provided accurate and robust performance. These findings suggest that DNN models may accurately identify MIC in DCIS using ultrasound images and have the potential to provide an objective auxiliary diagnostic method for clinicians. In particular, networks based on the Inception-v3 model may perform the best for these and similar applications. The article’s supplementary files as

43 in total

1. Automatic classification of ultrasound breast lesions using a deep convolutional neural network mimicking human decision-making.

Authors: Alexander Ciritsis; Cristina Rossi; Matthias Eberhard; Magda Marcon; Anton S Becker; Andreas Boss
Journal: Eur Radiol Date: 2019-03-29 Impact factor: 5.315

Review 2. American Joint Committee on Cancer's Staging System for Breast Cancer, Eighth Edition: What the Radiologist Needs to Know.

Authors: Sirishma Kalli; Alan Semine; Sara Cohen; Stephen P Naber; Shital S Makim; Manisha Bahl
Journal: Radiographics Date: 2018-09-28 Impact factor: 5.333

3. Ductal Carcinoma In Situ with Microinvasion on Core Biopsy: Evaluating Tumor Upstaging Rate, Lymph Node Metastasis Rate, and Associated Predictive Variables.

Authors: April Phantana-Angkool; Amy E Voci; Yancey E Warren; Chad A Livasy; Lakesha M Beasley; Myra M Robinson; Lejla Hadzikadic-Gusic; Terry Sarantou; Meghan R Forster; Deba Sarma; Richard L White
Journal: Ann Surg Oncol Date: 2019-07-24 Impact factor: 5.344

4. Microinvasive carcinoma (T1mic) of the breast: clinicopathologic profile of 21 cases.

Authors: M L Prasad; M P Osborne; D D Giri; S A Hoda
Journal: Am J Surg Pathol Date: 2000-03 Impact factor: 6.394

5. Microinvasive breast carcinoma carries an excellent prognosis regardless of the tumor characteristics.

Authors: Lamis Shatat; Nika Gloyeske; Rashna Madan; Maura O'Neil; Ossama Tawfik; Fang Fan
Journal: Hum Pathol Date: 2013-09-24 Impact factor: 3.466

6. Adjunct Screening With Tomosynthesis or Ultrasound in Women With Mammography-Negative Dense Breasts: Interim Report of a Prospective Comparative Trial.

Authors: Alberto S Tagliafico; Massimo Calabrese; Giovanna Mariscotti; Manuela Durando; Simona Tosto; Francesco Monetti; Sonia Airaldi; Bianca Bignotti; Jacopo Nori; Antonella Bagni; Alessio Signori; Maria Pia Sormani; Nehmat Houssami
Journal: J Clin Oncol Date: 2016-03-09 Impact factor: 44.544

Review 7. Ductal carcinoma in situ of the breast: a systematic review of incidence, treatment, and outcomes.

Authors: Beth A Virnig; Todd M Tuttle; Tatyana Shamliyan; Robert L Kane
Journal: J Natl Cancer Inst Date: 2010-01-13 Impact factor: 13.506

8. Ductal carcinoma in situ diagnosed at US-guided 14-gauge core-needle biopsy for breast mass: preoperative predictors of invasive breast cancer.

Authors: Ah Young Park; Hye Mi Gweon; Eun Ju Son; Miri Yoo; Jeong-Ah Kim; Ji Hyun Youk
Journal: Eur J Radiol Date: 2014-01-20 Impact factor: 3.528

9. A Nomogram to Predict Factors Associated with Lymph Node Metastasis in Ductal Carcinoma In Situ with Microinvasion.

Authors: Jessica C Gooch; Freya Schnabel; Jennifer Chun; Elizabeth Pirraglia; Andrea B Troxel; Amber Guth; Richard Shapiro; Deborah Axelrod; Daniel Roses
Journal: Ann Surg Oncol Date: 2019-09-16 Impact factor: 5.344

10. The Demographic Features, Clinicopathological Characteristics and Cancer-specific Outcomes for Patients with Microinvasive Breast Cancer: A SEER Database Analysis.

Authors: Wenna Wang; Wenjie Zhu; Feng Du; Yang Luo; Binghe Xu
Journal: Sci Rep Date: 2017-02-06 Impact factor: 4.379