| Literature DB >> 35035786 |
Jialiang Yang1,2, Jie Ju3, Lei Guo4, Binbin Ji2, Shufang Shi2,5, Zixuan Yang3, Songlin Gao3, Xu Yuan2, Geng Tian2, Yuebin Liang2, Peng Yuan6.
Abstract
HER2-positive breast cancer is a highly heterogeneous tumor, and about 30% of patients still suffer from recurrence and metastasis after trastuzumab targeted therapy. Predicting individual prognosis is of great significance for the further development of precise therapy. With the continuous development of computer technology, more and more attention has been paid to computer-aided diagnosis and prognosis prediction based on Hematoxylin and Eosin (H&E) pathological images, which are available for all breast cancer patients undergone surgical treatment. In this study, we first enrolled 127 HER2-positive breast cancer patients with known recurrence and metastasis status from Cancer Hospital of the Chinese Academy of Medical Sciences. We then proposed a novel multimodal deep learning method integrating whole slide H&E images (WSIs) and clinical information to accurately assess the risk of relapse and metastasis in patients with HER2-positive breast cancer. Specifically, we obtained the whole H&E staining images from the surgical specimens of breast cancer patients, and these images were adjusted to size 512 × 512 pixels. The deep convolutional neural network (CNN) was applied to these images to retrieve image features, which were combined with the clinical data. Based on the combined features. After that, a novel multimodal model was constructed for predicting the prognosis of each patient. The model achieved an area under curve (AUC) of 0.76 in the two-fold cross-validation (CV). To further evaluate the performance of our model, we downloaded the data of all 123 HER2-positive breast cancer patients with available H&E image and known recurrence and metastasis status in The Cancer Genome Atlas (TCGA), which was severed as an independent testing data. Despite the huge differences in race and experimental strategies, our model achieved an AUC of 0.72 in the TCGA samples. As a conclusion, H&E images, in conjunction with clinical information and advanced deep learning models, could be used to evaluate the risk of relapse and metastasis in patients with HER2-positive breast cancer.Entities:
Keywords: Breast cancer; Convolutional neural network; H&E-stained histological images; HER2; Recurrence
Year: 2021 PMID: 35035786 PMCID: PMC8733169 DOI: 10.1016/j.csbj.2021.12.028
Source DB: PubMed Journal: Comput Struct Biotechnol J ISSN: 2001-0370 Impact factor: 7.271
Summary of the general clinical information of breast patients.
| Clinicopathologic variable | Category | TCGA | CAMS data |
|---|---|---|---|
| Sample type | H&E | 123 | 127 |
| Age | <50 | 34 | 63 |
| ≥50 | 89 | 64 | |
| Tumor stage | I | 14 | 35 |
| II | 77 | 64 | |
| III | 32 | 28 | |
| PR | Positive | 73 | 74 |
| Negative | 50 | 53 | |
| ER | Positive | 90 | 74 |
| Negative | 33 | 53 | |
| Lymph nodes status | Positive (LMN+) | 65 | 60 |
| Negative (LMN-) | 58 | 67 | |
| Outcome | Non-recurrence | 118 | 95 |
| Recurrence | 5 | 32 |
Fig. 1The protocol of whole process. (a) Find H&E-stained histological images of tumor and annotate cancer areas. (b) Color normalization was performed on areas where the filtering blank ratio was < 30%. (c) The patches were labeled according to the slide, recurrence (left) and non-recurrence (right). (d) The model was constructed after the fusion of image features and clinical features, and applied to predict on independent test set.
Fig. 2Violin diagram of image basic feature distribution before and after color normalization. (a) ASM. (b) Contrast. (c) Entropy. (d) Homogeneity. (e) Mean. (f) Dissimilarity. (g) Variance.
Fig. 3Features of clinical data. (a). Mean Decrease Gini corresponding of variables. (b). Correlation of several features with recurrence of breast cancer were examined by violin figure and p-values. (c). Correlation of image features and clinical features.
Fig. 4ROC and fitting curve. (a) ROC curve of 2-fold CV in training data set. (b) ROC curve in the test data set. (c) A fitting curve for predicting breast cancer recurrence and true condition.
Fig. 5Survival analysis. (a). Survival analysis based on the predicted result of recurrence by only clinical data. (b). Survival analysis based on the predict result of recurrence by H&E-stained histological images combined with clinical data.
Fig. 6Network architecture of ResNet50. (a) ResNet50 structure diagram. (b) The basic structure of ResNet50 residual network.
The workflow of building model.
| | |
| | |
| sort importance of features in Random forest; | |
| train model on train data set; | |
| compute AUC from ROC curve for one subset of cross validation; | |
| draw ROC curve based on the results of 2-fold CV; | |
| train the model with whole train data set; | |
| test the performance of the model on test data set; | |
| final; | |