| Literature DB >> 34629099 |
Xiangning Chen1, Daniel G Chen2, Zhongming Zhao3,4, Justin M Balko5,6,7, Jingchun Chen8.
Abstract
BACKGROUND: Transcriptome sequencing has been broadly available in clinical studies. However, it remains a challenge to utilize these data effectively for clinical applications due to the high dimension of the data and the highly correlated expression between individual genes.Entities:
Keywords: Artificial image object; Artificial intelligence; Breast cancer biomarker classification; Convolutional neural network; Image classification; Machine learning algorithm; RNA sequencing
Mesh:
Substances:
Year: 2021 PMID: 34629099 PMCID: PMC8504079 DOI: 10.1186/s13058-021-01474-z
Source DB: PubMed Journal: Breast Cancer Res ISSN: 1465-5411 Impact factor: 6.466
Descriptive summary of the datasets used in this study
| Dataset | Ki67 | NHG | Survival day | Survival event | Sequencing platform | |||||
|---|---|---|---|---|---|---|---|---|---|---|
| Ki67− | Ki67+ | Missing | Grade I | Grade II | Grade III | Missing | ||||
| GSE96058 | 568 | 795 | 1613 | 449 | 1394 | 1074 | 59 | 2976 | 2976 | Hiseq2000/NextSeq500 |
| GSE81538 | 231 | 174 | 0 | 48 | 167 | 190 | 0 | 0 | 0 | Hiseq2000 |
| GSE163882 | 0 | 0 | 0 | 17 | 74 | 131 | 0 | 0 | 0 | NextSeq500 |
Fig. 1A schematic drawing illustrating the process to transform tabulated gene expression data into AIOs. a Tabulated expression data in normalized format. b Rescaling the expression data into the range of digital image (1 byte, 0–255). c Arranging the expression data from an individual into an artificial image object (AIO). An AIO could be a grayscale image as shown here (d) or a colored image in which multiple layers of data could be integrated into an AIO
Fig. 2CNN model architecture used for five-fold cross validation. The model had two branches. On the left was a modified VGG structure, and on the right was an embedding layer. The two branches were joined by concatenation before fully connected layers
Fig. 3Five-fold cross validation for biomarkers Ki67 and NHG. a Ki67, the AUCs of the training and testing samples were shown. b NHG, the class-specific AUCs for Grades I, II and III were shown
Cross-validation and sample testing results for Ki67
| Accuracy | AUC | Precision | Recall | F1-score | |
|---|---|---|---|---|---|
| Ki67- | 0.815 ± 0.036 | 0.834 ± 0.022 | 0.824 ± 0.026 | ||
| Ki67 + | 0.821 ± 0.023 | 0.891 ± 0.021 | 0.831 ± 0.016 | 0.811 ± 0.036 | 0.820 ± 0.024 |
| Weighted average | 0.821 ± 0.023 | 0.891 ± 0.021 | 0.822 ± 0.023 | 0.822 ± 0.024 | 0.822 ± 0.024 |
| Ki67- | 0.886 ± 0.010 | 0.796 ± 0.048 | 0.837 ± 0.021 | ||
| Ki67 + | 0.826 ± 0.037 | 0.883 ± 0.016 | 0.763 ± 0.036 | 0.864 ± 0.021 | 0.809 ± 0.016 |
| Weighted average | 0.826 ± 0.037 | 0.883 ± 0.016 | 0.833 ± 0.021 | 0.825 ± 0.036 | 0.825 ± 0.019 |
Cross validation and sample testing results for NHG
| Accuracya | AUCb | Precision | Recall | F1 Score | |
|---|---|---|---|---|---|
| Grade I | 0.838 ± 0.085 | 0.974 ± 0.005 | 0.913 ± 0.028 | 0.838 ± 0.085 | 0.871 ± 0.037 |
| Grade II | 0.809 ± 0.072 | 0.880 ± 0.012 | 0.686 ± 0.07 | 0.811 ± 0.072 | 0.738 ± 0.026 |
| Grade III | 0.825 ± 0.038 | 0.938 ± 0.004 | 0.863 ± 0.031 | 0.756 ± 0.072 | 0.803 ± 0.031 |
| Weighted average | 0.820 ± 0.012 | 0.931 ± 0.006 | 0.820 ± 0.012 | 0.802 ± 0.033 | 0.804 ± 0.030 |
| Grade I | 0.406 ± 0.081 | 0.873 ± 0.025 | 0.608 ± 0.116 | 0.408 ± 0.082 | 0.475 ± 0.059 |
| Grade II | 0.743 ± 0.069 | 0.833 ± 0.005 | 0.710 ± 0.026 | 0.745 ± 0.070 | 0.725 ± 0.026 |
| Grade III | 0.872 ± 0.029 | 0.928 ± 0.015 | 0.848 ± 0.022 | 0.873 ± 0.030 | 0.858 ± 0.015 |
| Weighted average | 0.764 ± 0.052 | 0.882 ± 0.012 | 0.762 ± 0.035 | 0.765 ± 0.052 | 0.758 ± 0.025 |
| Grade I | 0 ± 0 | 0.622 ± 0.189 | 0 ± 0 | 0 ± 0 | 0 ± 0 |
| Grade II | 0.016 ± 0.006 | 0.564 ± 0.044 | 0.268 ± 0.133 | 0.016 ± 0.006 | 0.030 ± 0.010 |
| Grade III | 0.974 ± 0.012 | 0.596 ± 0.070 | 0.589 ± 0.003 | 0.974 ± 0.012 | 0.734 ± 0.004 |
| Weighted average | 0.580 ± 0.006 | 0.587 ± 0.038 | 0.437 ± 0.045 | 0.330 ± 0.003 | 0.443 ± 0.004 |
aCategorical accuracy, bclass-specific AUC
Fig. 4Comparison of performance between pathologists’ consensus calls and model produced calls. a Survival analyses for pathologist’s consensus calls of Ki67 status. b Survival analyses for model produced calls of Ki67 status. c Survival analyses for pathologist’s consensus calls of NHG. d Survival analyses for model produced calls of NHG. For Ki67, the calls from the model had better performance in survival analyses than that of pathologist’s consensus calls. For NHG, the performance of model produced calls only showed a trend, this was likely due to the much smaller sample size (N = 59 as compared to N = 2917 from the pathologist’s calls, see Table 1)