| Literature DB >> 33810251 |
Cho-Hee Kim1, Subrata Bhattacharjee2, Deekshitha Prakash2, Suki Kang3, Nam-Hoon Cho3, Hee-Cheol Kim1, Heung-Kook Choi2.
Abstract
The optimal diagnostic and treatment strategies for prostate cancer (PCa) are constantly changing. Given the importance of accurate diagnosis, texture analysis of stained prostate tissues is important for automatic PCa detection. We used artificial intelligence (AI) techniques to classify dual-channel tissue features extracted from Hematoxylin and Eosin (H&E) tissue images, respectively. Tissue feature engineering was performed to extract first-order statistic (FOS)-based textural features from each stained channel, and cancer classification between benign and malignant was carried out based on important features. Recursive feature elimination (RFE) and one-way analysis of variance (ANOVA) methods were used to identify significant features, which provided the best five features out of the extracted six features. The AI techniques used in this study for binary classification (benign vs. malignant and low-grade vs. high-grade) were support vector machine (SVM), logistic regression (LR), bagging tree, boosting tree, and dual-channel bidirectional long short-term memory (DC-BiLSTM) network. Further, a comparative analysis was carried out between the AI algorithms. Two different datasets were used for PCa classification. Out of these, the first dataset (private) was used for training and testing the AI models and the second dataset (public) was used only for testing to evaluate model performance. The automatic AI classification system performed well and showed satisfactory results according to the hypothesis of this study.Entities:
Keywords: artificial intelligence; binary classification; dual-channel; prostate cancer; prostate cancer detection; texture analysis; tissue feature engineering
Year: 2021 PMID: 33810251 PMCID: PMC8036750 DOI: 10.3390/cancers13071524
Source DB: PubMed Journal: Cancers (Basel) ISSN: 2072-6694 Impact factor: 6.639
Figure 1The grading and staging of prostate cancer. (a) The Gleason grading system. (b) The tumor staging system (Source: https://sunshinecoasturology.com.au/useful-info/urological-conditions/what-is-prostate-cancer/ accessed on 2 March 2021).
Past studies related to cancer detection using artificial intelligence (AI) techniques.
| Author | AI Techniques | Classification Types | Parameters | Performance |
|---|---|---|---|---|
| Mohanty et al., 2011 [ | Association Rule | Binary | GLCM and GLRLM features | The association rule method was used for image classification. Accuracies of 94.9% and 92.3% were achieved using all and significant features, respectively. |
| Neeta et al., 2015 [ | SVM and K-NN | Binary | GLDM and Gabor features | Detection of PCa using CAD algorithm and the best result of 95.83% was achieved using an SVM classifier. |
| Filipczuk et al., 2012, [ | K-NN | Binary | GLCM and GLRLM features | Breast cancer diagnosis was performed by classifying the texture features based on GLCM and GLRLM extracted from the segmented nuclei. The best result of 90% was obtained by combining the optimal features of GLRLM. |
| Radhakrishnan et al., 2012 [ | SVM | Binary | Histogram, GLCM, and GLRLM features | TRUS medical images were used for prostate cancer classification. The DBSCAN clustering method was used for extracting the prostate region. The best accuracy of 91.7% was achieved by combining the three feature extraction methods. |
| Sinecen et al., 2007 [ | MLP1, MLP2 RBF, and LVQ | Binary | Image texture based on Gauss-Markov random field, Fourier transform, stationary wavelets | Prostate tissue images of 80 benign and 80 malignant cell nuclei were evaluated. The best accuracy of 86.88% was achieved using MLP2. |
| Bhattacharjee et al., 2019 [ | MLP | Binary | Color moment and GLCM features | Wavelet-based GLCM and color moment descriptor were extracted from prostate tissue images of benign and malignant classes. The model achieved an accuracy of 95%. |
| Song et al., 2018 [ | DCNN | Binary | MRI scans | PCa and noncancerous tissues were distinguished using DCNN. An AUC of 0.944, a sensitivity of 87.0%, a specificity of 90.6 PPV of 87.0%, and an NPV of 90.6% were achieved using DCNN. |
| Bhattacharjee | SVM | Binary (benign vs. malignant) and Multiclass (benign vs. grade 3 vs. grade 4 vs. grade 5) | Morphological features | Morphological feature classification was performed for discriminating benign from malignant tumor, grade 3 from grade 4, 5 tumors, and grade 4 from grade 5 tumor. The best rest was obtained from binary classification. |
| Zhao et al., 2015 [ | ANN | Binary | GLCM, Gray-level histogram, and general features | T2-weighted prostate MRI scans were used to extract 12 different types of features. Feature classification was performed using an artificial neural network. For PZ and CG, the accuracies achieved using a CAD system were 80.3% and 84.0%, respectively. |
| Roy et al., 2019 [ | CNN | Binary (nonmalignant vs. malignant) and | Histology images | The patch-based classifier using CNN was developed for the automated classification of histopathology images. In classifying the images of the cancer histology test dataset, the proposed technique achieved promising accuracies for both binary and multiclass classification. |
| Chakraborty | DCRCNN | Binary | Histopathologic scans | A dual-channel residual convolution neural network was used to classify the tissue images of the lymph node section. The model was trained with 220,025 images and achieved an overall accuracy of 96.47%. |
Figure 2Microscopic biopsy images of a private dataset at the magnification factor of 40×. (a) Benign tumor. (b) Grade 3 tumor. (c) Grade 4 tumor. (d) Grade 5 tumor.
The numbers of benign and malignant images used for training and testing. Private dataset structure at 40× magnification factor.
| Groups | Training | Testing | Total | |
|---|---|---|---|---|
| Train | Validation | |||
| Benign | 160 | 40 | 50 | 250 |
| Malignant | 160 | 40 | 50 | 250 |
| Total | 320 | 80 | 100 | 500 |
Figure 3Microscopic biopsy images of an external test set at the magnification factor of 10×. (a,b) Benign tumor. (c,d) Malignant tumor.
Figure 4Block diagram of patch extraction and stain deconvolution of histopathological images. (a) The preprocessed regions of interest (ROI) after gamma correction. (b) Patch extraction of size 64 × 64 pixels (24-bits/pixel). (c) Hematoxylin and Eosin channels extracted from (b). The black bounding box in (a) represents the size of the sliding window for extracting the patches.
Figure 5The proposed pipeline of the research work. Each step in the process flow diagram is carried out separately and independently.
Figure 6Feature ranking using the recursive feature elimination technique. The best five significant features are selected with the largest ranking values.
The second step feature selection using one-way ANOVA. Significant features are identified based on p-value < 0.05, and large effect size.
| Feature Name | Significance | Effect Size | |
|---|---|---|---|
| Eta Squared | |||
| Energy | 25,550.5 | <0.05 | 0.77127 |
| Skewness | 3375.6 | <0.05 | 0.32936 |
| Kurtosis | 2351.6 | <0.05 | 0.35199 |
| Entropy | 5742.8 | <0.05 | 0.58838 |
| Variance | 5689.3 | <0.05 | 0.58672 |
Figure 7The structure of the proposed DC-BiLSTM Network.
Comparative analysis of the performance of multiple classifiers in distinguishing between benign and malignant tissue. The performance metrics are for the internal test dataset.
| Models | Accuracy | Precision | Recall | F1-Score |
|---|---|---|---|---|
| SVM | 96.1% | 95.3% | 96.9% | 96.1% |
| LR | 96.1% | 95.5% | 96.8% | 96.1% |
| Bagging Tree | 95.6% | 95.0% | 96.2% | 95.6% |
| Boosting Tree | 96.0% | 95.5% | 96.4% | 95.9% |
| DC-BiLSTM | 98.6% | 98.2% | 98.9% | 98.6% |
Comparative analysis of the performance of multiple classifiers in distinguishing between low-grade and high-grade disease. The performance metrics are for the internal test dataset.
| Models | Accuracy | Precision | Recall | F1-Score |
|---|---|---|---|---|
| SVM | 85.2% | 89.5% | 82.4% | 85.8% |
| LR | 85.1% | 89.5% | 89.2% | 85.7% |
| Bagging Tree | 80.8% | 80.8% | 80.8% | 80.8% |
| Boosting Tree | 86.0% | 84.7% | 86.9% | 85.8% |
| DC-BiLSTM | 93.6% | 96.3% | 91.2% | 93.7% |
Comparative analysis of the performance of multiple classifiers in distinguishing between benign and malignant tissue. The performance metrics are for the external test dataset.
| Models | Accuracy | Precision | Recall | F1-Score |
|---|---|---|---|---|
| SVM | 88.2% | 83.1% | 92.6% | 87.6% |
| LR | 87.9% | 82.2% | 92.1% | 87.3% |
| Bagging Tree | 91.3% | 87.5% | 94.7% | 91.0% |
| Boosting Tree | 93.5% | 92.9% | 94.1% | 93.5% |
| DC-BiLSTM | 89.2% | 88.7% | 90.0% | 89.2% |
Confusion matrices of classification algorithms—Malignant vs. Benign (internal test set).
|
|
|
|
| |||||
|
|
|
|
|
|
| |||
| Benign | 763 | 37 | 764 | 36 | 760 | 40 | ||
| Malignant | 24 | 776 | 25 | 775 | 30 | 770 | ||
|
|
| |||||||
|
|
|
|
| |||||
| Benign | 764 | 36 | 786 | 14 | ||||
| Malignant | 28 | 772 | 8 | 792 | ||||
Confusion matrices of classification algorithms—Grade 3 vs. Grade 5 (internal test set).
|
|
|
|
| |||||
|
|
|
|
|
|
| |||
| Grade 3 | 716 | 84 | 716 | 84 | 647 | 153 | ||
| Grade 5 | 153 | 647 | 155 | 645 | 153 | 647 | ||
|
|
| |||||||
|
|
|
|
| |||||
| Grade 3 | 678 | 122 | 771 | 29 | ||||
| Grade 5 | 102 | 698 | 74 | 726 | ||||
Confusion matrices of classification algorithms—Malignant vs. Benign (external test set).
|
|
|
|
| |||||
|
|
|
|
|
|
| |||
| Benign | 3327 | 673 | 3318 | 682 | 3503 | 497 | ||
| Malignant | 266 | 3734 | 283 | 3717 | 195 | 3805 | ||
|
|
| |||||||
|
|
|
|
| |||||
| Benign | 3719 | 281 | 3550 | 450 | ||||
| Malignant | 233 | 3767 | 409 | 3591 | ||||
Figure 8The receiver operating characteristic (ROC) and area under the curve (AUC) for PCa classification between benign and malignant tissues, and within malignant tissue samples. The test performance of DC-BiLSTM and Boosting tree classifiers used the internal and external test datasets.
Figure 9Comparing the texture differences between benign and malignant tissue and grade 3 and grade 5 tissue samples using five high-ranked radiomic features. (a) Box plot comparing the five high high-ranked radiomic features between benign and malignant (internal test set). (b) Box plot comparing the five high high-ranked radiomic features between grade 3 and grade 5 (internal test set). (c) Box plot comparing the five high high-ranked radiomic features between benign and malignant (external test set).