| Literature DB >> 31769420 |
Han Suk Ryu1,2, Min-Sun Jin3, Jeong Hwan Park1,4, Sanghun Lee5, Joonyoung Cho5, Sangjun Oh5, Tae-Yeong Kwak5, Junwoo Isaac Woo5, Yechan Mun5, Sun Woo Kim5, Soohyun Hwang6, Su-Jin Shin7, Hyeyoon Chang5.
Abstract
The Gleason grading system, currently the most powerful prognostic predictor of prostate cancer, is based solely on the tumor's histological architecture and has high inter-observer variability. We propose an automated Gleason scoring system based on deep neural networks for diagnosis of prostate core needle biopsy samples. To verify its efficacy, the system was trained using 1133 cases of prostate core needle biopsy samples and validated on 700 cases. Further, system-based diagnosis results were compared with reference standards derived from three certified pathologists. In addition, the system's ability to quantify cancer in terms of tumor length was also evaluated via comparison with pathologist-based measurements. The results showed a substantial diagnostic concordance between the system-grade group classification and the reference standard (0.907 quadratic-weighted Cohen's kappa coefficient). The system tumor length measurements were also notably closer to the reference standard (correlation coefficient, R = 0.97) than the original hospital diagnoses (R = 0.90). We expect this system to assist pathologists to reduce the probability of over- or under-diagnosis by providing pathologist-level second opinions on the Gleason score when diagnosing prostate biopsy, and to support research on prostate cancer treatment and prognosis by providing reproducible diagnosis based on the consistent standards.Entities:
Keywords: deep neural network; gleason scoring system; prostate cancer; prostate core needle biopsy
Year: 2019 PMID: 31769420 PMCID: PMC6966453 DOI: 10.3390/cancers11121860
Source DB: PubMed Journal: Cancers (Basel) ISSN: 2072-6694 Impact factor: 6.639
Cohen’s kappa coefficient matrices for grade group classification (non-weighted/quadratic weighted).
| DeepDx Prostate | Reference Standard | Original Diagnoses | Pathologist 1 | Pathologist 2 | Pathologist 3 | |
|---|---|---|---|---|---|---|
| DeepDx Prostate | - | 0.615/0.907 | 0.440/0.811 | 0.550/0.875 | 0.606/0.906 | 0.615/0.916 |
| Reference | 0.615/0.907 | - | 0.524/0.870 | 0.781/0.955 | 0.809/0.952 | 0.794/0.943 |
| Original | 0.440/0.811 | 0.524/0.870 | - | 0.488/0.865 | 0.494/0.854 | 0.514/0.852 |
| Pathologist 1 * | 0.550/0.875 | 0.781/0.955 | 0.488/0.865 | - | 0.590/0.904 | 0.574/0.896 |
| Pathologist 2 | 0.606/0.906 | 0.809/0.952 | 0.494/0.854 | 0.590/0.904 | - | 0.682/0.920 |
| Pathologist 3 | 0.615/0.916 | 0.794/0.943 | 0.514/0.852 | 0.574/0.896 | 0.682/0.920 | - |
* Genitourinary specialist.
Figure 1Normalized confusion matrices between DeepDx Prostate and diagnoses. (A) Binary and (B) categorical results against reference standard. (C) Binary and (D) categorical results against original hospital diagnoses.
Figure 2Representative examples of core biopsy slides, hematoxylin and eosin (H&E) staining (left), and DeepDx Prostate analysis (right) at ×10 magnification. Core needle biopsy of prostate corresponding to grade groups (A) 1, (B) 2, (C) 4, and (D) 5. Highlights in yellow, orange, and red correspond to regions of Gleason patterns 3, 4, and 5, respectively.
Cohen’s kappa coefficient matrices according to diagnostic difficulty (non-weighted/quadratic weighted).
| Difficulty | DeepDx Prostate | Reference Standard | Original Diagnoses | |
|---|---|---|---|---|
| Easy | DeepDx Prostate | ‒ | 0.656/0.958 | 0.634/0.853 |
| Reference standard | 0.656/0.958 | ‒ | 0.611/0.836 | |
| Original diagnoses | 0.634/0.853 | 0.611/0.836 | ‒ | |
| Medium | DeepDx Prostate | ‒ | 0.529/0.856 | 0.311/0.709 |
| Reference standard | 0.529/0.856 | ‒ | 0.423/0.799 | |
| Original diagnoses | 0.311/0.709 | 0.423/0.799 | ‒ | |
| Hard | DeepDx Prostate | ‒ | 0.224/0.525 | 0.255/0.508 |
| Reference standard | 0.224/0.525 | ‒ | 0.224/0.683 | |
| Original diagnoses | 0.255/0.508 | 0.224/0.683 | ‒ |
Figure 3Distributions of grade groups according to diagnostic difficulty: (A) easy, (B) medium, and (C) hard. Grade groups were obtained from the reference standard.
Figure 4Representative examples of each level of difficulty, H&E staining (left), and DeepDx Prostate analysis (right) at ×10 magnification. (A) Easy-level image showing well-formed individual glands in grade group 1. (B) Medium level image showing well-formed individual glands intermingled with fused glands. (C) Hard-level image showing high-grade tumor intermingled with intraductal carcinoma of the prostate, and (D) very small foci of suspicious tumorous lesion. Highlights in yellow, orange, and red correspond to regions of Gleason patterns 3, 4, and 5, respectively.
Cohen’s kappa score matrices according to diagnostic difficulty, between pathologists (non-weighted/quadratic weighted).
| Difficulty | Pathologist 1 * | Pathologist 2 | Pathologist 3 | |
|---|---|---|---|---|
| Easy | Pathologist 1 * | ‒ | 0.931/0.990 | 1.000/1.000 |
| Pathologist 2 | 0.931/0.990 | ‒ | 0.931/0.990 | |
| Pathologist 3 | 1.000/1.000 | 0.931/0.990 | ‒ | |
| Medium | Pathologist 1 * | ‒ | 0.488/0.836 | 0.463/0.820 |
| Pathologist 2 | 0.488/0.836 | ‒ | 0.599/0.863 | |
| Pathologist 3 | 0.463/0.820 | 0.599/0.863 | ‒ | |
| Hard | Pathologist 1 * | ‒ | 0.273/0.636 | 0.382/0.815 |
| Pathologist 2 | 0.273/0.636 | ‒ | 0.219/0.733 | |
| Pathologist 3 | 0.382/0.815 | 0.219/0.733 | ‒ |
* The genitourinary specialist.
Figure 5Correlation matrix of tumor lengths.
Number of slides in the entire dataset for this study.
| Category | Discovery | Validation (Original Hospital Diagnosis) | Validation (Reference Standards) |
|---|---|---|---|
| Benign | 645 | 188 | 203 |
| ASAP * | 0 | 11 | 8 |
| Grade Group 1 | 145 | 100 | 67 |
| Grade Group 2 | 131 | 100 | 111 |
| Grade Group 3 | 27 | 101 | 92 |
| Grade Group 4 | 122 | 100 | 61 |
| Grade Group 5 | 63 | 100 | 158 |
| Total | 1133 | 700 | 700 |
* Atypical small acinar proliferation.