| Literature DB >> 32912980 |
Zhigang Song1, Chunkai Yu2, Shuangmei Zou3, Wenmiao Wang4, Yong Huang1, Xiaohui Ding1, Jinhong Liu1, Liwei Shao1, Jing Yuan1, Xiangnan Gou1, Wei Jin1, Zhanbo Wang1, Xin Chen1, Huang Chen5, Cancheng Liu6, Gang Xu7, Zhuo Sun6, Calvin Ku6, Yongqiang Zhang1, Xianghui Dong1, Shuhao Wang8,9, Wei Xu9, Ning Lv4, Huaiyin Shi10.
Abstract
OBJECTIVES: The microscopic evaluation of slides has been gradually moving towards all digital in recent years, leading to the possibility for computer-aided diagnosis. It is worthwhile to know the similarities between deep learning models and pathologists before we put them into practical scenarios. The simple criteria of colorectal adenoma diagnosis make it to be a perfect testbed for this study.Entities:
Keywords: computational pathology; deep learning; digital pathology; model interpretability, colorectal adenoma
Mesh:
Year: 2020 PMID: 32912980 PMCID: PMC7485250 DOI: 10.1136/bmjopen-2019-036423
Source DB: PubMed Journal: BMJ Open ISSN: 2044-6055 Impact factor: 2.692
Data distribution, where T, V, TV, H, L represent tubular, villous, tubulovillous, high grade, low grade, respectively
| Subtype | Grade | PLAGH (train) | PLAGH (validation) | PLAGH | CJFH | CH | |
| Adenoma | T | H | 10 | 5 | 0 | 8 | 13 |
| L | 151 | 5 | 56 | 43 | 58 | ||
| V | H | 11 | 5 | 0 | 5 | 11 | |
| L | 28 | 5 | 2 | 3 | 46 | ||
| TV | H | 10 | 0 | 0 | 5 | 11 | |
| L | 24 | 0 | 2 | 3 | 45 | ||
| Non-neoplasm | – | – | 21 | 20 | 138 | 13 | 44 |
| Total | – | – | 177 | 40 | 194 | 63 | 105 |
The counter is incremented by one when the slide contains a certain component.
CH, Cancer Hospital, Chinese Academy of Medical Sciences; CJFH, China-Japan Friendship Hospital; PLAGH, Chinese People's Liberation Army General Hospital.
Figure 1(A) Deep neural network structure; (B) predictions of both classification and segmentation models.
Performance of different deep learning models
| Model | Accuracy, % |
| ResNet-50 | 89.8 |
| DenseNet | 87.7 |
| Inception v3 | 90.3 |
| U-Net | 77.7 |
| DeepLab v3 | 88.3 |
| Improved DeepLab v2 | 90.4 |
Figure 2(A) An example of tiles in ×10, ×20 and ×40 FoVs; (B) Tile-level classification accuracy on the validation set; (C) relative computing time of a WSI on different FoVs. FoVs, for different field of views.
Figure 3Performance of the deep learning model and five pathologists. AUC, area under the curve.
Figure 4(A) Predicted examples in the test set; (B) some predictions for slides from other hospitals; (C) system performance against the hardware configuration.
Model performance on three test datasets, where T, V, TV, H, L represent tubular, villous, tubulovillous, high-grade, low-grade adenomas, respectively
| Dataset | Adenoma, % | T, % | V, % | TH, % | TL, % | VH, % | VL, % | TVH, % | TVL, % |
| PLAGH | 89.3/79.0 | 89.3 | 100.0 | – | 89.3 | – | 100.0 | – | 100.0 |
| CJFH | 90.0/92.3 | 89.8 | 100.0 | 100.0 | 88.3 | 100.0 | 100.0 | 100.0 | |
| CH | 93.4/93.2 | 96.6 | 95.7 | 92.3 | 96.6 | 90.9 | 95.7 | 97.78 |
The second column gives sensitivity/specificity and the last columns list the sensitivity.
CH, Cancer Hospital, Chinese Academy of Medical Sciences; CJFH, China-Japan Friendship Hospital; PLAGH, Chinese People's Liberation Army General Hospital.
Figure 5(A) Falsely predicted examples in the test set; (B) feature maps extracted by the deep CNN. CNN, convolutional neural networks.