| Literature DB >> 30375454 |
Yang Yu1,2,3, Jiahao Wang4, Chan Way Ng2,5,6, Yukun Ma2,6, Shupei Mo1, Eliza Li Shan Fong7, Jiangwa Xing1, Ziwei Song1,2, Yufei Xie8, Ke Si4,9, Aileen Wee10,11, Roy E Welsch12,13, Peter T C So3,14,15, Hanry Yu16,17,18,19,20,21.
Abstract
Current liver fibrosis scoring by computer-assisted image analytics is not fully automated as it requires manual preprocessing (segmentation and feature extraction) typically based on domain knowledge in liver pathology. Deep learning-based algorithms can potentially classify these images without the need for preprocessing through learning from a large dataset of images. We investigated the performance of classification models built using a deep learning-based algorithm pre-trained using multiple sources of images to score liver fibrosis and compared them against conventional non-deep learning-based algorithms - artificial neural networks (ANN), multinomial logistic regression (MLR), support vector machines (SVM) and random forests (RF). Automated feature classification and fibrosis scoring were achieved by using a transfer learning-based deep learning network, AlexNet-Convolutional Neural Networks (CNN), with balanced area under receiver operating characteristic (AUROC) values of up to 0.85-0.95 versus ANN (AUROC of up to 0.87-1.00), MLR (AUROC of up to 0.73-1.00), SVM (AUROC of up to 0.69-0.99) and RF (AUROC of up to 0.94-0.99). Results indicate that a deep learning-based algorithm with transfer learning enables the construction of a fully automated and accurate prediction model for scoring liver fibrosis stages that is comparable to other conventional non-deep learning-based algorithms that are not fully automated.Entities:
Mesh:
Substances:
Year: 2018 PMID: 30375454 PMCID: PMC6207665 DOI: 10.1038/s41598-018-34300-2
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Schematic of study outline. Tissues were fixed, dehydrated, embedded in paraffin and sectioned into 2 consecutive slides. One was stained with Sirius Red (SR) for histological assessment by a pathologist using a conventional bright field microscope. The other was imaged using second harmonic generation (SHG) microscopy to generate images for subsequent use in developing a computer-aided classification model. TRN: training group, VAL: validation group, TST: testing group.
Figure 2Representative images of SR-stained samples, SHG original images and SHG processed images at various stages of liver fibrosis.
Figure 3The architecture and of (A) deep learning via convolutional neuron networks (CNN) using pretrained 7-layered AlexNet and non-deep learning via (B) Artificial Neural Networks (ANN), (C) Multinomial Logistic Regression (MLR), (D) Support Vector Machines (SVM) and Random Forests (RF) for computer aided liver fibrosis scoring. SHG processed images were resized and duplicated to be used as input for deep learning-based algorithm, C1-C5: convolution layer, FC6-FC7: fully connected layer. Extracted morphological or/and textural features were used as input for non-deep learning-based algorithms.
Extracted morphological and textural features of collagen fibers from SHG processed images.
| No. | Feature Descriptions |
|---|---|
|
| |
| 1 | Total number of collagen fibers |
| 2–3 | Median and variance of fiber orientation |
| 4–6 | Median, total and variance of fiber length |
| 7–9 | Median, total and variance of fiber width |
| 10 | Total perimeter of collagen fibers |
| 11–14 | No. of long & short and thick & thin fibers |
| 15–16 | Ratio of short/long fibers and thin/thick fibers |
| 17–19 | Median, total and variance of fiber area |
| 20 | Collagen mean intensity |
| 21 | Fiber proportionate area (CPA) |
|
| |
| 22 | Entropy |
| 23–34 | Contrast, correlation, energy and homogeneity from the GLCM given three different pixel distances at two, four, eight pixels |
| 35–40 | Energy, entropy, mean, standard deviation, third moment and fourth moment of the coefficients from Fouriers transform |
| 41–100 | Energy, entropy, mean, standard deviation, third moment and fourth moment of the wavelet decomposition coefficients from ten sub-images generated by Daubechies wavelet transform |
| 101–130 | Energy, entropy, mean, standard deviation, third moment and fourth moment of the magnitude of the convolution over the image with Gabor filters at five scales |
Figure 4The area under receiver operating characteristic (AUROC) values for various classification models from AlexNet-Convolutional Neuron Networks (CNN), conventional Artificial Neural Networks (ANN), non-linear Multinomial Logistic Regression (MLR), linear Support Vector Machines (SVM) and feature-ranking based Random Forests (RF) algorithms. AUROC was evaluated for CNN and ANN, MLR, SVM and RF with (A) morphological features (Features 1–21), (B) textural features (Features 21–130) and (C) both morphological and textural features (Feature 1–130). N.S.: non-significant difference, *adjusted p value is less than 0.005.
The performance evaluation for CNN, ANN, MLR, SVM and RF-based classification models.
| Sensitivity | Specificity | PPV | NPV | ||
|---|---|---|---|---|---|
|
| |||||
| F0 vs F1–4 | CNN | 85.0% | 100.0% | 100.0% | 96.4% |
| ANN | 95.0% | 90.5% | 70.4% | 98.7% | |
| MLR | 90.0% | 98.8% | 94.7% | 97.5% | |
| SVM | 95.0% | 100.0% | 100.0% | 98.8% | |
| RF | 95.0% | 100.0% | 100.0% | 98.8% | |
| F0–1 vs F2–4 | CNN | 80.0% | 90.0% | 84.2% | 87.1% |
| ANN | 82.5% | 91.7% | 86.8% | 88.7% | |
| MLR | 87.5% | 90.0% | 85.4% | 91.5% | |
| SVM | 97.5% | 100.0% | 100.0% | 98.4% | |
| RF | 92.5% | 91.7% | 88.1% | 94.8% | |
| F0–2 vs F3–4 | CNN | 91.7% | 97.5% | 98.2% | 88.6% |
| ANN | 100.0% | 100.0% | 100.0% | 100.0% | |
| MLR | 96.7% | 100.0% | 100.0% | 95.2% | |
| SVM | 100.0% | 95.0% | 96.8% | 100.0% | |
| RF | 100.0% | 97.5% | 98.4% | 100.0% | |
| F0–3 vs F4 | CNN | 96.3% | 80.0% | 95.1% | 84.2% |
| ANN | 98.8% | 100.0% | 100.0% | 95.2% | |
| MLR | 100.0% | 100.0% | 100.0% | 100.0% | |
| SVM | 100.0% | 90.0% | 97.6% | 100.0% | |
| RF | 100.0% | 95.0% | 98.8% | 100.0% | |
|
| |||||
| F0 vs F1–4 | CNN | 85.0% | 100.0% | 100.0% | 96.4% |
| ANN | 80.0% | 98.8% | 94.1% | 95.2% | |
| MLR | 0.0% | 98.8% | 0.0% | 80.0% | |
| SVM | 100.0% | 2.5% | 20.4% | 100.0% | |
| RF | 95.0% | 98.8% | 95.0% | 98.8% | |
| F0–1 vs F2–4 | CNN | 80.0% | 90.0% | 84.2% | 87.1% |
| ANN | 60.0% | 96.0% | 92.3% | 78.4% | |
| MLR | 77.5% | 48.3% | 90.0% | 45.0% | |
| SVM | 100.0% | 16.7% | 44.4% | 100.0% | |
| RF | 77.5% | 88.3% | 81.6% | 85.5% | |
| F0–2 vs F3–4 | CNN | 91.7% | 97.5% | 98.2% | 88.6% |
| ANN | 1.7% | 97.5% | 50.0% | 40.0% | |
| MLR | 90.0% | 45.0% | 71.1% | 75.0% | |
| SVM | 100.0% | 20.0% | 65.2% | 100.0% | |
| RF | 95.0% | 87.5% | 91.9% | 92.1% | |
| F0–3 vs F4 | CNN | 96.3% | 80.0% | 95.1% | 84.2% |
| ANN | 100.0% | 60.0% | 91.0% | 100.0% | |
| MLR | 95.0% | 50.0% | 88.4% | 71.4% | |
| SVM | 100.0% | 55.0% | 89.9% | 100.0% | |
| RF | 100.0% | 95.0% | 96.4% | 100.0% | |
|
| |||||
| F0 vs F1–4 | CNN | 85.0% | 100.0% | 100.0% | 96.4% |
| ANN | 100.0% | 100.0% | 100.0% | 100.0% | |
| MLR | 0.0% | 98.8% | 0.0% | 80.0% | |
| SVM | 100.0% | 37.5% | 28.6% | 100.0% | |
| RF | 90.0% | 98.8% | 94.7% | 97.5% | |
| F0–1 vs F2–4 | CNN | 80.0% | 90.0% | 84.2% | 87.1% |
| ANN | 97.5% | 98.3% | 97.5% | 98.3% | |
| MLR | 77.5% | 48.3% | 50.0% | 76.3% | |
| SVM | 100.0% | 16.7% | 44.4% | 100.0% | |
| RF | 82.5% | 86.7% | 80.5% | 88.1% | |
| F0–2 vs F3–4 | CNN | 91.7% | 97.5% | 98.2% | 88.6% |
| ANN | 100.0% | 100.0% | 100.0% | 100.0% | |
| MLR | 90.0% | 45.0% | 71.1% | 75.0% | |
| SVM | 100.0% | 5.0% | 61.2% | 100.0% | |
| RF | 100.0% | 95.0% | 98.8% | 100.0% | |
| F0–3 vs F4 | CNN | 96.3% | 80.0% | 95.1% | 84.2% |
| ANN | 98.7% | 100.0% | 100.0% | 95.2% | |
| MLR | 95.0% | 50.0% | 88.4% | 71.4% | |
| SVM | 100.0% | 17.6% | 82.5% | 100.0% | |
| RF | 100.0% | 95.0% | 98.8% | 100.0% | |
PPV: positive predictive value. NPV: negative predictive value.