| Literature DB >> 35979264 |
Bowen Li1, Dar-In Tai2, Ke Yan1, Yi-Cheng Chen3, Cheng-Jen Chen3, Shiu-Feng Huang4, Tse-Hwa Hsu3, Wan-Ting Yu3, Jing Xiao5, Lu Le1, Adam P Harrison1.
Abstract
BACKGROUND: Hepatic steatosis is a major cause of chronic liver disease. Two-dimensional (2D) ultrasound is the most widely used non-invasive tool for screening and monitoring, but associated diagnoses are highly subjective. AIM: To develop a scalable deep learning (DL) algorithm for quantitative scoring of liver steatosis from 2D ultrasound images.Entities:
Keywords: Computer-aided diagnosis; Deep learning; Liver steatosis; Screening; Ultrasound
Mesh:
Year: 2022 PMID: 35979264 PMCID: PMC9258285 DOI: 10.3748/wjg.v28.i22.2494
Source DB: PubMed Journal: World J Gastroenterol ISSN: 1007-9327 Impact factor: 5.374
Overview of development and testing datasets
|
|
|
|
|
|
|
|
| DL learning | BD-L | Big data, to train the neural network | 2D US Dx | 2899 | 17149 | 200654 |
| DL validation | BD-V | Big data, to tune model performance | 2D US Dx | 411 | 2364 | 27421 |
| Testing | HP-U | Histopathology-proven group, to (a) measure the trend between DL predictions and histology (b) measure reliability across 2D US liver viewpoints | Histology | 147 | 147 | 1647 |
| TM | Tri-machine data US Dx group, to (a) measure reliability across 2D US liver viewpoints and (b) measure reliability across scanners | - | 246 | 733 | 9215 | |
| HP-T | Histology proven group to measure the trend between DL predictions and histology | Histology, CAP | 112 | 112 | 1996 |
Labels blind to deep learning researchers during course of algorithmic development.
DL: Deep learning; US: Ultrasound; BD-L: Big data learning group; BD-V: Big data validation group; HP-U: Histopathology unblinded test group; TM: Trimachine group; HP-T: Histopathology blinded test group; CAP: Control attenuation parameter; 2D: Two-dimensional; Dx: Diagnosis.
Figure 1Flowchart. 2D-US: Two-dimensional ultrasound.
Figure 2Image view categorization and grouping. Six ultrasound image viewpoints were used in this study. A: Left lobe longitudinal; B: Left lobe transverse; C: Right lobe intercostal; D: Lower right lobe intercostal (depicting liver/kidney contrast); E: Subcostal depicting liver/kidney contrast; F: Subcostal with hepatic veins. These views were further categorized into four groups: Left liver lobe (A and B), right liver lobe (C), liver/kidney contrast (D and E), and subcostal (E and F). LLL: Left liver lobe; RLL: Right liver lobe; LKC: Liver/kidney contrast; SC: Subcostal. Liver cartoons adapted from the DataBase Center for Life Science (https://commons.wikimedia.org/wiki/File:201405_liver.png), licensed under the Creative Commons Attribution 4.0 International[51] (Copyright permission see Supplementary material).
Figure 3Algorithmic workflow. Images were first comprehensively preprocessed to remove regions outside the ultrasound beam. A deep learning neural network, called ResNet-18, was trained on individual ultrasound images in the big data learning group. The model predicted confidences in three binary cutoffs: “≥ mild”, “≥ moderate”, or “= severe” steatosis. The confidences were mapped to a continuous image-wise score in the range of [0, 1]. View-group scores were produced by averaging each image within the group. An “All View Groups” score was produced by averaging all available view group scores. In the figure’s example, the gold standard histopathology diagnosis was a fatty cell percentage of 90%. LLL: Left liver lobe; RLL: Right liver lobe; LKC: Liver/kidney contrast; SC: Subcostal.
Reliability studies in different views
|
|
|
| |||||
|
|
|
|
|
|
|
|
|
| LLL | 342 | 0.27 (0.24, 0.29) | 237 | 0.00 (-0.01, 0.01) | -0.37 (-0.32, -0.42) | 0.37 (0.32, 0.42) | 92% |
| RLL | 370 | 0.21 (0.19, 0.23) | 232 | 0.00 (-0.01, 0.01) | -0.37 (-0.33, -0.42) | 0.37 (0.33, 0.42) | 92% |
| LKC | 267 | 0.30 (0.27, 0.34) | 183 | 0.00 (-0.01, 0.01) | -0.35 (-0.29, -0.41) | 0.35 (0.29, 0.41) | 93% |
| SC | 297 | 0.26 (0.24, 0.29) | 182 | 0.00 (-0.01, 0.01) | -0.36 (-0.31, -0.42) | 0.36 (0.31, 0.42) | 94% |
| All view groups | - | - | 237 | 0.00 (-0.01, 0.01) | -0.25 (-0.21, -0.28) | 0.24 (0.21, 0.28) | 94% |
On the left, the max repeatability coefficient was tabulated when using three images for each view group (Histopathology unblinded test group and Trimachine group datasets). On the right, the bias, worst-case limits of agreement, and agreement (%) were tabulated across different view groups for the Trimachine dataset. The results for all scanner pairs were combined. Parentheses enclose bootstrapped 95% confidence intervals. RC: Repeatability coefficient; LOA: Limits of agreement; LLL: Left liver lobe; RLL: Right liver lobe; LKC: Liver/kidney contrast; SC: Subcostal.
Figure 4Repeatability study. A: A repeatability coefficient plot for right liver lobe when using three images; B-D: Cross-scanner Bland-Altman plots for Siemens-Toshiba (B), Toshiba-Philips (C), and Philips-Siemens (D), respectively. Cross-scanner plots were depicted for “All View Groups” when using ≥ three images per view group. Grey-shaded areas indicate 95% confidence intervals. RC: Repeatability coefficient; LOA: Limit of agreement.
Receiver operating characteristic analysis on the histopathology unblinded test group and histopathology blinded test group cohorts for diagnosing steatosis grades
|
|
|
| ||||||||
|
|
|
|
|
|
|
|
|
|
|
|
| Complete 4 view group study | ||||||||||
| LLL | 41 | 0.98 (0.93, 1.00) | 0.95 (0.89, 1.00) | 0.94 (0.87, 1.00) | 83% | 51 | 0.90 (0.82, 0.98) | 0.92 (0.81, 1.00) | 0.90 (0.82, 0.99) | 88% |
| RLL | 41 | 0.96 (0.90, 1.00) | 0.95 (0.89, 1.00) | 0.89 (0.79, 0.99) | 85% | 51 | 0.84 (0.73, 0.95) | 0.93 (0.84, 1.00) | 0.92 (0.85, 1.00) | 96% |
| LKC | 41 | 0.96 (0.90, 1.00) | 0.95 (0.89, 1.00) | 0.93 (0.84, 1.00) | 83% | 51 | 0.84 (0.73, 0.95) | 0.95 (0.90, 1.00) | 0.88 (0.79, 0.97) | 90% |
| SC | 41 | 0.96 (0.90, 1.00) | 0.92 (0.84, 1.00) | 0.89 (0.79, 0.99) | 83% | 51 | 0.88 (0.79, 0.97) | 0.93 (0.86, 1.00) | 0.88 (0.77, 0.99) | 90% |
| All view groups | 41 | 0.96 (0.90, 1.00) | 0.94 (0.88, 1.00) | 0.92 (0.83, 1.00) | 83% | 51 | 0.88 (0.79, 0.98) | 0.95 (0.88, 1.00) | 0.91 (0.83, 0.99) | 94% |
| Individual view group study | ||||||||||
| LLL | 103 | 0.95 (0.90, 0.99) | 0.93 (0.87, 0.98) | 0.91 (0.86, 0.97) | 80% | 96 | 0.84 (0.76, 0.93) | 0.92 (0.85, 0.99) | 0.93 (0.88, 0.98) | 90% |
| RLL | 138 | 0.94 (0.91, 0.98) | 0.91 (0.86, 0.98) | 0.85 (0.78, 0.92) | 83% | 109 | 0.82 (0.74, 0.90) | 0.89 (0.83, 0.96) | 0.92 (0.87, 0.97) | 92% |
| LKC | 88 | 0.96 (0.92, 1.00) | 0.92 (0.86, 0.98) | 0.84 (0.76, 0.92) | 80% | 71 | 0.81 (0.69, 0.93) | 0.93 (0.87, 0.99) | 0.89 (0.81, 0.96) | 90% |
| SC | 117 | 0.93 (0.89, 0.98) | 0.91 (0.85, 0.96) | 0.86 (0.79, 0.92) | 79% | 90 | 0.86 (0.77, 0.94) | 0.89 (0.82, 0.96) | 0.90 (0.83, 0.97) | 88% |
| All view groups | 147 | 0.95 (0.91, 0.98) | 0.92 (0.88, 0.96) | 0.87 (0.81, 0.92) | 76% | 112 | 0.85 (0.77, 0.93) | 0.91 (0.85, 0.97) | 0.93 (0.88, 0.98) | 90% |
| FibroScan comparison study | ||||||||||
| All view groups | 147 | 0.95 | 0.95 | 0.92 | 77% | 80 | 0.93 (0.87, 0.98) | 0.97 (0.93, 1.00) | 0.92 | 91% |
| FibroScan | 147 | 0.88 (0.81, 0.95) | 0.88 (0.81, 0.95) | 0.80 (0.73, 0.87) | 62% | 80 | 0.89 (0.82, 0.96) | 0.92 (0.86, 0.98) | 0.82 (0.73, 0.92) | 68% |
Filtered with a minimum of 3 images for each view group.
Selecting only studies with associated FibroScan control attenuation parameter (CAP) scores.
Area under the curve of the receiver operating characteristic significantly better than FibroScan CAP scores.
“Complete 4 view groups study” only selects studies where every view group is qualifying (three or more images), whereas “Individual view group study” examines the performance of each qualifying view group individually. Numbers in parentheses are 95% confidence intervals. All trends between the deep learning/CAP score and the histopathology grades were significant (P < 0.001). “Acc” is the classification accuracy when the threshold values calculated by optimizing the Youden index[44] are applied. LLL: Left liver lobe; RLL: Right liver lobe; LKC: Liver/kidney contrast; SC: Subcostal; HP-U: Histopathology unblinded test group; HP-T: Histopathology blinded test group; AUC: Area under the curve.
Figure 5Receiver operating characteristic analysis on histopathology blinded test group. A and B: Receiver operating characteristic curves of the deep learning model for diagnosing hepatic steatosis grades on histopathology blinded test group (HP-T) when using all scanners and only the Siemens/Toshiba/Philips premium scanners, respectively; C and D: Only select for histopathology blinded test group studies with FibroScan diagnoses, corresponding to the performance of the deep learning algorithm and FibroScan, respectively. All receiver operating characteristic curves were measured against a histopathological gold standard. AUCROC: Area under the curve of the receiver operating characteristic.