| Literature DB >> 28550374 |
Timothy L Kline1, Panagiotis Korfiatis2, Marie E Edwards3, Jaime D Blais4, Frank S Czerwiec4, Peter C Harris3, Bernard F King2, Vicente E Torres3, Bradley J Erickson2.
Abstract
Deep learning techniques are being rapidly applied to medical imaging tasks-from organ and lesion segmentation to tissue and tumor classification. These techniques are becoming the leading algorithmic approaches to solve inherently difficult image processing tasks. Currently, the most critical requirement for successful implementation lies in the need for relatively large datasets that can be used for training the deep learning networks. Based on our initial studies of MR imaging examinations of the kidneys of patients affected by polycystic kidney disease (PKD), we have generated a unique database of imaging data and corresponding reference standard segmentations of polycystic kidneys. In the study of PKD, segmentation of the kidneys is needed in order to measure total kidney volume (TKV). Automated methods to segment the kidneys and measure TKV are needed to increase measurement throughput and alleviate the inherent variability of human-derived measurements. We hypothesize that deep learning techniques can be leveraged to perform fast, accurate, reproducible, and fully automated segmentation of polycystic kidneys. Here, we describe a fully automated approach for segmenting PKD kidneys within MR images that simulates a multi-observer approach in order to create an accurate and robust method for the task of segmentation and computation of TKV for PKD patients. A total of 2000 cases were used for training and validation, and 400 cases were used for testing. The multi-observer ensemble method had mean ± SD percent volume difference of 0.68 ± 2.2% compared with the reference standard segmentations. The complete framework performs fully automated segmentation at a level comparable with interobserver variability and could be considered as a replacement for the task of segmentation of PKD kidneys by a human.Entities:
Keywords: Autosomal dominant polycystic kidney disease; Deep learning; Magnetic resonance imaging; Planimetry; Segmentation; Total kidney volume
Mesh:
Year: 2017 PMID: 28550374 PMCID: PMC5537093 DOI: 10.1007/s10278-017-9978-1
Source DB: PubMed Journal: J Digit Imaging ISSN: 0897-1889 Impact factor: 4.056
Fig. 1Optimized network architecture consisting of a series of downsampling, upsampling, and skip connections. Each block consists of a series of convolutions (3 × 3 kernels, ReLU activation) and dropout layers (0.35). Both max pooling layers and upsampling layers are of size 2 × 2. The final convolutional layer is a 1 × 1 kernel with sigmoid activation, resulting in classification of each voxel of the input (size 256 × 256)
Fig. 2Training and validation curves for the optimized network. Training and validation Dice coefficients of 0.97 and 0.96 were obtained, respectively. Network weights were monitored and saved based on the best performance on validation set
Fig. 3Examples of segmentations obtained for three different patients. Shown in the left column are the MR images, the second column are the reference standard segmentations, the third column are the automated segmentations, and the right column are the segmentations overlaid on one another. Reference standard segmentations are shown in red, and automated segmentations are shown in blue. Regions of overlap are purple. Shown in the top row is an average example from the dataset, which had a Dice coefficient of 0.96. Shown in the second row is the worst-performing case, which had a Dice coefficient of 0.92. The difficulty in this case is the rarer (in terms of this particular dataset) T2-weighted acquisition (a FISP image) which suffers from image artifacts (particularly banding artifacts resulting from intravoxel dephasing). Shown in the final row is an example of a patient with significant polycystic liver disease. Notice how the automated approach does not classify the liver, or the liver cysts, as kidney
Summary statistics for the automated approach compared with the gold standard. Shown are the results for an individual network, as well as the multi-observer approach
| Statistic | Individual | Multi-observer |
|---|---|---|
| Jaccard | 0.93 ± 0.03 [0.78/0.98] | 0.94 ± 0.03 [0.85/0.98] |
| Dice | 0.96 ± 0.02 [0.88 0.99] | 0.97 ± 0.01 [0.92 0.99] |
| Sensitivity | 0.96 ± 0.02 [0.79/0.99] | 0.96 ± 0.02 [0.89/0.99] |
| Specificity | 0.99 ± 0.01 [0.99/1.00] | 0.99 ± 0.01 [0.99/1.00] |
| Precision | 0.97 ± 0.02 [0.83/1.00] | 0.97 ± 0.02 [0.88/1.00] |
|
| 0.57 ± 0.46 [0.18/4.45] | 0.49 ± 0.36 [0.17/3.69] |
| Volume difference % | −1.42 ± 2.75 [−18.90/15.72] | −0.65 ± 2.21 [−8.06/7.04] |
Fig. 4Bland-Altman analysis of the percent difference of TKV measurements obtained by the automated approach and the reference standard segmentations for both an individual network and the simulated multi-observer approach. The mean difference (solid line) and 95% confidence intervals (dotted lines) are also shown. For the individual network, the m ± SD for the percent volume difference was −1.42 ± 2.75 and the 95% confidence intervals were [−6.93 to 4.09]. The m ± SD for the percent volume difference was −0.65 ± 2.21 and the 95% confidence intervals were [−4.97 to 3.63]