| Literature DB >> 32938980 |
Naoki Yamato1, Hirohiko Niioka2, Jun Miyake3, Mamoru Hashimoto4.
Abstract
A coherent anti-Stokes Raman scattering (CARS) rigid endoscope was developed to visualize peripheral nerves without labeling for nerve-sparing endoscopic surgery. The developed CARS endoscope had a problem with low imaging speed, i.e. low imaging rate. In this study, we demonstrate that noise reduction with deep learning boosts the nerve imaging speed with CARS endoscopy. We employ fine-tuning and ensemble learning and compare deep learning models with three different architectures. In the fine-tuning strategy, deep learning models are pre-trained with CARS microscopy nerve images and retrained with CARS endoscopy nerve images to compensate for the small dataset of CARS endoscopy images. We propose using the equivalent imaging rate (EIR) as a new evaluation metric for quantitatively and directly assessing the imaging rate improvement by deep learning models. The highest EIR of the deep learning model was 7.0 images/min, which was 5 times higher than that of the raw endoscopic image of 1.4 images/min. We believe that the improvement of the nerve imaging speed will open up the possibility of reducing postoperative dysfunction by intraoperative nerve identification.Entities:
Year: 2020 PMID: 32938980 PMCID: PMC7495488 DOI: 10.1038/s41598-020-72241-x
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Denoising results of CARS microscopy images. The images are for: (a) input observed with an imaging rate of 33.3 images/min (a short exposure time of 1.8 s), (b) ground truth observed with an imaging rate of 0.7 images/min (a long exposure time of 90 s), and images denoised with (c) the BM3D filter, (d) DN, (e) N2N, and (f) W5. Parts (a–f) are each composed of two images: the right image is a magnified image of the white rectangle in the left image. The image size is , with pixels. Nerves lie horizontally in the lower half of the images. Scale bar indicates . The polarization direction of two laser beams is horizontal in images.
Comparison of denoising performance between BM3D filter and deep learning models.
| Model | Evaluation metrics | |
|---|---|---|
| PSNR | SSIM | |
| BM3D | ||
| DN | ||
| N2N | ||
| W5 | ||
The average and standard deviation of each metric over five test images are shown. The N2N shows the significant difference from the others in a paired t-test (, ) using the Bonferroni–Holm correction. N2N showed the highest performance for both metrics. * means significant difference after the Bonferroni–Holm correction ().
Figure 2Results of denoised CARS endoscopy images. The images are for (a) an input observed with an imaging rate of 12.5 images/min (a short exposure time of 4.8 s), denoised with (b) the BM3D filter and (c) N2N with ensemble, and (d) ground truth observed with an imaging rate of 0.4 images/min (a long exposure time of 160 s). Nerves lie at the center of the images with lipid rich tissue on each side. The CARS images in the second column are images cropped from the white rectangles in the left images. The line profiles of the cropped images are shown in (e). Scale bar indicates . The polarization direction of two laser beams is horizontal in images.
Results of denoising CARS endoscopy images.
| Model | Evaluation metrics | |
|---|---|---|
| PSNR | SSIM | |
| BM3D | ||
| DN with microscopy | ||
| DN with endoscopy | ||
| DN with fine-tuning | ||
| DN with ensemble | ||
| N2N with microscopy | ||
| N2N with endoscopy | ||
| N2N with fine-tuning | ||
| N2N with ensemble | ||
| W5 with microscopy | ||
| W5 with endoscopy | ||
| W5 with fine-tuning | ||
| W5 with ensemble | ||
The CARS endoscopy images for input are at 12.5 images/min (the exposure time of 4.8 s). The performances of four learning styles were compared: with microscopy, with endoscopy, with fine-tuning, and with ensemble. The evaluation metrics of the models with fine-tuning were statistically significantly higher than that of the models with endoscopy, except for DN and PSNR of W5. The evaluation metrics of the models with ensemble were statistically significantly higher than that of the models with fine-tuning, except for PSNR of W5. The comparison between the models with ensemble showed no-significant difference, despite having even though P values were < 0.05. The models with microscopy had worse performance than the models with endoscopy in DN and N2N. All of the deep learning models with ensemble showed statistically significantly higher performance than BM3D. * means significant difference by the Bonferroni–Holm correction in paired t-test (, ).
Figure 3Evaluation of EIR with plots of PSNR (a) and SSIM (b). The blue solid circles are the values of evaluation metrics for raw images observed with different imaging rates (exposure times). The orange solid circles are the values of the evaluation metrics for images denoised with N2N with ensemble. The criteria dB and are plotted as the red dashed lines. The star markers in (a) and (b) correspond the evaluation metrics at 12.5 images/min including Fig. 2c. (c) The and for dB and with (orange) and without denoising (blue), respectively.
Denoising processing time for one image with BM3D filter and three deep learning architectures.
| Model | Averaged time | Parameters |
|---|---|---|
| BM3D | – | |
| DN | 0.224 M | |
| DN with ensemble | 0.897 M | |
| N2N | 1.342 M | |
| N2N with ensemble | 5.366 M | |
| W5 | 0.019 M | |
| W5 with ensemble | 0.075 M |
For the deep learning models, the processing times with ensemble learning was measured. The processing time of N2N was faster than that of DN because five pooling layers are combined in N2N. The average and standard deviation from 1000 measurements are shown.
Figure 4Schematic of the CARS endoscope and microscope system. The two lasers were synchronized with a custom-built electrical device. Each laser beam was transmitted to the endoscopy system via a polarization-maintaining single-mode fiber. The two laser beams were combined with a long pass dichroic mirror and made incident on a galvanometer scanner. The backscattering CARS signal was separated and detected by a photomultiplier tube. The CARS microscope system is similar to the endoscope system, except for the optical fiber. DM dichroic mirror, F optical filters, FC fiber coupler, GS galvanometer scanner, OL objective lens, PD photodiode, PMF polarization-maintaining single-mode fiber, PMT photomultiplier tube, RM removable mirror.
Figure 5Schematic diagrams of three denoising models. (a) The W5 architecture is composed of a skip connection and convolution layers. The convolution layers except for the last layer are followed by a ReLU activation function. (b) The DN architecture has a structure similar to W5. The difference is that each convolution layer has a skip connection to the output. (c) The N2N architecture is based on U-Net. The convolution layers except for the last layer are followed by a leaky ReLU activation function. The number of filters in the convolution layers of the encoder part is half of the number in the decoder part except for the last three layers.
Hyperparameters utilized for grid search in this work.
| Model | Number of filters | Size of filter | Number of convolution layers |
|---|---|---|---|
| DN | 64, 32, 16 or 8 | 25,20,15 or 10 | |
| N2N | 128, 96, 64, 48 or 32 | 18 | |
| W5 | 128, 64, 32, 16 or 8 | 10, 8, 6, 5 or 4 |