| Literature DB >> 34608322 |
Romain F Laine1,2,3, Ignacio Arganda-Carreras4,5,6, Ricardo Henriques1,2,7, Guillaume Jacquemet8,9,10.
Abstract
Entities:
Mesh:
Year: 2021 PMID: 34608322 PMCID: PMC7611896 DOI: 10.1038/s41592-021-01284-3
Source DB: PubMed Journal: Nat Methods ISSN: 1548-7091 Impact factor: 28.547
Figure 1Using classical or DL algorithms to analyse microscopy images.
This figure illustrates the critical steps required when using classical or DL-based algorithms to analyse microscopy images, using denoising as an example. When using a classical algorithm, the researchers’ efforts are put into designing mathematical formulae that can then be directly applied to the images. When using a DL algorithm, first, a model needs to be trained using a training dataset. Next, the model can be directly applied to other images and generate predictions. Typically, such a model will only perform well on images similar to the ones used during training. This highlights the importance of the data used to train the DL algorithm (its quantity and diversity). The microscopy images displayed are breast cancer cells labelled with SiR-DNA to visualise their nuclei and imaged using a spinning disk confocal microscope (SDCM). The denoising performed in the “classical algorithm” section was performed using PureDenoise implemented in Fiji [20,21]. The denoising performed in the “Deep Learning algorithm” section was performed using CARE implemented in ZeroCostDL4Mic [10,22].
Figure 2Using quality metrics to assess the performance of DL models.
Figure illustrating that comparing DL-based predictions to ground truth images is a powerful strategy to assess a DL model performance. (A, B) Noisy images of breast cancer cells labelled with SiR-DNA were denoised using CARE (A, B; [10]), Noise2Void (B, [11]), and DecoNoising (C, [36]) all implemented in ZeroCostDL4Mic [22]. Noisy and ground truth images were acquired using different exposure times. (A) Matching noisy, ground truth, and CARE prediction images. White squares highlight regions of interest that are magnified in the bottom rows. Image similarity metrics mSSIM, NRMSE, and PNSR (see Box 1) shown on the images were obtained by comparing them to the ground truth image. The SSIM (yellow: high agreement; dark blue low agreement, 1 indicates perfect agreement) and RSE (yellow: high agreement; dark blue low agreement, 0 indicates perfect agreement) maps highlight the differences between the CARE prediction and the corresponding ground truth image. Note that the agreement between these two images is not homogenous across the field of view and that these maps are helpful to identify spatial artefacts. (B) Magnified region of interest from (A) showcasing how using image similarity metrics can compare different DL models trained using different algorithms but using the same training dataset. Note that in this example, all three algorithms improved the original image but to a different extent. Importantly, these results do not represent the algorithm’s overall performance to train these models but only assess their suitability to denoise this specific dataset. (C) Example highlighting how segmentation metrics can be used to evaluate the performance of segmentation pre-trained models [29,31,32] Image segmentation metrics Intersection over Union (loU, 1 indicates perfect agreement), F1 score (F1, 1 indicates perfect agreement), and panoptic quality (PQ, 1 indicates perfect agreement, [37]) displayed on the images were obtained by comparing them to the ground truth image which was manually annotated. Of note, these results do not reflect the overall quality of these pre-trained models (or of the algorithm used to train them) but only assess their suitability to segment this dataset.