| Literature DB >> 35810255 |
Christoph Spahn1,2, Estibaliz Gómez-de-Mariscal3, Romain F Laine4,5,6, Pedro M Pereira7, Lucas von Chamier4, Mia Conduit8, Mariana G Pinho7, Guillaume Jacquemet9,10,11, Séamus Holden8, Mike Heilemann12, Ricardo Henriques13,14,15.
Abstract
This work demonstrates and guides how to use a range of state-of-the-art artificial neural-networks to analyse bacterial microscopy images using the recently developed ZeroCostDL4Mic platform. We generated a database of image datasets used to train networks for various image analysis tasks and present strategies for data acquisition and curation, as well as model training. We showcase different deep learning (DL) approaches for segmenting bright field and fluorescence images of different bacterial species, use object detection to classify different growth stages in time-lapse imaging data, and carry out DL-assisted phenotypic profiling of antibiotic-treated cells. To also demonstrate the ability of DL to enhance low-phototoxicity live-cell microscopy, we showcase how image denoising can allow researchers to attain high-fidelity data in faster and longer imaging. Finally, artificial labelling of cell membranes and predictions of super-resolution images allow for accurate mapping of cell shape and intracellular targets. Our purposefully-built database of training and testing data aids in novice users' training, enabling them to quickly explore how to analyse their data through DL. We hope this lays a fertile ground for the efficient application of DL in microbiology and fosters the creation of tools for bacterial cell biology and antibiotic research.Entities:
Mesh:
Substances:
Year: 2022 PMID: 35810255 PMCID: PMC9271087 DOI: 10.1038/s42003-022-03634-z
Source DB: PubMed Journal: Commun Biol ISSN: 2399-3642
Fig. 1Overview of the DL tasks and datasets used in DeepBacs.
a We demonstrate the capabilities of DL in microbiology for segmentation (1), object detection (2), denoising (3), artificial labelling (4) and prediction of super-resolution images (5) of microbial microscopy data. A list of datasets can be found in Supplementary Table 1, comprising different species such as B. subtilis (1), E. coli (2–4) and S. aureus (5) and imaging modalities (widefield (1,2) and confocal (2,3) fluorescence microscopy, bright field imaging (1,2,4) or super-resolution techniques (4,5)). NN: neural network output. CAM = Chloramphenicol. Scale bars: 2 µm. b Schematic workflow of applying a DL network. Users select a ZeroCostDL4Mic notebook based on the image analysis task to be performed. Custom annotated or publicly available datasets are used to train and validate DL models. The user can train the DL model from scratch or load a pretrained model from public repositories (e.g., Zenodo or BioImage Model Zoo[77]) and fine tune it. After model accuracy assessment, trained models can be applied to new experimental data.
Overview of the deep learning models used in this study.
| Network | Description |
|---|---|
| U-Net | The U-Net, an encoder-decoder type of convolutional neural network (CNN), was proposed for the first time by Olaf Ronneberger to segment microscopy images[ |
| CARE | Content-aware image restoration (CARE) is a supervised DL-based image processing workflow developed by Weigert et al. for image restoration[ |
| StarDist | StarDist was developed by Schmidt et al. for the supervised segmentation of star-convex objects in 2D and 3D (i.e., ellipse-like shaped objects)[ |
| SplineDist | SplineDist was developed by the Uhlmann’s group and represents an extension of StarDist to detect non-star-convex objects[ |
| pix2pix | The supervised pix2pix, developed in the lab of Alexei Efros, belongs to the class of generic adversarial networks (GANs)[ |
| Noise2Void | Noise2Void is a self-supervised network for image denoising proposed in microscopy by the Jug lab[ |
| YOLOv2 | YOLOv2 was developed by Redmon and Farhadi for supervised (real-time) detection and classification of objects in images[ |
| fnet | fnet was developed by the group of Gregory Johnson for artificial labelling of bright field/transmitted light images[ |
Fig. 2Segmentation of bacterial images using open-source deep learning approaches.
a Overview of the datasets used for image segmentation. Shown are representative regions of interest for (i) S. aureus bright field and (ii) fluorescence images (Nile Red membrane stain), (iii) E. coli bright field images and (iv) fluorescence images of B. subtilis expressing FtsZ-GFP[47]. b Segmentation of S. aureus bright field and membrane-stain fluorescence images using StarDist[9]. Bright field and fluorescence images were acquired in the same measurements and thus share the same annotations. Yellow dashed lines indicate the cell outlines detected in the test images shown. c Segmentation of E. coli bright field images using the U-Net type network CARE[14] and GAN-type network pix2pix[18]. A representative region of a training image pair (bright field and GT mask) is shown. d Segmentation of fluorescence images of B. subtilis expressing FtsZ-GFP using U-Net and SplineDist[42]. GT = ground truth. e Segmentation and tracking of E. coli cells during recovery from stationary phase. Cells were segmented using StarDist and tracked with TrackMate[45,46]. f Plots show the mean (line) and standard deviation (shaded areas) for all cells in seven different regions of interest (colour-coded). Morphological features were normalised to the first value for each track. Scale bars are 2 µm (a, d), 3 µm (b, c) and 10 µm (e).
Metrics to evaluate model performance.
| Metric | Description |
|---|---|
| Intersection-over-Union (IoU) | The IoU metric reports on the overlap of output and ground truth segmentation masks. Higher overlap represents a better agreement between the model output and ground truth. |
| Precision and recall | These metrics are used to quantify the performance of instance segmentation or object detection. Precision is a measure for the specificity and describes which fraction of the detected objects are correctly detected/assigned. Recall, on the other hand, describes the sensitivity, i.e. how many objects out of all objects in the dataset were detected. |
| (mean) average precision ((m)AP) | This metric is used to evaluate model performance in object detection and classification tasks. It describes the models’ ability to detect objects of individual classes (AP) or all classes (mAP) present in the dataset. To obtain the average precision, precision and recall values for the individual object classes are calculated at different detection thresholds. mAP is calculated by averaging all single-class AP values. |
| Structural similarity (SSIM) | The SSIM value quantifies how similar two images are with respect to pixel intensities and intensity variations. As it is calculated locally using a defined windows size, it provides a similarity map that allows to identify regions of high or low similarity. |
| Peak-signal-to-noise ratio (PSNR) | The PSNR metric compares the signal to noise ratio of images with lower signal-to-noise to the high SNR counterpart based on the pixel-wise mean squared error. It is often used to compare the results of image compression algorithms, but can also be applied to evaluate model performance on paired test data. |
Summarised network performance for the different tasks and datasets.
| Task | Organism | Dataset | Figure | Network | Network performance | |||||
|---|---|---|---|---|---|---|---|---|---|---|
| IoU | Precision | Recall | mAP | SSIM | PSNR | |||||
| Segmentation | S. aureus | Bright field | 2B | StarDist | 0.64 ± 0.01 | 0.90 ± 0.03 | 0.87 ± 0.03 | – | – | – |
| S. aureus | fluorescence | 2B | StarDist | 0.91 ± 0.03 | 0.98 ± 0.02 | 1.00 ± 0.01 | – | – | – | |
| E. coli | Bright field | 2C | U-Net | 0.82 ± 0.03 | 0.56 ± 0.20 | 0.39 ± 0.20 | – | – | – | |
| E. coli | Bright field | 2C | ML-U-Net | 0.78 ± 0.05 | 0.71 ± 0.11 | 0.81 ± 0.09 | – | – | – | |
| E. coli | Bright field | 2C | CARE | 0.83 ± 0.03 | 0.78 ± 0.09 | – | – | – | ||
| E. coli | Bright field | 2C | StarDist | 0.78 ± 0.03 | 0.83 ± 0.12 | – | – | – | ||
| E. coli | Bright field | 2C | pix2pix | 0.82 ± 0.07 | 0.64 ± 0.12 | – | – | – | ||
| B. subtilis | fluorescence | 2D | U-Net | 0.78 ± 0.06 | 0.67 ± 0.21 | 0.63 ± 0.26 | – | – | – | |
| B. subtilis | fluorescence | 2D | ML-U-Net | 0.79 ± 0.16 | 0.82 ± 0.20 | – | – | – | ||
| B. subtilis | fluorescence | 2D | CARE | 0.74 ± 0.04 | 0.44 ± 0.28 | 0.36 ± 0.23 | – | – | – | |
| B. subtilis | fluorescence | 2D | StarDist | 0.76 ± 0.03 | – | – | – | |||
| B. subtilis | fluorescence | 2D | SplineDist | 0.72 ± 0.04 | 0.88 ± 0.06 | 0.87 ± 0.10 | – | – | – | |
| B. subtilis | fluorescence | 2D | pix2pix | 0.69 ± 0.07 | 0.69 ± 0.20 | 0.64 ± 0.21 | – | – | – | |
| all above | mixed model | S2 | StarDist | 0.74 ± 0.06 | 0.88 ± 0.08 | 0.84 ± 0.14 | – | – | – | |
| E. coli | Bright field (stat. Phase) | 2E | StarDist | 0.83 ± 0.02 | 0.95 ± 0.04 | 0.97 ± 0.03 | – | – | – | |
| Object detection | E. coli | Growth stage (large FoV) | – | YOLOv2 | – | 0.65 ± 0.10 | 0.47 ± 0.09 | 0.39 ± 0.09 | – | – |
| E. coli | Growth stage (small FoV) | 3A | YOLOv2 | – | 0.73 ± 0.03 | 0.74 ± 0.08 | 0.67 ± 0.10 | – | – | |
| E. coli | Antibiotic profiling | 3B | YOLOv2 | – | 0.76 ± 0.13 | 0.76 ± 0.23 | 0.66 ± 0.23 | – | – | |
| Denoising | E. coli | H-NS-mScarlet-I | 4A | PureDenoise | – | – | – | – | 0.834 ± 0.013 | 33.5 ± 0.9 |
| E. coli | H-NS-mScarlet-I | 4A | Noise2Void | – | – | – | – | 0.881 ± 0.005 | 34.9 ± 0.9 | |
| E. coli | H-NS-mScarlet-I | 4A | CARE | – | – | – | – | |||
| E. coli | MreB-sfGFP | 4E | PureDenoise | – | – | – | – | 0.458 ± 0.013 | 26.2 ± 0.9 | |
| E. coli | MreB-sfGFP | 4E | CARE | – | – | – | – | |||
| B. subtilis | FtsZ | 4G | Noise2Void | – | – | – | – | - | - | |
| Artificial labelling | E. coli | Widefield | 5A | CARE | – | – | – | – | 0.83 ± 0.05 | 24.4 ± 1.2 |
| E. coli | Widefield | 5A | fnet | – | – | – | – | |||
| E. coli | PAINT | 5A + S8 | CARE | – | – | – | – | 24.0 ± 1.2 | ||
| E. coli | PAINT | 5A + S8 | fnet | – | – | – | – | |||
| Resolution enhancement | E. coli | WF/SIM | 6A | CARE | – | – | – | – | 0.84 ± 0.03 | 25.4 ± 1.0 |
| S. aureus | WF/SIM | 6B | CARE | – | – | – | – | 0.92 ± 0.01 | 28.2 ± 0.7 | |
Bold numbers mark the best-performing network when multiple networks were applied to the same dataset.
Fig. 3DL-based object detection and classification.
a A YOLOv2 model was trained to detect and classify different growth stages of live E. coli cells (i). “Dividing” cells (green bounding boxes) show visible septation, the class “Rod” (blue bounding boxes) represents growing cells without visible septation and regions with high cell densities are classified as “Microcolonies” (red bounding boxes). (ii) Three individual frames of a live cell measurement. b Antibiotic phenotyping using object detection. A YOLOv2 model was trained on drug-treated cells (i). The model was tested on synthetic images randomly stitched from patches of different drug treatments (ii). Bounding box colours in the prediction (iii) refer to the colour-code in (i). Vesicles (V, orange boxes) and oblique cells (O, green boxes) were added as additional classes during training. Mecillinam-treated cells were misclassified as MP265-treated cells (red arrows). Scale bars are 10 µm (a, overview), 3 µm (lower panel in a and b) and 1 µm (b, upper panel).
Fig. 4Image denoising for improved live-cell imaging in bacteriology.
a Low and high signal-to-noise ratio (SNR) image pairs (ground truth, GT) of fixed E. coli cells, labelled for H-NS-mScarlet-I. Denoising was performed with PureDenoise (parametric approach), Noise2Void (self-supervised DL) and CARE (supervised DL). Structural similarity (SSIM) maps compare low-SNR or predictions to ground truth (GT) high-SNR data. b 10 s interval representative time points of a live-cell measurement recorded at 1 Hz frame rate, demonstrating CARE can provide prolonged imaging at high SNR using low-intensity images as input. t1/2 represents the decay half time. c Intensity over time for different imaging conditions providing low/high SNR images shown in a/b. d Structural similarity between subsequent imaging frames was calculated for raw and restored time-lapse measurements (Methods). e Denoising of confocal images of MreB-sfGFPsw expressing E. coli cells, imaged at the bottom plane (i). Outlines show cell boundaries obtained in transmitted light images (ii). (ii) Transmitted light image and SSIM maps generated by comparison of raw or denoised data with the high SNR image. (iii) Tracks of MreB filaments (colour-coded) and overlaid with the average image (grey) of a live-cell time series. Violin plots show the distribution of track duration (f) and speed (g) for the high SNR, low SNR (raw) and denoised image series, with mean values denoted by circles and percentiles by black boxes. Note that the distribution in g was cut at a max speed of 150 nm/s, excluding a small number of high-speed outliers but allowing for better visualisation of the main distribution. h Denoising of FtsZ-GFP dynamics in live B. subtilis. Cells were vertically trapped and imaged using the VerCINI method[47]. Details are restored by Noise2Void (N2V), rainbow colour-coded images were added for better visualisation. Values in a and e represent mean values derived from 2 (a) and 5 (e) images and the respective standard deviation. c, d Show mean values and respective standard deviations from 3 measurements. f, g Show tracking results from individual time series. Scale bars are 1 µm (a, b, e i) and 0.5 µm (e iii and h).
Fig. 5Artificial labelling of E. coli membranes.
a fnet and CARE predictions of diffraction-limited (i) and PAINT super-resolution (SR) (ii) membrane labels obtained from bright field (BF) images. GT = ground truth. Values represent averages from five test images and the respective standard deviation b Pseudo-dual-colour images of drug-treated E. coli cells. Nucleoids were super-resolved using PAINT imaging with JF646-Hoechst[64]. Membranes were predicted using the trained fnet model. CAM = Chloramphenicol. Scale bars are 2 µm (a) and 1 µm (b).
Fig. 6Prediction of SIM images from widefield fluorescence images.
Widefield-to-SIM image transformation was performed with CARE for a live E. coli (FM5-95) and b S. aureus (Nile Red) cells. Shown are diffraction-limited widefield images (i) and the magnified regions (ii) indicated by yellow rectangles in (i). WF = widefield; NN = neural network output. (iii) Line profiles correspond to the red lines in the WF images and show a good agreement between prediction and ground truth (bottom panel). Scale bars are 10 µm (i), 1 µm (ii) and 0.5 µm (iii).
Advantages and disadvantages of specific approaches for the performed image analysis tasks.
| Task | Network | Advantages | Disadvantages | Recommended for | Training speed |
|---|---|---|---|---|---|
| (Instance) Segmentation | Classical U-Net | Better feature synthesis and correspondence with the input image when compared with classical fully connected neural networks. Reproducible inference in Fiji. | Requires annotated masks and postprocessing of the network output. | Low cell densities, high contrast, arbitrary cell shapes | Intermediate |
| Multilabel U-Net | Semantic segmentation (background, cell boundary and cell cytosol) which improves to distinguish touching objects. Reproducible inference in Fiji. | Requires annotated masks and postprocessing of the network output. Implemented for 2D data. | Arbitrary cell shapes | Intermediate | |
| StarDist | Highly generalisable and excellent performance at high object density; available for 2D and 3D; equipped for processing of large field of views; reproducible inference in Fiji, QuPath and Napari. | Limited to star-convex objects, does not work well for objects with large axial ratio (e.g., long rod-shaped cells). | Cocci, Ovococci, small rod-shaped bacterial cells (slow growth, stationary phase), all object densities | Fast | |
| SplineDist | Regularly shaped, non-convex objects | Computationally expensive with a high demand of RAM memory; only implemented for 2D data. | Curved (non-star-convex) objects | Slow | |
| Pix2pix | GAN-type architecture allows for arbitrary image-to-image translation tasks. | Longer training times, post-processing required; high demand of computational resources, risk of strong hallucinations; 2D. | Complex images with multimodal intensity distributions | Slow | |
| Object detection | YOLOv2 | Fast training | Limited number of objects per image; low performance for small objects; fails determining objects in highly packed clusters; only available in 2D. | <50 uniformly distributed objects/image | Fast |
| Denoising | CARE | Fast training for 2D and 3D data; the trained model can be deployed in Fiji. | Requires paired data (supervised network). | Targets that allow recording of low/high SNR data (slow or chemically fixed) | Fast |
| Noise2Void | Unsupervised; new data is used both during the training and inference. Fast training; training and inference available in Fiji. | Lower performance than supervised learning approaches; only available for 2D. | Absence of high SNR images (fast dynamics, labels with low photostability) | (Very) fast | |
| PureDenoise (parametric) | Multi-frame denoising; Fiji plugin; no special requirements and no training required. | Often lower performance than DL-based approaches. | Low SNR data with temporal correlation (e.g., processive movement) | N.A. | |
| Artificial labelling | CARE | see above | Lower performance than fnet. | Prediction of membrane labels or structures visible in bright field images | Intermediate |
| fnet | Training schedule and DL workflow is designed for artificial labelling | - | Prediction of membrane labels or structures visible in bright field images | Intermediate | |
| Super-resolution prediction | CARE | see above | Might not predict rare sub-diffraction features | Regular structures (e.g., cell membranes) | Intermediate |
Training speed is only given as a qualitative measure and is based on our experience made during this work. Note that the training time consumption depends on the computational resources available and the size of the training data.