| Literature DB >> 30509239 |
Susanne Dunker1,2, David Boho3, Jana Wäldchen4, Patrick Mäder3.
Abstract
BACKGROUND: Phytoplankton species identification and counting is a crucial step of water quality assessment. Especially drinking water reservoirs, bathing and ballast water need to be regularly monitored for harmful species. In times of multiple environmental threats like eutrophication, climate warming and introduction of invasive species more intensive monitoring would be helpful to develop adequate measures. However, traditional methods such as microscopic counting by experts or high throughput flow cytometry based on scattering and fluorescence signals are either too time-consuming or inaccurate for species identification tasks. The combination of high qualitative microscopy with high throughput and latest development in machine learning techniques can overcome this hurdle.Entities:
Keywords: CNN; Deep learning; High throughput cytometry; Image-based identification; Images; Imaging flow cytometry; Machine learning; Magnification; Morphology; Phytoplankton
Mesh:
Year: 2018 PMID: 30509239 PMCID: PMC6276140 DOI: 10.1186/s12898-018-0209-5
Source DB: PubMed Journal: BMC Ecol ISSN: 1472-6785 Impact factor: 2.964
Fig. 1Demonstration of variability of fluorescence pattern depending on different growth stages during early exponential, exponential and stationary phase for all nine species. Presented as Chl a fluorescence excited by a 488 nm (x-axis) and a 561 nm laser excitation (y-axis). Yellow dots represent senescent cells during stationary phase, light green, blue or brown dots represent cells growing in early exponential phase and green, blue or brown dots represent cell growing in exponential phase
Overview about investigated species for strain identity, culture medium, cell size and weighted average tolerated Total phosphorus range from a global dataset (according to Phillips et al. [38], Supplementary material)
| Strain | Taxonomic group | Medium | TP-range (µg L−1)a | |
|---|---|---|---|---|
|
| SAG 276-3a | Green algae | BBM | 25–90 |
|
| SAG 11-32b | Green algae | BBM | 12–41 |
|
| SAG 211-11b | Green algae | BBM | 27–87 |
|
| SAG 41.79 | Cyanobacteria | Z-Medium | 16–60 |
|
| SAG 979-3 | Cryptophyte | BBM | 12–41 |
|
| SAG 276-4d | Green algae | BBM | 25–90 |
|
| SAG 1450-1 | Cyanobacteria | Z-Medium | 38–108 |
|
| SAG 257-1 | Green algae | BBM | 8–28 |
| PCC 6803 | Cyanobacteria | Z-Medium | 21–62 |
aTolerated range of total phosphorus (TP) according to Phillips et al. [38]
bIndicator genus according to Palmer [36]
cIndicator genus according to Reynolds [41]
Fig. 2Overview about number of images (brightfield or Chl a fluorescence images respectively) included per species and life cycle stage (stationary phase, early exponential phase and exponential phase)
Fig. 3Exemplary brightfield images (two or four images per case) taken with the ImageStream®X MK II (×60 magnification) of each phytoplankton species used in this study for training of the deep learning network at three different life cycle stages (early exponential, exponential and stationary phase)
Fig. 4Accuracy and per-class accuracy as metrics for four different classifiers (1) brightfield images alone, (2) Chl a fluorescence images alone, (3) all brightfield—Chl a fluorescence images and (4) merged brightfield—Chl a fluorescence images to predict species identity
Fig. 5Accuracy and per-class accuracy as metrics for four different classifiers (1) brightfield images alone, (2) Chl a fluorescence images alone, (3) all brightfield—Chl a fluorescence images and (4) merged brightfield—Chl a fluorescence images to predict species identity and life cycle stage
Fig. 6Exemplary images and confusion matrices for (a) species (b) and species and life cycle stage of four different classifiers trained on (1) brightfield images alone, (2) Chl a fluorescence images alone, (3) all brightfield—Chl a fluorescence images (“All images”) and (4) merged brightfield-Chl a fluorescence images (“Merged images”) solely identifying species and identifying species at different life cycle stages. The scale of the confusion matrices indicates the percentage of correct and incorrect classifications
Literature review for automated species identification for phytoplankton species identification with machine learning (neural networks), with flow cytometric data or images
| Reference | Network | Species/classes | Species size | Parameters included | Type of parameter | Instrument | Accuracy | |
|---|---|---|---|---|---|---|---|---|
| Flow cytometric data (scatter and fluorescence values) | Frankel et al. [ | ANN (Kohonen network/Back-propagation neural networks) | 5 | Picoplankton, large phytoplankton | 5 | FSC | EPICS V | 92–100% |
| Balfoort et al. [ | Multilayer feedforward network (NWorks, ANNET, 8) | 8 | 3–3500 µm | 6 | FSC, SSC, TOF | Optical Plankton Analyzer | 90–98% | |
| Boddy et al. [ | Back-propagation neural networks, hierarchical approach | 40 | 3–40 µm | 6 | FSC (horizontal, vertical), SSC, TOF | EPICS 741 | > 70% | |
| Wilkins et al. [ | MLP, RBF | 42 | Boddy et al. [ | 68–74% | ||||
| Wilkins et al. [ | RBF ANN | 34 | 3–1000 µm | 11 | FSC, SSC, TOF | EurOPA | 92% | |
| Boddy et al. [ | RBF NN | 72 | 1–45 µm | 7 | FSC-H, SSC-H, FL1-H, FL2-H, FL3-H, FL3-A, TOF | FACSort™ | 70–77% | |
| Pulse shape | Malkassian et al. [ | 20 | n. a. | 8 | FSC, SSC | CytoSub | 78% | |
| Images | Gorsky et al. [ | 3 | 3–43 µm | 5 | Area, Circularity, Convexity, Length, Perimeter | Autonomous Image Analyzer/HIAC | – | |
| Embleton et al. [ | MLP | 4 | 10–390 µm | 74 | Area, Circularity, Diameter, Fibrelength, Grey level values (SD, Skewness, Kurtosis), Perimeter | Microscope camera (Sony DXC-930P) | 67–93% | |
| Sosik and Olson [ | SVM | 15 (natural samples) | 10–100 µm | 22 categories/210 elements | Size, shape, symmetry, Texture characteristics, Diffraction, Co-occurence | FlowCytobot | 68–99% | |
| Blaschko et al. [ | SVM | 13 classes | n. a. | 780 features | Simple shape, moments, contour, differential, texture | FlowCam | 71% | |
| Correa et al. [ | CNN | 19 classes | n.a. | – | – | FlowCam | 89% | |
| Rodenacker et al. [ | DT, LDA | 23 | n. a. | 5 | Shape, significant points, principal components, contour, fourier descriptor, extinction, Shape moments, colorimetry, fluorimetry | Inverse microscope | 76% | |
| Chen et al. [ | CNN, PCA + SVM | 1 | n. a. | 16 | Diameter, tight area, perimeter, circularity, major axis, orientation, loose area, median radius, opd, refractive index, absorption, scattering | TS-QPI | < 85% | |
| Li et al. [ | CNN | 9 | n. a. | – | – | Mueller matrix microscope | 97% | |
| Pedraza et al. [ | CNN | 80 | n. a. | – | – | Microscopy | 99% | |
| This study | CNN | 9 | 1–90 µm | – | – | ImageStream®X MK II | 97% | |
The respective network, species or classes, species size, parameter number and parameter type, instruments or technique and the respective accuracy is provided. For some studies it was not possible to get details from the text about investigated cell size range
TOF Time of flight, MLP Multilayer Perceptron/backpropagation network, RBF radial basis function, SVM support vector machine, TS-QPI Time stretch quantitative phase imaging, OPD optical path length difference, n. a. not available, Picoplankton 0.2–2 µm, Nanoplankton 2–20 µm, Microplankton 20–200 µm, Macroplankton 200–2000 µm