| Literature DB >> 34483651 |
Priya Rani1, Shallu Kotwal2, Jatinder Manhas3, Vinod Sharma1, Sparsh Sharma4.
Abstract
Microorganisms or microbes comprise majority of the diversity on earth and are extremely important to human life. They are also integral to processes in the ecosystem. The process of their recognition is highly tedious, but very much essential in microbiology to carry out different experimentation. To overcome certain challenges, machine learning techniques assist microbiologists in automating the entire process. This paper presents a systematic review of research done using machine learning (ML) and deep leaning techniques in image recognition of different microorganisms. This review investigates certain research questions to analyze the studies concerning image pre-processing, feature extraction, classification techniques, evaluation measures, methodological limitations and technical development over a period of time. In addition to this, this paper also addresses the certain challenges faced by researchers in this field. Total of 100 research publications in the chronological order of their appearance have been considered for the time period 1995-2021. This review will be extremely beneficial to the researchers due to the detailed analysis of different methodologies and comprehensive overview of effectiveness of different ML techniques being applied in microorganism image recognition field. © CIMNE, Barcelona, Spain 2021.Entities:
Year: 2021 PMID: 34483651 PMCID: PMC8405717 DOI: 10.1007/s11831-021-09639-x
Source DB: PubMed Journal: Arch Comput Methods Eng ISSN: 1134-3060 Impact factor: 8.171
Research questions
| Research Questions | |
|---|---|
| RQ1 | What are the different ML based approaches used by researchers for microorganism’s image recognition? What are their limitations? |
| RQ2 | What are the different techniques used for image pre-processing and feature extraction? |
| RQ3 | Which ML techniques are used most for microorganism image classification? |
| RQ4 | What are the different metrics used for evaluating performance of the proposed ML models? |
| RQ5 | How has the ML based image recognition of microorganisms developed over time? |
| RQ6 | What are the main challenges in implementing ML techniques in the concerned field? |
Fig. 1Yearly distribution of selected articles
Fig. 2Article selection process
Fig. 3Number of articles selected from different databases to conduct this review
Fig. 4Impact of ML techniques on image recognition of various microorganisms
Fig. 5Example of microscopic images of bacteria species. a Vibrio cholera [125], b Tuberculosis bacteria [36]
Fig. 6Flowchart of RF based Identification of Tuberculosis bacteria in ZN stained sputum smear images. [36]
Fig. 7Workflow of bacteria image classification using BOW model and SVM [49]
Summary of research papers reviewed on ML methods for bacteria image recognition
| Author/year | Objective | Segmentation techniques | Type of features | Classification techniques | Dataset details | Performance metrics | Limitation |
|---|---|---|---|---|---|---|---|
| Veropoulos et.al. [ | Identification of tuberculosis bacteria in sputum smear images | Edge detection | Shape features | ANN | C = 2 TI = 1147 Po. = 267 Ne. = 880 Tr. = 1000 Te. = 147 | Acc. = 94.1% | Less description about evaluation |
| Liu et al. [ | Classification of different morphotypes of bacterial species | Threshold method | Shape, size and gray density | K-NN | C = 11 TI = 5741 Tr. = 1471 Te. = 4270 | Acc. = 97% | Less efficient in classifying closely related morphotypes |
| Men et al. [ | Image Recognition of heterotrophic bacteria colonies | Threshold method | Shape features | SVM | C = 2 TI = 300 Tr. = 180 Te. = 120 | Acc. = 96.9% | No description about type of heterotrophic bacteria |
| Chen et al. [ | Oral cavity bacteria colony counting | Watershed algorithm and Threshold method | Colour features | SVM | – | Prec. = 0.97 + -0.03 Rec. = 0.96 + -0.04 F-score = 0.96 + -0.01 | Less efficient in recognizing whether colony is clustered or irregularly shaped |
| Xiaojuan et al. [ | Image recognition of wastewater bacteria | Edge detection, Iterative threshold method | Shape features | Adaptive accelerated back propagation, ANN | – | Acc. = 85.5% | Dataset details not mentioned |
| Kumar et al. [ | Image classification of five bacterial species | Shape, optical and texture features | ANN | C = 5 TI = 18 | Acc. = 100% | Small dataset | |
| Akova et al. [ | Detection of unknown bacteria strains and serovars-level classification of bacteria species | – | Texture and shape features | Bayesian method | C = 28 TI = 2054 Tr. = 1643 Te. = 411 | Acc. = 95% | Imbalanced Dataset |
| Osman et al. [ | Identification of tuberculosis bacteria in sputum smear images | Colour image segmentation, K-means clustering, region growing algorithm | Shape features | ANN | C = 2 TI = 960 Tr. = 400 Te. = 280 Va. = 280 | Acc. = 89.64% | Less description about evaluation |
| Zhai et al. [ | Identifying and counting tuberculosis bacteria in sputum smear image | Colour image segmentation, threshold method | Shape features | Decision tree | C = 3 Te. = 100 | Acc. = 81–90% | Counting of tuberculosis bacilli was done manually |
| Zeder et al. [ | Quality assessment of fluorescently stained microscopic bacteria images | Pixel features | ANN | C = 3 TI = 25,000 | Acc. = 94% | Testing was not done on images with high background inhomogeneity | |
| Hiremath et al.[ | Segmentation and classification of microscopic cell images of cocci bacteria | Adaptive global threshold method | Shape features | K-NN, ANN | C = 6 TI = 350 | Acc. = 99% | Over-lapped cells were not considered |
| Rulaningtyas et al. [ | Identification of tuberculosis bacteria in sputum smear images | Shape features | ANN | C = 2 TI = 100 Tr. = 75 Te. = 25 | Mean square error = 0.000368 | Less description about evaluation | |
| Osman et al. [ | Identification of tuberculosis bacteria in sputum smear images | Color image segmentation, K-means clustering, region growing method | Shape features | ANN | C = 3 TI = 1603 Tr. = 1081 Va. = 121 Te. = 401 | Acc. = 74.62% | Less description about evaluation |
| Chayadevi et al. [ | Extraction of bacterial clusters from microscopic images | Freeman chain contour algorithm | Shape features | K-means, SOM(ANN) | TI = 320 | – | Less description about type of bacterial species in dataset |
| Ahmed et al. [ | Classification of scatter patterns of colonies formed by vibrio species | – | Shape and Texture features | SVM | C = 10 TI = 1000 | Acc. = 90–99% Prec. = 0.9 Rec. = 0.9 | High computational cost |
| Ayas et al. [ | Identification of tuberculosis bacteria in sputum smear images | Pixel based segmentation using RF | Colour and Shape features | RF | C = 2 TI = 116 Tr. = 40 Te. = 76 | Sens. = 89.34% Sens. = 88.47% | Less training data |
| Govindan et al. [ | Identification of tuberculosis bacteria in sputum smear images | De-correlation stretching method and K-means clustering method | Shape features | SVM | C = 2 | Sens. = 72.98% | Incomplete description about results and dataset |
| Nie et al. [ | Segmentation and classification of bacterial colony images | CDBN | Deep features and Texture features | SVM, CNN | C = 2 TI = 862 | Acc. = 97.14% Acc. = 62.10% Rec. = 82.16% Prec. = 83.76% | Less efficient in classifying bacteria colonies after species interaction |
| Ghosh et al. [ | Identification of tuberculosis bacteria in sputum smear images | Threshold method | Colour, Shape and Granularity features | Fuzzy membership function | C = 2 Te. = 150 | Sens. = 93.9% Spec. = 88.2% | Less dataset details were mentioned |
| Seo et al. [ | Classification of Staphylococcus species using hyper spectral imaging | Threshold method | Spectral features | SVM and Partial Least square discriminant analysis | C = 5 | Acc. = 97.8% Kappa = 0.97 | Hyper spectral imaging is complex and costly |
| Priya et al. [ | Object and image level classification of tuberculosis bacteria in sputum smear images | Active contour method | Shape features | SVM, back propagation, ANN | C = 2 Tr. = 1537 | Acc. = 92.5% Sens. = 95% Spec. = 90% F-score = 92.68% | Imbalanced datasets |
C = 2 TI = 100 | Acc. = 91.30 Sens. = 91.59% Spec. = 88.46% F-score = 95.03% | ||||||
| Ferrari et al. [ | Counting bacterial colonies in culture plate images | Threshold method | Shape, Deep features | CNN, SVM, watershed method | C = 7 TI = 28,500 Tr. = 19,950 Te. = 8550 | Prec. = 0.82 Rec. = 0.80 | Imbalanced dataset |
| Lopez et al. [ | Identification of tuberculosis bacteria in sputum smear images | – | Deep features | CNN | C = 2 TI = 29,310 | Acc. = 99% | Less dataset details were mentioned |
| Turra et al. [ | Bacteria identification using hyperspectral Imaging | – | Deep features | CNN, SVM, RF | C = 8 | Acc. = 99.7% | Hyper spectral imaging is complex and costly |
| Zielinski et al. [ | Classification of different genera and species of bacteria | – | Texture and deep features | SVM, RF | C = 33 TI = 660 | Acc. = 97.24% | Small dataset |
| Wahid et al. [ | Classification of five pathogenic bacteria species | – | Deep features | CNN | C = 5 TI = 500 Tr. = 400 Te. = 100 | Acc. = 95% | Small dataset |
| Andreini et al. [ | Segmentation of bacterial colonies using DL | Semantic segmentation using CNN | – | – | TI = 324 TI = 119,000 | Mean interaction-over-union = 86.33% | Imbalanced Dataset |
| Hay et al. [ | Identification of Larval zebrafish intestinal bacteria | – | Deep features | three-dimensional CNN | C = 2 TI = 22,302 Tr. = 21,000 Te. = 1302 | Acc. = 89.3% | Imbalanced Dataset |
| Mohamed et al. [ | Image classification of ten bacterial species | – | Texture features | SVM | C = 10 TI = 200 Tr. = 140 Te. = 70 | Acc. = 97% | Small dataset |
| Rahmayuna et al. [ | Genus level classification of pathogenic bacteria | – | Shape and Texture features | SVM | C = 4 TI = 600 Tr. = 540 Te. = 60 | Acc. = 90.33% | Feature selection was done manually |
| Panicker et al. [ | Identification of tuberculosis bacteria in sputum smear images | Threshold method | Deep features | CNN | C = 2 Tr. = 1800 Te. = 1817 | Sens. = 97.13% Prec. = 78.4% F-score = 86.76% Acc. = 98.88% | Low precision |
| Traore et al. [ | Image classification of Vibrio cholera and Plasmodium falciparum | – | Deep features | CNN | C = 2 TI = 480 Tr. = 400 Te. = 80 | Acc. = 94% | Less details about type of Images |
| Ahmed et al. [ | Image classification of seven bacteria species | – | Deep features | SVM | C = 7 Tr. = 800 Te. = 160 | Acc. = 96% | Less efficient in classifying images with multiple bacteria species |
| Mithra et al. [ | Identification of Tuberculosis bacteria in sputum smear images | Threshold method | Texture features | CDBN | C = 3 Tr. = 275 Te. = 225 | Acc. = 97.55% Sens. = 97.86% Spec. = 98.23% | Less training data |
| Treebupachatsakul et al. [ | Classification of Staphylococcus and Lactobacillus | – | Deep features | CNN | C = 2 | Acc. = 75% | Less accuracy and less dataset description |
| Bonah et al. [ | Classification of food borne pathogens using hyper spectral Imaging | – | Spectral features | SVM, Linear Discriminant Analysis, GS,GA,PSO, ACO, CARS, SI | C = 8 | Acc. = 99.47% | Hyper spectral imaging is complex and costly |
| Treebupachatsakul et al. [ | Image recognition of three bacterial species and one yeast specie | – | Deep features | CNN | C = 4 | Acc. = 98.66% | Less dataset details were mentioned |
| Mhathesh et al. [ | Image classification of | – | Deep features | three-dimensional CNN | C = 2 | Acc. = 95% | Less dataset details were mentioned |
*Acc. = Accuracy, Prec. = Precision, Rec. = Recall, Sens. = Sensitivity, Spec. = Specificity TI = Total Images, C = Classes, Tr. = Training, Te. = Testing, Va. = Validation, Po. = Positive, Ne. = Negative
Fig. 8Example of microscopic images of different algae genera [73]
Fig. 9Flowchart of Mask-RCNN for diatom image segmentation [87]
Summary of research papers reviewed on ML methods for algae image recognition
| Authors/Year | Objective | Segmentation techniques | Types of features | Classification techniques | Dataset details | Performance metrics | Limitation |
|---|---|---|---|---|---|---|---|
| Theil et al. [ | Image recognition of blue green algae species | Edge detection and Threshold method | Shape and texture features | Discriminant analysis | C = 9 TI = 158 | Acc. = 98% | Less dataset details were mentioned |
| Tang et al. [ | Image recognition of plankton | Mean shift method | Shape and texture features | ANN | C = 6 TI = 1869 Tr. = 935 Te. = 939 | Acc. = 95% | Imbalanced dataset |
| Alvarez et al. [ | Diatom image recognition | – | Frequency features | ANN | – | – | No details about dataset and results |
| Luo et al. [ | Plankton image recognition using active learning | – | – | SVM | C = 5 TI = 7440 | Acc. = 85.55% | No details about types of species and features |
| Blaschko et al. [ | Identification of plankton in Flowcam images | Snake based and intensity based segmentation | Shape, Differential and Texture features | Decision trees, K-NN, SVM, Naïve Bayes, Ensemble methods(Bagging, boosting) | C = 13 TI = 982 | Acc. = 72.61% | No error independence in Ensemble methods |
| Jalba et al. [ | Diatom Identification | – | Shape features | Decision trees, bagging, K-NN | C = 6 TI = 120 C = 37 TI = 781 | Acc. = 75% Acc. = 90% | No Details about quantity of dataset classes |
| Tao et al. [ | Image classification of Red tide algae species | Edge detection | Shape, Texture, and Differential Features | Naïve bayes, SVM | C = 9 TI = 2400 | Acc. = 74.3% | Less efficient in rejecting unknown algae species |
| Tao et al. [ | Image classification of Red tide algae species | Edge detection | Shape, Texture and Differential Features | Support vector data description, SVM | C = 9 TI = 2400 | Acc. = 82.3% | Only global image features were used |
| Xu et al. [ | Image classification of Red tide algae species | Edge detection and otsu self adaptive algorithm | Shape features | SVM, fuzzy C-means clustering | C = 5 TI = 1498 Tr. = 714 Te. = 784 | Acc. = 91.33% | No details about quantity of dataset classes |
| Mosleh et al. [ | Freshwater algae image recognition | Edge detection | Shape and texture features | ANN | C = 5 TI = 500 Tr. = 200 Te. = 300 | Acc. = 93% | Does not work well for images having multiple objects |
| Drews- et al. [ | Microalgae image classification | – | Feature extraction using Flowcam software | Expectation Maximization algorithm, gaussian mixture model | C = 4 TI = 1526 C = 4 TI = 923 | Acc. = 92% | Imbalanced datasets |
| Schulze et al. [ | Phytoplankton image recognition | Region growing approach, Edge detection | Shape and texture features | ANN | C = 12 Tr. = 7200 | Acc. = 94.7% | Less dataset details were mentioned |
| Coltelli et al. [ | Identification of algae species using unsupervised learning | Threshold method | Shape and colour features | SOM (ANN) | TI = 53,869 Tr. = 16,161 Te. = 37,708 | Acc. = 98.6% | Less training data |
| Promdaen et al. [ | Classification of microalgae genera images | Single and multi resolution edge detection | Shape and texture features | SVM | C = 12 TI = 720 Tr. = 540 Te. = 180 | Acc. = 97.22% | No mechanism for rejecting unknown algae |
| Dannemiller et al. [ | Image segmentation of freshwater algae | SVM | Spatial features | – | C = 2 Tr. = 200 | Detection rate = 95% | Less description about evaluation |
| Medina et al. [ | Detection of algae in underwater pipeline | Edge detection, Hough Transform, Gaussian filter | Texture features | ANN, SVM | C = 2 TI = 19,921 | Acc. = 93.60% | Imbalanced dataset |
| Qiu et al. [ | Image segmentation of | SVM, threshold method, grey Surface direction angle model | Pixel-level features | – | C = 2 | – | |
| Correa et al. [ | Classification of microalgae species using imbalanced dataset | – | Feature extraction using FlowCam | SVM, K-NN, ANN, naïve bayes | C = 19 TI = 24,302 | Kappa = 0.981 F1-score = 0.982 | Limited features were extracted using Flowcam |
| Medina et al. [ | Detection of algae in underwater pipeline | – | Shape, Texture and deep features | CNN and MLP (ANN) | C = 2 TI = 41,992 | Acc. = 99.4% | Incomplete details about quantity of dataset classes |
| Giraldo-Zuluaga et al. [ | Identification of Microalgae in microscopic images | Threshold method | Shape and Texture features | SVM, ANN | C = 4 TI = 1680 | Acc. = 98.63% | High processing time |
| Dannemiller et al. [ | Segmentation of algae microscopic images | Non-uniform background Subtraction, SVM | Texture features | – | C = 2 Tr. = 200 | – | Less description about performance evaluation |
| Lakshmi et al. [ | Classification of | – | Texture features, deep features | ANN, CNN | TI = 400 Te. = 220 | Acc. = 91.82% | Less dataset details were mentioned |
| Wu et al. [ | Algal blooms discrimination | K-means algorithm and region growing algorithm | Shape and texture features | ANN | C = 2 TI = 90 Tr. = 74 Te. = 16 | Acc. = 80% | Imbalanced dataset |
| Deglint et al. [ | Identification of algae species | Binary background-foreground Classifier | Shape and fluorescence based spectral features | ANN | C = 6 TI = 2611 Tr. = 1883 Te. = 778 | Acc. = 96.1% | Imbalanced dataset |
| Park et al. [ | Red tide algae image recognition | – | Texture features | Hierarchical learning method | C = 63 TI = 3500 | Acc. = 94.7% | Less description about species and results |
| Iamsiri et al. [ | Image classification of filamentous microalgae | Edge and region detection | Shape Features | SVM | C = 5 TI = 300 Tr. = 300 | Acc. = 91.30% | Model was tested on training set |
| Sanchez et al. [ | Identification of diatom and its life cycle stage using both supervised and unsupervised learning | Threshold method | Shape and Texture features | SVM, KNN K-means | Dataset 1 C = 8 TI = 703 Dataset2 C = 6 TI = 382 Dataset 3 C = 5 TI = 244 | Acc. = 99.9% | Imbalanced datasets |
| Ruiz-Santaquiteria et al. [ | Comparative study of semantic and instance segmentation for diatom image segmentation | SegNet, Mask-RCNN | Deep features | – | C = 10 TI = 126 | Sens. = 95% Spec. = 60% Prec. = 57% Sens. = 86% Spec. = 91% Prec. = 85% | Mask-RCNN achieved better results but sensitivity score is low |
*Acc. = Accuracy, Prec. = Precision, Rec. = Recall, Sens. = Sensitivity, Spec. = Specificity TI = Total Images, C = Classes, Tr. = Training, Te. = Testing, Va. = Validation, Po. = Positive, Ne. = Negative
Fig. 10Example of microscopic images of protozoa species. a Peranema, b Euglypha, c Coleps, d Asidisca cicada [93]
Fig. 11Main steps of Protozoa image pre-processing method, a pre-treated image, b region of interest, c binary image after segmentation, d final image [93]
Summary of research papers reviewed on ML methods for protozoa image recognition
| Author/year | Objective | Segmentation techniques | Types of features | Classification techniques | Dataset details | Performance metrics | Limitation |
|---|---|---|---|---|---|---|---|
| Widmer et al. [ | Identification of | – | Pixel intensity | ANN | C = 2 Tr. = 525 Te. = 200 | Acc. = 81% | Imbalanced dataset |
| Widmer et. al. [ | Image classification of | – | Shape features | ANN | C = 2 Tr.(GC) = 2431 Tr.(GO) = 1586 Te. = 100 Va. = 782 | Acc.91.8%for GC Acc.99.6%for CO | Imbalanced dataset |
| Weller et al. [ | – | Shape, Texture, and Colour features | SOM (ANN) | C = 5 TI = 903 | – | Less description about results | |
| Castonan et al. [ | Image recognition of seven distinct | Threshold method | Shape and texture features | Bayesian classifier | C = 7 TI = 3891 Tr. = 2724 Te. = 1167 | Acc. = 85.75% | Imbalanced dataset |
| Ginoris et al. [ | Image recognition of wastewater protozoa and metazoan | Threshold method | Shape features | Discriminant analysis, ANN and decision trees | C = 23 Tr. = 1548 Te. = 766 | Acc. = 83% | Less efficient for stalked protozoa images |
| Amaral et al. [ | Identification of stalked wastewater protozoa | Threshold method | Shape and signature based features | ANN | C = 8 | Acc. = 84.6% | Less efficient for small stalked protozoa and |
| Suzuki et al. [ | Segmentation and classification of human intestinal parasites | Ellipse matching, image foresting technique | Shape, colour and texture features | Optimum path forest, ANN, SVM, Bagging, Adaboost | C = 16 TI = 5763 Tr. = 2881 Te. = 2881 | Acc. = 98.22% Sens. = 90.38% Spec. = 98.19% | No details about quantity of dataset classes |
| Li et al. [ | Image classification of Environmental microorganisms | – | Shape features | SVM | C = 20 TI = 200 C = 20 TI = 200 | Acc. = 89.7% | High computational time |
| Li et al. [ | Image classification of Environmental microorganisms | Edge detection method | Shape features | SVM | C = 20 TI = 200 C = 20 TI = 200 | Acc. = 89.7% Sens. = 99% Spec. = 99% Similarity = 99% | Segmentation technique needs to be improved |
| Yang et al. [ | Image classification of Environmental microorganisms | Mouse clicking based manual method and Edge Detection | Shape features | SVM | C = 20 TI = 200 | Acc. = 92.5% | Segmentation technique needs to be improved |
| Apostol et al. [ | Classification of Radiolarian images | – | Shape and texture features | SVM | C = 5 TI = 1029 Tr. = 60 | Acc. = 95% | Less training data |
| Abdalla et al. [ | Image recognition of | Edge detection | Pixel-based features | KNN, ANN | TI = 4402 C = 7 TI = 2902 C = 11 | Acc. = 96.6% Acc. = 91.9% | Imbalanced datasets |
| Keceli et al. [ | Classification of Radiolarian images | Threshold method | Shape, texture and deep features | SVM, K-NN, Adaboost, RF | C = 4 | Acc. = 98.7% Spec. = 99.9% Sens. = 98.1% | Less dataset details were mentioned |
| Zhong et al. [ | Image classification of foraminifera species | – | Deep features and features extracted using Bag-of -features framework | RF,ANN, SVM, K-NN | C = 7 TI = 1437 | Prec. = 86% Rec. = 90% | Imbalanced dataset |
| Kosov et al. [ | Image Classification of Environmental microorganism | Shape, texture and deep features | Conditional random field | C = 21 TI = 400 | Acc. = 91.40% | Small dataset | |
| Pho et al. [ | Detection and identification of protozoa species in cyst and oocyst images | RetinaNet | Deep features | RetinaNet | C = 8 TI = 69 | maP = 0.77 Prec. = 0.78 Rec. = 0.90 | Small dataset |
| Solano et al. [ | Radiolarian classification using supervised and unsupervised ML techniques | – | Shape and texture features | Naïve bayes, RF, SOM, K-means | TI = 60 | Acc. = 88.57% Acc. = 88.89% | Less dataset details were mentioned |
| Vijayalakshmi et al. [ | Identification of | – | Deep features | CNN, SVM | C = 2 TI = 2550 Tr. = 1530 Te. = 1020 | Acc. = 93.1% Prec. = 89.95% Sens. = 93.44% Spec. = 92.92% F-score = 91.66% | Imbalanced dataset |
| Mitra et al. [ | Image classification of Foraminifera | – | Deep features | CNN | C = 7 TI = 1437 | Prec. = 80% Rec. = 82% F-score = 81% | Imbalanced dataset |
| Dionisio et al. [ | Genus-level and species-level classification of radiolarian species | – | Deep features | CNN | C = 9 TI = 929 C = 2 TI = 154 | Acc. = 91.85% Acc. = 100% | Imbalanced datasets |
| Liang et al. [ | Image classification of environmental microorganism | – | Deep features | CNN | C = 21 TI = 294 | Acc. = 92.9% | Less dataset details were mentioned |
| Zhang et al. [ | Image segmentation of Environmental microorganisms using a low-cost U-net | CNN, U-net, concatenate operations and Inception | – | – | – | Dice = 87.13% Jaccard = 79.74% Prec. = 90.14% Rec. = 87.12% Acc. = 96.91% VOE = 20.26% | Memory cost was reduced, but time cost was still high |
*Acc. = Accuracy, Prec. = Precision, Rec. = Recall, Sens. = Sensitivity, Spec. = Specificity TI = Total Images, C = Classes, Tr. = Training, Te. = Testing, Va. = Validation, Po. = Positive, Ne. = Negative, VOE = Volumetric overlap error
Fig. 12Example of microscopic images of fungi. a Pencillium, b Aspergillus
Fig. 13CNN architecture for fungus detection [118]
Summary of research papers reviewed on ML methods for fungi image recognition
| Author/Year | Objective | Segmentation techniques | Type of features | Classification techniques | Dataset details | Performance metrics | Limitation |
|---|---|---|---|---|---|---|---|
| Jin et al. [ | Classification of | Parametric contour approach | Spectral features | SVM | C = 4 | Acc. = 95% | Less dataset details were mentioned |
| Yu et al. [ | Classification of Yeast cell images | Threshold method | Shape features | SVM, KNN | C = 3 TI = 240 | Acc. = 82% | Under-focused images were not correctly classified |
| Tleis et al. [ | Classification of Yeast cell images | – | Shape and Texture features | Logistic regression | C = 2 TI = 1380 | Acc. = 82.2% | Less dataset details were mentioned |
| Liu et al. [ | Detection of fungi in microscopic fecal images | Threshold method | Shape features | ANN | C = 3 TI = 2062 Tr. = 924 Te. = 916 | Acc. = 93.6% | Imbalanced dataset |
| Zhang et al. [ | Identification of Fungi in Leucorrhea images | Threshold method | Deep features | SVM, CNN | C = 2 Tr. = 50,896 Te. = 3418 | Sens. = 99.8% Spec. = 95.1% | Imbalanced dataset |
| Tahir et al. [ | Fungus spores detection | – | Shape, size and color features | SVM | C = 2 Tr. = 822 Te. = 100 | Acc. = 88% | High computational cost |
| Tahir et al. [ | Identification of five types of fungus spores | CNN | Deep features | CNN | C = 6 TI = 40,800 Tr. = 30,000 Te. = 10,800 | Acc. = 94.8% | Image quality needs to be improved |
| Arredondo-Santoyo et al. [ | To Characterise the dye decolourisation of fungal strains | – | Texture features, adhoc expert features and deep features | Extremely randomized trees, SVM, KNN, MLP, RF, logistic regression | C = 4 TI = 1024 | Acc. = 96.5% | Less details about training and test sets |
| Zhou et al. [ | Identification of | Threshold method | Deep features | CNN | C = 3 TI = 70 | Acc. = 69.25% | Less accuracy |
| Hao et al. [ | Detection of | Threshold method | Deep features | CNN | C = 2 TI = 30,000 Tr. = 27,000 Te. = 3000 | Acc. = 93.26% Rec. = 94.34% | Imbalanced dataset |
| Zielinski et al. [ | Classification of microscopic fungi images | Threshold method | Deep features | SVM | C = 5 TI = 180 | Acc. = 91.1% | Small dataset |
| Ma et al. [ | Image classification of Aspergillus fungi species | Deep features | CNN | C = 7 TI = 19,995 Tr. = 12,249 Te. = 4893 Va. = 2853 | Acc. = 98.2% | Imbalanced dataset |
*Acc. = Accuracy, Prec. = Precision, Rec. = Recall, Sens. = Sensitivity, Spec. = Specificity TI = Total Images, C = Classes, Tr. = Training, Te. = Testing, Va. = Validation, Po. = Positive, Ne. = Negative
Descriptors used for different types of features extraction for microorganism classification
| Types of features | Descriptors |
|---|---|
| Shape | Simple geometric descriptors (area, width, perimeter, diameter, eccentricity, curvature, roundness etc.), Moment invariants (Hu, Zernike, pseudo-Zernike Chebyshev etc.), Fourier descriptors, HOG descriptors and CNN feature maps |
| Texture | mean, contrast, coarseness, roughness, variance, directionality, regularity, rotation invariant, local binary pattern, Haralick features, Gray—Level Co-occurence matrix, SIFT descriptors, SURF descriptors and CNN feature maps |
| Differential | Shape index |
Fig. 14Distribution of ML techniques used in reviewed research papers
Development Trend in ML based research in microorganism image recognition
| 1995–1999 | The researchers just started implementing ML techniques in microbiology. Some researchers used |
| 2000–2005 | During this span, ML based research was done for phytoplankton, diatom and tuberculosis classification. Techniques like |
| 2006–2010 | Researchers explored |
| 2011–2014 | Researchers introduced |
| 2015–2021 | During this interval researchers started exploring |
| Abbreviation | Name |
|---|---|
| ANN | Artificial Neural Network |
| ACO | Ant Colony Optimization |
| BOW | Bag Of Words |
| BPNN | Back Propagation Neural Network |
| CLAHE | Contrast Limited Adaptive Histogram Equalization |
| CNN | Convolutional Neural Network |
| CARS | Competitive Adaptive Reweighted Sampling |
| CDBN | Conditional Deep Belief Network |
| DL | Deep Learning |
| GA | Genetic Algorithm |
| GS | Grid Search |
| HOG | Histogram Of Oriented Gradients |
| K-NN | K-Nearest Neighbour |
| LVQ | Linear Vector Quantization |
| ML | Machine Learning |
| MLP | Multilayer Perceptron |
| PNN | Probabilistic Neural Network |
| PCA | Principal Component Analysis |
| PSO | Particle Swarm Optimization |
| ROC | Receiver Operating Characteristic Curve |
| RF | Random Forest |
| SVM | Support Vector Machines |
| SOM | Self-Organizing Map |
| SURF | Speeded Up Robust Features |
| SI | Synergy Interval |
| SIFT | Scale Invariant Feature Transform |
| SMO | Sequential Minimal Optimization |
| SMOTE | Synthetic Minority Oversampling Technique |
| ZN | Ziehl–Neelsen |