| Literature DB >> 29297354 |
Haiyong Zheng1, Ruchen Wang1, Zhibin Yu1, Nan Wang1, Zhaorui Gu1, Bing Zheng2.
Abstract
BACKGROUND: Plankton, including phytoplankton and zooplankton, are the main source of food for organisms in the ocean and form the base of marine food chain. As the fundamental components of marine ecosystems, plankton is very sensitive to environment changes, and the study of plankton abundance and distribution is crucial, in order to understand environment changes and protect marine ecosystems. This study was carried out to develop an extensive applicable plankton classification system with high accuracy for the increasing number of various imaging devices. Literature shows that most plankton image classification systems were limited to only one specific imaging device and a relatively narrow taxonomic scope. The real practical system for automatic plankton classification is even non-existent and this study is partly to fill this gap.Entities:
Keywords: Feature selection; Image classification; Multiple kernel learning; Multiple view features; Plankton classification
Mesh:
Year: 2017 PMID: 29297354 PMCID: PMC5751094 DOI: 10.1186/s12859-017-1954-8
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1The framework of our proposed plankton image classification system
Fig. 2An example of image pre-processing. a Original captured plankton image. b Binarization. c Denoising. d Extraction
Fig. 3The Gabor filters with different parameters
Fig. 4The LBP features
Fig. 5The HOG features
Fig. 6The keypoints of SIFT features
Confusion matrix
| Predicted condition | |||
|---|---|---|---|
| Total population | Prediction positive | Prediction negative | |
| True condition | Condition positive | True positive (TP) | False negative (FN) |
| Condition negative | False positive (FP) | True negative (TN) | |
Fig. 7Image examples from WHOI dataset
Fig. 8Image examples from ZooScan dataset
Fig. 9The number of images per category in ZooScan dataset
Fig. 10Image examples from Kaggle dataset
Fig. 11The number of images per category in Kaggle dataset
The classification results of the baseline system
| WHOI dataset | ZooScan dataset | Kaggle dataset | |
|---|---|---|---|
|
| 88.27% | 80.6% | 75.36% |
| 1− | 11.63% | 16.3% | 21.49% |
|
| 0.883 | 0.821 | 0.769 |
Fig. 12Confusion matrices of the baseline system. a WHOI dataset. b ZooScan dataset. c Kaggle dataset
The classification results of multiple view features using SVM
| Datasets | C | Gaussian | Polynomial | Linear | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
|
| 1− |
|
| 1− |
|
| 1− |
| ||
| WHOI | 1 | 84% | 15.43% | 0.843 | 88.97% | 10.95% | 0.89 | 86.45% | 13.41% | 0.865 |
| 10 | 88.94% | 11% | 0.89 | 89.45% | 10.47% | 0.895 | 88.12% | 11.78% | 0.882 | |
| 100 |
|
|
| 88.42% | 11.46% | 0.885 | 86.33% | 13.59% | 0.864 | |
| ZooScan | 1 | 79.65% | 16.06% | 0.817 | 82.45% | 15.99% | 0.832 | 79.91% | 18.14% | 0.809 |
| 10 |
|
|
| 84.14% | 15.22% | 0.845 | 85.52% | 16.01% | 0.847 | |
| 100 | 84.87% | 13.62% | 0.856 | 83.04% | 16.02% | 0.835 | 82.27% | 18.23% | 0.82 | |
| Kaggle | 1 | 77.26% | 18.96% | 0.791 | 77.48% | 19.6% | 0.789 | 71.32% | 25.09% | 0.731 |
| 10 |
|
|
| 80.7% | 18.08% | 0.813 | 78.44% | 20.63% | 0.789 | |
| 100 | 82.09% | 18.89% | 0.816 | 79.01% | 19.73% | 0.796 | 78.1% | 22.05% | 0.78 | |
The entries in boldface indicate the best classification results with the highest F
Fig. 13Confusion matrices of multiple view features using SVM. a WHOI dataset. b ZooScan dataset. c Kaggle dataset
The classification results of multiple view features using MKL with one kind of kernel
| Datasets | C | Gaussian | Polynomial | Linear | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
|
| 1− |
|
| 1− |
|
| 1− |
| ||
| WHOI | 1 | 88.48% | 11.35% | 0.886 | 89.58% | 10.21% | 0.897 | 88.55% | 11.18% | 0.887 |
| 10 | 88.75% | 11.04% | 0.889 |
|
|
| 89.12% | 10.65% | 0.892 | |
| 100 | 88.58% | 11.2% | 0.887 | 89.39% | 10.44% | 0.895 | 88.42% | 11.41% | 0.885 | |
| ZooScan | 1 | 83.32% | 11.74% | 0.857 | 83.94% | 12.61% | 0.856 | 81.78% | 15.53% | 0.831 |
| 10 | 86.26% | 10.01% | 0.881 | 86.74% | 11.76% | 0.875 | 83.98% | 15.28% | 0.843 | |
| 100 |
|
|
| 86.79% | 11.63% | 0.876 | 84.86% | 19.13% | 0.828 | |
| Kaggle | 1 | 78.46% | 17.41% | 0.805 | 80.39% | 16.76% | 0.818 | 78.09% | 19.24% | 0.794 |
| 10 | 82.95% | 16.42% | 0.833 |
|
|
| 81.32% | 17.66% | 0.818 | |
| 100 | 82.97% | 16.84% | 0.831 | 82.11% | 15.82% | 0.831 | 79.68% | 19.1% | 0.803 | |
The entries in boldface indicate the best classification results with the highest F
The classification results of multiple view features using MKL with three kinds of kernel
| Datasets | C | Gaussian+Polynomial+Linear | ||
|---|---|---|---|---|
|
| 1− |
| ||
| WHOI | 1 | 89.64% | 10.17% | 0.897 |
| 10 | 89.88% | 10.03% | 0.899 | |
| 100 |
|
|
| |
| ZooScan | 1 | 85.42% | 11.38% | 0.87 |
| 10 |
|
|
| |
| 100 | 88.31% | 9.81% | 0.892 | |
| Kaggle | 1 | 80.3% | 16.12% | 0.82 |
| 10 |
|
|
| |
| 100 | 83.46% | 14.88% | 0.843 | |
The entries in boldface indicate the best classification results with the highest F
Fig. 14Confusion matrices of multiple view features using MKL with one kind of kernel. a WHOI dataset. b ZooScan dataset. c Kaggle dataset