| Literature DB >> 26925315 |
Nima Salimi1, Kar Hoe Loh2, Sarinder Kaur Dhillon1, Ving Ching Chong3.
Abstract
Background. Fish species may be identified based on their unique otolith shape or contour. Several pattern recognition methods have been proposed to classify fish species through morphological features of the otolith contours. However, there has been no fully-automated species identification model with the accuracy higher than 80%. The purpose of the current study is to develop a fully-automated model, based on the otolith contours, to identify the fish species with the high classification accuracy. Methods. Images of the right sagittal otoliths of 14 fish species from three families namely Sciaenidae, Ariidae, and Engraulidae were used to develop the proposed identification model. Short-time Fourier transform (STFT) was used, for the first time in the area of otolith shape analysis, to extract important features of the otolith contours. Discriminant Analysis (DA), as a classification technique, was used to train and test the model based on the extracted features. Results. Performance of the model was demonstrated using species from three families separately, as well as all species combined. Overall classification accuracy of the model was greater than 90% for all cases. In addition, effects of STFT variables on the performance of the identification model were explored in this study. Conclusions. Short-time Fourier transform could determine important features of the otolith outlines. The fully-automated model proposed in this study (STFT-DA) could predict species of an unknown specimen with acceptable identification accuracy. The model codes can be accessed at http://mybiodiversityontologies.um.edu.my/Otolith/ and https://peerj.com/preprints/1517/. The current model has flexibility to be used for more species and families in future studies.Entities:
Keywords: Automated taxon identification; Discriminant analysis; Otolith shape analysis; Short-time Fourier transform
Year: 2016 PMID: 26925315 PMCID: PMC4768690 DOI: 10.7717/peerj.1664
Source DB: PubMed Journal: PeerJ ISSN: 2167-8359 Impact factor: 2.984
Figure 1A schematic diagram of the proposed image identification system.
(A) shows different stages for training the model, and the testing part of the system is illustrated in the (B).
Figure 2Image of an otolith (A) with its corresponding 1D signal (B).
1D signal was obtained by calculating the radius, distances between the boundary pixels (red) and the center of gravity (blue), as a function of angle.
Figure 3The spectrogram of the characteristic signal shown in Fig. 2.
The original signal was resampled to 1,000 points before calculating the short-time Fourier transform (STFT). The color bar indicates estimates of the power spectral density (PSD). STFT of the spatial-domain signal was calculated with sampling frequency of 2π.
Fish species used in the proposed fully-automated identification system.
| Species | Family |
|---|---|
| Sciaenidae | |
| ” | |
| ” | |
| ” | |
| ” | |
| Ariidae | |
| ” | |
| ” | |
| ” | |
| ” | |
| ” | |
| Engraulidae | |
| ” | |
| ” |
Confusion matrix for the classification results of the Engraulidae family.
The predicted species (columns) are compared with the species confirmed by an expert (rows).
| 0 (0%) | 0 (0%) | ||
| 0 (0%) | 0 (0%) | ||
| 0 (0%) | 1 (10%) |
Confusion matrix obtained from five species of the Sciaenidae family.
The columns indicate the predicted species by the identification model, while rows indicate the target species.
| 0 (0%) | 0 (0%) | 1 (10%) | 0 (0%) | ||
| 0 (0%) | 0 (0%) | 0 (0%) | 0 (0%) | ||
| 0 (0%) | 0 (0%) | 0 (0%) | 0 (0%) | ||
| 1 (10%) | 0 (0%) | 0 (0%) | 0 (0%) | ||
| 0 (0%) | 0 (0%) | 0 (0%) | 0 (0%) |
Classification results (confusion matrix) of the Ariidae family.
Outputs of the identification model (columns) are compared with the target species (rows).
| 0 (0%) | 2 (20%) | 0 (0%) | 0 (0%) | 0 (0%) | ||
| 0 (0%) | 0 (0%) | 0 (0%) | 0 (0%) | 0 (0%) | ||
| 1 (10%) | 0 (0%) | 0 (0%) | 0 (0%) | 0 (0%) | ||
| 0 (0%) | 0 (0%) | 0 (0%) | 0 (0%) | 0 (0%) | ||
| 1 (10%) | 0 (0%) | 0 (0%) | 0 (0%) | 0 (0%) | ||
| 0 (0%) | 0 (0%) | 0 (0%) | 0 (0%) | 0 (0%) |
Confusion matrix for the identification results obtained from 14 species of three different families.
In each target species (rows), numbers of specimens are indicated in the corresponding predicted species (columns). Species are Dendrophysa russelli (1), Johnius belangerii (2), Johnius carouna (3), Otolithes ruber (4), Panna microdon (5), Nemapteryx caelatus (6), Arius maculatus (7), Cryptarius truncatus (8), Hexanematichtys sagor (9), Osteogeneiosus militaris (10), Plicofollis argyropleuron (11), Coilia dussumieri (12), Setipinna taty (13), Thryssa hamiltonii (14).
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 0 | 0 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |
| 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |
| 3 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |
| 4 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |
| 5 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |
| 6 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | 0 | 1 | 0 | 0 | 0 | |
| 7 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |
| 8 | 0 | 0 | 0 | 0 | 0 | 3 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |
| 9 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |
| 10 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | |
| 11 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |
| 12 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |
| 13 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | |
| 14 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Performance of the model using absolute, phase angle, and combined features (rows) for all four data sets (columns) used in this study.
| Extracted features | Overall accuracy | |||
|---|---|---|---|---|
| Engraulidae family | Sciaenidae family | Ariidae family | All families | |
| 16 MAXABS | 87% | 84% | 87% | 84% |
| 16 MAXANG | 93% | 94% | 78% | 86% |
| 16 MAXABS + 16MAXANG | ||||
Classification results of the model for 16 different window functions.
Using each window function (rows), the model performance was calculated for all four datasets (columns).
| Window functions | Overall accuracy | |||
|---|---|---|---|---|
| Engraulidae family | Sciaenidae family | Ariidae family | All families | |
| Bartlett-Hann | 87% | 82% | 85% | 83% |
| Bartlett | 90% | 82% | 90% | 85% |
| Blackman | 60% | 80% | 88% | 77% |
| Blackman-Harris | 50% | 80% | 85% | 62% |
| Bohman | 53% | 82% | 87% | 63% |
| Chebyshev | 47% | 78% | 78% | 69% |
| Flat top | 40% | 68% | 80% | 64% |
| Gaussian | 92% | |||
| Hamming | 93% | 92% | 92% | |
| Hann | 70% | 88% | 88% | 84% |
| Kaiser | 90% | 82% | 93% | |
| Nuttall’s | 57% | 84% | 85% | 68% |
| Parzen | 50% | 68% | 90% | 66% |
| Rectangular | 87% | 83% | ||
| Tapered cosine | 93% | 88% | 87% | 89% |
| Triangular | 57% | 84% | 90% | 84% |