| Literature DB >> 34178926 |
Ashwini K1, P M Durai Raj Vincent1, Kathiravan Srinivasan1, Chuan-Yu Chang2.
Abstract
Neonatal infants communicate with us through cries. The infant cry signals have distinct patterns depending on the purpose of the cries. Preprocessing, feature extraction, and feature selection need expert attention and take much effort in audio signals in recent days. In deep learning techniques, it automatically extracts and selects the most important features. For this, it requires an enormous amount of data for effective classification. This work mainly discriminates the neonatal cries into pain, hunger, and sleepiness. The neonatal cry auditory signals are transformed into a spectrogram image by utilizing the short-time Fourier transform (STFT) technique. The deep convolutional neural network (DCNN) technique takes the spectrogram images for input. The features are obtained from the convolutional neural network and are passed to the support vector machine (SVM) classifier. Machine learning technique classifies neonatal cries. This work combines the advantages of machine learning and deep learning techniques to get the best results even with a moderate number of data samples. The experimental result shows that CNN-based feature extraction and SVM classifier provides promising results. While comparing the SVM-based kernel techniques, namely radial basis function (RBF), linear and polynomial, it is found that SVM-RBF provides the highest accuracy of kernel-based infant cry classification system provides 88.89% accuracy.Entities:
Keywords: convolutional neural network; infant cry classification; short time fourier transform; spectrogram; support vector machine
Year: 2021 PMID: 34178926 PMCID: PMC8222524 DOI: 10.3389/fpubh.2021.670352
Source DB: PubMed Journal: Front Public Health ISSN: 2296-2565
Figure 1Work flow diagram for baby cry classification.
Figure 2Audio signal transform into spectrogram.
Figure 3Pictorial representation of convolutional neural network.
Figure 4Pictorial representation of linear SVM.
Figure 5Pictorial representation of non-linear SVM.
Figure 6Convolutional neural network for feature extraction.
Figure 7Proposed infant cry classification system.
Convolution layer description of the network.
| 1 | Conv1 | 96 | 11*11 | 3 |
| 2 | Conv2 | 256 | 5*5 | 48 |
| 3 | Conv3 | 384 | 3*3 | 256 |
| 4 | Conv4 | 384 | 3*3 | 192 |
| 5 | Conv5 | 256 | 3*3 | 192 |
3*3 confusion matrix.
| A | TPA | EAB | EAC |
| B | EBA | TPB | EBC |
| C | ECA | ECB | TPC |
Performance evaluation of SVM-RBF.
| Specificity | 0.8571 | 0.8235 | 1.0000 | 0.8935 |
| Sensitivity | 0.9032 | 0.9643 | 0.9677 | 0.9450 |
| Precision | 0.8000 | 0.9333 | 0.9333 | 0.8888 |
| Accuracy | 0.8889 | 0.9111 | 0.9778 | 0.9259 |
| F1 score | 0.8276 | 0.8750 | 0.9655 | 0.8893 |
Performance evaluation of SVM-linear.
| Specificity | 0.8125 | 0.8571 | 0.8667 | 0.8454 |
| Sensitivity | 0.9310 | 0.9032 | 0.9333 | 0.9225 |
| Precision | 0.8667 | 0.8000 | 0.8667 | 0.8444 |
| Accuracy | 0.8889 | 0.8889 | 0.9111 | 0.8963 |
| F1 score | 0.8387 | 0.8276 | 0.8667 | 0.8443 |
Figure 8Performance measures of SVM-RBF.
Figure 10Performance measures of SVM-linear.
Figure 11ROC analysis of SVM-polynomial.
Figure 13ROC analysis of SVM-RBF.
Figure 14Comparison of various kernels in SVM.
Performance evaluation of SVM-polynomial.
| Specificity | 0.8125 | 0.8750 | 0.9231 | 0.8702 |
| Sensitivity | 0.9310 | 0.9655 | 0.9063 | 0.9342 |
| Precision | 0.8667 | 0.9333 | 0.8000 | 0.8666 |
| Accuracy | 0.8889 | 0.9333 | 0.9111 | 0.9111 |
| F1 score | 0.8387 | 0.9032 | 0.8571 | 0.8663 |