| Literature DB >> 29882825 |
Noor Almaadeed1, Muhammad Asim2, Somaya Al-Maadeed3, Ahmed Bouridane4, Azeddine Beghdadi5.
Abstract
This work investigates the problem of detecting hazardous events on roads by designing an audio surveillance system that automatically detects perilous situations such as car crashes and tire skidding. In recent years, research has shown several visual surveillance systems that have been proposed for road monitoring to detect accidents with an aim to improve safety procedures in emergency cases. However, the visual information alone cannot detect certain events such as car crashes and tire skidding, especially under adverse and visually cluttered weather conditions such as snowfall, rain, and fog. Consequently, the incorporation of microphones and audio event detectors based on audio processing can significantly enhance the detection accuracy of such surveillance systems. This paper proposes to combine time-domain, frequency-domain, and joint time-frequency features extracted from a class of quadratic time-frequency distributions (QTFDs) to detect events on roads through audio analysis and processing. Experiments were carried out using a publicly available dataset. The experimental results conform the effectiveness of the proposed approach for detecting hazardous events on roads as demonstrated by 7% improvement of accuracy rate when compared against methods that use individual temporal and spectral features.Entities:
Keywords: car crashes; event detection; hazardous events; tire skidding; visual surveillance
Year: 2018 PMID: 29882825 PMCID: PMC6022152 DOI: 10.3390/s18061858
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1Proposed methodology for detecting acoustic anomalies. The input signal is subdivided into small frames, and features are extracted in the time, frequency, and joint time-frequency domains. The highest ranked features among the computed features are selected.
Temporal and Spectral features.
| Temporal Features | Spectral Features |
|---|---|
| The | The |
|
|
|
| The | here |
|
| The |
| The |
|
|
| where |
| The | The |
|
|
|
| The | here |
|
| The |
|
| |
|
| where |
| The | |
|
|
|
| where | Maximum power of the frequency bands can be represented as, |
|
|
Figure 2Time-frequency (TF) approach for pattern classification.
Figure 3Time, frequency, and TF representations of a Background noise (BN) segment (1st row), a Car Crash (CC) sound (2nd row), and a Tire Skidding (TS) sound (3rd row). The TF representations were generated using the extended modified-B distribution (EMBD) with as σ = 0.9 and β = 0.01, with a lag window length of 355.
Details on the composition of the dataset.
| Class | # Events | Duration (s) |
|---|---|---|
| CC | 200 | 326.3 |
| TS | 200 | 522.5 |
| BN | - | 2737 |
Ranking of the t-, f- and (t, f)-domain features based on mutual information criteria.
| Time and Frequency Features | Time-Frequency Features | Selected Features |
|---|---|---|
| Mean ( | Mean ( | Variance ( |
| Variance ( | Variance ( | Coefficient of Variation ( |
| Skewness ( | Coefficient of Variation ( | Skewness ( |
| Coefficient of Variation ( | Skewness ( | Flatness ( |
| Kurtosis ( | Kurtosis ( | Spectral Entropy ( |
| Energy Entropy ( | Flatness ( | Spectral Flux ( |
| Zero Crossing Rate ( | Renyi Entropy ( | Instantaneous Amplitude ( |
| Short-Time Energy ( | Spectral Flux ( | SVD-based Feature ( |
| Flatness ( | Instantaneous Frequency ( | TFD Concentration Measure ( |
| Spectral Entropy ( | Instantaneous Amplitude ( | Aspect Ratio ( |
| Spectral Roll-off ( | SVD-based Features ( | Flatness ( |
| Spectral Centroid ( | TFD Concentration Measure ( | Spectral Entropy ( |
| Spectral Flux ( | Area or Convex Hull ( | Spectral Flux ( |
| Maximum power | Aspect Ratio ( | Spectral Centroid ( |
| of the frequency bands ( | Spectral Roll-off ( | |
| Skewness ( | ||
| Kurtosis ( | ||
| Energy Entropy ( | ||
| Short-Time Energy ( |
Classification matrices achieved using (a) the proposed approach, (b) the MFCC features proposed in [47], (c) the method defined in [13] with temporal and spectral features and (d) the method defined in [13] with MFCC Features using Bag of Word (BoW) approach.
| (a) Proposed Approach | (b) MFCC Features [ | ||||||
|---|---|---|---|---|---|---|---|
| Predicted Class | |||||||
| TS | CC | BN | TS | CC | BN | ||
| True Class | TS | 94% | 3% | 3% | 85.5% | 5% | 9% |
| CC | 1.5% | 96% | 2.5% | 0% | 92.5% | 7.5% | |
| BN | 8% | 2% | 90% | 5.5% | 8.5% | 86% | |
|
|
| ||||||
|
| |||||||
|
|
|
|
|
|
| ||
| TS | 75.0% | 0.5% | 24.5% | 71% | 0.5% | 28.5% | |
| CC | 0% | 89% | 11% | 1% | 89.5% | 9.5% | |
Comparison of performance results between the proposed approach and other approaches.
| Methods | RR (%) | MDR (%) | FPR (%) | AUC (%) |
|---|---|---|---|---|
| Bark Features [ | 78.20 | 21 | 10.96 | 86 |
| MFCC Features_BoW [ | 82.65 | 19 | 5.48 | 90 |
| Temporal and Spectral | 84.5 | 17.75 | 2.85 | 80 |
| Features [ | ||||
| MFCC Features [ | 88 | 8.25 | 7 | 96.72 |
| Proposed Approach | 95 | 2.75 | 5 | 98.32 |
Figure 4Comparison of performance results.
Comparison of recognition rate achieved between the three typologies of features.
| Features | RR (%) |
|---|---|
| Temporal | 61 |
| Spectral | 81.5 |
| Time-Frequency | 84 |
| Joint set of Features | 93 |
| Proposed set of features | 95 |
Figure 5Illustration of higher entropy and flux measures for event (a) CC than event (b) TS.
Figure 6Receiver operating characteristic (ROC) curves of the proposed system configured with Mel-frequency cepstral coefficient (MFCC) features.