Literature DB >> 17672638

The bag-of-frames approach to audio pattern recognition: a sufficient model for urban soundscapes but not for polyphonic music.

Jean-Julien Aucouturier1, Boris Defreville, François Pachet.   

Abstract

The "bag-of-frames" approach (BOF) to audio pattern recognition represents signals as the long-term statistical distribution of their local spectral features. This approach has proved nearly optimal for simulating the auditory perception of natural and human environments (or soundscapes), and is also the most predominent paradigm to extract high-level descriptions from music signals. However, recent studies show that, contrary to its application to soundscape signals, BOF only provides limited performance when applied to polyphonic music signals. This paper proposes to explicitly examine the difference between urban soundscapes and polyphonic music with respect to their modeling with the BOF approach. First, the application of the same measure of acoustic similarity on both soundscape and music data sets confirms that the BOF approach can model soundscapes to near-perfect precision, and exhibits none of the limitations observed in the music data set. Second, the modification of this measure by two custom homogeneity transforms reveals critical differences in the temporal and statistical structure of the typical frame distribution of each type of signal. Such differences may explain the uneven performance of BOF algorithms on soundscapes and music signals, and suggest that their human perception rely on cognitive processes of a different nature.

Entities:  

Mesh:

Year:  2007        PMID: 17672638     DOI: 10.1121/1.2750160

Source DB:  PubMed          Journal:  J Acoust Soc Am        ISSN: 0001-4966            Impact factor:   1.840


  6 in total

1.  Effects of Soundscape on Flow State during Diabolo Exercise.

Authors:  Tong-Yu Li; Si-Yuan Guo; Bin-Xia Xue; Qi Meng; Bo Jiang; Xin-Xin Xu; Chein-Chi Chang
Journal:  Int J Environ Res Public Health       Date:  2022-06-30       Impact factor: 4.614

2.  Zipf's law in short-time timbral codings of speech, music, and environmental sound signals.

Authors:  Martín Haro; Joan Serrà; Perfecto Herrera; Alvaro Corral
Journal:  PLoS One       Date:  2012-03-29       Impact factor: 3.240

3.  An Efficient Audio Coding Scheme for Quantitative and Qualitative Large Scale Acoustic Monitoring Using the Sensor Grid Approach.

Authors:  Félix Gontier; Mathieu Lagrange; Pierre Aumond; Arnaud Can; Catherine Lavandier
Journal:  Sensors (Basel)       Date:  2017-11-29       Impact factor: 3.576

4.  Matching Pursuit Analysis of Auditory Receptive Fields' Spectro-Temporal Properties.

Authors:  Jörg-Hendrik Bach; Birger Kollmeier; Jörn Anemüller
Journal:  Front Syst Neurosci       Date:  2017-02-09

5.  Transfer Learning for Improved Audio-Based Human Activity Recognition.

Authors:  Stavros Ntalampiras; Ilyas Potamitis
Journal:  Biosensors (Basel)       Date:  2018-06-25

6.  An Investigation of Soundscape Factors Influencing Perceptions of Square Dancing in Urban Streets: A Case Study in a County Level City in China.

Authors:  Jieling Xiao; Andrew Hilton
Journal:  Int J Environ Res Public Health       Date:  2019-03-07       Impact factor: 3.390

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.