| Literature DB >> 27250124 |
Yoonchang Han1, Subin Lee1, Juhan Nam2, Kyogu Lee1.
Abstract
Feature learning for music applications has recently received considerable attention from many researchers. This paper reports on the sparse feature learning algorithm for musical instrument identification, and in particular, focuses on the effects of the frame sampling techniques for dictionary learning and the pooling methods for feature aggregation. To this end, two frame sampling techniques are examined that are fixed and proportional random sampling. Furthermore, the effect of using onset frame was analyzed for both of proposed sampling methods. Regarding summarization of the feature activation, a standard deviation pooling method is used and compared with the commonly used max- and average-pooling techniques. Using more than 47 000 recordings of 24 instruments from various performers, playing styles, and dynamics, a number of tuning parameters are experimented including the analysis frame size, the dictionary size, and the type of frequency scaling as well as the different sampling and pooling methods. The results show that the combination of proportional sampling and standard deviation pooling achieve the best overall performance of 95.62% while the optimal parameter set varies among the instrument classes.Year: 2016 PMID: 27250124 DOI: 10.1121/1.4946988
Source DB: PubMed Journal: J Acoust Soc Am ISSN: 0001-4966 Impact factor: 1.840