| Literature DB >> 29702629 |
Wai Lok Woo1, Bin Gao2, Ahmed Bouridane3, Bingo Wing-Kuen Ling4, Cheng Siong Chin5.
Abstract
This paper presents an unsupervised learning algorithm for sparse nonnegative matrix factor time⁻frequency deconvolution with optimized fractional β-divergence. The β-divergence is a group of cost functions parametrized by a single parameter β. The Itakura⁻Saito divergence, Kullback⁻Leibler divergence and Least Square distance are special cases that correspond to β=0, 1, 2, respectively. This paper presents a generalized algorithm that uses a flexible range of β that includes fractional values. It describes a maximization⁻minimization (MM) algorithm leading to the development of a fast convergence multiplicative update algorithm with guaranteed convergence. The proposed model operates in the time⁻frequency domain and decomposes an information-bearing matrix into two-dimensional deconvolution of factor matrices that represent the spectral dictionary and temporal codes. The deconvolution process has been optimized to yield sparse temporal codes through maximizing the likelihood of the observations. The paper also presents a method to estimate the fractional β value. The method is demonstrated on separating audio mixtures recorded from a single channel. The paper shows that the extraction of the spectral dictionary and temporal codes is significantly more efficient by using the proposed algorithm and subsequently leads to better source separation performance. Experimental tests and comparisons with other factorization methods have been conducted to verify its efficacy.Entities:
Keywords: adaptive signal processing; blind source separation; machine learning, maximization–minimization algorithm, β-divergence, matrix deconvolution; sensors signal processing
Year: 2018 PMID: 29702629 PMCID: PMC5982401 DOI: 10.3390/s18051371
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Differentiable Convex-Concave-Constant Decomposition of -Divergence.
| Range |
|
|
|
|---|---|---|---|
|
|
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Results using different sources.
| Mixtures | SDR (dB) |
|
|---|---|---|
| Piano + trumpet | 16.11 | 2.11 |
| 9.19 | 2.13 | |
| 9.43 | 1.93 | |
| 7.73 | 1.82 | |
| 12.21 | 2.09 | |
| Piano + violin | 13.07 | 1.07 |
| 8.15 | 1.23 | |
| 6.25 | 0.92 | |
| 9.33 | 1.20 | |
| 8.19 | 0.89 | |
| Trumpet + violin | 14.63 | 0.68 |
| 8.14 | 0.62 | |
| 7.81 | 0.67 | |
| 9.81 | 0.51 | |
| 7.55 | 0.52 |
Figure 1Time-domain representation and log-frequency spectrogram of: the piano music (top panels); the trumpet music (middle panels); and the mixed signal (bottom panels).
Figure 2Estimation of and with: (a) ; (b) ; (c) ; (d) .
Figure 3Convergence trajectory of the sparsity parameter: (A) ; (B) ; (C) ; and (D) .
Figure 4SDR results of piano and trumpet mixture when using different values.
Figure 5Histogram of the converged adaptive sparsity parameter.
Results of separation for different mixture.
| Methods | SDR (dB) |
|---|---|
| Case (1) | 12.77 |
| Case (2) | 13.01 |
| Case (3) | 14.60 |
| Case (4) | 14.62 |
| Case (5) | 14.70 |
| Case (6) | 15.60 |
Figure 6SDR values of the separation results of mixture using different β values.
Figure 7SDR values of the separation results of mixture using different β values and sparse methods.
SDR (dB) results of adaptive versus fixed .
| Mixtures | SDR (dB) Using Adaptive | SDR (dB) Using | SDR (dB) Using |
|---|---|---|---|
| Piano + Trumpet | 16.85 | 14.11 | 15.93 |
| 10.74 | 7.95 | 9.01 | |
| 9.93 | 8.12 | 9.11 | |
| 8.95 | 6.57 | 7.44 | |
| 13.64 | 10.26 | 12.03 | |
| Piano + Violin | 14.17 | 12.12 | 11.67 |
| 9.04 | 7.95 | 7.11 | |
| 8.13 | 6.09 | 5.81 | |
| 10.4 | 9.08 | 8.71 | |
| 9.59 | 7.85 | 7.19 | |
| Trumpet + Violin | 15.40 | 12.49 | 12.13 |
| 8.87 | 6.23 | 6.31 | |
| 9.14 | 6.87 | 7.17 | |
| 10.51 | 7.92 | 8.11 | |
| 9.17 | 7.77 | 7.95 |
Performance comparison of proposed method with NMF models.
| Algorithm | SDR (dB) |
|---|---|
| NMF-LS | 4.17 |
| NMF-KLD | 3.47 |
| NMF-TCS | 5.12 |
| NMF-ARD | 3.98 |
| NMF using proposed method | 7.63 |
| Proposed method using matrix factor time–frequency deconvolution | 12.02 |