| Literature DB >> 35049850 |
Bruno Sauvalle1, Arnaud de La Fortelle1.
Abstract
The goal of background reconstruction is to recover the background image of a scene from a sequence of frames showing this scene cluttered by various moving objects. This task is fundamental in image analysis, and is generally the first step before more advanced processing, but difficult because there is no formal definition of what should be considered as background or foreground and the results may be severely impacted by various challenges such as illumination changes, intermittent object motions, highly cluttered scenes, etc. We propose in this paper a new iterative algorithm for background reconstruction, where the current estimate of the background is used to guess which image pixels are background pixels and a new background estimation is performed using those pixels only. We then show that the proposed algorithm, which uses stochastic gradient descent for improved regularization, is more accurate than the state of the art on the challenging SBMnet dataset, especially for short videos with low frame rates, and is also fast, reaching an average of 52 fps on this dataset when parameterized for maximal accuracy using acceleration with a graphics processing unit (GPU) and a Python implementation.Entities:
Keywords: background generation; background initialization; background reconstruction; background subtraction; motion detection; scene parsing
Year: 2022 PMID: 35049850 PMCID: PMC8780815 DOI: 10.3390/jimaging8010009
Source DB: PubMed Journal: J Imaging ISSN: 2313-433X
Figure 1Schematic of loss function and gradient computation (Images are normalized in the range [0,1]).
Figure 2Overview of the stochastic gradient descent optimization process.
Evaluation results per criteria on the SBMnet 2016 dataset. ↓ indicates lower score is better, ↑ indicates higher score is better. Source: SBMnet website http://pione.dinf.usherbrooke.ca/results/294/ (accessed on 20 November 2021).
| Method | Average | Average | Average | Average | Average | Average |
|---|---|---|---|---|---|---|
| BB-SGD (ours) |
|
|
|
|
|
|
| SPMD [ | 6.0985 | 0.0487 | 0.0154 | 0.9412 | 29.8439 | 30.6499 |
| LabGen-OF [ | 6.1897 | 0.0566 | 0.0232 | 0.9412 | 29.8957 | 30.7006 |
| FSBE [ | 6.6204 | 0.0605 | 0.0217 | 0.9373 | 29.3378 | 30.1777 |
| BEWIS [ | 6.7094 | 0.0592 | 0.0266 | 0.9282 | 28.7728 | 29.6342 |
| NExBI [ | 6.7778 | 0.0671 | 0.0227 | 0.9196 | 27.9944 | 28.8810 |
| Photomontage [ | 7.1950 | 0.0686 | 0.0257 | 0.9189 | 28.0113 | 28.8719 |
| SOBS [ | 7.5183 | 0.0711 | 0.0242 | 0.9160 | 27.6533 | 28.5601 |
| Temporal Median Filter [ | 8.2761 | 0.0984 | 0.0546 | 0.9130 | 27.5364 | 28.4434 |
Evaluation results for the AGE criterion per category on the SBMnet 2016 dataset. Source: SBMnet website http://pione.dinf.usherbrooke.ca/results/294/ (accessed on 20 November 2021).
| Method | Basic | Interm. | Clutter | Jitter | Illumin. | Backgr. | Very | Very |
|---|---|---|---|---|---|---|---|---|
| Motion | Changes | Motion | Long | Short | ||||
| BB-SGD (ours) |
| 4.8898 |
| 9.5374 | 4.5227 |
| 5.6494 |
|
| SPMD [ | 3.8141 |
| 4.5998 | 9.8095 |
| 9.9115 | 6.0926 | 5.9017 |
| LabGen-OF [ | 3.8421 | 4.6433 | 4.1821 | 9.2410 | 8.2200 | 10.0698 | 4.2856 | 5.0338 |
| FSBE [ | 3.8960 | 5.3438 | 4.7660 | 10.3878 | 5.5089 | 10.5862 | 6.9832 | 5.4912 |
| BEWIS [ | 4.0673 | 4.7798 | 10.6714 | 9.4156 | 5.9048 | 9.6776 |
| 5.1937 |
| Photomontage [ | 4.4856 | 7.1460 | 6.8195 | 10.1272 | 5.2668 | 12.0930 | 6.6446 | 4.9770 |
| SOBS [ | 4.3598 | 6.2583 | 7.0590 | 10.0232 | 10.3591 | 10.7280 | 6.0638 | 5.2953 |
| Temporal Median Filter [ | 3.8269 | 6.8003 | 12.5316 |
| 12.2205 | 9.6479 | 6.9588 | 5.1336 |
Evaluation results per criteria on the SBI dataset. ↓ indicates lower score is better, ↑ indicates higher score is better.
| Method | Average | Average | Average | Average | Average |
|---|---|---|---|---|---|
| AGE ↓ | pEPs ↓ | pCEPs ↓ | MS-SSIM ↑ | PSNR ↑ | |
| BB-SGD (ours) |
| 0.0083 | 0.0058 |
|
|
| LabGen-OF [ | 2.7191 | 0.0145 | 0.0106 | 0.9824 | 35.9758 |
| SS-SVD [ | 2.7479 | 0.0345 | 0.0907 | 0.9464 | 31.8116 |
| LabGen [ | 2.9945 | 0.0139 | 0.0092 | 0.9764 | 35.2028 |
| NExBI [ | 3.0547 |
|
| 0.9835 | 35.3078 |
| BEWIS [ | 3.8665 | 0.0242 | 0.0142 | 0.9675 | 32.0143 |
| Photomontage [ | 5.8238 | 0.0469 | 0.0372 | 0.9334 | 31.8573 |
| SOBS [ | 3.5023 | 0.0415 | 0.0222 | 0.9765 | 35.2723 |
| Temporal Median Filter [ | 10.3744 | 0.1340 | 0.1055 | 0.8533 | 28.0044 |
AGE scores obtained using various truncated versions of the algorithm on 18 SBMnet sequences where a ground truth background is available.
| Category | Video | Truncated Model | Full | |||
|---|---|---|---|---|---|---|
| Version | Model | |||||
| v0 | v1 | v2 | v3 | |||
| background motion | ||||||
| advertisementBoard | 1.61 | 1.62 | 1.60 | 1.34 | 1.71 | |
| basic | ||||||
| 511 | 3.42 | 3.44 | 3.43 | 3.44 | 3.43 | |
| Blurred | 1.80 | 1.69 | 1.68 | 1.68 | 1.61 | |
| clutter | ||||||
| Foliage | 32.87 | 5.86 | 3.62 | 3.41 | 3.37 | |
| Board | 21.37 | 6.78 | 7.84 | 7.37 | 7.39 | |
| People and Foliage | 31.36 | 9.66 | 3.75 | 2.54 | 2.60 | |
| boulevardJam | 21.37 | 15.89 | 19.5 | 11.0 | 2.03 | |
| illumination change | ||||||
| CameraParameter | 11.49 | 22.19 | 2.16 | 2.81 | 2.95 | |
| intermittent motion | ||||||
| busStation | 5.31 | 5.40 | 5.47 | 5.67 | 5.32 | |
| Candela_m1.10 | 4.93 | 5.09 | 5.18 | 5.21 | 2.81 | |
| CaVignal | 12.57 | 12.61 | 13.58 | 14.04 | 2.05 | |
| AVSS2007 | 10.98 | 10.32 | 10.25 | 10.01 | 8.73 | |
| jitter | ||||||
| badminton | 2.62 | 2.00 | 1.93 | 1.74 | 1.84 | |
| boulevard | 9.61 | 10.09 | 10.29 | 10.51 | 9.71 | |
| very long | ||||||
| BusStopMorning | 3.68 | 3.66 | 3.64 | 3.62 | 3.61 | |
| very short | ||||||
| Toscana | 8.79 | 8.80 | 8.79 | 3.30 | 3.30 | |
| DynamicBackground | 6.96 | 6.96 | 6.96 | 8.20 | 8.18 | |
| CUHK_Square | 2.77 | 2.77 | 2.77 | 2.99 | 2.98 | |
| Average AGE by category | 8.06 | 7.53 | 4.94 | 4.51 | 3.75 | |
Impact of reducing the number of iterations on average AGE score and computation time.
| Number of Iterations | 100 | 250 | 500 | 1000 | 3000 |
|---|---|---|---|---|---|
| Learning Rate | 0.06 | 0.03 | 0.03 | 0.03 | 0.03 |
| Computation time for 79 videos of the SBMnet dataset (seconds) | 337 | 391 | 482 | 666 | 1409 |
| Average AGE by category on 18 videos of the SBMnet dataset listed in | 4.07 | 3.83 | 3.80 | 3.76 | 3.75 |
| Average AGE on SBI dataset | 2.78 | 2.56 | 2.53 | 2.49 | 2.46 |
Figure 3Examples of background reconstruction using the proposed model and comparison with SPMD and LabGen-OF.
Figure 4Examples of background reconstruction. The bottom five rows show examples of low quality reconstructions.