| Literature DB >> 30340356 |
Gang-Joon Yoon1, Hyeong Jae Hwang2, Sang Min Yoon3.
Abstract
Visual object tracking is a fundamental research area in the field of computer vision and pattern recognition because it can be utilized by various intelligent systems. However, visual object tracking faces various challenging issues because tracking is influenced by illumination change, pose change, partial occlusion and background clutter. Sparse representation-based appearance modeling and dictionary learning that optimize tracking history have been proposed as one possible solution to overcome the problems of visual object tracking. However, there are limitations in representing high dimensional descriptors using the standard sparse representation approach. Therefore, this study proposes a structured sparse principal component analysis to represent the complex appearance descriptors of the target object effectively with a linear combination of a small number of elementary atoms chosen from an over-complete dictionary. Using an online dictionary for learning and updating by selecting similar dictionaries that have high probability makes it possible to track the target object in a variety of environments. Qualitative and quantitative experimental results, including comparison to the current state of the art visual object tracking algorithms, validate that the proposed tracking algorithm performs favorably with changes in the target object and environment for benchmark video sequences.Entities:
Keywords: appearance model; online learning; structured visual dictionary; visual object tracking structured sparse PCA
Year: 2018 PMID: 30340356 PMCID: PMC6209897 DOI: 10.3390/s18103513
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1Representation of the target object using structured sparse PCA and deterministic classification between the target object and background image patches.
Notations and symbols.
| Symbol | Description |
|---|---|
|
| Frame at time |
|
| State variable |
|
| Observation variable |
|
| Location vector in the state variable |
|
| Target descriptor vector |
|
| Background descriptor vector |
|
| Patch image |
|
| Column vectors of |
|
| Column vectors of |
|
| Feature descriptor |
|
| Feature dictionary |
|
| Feature coefficient matrices |
|
| Support vector machine classifier |
|
| |
|
| Multivariate normal distribution |
|
| Set of target descriptors |
|
| Probability function |
|
| Dimension of descriptors |
|
| Number of dictionary vectors |
|
| Number of background descriptors |
|
| Number of vectors after updating |
|
| Time variable |
|
| Real number |
|
| Real number |
|
| Width ( |
|
| Height ( |
| ≈ | Approximately equal |
| ∝ | Proportional to |
|
| Transpose operator |
Figure 2Representation of the target object using structured sparse PCA and deterministic classification between the target object and background image patches.
Figure 3Procedure to find the most similar target object templates using confidence (Equation (9)). (a) Typical explanation to find the target object by weighting the scale factor from positive candidate templates to prevent drift, partial occlusion and scaling problems; (b) real image-based re-weighting procedure to find similar templates from positive image templates.
Figure 4Target models in F at the update time and the detection of the target at a later frame.
Average of overlap score of the proposed tracker and several current state of the art trackers ((BC), deformation (DEF), fast motion (FM), in-plane rotation (IPR), illumination variation (IV), low resolution (LR), motion blur (MB), occlusion (OCC), out-of-plane rotation (OPR), out-of view (OV) and scale variation (SV)). The top two methods for each dataset are highlighted in red and blue, respectively. VTD, visual tracking decomposition; MS, mean-shift; MIL, multiple instance learning; SCM, sparse collaborative appearance; Frag, fragment-based; TLD, tracking-learning-detection.
| All | BC | DEF | FM | IPR | IV | LR | MB | OCC | OPR | OV | SV | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Proposed |
|
|
|
|
| 52.03 |
|
| 55.00 |
|
| 52.33 |
| VTD [ | 49.3 | 55.1 | 46.2 | 41.7 | 50.2 | 53.7 | 47.1 | 43.5 | 52.3 | 53.7 | 51.5 | 48.9 |
| MS [ | 35.6 | 36.7 | 32.8 | 40.5 | 36.8 | 34.6 | 28.4 | 41.2 | 37.4 | 37.3 | 41.0 | 36.0 |
| MIL [ | 45.9 | 48.6 | 45.7 | 44.1 | 45.7 | 47.1 | 43.5 | 43.7 | 47.6 | 48.9 | 52.7 | 44.5 |
| SCM [ | 54.4 |
| 51.5 | 42.8 | 51.8 | 61.1 | 61.7 | 45.2 | 56.8 | 57.0 | 56.4 |
|
| Frag [ | 44.2 | 46.1 | 41.8 | 44.8 | 43.3 | 42.6 | 42.6 | 46.1 | 46.6 | 46.1 | 50.1 | 44.2 |
| IVT [ | 46.4 | 51.6 | 40.5 | 37.3 | 46.4 | 51.2 | 55.8 | 41.3 | 49.3 | 49.0 | 52.3 | 47.1 |
| TLD [ | 46.8 | 48.3 | 37.4 | 44.6 | 48.9 | 46.7 | 53.3 | 51.0 | 45.2 | 46.0 | 50.2 | 47.1 |
| Struct [ |
| 59.3 |
|
|
|
| 59.1 |
|
|
|
|
|
| ASLA [ | 53.2 | 59.2 | 50.5 | 42.0 | 52.1 |
|
| 44.6 |
| 56.3 | 55.3 | 54.0 |
Figure 5Tracking during partial occlusion and drift.
Figure 6Tracking during illumination changes.
Figure 7Tracking during background clutter changes.
Figure 8Tracking comparison for the proposed and current state of the art trackers for the Bolt, Lamming, Racecar and Singer image sequences.