| Literature DB >> 22778665 |
Oscar Martinez Mozos1, Hitoshi Mizutani, Ryo Kurazume, Tsutomu Hasegawa.
Abstract
The categorization of places in indoor environments is an important capability for service robots working and interacting with humans. In this paper we present a method to categorize different areas in indoor environments using a mobile robot equipped with a Kinect camera. Our approach transforms depth and grey scale images taken at each place into histograms of local binary patterns (LBPs) whose dimensionality is further reduced following a uniform criterion. The histograms are then combined into a single feature vector which is categorized using a supervised method. In this work we compare the performance of support vector machines and random forests as supervised classifiers. Finally, we apply our technique to distinguish five different place categories: corridors, laboratories, offices, kitchens, and study rooms. Experimental results show that we can categorize these places with high accuracy using our approach.Entities:
Keywords: Kinect sensor; place categorization; service robots
Year: 2012 PMID: 22778665 PMCID: PMC3386764 DOI: 10.3390/s120506695
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1.(a) Depth image obtained in a laboratory using the Kinect sensor. Different depths are shown using different grey values. Complete black pixels represent undefined values (see Section 3.2); (b) Corresponding RGB image representing the same scene; (c) The Kinect sensor used in our approach.
Figure 2.Toy example for the calculation of the LBP value of a pixel in a grey scale image. (a) The reference pixel p (marked in bold in a shadow cell) has an initial value of 100; (b) Corresponding binary values for the 8-neighboring pixels of p. The values are arranged into a binary string following a clockwise order starting at b with a corresponding decimal value of 236; (c) The obtained decimal value is used as the new value for p in the transformed image T.
Figure 3.Example LBP transformations. (a) Original RGB (upper) and depth (bottom) images; (b) Corresponding LBP transformed images: T (upper) and T (bottom).
Dataset containing a total of 1,228 pairs of RGB and depth images.
|
| ||
|---|---|---|
| Corridor | Corridor 1 | 68 |
| Corridor 2 | 42 | |
| Corridor 3 | 70 | |
| Corridor 4 | 99 | |
|
| ||
| Total | 279 | |
|
| ||
| Kitchen | Kitchen 1 | 73 |
| Kitchen 2 | 65 | |
| Kitchen 3 | 53 | |
|
| ||
| Total | 191 | |
|
| ||
| Laboratory | Laboratory 1 | 99 |
| Laboratory 2 | 99 | |
| Laboratory 3 | 81 | |
| Laboratory 4 | 78 | |
|
| ||
| Total | 357 | |
|
| ||
| Study Room | Study Room 1 | 71 |
| Study Room 2 | 70 | |
| Study Room 3 | 49 | |
| Study Room 4 | 62 | |
|
| ||
| Total | 252 | |
|
| ||
| Office | Office 1 | 57 |
| Office 2 | 45 | |
| Office 3 | 47 | |
|
| ||
| Total | 149 | |
Figure 4.Examples of RGB and depth images for the places in each category.
Overall classification results using SVMs and different uniformity thresholds. We show the average and standard deviations over 10 experiments.
| θ = 2 | θ = 4 | θ = 6 | θ = 8 (CENTRIST) |
| 87.27 ± 10.71 | 89.71 ± 9.92 | 89.37 ± 8.85 |
Figure 5.Correct classification rates by category using different uniformity thresholds.
Confusion matrices for place categorization using SVMs and different uniformity thresholds.
| Predicted Class | ||||||
|---|---|---|---|---|---|---|
|
| ||||||
| % | Corridor | Kitchen | Laboratory | Study room | Office | |
|
| ||||||
| Actual class | Corridor | 0.20 ± 0.63 | 3.84 ± 6.25 | 0.91 ± 1.93 | 0.00 ± 0.00 | |
| Kitchen | 2.64 ± 3.99 | 4.15±7.96 | 22.64 ± 25.32 | 1.13 ± 2.97 | ||
| Laboratory | 0.25 ± 0.78 | 1.24 ± 3.90 | 2.26 ± 2.97 | 0.75 ± 1.94 | ||
| Study Room | 0.00 ± 0.00 | 3.29 ± 4.61 | 10.57 ± 11.15 | 0.32 ± 1.02 | ||
| Office | 0.00 ± 0.00 | 4.39 ± 4.69 | 5.09 ± 5.57 | 0.00 ± 0.00 | ||
Figure 6.Correct classification rates using different modalities.
Figure 7.Histograms using Spatial Pyramids [28]. Three levels of pyramids are applied and the corresponding local histograms are concatenate to form the final feature vector x.
Comparison of single and combined modalities. Results are shown as percentages together with standard deviations.
| Grey | Depth | Grey + Depth | ||
|---|---|---|---|---|
|
| ||||
| θ = 2 | Level 0 | 73.72 ± 19.84 | 78.37 ± 20.03 | 87.27 ± 10.71 |
| Level 1 | 80.93 ± 21.79 | 83.22 ± 16.40 | 85.53 ± 19.46 | |
| Level 2 | 82.21 ± 23.26 | 84.93 ± 17.18 | 82.46 ± 23.67 | |
|
| ||||
| θ = 4 | Level 0 | 78.75 ± 18.01 | 82.15 ± 20.53 | 92.61 ± 4.78 |
| Level 1 | 78.56 ± 23.13 | 89.02 ± 10.77 | 88.10 ± 15.75 | |
| Level 2 | 78.87 ± 22.80 | 86.67 ± 16.28 | 88.95 ± 14.18 | |
|
| ||||
| θ = 6 | Level 0 | 77.38 ± 17.73 | 80.70 ± 16.40 | 89.71 ± 9.92 |
| Level 1 | 80.33 ± 17.44 | 85.08 ± 12.58 | 87.18 ± 12.4 | |
| Level 2 | 78.33 ± 18.18 | 82.18 ± 15.55 | 80.69 ± 15.32 | |
|
| ||||
| θ = 8 (CENTRIST) | Level 0 | 76.60 ± 20.43 | 80.72 ± 20.14 | 89.37 ± 8.85 |
| Level 1 | 79.47 ± 21.78 | 85.11 ± 17.52 | 85.68 ± 17.88 | |
| Level 2 | 82.18 ± 18.30 | 83.14 ± 20.13 | 84.59 ± 19.69 | |
Comparison of SVM and random forest as categorization methods using as input reduced feature vectors with uniform measurement threshold θ = 4. Results are shown in percentages.
| Level | SVM | Random Forest |
|---|---|---|
|
| ||
| 0 | 92.61 ± 4.78 | 85.74 ± 11.82 |
| 1 | 88.10 ± 15.76 | 87.57 ± 14.23 |
| 2 | 88.95 ± 14.18 | 88.43 ± 12.79 |