| Literature DB >> 31467361 |
Zuria Bauer1, Francisco Gomez-Donoso2, Edmanuel Cruz2, Sergio Orts-Escolano2, Miguel Cazorla2.
Abstract
In this paper, we propose a new dataset for outdoor depth estimation from single and stereo RGB images. The dataset was acquired from the point of view of a pedestrian. Currently, the most novel approaches take advantage of deep learning-based techniques, which have proven to outperform traditional state-of-the-art computer vision methods. Nonetheless, these methods require large amounts of reliable ground-truth data. Despite there already existing several datasets that could be used for depth estimation, almost none of them are outdoor-oriented from an egocentric point of view. Our dataset introduces a large number of high-definition pairs of color frames and corresponding depth maps from a human perspective. In addition, the proposed dataset also features human interaction and great variability of data, as shown in this work.Entities:
Year: 2019 PMID: 31467361 PMCID: PMC6715739 DOI: 10.1038/s41597-019-0168-5
Source DB: PubMed Journal: Sci Data ISSN: 2052-4463 Impact factor: 6.444
Review of the main elements of the most popular RGB-D dataset.
| Dataset | Scale | Resolution | Modality | Outdoor | Video | |||
|---|---|---|---|---|---|---|---|---|
| Frames | Scenes | RGB | Depth | Point Cloud | ||||
| SCENENET RGB-D[ | 5 M | 57 | 320 × 240 | ✓ | ✓ | ✗ | ✗ | ✓ |
| MATTERPORT 3D[ | 200 k | 90 | 1280 × 1024 | ✓ | ✓ | ✓ | ✗ | ✓ |
| STANFORD 2D-3D[ | 70 k | 270 | 1080 × 1080 | ✓ | ✓ | ✓ | ✗ | ✓ |
| SCENE NN[ | 2 k–10 k each scene | 100 | 640 × 480 | ✓ | ✓ | ✗ | ✗ | ✓ |
| MICROSOFT RGB-D[ | 500–1 k each seq. | 7 | 640 × 480 | ✓ | ✓ | ✗ | ✗ | ✓ |
| VIDRILO[ | 22 k | 10 | 640 × 480 | ✓ | ✗ | ✓ | ✗ | ✓ |
| SUN RGBD[ | 10 k | 270 | 1080 × 1080 | ✓ | ✓ | ✓ | ✗ | ✗ |
| NYU-D V2[ | 1.5 k | 464 | 640 × 480 | ✓ | ✓ | ✗ | ✗ | ✓ |
| B3DO[ | 849 | 75 | 1080 × 1080 | ✓ | ✓ | ✗ | ✗ | ✗ |
| SYNTHIA[ | 200 k | 960 × 720 | ✓ | ✓ | ✗ | ✓ | ✗ | |
| KITTI[ | 1.6 k | 400 | 1392 × 512 | ✓ | ✓ | ✓ | ✓ | ✗ |
| ETH3D[ | 898 | 25 | ✓ | ✓ | ✗ | ✓ | ✓ | |
| MAKE 3D[ | 534 | 1280 × 1024 | ✓ | ✓ | ✗ | ✓ | ✓ | |
| Tanks and Temples[ | 147791 | 14 | 1600 × 1200 | ✓ | ✓ | ✓ | ✗ | ✓ |
| Middlebury Dataset[ | 40 k | 33 | 2964 × 1988 | ✓ | ✓ | ✗ | ✗ | ✗ |
|
|
|
| 2208 × 1242 | ✓ | ✓ | ✗ | ✓ | ✓ |
The important features for us are scale (number of frames and the number of different layouts or different scenes of the dataset), resolution (in pixel), modality (the data provided by the dataset) and whether it is an outdoor dataset.
Fig. 1General overview of some of the informations and components of the UASOL dataset.
Review of the main features of the ZED camera.
| Features | |
|---|---|
| Size (mm) | 175 × 30 × 33 |
| Weight (g) | 159 |
Image and Depth Resolution (pixels) | HD2K: 2208 × 1242 (15FPS), HD1080: 1920 × 1080 (30, 15FPS), HD720: 1280 × 720 (60, 30, 15FPS), WVGA: 672 × 376 (100, 60, 30, 15FPS) |
| Depth | Range (m): 1–20, Format (bits): 32 m Baseline (mm): 120 |
| Lens | FoV: 110, |
| Sensors | Size: 1/3, Format: 19:9, Pixel Size: 2-u pixels |
| Connectivity | USB 3.0 (5 V/380 mA), 0 C to + 45 C |
| SDK System | Windows or Linux, Dual-core 2.3 GHz, 4 GB RAM, Nvidia GPU |
Fig. 2Results of the YOLOv3 architecture on random samples of the proposed dataset.
Fig. 3Estimated segmentation masks on random samples of the proposed dataset. Note that each color represents a different class.
Train, Validation and Test splits with the sequences it comprises and the number of total frames.
| Splits | Sequences | Frames |
|---|---|---|
| Train | Alumns Help Desk, Lecture Rooms I, Lecture Rooms II, Library, Biotechnology, Sciences II, Sciences III, Sciences IV, Sciences V, Social Sciences, Club I, Economics, Multipurpose II-III, Nursery, EPS1, EPS4, German Bernacer, University, Rectorship, University 12, University 13, Optics | 110835 |
| Validation | Shopping Center, Science I, Club II, Law, EPS2, EPS3, Philosophy I, Philosophy II-III, Garden | 48688 |
| Test | Multipurpose I, Control Tower | 5842 |
Fig. 4Counting of interesting objects for each sequence of the proposed dataset. The values are expressed in absolute number of objects.
Fig. 5Percentage of the quantity of pixels that belong to a certain class in each sequence of the proposed dataset.
Fig. 6Results of the Semi-Global Matching algorithm and the GC-Net algorithm using the synthetic UnrealROX dataset (depth images are in millimeters).
Weather conditions in the different sequences of the dataset.
| Weather Condition | Sequences |
|---|---|
| Sun | Alumns Help Desk, Lecture Room I, Lecture Room II, Library, Biotechnology, Sciences II, Sciences III, Multipurpose I, Multipurpose II-III, Nursery, EPS1, EPS2, German Bernacer, Rectorship, University I, Control Tower, Garden, Philosophy I, Philosophy II-III |
| Partially cloudy | Shopping Center, Sciences I, Club II, Law, EPS3, EPS4, Optics, Sciences IV, Social Sciences, Club I, Economics |
| Cloudy | Sciences V, University I, University Institute |
Recording times for the different sequences of the dataset.
| Recording Schedule | Sequences |
|---|---|
| Morning (08.00 h–14.00 h) | Alumns Help Desk, Lecture Room I, Library, Biotechnology, Sciences I, Sciences II, Sciences III, Science IV, Law, Economics, Multipurpose I, Multipurpose II-III, EPS2, EPS3, EPS4, German Bernacer, University I, Optics, Control Tower, Philosophy I, Philosophy II-III, University II |
| Afternoon (14.00 h–20.00 h) | Lecture Room II, Shopping Center, Sciences V, Club 1, Club 2, Nursery, EPS1, German Bernacer, Garden, Social Sciences, Rectorship |
Fig. 7Cumulative plot of the depth values contained in the scene “Alumns Help Desk”.
Baseline for the dataset.
| Test Set | Architecture and Model | MRelE | RMSE |
|---|---|---|---|
| UASOL | Ref.[ (publicly available model) | 0.753 | 8.119 |
Ref.[ (trained on the UASOL dataset by us) | 0.326 | 4.4134 |
| Design Type(s) | modeling and simulation objective • image creation and editing objective • database creation objective |
| Measurement Type(s) | image |
| Technology Type(s) | digital camera |
| Factor Type(s) | geographic location |
| Sample Characteristic(s) | Alicante • anthropogenic habitat |