| Literature DB >> 32923541 |
André Broekman1, Petrus Johannes Gräbe1.
Abstract
A Perfectly Accurate, Synthetic dataset for Multi-View Stereopsis (PASMVS) is presented, consisting of 400 scenes and 18,000 model renderings together with ground truth depth maps, camera intrinsic and extrinsic parameters, and binary segmentation masks. Every scene is rendered from 45 different camera views in a circular pattern, using Blender's path-tracing rendering engine. Every scene is composed from a unique combination of two camera focal lengths, four 3D models of varying geometrical complexity, five high definition, high dynamic range (HDR) environmental textures to replicate photorealistic lighting conditions and ten materials. The material properties are primarily specular, with a selection of more diffuse materials for reference. The combination of highly specular and diffuse material properties increases the reconstruction ambiguity and complexity for MVS reconstruction algorithms and pipelines, and more recently, state-of-the-art architectures based on neural network implementations. PASMVS serves as an addition to the wide spectrum of available image datasets employed in computer vision research, improving the precision required for novel research applications.Entities:
Keywords: 3D reconstruction; Blender; Ground truth depth map; Multi-view stereopsis; Synthetic data
Year: 2020 PMID: 32923541 PMCID: PMC7474405 DOI: 10.1016/j.dib.2020.106219
Source DB: PubMed Journal: Data Brief ISSN: 2352-3409
Fig. 1PASMVS samples illustrating the selection of models, variation in illumination, material properties and camera focal lengths. (a) bunny, bricks, 35 mm, (b) bunny, brushedmetal, 50 mm, (c) teapot, concrete, 35 mm, (d) teapot, ceramic, 50 mm, (e) armadillo, copper, 25 mm, (f) armadillo, grungemetal, 50 mm, (g) dragon, piano, 25 mm, (h) dragon, marble, 50 mm.
Fig. 2Illustration of the high definition environmental lighting textures used for photorealistic scene illumination. (10) greenwich_park_8, (11) industrial_sunset_8k, (12) kiara_3_morning_8k, (13) kiara_4_mid-morning_8k, (25) sunny_vondelpark_8k.
Fig. 3Illustration of the (a) image rendering, (b) ground truth depth map, (c) subject segmentation masque, (d) ground plane segmentation masque.
Fig. 4Positioning and orientation of the 45 cameras used for all the scenes.
| Subject | Computer Vision and Pattern Recognition |
| Specific subject area | Multi-view stereopsis and 3D reconstruction from images |
| Type of data | Image |
| How data were acquired | A photorealistic virtual environment was created using Blender and rendered with the path-tracing rendering engine (Cycles). Different combinations of popular geometry models, surface materials, environmental textures and camera parameters were used to render the large variety of data samples. The binary segmentation maps were rendered alongside the colour images through assigning different material identification numbers for the geometry models and environmental textures. The ground truth depth map was also obtained during the same rendering pass by exporting the camera's Z-buffer (distance between the camera and intersecting geometry for every pixel of the imaging sensor). The intrinsic and extrinsic camera files were exported as a single (comma-separated value) CSV file for every scene. |
| Data format | Raw |
| Parameters for data collection | Using a constant, circular path for the camera around the centre point of the model, all possible combinations of model geometries, environmental lighting textures, model material properties and the camera focal lengths were rendered. |
| Description of data collection | Using Blender, path-traced images, ground truth depth map and binary segmentation maps were rendered using different models. 400 scenes in total were rendered using a combination of ten, primarily specular materials, five environmental textures, four models and two focal lengths. 45 views per scene yield a total of 18,000 synthetic samples. Intrinsic and extrinsic camera parameters were exported for each scene for generating camera matrices. Post-processing corrects Blender's rendered distance maps to depth maps. |
| Data source location | Institution: Department of Civil Engineering, University of Pretoria |
| Data accessibility | Repository name: Mendeley Data |