| Literature DB >> 36005459 |
Benoit Decoux1, Redouane Khemmar1, Nicolas Ragot1, Arthur Venon1, Marcos Grassi-Pampuch1, Antoine Mauri1, Louis Lecrosnier1, Vishnu Pradeep1.
Abstract
In smart mobility, the semantic segmentation of images is an important task for a good understanding of the environment. In recent years, many studies have been made on this subject, in the field of Autonomous Vehicles on roads. Some image datasets are available for learning semantic segmentation models, leading to very good performance. However, for other types of autonomous mobile systems like Electric Wheelchairs (EW) on sidewalks, there is no specific dataset. Our contribution presented in this article is twofold: (1) the proposal of a new dataset of short sequences of exterior images of street scenes taken from viewpoints located on sidewalks, in a 3D virtual environment (CARLA); (2) a convolutional neural network (CNN) adapted for temporal processing and including additional techniques to improve its accuracy. Our dataset includes a smaller subset, made of image pairs taken from the same places in the maps of the virtual environment, but from different viewpoints: one located on the road and the other located on the sidewalk. This additional set is aimed at showing the importance of the viewpoint in the result of semantic segmentation.Entities:
Keywords: convolutional neural network; dataset; deep learning; semantic segmentation; smart mobility
Year: 2022 PMID: 36005459 PMCID: PMC9410455 DOI: 10.3390/jimaging8080216
Source DB: PubMed Journal: J Imaging ISSN: 2313-433X
Figure 1Examples of image sequences from the dataset. Each sequence is made of 4 images with a small gap between each. The column of images at the right is made of the ground-truth corresponding to the last RGB image of each sequence.
Figure 2Four examples of couple-images from the additional dataset, taken from a viewpoint located on the road (left of each couple) and from a viewpoint located on the sidewalk (right of each couple). AT the couple of each couple-images, the true classes of the semantic segmentation are shown.
Our Carla 13 classes with their weights.
|
|
|
|
|
|
|
|
|
| Weights | 8.7140 | 31.2160 | 30.7612 | 27.5623 | 27.9024 | 37.7695 | 8.1916 |
|
|
|
|
|
|
|
| |
| Weights | 4.9276 | 6.8403 | 33.7536 | 17.6963 | 46.7649 | 3.3284 | - |
Figure 3Two examples of model inference. From left to right: raw image, model prediction and ground truth.
mIoU results of cross-validation test between the first part of the additional dataset: ADRoad (images taken from a viewpoint located on the road), and the second part: ADSidewalk (same images but taken from a viewpoint located on the sidewalk).
| Test on ADRoad | Test on ADSidewalk | |
|---|---|---|
| learn on ADRoad | 61.51% | 30.58% |
| learn on ADSidewalk | 51.32% | 62.23% |