| Literature DB >> 34940724 |
Claudio Ferrari1, Leonardo Casini2, Stefano Berretti2, Alberto Del Bimbo2.
Abstract
Estimating the 3D shape of objects from monocular images is a well-established and challenging task in the computer vision field. Further challenges arise when highly deformable objects, such as human faces or bodies, are considered. In this work, we address the problem of estimating the 3D shape of a human body from single images. In particular, we provide a solution to the problem of estimating the shape of the body when the subject is wearing clothes. This is a highly challenging scenario as loose clothes might hide the underlying body shape to a large extent. To this aim, we make use of a parametric 3D body model, the SMPL, whose parameters describe the body pose and shape of the body. Our main intuition is that the shape parameters associated with an individual should not change whether the subject is wearing clothes or not. To improve the shape estimation under clothing, we train a deep convolutional network to regress the shape parameters from a single image of a person. To increase the robustness to clothing, we build our training dataset by associating the shape parameters of a "minimally clothed" person to other samples of the same person wearing looser clothes. Experimental validation shows that our approach can more accurately estimate body shape parameters with respect to state-of-the-art approaches, even in the case of loose clothes.Entities:
Keywords: 3D body reconstruction; 3D modeling; learning 3D body shape parameters
Year: 2021 PMID: 34940724 PMCID: PMC8705765 DOI: 10.3390/jimaging7120257
Source DB: PubMed Journal: J Imaging ISSN: 2313-433X
Figure 1Example of the first 3 principal components of shape variations for the SMPL model.
Figure 2Phase 1: the proposed model takes as input an image of a person and regresses the SMPL shape parameters . Phase 2: the trained model is used to extract parameters from the minimally clothed images. The parameters are assigned to each image of the same individual and the network if fine-tuned.
Figure 3Samples of a same subject in the People Snapshot dataset. The “minimally clothed” version (with tight clothes) is shown on the left, and is used to estimate the shape parameters. These parameters are then assigned to the other sequences with looser clothes (right).
Quantitative results on the SURREAL dataset in terms of MPVE (mm).
| Type | Method | Error (mm) |
|---|---|---|
| Model-free | SMPLify++ [ |
|
| Tung et al [ |
| |
| BodyNet [ |
| |
| DecoMR [ |
| |
| Model-based | Neural Body Fitting [ |
|
| SMPLR [ |
| |
| Model-based | Our approach |
|
Quantitative results on some test samples of the People Snapshot dataset in terms of MPVE (mm). Table in the “All garments” side reports average reconstruction errors compared to HMR (averaged over all garments). Our method after fine-tuning obtains accurate reconstructions. Table in the “Tight-Loose” side highlights the improvement obtained on loose clothes (casual) after fine-tuning the model with parameters associated with the minimally clothed version (sport).
| Method | Subject | Error (mm) |
|---|---|---|
| All garments | ||
| Ours—Phase 1 | Female 1 |
|
| Ours—Phase 2 | Female 1 |
|
| HMR [ | Female 1 |
|
| Ours—Phase 1 | Female 3 |
|
| Ours—Phase 2 | Female 3 |
|
| HMR [ | Female 3 |
|
| Ours—Phase 1 | Female 6 |
|
| Ours—Phase 2 | Female 6 |
|
| HMR [ | Female 6 |
|
| Ours—Phase 1 | Male 9 |
|
| Ours—Phase 2 | Male 9 |
|
| HMR [ | Male 9 |
|
| Tight-Loose | ||
| Ours—Phase 1 | Female 1—Sport |
|
| Ours—Phase 2 | Female 1—Sport |
|
| Ours—Phase 1 | Female 1—Casual |
|
| Ours—Phase 2 | Female 1—Casual |
|
Figure 4Qualitative examples of body reconstruction from an image of the People Snapshot (left) and an “in the wild” image (right). The two reconstructions are obtained using the HMR method (left) and our approach after Phase 2 (right).
Figure 5Qualitative reconstruction examples from “in the wild” images collected from Internet.