| Literature DB >> 34177509 |
Haonan Duan1,2,3, Peng Wang1,3,4, Yayu Huang1,3, Guangyun Xu1,3, Wei Wei1,3, Xiaofei Shen1,3.
Abstract
Dexterous manipulation, especially dexterous grasping, is a primitive and crucial ability of robots that allows the implementation of performing human-like behaviors. Deploying the ability on robots enables them to assist and substitute human to accomplish more complex tasks in daily life and industrial production. A comprehensive review of the methods based on point cloud and deep learning for robotics dexterous grasping from three perspectives is given in this paper. As a new category schemes of the mainstream methods, the proposed generation-evaluation framework is the core concept of the classification. The other two classifications based on learning modes and applications are also briefly described afterwards. This review aims to afford a guideline for robotics dexterous grasping researchers and developers.Entities:
Keywords: deep learning; dexterous grasping; point cloud; review; robotics
Year: 2021 PMID: 34177509 PMCID: PMC8221534 DOI: 10.3389/fnbot.2021.658280
Source DB: PubMed Journal: Front Neurorobot ISSN: 1662-5218 Impact factor: 2.650
Figure 1Recent dexterous grasp pipeline.
The related surveys and corresponding topics.
| Du et al., | Vision methods facilitate grasp estimation | Artificial intelligence review |
| Ruiz-del-Solar et al., | Deep learning methods for robot vision | arXiv |
| Luo et al., | Robotic tactile perception | Mechatronics |
| Wang C. et al., | Feature sensing and robotic grasping | Sensors |
| Caldera et al., | Deep learning methods in grasp detection | Multimodal technologies and interaction |
| Kroemer et al., | Learning-based methods in robot manipulation | arXiv |
| Kleeberger et al., | Learning-based robotic grasping | Current robotics reports |
| Li and Qiao, | Robotic grasping and assembly tasks | IEEE Transactions on mechatronics |
| Mohammed et al., | Deep reinforcement learning-based grasping | IEEE Access |
| Zhao W. et al., | Sim-to-real problems of reinforcement learning | arXiv |
| Billard and Kragic, | Trends and challenges in robot manipulation | Science |
Figure 2Robotics dexterous grasping methods based on point cloud and deep learning.
Figure 3The general pipeline of grasp candidate generation.
Figure 4Entire pipelines of three classifications in object-aware sampling. Object detection and segmentation is the most basic method. Inputs are commonly RGB images, which are detected or segmented by networks to extract the object point clouds. Extracted point cloud can either be utilized to sample the grasp candidates immediately or fed into object affordance or shape complementation methods. Object affordance methods take extracted point cloud as inputs to obtain the affordance of object to reduce the sampling search space. On the contrary, object shape complementation aims to acquire the entire object point cloud to improve the grasp candidate generation confidence (The hammer point cloud is from YCB datasets).
The summary of geometry-base object-aware grasp candidate generation.
| Object detection and segmentation | Less is More (Lopes et al., | RANSAC | 90 | – | Single object | 1 | Irregular | No | R |
| (Schnaubelt et al., | Maskfusion | – | – | Cluttered | 5 | Irregular | No | R | |
| RED (Chen et al., | Mask-RCNN + PointNet | 84 (S) 82 (R) | Parallel-jaw gripper | Cluttered | 7 | Irregular | Yes | S/R | |
| (Bui et al., | YOLOv3 | – | Parallel-jaw gripper | Single object | 1 | Regular | No | S/R | |
| (Deng et al., | PoseCNN | 86.7 | Parallel-jaw gripper | Cluttered | – | Irregular | Yes | R | |
| (Lin and Cong, | PointNet | 90 | Parallel-jaw gripper | Cluttered | 5 | Irregular | No | R | |
| (Yu S. et al., | RANSAC + VGG | – | Parallel-jaw gripper | Single object | 1 | Irregular | No | R | |
| (Lin et al., | PPR-net | 78 | Parallel-jaw gripper | Cluttered | 30 | Regular | No | R | |
| (Sun and Lin, | Mask R-CNN | 71.1 | Parallel-jaw gripper | Single object | – | Regular | No | R | |
| Object affordance | (Qian et al., | ResNet101 + FPN | 95 | Parallel-jaw gripper | Single object | 1 | Regular | No | R |
| TOG-Net (Fang K. et al., | SOM | 80 | Parallel-jaw gripper | Single object | 1 | Irregular | No | R | |
| kPAM (Manuelli et al., | Integral human pose regression | – | Parallel-jaw gripper | Single object | 1 | Regular | No | R | |
| Object shape complement | (Varley et al., | CNN | 93.33 | Three fingers | Cluttered | – | Irregular | No | R |
| (Lundell et al., | CNN | 59 | Parallel-jaw gripper | Cluttered | 10 | Irregular | No | R | |
| (Yan et al., | CNN | 61 | Parallel-jaw gripper | Cluttered | – | Irregular | Yes | R | |
| (Torii and Hashimoto, | DNN | 85.6 | Parallel-jaw gripper | Cluttered | – | Regular | No | S | |
| (Liu and Cao, | CNN | 94.06 | Parallel-jaw gripper | Cluttered | – | Irregular | Yes | R | |
Figure 5Learning-based sampling (The spatula point cloud is from YCB datasets).
Geometry-based and learning-based recommendation under different conditions.
| Single object environment | Easy to sample grasp pose | Geometry-based |
| Collision-regardless | More constraints required to detect collision | |
| Hard to generate dataset | No training process based on large-scale dataset | |
| Cluttered environment | Sample grasp pose based on advanced features | Learning-based |
| collision-concern | No need to build hand-craft collision detection constraints | |
| easy to generate dataset | Suitable for training a model |
Object-aware sampling branches recommendation.
| Preliminary attempts | Highly adaptable | Object detection and segmentation |
| Regular object | Easy to detect and segment | |
| Harmless irregular object | Easy to detect and segment, no need to consider unsafe grasp pose | |
| Regular object | Able to detect more reasonable grasp | Object affordance |
| Irregular object | Able to detect where to grasp | |
| Harmful object | Able to detect a safe grasp | |
| Poor lighting condition | Restore object shape | Shape complement |
| Sparse point cloud | Restore object shape | |
| Irregular object | Filter unreasonable grasp pose with symmetric shape assumption |
Figure 6Learning-based candidate evaluation (The scissors point cloud is from YCB datasets).
Figure 7The deep reinforcement learning framework of robotic grasp learning based on point cloud. In order to reduce the cost of trial and error, the current robot grasping based on reinforcement learning is to first train the model in a simulation environment and then migrate to the real robot (Sim to real).
Figure 8End effector, system and applications for robotics dexterous grasping. End effector is divided into simple end-effectors and advanced end-effectors. The former group contains suction cup and parallel-jaw gripper, the latter class indicates those multi-finger hands. Grasping system is first designed, deployed, and matured on simple end-effectors, then transferred and improved on advanced end-effectors. Developed systems are applied in different scenarios, life-oriented, or industry-oriented.