| Literature DB >> 24478693 |
J Law1, P Shaw1, K Earland1, M Sheldon1, M Lee1.
Abstract
A major challenge in robotics is the ability to learn, from novel experiences, new behavior that is useful for achieving new goals and skills. Autonomous systems must be able to learn solely through the environment, thus ruling out a priori task knowledge, tuning, extensive training, or other forms of pre-programming. Learning must also be cumulative and incremental, as complex skills are built on top of primitive skills. Additionally, it must be driven by intrinsic motivation because formative experience is gained through autonomous activity, even in the absence of extrinsic goals or tasks. This paper presents an approach to these issues through robotic implementations inspired by the learning behavior of human infants. We describe an approach to developmental learning and present results from a demonstration of longitudinal development on an iCub humanoid robot. The results cover the rapid emergence of staged behavior, the role of constraints in development, the effect of bootstrapping between stages, and the use of a schema memory of experiential fragments in learning new skills. The context is a longitudinal experiment in which the robot advanced from uncontrolled motor babbling to skilled hand/eye integrated reaching and basic manipulation of objects. This approach offers promise for further fast and effective sensory-motor learning techniques for robotic learning.Entities:
Keywords: constraints; development; intrinsic motivation; robotics; staged learning
Year: 2014 PMID: 24478693 PMCID: PMC3902213 DOI: 10.3389/fnbot.2014.00001
Source DB: PubMed Journal: Front Neurorobot ISSN: 1662-5218 Impact factor: 2.650
Figure 1A conceptual diagram of the increase in motor control competence in infancy, highlighting behaviors identified in the infant literature. Darker shading indicates greater competency. This figure is abstracted from the detailed timeline compiled in Law et al. (2011).
Figure 2A schematic representation of the iCub highlighting the sensorimotor spaces explored in the experiment, and the relationships between them.
Figure 3Algorithm for novelty-driven action selection (derived from experiments in Law et al., .
Infant development and learning targets.
| Pre-natal | Grasp reflex Butterworth and Harris, | Grasp on tactile feedback |
| 1 | Sufficient muscle tone to support brief head movements Fiorentino, | Constraint on head movement |
| 1 | Eyes and head move to targets Sheridan, | Learning of saccade mappings |
| 1 | Saccades are few in number Maurer and Maurer, | |
| 2 | More saccades Maurer and Maurer, | Refinement of saccade mappings |
| 2 | Head only contributes to larger gaze shifts due to lack of muscle tone Goodkin, | Release of constraint on head motion, and beginnings of eye-head mapping |
| 2 | Involuntary grasp release Fiorentino, | Release grasp when hand attention is low |
| 3 | Head contributes to small gaze shifts 25% of the time, and always to large gaze shifts Goodkin, | Refinement of eye-head gaze control |
| 3 | Reach and miss Shirley, | Reaching triggered by visual stimulation |
| 3 | Hand regard and hands to mouth Fiorentino, | Initial learning of eye-hand mappings with return to “home” position |
| 3 | Clasps and unclasps hands Sheridan, | Learning of raking grasp |
| 4 | Good eye and head control Fiorentino, | Gaze mapping completed |
| 4 | Beginning thumb opposition Bayley, | Enable independent thumb movement |
| 5 | Rotation in upper trunk Fiorentino, | Begin torso mapping |
| 5 | Palmar grasp Fiorentino, | Learning of palmar grasp |
| 6 | Successful reach and grasp Sheridan, | Refinement of visually-guided reaching |
| 7 | Thumb opposition complete Bayley, | Refined thumb use |
| 8 | Pincer grasp, bilateral, unilateral, transfer Fiorentino, | Learning of pincer grasp |
| 8 | Crude voluntary release of objects Fiorentino, | Voluntary release |
| 9 | Leans forward without losing balance Sheridan, | Torso mapping complete |
Reach development and learning targets.
| Pre-natal | Arm babbling in the womb De Vries et al., | Proprioceptive-motor mapping of general movements |
| 1 | Hand-mouth movements Rochat, | Learning of home position through tactile feedback |
| 1 | Directed (to the hemifield in which a target appears), but unsuccessful, hand movements von Hofsten and Rönnqvist, | Initial mapping of general movements to vision |
| 1 | Initial reaching is goal directed, and triggered by a visual stimulus, but visual feedback is not used to correct movements mid-reach Bremner, | Visual stimuli trigger general reach movements |
| 3 | Infants often move their hand to a pre-reaching position near the head before starting a reach Berthier et al., | Reaches conducted from “home” position |
| 3 | Infants engaged in early reaching maintained a constant hand-body distance by locking the elbow, and instead used torso movements to alter the distance to targets Berthier et al., | Constraints on elbow movements reduce learning space |
| 3 | Successful reaching appears around 3–4 months after birth Shirley, | Primitive hand-eye mapping |
| 3 | Gaze still focused on the target and not the hand Clifton et al., | Reaches are visually elicited, but without continuous feedback |
| 4 | From 4 months, infants begin to use visual feedback to refine the movement of the hand White et al., | Begin to map joint-visual changes and use visual feedback to correct reaches |
| 4 | As infants age their reaching becomes straighter, with the hand following the shortest path Carvalho et al., | Refined reaching with smooth and direct movements |
Constraints used to structure behavioral stages on the robot.
| Environment | B | Affects data available for learning at all stages | None. Influenced by robot and experimenter |
| Eye motor | A | Prevents eye motion | Start of experiment |
| Neck motor | A | Prevents head motion | Threshold on eye control |
| Neck learning | B | Neck learning requires accurate eye control | Emerges as eye control develops |
| Shoulder motors | A + B | Prevents arm movement | Threshold on gaze control, exclusive of torso learning |
| Elbow motors | A + B | Limits forearm extension/flexion | Threshold on gaze control, exclusive of torso learning |
| Reflex grasp | A | Causes hand to close on tactile stimuli | Active until reaching threshold attained |
| Controlled grasp | A | Prevents voluntary grasping of objects | Released with shoulder |
| Torso motor | A + B | Prevents motion at waist | Threshold on gaze control, exclusive of arm learning |
Constraints used to structure reaching stages in simulation.
| No vision | A | Arm movements learnt through tactile and proprioceptive feedback only | Start of experiment |
| Crude gaze fields (large) | A | Arm movements coarsely correlated with vision | Threshold on maturity of internal structures |
| Fine gaze fields (small) | A | Fine correlation between hand position and vision | Threshold on development of reaching |
| Visual feedback | A | Prevents visual guidance during reaching | Threshold on successful reaching |
Figure 4System architecture.
Figure 5Motor dynamics in the horizontal plane during a typical gaze and reach action (see text for details).
Observable experimental behaviors.
| Fetal babbling | General arm movements | 10 | Simulator |
| Saccading | Eye movements only, trying to fixate on stimuli | 20 | Robot |
| Gazing | Eyes and head move to fixate on stimuli | 40 | Robot |
| Swiping | Arms make swiping actions in the general direction of visual stimuli | 10 | Simulator |
| Visually elicited reaching | Reaches toward visual targets with some success | 10 | Simulator |
| Guided reaching | Successful and smoother reaches toward visual targets | 60 | Both |
| Torso movement | Moves at waist to reach objects | 20 | Robot |
| Object play | Grasps objects and moves them around | 40 | Robot |
Behaviors observed on the iCub.
| Saccading | Eye movements only, trying to fixate on stimuli | 0 |
| Gazing | Eyes and head move to fixate on stimuli | 20 |
| Guided reaching | Successful and smoother reaches toward visual targets | 60 |
| Torso movement | Moves at waist to reach objects | 120 |
| Repeated touching | Repeatedly reaches out and touches objects | 140 |
| Pointing | Points to objects out of reach | 160 |
| Object play | Explores object affordances and actions | 170 |
| Stacks objects | Places one object on top of another | 210 |
| Learning ends | Experiment ends | 230 |
Behaviors observed in simulation.
| Fetal babbling | General arm movements | 0 |
| Pre-reaching position | Moves hand to the side of the head before reaching | 10 |
| Swiping | Arms make swiping actions in the general direction of visual stimuli | 10 |
| Visually elicited reaching | Reaches toward visual targets with some success | 20 |
| Guided reaching | Successful and smoother reaches toward visual targets | 30 |
| Learning ends | Refined hand-eye coordination | 90 |
Learning times using developmental processes.
| Eyes | 3 | 3 | 1 | 20 |
| Head | 2 | 2 | 1 | 40 |
| Tactile | 1 | |||
| Torso | 2 | 3 | 1 | 20 |
| Arms | 4*2 | 3*2 | 4 | 90 |
Figure 6Graph showing head learning with a Type A constraint lifted at 10 min intervals.
Figure 7Graph showing the effect of a Type B constraint on eye learning, when the head constraint is lifted at 10 min intervals.
Figure 8Arm fields after 10 min of hand regard behavior. The left image is with bootstrapping and the right is without bootstrapping.
Figure 9Effect of bootstrapping on reach learning.
Reach length comparison.
| Bootstrapping and hand regard | 107.5 |
| Hand regard, no bootstrapping | 149.5 |
| No bootstrapping or hand regard | 179.5 |
Figure 10A schematic map of some schema chaining possibilities. Rectangular boxes represent actions or state transitions and elliptical boxes represent different states known to the robot.
Effect of development and generalization on schema production.
| Generalization only (Simulated Robot) | 19,244 |
| Stages only (Simulated Robot) | 347 |
| Stages, generalization (Simulated Robot) | 227 |
| Stages, generalization (Physical Robot) | 115 |
Schema discovery on the iCub.
| 00:18 | Green object at (17.5, 72.4) | Reach to (17.5, 72.4) | Hand at (17.5, 72.4) | New touch schema |
| Green object at (17.5, 72.4) | ||||
| Touch sensation | ||||
| 00:50 | $z color object at ($x,$y) | Reach to ($x,$y) | Hand at ($x,$y) | Generalized touch schema |
| $z color object at ($x,$y) | ||||
| Touch sensation | ||||
| 01:56 | Green object at (17.5, 72.4) | Grasp | Hand at (17.5, 72.4) | New grasping schema |
| Touch sensation | Green object at (17.5, 72.4) | |||
| Holding object | ||||
| 02:01 | $z color object at ($x,$y) | Grasp | Hand at ($x,$y) | Generalized grasp schema |
| Touch sensation | $z color object at ($x,$y) | |||
| Holding object | ||||
| 02:19 | Hand at (17.5, 72.4) | Reach to (8.8, 62.6) | Hand at (8.8, 62.6) | New transport schema |
| Green object at (17.5, 72.4) | Green object at (8.8, 62.6) | |||
| Holding object | Holding object | |||
| 02:36 | Hand at ($x,$y) | Reach to ($u,$v) | Hand at ($u,$v) | Generalized transport schema |
| Green object at ($x,$y) | Green object at ($u,$v) | |||
| Holding object | Holding object | |||
| 03:42 | Hand at (17.5, 72.4) | Release | Hand at (17.5, 72.4) | New release schema |
| Green object at (17.5, 72.4) | Green object at (17.5, 72.4) | |||
| Holding object | Touch sensation |
The $ notation specifies variable bindings, in left to right order.