| Literature DB >> 35058767 |
Amanda S Therrien1, Aaron L Wong1.
Abstract
Human motor learning is governed by a suite of interacting mechanisms each one of which modifies behavior in distinct ways and rely on different neural circuits. In recent years, much attention has been given to one type of motor learning, called motor adaptation. Here, the field has generally focused on the interactions of three mechanisms: sensory prediction error SPE-driven, explicit (strategy-based), and reinforcement learning. Studies of these mechanisms have largely treated them as modular, aiming to model how the outputs of each are combined in the production of overt behavior. However, when examined closely the results of some studies also suggest the existence of additional interactions between the sub-components of each learning mechanism. In this perspective, we propose that these sub-component interactions represent a critical means through which different motor learning mechanisms are combined to produce movement; understanding such interactions is critical to advancing our knowledge of how humans learn new behaviors. We review current literature studying interactions between SPE-driven, explicit, and reinforcement mechanisms of motor learning. We then present evidence of sub-component interactions between SPE-driven and reinforcement learning as well as between SPE-driven and explicit learning from studies of people with cerebellar degeneration. Finally, we discuss the implications of interactions between learning mechanism sub-components for future research in human motor learning.Entities:
Keywords: adaptation; cerebellar degeneration; explicit and implicit motor learning; reinforcement learning; sensory prediction error
Year: 2022 PMID: 35058767 PMCID: PMC8764186 DOI: 10.3389/fnhum.2021.785992
Source DB: PubMed Journal: Front Hum Neurosci ISSN: 1662-5161 Impact factor: 3.473
Figure 1Control policy updates arising from the interactions of three learning mechanisms. On trial n, a control policy is issued to perform the current movement (light green thick arrows). This plan is executed by the body (physical plant), and sensory feedback is detected (dark green arrows). The SPE-driven learning system predicts the expected sensory consequences of the movement, which is compared against sensory feedback of the actual executed movement to compute a sensory prediction error (SPE). The reinforcement learning system predicts the expected reward associated with that movement and this is compared against the actual reward outcome to compute a reward prediction error (RPE). The explicit learning system compares the expected outcome of the strategy against the observed movement outcome to compute a task error (TE). In all cases, the computed error signals (thin blue arrows) update both the respective prediction mechanism as well as the control policy for the next (n + 1) movement. Most studies treat this control-policy update as the combination of the contributions of the individual learning systems (here labeled as the Integrator). We suggest that these systems also interact in other ways. For example, SPE signals are a means by which the reinforcement-learning and explicit-learning systems could solve the credit-assignment problem in determining whether the policy or the execution of that policy led to the observed result (solid orange arrows). Additional speculated interactions may exist (dashed orange arrows), although more behavioral evidence is needed to support the existence of such connections in humans.
Figure 2Proposed interactions between the SPE signal and other learning mechanisms to solve the credit-assignment problem. (A) On a given trial, individuals receive positive or negative reward feedback about reach outcome. If this feedback is unexpectedly negative (i.e., a negative RPE signal), for example, individuals must determine whether they erroneously selected the wrong control policy or simply executed the correct policy poorly. (B) An example state diagram corresponding to the situation in panel (A) describes how an update signal is generated based on an RPE (indicating an error has occurred). An SPE is used to determine if the RPE should be attributed to a poor policy choice or a poor execution of that policy. (C) During explicit learning, an individual adopts a strategy (e.g., aim location) to attain a goal (hit the target with the cursor). If a task error arises, individuals must determine if they erroneously selected the wrong explicit strategy or if they poorly executed the correct strategy. (D) Although it remains unclear exactly how explicit learning occurs, we propose that updates to the strategy choice occur as a result of a task error (TE), which is modulated by an SPE informing about the accuracy of executing the chosen strategy.