| Literature DB >> 28831043 |
Devavrat Vartak1, Danique Jeurissen1,2, Matthew W Self1, Pieter R Roelfsema3,4,5.
Abstract
We can learn new tasks by listening to a teacher, but we can also learn by trial-and-error. Here, we investigate the factors that determine how participants learn new stimulus-response mappings by trial-and-error. Does learning in human observers comply with reinforcement learning theories, which describe how subjects learn from rewards and punishments? If yes, what is the influence of selective attention in the learning process? We developed a novel redundant-relevant learning paradigm to examine the conjoint influence of attention and reward feedback. We found that subjects only learned stimulus-response mappings for attended shapes, even when unattended shapes were equally informative. Reward magnitude also influenced learning, an effect that was stronger for attended than for non-attended shapes and that carried over to a subsequent visual search task. Our results provide insights into how attention and reward jointly determine how we learn. They support the powerful learning rules that capitalize on the conjoint influence of these two factors on neuronal plasticity.Entities:
Mesh:
Year: 2017 PMID: 28831043 PMCID: PMC5567207 DOI: 10.1038/s41598-017-08200-w
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Shape learning task. (a) Trial structure. In every trial the participants saw four icons. They were instructed to pay attention to the side of the screen that was cued with a red triangle. Their task was to determine which of the two icons on the cued side was associated with a particular response button, by trial and error. They received auditory and visual feedback whether their response was erroneous or correct and on correct trials they also received feedback about the number of points that they gained (0 or 10 points). (b) Accuracy of the subjects that learned fast (within 1024 trials). Blue curves represent the accuracy for the high-reward icons and red curves the accuracy for the low-reward icons. The lighter curves show the accuracies of individual subjects (in a window of 50 trials) and the darker curves the average across subjects (N=10 fast learners). Dashed lines represent s.e.m. (c) Accuracy of the 6 subjects that learned more slowly and received additional training.
Figure 2Probe task testing the influence of attention on learning. (a) Trial structure. Participants saw icons that had been cued during learning (a relevant icon and a cued distractor) or icons that had not been cued (a redundant icon and a non-cued distractor). We gave no feedback on whether the response was correct or erroneous to prevent additional learning. (b) Accuracy in the probe task for icons that had or had not been cued during learning. Black (white) bars denote the accuracy for icons that had been associated with a high (low) reward. Error bars represent standard error of the mean.
Figure 3Visual search task. (a) Trial structure. Participants indicated the presence of a target icon. (b) Mean reaction time on trials in which subjects searched for the previously displayed icons with different reward values. Continuous (dashed) lines, reaction times in target-present (target absent) trials. All error-bars are s.e.m. across subjects.
Figure 4Accuracy in the visual search task. (a) Mean accuracy on trials in which subjects searched for the various types of icons of the learning task with different reward values. (b) Sensitivity (d-prime) for these icons. (c) Bias for the icons. Higher bias values correspond to a more conservative response criterion, i.e. a decreased probability to report “Target present”. Error-bars, s.e.m. across subjects.
Roles and properties of object icons: There were a total of 32 icons, two icons for every entry in the table.
| Category | Item | Side | Response | Reward |
|---|---|---|---|---|
| Relevant | Rel1_High | Cued | 1 | 10 points |
| Rel1_Low | Cued | 1 | 0 points | |
| Rel2_High | Cued | 2 | 10 points | |
| Rel2_Low | Cued | 2 | 0 points | |
| Cued distractor | cDisA_High | Cued | 1/2 | 10 points |
| cDisA_Low | Cued | 1/2 | 0 points | |
| cDisB_High | Cued | 1/2 | 10 points | |
| cDisB_Low | Cued | 1/2 | 0 points | |
| Redundant | Red1_High | NonCued | 1 | 10 points |
| Red1_Low | NonCued | 1 | 0 points | |
| Red2_High | NonCued | 2 | 10 points | |
| Red2_Low | NonCued | 2 | 0 points | |
| Non-cued distractor | nDisA_High | NonCued | 1/2 | 10 points |
| nDisA_Low | NonCued | 1/2 | 0 points | |
| nDisB_High | NonCued | 1/2 | 10 points | |
| nDisB_Low | NonCued | 1/2 | 0 points |
All icons were either associated with 10 (high reward) or 0 points (low reward). Relevant and redundant icons were associated with a particular response (1 or 2) whereas distractors were not (A or B).
Performance based achievement levels: These achievement levels were displayed on the screen during breaks.
| Performance in the last 128 trials | Title |
|---|---|
| Accuracy < 50% | Psychophysics - Apprentice |
| 50% < Accuracy < 70% | Psychophysics - Warrior |
| 70% < Accuracy < 90% | Psychophysics - Champion |
| Accuracy > 90% | Psychophysics - Wizard |