| Literature DB >> 36220950 |
Gibson Weydmann1, Igor Palmieri2, Reinaldo A G Simões2, João C Centurion Cabral3, Joseane Eckhardt4, Patrice Tavares2, Candice Moro4, Paulina Alves2, Samara Buchmann2, Eduardo Schmidt2, Rogério Friedman4, Lisiane Bizarro2.
Abstract
Online experiments are an alternative for researchers interested in conducting behavioral research outside the laboratory. However, an online assessment might become a challenge when long and complex experiments need to be conducted in a specific order or with supervision from a researcher. The aim of this study was to test the computational validity and the feasibility of a remote and synchronous reinforcement learning (RL) experiment conducted during the social-distancing measures imposed by the pandemic. An additional feature of this study was to describe how a behavioral experiment originally created to be conducted in-person was transformed into an online supervised remote experiment. Open-source software was used to collect data, conduct statistical analysis, and do computational modeling. Python codes were created to replicate computational models that simulate the effect of working memory (WM) load over RL performance. Our behavioral results indicated that we were able to replicate remotely and with a modified behavioral task the effects of working memory (WM) load over RL performance observed in previous studies with in-person assessments. Our computational analyses using Python code also captured the effects of WM load over RL as expected, which suggests that the algorithms and optimization methods were reliable in their ability to reproduce behavior. The behavioral and computational validation shown in this study and the detailed description of the supervised remote testing may be useful for researchers interested in conducting long and complex experiments online.Entities:
Keywords: Computational modeling; Online remote experiment; Reinforcement learning
Year: 2022 PMID: 36220950 PMCID: PMC9552715 DOI: 10.3758/s13428-022-01982-6
Source DB: PubMed Journal: Behav Res Methods ISSN: 1554-351X
Fig. 1Reinforcement Learning Working Memory (RLWM) task structure
Fig. 2Instruction and experimenter conduct for the remote behavioral experiment. The instructions were divided into five steps (A to E) and each step needed to be finished before going to the next. Dashed lines indicate cases in which experimenters would need to stop data collection. ** Indicates that experimenters were registering the process in the lab book. GRA Google Remote Access. RLWM Reinforcement Learning and Working Memory task
Fig. 3Learning curves as a function of stimulus iteration for each condition and correspondence between computational models and raw data. Simulations for the computational models were executed 100 times for each subject (see Model optimization for more details). A Learning curves for each condition. B Correspondence between Classic RL model simulation and raw data. C Correspondence between a pure WM model simulation and raw data. D Correspondence between RLWM model simulation and raw data
Mean and standard deviation for model parameters
| Models | Parameters | |||||||
|---|---|---|---|---|---|---|---|---|
| Decay (φ) | Persistence ( | Random noise (ε) | Initial bias ( | 𝜂 3 | 𝜂 6 | AIC | ||
| M (SD) | M (SD) | M (SD) | M (SD) | M (SD) | M (SD) | M (SD) | M (SD) | |
| Classic | .03 (.008) | __ | __ | __ | __ | __ | __ | 638.14 (128.00) |
| WM | __ | .20 (.058) | .85 (.069) | .02 (.043) | __ | __ | .90 (.074) | 608.81 (137.46) |
| RLWMi | .11 (.168) | .07 (.034) | .68 (.256) | .04 (.039) | .01 (.011) | .59 (.265) | .18 (.192) | 592.03 (134.64) |
M mean. SD standard deviation. For all models, β = 50. For the WM model, α = 1, K = 6, and 𝜂6 = 𝜂; 𝜂 is the estimated use of working memory on the task to obtain the expected behavioral outcome; init was not considered in the WM model. For the RLWMi model, 𝜂3 and 𝜂6 indicated the use of working memory on low and high working memory conditions, respectively. AIC Akaike information criterion, WM Working Memory Model, RLWM Reinforcement Learning and Working Memory Interaction model.