| Literature DB >> 25324770 |
Devika Narain1, Jeroen B J Smeets2, Pascal Mamassian3, Eli Brenner2, Robert J van Beers2.
Abstract
We often encounter pairs of variables in the world whose mutual relationship can be described by a function. After training, human responses closely correspond to these functional relationships. Here we study how humans predict unobserved segments of a function that they have been trained on and we compare how human predictions differ to those made by various function-learning models in the literature. Participants' performance was best predicted by the polynomial functions that generated the observations. Further, participants were able to explicitly report the correct generating function in most cases upon a post-experiment survey. This suggests that humans can abstract functions. To understand how they do so, we modeled human learning using an hierarchical Bayesian framework organized at two levels of abstraction: function learning and parameter learning, and used it to understand the time course of participants' learning as we surreptitiously changed the generating function over time. This Bayesian model selection framework allowed us to analyze the time course of function learning and parameter learning in relative isolation. We found that participants acquired new functions as they changed and even when parameter learning was not completely accurate, the probability that the correct function was learned remained high. Most importantly, we found that humans selected the simplest-fitting function with the highest probability and that they acquired simpler functions faster than more complex ones. Both aspects of this behavior, extent and rate of selection, present evidence that human function learning obeys the Occam's razor principle.Entities:
Keywords: Bayesian model selection; Occam's razor; function learning; sensorimotor learning; structure learning
Year: 2014 PMID: 25324770 PMCID: PMC4179744 DOI: 10.3389/fncom.2014.00121
Source DB: PubMed Journal: Front Comput Neurosci ISSN: 1662-5188 Impact factor: 2.380
Figure 1Experiment 1. (A) Timeline of a single trial. Participants initiated a trial, were presented with a cue about the spatial location of a future target and they pressed a key to launch an animated bullet to catch it en route. (B) Generating functions in three experimental sessions. The vertical extent of the curves indicates the time during which the target was visible (150 ms). Orange colors represent the test regions; other colors represent the training regions. (C) Smoothed data-averages (Gaussian kernel radius 10 mm) from all participants overlaid upon pooled responses. (D) A single participant's data for the quadratic (purple) and cubic (green) sessions along with the set of models/heuristics that were fit individually to the training data of each participant and whose prediction distributions were used to calculate likelihoods on the basis of the test data (curves indicate mean, shaded regions represent standard deviation). (E) Differences between the negative log likelihoods (summed over all participants) of the generating function and remaining heuristics and models. Positive bars indicate worse performance when compared to the generating function whereas negative bars indicate better performance. Numbers atop each bar quantify how well the model performs with respect to the generating function model i.e., 2lnK where K is the Bayes factor (see Materials and Methods for details).
A scale to interpret the measure 2 ln(K), where K is the Bayes Factor, a measure used in Figure .
| 0–2 | Not worth a mention |
| 2–6 | Positive |
| 6–10 | Strong |
| >10 | Very strong |
Unlike hypothesis testing, there is no single criterion but graded levels of evidence to support one of two hypotheses.
Figure 2Function learning in Experiment 2. (A) The three generating functions used in this experiment were presented in different serial orders of presentation to the four groups. (B) The average posterior probabilities (foreground bars) for the most-likely function corresponding to participants' responses within a moving-window of 50 trials are shown over the course of the experiment. The generating function that was used at a certain trial is indicated by the presence of a background color. The four panels represent averages of the four groups with different presentation orders for the generating functions.
Figure 3Parameter learning in Experiment 2. (A–D) The four panels represent the four different groups of participants. Each panel consists of three sub-parts, each indicating the parameters for the terms of the model when the corresponding generating functions were presented (note that the parameters have different scales and dimensions). Dashed lines represent the values of the generating function. All other lines represent the maximum likelihood estimate (MLE) of the parameters in a moving window of 50 trials. Thin lines represent individual participants and thick lines represent averages. The lines pertain to the parameters of the presented generating function and therefore there may be discontinuities in the lines at the switches. All participants' data are plotted irrespective of whether or not the generating function was the best description of the participants' responses.
Figure 4Comparisons of function acquisition times in Experiment 2. Comparison of (A) participants' data and (B) average responses from simulated participants with noisy responses (see Materials and Methods for various noise values) for different switches of generating functions. The ordinate represents acquisition time, i.e., the number of trials taken to achieve a selection probability of 0.33. Average acquisition times of each model for (C) participant's data and (D) for responses from a simulated participant. (E) Average function acquisition times for participants' data as a function of the selection threshold at which the acquisition time is determined. Error bars represent standard error across participants.