Miriam Hauptman1,2, Esti Blanco-Elorrieta1,3, Liina Pylkkänen1,2,4. 1. Department of Psychology, New York University, New York, NY 10003, USA. 2. NYUAD Institute, New York University Abu Dhabi, Abu Dhabi, P.O. Box 129188, UAE. 3. Department of Psychology, Harvard University, Cambridge, MA 02138, USA. 4. Department of Linguistics, New York University, New York, NY 10003, USA.
Abstract
Coherent language production requires that speakers adapt words to their grammatical contexts. A fundamental challenge in establishing a functional delineation of this process in the brain is that each linguistic process tends to correlate with numerous others. Our work investigated the neural basis of morphological inflection by measuring magnetoencephalography during the planning of inflected and uninflected utterances that varied across several linguistic dimensions. Results reveal increased activity in the left lateral frontotemporal cortex when inflection is planned, irrespective of phonological specification, syntactic context, or semantic type. Additional findings from univariate and connectivity analyses suggest that the brain distinguishes between different types of inflection. Specifically, planning noun and verb utterances requiring the addition of the suffix -s elicited increased activity in the ventral prefrontal cortex. A broadly distributed effect of syntactic context (verb vs. noun) was also identified. Results from representational similarity analysis indicate that this effect cannot be explained in terms of word meaning. Together, these results 1) offer evidence for a neural representation of abstract inflection that separates from other stimulus properties and 2) challenge theories that emphasize semantic content as a source of verb/noun processing differences.
Coherent language production requires that speakers adapt words to their grammatical contexts. A fundamental challenge in establishing a functional delineation of this process in the brain is that each linguistic process tends to correlate with numerous others. Our work investigated the neural basis of morphological inflection by measuring magnetoencephalography during the planning of inflected and uninflected utterances that varied across several linguistic dimensions. Results reveal increased activity in the left lateral frontotemporal cortex when inflection is planned, irrespective of phonological specification, syntactic context, or semantic type. Additional findings from univariate and connectivity analyses suggest that the brain distinguishes between different types of inflection. Specifically, planning noun and verb utterances requiring the addition of the suffix -s elicited increased activity in the ventral prefrontal cortex. A broadly distributed effect of syntactic context (verb vs. noun) was also identified. Results from representational similarity analysis indicate that this effect cannot be explained in terms of word meaning. Together, these results 1) offer evidence for a neural representation of abstract inflection that separates from other stimulus properties and 2) challenge theories that emphasize semantic content as a source of verb/noun processing differences.
Adapting words to their grammatical contexts is central to language use. For instance, in order to describe a person’s affinity for canines, I must combine “-s” with “like” and “-s” with “dog” to create the phrase “He likes dogs.” The process of grammatical inflection may appear simple, but it in fact involves a remarkable number of linguistic operations: one must select and then evaluate the meanings of all relevant pieces (e.g., “He,” “like,” and “dog”), identify the syntactic categories of the pieces and determine their order (syntax), retrieve the abstract grammatical elements (morphemes) that express those relations, and choose the correct sounds to realize the morphemes (such as -s to inflect a noun as plural). Teasing apart these operations is theoretically feasible; however, the correlational nature of language makes it extremely difficult to dissociate the underlying neural computations. This is because each level of linguistic representation often interacts, or is realized simultaneously, with one or multiple others.The current work asks whether inflection has a unified neural basis. Past patient studies using electrocorticography (ECoG) have offered evidence for unified neural correlates of inflection across number and tense in limited cortical search spaces: the left inferior frontal gyrus (Sahin et al. 2009) and the left posterior temporal cortex (Lee et al. 2018). Here, we took an exploratory approach to studying this question in healthy participants using whole-head magnetoencephalography (MEG). Our design allowed us to compare number and tense inflection for nouns and verbs, respectively, while also varying whether the inflection was phonologically realized/overt (e.g., “He crawlS”) or phonologically null/covert (e.g., “I crawl,” where the present tense of the first person singular does not affect the phonology of the verbal stem). Additionally, we orthogonalized the stimuli by 3 different semantic dimensions (Abstract Cognition, Manipulable, Non-Manipulable), enabling us to account for word meaning when interpreting effects of inflection and noun/verb processing.Our primary aim was to determine whether an effect of pure inflection (Chomsky 1955; Halle and Marantz 1994), independent of phonological and syntactic factors, could be observed in MEG data. Chomsky (1955) and Halle and Marantz (1994) treat inflection as the merge of functional categories with phrasal categories. Under such a hypothesis, inflectional processing should align with syntactic processing more generally. Many have sought to characterize the neural localization of syntax, citing the importance of various regions (Broca’s area: Hagoort 2005; Friederici 2011; posterior temporal lobe: Matchin and Hickok 2020; distributed: Blank et al. 2016). In our own work, we have observed “pure” effects of structure only in the left posterior temporal cortex (Flick and Pylkkänen 2020; Law and Pylkkänen 2021; Matar et al. 2021). Given this apparent inconsistency in the literature, one cannot make crisp predictions about where a syntactic effect would localize; however, the emergence of an effect of abstract inflection in our experiment, regardless of its location, would align with the Chomskyan view of inflection.Critically, the study of morphology in language production faces particular difficulties from the perspective of task design. Production research using controlled manipulations aims to elicit exactly the same responses from different participants. Simple picture naming tasks most easily surmount this methodological obstacle, explaining the popularity of such tasks. In studies of morphological inflection, however, each participant must produce not only the same word but also the same inflected form of that word, ideally in a manner that is reasonably natural. Since, in conversation, we often complete each other’s sentences, completion tasks are a fairly natural yet controlled way to elicit productions in a specific grammatical context. To address the neural basis of grammatical inflection, we used a phrase completion task in which participants produced the inflected form of a visually displayed stem when provided auditorily with the beginning of a phrase. This task was similar to those used in past studies of production in patients awaiting brain surgery (Sahin et al. 2009; Lee et al. 2018) and in functional magnetic resonance imaging studies of healthy participants (Sahin et al. 2006; Shapiro et al. 2006).The high temporal resolution of MEG allowed us to capture the time-locked progression of neural activity from the beginning of the language planning process until the moments just before production. We were therefore able to exclude from our data articulation-related motion artifacts, which present another classic challenge for brain research on language production. Univariate analyses of source-localized MEG signals in the left and right frontal and temporal lobes were complemented by whole-hemisphere representational similarity analyses (RSAs) that provided a finer grained characterization of the processes of interest. Additionally, we used analyses of Granger causality to examine the information flow between regions of interest located in the frontal, temporal, and parietal lobes during inflection.
Materials and Methods
Participants
Thirty individuals participated in the experiment. Four were excluded due to intense environmental noise caused by nearby construction, and 2 were excluded due to response accuracies below 80%, which left 24 participants in the final dataset (9 males, 15 females; M = 20.71; standard deviation [SD] = ±2.94 years). All participants were right-handed, monolingual native English speakers with normal or corrected-to-normal vision and no history of neurological anomalies. Participants provided informed written consent following NYU Institutional Review Board protocols and received payment or course credit for their participation. This study was conducted according to local ethics and the Declaration of Helsinki.Panel (a) shows the trial structure, taking as an example the stimulus item “dream,” while panel (b), left side, shows experimental manipulations at each stage of the trial and (b), right side, shows the factors of interest across all analyses in the context of this design table. “Semantic Type” is determined based on the meaning of the stimulus (Abstract Cognition, Non-manipulable, Manipulable) and “Syntactic Category” refers to its ability to appear as either a noun or a verb, or both (Noun-Only, Verb-Only or Ambiguous). “Syntactic Context” is determined by the auditory cue, which indicates whether the word will be used as a verb or noun (Verb-Context, Noun-Context). “Trial Type” describes the inflectional status of the target utterance (Inflect-Non-Modify, Inflect-Modify, Repeat). In panel (b), the first table displays example trials using visual stimuli that contain the suffix -s, while the second table displays example trials using visual stimuli that contain the suffix -s. Note that for illustrational purposes we chose to display examples from within the Abstract Cognition type only; the full design includes the same tables repeated for the Non-Manipulable and Manipulable types and can be found in Supplementary Figure 1.
Design and Materials
Task Rationale
Our task required participants to either inflect stems or simply repeat them (Fig. 1). To ensure uniform production across participants, we established the linguistic context preceding participants’ utterances by first visually displaying a stem (e.g., “dream”) to be used in a phrase completion task that immediately followed. We exploited the ambiguity of the English suffix -s as either plural inflection for nouns or as third person present singular for verbs to design a protocol in which participants produced verbs and nouns that had the same morphophonological shape. Pronouns were used to elicit verbs (He → “dreams”; I → “dream”) while numerals elicited nouns (Two → “dreams”; One → “dream”). The prompt, “Say,” elicited repetitions of the visually displayed stem. Thus, the 2-word expressions completed in verb trials constituted full sentences (“He dreams”), whereas expressions completed in noun trials were noun phrases (“Two dreams”). These choices were guided by 1) our intuitions of what would constitute natural phrase completions, 2) our desire to keep the context expressions as minimal as possible, and 3) our prior work on MEG correlates of composition. In our composition work, we have not detected reliable MEG reflections of combinatory processing involving grammatical function words lacking (or light on) conceptual content (reviewed in Pylkkänen 2019), with 2 of these studies specifically targeting numeral quantification in production (Del Prato and Pylkkänen 2014; Blanco-Elorrieta and Pylkkänen, 2016a). This gave us confidence that results from the current paradigm would likely reflect inflectional processing as opposed to basic conceptual composition. Our design differed from designs used in past work (Sahin et al. 2006, 2009; Shapiro et al. 2006; Lee et al. 2018) in that our grammatical context (He, I, Two, One) immediately preceded production, whereas prior designs involved first presenting a multiword preamble (“Today we will…”) and then a target stem (“walk”), after which the participant produced the target expression. We preferred to elicit production immediately following the auditory context for more natural phrase completion. Our goal was to simulate the experience of natural language production by minimizing the sense that the task merely involved word play with suffixes.
Figure 1
Panel (a) shows the trial structure, taking as an example the stimulus item “dream,” while panel (b), left side, shows experimental manipulations at each stage of the trial and (b), right side, shows the factors of interest across all analyses in the context of this design table. “Semantic Type” is determined based on the meaning of the stimulus (Abstract Cognition, Non-manipulable, Manipulable) and “Syntactic Category” refers to its ability to appear as either a noun or a verb, or both (Noun-Only, Verb-Only or Ambiguous). “Syntactic Context” is determined by the auditory cue, which indicates whether the word will be used as a verb or noun (Verb-Context, Noun-Context). “Trial Type” describes the inflectional status of the target utterance (Inflect-Non-Modify, Inflect-Modify, Repeat). In panel (b), the first table displays example trials using visual stimuli that contain the suffix -s, while the second table displays example trials using visual stimuli that contain the suffix -s. Note that for illustrational purposes we chose to display examples from within the Abstract Cognition type only; the full design includes the same tables repeated for the Non-Manipulable and Manipulable types and can be found in Supplementary Figure 1.
Stimuli
To isolate the neural processes associated with abstract inflection, independent of other potentially correlating features, we selected to-be-inflected stems that varied in their semantic and syntactic makeup. Specifically, 135 target stems were chosen across 3 semantic types (Abstract Cognition, Non-Manipulable, and Manipulable) and 3 syntactic categories (Verb-Only, Noun-Only, and Ambiguous), yielding 9 conditions of 15 words each. Each Verb-Only and Noun-Only stem appeared 6 times throughout the experiment and each Ambiguous item appeared 12 times.
Semantic types
Our experiment aimed to disrupt typical associations between a given syntactic category (e.g., verb) and its stereotypical meaning (i.e., action) in order to target representations of syntactic category that were pure and devoid of semantic confounds. For this purpose, we selected 3 semantic types that appear among both verbs and nouns and that recruit different sets of neural resources during processing (e.g., Martin et al. 1996; Kellenbach et al. 2003; Binder et al. 2005; Kemmerer et al. 2008). Abstract Cognition items implicated cognitive processes and did not have concrete physical referents (e.g., “joke,” “learn”); Non-Manipulable items referred to physical actions not involving direct manipulation by humans (e.g., “tornado,” “squirm”); and Manipulable items were associated with human manipulation of physical entities (e.g., “dagger,” “carve”). Stimuli were normed by 17 native English speakers who were asked to sort a scrambled list of the 135 stimuli items into these categories. Of the 135 stems in the initial stimulus set, 6 did not meet our inclusion criterion (i.e., fewer than 50% of participants placed the stem in the “correct” category) and were subsequently replaced. More than 75% of the stems included in the final stimuli set were categorized into the target category by over 80% of the norming participants.
Syntactic categories
Stems classified as Verb-Only could function only as verbs (e.g., “kneel,” “embroider”), Noun-Only stems could function only as nouns (e.g., “deed,” “comet”), and Ambiguous stems could function as either nouns or verbs depending on the syntactic context (e.g., “dream,” “shovel”). The inclusion of Ambiguous stems allowed us to assess the extent to which potential differences in noun and verb production could be attributed to contextual information (e.g., the presence of a pronoun or numeral auditory cue) as opposed to intrinsic lexical properties. We assessed the extent to which each stem appeared as a noun or verb in American English by querying the Corpus of Contemporary American English (COCA) (https://www.english-corpora.org/coca/). More than half of the items in the Noun-Only and Verb-Only categories appeared in their respective syntactic contexts in 100% of the corpus entries (e.g., all entries of “comet” were nouns), and the rest appeared in their respective syntactic contexts in at least 80% of the entries. Ambiguous items were ideally to appear in 50% but never over 80% of the entries in a given syntactic context, since in some cases the 50% condition could not be met when considering other criteria (e.g., membership in a given semantic type and compatibility with the sublexical factors we also controlled for, such as length and frequency).Stimuli across all 9 subgroups (3 syntactic categories × 3 semantic types) were matched for the following sublexical variables: length (number of characters), frequency, and number of phonemes, as measured by the English Lexicon Project (http://elexicon.wustl.edu/default.asp; Balota et al. 2007). In addition, all stems were monomorphemic; all Verb-Only and Ambiguous stems could function as intransitives as defined in the Merriam-Webster dictionary (e.g., “He dives,” “I doodle”); and all Noun-Only and Ambiguous stems were count nouns (i.e., could occur in plural form). A list of all stimuli is displayed in Supplementary Table 1 and a summary of their sublexical characteristics is displayed in Additional Table 2.
Trial Structure and Procedure
In the phrase completion task, participants inflected stems based on the syntactic context provided by an auditory cue to complete a 2-word phrase (e.g., “He dreams”). Each trial began with a visual display of the target stem with or without an -s (e.g., “dream” or “dreams”), followed by an auditory presentation of the syntactic context (“He”), after which the participant inflected the target stem accordingly (“dreams”) (Fig. 1).Auditory cues consisted of the pronouns “I” and “He” (Verb-Context), the numerals “One” and “Two” (Noun-Context), and the word “Say,” which prompted participants to simply repeat the visually displayed stem. “Say” trials were divided into Noun-Context and Verb-Context based on the syntactic category of the target production (Noun-Only = Noun-Context, Verb-Only = Verb-Context). Half of the Ambiguous “Say” trials were coded as Noun-Context and half as Verb-Context. All one-word cues (I, He, One, Two, Say) were recorded at 70 dB by a female native English speaker and were equalized using the Praat Vocal Toolkit (http://www.praatvocaltoolkit.com; Boersma 2001) so that each had the same duration (670 ms).Different combinations of the visual stimuli (-s/no -s) and auditory cues resulted in 2 main trial types: Repeat and Inflect. In Repeat trials, participants heard “Say” after viewing a stem, which cued them to repeat that stem (e.g., visual stimulus “dream” + auditory cue “Say” = target production “dream”). In Inflect trials, participants heard an auditory cue that established a given syntactic context and then completed the 2-word phrase using the correctly inflected form. The main focus of this experiment was to investigate the neural basis of abstract inflection by comparing neural activity during the Inflect and Repeat conditions. In accordance with past work (Lee et al. 2018), we were also interested as to whether the brain would distinguish between inflection that requires an active phonological modification of the stem as opposed to inflection that does not. Thus, we further divided the Inflect condition into 2 subcategories based on the nature of the visual stimuli. Inflect-Modify trials required that participants modify the stimulus item to match the provided context by either adding or removing an -s (e.g., “dream” + “He” = “dreams”). Conversely, in Inflect-Non-Modify trials, participants did not modify the phonological form of the stimulus to match the syntactic context (e.g., “dreams” + “He” = “dreams”).Each of the 45 Verb-Only and 45 Noun-Only stems appeared in each trial type (Repeat, Inflect-Modify, Inflect-Non-Modify) in both -s/no -s manipulations of the target production (6 times total), while the 45 Ambiguous stems appeared twice as often (6 times in Verb-Context and 6 times in Noun-Context trials). The resulting 1080 total trials were counterbalanced across semantic type (360 trials of each: Abstract Cognition, Non-Manipulable, Manipulable), syntactic category (270 trials of Verb-Only, Noun-Only; 540 trials of Ambiguous), syntactic context (540 trials of each: Verb-Context, Noun-Context), trial type (360 trials of each: Repeat, Inflect-Modify, Inflect-Non-Modify), and -s/no -s in the target production (540 trials of each).Participants were offered the chance to rest every 36 trials. Each individual trial was presented exactly once throughout the experiment, and each 36-trial block contained an equal number of every stimulus type (i.e., semantic type, syntactic category, syntactic context, trial type, -s/no -s in target production). To eliminate the possibility of order-based effects, we created 6 versions of the experiment by varying the trials contained in each block.
Data Acquisition and Preprocessing
Before recording, each participant’s head shape was digitized using a Polhemus dual source handheld FastSCAN laser scanner (Polhemus, VT, USA). Digital fiducial points were recorded at 6 points on the individual’s head: the nasion, anterior of the left and right auditory canal, and 3 points on the forehead. Marker coils were placed at the same positions in order to localize that person’s skull relative to the MEG sensors. The measurements of these marker coils were recorded both immediately before and immediately after the experiment in order to correct for movement during the recording. MEG data were collected in the Neuroscience of Language Lab in NYU New York using a whole-head 157 channel axial gradiometer system (Kanazawa Institute of Technology, Kanazawa, Japan) as participants lay in a dimly lit, magnetically shielded room.MEG data were recorded at 1000 Hz (200 Hz low-pass filter), and noise reduced by exploiting 8 magnetometer reference channels located away from the participants’ heads via the Continuously Adjusted Least-Squares Method in the MEG Laboratory software (Yokogawa Electric Corporation and Eagle Technology Corporation, Tokyo, Japan). The noise-reduced MEG recording, the digitized head shape, and the sensor locations were then imported into MNE-Python (Gramfort et al. 2014). Data were epoched from 100 ms before the beginning of the trial (i.e., the presentation of the visual stimulus) to 400 ms after production was allowed. To remove artifacts from our data, we applied an independent component analysis to our raw data and removed components corresponding to blinks, heartbeats, and motion artifacts. Subsequently, a strict artifact rejection routine used in previous MEG production studies (Pylkkänen et al. 2014; Blanco-Elorrieta and Pylkkänen 2015, 2016a, 2016b; Blanco-Elorrieta and Pylkkänen 2017; Blanco-Elorrieta, Emmorey, et al. 2018a) was followed to ensure that oral and manual artifacts would not contaminate our data. Specifically, we 1) rejected all individual epochs that contained amplitudes >2500 ft/cm for any sensor after noise reduction, 2) visualized all individual epochs before averaging and rejected any epoch that contained sudden increases in the magnitude of the signal caused by artifacts (e.g., muscular movements), and 3) applied a 40 Hz low-pass filter aimed at eliminating any remaining movement artifacts from our data, given that the gamma-frequency range (>40 Hz) is reportedly the one affected by muscle artifact contamination such as phasic contractions (Yuval-Greenberg and Deouell 2009; Gross et al. 2013). In addition, trials corresponding to behavioral errors were excluded from further analyses.Neuromagnetic data were coregistered with the FreeSurfer average brain (CorTechs Labs Inc., La Jolla, CA and MGH/HMS/MIT Athinoula A. Martinos Center for Biomedical Imaging, Charleston, MA) by scaling the size of the average brain to fit the participant’s head shape, aligning the fiducial points, and conducting final manual adjustments to minimize the difference between the head shape and the FreeSurfer average skull. Next, an ico-4 source space was created, consisting of 2562 potential electrical sources per hemisphere. At each source, activity was computed for the forward solution with the Boundary Element Model method, which provides an estimate of each MEG sensor’s magnetic field in response to a current dipole at that source. Epochs were baseline corrected with the 100 ms prior to the presentation of the visual stimulus (i.e., when participants had finished their utterances and were awaiting the start of the next trial) and low-pass filtered at 40 Hz. During preprocessing, the data were downsampled by a factor of 5 to improve computational performance. The inverse solution was computed from the forward solution and the grand average activity across all trials, which determines the most likely distribution of neural activity. The resulting minimum norm estimates of neural activity (Hämäläinen and Ilmoniemi 1994) were transformed into normalized estimates of noise at each spatial location, resulting in statistical parametric maps (SPMs), which provide information about the statistical reliability of the estimated signal at each location in the map with millisecond accuracy. The SPMs were then converted to dynamic maps (dSPM). In order to quantify the spatial resolution of these maps, the point-spread function for different locations on the cortical surface was computed, which reflects the spatial blurring of the true activity patterns in the spatiotemporal maps, thus yielding estimates of brain electrical activity with the highest possible spatial and temporal accuracy (Dale et al. 2000). The inverse solution was applied to each trial, which employed a fixed orientation of the dipole current that estimates the source normal to the cortical surface and retains dipole orientation.
Statistical Analysis
Behavioral Analysis
Trials corresponding to erroneous responses (incorrect naming, verbal disfluencies, nonresponses) were excluded from MEG analysis, resulting in the exclusion of 6% of trials per participant (SD = 6%). Unfortunately, our microphone was hypersensitive to noise in the testing environment, and we could not certify that reaction times in fact indicated the initiation of oral responses as opposed to background noise. Each verbal response was saved as an individual file that contained a single production from voice onset onward, making the reconstruction of reaction time impossible. For this reason, reaction times were not analyzed in the current experiment.
Source Localization Analyses
We analyzed source-localized current estimates using nonparametric spatiotemporal cluster tests across the left and right frontal and temporal lobes, as defined in PALS_B12_Lobes parcellation (https://surfer.nmr.mgh.harvard.edu/fswiki/PALS_B12) in the time window extending 100 ms from the offset of the auditory cue (i.e., 100 ms after participants were allowed to begin their responses) to 400 ms. Given our inability to measure response times, deciding the epoch duration posed a challenge for us, as we wanted to ensure that the data included in the analysis did not overlap with the onset of articulation. In previous MEG studies, average productions have started well over 600 ms post-stimulus (e.g., Blanco-Elorrieta and Pylkkänen 2016a, 2016b; Pylkkänen et al. 2014), including in paradigms in which participants were primed and would thus respond faster (Blanco-Elorrieta, Ferreira, et al. 2018b). Although motion artifacts were removed from our data using the strict artifact rejection method described above, we took a conservative approach and limited our epochs to a length of 400 ms to ensure that our epochs did not overlap with participant vocalizations. In total, we ran 4 statistical tests: 2 tests targeting inflection (one in which conditions were defined based on whether there was phonological modification to the visual stimulus—Inflect-Modify, Inflect-Non-Modify, and Repeat—and one in which the conditions were defined based on the presence of -s in the inflected word—Inflect-Overt, Inflect-Covert, and Repeat) and 2 tests targeting syntactic context (Noun-Context vs. Verb-Context across all trials and across only those trials containing syntactically ambiguous stems). For each statistical test, a map of F values was computed over sources and milliseconds. These maps were thresholded at a value equivalent to P = 0.05 (uncorrected); then, clusters were computed from adjacent values in space and time that surpassed our cutoff threshold. If a cluster consisted of a minimum of 10 vertices and lasted for at least 25 ms, the F values within this cluster were summed, resulting in a cluster-level statistic. We then permuted the data 10 000 times within the same spatiotemporal dimensions. Each permutation involved shuffling condition labels at random and recomputing the cluster statistic of the permuted data to form a distribution of cluster-level F values of the maximum cluster-level statistic (Maris and Oostenveld 2007). Pairwise differences between conditions within the clusters were computed using paired sample t-tests and corrected with false discovery rate over these tests. Since the temporal clusters initially chosen for further analysis are uncorrected, the borders of the clusters should be interpreted as having an approximate nature. We therefore cannot make claims about the exact latency or duration of any effects (see Sassenhagen and Draschkow 2019). Although we conducted these analyses in both the left and right hemispheres, the right hemisphere failed to yield reliable clusters.
Granger Causality
We used Wiener–Granger causality (G-causality; Granger 1969; Geweke 1982) to identify causal connectivity between different regions of interest in the MEG time series data.Past work on the processing of inflectional morphemes in both patient and healthy populations has consistently implicated left inferior frontal (Tyler et al. 2004; Sahin et al. 2009; Newman et al. 2010) and temporoparietal regions (Marslen-Wilson and Tyler 1998, 2007; Lee et al. 2018; see Ullman et al. 2005 for a discussion). To characterize the exchange of information between these regions in our set of participants, we ran connectivity analyses between individual Brodmann areas that constitute the left angular and supramarginal gyri (BAs 39, 40) and the left inferior frontal gyrus (BAs 44, 45, 47) as well as the left anterior temporal lobe (BAs 20, 21, 38). In response to the open question regarding whether the temporoparietal area identified in Lee et al. (2018) is selectively engaged during production (Fedorenko et al. 2018), we were especially interested in the nature of the connectivity patterns between this left posterior area and anterior temporal regions involved in semantic representation described in past studies of production (e.g., Fonseca et al. 2009; Schwartz et al. 2009; Mesulam et al. 2013).The analysis was conducted using the Multivariate Granger Causality Matlab Toolbox (Barnett and Seth 2014). The input to this analysis was the time course of activity averaged over all the sources in each Brodmann area of interest from 0 to 400 ms after the offset of the auditory cue. Brodmann areas 44, 45, and 47 were collapsed into a single label due to the small number of sources in each individual label. The details of the statistical procedure are laid out in Barnett and Seth (2014). Briefly, we fit a vector autoregressive model to our time series data, which allowed us to assess, throughout the 0–400 ms time window, whether neural activity from one area A at previous points in time helped predict the activity from another area B at later timepoints beyond the degree to which B was predicted by its own past. When the past of A conveys information about the future of B above what is contained in B’s own past, we can say that A “G-causes” B (Bressler and Seth 2011). Since this notion of causality may differ from more intuitive definitions, we use the terms “connectivity” and “information flow” in place of causality. Pairwise significance in Granger causality values was corrected across all regions using FDR (Benjamini and Hochberg 1995) at an alpha value of P = 0.05.
Representational Similarity Analysis
Our design enabled us not only to search for a neural correlate of inflection that is isolable from other linguistic properties (i.e., semantic type, syntactic context, phonological form) but also to examine interactions between these properties. In response to a longstanding debate concerning the relationship between differences in noun versus verb processing and trends in meaning (e.g., actions vs. objects) (e.g., Bird et al. 2000; Bi et al. 2005; Bedny and Caramazza 2011; Rodriguez-Ferreiro et al. 2011; Peelen et al. 2012; Moseley and Pulvermüller 2014), we investigated whether distinct or shared networks tracked differences in semantic type (Abstract Cognition, Manipulable, Non-Manipulable) and syntactic context (Noun-Context, Verb-Context) during production planning, the latter of which would suggest that these dimensions tap into similar underlying representations. The characterization of distributed patterns of neural activity associated with the properties of interest in our stimuli was achieved via a spatiotemporal searchlight RSA (Kriegeskorte et al. 2008; Su et al. 2012). Differences in semantic type and syntactic context were encoded in model representational dissimilarity matrices (model RDMs). In these matrices, pairwise correlation distance (1-correlation; Kriegeskorte et al. 2008) coded for the similarity between any 2 elements across either the semantic or the syntactic dimensions (i.e., a dissimilarity value of 0 indicated 2 elements belonged to the same category and 1 indicated they were not members of the same category). MEG data (neural) RDMs coded for the neural similarity between the same pairs of conditions and were constructed as follows (for a schematic depiction of the following description, see Fig. 2):
Figure 2
Spatiotemporal searchlight RSA. The numbered steps displayed in the figure correspond with the steps described in section Representational Similarity Analysis. For the purposes of this example, we depict the RSA pipeline using mock data for Noun-Context and Verb-Context from only one spatial label (“Label 1”) across three 50-ms sliding time windows (shaded in purple; see Step 2). Note: “E.D.” stands for Euclidean Distance.
A spatial search space of 10 mm in diameter at each source in the left hemisphere was selected.We extracted the evoked data averaged across participants for each condition and averaged these data across sources in the spatial search space. For this analysis, we excluded all data from the Repeat condition, as it is possible that participants held variable semantic and syntactic interpretations of the Ambiguous stems produced after the context-free “Say” prompt. We then divided this averaged response into 50 ms temporal windows centered on each millisecond in the 0–400 ms epoch.We calculated the dissimilarity between the activity patterns associated with any pair of conditions based on this single evoked response for each condition per temporal window. As in the model RDMs, we used correlation distance (1-correlation) as the dissimilarity measure.Next, we used these dissimilarity values to construct a neural similarity matrix (neural RDM) for each temporal window. This matrix thus encoded the similarity between neural activity associated with any 2 conditions.Once both the model RDM and the neural RDM were constructed, we ran a Spearman correlation between our model RDM and our neural RDMs to establish a time course of rho values representing the level of similarity between the model and neural RDMs at each millisecond in the 0–400 ms epoch.Only significant rho values (P < 0.05) that extended for at least 25 consecutive milliseconds within a given searchlight region (“temporal clusters”) were extracted for further analysis.Spatiotemporal searchlight RSA. The numbered steps displayed in the figure correspond with the steps described in section Representational Similarity Analysis. For the purposes of this example, we depict the RSA pipeline using mock data for Noun-Context and Verb-Context from only one spatial label (“Label 1”) across three 50-ms sliding time windows (shaded in purple; see Step 2). Note: “E.D.” stands for Euclidean Distance.P-values for each time point were calculated and FDR corrected (Benjamini and Yekutieli 2001). Once this process concluded, we moved the searchlight to the next source and began the process all over again, until we covered every source in the frontal, parietal, temporal, and occipital lobes as defined in the PALS_12_Lobes parcellation (https://surfer.nmr.mgh.harvard.edu/fswiki/PALS_B12).
Results
Evidence of Abstract Inflection in Left Frontotemporal Areas
This study set to characterize the neural correlates of abstract inflection, or inflection independent of syntactic context and phonological form. Our analyses revealed a left frontotemporal area where increased activity was observed during conditions requiring inflection (335–400 ms; P = 0.02; Fig. 3). This effect was modulated neither by syntactic context nor by whether the produced word contained an -s or belonged to a particular semantic type.
Figure 3
Effects of morphological inflection. Panel (a) shows an effect of inflection regardless of phonological realization, syntactic context, and semantic type. (b) Top part shows an effect of inflection that requires phonological modification and bottom part compares these effects when an -s is added (P = 0.12, 185–270 ms) or removed (no clusters identified). On each panel, the FreeSurfer average brains on the left side illustrate the spatial distribution of the reliable cluster (every source that was part of the cluster at some point in time is color-coded with the sum F statistic). The waveform plots show the time-course of activity for the sources in the cluster, where participants are allowed to start planning their response at 0. In panels (a) and (b) (top part), the boxed regions indicate that the difference in activity between the tested conditions was significant at P = 0.05 (corrected). In panel (b, bottom part), the boxed regions highlight the waveforms during the exact time window of the significant effect in (b) (180–305 ms). Significance was determined using a nonparametric permutation test (Maris and Oostenveld 2007) performed from 100 to 400 ms (10 000 permutations). The bar graphs on the right side illustrate the average activity per condition for the sources and time points that constitute the cluster.
Effects of morphological inflection. Panel (a) shows an effect of inflection regardless of phonological realization, syntactic context, and semantic type. (b) Top part shows an effect of inflection that requires phonological modification and bottom part compares these effects when an -s is added (P = 0.12, 185–270 ms) or removed (no clusters identified). On each panel, the FreeSurfer average brains on the left side illustrate the spatial distribution of the reliable cluster (every source that was part of the cluster at some point in time is color-coded with the sum F statistic). The waveform plots show the time-course of activity for the sources in the cluster, where participants are allowed to start planning their response at 0. In panels (a) and (b) (top part), the boxed regions indicate that the difference in activity between the tested conditions was significant at P = 0.05 (corrected). In panel (b, bottom part), the boxed regions highlight the waveforms during the exact time window of the significant effect in (b) (180–305 ms). Significance was determined using a nonparametric permutation test (Maris and Oostenveld 2007) performed from 100 to 400 ms (10 000 permutations). The bar graphs on the right side illustrate the average activity per condition for the sources and time points that constitute the cluster.In addition to identifying an effect of abstract inflection, our analyses revealed an earlier effect that was unique to the Inflect-Modify condition; that is, the trials in which participants produced an inflected form that required a phonological modification to the visual stimulus (180–305 ms; P = 0.03; Fig. 3). Importantly, the Inflect-Modify condition included inflection cases that required participants to either eliminate the suffix -s (e.g., see “dreams,” Hear “I,” Produce “dream”) or add -s (e.g., see “dream,” Hear “He,” Produce “dreams”). This prompted us to ask whether the effect of phonological modification in inflection was driven by the modification of the visual stimulus in general or by one particular direction of these modifications. Subsequent analysis suggested that this effect was in fact driven by cases in which participants added -s (Fig. 3, lower panel).It is worth considering the role of task difficulty with regard to differences between the Inflect and Repeat conditions, as Repeat might be easier for participants. However, if this were the case, we would expect to observe either that the Repeat condition would elicit lower activity throughout the epoch or that Repeat would elicit lower activity at a time window that would remain constant across analysis (i.e., the time period during which processing has concluded for Repeat but not for the 2 inflection conditions). However, our results show that neither of these possibilities holds true, as decreased activation for Repeat is observed across both effects, which span different temporal windows.Having identified that the effect of Inflect-Modify was driven specifically by the addition of the -s (Fig. 3, lower panel), we were interested as to whether this finding could be considered an effect of modifying the visual stimulus by adding -s (e.g., see ``dream,'' produce ``dreams'') or instead a more general effect of producing an inflected word containing -s (see ``dream,'' produce ``dreams''+see ``dreams'', produce ``dreams''). To assess the latter, we performed a subsequent analysis in which we organized our inflection conditions not by phonological modification, but by “overtness” of the inflection realization (whether or not inflected utterances contained an -s, regardless of whether a modification was applied to the visual stimulus; e.g., Sahin et al. 2006, 2009). Our analysis specifically compared neural activity from 3 conditions: “Inflect-Overt” (-s is present in the inflected word), “Inflect-Covert” (no -s is present the inflected word), and Repeat (no inflection is performed). The distinction between the Inflect-Overt and Inflect-Covert conditions is consistent with the concept of “null/zero” versus “realized” inflection in theoretical linguistics.Ultimately, no clusters corresponding to increases for the Inflect-Overt condition over the Inflect-Overt and Repeat conditions were observed. We therefore failed to replicate the distinction between overt and covert inflection (i.e., inflection with or without a phonological marker such as -s) that has previously aligned with processing differences on the neural level (e.g., Sahin et al. 2006, 2009). However, similar to the effect of abstract inflection displayed in Figure 3, a cluster that showed increased activity for the Inflect-Overt and Inflect-Covert over Repeat conditions did emerge (note that this cluster did not reach significance at 10 000 permutations; P = 0.18).Since our univariate analyses did not capture increased activity during overt inflection, we considered the possibility that the flow of information between neural regions, as opposed to differences in the magnitude of localized activation, may best capture differences between overt and covert inflection. To test this possibility, we used analyses of Granger causality, which evaluate the flow of activation between regions. Following recent ECoG work (Lee et al. 2018) demonstrating the involvement of temporoparietal areas during both covert and overt inflection in production, our analyses examined the flow of neural activation between temporoparietal and inferior frontal regions. Additionally, we included in our regions of interest anterior temporal regions implicated in conceptual retrieval during word production (Fonseca et al. 2009; Schwartz et al. 2009; Mesulam et al. 2013) that also demonstrated sensitivity to abstract inflection in our univariate analyses. These analyses revealed a significant (P = 0.002) flow of activity between the temporal pole and the supramarginal gyrus (BA 38 → BA 40) during production planning that was specific to the Inflect-Overt condition (Fig. 4). We also identified significant connectivity between adjacent regions (BA 38 → BA 20 and 21; BA 39 → 40); however, these patterns are of less theoretical value, as communication between adjacent regions should occur even in the absence of experimental manipulations.
Figure 4
Pairwise Granger causality. Panel (a) shows the locations of the reliable connections on the FreeSurfer average brain. Panel (b) shows pairwise conditional Granger causality (Granger 1969; Geweke 1982) across all ROIs for trials orthogonalized by inflectional overtness (Inflect-Overt, Inflect-Covert, Repeat). Brodmann areas are listed by number, and the ROI corresponding to Broca’s area is abbreviated by “BR.” Connections significant at P < 0.05 are highlighted in yellow/teal.
Increased Activity for Verbs in Context over Nouns in Context in the Temporal Lobe, Independent of Semantic Properties
In addition to identifying effects of abstract morphological inflection and phonological modification in inflection, we observed differences in the neural profiles associated with production of verbs versus nouns. Consistent with prior neuroimaging literature comparing noun and verb processing (e.g., Davis et al. 2004; Bedny and Thompson-Schill 2006; Bedny et al. 2008; see Vigliocco et al. 2011, for a review), we found that Verb-Context trials elicited increased activity as compared to Noun-Context trials across a large subset of the frontal and temporal lobes at approximately 320–400 ms after the offset of the auditory cue (P = 0.03, Fig. 5). We considered 2 interpretations that may explain this profile of activity. First, the increased activity for Verb-Context trials could indicate that the stored representations of Verb-Only stems were more salient due to the semantic properties associated with them. Alternatively, increased activity for verbs may reflect the increased syntactic complexity associated with engaging with verbs within syntactic contexts (Tyler et al. 2004), especially considering that the phrases completed in our Verb-Context trials constituted full sentences whereas Noun-Context trials involved completing noun phrases.
Figure 5
Effects of syntactic context. Panel (a) displays the spatial distribution of verb > noun effects and panels (b) and (c) display corresponding waveforms. The effects of syntactic context are collapsed across semantic type. The FreeSurfer average brain in (a) illustrates the spatial distribution of the cluster within all stems and within Ambiguous stems, and the overlap between these clusters. On the waveform plots in (b) and (c), we show the time course of activity for the sources in the cluster, where participants could begin producing their responses at 0. The boxed regions indicate that the difference in activity between the tested conditions was significant at P = 0.03 (b) or marginally significant at P = 0.07 (c) (corrected). Significance was determined using a nonparametric permutation test (Maris and Oostenveld 2007) performed from 100 to 400 ms (10 000 permutations). The bar graphs on the right side illustrate the average activity per condition for the sources and time points that constitute the cluster.
Pairwise Granger causality. Panel (a) shows the locations of the reliable connections on the FreeSurfer average brain. Panel (b) shows pairwise conditional Granger causality (Granger 1969; Geweke 1982) across all ROIs for trials orthogonalized by inflectional overtness (Inflect-Overt, Inflect-Covert, Repeat). Brodmann areas are listed by number, and the ROI corresponding to Broca’s area is abbreviated by “BR.” Connections significant at P < 0.05 are highlighted in yellow/teal.Effects of syntactic context. Panel (a) displays the spatial distribution of verb > noun effects and panels (b) and (c) display corresponding waveforms. The effects of syntactic context are collapsed across semantic type. The FreeSurfer average brain in (a) illustrates the spatial distribution of the cluster within all stems and within Ambiguous stems, and the overlap between these clusters. On the waveform plots in (b) and (c), we show the time course of activity for the sources in the cluster, where participants could begin producing their responses at 0. The boxed regions indicate that the difference in activity between the tested conditions was significant at P = 0.03 (b) or marginally significant at P = 0.07 (c) (corrected). Significance was determined using a nonparametric permutation test (Maris and Oostenveld 2007) performed from 100 to 400 ms (10 000 permutations). The bar graphs on the right side illustrate the average activity per condition for the sources and time points that constitute the cluster.To evaluate these hypotheses, we ran 2 complementary analyses, 1 on Ambiguous stems (i.e., stems such as “dream” that are used as nouns or verbs) and 1 on the Repeat trials. If inherent properties of the Verb-Only stems explained the increase in activity we observed during Verb-Context trials, no differences should emerge when Ambiguous stems are used as either verbs or nouns in our experiment. If, however, the Verb-Context over Noun-Context effect reflected the increased complexity of syntactic processes associated with engaging verbs in context, we should observe a similar increase for Verb-Context trials even when considering stems that do not belong firmly in a syntactic category. Our results support the latter: when looking exclusively at Ambiguous items, we found a spatially and temporally overlapping cluster where Verb-Context trials elicited increased activation as compared to Noun-Context trials (though the effect is marginal; 330–400 ms; P = 0.07; Fig. 5).This finding was confirmed by our analysis of Repeat trials. Since the “Say” auditory cue provides no information regarding syntactic context, an increase for Verb-Context trials within the Repeat condition would challenge the argument that differences in verb versus noun processing arise as a function of engaging verbs within a syntactic context. Consistent with our finding on Ambiguous stems, however, no clusters were identified when comparing the Noun-Only and Verb-Only Repeat trials. In line with the neuropsychological literature (Shapiro et al. 2000; Shapiro and Caramazza 2003a; Laiacona and Caramazza 2004; see Shapiro and Caramazza 2003b for a review), our work therefore points to the grammatical basis of differences in neural activity observed during verb versus noun processing.
Distributed Representations of Syntactic and Semantic Properties
A secondary aim of the present study involved investigating distinctions in the way grammatical categories (specifically, nouns and verbs) are represented. While past work has addressed differences in noun and verb representation (e.g., Shapiro and Caramazza 2003a, 2003b; Bedny et al. 2008; Peelen et al. 2012; Matchin et al. 2019), it is often the case that the semantic content of verb and noun stimuli is not controlled for, introducing a potential confound (Vigliocco et al. 2011). In other words, differences in noun and verb representation may often reflect a semantic contrast (object vs. action) rather than a syntactic one (noun vs. verb; see Vigliocco et al. 2011; Moseley and Pulvermüller 2014). Indeed, studies of naming deficits in aphasic patients suggest that semantic information about objects is stored in parts of the middle and inferior temporal lobe (Damasio and Tranel 1993), whereas semantic information about actions is housed in other frontal cortical structures (Silveri and Di Betta 1997; Tranel et al. 2001) that overlap with the spatial dissociation proposed for noun and verb processing (Shapiro and Caramazza 2003a, 2003b; Vigliocco et al. 2011; Kemmerer 2014; also with MEG, e.g., Liljeström et al. 2008; Tsigka et al. 2014). By including as stimuli verbs and noun stems that belong to the same semantic categories (i.e., Abstract-Cognition, Manipulable, and Non-Manipulable), we orthogonalized semantic content and grammatical class so that any observed syntactic effects would not be confounded by semantic properties.In fact, our univariate analyses revealed an increase for verb over noun processing that spanned the left frontal and temporal lobes and cut across the 3 semantic conditions, suggesting that this difference is one of grammatical category, independent of semantic content. This result is further supported by an effect of syntactic context that emerged even for Ambiguous stems: Not only were the semantic categories of Ambiguous items kept constant, but also the exact same word form was used in both Verb-Context and Noun-Context trials.Additionally, following recent proposals suggesting that semantic knowledge may be represented in distributed patterns of neural activity (e.g., Binder and Desai 2011), we turned to finer-grained multivariate analyses that could detect distributed patterns that are potentially invisible to analyses measuring the magnitude of activity in a specific location. The purpose of this analysis was 2-fold. First, we were interested as to whether a measure more sensitive than the univariate analyses could identify brain regions that track differences in semantic content. Second, we wanted to assess whether such regions would overlap with the network that distinguishes between syntactic properties, under the assumption that an overlap would lend support to theories attributing the source of syntactic differences to the underlying semantic properties of each word category (Moseley and Pulvermüller 2014; Vigliocco et al. 2011). To address these questions, we used RSA (Kriegeskorte et al. 2008), whereby we coded for either syntactic context or semantic type in RDMs and correlated those RDMs with patterns of brain activity observed across the left hemisphere (Fig. 2). This analysis allowed us to search for qualitative similarities in the spatial representation of syntactic context and semantic type, despite the fact that differences in activation were not observed.We ran 2 independent spatiotemporal searchlight RSAs (Su et al. 2012) for syntactic context and semantic type to characterize the representation of each of these features (Fig. 2). Our analyses revealed that syntactic context was primarily represented in the middle and posterior temporal lobe toward the end of the 0–400 ms window, similar to the spatial location and latency of the univariate Verb-Context over Noun-Context effect displayed in Fig. 5. Semantic type was primarily represented in inferior and middle frontal regions toward the start of the 0–400 ms window. Visually, this seemed to point to minimal overlap in the areas that encode both semantic and syntactic features. A quantitative analysis confirmed that the overlap was in fact restricted (76 of 887 total significant labels overlapped at any point in time in the 0–400 ms window, Jaccard similarity index = 0.09, which indicates limited overlap between the regions). To summarize the overlap between regions sensitive to differences in syntactic context versus semantic type at a more precise temporal level, we divided the analysis window into 4 sections of 100 ms each. The results of this analysis confirmed that, in fact, the overlap between the regions that tracked semantic and syntactic information was near zero (0–100 ms = 1 overlapping label; 100–200 ms = 6 overlapping labels; 200–300 ms = 7; 300–400 ms = 17).In sum, contra some previous proposals (e.g., Pulvermüller 1999; Bird et al. 2000; Moseley and Pulvermüller 2014), we found no quantitative or qualitative evidence that the effects of syntactic context can be explained by the semantic properties that correlate with them. Instead, it appears that the observed increases for Verb-Context over Noun-Context items are a product of the generative process associated with embedding them in a particular syntactic context and that the semantic properties of these items are represented in independent networks.
Discussion
Although the distinctions between different levels of linguistic processing have long been formally described, the neural substrates supporting these levels have remained elusive. The task of characterizing the neural underpinnings of inflection, a grammatical process that can occur without any overt realization, has proven particularly difficult. In this study, we used a seminaturalistic production task and a set of orthogonalized stimuli to provide a characterization of this process in relation to the neural representation of syntactic context and semantic type. Our results suggest that verbs and nouns are inflected by a common mechanism that is unaffected by both the syntactic properties of the input (i.e., grammatical context) and the phonological properties of the output (i.e., overt or covert realization). Additionally, these data identify differences between noun and verb processing that appear independent of semantic properties.Our study succeeded at isolating a neural effect of abstract inflection, which emerged in frontotemporal regions approximately 335 ms after participants were provided with the grammatical context of the target utterance. The emergence of an effect of abstract inflection is consistent with Chomsky’s (1955) conception of morphology (cf. Seidenberg and Gonnerman 2000), which proposes that morphemes are selected on an abstract level before phonological information is spelled out. In our results, both conditions requiring morphological inflection, including conditions in which inflection was not realized phonologically (e.g., “I dream”), dissociated from the baseline Repeat condition. This experiment therefore offers empirical support for linguistic theory.Anatomically, the location of this effect overlaps ventrally with the LIFG (Brodmann area 47), which has been implicated in inflectional processing across methodologies (e.g., Tyler et al. 2002, 2004; Ullman et al. 2005; Sahin et al. 2009). However, the effect is housed primarily in the left anterior temporal lobe and inferior prefrontal cortex. This finding was unexpected based on the previous literature, leaving us with 2 interpretations to consider. First, it is possible that our design allowed us to uncover previously undetected loci of inflectional processing. Alternatively, given that the localization of MEG data is least accurate in the inferior frontal and superior temporal cortex due to high crosstalk in these regions (Liu et al. 2002; Hauk et al. 2011), it is possible that the results observed in the inferior regions reflect an artifact of the source localization process. Localization concerns do not bear on the fact that our abstract inflection effect maps onto linguistic theory, but we cannot draw strong conclusions here about the relationship between function and anatomy. To address this issue, future studies should adopt the practice of co-registering MEG source estimates with participants’ structural magnetic resonance imagings (MRIs) to improve localization accuracy (Larson et al. 2014; Ahlfors and Mody 2019). In our study, we were unfortunately unable to include information about participants’ structural MRIs.Additionally, we identified an effect of phonological modification to the visual stimulus, which emerged earlier (~180 to 305 ms) and which appears to be driven by conditions that involved the addition of an -s to the visual stimulus (see “dream,” produce “dreams”), as opposed to removal of -s (see “dreams,” produce “dream”). The location of this effect was similar to the effect of abstract inflection, both of which spanned the left inferior frontal gyrus (LIFG) and portions of the anterior temporal lobe (LATL). As in Sahin et al. (2006, 2009), we report the involvement of the LIFG during both abstract inflection and a particular subcategory of inflection—in our case, the condition in which the visual stimulus was phonologically modified via the addition of -s. Despite limitations involved in localizing MEG data, our work still offers tentative evidence from healthy participants in support of the LIFG’s malleable role in the production of inflected forms.The fact that the effect of phonological modification in inflection emerged earlier than the abstract inflection effect was unexpected. Classic models of production such as Indefrey’s (2011) object naming model describe that language users first activate structural properties of a word before selecting its phonology when planning an utterance, suggesting that the need for inflection would be determined before the addition of -s. Although one might be tempted to hypothesize that this finding challenges these models, we recognize that the timing of this effect may simply reflect the properties of the trial structure in our task. In this experiment, participants knew whether the form to be produced would or would not contain an -s as soon as the cue was heard but before they actually began planning the production of the response. Thus, since it is possible that the timing of the earlier phonological modification effect points to the receipt of the cue information as opposed to the true timing of phonological encoding during production, we will refrain from interpreting the timing of this effect further.Although the LATL was not included in the regions profiled in past ECoG work on morphological inflection, the proximity of both effects of inflection to the LATL deserves discussion. An established literature has implicated the LATL in basic combinatory processing (see Pylkkänen 2019 for a review). Beyond the design justification described in Materials and Methods, we provide 2 additional arguments against the hypothesis that the LATL involvement in our experiment reflects semantic composition. First, we observed increased LATL activity related to only one kind of inflection (i.e., in which participants modified the phonological form of the visual stimulus) over another (i.e., in which participants’ utterances shared the same phonological form as the preceding visual stimulus). There is no reason to believe that utterances requiring a phonological modification to a task-related context word would compose any differently than utterances that match this context. Second, our baseline condition can be construed to constitute a combinatory context, as the phrase, “Say: ‘dreams’” is likely composed as an imperative sentence to some extent. Thus, we speculate that the activation of the LATL may instead reflect its general role of linking concepts to words during production (Schwartz et al. 2009). To compare subtypes of inflection using a more sensitive measure of neural activity, we conducted analyses of Granger causality to characterize information flow between the LIFG, the LATL, and temporoparietal regions that included the pSTG (Lee et al. 2018). Connectivity between the temporal lobe (BA 38) and a temporoparietal region (BA 40) emerged only when participants produced a word ending with -s within a syntactic context (Inflect-Overt). The existence of a communicative pathway between these regions may explain the pattern of results observed in Lee et al. (2018), in which direct stimulation to the pSTG resulted in inflectional errors. It is worth noting that these errors spanned conditions in which inflection was phonologically realized and unrealized, whereas the connection observed between BA 38 and BA 40 during our phonologically unrealized (Inflect-Covert) condition was not significant at P = 0.76. However, Lee et al. (2018) focused exclusively on the production of inflected verbs, which may complicate a direct comparison to our study of both verbs and nouns. Taken together, Lee et al. (2018)’s and our work diverge from the theory put forth by Ullman et al. (2005), which describes that temporoparietal and frontal regions comprise distinct morphological processing networks for stored (irregular) versus computed (regular) verb forms. In Lee et al. (2018) and the present work, a brain network involving the temporoparietal area seems to meaningfully contribute to the computation of regular verb inflection.Finally, in direct response to Fedorenko et al.’s (2018) call for further research on the functional role of the pSTG, our analysis of Granger causality offers evidence in favor of its involvement in production. However, we join Fedorenko et al. (2018) in urging future work to elucidate the potential “selectivity” of this area to morphological inflection in production, which would require a comparison to inflectional processing during comprehension that our design does not afford.Results from spatiotemporal searchlight RSA run on models of syntactic context and semantic type. The 2 leftmost FreeSurfer average brains display searchlight regions where the correlation between the observed neural data and the given RDM was significant at P < 0.05 for at least 25 consecutive milliseconds. The timing of these correlations is depicted such that significant correlations occurring closer to the onset of the analysis window are plotted in lighter colors. The rightmost brain displays the regions where significant correlation to the syntactic context RDM, but not the semantic type RDM, was observed at any point between 0 and 400 ms.In addition to investigating the neural basis of inflection, the present work suggests that producing verbs requires additional processing, even among words with flexible verb–noun membership. Specifically, our analysis revealed reliable increases for words produced in verb over noun contexts, even when controlling for correlational aspects of language that could confound the manipulation, including semantic type, inflectional status, and whether words were produced with or without an -s. These effects emerged approximately 320 ms after grammatical context was provided and spanned large swaths of the left frontal and temporal lobes. Increased activity for Verb-Context over Noun-Context conditions is consistent with past neuroimaging work showing similar increases in left frontal and temporal regions (e.g., Bedny and Thompson-Schill 2006; Peelen et al. 2012; Matchin et al. 2019) and neuropsychological studies of double dissociations between noun and verb processing (e.g., Daniele et al. 1994; Bi et al. 2005; Kemmerer et al. 2012). However, in contrast with recent work suggesting the existence of a noun-selective brain network (Elli et al. 2019), we did not observe any increases for Noun-Context over Verb-Context trials in the moments leading up to production.Critically, a marginally significant signal increase for the production of verbs was observed even for words able to function as either nouns or verbs, such as “dream,” but no effects emerged when unambiguous noun and verb stems were produced in a condition that did not provide syntactic context (i.e., Repeat, where produced words did not acquire an active syntactic role). Together, our findings suggest that increased activity for verbs was not driven by the latent conceptual content of the items but rather by their syntactic realization as verbs (Shapiro and Caramazza 2003b; Tyler et al. 2004; Longe et al. 2007). In other words, we would expect to observe the same increase when a word that appears primarily or exclusively in noun contexts is used as a verb. Importantly, our work does not speak to the spatial segregation of areas devoted to either verb or noun processing (see Crepaldi et al. 2011 for a review), as we only found increased activity for verbs over nouns. It is certainly possible that the “verb region” identified in our analyses also becomes engaged during noun production, though to a lesser extent.The contrast between verbs and nouns was robust and may reflect several factors. For one, verbs are the center of a sentence and appear to activate more syntactic information than nouns do (Linzen et al. 2013; Sharpe et al. 2019). Differences between verbs and nouns are also evident during development: Children acquire the verb category more slowly than nouns, perhaps because verbs are more difficult to individuate (Gentner 1982). With regard to our paradigm, it is important to note that words produced in verb contexts completed a sentence (e.g., “He dreams”), while words produced in noun contexts (“Two dreams”) did not. In fact, recent MEG work on Standard Arabic showed increased activity in the left posterior temporal lobe for length- and lexically matched sentences versus noun phrases (Matar et al. 2021). This issue is unavoidable in our design, where we aimed to keep the produced utterances short, natural, and matched for length. One could argue, however, that all utterances produced in the Repeat condition also constituted (imperative) sentences such as “Say, ‘kneel’” and “Say, ‘kite’,” but we did not observe the same Verb-Context over Noun-Context effect among the Repeat trials. Even if the effects reported here could be partially explained by structural differences in the phrases produced in our Verb-Context versus Noun-Context trials, the observed signal increase for Verb-Context trials among category-ambiguous stems suggests that verb/noun differences emerge primarily as a function of grammatical context.A complicating factor in many past studies of noun and verb representation has been the inherent correlation of the noun–verb distinction with aspects of word meaning: Nouns tend to describe objects and verbs actions. We addressed this issue by orthogonalizing syntactic category and semantic type, such that all nouns and verbs were divided into 3 meaning categories. In contrast with embodiment accounts of lexical representation (e.g., Pulvermüller 1999; Pulvermüller et al. 2005), which posit that sensorimotor experiences drive differences between verb and noun processing, our univariate analysis did not reveal any meaningful relationships between syntactic context and semantic content. These results were corroborated by a spatiotemporal searchlight RSA analysis, which revealed limited overlap between the regions representing syntactic versus semantic information and thus provided further evidence that the observed differences between verb and noun processing are due to syntactic factors that are independent of word meaning (Fig. 6).
Figure 6
Results from spatiotemporal searchlight RSA run on models of syntactic context and semantic type. The 2 leftmost FreeSurfer average brains display searchlight regions where the correlation between the observed neural data and the given RDM was significant at P < 0.05 for at least 25 consecutive milliseconds. The timing of these correlations is depicted such that significant correlations occurring closer to the onset of the analysis window are plotted in lighter colors. The rightmost brain displays the regions where significant correlation to the syntactic context RDM, but not the semantic type RDM, was observed at any point between 0 and 400 ms.
Although Fedorenko et al. (2020) did not find evidence for syntax selectivity in comprehension, they entertain the possibility that stages of linguistic processing might separate more easily during production as a reflection of the increased need to “[linearize] words, morphemes, and sounds” (p. 104348). It would seem that our results support that proposal, as they point to the existence of partially separable regions that differentiate between syntactic versus semantic properties during production.
Conclusion
This study isolated the neural processes and circuits associated with inflection during production. Our findings point to the existence of pure inflection, a process whose computations are housed in a left frontotemporal area. Importantly, this profile of activity is modulated by neither syntactic context nor the morphophonological makeup of the produced word. Using a combination of univariate and multivariate analyses, we additionally show that regions in both the inferior frontal and temporoparietal cortex are sensitive to finer-grained differences in inflection. Finally, our work suggests that syntactic demands are heavier on verbs than on nouns. Specifically, words produced in verb contexts elicited increased activity compared to words produced in noun contexts, even among syntactically ambiguous words such as “dream.” Evidence from RSA supports the notion that neural differences in verb and noun processing are driven by grammatical processes as opposed to the lexical properties that correlate with words belonging to each category. This work lays a foundation for future MEG research on morphological inflection and demonstrates that the use of stimuli controlled across multiple dimensions is necessary in addressing core questions about the interaction between different levels of linguistic processing.Click here for additional data file.