Literature DB >> 30319491

Phrase Position, but not Lexical Status, Affects the Prosody of Noun/Verb Homophones.

Abstract

Words that can occur in more than one lexical category produce regions of ambiguity that could confound language learning and processing. However, previous findings suggest that pronunciation of noun/verb homophones may, in fact, differ as a function of category of use, potentially mitigating that ambiguity. Whether these differences are part of the lexical representation of such words or mere by-products of sentence-level prosodic processes remains an open question, the answer to which is critical to resolving questions about the structure of the lexicon. In three studies, adult native speakers of English read aloud passages containing noun/verb homophones or nonce words used in both noun and verb contexts. Acoustic measurements of the target words indicated that, while sentence position influences the acoustic properties of noun/verb homophones, including duration and pitch, there are not significant effects of lexical category when other factors are controlled. Furthermore, the lexical status of a word (real or nonce) does not produce consistent prosodic effects. These findings suggest that previously reported prosodic differences in noun/verb homophones may result from the syntactic positions in which those categories tend to occur.

Entities: Chemical Disease Gene Species

Keywords: grammatical categories; lexical ambiguity; lexical representation; prosody; speech production

Year: 2018 PMID： 30319491 PMCID： PMC6167478 DOI： 10.3389/fpsyg.2018.01785

Source DB: PubMed Journal: Front Psychol ISSN： 1664-1078

Introduction

Ambiguity pervades natural language. Some of this ambiguity stems from words that have multiple meanings. These meanings may be related, as in the case of polysemes, or unrelated, as in the case of homophones. Some homophones and polysemes not only encode different meanings, but also belong to different grammatical categories (e.g., run may be used as both a noun and a verb). Such ambicategorical words could pose a significant challenge for language learning and processing, as they introduce both lexical and structural ambiguities into speech. However, children learn these words without apparent difficulty and adult speakers produce and process them every day, suggesting that the ambiguity they create is mitigated by other factors. Although top-down factors such as syntax and plausibility of meaning no doubt play a role in resolving some of the ambiguities created by words that can be used in more than one grammatical category, prosodic and acoustic differences may provide low-level, bottom-up cues for rapidly classifying ambiguous words. A growing body of research indicates that words that are both nouns and verbs differ in their duration, pitch contours and/or vowel quality depending on the category in which they are used (Sorensen et al., 1978; Shi and Moisan, 2008; Li et al., 2010; Conwell and Morgan, 2012). Sorensen et al. (1978) reported that noun tokens of four English word types were reliably longer than verb tokens of the same words, but they also found that these effects disappeared when those words appeared in sentence-final positions. Analyses of spontaneous child-directed speech also find that nouns are reliably longer than verbs (Conwell and Morgan, 2012; Conwell, 2017). Further, English speakers systematically alter stress patterns in disyllabic nonce words depending on lexical class (Kelly and Bock, 1988; Kelly, 1992; Sereno and Jongman, 1995). Specifically, speakers tend to stress the first syllable of a nonce word when it is used as a noun and the second syllable when it is used as a verb. The grammatical category of a prime word also produces differential production of noun/verb homophones (Melinger and Koenig, 2007). These findings suggest that noun/verb homophones in English differ in pronunciation as a function of their category of use. Studies have also found prosodic differences between noun and verb tokens of the same phonemic sequences in languages other than English. Canadian French speakers who read storybooks containing disyllabic nonce words aloud to infants lengthened the second vowels in noun uses of the nonce words more than they did in verb uses of the same word forms (Shi and Moisan, 2008). A similar study of Mandarin Chinese found that changes in duration and pitch across syllables distinguished noun and verb uses of disyllabic non-words in child-directed speech (Li et al., 2010). However, these studies are hard to generalize to English, as a significant proportion of category-ambiguous words in English are monosyllabic (Conwell and Morgan, 2012). An additional question regarding possible differences in noun/verb homophone pronunciation is where such differences might originate. One possibility is that they may arise purely from phrasal prosody, as suggested by Sorensen et al. (1978). English exhibits strong phrase-final lengthening, in which items that occur at the end of constituent phrases have greater duration than those that appear in the middle of phrases (Cooper and Paccia-Cooper, 1980; Shattuck-Hufnagel and Turk, 1996). Nouns are more likely than verbs to appear in sentence-final and phrase-final positions, and thus they are more likely to be lengthened. Speakers also place prosodic boundaries after nouns more often than they place them after verbs (Watson et al., 2006), which increases the likelihood of nouns being lengthened relative to verbs. Additionally, phrase-final words are more likely to receive focal stress under most conditions than are phrase-medial words (Truckenbrodt, 2007) and nouns in some contexts necessitate focal stress, while verbs in similar contexts may not (Selkirk, 2011). Taken together, these facts suggest that differences in the duration of noun/verb homophones are simply products of prosodic processes that differentially affect nouns and verbs. An alternative possibility is that pronunciation differences are associated with the specific meanings of the noun form and the verb form of the words themselves. A growing number of studies of homophones report representation-level differences in duration and pitch based on factors such as frequency and emotional content. Gahl (2008) found differences in the duration of homophone uses as a function of the frequency of the target meaning, suggesting that frequency of meaning, not frequency of the particular phonological sequence, impacts the phonetic realization of homophones. Nygaard et al. (2002) found that participants’ pronunciation of homophones was affected by the emotional content of the intended meaning. For example, the word bridal was produced with higher fundamental frequency (i.e., more positive emotion) than the word bridle. Melinger and Koenig (2007) demonstrated that the category of a prime affects which pronunciation of an ambiguous target word speakers produce. These findings indicate that lemma-level factors such as frequency, grammatical category, and meaning affect the pronunciation of homophonous words. Phonetic information that is specific to grammatical category could be stored as part of the lexical representation. Although Melinger and Koenig (2007) showed such effects for noun/verb pairs with phonemically distinct representations (e.g., dove) and for pairs with stress patterns that change as a function of category (e.g., contrast), there is not currently evidence for such effects in noun/verb pairs that are phonemically identical and monosyllabic, in other words, those that are truly homophonous. If there are differences in pronunciation as a function of lexical category, these differences could result from stored differences rather than from prosodic processes alone. If category-based pronunciation differences are stored in lexical representations, those differences could facilitate processing and learning of noun/verb homophones by providing a bottom-up cue to the intended category that is available to listeners before more top-down cues such as plausibility and syntax. This would facilitate processing of category-ambiguous words. This article has two main goals. First, the previous work on the existence of acoustic differences in noun/verb homophones has notable limitations. Sorensen et al. (1978) obtained only one noun token and one verb token for each of four word types from each of their 10 participants, resulting in a very small sample. Because the focus of their research was speech timing, they measured only duration, not other cues, such as pitch. Conwell and Morgan (2012) and Conwell (2017) both examined child-directed speech, which differs prosodically from adult-directed speech (Ferguson, 1964), and might, therefore, not be generalizable. Therefore, one goal of the research presented here is to assess whether previously reported prosodic differences in ambicategorical words are robust across a range of tokens, speakers and sentence positions. A second goal of this research is to begin to determine whether differences in noun/verb homophones are entirely the result of sentence prosody or whether they might be part of the lexical representation itself. This research addresses this issue in two ways. First, it controls for sentence and phrase position of the target words. Controlling sentence position reduces the effects of sentence prosody on pronunciation, but it does not wholly resolve the question of whether noun/verb differences are stored in the lexicon. Therefore, this work also compares the effects of grammatical category and sentence position on the realization of both real and nonce words used in noun and verb positions. Because nonce words have no lexical representation, any differences between the real and nonce words may be attributed to representational properties of the real words. This article presents three studies intended to address these issues. In all three studies, adult native speakers of English read aloud passages containing noun/verb homophones in medial and final positions, while other participants read the same passages but with the noun/verb homophones replaced by nonce words. The target words were extracted and measured along a range of acoustic dimensions. Experiment 1 controlled for the sentence position of the target words, but not for phrase position of sentence-medial tokens. Experiment 2 compared phrase-medial tokens to sentence-final tokens of the target words, while Experiment 3 compared phrase-medial and phrase-final tokens. If pronunciation differences in noun/verb homophones are robust across speakers, we would expect noun and verb uses of the target words to differ significantly from one another as a function of category. If noun/verb homophone pronunciation is not affected by category of use, we would expect only to see pronunciation differences as a function of sentence position. Additionally, if the pronunciation distinctions between noun and verb uses of homophones are part of the lexical representations of these words, then we would expect no such differences in noun and verb uses of nonce words, as such words do not have lexical representations. Alternatively, any differences between noun and verb uses of real words could be entirely the result of post-lexical production processes, such as sentence prosody, in which case we would expect no effects of lexicality on those differences and the same patterns would be present in nonce words as well.

Experiment 1

This experiment asks whether native English speakers reliably produce acoustic cues to nounhood or verbhood of ambicategorical words and how those cues might be related to sentence position and lexical status. Half of the participants read paragraphs aloud that contained noun/verb homophones in both sentence-medial and sentence-final positions. The other half read the same paragraphs with the noun/verb homophones replaced with nonce words. Analyses examined the role of lexical category, sentence position and lexical status on the pronunciation of these words. If differences in noun/verb homophones emerge as a function of sentence prosody alone, we would predict no differences in pronunciation as a function of category of use or of lexical status. If, however, a word’s category of use is related to its phonetic representation, independent of sentence position, then we would predict differences in pronunciation on the basis of lexical category for real words, but not for nonce words.

Methods

Participants

Thirty-nine adult native speakers of American English, 21 male and 18 female, participated in this experiment for pay or for course credit. All participants were from the north-central United States and spoke the dialect of English associated with that region. Two additional participants were recorded, but their data were lost due to equipment failure. Participants were recruited from introductory psychology courses and by posting flyers around the North Dakota State University campus. This study was carried out in accordance with the recommendations of the NDSU Internal Review Board, which approved the experimental protocol. All participants gave written informed consent in accordance with the Declaration of Helsinki.

Procedure

Participants were seated approximately 12 inches from a Blue Snowball USB microphone connected to an HP laptop computer. Recordings were made using the software Audacity (Audacity, 2017). Participants read 72 paragraphs, each containing a target word. To facilitate fluency in reading, participants were encouraged to read all of the stimuli silently to themselves before recording commenced. Participants read at a conversational pace and volume. They were asked to monitor their own fluency and to repeat the entire paragraph if they stumbled or mispronounced a word. Most participants made such corrections fewer than three times during the study.

Stimuli

The real noun/verb homophones used in this study were fish, kiss, kick, saw, walk, watch, match, nap, and pass. These words were selected because both noun and verb uses exist in the child-directed speech of the Providence Corpus (Demuth et al., 2006). Each target word occurred eight times and each use was embedded in a 2–4 sentence paragraph, to reduce participants’ ability to identify the target words. Each target word occurred four times as a noun and four times as a verb. Sentence position was also manipulated such that each word appeared four times sentence-medially and four times sentence-finally. A complete list of stimuli for this experiment is provided in Appendix . Sentence position and grammatical category were fully parameterized; each target word was used twice as a noun and twice as a verb in sentence-medial position as well as twice in each category in sentence-final position. Nineteen participants (9 male and 10 female) who read sentences containing real words were included in the analysis. The nonce stimuli were identical to the stimuli that contained real words, except that the target words were replaced with vik (/vIk/), blick (/blIk/), kip (/kIp/), gaw (/gɔ/), glot (/glɔt/), dotch (/dɔtʃ/), kasp (/kæsp/), lat (/læt/), and fash (/fæʃ/). The mean phonological neighborhood density for these words is 22.7, according to the Irvine Phonotactic Online Dictionary (IPhOD; Vaden et al., 2009), compared to a mean phonological neighborhood density of 28.2 for the real words (Vaden et al., 2009). The neighborhood densities of the real and nonce words are not statistically different from one another [t(16) = 1.038, p = 0.315], indicating that any differences in the production of real and nonce words is not a product of different neighborhood densities. Participants practiced pronouncing the nonce words once before beginning the recording to ensure consistent productions across participants. Data from twenty participants (12 male and 8 female) who read nonce stimuli were analyzed.

Measurement

Trained research assistants listened to each recording and used Audacity to isolate the target word tokens. Using the speech analysis software PRAAT (Boersma and Weenink, 2017), boundaries were placed at the beginning and end of each target word and at the beginning and end of the vowels in those words. Boundary placement was based on auditory cues as well as on visual examination of the spectrogram and waveform. Token duration, vowel duration and mean pitch (in Hz) were automatically extracted. Pitch range was calculated in semitones, using the minimum and maximum pitch in the token. A total of 42 targets (1.4%) were removed from the analysis, either because reading errors altered the lexical category or the position of the target word (7), because of disfluencies around the target word (8), or because the target word was mispronounced (27). Nine real targets and 33 nonce targets were excluded. Of the excluded nonce targets, 24 had been mispronounced. A total of 2766 tokens were included in the final analysis.

Results

Each dependent measure (token duration, vowel duration, mean pitch and pitch range) was analyzed separately, using a linear mixed model implemented in R (R Core Team, 2015) using lme4 (Bates et al., 2015) and lmertest (Kuznetsova et al., 2017). Category (noun, verb), sentence position (medial, final), and lexical status (real, nonce), as well as the interaction of these variables, were included as fixed factors. Speaker gender (male, female) was also included as a fixed factor. Participant and paragraph were included as random effects. While a maximal random slopes structure is ideal for such analyses (Barr et al., 2013), every model of token duration in Experiment 1 that included random slopes failed to converge. Therefore, for appropriate comparison across experiments and measures, random slopes were not included in any model in this article. The token and vowel duration results from this study are presented in Figure and the mean pitch and pitch range results are presented in Figure . Results of the token and vowel duration measurements for Experiment 1. Results of the pitch and pitch range measurements for Experiment 1.

Effects on Token and Vowel Duration

Category did not produce a significant main effect on token duration or on vowel duration (both t < 0.4, both p > 0.7). Sentence position did produce a significant main effect on token duration [t(189.11) = 2.17, p = 0.031] and a marginal main effect on vowel duration [t(180.32) = 1.86, p = 0.064]. Sentence-final tokens were longer and contained longer vowels than sentence-medial tokens did. Token duration also showed a marginal main effect of lexical status [t(215.91) = 1.95, p = 0.053], with real words showing longer duration than nonce words. The two-way of sentence position and lexical status was also statistically significant for token duration [t(209.36) = 2.42, p = 0.016], which results from a larger difference between sentence-medial and sentence-final tokens in real words than in nonce words. No other interaction of factors was statistically significant for token or vowel duration (all t < 1.4, all p > 0.17). Gender did not have a significant effect on token or vowel duration (both t < 0.7, both p > 0.5). Complete results from these analyses are presented in Table . Results of linear mixed model analysis of durational variables in Experiment 1.

Effects on Pitch and Pitch Range

Category produced a marginal main effect on mean pitch [t(137.08) = 1.89, p = 0.061], with nouns exhibiting higher mean pitch than verbs. Mean pitch was also significantly affected by sentence position [t(135.02) = 6.04, p < 0.001] with higher mean pitch in sentence final tokens, but there was not a significant effect of lexical status on mean pitch [t(107.18) = 107.18, p = 0.26]. Mean pitch was further affected by speaker gender [t(38.82) = 5.88, p < 0.001], as male participants had lower mean pitch than female participants. No interaction of factors was statistically significant for mean pitch (all t < 1.4, all p > 0.18). Pitch range was significantly affected by sentence position [t(132.8) = 5.6, p < 0.001]. Final tokens had greater pitch range than medial tokens. Gender also significantly affected pitch range [t(39.69) = 2.89, p = 0.006] with male participants producing a smaller pitch range than female participants. The main effects of category and lexical status were not significant for pitch range, and no interaction of factors had a significant impact on pitch range (all t < 1, all p > 0.35). The complete results of the model of pitch and the model of pitch range are presented in Table . Results of linear mixed model analysis of pitch variables in Experiment 1.

Interim Discussion

These findings suggest that lexical category does not affect the prosody of noun/verb homophones when sentence position is controlled for. This replicates the findings of Sorensen et al. (1978) with a larger number of participants and more word types. The positional effects on duration and pitch are consistent with English utterance-final prosody, which includes lengthening and increased pitch and pitch range (Shattuck-Hufnagel and Turk, 1996). Finding such effects suggests that participants were reading the stimuli in a naturalistic manner with sentence-level prosody that was typical for English. The effect of gender on mean pitch was also expected, although the smaller pitch range for male participants was somewhat unexpected. The effects of lexical status on token duration, specifically that nonce tokens had shorter duration than real tokens, and that real tokens show larger effects of sentence position than nonce tokens do, may indicate that participants were less natural in their production of sentences containing nonce words than those with real words. However, because the lexical status manipulation was a between-subjects factor, it is also possible that the effects of lexical status are a product of between-subjects variability in participants’ speech patterns. To fully match prosodic environments for the real and nonce words, it was necessary to use identical sentence contexts for those words. Having participants read the same sentences twice, once with a real word and once with a nonce word, would create demand characteristics that might affect pronunciation. Therefore, it was impractical to treat lexical status as a within-subjects factor. The design of the stimuli in this experiment has two potential problems. First, although the sentence positions of the target words were parameterized, the phrase positions were not. Furthermore, differences in phrasal position could create differences in how likely a word is to be stressed or accented (Kratzer and Selkirk, 2007; Truckenbrodt, 2007). The second concern is that the specific vowels in the words used in this study are not particularly representative of the vowels in English. The target words were chosen on the basis of their frequency to ensure participants’ experience with both noun and verb forms, but not all of the vowels in the target words are frequent relative to other vowels. Specifically, /ɔ/ is the fifth least frequent vowel in American English (Kessler and Treiman, 1997). Frequent vowels may undergo reduction differently than less frequent vowels, which may have affected the vowel duration results. To address these issues, we redesigned our stimuli to contain more representative vowels and to better control for phrase position. Experiment 2 directly controls for phrase position, allowing us to examine further how phrasal prosody might affect the production of noun/verb homophones. Sorensen et al. (1978) found that noun/verb homophones in phrase-final position did not exhibit durational differences as a function of category, but the strong tendency toward elongation in phrase-final positions could mask differences between noun and verb tokens. Sorensen et al. (1978) did not examine phrase-medial tokens.

Experiment 2

This experiment is intended to clarify two issues regarding Experiment 1. First, it changes the target words to include vowels that are more frequent in English. Second, it controls for phrase position as well as sentence position. In Experiment 1, sentence-medial target words could appear anywhere in a clause or phrase, meaning that phrase position was not controlled for. In Experiment 2, all sentence-medial targets appear exactly two syllables before a phrase boundary. This manipulation will determine whether the lack of category effects in Experiment 1 are, in fact, due to a confound between category and phrase position that was present in the stimuli for Experiment 1. Thirty-eight adult native speakers of American English, 21 male and 17 female, participated in this experiment for course credit. All participants were recruited from undergraduate psychology courses at North Dakota State University and all spoke the regional dialect of English. This study was carried out in accordance with the recommendations of the NDSU Internal Review Board, which approved the experimental protocol. All participants gave written informed consent in accordance with the Declaration of Helsinki. The procedure for this experiment was identical to that in Experiment 1. New stimuli were created for this experiment. The real noun/verb homophones used in this study were test, check, pet, peel, lead (/lid/), beat, bit, clip, and kick. These words were selected because they are widely used as both nouns and verbs and because they contain the high frequency vowels /ε/, /i/, and /I/. The average neighborhood density of these words was 34.8, according to IPhOD counts (Vaden et al., 2009). As in Experiment 1, each target word occurred eight times and each use was embedded in a 2–4 sentence paragraph, to reduce participants’ ability to identify the target words. Position and grammatical category were parameterized as in Experiment 1. All sentence-medial productions were exactly two syllables before the end of a syntactic phrase. The complete stimulus list for this study is in Appendix . Twenty participants (11 male and 9 female) who read sentences containing real words were included in the analysis. The nonce stimuli were identical to the stimuli that contained real words, except that the target words were replaced with kesk (/kεsk/), pell (/pεl/), stek (/stεk/), deek (/dik/), keet (/kit/), neep (/nip/), bip (/bIp/), plick (/plIk/), and tib (/tIb/). Based on IPhOD counts (Vaden et al., 2009), the average neighborhood density for these words is 24.2. The difference in neighborhood density is marginally significant [t(16) = 2.029, p = 0.059]. Participants practiced pronouncing the nonce words once before beginning the recording. Data from 18 participants (10 male and 8 female) who read nonce stimuli were analyzed. A total of 107 targets (3.8%) were removed from the analysis, either because reading errors altered the lexical category or the position of the target word (14), because of disfluencies around the target word (22) or because the target word was mispronounced (71). Thirty-seven real targets and 75 nonce targets were excluded. Of the excluded nonce targets, 55 had been mispronounced, typically by changing the vowel (34 tokens) or using a real word instead (12 tokens). A total of 2736 tokens were included in the final analysis. The target words were isolated and analyzed using the same process as in Experiment 1. The same four dimensions (token duration, vowel duration, mean pitch, and pitch range) were measured in the same way. The analyses for this experiment were conducted using the same process as in Experiment 1. Each dependent measure (token duration, vowel duration, mean pitch and pitch range) was analyzed separately, using a linear mixed model implemented in R (R Core Team, 2015) using lme4 and lmerTest (Bates et al., 2015; Kuznetsova et al., 2017). Category (noun, verb), sentence position (medial, final), and lexical status (real, nonce), as well as the interaction of these variables, were included as fixed factors. Speaker gender (male, female) was also included as a fixed factor. Participant and paragraph were included as random effects. The duration results of this study are presented in Figure and the pitch results are in Figure . Results of token and vowel duration measurements for Experiment 2. Results of pitch and pitch range measurements for Experiment 2. As in Experiment 1, category produced no main effect on token duration [t(143.48) = 0.67, p = 0.51] or on vowel duration [t(143.8) = 0.44, p = 0.66]. Lexical status significantly affected token duration [t(165.13) = 3.2, p = 0.002]; real words were shorter than nonce words. However, lexical status did not affect vowel duration [t(162.62) = 1.24, p = 0.22]. Sentence position created significant main effects on token duration [t(143.47) = 4.27, p < 0.001] and vowel duration [t(143.79) = 3.95, p < 0.001]. Sentence-final tokens were longer and contained longer vowels than phrase-medial tokens did. Gender had no main effect on either durational measure (both t < 1.35, both p > 0.18). No interaction of factors significantly affected either token duration or vowel duration (all t < 1, all p > 0.3). The complete results of the models of token duration and vowel duration are presented in Table . Results of linear mixed model analysis of durational variables in Experiment 2. Mean pitch was significantly affected by sentence position [t(144.42) = 0.51, p < 0.001] and by gender [t(37.97) = 8.64, p < 0.001]. Sentence medial tokens had lower pitch than sentence final tokens and male participants had lower pitch than female participants. Neither category nor lexical status significantly affected mean pitch (both t < 1.5, p > 0.13). No interaction of factors significantly affected mean pitch (all t < 0.6, all p > 0.58). Pitch range was significantly affected by lexical status [t(114.7) = 5.2, p < 0.001]; real words had a smaller pitch range than nonce words. Sentence position also significantly affected pitch range [t(148.15) = 7.07, p < 0.001] with sentence final tokens showing a larger pitch range than sentence medial tokens. Neither category nor gender had a significant effect on pitch range (both t < 1.2, both p > 0.27). The analysis of pitch range further showed a significant interaction of lexical status and sentence position [t(140.89) = 2.16, p = 0.03] because nonce words showed a larger difference in pitch range between medial and final tokens than did real words. No other interaction of factors had a significant effect on pitch range (all t < 0.6, all p > 0.56). The complete results of the pitch and pitch range models are presented in Table . Results of linear mixed model analysis of pitch variables in Experiment 2. The goal of Experiment 2 was to determine two things: whether the findings from Experiment 1 would generalize to more representative vowels and whether differences in phrasal position could account for the lack of effects of lexical category. However, like Experiment 1, Experiment 2 showed no effects of category on any prosodic measure. Furthermore, the effect of lexical status on token duration was reversed in this experiment. Once again, the most consistent effects on all measures were those of sentence position, not category or lexical status. However, in controlling for phrasal position of sentence-medial words, Experiment 2 actually contrasted phrase-medial position with sentence-final position. It does not clarify whether there are effects of phrase-medial and phrase-final position in sentence-medial contexts. To resolve this issue, the stimuli were slightly revised for an additional experiment.

Experiment 3

This experiment uses the same target words as Experiment 2, but modifies the stimuli so that the positional contrasts are between phrase-medial and phrase-final tokens. This manipulation will help to elucidate the lack of lexical category effects seen in Experiments 1 and 2. Forty-one adult native speakers of American English (9 male, 31 female, and one who declined to indicate sex or gender) participated in this experiment for course credit. All participants were recruited from undergraduate psychology courses at North Dakota State University and all spoke the regional dialect of English. Two additional participants completed the study but were excluded due to strong non-regional accents that affected their vowel productions. Another four participants completed the study and were excluded from analysis because of high rates of mispronunciations of target words (>15%) or significant disfluencies throughout the study. This study was carried out in accordance with the recommendations of the NDSU Internal Review Board, which approved the experimental protocol. All participants gave written informed consent in accordance with the Declaration of Helsinki. The procedure for this experiment was identical to that of Experiments 1 and 2. The stimuli for this study were slightly modified versions of those used in Experiment 2. The sentences with phrase-medial targets were identical to the stimuli for Experiment 2. The sentences with targets in final position were modified so that the target words were sentence-medial, but phrase-final. A complete list of stimuli for this experiment can be found in Appendix . Twenty participants (5 male and 15 female) who read sentences containing real words and 21 (4 male, 16 female and 1 unindicated) who read sentences containing nonce words were included in the analysis. A total of 83 targets (2.8%) were removed from the analysis, either because reading errors altered the lexical category or the position of the target word (14), because of disfluencies around the target word (35) or because the target word was mispronounced (34). Twenty-eight real targets and 55 nonce targets were excluded. Of the excluded nonce targets, 25 had been mispronounced, typically by changing the vowel (14 tokens) or using a real word instead (10 tokens). A total of 2869 tokens were included in the final analysis. The target words were isolated and analyzed using the same process as in Experiments 1 and 2. The same four dimensions (token duration, vowel duration, mean pitch, and pitch range) were measured in the same way. The data were analyzed in the same manner as the data for Experiments 1 and 2. Each dependent measure (token duration, vowel duration, mean pitch, and pitch range) was analyzed separately, using a linear mixed model implemented in R (R Core Team, 2015) using lme4 and lmerTest (Bates et al., 2015; Kuznetsova et al., 2017). Category (noun, verb), sentence position (medial, final), and lexical status (real, nonce), as well as the interaction of these variables, were included as fixed factors. Speaker gender (male, female) was also included as a fixed factor. Participant and paragraph were included as random effects. The duration results of this study are presented in Figure and the pitch results are in Figure . Results of token and vowel duration measurements for Experiment 3. Results of pitch and pitch range measurements for Experiment 3. As in Experiments 1 and 2, grammatical category did not produce significant main effects on token duration [t(155.99) = 0.22, p = 0.82] or vowel duration [t(151.5) = 0.57, p = 0.57]. Lexical status significantly affected token duration [t(192.56) = 3.14, p = 0.002], but not vowel duration [t(183.8) = 0.52, p = 0.61]. In this case, real words were shorter than nonce words. Phrase position also showed a significant main effect for token duration [t(215.17) = 3.03, p = 0.003], as well as for vowel duration [t(186.5) = 2.87, p = 0.004]. Phrase-final tokens were longer and contained longer vowels than phrase-medial tokens did. No interaction of factors produced a significant effect on either of the durational measures (all t < 1.39, all p > 0.18). The complete results of the models for token duration and vowel duration are presented in Table . Results of linear mixed model analysis of durational variables in Experiment 3. Only speaker gender had a significant main effect on mean pitch [t(41.03) = 3.57, p = 0.001] with male participants exhibiting lower pitch than female participants. No other factor or interaction of factors had a significant effect on mean pitch (all t < 1.02, all p > 0.31). Pitch range was also significantly affected by gender [t(40.99) = 2.67, p = 0.011]; male participants showed greater pitch range over the target words than female participants did. Pitch range was marginally affected by lexical status [t(116.66) = 1.73, p = 0.086] with real words containing a smaller pitch range than nonce words. Pitch range was also significantly affected by phrase position [t(144.31) = 4.17, p < 0.001]. Phrase final words had greater pitch range than phrase medial words did. Grammatical category did not produce a significant main effect on pitch range [t(140.57) = 0.34, p = 0.97]. No interaction of factors produced a significant effect on pitch range (all t < 1.16, all p > 0.25). The complete results of the models for pitch and pitch range are presented in Table . Results of linear mixed model analysis of pitch variables in Experiment 3.

General Discussion

The studies presented here asked two questions. First, they examined whether there are reliable differences between noun and verb uses of homophones in English. They also asked how factors such as lexical status and position within an utterance would affect the phonetic realization of words with more than one grammatical category. The use of nonce words, which have no stored representation for our participants, was intended to allow discussion of the differences in noun/verb homophones that are and are not part of the lexicon, while controlling for phrase and sentence position. The results of these three studies are consistent in some ways, but deviate from one another in others. A summary of results from all three studies can be found in Table . Summary of t-values across three experiments. Across three studies, we have failed to find evidence that nouns are consistently longer than verbs when other factors that influence prosody are considered. These results replicate those of Sorensen et al. (1978), while expanding the analysis to a greater number of word types and speakers, as well as accounting for multiple factors that might influence lexical prosody. We consistently find an effect of phrase or sentence position on the durational and pitch properties of these words. Because English is a head-final language, nouns are more likely than verbs to appear in final positions in spontaneous speech. In light of this fact, previous work that identifies category effects on token duration in spontaneous speech (e.g., Conwell and Morgan, 2012; Conwell, 2017) may find those effects due to a confound between lexical category and utterance positions, as well as the tendency of speakers to place prosodic boundaries more often after nouns than after verbs (Watson et al., 2006). Although previous work has statistically controlled for positional effects, direct experimental manipulation provides a more precise control. These results indicate that, once known production factors are well-controlled for, noun/verb homophones do not differ in their durations. This does not necessarily mean that such differences are unimportant for language learning and processing, as prosodic information may be available to speakers and learners before syntactic information is (or, alternatively, prosodic information may carry critical syntactic information). These results simply mean that the differences that have been previously reported may not be properties associated with “nounhood” or “verbhood,” per se. Rather, they tend to co-occur with grammatical category because of the interaction of prosody and syntax. This results in a correlation of lengthening with category, but not a causal relationship in which category causes lengthening or shortening (see Kratzer and Selkirk, 2007, for a related analysis of pitch accent and category). The inconsistent effects of lexical status are difficult to interpret. The lexical status manipulation was intended to address whether the presence of a stored representation for a word affects the extent to which its category of use impacts is phonetic realization. However, as this research found no effects of grammatical category and inconsistent effects of lexical status (e.g., in Experiment 1, real word were longer than nonce words, but in Experiment 2, this effect was reversed), these data cannot be brought to bear on that question. Lexical status was manipulated as a between-subjects factor so we could control local phonetic context of the real and nonce words by presenting them in otherwise identical paragraphs. Unfortunately, this means that we were unable to account for participant differences in the lexical status analyses. With respect to the main question that this research was intended to address, however, lexical status did not interact with grammatical category of use, indicating that there are not differences in the production of real noun/verb homophones that are not present in the production of nonce words in those same contexts. One factor that may have influenced the production of distinctions between noun and verb tokens of category ambiguous words in these studies is the relatedness of the noun and verb meanings. For some of the word types used in this study (e.g., kick, peel), the noun and verb forms are closely related. In the case of kick, for example, the noun use derives from the verb form without overt morphology. Other word types have wholly distinct meanings. For example, saw in its noun form refers to a hand tool, while saw in its verb form is the past tense of see. Other words have noun and verb meanings that may be historically related (e.g., watch), while still other pairs have both related and unrelated meanings (e.g., pass, check). It is possible that the extent to which the two forms are semantically related may affect how differentiated they are in production. Words with completely distinct meanings could be more differentiated than those that are more closely related, as, in some theories of lexical organization (e.g., Barner and Bale, 2002), words with more closely related meanings may share a lexical entry, while words with distinct meanings likely do not. However, previous work on spontaneous speech suggests that derived homophone pairs do differ in duration as a function of lexical category (Lohmann, 2017). An ideal test of this issue would include degree of meaning relatedness as a factor in the statistical analysis, but procedures for quantifying meaning relatedness of homographic words are not readily available. Therefore, these differences in relatedness cannot be statistically accounted for, which is a limitation of these results. The only consistent effects across the three experiments in this study are patterns that have been widely reported in previous literature, namely, that syntactic position affects the prosody of words. These data contribute not only further evidence that words before syntactic boundaries are subject to lengthening and increased pitch and pitch range, but also evidence that, once such prosodic effects have been controlled for, previously reported effects of grammatical category on prosody are no longer seen. This suggests that those effects are less the result of stored prosodic differences between nouns and verbs than the result of production processes operating at the level of the sentence. Our consistent effects of position, both phrasal and sentential, on duration and pitch measures are consistent with the phrase and sentence prosody of English (Shattuck-Hufnagel and Turk, 1996) and with prior work on the relationship between prosodic and syntactic phrasing (Selkirk, 2011). The research presented in this article shows that when a range of factors, including lexical status and position in a sentence, are controlled for, the previously reported effects of grammatical on the prosody of noun/verb homophones disappear. Furthermore, lexical status does not have a consistent effect on the prosody of such words and there are no interactions of category and lexical status, nor of category and position on these factors. These findings may indicate that low-level phonetic features of noun/verb homophones are entirely the result of production processes, including sentential prosody.

Author Contributions

KB collected and analyzed data for Experiment 1, wrote the first draft of the method sections, and provided feedback on the other portions of the manuscript. EC designed the experiments and stimuli, analyzed the data for Experiments 1–3, wrote the introduction, results, and discussion sections, and edited the methods section. EC was responsible for the final edit of the manuscript.

Conflict of Interest Statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Table 1

Results of linear mixed model analysis of durational variables in Experiment 1.

Effect	Parameter	Token duration		Vowel duration

		Estimate	SE	Estimate	SE
Fixed effects
Intercept	β	0.361***	0.017	0.219***	0.011
Category	β	0.007	0.019	0.004	0.013
Sentence position	β	-0.042*	0.019	-0.024	0.013
Lexical status	β	0.04	0.021	-0.008	0.014
Gender	β	0.003	0.011	-0.004	0.007
Category × Position	β	-0.027	0.027	-0.021	0.018
Category × Status	β	-0.006	0.024	-0.002	0.016
Position × Status	β	-0.06*	0.024	-0.022	0.016
Category × Position × Status	β	-0.0003	0.035	-0.007	0.023
Random effects
Participant	σ²	0.003	0.004	0.001	0.038
Item	σ²	0.001	0.002	0.0004	0.021
Residual	σ²	0.003	0.004	0.001	0.034

Table 2

Results of linear mixed model analysis of pitch variables in Experiment 1.

Effect	Parameter	Mean pitch		Pitch range

		Estimate	SE	Estimate	SE
Fixed effects
Intercept	β	252.48***	11.43	11.5***	0.716
Category	β	-18.85	9.99	-0.282	0.844
Sentence position	β	-60.16***	9.95	-4.7***	0.839
Lexical status	β	-15.35	13.67	0.383	0.915
Gender	β	-59.34***	10.1	-1.433**	0.495
Category × Position	β	14.58	14.11	-0.202	1.19
Category × Status	β	18.4	13.8	-0.316	1.181
Position × Status	β	15.9	13.72	-1.083	1.172
Category × Position × Status	β	-10.92	19.38	0.082	1.657
Random effects
Participant	σ²	499	22.34	2.28	1.51
Item	σ²	864.1	29.4	1.22	1.11
Residual	σ²	7217.9	84.96	74.64	8.64

Table 3

Results of linear mixed model analysis of durational variables in Experiment 2.

Effect	Parameter	Token duration		Vowel duration

		Estimate	SE	Estimate	SE
Fixed effects
Intercept	β	0.352***	0.016	0.189***	0.009
Category	β	-0.011	0.017	-0.004	0.01
Sentence position	β	-0.073***	0.017	-0.038***	0.01
Lexical status	β	-0.066**	0.021	-0.014	0.012
Gender	β	-0.016	0.012	-0.008	0.007
Category × Position	β	0.002	0.024	-0.005	0.014
Category × Status	β	0.016	0.024	0.004	0.014
Position × Status	β	0.003	0.024	-0.013	0.014
Category × Position × Status	β	-0.016	0.034	-0.005	0.019
Random effects
Participant	σ²	0.002	0.05	0.001	0.028
Item	σ²	0.001	0.035	0.0004	0.02
Residual	σ²	0.002	0.044	0.001	0.028

Table 4

Results of linear mixed model analysis of pitch variables in Experiment 2.

Effect	Parameter	Mean pitch		Pitch range

		Estimate	SE	Estimate	SE
Fixed effects
Intercept	β	262.97***	10.56	15.966***	1.059
Category	β	-5.61	11.02	-0.499	0.999
Sentence position	β	-47.1***	11	-7.043***	0.997
Lexical status	β	-19.83	13.31	-6.723***	1.292
Gender	β	-70.21***	8.13	1.014	0.907
Category × Position	β	8.44	15.56	-0.101	1.41
Category × Status	β	-6.04	15.48	0.811	1.4
Position × Status	β	4.24	15.45	3.001*	1.392
Category × Position × Status	β	8.46	21.86	-1.14	1.97
Random effects
Participant	σ²	796.4	28.22	4.57	2.14
Item	σ²	547	23.39	6.64	2.58
Residual	σ²	4909.8	70.07	73.39	8.57

Table 5

Results of linear mixed model analysis of durational variables in Experiment 3.

Effect	Parameter	Token duration		Vowel duration

		Estimate	SE	Estimate	SE
Fixed effects
Intercept	β	0.34***	0.012	0.001***	0.002
Category	β	0.004	0.016	0.0002	0.001
Sentence position	β	-0.045**	0.015	0.001**	0.003
Lexical status	β	-0.055**	0.017	0.001	0.002
Gender	β	-0.003	0.008	0.0001	0.001
Category × Position	β	-0.023	0.022	0.001	0.003
Category × Status	β	0.002	0.023	0.001	0.002
Position × Status	β	-0.02	0.022	0.0002	0.001
Category × Position × Status	β	0.011	0.032	0.0012	0.003
Random effects
Participant	σ²	0.002	0.048	0.001	0.028
Item	σ²	0.001	0.023	0.0002	0.013
Residual	σ²	0.002	0.047	0.001	0.035

Table 6

Results of linear mixed model analysis of pitch variables in Experiment 3.

Effect	Parameter	Mean pitch		Pitch range

		Estimate	SE	Estimate	SE
Fixed effects
Intercept	β	242.38***	9.98	12.89***	0.83
Category	β	5.06	11.23	0.03	0.86
Sentence position	β	-5.33	11.1	-3.59***	0.86
Lexical status	β	-2.37	13.82	-1.99	1.15
Gender	β	-35.48***	9.95	2.54*	0.95
Category × Position	β	-15.12	15.79	-1.41	1.22
Category × Status	β	7.12	15.96	0.58	1.23
Position × Status	β	-16.11	15.86	-0.09	1.23
Category × Position × Status	β	-5.35	22.5	0.24	1.74
Random effects
Participant	σ²	823.3	28.69	2.96	1.72
Item	σ²	655.4	25.6	5.7	2.39
Residual	σ²	6336.1	79.6	75.88	8.71

Table 7

Summary of t-values across three experiments.

	Token duration	Vowel duration	Pitch	Pitch range
Category (noun)
Experiment 1	0.37	0.31	-1.89	-0.34
Experiment 2	-0.67	-0.44	-0.51	-0.5
Experiment 3	0.22	0.57	0.45	0.03
Position (middle)
Experiment 1	-2.17*	-1.863	-6.04***	-5.6***
Experiment 2	-4.27***	-3.95***	-4.28***	-7.07***
Experiment 3	-3.03**	-2.871**	-0.48	-4.17***
Lexical Status (real)
Experiment 1	1.95	-0.59	-1.12	0.42
Experiment 2	-3.2**	-1.24	-1.49	-5.2***
Experiment 3	-3.14**	-0.52	-0.17	-1.73
Gender (male)
Experiment 1	0.31	-0.62	-5.88***	-2.89**
Experiment 2	-1.34	-1.25	-8.64***	1.12
Experiment 3	-0.4	-1.34	-3.57***	2.67*
Category (noun)^∗Position (middle)
Experiment 1	-1	-1.16	1.03	-0.17
Experiment 2	0.072	-0.36	0.54	-0.07
Experiment 3	-1.04	-1.34	-0.96	-1.15
Category (noun)^∗Lexical Status (real)
Experiment 1	-0.25	-0.11	1.33	-0.27
Experiment 2	0.65	0.32	-0.39	0.58
Experiment 3	0.1	0.003	0.45	0.47
Position (middle)^∗Lexical Status (real)
Experiment 1	-2.42*	-1.36	1.16	-0.92
Experiment 2	0.12	-0.96	0.27	2.16*
Experiment 3	-0.91	-0.94	-1.02	-0.08
Category^∗Position^∗Lexical Status
Experiment 1	-0.01	-0.29	-0.56	0.05
Experiment 2	-0.47	-0.26	0.39	-0.58
Experiment 3	0.35	0.26	-0.24	0.14

8 in total

Phrase Position, but not Lexical Status, Affects the Prosody of Noun/Verb Homophones.

Introduction

Experiment 1

Methods

Participants

Procedure

Stimuli

Measurement

Results

Effects on Token and Vowel Duration

Effects on Pitch and Pitch Range

Interim Discussion

Experiment 2

Experiment 3

General Discussion

Author Contributions

Conflict of Interest Statement

Review 1. Using sound to solve syntactic problems: the role of phonology in grammatical category assignments.

2. The role of syntactic obligatoriness in the production of intonational boundaries.

3. Random effects structure for confirmatory hypothesis testing: Keep it maximal.

Review 4. A prosody tutorial for investigators of auditory sentence processing.

5. Stress in time.

6. Speech timing of grammatical categories.

7. Word-minimality, epenthesis and coda licensing in the early acquisition of English.

8. Prosodic disambiguation of noun/verb homophones in child-directed speech.