| Literature DB >> 29124717 |
Ya-Ning Chang1,2, Chia-Ying Lee3.
Abstract
Words are considered semantically ambiguous if they have more than one meaning and can be used in multiple contexts. A number of recent studies have provided objective ambiguity measures by using a corpus-based approach and have demonstrated ambiguity advantages in both naming and lexical decision tasks. Although the predictive power of objective ambiguity measures has been examined in several alphabetic language systems, the effects in logographic languages remain unclear. Moreover, most ambiguity measures do not explicitly address how the various contexts associated with a given word relate to each other. To explore these issues, we computed the contextual diversity (Adelman, Brown, & Quesada, Psychological Science, 17; 814-823, 2006) and semantic ambiguity (Hoffman, Lambon Ralph, & Rogers, Behavior Research Methods, 45; 718-730, 2013) of traditional Chinese single-character words based on the Academia Sinica Balanced Corpus, where contextual diversity was used to evaluate the present semantic space. We then derived a novel ambiguity measure, namely semantic variability, by computing the distance properties of the distinct clusters grouped by the contexts that contained a given word. We demonstrated that semantic variability was superior to semantic diversity in accounting for the variance in naming response times, suggesting that considering the substructure of the various contexts associated with a given word can provide a relatively fine scale of ambiguity information for a word. All of the context and ambiguity measures for 2,418 Chinese single-character words are provided as supplementary materials.Entities:
Keywords: Chinese character naming; Contextual diversity; Latent semantic analysis; Semantic ambiguity; Semantic diversity
Mesh:
Year: 2018 PMID: 29124717 PMCID: PMC6267517 DOI: 10.3758/s13428-017-0993-4
Source DB: PubMed Journal: Behav Res Methods ISSN: 1554-351X
Fig. 1Distributions of single-character words as a function of log contextual diversity (upper panel), semantic diversity (middle panel), and semantic variability (lower panel), with their normal distribution curves
Fig. 2Within-group sum of squared errors (SSE) against number of cluster solutions for the single-character word 花
Correlations between predictors (except initial phonemes)
| Log CF | NoS | Cons | SemR | Img | SemD | SemVar | |
|---|---|---|---|---|---|---|---|
| Log CF | 1 | ||||||
| NoS | – .105*** | 1 | |||||
| Cons | – .011 | .127*** | 1 | ||||
| SemR | .604*** | – .167*** | – .062*** | 1 | |||
| Img | – .222*** | – .001 | – .012 | – .219*** | 1 | ||
| SemD | .257*** | – .028*** | – .097*** | .238*** | – .074*** | 1 | |
| SemVar | .594*** | – .130*** | – .006*** | .407*** | .142*** | .245*** | 1 |
| Log CD | .610*** | – .125*** | – .052*** | .403*** | .134*** | .166*** | .959*** |
***Correlation is significant at the .001 level; **Correlation is significant at the .01 level; *Correlation is significant at the .05 level. Log CF: log character frequency; NoS: number of strokes; Cons: consistency; Img: imageability; SemR: semantic ambiguity rating; SemD: semantic diversity; SemVar: semantic variability
Results of principal component analyses with promax rotation
| Factor 1 | Factor 2 | Factor 3 | |
|---|---|---|---|
| Log CF |
| .10 | – .13 |
| NoS | – .06 |
| – .02 |
| Cons | .12 |
| .00 |
| SemR |
| – .05 | – .25 |
| Img | – .03 | – .01 |
|
| SemD |
| – .09 | – .04 |
| SemVar |
| .03 |
|
Scores greater than .4 were marked in bold. Log CF: log character frequency; NoS: number of strokes; Cons: consistency; Img: imageability; SemR: semantic ambiguity rating; SemD: semantic diversity; SemVar: semantic variability
Linear mixed-effect model fitted to log RTs in naming (R-squared = 31.16%, n = 1,000)
| Estimated | Std. Err |
| Wald (2.5% to 97.5%) | Increase in AIC |
| |
|---|---|---|---|---|---|---|
| Stop | – .205 | .086 | – 2.38 | – .373 to – .036 | – | – |
| Affricate | – .197 | .092 | – 2.14 | – .377 to – .017 | – | – |
| Fricative | – .114 | .085 | – 1.34 | – .281 to .053 | – | – |
| Nasal | .040 | .060 | 0.66 | – .078 to .159 | – | – |
| Liquid | – | – | – | – | – | – |
| Aspirated | – .424 | .075 | – 5.65 | – .570 to – .277 | – | – |
| Voiced | – .129 | .025 | – 5.21 | – .177 to – .080 | – | – |
| Bilabial | .255 | .093 | 2.75 | .073 to .437 | – | – |
| Labiodental | .343 | .099 | 3.48 | .150 to .536 | – | – |
| Alveolar | .312 | .086 | 3.62 | .143 to .481 | – | – |
| Palato-alveolar | .459 | .099 | 4.64 | .265 to .653 | – | – |
| Alveolo-palatal | .462 | .094 | 4.89 | .276 to .647 | – | – |
| Velar | .208 | .091 | 2.29 | .030 to .386 | – | – |
| Log CF | – .120 | .013 | – 8.94 | – .146 to – .093 | 75 | 77.02*** |
| NoS | .040 | .009 | 4.26 | .021 to .058 | 16 | 18.04*** |
| Cons | – .051 | .009 | – 5.57 | – .069 to – .033 | 28 | 30.62*** |
| SemR | – .042 | .011 | – 3.69 | – .065 to – .020 | 11 | 13.51*** |
| Img | – .083 | .010 | – 8.41 | – .102 to – .063 | 66 | 68.37*** |
| SemD | – .031 | .009 | – 3.32 | – .049 to – .013 | 9 | 10.95*** |
| SemVar | – .073 | .012 | – 6.05 | – .096 to – .049 | 34 | 36.04*** |
All predictors were scaled. Log CF: log character frequency; NoS: number of strokes; Cons: consistency; Img: imageability; SemR: semantic ambiguity rating; SemD: semantic diversity; SemVar: semantic variability. ***The chi-square value is significant at the .001 level
Linear mixed-effect model fitted to log RTs in naming, with log CF, log CD, SemVar, or SemVarRes along with the other psycholinguistic variables
| LMM 1 | LMM 2 | |||||
|---|---|---|---|---|---|---|
| Estimated |
|
| Estimated |
|
| |
| Stop | – .208 | – 2.43 | – | – .208 | – 2.43 | – |
| Affricate | – .197 | – 2.16 | – | – .197 | – 2.16 | – |
| Fricative | – .112 | – 1.32 | – | – .112 | – 1.32 | – |
| Nasal | .042 | 0.71 | – | .042 | 0.71 | – |
| Liquid | – | – | – | – | – | – |
| Aspirated | – .426 | – 5.72 | – | – .426 | – 5.72 | – |
| Voiced | – .127 | – 5.18 | – | – .127 | – 5.18 | – |
| Bilabial | .254 | 2.75 | – | .254 | 2.75 | – |
| Labiodental | .335 | 3.42 | – | .335 | 3.42 | – |
| Alveolar | .310 | 3.62 | – | .310 | 3.62 | – |
| Palato-alveolar | .451 | 4.58 | – | .451 | 4.58 | – |
| Alveolo-palatal | .455 | 4.85 | – | .455 | 4.85 | – |
| Velar | .200 | 2.21 | – | .200 | 2.21 | – |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| NoS | .040 | 4.27 | 18.05*** | .040 | 4.27 | 18.05*** |
| Cons | – .051 | – 5.59 | 30.79*** | – .051 | – 5.59 | 30.79*** |
| SemR | – .042 | – 3.67 | 13.36*** | – .042 | – 3.67 | 13.36*** |
| Img | – .084 | – 8.61 | 71.55*** | – .084 | – 8.61 | 71.55*** |
| SemD | – .022 | – 2.30 | 5.26* | – .022 | – 2.30 | 5.26* |
|
|
|
|
| – | – | – |
|
| – | – | – |
|
|
|
Log CF: log character frequency; NoS: number of strokes; Cons: consistency; Img: imageability; SemR: semantic ambiguity rating; SemD: semantic diversity; log CD: log-transformed contextual diversity; SemVar: semantic variability; SemVarRes: the residuals of semantic variability after partialing out log CD