| Literature DB >> 23055162 |
William W Graves1, Jeffrey R Binder, Mark S Seidenberg.
Abstract
The combining of individual concepts to form an emergent concept is a fundamental aspect of language, yet much less is known about it than about processing isolated words or sentences. To facilitate research on conceptual combination, we provide meaningfulness ratings for a large set of (2,160) noun-noun pairs. Half of these pairs (1,080) are reversed versions of the other half (e.g., SKI JACKET and JACKET SKI), to facilitate the comparison of successful and unsuccessful conceptual combination independently of constituent lexical items. The computer code used for obtaining these ratings through a Web interface is provided. To further enhance the usefulness of this resource, ancillary measures obtained from other sources are also provided for each pair. These measures include associate production norms, contextual relatedness in terms of latent semantic analysis distance, total number of letters, phrase-level usage frequency, and word-level usage frequency summed across the words in each pair. Results of correlation and regression analyses are also provided for a quantitative description of the stimulus set. A subset of these stimuli was used to identify neural correlates of successful conceptual combination Graves, Binder, Desai, Conant, & Seidenberg, (NeuroImage 53:638-646, 2010). The stimuli can be used in other research and also provide benchmark data for evaluating the effectiveness of computational algorithms for predicting meaningfulness of noun-noun pairs.Entities:
Mesh:
Year: 2013 PMID: 23055162 PMCID: PMC3663253 DOI: 10.3758/s13428-012-0256-3
Source DB: PubMed Journal: Behav Res Methods ISSN: 1554-351X
Summary statistics for the entire set of noun-noun pairs and separated according to whether they were expected to be meaningful (forward order) or not meaningful (reversed order)
| Factor | Mean | Median |
|
|---|---|---|---|
| All pairs | |||
| Meaningfulness | 2.03 | 1.69 | 1.22 |
| Association | 0.02 | 0.00 | 0.06 |
| LSA | 0.19 | 0.15 | 0.16 |
| Number of letters | 9.87 | 10 | 2.23 |
| Phrase frequency (log) | 3.14 | 3.23 | 1.27 |
| Summed word frequency (log) | 5.76 | 5.81 | 0.88 |
| Forward | |||
| Meaningfulness | 2.83 | 3.15 | 1.07 |
| Association | 0.02 | 0.00 | 0.07 |
| Phrase frequency (log) | 3.81 | 3.90 | 0.98 |
| Reversed | |||
| Meaningfulness | 1.23 | 1.07 | 0.73 |
| Association | 0.01 | 0.00 | 0.05 |
| Phrase frequency (log) | 2.46 | 2.62 | 1.16 |
SD = standard deviation.
Fig. 1Distribution of meaningfulness values grouped by forward (red) or reversed (blue) order, each divided into 20 discrete bins
Pairwise correlations (Pearson r-values) among all variables of interest
| Meaningfulness |
| Association | LSA | Length | Phrase freq | |
|---|---|---|---|---|---|---|
| SD |
| |||||
| Association |
|
| ||||
| LSA |
|
|
| |||
| Length | .06* | −.02 | −.02 | .01 | ||
| Phrase freq |
|
|
|
| −.03 | |
| Sum word freq | −.01 | .05† |
|
| .01 |
|
Values are based on the subset of items for which association measures were available (2,144 of 2,160). SD, standard deviation; LSA, latent semantic analysis; frq, frequency. Entries in bold were significant at p < .0001.
* p < .01
p < .05
Fig. 2Hierarchical cluster analysis based on squared Spearman rank correlation distances showing the relationships between the factors in Table 2
Results of regression analyses for the subset of 2,144 noun–noun phrases for which association measures were available
| Predictor | Meaningfulness |
| ||
|---|---|---|---|---|
| Beta |
| Beta |
| |
| Association | .02 | .32 | −.07 | .00 |
| LSA | .01 | .64 | −.04 | .09 |
| Length | .08 | .00 | −.03 | .23 |
| Phrase freq | .65 | .00 | −.18 | .00 |
| Sum word freq | −.16 | .00 | .08 | .00 |
Values indicate the ability of five explanatory variables to predict average meaningfulness ratings (columns 2 and 3) or to predict the standard deviation of the ratings (two rightmost columns). These values represent standardized regression weights (beta weights) and corresponding p-values from tests of significance
Results of regression analyses reported as in Table 3, but with phrase category (forward or reversed) added as a predictor variable
| Predictor | Meaningfulness |
| ||
|---|---|---|---|---|
| Beta |
| Beta |
| |
| Category | .46 | .00 | −.05 | .04 |
| Association | .02 | .26 | −.07 | .00 |
| LSA | .09 | .00 | −.05 | .04 |
| Length | .07 | .00 | −.02 | .26 |
| Phrase freq | .37 | .00 | −.15 | .00 |
| Sum word freq | −.08 | .00 | .07 | .00 |
LSA, latent semantic analysis; freq, frequency