| Literature DB >> 20532192 |
Abstract
BACKGROUND: Word frequency is the most important variable in language research. However, despite the growing interest in the Chinese language, there are only a few sources of word frequency measures available to researchers, and the quality is less than what researchers in other languages are used to.Entities:
Mesh:
Year: 2010 PMID: 20532192 PMCID: PMC2880003 DOI: 10.1371/journal.pone.0010729
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Word frequency lists of Chinese.
|
The |
|
The |
|
The |
|
The |
|
|
|
Word list |
Figure 1Lay-out of the SUBTLEX-CH-CHR file.
Figure 2Lay-out of the SUBTLEX-CH-WF file.
Figure 3Lay-out of SUBTLEX-CH-WF_PoS file.
The intercorrelations between RT and eight available frequency measures (four word frequency measures and four character frequency measures) for 2289 single character words in a word naming task.
| N = 2289 | RT | Word frequency | Character frequency | |||||||
| SUBTL_logW | SUBTL_ logW-CD | LCSMCS_logW | LCMC_logW | SUBTL_logCHR | SUBTL_logCHR-CD | LCSMCS_logCHR | CCL_ logCHR | |||
| RT | 1 | |||||||||
| Word frequency | SUBTL_ logW | −.532 | 1 | |||||||
| SUBTL_ logW-CD | −.533 | .979 | 1 | |||||||
| LCSMCS_logW | −.479 | .791 | .786 | 1 | ||||||
| LCMC_ logW | −.548 | .854 | .840 | .877 | 1 | |||||
| Character frequency | SUBTL_ logCHR | −.566 | .825 | .819 | .666 | .770 | 1 | |||
| SUBTL_ logCHR-CD | −.571 | .780 | .806 | .617 | .721 | .970 | 1 | |||
| LCSMCS_logCHR | −.559 | .687 | .678 | .728 | .796 | .855 | .822 | 1 | ||
| CCL_ logCHR | −.547 | .670 | .657 | .660 | .778 | .866 | .831 | .962 | 1 | |
*p<0.01.
The percentages of variance in RTs accounted for by each of eight available frequency measures (four word frequency measures and four character frequency measures), for 2289 single character words in a word naming task.
| N = 2289 | Word frequency | Character frequency | ||||||
| SUBTL_logW | SUBTL_ logW-CD | LCSMCS_logW | LCMC_logW | SUBTL_logCHR | SUBTL_ logCHR-CD | LCSMCS_logCHR | CCL_ logCHR | |
| Log | 28.3 | 28.4 | 22.9 | 30.1 | 32.0 | 32.6 | 31.2 | 29.9 |
| log+log2 | 29.7 | 28.7 | 23.7 | 32.0 | 33.0 | 32.6 | 33.9 | 33.0 |
The percentages of variance in RT accounted for by each of the different frequency measures, for two-character words in the visual lexical decision task.
| N = 187 | Word frequency | |||
| SUBTL_ logW | SUBTL_ logW-CD | LCSMCS_logW | LCMC_ logW | |
| Log | 42.7 | 42.8 | 13.7 | 27.3 |
| log+log2 | 44.8 | 43.9 | 13.7 | 27.9 |
Figure 4The light gray points on the background represent the 28,336 two-character words included in both SUBLTEX-CH and LCMC, together with their log10 frequencies; the black diamonds represent the 400 words selected for the lexical decision validation study.
The percentages of variance in RT accounted for by each of the different frequency measures, for two-character words in our lexical decision task.
| N = 394 | Word frequency | |||
| SUBTL_ logW | SUBTL_ logW-CD | LCSMCS_logW | LCMC_ logW | |
| Log | 24.6 | 25.2 | 9.3 | 18.3 |
| log+log2 | 24.6 | 25.3 | 10.2 | 19.1 |
*N = 322.