| Literature DB >> 35840575 |
Guangyao Zhang1,2, Panpan Yao1,3, Guojie Ma1,2, Jingwen Wang1,2, Junyi Zhou1,2, Linjieqiong Huang1,2, Pingping Xu1,2, Lijing Chen1,2, Songlin Chen1,3, Junjuan Gu1,2, Wei Wei1,2, Xi Cheng1,2, Huimin Hua1,2, Pingping Liu1,2, Ya Lou1,2, Wei Shen1,2, Yaqian Bao1,2, Jiayu Liu1,2, Nan Lin4,5, Xingshan Li6,7.
Abstract
Eye movements are one of the most fundamental behaviors during reading. A growing number of Chinese reading studies have used eye-tracking techniques in the last two decades. The accumulated data provide a rich resource that can reflect the complex cognitive mechanisms underlying Chinese reading. This article reports a database of eye-movement measures of words during Chinese sentence reading. The database contains nine eye-movement measures of 8,551 Chinese words obtained from 1,718 participants across 57 Chinese sentence reading experiments. All data were collected in the same experimental environment and from homogenous participants, using the same protocols and parameters. This database enables researchers to test their theoretical or computational hypotheses concerning Chinese reading efficiently using a large number of words. The database can also indicate the processing difficulty of Chinese words during text reading, thus providing a way to control or manipulate the difficulty level of Chinese texts.Entities:
Mesh:
Year: 2022 PMID: 35840575 PMCID: PMC9287311 DOI: 10.1038/s41597-022-01464-6
Source DB: PubMed Journal: Sci Data ISSN: 2052-4463 Impact factor: 8.501
Fig. 1Schematic visualization of word segmentation and measure calculation. Note. Panel (a) shows the procedure of word segmentation for one sentence. Panel (b) shows an example of the procedure of calculating an eye-movement measure (i.e., FFD) on a word (e.g., “沙漠” meaning “desert” in English).
Definitions and Abbreviations of the Nine Eye-Movement Measures.
| Eye-Movement Measures | Abbreviations | Definition |
|---|---|---|
| First fixation duration* | FFD | Duration of the first fixation on the target word |
| Gaze duration* | GD | Sum of the fixation durations before the target word is exited to the right or left during first-pass reading |
| First-pass reading fixated proportion* | FPF | Proportion that the target word is fixated during the first-pass reading |
| Fixation number+ | FN | Total number of fixations on the target word |
| Proportion regression in+ | RI | Proportion of regression into the target word |
| Proportion regression out+ | RO | Proportion of regression out from the target word |
| Saccade length toward the target from the left+ | LI_left | Length of saccade into the target word when the word is first fixated from the left side (unit: character) |
| Saccade length from the target to the right+ | LO_right | Length of the saccade from target word to the right after the word first fixated (unit: character) |
| Total fixation duration+ | TT | Sum of the fixation durations on the target word |
Note. *Main measures in the database. +Supplementary measures in the database.
Mean Value (Standard Deviation) of the Eye-Movement Measures on Words of Different Length.
| Word length (number of characters) | ||||
|---|---|---|---|---|
| 1 | 2 | 3 | 4 | |
| Sample size | 1354 | 6128 | 547 | 522 |
| FFD (ms) | 264 (38) | 264 (33) | 259 (30) | 254 (32) |
| GD (ms) | 270 (43) | 307 (59) | 364 (90) | 414 (106) |
| FPF | 0.459 (0.132) | 0.783 (0.123) | 0.927 (0.082) | 0.963 (0.052) |
| FN | 0.747 (0.255) | 1.460 (0.470) | 1.928 (0.582) | 2.304 (0.654) |
| RI | 0.145 (0.100) | 0.223 (0.127) | 0.232 (0.132) | 0.208 (0.135) |
| RO | 0.146 (0.102) | 0.242 (0.137) | 0.262 (0.158) | 0.298 (0.184) |
| LI_left (characters) | 2.700 (2.761) | 2.801 (1.650) | 2.941 (1.195) | 3.142 (1.103) |
| LO_right (characters) | 3.338 (7.839) | 3.450 (5.318) | 3.390 (5.861) | 3.935 (4.93) |
| TT (ms) | 338 (86) | 438 (139) | 509 (178) | 573 (190) |
Results for the Effects of Word Frequency and Word Length on the Main Eye-Movement Measures.
| Dependent variables | Independent variables | Cohen’s | ||
|---|---|---|---|---|
| FFD | Log-transformed word frequency | −11.170 | −0.253 | −21.026*** |
| 2-char words vs 1-char words | −7.568 | −0.230 | −7.328*** | |
| 3-char words vs 2-char words | −11.626 | −0.353 | −7.499*** | |
| 4-char words vs 3-char words | −6.824 | −0.207 | −3.154** | |
| GD | Log-transformed word frequency | −22.350 | −0.249 | −23.031*** |
| 2-char words vs 1-char words | 19.533 | 0.291 | 10.354*** | |
| 3-char words vs 2-char words | 40.469 | 0.603 | 14.291*** | |
| 4-char words vs 3-char words | 46.267 | 0.690 | 11.708*** | |
| FPF | Log-transformed word frequency | −0.045 | −0.183 | −23.224*** |
| 2-char words vs 1-char words | 0.291 | 1.599 | 78.196*** | |
| 3-char words vs 2-char words | 0.127 | 0.697 | 22.685*** | |
| 4-char words vs 3-char words | 0.024 | 0.133 | 3.093** |
Note. *p < 0.05, **p < 0.01, ***p < 0.001. Abbreviations: FFD, first fixation duration; GD, gaze duration; FPF, first-pass reading fixation proportion.
Lexical Information of the Four Quarters of Words Divided Based on the Number of Observations.
| Measures | Quarters | Log-transformed word frequency | Number of observations | Sample size in different word length (unit: character) | |||||
|---|---|---|---|---|---|---|---|---|---|
| Mean (SD) | Range | Mean (SD) | Range | 1 | 2 | 3 | 4 | ||
| FFD | Quarter 1 | 0.632 (0.541) | [0.013, 3.425] | 8 (3) | [1, 13] | 450 | 1250 | 172 | 136 |
| Quarter 2 | 0.689 (0.554) | [0.013, 3.197] | 20 (4) | [13, 26] | 263 | 1443 | 126 | 176 | |
| Quarter 3 | 0.864 (0.621) | [0.013, 4.598] | 37 (9) | [26, 55] | 242 | 1567 | 121 | 78 | |
| Quarter 4 | 1.598 (0.809) | [0.013, 4.700] | 219 (892) | [55, 35299] | 396 | 1542 | 49 | 22 | |
| GD | Quarter 1 | 0.632 (0.541) | [0.013, 3.425] | 8 (3) | [1, 13] | 450 | 1250 | 172 | 136 |
| Quarter 2 | 0.689 (0.554) | [0.013, 3.197] | 20 (4) | [13, 26] | 263 | 1443 | 126 | 176 | |
| Quarter 3 | 0.864 (0.621) | [0.013, 4.598] | 37 (9) | [26, 55] | 242 | 1567 | 121 | 78 | |
| Quarter 4 | 1.598 (0.809) | [0.013, 4.700] | 219 (892) | [55, 35299] | 396 | 1542 | 49 | 22 | |
| FPF | Quarter 1 | 0.572 (0.517) | [0.013, 3.425] | 10 (3) | [2, 15] | 362 | 1268 | 230 | 148 |
| Quarter 2 | 0.666 (0.524) | [0.013, 3.197] | 23 (5) | [15, 30] | 190 | 1544 | 87 | 187 | |
| Quarter 3 | 0.905 (0.598) | [0.013, 3.527] | 46 (11) | [30, 70] | 301 | 1534 | 113 | 60 | |
| Quarter 4 | 1.639 (0.797) | [0.013, 4.700] | 355 (2073) | [70, 83658] | 498 | 1456 | 38 | 17 | |
Note. Quarters of each measure were divided based on the number of observations of words in ascending order, with each quarter containing 2008-2009 words. Abbreviations: FFD, first fixation duration; GD, gaze duration; FPF, first-pass reading fixation proportion; SD, standard deviation.
Results for the Effects of Word Frequency and Word Length on the Main Eye-Movement Measures.
| Dependent variables | Independent variables | Cohen’s | ||
|---|---|---|---|---|
| FFD | Log-transformed word frequency | −10.526 | −0.249 | −17.607*** |
| 2-char words vs 1-char words | −11.349 | −0.346 | −9.083*** | |
| 3-char words vs 2-char words | −10.388 | −0.317 | −7.115*** | |
| 4-char words vs 3-char words | −7.453 | −0.227 | −3.772*** | |
| GD | Log-transformed word frequency | −25.394 | −0.276 | −22.154*** |
| 2-char words vs 1-char words | 6.727 | 0.094 | 2.808** | |
| 3-char words vs 2-char words | 45.76 | 0.64 | 16.347*** | |
| 4-char words vs 3-char words | 42.77 | 0.598 | 11.289*** | |
| FPF | Log-transformed word frequency | −0.054 | −0.235 | −25.514*** |
| 2-char words vs 1-char words | 0.259 | 1.441 | 58.157*** | |
| 3-char words vs 2-char words | 0.12 | 0.669 | 23.113*** | |
| 4-char words vs 3-char words | 0.021 | 0.116 | 2.949** |
Note. *p < 0.05, **p < 0.01, ***p < 0.001. Abbreviations: FFD, first fixation duration; GD, gaze duration; FPF, first-pass reading fixation proportion.
Lexical Information of the Four Quarters of Words Divided Based on the Number of Observations.
| Measures | Quarters | Log-transformed word frequency | Number of observations | Sample size in different word length (unit: character) | |||||
|---|---|---|---|---|---|---|---|---|---|
| Mean (SD) | Range | Mean (SD) | Range | 1 | 2 | 3 | 4 | ||
| FFD | Quarter 1 | 0.792 (0.659) | [0.009, 3.386] | 8 (3) | [1, 13] | 342 | 1376 | 212 | 171 |
| Quarter 2 | 0.842 (0.641) | [0.017, 3.312] | 19 (4) | [13, 26] | 248 | 1470 | 150 | 233 | |
| Quarter 3 | 1.024 (0.661) | [0.023, 3.347] | 36 (8) | [26, 53] | 234 | 1653 | 126 | 88 | |
| Quarter 4 | 1.697 (0.778) | [0.016, 4.46] | 212 (872) | [53, 35299] | 406 | 1613 | 56 | 26 | |
| GD | Quarter 1 | 0.792 (0.659) | [0.009, 3.386] | 8 (3) | [1, 13] | 342 | 1376 | 212 | 171 |
| Quarter 2 | 0.842 (0.641) | [0.017, 3.312] | 19 (4) | [13, 26] | 248 | 1470 | 150 | 233 | |
| Quarter 3 | 1.024 (0.661) | [0.023, 3.347] | 36 (8) | [26, 53] | 234 | 1653 | 126 | 88 | |
| Quarter 4 | 1.697 (0.778) | [0.016, 4.46] | 212 (872) | [53, 35299] | 406 | 1613 | 56 | 26 | |
| FPF | Quarter 1 | 0.727 (0.627) | [0.009, 3.386] | 10 (3) | [2, 15] | 278 | 1358 | 283 | 182 |
| Quarter 2 | 0.799 (0.584) | [0.017, 3.199] | 22 (5) | [15, 30] | 169 | 1593 | 98 | 241 | |
| Quarter 3 | 1.064 (0.649) | [0.023, 3.326] | 44 (11) | [30, 67] | 272 | 1629 | 124 | 76 | |
| Quarter 4 | 1.765 (0.763) | [0.016, 4.46] | 343 (2028) | [67, 83658] | 511 | 1532 | 39 | 19 | |
Note. Quarters of each measure were divided based on the number of observations of words in ascending order, with each quarter containing 2101 words. Abbreviations: FFD, first fixation duration; GD, gaze duration; FPF, first-pass reading fixation proportion; SD, standard deviation.
| Measurement(s) | eye movement |
| Technology Type(s) | eye tracking |
| Factor Type(s) | word frequency • word length |
| Sample Characteristic - Organism | Human |
| Sample Characteristic - Environment | laboratory environment |