| Literature DB >> 36171798 |
Alessandra Rossetti1, Luuk Van Waes1.
Abstract
Text simplification involves making texts easier to understand, usually for lay readers. Simplifying texts is a complex task, especially when conducted in a second language. The readability of the produced texts and the way in which authors manage the different phases of the text simplification process are influenced by their writing expertise and by their language proficiency. Training on audience awareness can be beneficial for writers, but most research so far has devoted attention to first-language writers who simplify their own texts. Therefore, this study investigated the impact of text simplification training on second-language writers (university students) who simplify already existing texts. Specifically, after identifying a first and a second phase in the text simplification process (namely, two distinct series of writing dynamics), we analyzed the impact of our training on pausing and revision behavior across phases, as well as levels of readability achieved by the students. Additionally, we examined correlations between pausing behavior and readability by using keystroke logging data and automated text analysis. We found that phases of text simplification differ along multiple dimensions, even though our training did not seem to influence pausing and revision dynamics. Our training led to texts with fewer and shorter words, and with syntactically simpler sentences. The correlation analysis showed that longer and more frequent pauses at specific text locations were linked with increased readability in the same or adjacent text locations. We conclude the paper by discussing theoretical, methodological, and pedagogical implications, alongside limitations and areas for future research.Entities:
Keywords: automated text analysis; keystroke logging; second language; text readability; text simplification training; writing phases
Year: 2022 PMID: 36171798 PMCID: PMC9510649 DOI: 10.3389/frai.2022.983008
Source DB: PubMed Journal: Front Artif Intell ISSN: 2624-8212
Figure 1Example of an Inputlog process graph.
Figure 2Time distribution across phases in pre-test session.
Figure 3Time distribution across phases in post-test session.
Pausing behavior across writing phases.
|
|
|
|
| ||||
|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
| ||
| Total process time |
| 0:33:41 | 0:14:16 | <0.001 | 0:36:59 | 0:15:30 | <0.001 |
|
| 0:09:58 | 0:08:53 | 0:12:45 | 0:08:16 | |||
| Total pause time |
| 0:22:44 | 0:10:39 | <0.001 | 0:24:42 | 0:10:34 | <0.001 |
|
| 0:06:18 | 0:05:33 | 0:07:56 | 0:05:27 | |||
| Total active writing time |
| 0:10:56 | 0:05:13 | <0.001 | 0:12:17 | 0:06:10 | <0.001 |
|
| 0:03:38 | 0:03:42 | 0:04:48 | 0:03:22 | |||
| Proportion of pause time |
| 67% | 10% | 0.10 | 67% | 7% | <0.05 |
|
| 64% | 10% | 62% | 12% | |||
| Total number of pauses |
| 1,159 | 518 | <0.001 | 1,311 | 614 | <0.001 |
|
| 314 | 313 | 413 | 296 | |||
| Mean pause time (secs) |
| 0.55 | 0.06 | <0.05 | 0.54 | 0.07 | 0.05 |
|
| 0.59 | 0.12 | 0.58 | 0.11 | |||
| Median pause time (secs) |
| 0.42 | 0.06 | <0.05 | 0.42 | 0.68 | <0.05 |
|
| 0.48 | 0.10 | 0.47 | 0.09 | |||
| Number P-Bursts |
| 115.33 | 50.85 | <0.001 | 127.83 | 54.40 | <0.001 |
|
| 32.41 | 27.14 | 41.08 | 29.89 | |||
| Number within-word pauses |
| 354.36 | 203.59 | <0.001 | 401.03 | 225.01 | <0.001 |
|
| 76.05 | 87.51 | 102.83 | 92.82 | |||
| Mean within-word pauses |
| 0.32 | 0.02 | 1.00 | 0.32 | 0.02 | 0.66 |
|
| 0.33 | 0.04 | 0.33 | 0.06 | |||
| Number before-word pauses |
| 168.92 | 81.51 | <0.001 | 181.93 | 88.08 | <0.001 |
|
| 37.92 | 53.22 | 44.90 | 37.59 | |||
| Mean before-word pauses |
| 0.57 | 0.14 | 0.10 | 0.54 | 0.12 | 0.30 |
|
| 0.53 | 0.23 | 0.51 | 0.17 | |||
| Number before-sentence pauses |
| 9.49 | 7.53 | <0.001 | 11.23 | 7.37 | <0.001 |
|
| 1.82 | 2.83 | 3.33 | 3.36 | |||
| Mean before-sentence pauses |
| 0.61 | 0.39 | 1.00 | 0.69 | 0.87 | 0.001 |
|
| 0.46 | 0.22 | 0.34 | 0.34 | |||
| Number before-paragraph pauses |
| 12.46 | 8.23 | <0.001 | 14.38 | 8.36 | <0.001 |
|
| 3.54 | 5.01 | 4.33 | 4.63 | |||
| Mean before-paragraph pauses |
| 0.61 | 0.20 | <0.01 | 0.65 | 0.34 | <0.05 |
|
| 0.36 | 0.32 | 0.47 | 0.30 | |||
| Number revision pauses |
| 131.82 | 82.67 | <0.001 | 151.57 | 95.44 | <0.001 |
|
| 38.97 | 48.76 | 45.25 | 42.67 | |||
| Mean revision pauses |
| 0.57 | 0.09 | 0.51 | 0.55 | 0.79 | 0.10 |
|
| 0.53 | 0.17 | 0.53 | 0.15 | |||
| Number between-word pauses |
| 220.23 | 112.83 | <0.001 | 255.50 | 122.08 | <0.001 |
|
| 48.59 | 82.85 | 65.40 | 63.98 | |||
| Mean between-word pauses |
| 0.57 | 0.13 | 0.87 | 0.55 | 0.13 | <0.05 |
|
| 0.57 | 0.20 | 0.49 | 0.17 | |||
| Number between-sentence pauses |
| 7.46 | 4.63 | <0.001 | 8.82 | 5.77 | <0.001 |
|
| 1.44 | 4.10 | 2.45 | 2.98 | |||
| Mean between-sentence pauses |
| 0.83 | 0.55 | 0.72 | 1.07 | 0.96 | <0.05 |
|
| 0.65 | 0.64 | 0.58 | 0.53 | |||
The * symbol indicates statistical significance at the p value <0.05.
Revision behavior across writing phases.
|
|
|
|
| ||||
|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
| ||
| Count of all revision events |
| 250.49 | 139.66 | <0.001 | 254.33 | 129.69 | <0.001 |
|
| 996.72 | 1,140.45 | 85.47 | 85.19 | |||
| Sum of characters in all revision events |
| 7,858.49 | 7,509.93 | <0.001 | 6,683.35 | 5,136.26 | 0.001 |
|
| 4,064.23 | 8,223.70 | 3,869.72 | 5,208.27 | |||
| Normal production (events) |
| 41.86 | 49.56 | <0.001 | 51.14 | 49.97 | <0.05 |
|
| 304.54 | 331.33 | 17.94 | 20.05 | |||
| Normal production (characters) |
| 2,106.70 | 1,723.78 | 0.057 | 1,817.69 | 1,517.51 | 0.549 |
|
| 1,256.60 | 1,546.09 | 1,497.94 | 1,912.41 | |||
| Insertion (events) |
| 95.87 | 80.95 | <0.05 | 86.60 | 71.37 | <0.05 |
|
| 390.84 | 686.37 | 38.33 | 47.99 | |||
| Insertion (characters) |
| 2,460.26 | 2,488.15 | <0.05 | 2,504.55 | 2,218.24 | <0.05 |
|
| 1,846.81 | 5,436.73 | 1,227.76 | 1,790.80 | |||
| Deletion (events) |
| 117.36 | 66.12 | <0.001 | 119.79 | 62.92 | <0.001 |
|
| 402.72 | 431.97 | 41.02 | 40.17 | |||
| Deletion (characters) |
| 3,462.64 | 6,191.19 | <0.001 | 2,461.63 | 2,708.57 | <0.05 |
|
| 1,421.18 | 3,220.55 | 1,673.40 | 2,899.75 | |||
The * symbol indicates statistical significance at the p value <0.05.