| Literature DB >> 36206273 |
Johanna K Kaakinen1,2, Egon Werlen3, Yvonne Kammerer4,5, Cengiz Acartürk6,7, Xavier Aparicio8, Thierry Baccino9, Ugo Ballenghein8,9, Per Bergamin3, Núria Castells10, Armanda Costa11, Isabel Falé11,12, Olga Mégalakaki13,14, Susana Ruiz Fernández4,15.
Abstract
We introduce a database (IDEST) of 250 short stories rated for valence, arousal, and comprehensibility in two languages. The texts, with a narrative structure telling a story in the first person and controlled for length, were originally written in six different languages (Finnish, French, German, Portuguese, Spanish, and Turkish), and rated for arousal, valence, and comprehensibility in the original language. The stories were translated into English, and the same ratings for the English translations were collected via an internet survey tool (N = 573). In addition to the rating data, we also report readability indexes for the original and English texts. The texts have been categorized into different story types based on their emotional arc. The texts score high on comprehensibility and represent a wide range of emotional valence and arousal levels. The comparative analysis of the ratings of the original texts and English translations showed that valence ratings were very similar across languages, whereas correlations between the two pairs of language versions for arousal and comprehensibility were modest. Comprehensibility ratings correlated with only some of the readability indexes. The database is published in osf.io/9tga3, and it is freely available for academic research.Entities:
Mesh:
Year: 2022 PMID: 36206273 PMCID: PMC9544016 DOI: 10.1371/journal.pone.0274480
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.752
The total number of texts obtained per language and the number and percentage of texts retained in the database.
| Language | total # texts | # texts retained | % texts retained |
|---|---|---|---|
| Finnish | 45 | 45 | 100.00 |
| French | 75 | 46 | 61.33 |
| German | 102 | 55 | 53.92 |
| Portuguese | 36 | 30 | 83.33 |
| Spanish | 50 | 31 | 62.00 |
| Turkish | 46 | 43 | 93.48 |
| Total | 354 | 250 | 70.62 |
Means, standard deviations, and minimum and maximum values of valence for all texts, separately for the English translations and the original language versions.
| Language | Number | English translations | Original texts | ||||||
|---|---|---|---|---|---|---|---|---|---|
| of origin | of texts |
|
|
|
|
|
|
|
|
| All | 250 | 4.61 | 1.67 | 1.24 | 7.88 | 4.68 | 1.98 | 1.17 | 8.61 |
| Finnish | 45 | 5.15 | 1.80 | 2.17 | 7.88 | 5.19 | 2.10 | 1.77 | 8.00 |
| French | 46 | 4.79 | 1.78 | 1.96 | 7.35 | 4.66 | 1.90 | 1.25 | 7.42 |
| German | 55 | 4.23 | 1.63 | 1.24 | 7.48 | 4.23 | 2.10 | 1.17 | 8.61 |
| Portuguese | 30 | 4.94 | 1.77 | 2.29 | 7.32 | 5.23 | 2.04 | 2.00 | 8.50 |
| Spanish | 31 | 4.62 | 1.81 | 1.90 | 7.45 | 4.66 | 2.19 | 1.41 | 8.15 |
| Turkish | 43 | 4.13 | 1.00 | 2.44 | 6.54 | 4.39 | 1.42 | 2.00 | 7.20 |
Reliability of valence, arousal, and comprehensibility ratings for English texts and the texts in their original language as averages of ICC(2,k).
| Valence | Arousal | Compreh. | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Language | Groups | Raters / group | M | Lb | Ub | M | Lb | Ub | M | Lb | Ub |
| English | 25 | 20–25 | .97 | .93 | .99 | .87 | .71 | .96 | .80 | .55 | .94 |
| Finnish | 3 | 25–26 | .99 | .97 | .99 | .96 | .93 | .99 | .54 | .22 | .82 |
| French | 1 | 12 | .96 | .94 | .97 | .87 | .81 | .92 | .81 | .71 | .88 |
| German | 3 | 17–20 | .98 | .97 | .99 | .92 | .85 | .96 | .83 | .68 | .92 |
| Portuguese | 6 | 6–14 | .95 | .84 | .99 | .48 | .03 | .94 | .53 | .11 | .94 |
| Spanish | 3 | 21–27 | .96 | .87 | .99 | .70 | .37 | .96 | .75 | .49 | .96 |
| Turkish | 2 | 5 | .83 | .68 | .92 | .67 | .37 | .85 | .44 | .13 | .74 |
Note. Groups = number of rater groups, Lbound = Lower bound, Ubound = Upper bound, Compreh. = Comprehensibility.
Correlations between ratings given to English translations and original texts, and the ML-ICC values for valence, arousal, and comprehensibility.
| Valence | Arousal | Compreh. | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ML-ICC | ML-ICC | ML-ICC | ||||||||||
| Language |
| T | R | L |
| T | R | L |
| T | R | L |
| All | .92 | .59 | .03 | .00 | .40 | .16 | .22 | .13 | .53 | .12 | .30 | .07 |
| Finnish | .97 | .65 | .03 | .00 | .89 | .32 | .23 | .08 | -.05 | .00 | .61 | .17 |
| French | .91 | .59 | .02 | .00 | .46 | .13 | .27 | .14 | .34 | .06 | .37 | .01 |
| German | .96 | .61 | .02 | .00 | .59 | .19 | .20 | .07 | .69 | .12 | .34 | .01 |
| Portuguese | .94 | .57 | .02 | .01 | -.07 | .06 | .29 | .00 | .36 | .05 | .28 | .22 |
| Spanish | .97 | .66 | .01 | .00 | .40 | .08 | .25 | .14 | .73 | .10 | .45 | .13 |
| Turkish | .64 | .28 | .09 | .00 | .26 | .13 | .23 | .02 | .32 | .09 | .24 | .10 |
Note. *p < .05. ML-ICC = intraclass correlation in a multilevel model (R2), r = Pearson correlation, T = texts, R = raters, L = language, Compreh. = Comprehensibility.
Fig 1Distributions of the valence, arousal and comprehensibility ratings for original texts and English translations.
(A) Valence ratings. (B) Arousal ratings. (C) Comprehensibility ratings.
Means, standard deviations, and minimum and maximum arousal ratings for all texts, separately for the English translations and the original language versions.
| Language | Number | English translations | Original texts | ||||||
|---|---|---|---|---|---|---|---|---|---|
| of origin | of texts |
|
|
|
|
|
|
|
|
| All | 250 | 4.22 | 1.03 | 1.95 | 6.88 | 4.57 | 1.44 | 1.31 | 8.28 |
| Finnish | 45 | 4.54 | 1.50 | 1.95 | 6.88 | 3.54 | 1.44 | 1.31 | 6.31 |
| French | 46 | 4.09 | 0.94 | 2.46 | 6.83 | 5.41 | 1.12 | 2.92 | 7.92 |
| German | 55 | 4.22 | 1.00 | 2.17 | 6.19 | 5.11 | 1.35 | 1.58 | 8.28 |
| Portuguese | 30 | 4.06 | 0.70 | 2.68 | 5.50 | 4.30 | 1.03 | 2.63 | 6.83 |
| Spanish | 31 | 4.12 | 0.71 | 2.95 | 5.30 | 5.38 | 0.92 | 3.93 | 7.15 |
| Turkish | 43 | 4.21 | 0.90 | 2.87 | 6.23 | 3.64 | 1.20 | 1.80 | 5.80 |
Means, standard deviations, and minimum and maximum values of comprehensibility ratings for all texts, separately for the English translations and the original language versions.
| Language | Number | English translations | Original texts | ||||||
|---|---|---|---|---|---|---|---|---|---|
| of origin | of texts |
|
|
|
|
|
|
|
|
| All | 250 | 7.39 | 0.67 | 5.04 | 8.54 | 8.10 | 0.78 | 5.22 | 9.00 |
| Finnish | 45 | 7.99 | 0.34 | 7.17 | 8.54 | 8.78 | 0.18 | 8.40 | 8.96 |
| French | 46 | 7.31 | 0.47 | 6.35 | 8.18 | 7.74 | 0.72 | 5.22 | 8.78 |
| German | 55 | 7.40 | 0.65 | 5.10 | 8.52 | 7.64 | 0.76 | 5.83 | 8.63 |
| Portuguese | 30 | 7.25 | 0.49 | 6.43 | 8.29 | 8.61 | 0.46 | 7.50 | 9.00 |
| Spanish | 31 | 7.57 | 0.54 | 6.22 | 8.41 | 8.41 | 0.57 | 5.96 | 8.88 |
| Turkish | 43 | 6.78 | 0.75 | 5.04 | 7.95 | 7.76 | 0.81 | 5.40 | 8.80 |
Fig 2A scatterplot of the relationship between valence and arousal ratings for the English translations (left panel) and original texts (right panel). The different colors correspond to the six languages of the original texts.
Means, standard deviations, minimum and maximum values of valence and arousal ratings of texts as a function of story type.
| Valence | Arousal | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Story type | N |
|
|
|
|
|
|
|
|
| Constant | 54 | 4.88 | 1.47 | 2.29 | 7.48 | 3.45 | 0.89 | 1.95 | 5.83 |
| Tragedy | 69 | 2.95 | 0.59 | 1.24 | 4.56 | 4.73 | 1.03 | 3.00 | 6.88 |
| Rags-to-riches | 50 | 6.72 | 0.59 | 5.14 | 7.88 | 4.28 | 0.87 | 2.85 | 6.54 |
| Man-in-a-hole | 19 | 5.17 | 1.11 | 3.18 | 6.96 | 4.73 | 0.83 | 3.42 | 6.23 |
| Icarus | 18 | 3.67 | 0.88 | 2.18 | 5.13 | 4.22 | 1.04 | 2.70 | 6.54 |
| Oedipus | 2 | 3.34 | 0.36 | 3.08 | 3.59 | 5.51 | 0.28 | 5.32 | 5.71 |
| Cinderella | 5 | 5.76 | 1.29 | 4.25 | 7.05 | 3.78 | 0.89 | 2.61 | 5.04 |
| No clear emotional arc | 33 | 4.58 | 1.23 | 3.00 | 7.23 | 4.01 | 0.77 | 2.82 | 5.61 |
Means, standard deviations, and minimum and maximum values of the readability scores for the 250 English texts and their correlations (r) to the comprehensibility rating.
|
|
|
|
|
| |
|---|---|---|---|---|---|
| Flesch Reading Ease | 81.07 | 8.18 | 50.38 | 98.13 | -.13 |
| Flesch Kincaid Grade Level | 5.47 | 1.99 | 1.00 | 15.42 | .17 |
| Automated Readability Index | 4.77 | 2.42 | -0.26 | 16.71 | .17 |
| New Dale Chall Readability Formula | 6.16 | 0.87 | 1.20 | 8.48 | -.01 |
| CAREC | 0.10 | 0.06 | -0.16 | 0.23 | -.10 |
| CARES | 0.46 | 0.10 | 0.23 | 0.75 | -.20 |
| CML2RI | 24.96 | 5.37 | 11.63 | 38.22 | -.21 |
Note. * p < .05.
Means, standard deviations, and minimum and maximum values of the Flesch Reading Ease scores for the original texts and their correlations to the comprehensibility rating.
| Language of origin |
|
|
|
|
|
|---|---|---|---|---|---|
| All | 65.54 | 13.81 | 29.18 | 93.58 | .03 |
| Finnish | - | - | - | - | - |
| French | 47.05 | 10.25 | 29.18 | 73.72 | -.13 |
| German | 73.57 | 8.30 | 57.51 | 93.58 | -.41 |
| Portuguese | 70.37 | 7.75 | 55.20 | 85.20 | .05 |
| Spanish | 68.56 | 8.90 | 47.39 | 84.22 | -.18 |
| Turkish | 69.51 | 11.34 | 33.35 | 89.67 | .30 |
Note. * p < .05.