Literature DB >> 35167112

Age of Exposure 2.0: Estimating word complexity using iterative models of word embeddings.

Robert-Mihai Botarleanu1, Mihai Dascalu2,3, Micah Watanabe4, Scott Andrew Crossley5, Danielle S McNamara4.   

Abstract

Age of acquisition (AoA) is a measure of word complexity which refers to the age at which a word is typically learned. AoA measures have shown strong correlations with reading comprehension, lexical decision times, and writing quality. AoA scores based on both adult and child data have limitations that allow for error in measurement, and increase the cost and effort to produce. In this paper, we introduce Age of Exposure (AoE) version 2, a proxy for human exposure to new vocabulary terms that expands AoA word lists through training regressors to predict AoA scores. Word2vec word embeddings are trained on cumulatively increasing corpora of texts, word exposure trajectories are generated by aligning the word2vec vector spaces, and features of words are derived for modeling AoA scores. Our prediction models achieve low errors (from 13% with a corresponding R2 of .35 up to 7% with an R2 of .74), can be uniformly applied to different AoA word lists, and generalize to the entire vocabulary of a language. Our method benefits from using existing readability indices to define the order of texts in the corpora, while the performed analyses confirm that the generated AoA scores accurately predicted the difficulty of texts (R2 of .84, surpassing related previous work). Further, we provide evidence of the internal reliability of our word trajectory features, demonstrate the effectiveness of the word trajectory features when contrasted with simple lexical features, and show that the exclusion of features that rely on external resources does not significantly impact performance.
© 2022. The Psychonomic Society, Inc.

Entities:  

Keywords:  Age of acquisition; Age of exposure; Word embeddings; Word exposure

Year:  2022        PMID: 35167112     DOI: 10.3758/s13428-022-01797-5

Source DB:  PubMed          Journal:  Behav Res Methods        ISSN: 1554-351X


  23 in total

Review 1.  Age of acquisition and the cumulative-frequency hypothesis: a review of the literature and a new multi-task investigation.

Authors:  Mandy Ghyselinck; Michael B Lewis; Marc Brysbaert
Journal:  Acta Psychol (Amst)       Date:  2004-01

2.  Reexamining the vocabulary spurt.

Authors:  Jennifer Ganger; Michael R Brent
Journal:  Dev Psychol       Date:  2004-07

3.  Age of acquisition ratings for 3,000 monosyllabic words.

Authors:  Michael J Cortese; Maya M Khanna
Journal:  Behav Res Methods       Date:  2008-08

4.  Early lexical acquisition: rate, content, and the vocabulary spurt.

Authors:  B A Goldfield; J S Reznick
Journal:  J Child Lang       Date:  1990-02

5.  Objective age of acquisition norms for a set of 286 words in Russian: relationships with other psycholinguistic variables.

Authors:  Andrei Grigoriev; Ivan Oshhepkov
Journal:  Behav Res Methods       Date:  2013-12

6.  Subjective age-of-acquisition norms for 7,039 Spanish words.

Authors:  María Angeles Alonso; Angel Fernandez; Emiliano Díez
Journal:  Behav Res Methods       Date:  2015-03

7.  Wordbank: an open repository for developmental vocabulary data.

Authors:  Michael C Frank; Mika Braginsky; Daniel Yurovsky; Virginia A Marchman
Journal:  J Child Lang       Date:  2016-05-18

8.  Individual differences in lexical processing at 18 months predict vocabulary growth in typically developing and late-talking toddlers.

Authors:  Anne Fernald; Virginia A Marchman
Journal:  Child Dev       Date:  2011-12-16

9.  Objective age of acquisition norms for a set of 328 words in Spanish.

Authors:  Bernardo Alvarez; Fernando Cuetos
Journal:  Behav Res Methods       Date:  2007-08

10.  Test-based age-of-acquisition norms for 44 thousand English word meanings.

Authors:  Marc Brysbaert; Andrew Biemiller
Journal:  Behav Res Methods       Date:  2017-08
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.