Literature DB >> 19409539

A Bayesian framework for word segmentation: exploring the effects of context.

Sharon Goldwater1, Thomas L Griffiths, Mark Johnson.   

Abstract

Since the experiments of Saffran et al. [Saffran, J., Aslin, R., & Newport, E. (1996). Statistical learning in 8-month-old infants. Science, 274, 1926-1928], there has been a great deal of interest in the question of how statistical regularities in the speech stream might be used by infants to begin to identify individual words. In this work, we use computational modeling to explore the effects of different assumptions the learner might make regarding the nature of words--in particular, how these assumptions affect the kinds of words that are segmented from a corpus of transcribed child-directed speech. We develop several models within a Bayesian ideal observer framework, and use them to examine the consequences of assuming either that words are independent units, or units that help to predict other units. We show through empirical and theoretical results that the assumption of independence causes the learner to undersegment the corpus, with many two- and three-word sequences (e.g. what's that, do you, in the house) misidentified as individual words. In contrast, when the learner assumes that words are predictive, the resulting segmentation is far more accurate. These results indicate that taking context into account is important for a statistical word segmentation strategy to be successful, and raise the possibility that even young infants may be able to exploit more subtle statistical patterns than have usually been considered.

Entities:  

Mesh:

Year:  2009        PMID: 19409539     DOI: 10.1016/j.cognition.2009.03.008

Source DB:  PubMed          Journal:  Cognition        ISSN: 0010-0277


  32 in total

1.  Language Learnability Analysis of Hindi: A Comparison with Ideal and Constrained Learning Approaches.

Authors:  Sandeep Saini; Vineet Sahula
Journal:  J Psycholinguist Res       Date:  2019-10

2.  Linguistic entrenchment: Prior knowledge impacts statistical learning performance.

Authors:  Noam Siegelman; Louisa Bogaerts; Amit Elazar; Joanne Arciuli; Ram Frost
Journal:  Cognition       Date:  2018-04-26

3.  Isolated words enhance statistical language learning in infancy.

Authors:  Casey Lew-Williams; Bruna Pelucchi; Jenny R Saffran
Journal:  Dev Sci       Date:  2011-08-02

4.  Words as alleles: connecting language evolution with Bayesian learners to models of genetic drift.

Authors:  Florencia Reali; Thomas L Griffiths
Journal:  Proc Biol Sci       Date:  2009-10-07       Impact factor: 5.349

5.  When learning goes beyond statistics: Infants represent visual sequences in terms of chunks.

Authors:  Lauren K Slone; Scott P Johnson
Journal:  Cognition       Date:  2018-05-26

6.  Learning across senses: cross-modal effects in multisensory statistical learning.

Authors:  Aaron D Mitchel; Daniel J Weiss
Journal:  J Exp Psychol Learn Mem Cogn       Date:  2011-09       Impact factor: 3.051

Review 7.  Marr's levels and the minimalist program.

Authors:  Mark Johnson
Journal:  Psychon Bull Rev       Date:  2017-02

8.  A role for the developing lexicon in phonetic category acquisition.

Authors:  Naomi H Feldman; Thomas L Griffiths; Sharon Goldwater; James L Morgan
Journal:  Psychol Rev       Date:  2013-10       Impact factor: 8.934

9.  Long-Range Correlation Underlying Childhood Language and Generative Models.

Authors:  Kumiko Tanaka-Ishii
Journal:  Front Psychol       Date:  2018-09-19

10.  Is statistical learning constrained by lower level perceptual organization?

Authors:  Lauren L Emberson; Ran Liu; Jason D Zevin
Journal:  Cognition       Date:  2013-04-22
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.