Literature DB >> 19587174

More data trumps smarter algorithms: comparing pointwise mutual information with latent semantic analysis.

Gabriel Recchia1, Michael N Jones.   

Abstract

Computational models of lexical semantics, such as latent semantic analysis, can automatically generate semantic similarity measures between words from statistical redundancies in text. These measures are useful for experimental stimulus selection and for evaluating a model's cognitive plausibility as a mechanism that people might use to organize meaning in memory. Although humans are exposed to enormous quantities of speech, practical constraints limit the amount of data that many current computational models can learn from. We follow up on previous work evaluating a simple metric of pointwise mutual information. Controlling for confounds in previous work, we demonstrate that this metric benefits from training on extremely large amounts of data and correlates more closely with human semantic similarity ratings than do publicly available implementations of several more complex models. We also present a simple tool for building simple and scalable models from large corpora quickly and efficiently.

Entities:  

Mesh:

Year:  2009        PMID: 19587174     DOI: 10.3758/BRM.41.3.647

Source DB:  PubMed          Journal:  Behav Res Methods        ISSN: 1554-351X


  14 in total

1.  Effects of verbal event structure on online thematic role assignment.

Authors:  Evie Malaia; Ronnie B Wilbur; Christine Weber-Fox
Journal:  J Psycholinguist Res       Date:  2012-10

2.  Evaluating the random representation assumption of lexical semantics in cognitive models.

Authors:  Brendan T Johns; Michael N Jones
Journal:  Psychon Bull Rev       Date:  2010-10

3.  Categorical and associative relations increase false memory relative to purely associative relations.

Authors:  Jennifer H Coane; Dawn M McBride; Miia-Liisa Termonen; J Cooper Cutting
Journal:  Mem Cognit       Date:  2016-01

4.  Effects of word frequency, contextual diversity, and semantic distinctiveness on spoken word recognition.

Authors:  Brendan T Johns; Thomas M Gruenenfelder; David B Pisoni; Michael N Jones
Journal:  J Acoust Soc Am       Date:  2012-08       Impact factor: 1.840

5.  Neural bases of syntax-semantics interface processing.

Authors:  Evguenia Malaia; Sharlene Newman
Journal:  Cogn Neurodyn       Date:  2015-01-13       Impact factor: 5.082

Review 6.  Using experiential optimization to build lexical representations.

Authors:  Brendan T Johns; Michael N Jones; D J K Mewhort
Journal:  Psychon Bull Rev       Date:  2019-02

Review 7.  The role of partial knowledge in statistical word learning.

Authors:  Daniel Yurovsky; Damian C Fricker; Chen Yu; Linda B Smith
Journal:  Psychon Bull Rev       Date:  2014-02

8.  Encoding sequential information in semantic space models: comparing holographic reduced representation and random permutation.

Authors:  Gabriel Recchia; Magnus Sahlgren; Pentti Kanerva; Michael N Jones
Journal:  Comput Intell Neurosci       Date:  2015-04-07

9.  A Complex Network Approach to Distributional Semantic Models.

Authors:  Akira Utsumi
Journal:  PLoS One       Date:  2015-08-21       Impact factor: 3.240

10.  The semantic richness of abstract concepts.

Authors:  Gabriel Recchia; Michael N Jones
Journal:  Front Hum Neurosci       Date:  2012-11-27       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.