Literature DB >> 22258891

Extracting semantic representations from word co-occurrence statistics: stop-lists, stemming, and SVD.

John A Bullinaria1, Joseph P Levy.   

Abstract

In a previous article, we presented a systematic computational study of the extraction of semantic representations from the word-word co-occurrence statistics of large text corpora. The conclusion was that semantic vectors of pointwise mutual information values from very small co-occurrence windows, together with a cosine distance measure, consistently resulted in the best representations across a range of psychologically relevant semantic tasks. This article extends that study by investigating the use of three further factors--namely, the application of stop-lists, word stemming, and dimensionality reduction using singular value decomposition (SVD)--that have been used to provide improved performance elsewhere. It also introduces an additional semantic task and explores the advantages of using a much larger corpus. This leads to the discovery and analysis of improved SVD-based methods for generating semantic representations (that provide new state-of-the-art performance on a standard TOEFL task) and the identification and discussion of problems and misleading results that can arise without a full systematic study.

Entities:  

Mesh:

Year:  2012        PMID: 22258891     DOI: 10.3758/s13428-011-0183-8

Source DB:  PubMed          Journal:  Behav Res Methods        ISSN: 1554-351X


  12 in total

Review 1.  Using experiential optimization to build lexical representations.

Authors:  Brendan T Johns; Michael N Jones; D J K Mewhort
Journal:  Psychon Bull Rev       Date:  2019-02

2.  Inducing Domain-Specific Sentiment Lexicons from Unlabeled Corpora.

Authors:  William L Hamilton; Kevin Clark; Jure Leskovec; Dan Jurafsky
Journal:  Proc Conf Empir Methods Nat Lang Process       Date:  2016-11

3.  Sensorimotor distance: A grounded measure of semantic similarity for 800 million concept pairs.

Authors:  Cai Wingfield; Louise Connell
Journal:  Behav Res Methods       Date:  2022-09-21

4.  A Complex Network Approach to Distributional Semantic Models.

Authors:  Akira Utsumi
Journal:  PLoS One       Date:  2015-08-21       Impact factor: 3.240

5.  Using a high-dimensional graph of semantic space to model relationships among words.

Authors:  Alice F Jackson; Donald J Bolger
Journal:  Front Psychol       Date:  2014-05-12

6.  Predicting Lexical Priming Effects from Distributional Semantic Similarities: A Replication with Extension.

Authors:  Fritz Günther; Carolin Dudschig; Barbara Kaup
Journal:  Front Psychol       Date:  2016-10-24

7.  Understanding Karma Police: The Perceived Plausibility of Noun Compounds as Predicted by Distributional Models of Semantic Representation.

Authors:  Fritz Günther; Marco Marelli
Journal:  PLoS One       Date:  2016-10-12       Impact factor: 3.240

8.  Avoiding Conflict: When Speaker Coordination Does Not Require Conceptual Agreement.

Authors:  Alexandre Kabbach; Aurélie Herbelot
Journal:  Front Artif Intell       Date:  2021-01-27

9.  Limiting factors for mapping corpus-based semantic representations to brain activity.

Authors:  John A Bullinaria; Joseph P Levy
Journal:  PLoS One       Date:  2013-03-19       Impact factor: 3.240

10.  What is semantic diversity and why does it facilitate visual word recognition?

Authors:  Benedetta Cevoli; Chris Watkins; Kathleen Rastle
Journal:  Behav Res Methods       Date:  2021-02
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.