Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Extracting semantic representations from word co-occurrence statistics: stop-lists, stemming, and SVD.

Literature DB >> 22258891

Extracting semantic representations from word co-occurrence statistics: stop-lists, stemming, and SVD.

Abstract

In a previous article, we presented a systematic computational study of the extraction of semantic representations from the word-word co-occurrence statistics of large text corpora. The conclusion was that semantic vectors of pointwise mutual information values from very small co-occurrence windows, together with a cosine distance measure, consistently resulted in the best representations across a range of psychologically relevant semantic tasks. This article extends that study by investigating the use of three further factors--namely, the application of stop-lists, word stemming, and dimensionality reduction using singular value decomposition (SVD)--that have been used to provide improved performance elsewhere. It also introduces an additional semantic task and explores the advantages of using a much larger corpus. This leads to the discovery and analysis of improved SVD-based methods for generating semantic representations (that provide new state-of-the-art performance on a standard TOEFL task) and the identification and discussion of problems and misleading results that can arise without a full systematic study.

Entities: Disease

Mesh：

Year: 2012 PMID： 22258891 DOI： 10.3758/s13428-011-0183-8

Source DB: PubMed Journal: Behav Res Methods ISSN： 1554-351X

Keyword Cloud
Cited

12 in total

7. Understanding Karma Police: The Perceived Plausibility of Noun Compounds as Predicted by Distributional Models of Semantic Representation.

Authors: Fritz Günther; Marco Marelli
Journal: PLoS One Date: 2016-10-12 Impact factor: 3.240

Extracting semantic representations from word co-occurrence statistics: stop-lists, stemming, and SVD.

Review 1. Using experiential optimization to build lexical representations.

2. Inducing Domain-Specific Sentiment Lexicons from Unlabeled Corpora.

3. Sensorimotor distance: A grounded measure of semantic similarity for 800 million concept pairs.

4. A Complex Network Approach to Distributional Semantic Models.

5. Using a high-dimensional graph of semantic space to model relationships among words.

6. Predicting Lexical Priming Effects from Distributional Semantic Similarities: A Replication with Extension.

7. Understanding Karma Police: The Perceived Plausibility of Noun Compounds as Predicted by Distributional Models of Semantic Representation.

8. Avoiding Conflict: When Speaker Coordination Does Not Require Conceptual Agreement.

9. Limiting factors for mapping corpus-based semantic representations to brain activity.

10. What is semantic diversity and why does it facilitate visual word recognition?