Literature DB >> 34337323

Scale-Dependent Relationships in Natural Language.

Aakash Sarkar1, Marc W Howard1.   

Abstract

Language, like other natural sequences, exhibits statistical dependencies at a wide range of scales (Lin & Tegmark, 2016). However, many statistical learning models applied to language impose a sampling scale while extracting statistical structure. For instance, Word2Vec creates vector embeddings by sampling context in a window around each word, the size of which defines a strong scale; relationships over much larger temporal scales would be invisible to the algorithm. This paper examines the family of Word2Vec embeddings generated while systematically manipulating the size of the context window. The primary result is that different linguistic relationships are preferentially encoded at different scales. Different scales emphasize different syntactic and semantic relations between words, as assessed both by analogical reasoning tasks in the Google Analogies test set and human similarity rating datasets WordSim-353 and SimLex-999. Moreover, the neighborhoods of a given word in the embeddings change considerably depending on the scale. These results suggest that sampling at any individual scale can only identify a subset of the meaningful relationships a word might have, and point toward the importance of developing scale-free models of semantic meaning.

Entities:  

Year:  2021        PMID: 34337323      PMCID: PMC8317965          DOI: 10.1007/s42113-020-00094-8

Source DB:  PubMed          Journal:  Comput Brain Behav        ISSN: 2522-0861


  28 in total

1.  Long-range temporal correlations and scaling behavior in human brain oscillations.

Authors:  K Linkenkaer-Hansen; V V Nikouline; J M Palva; R J Ilmoniemi
Journal:  J Neurosci       Date:  2001-02-15       Impact factor: 6.167

2.  The magical number seven plus or minus two: some limits on our capacity for processing information.

Authors:  G A MILLER
Journal:  Psychol Rev       Date:  1956-03       Impact factor: 8.934

3.  On the origin of long-range correlations in texts.

Authors:  Eduardo G Altmann; Giampaolo Cristadoro; Mirko Degli Esposti
Journal:  Proc Natl Acad Sci U S A       Date:  2012-07-02       Impact factor: 11.205

4.  The dimensionality of discourse.

Authors:  Isidoros Doxas; Simon Dennis; William L Oliver
Journal:  Proc Natl Acad Sci U S A       Date:  2010-03-01       Impact factor: 11.205

Review 5.  Time cells in the hippocampus: a new dimension for mapping memories.

Authors:  Howard Eichenbaum
Journal:  Nat Rev Neurosci       Date:  2014-10-01       Impact factor: 34.870

6.  The Role of Negative Information in Distributional Semantic Learning.

Authors:  Brendan T Johns; Douglas J K Mewhort; Michael N Jones
Journal:  Cogn Sci       Date:  2019-05

7.  Time Cells in Hippocampal Area CA3.

Authors:  Daniel M Salz; Zoran Tiganj; Srijesa Khasnabish; Annalyse Kohley; Daniel Sheehan; Marc W Howard; Howard Eichenbaum
Journal:  J Neurosci       Date:  2016-07-13       Impact factor: 6.167

8.  LSTM: A Search Space Odyssey.

Authors:  Klaus Greff; Rupesh K Srivastava; Jan Koutnik; Bas R Steunebrink; Jurgen Schmidhuber
Journal:  IEEE Trans Neural Netw Learn Syst       Date:  2016-07-08       Impact factor: 10.451

9.  Temporal maps and informativeness in associative learning.

Authors:  Peter D Balsam; C Randy Gallistel
Journal:  Trends Neurosci       Date:  2009-01-10       Impact factor: 13.837

10.  Hippocampal ensemble dynamics timestamp events in long-term memory.

Authors:  Alon Rubin; Nitzan Geva; Liron Sheintuch; Yaniv Ziv
Journal:  Elife       Date:  2015-12-18       Impact factor: 8.140

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.