Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Emergent linguistic structure in artificial neural networks trained by self-supervision.

Literature DB >> 32493748

Emergent linguistic structure in artificial neural networks trained by self-supervision.

Christopher D Manning¹, Kevin Clark², John Hewitt², Urvashi Khandelwal², Omer Levy³.

Abstract

This paper explores the knowledge of linguistic structure learned by large artificial neural networks, trained via self-supervision, whereby the model simply tries to predict a masked word in a given context. Human language communication is via sequences of words, but language understanding requires constructing rich hierarchical structures that are never observed explicitly. The mechanisms for this have been a prime mystery of human language acquisition, while engineering work has mainly proceeded by supervised learning on treebanks of sentences hand labeled for this latent structure. However, we demonstrate that modern deep contextual language models learn major aspects of this structure, without any explicit supervision. We develop methods for identifying linguistic hierarchical structure emergent in artificial neural networks and demonstrate that components in these models focus on syntactic grammatical relationships and anaphoric coreference. Indeed, we show that a linear transformation of learned embeddings in these models captures parse tree distances to a surprising degree, allowing approximate reconstruction of the sentence tree structures normally assumed by linguists. These results help explain why these models have brought such large improvements across many language-understanding tasks.

Entities: Species

Keywords: artificial neural netwok; learning; self-supervision; syntax

Year: 2020 PMID： 32493748 PMCID： PMC7720155 DOI： 10.1073/pnas.1907367117

Source DB: PubMed Journal: Proc Natl Acad Sci U S A ISSN： 0027-8424 Impact factor: 11.205

5 in total

Review 1. Perception viewed as an inverse problem.

Authors: Z Pizlo
Journal: Vision Res Date: 2001-11 Impact factor: 1.886

2. Broken agreement.

Authors: K Bock; C A Miller
Journal: Cogn Psychol Date: 1991-01 Impact factor: 3.468

3. Rethinking language: how probabilities shape the words we use.

Authors: Thomas L Griffiths
Journal: Proc Natl Acad Sci U S A Date: 2011-02-23 Impact factor: 11.205

Review 4. Poverty of the stimulus revisited.

Authors: Robert C Berwick; Paul Pietroski; Beracah Yankama; Noam Chomsky
Journal: Cogn Sci Date: 2011-08-08

Review 5. Early language acquisition: cracking the speech code.

Authors: Patricia K Kuhl
Journal: Nat Rev Neurosci Date: 2004-11 Impact factor: 34.870

5 in total

17 in total

1. The science of deep learning.

Authors: Richard Baraniuk; David Donoho; Matan Gavish
Journal: Proc Natl Acad Sci U S A Date: 2020-11-23 Impact factor: 11.205

2. A hierarchy of linguistic predictions during natural language comprehension.

Authors: Micha Heilbron; Kristijan Armeni; Jan-Mathijs Schoffelen; Peter Hagoort; Floris P de Lange
Journal: Proc Natl Acad Sci U S A Date: 2022-08-03 Impact factor: 12.779

3. A weighted constraint satisfaction approach to human goal-directed decision making.

Authors: Yuxuan Li; James L McClelland
Journal: PLoS Comput Biol Date: 2022-06-16 Impact factor: 4.779

4. Construction of English Translation Model Based on Neural Network Fuzzy Semantic Optimal Control.

Authors: Bingjie Zhang; Yiming Liu
Journal: Comput Intell Neurosci Date: 2022-05-02

5. Compositional Processing Emerges in Neural Networks Solving Math Problems.

Authors: Jacob Russin; Roland Fernandez; Hamid Palangi; Eric Rosen; Nebojsa Jojic; Paul Smolensky; Jianfeng Gao
Journal: Cogsci Date: 2021-07

6. Placing language in an integrated understanding system: Next steps toward human-level performance in neural language models.

Authors: James L McClelland; Felix Hill; Maja Rudolph; Jason Baldridge; Hinrich Schütze
Journal: Proc Natl Acad Sci U S A Date: 2020-09-28 Impact factor: 11.205