Literature DB >> 16087885

Unsupervised learning of natural languages.

Zach Solan1, David Horn, Eytan Ruppin, Shimon Edelman.   

Abstract

We address the problem, fundamental to linguistics, bioinformatics, and certain other disciplines, of using corpora of raw symbolic sequential data to infer underlying rules that govern their production. Given a corpus of strings (such as text, transcribed speech, chromosome or protein sequence data, sheet music, etc.), our unsupervised algorithm recursively distills from it hierarchically structured patterns. The adios (automatic distillation of structure) algorithm relies on a statistical method for pattern extraction and on structured generalization, two processes that have been implicated in language acquisition. It has been evaluated on artificial context-free grammars with thousands of rules, on natural languages as diverse as English and Chinese, and on protein data correlating sequence with function. This unsupervised algorithm is capable of learning complex syntax, generating grammatical novel sentences, and proving useful in other fields that call for structure discovery from raw data, such as bioinformatics.

Mesh:

Year:  2005        PMID: 16087885      PMCID: PMC1187953          DOI: 10.1073/pnas.0409746102

Source DB:  PubMed          Journal:  Proc Natl Acad Sci U S A        ISSN: 0027-8424            Impact factor:   11.205


  9 in total

1.  Evolution of universal grammar.

Authors:  M A Nowak; N L Komarova; P Niyogi
Journal:  Science       Date:  2001-01-05       Impact factor: 47.728

2.  Neuroscience. Does grammar start where statistics stop?

Authors:  Mark S Seidenberg; Maryellen C MacDonald; Jenny R Saffran
Journal:  Science       Date:  2002-10-18       Impact factor: 47.728

3.  Constructions: a new theoretical approach to language.

Authors:  Adele E. Goldberg
Journal:  Trends Cogn Sci       Date:  2003-05       Impact factor: 20.229

4.  SVM-Prot: Web-based support vector machine software for functional classification of a protein from its primary sequence.

Authors:  C Z Cai; L Y Han; Z L Ji; X Chen; Y Z Chen
Journal:  Nucleic Acids Res       Date:  2003-07-01       Impact factor: 16.971

5.  Signal-driven computations in speech processing.

Authors:  Marcela Peña; Luca L Bonatti; Marina Nespor; Jacques Mehler
Journal:  Science       Date:  2002-08-29       Impact factor: 47.728

6.  Variability and detection of invariant structure.

Authors:  Rebecca L Gómez
Journal:  Psychol Sci       Date:  2002-09

7.  Rule learning by seven-month-old infants.

Authors:  G F Marcus; S Vijayan; S Bandi Rao; P M Vishton
Journal:  Science       Date:  1999-01-01       Impact factor: 47.728

8.  Statistical learning by 8-month-old infants.

Authors:  J R Saffran; R N Aslin; E L Newport
Journal:  Science       Date:  1996-12-13       Impact factor: 47.728

Review 9.  Language acquisition and use: learning and applying probabilistic constraints.

Authors:  M S Seidenberg
Journal:  Science       Date:  1997-03-14       Impact factor: 47.728

  9 in total
  31 in total

Review 1.  Syntactic structures in languages and biology.

Authors:  David Horn
Journal:  Cogn Process       Date:  2007-10-19

2.  Syntactic parsing of clinical text: guideline and corpus development with handling ill-formed sentences.

Authors:  Jung-wei Fan; Elly W Yang; Min Jiang; Rashmi Prasad; Richard M Loomis; Daniel S Zisook; Josh C Denny; Hua Xu; Yang Huang
Journal:  J Am Med Inform Assoc       Date:  2013-08-01       Impact factor: 4.497

3.  Evolution of protolinguistic abilities as a by-product of learning to forage in structured environments.

Authors:  Oren Kolodny; Shimon Edelman; Arnon Lotem
Journal:  Proc Biol Sci       Date:  2015-07-22       Impact factor: 5.349

Review 4.  Acoustic sequences in non-human animals: a tutorial review and prospectus.

Authors:  Arik Kershenbaum; Daniel T Blumstein; Marie A Roch; Çağlar Akçay; Gregory Backus; Mark A Bee; Kirsten Bohn; Yan Cao; Gerald Carter; Cristiane Cäsar; Michael Coen; Stacy L DeRuiter; Laurance Doyle; Shimon Edelman; Ramon Ferrer-i-Cancho; Todd M Freeberg; Ellen C Garland; Morgan Gustison; Heidi E Harley; Chloé Huetz; Melissa Hughes; Julia Hyland Bruno; Amiyaal Ilany; Dezhe Z Jin; Michael Johnson; Chenghui Ju; Jeremy Karnowski; Bernard Lohr; Marta B Manser; Brenda McCowan; Eduardo Mercado; Peter M Narins; Alex Piel; Megan Rice; Roberta Salmi; Kazutoshi Sasahara; Laela Sayigh; Yu Shiu; Charles Taylor; Edgar E Vallejo; Sara Waller; Veronica Zamora-Gutierrez
Journal:  Biol Rev Camb Philos Soc       Date:  2014-11-26

5.  The evolution of cognitive mechanisms in response to cultural innovations.

Authors:  Arnon Lotem; Joseph Y Halpern; Shimon Edelman; Oren Kolodny
Journal:  Proc Natl Acad Sci U S A       Date:  2017-07-24       Impact factor: 11.205

6.  Coevolution of learning and data-acquisition mechanisms: a model for cognitive evolution.

Authors:  Arnon Lotem; Joseph Y Halpern
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  2012-10-05       Impact factor: 6.237

7.  HelitronScanner uncovers a large overlooked cache of Helitron transposons in many plant genomes.

Authors:  Wenwei Xiong; Limei He; Jinsheng Lai; Hugo K Dooner; Chunguang Du
Journal:  Proc Natl Acad Sci U S A       Date:  2014-06-30       Impact factor: 11.205

8.  The evolution of continuous learning of the structure of the environment.

Authors:  Oren Kolodny; Shimon Edelman; Arnon Lotem
Journal:  J R Soc Interface       Date:  2014-01-08       Impact factor: 4.118

9.  Deriving enzymatic and taxonomic signatures of metagenomes from short read data.

Authors:  Uri Weingart; Erez Persi; Uri Gophna; David Horn
Journal:  BMC Bioinformatics       Date:  2010-07-22       Impact factor: 3.169

10.  Data mining of enzymes using specific peptides.

Authors:  Uri Weingart; Yair Lavi; David Horn
Journal:  BMC Bioinformatics       Date:  2009-12-24       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.