Literature DB >> 35710909

Lexibank, a public repository of standardized wordlists with computed phonological and lexical features.

Johann-Mattis List1,2, Robert Forkel3, Simon J Greenhill4,5, Christoph Rzymski4, Johannes Englisch4, Russell D Gray4,6.   

Abstract

The past decades have seen substantial growth in digital data on the world's languages. At the same time, the demand for cross-linguistic datasets has been increasing, as witnessed by numerous studies devoted to diverse questions on human prehistory, cultural evolution, and human cognition. Unfortunately, most published datasets lack standardization which makes their comparison difficult. Here, we present a new approach to increase the comparability of cross-linguistic lexical data. We have designed workflows for the computer-assisted lifting of datasets to Cross-Linguistic Data Formats, a collection of standards that make these datasets more Findable, Accessible, Interoperable, and Reusable (FAIR). We test the Lexibank workflow on 100 lexical datasets from which we derive an aggregated database of wordlists in unified phonetic transcriptions covering more than 2000 language varieties. We illustrate the benefits of our approach by showing how phonological and lexical features can be automatically inferred, complementing and expanding existing cross-linguistic datasets.
© 2022. The Author(s).

Entities:  

Year:  2022        PMID: 35710909     DOI: 10.1038/s41597-022-01432-0

Source DB:  PubMed          Journal:  Sci Data        ISSN: 2052-4463            Impact factor:   6.444


  10 in total

1.  Evolved structure of language shows lineage-specific trends in word-order universals.

Authors:  Michael Dunn; Simon J Greenhill; Stephen C Levinson; Russell D Gray
Journal:  Nature       Date:  2011-04-13       Impact factor: 49.962

2.  Language phylogenies reveal expansion pulses and pauses in Pacific settlement.

Authors:  R D Gray; A J Drummond; S J Greenhill
Journal:  Science       Date:  2009-01-23       Impact factor: 47.728

3.  Climate, vocal folds, and tonal languages: Connecting the physiological and geographic dots.

Authors:  Caleb Everett; Damián E Blasi; Seán G Roberts
Journal:  Proc Natl Acad Sci U S A       Date:  2015-01-20       Impact factor: 11.205

4.  Differential coding of perception in the world's languages.

Authors:  Asifa Majid; Seán G Roberts; Ludy Cilissen; Karen Emmorey; Brenda Nicodemus; Lucinda O'Grady; Bencie Woll; Barbara LeLan; Hilário de Sousa; Brian L Cansler; Shakila Shayan; Connie de Vos; Gunter Senft; N J Enfield; Rogayah A Razak; Sebastian Fedden; Sylvia Tufvesson; Mark Dingemanse; Ozge Ozturk; Penelope Brown; Clair Hill; Olivier Le Guen; Vincent Hirtzel; Rik van Gijn; Mark A Sicoli; Stephen C Levinson
Journal:  Proc Natl Acad Sci U S A       Date:  2018-11-06       Impact factor: 11.205

5.  Emotion semantics show both cultural variation and universal structure.

Authors:  Joseph Watts; Teague R Henry; Joshua Conrad Jackson; Johann-Mattis List; Robert Forkel; Peter J Mucha; Simon J Greenhill; Russell D Gray; Kristen A Lindquist
Journal:  Science       Date:  2019-12-20       Impact factor: 47.728

6.  Cultural influences on word meanings revealed through large-scale semantic alignment.

Authors:  Bill Thompson; Seán G Roberts; Gary Lupyan
Journal:  Nat Hum Behav       Date:  2020-08-10

7.  Wine experts' recognition of wine odors is not verbally mediated.

Authors:  Ilja Croijmans; Artin Arshamian; Laura J Speed; Asifa Majid
Journal:  J Exp Psychol Gen       Date:  2020-10-01

8.  Adaptive Communication: Languages with More Non-Native Speakers Tend to Have Fewer Word Forms.

Authors:  Christian Bentz; Annemarie Verkerk; Douwe Kiela; Felix Hill; Paula Buttery
Journal:  PLoS One       Date:  2015-06-17       Impact factor: 3.240

9.  NorthEuraLex: a wide-coverage lexical database of Northern Eurasia.

Authors:  Johannes Dellert; Thora Daneyko; Alla Münch; Alina Ladygina; Armin Buch; Natalie Clarius; Ilja Grigorjew; Mohamed Balabel; Hizniye Isabella Boga; Zalina Baysarova; Roland Mühlenbernd; Johannes Wahle; Gerhard Jäger
Journal:  Lang Resour Eval       Date:  2019-11-30       Impact factor: 1.358

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.