Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 The Hebrew CHILDES corpus: transcription and morphological analysis.

Literature DB >> 25419199

The Hebrew CHILDES corpus: transcription and morphological analysis.

Aviad Albert¹, Brian MacWhinney², Bracha Nir³, Shuly Wintner⁴.

Abstract

We present a corpus of transcribed spoken Hebrew that reflects spoken interactions between children and adults. The corpus is an integral part of the CHILDES database, which distributes similar corpora for over 25 languages. We introduce a dedicated transcription scheme for the spoken Hebrew data that is sensitive to both the phonology and the standard orthography of the language. We also introduce a morphological analyzer that was specifically developed for this corpus. The analyzer adequately covers the entire corpus, producing detailed correct analyses for all tokens. Evaluation on a new corpus reveals high coverage as well. Finally, we describe a morphological disambiguation module that selects the correct analysis of each token in context. The result is a high-quality morphologically-annotated CHILDES corpus of Hebrew, along with a set of tools that can be applied to new corpora.

Entities: Chemical Disease Gene Species

Keywords: CHILDES; Hebrew; Morphological analysis; Morphological disambiguation; Transcription of spoken language

Year: 2013 PMID： 25419199 PMCID： PMC4240028 DOI： 10.1007/s10579-012-9214-z

Source DB: PubMed Journal: Lang Resour Eval ISSN： 1574-020X Impact factor: 1.358

Keyword Cloud
References

8 in total

1. An empirical generative framework for computational modeling of language acquisition.

Authors: Heidi R Waterfall; Ben Sandbank; Luca Onnis; Shimon Edelman
Journal: J Child Lang Date: 2010-06

2. Explaining quantitative variation in the rate of Optional Infinitive errors across languages: a comparison of MOSAIC and the Variational Learning Model.

Authors: Daniel Freudenthal; Julian Pine; Fernand Gobet
Journal: J Child Lang Date: 2010-03-25

7. Children's grammars grow more abstract with age--evidence from an automatic procedure for identifying the productive units of language.

Authors: Gideon Borensztajn; Willem Zuidema; Rens Bod
Journal: Top Cogn Sci Date: 2009-01

8. Modeling children's early grammatical knowledge.

Authors: Colin Bannard; Elena Lieven; Michael Tomasello
Journal: Proc Natl Acad Sci U S A Date: 2009-10-05 Impact factor: 11.205

8 in total

The Hebrew CHILDES corpus: transcription and morphological analysis.

1. An empirical generative framework for computational modeling of language acquisition.

2. Explaining quantitative variation in the rate of Optional Infinitive errors across languages: a comparison of MOSAIC and the Variational Learning Model.

3. Morphosyntactic annotation of CHILDES transcripts.

4. Types of linguistic knowledge: interpreting and producing compound nouns.

5. Language development and language knowledge: evidence from the acquisition of Hebrew morphophonology.

6. Automatic parsing of parental verbal input.

7. Children's grammars grow more abstract with age--evidence from an automatic procedure for identifying the productive units of language.

8. Modeling children's early grammatical knowledge.