Literature DB >> 15190707

Automatic parsing of parental verbal input.

Kenji Sagae1, Brian MacWhinney, Alon Lavie.   

Abstract

To evaluate theoretical proposals regarding the course of child language acquisition, researchers often need to rely on the processing of large numbers of syntactically parsed utterances, both from children and from their parents. Because it is so difficult to do this by hand, there are currently no parsed corpora of child language input data. To automate this process, we developed a system that combined the MOR tagger, a rule-based parser, and statistical disambiguation techniques. The resultant system obtained nearly 80% correct parses for the sentences spoken to children. To achieve this level, we had to construct a particular processing sequence that minimizes problems caused by the coverage/ambiguity tradeoff in parser design. These procedures are particularly appropriate for use with the CHILDES database, an international corpus of transcripts. The data and programs are now freely available over the Internet.

Entities:  

Mesh:

Year:  2004        PMID: 15190707      PMCID: PMC1880885          DOI: 10.3758/bf03195557

Source DB:  PubMed          Journal:  Behav Res Methods Instrum Comput        ISSN: 0743-3808


  1 in total

1.  Automatic disambiguation of morphosyntax in spoken language corpora.

Authors:  C Parisse; M T Le Normand
Journal:  Behav Res Methods Instrum Comput       Date:  2000-08
  1 in total
  2 in total

1.  A multiple process solution to the logical problem of language acquisition.

Authors:  Brian MacWhinney
Journal:  J Child Lang       Date:  2004-11

2.  The Hebrew CHILDES corpus: transcription and morphological analysis.

Authors:  Aviad Albert; Brian MacWhinney; Bracha Nir; Shuly Wintner
Journal:  Lang Resour Eval       Date:  2013-12-01       Impact factor: 1.358

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.