| Literature DB >> 23343571 |
Sheli Kol1, Bracha Nir2, Shuly Wintner1.
Abstract
Several models of language acquisition have emerged in recent years that rely on computational algorithms for simulation and evaluation. Computational models are formal and precise, and can thus provide mathematically well-motivated insights into the process of language acquisition. Such models are amenable to robust computational evaluation, using technology that was developed for Information Retrieval and Computational Linguistics. In this article we advocate the use of such technology for the evaluation of formal models of language acquisition. We focus on the Traceback Method, proposed in several recent studies as a model of early language acquisition, explaining some of the phenomena associated with children's ability to generalize previously heard utterances and generate novel ones. We present a rigorous computational evaluation that reveals some flaws in the method, and suggest directions for improving it.Entities:
Mesh:
Year: 2013 PMID: 23343571 PMCID: PMC3866979 DOI: 10.1017/S0305000912000694
Source DB: PubMed Journal: J Child Lang ISSN: 0305-0009
Size of the corpora
| Corpus | Main corpus | Test corpus | ||
|---|---|---|---|---|
| Utterances | Word tokens | Utterances | Word tokens | |
| Eve | 19,536 | 85,350 | 224 | 875 |
| Adam | 20,443 | 75,213 | 792 | 3,166 |
| Sarah | 6,425 | 23,330 | 106 | 252 |
| Nina | 38,736 | 175,748 | 458 | 1,632 |
| Thomas-A | 25,776 | 132,836 | 357 | 1,269 |
| Thomas-B | 25,110 | 131,652 | 326 | 2,192 |
Re-implementation results: for each corpus, the number and ratio of successfully derived utterances; of those, the number and ratio of utterances derived using exact matches (‘Fixed’), using any of the two operations (‘Superimpose’ and ‘Juxtapose’); and of the utterances derived by some operation, the number and ratio derived using only one or two operations
| Corpus | Test | Derived | Fixed | Superimpose | Juxtapose | 1 OP | 2 OP | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| # | % | # | % | # | % | # | % | # | % | # | % | ||
| Eve | 224 | 155 | 69 | 37 | 24 | 87 | 56 | 32 | 21 | 95 | 81 | 11 | 9 |
| Adam | 792 | 675 | 85 | 183 | 27 | 312 | 46 | 185 | 27 | 362 | 74 | 70 | 14 |
| Sarah | 106 | 94 | 89 | 40 | 43 | 45 | 48 | 9 | 10 | 54 | 100 | 0 | 0 |
| Nina | 458 | 401 | 88 | 119 | 30 | 217 | 54 | 66 | 17 | 230 | 82 | 27 | 10 |
| Thomas-A | 357 | 246 | 69 | 106 | 43 | 136 | 55 | 8 | 3 | 127 | 91 | 13 | 9 |
| Thomas-B | 436 | 260 | 60 | 101 | 39 | 150 | 58 | 12 | 5 | 143 | 90 | 15 | 9 |
Results for training data as CDS or CS only
| Corpus | size | Derived | Fixed | Superimpose | Juxtapose | ||||
|---|---|---|---|---|---|---|---|---|---|
| # | % | # | % | # | % | # | % | ||
| Eve-CDS | 224 | 155 | 69·2 | 37 | 23·9 | 87 | 56·1 | 32 | 20·6 |
| Eve-CS | 155 | 65·6 | 33 | 22·5 | 71 | 48·2 | 44 | 29·9 | |
| Adam-CDS | 792 | 632 | 79·8 | 131 | 20·7 | 277 | 43·8 | 234 | 37·1 |
| Adam-CS | 649 | 81·9 | 160 | 24·7 | 294 | 45·3 | 202 | 31·1 | |
| Sarah-CDS | 106 | 91 | 85·9 | 28 | 30·8 | 51 | 56·1 | 12 | 13·2 |
| Sarah-CS | 79 | 74·5 | 32 | 40·5 | 33 | 41·8 | 14 | 17·7 | |
| Nina-CDS | 458 | 408 | 89·1 | 103 | 25·2 | 227 | 55·6 | 79 | 19·4 |
| Nina-CS | 395 | 86·2 | 119 | 30·1 | 196 | 49·6 | 84 | 21·3 | |
| Thomas-A-CDS | 357 | 202 | 56·6 | 44 | 21·8 | 151 | 74·8 | 19 | 9·4 |
| Thomas-A-CS | 246 | 68·9 | 106 | 43·1 | 136 | 55·3 | 8 | 3·3 | |
| Thomas-B-CDS | 436 | 233 | 53·4 | 78 | 33·5 | 143 | 61·4 | 14 | 6·0 |
| Thomas-B-CS | 242 | 55·5 | 103 | 42·6 | 131 | 54·1 | 11 | 4·6 | |
Evaluation of the TBM
| Corpus | size | Derived | Fixed | Superimpose | Juxtapose | 1 OP | |||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| # | % | # | % | # | % | # | % | # | % | ||
| Eve | 224 | 155 | 69 | 37 | 24 | 87 | 56 | 32 | 21 | 95 | 74 |
| (reversed) | 153 | 68 | 5 | 3 | 64 | 42 | 84 | 55 | 93 | 63 | |
| (random) | 146 | 65 | 29 | 20 | 68 | 47 | 49 | 34 | 78 | 67 | |
| Adam | 792 | 675 | 85 | 183 | 27 | 312 | 46 | 185 | 27 | 362 | 83 |
| (reversed) | 659 | 83 | 43 | 7 | 226 | 34 | 403 | 61 | 335 | 54 | |
| (random) | 657 | 83 | 138 | 21 | 271 | 41 | 260 | 40 | 313 | 60 | |
| Sarah | 106 | 94 | 89 | 40 | 43 | 45 | 48 | 9 | 10 | 54 | 100 |
| (reversed) | 94 | 89 | 8 | 9 | 63 | 67 | 23 | 25 | 82 | 95 | |
| (random) | 91 | 86 | 8 | 9 | 73 | 80 | 12 | 213 | 80 | 98 | |
| Nina | 458 | 401 | 88 | 119 | 30 | 217 | 54 | 66 | 17 | 230 | 81 |
| (reversed) | 389 | 85 | 22 | 6 | 143 | 37 | 227 | 58 | 241 | 63 | |
| (random) | 400 | 87 | 71 | 18 | 132 | 33 | 198 | 50 | 248 | 75 | |
| Thomas-A | 357 | 246 | 69 | 106 | 43 | 136 | 55 | 8 | 3 | 127 | 90 |
| (reversed) | 204 | 57 | 48 | 24 | 150 | 74 | 15 | 7 | 123 | 79 | |
| (random) | 212 | 59 | 81 | 38 | 131 | 62 | 3 | 1 | 114 | 5 | |
| Thomas-B | 436 | 260 | 60 | 101 | 39 | 150 | 58 | 12 | 5 | 143 | 90 |
| (reversed) | 213 | 49 | 31 | 15 | 177 | 83 | 11 | 5 | 136 | 43 | |
| (random) | 219 | 50 | 88 | 40 | 129 | 59 | 6 | 3 | 106 | 79 | |