| Literature DB >> 35237205 |
Xiangtao Du1, Muhammad Afzaal2, Hind Al Fadda3.
Abstract
The investigation of learners' interlanguage could greatly contribute to the teaching of English as a foreign language and the development of teaching materials. The present study investigates the collocational profiles of large-scale written production by English learners with varied L1 backgrounds and different proficiency levels. Using the British National Corpus as reference corpus, learners' collocation use was extracted by corpus query language and further identified by t-score via Python programming language. The collocation list consists of 2,501 make/take + noun (the direct object) collocations. Findings show that proficient learners tend to use collocations containing more semantically complicated and abstract noun elements for varied communication tasks. Moreover, advanced learners are inclined to use collocations comprised of more difficult and longer noun elements.Entities:
Keywords: EFCAMDAT; collocations; corpus analysis; foreign language writing; lexical developmental patterns
Year: 2022 PMID: 35237205 PMCID: PMC8884357 DOI: 10.3389/fpsyg.2022.752134
Source DB: PubMed Journal: Front Psychol ISSN: 1664-1078
Summary of randomly extracted learner data.
| CEFR level | A1 | A2 | B1 | B2 | C1, C2 | In total |
|---|---|---|---|---|---|---|
| Number of scripts | 3,600 | 3,600 | 3,600 | 3,600 | 3,600 | 18,000 |
| Number of tokens | 129,058 | 198,020 | 259,693 | 490,921 | 457,678 | 1,535,370 |
Summary of make/take + noun patterns identified from learner data.
| Category | A1 | A2 | B1 | B2 | C | In total | |
|---|---|---|---|---|---|---|---|
| Combinations | 134 | 927 | 719 | 1,083 | 1,208 | 4,071 | |
| 25 | 50 | 81 | 116 | 119 | 391 | ||
| Collocations ( | 68 | 493 | 471 | 723 | 746 | 2,501 | |
| Retention rate | 50.75% | 53.18% | 65.51% | 66.76% | 61.75% | 61.43% |
Summary of USAS tag set.
| Number | Semantic fields |
|---|---|
| A | GENERAL & ABSTRACT TERMS |
| B | THE BODY & THE INDIVIDUAL |
| C | ARTS & CRAFTS |
| E | EMOTIONAL ACTIONS, STATES & PROCESSES |
| G | GOVT. & THE PUBLIC DOMAIN |
| H | ARCHITECTURE, BUILDINGS, HOUSES & THE HOME |
| I | MONEY & COMMERCE |
| K | ENTERTAINMENT, SPORTS, & GAMES |
| L | LIFE & LIVING THINGS |
| M | MOVEMENT, LOCATION, TRAVEL, & TRANSPORT |
| N | NUMBERS & MEASUREMENT |
| O | SUBSTANCES, MATERIALS, OBJECTS, & EQUIPMENT |
| P | EDUCATION |
| Q | LINGUISTIC ACTIONS, STATES, & PROCESSES |
| S | SOCIAL ACTIONS, STATES, & PROCESSES |
| T | TIME |
| W | THE WORLD & OUR ENVIRONMENT |
| X | PSYCHOLOGICAL ACTIONS, STATES, & PROCESSES |
| Y | SCIENCE & TECHNOLOGY |
| Z | NAMES & GRAMMATICAL WORDS |
Figure 1Percentage of nouns in each semantic field across CEFR level.
Top two semantic fields at each level.
| CEFR level | Ratio | Semantic Fields | Examples |
|---|---|---|---|
| A1 | 19% | F |
|
| 30% | B |
| |
| A2 | 14% | K |
|
| 35% | H |
| |
| B1 | 20% | A |
|
| 24% | S |
| |
| B2 | 25% | A |
|
| 31% | S |
| |
| C | 20% | S |
|
| 25% | A |
|
Figure 2Bi-plot of correspondence analysis: CEFR levels and semantic fields of noun elements.
Crosstabulation of proficiency level and difficulty level of noun elements.
| CEFR levels | EVP levels of noun elements | |||||
|---|---|---|---|---|---|---|
| A1 | A2 | B1 | B2 | C1 | C2 | |
| A1 | 39 | 13 | 8 | 8 | 1 | 0 |
| A2 | 274 | 139 | 75 | 3 | 0 | 2 |
| B1 | 103 | 111 | 204 | 49 | 0 | 4 |
| B2 | 109 | 107 | 321 | 163 | 17 | 6 |
| C | 107 | 211 | 286 | 80 | 34 | 27 |
Adjusted residuals appear in parentheses below observed frequencies.
Figure 3The relationship between the CEFR level and length of noun elements.
Summary of the mixed effects model for the length of noun elements.
| Parameters | Fixed effects | Random effects | |||||
|---|---|---|---|---|---|---|---|
| Topic | Nationality | learner | |||||
|
|
|
|
|
|
| ||
| Intercept | 4.99 [4.24;5.72] | 0.40 | 12.62 | 0.0001 | 0.87 | 0.09 | 0.0004 |
| CEFR A2 | −0.28 [−1.14;0.63] | 0.46 | −0.60 | 0.55 | – | – | – |
| CEFR B1 | 0.92 [0.05;1.80] | 0.45 | 2.04 | 0.04 | – | – | – |
| CEFR B2 | 0.82 [−0.03;1.69] | 0.45 | 1.82 | 0.07 | – | – | – |
| CEFR C | 0.94 [0.09;1.72] | 0.44 | 2.16 | 0.03 | – | – | – |
p < 0.05;
p < 0.001.