| Literature DB >> 35844862 |
Abstract
Lexical features are influenced by different languages and genres. The study of lexical features in different genres of texts on the same topic is helpful to understand the universalities and peculiarities of languages. This study constructs a research on the lexical feature and word collocations of two self-build corpora (China's economic Legal Policy Corpus and English News Corpus during the COVID-19 pandemic), the methods of Quantitative Linguistics and context interpretation are adopted. It was found that: (1) the word length, word frequency, word cluster and high frequency word distribution in English economic news and Chinese economic legal policies are influenced by language and genre to some extent, and they conform to different functional image distribution; (2) during the COVID-19 pandemic, "development" has been the focus of China's economic legal policies and English news, the two have attached importance to economic recovery and taken a positive attitude toward it in different ways. These findings suggest that: (1) There are some universalities and peculiarities between English economic news and Chinese economic legal policies in the distribution of lexical feature; (2) there is a certain synchronization between laws and news, and both of them maintain a positive and objective attitude toward the economic development during the pandemic. This study carries out a macroscopic investigation on internal structure and external interpretation, which enriches the study on lexical features and cultural features of language and provides some references for relevant studies.Entities:
Keywords: COVID-19; corpus studies; economic legal policy; lexical feature; news text
Mesh:
Year: 2022 PMID: 35844862 PMCID: PMC9283870 DOI: 10.3389/fpubh.2022.928965
Source DB: PubMed Journal: Front Public Health ISSN: 2296-2565
Basic information of the two corpora.
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|
| Legal Policy Corpus (LPC) | Official | Chinese | 174,904 | Jan.2020—Nov.2020 | Currently effective |
|
| News Corpus (NC) | Official | English | 123,333 | Apr.2020—Nov.2020 | International section |
|
Basic features of the two corpora.
|
|
| |
|---|---|---|
|
|
| |
| Tokens | 90,648 | 122,271 |
| Types | 7,116 | 8,769 |
| Type-token ratio (TTR) | 7.85% | 7.17% |
| Standard type-token ratio (STTR) | 44.83% | 44.06% |
| STTR standard deviation | 52.29 | 54.83 |
| STTR basis | 1,000 | 1,000 |
| Mean word length | 1.87 | 5.20 |
| Mean word length standard deviation | 0.61 | 2.78 |
| Mean sentence length | 22.26 | 25.70 |
Word length distribution of legal policy corpus (LPC) and news corpus (NC).
|
|
| |
|---|---|---|
| 1-letter words | 21,859 | 2,484 |
| 2-letter words | 64,514 | 18,905 |
| 3-letter words | 5,533 | 21,371 |
| 4-letter words | 1,601 | 18,023 |
| 5-letter words | 130 | 12,607 |
| 6-letter words | 48 | 11,505 |
| 7-letter words | 1 | 12,184 |
| 8-letter words | 3 | 9,604 |
| 9-letter words | 0 | 6,395 |
| 10-letter words | 0 | 4,532 |
| 11-letter words | 3 | 2,831 |
| 12-letter words | 1 | 1,257 |
| 13-letter words | 0 | 1,195 |
| 14-letter words | 0 | 291 |
| 15-letter words | 0 | 104 |
| 16-letter words | 0 | 28 |
| 17-letter words | 0 | 9 |
| 18-letter words | 0 | 3 |
| 19-letter words | 0 | 0 |
| 20-letter words | 0 | 5 |
Correlation analysis of word length and frequency distribution between LPC and NC.
|
|
| |||
|---|---|---|---|---|
|
|
| Correlation Coefficient | 1.000 | 0.661** |
| Sig. (2-ailed) | . | 0.000 | ||
| N | 20 | 20 | ||
|
| Correlation Coefficient | 0.661** | 1.000 | |
| Sig. (2-ailed) | 0.000 | . | ||
| N | 20 | 20 |
**. Correlation is significant at the 0.01 level (2-tailed).
Figure 1Word length percentage distribution of LPC and NC.
Figure 2Dispersion diagram of word length and frequency of LPC and NC.
Model summaries and parameter estimates of LPC.
|
| |||||||
|---|---|---|---|---|---|---|---|
|
|
|
| |||||
|
|
|
|
|
|
| ||
| Logarithmic | 0.396 | 11.783 | 1 | 18 | 0.003 | 29141.417 | −11553.425 |
| Power | 0.831 | 88.799 | 1 | 18 | 0.000 | 91851.245 | −4.206 |
| Exponent | 0.621 | 29.542 | 1 | 18 | 0.000 | 2366.724 | −0.500 |
Independent variable: Wordlength.
Model summaries and parameter estimates of NC.
|
| |||||||
|---|---|---|---|---|---|---|---|
|
|
|
| |||||
|
|
|
|
|
|
| ||
| logarithmic | 0.469 | 15.880 | 1 | 18 | 0.001 | 19032.511 | −6078.008 |
| power | 0.487 | 17.092 | 1 | 18 | 0.001 | 309632.055 | −2.838 |
| exponent | 0.810 | 76.784 | 1 | 18 | 0.000 | 149502.906 | −0.503 |
Independent variable: Wordlength.
Figure 3Word clusters of 2 to 7 words with frequencies over 20 in two corpora.
Word clusters information of 2 to 7 words with frequencies over 10 in two corpora.
|
|
| |
|---|---|---|
| word clusters of two words | 854 | 1,298 |
| word clusters of three words | 299 | 388 |
| word clusters of four words | 175 | 103 |
| word clusters of five words | 117 | 34 |
| word clusters of six words | 92 | 10 |
| word clusters of seven words | 72 | 2 |
| Total | 1,609 | 1,835 |
Figure 4The number of word clusters and the distribution of standardized word clusters.
High frequency word distribution basic information of LPC and NC.
|
|
| |
|---|---|---|
| Number of high frequency words | 61 | 113 |
| Token Ratio of high frequency words | 24.72% | 50.38% |
| Type Ratio of high frequency words | 0.86% | 1.29% |
Figure 5Token ratio distribution of high frequency words of LPC and NC.
Figure 6Type ratio distribution of high frequency words of LPC and NC.
Content words ranked top eleven in frequency in the two corpora.
|
|
| ||
|---|---|---|---|
|
|
|
|
|
| 发展 (development) | 801 | China | 1,614 |
| 企业 (enterprise) | 756 | Said | 1,117 |
| 服务 (service) | 538 | Development | 501 |
| 疫情 (COVID-19) | 517 | Global | 466 |
| 建设 (construction) | 507 | World | 444 |
| 工作 (job) | 487 | Economic | 440 |
| 经济 (economy) | 486 | Market | 420 |
| 推进 (improvement) | 463 | Growth | 355 |
| 改革 (reform) | 414 | International | 350 |
| 政策 (policy) | 387 | Cooperation | 284 |
| 管理 (management) | 341 | Recovery | 263 |
Collocations of “development” in the two corpus.
|
|
|
|---|---|
| 经济社会发展 (economic and social development) | Stage of development |
| 规划发展 (planning and development) | Development stage |
| 发展战略 (development strategy) | Development area |
| 改革发展 (reform and development) | Investment and development |
| 发展要求 (development requirement) | Global development |
| 企业发展质量 (enterprise development quality) | New technology development |
| 发展定位 (development orientation) | Boosting regional development |
| 发展目标 (goal of development) | Development for recovery |
| 发展的原则 (principles of development) | Development of high-tech |
| 发展理念 (concepts for development) | Development in Shanxi |
| 协调发展 (coordinated development) | Development in China |
| 发展计划 (development planning) | Development and cooperation |
| 高质量发展政策 (high-quality development policies) | Development of foreign trade |
| 发展规范 (development norm) | Sustainable development |
| 发展规划纲要 (outline of development plan) | Development and prosperity |
| 加快发展 (speed up development) | Development of China's film |
| 发展氛围 (atmosphere of development) | Achieving development |