| Literature DB >> 35655503 |
Abstract
This work is to reduce the workload of teachers in English teaching and improve the writing level of students, so as to provide a way for students to practice English composition scoring independently and satisfy the needs of college teachers and students for intelligent English composition scoring and intelligently generated comments. In this work, it firstly clarifies the teaching requirements of college English classrooms and expounds the principles and advantages of machine learning technology. Secondly, a three-layer neural network model (NNM) is constructed by using the multilayer perceptron (MLP), combined with the latent Dirichlet allocation (LDA) algorithm. Furthermore, three semantic representation vector technologies, including word vector, paragraph vector, and full-text vector feature, are used to represent the full-text vocabulary of English composition. Then, a model based on the K-nearest neighbors (kNN) algorithm is proposed to generate English composition evaluation, and a final score based on the extreme gradient boosting (XGBoost) model is proposed. Finally, a model dataset is constructed using 800 college students' English essays for the CET-4 mock test, and the model is tested. The research results show that the semantic representation vector technology proposed can more effectively extract the lexical semantic features of English compositions. The XGBoost model and the kNN algorithm model are used to score and evaluate English compositions, which improves the accuracy of the scores. This makes the management of the entire scoring model more efficient and more accurate. It means that the model proposed is better than the traditional model in terms of evaluation accuracy. This work provides a new direction for the application of artificial intelligence technology in English teaching under the background of modern information technology.Entities:
Mesh:
Year: 2022 PMID: 35655503 PMCID: PMC9152396 DOI: 10.1155/2022/6912018
Source DB: PubMed Journal: Comput Intell Neurosci
Figure 1The schematic diagram of the MLP neural network.
Figure 2The schematic diagram of the three-layer NNM.
Figure 3Schematic diagram of the principle of the word2vec model.
Figure 4The schematic diagram of the network model.
Figure 5The calculation process of LDA parameters.
The production process of an article and the calculation steps of the LDA model.
| Order | Step | Calculation process |
|---|---|---|
| 1 | To determine the distribution of topics and words | LDA is used to calculate the polynomial distribution of feature words and describe the distribution with parameters |
| 2 | To determine the distribution of articles and topics | According to PD (Poisson distribution), the scale of feature words is calculated |
| 3 | To randomly determine the number | LDA is used to calculate the probability vector of topic distribution |
| 4 | If the number of currently generated terms is less than N, go to step 5; otherwise, step 6 is performed | From the |
| 5 | A topic is generated randomly according to article and topic distribution, and then a word is generated randomly based on topic and word distribution. Next, proceed to step 4. | |
| 6 | The article generation is finished |
The general idea to score essays.
| Order | Step |
|---|---|
| 1 | The TF − IDF method and TextRank method are used to select typical labels to score essays |
| 2 | With general feature vector to represent the essay to be scored and the essay in the data set, their feature vectors are compared for cosine similarity. Then, the |
| 3 | To get the score |
| 4 | The TextRank method is used to estimate the weight value of each word segment, and the word segment sequence |
| 5 | In the scoring process, essay |
Figure 6The number of essays belonging to each category.
Figure 7The number of essays in each score range.
Figure 8MSE and PCC results of different methods.
The results of precision, recall, and F1-score.
| Method | F1-score | Recall | Precision |
|---|---|---|---|
| The proposed method | 0.8 | 0.78 | 0.82 |
| TR + kNN | 0.68 | 0.66 | 0.7 |
| TF-IDF + kNN | 0.65 | 0.63 | 0.68 |