| Literature DB >> 35726286 |
Abstract
In order to solve the problem of tense consistency in Chinese-English neural machine translation (NMT) system, a Chinese verb tense annotation model is proposed. Firstly, a neural network is used to build a Chinese tense annotation model. During the translation process, the source tense is passed to the target side through the alignment matrix of the traditional Attention mechanism. The probability of the candidate words inconsistent with the corresponding tense of source words in the candidate translation word set is also reduced. Then, the Chinese-English temporal annotation algorithm is integrated into the MT model, so as to build a Chinese-English translation system with temporal processing function. The essence of the system is that, in the process of translation, Chinese-English temporal annotation algorithm is used to obtain temporal information from Chinese sentences and transfer it to the corresponding English sentences, so as to realize the temporal processing of English sentences and obtain the English sentences corresponding to the tenses of the original Chinese sentences. The experimental results show that the Chinese tense annotation model of bidirectional long short-term memory (LSTM) is more accurate for the prediction of Chinese verb tense, so the improvement effect of NMT model is also the most obvious, especially on the NIST06 test set, where the BLEU value is increased by 1.07%. As the mainstream translation model, the transformer model contains multihead Attention mechanism, which can pay attention to some temporal information and has a certain processing ability for temporal translation. It solves the tense problems encountered in the process of MT and improves the credibility of Chinese-English machine translation (MT).Entities:
Mesh:
Year: 2022 PMID: 35726286 PMCID: PMC9206575 DOI: 10.1155/2022/1662311
Source DB: PubMed Journal: Comput Intell Neurosci
Sixteen tenses of English verbs.
| Tense | Simple tense | Perfect tense | Continuous tense | Perfect continuous tense |
|---|---|---|---|---|
| Present tense | Simple present tense | Present perfect tense | Present continuous tense | Present perfect continuous tense |
| Past tense | Simple past tense | Past perfect tense | Past continuous tense | Past perfect continuous tense |
| Future tense | Simple future tense | Future perfect tense | Future continuous tense | Future perfect continuous tense |
| Past future tense | Simple past future tense | Past future perfect tense | Past future continuous tense | Past future perfect continuous tense |
Figure 1Automatic Chinese temporal annotation model based on LSTM.
Figure 2Encoder-Decoder model framework.
Figure 3NMT model.
Main parameter settings and descriptions of baseline model training.
| Parameters | Parameter value | Descriptions |
|---|---|---|
| Scr vocab size | 30000 | The size of source Chinese vocabulary |
| Tgt vocab size | 30000 | The size of target English vocabulary |
| Batch size | 64 | Minibatch size, the number of training samples taken out for each training |
| Embedding size | 500 | Dimensions of source and target word embedding |
| Encoder type | BiLSTM | The type of neural network used by Encoder, a bidirectional LSTM here |
| Decoder type | LSTM | The neural network type used by Decoder, the standard LSTM used here |
| Enc/Dec layers | 2 | Network layers of Encoder and Decoder |
| LSTM size | 500 | Dimension of hidden layer of the neural network in LSTM |
| Optimization | Adam | Type of the optimization functions |
| Learning rate | 0.001 | Learning rate for neural network training |
| Beam size | 10 | The size of each candidate set selected in beam-search |
Figure 4Structure diagram (a) dependency syntax tree; (b) phrase analysis syntax tree.
Figure 5Flow chart of temporal recovery algorithm.
Figure 6DeTense model data preprocessing steps.
Figure 7Chinese temporal annotation results.
Test results of NMT model combined with source-side Chinese annotation information.
| Models | NIST03 | NIST04 | NIST05 | NIST06 | NIST08 | Mean value |
|---|---|---|---|---|---|---|
| Baseline | 35.84 | 38.84 | 34.46 | 33.85 | 26.59 | 33.92 |
| Tense-LSTM | 35.76 | 38.40 | 34.56 | 34.59 | 26.74 | 34.01 |
| Tense-BiLSTM | 35.99 | 38.67 | 34.70 | 34.92 | 26.76 | 34.21 |
| Tense-2layers | 36.06 | 38.56 | 34.43 | 34.52 | 26.85 | 34.08 |
Figure 8Comparison of model translation effect (a) translation effect of the last model; (b) average translation effect of the last 10 models.
Correct experimental example sentences.
| Original Chinese sentence | 因此, 经常预算准备金总数将减少至3.484亿美元 |
|---|---|
| Original English sentence | Thus, total regular budget reserves will be reduced to $348.4 million |
| Baseline | The total reserve for the regular budget would be reduced to $348.4 million |
| Tenseless English sentence | The total level of the regular budget reserve is reduced to $484.4 million |
| DeTense + temporal annotation | The total level of the regular budget reserve will be reduced to $484.4 million |
Examples of experimental errors.
| Original Chinese sentence | 即便如此, 如果这将使我们能够更好地了解历史问题及其法律内涵, 再次把不满化为友谊, 那么, 用同情和宽容看待各种讨论, 并期待各方抱有同样的态度, 也是理所当然的。 |
|---|---|
| Original English sentence | Even so, if this will enable us to better understand historical issues with their legal aspects and to transform resentment into friendship again, it is natural to approach different discourses with empathy and tolerance and expect a similar attitude from all sides |
| Baseline | Even so, if this will enable us to better understand the historical issues and their legal dimensions, to turn the grievances once again into friendship, then it is legitimate to view the discussion with compassion and tolerance and to expect the same attitude from all sides |
| Tenseless English sentence | Even so, if it enables us to better understand historical issues and their legal content and once again translate grievances into friendship, then it is only natural to view the discussions with sympathy and tolerance and to expect the same attitude on all sides |
| DeTense + temporal annotation | Even so, if it enables us to better understand historical issues and their legal content and once again will translate grievances into friendship, then it is only natural to view the discussions with sympathy and tolerance and to expect the same attitude on all sides |