| Literature DB >> 33286789 |
Consuelo V García-Mendoza1, Omar J Gambino1, Miguel G Villarreal-Cervantes2, Hiram Calvo3.
Abstract
Sentiment polarity classification in social media is a very important task, as it enables gathering trends on particular subjects given a set of opinions. Currently, a great advance has been made by using deep learning techniques, such as word embeddings, recurrent neural networks, and encoders, such as BERT. Unfortunately, these techniques require large amounts of data, which, in some cases, is not available. In order to model this situation, challenges, such as the Spanish TASS organized by the Spanish Society for Natural Language Processing (SEPLN), have been proposed, which pose particular difficulties: First, an unwieldy balance in the training and the test set, being this latter more than eight times the size of the training set. Another difficulty is the marked unbalance in the distribution of classes, which is also different between both sets. Finally, there are four different labels, which create the need to adapt current classifications methods for multiclass handling. Traditional machine learning methods, such as Naïve Bayes, Logistic Regression, and Support Vector Machines, achieve modest performance in these conditions, but used as an ensemble it is possible to attain competitive execution. Several strategies to build classifier ensembles have been proposed; this paper proposes estimating an optimal weighting scheme using a Differential Evolution algorithm focused on dealing with particular issues that multiclass classification and unbalanced corpora pose. The ensemble with the proposed optimized weighting scheme is able to improve the classification results on the full test set of the TASS challenge (General corpus), achieving state of the art performance when compared with other works on this task, which make no use of NLP techniques.Entities:
Keywords: Twitter sentiment analysis; ensemble learning; evolutionary optimization; sentiment polarity; unbalanced classes
Year: 2020 PMID: 33286789 PMCID: PMC7597113 DOI: 10.3390/e22091020
Source DB: PubMed Journal: Entropy (Basel) ISSN: 1099-4300 Impact factor: 2.524
Opinion polarity distribution of tweets in TASS corpus.
| Polarity |
|
| ||
|---|---|---|---|---|
| Frecuency | Tweets | Frecuency | Tweets | |
| Positive | 39.94% | 2884 | 36.57% | 22,233 |
| Negative | 30.22% | 2182 | 26.06% | 15,844 |
| None | 20.54% | 1483 | 35.22% | 21,416 |
| Neutral | 9.28% | 670 | 2.15% | 1305 |
Probabilities generated by the classifier , of which the q-th tweet belongs to each of the classes.
|
|
| ⋯ |
| |
|---|---|---|---|---|
|
|
|
| ⋯ |
|
Weighting scheme of ensemble classification.
|
|
| ⋯ |
| |
|---|---|---|---|---|
|
|
|
| ⋯ |
|
|
|
|
| ⋯ |
|
| ⋯ | ⋯ | ⋯ | ⋯ | ⋯ |
|
|
|
| ⋯ |
|
Predictions of three classifiers for five tweets.
|
|
|
|
|
|
|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Results on random folds of the TASS training set E.
| F1 | F2 | F3 | F4 | F5 | F6 | F7 | F8 | F9 | F10 | Average | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| NB | 0.5927 | 0.6066 |
| 0.6149 | 0.6191 | 0.5775 | 0.6288 | 0.6288 | 0.5886 | 0.6282 | 0.6159 |
| LR |
|
| 0.6634 |
| 0.6329 | 0.6204 |
|
|
| 0.6324 | 0.6400 |
| SVM |
| 0.6329 | 0.6606 | 0.6260 |
|
| 0.6426 | 0.6398 | 0.6218 |
| 0.6358 |
| Hard w | 0.6163 | 0.6274 | 0.6897 | 0.6371 | 0.6315 | 0.6066 | 0.6481 | 0.6675 | 0.6218 | 0.6421 | 0.6388 |
|
|
|
| 0.6939 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| TA | 0.7160 | 0.7299 | 0.7603 | 0.7229 | 0.7119 | 0.6966 | 0.7409 | 0.7368 | 0.7105 | 0.7392 | 0.7265 |
Figure 1Accuracy of each classifier, different weighting schemes, and maximum theoretical accuracy on each fold of the E set of the TASS corpus.
Results on stratified folds of the training set of the TASS corpus E.
| F1 | F2 | F3 | F4 | F5 | F6 | F7 | F8 | F9 | F10 | Average | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| NB | 0.6422 | 0.6284 | 0.6127 | 0.6232 | 0.6380 | 0.6449 | 0.6102 | 0.6185 | 0.6061 | 0.6324 | 0.6256 |
| LR |
|
|
|
|
|
|
|
| 0.6213 |
| 0.6447 |
| SVM | 0.6629 | 0.6533 | 0.6210 | 0.6357 | 0.6185 | 0.6393 | 0.6296 | 0.6393 |
|
| 0.6385 |
| Hard w | 0.6712 | 0.6408 | 0.6334 | 0.6426 | 0.6407 | 0.6629 | 0.6393 | 0.6449 | 0.6213 | 0.6532 | 0.6450 |
|
|
|
|
|
|
|
|
|
|
| 0.6740 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| TA | 0.7555 | 0.7417 | 0.7136 | 0.7313 | 0.7364 | 0.7420 | 0.7198 | 0.7323 | 0.7073 | 0.7392 | 0.7319 |
Figure 2Accuracy of each classifier, different weighting schemes, and maximum theoretical accuracy on each stratified fold of the E set of the TASS corpus.
Figure 3Accuracy of Soft w:20–300 on both random and stratified folding strategy for each created fold of the E set of the TASS corpus.
Results of the experiment with Z set of the TASS corpus. Soft shows results with soft weights calculated from random folds while those of Soft were calculated from stratified folds.
| Method | Accuracy |
|---|---|
|
| 0.6384 |
|
| 0.6712 |
|
| 0.6721 |
|
| 0.6657 |
|
| 0.6768 |
|
|
|
External resources used and Accuracy for TASS Task 1, 4 classes, Z corpus. Best run reported for each system.
| System | Accuracy | ElhPolar [ | SOCAL [ | iSOL [ | SSL [ | Own | DAL [ | LYSA [ | ML Senticon [ | Semeval 2013 [ | MPQA [ | HGI [ |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| LIF | 0.726 | ✓ | ✓ | ✓ | ✓ | ✓ | ||||||
| ELiRF | 0.725 | ✓ | ✓ | ✓ | ||||||||
| GTI-GRAD | 0.695 | ✓ | ✓ | ✓ | ✓ | |||||||
| GSI (aspect) | 0.691 | ✓ | ✓ | ✓ | ✓ | ✓ | ||||||
|
| 0.677 | |||||||||||
| LYS | 0.664 | |||||||||||
| DLSI | 0.655 | ✓ | ||||||||||
| SINAI-DW2Vec | 0.619 | |||||||||||
| INGEOTEC | 0.613 |
Classifiers used for systems in Table 8. (LR = Logistic Regression, SVM = Support Vector Machines, ME = MaxEnt, SG = SkipGrams).
| System | Accuracy | max n-gram | NER | NLP | Negation | Word2Vec | Doc2Vec | GloVe | TF·IDF | LDA | LSI | Classifier |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| LIF | 0.726 | 1 | ✓ | ✓ | ✓ | ✓ | (SVM SG Cbow)→SVM | |||||
| ELiRF | 0.725 | 1 | ✓ | ✓ | SVM (+ SVM) | |||||||
| GTI-GRAD | 0.695 | 2 | ✓ | LR | ||||||||
| GSI (aspect) | 0.691 | 1 | ✓ | ✓ | ✓ | SVM | ||||||
|
| 0.677 | 1 | DE: (NB, LR, SVM) | |||||||||
| LYS | 0.664 | 1 | ✓ | Logistic regression L2-LG | ||||||||
| DLSI | 0.655 | 2 | ✓ | SVM | ||||||||
| SINAI-DW2Vec | 0.619 | 1 | ✓ | ✓ | SVM | |||||||
| INGEOTEC | 0.613 | 5 | ✓ | ✓ | ✓ | SVM |
Results of our proposed method with the InterTASS ES corpus, as compared with top results [61].
| System | M. F1 | Acc. | System | M. F1 | Acc. |
|---|---|---|---|---|---|
| elirf-es-run-1 | 0.503 | 0.612 | atalaya-lr-50-2-roc | 0.455 | 0.595 |
| retuyt-lstm-es-1 | 0.499 | 0.549 | ingeotec-run1 | 0.445 | 0.530 |
| retuyt-combined-es | 0.491 | 0.602 | atalaya-svm-50-2 | 0.431 | 0.583 |
| atalaya-ubav3-100-3-syn | 0.476 | 0.544 | itainnova-cl-base | 0.383 | 0.433 |
| DE:Soft | 0.461 | 0.585 | itainnova-cl-proc1 | 0.320 | 0.395 |
| retuyt-cnn-es-1 | 0.458 | 0.592 |