| Literature DB >> 35494831 |
Noman Ashraf1, Lal Khan2, Sabur Butt1, Hsien-Tsung Chang2,3,4, Grigori Sidorov1, Alexander Gelbukh1.
Abstract
Urdu is a widely used language in South Asia and worldwide. While there are similar datasets available in English, we created the first multi-label emotion dataset consisting of 6,043 tweets and six basic emotions in the Urdu Nastalíq script. A multi-label (ML) classification approach was adopted to detect emotions from Urdu. The morphological and syntactic structure of Urdu makes it a challenging problem for multi-label emotion detection. In this paper, we build a set of baseline classifiers such as machine learning algorithms (Random forest (RF), Decision tree (J48), Sequential minimal optimization (SMO), AdaBoostM1, and Bagging), deep-learning algorithms (Convolutional Neural Networks (1D-CNN), Long short-term memory (LSTM), and LSTM with CNN features) and transformer-based baseline (BERT). We used a combination of text representations: stylometric-based features, pre-trained word embedding, word-based n-grams, and character-based n-grams. The paper highlights the annotation guidelines, dataset characteristics and insights into different methodologies used for Urdu based emotion classification. We present our best results using micro-averaged F1, macro-averaged F1, accuracy, Hamming loss (HL) and exact match (EM) for all tested methods.Entities:
Keywords: Deep learning; Emotion classification in Urdu; Emotion detection; Machine learning; Multi-label emotion detection; Natural language processing
Year: 2022 PMID: 35494831 PMCID: PMC9044368 DOI: 10.7717/peerj-cs.896
Source DB: PubMed Journal: PeerJ Comput Sci ISSN: 2376-5992
Comparison of state-of-the-art in multilabel emotion detection.
| Link | Size | Language | Data source | Composition |
|---|---|---|---|---|
| EmoBank | 10,000 | English | MASC | VAD |
| Affective Text | 1,250 | English | News websites ( | Ekmans emotions + valence indication (positive/negative). |
| DailyDialog | 13,118 | English | Dialogues from human conversations | Ekman’s emotion + No emotion |
| Electoral Tweets | 100,000 | English | Plutchik’s emotions + sentiment (positive/negative) | |
| EmoInt | 7,097 | English | Intensities of sadness, fear, anger, and joy | |
| Emotion Stimulus | 2,414 | English | FrameNets annotated data | Ekman’s emotions and shame |
| Grounded Emotions | 2,557 | English | Emotional state (happy or sad) + five types of external factors namely user predisposition, weather, social network, news exposure, and timing | |
| Fb-Valence-Arousal | 2,895 | English | valence (sentiment) + arousal (intensity) | |
| Stance Sentiment Emotion | 4,868 | English | Plutchik’s emotions |
Figure 1Multilabel emotion detection model for Urdu language.
Figure 2Examples in our dataset (translated by Google).
Distribution of emotions in the dataset.
| Emotions | Train | Test |
|---|---|---|
| Anger ( | 833 | 191 |
| Disgust ( | 756 | 203 |
| Fear ( | 594 | 184 |
| Sadness ( | 2,206 | 560 |
| Surprise ( | 1,572 | 382 |
| Happiness ( | 1,040 | 278 |
Statistics based on the train and test dataset.
| Dataset | Tweets | Words | Avg. Word | Char | Avg. Char | Vocab |
|---|---|---|---|---|---|---|
| All | 6,043 | 44,525 | 9.24 | 224,806 | 46.65 | 14,101 |
| Train | 4,818 | 44,525 | 9.24 | 224,806 | 46.65 | 9,840 |
| Test | 1,225 | 11,425 | 9.32 | 57,658 | 47.06 | 4,261 |
Figure 31D-CNN model architecture.
Figure 4LSTM model architecture.
Deep learning parameters for 1D-CNN and LSTM.
| Parameter | 1D-CNN | LSTM |
|---|---|---|
| Epochs | 100 | 150 |
| Optimizer | Adam | Adam |
| Loss | categorical crossentropy | categorical crossentropy |
| Learning Rate | 0.001 | 0.0001 |
| Regularization | 0.01 | – |
| Bias Regularization | 0.01 | – |
| Validation Split | 0.1 | 0.1 |
| Hidden Layer 1 Dimension | 16 | 16 |
| Hidden Layer 1 Activation | tanh | tanh |
| Hidden Layer 1 Dropout | 0.2 | – |
| Hidden Layer 2 Dimension | 32 | 32 |
| Hidden Layer 2 Activation | tanh | tanh |
| Hidden Layer 2 Dropout | 0.2 | – |
| Hidden Layer 3 Dimension | – | 64 |
| Hidden Layer 3 Activation | – | tanh |
| Hidden Layer 3 Dropout | – | – |
Best results for multi-label emotion detection using word n-gram features.
| Features | MLC | SLC | Acc. | EM | HL | Micro-F1 | Macro-F1 |
|---|---|---|---|---|---|---|---|
| Word N–gram | |||||||
| Word 1-gram | BR | RF | 51.20 | 32.30 | 19.40 | 60.20 | 56.10 |
| Word 2-gram | LC | SMO | 43.60 | 30.30 | 21.70 | 50.20 | 47.50 |
| Word 3-gram | BR | RF | 39.90 | 16.60 | 28.40 | 50.00 | 48.10 |
| Combination of Word N–gram | |||||||
| Word 1–3-gram | BR | AdaBoostM1 | 35.10 | 14.90 | 30.10 | 44.50 | 42.60 |
Best results for multi-label emotion detection using char n-gram features.
| Features | MLC | SLC | Acc. | EM | HL | Micro-F1 | Macro-F1 |
|---|---|---|---|---|---|---|---|
| Character N-gram | |||||||
| Char 3-gram | BR | RF | 47.20 | 28.20 | 21.10 | 56.60 | 52.70 |
| Char 4-gram | BR | Bagging | 38.60 | 21.70 | 25.60 | 47.30 | 44.60 |
| Char 5-gram | BR | Bagging | 38.30 | 16.50 | 28.80 | 47.90 | 46.30 |
| Char 6-gram | BR | Bagging | 37.80 | 16.90 | 29.30 | 46.30 | 45.50 |
| Char 7-gram | BR | RF | 36.10 | 15.50 | 31.00 | 44.70 | 43.80 |
| Char 8-gram | BR | RF | 34.80 | 11.80 | 31.50 | 45.30 | 43.50 |
| Char 9-gram | BR | RF | 34.80 | 11.80 | 31.50 | 45.10 | 43.40 |
| Combination of Character N-gram | |||||||
| Char 3–9 | LC | RF | 33.60 | 32.90 | 12.10 | 32.30 | 33.90 |
Best results for multi-label emotion detection using stylometry-based features.
| Features | MLC | SLC | Acc. | EM | HL | Micro-F1 | Macro-F1 |
|---|---|---|---|---|---|---|---|
| Character-based | BR | DT | 33.70 | 10.7 | 31.90 | 44.40 | 42.40 |
| Word-based | BR | AdaBoostM1 | 35.10 | 14.90 | 30.10 | 44.50 | 42.60 |
| Vocabulary richness | BR | AdaBoostM1 | 34.10 | 11.80 | 31.10 | 44.50 | 42.50 |
| All features | BR | AdaBoostM1 | 35.00 | 14.90 | 30.00 | 44.50 | 42.50 |
Best results for multi-label emotion detection using pre-trained word embedding features.
| Model | Features (dim) | Acc. | EM | HL | Micro-F1 | Macro-F1 |
|---|---|---|---|---|---|---|
| 1D CNN | fastText (300) | 45.00 | 42.00 | 36.00 | 35.00 | 54.00 |
| LSTM | fastText (300) | 44.00 | 42.00 | 35.00 | 32.00 | 55.00 |
Best results for multi-label emotion detection using contextual pre-trained word embedding features.
| Model | Features (dim) | Acc. | EM | HL | Micro-F1 | Macro-F1 |
|---|---|---|---|---|---|---|
| LSTM | fastText (300), 1D CNN (16) | 46.00 | 35.00 | 36.00 | 34.00 | 53.00 |
| BERT | BERT Contextual Embeddings (768) | 15.00 | 44.00 | 57.00 | 54.00 | 37.00 |
Comparison of state-of-the-art results in multi-label emotion detection.
| Reference | Model | Features | Accuracy | Micro-F1 | Macro-F1 | HL |
|---|---|---|---|---|---|---|
|
| RF | n-gram | 45.20 | 57.30 | 55.90 | 17.90 |
|
| MMS2S | – | 47.50 | – | 56.00 | 18.30 |
|
| C-GRU | AraVec, word2vec | 53.20 | 49.50 | 64.80 | – |
|
| MESGN | – | 49.4 | – | 56.10 | 18.00 |
| Proposed | 1D CNN | fastText | 45.00 | 35.00 | 54.00 | 36.00 |
| Proposed | RF | word unigram | 51.20 | 60.20 | 56.10 | 19.40 |