| Literature DB >> 35632034 |
Syed Zohaib Hassan1, Kashif Ahmad2, Steven Hicks1, Pål Halvorsen1, Ala Al-Fuqaha2, Nicola Conci3, Michael Riegler1.
Abstract
The increasing popularity of social networks and users' tendency towards sharing their feelings, expressions, and opinions in text, visual, and audio content have opened new opportunities and challenges in sentiment analysis. While sentiment analysis of text streams has been widely explored in the literature, sentiment analysis from images and videos is relatively new. This article focuses on visual sentiment analysis in a societally important domain, namely disaster analysis in social media. To this aim, we propose a deep visual sentiment analyzer for disaster-related images, covering different aspects of visual sentiment analysis starting from data collection, annotation, model selection, implementation, and evaluations. For data annotation and analyzing people's sentiments towards natural disasters and associated images in social media, a crowd-sourcing study has been conducted with a large number of participants worldwide. The crowd-sourcing study resulted in a large-scale benchmark dataset with four different sets of annotations, each aiming at a separate task. The presented analysis and the associated dataset, which is made public, will provide a baseline/benchmark for future research in the domain. We believe the proposed system can contribute toward more livable communities by helping different stakeholders, such as news broadcasters, humanitarian organizations, as well as the general public.Entities:
Keywords: deep learning; emotions; multimedia retrieval; natural disasters; sentiment analysis
Mesh:
Year: 2022 PMID: 35632034 PMCID: PMC9146152 DOI: 10.3390/s22103628
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.847
Figure 1Sample images of natural disasters for sentiment analysis showing the diversity in the content and information to be extracted.
A summary and comparative analysis of existing works on the topic.
| Refs. | Dataset | Application | Features/Model | Main Focus |
|---|---|---|---|---|
| [ | CMUMOSEI [ | Generic | Color and | Relies on features based on psychology and art theory |
| [ | SimleyNet [ | Emojis | CNNs | Mainly focuses on detection and classification of emojis, |
| [ | Twitter dataset [ | Generic | CNNs | Proposes a residual
attention-based deep learning |
| [ | Flickr dataset [ | Generic | CNNs, handcrafted | Jointly utilize text (objective text description of images |
| [ | Self-collected dataset | Generic | CNNs, LSTM | Proposes an attention-based network, namely |
| [ | VSO [ | Generic | CNNs | Aims to explore the role of local image regions in visual |
| [ | Twitter [ | Generic | CNNs | Proposes a multi-level context pyramid network |
| [ | Emotion ROI | Generic | CNNs and GCN | Proposes a Graph Convolutional Network (GCN)- |
| This Work | Self-collected | Natural | CNNs | Explores a new application of visual sentiment |
Figure 2Block diagram of the proposed visual sentiment analysis processing pipeline.
List of tags used in the crowd-sourcing study in the four sets.
| Sets | Tags |
|---|---|
| Set 1 | Positive, Negative, Neutral |
| Set 2 | Relax, Stimulated, Normal |
| Set 3 | Joy, Sadness, Fear, Disgust, Anger, Surprise, and Neutral |
| Set 4 | Anger, Anxiety, craving, Empathetic pain, Fear, Horror, Joy, Relief, Sadness, and Surprise |
Figure 3Statistics of the crowd-sourcing study participants’ demographics. (a) Statistics of the participants’ demographics in terms of the percentage of participants from different regions of the world. (b) Statistics of the participant demographics in terms of percentages of responses received from participants belonging to different regions of the world.
Figure 4An illustration of the web application used for the crowd-sourcing study. A disaster-related image is provided to the users who are asked to provide options/tags. In case, additional tags/comments can also be reported.
Statistics of the dataset for task 1.
| Tags | # Samples |
|---|---|
| Positive | 803 |
| Negative | 2297 |
| Neutral | 579 |
Statistics of the dataset for task 2.
| Tags | # Samples | Tags | # Samples |
|---|---|---|---|
| Joy | 1207 | Sadness | 3336 |
| Fear | 2797 | Disgust | 1428 |
| Anger | 1419 | Surprise | 2233 |
| Neutral | 1892 | - | - |
Statistics of the dataset for task 3.
| Tags | # Samples | Tags | # Samples |
|---|---|---|---|
| Anger | 2108 | Anxiety | 2716 |
| Craving | 1100 | Pain | 2544 |
| Fear | 2803 | Horror | 2042 |
| Joy | 1181 | Relief | 1356 |
| Sadness | 3300 | Surprise | 1975 |
Figure 5Statistics of the fifth question of the crowd-sourcing study in terms of what kind of information in the images influence users’ emotion most.
Figure 6Statistics of the first four questions of the crowd-sourcing study. (a) Statistics of the responses for the first question. Tags 1 to 4 represent negative sentiments while tag 5 represents neutral, and tags 6 to 9 show positive sentiments. (b) Statistics of the responses for the second question. Tags 1 to 4 represent calm/relaxed emotion, tag 5 shows a normal condition, while tags 6 to 9 depict excited/stimulated status. (c) Statistics of the responses for the third question. (d) Statistics of the responses for the fourth question.
Figure 7Statistics of the crowd-sourcing study in terms of how different tags are jointly associated with the images. (a) Tags jointly used in pairs. (b) Tags jointly used in a group of three.
Evaluation of the proposed visual sentiment analyzer on task 1 (i.e., single-label classification of three classes, namely negative, neutral, and positive).
| Model | Accuracy | Precision | Recall | F-Score |
|---|---|---|---|---|
| VGGNet (ImageNet) | 92.12 | 88.64 | 87.63 | 87.89 |
| VGGNet (Places) | 92.88 | 89.92 | 88.43 | 89.07 |
| Inception-v3 (ImageNet) | 82.59 | 76.38 | 68.81 | 71.60 |
| ResNet-50 (ImageNet) | 90.61 | 86.32 | 85.18 | 85.63 |
| ResNet-101 (ImageNet) | 90.90 | 86.79 | 85.84 | 86.01 |
| DenseNet (ImageNet) | 85.77 | 79.39 | 78.53 | 78.20 |
| EfficientNet (ImageNet) | 91.31 | 87.00 | 86.94 | 86.70 |
| VGGNet (places + ImageNet) | 92.83 | 89.67 | 88.65 | 88.97 |
Evaluation of the proposed visual sentiment analyzer on task 2 (i.e., multi-label classification of seven classes, namely sadness, fear, disgust, joy, anger, surprise, and neutral.
| Model | Accuracy | Precision | Recall | F-Score |
|---|---|---|---|---|
| VGGNet (ImageNet) | 82.61 | 84.12 | 80.28 | 81.66 |
| VGGNet (Places) | 82.94 | 82.87 | 82.30 | 82.28 |
| Inception-v3 (ImageNet) | 80.67 | 80.98 | 82.98 | 80.72 |
| ResNet-50 (ImageNet) | 82.48 | 84.33 | 79.41 | 81.38 |
| ResNet-101 (ImageNet) | 82.70 | 82.92 | 82.04 | 82.20 |
| DenseNet (ImageNet) | 81.99 | 83.43 | 81.30 | 81.51 |
| EfficientNet (ImageNet) | 82.08 | 82.80 | 81.31 | 81.51 |
| VGGNet (places + ImageNet) | 83.18 | 83.13 | 83.04 | 82.57 |
Evaluation of the proposed visual sentiment analyzer on task 3 (i.e., multi-label classification of seven classes, namely anger, anxiety, craving, empathetic pain, fear, horror, joy, relief, sadness, and surprise.
| Model | Accuracy | Precision | Recall | F-Score |
|---|---|---|---|---|
| VGGNet (ImageNet) | 82.74 | 80.43 | 85.61 | 82.14 |
| VGGNet (Places) | 81.55 | 79.26 | 85.08 | 81.16 |
| Inception-v3 (ImageNet) | 81.53 | 78.21 | 89.30 | 82.27 |
| ResNet-50 (ImageNet) | 82.30 | 79.90 | 84.18 | 81.60 |
| ResNet-101 (ImageNet) | 82.56 | 80.25 | 84.51 | 81.80 |
| DenseNet (ImageNet) | 81.72 | 79.40 | 85.35 | 81.63 |
| EfficientNet (ImageNet) | 82.25 | 80.83 | 82.70 | 81.39 |
| VGGNet (places + ImageNet) | 82.08 | 79.36 | 87.25 | 81.99 |
Experimental results of the proposed visual sentiment analyzer on task 1 in terms of accuracy, precision, recall, and F1-score per class. P represents the version of the model pre-trained on the Places dataset while the rest are pre-trained on the ImageNet dataset.
| Model | Metric | Negative | Neutral | Positive |
|---|---|---|---|---|
| VGGNet |
| 88.61 | 95.36 | 91.66 |
|
| 88.45 | 93.20 | 84.56 | |
|
| 74.59 | 93.29 | 91.83 | |
|
| 80.85 | 93.22 | 88.04 | |
| VGGNet (p) |
| 90.07 | 94.88 | 93.21 |
|
| 88.63 | 91.13 | 89.87 | |
|
| 79.52 | 94.21 | 89.88 | |
|
| 83.79 | 92.64 | 89.85 | |
| Inception V-3 |
| 76.48 | 86.51 | 82.28 |
|
| 70.64 | 79.34 | 78.25 | |
|
| 45.76 | 82.51 | 66.86 | |
|
| 55.46 | 80.85 | 71.41 | |
| ResNet-50 |
| 86.95 | 92.22 | 92.07 |
|
| 83.40 | 87.15 | 88.14 | |
|
| 74.51 | 90.68 | 88.29 | |
|
| 78.65 | 88.86 | 88.170 | |
| ResNet-101 |
| 87.16 | 92.31 | 92.29 |
|
| 86.57 | 86.07 | 87.99 | |
|
| 71.38 | 92.80 | 89.15 | |
|
| 78.11 | 89.25 | 88.54 | |
| DenseNet |
| 80.59 | 87.84 | 87.72 |
|
| 76.98 | 80.33 | 83.04 | |
|
| 60.16 | 87.01 | 79.54 | |
|
| 66.15 | 83.10 | 81.18 | |
| EfficientNet |
| 87.50 | 93.91 | 91.66 |
|
| 86.41 | 93.91 | 84.87 | |
|
| 72.87 | 92.58 | 91.68 | |
|
| 78.96 | 91.24 | 88.07 | |
| VGGNet (P+I) |
| 89.94 | 94.90 | 92.99 |
|
| 88.99 | 90.62 | 89.62 | |
|
| 78.44 | 95.17 | 89.58 | |
|
| 83.15 | 92.81 | 89.58 |
Experimental results of the proposed visual sentiment analyzer on task 2 in terms of accuracy, precision, recall, and F1-score per class. P represents the version of the model pre-trained on the Places dataset while the rest are pre-trained on the ImageNet dataset.
| Model | Metric | Joy | Sadness | Fear | Diguest | Anger | Surprise | Neutral |
|---|---|---|---|---|---|---|---|---|
|
|
| 83.37 | 95.32 | 88.24 | 76.67 | 76.86 | 75.29 | 75.78 |
|
| 92.17 | 92.46 | 85.09 | 76.78 | 82.13 | 76.96 | 80.31 | |
|
| 76.78 | 99.12 | 94.83 | 60.63 | 56.71 | 77.22 | 73.32 | |
|
| 83.77 | 95.67 | 89.68 | 67.68 | 66.99 | 77.07 | 76.35 | |
|
|
| 84.59 | 95.67 | 88.86 | 76.07 | 77.43 | 75.99 | 77.21 |
|
| 92.44 | 93.47 | 85.47 | 71.19 | 76.23 | 75.21 | 81.58 | |
|
| 78.65 | 98.60 | 95.46 | 67.15 | 62.18 | 82.27 | 75.64 | |
|
| 84.99 | 95.97 | 90.19 | 68.78 | 68.33 | 78.58 | 78.43 | |
|
|
| 79.81 | 90.51 | 85.40 | 76.26 | 75.51 | 76.21 | 75.51 |
|
| 89.81 | 86.77 | 81.19 | 86.36 | 86.12 | 71.72 | 75.66 | |
|
| 72.09 | 96.53 | 94.88 | 49.30 | 49.87 | 92.06 | 80.48 | |
|
| 79.94 | 91.39 | 87.50 | 62.57 | 62.64 | 80.62 | 77.84 | |
|
|
| 85.59 | 95.03 | 87.97 | 75.16 | 77.64 | 73.75 | 75.72 |
|
| 94.16 | 92.71 | 86.18 | 73.83 | 81.89 | 79.31 | 78.49 | |
|
| 79.15 | 98.19 | 92.52 | 61.23 | 59.70 | 69.63 | 75.59 | |
|
| 85.99 | 95.37 | 89.22 | 66.43 | 68.83 | 73.92 | 76.91 | |
|
| Accuracy | 85.10 | 95.30 | 88.38 | 76.13 | 76.10 | 75.43 | 77.24 |
|
| 88.15 | 93.59 | 86.84 | 76.67 | 76.90 | 74.76 | 79.28 | |
|
| 84.86 | 97.67 | 92.42 | 58.95 | 61.13 | 82.00 | 77.96 | |
|
| 86.42 | 95.59 | 89.54 | 66.49 | 67.94 | 78.17 | 78.60 | |
|
|
| 83.81 | 93.51 | 87.32 | 76.24 | 76.48 | 75.78 | 75.43 |
|
| 91.41 | 91.79 | 85.50 | 81.24 | 87.74 | 73.15 | 77.47 | |
|
| 78.47 | 96.16 | 92.07 | 53.72 | 50.52 | 86.83 | 76.85 | |
|
| 84.41 | 93.92 | 88.66 | 64.52 | 64.02 | 79.38 | 76.94 | |
|
|
| 84.40 | 94.84 | 88.38 | 75.70 | 75.78 | 74.83 | 75.67 |
|
| 91.44 | 93.04 | 86.16 | 75.49 | 78.24 | 74.24 | 80.47 | |
|
| 79.73 | 97.41 | 93.52 | 63.65 | 59.57 | 81.50 | 72.67 | |
|
| 85.09 | 95.16 | 89.63 | 67.57 | 66.82 | 77.65 | 76.13 | |
|
|
| 83.09 | 95.62 | 89.11 | 77.05 | 77.72 | 77.18 | 77.53 |
|
| 95.89 | 93.30 | 84.66 | 73.57 | 76.36 | 74.31 | 82.91 | |
|
| 72.66 | 98.71 | 97.33 | 65.50 | 63.15 | 87.78 | 74.46 | |
|
| 82.65 | 95.93 | 90.55 | 69.24 | 68.91 | 80.45 | 78.41 |
Experimental results of the proposed visual sentiment analyzer on task 3 in terms of accuracy, precision, recall, and F1-score per class.
| Model | Metric | Anger | Anxiety | Craving | Pain | Fear | Horror | Joy | Relief | Sadness | Surprise |
|---|---|---|---|---|---|---|---|---|---|---|---|
| VGGNet | Accuracy | 73.87 | 86.16 | 80.73 | 82.29 | 87.52 | 79.12 | 84.21 | 81.23 | 95.22 | 70.70 |
|
| 63.52 | 82.04 | 61.14 | 76.15 | 81.46 | 67.87 | 95.11 | 92.04 | 92.28 | 79.50 | |
|
| 80.28 | 95.39 | 28.40 | 93.73 | 97.88 | 83.85 | 75.09 | 71.63 | 99.59 | 68.24 | |
|
| 70.83 | 88.20 | 38.65 | 83.99 | 88.91 | 75.00 | 83.88 | 80.56 | 95.80 | 72.60 | |
| VGGNet (p) | Accuracy | 74.81 | 83.76 | 79.57 | 80.34 | 86.38 | 78.23 | 82.12 | 77.31 | 95.16 | 72.81 |
|
| 64.38 | 77.37 | 60.14 | 73.74 | 80.73 | 69.92 | 95.22 | 93.12 | 92.37 | 75.15 | |
|
| 82.81 | 97.11 | 29.77 | 92.17 | 96.83 | 77.25 | 72.07 | 65.11 | 99.39 | 78.35 | |
|
| 72.39 | 86.12 | 39.52 | 81.90 | 88.04 | 73.37 | 82.00 | 76.63 | 95.75 | 76.61 | |
| Inception V-3 |
| 75.76 | 85.29 | 81.40 | 80.96 | 86.24 | 75.70 | 82.43 | 79.46 | 94.36 | 73.03 |
|
| 63.38 | 80.14 | 91.79 | 73.47 | 80.12 | 62.47 | 94.86 | 92.68 | 91.84 | 73.18 | |
|
| 92.00 | 96.93 | 14.66 | 96.47 | 97.23 | 87.80 | 71.99 | 67.74 | 98.42 | 84.89 | |
|
| 75.04 | 87.73 | 25.24 | 83.41 | 87.83 | 72.95 | 81.79 | 78.18 | 95.02 | 78.47 | |
| ResNet-50 |
| 72.81 | 85.38 | 79.93 | 81.21 | 86.79 | 79.12 | 85.10 | 81.79 | 94.72 | 71.48 |
|
| 63.91 | 82.12 | 55.65 | 76.23 | 82.89 | 69.13 | 90.09 | 89.59 | 92.78 | 75.17 | |
|
| 71.93 | 93.39 | 32.81 | 90.31 | 93.53 | 79.91 | 81.94 | 75.27 | 97.97 | 76.37 | |
|
| 67.41 | 87.39 | 41.13 | 82.67 | 87.87 | 74.08 | 85.77 | 81.78 | 95.30 | 75.59 | |
| ResNet-101 |
| 73.09 | 85.40 | 79.84 | 82.71 | 87.46 | 78.62 | 85.29 | 80.87 | 94.72 | 72.31 |
|
| 63.08 | 81.02 | 55.52 | 76.32 | 83.32 | 70.46 | 93.01 | 90.90 | 92.54 | 76.84 | |
|
| 77.40 | 95.49 | 32.30 | 94.51 | 94.40 | 74.11 | 79.30 | 72.04 | 98.27 | 75.08 | |
|
| 69.49 | 87.66 | 40.72 | 84.44 | 88.50 | 72.14 | 85.52 | 80.36 | 95.32 | 75.90 | |
| DenseNet |
| 73.31 | 84.88 | 80.73 | 81.10 | 87.21 | 77.95 | 82.60 | 81.26 | 93.41 | 72.17 |
|
| 63.80 | 81.41 | 67.75 | 74.75 | 83.03 | 66.11 | 90.24 | 89.92 | 92.33 | 74.64 | |
|
| 75.51 | 93.50 | 19.33 | 93.50 | 94.29 | 84.60 | 76.76 | 73.84 | 95.93 | 79.10 | |
|
| 67.20 | 87.92 | 38.81 | 84.44 | 88.22 | 75.20 | 83.16 | 80.12 | 95.37 | 77.37 | |
| EfficientNet |
| 74.46 | 86.34 | 81.40 | 83.18 | 88.12 | 77.62 | 82.72 | 78.24 | 94.91 | 72.38 |
|
| 62.07 | 82.16 | 65.22 | 76.69 | 84.58 | 71.05 | 91.86 | 91.26 | 92.43 | 79.86 | |
|
| 79.08 | 95.43 | 31.80 | 95.71 | 94.55 | 70.48 | 75.96 | 68.16 | 98.56 | 65.82 | |
|
| 69.06 | 87.03 | 30.02 | 83.07 | 88.28 | 74.09 | 82.85 | 81.05 | 94.08 | 76.72 | |
| VGGNet (P+I) |
| 75.90 | 84.24 | 79.96 | 80.43 | 87.24 | 78.23 | 82.35 | 78.32 | 95.47 | 73.56 |
|
| 64.85 | 77.44 | 66.01 | 72.59 | 81.06 | 67.11 | 95.87 | 94.75 | 92.61 | 77.24 | |
|
| 86.71 | 98.23 | 24.85 | 95.57 | 98.34 | 86.27 | 71.93 | 65.69 | 99.70 | 76.10 | |
|
| 74.07 | 86.60 | 35.68 | 82.50 | 88.87 | 75.49 | 82.16 | 77.59 | 96.02 | 76.55 |