Aarushi Vohra1, Ritu Garg1. 1. Department of Computer Engineering, National Institute of Technology Kurukshetra, 136119 Kurukshetra, Haryana India.
Abstract
Nowadays, we are witnessing a paradigm shift from the conventional approach of working from office spaces to the emerging culture of working virtually from home. Even during the COVID-19 pandemic, many organisations were forced to allow employees to work from their homes, which led to worldwide discussions of this trend on Twitter. The analysis of this data has immense potential to change the way we work but extracting useful information from this valuable data is a challenge. Hence in this study, the microblogging website Twitter is used to gather more than 450,000 English language tweets from 22nd January 2022 to 12th March 2022, consisting of keywords related to working from home. A state-of-the-art pre-processing technique is used to convert all emojis into text, remove duplicate tweets, retweets, username tags, URLs, hashtags etc. and then the text is converted to lowercase. Thus, the number of tweets is reduced to 358,823. In this paper, we propose a fine-tuned Convolutional Neural Network (CNN) model to analyse Twitter data. The input to our deep learning model is an annotated set of tweets that are effectively labelled into three sentiment classes, viz. positive negative and neutral using VADER (Valence Aware Dictionary for sEntiment Reasoning). We also use a variation in the input vector to the embedding layer, by using FastText embeddings with our model to train supervised word representations for our text corpus of more than 450,000 tweets. The proposed model uses multiple convolution and max pooling layers, dropout operation, and dense layers with ReLU and sigmoid activations to achieve remarkable results on our dataset. Further, the performance of our model is compared with some standard classifiers like Support Vector Machine (SVM), Naive Bayes, Decision Tree, and Random Forest. From the results, it is observed that on the given dataset, the proposed CNN with FastText word embeddings outperforms other classifiers with an accuracy of 0.925969. As a result of this classification, 54.41% of the tweets are found to show affirmation, 24.50% show a negative disposition, and 21.09% have neutral sentiments towards working from home.
Nowadays, we are witnessing a paradigm shift from the conventional approach of working from office spaces to the emerging culture of working virtually from home. Even during the COVID-19 pandemic, many organisations were forced to allow employees to work from their homes, which led to worldwide discussions of this trend on Twitter. The analysis of this data has immense potential to change the way we work but extracting useful information from this valuable data is a challenge. Hence in this study, the microblogging website Twitter is used to gather more than 450,000 English language tweets from 22nd January 2022 to 12th March 2022, consisting of keywords related to working from home. A state-of-the-art pre-processing technique is used to convert all emojis into text, remove duplicate tweets, retweets, username tags, URLs, hashtags etc. and then the text is converted to lowercase. Thus, the number of tweets is reduced to 358,823. In this paper, we propose a fine-tuned Convolutional Neural Network (CNN) model to analyse Twitter data. The input to our deep learning model is an annotated set of tweets that are effectively labelled into three sentiment classes, viz. positive negative and neutral using VADER (Valence Aware Dictionary for sEntiment Reasoning). We also use a variation in the input vector to the embedding layer, by using FastText embeddings with our model to train supervised word representations for our text corpus of more than 450,000 tweets. The proposed model uses multiple convolution and max pooling layers, dropout operation, and dense layers with ReLU and sigmoid activations to achieve remarkable results on our dataset. Further, the performance of our model is compared with some standard classifiers like Support Vector Machine (SVM), Naive Bayes, Decision Tree, and Random Forest. From the results, it is observed that on the given dataset, the proposed CNN with FastText word embeddings outperforms other classifiers with an accuracy of 0.925969. As a result of this classification, 54.41% of the tweets are found to show affirmation, 24.50% show a negative disposition, and 21.09% have neutral sentiments towards working from home.
An argument that is gaining popularity nowadays revolves around how employees will perceive their work environment in the times to come. Whether they will prefer to work on their office desks in a group setting or work virtually from the comforts of their home. Since the inception of the COVID-19 pandemic, the general outlook of people on working from home has changed. Work from home is the process of carrying out office-established work from the premises of home, using internet services (Islam, 2022; Tønnessen et al., 2021). The COVID-19 pandemic has forced organisations to adopt flexible professional engagements in the form of virtual working environments. There has been a remarkable growth in the use of digital technology for telework (Tønnessen et al., 2021). Figure 1 shows a graph obtained from Google Trends demonstrating a relative search interest in this regard.
Fig. 1
Relative search interest of ‘Work from Home’ from January 2021 to February 2022 (Source: https://trends.google.com/trends)
Relative search interest of ‘Work from Home’ from January 2021 to February 2022 (Source: https://trends.google.com/trends)Several studies have shown that the paradigm shift to the culture of working from home has positive impacts on the employees as it improves productivity, reduces stress, and leads to job satisfaction. A flexible schedule allows for more time with family. It has considerably reduced the time required to travel to the office (Kawakubo and Arata, 2022). Hence, it has led to a reduction in transport-related energy consumption, and therefore, limiting air pollution and carbon emissions (Jain et al., 2022). Work from home is also in the interest of organizations as the virtual operation has the potential to boost creativity and innovation among employees (Tønnessen et al., 2021). However, some researchers suggest that long hours of working from home are highly demanding which may reduce productivity and disturb work-life balance. Limited in-person conversations and increased screen time may trigger anxiety. Moreover, interruptions from family members during working hours and lack of resources at home also create hindrances to work efficiently (Islam, 2022; Tønnessen et al., 2021; Kawakubo and Arata, 2022; Prodanova and Kocarev, 2021). These factors play a significant role in framing policies for employees. The success of an organisation largely depends on ability of its employees to perform efficiently. Thus, the orientation of employees towards working from home or office is crucial for deciding the future of work.Due to the rise of social media users and the ubiquitous influence of social media, platforms such as Twitter, Facebook, YouTube, LinkedIn, Instagram have emerged as the principal contributors of big data. The escalation of social media data puts forward a wide range of opportunities in Natural Language Processing (NLP). When data from social media is analysed with different representation and modelling approaches, it gives a diverse idea of people’s perception of social disciplines. This intense data helps researchers to elucidate people’s opinions and develop novel prediction techniques for a variety of domains like decision making, stock market, and recommendation systems etc (Rani et al., 2022; Zachlod et al., 2022; Salim et al., 2022; Pathak et al., 2021).But extracting valuable information from a huge volume of raw data and comprehending worthwhile insights is a challenge. Moreover, data storage is another problem to deal with Rani et al. (2022). Despite these concerns, sentiment analysis is a highly prevalent research area in NLP. The textual data from social media platforms can be analysed by lexicon based and machine learning based sentiment analysis methods (Li et al., 2020). Such techniques involve the use of algorithms like SVM, Naíve Bayes, Decision Tree, Random Forest etc. The machine learning approach can also be based on powerful Deep Neural Networks. They are found to give better results than conventional models because of their ability to detect features from a large amount of data (Joseph et al., 2022). Such a technique requires high-dimensional labelled data to train the model, but the training process may be a memory and time-consuming task. The quality of these factors may lead to a rise or drop in the overall performance of the model (Basiri et al., 2021). In the text-processing domain, CNNs are popularly used with word embeddings for classification and clustering. It is because of their ability to extract features from data with the convolution operation and measure the relationships in local patterns (Li et al., 2020; Basiri et al., 2021; Liao et al., 2017).The analysis of Twitter data has become a powerful tool to capture dynamic information related to public perception, which can serve as useful information for decision makers (De Rosis et al., 2021). Some existing studies (De Rosis et al., 2021; Rakshitha et al., 2021; Yousefinaghani et al., 2021) on Twitter sentiment analysis are based on a lexicon-based approach. Such a technique, when combined with a statistical/machine learning approach, can lead to a more effective polarity classification (Cambria et al., 2017). Researchers in García-Ordás et al. (2021); Sasidhar et al. (2020) utilise a combination of both these techniques on a low sample size. Given these premises, we aim to address the emerging trend of working from home, by analysing public opinion and notions based on a rich dataset, using a combination of lexicon and machine learning based approaches.The data set for the same is acquired for a period of 50 days from 22nd January 2022 to 12th March 2022 from Twitter, which is a popular medium of expression for various public interests. It is mostly used by intellectually aware people to discuss opinions in real-time (Ding et al., 2021; Liu and Liu, 2021). Hence, data-mining techniques are employed to extract a rich text corpus of around 450,000 English language tweets from Twitter, containing keywords like ‘wfh’, ‘work from home’ and ‘working from home’. The tweets are pre-processed and are the labelled using VADER (Hutto and Gilbert, 2014) as three sentiment classes, namely, positive, negative, and neutral. This textual data is then converted into word vectors using FastText word embeedings, that serves as an input to train the proposed CNN. The comparison with standard classifiers like SVM, Naíve Bayes, Decision Tree, and Random Forest validates the performance accuracy of our model.The principal contributions of this research are as follows - The remainder of this study includes the following. A literature review in Section 2. In Section 3, we have discussed the methodology. Section 4 consists of our proposed CNN model. Section 5 is based on model predictions, analysis, and comparison with other machine learning classifiers. The paper is concluded in Section 6, with the final discussions and future scope.Collection of a rich text corpus containing around 450,000 English language tweets from Twitter.Performing a series of state-of-the-art pre-processing tasks such as conversion into lowercase, removal of duplicates, handling emojis, usernames, URLs, and hashtags etc.Proposing a fine-tuned CNN model for an effective sentiment analysis on the above-mentioned dataset.A performance comparison of the CNN model with various machine learning classifiers, such as SVM, Naíve Bayes, Decision Tree, Random Forest to evaluate the best performing classifier on our dataset.The findings of this study may help organisations and researchers in the ideation of a novel system to carry out office work.
Literature review
The extensive influx of social media data has been widely used for analysis. Previous studies have shown that this dynamic source of big data holds immense significance in a wide range of domains such as business, marketing, recommendation, politics, research, medicine and healthcare, opinion analysis, intelligence etc (Luo and Mu, 2022; Ansari et al., 202; Nezhad and Deihimi, 2022; Alamoodi et al., 2021; Rajalakshmi et al., 2017). A comprehensive sentiment analysis process includes the creation or extraction of textual data, storage of the text corpus into files, pre-processing the data, feature engineering and selection and finally, applying a sentiment analysis approach (Rajalakshmi et al., 2017).Public opinions from around the world are discussed on social media platforms, like Twitter. A careful analysis of these facts can lend a constructive understanding for the development of novel strategies. Authors of Rachunok et al. (2022) emphasise the discussion of Twitter data on a large scale. They have extracted keyword and location-based tweets and analysed their metrics. A lexicon-based approach to know the polarity of data is much in trend. In paper (Rakshitha et al., 2021), TextBlob is used to analyse the polarity of customer reviews in five Indian regional languages. Researchers in Yousefinaghani et al. (2021) have used VADER to assign polarity to tweets relating to vaccine sentiments. The study (Ding et al., 2021) gives an insight into people’s reaction during the initial weeks of the COVID-19 pandemic. In paper (Liu and Liu, 2021), a sentiment analysis is conducted on COVID-19 vaccines. These studies mainly employ a lexicon-based method for sentiment analysis. Due to the effectiveness of classifying texts by using a lexicon-based approach, we use VADER to assign labels to our dataset.Twitter can also be used for analysis in other domains other. Researchers in Neogi et al. (2021) focus on analysing tweets on the farmers’ protest in India using Bag of Words and TF-IDF vectorizer along with some standard classifiers for prediction. Authors of Hidayat et al. (2022) propose the use of SVM and logistic regression to classify a dataset of a few thousand tweets. Paper (Ding et al., 2021) explains public interest in autonomous vehicles using data obtained from Twitter feeds. Researchers in García-Ordás et al. (2021) have proposed a novel neural network to classify variable-length audio in real time. Word2Vec CBOW (Continuous Bag of Words) model is used in study (Sasidhar et al., 2020), to detect emotions in Hindi-English code mix tweets. Authors of Fitri et al. (2019) have used Naíve Bayes, Decision Tree, and Random Forest algorithms for sentiment analysis of Anti-LGBT campaign on Twitter. These studies use a low sample size of data for modelling. Therefore, in this study we attempt to analyse more than 350,000 unique English tweets obtained from Twitter.The survey (Liu and Liu, 2021) presents that supervised learning approaches are widely used for sentiment analysis. SVM and Naíve Bayes classifiers have been used in paper (Pavitha et al., 2022) for sentiment analysis. Deep neural networks are also used for sentiment analysis tasks (Joseph et al., 2022). Authors of Rani et al. (2022) have presented a CNN-LSTM Model to classify tweets into six sentiment classes. Researchers of Fiok et al. (2021) have performed a sentiment analysis on a five-level sentiment scale, based on tweets posted to a specific Twitter account. Paper (Chen et al., 2017) proposes a BiLSTM-CRF based approach to improve sentence type classification. The study (Ridhwan and Hargreaves, 2021) has used VADER with a deep learning method (RNN) to classify tweets related to COVID-19 in Singapore. Authors of Umair and Masciari (2022) have used TextBlob and BERT model to identify sentiments related to COVID-19 vaccines. With deep learning models, pre-trained word embeddings can also be used. In paper (Sharma et al., 2020), pre-trained Word2Vec embeddings have resulted in a better classification of small movie review sentences. Pre-trained GloVe word embeddings are used as initial weights in research (Basiri et al., 2021). The use of FastText word representations in study (Khasanah, 2021) has slightly improved the performance of the model. The research (Deb and Chanda, 2022) shows that contextual word embeddings have a better accuracy than context-free word embeddings. Advanced deep neural networks may achieve a higher accuracy with word embeddings. Other artificial neural networks, like bidirectional emotional recurrent units (Li et al., 2022) and improved graph convolutional networks (He et al., 2022), are also used for aspect-based sentiment analysis (Imani and Noferesti, 2022; Zhao et al., 2022). A sentiment analysis at finer classification levels to handle ambivalent emotions is proposed in paper (Wang et al., 2020). Hence, deep learning techniques have led to remarkable advancements in the field of sentiment analysis (Cambria et al., 2022). Therefore, in this research, we employ word embeddings with a deep neural network, and conduct different experiments to analyse the effect of use of word embeddings as initial weights to the model on our dataset.This work revolves around analysing public opinions related to the present-day concept of working from home. The analysis is based on more than 450,000 English tweets obtained from Twitter. The authors of paper (Cambria et al., 2017) have proposed the use of knowledge-based techniques combined with machine learning approaches for polarity detection. Hence, we employ a lexicon-based approach as well as various machine learning models to get a thorough understanding of this trend. The entire discussion unfolds in the succeeding sections.
Methodology
For this study, we have acquired data (tweets) from Twitter. We have applied data pre-processing techniques to deal with emojis, usernames, hashtags, URLs, and remove duplicates. Then, the data is labelled, up sampled, and split into training, validation, and testing datasets. This data is fed to our model (a Convolutional Neural Network) for training and prediction. In this section, we present a discussion of the above steps in detail.A diagrammatic representation of the proposed scheme is shown in Fig. 2.
Fig. 2
Methodology
Methodology
Data acquisition
Python’s Tweepy library is used to gather a rich corpus of tweets related to public opinion on working from home. Around 450,000 English language tweets are extracted from Twitter from 22nd January 2022 to 12th March 2022, by querying for a variety of related keywords such as ‘wfh’, ‘work from home’ and ‘working from home’. The tweets thus collected consist of attributes like a Tweet ID (a unique identifier for the tweet), text (the textual contents of the tweet as posted by the user), and the date and time when the tweet was posted. The data obtained is stored in a CSV format. An exploratory analysis shows the presence of tweets of different polarities in the dataset.
Data pre-processing
A state-of-the-art pre-processing technique is applied on the raw data obtained from Twitter to get rid of any inconsistency or noise before the data is fed to the model. Pre-processing reduces the dimensionality of input data (Rajalakshmi et al., 2017). This, in turn, helps to an achieve better model performance in sentiment analysis. To pre-process the extracted text corpus, we first convert the emojis present in the text into their CLDR (Common Locale Data Repository Project) short names using Python’s Emoji module. Then the entire text data is converted to lowercase. Afterwards, all retweets, username tags, URLs and hashtag symbols are removed. The keyword ‘wfh’ is converted to work from home. Finally, we remove all the duplicate entries from the text. The above-mentioned pre-processing steps are carried out by using Python scripts. Hence, the dataset is reduced to a total of 358,823 unique tweets after pre-processing. Some sample examples of raw tweets obtained from Twitter and their pre-processed counterparts are shown in Table 1. Figure 3 represents a graph for the number of unique tweets corresponding to different days during the tweets’ extraction period.
Table 1
Raw tweets and their pre-processed counterparts
Raw tweet
Tweet after pre-processing
I have never been so grateful that my company allows us to WFH, though I have a morning meeting at least I don’t have to worry about morning commute https://t.co/RKlCXx5s8K
i have never been so grateful that my company allows us to work from home though i have a morning meeting at least i don t have to worry about morning commute
A leave is not #WFH and WFH is not a LEAVE!!!! https://t.co/cmrAOUHxHA
a leave is not work from home and work from home is not a leave
Twitter is a weird space, it’s very understandable that some people love WFH while some hate it
twitter is a weird space its very understandable that some people love work from home while some hate it
Ugh I’m just trying to work from home
ugh i m just trying to work from home
Fig. 3
Daily number of unique tweets
Raw tweets and their pre-processed counterpartsDaily number of unique tweets
Data annotation
Due to the enormous volume of our dataset, it is infeasible to manually label the data. Hence, we have applied a popular lexicon-based approach for data annotation. We have used VADER to understand the semantic orientation of the tweets. VADER is attuned to work well on data consisting of abbreviations, short unconventional texts and slangs, catering to the needs of social media data. It is computationally fast and may exhibit higher accuracy than human annotators. We use this tool to categorise our data into three classes - positive, negative, and neutral, based on sentiment and polarity scores (Hutto and Gilbert, 2014). Examples of tweets labelled as positive, negative, and neutral are shown in Table 2. The distribution of tweets in these classes is shown in Fig. 4. A total of 1,95,233 tweets are found to be positive, 87,898 negative and 75,692 neutral. This gives us an overview of a public perception of working from home. A graph for polarity of unique tweets for each day during the data acquisition period is shown in Fig. 5.
Table 2
Tweets and their polarities as classified by VADER
Tweet
Polarity
i have never been so grateful that my company allows us to work from home though i have a morning meeting at least i don t have to worry about morning commute
Positive
twitter is a weird space its very understandable that some people love work from home while some hate it
Negative
i work from home and surprisingly it’s more exhausting lol
Positive
a leave is not work from home and work from home is not a leave
Neutral
i need to buy a proper office chair for my house because of work from home long hours of sitting is killing my back
Negative
Fig. 4
Distribution of tweets into classes
Fig. 5
Polarity of Unique Tweets for Each Day
Tweets and their polarities as classified by VADERDistribution of tweets into classesPolarity of Unique Tweets for Each Day
Data up sampling and splitting
From Fig. 4 it is evident that the data is highly unbalanced in the three classes - positive tweets dominating the data with 1,95,233 tweets and the number of negative and neutral tweets being 87,898 and 75,692 respectively. Hence, we have resampled the minority classes (negative and neutral) and injected it back to the original dataset, so that the model doesn’t incline towards the majority class (positive). Then, we split the dataset into three parts - training (80%), validation (10%) and testing (10%) data. Data up sampling and splitting is done using Python scikit-learn. A precise outline of the number and class of tweets in the training, validation and testing dataset is shown in Table 3.
Table 3
Number and polarities of tweets in training, validation, and testing datasets
Positive
Negative
Neutral
Total
Training
156,071
156,137
156,351
468,559
Validation
19,527
19,511
19,532
58,570
Testing
19,635
19,585
19,350
58,570
Number and polarities of tweets in training, validation, and testing datasets
Tokenisation, padding and word embeddings
To provide textual data (tweets) to our model in the form of training data, we have converted the textual data into vector form, known as word embeddings. This conversion is carried out using Keras Tokenizer in Python. The maximum sentence length of the text in training, validation and testing dataset is found to be 189, 255, and 188 respectively. We have limited the maximum sentence length to 190. Hence, word embeddings of sentences with length less than 190 are padded with zeros. Thus, the input shape of embeddings to our model is fixed to 190. Moreover, to train another instance of the model, FastText word embeddings are used as supervised word representations. These word embeddings are based on sub-word information. Hence, FastText can also handle out-of-vocabulary words. Further, the non-numerical labels (i.e., positive, negative, and neutral) are converted into categorical values (one-hot encoding) using Python scikit-learn Label Encoder.
The Model - Convolutional Neural Network
To perform sentiment analysis on our dataset, we have used a CNN. This section describes the model used in this study.
Model parameters
The proposed model consists of an input layer, an embedding layer, two sets of conv1d and max pooling layers, a dropout layer, and two dense layers.Embedding layer: The word embeddings generated from the tweets’ text are an input to this layer. It produces an output of shape (190, 300), with the embedding dimensions being 300. The maximum number of features to be considered are set to 30,000. In this research, we propose the use of pre-trained FastText word embeddings as weights to this layer.Convolution layer: The proposed CNN model consists of two convolution layers. The first conv1d layer has 64 filters each of size 3, with ReLU activation and L2 regularizers. The input is down sampled by using a max pooling layer with the pool size and strides both being set as 2. Another conv1D layer, like the former, is also present.Global max pooling: The convolution layers are followed by a global max-pooling operation. This gives the maximum value of all values across the entire input.Dropout layer: Then, a dropout layer is used to prevent overfitting. The dropout rate is set to 0.5 which drops half of the input units at each step.Dense layer: The model also has two fully-connected dense layers (in which neurons of the layer are deeply connected with the preceding layer). The first dense layer has 32 units (output space dimensions), ReLU activation, and regularizers. The other dense layer has 3 units (equal to the number of classes), SoftMax activation, and regularizers for kernel and bias.The model is compiled with Categorical Cross entropy loss function for multi-label classification, as the labels are one-hot encoded. We have also used Adam optimizer with learning rate set to 0.001. The metrics are set to Categorical Accuracy. It calculates the percentage of correctly predicted labels for one-hot encodings. The entire model is implemented in Python using Tensorflow and Keras. All the above-mentioned parameters are tuned after careful experimentation. The architecture of the model is shown in Fig. 6.
Fig. 6
Architecture of the proposed CNN model
Architecture of the proposed CNN model
Training
The model is trained on labelled data (supervised learning). The batch size is set to 128. Initially, the model is expected to train on 50 epochs, but an early stopping callback function is also applied to prevent overfitting of the model. The callback function monitors the validation loss and stops training the model when the metric ceases to decrease. Plots of categorical accuracy and loss for both training and validation datasets are shown in Figs. 7 and 8 respectively. Another instance of the same model is also trained using FastText pre-trained word embeddings as initial weights. Plots of categorical accuracy and loss for FastText word embeddings-based CNN model are shown in Figs. 7 and 8 respectively. It is found that the model that used pre-trained FastText word embeddings trained for 23 epochs, whereas the CNN model without FastText word embeddings model trained for 22 epochs, afterwards, the training is stopped by the early stopping function.
Fig. 7
Categorical Accuracy values for CNN (a) without FastText word embeddings (b) with FastText word embeddings
Fig. 8
Loss values for CNN (a) without FastText word embeddings (b) with FastText word embeddings
Categorical Accuracy values for CNN (a) without FastText word embeddings (b) with FastText word embeddingsLoss values for CNN (a) without FastText word embeddings (b) with FastText word embeddings
Evaluation and results
The model described in the previous section is evaluated on a testing dataset consisting of 58,570 tweets, of which 19,635 are positive, 19,585 are negative and 19,350 are neutral tweets. The performance of the proposed CNN model is also compared with existing machine learning classifiers like SVM (Pavitha et al., 2022), Naíve Bayes (Yang, 2018), Decision Tree (Swain and Hauska, 1977) and Random Forest (Biau and Scornet, 2016). Some standard metrics (Sasidhar et al., 2020) considered for this analysis are as follows -Accuracy - It represents the proportion of the correctly predicted samples to the total number of samples. Here,TP: the number of positive instances that the classifier correctly labels as positive.FP: the number of negative instances that the classifier incorrectly labels as positive.FN: the number of positive instances that the classifier incorrectly labels as negative.TN: the number of negative instances that the classifier correctly labels as negative.Precision - It represents the ratio between the correctly predicted positive samples to the total number of predicted positive samples.Recall - It represents the ratio between correctly predicted positive samples to the number of all positive samples.F1-score - It is a value between 0 and 1 which is calculated as the harmonic mean of precision and recall.Support - It is the actual number of samples of a class in the dataset.
Convolutional Neural Network model without FastText word embeddings
In this experiment, word embeddings generated using Keras tokenizer are used with the CNN model. The results are predicted on the testing dataset. The classification report generated using Python scikit-learn, based on evaluation metrics represented by (1), (2), (3), (4) is shown in Table 4. As a result, it is found that 17,819 positive, 18,557 negative and 17,778 neutral tweets are correctly classified into their respective classes. Figure 9 shows the confusion matrix for the classification of tweets into three classes. The accuracy of the CNN without the use of FastText word embeddings is 92.4603%.
Table 4
Classification Report of CNN model without FastText word embeddings
Precision
Recall
F1-score
Support
Positive
0.923695
0.913748
0.918695
19501
Neutral
0.988271
0.908989
0.946973
19558
Negative
0.87163
0.951105
0.909635
19511
Accuracy
0.924603
Fig. 9
Confusion Matrix for CNN (a) without FastText word embeddings (b) with FastText word embeddings
Classification Report of CNN model without FastText word embeddings
Convolutional Neural Network model with FastText word embeddings
In this experiment, pre-trained FastText word embeddings are used as initial weights to the CNN model. After predicting labels on the testing dataset, the classification report generated using Python scikit-learn, based on evaluation metrics represented by (1), (2), (3), (4) is shown in Table 5. It is found that 17,777 positive, 18,717 negative and 17,740 neutral tweets are correctly classified into their respective classes. Figure 9 shows the confusion matrix for the classification of tweets into three classes. The accuracy of the CNN with the use of FastText word embeddings is 92.5969%.
Table 5
Classification Report of CNN model with FastText word embeddings as initial weights
Precision
Recall
F1-score
Support
Positive
0.93166
0.911594
0.921518
19501
Neutral
0.989845
0.907046
0.946638
19558
Negative
0.867854
0.959305
0.911291
19511
Accuracy
0.925969
Classification Report of CNN model with FastText word embeddings as initial weightsConfusion Matrix for CNN (a) without FastText word embeddings (b) with FastText word embeddingsIn the following series of experiments, other machine learning classifiers are also used for sentiment prediction on the same dataset.
SVM model
The performance accuracy of SVM on our data set is 0.864396. Table 6 shows the classification report for the SVM model generated using Python scikit-learn, based on evaluation metrics represented by (1), (2), (3), (4).
Table 6
Classification report of SVM model
Precision
Recall
F1-score
Support
Positive
0.868624
0.834709
0.851344
19501
Neutral
0.882702
0.888881
0.885781
19558
Negative
0.842627
0.869647
0.855933
19511
Accuracy
0.864396
Classification report of SVM model
Naíve Bayes model
The performance accuracy of Naíve Bayes on our data set is 0.772759. Table 7 shows the classification report for the Naïve Bayes model generated using Python scikit-learn.
Table 7
Classification report of naïve bayes model
Precision
Recall
F1-score
Support
Positive
0.83988
0.692507
0.759106
19501
Neutral
0.890919
0.734254
0.805035
19558
Negative
0.659951
0.89131
0.758378
19511
Accuracy
0.772759
Classification report of naïve bayes model
Decision Tree model
The performance accuracy of Decision Tree on our data set is 0.794007. Table 8 shows the classification report for the Naíve Bayes model generated using Python scikit-learn.
Table 8
Classification report of decision tree model
Precision
Recall
F1-score
Support
Positive
0.805072
0.579427
0.673862
19501
Neutral
0.81433
0.920235
0.86405
19558
Negative
0.767191
0.882538
0.820832
19511
Accuracy
0.794007
Classification report of decision tree model
Random Forest model
The performance accuracy of Random Forest on our data set is 0.873399. Table 9 shows the classification report for the Naíve Bayes model generated using Python scikit-learn.
Table 9
Classification report of random forest model
Precision
Recall
F1-score
Support
Positive
0.806344
0.847341
0.826334
19501
Neutral
0.874124
0.931088
0.901707
19558
Negative
0.952134
0.841926
0.893645
19511
Accuracy
0.873399
Classification report of random forest modelFurther, Table 10 shows a comparative analysis of the performance of all the classifiers used for sentiment analysis on the dataset. It is found that the proposed CNN with FastText word embeddings outperforms other classifiers with a classification accuracy of 0. 925969.
Table 10
Performance comparison of CNN (with and without FastText word embeddings), SVM, naïve bayes, decision tree, and random forest models
Model
Accuracy
CNN without FastText word embeddings
0.924603
CNN with FastText word embeddings
0.925969
SVM
0.864396
Naïve Bayes
0.772759
Decision Tree
0.794007
Random Forest
0.873399
Performance comparison of CNN (with and without FastText word embeddings), SVM, naïve bayes, decision tree, and random forest modelsThus, CNNs are found to have the best performance on the given dataset. Moreover, it is observed that the accuracy of CNN increases slightly with the use of pre-trained FastText word vectors. This is because of FastText’s ability to hold sub-word information, which allows FastText to generate out of vocabulary words. Table 11 illustrates a few examples where the predicted value differed from the actual label. A word cloud representing some of the most commonly occurring words in our text corpus is shown in Fig. 10.
Table 11
Examples of tweets where sentiment classification of the CNN model in experiment 2 differed from VADER
Tweet text
Vader
CNN Model
a leave is not work from home and work from home is not a leave
Neutral
Negative
i work from home and surprisingly it’s more exhausting lol
Positive
Negative
best part about working from home is you can scream at the top of your lungs in distress and none of your coworkers will hear it
Neutral
Negative
i have never been so grateful that my company allows us to work from home though i have a morning meeting at least i don t have to worry about morning commute
Positive
Negative
twitter is a weird space its very understandable that some people love work from home while some hate it
Negative
Positive
Fig. 10
Word Cloud for some commonly occurring words
Examples of tweets where sentiment classification of the CNN model in experiment 2 differed from VADERWord Cloud for some commonly occurring words
Conclusion
The huge volume of data from social media has immense potential for exploration. By performing a careful analysis, such a rich dataset of big data can be used for predictions for improvement in the field of decision making.In this research, we have conducted a sentiment analysis on data obtained from Twitter. Opinion mining techniques are employed to gather a rich text corpus of more than 450,000 English tweets containing keywords related to work from home, over a period of 50 days. A series of state-of-the-art pre-processing techniques are carried out to handle emojis, usernames, URLs, hashtags, abbreviations, and inconsistencies in text. Polarities are assigned to the tweets using VADER. Further, we use a novel CNN to examine the tweets to infer public perception of working from home. The proposed deep learning model has multiple convolution and max pooling layers, dropout operation, and dense layers with ReLU and sigmoid activations. The use of FastText supervised word representations with our model has shown a promising performance on our dataset. Further, some standard machine learning classifiers - SVM, Naíve Bayes, Decision Tree, and Random Forest are used to validate the performance of the proposed CNN. As a result, it is shown that our CNN model with FastText word embeddings remarkably outperforms other classifiers, with a classification accuracy of 0.925969. Thus, we have addressed the emerging inclination of working from home on Twitter, using lexicon-based techniques and several machine learning classifiers, and found that 54.41% of tweets show affirmation for working from home, whereas 24.50% tweets show public dissatisfaction. However, 21.09% tweets have a neutral disposition on this present-day working trend.Results from this study can be used to frame new flexible policies to give employees the freedom to choose their work settings. Moreover, a hybrid approach will considerably save the time and resources required to travel to the office. A limitation of this research includes the consideration of only English language tweets. In future, the proposed model could be used in conjunction with other classification models to classify the tweets into much finer classes to handle mixed emotions, and to improve accuracy and time-related metrics.
Authors: A H Alamoodi; B B Zaidan; Maimonah Al-Masawa; Sahar M Taresh; Sarah Noman; Ibraheem Y Y Ahmaro; Salem Garfan; Juliana Chen; M A Ahmed; A A Zaidan; O S Albahri; Uwe Aickelin; Noor N Thamir; Julanar Ahmed Fadhil; Asmaa Salahaldin Journal: Comput Biol Med Date: 2021-10-16 Impact factor: 4.589