| Literature DB >> 36091980 |
Muhammad Irzam Liaqat1, Muhammad Awais Hassan1, Muhammad Shoaib1, Syed Khaldoon Khurshid1, Mohamed A Shamseldin2.
Abstract
Sentiment analysis in research involves the processing and analysis of sentiments from textual data. The sentiment analysis for high resource languages such as English and French has been carried out effectively in the past. However, its applications are comparatively few for resource-poor languages due to a lack of textual resources. This systematic literature explores different aspects of Urdu-based sentiment analysis, a classic case of poor resource language. While Urdu is a South Asian language understood by one hundred and sixty-nine million people across the planet. There are various shortcomings in the literature, including limitation of large corpora, language parsers, and lack of pre-trained machine learning models that result in poor performance. This article has analyzed and evaluated studies addressing machine learning-based Urdu sentiment analysis. After searching and filtering, forty articles have been inspected. Research objectives have been proposed that lead to research questions. Our searches were organized in digital repositories after selecting and screening relevant studies. Data was extracted from these studies. Our work on the existing literature reflects that sentiment classification performance can be improved by overcoming the challenges such as word sense disambiguation and massive datasets. Furthermore, Urdu-based language constructs, including language parsers and emoticons, context-level sentiment analysis techniques, pre-processing methods, and lexical resources, can also be improved. ©2022 Liaqat et al.Entities:
Keywords: Digital repositories; Opinion mining; Poor resource language; Sentiment analysis; Urdu-based language constructs; Word sensedisambiguation
Year: 2022 PMID: 36091980 PMCID: PMC9454799 DOI: 10.7717/peerj-cs.1032
Source DB: PubMed Journal: PeerJ Comput Sci ISSN: 2376-5992
Comparison with related work.
| Article | Focus of the study | Ref. | Approach | Quality assessment score | Identification of dimensions | Explored survey perspectives | Targeted repositories | |||
|---|---|---|---|---|---|---|---|---|---|---|
| Publication channels | Machine learning approaches | Challenge and opportunities | Effective machine learning methods | |||||||
|
| Survey of sentiment analysis in Urdu language | 2020 | SLR | Yes | Yes | Yes | No | Yes | No | 3 |
|
| Survey of the lexicon, machine learning, and hybrid techniques for SA | 2019 | Trad. | No | No | No | Yes | No | No | Not mentioned |
|
| Review of studies for lexicon development | 2006 | Trad. | No | Yes | No | No | Yes | No | Not mentioned |
|
| Survey of multilingual based SA | 2017 | Trad. | No | Yes | No | No | Yes | No | Not mentioned |
| This survey | Classification of machine and deep learning approaches | 2021 | SLR | Yes | Yes | Yes | Yes | Yes | Yes | 10 |
Figure 1Research methodology.
Figure 2Search strategy.
Research questions.
|
|
|
|
|---|---|---|
| RQ1: | What are the relevant publication channels for Urdu-based SA? | To identify |
| RQ2: | What are the primary studies that have studied and discussed the use of machine learning methods for sentiment analysis? | To identify |
| RQ3: | What are the major challenges and opportunities for the Urdu-based sentiment analysis? | To identify |
| RQ4: | What are the most effective machine learning-based methods for performing Urdu-based SA? | To identify |
Figure 3Query string listing possible combinations using derived keywords.
Scores for journals and conferences.
|
|
|
|
|
|
|
|---|---|---|---|---|---|
| Journals | Q1 | Q2 | Q3 | Q4 | No JCR Ranking |
| Conferences, Workshops, Symposia | CORE A* | CORE A | CORE B | CORE C | Not in CORE Ranking |
Query string generated results and filtering phases.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Search | Keywords ( | 172,796 | 531 | 425 | 14,143 | 377,221 | 122 | 129 | 1,386 | 103 | 50,900 |
|
| 2 | Filtering | Since 2017 | 48,912 | 492 | 350 | 8,903 | 137,721 | 89 | 12 | 1,386 | 103 | 18,500 |
|
| 3 | Filtering | Title | 170 | 50 | 150 | 128 | 150 | 50 | 6 | 27 | 30 | 198 |
|
| 4 | Filtering | Abstract | 103 | 30 | 15 | 40 | 10 | 35 | 3 | 15 | 12 | 58 |
|
| 5 | Filtering | Introduction and conclusion | 33 | 12 | 3 | 10 | 7 | 15 | 2 | 8 | 5 | 37 |
|
| 6 | Inspection | Full Article | 8 | 10 | 1 | 8 | 3 | 5 | 1 | 1 | 2 | 2 |
|
Search strategy for selected repositories.
|
|
|
|
|---|---|---|
| ACM Digital Library | [[All: sentiment analysis] OR [All: intent analysis]] AND [[All: machine learning] OR [All: deep learning]] AND [All: Urdu] AND [All: challenges] AND [All: opportunities] | Since 2017 |
| IEEE Xplore | (Sentiment Analysis OR Sentiment Classification) AND (Machine Learning OR Deep Learning OR Supervised Learning OR Semi Supervised Learning) AND (Urdu OR Urdu language) | Since 2017 |
| PLOS ONE | Urdu Sentiment Analysis using Machine Learning | Since 2017 |
| Google Scholar | (Sentiment Analysis OR Opinion Mining OR Intent Analysis OR Sentiment Classification) AND (Machine Learning OR Deep Learning OR Supervised Learning OR Semi Supervised Learning) AND (Urdu OR Urdu language) AND (Challenges) AND (Opportunities) | Since 2012 Since 2017 |
| Science Direct | (Sentiment Analysis OR Opinion Mining OR Intent Analysis OR Sentiment Classification) AND (Machine Learning OR Deep Learning OR Supervised Learning OR Semi Supervised Learning) AND (Urdu OR Urdu language) AND (Challenges) AND (Opportunities) | Since 2017 |
| Springer Link | (Sentiment Analysis OR Opinion Mining OR Intent Analysis OR Sentiment Classification) AND (Machine Learning OR Deep Learning OR Supervised Learning OR Semi Supervised Learning) AND (Urdu OR Urdu language) AND (Challenges) AND (Opportunities) | Since 2017 |
| Wiley Online Library | (Sentiment Analysis OR Opinion Mining OR Intent Analysis OR Sentiment Classification) AND (Machine Learning OR Deep Learning OR Supervised Learning OR Semi Supervised Learning) AND (Urdu OR Urdu language) AND (Challenges) AND (Opportunities) | Since 2017 |
| arXiv | Urdu Sentiment Analysis using Machine Learning | Since 2017 |
| IGI Global | Expert search-based keywords: Sentiment Analysis, Urdu, Machine Learning, Deep Learning, Challenges, opportunities | Since 2017 |
| CEEOL | Search String keywords: Sentiment Analysis, Urdu, Machine Learning, Deep Learning, Challenges, opportunities | Since 2017 |
Figure 4Year wise publication identification.
Percentage of publication type.
| Publications (in percentage) | |
|---|---|
| Journals | 62 |
| Conferences | 36 |
| Workshops | 2 |
Distribution of selected studies.
| Distribution of articles (in percentage) | |
|---|---|
| America | 12 |
| Europe | 18 |
| Asia | 60 |
| Other | 10 |
Quality assessment score.
| References | Score | Total |
|---|---|---|
| 8 | 5 | |
| 7 | 8 | |
| 6 | 14 | |
| 5 | 13 |
Figure 5Taxonomy for the SLR.
Classification of the selected studies.
|
| Classification | Quality assessment | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
| |
|
| Journal | 2019 | Primary | Experimental | Urdu SA | 1 | 0 | 1 | 0 | 4 | 6 |
|
| Journal | 2018 | Primary | Experimental | ML based SA | 0 | 1 | 1 | 1 | 3 | 6 |
|
| Journal | 2021 | Primary | Experimental | Deep Learning Based SA | 0 | 1 | 1 | 0 | 4 | 6 |
|
| Journal | 2019 | Primary | Experimental | DL based SA | 0 | 1 | 1 | 0 | 3 | 5 |
|
| Journal | 2020 | Survey | No | SLR | 1 | 0 | 0 | 1 | 4 | 6 |
|
| Journal | 2019 | Primary | Experimental | ML based SA | 0 | 1 | 1 | 0 | 4 | 6 |
|
| Conference | 2014 | Primary | Experimental | Supervised ML based SA in Persian | 0 | 1 | 1 | 0 | 3 | 5 |
|
| Journal | 2019 | Primary | Experimental | ML based SA | 0 | 1 | 1 | 1 | 4 | 7 |
|
| Conference | 2020 | Primary | Solution article | ML based SA | 0 | 1 | 0 | 1 | 3 | 5 |
|
| Journal | 2021 | Primary | Experimental | ML based SA | 1 | 1 | 1 | 1 | 3 | 7 |
|
| Journal | 2021 | Primary | Experimental | ML based SA | 1 | 1 | 1 | 0 | 4 | 7 |
|
| Journal | 2020 | Comparative study | Survey article | Deep learning models | 1 | 1 | 1 | 1 | 4 | 8 |
|
| Journal | 2021 | Primary | Experimental | ML based SA | 1 | 1 | 1 | 1 | 4 | 8 |
|
| Journal | 2021 | Primary | Experimental | ML based Hate Speech Analysis | 1 | 1 | 1 | 0 | 4 | 7 |
|
| Journal | 2020 | Primary | Experimental | Lexicon based Sentiment Analysis | 1 | 0 | 1 | 0 | 4 | 6 |
|
| Journal | 2020 | Primary | Experimental | ML based SA | 0 | 1 | 1 | 1 | 3 | 6 |
|
| Journal | 2021 | Primary | Experimental | Deep Learning based SA | 1 | 1 | 1 | 1 | 4 | 8 |
|
| Journal | 2021 | Primary | Exploratory | CNN-RNN based SA | 1 | 1 | 1 | 1 | 4 | 8 |
|
| Journal | 2020 | Primary | Experimental | Hate Speech using ML | 0 | 1 | 1 | 1 | 4 | 7 |
|
| Journal | 2020 | Primary | Experimental | ML and deep learning based review classification | 0 | 1 | 1 | 0 | 3 | 5 |
|
| Journal arxiv | 2020 | Survey | Survey article | Survey of DL Methods | 0 | 1 | 0 | 1 | 3 | 5 |
|
| Journal ACM | 2021 | Survey | Survey article | DL based text classification | 0 | 1 | 0 | 0 | 4 | 5 |
|
| Journal Elsevier | 2021 | Review | Review article | Opinion Analysis | 0 | 1 | 0 | 1 | 3 | 5 |
|
| Journal ACM | 2019 | Survey | Survey article | Opinion Mining | 0 | 1 | 0 | 1 | 4 | 6 |
|
| Journal Telematics and informatics | 2018 | Comparative study | Experimental | Lexicon and ML approaches | 1 | 1 | 1 | 0 | 3 | 6 |
|
| Journal Procedia CS | 2019 | Primary | Experimental | DL based Sentiment Analysis | 0 | 1 | 1 | 0 | 3 | 5 |
|
| Journal | 2020 | Survey | Survey article | Sentiment Analysis methods | 0 | 1 | 0 | 1 | 3 | 5 |
|
| Journal AI Review | 2019 | Survey | Survey article | Opinion mining and approaches | 0 | 1 | 0 | 1 | 4 | 6 |
|
| Journal IJACSA | 2019 | Primary | Experimental | Text analysis for sentiment | 0 | 1 | 1 | 1 | 2 | 5 |
|
| Conference ACM | 2009 | Survey | Survey article | Sentiment Analysis | 0 | 1 | 0 | 1 | 3 | 5 |
|
| Journal Expert Sys | 2014 | Primary | Experimental | Aspect based opinion mining | 0 | 1 | 0 | 1 | 4 | 6 |
|
| Conference | 2017 | Primary | Experimental | ML based SA | 0 | 1 | 1 | 1 | 2 | 5 |
|
| Conference | 2017 | Primary | Experimental | Context based topic modeling | 0 | 1 | 1 | 1 | 3 | 6 |
|
| Conference | 2019 | Primary | Experimental | Urdu-based SA | 1 | 1 | 1 | 1 | 3 | 7 |
|
| Conference | 2019 | Primary | Experimental | Feedback analysis | 0 | 1 | 1 | 1 | 3 | 6 |
|
| Journal Pakistan JS | 2021 | Primary | Experimental | Review SA | 0 | 1 | 1 | 1 | 3 | 6 |
|
| Journal IEEE Access | 2019 | Primary | Experimental | Roman Urdu SA | 0 | 1 | 1 | 1 | 4 | 7 |
|
| Journal IJST | 2019 | Primary | Experimental | Text classification-based SA | 0 | 1 | 1 | 1 | 2 | 5 |
|
| Conference IEEE | 2020 | Primary | Experimental | Roman Urdu SA | 0 | 1 | 1 | 0 | 2 | 4 |
|
| Conference IEEE | 2019 | Primary | Experimental | ML based Urdu SA | 1 | 1 | 1 | 1 | 3 | 7 |
|
| Journal | 2019 | Primary | Experimental | Emotion Analysis | 0 | 1 | 1 | 0 | 3 | 5 |
Validation classification of studies.
|
|
| ||
|---|---|---|---|
|
|
|
| |
|
| Precision, Accuracy, Recall, F-measure | UL | U |
|
| Accuracy | UM, CO | RU |
|
| Precision, Accuracy, Recall, F-measure | UD | RU |
|
| Precision, Accuracy, Recall, F-measure | UD | RU |
|
| Survey | CO | U |
|
| Accuracy Unigram | UM | RU |
|
| F-score of features | UM, CO | RU |
|
| Accuracy, F-score | UM, CO | RU |
|
| Precision, Accuracy, Recall, F-measure | UM, CO | RU |
|
| Precision, Accuracy, Recall, F-measure | UM, CO | U |
|
| Precision, Accuracy, Recall, F-measure | UM | U |
|
| Accuracy for features | UD, CO | U |
|
| Precision, Accuracy, Recall, F-measure | UM, CO | U |
|
| Micro F1 score | UM | U |
|
| Accuracy | UL | U |
|
| Accuracy | UM, CO | RU |
|
| Precision, Accuracy, Recall, F-measure | UD, CO | U |
|
| Precision, Accuracy, Recall, F-measure | UD, CO | U |
|
| Precision, Recall, F-measure | UM, CO | RU |
|
| Accuracy | UM, UD | RU |
|
| F-measure | UD, CO | RU |
|
| Survey | UD | RU |
|
| Review | UL, CO | RU |
|
| survey | UL, CO | RU |
|
| Precision, Recall, F-measure | UL | U |
|
| Precision, Accuracy, Recall, F-measure | UD | RU |
|
| Review | UL, CO | RU |
|
| Survey | UM, CO | RU |
|
| Accuracy, F-measure | UL, CO | RU |
|
| Accuracy | UM, CO | RU |
|
| Accuracy, F-measure | UM, CO | RU |
|
| F-measure | UD, CO | RU |
|
| Accuracy | ML, CO | U |
|
| Accuracy | UD, CO | RU |
|
| Precision, Accuracy, Recall, F-measure | UL, CO | RU |
|
| Accuracy, F-measure | UM, CO | RU |
|
| F-score of features | UM, CO | RU |
|
| Accuracy | UL, CO | RU |
|
| Precision, Accuracy, Recall, F-measure | UM, CO | U |
|
| Accuracy | UD, CO | RU |
|
| Precision, Accuracy, Recall, F-measure | UL,CO | U |