| Literature DB >> 35389357 |
Kashif Ahmad1, Firoj Alam2, Junaid Qadir3, Basheer Qolomany4, Imran Khan5, Talhat Khan5, Muhammad Suleman5, Naina Said5, Syed Zohaib Hassan6, Asma Gul7, Mowafa Househ1, Ala Al-Fuqaha1.
Abstract
BACKGROUND: Contact tracing has been globally adopted in the fight to control the infection rate of COVID-19. To this aim, several mobile apps have been developed. However, there are ever-growing concerns over the working mechanism and performance of these applications. The literature already provides some interesting exploratory studies on the community's response to the applications by analyzing information from different sources, such as news and users' reviews of the applications. However, to the best of our knowledge, there is no existing solution that automatically analyzes users' reviews and extracts the evoked sentiments. We believe such solutions combined with a user-friendly interface can be used as a rapid surveillance tool to monitor how effective an application is and to make immediate changes without going through an intense participatory design method.Entities:
Keywords: BERT; COVID-19; NLP; RoBerta; contact tracing applications; fastText; sentiment analysis; text classification; transformers
Year: 2022 PMID: 35389357 PMCID: PMC9097863 DOI: 10.2196/36238
Source DB: PubMed Journal: JMIR Form Res ISSN: 2561-326X
COVID-19 contact tracing mobile apps used in this study.
| S. No. | Country | Application | Technology |
| 1 | Australia | COVIDSafe | Bluetooth, Google/Apple |
| 2 | Austria | Stopp Corona | Bluetooth, Google/Apple |
| 3 | Bahrain | BeAware | Bluetooth, location |
| 4 | Bangladesh | Corona Tracer BD | Bluetooth, Google |
| 5 | Belgium | Coronalert | Bluetooth, Google/Apple |
| 6 | Bulgaria | ViruSafe | Location, Bluetooth, Google/Apple |
| 7 | Canada | COVID Alert | Bluetooth, Google/Apple |
| 8 | Cyprus | CovTracer | Location, GPS |
| 9 | Czech Republic | eRouska | Bluetooth, Google/Apple |
| 10 | Denmark | Smittestop | Bluetooth, Google/Apple |
| 11 | Estonia | HOIA | Bluetooth, DP-3T, Google/Apple |
| 12 | Fiji | CareFiji | Bluetooth, Google/Apple |
| 13 | Finland | Koronavilkku | Bluetooth, DP-3T |
| 14 | France | TousAntiCovid | Bluetooth, Google/Apple |
| 15 | Germany | Corona-Warn-App | Bluetooth, Google/Apple |
| 16 | Ghana | GH COVID-19 Tracker | Location, Google/Apple |
| 17 | Gibraltar | Beat Covid Gibraltar | Bluetooth, Google/Apple |
| 18 | Hungary | VirusRadar | Bluetooth, Google |
| 19 | Iceland | Rakning C-19 | Location, Google/Apple |
| 20 | India | Aarogya Setu | Bluetooth, location, Google/Apple |
| 21 | Indonesia | PeduliLindungi | Bluetooth, Google/Apple |
| 22 | Ireland | Covid Tracker | Bluetooth, Google/Apple |
| 23 | Israel | HaMagen | Location, Google/Apple |
| 24 | Italy | Immuni | Bluetooth, Google/Apple |
| 25 | Japan | COCOA | Google/Apple |
| 26 | Kingdom of Saudi Arabia | Tawakkalna | Bluetooth, Google |
| 27 | Kingdom of Saudi Arabia | Tabaud | |
| 28 | Kuwait | Shlonik | Location, Google/Apple |
| 29 | Malaysia | MyTrace | Bluetooth, Google/Apple |
| 30 | Mexico | CovidRadar | Bluetooth |
| 31 | New Zealand | NZ COVID Tracer | QR codes, Google/Apple |
| 32 | North Macedonia | StopKorona | Bluetooth |
| 33 | Northern Ireland | StopCOVID NI | Bluetooth, Google/Apple |
| 34 | Norway | Smittestopp | Bluetooth, location, Google |
| 35 | Pakistan | COVID-Gov-PK | Bluetooth, GPS, Google/Apple |
| 36 | Philippines | StaySafe | Bluetooth, Google/Apple |
| 37 | Poland | ProteGO Safe | Bluetooth, Google |
| 38 | Qatar | Ehteraz | Bluetooth, location, Google/Apple |
| 39 | Singapore | TraceTogether | Bluetooth, Google/Apple |
| 40 | South Africa | COVID Alert SA | Bluetooth, Google/Apple |
| 41 | Switzerland | SwissCovid | Bluetooth, DP-3T, Google/Apple |
| 42 | Thailand | MorChana | Location, Bluetooth |
| 43 | Tunisia | E7mi | Google/Apple |
| 44 | Turkey | Hayat Eve Sıg˘ar | Bluetooth, location, Google/Apple |
| 45 | United Arab Emirates | TraceCovid | Bluetooth |
| 46 | United Kingdom | NHS COVID-19 App | Bluetooth, Google/Apple |
Figure 1Block diagram of the proposed pipeline for sentiment analysis of users’ feedback on COVID-19 contact tracing mobile apps, roughly divided into 2 components, namely (1) data set development and (2) experiments.
Figure 2Screenshot of the annotation platform.
Figure 3Taxonomy of main reasons or causes for the positive and negative reviews, as well as technical issues.
Common reasons for the feedback provided in the reviews (n=34,534).
| Type of feedback | Frequency of responses, n (%) | |
|
| ||
|
| Easy to install or use | 1137 (7.3) |
|
| Useful, informative, and helpful | 5673 (36.4) |
|
| Likes the idea or initiative | 904 (5.8) |
|
| Working fine | 3226 (20.7) |
|
| No reason or other | 4645 (29.8) |
|
| ||
|
| Registration issues | 4111 (43.3) |
|
| Update issues | 978 (10.3) |
|
| Frequent crashes | 1443 (15.2) |
|
| No reason or other | 2954 (31.1) |
|
| ||
|
| Power consumption | 1733 (21.2) |
|
| Privacy concerns | 1063 (13.0) |
|
| Useless | 2020 (24.7) |
|
| Not user-friendly | 1022 (12.5) |
|
| No reason or other | 2339 (28.6) |
Figure 4Distribution of negative, positive, and neutral reviews as well as technical issues reported for the applications in our data set, by country.
Figure 5Preliminary temporal analysis reflecting the changes in the distribution of the sentiment classes over time. The data were compiled by analyzing the top 200 more recent (ie, December 25, 2020) and the initial 200 reviews from some of the applications that had a sufficient number of reviews.
Number of reviews of different lengths in the overall data set.
| Number of tokens | Reviews, n |
| 0-20 | 23,602 |
| 21-40 | 6611 |
| 41-60 | 2463 |
| 61-80 | 1068 |
| 81-100 | 664 |
| 101-120 | 68 |
| >120 | 60 |
The most frequent class-wise n-grams based on valance scores.
| Ranking | Negative | Positive | Technical issues |
| 1 | Battery went down | Best app | Error requesting |
| 2 | to delete this | Excellent app, | Cannot register. |
| 3 | overheating and battery | and helpful | what’s wrong |
| 4 | Not happy | Very nice and | fix this, |
| 5 | uninstall due to | feel safer | I can’t seem |
| 6 | Drains battery and | very good apps | Unable to proceed |
| 7 | massive drain | Good information | error while |
| 8 | too much battery, | save lives and | have error |
| 9 | Massive battery drain | very useful for | phone number. Tried |
Data split and distribution of class labels for Task 1.
| Class | Train | Validation | Test | Total |
| Positive | 9370 | 1041 | 5176 | 15,587 |
| Negative | 5000 | 556 | 2622 | 8178 |
| Technical issues | 5686 | 632 | 3178 | 9496 |
| Total | 20,056 | 2229 | 10,976 | 33,261 |
Data split and distribution of class labels for Task 3.
| Class | Train | Validation | Test | Total |
| Positive | 9364 | 1040 | 5183 | 15,587 |
| Negative | 10,690 | 1188 | 5798 | 17,676 |
| Neutral | 770 | 85 | 416 | 1271 |
| Total | 20,824 | 2314 | 11,398 | 34,534 |
Hyperparameter settings used during the experiments.
| Parameters | Value |
| Batch size | 8 |
| Learning rate (Adam) | 2e-5 |
| Number of epochs | 10 |
| Max sequence length | 128 |
Experimental results for Task 1: ternary classification of positive, negative, and technical issues (PNT).
| Method | Positive | Negative | Technical issues | Overall (weighted average) | |||||||||||
|
| Pa | Rb | F1 | P | R | F1 | P | R | F1 | Accc | P | R | F1 | ||
| MNBd | .910 | .892 | .901 | .679 | .664 | .671 | .751 | .789 | .769 | .808 | .809 | .808 | .808 | ||
| RFe | .854 | .923 | .887 | .809 | .538 | .646 | .729 | .833 | .777 | .805 | .806 | .805 | .797 | ||
| SVMf | .946 | .867 | .905 | .660 | .707 | .683 | .745 | .803 | .773 | .810 | .820 | .810 | .814 | ||
| fastText | .930 | .904 | .917 | .713 | .691 | .702 | .752 | .806 | .778 | .825 | .827 | .825 | .825 | ||
| DistilBERTg | .943 | .934 | .939 | .753 | .714 | .733 | .778 | .824 | .800 | .849 | .850 | .849 | .849 | ||
| BERT | .938 | .936 | .937 | .750 | .718 | .734 | .786 | .817 | .801 | .850 | .849 | .850 | .849 | ||
| RoBERTa | .943 | .946 | .945 | .754 | .716 | .734 | .788 | .817 | .802 | .854 | .853 | .854 | .853 | ||
| XML-RoBERTa | .941 | .946 | .943 | .744 | .705 | .724 | .783 | .811 | .797 | .849 | .848 | .849 | .848 | ||
aP: precision.
bR: recall.
cAcc: accuracy.
dMNB: multinomial Naïve Bayes.
eRF: random forest.
fSVM: support vector machine.
gBERT: Bidirectional Encoder Representations from Transformers.
Experimental results for Task 2: binary classification (positive or negative [PN]).
| Method | Positive | Negative | Overall (weighted average) | |||||||||
|
| Pa | Rb | F1 | P | R | F1 | Accc | P | R | F1 | ||
| MNBd | .925 | .873 | .898 | .891 | .936 | .913 | .906 | .907 | .906 | .906 | ||
| RFe | .902 | .879 | .891 | .894 | .914 | .904 | .898 | .898 | .898 | .898 | ||
| SVMf | .944 | .876 | .909 | .895 | .953 | .923 | .916 | .918 | .916 | .916 | ||
| fastText | .947 | .890 | .917 | .905 | .955 | .929 | .924 | .925 | .924 | .924 | ||
| DistilBERTg | .947 | .932 | .939 | .939 | .953 | .946 | .943 | .943 | .943 | .943 | ||
| BERT | .947 | .936 | .941 | .943 | .953 | .948 | .945 | .945 | .945 | .945 | ||
| RoBERTa | .948 | .942 | .945 | .948 | .953 | .951 | .948 | .948 | .948 | .948 | ||
| XML-RoBERTa | .953 | .930 | .942 | .939 | .959 | .949 | .945 | .946 | .945 | .945 | ||
aP: precision.
bR: recall.
cAcc: accuracy.
dMNB: multinomial Naïve Bayes.
eRF: random forest.
fSVM: support vector machine.
gBERT: Bidirectional Encoder Representations from Transformers.
Experimental results for Task 3: ternary classification (positive, negative, or neutral [PNN]).
| Method | Positive | Negative | Neutral | Overall (weighted average) | ||||||||||||
|
| Pa | Rb | F1 | P | R | F1 | P | R | F1 | Accc | P | R | F1 | |||
| MNBd | .902 | .873 | .888 | .854 | .935 | .892 | .379 | .027 | .050 | .874 | .859 | .874 | .860 | |||
| RFe | .875 | .881 | .878 | .862 | .916 | .888 | .333 | .005 | .010 | .866 | .844 | .866 | .851 | |||
| SVMf | .926 | .844 | .883 | .881 | .914 | .897 | .211 | .330 | .257 | .861 | .877 | .861 | .868 | |||
| fastText | .947 | .890 | .917 | .905 | .955 | .929 | .463 | .177 | .256 | .891 | .883 | .891 | .883 | |||
| DistilBERTg | .932 | .918 | .925 | .913 | .934 | .923 | .364 | .312 | .336 | .904 | .901 | .904 | .902 | |||
| BERT | .933 | .927 | .930 | .913 | .940 | .926 | .387 | .261 | .312 | .909 | .903 | .909 | .905 | |||
| RoBERTa | .933 | .931 | .932 | .919 | .941 | .930 | .386 | .269 | .317 | .912 | .906 | .912 | .909 | |||
| XML-RoBERTa | .941 | .932 | .936 | .923 | .936 | .929 | .341 | .319 | .333 | .911 | .910 | .911 | .911 | |||
aP: precision.
bR: recall.
cAcc: accuracy.
dMNB: multinomial Naïve Bayes.
eRF: random forest.
fSVM: support vector machine.
gBERT: Bidirectional Encoder Representations from Transformers.
Figure 6Results of the statistical significance (McNemar) test comparing the different methods for Task 1. BERT: Bidirectional Encoder Representations from Transformers; MNB: multinomial Naïve Bayes; RF: random forest; SVM: support vector machine.
Figure 8Results of the statistical significance (McNemar) test comparing the different methods for Task 3. BERT: Bidirectional Encoder Representations from Transformers; MNB: multinomial Naïve Bayes; RF: random forest; SVM: support vector machine.
Figure 9Screenshot of the potential tool based on the proposed solutions.
Data split and distribution of class labels for Task 2.
| Class | Train | Validation | Test | Total |
| Positive | 9342 | 1038 | 5207 | 15,587 |
| Negative | 10,715 | 1191 | 5770 | 17,676 |
| Total | 20,057 | 2229 | 10,977 | 33,263 |