| Literature DB >> 35176059 |
Swati Agarwal1, Sayantani Sarkar2.
Abstract
Implementing countrywide lockdown measures in India, from March 2020 to May 2020 was a major step to deal with the COVID -19 pandemic crisis. The decision of country lockdown adversely affected the urban migrant population, and a large section of them was compelled to move out of the urban areas to their native places. The reverse migration garnered widespread media attention and coverage in electronic as well as print media. The present study focuses on the coverage of the issue by print media using descriptive natural language text mining. The study uses topic modelling, clustering, and sentiment analysis to examine the articles on migration issues during the lockdown period published in two leading English newspapers in India- The Times of India and The Hindu. The sentiment analysis results indicate that the majority of articles have neutral sentiment while very few articles show high negative or positive polarity. Descriptive topic modelling results show that transport, food security, special services, and employment with migration and migrants are the majorly covered topics after employing Bag of Words and TF-IDF models. Clustering is performed to group the article titles based on similar traits using agglomerative hierarchical clustering.Entities:
Mesh:
Year: 2022 PMID: 35176059 PMCID: PMC8853493 DOI: 10.1371/journal.pone.0263787
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Illustrating the distribution of words across articles.
Fig 2Illustrating the coherence score for different number of topics for bag of words and TF-IDF models employed on article and summary.
Fig 3Various linkage metrics for hierarchical clustering.
The dashed lines are the linkage between clusters and highlighted edge shows the optimal linkage for clustering.
Fig 4Illustrating the variation in sentiment scores for articles.
The positive, negative, and neutral sentiments for each article reveals the ratio of polarities within an article.
Fig 5Illustrating the violin plot for distribution of sentiment scores across experimental dataset.
The width of the plot at each instance shows the density estimate of articles having a polarity score. In addition to density and distribution, violin plot also shows the inter-quartile summaries of sentiment scores.
Illustrating the words describing various topics identified using BoW model for both article text and summaries.
|
| |
| Topic 1 | case, quarantine, test, covid, centre, health, return, report, number, person |
| Topic 2 | food, help, group, family, water, work, money, woman, time, shelter |
| Topic 3 | court, minister, issue, order, centre, union, congress, party, bench, secretary |
| Topic 4 | village, school, family, year, return, work, child, labourer, shelter, mumbai |
| Topic 5 | travel, transport, border, arrange, jharkhand, place, return, odisha, administration, arrangement |
| Topic 6 | work, employment, return, lakh, scheme, department, minister, number, migration, sector |
| Topic 7 | police, truck, station, road, labourer, vehicle, driver, spot, highway, place |
| Topic 8 | work, labourer, construction, return, demand, industry, city, site, unit, contractor |
| Topic 9 | train, railway, station, passenger, shramik, bihar, board, ticket, reach, official |
|
| |
| Topic 1 | work, district, case, return, covid, labourer, state, construction, number, industry |
| Topic 2 | state, court, health, department, issue, district, order, test, centre, government |
| Topic 3 | home, district, labourer, bihar, place, shelter, city, administration, jharkhand, demand |
| Topic 4 | police, district, labourer, group, truck, home, border, village, road, woman |
| Topic 5 | state, government, minister, home, return, lakh, strand, country, congress, union |
| Topic 6 | train, railway, station, shramik, passenger, board, district, reach, bihar, ticket |
| Topic 7 | food, family, centre, help, water, ration, quarantine, member, home, month |
Illustrating the words describing various topics identified using TF-IDF model for both article text and summaries.
|
| |
| Topic 1 | transfer, account, bank, gandhi, hail, application, certificate, accident, meeting, death |
| Topic 2 | minister, party, congress, industry, employment, work, sector, leader, project, lakh |
| Topic 3 | ration, card, distribute, rice, distribution, packet, food, supply, volunteer, meal |
| Topic 4 | test, case, quarantine, hospital, health, centre, covid, report, sample, person |
| Topic 5 | train, railway, station, passenger, shramik, board, bihar, flight, ticket, travel |
| Topic 6 | police, court, truck, work, family, shelter, labourer, food, village, group |
|
| |
| Topic 1 | government, state, minister, case, court, return, lakh, work, issue, covid |
| Topic 2 | train, station, railway, district, bihar, shramik, passenger, board, border, arrange |
| Topic 3 | police, food, family, village, shelter, help, group, work, truck, labourer |
Fig 6Illustrating the top 30 salient terms present in first topic identified using TF-IDF model employed on articles.
The bubble chart visualises the overlap in various topics plotted in a two dimensional (components) space.
Fig 7Illustrating the top 30 salient terms present in first topic identified using TF-IDF model employed on articles’ summaries.
The bubble chart visualises the overlap in various topics plotted in a two dimensional (components) space.
Fig 8Illustrating the full dendrogram result of agglomerative hierarchical clustering employed on news headlines.
Fig 9Illustrating the partial dendrogram obtained from agglomerative hierarchical clustering and representing different clusters upto 5 levels from root.