| Literature DB >> 26839759 |
Abstract
As social media has become more prevalent, its influence on business, politics, and society has become significant. Due to easy access and interaction between large numbers of users, information diffuses in an epidemic style on the web. Understanding the mechanisms of information diffusion through these new publication methods is important for political and marketing purposes. Among social media, web forums, where people in online communities disseminate and receive information, provide a good environment for examining information diffusion. In this paper, we model topic diffusion in web forums using the epidemiology model, the susceptible-infected-recovered (SIR) model, frequently used in previous research to analyze both disease outbreaks and knowledge diffusion. The model was evaluated on a large longitudinal dataset from the web forum of a major retail company and from a general political discussion forum. The fitting results showed that the SIR model is a plausible model to describe the diffusion process of a topic. This research shows that epidemic models can expand their application areas to topic discussion on the web, particularly social media such as web forums.Entities:
Keywords: Contagion; Epidemic model; Information diffusion; Social media; Web forum
Year: 2016 PMID: 26839759 PMCID: PMC4723377 DOI: 10.1186/s40064-016-1675-x
Source DB: PubMed Journal: Springerplus ISSN: 2193-1801
Previous research on information diffusion
| Key papers | Model specification | Applications | Contributions |
|---|---|---|---|
| Goffman and Newill ( | SIR, SIS | Scientific theory | The first analogy development between information and disease diffusion |
| Kawachi ( | SIR-variants | Rumor | The novel model with offsetting effect |
| Fan ( | SIR | Financial information | The novel model with content characteristics ideodynamics model |
| Shive ( | SIR | WOM of stock | Novel model with corporate financial information |
| Shtatland and Shtatland ( | SIR | Financial information | Outbreak detection using the diffusion model |
| Goldenberg et al. ( | SIR | Word of mouth (WOM) | The network effects on WOM |
| Bampo et al. ( | ICM | WOM | The network effects on WOM |
| Gruhl et al. ( | ICM | Blog | The empirical test |
| Saito et al. ( | ICM | The method to estimate infection rate | |
| Leskovec et al. ( | Network SIS | Blog | The empirical test |
| Kubo et al. ( | SIR | Web forum | The analogy development between topic diffusion in the web forum and disease spread |
| Toole et al. ( | Network SIS | The novel model with geolocation information, the empirical test | |
| Myers et al. ( | ICM | The novel model with external effect | |
| Tang et al. ( | Network SI | Chinese Twitter | The empirical test |
| Liu and Zhang ( | ICM | Syntactic data | The novel model with rewiring friendship |
| Wang et al. ( | Network SI | The novel model, emotion-based spreaderignorantstifler (ESIS) model |
Fig. 1Transition diagram of the SIR model in web forums
The analogy between epidemics and topic diffusion in the web forum
| Elements of SIR model | Epidemics | Topic diffusion in web forums |
|---|---|---|
| What flows | Disease | Idea/topic (keywords) |
| Susceptible: S(t) | People who can have contact with an infective and possibly will become infected | Possible authors (including commenters) who might read posts on a topic |
| Infective: I(t) | People who have a disease and possibly will infect others | Current authors who write posts on a topic |
| Recovered: R(t) | People who recover from a disease and lose the power to infect others | Past authors whose posts lose influence toward others |
| Infection rate: | The probability of transmission in a contact between an infective and a susceptible | The probability of writing a comment or thread after reading posts on the topic |
| Recovery rate: | The probability that the infective becomes recovered | The probability that posts lose infectivity |
| Recruitment rate: | The proportional increase rate of the population | The proportional increase rate of author pools |
| Carrying capacity: K | The maximum population that the environment can support | The highest value of the total authors that a topic can recruit |
Fig. 2SIRW system design
Fig. 3Trend of author participation
Fig. 4Spikey topic vs. chatter topic
The major topics and keywords in the Walmart forum
| Topic group | Topic | Keywords |
|---|---|---|
| Investor | Stock price | Growth, share, earnings, price, stock, market |
| Sales | Sales, percent, quarter, increase, fiscal, earnings, expected, results | |
| Customer | Low price | Prices, low, economy, consumer, cost, market |
| Shopping convenience | Shopping, items, manager, shoppers, service, line, door, experience | |
| Employee | Healthcare | Healthcare, employees, insurance, medical, plan |
| Labor law | Labor, illegal, federal, laws, violations, rights | |
| Wage | Pay, wages, benefits, employees, hour, working paid average hours, minimum, poverty, paying |
Fig. 5The time-series patterns of selected Walmart topics
Parameter estimation results on the Walmart forum
| Topic | MSE |
| S(0) |
|
|
| K |
|---|---|---|---|---|---|---|---|
| Stock price | 5.28E+03 | 0.6198 | 163 | 0.0045 | 0.6798 | 0.1226 | 1384 |
| Sales | 2.72E+03 | 0.6320 | 100 | 0.0081 | 0.7270 | 0.1388 | 997 |
| Low price | 3.64E+03 | 0.7262 | 122 | 0.0059 | 0.7506 | 0.1419 | 1401 |
| Shopping convenience | 1.98E+03 | 0.6433 | 116 | 0.0078 | 0.7914 | 0.1230 | 1000 |
| Healthcare | 3.83E+03 | 0.7190 | 116 | 0.0065 | 0.7677 | 0.1361 | 1200 |
| Labor law | 1.16E+03 | 0.7510 | 89 | 0.0088 | 0.7433 | 0.1324 | 800 |
| Wage | 6.55E+03 | 0.5209 | 100 | 0.0053 | 0.6000 | 0.1524 | 950 |
Fig. 6The estimated and real values of the SIR model on the topic of labor law
The major topics and keywords in the US Politics Online forum
| Topic group | Topic | Keywords |
|---|---|---|
| International issue | Nuclear weapon | Iran, nuclear, weapons, United States, Ahmadinejad, Russia |
| Iraq war | Iraq, war, troops, Iraqi, military, forces, security, government | |
| Domestic issue | Healthcare bill | Tax, healthcare, plan, pay, cost, insurance, income, program |
| Election issue | McCain | McCain, campaign, Palin, John, Governor, Presidential, Sarah |
| Obama | Obama, president, Barack, presidential |
Fig. 7The time-series patterns of selected political topics
Fig. 8The estimated and real values of the SIR model on the topics of Obama and McCain
Parameter estimation results on the political forum
| Topic | MSE |
| S(0) |
|
|
| K |
|---|---|---|---|---|---|---|---|
| Nuclear weapon | 9.90E+03 | 0.4379 | 142.9 | 0.0076 | 0.9500 | 0.2670 | 931.8 |
| Iraq war | 9.27E+03 | 0.4739 | 166.4 | 0.0062 | 0.9115 | 0.2565 | 861.4 |
| Healthcare bill | 6.81E+02 | 0.5761 | 21.7 | 0.0180 | 0.5696 | 0.2995 | 208.7 |
| Barack Obama | 8.65E+03 | 0.7929 | 67.2 | 0.0039 | 0.2212 | 0.0937 | 1022.7 |
| John McCain | 8.20E+03 | 0.7190 | 140.4 | 0.0034 | 0.8078 | 0.2232 | 709.3 |