| Literature DB >> 23750192 |
Qingyu Yuan1, Elaine O Nsoesie, Benfu Lv, Geng Peng, Rumi Chunara, John S Brownstein.
Abstract
Several approaches have been proposed for near real-time detection and prediction of the spread of influenza. These include search query data for influenza-related terms, which has been explored as a tool for augmenting traditional surveillance methods. In this paper, we present a method that uses Internet search query data from Baidu to model and monitor influenza activity in China. The objectives of the study are to present a comprehensive technique for: (i) keyword selection, (ii) keyword filtering, (iii) index composition and (iv) modeling and detection of influenza activity in China. Sequential time-series for the selected composite keyword index is significantly correlated with Chinese influenza case data. In addition, one-month ahead prediction of influenza cases for the first eight months of 2012 has a mean absolute percent error less than 11%. To our knowledge, this is the first study on the use of search query data from Baidu in conjunction with this approach for estimation of influenza activity in China.Entities:
Mesh:
Year: 2013 PMID: 23750192 PMCID: PMC3667820 DOI: 10.1371/journal.pone.0064323
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Influenza case data from China’s MOH.
| Month | ICD | Month | ICD | Month | ICD | Month | ICD | Month | ICD |
| 2009–03 | 8015 | 2009–12 | 29977 | 2010–09 | 5114 | 2011–06 | 3065 | 2012–03 | 21625 |
| 2009–04 | 6794 | 2010–01 | 10415 | 2010–10 | 4121 | 2011–07 | 2654 | 2012–04 | 10707 |
| 2009–05 | 7769 | 2010–02 | 6595 | 2010–11 | 5323 | 2011–08 | 3243 | 2012–05 | 8520 |
| 2009–06 | 7999 | 2010–03 | 8488 | 2010–12 | 6529 | 2011–09 | 4360 | 2012–06 | 6195 |
| 2009–07 | 7791 | 2010–04 | 6357 | 2011–01 | 6072 | 2011–10 | 5525 | 2012–07 | 6738 |
| 2009–08 | 14548 | 2010–05 | 3865 | 2011–02 | 5930 | 2011–11 | 7055 | 2012–08 | 6793 |
| 2009–09 | 43596 | 2010–06 | 2642 | 2011–03 | 7299 | 2011–12 | 11631 | ||
| 2009–10 | 25132 | 2010–07 | 2627 | 2011–04 | 5727 | 2012–01 | 10046 | ||
| 2009–11 | 43018 | 2010–08 | 3588 | 2011–05 | 4130 | 2012–02 | 17421 |
ICD is the abbreviation for influenza case data.
Complete keyword list.
| 流感(flu) | 新流感(new flu) | h1n1流感(h1n1 flu) | 本山快乐营猪流感 | 流感吃什么药(influenza drugs) | 甲型流感疫苗(type a influenza vaccine) |
|
| 流感疫苗(influenza vaccine) | 预防流感知识(knowledge of influenza prevention) | 上流感 | 季节性流感疫苗(seasonal influenza vaccine) |
| 甲型流感症状(type a flu symptom) | 甲型h1n1流感的症状(the symptoms of type a h1n1 flu) |
|
| 关颖 上流感 | 甲型流感(type a flu) | 流感概念股 | 流感疫苗价格(the price of influenza vaccine) | 如何预防猪流感(how to prevent swine flu) | 甲型h1n1流感防治(type h1n1 influenza prevention and control) |
|
|
|
|
| 如何预防流感(how to prevent flu) |
|
|
| 流感的预防措施(prevention measures of influenza) | 甲型h1n1流感防控(the prevention of h1n1 flu) | 甲流感(h1n1 influenza) |
| 季节性流感(seasonal flu) |
| 甲型h1n1流感资料(type a h1n1 flu information) |
|
| 甲型h1n1流感预防(prevention of h1n1 flu) | 流感预防措施(the prevention measures of influenza) |
|
|
| 甲型流感的预防(prevention of type a flu) |
| 流感症状(influenza symptom) | 甲型h1n1流感症状(influenza symptom of h1n1) |
| 预防甲型h1n1流感(prevention of type a h1n1 flu) |
|
|
|
| 流感病毒(influenza virus) | 流感疫情(Flu epidemic) | 预防流感(prevent the flu) |
| 流感传播途径(transmission way of flu) |
|
|
|
| 甲型h1n1流感疫苗(influenza vaccine of h1n1) | 流感的传播途径(the transmission way of flu) |
| 流感大流行(influenza pandemic) |
|
|
| 流感的症状(the influenza symptom) |
| 流感疫苗副作用(Influenza vaccine side effects) | 甲流感症状(symptom of h1n1 flu) | 流感预防(prevent influenza) | 预防甲型流感(prevention of type a flu) |
|
|
|
|
| 甲型h1n1流感疫情(epidemic situation of type a h1n1 flu ) | 流感治疗(flu treatment) |
| 人感染猪流感症状(Human infection with swine flu symptoms) |
|
|
|
|
|
|
|
|
|
|
|
|
| a型流感(type a influenza) | 甲型h1n1流感病毒(type a h1n1 flu virus) |
|
| h1n1流感预防(h1n1 influenza prevention) |
| H1n1流感(h1n1 flu) |
Note: Web users use Chinese characters to search in Baidu. Keywords in English are listed to show the corresponding translation of each Chinese character. The keywords in bold are excluded at filtering step (i). The keywords in italics are excluded at filtering step (ii) and keywords in bold and italics are excluded at filtering step (iii).
Keywords in composite index.
| Chinese | 流感预防 | 流感的症状 | 甲型流感疫苗 | 流感症状 |
| English | (prevent influenza) | (the influenza symptom) | (type a influenza vaccine) | (flu symptom) |
| Correlation | 0.93 | 0.92 | 0.90 | 0.87 |
| Chinese | 流感疫情 | 流感病毒 | 流感大流行 | a型流感 |
| English | (Flu epidemic) | (influenza virus) | (influenza pandemic) | (type a influenza) |
| Correlation | 0.85 | 0.63 | 0.57 | 0.40 |
Figure 1Influenza case data and composite search index.
Statistical results for model [2].
| Variable | Coefficient | Std. Error | t-Statistic | Prob. | R-squared | Durbin-Watson stat |
|
| 0.253 | 0.015 | 17.455 | <0.001 | 0.950 | 1.887 |
|
| −0.138 | 0.044 | −3.159 | 0.0036 | ||
|
| 0.555 | 0.157 | 3.534 | 0.0013 | ||
| residual | ADF | MacKinnon threshold | Prob* | result | ||
| t-Stat | 1% | 5% | 10% | |||
| −5.685 | −3.654 | −2.957 | −2.617 | <0.001 | stationary | |
Note: ADF is the abbreviation for augmented Dickey-Fuller Test. ICD represents influenza case data.
Figure 2Plot of influenza cases, fitted values and prediction based on model [2].
Predicted values and error.
| Month | Actual value | Predicted value | Absolute error | Percent absolute error |
| 01–2012 | 10046 | 10230 | 184 | 1.8% |
| 02–2012 | 17421 | 14578 | 2843 | 16.3% |
| 03–2012 | 21625 | 18429 | 3196 | 14.8% |
| 04–2012 | 10707 | 11785 | 1078 | 10.1% |
| 05–2012 | 8520 | 8618 | 98 | 1.2% |
| 06–2012 | 6195 | 6621 | 426 | 6.9% |
| 07–2012 | 6738 | 5240 | 1498 | 22.2% |
| 08–2012 | 6793 | 5983 | 810 | 11.9% |