| Literature DB >> 29967755 |
Feng Liang1, Peng Guan1, Wei Wu1, Desheng Huang1,2.
Abstract
BACKGROUND: Influenza epidemics pose significant social and economic challenges in China. Internet search query data have been identified as a valuable source for the detection of emerging influenza epidemics. However, the selection of the search queries and the adoption of prediction methods are crucial challenges when it comes to improving predictions. The purpose of this study was to explore the application of the Support Vector Machine (SVM) regression model in merging search engine query data and traditional influenza data.Entities:
Keywords: Baidu search query; Flu surveillance system; Liaoning; SVM regression model; Seasonal influenza
Year: 2018 PMID: 29967755 PMCID: PMC6022725 DOI: 10.7717/peerj.5134
Source DB: PubMed Journal: PeerJ ISSN: 2167-8359 Impact factor: 2.984
Pearson correlation coefficients between search terms from Baidu search engine and the number of influenza cases in Liaoning, for the time period January 2011–December 2015.
| Search terms | The same month | Lag one month | Lag two months | Lag three months | Non-flu season |
|---|---|---|---|---|---|
| Flu | 0.608 | 0.536 | 0.500 | 0.395 | 0.672 |
| Flu symptoms | 0.618 | 0.374 | 0.273 | 0.04 | 0.430 |
| Influenza type A | 0.489 | 0.134 | 0.005 | −0.287 | 0.048 |
| Influenza vaccine | 0.259 | 0.436 | 0.645 | 0.764 | 0.328 |
| Is it necessary to get vaccinated against the flu? | 0.103 | 0.362 | 0.626 | 0.814 | 0.255 |
| Flu virus | 0.621 | 0.435 | 0.282 | 0.175 | 0.711 |
| The symptom of flu | 0.656 | 0.438 | 0.209 | −0.054 | 0.235 |
| Influenza drugs | 0.639 | 0.431 | 0.218 | −0.116 | 0.337 |
| The symptoms of type A flu | 0.157 | 0.029 | −0.061 | −0.225 | 0.320 |
| Prevent flu | 0.623 | 0.644 | 0.663 | 0.511 | 0.374 |
| Swine flu | 0.371 | 0.290 | 0.172 | 0.075 | 0.459 |
| H1N1 flu | 0.021 | −0.157 | −0.231 | −0.409 | −0.302 |
| Beijing flu | 0.249 | 0.055 | 0.037 | −0.148 | 0.313 |
| Swine flu symptoms | 0.032 | 0.142 | −0.104 | −0.101 | −0.039 |
| How to prevent flu | 0.023 | −0.025 | 0.056 | 0.007 | −0.006 |
| Viral flu | 0.484 | 0.339 | 0.234 | −0.023 | 0.587 |
| How to prevent flu | 0.129 | 0.087 | 0.201 | 0.125 | −0.014 |
| Spanish flu | 0.459 | 0.479 | 0.405 | 0.409 | 0.491 |
| Flu prevention | 0.178 | 0.012 | −0.092 | −0.133 | 0.380 |
| Side effects of flu vaccine | 0.084 | 0.379 | 0.657 | 0.804 | 0.254 |
| The prevention measures of flu | −0.079 | −0.056 | 0.043 | 0.061 | −0.108 |
| Type A H1N1 flu | 0.089 | −0.024 | −0.281 | −0.362 | −0.379 |
| Flu therapy | 0.438 | 0.183 | 0.168 | −0.075 | 0.291 |
| The prevention of flu | −0.335 | −0.357 | −0.251 | −0.285 | −0.394 |
| Influenza epidemic | 0.09 | −0.075 | −0.207 | −0.375 | 0.108 |
| Influenza vaccine price | −0.273 | −0.185 | −0.002 | 0.126 | −0.124 |
| Type A flu | 0.383 | 0.326 | 0.025 | −0.193 | 0.508 |
| Type A flu virus | 0.586 | 0.395 | 0.276 | 0.002 | 0.369 |
| New type of flu | 0.352 | 0.430 | 0.037 | −0.076 | 0.387 |
| Type A influenza | 0.016 | −0.222 | −0.138 | −0.345 | −0.197 |
| Love flu strain | 0.054 | −0.165 | −0.166 | −0.296 | −0.425 |
| Flu concept stock | 0.266 | 0.226 | 0.011 | −0.072 | 0.349 |
| Seasonal influenza | −0.046 | −0.122 | −0.12 | −0.172 | 0.515 |
| Love flu | −0.300 | −0.310 | −0.350 | −0.368 | −0.114 |
| Type A H1N1 flu virus | −0.109 | −0.22 | −0.24 | −0.131 | −0.160 |
| New flu | −0.048 | −0.064 | −0.151 | −0.25 | −0.016 |
| How to treat swine flu | 0.223 | 0.547 | 0.476 | 0.381 | 0.200 |
| Influenza transmission route | 0.228 | 0.1 | −0.09 | −0.076 | 0.346 |
| The route of transmission of flu | 0.216 | −0.011 | −0.12 | −0.164 | 0.207 |
| Treatment program of A type H1N1 flu | −0.145 | −0.19 | −0.24 | −0.304 | 0.102 |
| Flu (space) symptom | 0.147 | 0.042 | −0.068 | −0.157 | 0.358 |
| H1N1 flu symptom | 0.346 | 0.196 | 0.141 | 0.064 | 0.260 |
| Sheep flu | 0.011 | −0.038 | −0.053 | −0.107 | 0.211 |
| Super flu | 0.314 | 0.248 | 0.133 | 0.16 | 0.172 |
| The symptom of swine flu | 0.134 | 0.071 | −0.025 | −0.124 | 0.341 |
| Taiwan flu | 0.124 | 0.066 | 0.104 | −0.188 | 0.374 |
Notes:
Indicates the P value with statistically significance at 0.05 level.
indicates the P value with statistically significance at 0.01 level.
Strongly correlated search terms with the number of influenza cases in different lag periods.
| Lag time | Search keywords |
|---|---|
| The same month | Flu, flu symptoms, influenza type A, flu virus, the symptoms of flu, influenza drugs, viral flu, flu therapy, type A flu virus |
| Lag one month | Spanish flu, new type of flu, how to treat swine flu |
| Lag two months | Prevent flu |
| Lag three months | Influenza vaccine, is it necessary to get vaccinated against the flu, H1N1 flu, side effects of flu vaccine |
The SVM model precision of different C values (γ = 0.05556, ε = 0.1).
| Training error | Test error | |
|---|---|---|
| 0.0001 | 8,812.675 | 8,834.564 |
| 0.001 | 8,768.452 | 8,806.329 |
| 0.01 | 8,363.176 | 8,532.467 |
| 0.1 | 5,831.012 | 6,661.826 |
| 1 | 1,4647.06 | 4,052.645 |
| 2 | 498.3551 | 3,900.983 |
| 3 | 215.0484 | 4,003.317 |
| 4 | 175.4402 | 4,116.998 |
| 5 | 147.7603 | 4,215.681 |
| 10 | 76.7374 | 4,756.99 |
| 100 | 71.55703 | 4,792.467 |
The SVM model precision of different γ values (C = 1, ε = 0.1).
| γ | Training error | Test error |
|---|---|---|
| 0.0001 | 8,130.444 | 8,199.893 |
| 0.001 | 4,533.864 | 4,933.536 |
| 0.005 | 1,932.273 | 3,239.05 |
| 0.01 | 1,604.143 | 3,502.44 |
| 0.02 | 1,493.212 | 3,655.345 |
| 0.03 | 1,465.351 | 3,773.955 |
| 0.04 | 1,466.39 | 3,873.576 |
| 0.05 | 1,459.852 | 3,982.198 |
| 0.1 | 1,470.783 | 4,796.493 |
| 1 | 2,441.952 | 7,936.262 |
The SVM model precision of different ε values (C = 1, γ = 0.05556).
| ε | Training error | Test error |
|---|---|---|
| 0.0001 | 1,388.189 | 3,985.145 |
| 0.001 | 1,388.479 | 3,986.231 |
| 0.01 | 1,392.287 | 3,993.167 |
| 0.05 | 1,412.391 | 4,027.2 |
| 0.08 | 1,440.697 | 4,051.417 |
| 0.09 | 1,452.264 | 4,052.576 |
| 0.1 | 1,464.717 | 4,052.645 |
| 0.2 | 1,539.904 | 4,060.86 |
| 0.3 | 1,709.969 | 4,153.16 |
| 0.4 | 1,965.82 | 4,346.04 |
| 0.5 | 2,361.972 | 4,643.752 |
| 1 | 5,522.041 | 7,408.229 |
Similarity metrics between 3 data sources: the number of influenza cases at a lag of one month, Baidu keywords, ensemble data, for the time period October 2014–December 2015.
| RMSE | RMSPE (%) | MAPE (%) | |
|---|---|---|---|
| Influenza cases at a lag of one month | 82.874 | 40.658 | 35.150 |
| Baidu keywords | 43.472 | 30.438 | 26.806 |
| Ensemble model | 42.654 | 29.687 | 26.197 |
Figure 1The performance of the three available predictors.
Figure 2The residuals of the three predictors.