| Literature DB >> 33942002 |
Abstract
Coronavirus disease (COVID-19) has evolved into a pandemic with many unknowns. Houston, located in the Harris County of Texas, is becoming the next hotspot of this pandemic. With a severe decline in international and inter-state travel, a model at the county level is needed as opposed to the state or country level. Existing approaches have a few drawbacks. Firstly, the data used is the number of COVID-19 positive cases instead of positivity. The former is a function of the number of tests carried out while the number of tests normalizes the latter. Positivity gives a better picture of the spread of this pandemic as, with time, more tests are being administered. Positivity under 5% has been desired for the reopening of businesses to almost 100% capacity. Secondly, the data used by models like SEIRD (Susceptible, Exposed, Infectious, Recovered, and Deceased) lacks information about the sentiment of people concerning coronavirus. Thirdly, models that make use of social media posts might have too much noise and misinformation. On the other hand, news sentiment can capture long-term effects of hidden variables like public policy, opinions of local doctors, and disobedience of state-wide mandates. The present study introduces a new artificial intelligence (i.e., AI) model, viz., Sentiment Informed Time-series Analyzing AI (SITALA), trained on COVID-19 test positivity data and news sentiment from over 2750 news articles for Harris county. The news sentiment was obtained using IBM Watson Discovery News. SITALA is inspired by Google-Wavenet architecture and makes use of TensorFlow. The mean absolute error for the training dataset of 66 consecutive days is 2.76, and that for the test dataset of 22 consecutive days is 9.6. A cone of uncertainty is provided within which future COVID-19 test positivity has been shown to fall with high accuracy. The model predictions fare better than a published Bayesian-based SEIRD model. The model forecasts that in order to curb the spread of coronavirus in Houston, a sustained negative news sentiment (e.g., death count for COVID-19 will grow at an alarming rate in Houston if mask orders are not followed) will be desirable. Public policymakers may use SITALA to set the tone of the local policies and mandates.Entities:
Keywords: Artificial intelligence; COVID-19 model; Deep learning; News sentiment; Pandemic forecast; Public policy
Year: 2021 PMID: 33942002 PMCID: PMC8081574 DOI: 10.1016/j.eswa.2021.115104
Source DB: PubMed Journal: Expert Syst Appl ISSN: 0957-4174 Impact factor: 6.954
Fig. 1Title: Sample query output from IBM Watson News Discovery. Description: The present study used this exact same query to determine the sentiment of news (-1 implies maximum negative sentiment and +1 implies maximum positive sentiment) over varying publication dates.
Fig. 2Title: Architecture of SITALA. Description: It is a sequential model that takes a window of multivariate (viz., COVID test positivity and news sentiment from IBM Watson News Discovery) timeseries and outputs the COVID test positivity at the next timestep. This architecture is inspired by Google-Wavenet (Oord et al., 2016) architecture (viz., dilated causal convolutions). In the present study, a window size of 16 was used and dilations of 1, 2, 4, 8 were used. The window size was chosen to be 16 based on the incubation period of coronavirus as reported by Lauer et al. (2020).
Fig. 3Title: SITALA predictions and relevant COVID-19 data for the Harris county, Texas. Description: Number of news articles returned by IBM Watson discovery, shown on the right axis, started increasing from around mid-May, 2020. COVID-19 test positivity and news sentiment are shown on the left axis. Around 75% of the data (green window) was used for training SITALA, of which 10% was reserved for validation. SITALA was tested on remaining 25% of the data (blue window) for which the mean absolute error (MAE) was 9.6. SITALA forecast (gray window) shows how maintaining a negative sentiment in the news about the spread of COVID-19 can be beneficial to control and eventually decrease test positivity.
| Inputs | Output | SITALA predictions | |||||
|---|---|---|---|---|---|---|---|
| Date | Percent cleaned test positivity | Percent COVID news sentiment | Percent cleaned test positivity | Positivity predictions | Forecast with positive sentiment (70) | Forecast with negative sentiment (−70) | |
| Training dataset with 10% reserved for validation | 4/23 | 2.24 | 0.00 | 2.24 | |||
| 4/24 | 3.56 | 0.00 | 3.56 | ||||
| 4/25 | 4.38 | 0.00 | 4.38 | ||||
| 4/26 | 21.19 | 0.00 | 21.19 | ||||
| 4/27 | 3.93 | −61.81 | 3.93 | ||||
| 4/28 | 14.83 | 0.00 | 14.83 | ||||
| 4/29 | 4.24 | 0.00 | 4.24 | ||||
| 4/30 | 7.58 | 0.00 | 7.58 | ||||
| 5/1 | 6.67 | 0.00 | 6.67 | ||||
| 5/2 | 5.76 | 0.00 | 5.76 | ||||
| 5/3 | 4.89 | 0.00 | 4.89 | ||||
| 5/4 | 4.29 | −52.66 | 4.29 | ||||
| 5/5 | 3.73 | 0.00 | 3.73 | ||||
| 5/6 | 3.18 | 0.00 | 3.18 | ||||
| 5/7 | 2.62 | 0.00 | 2.62 | 2.60 | |||
| 5/8 | 2.36 | 0.00 | 2.36 | 2.49 | |||
| 5/9 | 6.29 | 0.00 | 6.29 | 6.08 | |||
| 5/10 | 4.20 | 0.00 | 4.20 | 4.16 | |||
| 5/11 | 2.23 | 0.00 | 2.23 | 2.06 | |||
| 5/12 | 3.93 | 0.00 | 3.93 | 3.67 | |||
| 5/13 | 4.50 | 0.00 | 4.50 | 4.32 | |||
| 5/14 | 4.29 | 0.00 | 4.29 | 4.17 | |||
| 5/15 | 9.17 | 5.43 | 9.17 | 8.85 | |||
| 5/16 | 4.78 | 1.36 | 4.78 | 4.59 | |||
| 5/17 | 1.81 | −26.88 | 1.81 | 1.76 | |||
| 5/18 | 7.66 | 13.96 | 7.66 | 7.32 | |||
| 5/19 | 2.38 | −10.04 | 2.38 | 2.19 | |||
| 5/20 | 5.00 | 7.82 | 5.00 | 4.88 | |||
| 5/21 | 6.64 | 1.89 | 6.64 | 6.49 | |||
| 5/22 | 8.48 | 11.79 | 8.48 | 8.32 | |||
| 5/23 | 6.74 | −54.95 | 6.74 | 6.56 | |||
| 5/24 | 4.99 | −11.66 | 4.99 | 4.92 | |||
| 5/25 | 3.25 | −30.84 | 3.25 | 3.19 | |||
| 5/26 | 1.46 | 6.23 | 1.46 | 1.52 | |||
| 5/27 | 14.83 | 11.26 | 14.83 | 14.78 | |||
| 5/28 | 4.27 | 11.29 | 4.27 | 4.35 | |||
| 5/29 | 6.20 | 5.52 | 6.20 | 5.92 | |||
| 5/30 | 6.06 | 39.18 | 6.06 | 6.00 | |||
| 5/31 | 21.91 | 25.26 | 21.91 | 21.21 | |||
| 6/1 | 0.96 | 16.38 | 0.96 | 0.68 | |||
| 6/2 | 8.34 | 0.33 | 8.34 | 7.94 | |||
| 6/3 | 12.92 | −3.21 | 12.92 | 12.45 | |||
| 6/4 | 17.55 | 15.25 | 17.55 | 17.02 | |||
| 6/5 | 12.49 | 9.44 | 12.49 | 11.88 | |||
| 6/6 | 7.89 | −26.27 | 7.89 | 6.94 | |||
| 6/7 | 5.57 | −15.18 | 5.57 | 5.21 | |||
| 6/8 | 11.63 | −19.69 | 11.63 | 11.01 | |||
| 6/9 | 16.62 | −14.70 | 16.62 | 16.26 | |||
| 6/10 | 15.01 | 2.63 | 15.01 | 14.11 | |||
| 6/11 | 13.40 | −12.09 | 13.40 | 13.22 | |||
| 6/12 | 11.79 | −18.90 | 11.79 | 11.28 | |||
| 6/13 | 10.19 | −28.92 | 10.19 | 9.99 | |||
| 6/14 | 7.46 | 2.87 | 7.46 | 7.58 | |||
| 6/15 | 7.16 | −4.26 | 7.16 | 7.27 | |||
| 6/16 | 5.53 | 2.51 | 5.53 | 5.57 | |||
| 6/17 | 10.04 | 14.69 | 10.04 | 10.10 | |||
| 6/18 | 7.08 | −0.39 | 7.08 | 6.87 | |||
| 6/19 | 5.75 | 15.41 | 5.75 | 5.68 | |||
| 6/20 | 17.55 | 6.68 | 17.55 | 16.89 | |||
| 6/21 | 27.71 | −27.42 | 27.71 | 6.39 | |||
| 6/22 | 3.71 | 4.48 | 3.71 | 6.63 | |||
| 6/23 | 60.35 | −17.23 | 60.35 | 12.43 | |||
| 6/24 | 45.44 | −10.17 | 45.44 | 9.54 | |||
| 6/25 | 30.53 | −0.26 | 30.53 | 13.65 | |||
| 6/26 | 15.62 | −16.69 | 15.62 | 11.36 | |||
| Test dataset | 6/27 | 9.05 | −10.33 | 9.05 | 14.38 | ||
| 6/28 | 9.18 | −48.70 | 9.18 | 14.08 | |||
| 6/29 | 1.64 | −20.83 | 1.64 | 17.87 | |||
| 6/30 | 19.83 | −15.70 | 19.83 | 11.95 | |||
| 7/1 | 14.38 | −20.46 | 14.38 | 9.54 | |||
| 7/2 | 20.69 | −30.37 | 20.69 | 17.72 | |||
| 7/3 | 23.57 | −11.68 | 23.57 | 8.66 | |||
| 7/4 | 15.27 | −24.72 | 15.27 | 12.23 | |||
| 7/5 | 6.32 | −35.68 | 6.32 | 16.23 | |||
| 7/6 | 7.38 | 1.79 | 7.38 | 11.14 | |||
| 7/7 | 13.42 | 3.53 | 13.42 | 12.01 | |||
| 7/8 | 15.12 | −41.36 | 15.12 | 11.32 | |||
| 7/9 | 13.59 | −50.11 | 13.59 | 10.55 | |||
| 7/10 | 11.44 | −27.35 | 11.44 | 12.04 | |||
| 7/11 | 10.81 | −21.80 | 10.81 | 5.45 | |||
| 7/12 | 25.92 | −51.09 | 25.92 | 11.06 | |||
| 7/13 | 18.76 | −4.39 | 18.76 | 6.50 | |||
| 7/14 | 49.02 | 0.49 | 49.02 | 6.91 | |||
| 7/15 | 28.24 | −11.31 | 28.24 | 7.88 | |||
| 7/16 | 27.44 | 1.59 | 27.44 | 16.13 | |||
| 7/17 | 26.65 | −9.15 | 26.65 | 9.96 | |||
| 7/18 | 20.28 | −31.79 | 20.28 | 20.28 | |||
| Forecast | 7/19 | 12.27 | 9.23 | 9.23 | |||
| 7/20 | 14.15 | 17.31 | 13.78 | ||||
| 7/21 | 20.52 | 26.54 | 17.34 | ||||
| 7/22 | 27.54 | 14.89 | 8.86 | ||||
| 7/23 | 24.79 | 22.31 | 11.32 | ||||
| 7/24 | 19.83 | 32.09 | 16.35 | ||||
| 7/25 | 16.16 | 26.15 | 8.79 | ||||
| 7/26 | 21.30 | 29.99 | 10.60 | ||||
| 7/27 | 21.00 | 37.56 | 14.11 | ||||
| 7/28 | 30.01 | 33.28 | 12.61 | ||||
| 7/29 | 39.18 | 41.10 | 9.85 | ||||
| 7/30 | 23.33 | 44.90 | 15.87 | ||||
| 7/31 | 43.18 | 45.41 | 14.24 | ||||
| 8/1 | 37.56 | 49.85 | 16.57 | ||||
| 8/2 | 12.00 | 57.47 | 18.08 | ||||
| 8/3 | 5.75 | 56.75 | 19.36 | ||||
| 8/4 | 15.04 | 55.62 | 16.81 | ||||
| 8/5 | 21.09 | 56.75 | 17.65 | ||||
| 8/6 | 59.65 | 17.22 | |||||
| 8/7 | 60.82 | 17.17 | |||||
| # SITALA: Sentiment Informed Timeseries Analyzing AI |
| # Location: Harris county, TX |
| # Purpose: Forecast spread of coronavirus |
| # Author: Prathamesh S. Desai |
| tf.keras.backend.clear_session() |
| tf.random.set_seed(40) |
| np.random.seed(40) |
| # Model definition |
| model = tf.keras.Sequential() |
| model.add(tf.keras.layers.InputLayer(input_shape=(window, n_features))) |
| for dilation_rate in (1, 2, 4, 8): |
| model.add(tf.keras.layers.Conv1D(filters = 64, kernel_size = 2, strides = 1, dilation_rate = dilation_rate, padding=‘‘causal", activation=‘‘relu")) |
| model.add(tf.keras.layers.Flatten()) |
| model.add(tf.keras.layers.Dense(128, activation=’relu’)) |
| model.add(tf.keras.layers.Dense(1)) |
| # Model compilation |
| optimizer = tf.keras.optimizers.Adam(lr = 5e-4) |
| model.compile(loss = tf.keras.losses.Huber(), optimizer = optimizer, metrics=[‘‘mae"]) |
| early_stopping = tf.keras.callbacks.EarlyStopping(patience = 200) |
| model.summary() |
| history = model.fit(X_train, Y_train, epochs = 500, verbose = True, validation_split = 0.1, callbacks=[early_stopping]) |
| print(‘‘SITALA has been trained") |