| Literature DB >> 35430595 |
Omar Malik1,2, Bowen Gong3, Alaa Moussawi4, Gyorgy Korniss5,3, Boleslaw K Szymanski3,6.
Abstract
We study how public transportation data can inform the modeling of the spread of infectious diseases based on SIR dynamics. We present a model where public transportation data is used as an indicator of broader mobility patterns within a city, including the use of private transportation, walking etc. The mobility parameter derived from this data is used to model the infection rate. As a test case, we study the impact of the usage of the New York City subway on the spread of COVID-19 within the city during 2020. We show that utilizing subway transport data as an indicator of the general mobility trends within the city, and therefore as an indicator of the effective infection rate, improves the quality of forecasting COVID-19 spread in New York City. Our model predicts the two peaks in the spread of COVID-19 cases in NYC in 2020, unlike a standard SIR model that misses the second peak entirely.Entities:
Mesh:
Year: 2022 PMID: 35430595 PMCID: PMC9012993 DOI: 10.1038/s41598-022-10234-8
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1(a) The red line shows the daily reported cases of COVID-19 in New York City. The blue line shows the total daily number of trips taken on the subway, with entries related to the Port Authority Trans-Hudson (PATH) removed. The dotted line indicates the start of the NY PAUSE Program, (b) The fraction of total weekly cases reported on each day of the week, averaged over 44 weeks. While the weekdays remain largely consistent, there is a significant drop in reporting on weekends.
An example of the matching process using real data. The columns labelled ENTRIES, EXITS, and TIME are from the turnstile data. We calculate the number of arrivals, , by subtracting successive values of the running total of entries. These arrivals are assigned a fractional time, , corresponding to the midpoint of successive time snapshots. The departures, , are calculated in the same way.
| Entries | Exits | Time | |||||||
|---|---|---|---|---|---|---|---|---|---|
| 0007328037 | 0002483731 | 03:00:00 | |||||||
| 0007328044 | 0002483742 | 07:00:00 | 0.208 | 7 | 11 | 0 | 0 | 7 | 11 |
| 0007328075 | 0002483781 | 11:00:00 | 0.375 | 31 | 39 | 0 | 4 | 31 | 43 |
| 0007328193 | 0002483821 | 15:00:00 | 0.542 | 118 | 40 | 0 | 11 | 118 | 51 |
| 0007328375 | 0002483878 | 19:00:00 | 0.708 | 182 | 57 | 67 | 0 | 249 | 57 |
| 0007328499 | 0002483910 | 23:00:00 | 0.875 | 124 | 32 | 192 | 0 | 316 | 32 |
Figure 2(a) Schematic representation of the mobility-based SIR model. Each region j has an associated infection rate and mobility parameters and , which represent individuals from region j visiting region and vice versa. (b) The enhanced model that includes a public transportation node without a permanent population. Inter-region mixing still occurs as in the basic model, but the visiting populations of every region pass through the transportation node for the duration of their commute time during which they are exposed to the higher infection rate associated with using public transportation. The effective population of region j that is commuting is given by while the effective population of all other regions that are visiting region j are given by .
Figure 4(a) The predicted number of daily cases in NYC normalized by the total population of the city. The red line is the 7-day running average of the total daily reported cases in NYC as a fraction of the total population of the city. The dashed black line shows the best-fit output of the model in the training period, and the solid black line shows the model’s prediction for the testing period. The vertical dotted line marks the beginning of the three-week testing period. The inset figure shows the testing period in more detail. (b) The ratio of the average infection rate, , over the recovery rate, as a function of time. The second axis shows the average mobility parameter, .
Figure 3(a) The best-fit model output of the daily number of new cases in NYC. The black line shows the model’s output. The red line is the 7-day running average of the total daily reported cases in NYC as a fraction of the total population of the city. The dotted line indicates the start of the NYC Pause Program. (b) Fitting results for the model without the mobility-dependent infection rate given by Eq. 29. As the plot demonstrates, we cannot fit NYC’s COVID-19 spread without modifying the infection rate by the mobility term.
The results from fitting the data with and without the last three weeks masked. refers to the MSE of fitting the data, while shows the MSE of the model’s prediction during the testing period. While we minimize the MSE during the fitting process, the table also reports the in-sample and out of sample score of the best fit. We use for all our simulations.
| Parameter values | Data fitting without testing period | Data fitting with three-week testing period |
|---|---|---|
| 1.55 | 1.59 | |
| 0.55 | 0.53 | |
| 4 | 4 | |
| 4 | 4 | |
| 0.04 | 0.04 | |
| 21 days | 21 days | |
| – | ||
| 0.82 | 0.88 | |
| – | 0.43 |