Evan O'Brien1, Irene Xagoraraki1. 1. Department of Civil and Environmental Engineering, Michigan State University, East Lansing, MI 48824, USA.
Abstract
Viral diseases exhibit spatial and temporal variation, and there are many factors that can affect their occurrence. The identification of these factors is critical in the efforts to predict and lessen viral disease burden. Because viral infection is able to spread to humans from the environment, animals, and other humans, the One-Health framework can be used to investigate the critical pathways through which viruses are transported and transmitted. A holistic approach, incorporating publicly available clinical data for human, livestock, and wildlife disease occurrence, together with environmental data reported in federal and state databases such as parameters related to land use, environmental quality, and weather, can enhance the understanding of variations in disease patterns, leading to the design and implementation of surveillance systems. An example analysis approach is presented for Michigan, United States, which is a state with large urban centers as well as a sizeable rural and agricultural population. Analysis of publicly available data from 2017 indicates that gastrointestinal (GI) and influenza-associated illnesses in Michigan may have been related with agricultural land use to a higher extent than with developed land use during that year. Meanwhile, hepatitis A virus appears to be most closely related with developed land use in dense population areas. GI illnesses may be related to precipitation, and this relationship is strongest in the springtime, although GI illnesses are most common in the winter months. Integration of human-related clinical data, animal disease data, and environmental data can ultimately be used for prioritization of the most critical locations and times for viral outbreaks in both urban and rural environments.
Viral diseases exhibit spatial and temporal variation, and there are many factors that can affect their occurrence. The identification of these factors is critical in the efforts to predict and lessen viral disease burden. Because viral infection is able to spread to humans from the environment, animals, and other humans, the One-Health framework can be used to investigate the critical pathways through which viruses are transported and transmitted. A holistic approach, incorporating publicly available clinical data for human, livestock, and wildlife disease occurrence, together with environmental data reported in federal and state databases such as parameters related to land use, environmental quality, and weather, can enhance the understanding of variations in disease patterns, leading to the design and implementation of surveillance systems. An example analysis approach is presented for Michigan, United States, which is a state with large urban centers as well as a sizeable rural and agricultural population. Analysis of publicly available data from 2017 indicates that gastrointestinal (GI) and influenza-associated illnesses in Michigan may have been related with agricultural land use to a higher extent than with developed land use during that year. Meanwhile, hepatitis A virus appears to be most closely related with developed land use in dense population areas. GI illnesses may be related to precipitation, and this relationship is strongest in the springtime, although GI illnesses are most common in the winter months. Integration of human-related clinical data, animal disease data, and environmental data can ultimately be used for prioritization of the most critical locations and times for viral outbreaks in both urban and rural environments.
The burden of viral disease is a global challenge, and the surveillance and reporting of viral disease is one way in which to manage and mitigate outbreaks. In the United States, the Centers for Disease Control (CDC) publish surveillance statistics regarding the rate and occurrence of disease for a number of human viruses, and annual summaries of these surveillance statistics are published in various forms. The Summary of Notifiable Diseases (SoND) is an annual report containing information on those diseases for which “regular, frequent, and timely information regarding individual cases is considered necessary for the prevention and control of the disease or condition”, a list of which is updated regularly. The CDC also maintains the National Outbreak Reporting System (NORS), which includes information on the number of disease cases and outbreaks for a number of infectious agents, including certain viruses. Influenza statistics, meanwhile, are reported most frequently by the CDC via published FluView Weekly Influenza Surveillance Reports, documenting the number of cases of influenza and influenza-like illnesses in the United States. In assessing national viral disease burden, it is necessary to analyze data from all of these sources.Fig. 1 presents the number of disease cases by month for influenza A as reported by FluView, West Nile virus and hepatitis A virus as reported by SoND, and norovirus, sapovirus, and rotavirus as reported by NORS from 2012 to 2016 [[1], [2], [3], [4], [5], [6], [7]]. Each of the six viruses exhibit different times of year in which disease cases are more prevalent. Insect-transmitted viruses such as West Nile virus are more common in the warmer months from July to September. Meanwhile, the waterborne viruses (norovirus, sapovirus, rotavirus, and hepatitis A virus) all exhibit different trends. Perhaps most notable is the distinction between norovirus, which is most common in the winter from January to March, and sapovirus, which is most common in autumn from September to November. Norovirus and sapovirus are closely related, both being members of the Caliciviridae family, yet they have strikingly different seasonal infection trends. Hepatitis A virus, on the other hand, does not show significant variation throughout the year. Rather, rates of infection are relatively constant from one month to the next.
Fig. 1
Disease cases by month as reported by SoND (West Nile virus, Hepatitis A virus) NORS (norovirus, sapovirus, rotavirus) and FluView (influenza A) for 2012–2016 [[1], [2], [3], [4], [5], [6], [7]]. Data summarized by the authors.
Disease cases by month as reported by SoND (West Nile virus, Hepatitis A virus) NORS (norovirus, sapovirus, rotavirus) and FluView (influenza A) for 2012–2016 [[1], [2], [3], [4], [5], [6], [7]]. Data summarized by the authors.In addition to temporal variations, virus outbreaks also exhibit spatial variations, with certain areas being more commonly affected than others. The aforementioned CDC sources also publish information regarding the disease cases for each individual state. Fig. 2 presents heatmaps of disease cases relative to state population for the six viruses mentioned above. West Nile virus appears to be more prevalent in the plains states of the central United States, while norovirus is most common in the Midwest and New England. Moreover, there is no significant spatial differentiation for hepatitis A virus from one region to another, mimicking its temporal trends. Rotavirus and sapovirus, meanwhile, tend to be concentrated in specific states, suggesting that outbreaks are the most common drivers of occurrence of these diseases. It is important to note, however, that these statistics are only a measure of reported cases, and that the actual incidence of viral disease could be significantly higher than the reported statistics indicate. For example, the CDC estimates that the rates of hepatitis A virus are approximately twice as high as reported incidence rates indicate [8].
Fig. 2
Heatmaps of disease cases relative to population in the United States for 2012–2016 as reported by the CDC [[1], [2], [3], [4], [5], [6], [7]]. Data summarized by the authors.
Heatmaps of disease cases relative to population in the United States for 2012–2016 as reported by the CDC [[1], [2], [3], [4], [5], [6], [7]]. Data summarized by the authors.Viral disease data have also been collected for the State of Michigan. Viral disease has been demonstrated to impact human, animal, and environmental health within the state of Michigan. Numerous human outbreaks due to multiple viral agents have been reported. These outbreaks include coronavirus in Lenawee County in 1966 [9], norovirus in Macomb County in 1979 [10] and in Ottawa County in 2008 [11], hepatitis A virus in Calhoun and Saginaw Counties in 1997 [12], and West Nile virus in Kent County in 2002 [13]. Michigan has also been in the midst of an outbreak of hepatitis A virus since 2016 [14]. Illustrated in these examples is both the variety of humanviral diseases that have impacted the state as well as that different areas of the state are subject to outbreaks.Numerous governmental agencies publish data regarding clinical cases of disease both spatially and temporally. In addition to the published federal data, individual states publish disease surveillance statistics, such as the Michigan Department of Health and Human Services (MDHHS). MDHSS maintains the Michigan Disease Surveillance System (MDSS) which publishes weekly disease reports on a number of communicable diseases [15]. Data taken from MDSS reports show an increase in viral disease over the past five years, as shown in Fig. 3. For this paper, gastrointestinal (GI) illnesses, influenza-like illnesses, hepatitis A illnesses, and norovirus illnesses are selected. While some GI illnesses and influenza-like illnesses may be caused by bacterial pathogens, a large percentage of GI illnesses and influenza-like illnesses are expected to be of viral origin and all hepatitis and norovirus illnesses are of viral origin. These diseases have been selected for investigation since they have different exposure pathways [16]. Influenza illnesses may be zoonotic but are not waterborne. Hepatitis A illnesses are waterborne but are not zoonotic. Norovirus is commonly foodborne, it may be waterborne, and it is not typically zoonotic. GI illnesses may be both waterborne and zoonotic and may be caused by viruses, bacteria, or paracites.
Fig. 3
Reported cases in Michigan over the past five years for GI illnesses, influenza-like illnesses, Hepatitis A virus, and norovirus as reported by MDSS [15]. Note: MDSS is a continually active system and reported numbers in the MDSS weekly reports are not final.
Reported cases in Michigan over the past five years for GI illnesses, influenza-like illnesses, Hepatitis A virus, and norovirus as reported by MDSS [15]. Note: MDSS is a continually active system and reported numbers in the MDSS weekly reports are not final.Viral disease outbreaks have also affected animals in Michigan, including viral diarrhea in cattle [17], eastern equine encephalitis virus in deer [18], and an outbreak of a novel calicivirus in rabbits [19]. According to the USDA report on death loss in U.S. cattle and calves (2015), 31.8% of non-predator cattle deaths and 42.3% of non-predator calf deaths were due to digestive or respiratory causes. These figures are amplified in the state of Michigan; the percentages are 37.8% and 66.3% respectively, equating to approximately 9027 cattle deaths and 27,926 calf deaths in the state during 2015 [20]. While the report does not specify the etiological nature of the deaths, a portion of these illnesses are due to viral causes, illustrating the potential burden of viral disease on animals. The Michigan Department of Agriculture & Rural Development (MDARD) also publishes annual statistics on reportable animal diseases. The MDARD report from 2017 includes many viral animal disease cases, including 373 cases of bovine leukemia virus, 160 cases of caprinearthritis encephalitis, 17 cases of porcine reproductive and respiratory syndrome virus, 7 cases of swine enteric coronavirus, 9 cases of canine influenza, 7 cases of easternequineencephalitis, 10 cases of equineherpesvirus, and 15 cases of West Nile virus in equines [21].Moreover, viruses have been detected in environmental samples in Michigan. Human enteric viruses have been detected in the effluent of multiple Michigan wastewater treatment plants, which is released into surrounding surface waters [25,26]. Adenovirus and other human viruses have also been detected at public recreational beaches in Michigan, leading to beach closures [27,28]. Numerous environmental factors may contribute to the likelihood of infectious disease in certain areas or time periods, including but not limited to land use [22], precipitation [23], and population density [24]. Land use is relevant to determine the environmental state of the area, and can be impactful during runoff events. Precipitation levels inform where these runoff events may occur. Population and population density can affect the spread of viral disease and can also be used to normalize disease levels from one county to another. Other factors can be used to further characterize land use, such as information related to agricultural activity. Variables such as livestock population can not only illustrate the level of agricultural activity in an area, but also illustrate the expected quality of nearby surface water after runoff events.Because viral infection is able to spread to humans from the environment, animals, and other humans, the One-Health framework is ideal to investigate the critical pathways through which viruses are transported and transmitted [16]. Data collection related to human, animal, and environmental health is crucial to attain preliminary information for the identification of these critical pathways. This information can help to illuminate the parameters that affect the spatial and temporal patterns of disease. The goal of this study is to present an example preliminary data collection and analysis approach for the state of Michigan.
Methods
Data collection
Disease data was collected from weekly MDSS reports for 2017 from MDHHS, which reports the number of cases for each disease for each county for both the current week and year-to-date (note: MDSS is a continually active system and reported numbers in the MDSS weekly reports are not final) [15]. Year-to-date values were chosen as the values utilized in this data analysis as they were found to be more comprehensive compared to current week values; it is suspected this is because some cases for given weeks would not be reported until after those weeks' reports were published, thus they would only be reflected in the year-to-date values.To adequately compare counties to one another, population data for each county was collected from U.S. Census data [29]. Population data was used to calculate the relative number of disease cases per capita for each county and each week. Weeks which contained days in more than one month were grouped into the month for which there were more days in that week (e.g., the week of 1/29–2/4 was designated as February as it contains four days in February compared to three days in January). The relative numbers of disease cases are expressed as “number of reported disease cases per 1000 people” and are considered the dependent variables for the analysis that follows.Land use data at the county level was collected from the United States Geological Survey (USGS) Land Cover Data Viewer [30]. Absolute land cover was collected as hectares and relative land cover was also calculated using total land area for each county. County-level agricultural data was collected from the USDA Census of Agriculture [31]. Precipitation information was collected from USDA as a 30-year average of monthly precipitation for each county in Michigan; annual values were also reported [32].
Exploratory data analysis
After data collection, exploratory data analysis was performed to investigate relationships between independent and dependent variables. Spatial distributions of variables were visualized with the creation of county-level heatmaps. Correlations were performed between variables to obtain correlation coefficients and determine which pairs of variables exhibited relationships with one another. Scatter plots were also created between independent and dependent variables to represent relationships between variables visually.
Statistical methods
Independent variables determined to have a potential correlation with disease levels were selected and utilized in the development of a preliminary statistical model. Spatial regression analysis was performed in R to assess the validity of the independent variables as predictors of the corresponding dependent variables. First, ordinary least squares (OLS) regression was performed to determine whether the collected independent variables were significantly related to the diseases studied. Independent variables were introduced into the OLS model based upon the prior exploratory data analysis; those with the highest correlations with disease levels were interpreted as the most likely predictors and were incorporated first, followed by the next highest correlation, and so on. The regression model was run each time a new variable was introduced. Those that did not exhibit a relationship with 85% confidence (i.e. p-value not <0.15) were omitted from further consideration. This conservative level of confidence has been employed in prior studies performing spatial regression of environmental data [33]. Predictor variable collinearity was assessed using the calculation of variance inflation factor (VIF) scores; it was ensured that no predictor variable had a VIF score >3.0 [34]. This analysis provided an initial model with which to assess the relationships between variables.However, OLS regression does not account for spatial autocorrelation in the data, and other regression models that do account for this may be appropriate [33]. The degree of spatial autocorrelation was assessed in R and quantified with Moran's I and Lagrange multiplier diagnostics using k-nearest neighborhoods of different sizes. It was found that values of k >1 provided appropriate results; a value of k = 5 was utilized in diagnostic tests to adequately account for spatial autocorrelation. These diagnostic tests found the existence of spatial autocorrelation in this dataset, and determined that a spatial lag model would be more appropriate. The spatial lag regression model was therefore performed in R to adjust the regression coefficients of the selected predictor variables. Akaike information criterion (AIC) values for each of the models were calculated to determine which model was of higher quality.
Results and discussion
Spatial and temporal distribution of viral disease in Michigan
Included in the MDSS reports are disease statistics by county for various viruses, and certain areas of the state are more commonly affected by viral disease than others. Fig. 4 shows heatmaps for cases of four diseases (GI illnesses, influenza-like illnesses, hepatitis A virus, and norovirus) for each Michigan county. Variation in spatial distribution of diseases can be observed in Michigan, with GI illnesses concentrated in the southwest portion of the state, whereas the eastern portion of the state is most affected by hepatitis A.
Fig. 4
Heatmaps of disease cases relative to population for Michigan counties for the year 2017 as reported by MDSS. (number of cases divided by population for county multiplied by 1000) [15]. Maps prepared by the authors. Note: MDSS is a continually active system and reported numbers in the MDSS weekly reports are not final.
Heatmaps of disease cases relative to population for Michigan counties for the year 2017 as reported by MDSS. (number of cases divided by population for county multiplied by 1000) [15]. Maps prepared by the authors. Note: MDSS is a continually active system and reported numbers in the MDSS weekly reports are not final.Because MDSS issues weekly reports on disease statistics, temporal trends can also be observed for the illnesses in question. Fig. 5 displays the number of disease cases by month for the state of Michigan in the year 2017 for GI illnesses, influenza-like illnesses, hepatitis A virus, and norovirus. GI illness and influenza norovirus are all more prevalent in the winter and spring months. Hepatitis A virus cases are more common in the latter half of the year, but there is relatively little annual variation as compared to the other diseases in question.
Fig. 5
Disease cases by month in Michigan for the year 2017 as reported by MDSS [15]. Note: MDSS is a continually active system and reported numbers in the MDSS weekly reports are not final.
Disease cases by month in Michigan for the year 2017 as reported by MDSS [15]. Note: MDSS is a continually active system and reported numbers in the MDSS weekly reports are not final.
Spatial parameters of consideration
The primary spatial factor to consider in this case is land use. For each county in Michigan, correlations are calculated between the number of reported cases of disease (normalized to population) and the types of land use for that respective county as reported by USGS [30]. Table 1 presents the calculated correlation coefficients between these two variables.
Table 1
Correlation coefficients between disease cases normalized to population for each MI county and relative land cover for different types (hectares of type in county per total hectares in county).
Correlation coefficients between disease cases normalized to population for each MI county and relative land cover for different types (hectares of type in county per total hectares in county).Bolded values indicate potentially significant relationships.The correlations of the diseases with agricultural vegetation presents a prominent contrast; the relationships of influenza-like illnesses and GI illnesses with agricultural land use (bolded in the table) are markedly stronger than those of hepatitis A virus and norovirus. This indicates the possibility that agricultural activity may have an impact on the transport of influenza-like illnesses and GI illnesses; the notion that agricultural land use can introduce pathogens to surrounding surface waters is supported by the literature [[35], [36], [37], [38]]. This is an expected finding given that hepatitis A virus and norovirus are not thought to be zoonotic, whereas some influenza-like illnesses and GI illnesses, while uncommon, have the potential to be zoonotic. Similarly, the relationship of developed land with hepatitis A virus is much higher than with the other three diseases studied. This implies that more heavily populated areas may contribute to the incidence of hepatitis A virus. This relationship exists despite the fact that the number of disease cases for each county was normalized to that county's population, signifying that this relationship does not arise merely from a large number of reported cases in urban areas.These relationships can be more plainly distinguished with the use of scatter plots. Fig. 6 displays scatter plots to visualize the correlations reported in Table 1 between agricultural vegetation and the four diseases investigated. A positive correlation is observable in the first two plots representing GI illness and influenza-like illness, especially when contrasted with hepatitis A virus, which shows no relationship between the two variables.
Fig. 6
Scatter plots displaying correlation between relative agricultural land cover in each county with reported disease cases (normalized to population) in each county for gastrointestinal illness, influenza-like illness, hepatitis A virus, and norovirus.
Scatter plots displaying correlation between relative agricultural land cover in each county with reported disease cases (normalized to population) in each county for gastrointestinal illness, influenza-like illness, hepatitis A virus, and norovirus.Agricultural data can assist in determining critical locations, as comparisons can also be made to agricultural trends. Fig. 7 displays heatmaps of farmland acreage, cattle population, swine population, and sheep population as reported by the USDA [31]. According to visual examination, the most commonly affected areas of viral disease appear to typically be contained within major watersheds, including the Grand River watershed for influenza-like illnesses. These illnesses also appear to correspond to areas with high cattle populations. With these observations in mind, particular attention could be paid to those factors when determining where to sample in these locations.
Fig. 7
Heatmap of agricultural data by county for the state of Michigan as reported by USDA (2012). Top-left: farmland acreage, top-right: cattle inventory, bottom-left: swine inventory, bottom-right: sheep inventory [31]. Maps prepared by the authors.
Heatmap of agricultural data by county for the state of Michigan as reported by USDA (2012). Top-left: farmland acreage, top-right: cattle inventory, bottom-left: swine inventory, bottom-right: sheep inventory [31]. Maps prepared by the authors.Fig. 8 presents heatmaps of average annual precipitation and population density for each county in Michigan as reported by the Agricultural Applied Climate Information System [32] and U.S. Census Bureau [29] respectively. Visual examination determines that precipitation levels appear highest in the western part of the state, similar to the areas most commonly affected by GI illness and influenza-like illness. Meanwhile, population density is highest near the Detroit area, which is the most area most affected by hepatitis A virus.
Fig. 8
Left: average annual precipitation by county for Michigan for the years 1981–2010. Right: Population density in persons per square mile by county for Michigan. Maps prepared by the authors.
Left: average annual precipitation by county for Michigan for the years 1981–2010. Right: Population density in persons per square mile by county for Michigan. Maps prepared by the authors.Table 2 presents county-level correlations between the four diseases investigated and the aforementioned variables. As is suggested by the heatmaps, precipitation has a high degree of correlation with GI illness, and a slight correlation with influenza-like illness, while showing no substantial relationship with the other two diseases. Meanwhile, population density has a high correlation with hepatitis A virus; this is an understandable result given the established correlation with developed land use. Livestock inventory is also seen to have stronger correlations with influenza-like illness and GI illness than with the other two diseases studied, but these correlations are not as strong with that of agricultural land use as seen in Table 1. This suggests that the potential relationship between agricultural activity and GI illness/influenza-like illness may be a result of particular agricultural practices, such as the land application of biosolids and manure as fertilizer, rather than the livestock animals themselves.
Table 2
Correlation coefficients between disease cases normalized to population for each MI county and agricultural data, precipitation data, and population density for each MI county.
Correlation coefficients between disease cases normalized to population for each MI county and agricultural data, precipitation data, and population density for each MI county.Bolded values indicate potentially significant relationships.
Temporal factors of consideration
The temporal variation of factors such as precipitation and surface water runoff may also help to explain viral disease occurrence. Surface water discharge, such as the flow rates of specific rivers in the state, can also provide valuable information about the status of a watershed over time. Temperature also assists in determining when runoff and first-flush events will occur. The timing of sanitary sewer overflow (SSO) and combined sewer overflow (CSO) events can give an idea of the times in which certain areas are most at-risk for pathogen exposure [39]. Similarly, comparison of the timing of manure application with the timing of runoff events can help to determine the impact of land application of biosolids on environmental water quality [40].Comparisons can be made between the temporal distribution of disease cases and these temporal factors. One such comparison can assess the relationship between reported monthly disease cases and monthly precipitation. Accurate county-wide monthly precipitation measurements are not readily available for every county in Michigan during the year 2017, but the Agricultural Applied Climate Information System reports the 30-year average monthly precipitation levels for Michigan counties in addition to annual figs. [32]. These precipitation levels in each county can be correlated with reported diseases cases in each county by month, taking the spatial analysis from above and introducing a more specific temporal element. A summary of these correlation coefficients is presented in Table 3.
Table 3
Summary of correlation coefficients for the relationship between average 30-year precipitation in the county with reported disease cases (normalized to population) in the county for each month.
Summary of correlation coefficients for the relationship between average 30-year precipitation in the county with reported disease cases (normalized to population) in the county for each month.Bolded values indicate potentially significant relationships.This analysis reveals that the spatial correlations (represented by the annual figures) fluctuate at different points throughout the year. For example, as mentioned, GI illnesses have a correlation of 0.474 with annual precipitation on the spatial level, but this relationship is strongest in the month of May, when it reaches a correlation coefficient of 0.532. Moreover, the correlation coefficients between GI disease and precipitation increase in magnitude from February to May. This finding is interesting because the spring months are the times in which land application of fertilizers and manure are most common, as it is the beginning of the growing season. This relationship with precipitation, combined with the aforementioned relationship between GI disease and agricultural land use, strengthens the possibility that agricultural runoff could be a critical pathway for GI diseases in Michigan. Stronger relationships are also observed between the months of August and November and GI illnesses as well as the months of September and November and influenza-like illnesses. These correlations indicate that these months, in addition to the aforementioned spring months, could also be critical times at which runoff is an important pathway for the studied illnesses.Other independent variables that have not been collected could be utilized as data becomes available. In addition to spatial and temporal distribution of publicly available human disease data, livestock and wildlife disease data would be very useful. However, governmental agencies do not collect or provide such data to the same detail as human disease data.
Spatial regression modeling
Based on exploratory data analysis, relative agricultural land use and annual precipitation were determined to be independent variables of interest for GI illness and influenza-like illness. The OLS regression model performed showed that the initial inferences drawn from exploratory data analysis were appropriate, as all other variables (other types of land use, livestock information) did not meet the threshold of confidence (p-value not <0.15) for further consideration. A summary of the results is listed in Table 4. As shown, both variables display a relationship with GI illness with a high degree of confidence (p < .001). Of the other diseases, none were related to precipitation with a high degree of confidence, and while all three meet the threshold for consideration (p-value<.15), influenza-like illness was found to be related to agricultural land use with a higher degree of confidence (p-value<.001). Additionally, the regression coefficients for agricultural land use for both hepatitis A virus and norovirus were much smaller than those for GI illness and influenza-like illness, suggesting that the relationship is not nearly as strong as with the two latter diseases.
Table 4
Summary of OLS regression results for each disease investigated.
Disease
Independent variable
Regression coefficient
Standard error
Test statistic
P-value
Gastrointestinal illness
Agricultural Vegetation
35.5565
9.665
3.679
0.000423
Annual Precipitation
2.6549
0.7713
3.442
0.000921
Influenza-like illness
Agricultural Vegetation
53.7628
13.4585
3.995
0.000143
Annual Precipitation
0.6955
1.074
0.648
0.519096
Hepatitis A virus
Agricultural Vegetation
0.046447
0.022812
2.036
0.045
Annual Precipitation
−0.0023
0.00182
−1.262
0.211
Norovirus
Agricultural Vegetation
0.189612
0.106758
1.776
0.0795
Annual Precipitation
−0.00365
0.00852
−0.429
0.6691
Bolded values indicate significant relationships.
Summary of OLS regression results for each disease investigated.Bolded values indicate significant relationships.The spatial lag model adjusted the regression coefficients for the GI model to 28.87805 for agricultural land use (P-value = .005703) and 2.22179 for precipitation (P-value = .003851). The regression coefficients for the spatial lag model are less than those for the OLS model, as the spatial lag model accounts for spatial autocorrelation, lessening the influence of the predictor variables. The AIC values for each of the models also determined that the spatial lag model was of higher quality than the OLS model, validating the use of spatial regression.This analysis is one rudimentary example of the types of statistical techniques that can be employed to assess the relationships between collected disease data and other independent variables. Moreover, the variables assembled in these analyses is a non-exhaustive list of the potential environmental factors that can impact viral disease. As more data becomes available, more relationships of interest may be observed in exploratory data analysis, and new predictor variables could be incorporated in the above spatial regression analysis. Additionally, this regression analysis only accounts for spatial interactions between variables, and exploratory data analysis revealed that precipitation is more strongly related to GI illness in certain months of the year. Temporal data could therefore also be incorporated into future regression analyses to further pinpoint the critical times and locations for GI illness.Additionally, disease data such as the data used in this analysis contains many zero values (counties and times at which no cases were reported). In this case, zero-inflated linear regression techniques, such as the zero-inflated Poisson model could be of use. Furthermore, more robust reporting of clinical data would be valuable in this analysis. One potential reason for the inconclusive relationships between norovirus and the investigated independent variables could be that norovirus is not as widely reported as GI illnesses or influenza-like illnesses, making it more difficult to observe correlations between variables.
Discussion
To summarize, there are two potential findings from the above analyses. Influenza and particularly GI illnesses may be related to agricultural land use and precipitation, and this relationship with precipitation is strongest in the springtime, although GI illnesses are most common in the winter months. Meanwhile, hepatitis A virus appears to be most closely related with developed land use, and is more common in the later months of the year in autumn and winter. GI and influenza-related disease cases are observed to be relatively high in counties located in the Grand River watershed.As the methodology used in this study relies on the reporting of disease cases, it is critical that the extent to which patients and physicians report disease cases is quantified. The analysis contained within this study is reliant on the assumption that diseases are reported at similar rates regardless of location or time of occurrence. However, this may not be the case, and reporting bias may exist. For example, certain communities may have different access to local clinics than others; highly populated areas may be underserved by not having the necessary facilities to handle the population, or sparsely populated areas may not be within close enough proximity to accessible medical care. In future analysis this issue could be addressed by obtaining data on the number of clinics and medical facilities in each county and the number of visits to these facilities and normalizing this data to population.In addition to the data sources utilized in this paper, there are numerous other databases that can be used to obtain relevant information to determine and predict disease variability. Environmental quality data are important factors that may be considered and results from environmental sampling and analysis may correlate with disease occurrence. For example, wastewater treatment plants and concentrated animal feeding operations (CAFOs) can be valuable sampling points, since sampling and characterizing community wastewater and livestock manure represents a snapshot of the status of community human and animal health.The Michigan Water Environment Association maintains a list of wastewater facilities in the state of Michigan [41]. Sampling can also take place at other locations, such as storm drains, agricultural field runoff drains, and areas that have recently experienced combined sewer overflows. Other agricultural data could also be valuable in determining sampling points in rural areas of the watershed, such as amount of fertilizer purchased per week per county and location of CAFOs. The Sierra Club maintains a readily available map of CAFOs throughout the United States, including in Michigan [42]. CAFO locations can help to determine where livestock populations are most abundant, heightening the risk for both animal disease and zoonotic disease. Beyond agricultural data, information on surface water quality can also be of use. For example, the Michigan Department of Environmental Quality (MDEQ) reports figures for public beach closures, which occur when surface water contamination is detected during regular screening for pathogen indicators. MDEQ also summarizes sanitary sewer overflow (SSO) and combined sewer overflow (CSO) events, which occur when wastewater levels in municipal sewer systems exceed the systems' capacity, resulting in untreated wastewater discharging into nearby surface waters.Sampling times, meanwhile, can be determined by other factors. Using Michigan as an example, the relationship between precipitation and GI illness was strongest in the spring months, coinciding with land application of agricultural fertilizers. Additionally, these relationships were strongest in the area of the Grand River watershed, a large watershed in southwest Michigan encompassing Grand Rapids, Lansing, and surrounding agricultural areas. Therefore, sampling within this watershed in spring (from March to May) would be ideal to assess the impact of agricultural runoff on the occurrence of GI illness. This runoff would be at its peak when the flow rate of the Grand River would be highest. Therefore, examining the discharge of the Grand River can aid in determining the most critical times at which to sample. The USGS reports discharge data and water quality data for many surface waters and groundwaters in the United States, including the Grand River [44]. Animal disease data can also be collected and analyzed in this manner, but there remains a need for integrated human-animal disease surveillance to assess zoonotic disease occurrence [43].One goal of integrative human, animal and environmental data analysis and subsequent environmental monitoring is to develop a system of prioritization for potential occurrence of disease for each county in Michigan per month. Analysis of the collected spatial and temporal data can determine the factors that are most critical in specific places and at specific times. This can lead to the determination of which locations and time periods are of the greatest concern for each disease investigated. This will, in turn, lead to higher levels of preparedness to combat viral disease outbreaks, as these critical times and locations can be surveyed before the disease develops.
Conclusions
The One Health framework can be readily applied to the investigation of viral disease, and determination of critical environmental factors is an important part of this process. This study shows the existence of significant relationships between clinically reported human viral infections and environmental factors such as land use and precipitation. The identification of these relationships can assist in the determination of the most critical times and locations for which humans and animals are most at risk of viral infection. Once these times and locations are determined, surveillance systems can be implemented and interventions can be introduced to mitigate potential viral outbreaks. While Michigan was used as an example in this study, these concepts are still relevant regardless of where and when this methodology is implemented. The utilization of the One Health framework in its full capacity can better help to identify, predict, and prevent viral disease outbreaks.
Declaration of Competing Interest
The authors declare that they have no competing interest or any conflict of interest in this work.
Authors: Deborah A Adams; Kimberly R Thomas; Ruth Ann Jajosky; Loretta Foster; Gitangali Baroi; Pearl Sharp; Diana H Onweh; Alan W Schley; Willie J Anderson Journal: MMWR Morb Mortal Wkly Rep Date: 2017-08-11 Impact factor: 17.586
Authors: Evan O'Brien; Mariya Munir; Terence Marsh; Marc Heran; Geoffroy Lesage; Volodymyr V Tarabara; Irene Xagoraraki Journal: Water Res Date: 2017-01-08 Impact factor: 11.236
Authors: Deborah A Adams; Kimberly R Thomas; Ruth Ann Jajosky; Loretta Foster; Pearl Sharp; Diana H Onweh; Alan W Schley; Willie J Anderson Journal: MMWR Morb Mortal Wkly Rep Date: 2016-10-14 Impact factor: 17.586
Authors: Y J Hutin; V Pool; E H Cramer; O V Nainan; J Weth; I T Williams; S T Goldstein; K F Gensheimer; B P Bell; C N Shapiro; M J Alter; H S Margolis Journal: N Engl J Med Date: 1999-02-25 Impact factor: 91.245
Authors: Stephen M Schmitt; Thomas M Cooley; Scott D Fitzgerald; Steven R Bolin; Ailam Lim; Sara M Schaefer; Matti Kiupel; Roger K Maes; Stephanie A Hogle; Daniel J O'Brien Journal: J Wildl Dis Date: 2007-10 Impact factor: 1.535
Authors: Arbor J L Quist; David A Holcomb; Mike Dolan Fliss; Paul L Delamater; David B Richardson; Lawrence S Engel Journal: Sci Total Environ Date: 2022-03-25 Impact factor: 10.753
Authors: Emily Sanchez; Ryan B Simpson; Yutong Zhang; Lauren E Sallade; Elena N Naumova Journal: Int J Environ Res Public Health Date: 2022-04-19 Impact factor: 3.390