Literature DB >> 32545581

Artificial Neural Network Modeling of Novel Coronavirus (COVID-19) Incidence Rates across the Continental United States.

Abolfazl Mollalo¹, Kiara M Rivera¹, Behzad Vahedi².

Abstract

Prediction of the COVID-19 incidence rate is a matter of global importance, particularly in the United States. As of 4 June 2020, more than 1.8 million confirmed cases and over 108 thousand deaths have been reported in this country. Few studies have examined nationwide modeling of COVID-19 incidence in the United States particularly using machine-learning algorithms. Thus, we collected and prepared a database of 57 candidate explanatory variables to examine the performance of multilayer perceptron (MLP) neural network in predicting the cumulative COVID-19 incidence rates across the continental United States. Our results indicated that a single-hidden-layer MLP could explain almost 65% of the correlation with ground truth for the holdout samples. Sensitivity analysis conducted on this model showed that the age-adjusted mortality rates of ischemic heart disease, pancreatic cancer, and leukemia, together with two socioeconomic and environmental factors (median household income and total precipitation), are among the most substantial factors for predicting COVID-19 incidence rates. Moreover, results of the logistic regression model indicated that these variables could explain the presence/absence of the hotspots of disease incidence that were identified by Getis-Ord Gi* (p < 0.05) in a geographic information system environment. The findings may provide useful insights for public health decision makers regarding the influence of potential risk factors associated with the COVID-19 incidence at the county level.

Entities: Disease Gene Species

Keywords: COVID-19 (Coronavirus); GIS; United States; artificial neural networks; multilayer perceptron

Mesh：

Year: 2020 PMID： 32545581 PMCID： PMC7344609 DOI： 10.3390/ijerph17124204

Source DB: PubMed Journal: Int J Environ Res Public Health ISSN： 1660-4601 Impact factor: 3.390

1. Introduction

Novel coronavirus disease (COVID-19) has rapidly spread worldwide, becoming a global health threat [1]. The disease was first identified in Wuhan, China, and continued to spread out across the world [2]. According to the World Health Organization [3], as of 4 June 2020, there have been more than 6.4 million confirmed cases and over 380 thousand deaths worldwide. These statistics have surpassed the number of deaths and cases for Middle East respiratory syndrome (MERS) and severe acute respiratory disorder (SARS) since their outbreaks [4]. The pandemic has directly impacted the economy, society, and healthcare systems. According to the International Monetary Fund [5], global economic growth in the year 2020 is estimated to be -3.0%, compared to +2.9% in 2019. The United Nations predicts that the pandemic can continue to adversely impact societies with perpetual disease spread due to improper policy interventions [6]. Although the United States is ranked number one in the global health security index [7], it is the leading country in the number of confirmed cases and deaths globally [8]. As of 4 June 2020, there have been over 1.8 million confirmed cases and more than 108,000 deaths in this country [9]. Moreover, the case fatality ratio (CFR) continues to fluctuate in this country. As of 4 June 2020, the United States ranks in ninth place worldwide, with a CFR of 5.8% [10]. Recent studies have demonstrated that preexisting conditions, such as cardiovascular diseases [11], respiratory diseases [12], cancer [13], infectious diseases [14], and substance abuse [15], can contribute to the elevated morbidity and mortality of COVID-19. In China, Zheng et al. [11] utilized the MERS virus as a reference and suggested that SARS-CoV-12 can cause cardiac failure and acute myocarditis. Although the findings were preliminary, they indicated that patients could experience chronic cardiovascular effects secondary to contracting the disease. Lippi and Henry [12] conducted a meta-analysis demonstrating that chronic obstructive pulmonary disease (COPD) patients are five times more at risk of contracting the SARS-CoV-2 virus. You et al. [13] alluded to the guidelines suggested by French medical oncologists on cancer patient care during the pandemic. In South Africa, Cox et al. [14] highlighted changes in tuberculosis (TB) patients’ treatment during the pandemic. In the United Kingdom, Marsden et al. [15] indicated how individuals with substance abuse disorders might experience addiction augmentation during the pandemic, consequently, increasing the risk for COVID-19 contraction. They suggested that substance abuse disorder may not be overlooked when addressing preexisting conditions in COVID-19 patients. In addition to preexisting conditions, environmental [16], demographic, and socioeconomic [17] factors can potentially influence COVID-19 incidence. For instance, Wang et al. [16] indicated that COVID-19 transmission is influenced by temperature variability. Their results suggest that reduced COVID-19 transmission is associated with higher humidity and temperature. In the United States, Mollalo et al. [17] suggested that higher percentages of nurse practitioners and black females and higher income inequality at the county level could explain 68.1% of COVID-19 incidence geographic variations. Artificial neural networks (ANNs) are relatively novel techniques to model complex non-linear relationships in spatial epidemiology [18]. The techniques have been applied in a variety of fields, including but not limited to environmental science [19,20], agriculture [21], finance [22,23], artificial intelligence [24], epidemiology and public health [25,26,27]. Reddy and Imler [26] demonstrated that ANNs could provide reliable predictions for chronic diseases, such as cirrhosis patients with hepatocellular carcinoma. They found high sensitivity (80.61–86.67%) and specificity (99.88–99.95%), corresponding to demographic and physiological inputs. Badnjević et al. [28] incorporated ANNs to classify asthma; they found high levels of sensitivity (97.11%) in asthmatic individuals and specificity (98.85%) in healthy individuals. Their findings suggested that ANNs can be appropriate techniques for asthma detection. Due to a lack of research on the spatial complexities of COVID-19 at the national level, in this study, we leveraged the potential of ANNs in identifying complex spatial patterns and the power of geographic information systems (GIS) in spatial analysis [29,30] to predict county-level COVID-19 incidence rates in the continental United States. We employed one of the widely used topologies of ANNs that is described in Section 2.4.

2. Materials and Methods

2.1. Data Collection and Preparation

COVID-19 is continually monitored by governmental health agencies and institutions of higher learning, such as the US Centers for Disease and Control and Johns Hopkins University [31]. In this study, we compiled a database of 57 candidate variables that may predict county-level cumulative disease incidence as a dependent variable. From January 22 to April 25, 2020, cumulative numbers of confirmed cases of COVID-19 across the continental United States were collected at the county level from USAFacts (usafacts.org) and normalized by populations. The counties (n = 3109) were considered as samples that represent the status of the disease in the US. In this study, socioeconomic (such as household income, income inequalities, and unemployment rate), behavioral (such as smoking), environmental (such as temperature, precipitation, and air pollution), topographic (such as altitude, and terrain slope), and demographic (such as proportions of age groups, race, gender, and access to primary care) factors were prepared at the county level and were used as explanatory variables. To avoid reiteration, a complete description of the used variables has been provided in Mollalo et al. [17]. In addition to the above explanatory variables, which were also used in the study of Mollalo et al. [17], age-adjusted mortality rates of several diseases were incorporated, including infectious diseases (i.e., TB, HIV/AIDS, hepatitis, and lower respiratory infection), cardiovascular diseases (i.e., cerebrovascular disease, hypertensive heart disease, ischemic heart disease, cardiomyopathy and myocarditis, atrial fibrillation and peripheral vascular disease), chronic respiratory diseases (i.e., COPD, asthma, interstitial lung disease, and pulmonary sarcoidosis), cancer (i.e., pancreatic, gallbladder and biliary tract, mesothelioma, Hodgkin lymphoma, leukemia, tracheal, bronchus, and lung cancer), and substance use disorders (i.e., drug and alcohol use). The data were retrieved from the University of Washington Global Health Data Exchange (http://ghdx.healthdata.org/us-data) and joined to the preexisting database. All data were collected and prepared at the county level and are publicly available. A list of all variables can be found in the Supplementary Materials.

2.2. Spatial Analysis

We examined the geographic distribution of the COVID-19 incidence rate using global and local indices. The global Moran’s index [32,33] was used to identify the overall pattern (random, clustered, or dispersed) of disease incidence rate using the following formula: where and are the deviations of COVID-19 incidence rates from the mean incidence rate for county and county , respectively; is the spatial weight between county and county , which is non-zero when the counties are neighbors (i.e., share borders); and is the total number of counties. The value of ranges between −1 and +1. The values close to 0 indicate random distribution (null hypothesis), while values close to +1 and −1, respectively, indicate positive and negative spatial autocorrelations [34,35]. As the global Moran’s index is unable to identify the location of hotspots [35], Getis–Ord Gi*, statistics developed by Getis and Ord [36] were used to identify the hotspots of COVID-19 incidence rates (p < 0.05) as follows [37]: The positive and high value of indicates a more intense clustering of high values (hotspot(s)). The output of the statistic was mapped in ArcGIS 10.7 (Esri, Redlands, CA, USA) to locate the hotspots of COVID-19 incidence rates.

2.3. Feature Selection

The presence of a relatively large number (n = 57) of potentially relevant variables can create a technical problem and a theoretical discrepancy, which can in turn decrease the generalizability of the neural networks [38]. Therefore, we applied the Boruta algorithm [39] to identify feature importance, and ultimately chose “all-relevant” important features [40]. This algorithm is a wrapper around the Random Forest classification algorithm and is implemented in the “Boruta” package in R. To determine important and unimportant features, this algorithm creates random shadow variables and runs a random forest classifier on the set of original and shadow variables. Based on the results of a statistical test (using z-scores), the algorithm iteratively removes the variables that have lower z-scores compared to the shadow variables [39]. After performing the Boruta feature selection algorithm and also Pearson’s correlation analysis on the training dataset, important and less correlated (r < 0.7) variables were identified and selected as input variables in the neural networks.

2.4. Artificial Neural Networks

Artificial neural networks (ANNs) are computational structures that can learn the relationship between a set of input and output variables through an iterative learning process. These networks use simple computational operations such as addition and multiplication, yet they are capable of solving complex, non-linear problems [41,42,43]. Once a network is properly trained, it can be used to predict a variable of interest based on an independent (holdout) dataset, usually with minimal modifications [44]. The main components of ANNs are neurons that are organized in layers and are fully connected to the next layer by a set of weights (edges). Each ANN consists of one input layer, one output layer, and at least one hidden layer. The simplest form of ANN is called a perceptron, first introduced by Rosenblatt [45], which is the building block of neural networks. In a perceptron, each input is multiplied by a corresponding weight and then aggregated by a mathematical function called “activation of the neuron.” Another function then computes the output. ANNs are a set of layers that are created by stacking perceptrons. For instance, if the inputs to the ith perceptron in a network are denoted by , assuming that a summation function is used to calculate the outputs (denoted by ), we will have [44]: where n is the number of inputs; m is the number of neurons in the current layer; is the weight of the th neuron (th input to the th cell), and is a bias term. In matrix form, can be simplified to: where Given a specific loss function, the perceptron can reach better estimates of the output values by adjusting the weights and bias terms through an iterative process referred to as error-correction learning. This process calculates the “errors” using observed and estimated values and “corrects” network parameters based on those errors. Given the estimated value of the network output at iteration n, (i.e., ), and the observed output value , a loss term is defined by [46]: where Loss is a function of and , which gives a measure of the difference between observed and estimated output values and is defined based on the type of problem. This Loss term can be used locally at each neuron to update the weights of the network (in that neuron) using gradient descent learning: where, at iteration n, is the weight from neuron j to neuron i, is the step size, and is the partial derivative (gradient) of Loss with respect to . Step size is one of the (hyper) parameters of a network and can be optimized by trial and error. A similar procedure is used to update bias terms. The activation function is a non-linear function applied to each neuron to transfer its values into a known range, for instance, [−1, 1] or [0, 1]. The most common activation functions in ANNs are rectified linear unit (ReLU), sigmoid, and hyperbolic tangent (tanh) [47]. The summation term in Equation 4 acts as an activation function for the perceptron. In this study, the performance of multilayer perceptron (MLP) neural networks in modeling the disease incidence is investigated across the continental United States. MLP is a variant of the (single) perceptron model explained above and is one of the most popular classes of feedforward ANNs, with one or more hidden layers between the input and output layers [48]. MLP is used in supervised learning tasks for classification or regression. Figure 1 represents the topology of the MLP neural network. In this regression study, we employed MLP with 1 and 2 hidden layers. The “Neuralnet” package in R was used to train the MLP.

Figure 1

The topology of MLP neural network.

2.5. Model Performance

The entire dataset was randomly divided into three different categories: 1) training samples: 60% (n = 1865) of data used for developing the models; 2) cross-validation samples: 15% (n = 466) of data used to fine-tune network weights and to avoid overfitting; 3) holdout samples: 25% (n = 777) of data used to test the accuracy and generalizability of the models. The same partitioned data were used for all models for the purpose of comparison. The process of training models stopped at earlier stages to avoid overfitting. The performances of neural networks in predicting COVID-19 cumulative incidence rate (output) based on selected variables (inputs) were compared to each other, and to the linear regression model as a baseline on holdout samples. We used three different evaluation measures for accuracy assessments: root-mean-square error (RMSE), mean absolute error (MAE), and the correlation coefficient between observed COVID-19 incidence rate and model predictions (r). In this study, the model with minimum error values and a higher correlation coefficient was considered as the optimal model [47]. Below are the formulae to assess the accuracies: where is the observed value of the COVID-19 incidence rate, is the predicted value by the model, and is the number of observations on a holdout dataset. Sensitivity analysis was carried out on the optimal model to assess the contributions of variables in predicting disease incidence. Finally, vanilla logistic regression was utilized to explain the relationship of the most contributing factors obtained from sensitivity analysis and the presence/absence of hotspots identified by Getis-Ord Gi*.

3. Results

Results of spatial analysis with Global Moran’s I indicated that the distribution of COVID-19 incidence rate in the continental United States is clustered (Index: 0.36, z-score: 34.75, p < 0.0001), rejecting the null hypothesis (random distribution). Moreover, Getis-Ord Gi* could identify the location of hotspots of disease incidence rates (Figure 2). In total, 217 counties were identified as hotspots (p < 0.05), which were mainly located in the northeastern regions of the continental United States, western Georgia, central Ohio, southern Louisiana, and northeast Iowa.

Figure 2

Locations of hotspots of COVID-19 incidence identified by Getis-Ord Gi*, continental United States.

The Boruta algorithm and Pearson’s correlation analysis selected 34 variables as less correlated and important variables (Supplementary Materials), which were then fed as inputs to ANNs. Overall, among the activation functions, “tanh” had slightly better performance (lowest RMSE) and thus was used in the MLPs. We systematically increased the number of neurons in the hidden layers from 10 to 30. The lowest errors were obtained with 15 neurons in the hidden layer. The performances of all employed models, in terms of RMSE, MAE, and r between observed COVID-19 incidence rate and model predictions on the holdout sample are presented in Table 1. Correlation coefficients of the models ranged between 0.30 and 0.65. The linear regression model achieved the least correlations with observed COVID-19 incidence rates (r < 0.3). On the contrary, the MLP with one hidden layer achieved the highest correlation (r = 0.65), indicating a satisfactory agreement between model predictions and observed COVID-19 incidence rates. Moreover, the accuracy assessment of the results indicated that the prediction error of the MLP with one hidden layer is less than others (RMSE = 0.72, MAE = 0.36). The worst performance was obtained by linear regression (RMSE = 0.99, MAE = 0.58), while the MLP with one hidden layer yielded better accuracy and generalization capability than other models and was thus considered as the proposed model for further analysis. Figure 3 compares the z-scores of actual and predicted values of the dependent variable for holdout samples using the one-hidden-layer MLP.

Table 1

Comparative performance of the employed models (single run) to predict COVID-19 rates across the continental United States.

Model	Accuracy Assessment
	RMSE	r	MAE
Linear Regression	0.992517	0.295885	0.577808
MLP (1 hidden layer)	0.722409	0.645481	0.355843
MLP (2 hidden layers)	0.839806	0.466981	0.39755

Figure 3

Comparison of actual and predicted values of the dependent variable (z-scores) for holdout samples using the one-hidden-layer MLP.

We performed a sensitivity analysis to investigate the effect of each variable on the COVID-19 incidence rate using the MLP with one hidden layer. Figure 4 shows the top 10 contributing variables in order of importance. According to Figure 4, age-adjusted mortality rates of ischemic heart disease, pancreatic cancer, leukemia, Hodgkin’s disease, mesothelioma, and cardiovascular disease were among the top 10 factors with the highest relative importance for COVID-19 incidence rates, showing the potential importance of these preexisting conditions to COVID-19 incidence rate. In addition to the mortality rates, the proportion of males above 65 years old, higher median household income, precipitation, and maximum terrain slope were other important contributing variables.

Figure 4

The relative importance of the top 10 variables to the COVID-19 incidence rate, using sensitivity analysis by one hidden layer MLP, continental United States.

The logistic regression model was used to explain the association between the presence/absence of the identified hotspots (p < 0.05) of COVID-19 incidence rates and the explanatory variables obtained from sensitivity analysis. The results indicate that age-adjusted pancreatic cancer mortality rates followed by median household income, precipitation, and Hodgkin’s disease mortality rates could explain the positive association with the presence/absence of hotspots. Meanwhile, age-adjusted mortality rates for leukemia and cardiovascular disease, and maximum terrain slope, were negatively correlated with the occurrence of the hotspots. Table 2 summarizes the results of the logistic regression model statistics.

Table 2

Results of the logistic regression model in explaining the presence/absence of the hotspots (p < 0.05) of COVID-19 incidence rate, continental United States.

	Coefficient (B)	Standard Error	Wald Test	Degree of Freedom	Significance	Exp (B)
Constant	−2.763	0.086	1036.109	1	0.000	0.063
Median household income	0.403	0.079	26.139	1	0.000	1.497
Max terrain slope	−0.270	0.093	8.432	1	0.004	0.763
Precipitation	0.337	0.080	17.817	1	0.000	1.400
Pancreatitis cancer	0.636	0.095	44.672	1	0.000	1.889
Hodgkin’s Disease	0.409	0.100	16.596	1	0.000	1.505
Leukemia	−0.550	0.089	38.241	1	0.000	0.577
Cardiovascular	−0.414	0.118	12.350	1	0.000	0.661

4. Discussion

COVID-19 is an RNA virus that has the potential to mutate like the flu and measles, which may have contributed to the rapid transmission of the disease [49]. Due to the successful performance of ANNs in modeling many complex relationships, we examined the applicability of ANNs in predicting COVID-19 incidence in the continental United States. One of the main advantages of ANNs over widely applied traditional statistical techniques is their predictive capabilities even when working with noisy, complex, and incomplete datasets [18], which may also be useful for modeling other viruses with complex epidemiology, such as Zika virus. This motivated us to compile a relatively broad range (n = 57) of socioeconomic, behavioral, environmental, topographic, and demographic factors together with mortality rates of preexisting conditions. The variables were either suggested by previous studies or were based on domain knowledge (rarely investigated at the county level). Among the different combinations of network topologies and learning parameters that were examined, the MLP with one hidden layer performed better and thus was used for predictions. Sensitivity analysis of this model indicated that six age-adjusted mortality rates, including ischemic heart disease, pancreatic cancer, leukemia, Hodgkin’s disease, mesothelioma, and cardiovascular disease, had substantial impacts on county-level COVID-19 incidence across the continental United States. While there is still much to discover and research, the results suggest that the disease incidence may be influenced by the fluctuance in mortality rates’ distribution nationwide. Therefore, counties with elevated proportions of mortality rates of one or more chronic conditions may be more vulnerable to the higher incidence of COVID-19, when compared to other counties. As a result, it may potentially impact mortality rates during the pandemic. Lai et al. [50] indicated that comorbidities and cancer might be substantial contributors to COVID-19 mortality excess rates. They proposed that their findings are applicable to COVID-19 incidence and mortality in the United States. Hanff et al. [51] convey that COVID-19 mortality is significantly associated with comorbidities, including cardiovascular diseases (i.e., hypertension), suggesting that further studies may focus on detailed descriptions of comorbid physiological implications in COVID-19 patients, especially in the use pharmacological therapies. Alimadadi et al. [52] proposed that sophisticated analysis, such as machine learning and artificial intelligence, may aid in combating the pandemic. They also suggest that these methods may provide a better understanding of COVID-19 diagnosis, medication treatment, prevention, and hospital logistics. Although our findings seem consistent with recent studies, drawing conclusions at the individual level is not valid due to ecological fallacy, thus the findings can only be interpreted at the county level. According to our findings, demographic (i.e., % male above 65), socioeconomic (i.e., median household income), and environmental factors (i.e., maximum terrain slope and precipitation) are influential in predicting COVID-19 incidence, indicating that the disease is not merely affected or driven by physiological conditions. The findings support and extend the previous study of Mollalo et al. [17], who utilized multiscale geographically weighted regression to explain geographic county-level variations of COVID-19 incidence in the United States. Their results indicated that counties with higher median household income and income inequalities were positively correlated with elevated disease incidence, predominantly in the tristate area. Kavanagh et al. [53] proposed that socioeconomic and demographic factors are vital to consider when addressing the pandemic as they may be associated with income disparities that exist in the United States. This may be the case of some employees that may not have the option to work remotely from home, instead, potentially resulting in more frequent exposure to the virus, contributing to further spread of the disease. The study of Qu et al. [54] emphasize the significance of examining the effects of environmental factors pertaining to COVID-19. Their results suggest that COVID-19 may be aggravated by air pollutants (i.e., airborne particulate matter), influencing infectivity. Hence, further studies on preexisting conditions, socioeconomic, demographic, and environmental impacts on COVID-19 incidence preferably at a less coarse granularity level are essential. We acknowledge that the obtained consistency between the model and ground truth is not notably large. This is likely due to the limited knowledge about the recently emerged disease and factors that may be influential but not included in this study. Therefore, future studies should focus on improving the prediction accuracy of this initial model. Additionally, even though no significant difference is observed between the performance of MLP networks with one and two hidden layers, there may still exist complex relationships in the data that are not captured. This leads us to another limitation of this study, which is the number of training samples. With a higher amount of training data, one could apply deeper networks, i.e., networks with more than two hidden layers, and leverage the power of deep learning models. Deeper neural networks can capture potential non-linearity in the relationship between dependent and independent variables by stacking two or more hidden layers. Thus, such networks are, in general, capable of reaching higher accuracies and can reveal the nuances of the data. However, the amount of training data that was available in this study does not justify utilizing deep networks. A few possible solutions to increase the amount of data are to consider a longer temporal interval (which was not possible in this case), to incorporate data from other countries and regions, to use finer spatial units data (if available), or to use data augmentation techniques to (artificially) generate more training data and features. Moreover, although adjusted mortality rates of the diseases used in this study cannot be directly interpreted as preexisting conditions, higher mortality rates of a certain disease could allude to a higher incidence rate of it. Therefore, this study could be used to further investigate any potential correlation between disease prevalence and COVID-19 incidence. After more than three months since the first confirmed case of COVID-19 in the US, and due to the substantial economic and social impacts of the pandemic itself and the resulting lockdown policies, discussions regarding “re-opening the country” are omnipresent. The findings of this paper could be used as one of the many guidelines needed by policymakers to decide if and where (at the county level) lockdown policies should be relaxed.

5. Conclusions

In this study, we examined the applicability of multi-layer perceptron artificial neural networks in modeling cumulative incidence of COVID-19 at the county-level across the continental United States. Although the employed model indicated a reasonable but not large consistency with ground-truth on holdout samples, the prediction capability of the model requires a significant improvement possibly by incorporating new related variables or perhaps by employing different machine learning algorithms. However, with the obtained accuracy, (age-adjusted) mortality rates of ischemic heart disease, pancreatic cancer, leukemia, Hodgkin’s disease, mesothelioma, and cardiovascular disease together with two socioeconomic and environmental factors (median household income and total precipitation) could contribute with the disease incidence. Therefore, further studies of the factors and their associations with the disease may reveal useful information for monitoring COVID-19 outbreak.

20 in total

1. The perceptron: a probabilistic model for information storage and organization in the brain.

Authors: F ROSENBLATT
Journal: Psychol Rev Date: 1958-11 Impact factor: 8.934

2. Spatial and spatio-temporal analysis of human brucellosis in Iran.

Authors: Abolfazl Mollalo; Abbas Alimohammadi; Mostafa Khoshabi
Journal: Trans R Soc Trop Med Hyg Date: 2014-08-25 Impact factor: 2.184

Review 3. Deep learning.

Authors: Yann LeCun; Yoshua Bengio; Geoffrey Hinton
Journal: Nature Date: 2015-05-28 Impact factor: 49.962

4. A 24-year exploratory spatial data analysis of Lyme disease incidence rate in Connecticut, USA.

Authors: Abolfazl Mollalo; Jason K Blackburn; Lillian R Morris; Gregory E Glass
Journal: Geospat Health Date: 2017-11-08 Impact factor: 1.212

5. Meteorological, environmental remote sensing and neural network analysis of the epidemiology of malaria transmission in Thailand.

Authors: Richard Kiang; Farida Adimi; Valerii Soika; Joseph Nigro; Pratap Singhasivanon; Jeeraphat Sirichaisinthop; Somjai Leemingsawat; Chamnarn Apiwathnasorn; Sornchai Looareesuwan
Journal: Geospat Health Date: 2006-11 Impact factor: 1.212

6. Machine learning approaches in GIS-based ecological modeling of the sand fly Phlebotomus papatasi, a vector of zoonotic cutaneous leishmaniasis in Golestan province, Iran.

Authors: Abolfazl Mollalo; Ali Sadeghian; Glenn D Israel; Parisa Rashidi; Aioub Sofizadeh; Gregory E Glass
Journal: Acta Trop Date: 2018-09-07 Impact factor: 3.112

7. GIS-based spatial modeling of COVID-19 incidence rate in the continental United States.

Authors: Abolfazl Mollalo; Behzad Vahedi; Kiara M Rivera
Journal: Sci Total Environ Date: 2020-04-22 Impact factor: 7.963

8. An interactive web-based dashboard to track COVID-19 in real time.

Authors: Ensheng Dong; Hongru Du; Lauren Gardner
Journal: Lancet Infect Dis Date: 2020-02-19 Impact factor: 25.071

9. Artificial intelligence and machine learning to fight COVID-19.

Authors: Ahmad Alimadadi; Sachin Aryal; Ishan Manandhar; Patricia B Munroe; Bina Joe; Xi Cheng
Journal: Physiol Genomics Date: 2020-03-27 Impact factor: 3.107

10. Is There an Association Between COVID-19 Mortality and the Renin-Angiotensin System? A Call for Epidemiologic Investigations.

Authors: Thomas C Hanff; Michael O Harhay; Tyler S Brown; Jordana B Cohen; Amir M Mohareb
Journal: Clin Infect Dis Date: 2020-07-28 Impact factor: 9.079

33 in total

Review 1. Artificial intelligence for forecasting and diagnosing COVID-19 pandemic: A focused review.

Authors: Carmela Comito; Clara Pizzuti
Journal: Artif Intell Med Date: 2022-03-28 Impact factor: 7.011

2. Predicting COVID-19 county-level case number trend by combining demographic characteristics and social distancing policies.

Authors: Megan Mun Li; Anh Pham; Tsung-Ting Kuo
Journal: JAMIA Open Date: 2022-06-25

3. Coronavirus disease vulnerability map using a geographic information system (GIS) from 16 April to 16 May 2020.

Authors: Seyed Vahid Razavi-Termeh; Abolghasem Sadeghi-Niaraki; Soo-Mi Choi
Journal: Phys Chem Earth (2002) Date: 2021-06-16 Impact factor: 3.311

4. Design of an artificial neural network to predict mortality among COVID-19 patients.

Authors: Mostafa Shanbehzadeh; Raoof Nopour; Hadi Kazemi-Arpanahi
Journal: Inform Med Unlocked Date: 2022-05-29

5. Coronavirus disease (COVID-19) cases analysis using machine-learning applications.

Authors: Ameer Sardar Kwekha-Rashid; Heamn N Abduljabbar; Bilal Alhayani
Journal: Appl Nanosci Date: 2021-05-21 Impact factor: 3.869

6. Spatial statistical analysis of pre-existing mortalities of 20 diseases with COVID-19 mortalities in the continental United States.

Authors: Abolfazl Mollalo; Kiara M Rivera; Nasim Vahabi
Journal: Sustain Cities Soc Date: 2021-01-28 Impact factor: 7.587

7. Visualizing and assessing US county-level COVID19 vulnerability.

Authors: Gina Cahill; Carleigh Kutac; Nicholas L Rider
Journal: Am J Infect Control Date: 2020-12-19 Impact factor: 2.918

8. COVID-19 Community Incidence and Associated Neighborhood-Level Characteristics in Houston, Texas, USA.

Authors: Abiodun O Oluyomi; Sarah M Gunter; Lauren M Leining; Kristy O Murray; Chris Amos
Journal: Int J Environ Res Public Health Date: 2021-02-04 Impact factor: 3.390

9. Sociodemographic determinants of COVID-19 incidence rates in Oman: Geospatial modelling using multiscale geographically weighted regression (MGWR).

Authors: Shawky Mansour; Abdullah Al Kindi; Alkhattab Al-Said; Adham Al-Said; Peter Atkinson
Journal: Sustain Cities Soc Date: 2020-12-02 Impact factor: 7.587

10. Predicting the hotspots of age-adjusted mortality rates of lower respiratory infection across the continental United States: Integration of GIS, spatial statistics and machine learning algorithms.

Authors: Abolfazl Mollalo; Behrooz Vahedi; Shreejana Bhattarai; Laura C Hopkins; Swagata Banik; Behzad Vahedi
Journal: Int J Med Inform Date: 2020-08-22 Impact factor: 4.046