Literature DB >> 33714094

Detecting SARS-CoV-2 RNA prone clusters in a municipal wastewater network using fuzzy-Bayesian optimization model to facilitate wastewater-based epidemiology.

Srinivas Rallapalli¹, Shubham Aggarwal², Ajit Pratap Singh².

Abstract

The current pandemic disease coronavirus (COVID-19) has not only become a worldwide health emergency, but also devoured the global economy. Despite appreciable research, identification of targeted populations for testing and tracking the spread of COVID-19 at a larger scale is an intimidating challenge. There is a need to quickly identify the infected individual or community to check the spread. The diagnostic testing done at large-scale for individuals has limitations as it cannot provide information at a swift pace in large populations, which is pivotal to contain the spread at the early stage of its breakouts. Recently, scientists are exploring the presence of SARS-CoV-2 RNA in the faeces discharged in municipal wastewater. Wastewater sampling could be a potential tool to expedite the early identification of infected communities by detecting the biomarkers from the virus. However, it needs a targeted approach to choose optimized locations for wastewater sampling. The present study proposes a novel fuzzy based Bayesian model to identify targeted populations and optimized locations with a maximum probability of detecting SARS-CoV-2 RNA in wastewater networks. Consequently, real time monitoring of SARS-CoV-2 RNA in wastewater using autosamplers or biosensors could be deployed efficiently. Fourteen criteria such as population density, patients with comorbidity, quarantine and hospital facilities, etc. are analysed using the data of 14 lac individuals infected by COVID-19 in the USA. The uniqueness of the proposed model is its ability to deal with the uncertainty associated with the data and decision maker's opinions using fuzzy logic, which is fused with Bayesian approach. The evidence-based virus detection in wastewater not only facilitates focused testing, but also provides potential communities for vaccine distribution. Consequently, governments can reduce lockdown periods, thereby relieving human stress and boosting economic growth.

Entities: Chemical

Keywords: Bayesian approach; Biosensors; Coronavirus; Fuzzy logic; Vaccine; Wastewater sampling

Mesh：

Substances：
RNA, Viral
Waste Water

Year: 2021 PMID： 33714094 PMCID： PMC7938789 DOI： 10.1016/j.scitotenv.2021.146294

Source DB: PubMed Journal: Sci Total Environ ISSN： 0048-9697 Impact factor: 7.963

Introduction

The ‘Severe Acute Respiratory Syndrome Coronavirus-2’ (SARS-CoV-2), a new respiratory disease sustained by the novel coronavirus (COVID-19) (Gorbalenya et al., 2020), emerged in the world in 2019. Due to the fast spread of the disease, the only viable option adopted by many countries to suppress the disease is blanket isolation (i.e., stay-at-home and lock-down policies) (Casella, 2020; Holshue et al., 2020), which affected business-economies and health of the individuals, leading to security, and safety concerns. Overall, this model (lockdown/shutdown) has proved to be a great threat for federal and state governments, pressurizing them to quickly look for devising some rational informed decisions, so that people could reinstate to their occupation (Daughton, 2020a, Daughton, 2020b). The primary mode of transmission of COVID-19 is through the droplets produced while coughing, sneezing, and breathing, etc., and also by contacting the contaminating surfaces. Experts and researchers, however, have also reported the presence of ‘SARS-CoV-2 RNA’ in faeces of the infected individuals (Chen et al., 2020; Wu et al., 2020a, Wu et al., 2020b). This could be an invaluable asset for developing wastewater based early warning tools, which could facilitate in curbing the spread of the pandemic (Xing et al., 2020). Wastewater-Based Epidemiology (WBE) is one of the promising early warning tools for monitoring the infection of COVID-19 (Daughton, 2020a, Daughton, 2020b; Mallapaty, 2020). WBE is a wastewater monitoring technique capable of estimating the health graphics for groups of the population. WBE's could help in targeting the use of clinical diagnostic testing by monitoring key and confined sub-populations, which not only reduces the demand for diagnostic testing, but would also reduce supply-chain shortages caused by limited manufacturing capacity. WBE would guide focused and smarter testing by avoiding testing the populations likely to be ‘negative’. In this manner, WBE is magnifying the value of diagnostic testing by making them count and reducing ‘tests administered versus cases confirmed ratio’. By locating the targeted subpopulations infected by the disease, WBE increases the efficiency of diagnostic testing, which is presently being implemented for mass surveillance that is not only time-intensive, but also expensive (Naddeo and Liu, 2020). Also, the diagnostic testing is never meant for mass inspection for it can pose an acute threat to the healthcare individuals. The advantage of WBE is its practical application which is not restricted to detection of real nucleic acid or antigen of COVID-19 (or other contagious surrogates). WBE could effectively be applied by targeting the detection of endogenous biomarkers, which become highly raised in the diseased condition. Biomarkers present in human metabolic excretion products serve as indicators of consumption or exposure of the population to infectious diseases, pathogens or chemicals (Choi et al., 2018). There are several biomarkers suitable for this purpose such as the inflammatory response biomarkers, the immunoglobulins (Ig) produced by the immune system as response to the pathogen. Immunoglobulins can be valuable indicators for detection of COVID-19 because the antibodies IgM and IgG are produced during different time frames from the onset of disease. The IgM antibodies can be detected within 4 to 10 days, whereas IgG manifests about after 2 weeks of disease inception. In addition, there are several other technologies which could be potential players in addressing the need of WBE applications like the use of biosensors such as surface enhanced Raman scattering (SERS), field-effect transistor (FET), lab-on-a-chip (LOC) etc. This feature of WBE surpasses the use of polymerase chain reaction (PCR) and antigen testing, which measures the original nucleic acid of the infectious agent. There are several benefits associated with testing of biomarkers as compared to PCR (Rice and Kasprzyk-Hordern, 2019; Polo et al., 2020; Xagoraraki and O'Brien, 2020): (i) biomarkers are excreted both in urine and faeces, the levels could help in tracking the severity of infection; (ii) testing might have compact ranges for per-capita excretion resulting in better calibration and estimation of the infected individuals in a community; (iii) biomarkers are more universally excreted among the infected individuals; and (iv) sometimes PCR might overestimate the incidence or intensity of existing infections, where viable virus might not be lead to origin of RNA fragments, but rather virus remnants (litter) from cleared infections are the cause. This could be a possible cause of some recovered patients again testing positive for COVID-19 (Cha and Smith, 2020). Detection of biomarkers can help in identifying the cases of asymptomatic individuals (Daughton, 2020a, Daughton, 2020b) and the testing could be done more frequently. Recent research shows biosensors as effective tools for analysing wastewater for potential biomarkers and public health assessment using WBE (Yang et al., 2017; Mao et al., 2020; Mao et al., 2021). Biosensors are low-cost, highly sensitive portable devices that work based on biochemical reactions facilitated by a biological receptor/biorecognition element, such as anti- bodies, enzymes, cells, enzymes, nucleic acids and even microorganisms (Ejeian et al., 2018). An effective in-situ development of the WBE requires use of biosensors at the destined locations to detect the presence of SARS-CoV-2 RNA in the wastewater samples. Mao et al. (2020) have proposed an integrated biosensor system with mobile health for wastewater-based epidemiology that can give early warning of COVID-19, accelerate screening and diagnosis of potential infectors, thereby addressing public health issues in a fast, affordable and reliable way. In general, they detect various targets in the wastewater based on optical, electro-chemical and thermal signals (Chen et al., 2019). Biosensors used for WBE, analyses drug biomarkers, population markers and health markers, and could provide real-time data trends to healthcare agencies and government bodies to build an early warning system to prevent community- based disease outbreaks (Mao et al., 2020). Biosensors are emerging analytical tools and are becoming significantly important in healthcare, environmental monitoring, and drug discovery (Mao et al., 2021). There are biosensors based on electrochemical reactions (EC), field-effect transistor (FET), or surface plasmon resonance (SPR), surface enhanced Raman scattering (SERS) for the real-time and remote detection of coronavirus RNA at the wastewater treatment plants as well as at the specific residential building sites. Currently, autosampling is more widely used technique for WBE monitoring of SARS-CoV-2 RNA (Barceló, 2020) as compared to biosensors (Mao et al., 2020). However, the indiscriminate monitoring using WBE autosamplers or biosensors at all locations would again be a time-consuming and costly affair. Therefore, a focused modelling framework is required to identify targeted locations, where there is maximum probability of occurrence of severely affected populations by studying the biomarkers present in wastewater networks. This would guide the medical agencies to adopt massive testing strategies in those locations. Worldwide, various trends in the spread of COVID-19 have been observed (Ioannidis et al., 2020). For example, some Asian countries like India and Pakistan have observed a very slow growth in the initial number of COVID-19 cases as compared to other regions. Even within the USA, different spread patterns have been observed (Ioannidis et al., 2020) for different states. This is obviously true as multiple factors are contributing simultaneously to the spread of virus. These factors could be population demography, degree of urbanization, literacy rate, availability of the medical and quarantine facilities, atmospheric conditions, strict implementation of state laws, air pollution levels, etc. Although, WBE is an invincible tool for giving specific directions for containing and mitigating the disease, its focused implementation at a large scale needs a modelling approach. It is important to understand the prominent factors responsible for the spread of the disease at a given location in order to propose most probable sites for testing wastewater for novel coronavirus using sensor-based technology. This demands a thorough study of the factors as well as their cause-effect relationships. Bayesian approach based on the concept of conditional probability in conjunction with WBE could very well be considered superior as compared to other contemporary approaches concerning the systematic study of complex systems and interrelationships of their components (Zhang et al., 2011). Bayesian approach can model complex case scenarios as it can classify and find patterns in complex dataset, especially the case studies in healthcare domains. Researchers have used it for prognosis of health injuries, pneumonia and breast cancer (Burnside et al., 2000; Ito et al., 2015). Bayesian model can compensate for the limitations associated with contemporary approaches namely, neural networks, classification and regression trees, and decision trees such as their inability to effectively explain the cause-effect relationships among the variables (Phan et al., 2019). In contrast, the Bayesian approach, based on graph and probability theory is capable of presenting the comprehensive understanding of the relationships among both input and output variables in the form of a network which is essential especially when health related solutions need to be analysed (Xue et al., 2017). No doubt, Bayesian approach would be useful in finding the probability associated with the severity of spread at a particular location required for effectively using WBE based autosamplers or biosensors, the accuracy of the model can be further enhanced by coupling it with fuzzy logic based technique. Fuzzy logic can quantify the uncertainties associated with datasets and imprecise judgements of decision makers (Srinivas et al., 2018). The Bayesian model uses a large number of datasets such as population demography, degree of urbanization, literacy rate, atmospheric conditions, strict implementation of state laws, air pollution levels, etc. These datasets exhibit tremendous uncertainties due to randomness. At the same time, there is uncertainty due to imprecise opinion of the experts required for developing various COVID-19 scenarios used in the Bayesian model (Rehana and Mujumdar, 2009; Srinivas et al., 2018). Ye et al. (2013) simultaneously used fuzzy logic and probability distribution functions for water quality management. Pan et al. (2018) solved uncertainty, hesitancy, and parameterization problems associated with the water reuse in Penticton city using generalized intuitionistic fuzzy soft set-based decision-making framework. Ouyang and Guo (2018) integrated fuzzy set theory with fuzzy based Delphi and analytical hierarchical process (AHP) for selecting optimal wastewater treatment alternatives. Simultaneous and integrated analysis of the uncertainties associated with probabilistic models can enhance the credibility of probability estimations and decision making (Singh et al., 2007; Srinivas et al., 2017). The proposed fuzzy based Bayesian model has three novel aspects viz. (i) it provides targeted population clusters and optimized locations for implementation of WBE to detect COVID-19; (ii) it expresses the imprecision and uncertainties associated with pandemic factors data and decision makers using fuzzy membership functions; and (iii) it is flexible as it allows addition, deletion, and modification of pandemic factors with time and space, which makes it suitable not only for current pandemic (COVID-19), but any such outbreaks in the future (Maseleno et al., 2015; Srinivas et al., 2018). In particular, the study aims to: (1) demonstrate the application of the proposed model on COVID-19 data of USA based on eleven critical factors; (2) facilitate focused installation of WBE based biosensors/autosamplers by identifying targeted populations and wastewater networks where there is maximum probability of detecting COVID-19; (3) perform scenario analysis of factors using 1536 different cases; and (4) incorporate imprecise viewpoints of COVID-19 experts for assigning each case to a particular severity of disease spread. In addition, identification of severely affected populations/communities would help the governments to plan optimal distribution of vaccines. Overall, this work not only presents a novel way to narrow down the area of diagnoses of individuals/communities infected by COVID-19, but also helps the practitioners for effective application of WBE on the selected sites.

Materials and methods

Coupling Bayesian model with wastewater-based epidemiology

COVID-19 Bayesian model is created in the form of an acyclic graph using the parameters affecting its spread, which are linked using directed arrows. Each parameter (nodes) used in the Bayesian network has its description or constituents along with its corresponding probabilities. The nodes with arrows pointed from them are known as parent nodes, nodes with arrows pointed into them are called child nodes. The parent node, which is not a child node, is called root node. And the child node which is not a parent node is called a leaf node. The root nodes have prior probabilities, whereas the child nodes always have conditional probabilities associated with them. The Bayesian network can be represented in three different ways (Fig. 1 ). Fig. 1 also represents the various parameters and their cause-effect relationships chosen to identify targeted and optimized populations/communities for applying WBE.

Fig. 1

Bayesian networks and cause-effect relationships of various parameters for estimating COVID spread.

Bayesian networks and cause-effect relationships of various parameters for estimating COVID spread. Network A, shown in Fig. 1 is a serial connection network, where node A1 inputs node A2 and node A2 inputs to node A3. For illustration, the node parameter ‘Population demography’ leads to ‘Comorbidities’ which in turn contributes towards ‘COVID spread’ in Fig. 1. The joint probability for such network types can be computed using Eq. (1): Network B in Fig. 1 is a diverging connection network, where node B1 is parent to both B2 and B3, and child nodes B2 and B3 do not interact with each other. The variable ‘Demographic Categorization of the region’ feeding to five parameters like ‘Migration of people/Tourist visitors, Quarantine facility, Education’, and so on is an example of diverging connection network with five diverging connections. The joint probability for diverging connection networks can be represented using Eq. (2): In the converging connection network, the child node C1 is connected to both parent nodes C2 and C3. For illustration, ‘Population density’ is connected to ‘Temperature and Weather conditions’ via ‘COVID spread’ leaf node in Fig. 1. The joint probability for the converging network is represented using Eq. ((3): The formation of COVID-19 Bayesian model involves establishment of relationships among the parameters influencing the spread of coronavirus in a particular region and finding associated conditional probabilities (Zoullouti et al., 2019) based on the data collected from various sources as discussed in the next section. The resulting probabilities representing the COVID-19 spread severity help in identifying the targeted locations and communities for implementing WBE in a wastewater network. Consequently, biosensors or autosamplers can be installed in the targeted municipal wastewater drains to detect SARS-CoV-2 virus discharged through human faeces and urine.

COVID-19 parameters considered for proposed model

Based on existing literature, expert opinion, and other secondary sources (Ioannidis et al., 2020), the prominent factors responsible for the spread of COVID-19 are considered for constructing Bayesian network (Roy and Ghosh, 2020; Tantrakarnapa et al., 2020). These factors are listed in Table 1 . The dataset for the United States of America corresponding to these crucial parameters is retrieved from various authorized sources for the purpose of illustration and analysis using COVID-19 Bayesian model. The major sources of the data are Centres for Disease Control and Prevention (CDC, 2020), Calgary (2020), Health System Tracker (2020), National Weather Service (NWS, 2020), Statista, 2020a, Statista, 2020b, Statistics times (2020), UNODC (2016), World Tourism Organization (UNWTO) (2019), and World population review (2020). Specific data collected from each source are also listed during the course of discussion. The collected data accounts for the COVID cases from the beginning of the pandemic till 20 October 2020, and it is used for developing the probability tables corresponding to each parameter in the COVID model following the top to down approach.

Table 1

Parameters used for constructing Bayesian network for COVID-19 model.

Node	Parameter	Description of state
DCR	Demographic categorization of the region	Rural, Urban
MPT	Migration of people/Tourist visitors	High, Low
QF	Quarantine facilities	Good, Poor
SSI	Strict regulations and systems implementation	Yes, No
EDU	Education	Good, Poor
HF	Healthcare facilities	Good, Poor
TWC	Temperature and weather conditions	Below 10, 10 to 20, 20 to 30, and Above 30
PDN	Population density	Low, High
PDM	Population demography	Male, Female
AWC	Age-wise categorization	<18, 19–29, 30–49, 50–84, and Above 85
COM	Comorbidities	Respiratory + Kidney, Obesity + Liver, Cardiovascular, Diabetes, Hypertension, and None

Parameters used for constructing Bayesian network for COVID-19 model. Migration of people/tourist visitors (MPT) is the driving factor, as the disease travelled from the city, where first cases are detected to its neighbouring cities and then to other countries across the world (Tantrakarnapa et al., 2020). The countries with low temperature and weather conditions (TWC) are more prone to be infected by the presence of SARS-CoV-2 (Altamimi and Ahmed, 2020; Xie and Zhu, 2020). Scientists reveal that the temperature influences virus survival, as the virus declines more rapidly at 23-25 °C as compared to low temperatures (below 4 °C) (La Rosa et al., 2020). Riddell et al. (2020), analyses the effect of temperature range (20 °C–40 °C) on persistence of coronavirus on common surfaces. Casanova et al. (2010) studied how the inactivation of virus is more rapid as temperature increases from 4 °C to 40 °C. At higher temperatures (>35 °C), the virus viability is completely lost (Chan et al., 2011). The findings of Wu et al., 2020a, Wu et al., 2020b reveal that for every 1 °C increase in temperature (ranging from −5.28 to 34.30 °C), there is considerable reduction in daily new cases in 166 countries. Considering the evidences provided by the above-mentioned literature, authors have chosen various temperature ranges (Table 1). To put a halt on the steep rise of COVID-19 infected cases, governmental and self-Quarantine facilities (QF) were immensely promoted together with the strict regulations and systems implemented (SSI) by federal as well as state governments. The regulations passed included controlling crowd gatherings at restaurants, movie theatres, malls, spas, departmental stores, temples and so on. To implement laws, cooperation from people is essential. Thus, education (EDU) of the individuals plays a crucial role. All these measures are supposed to slow down the intrusion of infected cases in a region. However, the active cases need to be addressed effectively to check the spread. Therefore, the degree of betterment of the Healthcare facilities (HF) is considered a crucial factor influencing the spread of the disease. This includes increasing the number of hospitals and quality of service, effective execution of surveys by the health volunteers. The aforementioned six factors except ‘temperature and weather conditions’ can efficiently be grouped into a single thread i.e. Demographic categorization of the region (DCR), which indicates whether a region falls under urban or rural category. For example, an urban region is expected to have high MPT, high QF, good SSI, good EDU, and good HF. Other important factors are population density (PDN) and population demography (PDM). In an article (Griffith, 2020) published by Centers for Disease Control and Prevention (CDC, 2020), disparity has been observed in the number of cases of men and women infected by COVID-19. The COVID-19 death cases for men are 57%. This difference is not that significant; however, a significant change is observed in the COVID cases with ‘age wise categorization (AWC)’. According to CDC, individuals with age range between 65 and 75 years are nine times at higher risk of death than individuals with age range between 40 and 50 years (CDC, 2020). CDC (2021) also analysed the risk for COVID-19 infection, hospitalization, and death by age groups ranging from ‘0–4’ years to ‘85+’ years. Ioannidis et al. (2020) population-level COVID-19 mortality risk for various age groups. Based on evidences provided by above mentioned sources, authors have chosen various age groups in this study (Table 1). Individuals with underlying medical conditions or comorbidities (COM) are another important category. The quantum of COVID cases with comorbidities varies with gender and age-group of individuals (Calgary, 2020). Hence, the AWC and PDM factors converge into the COM node in COVID-19 Bayesian model in Fig. 1. The ‘None’ state in the COM node accounts for cases without comorbidities (Table 1). Also, the connection of ‘Hospital facilities’ factor to ‘Quarantine facilities’ highlights the relationship among them, as the majority locations which were converted into Quarantine centres were either academic institutions or hospitals (Wang et al., 2020). Similarly, the nodes EDU and HF feeding to node MPT delineates the fact of increased mobilisation when there are good medical and educational facilities available in a given region. In this study, only the above mentioned 11 factors and their relationships are considered and linguistically described based on the available data (Table 1). The prior and conditional probabilities for the root and parent-children nodes are computed using the available data for COVID-19 in the USA. For example, the prior and conditional probabilities for the node AWC and node COM respectively are computed using the data as represented in Figs. S1–S2 respectively of the Supplementary material. Fig. 2 represents relative risk of COVID-19 death of a population with more than 65 years of age versus those with less than 65 years of age. Data pertaining to other factors have not been represented for brevity purposes. Emphasis has been given to more closely analyse the data pertaining to severely affected 13 states in the USA.

Fig. 2

Data related to age and relative risk for severely affected 13 states in USA.

Data related to age and relative risk for severely affected 13 states in USA. Demographic categorization of the region (DCR) categorizes the given region into two categories: Rural and Urban. Each of these two town types have certain positive and negative aspects towards spreading of COVID-19. For example, the lower healthcare infrastructure and resources in rural America (the hospital beds per capita in metro areas are 2.8 ICU beds per 10,000 population whereas non-metro areas have only 1.7 (Orgera et al., 2020)) supports the spread of COVID-19, while the low population density and less mobilisation help in preventing COVID-19 spread. The annual data (1960 to 2020) of rural and urban population in the US has been procured from Statista, 2020a, Statista, 2020b. Migration of people/Tourist Visitors (MPT) represents the total number of arrivals of international visitors, which is calculated based on busy airport score (Roy and Ghosh, 2020). In this study, total number of international passengers' arrivals (IPA) into the nine countries similar to the US in their wealth and economic size are considered and data is obtained from the World Tourism Organization (UNWTO) (2019). The state values for the MPT node (i.e., High and Low) for US has been obtained using normalization, giving highest value (i.e., =1) to France with 89.3 million IPA, and thus the normalized value for USA having 79.6 million IPA is computed, which is equal to 0.89. Strict regulations and systems implementation (SSI) in the state is inferred based on the number of criminal cases per 100,000 population of a given location. More number of the criminal cases indicates poor implementation of laws. This data is retrieved from UNODC (2016). Normalization is performed using nine similar countries, where Switzerland obtained the highest rank. Education (EDU) score is obtained based on the literacy rate of a region. The average literacy rate of the USA is 88% (World population review, 2020). Healthcare facilities (HF) are estimated using hospital density in the given region (Health system tracker, 2020). The hospital density for the US has been compared with other similar countries to estimate HF scores. Temperature and Weather conditions (TWC) are obtained from the National Weather Service (NWS, 2020) of the USA. The TWC node is divided into four states with 10 °C interval, such temperature distribution is practiced to capture relevant COVID-19 virus survival temperatures i.e. 4–25 °C (La Rosa et al., 2020). The Population densities (PDN) for all countries around the globe eliminating the outliers such as Monaco and Greenland were extracted from Statistics Times (2020), the population densities were fitted in normal distribution curves to obtain the values of PDN for the US. Population demography (PDM) and Age wise categorization (AWC) of the US population is obtained from Statista, 2020a, Statista, 2020b. Node PDM is divided into two states namely ‘male and female’. As per the observation made by (Griffith, 2020), males are found to be having 14% increase in the total number of COVID cases as compared to females. The detailed age and gender wise data for the comorbidities (COM) has been procured from Calgary (2020). The states under COM node have been categorized into six categories with a couple of states having two comorbidity types combined (e.g. Respiratory and Kidney). This is practiced to compensate for the low number of cases. The quantification of these categories has been performed using fuzzy scale as discussed in the subsequent section.

Incorporating uncertainties in the model using fuzzy logic

Finally, a group of experts are consulted to estimate the degree of severity of COVID-19 spread based on analysis of these factors. A questionnaire survey has been conducted using a fuzzy Delphi AHP approach (Minatour et al., 2016) to procure the responses of a group of experts from health care agencies, coronavirus disease expert team, governmental decision making bodies, COVID-19 response team, and vaccine manufacturers. Uncertainty associated with conflicting opinions of the experts is dealt using fuzzy membership functions. Fuzzy entropy approach has been used to analyse the linguistic responses of the experts under a fuzzy environment. Details of this method have been explained in Srinivas et al. (2018). Table 2 shows the fuzzy membership ratings against linguistic responses of the experts. For example, after assessing the data of the quarantine facilities, an expert may give qualitative assessment as ‘good’ and corresponding membership rating is [3, 4, 6, 7]. Data pertaining to ‘Strict regulations and systems implementation (SII)’ can be rated either as ‘Yes’ or ‘No’ using trapezoidal membership functions [1, 2, 3, 4] and [3, 4, 5, 6] respectively instead of using binary ratings [either 1 or 0] which doesn't take into account the uncertainty associated with experts qualitative assessment (yes or no). As far as categories of comorbidities are concerned, membership functions are assigned based on the severity of their impact on human health. For example, ‘Respiratory + Kidney’ disease has been assigned [8, 9, 10, 10], Cardiovascular [5, 7, 8, 9], Obesity + Liver [3, 4, 6, 7], Diabetes [3, 4, 6, 7], Hypertension [3, 4, 6, 7], and None [0,0,0,0]. In a similar manner, datasets of all factors are assessed by experts. The membership functions chosen for this study to procure uncertain, diverse and conflicting viewpoints of experts are represented in Fig. 3 , where abscissae represent the membership rating and ordinate gives corresponding membership grade. For illustration, if the severity is ‘high’, the corresponding fuzzy rating is [5, 7, 8, 9] (Table 2). The responses of all the experts are aggregated using fuzzy entropy method and defuzzification is done to obtain a crisp value, which represents the severity of spread of COVID-19 (Srinivas et al., 2018). Hugin expert software (HUGIN 8.9) is used to develop the Bayesian model for the estimation of the severity of COVID-19 using defuzzified or crisp data values. Fig. S3 of the Supplementary material shows the network and calculations performed in Hugin expert software for the proposed model.

Table 2

Fuzzy membership ratings of the expert's linguistic viewpoint.

Linguistic variable	Fuzzy membership ratings
Poor/rural	[0, 0, 3, 5]
Good	[3, 4, 6, 7]
Very good/urban	[5, 7, 10, 10]
Negligible	[0, 0, 1, 2]
Low	[1, 2, 3, 4]
Moderate	[3, 4, 6, 7]
High	[5, 7, 8, 9]
Extreme	[8, 9, 10, 10]
Yes	[1, 2, 3, 4]
No	[3, 4, 5, 6]

Fig. 3

Fuzzy membership function corresponding to factors (e.g. Quarantine facilities) (left) and outputs (right) of COVID model.

Fuzzy membership ratings of the expert's linguistic viewpoint. Fuzzy membership function corresponding to factors (e.g. Quarantine facilities) (left) and outputs (right) of COVID model. The fuzzy Bayesian model identifies the targeted locations, which are at increased risk of COVID-19 infection. Based on the COVID-19 spread severity computed by the model and the population demography, autosampler or biosensors can be installed in the wastewater network at targeted and optimized locations. The practice of WBE on the selected sites would direct healthcare authorities to diagnose a focussed group of individuals (communities). In this study, scenario analysis has also been performed to demonstrate the robustness of the model to address future pandemics. Fig. S4 of the Supplementary material summarizes the basic framework proposed for effective handling of the pandemics alike COVID-19.

Results and discussion

The datasets described in the previous section have been used to develop fuzzy based COVID-19 Bayesian model. The prior and conditional probabilities for the root, parent and child nodes are constructed. To demonstrate the construction of probabilities, population demography (PDM), age wise categorization (AWC), and comorbidities nodes from Fig. 1 are selected. PDM and AWC are the root and parent nodes to node COM. PDM has only two states namely ‘male’ and ‘female’. The gender wise distribution of the US population has been steady since 2013 with women accounting for about 51% of the total population. The prior probability for the root node can be assigned as, P (PDM = Female) = 166.57 million/328.23 million = 0.51, consequently P (PDM = Male) = 1–0.51 = 0.49. Similarly, for the AWC node, the total population of US (328.23 million) can be divided into five states: [<18, 19–29, 30–49, 50–84, and >85] with 81.63, 45.13, 84.48, 110.38, and 6.61 million population in each category respectively. Hence, P (AWC ≤ 18) = 81.63 million/328.23 million = 0.25. In this manner, the probabilities can be computed for all other states of AWC nodes. The conditional probabilities for the node COM are estimated based on various combinations of the states of AWC and PDM nodes. For illustration, the total US male and female population with age less than 18 years is 41.7 and 39.93 respectively. A total four death counts are recorded in males due to ‘COVID plus respiratory’ and ‘COVID plus kidney’ problems. Eqs. (4), (5) represent the probabilities as given below: Similarly, the conditional probabilities for other parent and child nodes, and the prior probabilities can be calculated for all the relationships shown in Fig. 1. Further, the qualitative viewpoints of the experts in the field wastewater monitoring and COVID spread modelling are obtained to assign severity of COVID-19 spread into five different states: Extreme, High, Moderate, Low, and Negligible for different combinations of 11 factors. The qualitative responses are fuzzified using Table 2 and membership function given in Fig. 3. Fuzzy entropy methodology (Srinivas et al., 2018) is used to aggregate the uncertain (fuzzified) responses of multiple experts, which are finally defuzzified. A total of 1536 combinations are produced and assigned to a case severity. Finally, the probabilities along with the defuzzified responses assigned by the group of experts are entered into the HUGIN (version 8.9) computer program to analyse the COVID Bayesian network. The analysis results corresponding to the dataset of different states of US are as follows: Extreme = 1.18%; High = 29.9%; Moderate = 49.04%; Low = 19.79%; Negligible = 0.09%. Results provide key insights to understand the severity of the spread of COVID-19 in different states of the US. The advantage of the fuzzy based Bayesian model is its flexibility and applicability to any state/county/sub-regions, given the datasets of the factors mentioned in the Table 1 are available. For example, if a county has ‘n’ number of sub-counties, the model will identify the sub-counties (targeted and optimized locations), where the COVID-19 spread is ‘extreme’, ‘high’, ‘moderate’, etc. along with the probabilities while incorporating the uncertainties, and the WBE can be applied to wastewater network of those targeted sub-counties by installing biosensors/autosamplers to identify biomarkers in human excreta. In this manner, severely infected populations/communities can be identified. Hence, the appropriate policies and regulations can be mandated to inhibit the further spread of disease. It would also guide the healthcare officials to channelize more attention and quality treatment to the specific audience in an effective manner.

Scenario analysis

In order to exemplify the robustness of the proposed model and facilitate the application of WBE at various locations, the outputs of the model are calculated by varying the 11 factors values. Five different scenarios have been formed to demonstrate the applicability of model at different locations across the world. These scenarios are described as follows: Scenario 1: Region with high population density with 80% individuals above 50 years of age and 50% literacy rate. Scenario 2: Diabetes and Cardiovascular comorbidities partake 90% of the COVID infected individuals for a given region. Scenario 3: Urban dominating town with superior healthcare and quarantine facilities, strict law enforcement, good population density and international tourist arrivals. Scenario 4: Rural town with poor healthcare and quarantine facilities, 20% literacy rate and moderate mobilisation. Scenario 5: An intermediate town with good law enforcement, quarantine facility, education, and moderate mobilisation. The probabilities corresponding to 11 factors (criteria) contributing to the spread of COVID-19 are modified for these scenarios and analysed using the proposed model (Fig. 4, Fig. 5, Fig. 6 , Figs. S5–S10 of the Supplementary material). To replicate and represent various scenarios that could exist across the world ranging from ideal to worst, criteria of scenarios 1 to 5 are modified; for example, in the 2nd scenario, the percentage of comorbidities cases corresponding to diabetes and cardiovascular problems are set to 90%, assigning a total of 10% cases to the all other states in the COM node. In other words, most probable situations in the world would find some relevance amidst these five scenarios.

Fig. 4

Graphical representation of prior probability or Age-wise Categorization parameter for all scenarios including US case study.

Fig. 5

Graphical representation of conditional probability for comorbidities parameter for all scenarios.

Fig. 6

Summary of COVID spread results pertaining to all scenarios.

Graphical representation of prior probability or Age-wise Categorization parameter for all scenarios including US case study. Graphical representation of conditional probability for comorbidities parameter for all scenarios. Summary of COVID spread results pertaining to all scenarios. Fig. 4 represents the changes made in the factor ‘Age-wise categorization (AWC)’ and Figs. S5–S6 of the Supplementary section represents the changes made in the, ‘Population Density (PDN)’ and ‘Demographic Categorization (DCR)’ under the five scenarios. Increasing the population of individuals with age more than 50 years has resulted in the increase of severity of COVID-19 spread by 9% in the ‘extreme’ region, and about 20% in ‘high’ regions. This fact is in line with the observed statistics of COVID-19 for older adults. As reported by CDC (2020), the individuals older than 65 years share only 17% of the total US population. However, they constitute 31% of the total COVID-19 infections, 45% of those need hospitalization, 53% need ICU's. 80% of these cases result in fatality. This clearly suggests that the individuals in the higher age group are more prone to catch the infection and would take considerable time to get recovered. Thus, it will contribute highly to the spread of COVID-19. Worldwide, an increasing trend of the literacy rate has been observed with passage of time. According to Roser and Ortiz-Ospina (2016), only 14% of the world population is illiterate and incorporation of lower literacy with increasing age group is justified. Besides this fact, the younger individuals are quick at learning and adopting the sudden changes. This factor also contributes towards increasing the severity of spread of COVID-19 in the first scenario. It is also observed that metropolitan or urban populations increase the spread of COVID-19 infection (Hamidi et al., 2020). Hence, the regions that correspond to ‘scenario 1’ and housing a greater number of older adult populations would surely increase the probability of COVID-19 spread. The installation of autosamplers or biosensors at the strategic positions such as domestic sewage, housing estate sewage at identified targeted locations would help in implementing WBE. The advantage of WBE over PCR testing is the use of biomarkers. The inter-person variation of magnitude in shedding of viral load containing biomarkers would facilitate the identification of clusters of older individuals (Petterson, 2020). Since the older adults possess lesser immunity and comparatively weak bodily organs to fight COVID-19, the ability of WBE to quickly detect the infection would help the healthcare agencies to provide appropriate treatment on time (Daughton, 2020a, Daughton, 2020b). The patients with history of diabetes, cardiovascular, hypertension and chronic kidney diseases are also at high risk of contracting COVID-19 (Richardson et al., 2020; Yang et al., 2020). Among all the comorbidities cases of COVID-19 in the US, patients with cardiovascular and diabetes problems share the maximum percentage of death tolls (Calgary, 2020). Thus, the number of cases with these two comorbidities have been increased to perceive the effects of most critical scenarios using the proposed model. Such scenario analysis is valuable as it can guide the controlling authorities to devise strategies to deal with such unforeseen circumstances by studying repercussions of most unfortunate or critical scenarios. Keeping other criteria intact, ‘scenario 2’ results in an increase of COVID-19 spread from 29.9% to 60.14% under ‘high’ state and about 3% in the ‘extreme state’ (Fig. 5). This gives immense clarity to the state/local governments to solidify the facilities such as number of hospitals along with ICUs and ventilators, healthcare workers, and spread awareness on following social distancing, and proper sanitization in the targeted locations identified by the model. Nonetheless, the epidemiological surveillance system must become smarter and better in terms of capacity to serve a higher number of individuals, safety of healthcare individuals, ensuring better coordination among different sectors. These requirements concerning surveillance systems are also mentioned in the US government plan to support the international COVID-19 response (U.S. Department of state, 2020). WBE based surveillance systems coupled with COVID-19 Bayesian approach is competent in addressing these requirements. The autosamplers or biosensors in WBE can collect water samples automatically without personal intervention at scheduled intervals (Setford et al., 2018; Yu-wei et al., 2019). Though, the scenario 2 considers an extreme situation, the model is flexible and capable of assessing other cases concerning comorbidities in a stepwise manner as given below: Assess the probabilities for the states corresponding to 11 parameters responsible for COVID-19 spread including the five states of comorbidities (Respiratory + Kidney, Obesity + Liver, Cardiovascular, Diabetes, Hypertension). Here, comorbidities data corresponding to a particular region is used (50 states of USA in this study). Based on experts (mainly medical team of a desired region) opinion obtained using fuzzy logic, assign priority to each of these five states, and use Bayesian model to obtain the probability of COVID-19 severity contributed by these five states. This is general scenario. Perform scenario analysis by modifying five states (e.g. Diabetes and Cardiovascular comorbidities partake 90% of the COVID infected individuals as in scenario 2, or increase respiratory + kidney diseases to 80% and Obesity + liver diseases to 70%, etc.) to evaluate the response of the model for different cases Re-run the model to calculate the COVID-19 severity contributed by new scenarios (cases). Figs. S7–S10 of the Supplementary material represent the scenarios corresponding to healthcare facilities (HF), education (EDU), quarantine facilities (QF) and strict regulations and systems implementation (SSI). Paul et al. (2020) mentions that both rural and urban town are found to have a comparable number of COVID cases. Since both town types have their pros and cons in supporting the COVID spread as discussed earlier. ‘Scenario 3’ has been formulated for an Urban town type having superior quarantine facilities, healthcare systems, and strict enforcement of laws along with high population density and high international tourist arrivals. The analysis resulted in reduced spread of COVID-19, where the low state shows about 50% increase and high state achieved a reduction by 30%. It is important to note that these results are true even for high population density cases (850 people/sq. km.), which are almost double of New Jersey's population density with a high tourist arrival rate. Hence, if the counties in US could provide superior quarantine and healthcare facilities i.e., hospital density as good as of Australia (55.9 hospitals/million population) and could raise its level of law enforcement comparable to Switzerland (0.5 victims of intentional homicide/100,000 population whereas US has the rate of 5.4), then even if population density in US is double than New Jersey along with similar tourist visitors, the ‘scenario 3’ would result in a lower COVID-19 spread. Similarly, for ‘scenarios 4 and 5’, a rural town with poor quarantine, healthcare, literacy rate and an intermediate or suburban town with facilities lying between ‘scenario 3 and 4’ are considered. In brief, a rural town with poor facilities surpasses almost all other scenarios in the severity of spread of COVID-19. Thus, the towns with majority of rural population with poor healthcare and other related facilities should be given maximum attention. On the other hand, towns with either intermediate or good facilities are comparatively safer. Fig. 6 summarizes the results pertaining to all five scenarios. Authors have tried to formulate all the scenarios keeping in view the variations of chosen factors (criteria) across different parts of the world, especially cities like New York (USA), New Delhi (India), London (UK), Beijing (China), Sydney (Australia), etc. Due to the flexibility and structural simplicity of the proposed model, it can be applied towards understanding the effects of any future pandemics in the universe. Also, the parameters can be modified depending on spatio-temporal characteristics of the pandemic. The COVID-19 spread score given by the model would help in identifying targeted potential locations where WBE can be used to identify biomarkers found in human excrete discharged in wastewater networks for the most vulnerable communities.

Detection of biomarkers using biosensors

Increasing risk of COVID-19 outbreak accentuates the need for swift and accurate detection techniques for early detection and control (Cui and Zhou, 2020). Being simple, flexible and cost-effective, biosensors are capable of high-sensitivity detection. The usage of endogenous biomarkers in WBE has several advantages over PCR methods for identifying a targeted community, which is severely affected by COVID-19. Apart from being specific for a particular disease, biomarkers are more readily and extensively available in urine and faeces of infected individuals and offer cheaper analytical costs. Moreover, the use of enzyme-linked immunosorbent assay (ELISA) and Mass Spectrometry (MS) techniques for quantification of biomarkers is accepted to be more accurate, precise and validated compared to PCR (Barceló, 2020; Polo et al., 2020). Also, the current use of RT-qPCR methods in WBE applications require verification, since there are no fixed standards pertaining to viral load content worldwide for claiming a COVID-19 case to be positive. Such standardisation is also not easy because of the great uncertainty involved in the quantification of viral load in faeces of different individuals. Thus, there is a need to incorporate some novel approach to WBE in addition to or as a replacement for PCR technology. The foretold practices are eligible and can provide simple, swift and sensitive WBE application.

Applications of proposed model and its efficacy in vaccine distribution

The ability of WBE to detect COVID-19 before surveillance using diagnostic testing at mass scale makes it a powerful tool to control the transmission and plan medical facilities on a timely basis. General practice is to install WBE based biosensors or autosamplers at regions which have already experienced rapid transmission such as schools, universities, congested public housing, airports, hospitals, shopping malls, prisons, warehouse facilities, maritime ships, naval vessels, etc. (Prichard et al., 2014). However, the application of WBE at these locations doesn't consider most of the critical factors discussed in this study. The proposed model provides a framework to apply WBE in a more focused manner, thereby saving cost, time and lives of many people. For example, if there are 50 counties in a state and data of all the factors are analysed using the proposed model, targeted locations and communities where COVID-19 could severely affect the health and transmit at a greater rate can be identified. Application of WBE to the wastewater networks of these locations would help in identifying the biomarkers discharged in the wastewater. Such focused application also guides the governmental agencies to establish quarantine and health care facilities and take necessary precautions such as proper sanitization in the targeted communities. In a similar manner, targeted locations and communities could be found for other outbreaks given the availability of factors data. Many research institutions, pharmaceutical companies, and non-governmental organizations across the globe have taken part in a race to bid farewell to COVID-19. Presently, there are about 100 vaccine candidates in their different trial stages and would require approval from federal health authorities before they can be used in a given country. There are many vaccine candidates such as USA's Moderna, USA's Pfizer, Russia's Sputnik, Oxford's AstraZeneca, India's Covaxin. Some of them have been approved by respected federal governments while some are in their third/final phase of trials and are expected to be launched as early as the end of December 2020 to the mid-2021 (Craven, 2020; WHO, 2020). The model suggested in this study localises the clusters of severely infected population followed by effective application of WBE technology. The model outcomes in terms of identifying severely affected targeted communities are not only efficient in addressing and lowering down the fast spread disease but will also be an invaluable support to vaccine distributing centres and organizations in a targeted manner. It is a great challenge for governments and health agencies to devise a plan for rational and effective distribution of the vaccine. The proposed model does a significant job in prioritising the distributions of the vaccine, which will not only curtail the number of vaccinations required but would also curb the pandemic in a shorter period of time.

Limitations of the study

The proposed methodology for detection of infected population requires knowledge of the parameters responsible for the disease communication and their interrelated relationships. The inability to always accurately predict all the complex interrelationships which would possibly be playing simultaneously in the natural setup is a major limitation for the proposed approach. In addition, there is a requirement of adequate amount of data for understanding the dynamics of disease severity and transmission, procurement of which might be difficult in some locations for various reasons such as lack of technical resources, language barriers, privacy protection and so on (Van Panhuis et al., 2014).

Conclusions

The fuzzy based COVID-19 Bayesian model proposed in this study would play a key role in identifying targeted communities to facilitate development of healthcare facilities, WBE surveillance using biosensors as well as channelization of the vaccine distribution. Not only, the model mathematically determines the probability of finding severely affected COVID-19 communities, but also deal with the uncertainty associated with factors data and imprecise and diverse opinions of the expert decision makers. The study establishes that detecting viral load in the biomarkers excreted in human urine and faeces using WBE could be a more promising approach to investigate the occurrence of COVID-19 in communities, especially at locations with limited clinical testing. The model outcomes promote targeted implementation of biosensors or autosamplers in the municipal wastewater to detect more possible cases of COVID-19 patients, including asymptomatic ones, and also for immunized patients who recovered from COVID-19. Such data could provide crucial evidence to the epidemiologists to estimate if COVID-19 outbreak would ultimately develop into a common flu in the near future. The proposed model is flexible and can be easily adopted for expected diseases in the near future provided their influencing factors and cause-effect relationships are properly understood. The study guides the federal and state governments to develop a coordinated action plan along with the international community of WBE scientists. Initiatives such as using economical and portable technological devices to implement WBE at the targeted locations identified by the proposed model would be useful. Paper based sewage sensors, smartphone application for detection of SARS-COV-2 could be used at the community level to issue early warnings of pandemic. Future scope of this study would be to differentiate biomarkers present in faeces and liquid sewage along with estimating their half-life (persistence) and changing concentration with temperature in wastewater to perform robust water analysis.

CRediT authorship contribution statement

Srinivas Rallapalli: Conceptualization, Methodology, Software, Writing – original draft, Investigation, Writing – review & editing, Visualization, Supervision. Shubham Aggarwal: Data curation, Writing – original draft, Investigation, Methodology. Ajit Pratap Singh: Writing – review & editing, Conceptualization, Visualization.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

35 in total

1. A Bayesian network for mammography.

Authors: E Burnside; D Rubin; R Shachter
Journal: Proc AMIA Symp Date: 2000

Review 2. Biosensors for wastewater-based epidemiology for monitoring public health.

Authors: Kang Mao; Hua Zhang; Yuwei Pan; Zhugen Yang
Journal: Water Res Date: 2020-12-25 Impact factor: 11.236

3. Sewage epidemiology and illicit drug research: the development of ethical research guidelines.

Authors: Jeremy Prichard; Wayne Hall; Pim de Voogt; Ettore Zuccato
Journal: Sci Total Environ Date: 2013-12-07 Impact factor: 7.963

4. An imprecise fuzzy risk approach for water quality management of a river system.

Authors: S Rehana; P P Mujumdar
Journal: J Environ Manage Date: 2009-08-12 Impact factor: 6.789

5. The Effects of Temperature and Relative Humidity on the Viability of the SARS Coronavirus.

Authors: K H Chan; J S Malik Peiris; S Y Lam; L L M Poon; K Y Yuen; W H Seto
Journal: Adv Virol Date: 2011-10-01

6. Influencing factors of COVID-19 spreading: a case study of Thailand.

Authors: Kraichat Tantrakarnapa; Bhophkrit Bhopdhornangkul; Kanchana Nakhaapakorn
Journal: Z Gesundh Wiss Date: 2020-06-18

7. First Case of 2019 Novel Coronavirus in the United States.

Authors: Michelle L Holshue; Chas DeBolt; Scott Lindquist; Kathy H Lofy; John Wiesman; Hollianne Bruce; Christopher Spitters; Keith Ericson; Sara Wilkerson; Ahmet Tural; George Diaz; Amanda Cohn; LeAnne Fox; Anita Patel; Susan I Gerber; Lindsay Kim; Suxiang Tong; Xiaoyan Lu; Steve Lindstrom; Mark A Pallansch; William C Weldon; Holly M Biggs; Timothy M Uyeki; Satish K Pillai
Journal: N Engl J Med Date: 2020-01-31 Impact factor: 91.245

8. Men and COVID-19: A Biopsychosocial Approach to Understanding Sex Differences in Mortality and Recommendations for Practice and Policy Interventions.

Authors: Derek M Griffith; Garima Sharma; Christopher S Holliday; Okechuku K Enyia; Matthew Valliere; Andrea R Semlow; Elizabeth C Stewart; Roger Scott Blumenthal
Journal: Prev Chronic Dis Date: 2020-07-16 Impact factor: 2.830

Review 9. Diagnostic methods and potential portable biosensors for coronavirus disease 2019.

Authors: Feiyun Cui; H Susan Zhou
Journal: Biosens Bioelectron Date: 2020-06-02 Impact factor: 10.618

10. Factors affecting COVID-19 infected and death rates inform lockdown-related policymaking.

Authors: Satyaki Roy; Preetam Ghosh
Journal: PLoS One Date: 2020-10-23 Impact factor: 3.240

3 in total

Review 1. Sheet, Surveillance, Strategy, Salvage and Shield in global biodefense system to protect the public health and tackle the incoming pandemics.

Authors: Xinzi Wang; Tianyun Wu; Luis F S Oliveira; Dayi Zhang
Journal: Sci Total Environ Date: 2022-01-29 Impact factor: 10.753

Review 2. Toward smart diagnosis of pandemic infectious diseases using wastewater-based epidemiology.

Authors: Tohid Mahmoudi; Tina Naghdi; Eden Morales-Narváez; Hamed Golmohammadi
Journal: Trends Analyt Chem Date: 2022-04-15 Impact factor: 14.908

3. Data analytics during pandemics: a transportation and location planning perspective.

Authors: Elif Bozkaya; Levent Eriskin; Mumtaz Karatas
Journal: Ann Oper Res Date: 2022-08-01 Impact factor: 4.820

3 in total