Literature DB >> 35813986

Application of Recursive Estimation to Heat Tracing for Groundwater/Surface-Water Exchange.

W Anderson McAliley^1,2, Frederick D Day-Lewis³, David Rey⁴, Martin A Briggs⁵, Allen M Shapiro⁶, Dale Werkema⁷.

Abstract

We present and demonstrate a recursive-estimation framework to infer groundwater/surface-water exchange based on temperature time series collected at different vertical depths below the sediment/water interface. We formulate the heat-transport problem as a state-space model (SSM), in which the spatial derivatives in the convection/conduction equation are approximated using finite differences. The SSM is calibrated to estimate time-varying specific discharge using the Extended Kalman Filter (EKF) and Extended Rauch-Tung-Striebel Smoother (ERTSS). Whereas the EKF is suited to real-time ("online") applications and uses only the past and current measurements for estimation (filtering), the ERTSS is intended for near-real time or batch-processing ("offline") applications and uses a window of data for batch estimation (smoothing). The two algorithms are demonstrated with synthetic and field-experimental data and are shown to be efficient and rapid for the estimation of time-varying flux over seasonal periods; further, the recursive approaches are effective in the presence of rapidly changing flux and (or) nonperiodic thermal boundary conditions, both of which are problematic for existing approaches to heat tracing of time-varying groundwater/surface-water exchange.

Entities: Chemical

Year: 2022 PMID： 35813986 PMCID： PMC9257602 DOI： 10.1029/2021wr030443

Source DB: PubMed Journal: Water Resour Res ISSN： 0043-1397 Impact factor: 6.159

Introduction

Heat has been used as a tracer to estimate groundwater/surface‐water exchange and hydraulic properties since pioneering work by Stallman in the 1960s (Anderson, 2005; Constantz, 2008; Stallman, 1965). The measurement technology (e.g., thermistors and thermocouples) is relatively inexpensive, long‐term field installations are possible, and data analysis is straightforward. The increasing interest in hyporheic, hypolentic, and hypopaludal processes over the last two decades has led to burgeoning applications of heat tracing below the sediment water interface (e.g., Anderson, 2005; Briggs et al., 2013; Constantz, 2008; Rau et al., 2014). Tools for data analysis are well established and some are available within the public domain (Gordon et al., 2012; Koch et al., 2016; Voytek et al., 2014). Data analysis commonly involves (a) calibration of numerical or analytical models of heat transport to infer specific discharge associated with groundwater/surface‐water exchange (Koch et al., 2016; Lapham, 1989; Voytek et al., 2014) or (b) analysis of the amplitude phase of signals extracted from temperature time series, which are then related to specific discharge through analytical models (Gordon et al., 2012; Schneidewind et al., 2016). We briefly review each of these two data analysis approaches as formulated within the existing tools 1DTempPro and VFLUX. The 1DTempPro software package (Koch et al., 2016; Voytek et al., 2014) was written in C# for the Windows platform and provides a framework to calibrate a numerical model of heat transport to temperature time series collected at different depths below the sediment/water interface; this code allows for consideration of heterogeneity of hydraulic and thermal properties, complex (non‐sinusoidal) boundary conditions, and abrupt changes in exchange rate. Such abrupt changes can result from extreme weather driving rapid stage fluctuations, dam control on stream flows, or any flashy system. Model calibration is performed manually or by nonlinear regression. Estimation of time‐varying discharge in 1DTempPro is cumbersome and commonly entails the independent analysis of manually windowed subsets (days to week) of long‐term seasonal data sets. 1DTempPro provides a graphical interface for the VS2DH model. The finite difference scheme used by VS2DH is described in the following section. The VFLUX MATLAB‐based software package (Gordon et al., 2012; Irvine et al., 2015) provides a framework to extract signal amplitude and phase from temperature time series collected at different depths using dynamic harmonic regression (DHR). In DHR, a Kalman Filter (KF) algorithm (Young et al., 1999, 2004) is used to identify the time‐varying amplitude and phase of a finite number of sinusoids that, superposed, optimize the fit to observed temperatures. Assuming an analytical model for heat transport, the extracted time‐varying amplitude decay and (or) phase versus depth are related to discharge. Algorithms have been developed to use amplitude or phase or both amplitude and phase (e.g., Irvine et al., 2015). Compared to 1DTempPro, VFLUX and related signal‐extraction methods (e.g., Hatch et al., 2006; Keery et al., 2007; Schneidewind et al., 2016) are better suited to the estimation of time‐varying discharge. Signal extraction, however, can prove problematic in the presence of episodic forcing (e.g., storm events) and during times of year when sinusoidal signals are naturally weak or when the vertical temperature gradient is small (Irvine et al., 2015; Lautz, 2012; Rau et al., 2015). In groundwater‐upwelling zones, signals are also strongly attenuated with depth and may not be identified with DHR below approximately 15 cm under most conditions using typical temperature sensors (Briggs et al., 2016). In practice, DHR is subject to leakage and may impede recovery of abrupt changes in flux (Glose et al., 2021; Koch et al., 2016), but multifrequency signal extraction, for example, van Kampen et al. (2022) can mitigate these limitations. Reliance on simple analytical models to relate amplitude and (or) phase to discharge precludes consideration of heterogeneity, artificial heating sources, and the dependence of physical properties on temperature. Here, we propose a recursive‐estimation framework (Figure 1) to identify time‐varying specific discharge associated with groundwater/surface‐water exchange. This approach uses: (a) a state‐space model (SSM) of heat transport and (b) Kalman‐based algorithms commonly used in control engineering (Brown & Hwang, 2012; Gelb, 1974; Simon, 2006) and time series analysis (Durbin & Koopman, 2012; Harvey, 1990). In the former context, Kalman‐based approaches recursively predict the future states of a dynamical system (e.g., a circuit model or kinematic model of a vehicle's motion) and their covariance using both a process model and measurements. In the latter context, the same mathematical procedure is used for a time series model (e.g., ARIMA or Box‐Jenkins) rather than a physics‐based process model. Our approach therefore can be viewed as a hybrid between calibration of rigorous process models (e.g., 1DTempPro) and digital signal processing (e.g., VFLUX), and subsequent analysis with analytical models. Indeed, our formulation offers the strengths of both: (a) computational efficiency for estimation of time‐varying discharge, discretized based on the measurement interval; (b) resolution of abrupt changes in discharge; (c) consideration of complicated time‐varying boundary conditions; and (d) quantitative assessment of estimation uncertainty offered by the posterior state covariance matrix in the Kalman‐based approach.

Figure 1

Schematic diagram illustrating heat tracing for groundwater/surface‐water exchange: (a) field‐experimental setup with temperature sensors indicated by colored circles; (b) observed temperatures used for model boundary conditions and calibration data; (c) the two‐step filtering procedure involving prediction and correction; and (d) the filter results, including estimated temperature and discharge states. The filter steps through data assimilating new measurements at time t to produce state estimates for time k.

Methods

Heat transport below the sediment/water interface is commonly described using the one‐dimensional (1D) convection/conduction partial differential equation (PDE) (after Constantz, 2008): where K is thermal conductivity of bulk sediment (W/(m °C)), C is volumetric heat capacity of water (J/(m3 °C)), C is the volumetric heat capacity of the bulk sediment (J/(m3 °C)), T is temperature (°C), t is time (s), z is vertical position (m), and q is specific fluid discharge (m/day). We note that Equation 1 neglects thermal dispersion per common practice although consideration of dispersion is straightforward. Analytical solutions to the PDE are available for idealized boundary conditions and homogeneous properties (e.g., Luce et al., 2017). We note also that the 1D assumption is commonly used and justified (e.g., Lautz, 2010; Reeves & Hatch, 2016) for heat tracing although such simplifications can, in general, introduce error in other contexts (Sinsbeck & Tartakovsky, 2015). Numerical approaches (e.g., finite‐difference or finite‐element) allow for consideration of complex boundary conditions and (or) heterogeneous properties. Although Equation 1 is written for homogenous and temperature‐independent properties, this is not a requirement for numerical models. For the heat tracing problem considered here, the model domain is within the sediment layer with boundary conditions assigned based on temperature measurements. In the following sections, we first formulate the process model for solution of Equation 1 and subsequently formulate filtering and smoothing algorithms to infer specific discharge.

Process Model in Continuous Time

To facilitate recursive estimation, we reformulate Equation 1 as an SSM. First, we discretize the PDE's spatial terms using a finite‐difference approximation consistent with that of the VS2DH model (Healy & Ronan, 1996), which is used by 1DTempPro. For the finite‐difference cell denoted by the subscript i: where Δz is the spatial discretization, which is assumed to be constant, T (t) is the temperature at the center of the finite‐difference cell i at time t, and the subscripts i + 1 and i − 1 denote adjacent finite‐difference cells. Although we have assumed homogeneity of physical properties in Equations 1 and 2, this assumption is not necessary, and heterogeneity could be considered in the SSM as it is in VS2DH. Rearranging Equation 2 and accounting for boundaries adjacent to cells 1 and n: where u 1(t) and u 2(t) are the temperatures at the top and bottom boundaries, respectively. Equations (3a), (3b), (3c) constitutes a system of n‐coupled first‐order ordinary differential equations. The coefficients in Equations (3a), (3b), (3c) are time‐varying because two terms contain q(t). To simplify the system, we approximate q(t) as constant over each measurement interval; this approximation is well justified, given sampling rates are commonly fast (on the order of minutes) compared to the expected changes in flux and thermal response (on the order of hours or days). We note that the time derivatives in Equation 1 remain derivatives in Equations 2 and (3a), (3b), (3c) and hence the dot notation. The use of finite‐difference discretization in space but not time is often referred to as the “method of lines” (Schiesser, 1991). We can now express Equations (3a), (3b), (3c) as a linear time‐varying SSM transition equation: where at time t, (t) is the n‐by‐n state‐transition matrix, containing the coefficients of the finite‐difference cell temperatures in Equations (3a), (3b), (3c); (t) is the n‐by‐2 input matrix, containing the coefficients of the two boundary temperatures in Equations 3b and 3c; and (t) is a 2‐by‐1 vector of system inputs that is the boundary temperatures. Consistent with Koch et al. (2016) and many other studies, thermal boundary conditions at the top (u 1(t)) and bottom (u 2(t)) of the model are assigned based on measured temperature time series at the shallowest and deepest measurement locations, respectively. The solution to Equation 4 to advance the system states from sample time t to t is where Δt is the sample interval. In Equation 5, the integration is performed assuming the temperatures at the boundaries (i.e., (τ)) vary linearly between measurements at time t and t (i.e., a first‐order hold). Our use of a finite‐difference spatial discretization to reduce the PDE to a system of ordinary differential equations and the subsequent use of the matrix exponential for temporal solution are rare but not unique. The procedure was first proposed in groundwater hydrology by Umari and Gorelick (1986) for problems in advective‐dispersive transport and groundwater flow. Here, the procedure allows us to formulate the process model as an SSM (Equation 4), thus facilitating a straightforward application of well‐established and powerful system‐identification tools for state estimation (Brown & Hwang, 2012; Gelb, 1974; Särkkä, 2013; Simon, 2006). We note that this discretization scheme results in a first‐order spatial difference for the advective term, which results in a convergence rate on the order of Δz. Such a differencing scheme would lead to stability issues for a spatial and temporal finite‐differencing approach; however, since we use the matrix exponential in place of temporal discretization, the solution is not subject to the same stability issues.

Recursive Estimation in Discrete Time

Applications of recursive estimation in hydrology are numerous (Sun et al., 2016) but rarely involve the type of real‐time (“online”) state‐estimation applications for which the methods were originally developed and intended (Kalman, 1960); rather, Kalman‐based approaches are more commonly used for offline (i.e., not real time) parameter‐estimation problems, in which the Kalman recursions sequentially assimilate previously acquired time series measurements. We refer the interested reader to Sun et al. (2016) for a review of Kalman‐based applications in hydrology and to Shapiro et al. (2022), Shapiro and Day‐Lewis (2021), Kang et al. (2018), Kitanidis (2015), Ng et al. (2014), and Li et al. (2015) for representative hydrologic applications. Whereas the heat‐transport process is simulated in continuous time, the estimation problem is solved in discrete time; that is the recursive estimation is performed at the times where measurements are acquired. This continuous/discrete scheme is common in the literature (e.g., Brown & Hwang, 2012) and provides for the accurate and stable process modeling and efficient assimilation of time series data. The state vector for the discrete‐time estimation problem, k, at measurement time t k is augmented to include specific discharge as a state: Converting the state transition equation (Equation 4) from continuous to discrete time, including the measurement equation, using the new augmented state vector, , and applying a first‐order hold for the input, yields where Φ is the discrete‐time transition matrix calculated at time k − 1; and are the discrete‐time input matrices calculated at time k‐1, discussed subsequently; is a 2‐by‐1 vector containing the temperatures measured at the top and bottom boundaries; is the process‐noise vector with each element corresponding to one of the augmented n + 1 states; is a m‐by‐1 vector of simulated observations; and is the measurement‐noise vector with each element corresponding to one of the augmented m measurements. For our problem, the state‐space measurement equation (Equation 7b) simply extracts or interpolates from the temperatures at measurement locations. For each observation, the corresponding row of comprises weights that linearly interpolate the temperature at the observation location using the closest two finite‐difference cell centers. Here, we assume that both the process and measurement noise are independent Gaussian white noise. The matrices , , and are found at each sample time assuming a first‐order hold for the input following Virtanen et al. (2020). The SSM is augmented a second time to include the inputs from times k − 1 and k, and the matrix exponential of the augmented system's transition matrix is calculated, producing: The relevant matrix blocks are extracted from the result of this operation, from which , , and are calculated. The last row of the linear system of Equation 7a describes the transition of q(t). The true transition of q(t) is governed by hydraulic forcing, which is unknown in practice; hence, a stochastic process must be assumed for the transition of this state. Here, q(t) is modeled as a random walk (RW), a parsimonious model commonly used for the variation of unobserved states. The RW model in discrete time, from time step k − 1 to k, is where w is zero‐mean Gaussian white‐noise. Identification of the variance is discussed in Appendices A and B and demonstrated in Supporting Information S1 (Sections 3, 4). The goal of filtering is to estimate a system's current states using information acquired up to and including the current time (Figure 2), that is, real‐time (online) estimation. Similar to the linear KF, the EKF assimilates each new measurement using a two‐step procedure, prediction, and correction, illustrated schematically in Figure 3. The filter is initialized with a prior state estimate () and prior covariance (). The prediction step (time update) projects the system states and their covariance from time t to t . The projection of the states (Equation 10) is performed based on Equation 7a, and the projection of the covariance (Equation 11) is based on a linearization involving the state Jacobian, , calculated at step using a central finite‐difference scheme and perturbation around the current state estimate: where is the process noise covariance, assumed here to be time‐invariant. The state Jacobian differs from the discrete‐time transition matrix because depends on discharge, which is included in the state vector. We further assume that the process noise for all states is independent zero‐mean Gaussian white noise and constant across time with one variance common to all temperature states and a different variance for the RW of the specific discharge, consistent with w (Equation 7a).

Figure 2

Figure 3

Schematic diagram showing the workflow for the two‐step Extended Kalman Filter (EKF) recursion and associated calculations. The Extended Rauch‐Tung‐Striebel Smoother workflow (not shown) includes forward and backward EKF passes over the measurements, using the results of the first pass to initialize the second.

Schematic diagram explaining (a) prediction, where estimation is performed at times beyond the interval over which data are available, [t 0, T]; (b) filtering, where estimation is performed at the time, T, of the current measurement; and (c) smoothing, where estimation is performed for times within the interval over which data are available (after Särkkä (2013)). Schematic diagram showing the workflow for the two‐step Extended Kalman Filter (EKF) recursion and associated calculations. The Extended Rauch‐Tung‐Striebel Smoother workflow (not shown) includes forward and backward EKF passes over the measurements, using the results of the first pass to initialize the second. The correction step (measurement update) assimilates new measurements at time t , based on a measurement Jacobian, , which, here, is time‐invariant because the predicted measurements depend linearly on the temperatures in the current state . The Kalman gain, (Equation 12), is used in the correction step to weight how the new measurements affect the state estimate (Equation 13) and covariance (Equation 14): where is the measurement noise covariance at time k. Here, we assume the measurement noise is time‐invariant independent zero‐mean Gaussian white noise; thus, is a diagonal matrix with elements being measurement error variances, consistent with v in Equation 7b. The EKF algorithm thus advances through the time series data, predicting and correcting at each sample time (Figure 3). Our EKF recursions follow well‐established procedures for the filter as described in both modern (Brown & Hwang, 2012; Särkkä, 2013; Simon, 2006) and classic texts (Gelb, 1974) on optimal state estimation. Whereas the goal of filtering is to estimate current states based on measurements up to the present time, smoothing uses all available information. The goal of smoothing is to estimate a system's state using information from the past and present with respect to the times being estimated, that is, en bloc (offline) estimation (Figure 2). Just as the linear Rauch‐Tung‐Striebel Smoother (RTSS) is the smoothing algorithm based on the linear KF, the ERTSS is the smoothing algorithm based on the EKF (Särkkä, 2013). Here, we consider fixed‐interval smoothing (FIS), which produces estimates over a fixed time interval that is identical to the interval over which measurements are available. The ERTSS FIS entails by first running an EKF forward through the time series measurements to obtain forward‐pass filter estimates. The state and covariance estimates for each step of the forward pass are then used in a second, backward pass through the measurements, producing the smoothed result. For a linear problem, the RTSS can be shown to produce optimal estimates (Rauch et al., 1965), but the same is not guaranteed for the ERTSS applied to nonlinear problems (Särkkä, 2013). Of course, optimality depends also on the appropriate selection of and , which control the reliance on the prediction relative to the reliance on measurements. See Appendices A and B for automated selection methods, including the Discrepancy Principle (Constable et al., 1987; Hansen, 1997), which is demonstrated in Section 3. Furthermore, the convergence to optimal estimates and the stability of the filter depend also on the selection of and . In this work, we focus on the estimation of specific discharge and assume that the values of thermal properties (K , C , and C ) are known. Compared to hydraulic conductivity, which varies by approximately six orders of magnitude in natural sediments, thermal properties are more tightly constrained (Stonestrom & Constantz, 2003). Furthermore, in situ direct measurement of the thermal properties of bed materials is possible using inexpensive handheld field probes (e.g., the Decagon KD‐2 or the Meter Group TEMPOS) as in, for example, Briggs et al. (2014). The limitations of our approach include: (a) the 1D assumption, (b) the assumption of constant discharge between sample times, (c) assumption of known thermal properties, and (d) the linearization and Gaussian assumption underlying the EKF and ERTSS. The first limitation is common to many approaches to heat tracing. The second limitation is mitigated by using modern measurement technology, which is capable of logging measurements at minute intervals. The third can be addressed by in situ measurement as noted above or assessed using the sensitivity analysis (see Section 3.1). The third and fourth could also be addressed in future extensions of this work involving stochastic Kalman‐based frameworks, such as the Ensemble Kalman Filter (EnKF) and Particle Filter (PF) (e.g., Särkkä, 2013; Simon, 2006); these filters (and associated smoothers) are better suited than the EKF to strongly nonlinear and non‐Gaussian problems.

Python Implementation

Three Jupyter notebooks, available from McAliley et al. (2022) and also reported in Supporting Information S1, set up and run analyses of synthetic and field‐experimental data (Section 3). The spatial discretization of the convection/conduction equation, construction of the SSM matrices, matrix‐exponential solution, and filtering and smoothing algorithms are implemented in a Python library, tempest1d, also available from McAliley et al. (2022); this library contains two classes that are needed to implement the filter. The first class, ModelProperties, is a convenience container to hold spatial discretization parameters and thermal properties. The second class, EKF, extends filterpy's ExtendedKalmanFilter class to form the Jacobian of the system dynamics function and hold problem‐specific parameters. Code dependencies include filterpy (Labbe, 2018) and, in turn, filterpy's dependencies (e.g., numpy and scipy). In addition, the Jupyter notebooks use matplotlib for graphics. The use of Jupyter notebooks in this work facilitates: (a) reproducibility of research, (b) application to other data sets, (c) extension by others, and (d) integration with other software and interfaces.

Examples

We demonstrate the EKF and the ERTSS in two examples. First, we consider a synthetic data set with step changes in specific discharge; this example provides an assessment of the algorithms' ability to resolve extremely abrupt changes in the specific discharge. The application to synthetic data allows for a straightforward comparison of the performance of the EKF, ERTSS, and the commonly used discharge‐estimation algorithms of VFLUX, for a problem where the true specific discharge is known. In the second example, we consider field‐experimental data from the Upper Neversink watershed in New York, representing a realistic application where the true physical parameters and discharge are unknown.

Example 1: Synthetic Problem

The synthetic scenario spans 20 days and features a vertical discharge with four stages of constant discharge, each stage lasting 5 days. The four values of discharge for the stages are 0, 1, 0, and −1 m/day as shown in Figure 4a. To generate synthetic data, we assume that the sediment/water interface has sinusoidal diurnal and weekly periodic components with amplitudes of 5°C and 3°C, respectively, and an upward linear trend of 0.3°C per day as shown in Figure 4b; this signal is used to prescribe the Dirichlet boundary condition for the top of the synthetic model. We place the bottom boundary at 5 m below the surface, where we assume a constant temperature of 10°C, giving the boundary condition for the base of the model. The values of hydraulic and thermal properties and model specifications are given in Table 1. The numerically computed temperatures of the topmost meter over time are shown in Figure 4c; from these temperatures, the synthetic data are taken at 0.06, 0.1, 0.2, 0.4, 0.7, and 1 m depths below the sediment/water interface at a 10‐min interval. The synthetic data were corrupted with zero mean Gaussian noise of standard deviation 0.0625°C, the resolution of the commonly used iButton temperature sensor (Figure 4d).

Figure 4

Table 1

Physical Properties and Model Specifications for Example 1

Physical property	Value
K _T	2 W/(m°C)
C _w	4.182 × 10⁶ J/(m³°C)
C _s	2 × 10⁶ J/(m³°C)

Synthetic 20‐day data with (a) known specific discharge; (b) time‐varying temperature boundary conditions to generate synthetic data; (c) simulated temperatures with dashed horizontal black lines showing the measurement locations; and (d) temperature time series measurements with additive Gaussian noise. Physical Properties and Model Specifications for Example 1 The EKF and ERTSS require a process covariance matrix, , as explained in Section 2. For this example, we assume a diagonal . Each diagonal entry represents the variance of the corresponding state. The last diagonal entry corresponds to the temporal variance of the discharge over Δt; the assumed value for this covariance controls the magnitude of temporal variation in the discharge estimate and thus strongly affects the performance of the filter. We tested values ranging across several orders of magnitude, choosing a value of 7 × 10−5 (m/day)2 (see Appendices A and B, Supporting Information S1). This choice corresponds to an expected standard deviation of 0.0086 m/day from the start of a 10‐min time interval to the end. The first n diagonal entries of the process covariance matrix correspond to the temperatures in each model cell; these values could be set to zero for the EKF, but such a choice results in numerical instability in the ERTSS. Therefore, the remaining diagonal entries were set to 10−4 (°C)2, a small value, reflecting the assumption that the heat transport model is highly certain. Both the EKF and ERTSS estimates of discharge are close to the true discharge (Figure 5); however, the ERTSS improves the estimates over the EKF estimates in three respects. First, the ERTSS estimates are smoother in time as each estimate is informed by more measurements and thus less affected by measurement noise. Second, it decreases the uncertainty associated with the estimates as indicated by the tighter confidence intervals. Third, it better localizes the step changes in time because it is using information from before and after the changes to inform all estimates. Figures 5c and 5e demonstrate this third effect. For both the EKF and ERTSS, the full response to the step changes is approximately 0.2 days (<5 hr).

Figure 5

(a) Estimated discharge for Example 1, obtained using the Extended Kalman Filter (EKF) and Extended Rauch‐Tung‐Striebel Smoother (ERTSS), with 95% confidence intervals shaded; (b) measured temperatures at the middle four thermistors, plotted as circles, and temperatures predicted by the EKF with ERTSS; (c) estimated discharge during the first step change; (d) RMS error in temperature at each time step; (e) estimated discharge during the last step change; and (f) Comparison of the ERTSS and VFLUX estimates. As a basis for comparison, we analyzed the synthetic data using VFLUX (version 2) (Figure 5e). The “Hatch Amplitude” analytical model of VFLUX was applied using the known synthetic thermal parameters and the 0.1 and 0.2 m depth pair. VFLUX does not resolve the step changes as well as either the ERTSS or EKF. The VFLUX DHR algorithm requires approximately 2 days to fully respond to step changes. Furthermore, the quality of discharge estimates is strongly degraded during the period of strong upwelling from days 15–20 likely due to the highly attenuated diurnal signal at 0.2 m depth during this time, highlighting the limitations of the signal‐based analytical models in zones of groundwater discharge. We also investigated the effect of using an incorrect value for the thermal conductivity. We perturbed the true thermal conductivity by 10% in either direction. The RMSE of the discharge estimates increased from 0.052 m/day to 0.065 and 0.064 m/day because of using a thermal conductivity value that was 10% too low and 10% too high, respectively. The mean of the change in estimated discharge over the change in thermal conductivity was −0.0022 m/day per W/m°C; thus, the discharge estimates are not strongly sensitive to reasonable uncertainty in thermal conductivity in this example.

Example 2: Field‐Experimental Data From the Upper Neversink Subwatershed

To demonstrate the applicability of our approach to field data, we use data from the Upper Neversink watershed, which falls within the Delaware River Basin, one of the U.S. Geological Survey's Integrated Water Science basins. The Neversink is an area of ongoing intensive hydrologic monitoring, including new infrastructure for networked real‐time monitoring of vertical temperature profiles in streambeds and exposed groundwater seeps. We present the data available from a vertical temperature installation in a river‐bank seep at the hillslope break that typically contributes the observed (with thermal infrared) discharge to the channel below (Figure 6).

Figure 6

(a) Drone‐based ortho imagery shows the location of a shallow groundwater bank seep along the west branch of the Neversink River relative to a U.S. Geological Survey Next Generation Water Observing System gage; (b) a photograph of instrumented seep in fall with noted placement of the vertical temperature profiler next to a monitoring piezometer; and (c) an infrared image of the seep in winter showing sustained discharge of relatively warm groundwater to the land surface. Temperatures were recorded using a Campbell Scientific CS230 temperature profiler with thermistors located at 0, 0.05, 0.1, 0.3, and 0.5 m below the sediment/water interface (Figure 7a). We neglect data from the thermistor at 0 m as data from the sediment/water interface are commonly noisy and may prove problematic if stage drops, and the thermistor is exposed to air. We use the temperature time series from the thermistors at 0.05 and 0.3 m to define Dirichlet boundary conditions for the SSM. Measurements from the thermistor at 0.1 m are used for calibration. Data from the thermistor at 0.5 m are not used because the depth to bedrock is less than 1 m, resulting in increasing lateral heat transport with increasing depth. Furthermore, the VFLUX analysis used only data from the thermistors at 0.05 and 0.1 m due to extended times of highly attenuated diurnal signals at deeper sensors. The EKF requires a minimum of three thermistors, so using the thermistors at 0.05 m, 0.1 m, and 0.3 m allows the closest comparison of the estimates by both methods.

Figure 7

(a) Temperature data set from Upper Neversink Watershed; (b) vertical discharge estimated using Extended Kalman Filter (EKF) and Extended Rauch‐Tung‐Striebel Smoother (ERTSS), compared with discharge estimates obtained by applying VFLUX to temperatures at 0.05 and 0.1 m; (c) measured temperatures at 0.1 m compared to temperatures estimated using ERTSS; and (d) residuals between simulated and observed data based on the ERTSS results. Spurious VFLUX estimates in December 2020 coincide with a period of multiple days of near‐constant temperatures. The covariance matrices for the EKF were formed as follows. First, we form the measurement noise covariance matrix, which expresses the estimated measurement error. The manufacturer‐specified resolution of the thermistors is 0.0078°C. However, based on experience with the instrument, we estimate that the standard deviation due to instrument error is 0.03°C. Furthermore, the temperature data as available online from USGS are rounded to the nearest 0.1°C. Assuming that rounding errors are uniformly distributed between −0.05°C and 0.05°C, the standard deviation of the rounding error is . The total measurement noise standard deviation is the square root of the sum of the squares of these two values, 0.042°C. The square of this value is placed along the diagonal of the measurement covariance matrix, and the off‐diagonal terms are set to zero. We use a diagonal process covariance matrix, just as in the synthetic example. A value of 0.00086 m/day was chosen for the discharge standard deviation using the discrepancy principle (see Appendices A and B, Supporting Information S1, and Section 4). We again set the remaining diagonal entries of the process covariance matrix to the value of 10−4 (°C)2 to ensure stability for the ERTSS. This value corresponds to a process standard deviation of 0.01°C, which is significantly less than the measurement standard deviation of 0.042°C. Keeping this value below the measurement standard deviation ensures that the state covariance is primarily determined by the measurement standard deviation, not the temperature process standard deviation, reflecting the assumption that the process model captures the governing physics well. The values of hydraulic and thermal properties and model specifications are given in Table 2 and were estimated based on field sediment observations. Results from the ERTSS are shown in Figure 7b. The ERTSS analysis took approximately 2 min on a laptop computer with a 2.9 GHz Quad‐Core Intel Core i7 processor. Note that the use of 1DTempPro would take at least an order of magnitude longer for these problems. As with the synthetic data test, the Hatch Amplitude model of VFLUX was run using the same thermal parameters as for ERTSS. VFLUX produces spurious estimates during a few days in December of 2020, during which temperatures stayed nearly constant. Because VFLUX relies on sinusoidal diurnal temperature variations, the algorithm performs poorly when there is no sinusoidal component to temperature fluctuations. Neither the EKF nor the ERTSS have this performance limitation. We also note that the SSM approach presented here uses temperature measurements to form a lower boundary condition, while VFLUX assumes a semi‐infinite half‐space. This difference might explain some of the discrepancy between VFLUX and ERTSS discharge estimates, especially during a rapid upward exchange. As shown in Supporting Information S1, the EKF and ERTSS are also robust in the presence of data gaps, which are a common issue for automated analyses of continuous data. Figure 7c shows that predicted temperatures track measured temperatures, and Figure 7d shows that residual errors mostly fall within 0.1°C.

Table 2

Physical Properties and Model Specifications for Analysis of the Field‐Experimental Data Set From the Upper Neversink Watershed in Example 2

Physical property	Value
K _T	1.68 W/(m°C)
C _w	4.182 × 10⁶ J/(m³°C)
C _s	2.829 × 10⁶ J/(m³°C)

Physical Properties and Model Specifications for Analysis of the Field‐Experimental Data Set From the Upper Neversink Watershed in Example 2

Discussion

Based on a synthetic example, the EKF/ERTSS offers improved resolution of rapid changes in flux compared to the commonly used VFLUX tool (Figure 5f), while offering the major benefits of signal‐extraction methods, that is, computational efficiency for the estimation of time‐varying flux. The ERTSS algorithm, in particular, allows for accurate temporal localization of abrupt changes. An application to field‐experimental data indicates that the EKF/ERTSS may be less subject to problems arising from strong upwelling, where the diurnal signal exploited by DHR is weak or absent, and VFLUX produces spurious results (Figure 7b). The focus on the diurnal signal can also be problematic for DHR approaches in overcast conditions. The choice of process variance for the discharge has a strong effect on the performance of the filter. A large discharge variance leads to discharge estimates that are governed by temperature data fit. Discharge variance that is too large can lead to estimates that overfit the temperature data and vary unrealistically from downwelling to upwelling and back daily. A large discharge variance also results in unrealistically large uncertainties in the estimates. A small discharge variance, meanwhile, yields overly smooth discharge estimates with overconfident estimates of uncertainty and a poor data fit. We use the discrepancy principle described in Appendix A (see also Supporting Information S1, Sections 3, 4) to choose the optimal process variance. By this approach, the expected measurement RMSE is estimated based on the measurement noise. Then, the optimal process variance is the value for which the resulting RMSE equals the expected. Furthermore, the variance and character of the discharge over time should be considered. For instance, in the field data example, we ensure that the discharge estimates are physically realistic not alternating between positive and negative discharge (downwelling and upwelling) on a daily basis. The process and noise variance, as well as the initial prior covariance, also affect the stability of the EKF (and thus ERTSS) and its convergence. We find that positive diagonal entries in the process covariance matrix are necessary for ERTSS stability. Also, if the initial prior covariance is too small (i.e., excessive confidence in the initial state estimates is assumed), the filter measurement updates will not be large enough, and the filter will not converge. Conversely, stability issues can result if the prior covariance is too large. However, we find that when the filter converges stably, the ERTSS estimates are not significantly affected by the choice of initial prior covariance, and only early EKF estimates are affected. The ERTSS analysis of the synthetic and field data required on the order of 20 and 120 s of computer processing time, respectively, using a 2.9 GHz Quad‐Core Intel Core i7 processor in a laptop computer, underscoring the computational efficiency of the EKF/ERTSS approach compared to, for example, use of 1DTempPro. The ERTSS requires approximately twice the computer processing time of the EKF and thus is tractable for practical problems in vertical heat tracing for groundwater/surface‐water exchange.

Conclusions

We presented and demonstrated a new approach to estimate specific discharge associated with groundwater/surface‐water exchange using temperature data collected below the sediment/water interface. Our recursive‐estimation framework combines the strengths of previous approaches based on either signal extraction or model calibration: (a) computational efficiency for estimation of time‐varying discharge, (b) resolution of abrupt changes in discharge, (c) consideration of arbitrary time‐varying Dirichlet boundary conditions, and (d) quantitative assessment of estimation uncertainty. We also note that the new framework is effectively a multifrequency approach; it obviates the decision to analyze amplitude or phase or both and it requires no windowing procedure. The optimal filtering approach is well suited to online, real‐time data assimilation and estimation; furthermore, the Kalman‐based approach is also well suited to forecasting (Figure 2), which amounts to performing the prediction step without new measurements. Our SSM and the assumed model for the transition of q (Equation 9) could be modified to consider the inputs as states and to forecast inputs and discharge both. Alternatives to the RW model (e.g., structural time series, Gauss‐Markov, autoregressive, or integrated RW processes) could allow for prediction of temperature states and discharge for management problems related to thermal loading to surface water, for example, maintenance of thermal refugia for temperature‐sensitive aquatic species. Real‐time assimilation and data reduction problems in other fields are driving intensive development and investment in libraries for machine learning, which include tools for recursive estimation, for example, Tensorflow (Abadi et al., 2016) and Pyro (Bingham et al., 2019). There is enormous potential for application of these tools in data‐driven hydrology (Shapiro & Day‐Lewis, 2022) to address real‐time water management and capitalize on real‐time hydrologic observation networks. Future extensions to our approach and the associated Python library could include further investigation into methods for automated selection of the process covariance to identify the appropriate trade‐off between model complexity and data misfit. For example, the maximum‐likelihood method has been used to infer hyperparameters for process noise based on the analysis of measurement residuals (Young, 2011, p. 81). Another direction for future work is consideration of alternative filtering and smoothing methods based on, for example, the Unscented Kalman Filter (UKF), EnKF, or PF (e.g., Särkkä, 2013). Although the heat tracing problem is clearly amenable to solution using the simpler EKF as presented in this paper, the UKF could have advantages for strongly nonlinear examples, and the more computationally intensive EnKF and (or) PF could allow for simultaneous estimation of thermal and (or) hydraulic properties and stochastic assessment of parameter uncertainty. A further possible extension is the consideration of Fixed Lag Smoothing (FLS) instead of FIS. In FLS, the estimation is focused on a time prior to the most recent available measurement, that is, the estimation is delayed with respect to the measurements by a fixed lag. The FLS thus has some of the advantages of FIS, that is, improved temporal resolution of changes in states, reduced estimation uncertainty, and reduced estimation errors. Although the potential improvements with FLS come at the cost of real‐time estimation, the cost may be worth a short delay (a few measurement intervals) for many problems. Aside from improvements to the estimation framework, the process representation in our approach could easily be extended to address artificial heating. In the presence of strong upwelling, the diurnal signal that is commonly exploited in vertical heat tracing is weak or nonexistent. Active heating, for example, using electrical resistive heating elements, can generate an artificial signal to overcome this limitation of heat tracing (e.g., Briggs et al., 2016). As formulated here, our approach and associated codes can address non‐sinusoidal boundary conditions, but accurate representation of common heat sources would require an axis‐symmetric two‐dimensional or else a three‐dimensional finite‐difference grid; this amounts to a straightforward extension. Supporting Information S1 Click here for additional data file.

8 in total

Review 1. Heat as a ground water tracer.

Authors: Mary P Anderson
Journal: Ground Water Date: 2005 Nov-Dec Impact factor: 2.671

2. Understanding water column and streambed thermal refugia for endangered mussels in the Delaware River.

Authors: Martin A Briggs; Emily B Voytek; Frederick D Day-Lewis; Donald O Rosenberry; John W Lane
Journal: Environ Sci Technol Date: 2013-09-25 Impact factor: 9.028

3. 1DTempPro: analyzing temperature profiles for groundwater/surface-water exchange.

Authors: Emily B Voytek; Anja Drenkelfuss; Frederick D Day-Lewis; Richard Healy; John W Lane; Dale Werkema
Journal: Ground Water Date: 2013-04-02 Impact factor: 2.671

4. 1DTempPro V2: New Features for Inferring Groundwater/Surface-Water Exchange.

Authors: Franklin W Koch; Emily B Voytek; Frederick D Day-Lewis; Richard Healy; Martin A Briggs; John W Lane; Dale Werkema
Journal: Ground Water Date: 2015-09-15 Impact factor: 2.671

5. Incorporating Snowmelt into Daily Estimates of Recharge Using a State-Space Model of Infiltration.

Authors: Allen M Shapiro; Frederick D Day-Lewis; William M Kappel; John H Williams
Journal: Ground Water Date: 2022-05-07 Impact factor: 2.671

6. Reframing groundwater hydrology as a data-driven science.

Authors: Allen M Shapiro; Frederick D Day-Lewis
Journal: Ground Water Date: 2022-03-23 Impact factor: 2.887

7. Application of Recursive Estimation to Heat Tracing for Groundwater/Surface-Water Exchange.

Authors: W Anderson McAliley; Frederick D Day-Lewis; David Rey; Martin A Briggs; Allen M Shapiro; Dale Werkema
Journal: Water Resour Res Date: 2022-06-02 Impact factor: 6.159

Review 8. SciPy 1.0: fundamental algorithms for scientific computing in Python.

Authors: Pauli Virtanen; Ralf Gommers; Travis E Oliphant; Matt Haberland; Tyler Reddy; David Cournapeau; Evgeni Burovski; Pearu Peterson; Warren Weckesser; Jonathan Bright; Stéfan J van der Walt; Matthew Brett; Joshua Wilson; K Jarrod Millman; Nikolay Mayorov; Andrew R J Nelson; Eric Jones; Robert Kern; Eric Larson; C J Carey; İlhan Polat; Yu Feng; Eric W Moore; Jake VanderPlas; Denis Laxalde; Josef Perktold; Robert Cimrman; Ian Henriksen; E A Quintero; Charles R Harris; Anne M Archibald; Antônio H Ribeiro; Fabian Pedregosa; Paul van Mulbregt
Journal: Nat Methods Date: 2020-02-03 Impact factor: 28.547

8 in total

1 in total

1. Application of Recursive Estimation to Heat Tracing for Groundwater/Surface-Water Exchange.

Authors: W Anderson McAliley; Frederick D Day-Lewis; David Rey; Martin A Briggs; Allen M Shapiro; Dale Werkema
Journal: Water Resour Res Date: 2022-06-02 Impact factor: 6.159

1 in total