Future global Visible Shortwave Infrared Imaging Spectrometers, such as the Surface Biology and Geology (SBG) mission, will regularly cover the Earth's entire terrestrial land area. These missions need high fidelity atmospheric correction to produce consistent maps of terrestrial and aquatic ecosystem traits. However, estimation of surface reflectance and atmospheric state is computationally challenging, and the terabyte data volumes of global missions will exceed available processing capacity. This article describes how missions can overcome this bottleneck using the spatial continuity of atmospheric fields. Contemporary imaging spectrometers oversample atmospheric spatial variability, so it is not necessary to invert every pixel. Spatially sparse solutions can train local linear emulators that provide fast, exact inversions in their vicinity. We find that estimating the atmosphere at 200 m scales can outperform traditional atmospheric correction, improving speed by one to two orders of magnitude with no measurable penalty to accuracy. We validate performance with an airborne field campaign, showing reflectance accuracies with RMSE of 1.1% or better compared to ground measurements of diverse targets. These errors are statistically consistent with retrieval uncertainty budgets. Local emulators can close the efficiency gap and make rigorous model inversion algorithms feasible for global missions such as SBG.
Future global Visible Shortwave Infrared Imaging Spectrometers, such as the Surface Biology and Geology (SBG) mission, will regularly cover the Earth's entire terrestrial land area. These missions need high fidelity atmospheric correction to produce consistent maps of terrestrial and aquatic ecosystem traits. However, estimation of surface reflectance and atmospheric state is computationally challenging, and the terabyte data volumes of global missions will exceed available processing capacity. This article describes how missions can overcome this bottleneck using the spatial continuity of atmospheric fields. Contemporary imaging spectrometers oversample atmospheric spatial variability, so it is not necessary to invert every pixel. Spatially sparse solutions can train local linear emulators that provide fast, exact inversions in their vicinity. We find that estimating the atmosphere at 200 m scales can outperform traditional atmospheric correction, improving speed by one to two orders of magnitude with no measurable penalty to accuracy. We validate performance with an airborne field campaign, showing reflectance accuracies with RMSE of 1.1% or better compared to ground measurements of diverse targets. These errors are statistically consistent with retrieval uncertainty budgets. Local emulators can close the efficiency gap and make rigorous model inversion algorithms feasible for global missions such as SBG.
A new generation of remote Visible to ShortWave InfraRed (VSWIR) imaging spectrometers is beginning to make global observations from orbit. These instruments, also known as hyperspectral imagers, measure the upwelling spectrum in contiguous channels from ∼400–2,500 nm for each pixel of an image. The VSWIR spectral range is sensitive to diverse Earth surface properties ranging from terrestrial and aquatic ecosystem function, to geologic composition, to the condition of snow and ice. Current and forthcoming instruments include the National Aeronautics and Space Administration's Earth surface Mineral dust source InvesTigation (EMIT; Green et al., 2020) and Surface Biology and Geology mission (SBG; Cawse‐Nicholson et al., 2021), along with similar European instruments such as DESIS (Alonso et al., 2019), PRISMA (Cogliati et al., 2021), EnMAP (Guanter et al., 2015), and CHIME (Nieke & Rast, 2018). Thanks to continual improvements in instrument capabilities, data volumes will soon exceed anything in the history of optical/infrared imaging. For example, the SBG mission aims to provide global biweekly coverage of Earth's terrestrial and coastal aquatic areas at 30 m spatial sampling. Even accounting for non‐sunlit areas and cloud cover, this equates to more than a hundred thousand spectra per second continuously over the mission's multi‐year lifetime. With finite resources for analysis, we must significantly improve the efficiency of ground processing to match this explosion in data volume.One of the most intensive processing steps is the joint estimation of spectral surface reflectance and atmospheric state. This process, colloquially termed “atmospheric correction,” is actually a complex inversion of an atmospheric Radiative Transfer Model (RTM). Current atmospheric correction methods use two steps, both of which are computationally challenging. First, analysts execute an RTM many times to fill a lookup table of atmospheric parameters, such as transmittance, that influence the radiance at the sensor. Next, each radiance spectrum is inverted independently to find the atmospheric parameters which best explain the measurement. Recent studies have made considerable progress in accelerating the first step. They have sped lookup table generation by two to three orders of magnitude with surrogate models such as neural networks (Brodrick et al., 2021; Bue et al., 2019). The radiative transfer models themselves can be accelerated with PCA representations (Liu et al., 2006). However, there has been little progress on the inversion step, which is far more computationally‐intensive. Efficiency gains in the inversion are required to meet the future needs of missions like SBG.This paper demonstrates that, independent of the inversion algorithm used, a one to two order of magnitude speedup is possible based on the simple observation that atmospheric state is locally smooth. VSWIR imaging spectrometers oversample atmospheric spatial variability, so it is not necessary to find a separate solution for every spectrum. Performing atmospheric inversions over larger spatial footprints can significantly reduce the number of computations required without an accuracy penalty. This manuscript presents the first formal treatment and validation of this process, providing a framework for assessing the benefits of spatial aggregation as a function of noise level and spatial resolution. We explore approaches for spatially interpolating atmospheric solutions that treat surface properties like reflectance and topography at the native resolution of the sensor. We then demonstrate feasibility using a field experiment with airborne data. The approach has been adopted by NASA's EMIT mission, where it will enable full‐physics atmosphere/surface inversions for a large fraction of Earth's surface using a single commodity computing cluster. Ultimately, these speed improvements make modern inversion algorithms tractable for SBG data volumes.
Theoretical Background
We represent the measurement process mathematically using a forward model
, a function which maps a vector of free surface and atmosphere parameters onto the spectral radiance at sensor . Radiative transfer theory provides the following parameterization (Vermote et al., 1997):
where is the path radiance, photons that scatter from atmospheric particles or molecules and never interact with the surface. Boldface symbols are vector‐valued spectral quantities. The symbol ◦ signifies an element‐wise (Hadamard) product in which each entry of the first factor is multiplied by its corresponding entry in the second factor to produce a vector‐valued result. The horizontal line signifies an element‐wise division. We ascribe each pixel a hemispherical‐directional reflectance of , and the background surrounding the pixel an average reflectance spectrum . The upward transmittance from the ground to the sensor is the sum of diffuse and direct components, written and respectively. The variable s represents the spherical sky albedo. The radiance due to downwelling illumination at the surface, written , decomposes via:Division by converts the extraterrestrial solar irradiance to an angular quantity. The total transmittance from the sun to ground is the sum of diffuse downwelling transmittance and direct downwelling transmittance . Here represents the solar zenith angle at the top of atmosphere, while is the angle of incidence onto the terrain facet, relative to the surface normal direction. This can be calculated by (Civco, 1989):
where is the slope angle, is the solar azimuth and is the terrain aspect angle. Alternatively, can be calculated as the dot product between the surface normal and the direction to the sun. If the terrain is flat, . The observation process generates a measured spectrum according to:
where is a zero mean random variable representing Gaussian measurement noise with covariance . This covariance can also be used to bookkeep other random discrepancies between the forward model and reality, such as calibration or atmospheric modeling errors that prevent the model from matching the measurement (Thompson et al., 2021).Surface reflectance estimation involves using the measured radiance spectrum y to estimate the state vector (Thompson, Guanter, et al., 2019). The designer has discretion about how to parameterize . For these experiments, we will follow the common convention of estimating the surface reflectance in each of channels, along with two atmospheric parameters: the column water vapor concentration and aerosol optical depth at 550 nm. These are the temporally variable quantities with the largest effect on the observed radiance spectrum (Thompson et al., 2018). All optical coefficients in Equation 1 are calculated with sRTMnet (Brodrick et al., 2021), a neural network emulator for the MODTRAN 6.0 RTM (Berk & Hawes, 2017). During the training, we configure MODTRAN to use a fine 0.1 cm−1 band model. This high spectral resolution is a good approximation for Equation 1 that describes monochromatic light. Note that the resulting inversion problem is ill‐posed since we aim to estimate parameters using only radiance channel measurements.Constraints on the reflectance spectrum can make the problem tractable by reducing the effective number of free parameters. Sequential methods make assumptions about the surface reflectance to measure the atmosphere, and then use this measured atmosphere to re‐estimate the surface terms. For example, assuming the reflectance is linear across absorption features enables a direct measurement of water vapor concentrations (Gao et al., 1993; Schläpfer et al., 1998). Similarly, terrain‐specific band ratios can be used to estimate atmospheric aerosols (Hsu et al., 2013). After estimating these atmospheric state variables, the remaining reflectance parameters can then be found via direct algebraic inversion. Such sequential solutions are fast, but the estimated reflectance is not generally consistent with the assumptions used in the atmospheric retrieval. Recently, full physics inversion strategies have been developed which attempt to solve all parameters of the physical observation model at once by applying some other background information to reduce the effective number of degrees of freedom. For example, one can constrain the retrieval using prior distributions over state vector parameters (Thompson et al., 2018). Since the inverse problem is nearly well‐posed, very loose prior covariances in specific spectral intervals ‐ such as atmospheric absorption features—are sufficient to make it invertible (Thompson et al., 2020). This leads naturally to a fully‐consistent solution based on Maximum A Posteriori (MAP) inference (Rodgers, 2000). The best maximizes the posterior probability of the state vector given the measurement, constrained by a Gaussian prior distribution with mean and covariance . One finds it by minimizing the cost function,
with a nonlinear optimization such as gradient descent. After finding a full physics solution, one can calculate a closed‐form estimate of posterior uncertainties by linearizing the forward model at the solution state. We define the Jacobian matrix as the matrix of partial derivatives of with respect to . The posterior uncertainty is (Rodgers, 2000):Other Bayesian inference algorithms go beyond MAP estimation to characterize the full posterior distribution over atmosphere and surface states. These include MCMC approaches like those described in Thompson, Babu, et al. (2019). Regardless of the precise inversion approach used, we will refer to these solutions as full physics algorithms because they attempt an exact, statistically optimal inversion of the physics‐based radiative transfer model.Full physics inversions combine principled physical modeling with statistical rigor and uncertainty propagation, making them the favored approach for spectroscopic sounders like OCO‐2 (O’Dell et al., 2012). However, such approaches are still rare for imaging spectrometers due to their large data volumes—in SBG's case, hundreds of thousands of spectra per second, on average, over the mission. Iterative solutions for Equation 5 are generally too slow to keep pace with such data rates. At the time of this writing, implementations require one to five core‐seconds per spectrum on modern hardware, and MCMC algorithms are far slower. Such speeds imply significantly larger computing costs than traditional missions. Once hardware and software optimizations are exhausted, how can iterative atmospheric correction be made tractable for a future mission like EMIT or SBG?Here we present a solution based on the simple observation that imaging spectrometers oversample the spatially‐smooth atmosphere. Orbital imaging spectrometer ground sampling is on the order of 30 m. In contrast, water vapor concentrations are effectively constant over 100 m scales, and have only 3%–5% variability over kilometer scales (Thompson et al., 2021). Outside anomalous conditions like wildfires (Brodrick et al., 2022), the spatial scales of aerosols are even larger, on the order of tens of kilometers for optical depth (Anderson et al., 2003). Consequently, for a local neighborhood of pixels at similar elevation, the atmosphere can be considered constant. All the terms in Equation 1 except
can be held fixed. Under these conditions we can replace expression 1 with the much simpler model F′(
), which is a function of surface reflectance alone,
for three vector‐valued free parameters , and . These parameters can be fit with ordinary least squares using any three reflectance/radiance pairs, producing local coefficients and . Following statistics convention, the hat symbol ˆ indicates that these values are estimated from data. The resulting linear forward model can be inverted directly to solve for each pixel in the neighborhood:If the terrain is locally flat, then , and the forward model can be further simplified:This approach forms the basis of the so‐called empirical line solutions (Smith & Milton, 1999), which fit and using two or more paired spectra of remote radiance and field reflectance data. The corresponding inverse relationship is:This insight can be used to speed up full physics atmospheric correction. The strategy, first proposed in the appendix of Thompson et al. (2020), performs a sparse set of full physics solutions within a small geographic area. The resulting radiance and reflectance pairs are used to fit a and b via ordinary least squares. This is tantamount to training a local linear emulator (Servera et al., 2021; Vicent et al., 2018) for the forward model which can be inverted algebraically. The model is only valid over short distances and flat terrain, where the atmospheric optical properties are uniform, so an acquisition might require thousands of local linear emulators to tile the scene. However, training and inverting these linear models can enable speedups of one to two orders of magnitude vis a vis inverting each pixel independently.There are many ways one could implement this basic strategy. In the experiments that follow, we will use a reference implementation from the open source ISOFIT codebase (ISOFIT, 2021). It first pre‐segments the image into small contiguous groups with a superpixel segmentation algorithm, and then performs a full‐physics inversion on the average spectrum of each group. Figure 1 shows the sequence of operations:
Figure 1
Spatial segmentation process (a) The initial image (b) The image is segmented into small regions (c) The average radiance spectrum of each region is inverted using the full physics retrieval algorithm (d) Local models are constructed using a neighborhood of radiance/reflectance pairs (e) Local models are applied to each pixel to predict reflectance at native resolution.
Calibrate the instrument data to produce a radiance image, then georectify and project it onto a rectilinear grid using nearest neighbor interpolation.Segment the radiance image into small regions. Our reference implementation uses the SLIC algorithm (Achanta et al., 2012) operating on the first five principal components of the radiance spectra, with segments that contain approximately 40 native pixels per superpixel.Perform a full physics inversion on the average spectrum from each superpixel, estimating the surface and atmospheric state of Equation 1 by minimizing Equation 5. In our reference implementation the surface is taken to be flat, meaning that in Equation 3. This step yields a paired reflectance and radiance spectrum for each superpixel.Form local emulators for each superpixel by using a collection of the nearest radiance/reflectance pairs to fit the and coefficients of Equation 9. Our reference implementation fits these coefficients using ordinary least squares with radiance/reflectance pairs from the 400 nearest superpixels, a number which can be changed to modify the spatial neighborhood of the model.Apply the associated emulator to the pixels inside each superpixel, predicting the reflectance at native resolution using Equation 10.Spatial segmentation process (a) The initial image (b) The image is segmented into small regions (c) The average radiance spectrum of each region is inverted using the full physics retrieval algorithm (d) Local models are constructed using a neighborhood of radiance/reflectance pairs (e) Local models are applied to each pixel to predict reflectance at native resolution.This procedure works well because the terrain in the field experiments is mostly flat. In very rough terrain the full physics inversion takes local slope into account by adjusting in Equation 2. In this case the three‐parameter emulator of Equations 7 and 8 would be used.Uncertainty quantification for the linear emulator is different than the full physics retrieval. There are several kinds of error to consider in Equation 10. First, the radiance measurement may contain both systematic errors such as calibration drift and random errors such as instrument noise, which are all captured by the measurement noise covariance . Second, the and coefficients might accrue errors from the component radiances and reflectances used to estimate them. We represent these errors using covariance matrices and , respectively. We then linearize the forward model via first order Taylor expansion to approximate the resulting error in the estimated reflectance (JCGM, 2008). With the reasonable assumption that the underlying value of is fixed, the reflectance estimation error is zero mean with the covariance:We estimate and with the bootstrap method. Specifically, we fit and many times by sampling random sets of the training datapoints used to construct the emulator, sampling with replacement from the original training set. We then calculate the sample covariance of inferred coefficients. This provides all the necessary terms to evaluate Equation 11. The resulting covariance accounts for the random noise in the target radiance, the measurement noise in the coefficient training sets, and finite sample sizes used to estimate those coefficients. Note that the posterior uncertainty in surface reflectance from Equation 11 is strictly larger than the pixelwise posterior in Equation 6. This is the case because the term in Equation 11 is tantamount to the form in Equation 6, selecting the submatrix of surface reflectance state vector elements. In the full physics algorithm, this uncertainty can only shrink due to information provided by a prior. In the emulator case, the uncertainty can only grow due to the additional errors in estimating and .This raises the question: how significant are coefficient uncertainties for actual imaging spectroscopy retrievals? At the time of this writing, there is substantial anecdotal evidence that linear emulator approaches perform well. However, they rely on the continuity of local atmospheric fields, an assumption that has not been rigorously tested with ground truth data. If the atmosphere within a scene were always perfectly uniform, a single atmospheric solution would be sufficient. If the atmosphere were spatially uncorrelated, a new solution would be required at every pixel. Reality lies somewhere between; as the neighborhood of analysis increases, oversmoothing introduces additional uncertainty in and , and errors begin to accrue in the estimated surface reflectance. Thus, the local smoothness of atmospheric aerosols, water vapor, and elevation, and the tolerance for approximation error, determine computational requirements for the SBG mission atmospheric correction.Previous studies have measured these lengthscales for different atmospheric constituents. One must interpret these results with care, since concentrations have a complex fractal structure with different gradient magnitudes at different distances. Moreover, lengthscales may vary across different atmospheric conditions, meaning previous studies do not definitively predict future conditions. However, they illustrate the range of possibilities and provide a starting point for performance expectations. Table 1 shows several such studies. The most prominently absorbing atmospheric trace gas is water vapor, which can vary by 3%–5% over kilometers in flat terrain (Thompson et al., 2021). Water vapor has many broad features across the VSWIR range, so small changes in concentration are detectable. Maps of water vapor with contemporary VSWIR instruments show variability over scales down to a few hundreds of meters, corresponding to changes of 0.1 g cm−2. Kilometer‐scale gradients have been attributed to evapotranspiration over vegetated areas (Ogunjemiyo et al., 2002). Even steeper gradients are found at rare isolated anthropogenic point sources (Thorpe et al., 2017). Surface elevation changes can also induce some gradients in the total observed absorption because water vapor is concentrated in the lower troposphere. Absorption by gases like CH4 and CO2 is more constant (Li et al., 2022; Lin et al., 2004). Their background variability does not generally exceed 1%, a difference that is virtually undetectable by VSWIR instruments. Again, larger enhancements are observed at rare point sources (Thorpe et al., 2017). Outside of trace gases, aerosol particles can also affect VSWIR measurements through absorption and scattering. Changes in background Aerosol Optical Depth (AOD) are gradual enough to be detectable only over tens of kilometers (Anderson et al., 2003). Much stronger spatial gradients are observed from rare, strong emitters like wildfires (Brodrick et al., 2022).
Table 1
Lengthscales of Different Atmospheric Parameters
Gradient
Reference
Detection limit
Background constituent
Precipitable H2O
0.1 g cm−2 km−1
(Thompson et al., 2021)
0.1–1 km
C2O column average
0.01 ppm km−1
(Lin et al., 2004)
>1 km
CH4 column average
<0.1 ppb km−1
(Li et al., 2022)
>1 km
AOD (550 nm)
>0.01 km−1
(Anderson et al., 2003)
>1 km
Enhancement source
Anthropogenic H2O
10,000 ppm m km−1
(Thorpe et al., 2017)
<0.01 km
H2O evapotranspiration
0.02 cm km−1
(Ogunjemiyo et al., 2002)
0.1–1 km
Anthropogenic C2O
>10,000 ppm m km−1
(Thorpe et al., 2017)
<0.01 km
Anthropogenic CH4
>10,000 ppm m km−1
(Thorpe et al., 2017)
<0.01 km
Wildfire, AOD at 550 nm
>1 km−1
(Brodrick et al., 2022)
<0.1 km
Note. Units of absorption for point source emitters are in concentration lengths of ppm m. All values are rough orders of magnitude, and assume flat terrain. Detection limits show the minimum distance at which the gradient is detectable in a single pixel by an instrument with sensitivity similar to AVIRIS‐NG. The top table shows expected background variability, while the bottom shows the stronger gradients occasionally observed around rare sources.
Lengthscales of Different Atmospheric ParametersNote. Units of absorption for point source emitters are in concentration lengths of ppm m. All values are rough orders of magnitude, and assume flat terrain. Detection limits show the minimum distance at which the gradient is detectable in a single pixel by an instrument with sensitivity similar to AVIRIS‐NG. The top table shows expected background variability, while the bottom shows the stronger gradients occasionally observed around rare sources.Our existing knowledge of atmospheric lengthscales is thus a complicated story. But if one ignores rare localized sources that should not impact operational atmospheric correction, then a consistent picture emerges. Most constituents can be considered effectively constant over scales of multiple kilometers in flat terrain. Water vapor appears most likely to set the minimum neighborhood for VSWIR atmospheric correction, but it is still highly overdetermined for orbital instruments with 30–60 m ground sampling. We hypothesize that, for the purposes of surface reflectance retrieval, there is a neighborhood within 100 m and 1 km in which the atmosphere can be considered effectively constant. The next section describes an approach to test this hypothesis with field data. Section 4 shows experimental results. Section 5 then discusses the implications and describes how the result might be extended to scenes with mountainous terrain.
Experimental Approach
Here we describe a series of field experiments to validate the proposed retrieval approach. These experiments compare both single‐pixel and emulator solutions against each other and with field reflectance data for a pair of coastal imaging spectrometer overflights. We characterize the tradeoff of speed versus accuracy, charting the approximation error as a function of the training neighborhood size. Our experiments use data from NASA's Airborne Visible Infrared Imaging Spectrometer (AVIRIS‐NG). AVIRIS‐NG is an airborne imaging spectrometer that measures the solar‐reflected regime from 380 to 2,500 nm at approximately 5 nm sampling. Here it acquired data over two distinct coastal environments and atmospheric conditions. The first was a section of the southern California coastline, the Salt Creek and Aliso beaches north of Dana Point, California (AVIRIS‐NG flightline ang20211012t181608), which it overflew at approximately 1,000 m above ground level to achieve spatial sampling of 1.0 m. The scene contained diverse content, including buildings, vegetated terrain, bare soil, sandy beach, and ocean. The second was a section of the Achafalaya river delta, Louisiana (ang20210402t195922), overflown at 4,700 m above ground level for a spatial resolution of 4.7 m. This scene was more spatially homogeneous, with low terrain and turbid water. Conditions were clear on both overflights, without significant cloud cover. The overflights each contained over 5 million spectra.After acquisition, we analyzed the data using established best practice methods. We first calibrated the instrument digital numbers to radiance units using the approach described in Chapman et al. (2019). We then adjusted the laboratory radiometric calibration using a vicarious reference surface in the Louisiana data set, separate from the ground truth locations we evaluated. This approach, detailed in Bruegge et al. (2021), compensated for minor discrepancies in instrument sampling and response levels too subtle to estimate under synthetic laboratory illumination. These radiometric adjustments were 1% or less in most channels. We next orthorectified the data based on a geometric camera model and data from an onboard Global Positioning System/Inertial Measurement Unit (GPS/IMU). We geolocated each pixel independently by tracing its field of view as a ray from the aircraft position to a Digital Elevation Model (DEM).Ground teams measured selected locations within both flightlines. These were uniform surfaces located as far as possible from other large scattering objects like trees and buildings. To reduce errors due to geographic misregistration, or multiple scattering from objects like trees and buildings we selected flat homogeneous surfaces. We minimized unmodeled BRDF effects by measuring these surfaces within 2 hours of the overflight. For the California overflight, we selected a synthetic basketball court surface and a well‐tended lawn. We measured an area of approximately 4 m in diameter using an ASD Fieldspec Pro field spectrometer. A standardized reflectance panel was placed nearby on a tripod and set orthogonal to the gravity vector with a bubble level. The surface reflectance was calculated as the ratio of field spectrometer DNs from the surface to the panel, further dividing out the known panel reflectance to determine the hemispherical directional reflectance of the surface. Images of the California location, and the corresponding positions within the flightline, appear in Figure 2.
Figure 2
Locations for the California validation site, a complex environment imaged at 1.0 m spatial sampling. Inset images show the ground perspective of each validation surface.
Locations for the California validation site, a complex environment imaged at 1.0 m spatial sampling. Inset images show the ground perspective of each validation surface.During the Louisiana overflight, a field team measured water‐leaving reflectance spectra at several stations using a boat‐deployed spectroradiometer manufactured by Spectral Evolution (PSR‐1100f). These measurements were combined with the estimated fraction of skylight reflected at the air‐sea interface to calculate the water‐leaving reflectance corrected for specular reflection of skylight at the air‐water interface (Mobley, 1999, 2015). The team made three near‐simultaneous measurements: the water surface, 40° from nadir and 135° from the sun azimuthal plane; the sky, 40° from zenith and 135° from the sun azimuthal plane; and a highly reflective reference panel. These specific angles avoided sunglint, while also minimizing variability of skylight reflectance at the air‐water interface to simplify its prediction from wind speed (Mobley, 1999, 2015). The procedure was repeated multiple times and the median of each set was used. The method of Jiang et al. (2020) was used for sunglint removal. Six stations were measured, but only station five had temporal alignment with the overflight, so we selected this location for the study. Images of the Louisiana scene from airborne and surface perspectives appear in Figure 3.
Figure 3
A segment of the Louisiana scene, a simpler environment imaged at 4.7 m spatial sampling.
A segment of the Louisiana scene, a simpler environment imaged at 4.7 m spatial sampling.We used two alternative approaches to estimate surface reflectance. Our first retrieval was a full physics solution with no potential for spatial interpolation error. We extracted a group of pixels at each measurement site and averaged them to form a single radiance spectrum. We then inverted this spectrum using the MAP estimation approach. We performed the inversion using the ISOFIT codebase (ISOFIT, 2021), with an uninformed prior over atmospheric state vector elements, and a conservatively loose prior over surfaces that constrained only water absorption windows as in Thompson et al. (2020). We calculated the instrument noise covariance S
as the sum of several different error sources: (a) the AVIRIS‐NG instrument model including constant dark and read noise terms, as well as the signal‐dependent photon shot noise in each channel; (b) a 1% radiometric calibration error in every channel, as in Thompson et al. (2018); and (c) a small systematic error term corresponding to discrepancies in the radiative transfer model, as in Thompson et al. (2018). We calculated error predictions using Equation 6, accounting for the root‐n reduction in random noise due to averaging multiple radiance spectra. We then compared the result to the in situ measurements using an additional uncertainty of 1% absolute calibration error for the in situ instrument, and 1% uncertainty in the spectralon BRDF. All uncertainties were added in quadrature, and the resulting posterior error predictions compared against the observed differences between measurements.We performed similar retrievals using the linear emulator. We segmented the entire radiance cube using the SLIC algorithm to create contiguous, homogeneous 40‐pixel fragments. Then, we inverted the average radiance spectrum of each fragment using the full physics solution. Finally, we calculated the reflectance at native resolution by fitting local linear emulators and then solving for each pixel's reflectance using Equation 10. Together, these measures reduced the number of integrations required to process the flightline by a factor of 40 vis a vis the pixelwise approach. Four hundred neighbor fragments trained each model, corresponding to square neighborhoods with widths of approximately 125 and 600 m for the California and Louisiana datasets, respectively. After calculating the reflectances for each pixel, we extracted and averaged the reflectances in the spatial footprint of each ground measurement site. This second reflectance might contain interpolation errors if the local emulators were inaccurate. To quantify this additional uncertainty, we estimated the covariances S
and S
using bootstrap estimation and computed calculated posterior errors for each spectrum using Equation 11, accounting for the root‐n noise reductions from averaging multiple reflectances. We compared the result to field data with the additional in situ instrument error terms described above. As before, our S
matrix included calibration, model discrepancy, and instrument noise terms.A final experiment aimed to quantify the tradeoff between accuracy and spatial scale. We varied the radius over which the linear model was trained, and calculated the differences in the final reflectance maps. We used neighborhood window diameters of 25, 50, 100, 200, and 400 base pixels. This translated to ground distances of 25–400 m in the California flightline, and 117–1,880 m in the Louisiana flightline. We calculated the root mean square differences across all locations and channels, and plotted the differences as a function of interpolation radius.
Experimental Results
Figure 4 shows the measured radiances at all three sites. The overall profile and magnitudes are similar to the solar irradiance, with water vapor features visible at 940 and 1,140 nm. The right column shows the a and b coefficients for all wavelengths at each of the three sites. These coefficients reveal the physical processes captured by the emulators. The black curves show b, which is the gain or change in radiance with respect to some unit change in reflectance. This appears, appropriately, as the reflected solar irradiance profile, attenuated by atmospheric gas absorption and particle scattering. The red spectrum is the a coefficient which is independent of surface reflectance. It represents path radiance due to particle scattering, that is, photons that do not interact with the surface. We observe some spectral structure in a coefficients near water absorption features at 940 and 1,140 nm. Additive path radiance from particle or molecular scattering should vary smoothly with wavelength, so this structure is probably caused by measurement error like detector nonlinearity, instrument stray light, or erroneous assumptions about aerosol optical properties. In any case this effect is quite small relative to the overall radiance. The lawn and basketball court sites have nearly identical coefficients, which is expected since these spectra were observed under similar elevation and atmospheric conditions. The station five spectrum is distinct, with a significantly higher additive term a indicating a greater proportion of the signal at sensor from atmospheric scattering. This is expected due to the hazier conditions and higher aircraft elevations of this overflight.
Figure 4
Measurement of the three sites. The rows show the lawn, basketball court, and station five (aquatic) site, respectively. The left column shows radiance at sensor. The right shows additive and multiplicative (a) and (b) coefficients by wavelength.
Measurement of the three sites. The rows show the lawn, basketball court, and station five (aquatic) site, respectively. The left column shows radiance at sensor. The right shows additive and multiplicative (a) and (b) coefficients by wavelength.Figure 5 compares the retrieval results. At all three sites, the full physics and emulator methods produce effectively identical reflectance estimates. Moreover, the remote and in situ measurements are similar, validating the accuracy of the remote inversion approach. The right column shows the discrepancy between the remote and in situ measurements of surface reflectance. These discrepancies lie within the 95% confidence interval of our predicted uncertainties. Possible sources of unmodeled error at the terrestrial sites include terrain and BRDF effects as well as calibration and atmospheric scattering uncertainty. Likely error sources at the aquatic site include sky glint, other surface effects, and atmospheric scattering.
Figure 5
Retrieved reflectances from the three sites. The rows show the lawn, basketball court, and station five (aquatic) site, respectively. The left column shows the estimated surface reflectance with in situ measurements in black, remote single‐pixel retrievals in blue, and remote spatially‐interpolated retrievals in red. At all three sites, blue and red overlay each other and are visually indistinguishable. The right column shows surface reflectance discrepancy comparing the remote and in situ measurements. Blue and red dotted lines show the 95% confidence interval where errors are expected to lie for the single‐pixel and spatially‐interpolated retrievals, respectively.
Retrieved reflectances from the three sites. The rows show the lawn, basketball court, and station five (aquatic) site, respectively. The left column shows the estimated surface reflectance with in situ measurements in black, remote single‐pixel retrievals in blue, and remote spatially‐interpolated retrievals in red. At all three sites, blue and red overlay each other and are visually indistinguishable. The right column shows surface reflectance discrepancy comparing the remote and in situ measurements. Blue and red dotted lines show the 95% confidence interval where errors are expected to lie for the single‐pixel and spatially‐interpolated retrievals, respectively.We use a chi‐square test to assess the statistical consistency of realized errors with the predicted uncertainties. Specifically, we posit a null hypothesis that the differences are drawn from the Gaussian distribution associated with our error budget. We treat errors in each channel as independent, conservatively double‐counting the correlated errors in adjacent channels. We compare the sum of squared errors to the appropriate chi‐square distribution with n−1 degrees of freedom. Table 2 shows the resulting p‐values to reject the null hypothesis for each site and retrieval approach. The values near unity signify that the observed discrepancies do not exceed the sum of squared errors expected by a random draw from the predicted error distribution. In other words, the uncertainty budgets fully explain the differences in our measurements. This might not be the case for environments that violated the forward modeling assumptions, such as topographically complex terrain with cast shadows or multiple scattering between terrain facets. Table 2 also shows the Root Mean Squared Error (RMSE) values between remote and in situ measurements, demonstrating that the different retrieval approaches provide similar accuracy at all field sites. The remote retrievals and in situ data match within approximately 1.1% in reflectance units.
Table 2
Discrepancies Between Remote Retrievals and in Situ Data
Score
Basketball court
Lawn
Station five
Number of pixels
3
12
16
Single pixel p‐value
>0.99
>0.99
>0.99
Emulator p‐value
>0.99
>0.99
>0.99
Single pixel versus in situ RMSE
0.012
0.010
0.002
Emulator versus in situ RMSE
0.011
0.010
0.002
Emulator versus Single pixel RMSE
0.0018
0.00086
0.0015
Note. The number of pixels used to generate the spectrum at each location is provided. RMSE, Root Mean Squared Error.
Discrepancies Between Remote Retrievals and in Situ DataNote. The number of pixels used to generate the spectrum at each location is provided. RMSE, Root Mean Squared Error.Bootstrap error analysis provides more insight into the emulator‐induced uncertainty. Figure 6 shows the bootstrap distribution of coefficients from the lawn spectrum. The left panel shows the estimated values of at a reference wavelength of 877 nm for 10,000 trials. The right panel shows a similar comparison for the coefficient. The distribution is nearly Gaussian and highly compact with less than 0.1% variance, signifying that the number of neighbors used in our tests was sufficient to estimate the gain for this wavelength with high accuracy.
Figure 6
Bootstrap distributions for a and b coefficients from the lawn site, for the 877 nm reference channel.
Bootstrap distributions for a and b coefficients from the lawn site, for the 877 nm reference channel.Finally, we assess how the reflectance solution changes across different neighborhood sizes. Figure 7 plots the root mean squared differences in the reflectance cubes for different spatial neighborhoods, compared to the reflectances obtained using a 25 pixel window size. As the neighborhood sizes grow, the differences in retrieved reflectance also increase. At one km scales, the changes induced by atmospheric smoothing are approximately 0.15% for the Louisiana scene, and 0.25% for the California scene. We find that the differences lie almost perfectly along a logarithmic curve, ν
1 + ν
2 log d for distance d in meters. The corresponding decay curves are −0.0011 + 0.00052 log d for the California data set, and −0.0011 + 0.00034 log d for the Louisiana data set. Using the California data set as a stressing case of a complex surface and atmosphere, one can calculate the neighborhood distance to use which bounds the expected error to a desired level. The relationship is:
Figure 7
Root mean squared difference in reflectance as a function of the window size for the linear emulator, compared with the smallest size (25 pixels). Remote and in situ discrepancies are shown by ’x’ symbols for the Basketball court (BC), Lawn, and Station five locations.
Root mean squared difference in reflectance as a function of the window size for the linear emulator, compared with the smallest size (25 pixels). Remote and in situ discrepancies are shown by ’x’ symbols for the Basketball court (BC), Lawn, and Station five locations.This estimate is probably conservative for flat terrain because the differences between neighborhood scales are not necessarily errors. Larger neighborhood sizes have the benefits of averaging over surface‐induced anomalies and measurement noise. On the other hand, variability could be greater along steep topographic gradients that induce rapid changes in the atmospheric column. We will discuss some compensatory strategies in the following section.
Discussion and Conclusions
The field experiments demonstrate that local linear emulators, which exploit the spatial smoothness of atmospheric fields, are an accurate, computationally efficient solution to atmospheric correction at scale. Their implementation is straightforward, and they can be used with any inversion algorithm. They are capable of seamlessly incorporating topography whenever the full physics inversion uses this information. The result is a one to two order of magnitude improvement in computational efficiency for atmosphere/surface inversions. In these airborne demonstrations, atmospheric correction accuracies align well with in situ measurements, with discrepancies that are consistent with first‐principles error budgets. For neighborhood sizes on the 200 m scale over flat terrain, local linear emulator solutions are effectively indistinguishable from the full physics solution. As neighborhood sizes grow beyond 1,000 m, differences become measurable, but are still an order of magnitude below other model uncertainties. These results are consistent with prior studies (Thompson et al., 2021) that show measurable differences in atmospheric water vapor on these scales.Our experiments aimed to quantify error induced by the linear emulators, presupposing that emulation would hurt accuracy relative to the pixelwise approach if it failed to capture small‐scale variability in atmospheric parameters. Indeed, some of our experiments were consistent with a very small amount of atmospheric oversmoothing—we could measure reflectance differences of approximately 0.2% from emulators trained with different neighborhood sizes. However, when ground reference data was introduced, we were unable to measure any degradation in performance for even fairly large (>500 m) neighborhoods. In all such comparisons, the emulator and full‐physics retrievals were identical to within a small fraction of the reflectance. Minor differences were just as likely to improve as reduce agreement with the in situ data, and all differences were at a level much smaller than the systematic differences between remote and in situ data. The surprisingly good performance led us to ask whether, even as it risks atmospheric oversmoothing, the emulator might actually be improving accuracy in other aspects. There are several ways that this could occur. First, the spectral averaging of multiple radiance pixels within each segment reduces instrument noise, enabling more accurate atmospheric estimation in the full physics retrievals that comprise the training data. For example, very high SNR inside dark features like water vapor absorptions might assist with aspects of atmospheric correction like the estimation of aerosol scattering. Second, the combination of multiple segments into each linear emulator should reduce the impact of outlier solutions or surfaces on the local result. This beneficial smoothing may actually improve the performance of atmospheric correction relative to pixelwise inversions in some scenarios.Incorporating emulator solutions into an atmospheric correction workflow is simple and straightforward, and it does not require any core changes to the inversion algorithm. Since the local emulators predict a directional reflectance quantity, any post‐corrections such as BRDF inference can be applied equivalently to either pixelwise or emulated products. These properties make it a good candidate for operational use. However, as the approach is applied to large areas, topography should also be considered in the design. Topography can create variability by changing the thickness of the atmospheric column. Changes in oxygen absorption could reveal differences in elevation as small as 100 m, and water vapor could be even more sensitive. Applying a local linear emulator across variable topography would violate its assumption of a locally‐constant atmosphere. Consequently, over variable terrain, differences in elevation should also be considered when defining the local “neighborhood” of a pixel. The EMIT mission will use a weighting in which vertical distance is weighted by a factor of 10 relative to horizontal distance for defining a spatial neighborhood. In other words, a 1 km spatial neighborhood would permit an elevation change of no greater than 100 m. The forthcoming deployment on the EMIT mission will provide much larger datasets spanning greater geographic scales to build experience with this strategy. Analysts seeking to implement this approach should also consider how systematic instrument errors could affect their results. Pushbroom imaging spectrometers often experience slight spatial nonuniformities, manifesting as changes in wavelength calibration or spectral response at different cross‐track positions (Richter et al., 2010). AVIRIS‐NG is unusually uniform among these instruments, with an average wavelength shift of less than 2% of the FWHM across the field of view (Chapman et al., 2019). Larger nonuniformities would mean that the atmospheric absorption lines observed by each channel would change at different locations in the image. The interpretation of transmittance, path radiance, and spherical albedo vectors would also change, violating assumptions of the emulator approach and impacting the model's performance. Naturally, such nonuniformities are also deleterious to pixelwise retrievals.In aggregate, the significant speed advantages of the linear emulator approach will likely overcome these minor disadvantages for the short term, making it a favorable design choice for EMIT and other airborne campaigns. In the longer term, geostatistical solutions may be an even better option for modeling smooth atmospheres (Thompson et al., 2018). Techniques like Gaussian process priors could enable probabilistic modeling of surface and atmosphere together, allowing the surface to vary independently while enforcing local atmospheric smoothness. Such models would provide a unified probabilistic account of the scene, grounding model parameters in the known physical lengthscales of atmospheric phenomena. Further techniques that leverage geospatial surface statistics may also prove useful in constraining the optimization algorithm, leading to faster processing times. The introduction of spatial regularization of the surface reflectance through natural image priors (Kim & Kwon, 2010), patch priors (Bouman et al., 2016; Zoran & Weiss, 2011), or sparsity priors (Candes et al., 2006; Donoho, 2006), relying on the surface's underlaying sparse geologic and vegetative attributes, may prove useful for introducing spatial constraints on the retrieval algorithm. Repeat overflights of the same area under global campaigns such as SBG will provide a wealth of data which can also be used to introduce spatial constraints. Data from multiple overflights could be used to construct location‐dependent statistical constraints on the expected reflectance retrievals. Similarly, reconstructing a reflectance base map from multiple measurements of the same area would make the atmospheric and surface retrievals more well‐posed by requiring an underlying surface consistency. These methods could be extended to include spatio‐temporal statistical constraints as the surface and atmosphere are sampled repeatedly to account for expected seasonal variability. In this sense, the emulator approach presented here is simply a harbinger of future spatio‐spectral retrievals that will become possible as computing speed and code optimization progresses.
Authors: David R Thompson; Niklas Bohn; Philip G Brodrick; Nimrod Carmon; Michael L Eastwood; Regina Eckert; Cédric G Fichot; Joshua P Harringmeyer; Hai M Nguyen; Marc Simard; Andrew K Thorpe Journal: J Geophys Res Biogeosci Date: 2022-06-27 Impact factor: 4.432
Authors: Kevin Alonso; Martin Bachmann; Kara Burch; Emiliano Carmona; Daniele Cerra; Raquel de Los Reyes; Daniele Dietrich; Uta Heiden; Andreas Hölderlin; Jack Ickes; Uwe Knodt; David Krutz; Heath Lester; Rupert Müller; Mary Pagnutti; Peter Reinartz; Rudolf Richter; Robert Ryan; Ilse Sebastian; Mirco Tegler Journal: Sensors (Basel) Date: 2019-10-15 Impact factor: 3.576
Authors: Philip G Brodrick; David R Thompson; Michael J Garay; David M Giles; Brent N Holben; Olga V Kalashnikova Journal: J Geophys Res Atmos Date: 2022-04-07 Impact factor: 5.217
Authors: David R Thompson; Niklas Bohn; Philip G Brodrick; Nimrod Carmon; Michael L Eastwood; Regina Eckert; Cédric G Fichot; Joshua P Harringmeyer; Hai M Nguyen; Marc Simard; Andrew K Thorpe Journal: J Geophys Res Biogeosci Date: 2022-06-27 Impact factor: 4.432