Wavelet transform is an important tool in the computational signal processing of terahertz time-domain spectroscopy (THz-TDS) measurements. Despite its prevalence, the effects of using different forms of wavelet transforms in THz-TDS studies have not been investigated. In this paper, we explore the implications of using the maximal overlap discrete wavelet transform (MODWT) versus the well-known discrete wavelet transform (DWT). We demonstrate that the spectroscopic features extracted using DWT can vary over different overlapping frequency ranges. On the contrary, MODWT is translation-invariant and results in identical features, regardless of the spectral range used for its implementation.We also demonstrate that the details coefficients obtained by the multiresolution analysis (MRA) using MODWT are associated with zero-phase filters. In contrast, DWT details coefficients suffer from misalignments originated from the down- and upsampling operations in DWT pyramid algorithm. Such misalignments have adverse effects when it is critical to retain the exact location of the absorption lines. We study the differences of DWT and MODWT both analytically and experimentally, using reflection THz-TDS measurements of α-lactose monohydrate. This manuscript can guide the researchers to select the right wavelet analysis tool for their specific application of the THz spectroscopy.
Wavelet transform is an important tool in the computational signal processing of terahertz time-domain spectroscopy (THz-TDS) measurements. Despite its prevalence, the effects of using different forms of wavelet transforms in THz-TDS studies have not been investigated. In this paper, we explore the implications of using the maximal overlap discrete wavelet transform (MODWT) versus the well-known discrete wavelet transform (DWT). We demonstrate that the spectroscopic features extracted using DWT can vary over different overlapping frequency ranges. On the contrary, MODWT is translation-invariant and results in identical features, regardless of the spectral range used for its implementation.We also demonstrate that the details coefficients obtained by the multiresolution analysis (MRA) using MODWT are associated with zero-phase filters. In contrast, DWT details coefficients suffer from misalignments originated from the down- and upsampling operations in DWT pyramid algorithm. Such misalignments have adverse effects when it is critical to retain the exact location of the absorption lines. We study the differences of DWT and MODWT both analytically and experimentally, using reflection THz-TDS measurements of α-lactose monohydrate. This manuscript can guide the researchers to select the right wavelet analysis tool for their specific application of the THz spectroscopy.
Terahertz time-domain spectroscopy (THz-TDS) has emerged as a promising technique for non-destructive testing applications such as monitoring hidden defects [1,2,3], non-contact thickness measurement [4,5], investigation of artworks and historical artifacts [6,7], and quantification of polymorphism, crystallinity, and bulk porosity in pharmaceuticals [8,9]. Many substances, including molecular crystals and polar gases, possess distinctive spectroscopic fingerprints in the THz range, which allow for non-invasive material identification, sample characterization, and quality control [10,11,12]. However, electromagnetic scattering can significantly alter and obscure these vibrational absorption features [13,14,15,16]. This phenomena in turn can hinder spectroscopic studies and chemical sensing in the presence of rough surface or volume scattering. Additionally, multiple Mie scattering can introduce resonance-shaped spectral artifacts into the measured THz spectra [17]. To address these challenges, signal processing and experimental techniques have been introduced to retrieve spectroscopic information in the presence of such scattering effects [14,17,18,19,20,21].The widespread applications of THz-TDS demand utilizing advanced signal processing techniques to enhance the precision and accuracy in identification of spectroscopic features [22,23]. Because THz pulses are inherently sparse [24], and they contain localized features in both time and frequency domains, wavelet transform is particularly advantageous for this purpose. Wavelet transform has been used extensively for denoising time- and Fourier-domain THz signals [25,26,27,28,29,30] in addition to THz images obtained using CCD cameras [31,32]. Moreover, wavelet shrinkage accompanied by a Fourier-domain Wiener deconvolution has been proven to be effective in finding the thickness of optically-thin layers in biological samples [33] and art paintings [34]. THz compressive sensing [35], THz tomographic imaging [36], and THz feature extraction for differentiation between various pathological tissues [37,38] are also among the main applications of the wavelet transform in analyzing THz-TDS measurements.Recently, in order to enable in-line process control and real-time THz sensing applications, we have designed and fabricated a fiber-coupled THz spectral imager, named Portable HAndheld Spectral Reflection (PHASR) Scanner [39]. This device incorporates the TERA-ASOPS (Asynchronous OPtical Sampling) dual-fiber-laser spectrometer (Menlo Systems, Inc., Newton, NJ, USA) into a collocated, telecentric imaging configuration, which utilizes an f- lens [40] and a two-axis motorized scanning system [41]. We have conducted multiple experiments on the utility of the PHASR Scanner for material identification [42] and biomedical imaging [43] in the presence of electromagnetic scattering. We have shown that because wavelet transform provides spectral localization at multiple resolution scales, when applied to THz absorption or reflection coefficients, it can separate the characteristic resonant signatures from the scattering-induced spectral artifacts [15,16,17,44]. Furthermore, the wavelet coefficient levels associated with resonant signatures can be identified using total variation in reflection spectroscopy [16], or bimodality coefficient [17] and principal component analysis [45] in the transmission geometry.In all aforementioned studies, two main forms of the wavelet transform have been utilized. First, the critically-sampled discrete wavelet transform (DWT), and second, an undecimated form of the DWT, named the maximal overlap DWT (MODWT) [46]. In this paper, we study the implications of using MODWT and DWT for feature extraction in THz spectroscopic data. We demonstrate two main advantages of MODWT over DWT, including MODWT invariance with respect to the frequency range and association of MODWT multiresolution analysis (MRA) with zero-phase filters. These advantages come at a slightly higher computational cost, which is affordable for many applications due to the high efficiency in modern computer processing units. In the following, we first explain the differences between DWT and MODWT analytically. Then, we describe the experimental methodology in our study to obtain the THz-TDS measurements. We will show the effects of using MODWT versus DWT in extracting the resonances of -lactose over the spectral region of THz. These results elucidate the advantages of MODWT in sensing applications, where it is critical to retain the exact location of the spectral features, such as a resonant frequency.
2. Materials and Methods
2.1. Analytical Description
2.1.1. Translation Invariance
The DWT decomposes a signal into a set of wavelet and scaling coefficient vectors. Each vector describes the localized changes in the signal at a specific variation scale. The scale is an interval of the signal over which weighted averages of that signal (i.e., the scaling coefficients) and the differences of those weighted averages (i.e., the wavelet coefficients) are calculated. The DWT coefficients are obtained using the pyramid algorithm shown in Figure 1a [46].
Figure 1
(a) The block diagram of the DWT pyramid algorithm. The input signal is filtered recursively by a set of high-pass wavelet and low-pass scaling filters. The output of each filter is downsampled by a factor of two. (b) The block diagram of the MODWT pyramid algorithm. In MODWT, the sampling rate is the same at all levels of decomposition. However, the th-level filters are upsampled by a factor of two with respect to the jth-level filters.
Here, is the input signal, which can be a THz pulse either in the time or Fourier domain. The is the vector of the jth-level wavelet coefficients of . It is associated with differences in the weighted averages of over the scale of , where indicates the sampling period in . Equivalently, corresponds to the spectral content of in the range of , displaying band- or high-pass characteristics [46]. The is composed of the jth-level scaling coefficients of . It is associated with the weighted averages of over the scale of , or, equivalently, it corresponds to the spectral content of in the range of , displaying low-pass characteristics. The jth-level of the DWT pyramid algorithm is given by [46],
and
yielding the wavelet and scaling coefficients, respectively. In Equations (1) and (2), h and g are the mother wavelet and scaling filters of size K. It should be noted that because of downsampling by two, the number of DWT wavelet and scaling coefficients at level j is -th of the length of , N. This progressively coarser sampling in DWT is advantageous for applications that rely on compact signal representations, such as signal and image compression [47,48]. However, it leads DWT coefficients to be critically sensitive to the range of analysis. According to Equations (1) and (2), and are formed by discrete, cyclic convolutions of h and g with , respectively, and the results are downsampled by two. The convolution of two discrete functions (e.g., h and ) describes the summation of the product of two functions after one is reversed () and shifted (), over all possible values of shifts (n). In Equations (1) and (2), the product of h and or g and depends on which data points in are excluded in downsampling.Conversely, in the MODWT pyramid algorithm, which is shown by the block diagram in Figure 1b, the downsampling operation is excluded. In other words, the wavelet and scaling coefficients at all decomposition levels have the same size as the original signal. We use the symbol “∼” to differentiate MODWT wavelet and scaling filters and coefficients (, , , and ) from DWT. The MODWT wavelet and scaling filters, and , are related to h and g by and , respectively [46]. In the block diagram of Figure 1b, the MODWT wavelet and scaling filters at level j are upsampled by with respect to and to form a multiresolution decomposition. The upsampling by is achieved by inserting zeros between successive elements of . For example, if is a MODWT wavelet filter with K elements, the second-level wavelet filter is formed by , composed of elements. Accordingly, the jth-level of the MODWT pyramid algorithm is given by [46],
and
Importantly, the convolution sums in Equations (3) and (4) include all possible products of the upsampled and or upsampled and . Therefore, the MODWT wavelet and scaling coefficients are translation-invariant and do not depend on the range of the analysis.
2.1.2. Zero-Phase Filtering
The phase functions of the DWT and MODWT filters change the position of the extracted features in the wavelet domain with respect to their locations in the original signal. If the phase function of a filter is linear, shifting its output coefficients by a certain number, which is equal to the slope of its phase as a function of frequency, will compensate the introduced misalignments [15]. However, the phase functions of the orthogonal wavelet and scaling filters, which are used in DWT and MODWT, are not linear [49]. Moreover, the recursive filtering of a signal by the filters in the pyramid algorithm makes the non-linearity of the phase functions more complicated. To address this problem in applications where retaining the location of the features is important, the cross-correlation of wavelet and scaling coefficients with their corresponding wavelet and scaling filters can be used to form a zero-phase multiresolution analysis. In the DWT pyramid algorithm of Figure 1a, the jth-level wavelet coefficients are formed by convolutions between and and a final convolution with . Equivalently, we can build jth-level DWT wavelet and scaling filters given by and , respectively, where the symbol indicates a cyclic convolution. Thereafter, and , which were previously calculated using Equations (1) and (2), can be directly obtained from given by [46],
and
where is the length of and . Consequently, the aforementioned cross-correlation between and is given by [46],
where is the jth-level wavelet coefficients upsampled by , which is formed by inserting zeros before every element of . The symbol “⍟” in Equations (6) and (7) indicates a cyclic cross-correlation. The d is composed of the jth-level DWT details coefficients and, similar to w, it is associated with differences in the weighted averages of x over the scale of τ2. Assuming that X(f), D(f), and H(f) are used to represent the Fourier transform of x(n), d(n), and h(n), respectively, according to the convolution theorem and while ignoring the down- and upsampling operations, D(f) is obtained by
where |H(f)|2 has a zero phase function. Therefore, it only modifies the magnitude spectrum of X(f), while its phase spectrum remains intact. Thus, there will be no misalignments between d and x caused by the filters’ phase functions. However, it should be noted that the downsampling of DWT wavelet coefficients in Equation (1) and their subsequent upsampling in Equation (7) can still introduce mismatches between the location of the features in d and x.Similar to DWT, MODWT details coefficients are obtained by a cross-correlation between and given by [46],
or, equivalently in the Fourier domain,
Therefore, is also formed by zero-phase filtering of with . Moreover, because the MODWT wavelet coefficients are neither down- nor upsampled, there will be no misalignments between and x.
2.2. Measurement Setup
The THz–TDS measurements are obtained using a TERA-ASOPS (ASynchronous OPtical Sampling) high-speed THz time-domain spectrometer (Menlo Systems, Inc., Newton, NJ, USA). The ASOPS system is composed of two 1560 nm mode-locked femtosecond lasers, which are slightly detuned in their repetition rates by a fixed value of Hz. The difference in the repetition rate results in pulses from the two lasers that are emitted at to be separated at detector by given by,
where MHz in our experiments. This enables the probe pulse to linearly sample the THz pulse generated by the pump pulse. Figure 2a shows a schematic of the measurement setup. The femtosecond pump and probe pulses are guided by optical fibers to the emitter and detector photoconductive antennae (PCA), generating THz radiation with an effective bandwidth of THz. Two TPX50 lenses (Menlo Systems, Inc., Newton, NJ, USA) collimate and focus the generated beams on the sample surface at an angle of incidence of and a focal spot size of approximately 1.2 mm at the peak frequency of the THz emissions. Similar lenses collimate and refocus the specular reflections on the detector antenna.
Figure 2
(a) The schematic of the measurement setup. The specular reflections are measured at . (b) An example THz pulse reflected from an -lactose pellet. (c) The spectral amplitude (SA) of -lactose over the frequency range of THz. A red arrow marks the zero-crossing at 0.53 THz, which is a characteristic resonance mode of -lactose. (d) The vertically-offset fourth-level DWT wavelet coefficients of SA, applied in the frequency domain. Each line in (d) is composed of the wavelet coefficients calculated from a different spectral range, i.e., THz in the purple trace, THz in the red trace, THz in the blue trace, and THz in the green trace. The lower bound of the frequency ranges are emphasized using vertical lines at the beginning of each trace. Each inset box shaded with the same color shows the overlap between the wavelet filter and the resonant signature at 0.53 THz in the corresponding reflection spectra. In each case, it can be noted that whether the resonance at 0.53 THz is resolved in depends on the overlap between the spectral feature and the DWT wavelet filter.
2.3. Sample Preparation
We prepared sample disks made from -lactose monohydrate powders (Spectrum Chemical Mfg. Corp., Gardena, CA, USA) with resonant frequencies at 0.53, 1.2, and 1.4 THz [50,51]. To create the sample disks, we mixed the -lactose powders with ultra-fine high-density polyethylene (HDPE) (1:1 ratio), and pressed the mixture under 3000 PSI load for about 30 min, yielding pellets with approximately 4 mm thickness and 50 mm diameter. Therefore, each sample was thick enough to avoid any overlaps between the front- and back-reflections.
2.4. Signal Conditioning
The main pulse in the measured time-domain electric field is windowed to exclude a second Fabry Perot reflection pulse. To avoid creation of an artificial step in the signal, we apply a tapering Hann window to the data points before the cut-off location. To account for the spatial heterogeneity in the samples, the average amplitude spectrum of ten disjoint measurements over the sample surface is used. Additionally, the sample measurements are deconvolved in the Fourier domain by the reflections from a reference mirror placed at the location of the sample surface. The amplitude spectrum of a sample with a smooth surface exhibits a linear trend with a close-to-zero slope. The effect of this trend in each amplitude spectrum is removed using linear curve fitting. Following this step, tapering Hann windows are applied at the boundary points in the frequency-domain to smooth the transition into the baseline. The spectral window range is chosen such that it does not affect the signal within the useful bandwidth. For a DWT with J levels of decomposition, the size of the signal N should be an integer-multiple of . Therefore, we zero-pad the signals to for a DWT with levels of decomposition. Subsequently, the DWT and MODWT wavelet and details coefficients are calculated using Equations (1), (3), (7), and (9). We use the MATLAB software (Mathworks, Natick, MA, USA) for analyzing the data and creating the figures.
3. Results
Figure 2b shows an example specular reflection measured from an -lactose pellet. Figure 2c shows the spectral amplitude (SA) of -lactose over the frequency range of THz. As described in the “signal conditioning” section, to obtain the SA, the sample measurement is deconvolved in the Fourier domain using a reference measurement, and the result is detrended using linear curve-fitting. The zero-crossing of the resonant signature in the SA signal, which is delineated using a red arrow in Figure 2c, is positioned at 0.53 THz. Figure 2d shows the vertically-offset fourth-level DWT wavelet coefficients of the SA signal. The wavelet coefficients are calculated using the least asymmetric mother wavelet with four vanishing moments, i.e., the LA(8) or sym4 wavelet filter [49]. Each trace in Figure 2d is composed of the wavelet coefficients calculated over different overlapping spectral ranges of SA. These ranges include THz (purple line), THz (red line), THz (blue line), and THz (green line). The beginning frequency of each trace is delineated using a vertical line. It can be seen that the features extracted by DWT wavelet coefficients depend on the spectral range used for calculation of the wavelet coefficient. In particular, with the beginning frequency at and THz in the purple and red traces, the wavelet coefficients do not extract the resonance of -lactose. On the contrary, the same resonant feature can be captured by the red and green traces with the beginning frequency at and THz, respectively. In Figure 2d, each box with the same color as the corresponding trace shows the overlap between the wavelet filter (dashed line) and the resonant signature (solid line) at 0.53 THz. It can be observed that, because of the downsampling operation in Equation (1), the wavelet filter and the resonant signature overlap in neither of the purple and blue boxes. However, there is a perfect overlap between the wavelet filter and the resonant signature in both of the red and green boxes, leading to the capturing of the resonance.Figure 3 compares the SA of -lactose as the control sample in Figure 3a with the vertically-offset fourth-level MODWT wavelet coefficients in Figure 3b. Similar to Figure 2d, each trace in Figure 3b is composed of the wavelet coefficients calculated over a different frequency range, including THz in the purple trace, THz in the red trace, THz in the blue trace, and THz in the green trace. It can be observed that, unlike DWT, all MODWT wavelet coefficients, regardless of their frequency range, can extract the resonant signature. In Figure 3b, each box is shaded with the same color as the corresponding wavelet coefficient trace. The inset boxes display the overlap between the MODWT wavelet filter and the resonant signature at 0.53 THz. It can be seen that the wavelet filter and the resonant signature overlap in all the boxes, leading to the extraction of the resonant feature by their corresponding traces.
Figure 3
(a) The SA of -lactose over the frequency range of THz as the control sample. (b) The vertically-offset fourth-level MODWT wavelet coefficients of the SA signal, applied in the frequency domain. Each line in (b) is calculated over a different frequency range, including THz in the purple line, THz in the red line, THz in the blue line, and THz in the green line. The lower bound of the frequency ranges are emphasized using vertical lines at the beginning of each trace. Each inset box, shaded with the same color, shows the overlap between the wavelet filter (dashed line) and the resonant signature (solid line) at 0.53 THz in the corresponding reflection spectra. (c) The vertically-offset fourth-level DWT and MODWT wavelet and details coefficients of the SA signal. Panel (d) shows the SA of -lactose over the entire measurement bandwidth of THz. In (e) are the vertically-offset third-level DWT and MODWT details vectors of the SA signal in (d). Panel (f) zooms into the details coefficients around 1.4 THz, revealing that the DWT details vector (blue line) is slightly misaligned with respect to the resonant frequency, while MODWT details vector (red line) maintains the exact spectral location. The vertical dashed lines in (a–f) delineate the resonances of -lactose.
It should be noted that to align the wavelet coefficients in Figure 2d and Figure 3b with the SA signal, wavelet coefficients are circularly shifted, following the phase function correction approach described in [15]. The number of these circular shifts is given by the power of T in Figure 2d and Figure 3b, i.e., and , where a negative sign indicates that the circular shifts are to the left. On the contrary, as we explain in the “zero-phase filtering” section, details vectors are inherently associated with zero-phase filters, while they have the same band- or high-pass characteristics as the wavelet coefficients. Therefore, they do not require the phase corrections for the purpose of aligning the features. Figure 3c compares the fourth-level DWT and MODWT wavelet and details coefficients. To better demonstrate the impact of the phase function on alignment of the features, wavelet coefficients in Figure 3c are not circularly shifted. It can be observed that and do not align at the resonant frequency, which is delineated using a vertical dashed line. In contrast, the resonant features extracted by details coefficients are aligned at 0.53 THz without requiring any phase correction. Furthermore, to validate the superiority of MODWT over DWT details, Figure 3d–f compare the third-level DWT and MODWT details coefficients of the -lactose’s SA. In Figure 3d, the SA signal is shown over the entire measurement bandwidth of THz. The resonances of -lactose at 0.53, 1.2, and 1.4 THz are delineated using vertical dashed lines. Although the details coefficients in Figure 3e do not require any circular shifts, zooming in at the details coefficients around 1.4 THz, which are shown in Figure 3e, reveals that DWT details are slightly misaligned with respect to the resonant frequency. This misalignment arises from the down- and upsampling operations used in calculating the DWT details coefficients. Therefore, in spectroscopic applications where retaining the exact location of features is critical, MODWT details are preferable over DWT. In our previous papers, we have also demonstrated that the resonant signatures of different characteristics, such as spectral height and width, can be identified using wavelet coefficients at different decomposition levels [16,17]. We have presented two computational techniques, i.e., the total variation of second-order derivatives of the reflection coefficient [16] or the bimodality coefficient spectrum of the absorption coefficient [17], to identify the decomposition levels associated with each resonant feature.Figure 4a,b compares the translation-invariance property between the DWT and MODWT details coefficients. Similar to Figure 2d and Figure 3b, each trace in Figure 4a,b is calculated over a different frequency range. It can be seen that DWT details calculated over THz (purple trace) and THz (blue trace) do not contain the -lactose’s resonance. In contrast, all MODWT details coefficients, regardless of their frequency range, can extract the resonant signature. Finally, Figure 4c–f show the amplitude of the DWT and MODWT details vectors calculated over two different, overlapping frequency ranges down to four levels of decomposition. The horizontal white dashed lines separate the levels of decomposition. The details coefficients at each panel are min–max-normalized between zero and one. Figure 4c,d compare the DWT details coefficients calculated over THz and THz, respectively. It can be seen that the DWT coefficients over the two regions are not alike. Moreover, the resonant lines at 0.53 THz are not clearly distinguishable from the other lines surrounding them. Similarly, Figure 4e,f compare the MODWT details coefficients calculated over the same frequency ranges as Figure 4c,d. The MODWT details coefficients calculated over the two different regions are identical, indicating the translation-invariance of MODWT at all levels of decomposition. Moreover, the resonant lines at both frequency regions are clearly distinct from the other coefficients.
Figure 4
The vertically-offset fourth-level DWT details coefficients, (a), are compared to the the same vertically-offset fourth-level MODWT details coefficients in (b). Each trace in (a,b) is calculated over a different frequency range, including THz in the purple traces, THz in the red traces, THz in the blue traces, and THz in the green traces. In (c,d) the amplitude of all DWT details coefficients are calculated over THz, (c), and THz, (b) for a decomposition level of . In (e,f), the amplitude of all MODWT details coefficients are calculated over THz and THz, respectively, for a decomposition level of . The horizontal white dashed lines separate the levels of decomposition. The details coefficients at each panel are min–max-normalized between zero and one.
4. Conclusions
In this paper, we studied the main advantages of MODWT over DWT for the extraction of characteristic resonances using broadband terahertz spectroscopy technique. The first advantage is the invariance property of MODWT with respect to the spectral range of analysis. We showed that by changing the frequency range in the DWT analysis of the reflection spectral amplitudes of -lactose, the wavelet coefficients calculated over different ranges are not alike. Therefore, the characteristic resonances do not appear in the DWT wavelet and details coefficients of all frequency ranges. In particular, we showed that the -lactose’s resonance at 0.53 THz is not captured by the DWT wavelet and details coefficients over the overlapping spectral ranges of THz and THz. However, in MODWT, there is no dependency on the the frequency range, and all wavelet and details coefficients can extract the resonant signature. The second advantage is the zero-phase filtering inherent in calculating the MODWT details coefficients, which results in a perfect alignment between the spectroscopic features in detail coefficients and the original signal. We demonstrated that although DWT details coefficients are also associated with zero-phase filters, because of the down- and upsampling operations used in their calculation, slight mismatches still appear between the details coefficients and the original signal. Our experimental results show that DWT details coefficients associated with -lactose’s resonant signature at 1.40 THz do not align with the original signal at the resonant frequency, whereas there is a perfect alignment between MODWT details vector and the original spectrum. Due to these advantages, MODWT is a more reliable analytical tool for spectroscopic applications that rely on the extraction and the exact spectral location of the characteristic absorption lines.
Authors: Clare J Strachan; Philip F Taday; David A Newnham; Keith C Gordon; J Axel Zeitler; Michael Pepper; Thomas Rades Journal: J Pharm Sci Date: 2005-04 Impact factor: 3.534
Authors: Mahmoud E Khani; Zachery B Harris; Omar B Osman; Juin W Zhou; Andrew Chen; Adam J Singer; M Hassan Arbab Journal: Sci Rep Date: 2022-03-24 Impact factor: 4.996