Literature DB >> 35459049

Deep Unfolding of Iteratively Reweighted ADMM for Wireless RF Sensing.

Udaya S K P Miriya Thanthrige¹, Peter Jung^2,3, Aydin Sezgin¹.

Abstract

We address the detection of material defects, which are inside a layered material structure using compressive sensing-based multiple-input and multiple-output (MIMO) wireless radar. Here, strong clutter due to the reflection of the layered structure's surface often makes the detection of the defects challenging. Thus, sophisticated signal separation methods are required for improved defect detection. In many scenarios, the number of defects that we are interested in is limited, and the signaling response of the layered structure can be modeled as a low-rank structure. Therefore, we propose joint rank and sparsity minimization for defect detection. In particular, we propose a non-convex approach based on the iteratively reweighted nuclear and ℓ1-norm (a double-reweighted approach) to obtain a higher accuracy compared to the conventional nuclear norm and ℓ1-norm minimization. To this end, an iterative algorithm is designed to estimate the low-rank and sparse contributions. Further, we propose deep learning-based parameter tuning of the algorithm (i.e., algorithm unfolding) to improve the accuracy and the speed of convergence of the algorithm. Our numerical results show that the proposed approach outperforms the conventional approaches in terms of mean squared errors of the recovered low-rank and sparse components and the speed of convergence.

Entities: Chemical

Keywords: algorithm unfolding; clutter suppression; compressive sensing; defects detection; reweighted norm

Mesh：

Year: 2022 PMID： 35459049 PMCID： PMC9028850 DOI： 10.3390/s22083065

Source DB: PubMed Journal: Sensors (Basel) ISSN： 1424-8220 Impact factor: 3.847

1. Introduction

The electromagnetic (EM) waves-based remote sensing has many potential applications such as behind the wall object identification [1], multi-layer target detection [2], material characterization [3], defect detection [4,5,6,7], and many more. In EM and radio frequency (RF) waves-based detection of objects/defects which are behind or inside a layered structure, the EM waves that reflect from the object/defect are analyzed. Here, one major challenge is the presence of strong unwanted reflections, i.e., clutter [1,8]. In this context, the main source of the clutter is the reflection from the surface of the layered material structure. The state-of-the-art clutter suppression methods such as background subtraction (BS), time-gating, and subspace projection (SP) [9] are not able to suppress the clutter in the context of object/defect detection. This is due to the fact that in BS, it requires the reference data of the scene, and this reference data is not available most of the time. Moreover, in the SP, prior knowledge is required to determine the perfect threshold for clutter removal. On the other hand, in time-gating, the time window in which clutter resides needs to be determined for successful clutter removal. However, this time window cannot be determined exactly. Clutter suppression becomes even more challenging if objects and clutter are closely located. This occurs regularly in the detection of defects which are inside a layered structure. Then, due to the small delay spread, the signaling responses of defects and clutter superimpose each other. In order to overcome these challenges, advanced signal processing methods are required for clutter suppression [1,8,10]. In many scenarios, the responses of the material defects are weak and, thus, difficult to detect. Even if there is no clutter, due to very low signal amplitude, it may be difficult to detect material defects in the presence of noise. In this context, the weak signal detection in the presence of noise has drawn attention in the defect detection research field. Therefore, we briefly discuss the weak signal detection in the following. Stochastic resonance has been widely used in weak signal detection [11,12,13]. In [11], to improve upon the weak signal detection by stochastic resonance, the relationship between the current and the previous value of the state variable of the system has been utilized. It is worth noticing that weak signal detection plays an important role in other applications such as health monitoring. Similar to defect detection, health monitoring aims to detect weak signals in the presence of strong noise. In [14], a comparative study of well-known adaptive mode decomposition approaches that are used for the aforementioned task is reviewed. Here, the advantages, limitations, and the performance comparison of adaptive mode decomposition approaches, namely empirical mode decomposition, Hilbert vibration decomposition, and variational mode decomposition, have been given. Other than signal detection, the extraction of features of the detected signal is important in many applications as these features are used for classification and clustering. In this context, it is important to select the most important features as the accuracy and speed of the classification depend on the features that are used. The impact of the feature selection for electromyographic signal decomposition is studied in [15]. Moreover, in this study, various feature extraction methods are compared, and a guide to select the most important features that improve the signal decomposition is provided [15]. As we discussed above, weak signal detection in the presence of disturbances like noise or clutter is challenging, therefore, advanced signal procession methods are required. Next, we discuss clutter suppression in more detail. In many scenarios, the number of defects is limited. Therefore, the signaling response of the defect is sparse in nature. By exploring this, compressive sensing (CS) [16] based approaches have shown promising results in object/defect detection with clutter [1,8]. In addition, the CS-based approaches do not require a full measurement data set, which results in fast data acquisition and less sensitivity to sensor failure, wireless interference, and jamming. In CS-based approaches, it is considered that the clutter resides in a low-rank subspace and the response of the objects is sparse [1,8]. To this end, we present a general data acquisition model where the received data vector is modeled as a combination of a low-rank matrix and a sparse matrix with : in which with , are compression operators/measurement matrices and measurement noise, respectively. Here, the compression ratio is defined as . Further, denotes the vectorization operator, which converts a matrix to a vector by stacking the columns of the matrix. Given the received data vector , our aim is to estimate the signals of interest, , using a small number of linear measurements by minimizing the rank and sparsity as where , are regularization parameters and is a small positive constant (noise bound). Here, is the -norm, i.e., sparsity (the number of non-zero components). Note that the problem given in (2) is also known as robust principal component analysis (RPCA) [17]. The RPCA problem has different types as follows: (a) standard/classical RPCA in which both and in (1) are identity matrices [17], (b) the matrices and is a selection operator which select a random subset of size K from entries [18], (c) both and are matrices which map the vector space to the vector space [18]. The problem given in (2) is an NP-hard problem and, thus, difficult to solve. To this end, convex relaxations of sparsity and rank in terms of -norm of a matrix (absolute sum of elements) and nuclear norm of a matrix (sum of singular values) are utilized, respectively [19,20,21]. However, enjoying a rigorous analysis, the convex relaxations of sparsity and rank cause disadvantages in many applications. In addition to that, in many applications, the important properties of the signal are preserved by the large coefficients/singular values of the signal [22]. However, the -norm/nuclear norm minimization algorithms shrink all the coefficients/singular values with the same threshold. Thus, to avoid this weakness, we should shrink less the larger coefficients/singular values. To address aforementioned drawbacks, non-convex approaches such as reweighted nuclear norm and reweighted -norm minimization have been considered [22,23,24,25]. These non-convex approaches have shown better performance over the convex relaxations by providing tighter characterizations of rank and sparsity, yet their behavior and convergence have not been fully studied [26]. Generally, RPCA problems are numerically solved by means of iterative algorithms based on the alternating direction method of multipliers (ADMM) [17,27,28] or accelerated proximal gradient (APG) [29,30]. In iterative algorithms, the accuracy of the recovered signal component and the convergence rate depends on the proper selection of parameters (e.g., regularization/thresholding/denoising parameters). Generally, parameters are chosen by handcrafting, and it is a time-consuming task. In this context, machine learning-based parameter tuning using training data has shown promising results in many applications such as sparse vector recovery [31,32,33] and image processing [34]. For instance, as shown in [31], the unfolded iterative soft-thresholding algorithm (LISTA) converges twenty times faster than the conventional iterative soft-thresholding algorithm (ISTA). This approach is known as algorithm unrolling/unfolding, and an overview can be found in [35]. In this work, we formulate the detection of material defects as a RPCA problem. This RPCA problem is solved based on the reweighted nuclear norm and reweighted -norm minimization. However, most of the time, RPCA problems are solved by using the convex relaxation or with the single reweighting, i.e., either reweighted -norm or reweighted nuclear norm [22,30,36,37]. Next, our objective is to jointly estimate the low-rank matrix and the sparse matrix from few compressive measurements. It is worth noticing that most of the work in the literature focuses on the standard RPCA problem, where and are identity matrices [22,36]. To the best of our knowledge, the full doubly reweighted (joint reweighted nuclear norm and reweighted -norm) approach has not yet been studied comprehensively in the literature for the compressive case. Then, we propose an iterative algorithm for (locally) minimizing the objective, i.e., reweighted nuclear norm and reweighted -norm, which is based on the alternating direction method of multipliers (ADMM) [38,39]. Further, we propose deep learning-based parameter tuning to improve the accuracies of the recovered low-rank and sparse components and the convergence rate of the ADMM-based iterative algorithm. In addition to the EM-based defect detection, there are many applications where the data generated by the application can be modeled as a combination of low-rank plus sparse contributions. For instance, in video surveillance, the static background results in a low-rank contribution, and moving objects result in a sparse contribution [40]. Further, in human face recognition from a corrupted face image, the human face can be approximated as a low-rank structure while self-shadowing and specularities are modeled as sparse contributions [40,41]. Therefore, RPCA can be applied to the aforementioned applications and other applications as long as the data/measurements are combinations of low-rank and sparse contributions. It is worth noticing that our proposed full doubly reweighted (joint reweighted nuclear norm and reweighted -norm) approach with deep learning-based parameter tuning for RPCA is not limited to EM-based defect detection and can be applied to other applications that are solved using RPCA. In the context of the algorithm unfolding for the RPCA, the convolutional robust principal component analysis (CORONA) [30,37] are the closest studies to our work. There are fundamental methodological differences between our work and [30,37]: (a) Both [30,37] considered the standard convex relaxation (-norm and nuclear norm) to solve the RPCA problem, while we propose the reweighted -norm and reweighted nuclear norm. (b) In this work, the RPCA problem is solved by an iterative algorithm based on ADMM, while the iterative algorithm in [30,37] is based on fast ISTA (FISTA). The motivation to propose ADMM over ISTA/FISTA for RPCA is as follows. As shown in [17,27] for RPCA, the ADMM-based approach is able to achieve the desired solution with a good recovery error with few iterations for a wide range of applications compared to APG-based approaches like ISTA/FISTA. Further, the performances of the APG-based approaches are heavily dependent on the good continuation schemes [17]. This condition may not be satisfied for a wide range of applications. (c) Different from [30,37], our focus is on defect detection based on the stepped-frequency continuous wave (SFCW) radar, while [30,37] focus on ultrasound imaging application. Moreover, experimental measurement data of [30,37] have considered that in (1) is an identity matrix, while we consider both scenarios where is an identity matrix and it is a compression operator. Further, for the SFCW radar application, we consider that . Further, we have studied the performance of our approach with a generic real-valued Gaussian model for different compression ratios. The CORONA focuses on ultrasound imaging applications where sparse matrix has row-sparse structure. Thus, there is a strong relationship between measurement to measurement, and there is a common sparsity structure. Therefore, -norm minimization is more suitable than -norm minimization to estimate sparse matrix . Further, the CORONA is based on a convolutional deep neural network to learn spatial invariance features of data, which is more suitable for ultrasound imaging applications than a dense deep neural network (DNN). However, we assume that there is no strong relationship of a data element to its neighboring elements, nor is there a specific sparsity structure. Thus, we consider a dense DNN in this work. It is straightforward to modify our ADMM approach with convolutional DNN and the -norm minimization. In CORONA [30], customized complex-valued convolution layers and singular value decomposition operations are utilized. In our work, we have implemented a dense DNN which supports complex-valued data and singular value decomposition (SVD) operation. The contributions of this work are summarized as follows: We propose a generic approach based on the non-convex fully double-reweighted approach, i.e., both reweighted -norm and reweighted nuclear norm simultaneously to solve the RPCA problem. To this end, we propose an iterative algorithm based on ADMM to estimate the low-rank and sparse components jointly. In contrast to standard/classical RPCA, we consider the compressive sensing data acquisition model, which reflects more on the practical problem at hand. Next, to improve the accuracy and convergence speed of the ADMM-based iterative algorithm, we propose a deep neural network (DNN) to tune the parameters of the iterative algorithm (i.e., algorithm unfolding/unrolling) from training data. We intensively evaluate our proposed approach for a generic Gaussian data acquisition model with . In addition to that, the defect detection by SFCW radar from compressive measurements with is considered. To compare our approach, we consider the standard convex approach (i.e., nuclear norm and -norm minimization) and the untrained ADMM-based iterative algorithm for different compression ratios. In both the generic Gaussian data acquisition model and SFCW-based defect detection, our numerical results show that the proposed approach outperforms the conventional approaches in terms of mean squared errors of the recovered low-rank and sparse components and the speed of convergence. In the context of algorithm unrolling for RPCA, we compare our approach with the approach given in [30] (CORONA). It turns out that our proposed approach shows similar performance as CORONA for experimental ultrasound imaging data used in [30], and our approach outperforms CORONA for generic Gaussian data. It is worth noticing that there is a row-sparse nature of the experimental ultrasound data. That is the reason CORONA uses -norm minimization to estimate sparse matrix . Our approach is generic, yet our approach is able to achieve similar results as CORONA by learning. This shows the applicability of our approach to different types of use cases and data (defect detection, ultrasound imaging, generic Gaussian data). We numerically analyze the robustness of our proposed approach for the generic Gaussian data acquisition model. Here, we consider the deviation in the measurement matrices () and testing signal-to-noise ratio (SNR) uncertainty. It was observed that the proposed approach is robust for a small deviation in the measurement matrices. Further, it was observed that training with the SNR like 5 dB is favorable when SNR of the testing data is unknown.

1.1. Contribution

The remainder of the paper is organized as follows. We introduce the SFCW radar-based defect detection and the low-rank plus sparse recovery with reweighting in Section 2. In Section 3, we discuss the DNN-based low-rank plus sparse recovery algorithm unfolding. In Section 4, we provide an evaluation of the proposed DNN-based low-rank plus sparse recovery algorithm unfolding approaches and provide interesting insights. Section 5 concludes the paper.

1.2. Notation

In this paper, the following notation is used. A vector is denoted in boldface lower-case letter, while the matrices are denoted in boldface upper-case. The -norm (the number of nonzero components), -norm (absolute sum of elements) of a matrix/vector, and nuclear norm of a matrix (sum of singular values) are denoted by , , and , respectively. Further, the Frobenius norm of a matrix and -norm is given by and , respectively. The Hermitian and transpose of the matrix are represented by and , respectively. In addition, the Moore–Penrose pseudo inverse is denoted by . A matrix of size with all elements equal to zero and one are denoted by and , respectively. Moreover, a vector of size M with all elements equal to zero and one are denoted by and , respectively. In addition, identity matrix is denoted by . The main variable list and abbreviations that are used in this manuscript are listed at the end of the manuscript.

2. System Model

First, we briefly present the system model of the mono-static SFCW radar-based defect detection. Next, we discuss the ADMM -based iterative algorithm for the low-rank plus sparse recovery.

2.1. SFCW Radar Based Defect Detection

We consider an SFCW radar with M transceivers which are placed in parallel to the single-layered material structure while maintaining an equal distance between transceivers, as shown in Figure 1. In SFCW radar, each transceiver transmits a stepped-frequency signal containing N frequencies which are equally spaced over the bandwidth of B Hz. To this end, the received signal corresponding to all M transceivers and N frequencies are given by

Figure 1

Getting the measurements of a single-layered material structure using an SFCW radar with M transceivers. The received signal consists of two main components, the reflection of the layered material structure () and the reflection of the defects (), where is the main clutter source. Here, defects are shown as red circles.

Note that consists of two main components, the reflection of the layered material structure () and the reflection of the defects (). Here, is the additive Gaussian noise matrix. Next, we discuss in detail the modeling of the received signal of the defects by using the propagation time delay. To this end, the scene shown in Figure 1 is virtually partitioned into a rectangular grid of size Q. Suppose that the round-travel time of the signal from the m-th antenna location to the p-th defect and back is given by . Then, the received signal of the defects in m-th transceiver corresponding to n-th frequency band is given by [1] Here, , is the complex reflectivity coefficient of the p-th defect, and P is the total number of defects. To this end, is given by where contains all the values of the defects. Since there are P defects, the vector only contains P non-zero entries. The matrix is given by . Note that the -th element of the matrix is given by , where is the propagation time delay between the m-th antenna to the q-th grid location. We assume that the propagation time delays of the defects are exactly matched with the propagation time delays of the grid locations. If this condition does not satisfy, it is known as grid mismatch. The grid mismatch degrades the performance of the sparse signal estimation [42]. There are several approaches proposed to rectify this problem, e.g., Bayesian learning-based approach [43], iterative dictionary updates [3], and many more. Similar to the received signal of the defects , the received signal of the layered material structure in m-th transceiver corresponding to n-th frequency band is given by [1]: Here, is the complex reflectivity of the layered material structure. and are the propagation loss and the propagation delay of the -th return of the layered material structure. The number of internal reflections within the layered material is given by .

2.2. Compressed Sensing (CS) Approach

In the compressed sensing (CS) setup, only a subset of antennas/frequencies are available or selected. Now, the reduced data vector of size is given by where is the selection matrix. The matrix has a single non-zero element of value one in each row to indicate the selected frequency of a particular antenna if that antenna is selected. Here, our main objective is to recover and from the reduced data vector using the low-rank plus sparse recovery approach as detailed below.

2.3. Low-Rank Plus Sparse Recovery Algorithm

From now on we consider the general data acquisition model given in (1) in Section 1, i.e., . Note that the SFCW radar model given in (7) is mapped to the generic measurement model by considering , , , , and , respectively. Our objective is to recover the low-rank matrix and the sparse matrix from the compressive measurements . Thus, the estimation of and from is done by minimizing rank and the sparsity (-norm). Note that rank and -norm minimization problems are usually NP-hard. Thus, one may use instead convex relaxations based on the nuclear norm of a matrix and -norm of a matrix as follows: The resulting convex problems, i.e., -norm and nuclear norm minimization, are well studied in the literature, and there are several non-convex approaches to improve over the standard convex relaxation. One well-known approach is iterative reweighting of the -norm [23,32,44] and nuclear norm [22,45,46,47]. Alternating direction method of multipliers (ADMM) is used to solve the problem given in (8). First, we formulate the problem given in (8) based on ADMM approach, and then we introduce the non-convex double-reweighted approach, i.e., both reweighted -norm and reweighted nuclear norm simultaneously. Let the signal component value of and at the t-th iteration be denoted as . Now, based on the ADMM, and are estimated by Here, , are auxiliary variables and a penalty factor. Let be the singular values of . The nuclear norm of is given by . Now, we are going to introduce the weighted -norm and weighted nuclear norm to the sub-problems given in (9) and (10) as follows: The operator ⊙ denotes element-wise multiplication. Here, and are non-negative weight vectors in -th iteration. To this end, and are calculated based on the previous estimation of the and , i.e., and . Here, and are decay functions, applied component-wise, which are used to calculate the weights. There are several decay functions proposed in the literature, and an overview of the nuclear norm is given in [47]. In this work, motivated by [32], we consider element-wise (adaptive) soft-thresholding as the proximal operator of the weighted -norm. In addition, inspired by [48], element-wise (adaptive) singular value soft-thresholding (i.e., element-wise soft-thresholding on the singular values of a matrix) is used as a proximal operator of the weighted nuclear norm. Now, and are given by where and are the element-wise singular value soft-thresholding and element-wise soft-thresholding operators [32,48], respectively. Note that is a linear operator which back projects the vector into the target matrix subspace. There are two options for : (a) Hermitian transpose , as done in [32], or (b) Moore–Penrose pseudo inverse , as done in [1]. Next, we are going to discuss the element-wise (adaptive) soft-thresholding and the element-wise (adaptive) singular value soft-thresholding.

2.4. Element-Wise Soft-Thresholding and Singular Value Soft-Thresholding

In (16), contains the element-wise thresholds for for the -th iteration. These thresholds are derived based on the previous estimate , i.e., , Here, is a positive constant (soft-thresholding parameter), and is the -th row and -th column element of the t-th estimation of , i.e., . The same concept is also applied to the singular value soft-thresholding which is used in (15), as discussed next. In this work, we consider the same decay function for both sparsity and rank, i.e., . In (15), contains the different thresholds calculated from the singular values of the previous estimate of as given below: Here, is the -th singular value of , and is a positive constant (singular-value-soft-thresholding parameter). For completeness, definitions of the element-wise soft-thresholding and singular value soft-thresholding are given in Appendix A.1. Our objective is to tune the parameters in (17) and in (18) by using a deep neural network, as discussed next.

3. Unfolding ADMM-Based Low-Rank Plus Sparse Recovery Algorithm

In this section, we are going to discuss the ADMM algorithm unfolding using a dense DNN. The iterative algorithm given in Algorithm 1 utilizes the ADMM steps given in (15), (16), and (11), and previous estimates are used in the next iteration. Thus, this kind of iterative algorithm can be considered as a recurrent neural network. The t-th iteration of the iterative Algorithm 1 is modeled as the t-th layer of the deep neural network as shown in Figure 2. Each matrix multiplication given in the ADMM steps (15), (16), and (11) are implemented as linear layers without biases. Here, our main objective is to learn the per iteration weights of the network and thresholding parameters and given in (17) and (18) from training data. To this end, the t-th layer of the neural network is represented by the following equations:

Figure 2

Block diagram of the t-th layer of the DNN which mimics the low-rank plus sparse recovery Algorithm 1. Weights of the linear layers () and other parameters (, , , ) are learned from training data.

Here, , , , and are the weights of the t-th layer as shown in Figure 2. Their initial values are , , , and to mimic the ADMM Algorithm 1. Further, and are the thresholding vectors of the t-th layer as given in (15) and (16). Note that , and depend on the previous estimates of the , and two parameters ( and ). Here, we consider the weights , , , and are tied over all the layers, i.e., sharing weights. However, we do not consider thresholding parameters ( and ) and to be tied over all layers, i.e., each layer has its own thresholding parameters. To this end, represents the set of learning parameters. Here, and are the thresholding parameters of the t-th layer.

3.1. Training Phase

In the training phase, the DNN is trained in a supervised manner. Here, the DNN learns the parameters given in . Suppose that the DNN has T layers, then the outputs of the DNN in the training phase for the i-th sample are given by and , respectively. Note that, in the training phase, the DNN minimizes the normalized mean squared error where and are i-th ground-truth low-rank and sparse matrices, and is the number of training samples. In this work, in the context of DNN-based parameter tuning, we consider three versions of the ADMM-based iterative algorithm to solve the RPCA problem as follows: (a) Parameter tuning with non-adaptive thresholding (i.e., ). This approach is named as ADMM-based trained RPCA with thresholding (TRPCA-T). For the parameter tuning with adaptive thresholding, we consider two versions based on two decay functions as described in Section 2.4. These two approaches are named as follows: (b) ADMM-based trained RPCA with adaptive thresholding based on logarithm heuristic (TRPCA-AT(log)). (c) ADMM-based trained RPCA with adaptive thresholding based on exponential heuristic (TRPCA-AT(exp). Among above versions, in this work, we propose parameter tuning with adaptive thresholding approaches TRPCA-AT(log) and TRPCA-AT(exp) to solve the RPCA problem with the compressive sensing data acquisition model. To have a comparison with our proposed approaches, we consider two approaches. In the first approach, we consider the untrained ADMM approach to solve the convex low-rank plus sparse recovery as given in Algorithm 1 with non-adaptive thresholding (i.e., ). This method is named as ADMM-based untrained RPCA with thresholding (URPCA-T). As a second option, we consider the low-rank plus sparse recovery problem given in (8). This method is named as low-rank plus sparse recovery with convex relaxation (LRPSRC).

3.2. Computation Complexity

In this subsection, the computational complexity of the proposed DNN is briefly discussed. Detail breakdown is given in Appendix A.2. The training complexity of the DNN is the addition of the feed-forward and the back propagation complexities. For , number of training samples, , number of epochs, and for , the training computational complexity for the DNN with T layers is given by . The testing computational complexity is the feed-forward propagation complexity of data through the DNN. It is given by ; here is the number of testing samples. The is the Big O notation for asymptotic computational complexity analysis [49].

4. Results and Discussion

In this section numerical results are presented. First, the performance of deep learning-based trained ADMM adaptive thresholding is evaluated with a generic real-valued Gaussian model, and next, a complex-valued SFCW radar model given in Section 2.1 is used.

4.1. Generic Gaussian Model

In this subsection, our proposed approach is evaluated using the generic Gaussian data. The order of this subsection is summarized as follows. First, the performance of the proposed approach is compared with state-of-the-art approaches for and compression ratios. Second, the Cramér–Rao bound (CRB) of unbiased estimation of low-rank and sparse matrices is used to evaluate the proposed approach. Third, to investigate the robustness of the proposed approach, two scenarios are used: (a) testing SNR uncertainty and (b) deviation in measurement matrices and between training and testing. Fourth, the performance comparison between ADMM- and FISTA-based approaches for RPCA is evaluated. Here, the approach given in [30] (CORONA) is used as unfolded FISTA-based approach for RPCA. In the generic Gaussian model, the elements of are generated once from an i.i.d. Gaussian with zero mean and unit variance. In this work, training and testing data are synthetically generated based on the system model given in (1). Therefore, ground-truth low-rank and sparse matrices are available in the training phase. In case only the received data vector in (1) is available, in general, Algorithm 1 or LRPSRC given in (8) can be used to generate low-rank and sparse matrices in the training phase. Let the received signal, noise vector, and low-rank and sparse matrices for the i-th data sample as given in (1) be denoted by , , , and , respectively. We generate a low-rank matrix with rank r as with and . Here, elements of , and non-zero entries of are generated independently from an i.i.d. Gaussian with zero mean and unit variance. The fixed number of non-zero locations of each are selected uniformly. We normalized and to have a unit Frobenius norm (i.e., ). For better readability, we introduce a parameter set as . Here, , , , r, , , and are the number of training samples, number of testing samples, number of epochs, rank of the low-rank matrix, number of non-zero elements of the sparse matrix, SNR of the training data, and SNR of the testing data, respectively. The signal-to-noise ratio (SNR) of the i-th data sample for given is defined as . First, we generate a Gaussian noise vector , then re-normalize to reach a given target SNR, and we set the same SNR for all samples. In the training stage, we set different learning rates denoted by and for the weights of the linear layers () and other parameters (, ) given in . The main objective for setting different learning rates is to reduce over-fitting to training data. Generally, many training samples are required to train a deep neural network. However, due to the specific architecture of the iterative algorithm, we are able to train the DNN with a small data set with the number of training samples . In the training phase, the adaptive moment estimation (Adam) optimizer [50] is used to train the DNN. Here, we initialize , , , and to mimic the ADMM Algorithm 1 and . In the inference phase, to evaluate the performance of the DNN, the normalized average root mean squared error is used. For the low-rank and sparse matrix, it is given by The outputs of a DNN with T layers for the i-th testing sample are given by and , respectively. The CRB given in (A4) is based on the combined recovery error of both low-rank and sparse matrices. The combined average mean squared error and the combined average normalized root mean squared error for both low-rank and sparse matrices are given by in which and . Both Algorithm 1 and LRPSRC given in (8) are implemented using Matlab [51], and LRPSRC is solved using the CVX package [52]. Notice that, in the LRPSRC, and are set to 1 and , respectively, as suggested by [17]. Note that for Algorithm 1, there is no specific rule to select the and and , thus they are manually tuned based on data. When is identity matrix, there is a specific rule to select as [17]. Note that, as a rule of thumb, thresholding parameters and given in (17) and (18) are initialized as and , respectively [17]. The ADMM penalty factor has an important impact on the convergence of the Algorithm 1. Usually, as increases, the algorithm converges faster. However, cannot be arbitrarily large, as it may overshoot the algorithm. Furthermore, should not be too big or too small. However, finding an optimal value for is an open problem, and it depends on the application/data. As a rule of thumb, can be set as [27]. In this work, we set , as we observed that the value suggested in [27] is not optimal for our data. Note that, unless otherwise stated, for all the simulation with Gaussian data, aforementioned parameter settings are used throughout this paper. The Pytorch package was used to implement the DNN [53]. First, we analyzed the performance of the proposed approach for different compression ratios () with respect to the number of layers of the DNN. For this simulation, the parameter set P is given by . Here, DNN only learns , , and instead of all the parameters given in , i.e., . This is due to the fact that the performance gain improvement by learning all the parameters given in is very small compared to learning only and . The average normalized RMSEs for the different number of layers of the DNN are, for compression ratio () and , shown in Figure 3 and Figure 4, respectively. Figure 3 and Figure 4 show that the proposed DNN-based thresholding (TRPCA-AT(log) and TRPCA-AT(exp)) outperforms the URPCA-T and the LRPSRC. Further, it is observed that as the number of layers increases, the average NRMSE decreases. For compression ratio, the average NRMSE does not show a large variance after ten layers. However, for , this is not the case. This is due to the fact that, as the compression increases, recovering of the low-rank and the sparse matrices becomes more challenging.

Figure 3

Average recovery error of low-rank (a) and sparsity (b) contributions for compression ratio for ADMM-based trained RPCA with thresholding (TRPCA-T), proposed ADMM-based trained RPCA with adaptive thresholding based on logarithm heuristic (TRPCA-AT(log)), proposed ADMM-based trained RPCA with adaptive thresholding based on exponential heuristic (TRPCA-AT(exp)), low-rank plus sparse recovery with convex relaxation (LRPSRC), and ADMM-based untrained RPCA with thresholding (URPCA-T).

Figure 4

Further, the TRPCA-AT outperforms the TRPCA-T. This performance improvement is mainly due to the iterative reweighting of -norm and nuclear norm minimization. In addition, the improvement over unweighted to iterative reweighting is more visible as the compression increases (i.e., as the problem gets more challenging). As an example, for compression ratio, the average NRMSE improvement between the TRPCA-T with twenty layers and TRPCA-AT(exp) with twenty layers for the low-rank and sparse components are and , respectively. However, for compression ratio, this improvement for the low-rank and sparse components are and , respectively. Further, we observe slight performance gains as the decay function is changed from log-determinant to exponential. Next, we analyzed the convergence speed of the proposed TRPCA-AT(exp) and TRPCA-AT(log) with URPCA-T. For of compression ratio, TRPCA-AT with ten layers outperforms URPCA-T with 150 iterations. Therefore, in the testing phase (inference phase), our proposed approaches (TRPCA-AT(exp) and TRPCA-AT(log)) are fifteen times faster than the conventional untrained approach URPCA-T. Moreover, for of compression ratio, TRPCA-AT with twenty layers outperforms URPCA-T with 150 iterations. Thus, our approach is times faster than the untrained approach URPCA-T. It is worth noticing that one layer of the DNN of our proposed approach is equivalent to one iteration of the conventional untrained approach URPCA-T. Therefore, the proposed approaches (TRPCA-AT(exp) and TRPCA-AT(log)) achieve lower NRMSE than the untrained approach URPCA-T with a much lower number of iterations. In Table 1, NRMSEs of recovered low-rank and sparse matrices with the corresponding number of iterations are listed for comparison.

Table 1

Comparison of convergence speeds for ADMM-based trained RPCA with thresholding (TRPCA-T), proposed ADMM-based trained RPCA with adaptive thresholding based on logarithm heuristic (TRPCA-AT(log)), proposed ADMM-based trained RPCA with adaptive thresholding based on exponential heuristic (TRPCA-AT(exp)), and ADMM-based untrained RPCA with thresholding (URPCA-T). The proposed approaches TRPCA-AT(log) and TRPCA-AT(exp) are 15 and times faster than URPCA-T for compression ratios and , respectively.

Method	Compression Ratio (K/MN) %	Number of Iterations	NRMSE=1Ns∑i=1NsXi−X^iF/XiF
Method	Compression Ratio (K/MN) %	Number of Iterations	Low-Rank Matrix L	Sparse Matrix S
TRPCA-AT(log)	50%	10	8.36×10−2	2.78×10−2
TRPCA-AT(exp)	50%	10	8.20×10−2	2.66×10−2
TRPCA-T	50%	10	8.99×10−2	3.69×10−2
URPCA-T	50%	150	9.14×10−2	4.72×10−2
TRPCA-AT(log)	25%	20	1.81×10−1	9.85×10−2
TRPCA-AT(exp)	25%	20	1.57×10−1	9.16×10−2
TRPCA-T	25%	20	2.35×10−1	1.38×10−1
URPCA-T	25%	150	2.33×10−1	1.61×10−1

To further demonstrate the advantage of non-convex iterative reweighting of -norm and nuclear norm minimization, histograms of the non-zero singular values of the low-rank matrix and non-zero element of the sparse matrix are shown in Figure 5 for the DNN with 20 layers. Here, these histograms correspond to the simulation given in Figure 3, i.e., 1200 testing samples and compression ratio . Based on Figure 5, for the sparse matrix , the proposed non-convex iterative reweighted approaches (TRPCA-AT(exp) and TRPCA-AT(log)) closely follow the histogram of the true sparse matrix. Moreover, for a given value range, the number of occurrences of the recovered sparse matrix by the unweighted approach TRPCA-T is less than the true number of occurrences of that value range as shown in Figure 5a. However, this is not the case for the non-convex iterative reweighted approaches (TRPCA-AT(exp) and TRPCA-AT(log)). These results validate that the important features preserved by the large coefficients are well recovered by the iterative reweighted approaches. This is the reason for the performance improvement of the iterative reweighted approaches compared to the unweighted approach TRPCA-T. In addition, recovered sparse matrices by the unweighted approach TRPCA-T have many small values compared to the iterative reweighted approaches. This indicates that the iterative reweighted approaches achieve more sparse solution than the unweighted approach.

Figure 5

Histograms of the non-zero singular values of (top) and non-zero elements of (bottom) for . (a) TRPCA-T, (b) TRPCA-AT(log) (proposed) and (c) TRPCA-AT(exp) (proposed). Note that, In the figure, true histograms are shown in red color and the recovered histograms are shown in black color. It is noticeable that the proposed non-convex iterative reweighted approaches (TRPCA-AT(exp) and TRPCA-AT(log)) closely follow the histograms of the true non-zero elements of and non-zero singular values of compared to the unweighted approach TRPCA-T. In addition, the recovered by the unweighted approach TRPCA-T has many small values compared to the iterative reweighted approaches, i.e., the iterative reweighted approach achieves a more sparse solution than the unweighted approach.

As seen in Figure 5, histograms of the non-zero singular values of the low-rank matrix by the proposed non-convex iterative reweighted approaches are less spread out compared to the histogram of the unweighted approach TRPCA-T. This also validates the aforementioned argument that important features preserved by the large coefficients are well recovered by the iterative reweighted approaches TRPCA-AT(exp) and TRPCA-AT(log). Note that in the histograms, the number of occurrences of zero value is not shown. This is due to the fact that the number of occurrences of zero value is much larger than occurrences of other values. In Figure 5, histograms corresponding to the compression ratio are shown, and for the compression ratio , similar results were observed.

4.1.1. Cramér–Rao Bound (CRB) Analysis

To further evaluate the performance of the proposed approach, the Cramér-Rao bound (CRB) of unbiased estimation of low-rank and sparse matrices given in [18] (Equation (A4)) is used. For completeness, the CRB and recovery guarantees of the RPCA are given in Appendix A.3. Note that in [18], the measurement matrices and are assumed to be a selection operator. Therefore, to have a fair comparison, first we consider that both and are identity matrices, i.e., standard RPCA problems. Now, the data acquisition model (Equation (1)) is simplified as where and are received signal matrix and noise matrix of size , respectively. For this simulation, parameter set P is given by . In this simulation, we set the number of layers of the DNN as 10. Figure 6 shows the CRB and average MSE of the combined low-rank and sparse matrices for 1200 testing samples for different SNR levels ranging from dB to 20 dB in steps of 5 dB. Here, we consider same SNR in both training and testing. As an example, if the testing SNR () is 20 dB, then training SNR () is 20 dB. As per Figure 6, it can be seen that the non-convex approach TRPCA-AT has the best performance compared to other approaches in higher SNR regime. Note that, here, the performance gap between the non-convex approach TRPCA-AT and the non-convex approach TRPCA-T is small. This is due to the fact that, as observed in Figure 3 and Figure 4, when compression decreases, the gain achieve by the non-convex approaches decreases.

Figure 6

Average combined recovery error of low-rank and sparse matrices as given in (25) and Cramér-Rao bounds as given in (A4) for compression ratio for ADMM-based trained RPCA with thresholding (TRPCA-T), proposed ADMM-based trained RPCA with adaptive thresholding based on logarithm heuristic (TRPCA-AT(log)), proposed ADMM-based trained RPCA with adaptive thresholding based on exponential heuristic (TRPCA-AT(exp)), low-rank plus sparse recovery with convex relaxation (LRPSRC), and ADMM-based untrained RPCA with thresholding (URPCA-T).

Next, we compare the results shown in Figure 3 and Figure 4 with the CRB given in [18] (Equation (A4)). In [18], the measurement matrix is assumed to be a selection operator which selects a random subset of size K from entries. Since this is the closest matching CRB to our model given in (1), we have considered this formulation as a benchmark. Further, we consider that is fixed over all testing samples. Figure 7 shows the CRB of the combined low-rank and sparse matrices for compression ratios and . It can be seen that the non-convex approach TRPCA-AT has the closest performance to the CRB. As the compression increases, the estimation of low-rank and sparse matrices from compressive measurements becomes more challenging. This can be seen by the increase of CRB as the compression ratio changes from to .

Figure 7

Average combined recovery error of low-rank and sparse matrices as given in (25) and Cramér-Rao bounds as given in (A4) for compression ratio (a) and (b) for ADMM-based trained RPCA with thresholding (TRPCA-T), proposed ADMM-based trained RPCA with adaptive thresholding based on logarithm heuristic (TRPCA-AT(log)), proposed ADMM-based trained RPCA with adaptive thresholding based on exponential heuristic (TRPCA-AT(exp)), low-rank plus sparse recovery with convex relaxation (LRPSRC), and ADMM-based untrained RPCA with thresholding (URPCA-T).

4.1.2. Robustness of the Proposed Approach

We considered two scenarios to analyze the robustness of the proposed trained ADMM adaptive thresholding approaches TRPCA-AT and TRPCA-T. First, motivated by [54], we analyzed the performance with respect to the test SNR uncertainty, i.e., the SNRs of training phase and testing phase are different. Second, we analyzed the effect of deviations in the measurement matrices and in (1) between training and testing. To this end, to evaluate the effect of testing SNR uncertainty, training SNR () is changed from dB to 20 dB with a step size of 5 dB. In addition, testing SNR () is changed from dB to 20 dB with a step size of 5 dB. For this simulation, P is given by . The cumulative average (where is defined in (26)) over all testing SNRs for each training SNR is shown in Figure 8. Here, we set and number of layers of the DNN as 10. We trained the DNN for 20 epochs using the Adam. Based on the results shown in Figure 8, we observed that, for all three approaches (TRPCA-AT(log), TRPCA-AT(exp), TRPCA-T), the cumulative average decreases as training SNR increases to some extent, and then, again, the cumulative average increases as training SNR further increases. Hence, these results show the importance of knowing the testing SNR, and as a simple rule, training SNR should be same as testing SNR to achieve the best performance. On the other hand, training with an SNR dB is favorable in the presence of uncertainty about testing SNR.

Figure 8

Average combined recovery error of low-rank and sparse matrices for compression ratio for training at a single SNR and testing with different SNRs for (a) ADMM-based trained RPCA with thresholding (TRPCA-T), (b) the proposed ADMM-based trained RPCA with adaptive thresholding based on logarithm heuristic (TRPCA-AT(log)), and (c) the proposed ADMM-based trained RPCA with adaptive thresholding based on exponential heuristic (TRPCA-AT(exp)). In the presence of uncertainty about testing SNR, then training with an SNR dB is favorable.

Next, we evaluate the performance of the proposed approaches for different measurement matrices and in (1) during training and testing. For simplicity, we assume that . In the training phase, while in the testing phase . Here, is the measurement matrix with error and . To quantify the effect of , is used as a metric. We evaluate the performance of the proposed approaches while changing from 0 dB to 20 dB in steps of 5 dB. For this simulation, parameter set P is given by . Note that testing SNR varies as changes, therefore, it is shown as in parameter set P. Figure 9 shows the average NRMSEs of the combined low-rank and sparse matrices and . In Figure 9, solid lines represent the NRMSEs for the model with measurement matrix error (); we also include a prefix “-E” in the legend of the figure to indicate it. In Figure 9, the dotted line shows the NRMSEs without error in the measurement matrix. Based on the results shown in Figure 9, the proposed approaches are robust for smaller deviations like and 15 dB. However, for higher deviations dB, the proposed approaches are not robust enough and additional measures are required to rectify the matrix deviation. As a countermeasure, we assume that the model error distribution is available in training as well. Here, both i-th sample of training and testing data are generated by . For training, , and for testing, . Here, for each training and testing sample, and are generated independently from an i.i.d. Gaussian with zero mean and unit variance. For comparison, we include results with training without error distribution, i.e., training data are generated as while keeping testing data the same, i.e., . Note that this result is shown as solid lines in Figure 10. As shown in Figure 10, when model error distribution is included in training (dotted line in Figure 10), the NRMSEs show improvement over training without distribution of when is in the range of 0 dB to 15 dB. Moreover, as increases, i.e., deviation decreases, training without distribution of error provides similar results as training with distribution of . As a conclusion, when there is high deviation in the measurement matrix, a robust training approach, i.e., training with distribution of , provides an advantage.

Figure 9

Average combined recovery error of low-rank and sparse matrices for compression ratio (a) and (b) with different model error levels for ADMM-based trained RPCA with thresholding (TRPCA-T), proposed ADMM-based trained RPCA with adaptive thresholding based on logarithm heuristic (TRPCA-AT(log)), and proposed ADMM-based trained RPCA with adaptive thresholding based on exponential heuristic (TRPCA-AT(exp)). Here, model error means that the train and testing samples are generated with different measurement matrices. For training, ; for testing, with . The results with model error are represented by solid lines, whereas dotted lines indicates the results without model error ().

Figure 10

Average combined recovery error of low-rank and sparse matrices for compression ratio with different model error levels for ADMM-based trained RPCA with thresholding (TRPCA-T), proposed ADMM-based trained RPCA with adaptive thresholding based on logarithm heuristic (TRPCA-AT(log)), and proposed ADMM-based trained RPCA with adaptive thresholding based on exponential heuristic (TRPCA-AT(exp)). Here, model error means that the training and testing samples are generated with different measurement matrices. For training, ; for testing, with . The results with model error are represented by solid lines, whereas dotted lines indicates the results when model error distribution is included in training.

4.1.3. ADMM or FISTA to Solve RPCA Problem

In this work, we consider an iterative algorithm based on the ADMM [39] to solve the RPCA problem. Alternatively, other methods such as ISTA and FISTA can be used [30,37]. In the following, we first compare the performance of the untrained ADMM-based Algorithm 1 with (URPCA-T) and the untrained algorithm based on FISTA as given in [30,37]. Further, we consider three different combinations for the rank of and sparsity of with . The aforementioned three combinations are given by and . Here, we consider 250 test samples in each combination. It turns out that for all three combinations, the ADMM-based approach achieves lower NRMSEs with fewer numbers of iterations compared to the FISTA, as shown in Figure 11. In this simulation, we consider standard RPCA where , in (1) are equal to the identity matrix. We chose this scenario because it is the simplest non-compression scenario. Note that, for FISTA, soft-thresholding and singular value thresholding parameters are set as and , in which are the singular values of .

Figure 11

Average NRMSE of low-rank and sparsity contributions for for ADMM- and FISTA-based approaches. (a) and , (b) and and (c) and . The ADMM-based approach achieves a lower NRMSE with a lower number of iterations compared to the FISTA-based approach.

4.1.4. Performance Evaluation for Experimental Ultrasound Imaging Data

To further assess the performance of ADMM- and FISTA-based approaches in the context of algorithm unfolding, we consider the FISTA-based unfolded approach in [30] (CORONA). For fair comparison, we consider two types of data: (a) experimental ultrasound imaging data used in [30] (available at https://www.wisdom.weizmann.ac.il/yonina accessed on 20 December 2021) and (b) complex-valued generic Gaussian data. Note that for the generic data, we set and to match the same dimension as ultrasound data in [30]. Moreover, real and complex valued entries of both low-rank and sparse matrices are generated independently from an i.i.d. Gaussian with zero mean and unit variance. Further, the fixed number of non-zero locations of each are selected uniformly. The rank of each is set as 2, and the number of non-zero elements of each is set as . We normalized and to have a unit Frobenius norm (i.e., ). Further, SNR during training and testing is 20 dB. To have a fair comparison, we use the same number of layers as in CORONA. Both CORONA and our approaches are trained using the Adam with 20 epochs. For CORONA, the same settings as in [30] were used. We utilized the CORONA implementation from the author’s website (https://www.wisdom.weizmann.ac.il/yonina accessed on 20 December 2021). Note that experimental ultrasound data in [30] follows the standard RPCA problem: , in (1) are equal to identity matrix. Therefore, for comparison, our ADMM-based approach is implemented without linear layers, i.e, all , , , and are identity matrices, i.e., aforementioned linear layers are omitted from the DNN. Thus, our proposed approach only learns the thresholding parameters , and with a learning rate of . Note that in this setting, our proposed approach is only required to learn the four parameters per layer, i.e., 40 parameters for a DNN with ten layers. However, in CORONA, for a single layer, six convolutional weights matrices and two thresholding parameters have to be learned. Based on convolutional filter sizes given in [30], CORONA with ten layers is required to learn parameters compared to 40 parameters in our approach. First, we compared our proposed approach with CORONA for experimental ultrasound data, and corresponding results are shown in Table 2. The experimental ultrasound data in [30] consists of 2400 training samples and 800 testing samples. For performance comparison, similar to [30], average MSE is utilized as a metric for ultrasound data. Note that for experimental ultrasound data, CORONA shows slightly better performance in recovering , compared to the proposed approaches. For the low-rank matrix recovery, our proposed approaches and CORONA show similar performance levels. For the sparse matrix recovery, our proposed approaches show slightly worse performances compared to the CORONA. This is to be expected since, in CORONA, -norm minimization is used for , which reflects the row-sparse nature of the experimental ultrasound data. Our approach is formulated for plain unstructured sparsity in the matrix, and it is, therefore, not that optimized for sparsity patterns in experimental ultrasound data. Note that it is also straightforward to modify our ADMM approaches for soft-thresholding related to the -norm.

Table 2

Comparison with CORONA [30] for experimental ultrasound imaging data from [30]. CORONA shows slightly better performance compared to the proposed approaches TRPCA-AT(log) and TRPCA-AT(exp) because CORONA is optimized for the structure of the ultrasound data. However, our approaches are not optimized for the structure of the experimental ultrasound data.

Method	Average Recovery Error =1MNNs∑i=1NsXi−X^iF
Method	Low-Rank Matrix L	Sparse Matrix S
CORONA [30]	3.23×10−4	3.431×10−4
TRPCA-AT(log)	3.26×10−4	6.641×10−4
TRPCA-AT(exp)	3.37×10−4	7.101×10−4
TRPCA-T	9.95×10−4	7.35×10−4

Next, we compared CORONA and our approach for the generic Gaussian data. For the Gaussian data set, we consider 2400 training samples and 1600 testing samples. Here, our proposed approach outperforms the CORONA, as shown in Table 3. This is due to the fact that the data acquisition model given in (27) follows an unstructured sparsity model and does not include convolution operation. Thus, since the generic Gaussian data does not follow the sparsity model as the ultrasound data in [30], the performance of CORONA is degraded compared to our approach. As discussed above, the ultrasound data follows the standard RPCA where there is no compression, i.e., , in (1) are equal to identity matrix. In order to evaluate our approach on compressed data, we manually applied the compression on ultrasound data as discussed next.

Table 3

Comparison with CORONA [30] for generic Gaussian data. Our proposed approach TRPCA-AT(log) outperforms the CORONA. This is due to the fact that the CORONA is optimized for structured sparsity of ultrasound data, which is not present in generic Gaussian data.

Method	Average Recovery Error NRMSE=1Ns∑i=1NsXi−X^iF/XiF
Method	Low-Rank Matrix L	Sparse Matrix S
CORONA [30]	4.45×10−1	4.08×10−1
TRPCA-AT(log)	6.56×10−2	3.29×10−2

The received signal matrix of ultrasound data is given by where , , and are low-rank, sparse, and noise matrices of size , respectively. In ultrasound data, a single measurement consists of twenty frames of size ; this results in and . Lets denote the frame size as . In order to evaluate our approach to compressed data, we manually applied the compression on ultrasound data by using a Gaussian matrix which compresses a frame to a frame, i.e., compression. In more detail, the matrix is a linear operator which maps the vector space to vector space . We set and . Now, after the compression, the received signal for a single measurement is given by , i.e., . Here, we consider 1800 training samples and 400 testing samples. We train our proposed approach using the Adam optimizer with 20 epochs with learning rate of . The average normalized RMSEs for the different numbers of layers of the DNN for is shown in Figure 12. As shown in Figure 12, our proposed approach TRPCA-AT(log) outperforms the untrained approach URPCA-T in terms of NRMSE as well as the number of iterations. The proposed approach TRPCA-AT(log) is able to achieve much lower NRMSE by only using 15 layers compared to the 200 iterations in the untrained approach URPCA-T. Therefore, our approach is 13 times faster than the untrained approach.

Figure 12

Average NRMSE of low-rank (a) and sparsity (b) contributions for experimental ultrasounds data with compression ratio for the proposed ADMM-based trained RPCA with adaptive thresholding based on logarithm heuristic (TRPCA-AT(log)) and ADMM-based untrained RPCA with thresholding (URPCA-T).

4.2. SFCW Radar Model

In this subsection, the performance evaluation of the ADMM-based trained RPCA with adaptive thresholding is now performed for the SFCW radar model given in Section 2.1. In the simulations, we set the carrier frequency of 300 GHz and bandwidth B as 5 GHz. Here, we consider two types of simulations: (a) small scale and (b) large scale. For the small scale, we consider , i.e., 30 antennas and 30 frequency bands. Both height and length of the layered structure are m. In the simulations, we consider six defects, and the scene is partitioned into a grid with equal grid size (i.e., ). The grid size is selected according to the Rayleigh resolution of the radar. For the large scale, we consider , i.e., 100 antennas and 100 frequency bands. In addition to that, we have increase both height and length of the layered structure to m. This results in an increase of the grid size, and the grid size for this scenario is , (i.e., ). Moreover, here we consider nine defects in the radar scene.. The inter-antenna spacing is chosen as half of the wavelength of . We consider a single-layered structure, and the distance to the front surface of the layered structure is m. Denote the reflection of the layered material structure, noise matrix, and sparse vector for the i-th data sample, given in (7), by , and , respectively. In the simulations, the signal-to-noise ratio of the i-th data sample for given and is defined as dB. Here, we set same SNR for all samples. Note that the SFCW data consists of complex numbers, thus, in this work, we implemented the DNN which supports complex numbers by using the PyTorch version [53]. Here, we initialize , , , and to mimic the ADMM Algorithm 1. Interestingly, in contrast to the generic Gaussian model, only learning the and does not achieve satisfactory average NRMSEs of the low-rank and sparse components. Therefore, we enable learning all parameters given in . Further, we notice that the stochastic gradient descent (SGD) [55] performs better than the Adam in learning all the parameters given in together. Therefore, we consider a three-stage training process for better learning. A detailed breakdown of this three-stage training process is given in Appendix A.4. For defect detection by SFCW radar, we considered a data set of 600 samples. Here, 500 data samples are used for training and validation, and 100 data samples are used for testing. We used Matlab [51] to generate the SFCW data based on (7). First, we present the results related to the small-scale simulations. Next, results related to the large-scale simulations are presented.

4.2.1. SFCW Small-Scale Simulations

Here, we present the results for configuration, i.e., 30 antenna elements and 30 frequency bands. The average normalized RMSEs for the different numbers of layers of the DNN for is shown in Figure 13. The figure shows that the proposed TRPCA-AT outperforms both URPCA-T and the LRPSRC given in (8). Further, in terms of the average RMSE, the TRPCA-AT and TRPCA-T with five layers outperform URPCA-T with 200 iterations. Therefore, as we compare the number of layers of the TRPCA-AT to the number of iterations of the URPCA-T, the TRPCA-AT achieves a improvement for the SFCW radar data, i.e., our proposed approach (TRPCA-AT) is forty times faster than the conventional untrained approach (URPCA-T). Moreover, based on the results shown in Figure 13, the TRPCA-AT shows better performance compared to the TRPCA-T. In addition, note that with compression ratio, the estimation of and from in (7) is more challenging. Therefore, the average RMSE of the LRPSRC is higher than . However, the DNN-based TRPCA-AT is able to achieve average RMSE in the range of for both sparse and low-rank components. Since and are unequal in the SFCW radar model, we did not consider the CRB benchmark given in (A4).

Figure 13

Average NRMSE of low-rank (a) and sparsity (b) contributions of SFCW Radar model for with for ADMM-based trained RPCA with thresholding (TRPCA-T), proposed ADMM-based trained RPCA with adaptive thresholding based on logarithm heuristic (TRPCA-AT(log)), proposed ADMM-based trained RPCA with adaptive thresholding based on exponential heuristic (TRPCA-AT(exp)), low-rank plus sparse recovery with convex relaxation (LRPSRC), and ADMM-based untrained RPCA with thresholding (URPCA-T).

Next, to further illustrate defect detection, images of the recovered defects are formed, as shown in Figure 14 for a single data sample. As a benchmark, we consider the state-of-the-art subspace projection (SP) [9] method with the full data set, . Further, for SP, it is assumed that the number of defects is known. Figure 14a shows the actual defect locations. The recovered locations of the defects for the ADMM-based trained RPCA TRPCA-AT(log), TRPCA-AT(exp), TRPCA-T, LRPSRC, ADMM-based untrained RPCA with thresholding URPCA-T, and the SP are shown in Figure 14b–g, respectively. It can be seen that the proposed TRPCA-AT approaches are able to identify all six defects. Further, the proposed TRPCA-AT approaches are even able to estimate amplitudes of the recovered defects (vector ) closer to the actual defects. Therefore, the proposed TRPCA-AT approaches outperform state-of-the-art SP even with of data.

Figure 14

Object recovery for a single case with compression ratio with . (a) Ground-truth, (b) TRPCA-AT(log) (proposed), (c) TRPCA-AT(exp) (proposed), (d) TRPCA-T, (e) LRPSRC, (f) URPCA-T, and (g) SP with of data. The proposed TRPCA-AT approaches are able to identify all six objects successfully by only utilizing of data compared to the unweighted approach TRPCA-T.

4.2.2. SFCW Large-Scale Simulations

Here, we present the results for configuration, i.e., 100 antenna elements and 100 frequency bands. In small-scale simulations, our proposed approaches TRPCA-AT(log) and TRPCA-AT(exp) achieve similar results, therefore, we chose one of them for the large-scale simulations to compare with the untrained approach URPCA-T. The average normalized RMSEs for the different numbers of layers of the DNN for are shown in Figure 15. The figure shows that the proposed TRPCA-AT(log) outperforms the untrained approach URPCA-T. Further, in terms of the average RMSE, the proposed TRPCA-AT(log) with five layers outperforms the untrained approach URPCA-T with 200 iterations. Therefore, our proposed approach (TRPCA-AT) is forty times faster than the conventional untrained approach (URPCA-T). In addition, note that with compression ratio, the estimations of low-rank and sparse matrices are more challenging. However, the DNN-based TRPCA-AT(log) is able to achieve a lower average NRMSE by parameter tuning compared to the conventional untrained approach (URPCA-T) with the fewer numbers of iterations.

Figure 15

Average NRMSE of low-rank (a) and sparsity (b) contributions of SFCW Radar model for with for the proposed ADMM-based trained RPCA with adaptive thresholding based on logarithm heuristic (TRPCA-AT(log)) and ADMM-based untrained RPCA with thresholding (URPCA-T).

The recovered sparse matrix contains all the complex reflection coefficients () of the defects. Therefore, to further illustrate the defect detectability, we show the total power of the recovered sparse matrix. Here, we consider two metrics: (a) total power of the true locations of the defects and (b) total power of the false detection. Here, the power of the false detection is the power of elements in the sparse matrix that does not belong to the true locations of the defects. In addition, the total power of the true locations of the defects is the power of elements in the sparse matrix that belong to the true locations of the defects. These results are shown in Table 4, and it is observed that the proposed approach TRPCA-AT(log) is able to achieve much higher total power of the true locations of the defects than the untrained approach (URPCA-T). Further, it is observed that our approach achieves lower power in false detection, too.

Table 4

Total power of the true defects and false detection for 100 simulations with . Here, the total true power of the defects for all simulations is 100.

Method	Total Power =∑i=1NsSiF2
Method	True Locations of the Defects	False Detection
URPCA-T	27.8565	2.0247
TRPCA-AT(log)	44.727	1.3537

Next, to illustrate defect detection, images of the recovered defects are formed for two scenarios as shown in Figure 16. In Figure 16, (Aa) and (Ba) show the actual locations of the defects. The recovered locations of the defects by the proposed ADMM-based trained RPCA TRPCA-AT(log), ADMM-based untrained RPCA with thresholding URPCA-T, and the classical subspace projection (SP) are shown in Figure 16b–d, respectively. It can be seen that the proposed TRPCA-AT(log) approach is able to identify all defects while only utilizing of data. Further, the proposed TRPCA-AT(log) approach has fewer false detections than the untrained RPCA with thresholding (URPCA-T) approach for these two scenarios. It is worth noticing that the conventional SP approach utilizes of data, and for the SP, it is required to know the number of the defects prior.

Figure 16

Object recovery for a single case with compression ratio with . (a) Ground-truth, (b) TRPCA-AT(log) (proposed), (c) URPCA-T and (d) SP with of data. The proposed TRPCA-AT(log) approach is able to identify all nine objects successfully by only utilizing of data. True locations of the objects are shown inside ellipses and false detections are shown in squares.

5. Conclusions

This paper presents a deep learning-based parameter tuning for the low-rank plus sparse recovery (RPCA). To this end, an iterative algorithm was developed based on ADMM to estimate the low-rank and sparse contributions with iterative reweighted nuclear and -norm minimization. Next, to improve the accuracies of the recovered low-rank and sparse components and the speed of convergence of the algorithm, we proposed a DNN to tune the parameters of the iterative algorithm, i.e., algorithm unrolling/unfolding. Our proposed approach was evaluated for two types of data. As a standard benchmark, a generic Gaussian data acquisition model was used, and for practical application, the defect detection by SFCW radar from compressive measurements was considered. For both cases, our proposed approach performed substantially better compared to the untrained iterative algorithms in terms of low-rank and sparse recovery and convergence speed. In particular, for compression ratios () and , our proposed approach was 15 and times faster than the untrained algorithm. In addition to that, we have compared our proposed approach with the state-of-the-art RPCA unfolding approach (CORONA). Our approach achieveed a similar performance level as CORONA for experimental ultrasound imaging data, and our approach outperformed CORONA for generic Gaussian data. Moreover, we analyzed the robustness of our approach for testing signal-to-noise ratio (SNR) uncertainty and the deviation in the measurement matrices (, ). It was observed that the knowledge of testing SNR is an important factor, and for unknown testing SNR, it is better to train the DNN with SNR like 5 dB. Furthermore, the robust training approach (training with the distribution of deviation) decreased the impact of the deviation in the measurement matrices on the performance. In this work, we considered a model-based unfolding approach where unfolded DNN strictly follows the structure of the optimization steps/rules. As possible future work, it would be interesting to study a model-free unfolding approach which is able to learn new optimization steps/rules from data. Moreover, validation of our approach for experimental/real measurements based on defect detection is subject to future work.

11 in total

Review 8. A Comparative Analysis of Signal Decomposition Techniques for Structural Health Monitoring on an Experimental Benchmark.

Authors: Marco Civera; Cecilia Surace
Journal: Sensors (Basel) Date: 2021-03-05 Impact factor: 3.576