Literature DB >> 29200797

Lower bounds for the low-rank matrix approximation.

Abstract

Low-rank matrix recovery is an active topic drawing the attention of many researchers. It addresses the problem of approximating the observed data matrix by an unknown low-rank matrix. Suppose that A is a low-rank matrix approximation of D, where D and A are [Formula: see text] matrices. Based on a useful decomposition of [Formula: see text], for the unitarily invariant norm [Formula: see text], when [Formula: see text] and [Formula: see text], two sharp lower bounds of [Formula: see text] are derived respectively. The presented simulations and applications demonstrate our results when the approximation matrix A is low-rank and the perturbation matrix is sparse.

Entities: Chemical Disease Gene

Keywords: approximation; error estimation; low-rank matrix; matrix norms; pseudo-inverse

Year: 2017 PMID： 29200797 PMCID： PMC5696467 DOI： 10.1186/s13660-017-1564-z

Source DB: PubMed Journal: J Inequal Appl ISSN： 1025-5834 Impact factor: 2.491

Introduction

In mathematics, low-rank approximation is a minimization problem, in which the cost function measures the fit between a given matrix (the data) and an approximating matrix (the optimization variable), subject to a constraint that the approximating matrix has reduced rank. The problem is used for mathematical modeling and data compression. The rank constraint is related to a constraint on the complexity of a model that fits the data. Low-rank approximation of a linear operator is ubiquitous in applied mathematics, scientific computing, numerical analysis, and a number of other areas. For example, a low-rank matrix could correspond to a low-degree statistical model for a random process (e.g., factor analysis), a low-order realization of a linear system [1], or a low-dimensional embedding of data in the Euclidean space [2], the image and computer vision [3-5], bioinformatics, background modeling and face recognition [6], latent semantic indexing [7, 8], machine learning [9-12] and control [13] etc. These data may have thousands or even billions of dimensions, and a large number of samples may have the same or similar structure. As we know, the important information lies in some low-dimensional subspace or low-dimensional manifold, but interfered with some perturbative components (sometimes interfered by the sparse component). Let be an observed data matrix which is combined as where is the low-rank component and is the perturbation component of D. The singular value decomposition (SVD [14]) is a method for dealing with such high-dimensional data. If the matrix E is small, the classical principal components analysis (PCA [15-17]) can seek the best rank-r estimation of A by solving the following constrained optimization via SVD of D and then projecting the columns of D onto the subspace spanned by the r principal left singular vectors of D: where is the target dimension of the subspace, ϵ is an upper bound on the perturbative component and is the Frobenius norm. Despite its many advantages, the traditional PCA suffers from the fact that the estimation Â obtained by classical PCA can be arbitrarily far from the true A, when E is sufficiently sparse (relative to the rank of A). The reason for this poor performance is precisely that the traditional PCA makes sense for Gaussian noise and not for sparse noise. Recently, robust PCA (RPCA [6]) is a family of methods that aims to make PCA robust to large errors and outliers. That is, RPCA is an upgrade of PCA. There are some reasons for the study of lower bound of a low-rank matrix approximation problem. Firstly, as far as we know, there is no literature to consider the lower bound of the low-rank matrix approximation problem. In our paper, we first put forward the lower bound. Secondly, for the low-rank approximation, when a perturbation E exists, there is an approximation error which cannot be avoided, that is, the approximation error cannot equal 0, but tends to 0. Thirdly, from our main results, we can clearly find the influence of the spectral norm () on the low-rank matrix approximation. For example, for our main result of Case II, when the maximum eigenvalue of the matrix D is larger, the approximation error of is smaller. In addition, the lower bound can verify whether the solution obtained by algorithms is optimal. For details, please refer to the experiments Section 4 of our paper. Therefore, it is necessary and significant to study the lower bound of the low-rank matrix approximation problem.

Remark 1.1

PCA and RPCA are methods for the low-rank approximation problem when perturbation item exists. Our aim is to prove that no matter what method is used, the lower bound of error always exists and it cannot be avoided with the perturbation item E. Considering the existence of error, this paper focuses on the specific situation of this lower bound.

Notations

For a matrix , let and denote the spectral norm and the nuclear norm (i.e., the sum of its singular values), respectively. Let be a unitarily invariant norm. The pseudo-inverse and the conjugate transpose of A are denoted by and , respectively. We consider the singular value decomposition (SVD) of a matrix A of rank r where U and V are and matrices with orthonormal columns, respectively, and is the positive singular values. We always assume that the SVD of a matrix is given in the reduced form above. Furthermore, denotes the standard inner product, then the Frobenius norm is

Organization

In this paper, we study a perturbation theory for low-rank matrix approximation. When or , two sharp lower bounds of are derived for a unitarily invariant norm respectively. This work is organized as follows. In Section 2, we provide a review of relevant linear algebra and some preliminary results. In Section 3, under different norms, two sharp lower bounds of are given for the low-rank approximation problem and some proofs of Theorem 3.5 are presented. In Section 4, example and applications are given to verify the provided lower bounds. Finally, we conclude the paper with a short discussion.

Preliminaries

In order to prove our main results, we mention the following results for our further discussions.

Unitarily invariant norm

An important property of a Euclidean space is that shapes and distance do not change under rotation. In particular, for any vector x and for any unitary matrices U, we have An analogous property is shared by the spectral and Frobenius norms: namely, for any unitary matrices U and V, the product is defined by These examples suggest the following definition.

Definition 2.1

([18]) A norm on is unitarily invariant if it satisfies for any unitary matrices U and V. It is normalized if whenever A is of rank one.

Remark 2.2

Let be the singular value decomposition of the matrix A with order n. Let be a unitarily invariant norm. Since U and V are unitary, Thus is a function of the singular values of A. The 2-norm plays a special role in the theory of unitarily invariant norms as the following theorem shows.

Theorem 2.3

([18]) Let be a family of unitarily invariant norm. Then and Moreover, if , then We have observed that the spectral and Frobenius norms are unitarily invariant. However, not all norms are unitarily invariant as the following example shows.

Example 2.4

Let obviously, , but for a unitary matrix we have

Remark 2.5

It is easy to verify that the nuclear norm is a unitarily invariant norm.

Projection

Let and be m and n-dimensional inner product spaces over the complex field, respectively, and be a linear transformation from into .

Definition 2.6

([18]) The column space (range) of A is denoted by and the null space of A by Further, we let ⊥ denote the orthogonal complement and get and . The following properties [18] of the pseudo-inverse are easily established.

Theorem 2.7

([18]) For any matrix A, the following hold. Here is the identity matrix. If has rank n, then and . If has rank m, then and .

Theorem 2.8

([18]) For any matrix A, is the orthogonal projector onto , is the orthogonal projector onto , is the orthogonal projector onto .

The decomposition of

In this section, we focus on the decomposition of and a general bound of the perturbation theory for pseudo-inverses. Firstly, according to the orthogonal projection, we can deduce the following lemma.

Lemma 2.9

For any matrix A, and , then we have

Proof

Since and , then we have that The proof is completed. □ Using Lemma 2.9, the decompositions of are developed by Wedin [19].

Theorem 2.10

([19]) Let , then the difference is given by the expressions By Lemma 2.9, using , , , , these expressions can be verified. In previous work [19], Wedin developed a general bound of the perturbation theory for pseudo-inverses. Theorem 2.11 is based on a useful decomposition of , where D and A are matrices. Sharp estimates of are derived for a unitarily invariant norm. In [20], Chen et al. presented some new perturbation bounds for the orthogonal projections .

Theorem 2.11

([19]) Suppose , then the error of has the following bound: where γ is given in Table 1.

Table 1

Value options for

∥⋅∥	Arbitrary	Spectral	Frobenius
γ	3	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\frac{1+\sqrt{5}}{2}$\end{document}1+52	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\sqrt{2}$\end{document}2

Value options for

Remark 2.12

For the spectral norm, by formula (11) we can achieve . When is the Frobenius norm, by formula (12), we have . Similarly, for an arbitrary unitarily invariant norm, according to formula (13), we can deduce .

Remark 2.13

From Theorem 2.11, since , in fact, if , then (11) gives the lower bound of the low-rank matrix approximation: In the following section, based on Theorem 2.11, we provide two lower error bounds of for a unitarily invariant norm.

Our main results

In this section, we consider the lower bound theory for the low-rank matrix approximation based on a useful decomposition of . When , some sharp lower bounds of are derived in terms of a unitarily invariant norm. In order to prove our result, some lemmas are listed below.

Lemma 3.1

([18]) Let , the projections and satisfy therefore If , then

Lemma 3.2

([21]) Let , , , , then there exists a unitary matrix such that where , and , . Moreover, and satisfy , . According to Lemma 3.2, we can easily get the following result.

Lemma 3.3

Let , , , , then we have

Proof

Since and then and Therefore, they have the same singular values which yield that . □ This is a useful lemma that we will use in the proof of the main result. In order to prove our main theorem, two lower bounds of are required by the following lemma.

Lemma 3.4

For the unitarily invariant norm, if , then the lower bound of satisfies: Case I: For , we have Case II: For , we have Case I: Since , we have . Using Theorem 2.3 and Lemma 3.1, we have and , respectively. By Lemma 2.9, we have , and , this also yields Case II: Since , we have . Similarly, by Lemma 3.3, using , we have We complete the proof of Lemma 3.4. □ Our main results can be described as the following theorem.

Theorem 3.5

Suppose that , , for the unitarily invariant norm , the error of has the following bounds. Case I: For , we have Case II: For , we have where the value options for γ are the same as in Table 1. Case I: For , by Theorem 2.11 and Lemma 3.4 (22), we can deduce this yields Case II: Similarly, for , by Theorem 2.11 and Lemma 3.4 (23), we can deduce this yields where the value options for γ are the same as in Table 1. In summary, we prove the lower bounds of Theorem 3.5. □

Remark 3.6

From the main theorem, we can see that if , then . However, in the problem of low-rank matrix approximation, is not necessarily equal to , so the approximation error is present. Furthermore, when is close to , simulations demonstrate that the error has a very small magnitude (see Section 4). In this section, we discuss the error bounds under different conditions for the unitarily invariant norm. Based on a useful decomposition of , for and , we have bounds (26) and (27), respectively. The two error bounds are useful in low-rank matrix approximation. The following experiments illustrate our results when the approximation matrix A is low-rank and the perturbation matrix E is sparse.

Experiments

The singular value thresholding algorithm

Our results are obtained by a singular value thresholding (SVT [22]) algorithm. This algorithm is easy to implement and surprisingly effective both in terms of computational cost and storage requirement when the minimum nuclear norm solution is also the lowest-rank solution. The specific algorithm is described as follows. For the low-rank matrix approximation problem which is contaminated with perturbation item E, we observe that the data matrix . To approximate D, we can solve the convex optimization problem where denotes the nuclear norm of a matrix (i.e., the sum of its singular values). For solving (28), we introduce the soft-thresholding operator [22] which is defined as where . In general, this operator can effectively shrink some singular values toward zero. The following theorem is with respect to the shrinkage operators [22-24], which will be used at each iteration of the proposed algorithms.

Theorem 4.1

([22]) For each and , the singular value shrinkage operator obeys where . By introducing a Lagrange multiplier Y to remove the inequality constraint, one has the augmented Lagrangian function of (28) The iterative scheme of the classical augmented Lagrangian multipliers method is Based on the optimality conditions, (29) is equivalent to where denotes the subgradient operator of a convex function. Then, by Theorem 4.1 above, we have the iterative solution The SVT approach works as described in Algorithm 1. SVT

Simulations

In this section, we use the SVT algorithm for the low-rank matrix approximation problem. Let be the available data. Simply, we restrict our examples to square matrices (). We draw A according to the independent random matrices and generate the perturbation matrix E to be sparse, which satisfies the i.i.d. Gaussian distribution. Specially, the rank of the matrix A and the sparse entries of the perturbation matrix E are selected to be and , respectively. Table 2 reports the results obtained by lower bounds (24), (25) and (12), respectively. Bounds (24) and (25) are our new result, bound (12) is the previous result. Then, comparing the bounds with each other by numerical experiments, we find that lower bounds (24), (25) are smaller than lower bound (12).

Table 2

Lower bound comparison results

	Bound ( 24 )		Bound ( 25 )		Bound ( 12 )
m = n	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\boldsymbol{\\|\cdot\\|_{2}}$\end{document}∥⋅∥2	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\boldsymbol{\\|\cdot\\|_{F}}$\end{document}∥⋅∥F	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\boldsymbol{\\|\cdot\\|_{2}}$\end{document}∥⋅∥2	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\boldsymbol{\\| \cdot\\|_{F}}$\end{document}∥⋅∥F	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\boldsymbol{\\|\cdot\\|_{2}}$\end{document}∥⋅∥2	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\boldsymbol{\\|\cdot\\|_{F}}$\end{document}∥⋅∥F
100	8.13e-7	1.89e-7	1.54e-7	3.31e-7	1.01e-4	1.27e-4
500	5.11e-8	3.71e-8	4.22e-8	4.62e-8	4.23e-4	5.22e-4
1,000	3.76e-8	2.14e-8	1.01e-8	1.19e-8	5.57e-4	7.48e-4

Lower bound comparison results

Applications

In this section, we use the SVT algorithm for the low-rank image approximation. From Figures 1 and 2, comparing with the original image (a), the low-rank image (b) loses some details. We can hardly get any detailed information from incomplete image (c). However, the output image (d) , which is obtained by the SVT algorithm, can recover the details of the low-rank image (b). If we denote image (b) to be a low-rank matrix A, then image (c) is the observed data matrix D which is perturbed by a sparse matrix E, that is,

Figure 1

Cameraman. (a) Original image with full rank. (b) Original image truncated to be rank 50. (c) 50% randomly masked of (b). (d) Recovered image from (c).

Figure 2

Barbara. (a) Original image with full rank. (b) Original image truncated to be rank 100. (c) 50% randomly masked of (b). (d) Recovered image from (c).

Cameraman. (a) Original image with full rank. (b) Original image truncated to be rank 50. (c) 50% randomly masked of (b). (d) Recovered image from (c). Barbara. (a) Original image with full rank. (b) Original image truncated to be rank 100. (c) 50% randomly masked of (b). (d) Recovered image from (c). Using the SVT algorithm for the low-rank image approximation problem, the lower bound comparison results are shown in Table 3. We calculate are 8.71e-2 and 7.23e-2 for images Cameraman and Barbara, respectively. But for F-norm of our lower bound (25), we can see that they are 2.59e-5 and 1.09e-5 for images Cameraman and Barbara, respectively. That is to say, our error bounds can verify that the SVT algorithm still can be improved.

Table 3

Lower bound comparison results of low-rank image approximation

	Cameraman	Barbara
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\\|E\\|_{F}$\end{document}∥E∥F	8.71e-2	7.23e-2
Bound (25)	2.59e-5	1.09e-5
Iters	200	200

Lower bound comparison results of low-rank image approximation

Conclusion

Low-rank matrix approximation problem is a field which arises in a number of applications in model selection, system identification, complexity theory, and optics. Based on a useful decomposition of , this paper reviewed the previous work and provided two sharp lower bounds for the low-rank matrices recovery problem with a unitarily invariant norm. From our main Theorem 3.5, we can see that if , then . However, in the problem of low-rank matrix approximation, is not necessarily equal to , so the approximation error is present. Furthermore, from the main results, we can clearly find the influence of the spectral norm () on the low-rank matrix approximation. For example, in Case II, when the maximum eigenvalue of the matrix D is larger, the error of is smaller. Finally, we use the SVT algorithm for the low-rank matrix approximation problem. Table 2 shows that our lower bounds (24), (25) are smaller than lower bound (12). Simulation results demonstrate that the lower bounds have a very small magnitude. In applications section, we use the SVT algorithm for the low-rank image approximation problem, the lower bounds comparison results are shown in Table 3. From the comparison results, we find that our lower bounds can verify whether the SVT algorithm can be improved.

1 in total

1. Recovering the missing components in a large noisy low-rank matrix: application to SFM.

Authors: Pei Chen; David Suter
Journal: IEEE Trans Pattern Anal Mach Intell Date: 2004-08 Impact factor: 6.226

1 in total