Reconstruction of jointly sparse vectors via manifold optimization
Introduction
The sparse sampling and recovery problem has been a subject of intensive research for the last two decades, especially after major advances in the field of compressed sensing, through the seminal works [5], [11]. In the classical problem, the goal is to find the vector from the system where , under the assumption that the vector x is sparse, i.e., has only few non-zero entries. Here, N represents the dimensionality of the unknown vector x and M is the number of linear measurements performed on x. A typical difficulty in solving the problem stems from the fact that one often has only a limited number of measurements, i.e., , so the system is underdetermined.
In the joint sparse recovery problem, also known as multiple measurements vectors (MMV), simultaneous sparse approximation, or collaborative sparse coding, we instead solve a matrix equation where the unknown matrix has only few non-zero rows. X can be viewed as a collection of K unknown vectors in which share the same sparsity support. In this case, it has been demonstrated [30], [14], [26] that recovery of multiple vectors simultaneously can take advantage of their joint sparsity, which helps reduce the number of measurements compared to individual reconstructing each at a time. The joint sparse recovery problem has applications in hyper-spectral imaging [7], [13], [35], [37], source localization [2], [25] and many other fields. In these problems, the joint sparse recovery is often conducted under additional positivity constraints, which are out of the scope of this paper. Our proposed method therefore is not applicable directly, but is an important block towards these applications. Another example is the CLSUnSAL algorithm [19] for hyperspectral unmixing, which is based on a joint sparse regularization.
In this paper, we first utilize the rank of the output data matrix Y to reduce the problem to the case where unknown matrix has full column rank. More specifically, we prove the following theorem (Section 3). Theorem 1 Denote by the set of all matrices in of rank at least r and having at most s non-zero rows (s-row sparse). Let be such that the induced map is injective on , be s-row sparse, and . Then Y can be written as a product where with and satisfying where is the identity matrix in . The problem has a unique solution . This solution is s-row sparse and has full column rank. The original matrix X can be computed by .
Here, is the number of non-zero rows of matrix Z.
As a relaxation of the minimization problem (1) in the theorem above, we propose a new method for the recovery of jointly sparse vectors in the form of a non-convex optimization problem on the non-compact Stiefel manifold: For the definition of norm, see (5). Our method generalizes the minimization method studied in [12], [17], [21], [29], [36]. We demonstrate in our numerical examples that it outperforms the minimization commonly used to solve the joint sparse recovery problem in the sense that much fewer measurements are required for accurate sparse reconstructions.
For numerical examples, we consider the unconstrained version of (2) where is the Frobenius norm. Note that (3) is the Lagrangian form of the equivalent problem for some other parameter , that measures the noise level in the measurements. Just like the ϵ, the choices of tuning parameter are problem dependent, with small values of ϵ corresponding to large values of λ, and often determined empirically (see [9], [23] for more detailed discussion). By choosing large λ, ensures equation (3) gives the same solution as (2). Currently, there are several packages available for manifold optimization (see [18] for a detailed list). We give preference to the Pymanopt package for its automatic differentiation capabilities [31]. It is the Python version of the original Manopt package [4], [34] written for MATLAB. We also smooth out the norm with the Huber function to be able to apply the conjugate gradient algorithm in Pymanopt. For users of the Manopt package, calculation of derivatives is needed, and we will explicitly compute the Euclidean gradient of the objective function in (3) as well as the gradient of its Huber regularized version.
The paper is organized as follows. For the rest of Section 1, we review the single sparse and joint sparse recovery problems. Section 2 provides background results on the rank-awareness of the minimization problem for matrices. Section 3 discusses a simple strategy to reduce the rank deficient problems to the full-rank setting. We relax the minimization to a manifold optimization problem in Section 4. Section 5 discusses the Huber regularization. Calculation of the derivative of our objective functions will be presented in Section 6. Finally, numerical example is given in Section 7.
Many signals of interest have (almost) sparse representation in some basis. For instance, images typically have sparse representations in wavelet bases, digital audio signals have sparse representation in Gabor systems, etc. Let be the coefficients of the signal in such a basis and be a vector of linear measurements of the signal. Via the Riesz theorem, the relationship between x and y can be written as where the matrix A can be explicitly computed.
In classical sparse sampling and recovery problem, the goal is to find the vector from the system where , under the assumption that the vector x is sparse. Even though the number s of non-zero entries in x can be much smaller than the dimension N, the indices of these entries are unknown, and directly solving the system by first finding the non-zero entry indices of x is an NP-hard problem [27]. Other methods have been suggested for solving it, one of the most popular is the minimization problem or its unconstrained version
It has been proved that under some conditions on the matrix A, specifically the null space property (NSP) and the restricted isometry property (RIP), these methods are able to recover the vector x exactly and stably.
In the joint sparse recovery problem we want to solve a matrix equation of the form given that is a s-row sparse matrix, i.e., if then 's are all zero for all indices i's outside some subset of of size s.
Our goal is to find a method to exactly recover X from (4), where the matrix is chosen with small M. One expects that when increasing the number K of unknown jointly sparse vectors, a high rank of X will result in a reduction of M which corresponds to the number of linear measurements done on the vectors. Define the norm of X Similar to the case of single vector recovery, the following convex minimization problem has been widely used in place of directly solving for sparsest solution of (4) In the worst case (uniform recovery) scenario, however, this method does not allow measurement reduction when the number of vectors is increased. As the next theorem shows, null space property is a necessary and sufficient condition for both the single sparse recovery via minimization and the joint sparse recovery via minimization. Specifically, if A fails to recover an s-sparse vector with minimization, then it will fail to recover some s-row sparse matrix via (6), no matter how high the rank of the unknown matrix is. As such, the minimization is known to be a rank blind algorithm.
Theorem 2 [22], [10] Given a matrix , every s-row sparse matrix can be exactly recovered from (6) if and only if A satisfies the NSP: where is the vector that we get after nullifying all the entries of x except the ones with indexes in I. Furthermore, if there exists an such that then, for any , there is an s-row sparse matrix with which cannot be recovered from (6).
There have been several efforts to exploit the rank of the unknown matrix and design rank aware algorithms to improve the joint sparse recovery results. Examples of such algorithms include those based on MUSIC [15], [24] and orthogonal matching pursuit [10]. On the other hand, we are not aware of available rank aware algorithms based on (convex) functional optimization, CoSaMP or thresholding approaches.
One may think of resolving the rank blindness issue of minimization by replacing the norm with a different functional, i.e., consider the problem where ϕ is a convex functional in . However, as we point out below, no such convex functional like will be rank aware, and we may need to change the geometry or look into non-convex optimization for such desirable property.
First, if (7) has a unique solution for every with then, for any and , it has a unique solution too. Hence (7) is rank aware if it allows unique recovery in the class () and fails to recover a matrix in for some . Observe that, every is the unique solution of (7) if and only if And (7) strongly fails to recover an if and only if there exists a such that . However, by the continuity of ϕ and the fact that is dense in when , (8) implies that , a contradiction. As a result, (7) cannot recover any matrix in while strongly fail to recover in the class for .
In this paper, we utilize the rank of Y to extract information about X and reduce the problem to a smaller dimensional problem in full-rank setting. As a first step towards a rank aware optimization approach for uniform recovery, we develop a new non-convex optimization method on manifold, which achieves a significant reduction in the number of measurements in joint sparse recovery, compared to minimization.
Section snippets
Preliminary results
This section revisits the minimization for matrices, based on results in [6], [10]. We present a uniqueness condition for the sparse recovery on , that is, the map is injective on . Before we characterize this condition, let us make a few observations.
Definition 1 The spark of the matrix A, denoted by , is the smallest number j such that there exists j linearly dependent columns in A.
Note that any columns of A are linearly independent.
Proposition 1 If is injective then for
Reduction to full-rank setting
We present a simple way to transform the matrix equation to be solved into one where the unknown is full-rank matrix. At the expense of an additional decomposition of output data, the benefit of this practice is twofold. First, we reduce the dimension of the matrix equation and now find solutions in a potentially much lower dimensional space , compared to the original one in . More importantly, this step allows us to apply and exploit the strength of several joint sparse recovery
Non-convex manifold optimization for joint sparse recovery
In this section, we present a novel relaxation to minimization in form of an optimization problem on manifold.
Consider problem (10). As known from the previous section that its solution must have full rank, we can restrict this problem to the space of all full-rank matrices in . This adds a rank constraint to the joint sparsity recovery, and our idea is to recast the optimization problem with full-rank condition in the Euclidean space into one on a manifold that encodes this constraint
Huber regularization
We consider the following unconstrained problem, which produces the same solution as (11) for large λ: Satisfactory choices of λ are problem dependent and often determined empirically, as have been done herein. With notice that norm is non-smooth, we further relax with Huber regularization to be able to apply manifold optimization methods, which have been mostly developed for smooth optimization. In particular, we replace the with
Explicit formulae for the Euclidean gradients of cost functions
The Euclidean gradients are the fundamental building block for the optimization of cost functions on manifolds. Such derivatives are often taxing to calculate, posing a major hurdle for implementing manifold optimization methods for many problems, especially where large matrices or vectors are involved. Interestingly, for our objective functions (12) and (13), it is possible to derive the exact, explicit formulae for their gradients. Expectedly, these formulae save the users a considerable
Numerical experiments with Gaussian data and measurements
We use the Pymanopt package [31], which is the Python version of the original Manopt package [4], [34] for MATLAB, in these numerical tests. The conjugate gradient method with backtracking Armijo line-search provided in Pymanopt is used to solve (13). For automatic differentiation, we employ the Autograd package. In each iteration, an SVD is computed. For sparse recovery, since , this step is not numerically expensive. The initial value is chosen randomly.
In this test we compare the
Conclusion and future work
We study the joint sparse recovery problem. Our result here allows to reduce the problem to a potentially much smaller dimensional problem where the unknown matrix has full column rank. We also offer a new functional minimization method on the manifold of full-rank matrices for the recovery of the unknown matrix. As pointed out in Section 1.2, we cannot expect to have a rank sensitive convex optimization method on the Euclidean space, so changing the geometry and looking into manifold
Declaration of Competing Interest
None.
Acknowledgements
We want to thank Richard Barnard (Western Washington University) for suggesting the use of Huber regularization and Carola-Bibiane Schönlieb (University of Cambridge) for pointing out the Γ-convergence argument for Huber regularization to us. We also thank the anonymous referee for valuable suggestions.
This work is supported by the National Science Foundation, Division of Mathematical Sciences, Computational Mathematics program under contract number DMS1620280; the U.S. Department of Energy,
References (37)
- et al.
The null space property for sparse recovery from multiple measurement vectors
Appl. Comput. Harmon. Anal.
(2011) - et al.
Joint sparse model-based discriminative k-svd for hyperspectral image classification
Signal Process.
(2017) - et al.
Structured sparse method for hyperspectral unmixing
ISPRS J. Photogramm. Remote Sens.
(2014) - et al.
Optimization Algorithms on Matrix Manifolds
(2009) - et al.
Sparse optimization with least-squares constraints
SIAM J. Optim.
(2011) - et al.
Probing the Pareto frontier for basis pursuit solutions
SIAM J. Sci. Comput.
(2008) - et al.
Manopt, a Matlab toolbox for optimization on manifolds
J. Mach. Learn. Res.
(2014) - et al.
Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information
IEEE Trans. Inf. Theory
(2006) - et al.
Theoretical results on sparse representations of multiple-measurement vectors
IEEE Trans. Signal Process.
(2006) - et al.
Hyperspectral image classification using dictionary-based sparse representation
IEEE Trans. Geosci. Remote Sens.
(2011)
An Introduction to Γ-Convergence, vol. 8
On the prediction performance of the lasso
Bernoulli
Rank awareness in joint sparse recovery
IEEE Trans. Inf. Theory
Compressed sensing
IEEE Trans. Inf. Theory
A method for finding structured sparse solutions to non-negative least squares problems with applications
SIAM J. Imaging Sci.
Spectral–spatial hyperspectral image classification via multiscale adaptive sparse representation
IEEE Trans. Geosci. Remote Sens.
Universal Minimum-Rate Sampling and Spectrum-Blind Reconstruction for Multiband Signals
Spectrum-blind minimum-rate sampling and reconstruction of multiband signals
Proc. ICASSP
Cited by (7)
Minimization of the q-ratio sparsity with 1<q≤∞ for signal recovery
2021, Signal ProcessingAnalysis of the ratio of ℓ<inf>1</inf> and ℓ<inf>2</inf> norms in compressed sensing
2021, Applied and Computational Harmonic AnalysisOn the Strong Convergence of Forward-Backward Splitting in Reconstructing Jointly Sparse Signals
2022, Set-Valued and Variational Analysis