Analysis of the ratio of ℓ1 and ℓ2 norms in compressed sensing
Introduction
The goal of compressed sensing (CS) problem is to seek the sparsest solution of an underdetermined linear system: where and with . The quasinorm measures the number of nonzero components in x. In CS applications, one typically considers x as the frame/basis coordinates of an unknown signal, and it is typically assumed that the coordinate representation is sparse, i.e., that is “small”. A is the measurement matrix that encodes linear measurements of the signal x, and b contains the corresponding measured values. In the language of signal processing, (1) is equivalent to applying the sparsity decoder to reconstruct a signal from the undersampled measurement pair (A, b). A naive, empirical counting argument suggests that if measurements b of an unknown signal x are available, then we can perhaps compute the original signal coordinates x, assuming x is approximately m-sparse. The optimization (1) is the quantitative manifestation of this argument.
It was established in [1] that under mild conditions, (1) has a unique minimizer. In the rest of the paper we assume that the minimizer is unique and denote it by . One of the central problems in compressed sensing is to design an effective algorithm to find : Directly solving (1) via combinatorial search is NP-hard [2]. A more practical approach, which was proposed in the seminal work [3], is to relax the sparsity measure to the convex norm :
The convexity of the problem (2) ensures that efficient algorithms can be leveraged to compute solutions. Many pioneering works in compressed sensing have focused on understanding the equivalence between (1) and (2), see [3], [4], [5]. A major theoretical cornerstone of such equivalence results is the Null Space Property (NSP), which was first introduced in [6] and plays a crucial role in establishing sufficient and necessary conditions for the equivalence between (1) and (2). A sufficient condition for such an equivalence is called an exact recovery condition. A closely related but stronger condition is the Restricted Isometry Property (RIP), see [4]. The RIP is more flexible than the NSP for practical usage, yet conditions given by both the NSP and RIP are hard to verify in the case when measurements (i.e., the matrix A) are deterministically sampled. An alternative approach based on analyzing the mutual coherence of A produces a practically computable but suboptimal condition, see [1]. We will use a slightly more general definition of the NSP that was introduced in [7]:
Definition 1.1 Null space property Given and , a matrix satisfies the -NSP in the quasi-norm () if for all and , we have Here, is the collection of all subsets of with cardinality at most s, is the restriction of h to the index set T, and .
Nearly all exact recovery conditions based on the RIP are probabilistic in nature. This means that such analysis typically is split into two major thrusts: (i) the first one establishes that (1) and (2) are equivalent for a class of sparse signals if A satisfies an RIP condition, and (ii) the second one proves that the RIP condition for a suitable random matrix A is achievable with high probability. Such random arguments appear to be necessary in practice for the RIP analysis in order to mitigate pathological measurement configurations.
Under proper randomness assumptions, an alternative approach that circumvents the RIP also yields fruitful results in the study of (2), see [8], [9]. This approach is more reliant on a geometric characterization of the nullspace of the measurements, and therefore could be potentially adapted to analyzing non-convex objectives with similar geometric interpretations. We take this approach for analysis in this paper.
Although (2) has attracted a lot of interest in the past decades, the community realized that minimization is not as robust for computing sparsity-promoting solutions compared to other objective functions, in particular compared to other computationally efficient non-convex objectives. This motivates the study of non-convex relaxation methods (use non-convex objectives to approximate ), which are believed to be more sparsity-aware. Many non-convex objective functions, such as () [10], [11], [7]), reweighted [12], CoSaMP [13], IHT [14], [15], [16], and [17], [18], [15], [19], are empirically shown to outperform in certain contexts. However, relatively few such approaches have been successfully analyzed for theoretical recovery. In fact, obtaining exact recovery conditions and robustness analysis for general non-convex optimization problems is difficult, unless the objective possesses certain exploitable structure, see [20], [21].
We aim to investigate exact recovery conditions as well as the robustness for the objective in this paper. We are interested in providing conditions under which (1) is equivalent to the following problem: To the best of our knowledge, does not belong to any particular class of non-convex functions that has a systematic theory behind it. This is mainly because is neither convex nor concave, and is not even globally continuous. However, there are a few observations that make this non-convex objective worth investigating. First, in [22] it was shown numerically that the outperforms by a notable margin in jointly sparse recovery problems (in the sense that many fewer measurements are required to achieve the same recovery rate); particularly, admits a high-dimensional generalization called orthogonal factor and the corresponding minimization problem can be effectively solved using modern methods of manifold optimization. Understanding in one dimension would offer a baseline for its higher-dimensional counterparts. Secondly, in the matrix completion problem [23], one desires a matrix with minimal rank under the component constraints. Note that the rank of a matrix is the measure of its singular value vector. A natural relaxation of rank to a more regular objectives includes the so-called numerical intrinsic rank, which is defined by the ratio of the singular value vector, and the numerical/stable rank, which is defined by the ratio of the singular value vector. This suggests that the ratio between different norms might be a useful function to measure sparsity (complexity) of an object, and therefore leads us to study the objective in compressed sensing.
A few attempts have been made recently to reveal both the theory and applications behind the problem [24], [25], [15], [19], [26]. However, the existing analysis is either applicable only for non-negative signals, or yields a local optimality condition which is often too strict in practice. The investigation of efficient algorithms for solving the minimization is also an active area of research [19], [27], [28], [29], [30], [31].
Our contributions in this paper are two-fold. First we propose a new local optimality criterion which provides evidence that a large “dynamic range” may lead to better performance of an procedure, as was observed in [27]. We also conduct a first attempt at analyzing the exact recovery condition (global optimality) of ; a sufficient condition for uniform recoverability as well as some analysis of the robustness to noise are also given. We also provide numerical demonstrations, in which a novel initialization step for the optimization is proposed and explored to improve the performance of existing algorithms. We remark that since this problem is non-convex, none of the results in this paper are tight; they only serve as the initial insight into certain aspects of the method that have been observed in practice.
The rest of the paper is organized as follows. In Section 2 we briefly introduce the results in [8] and [9] obtained by a high-dimensional geometry approach, which are relevant to our analysis. In Section 3, we give a new local optimality condition ensuring that an s-sparse vector is the local minimizer of the problem. In Section 4 we investigate the uniform recoverability of and propose a new sufficient condition for this recoverability. We also show that this condition is easily satisfied for a large class of random matrices. In Section 5, we give some elementary analysis on how the solution to minimization problem is affected in the presence of noise. In Section 6, we provide some numerical experiments to support our findings and propose a novel initialization technique that further improves an existing algorithm from [27]. In Section 7, we summarize our findings as well as point out some possible directions for future investigation.
Section snippets
A geometric perspective on minimization
Geometric interpretation of compressed sensing first appeared in an abstract formulation of the problem in Donoho's original work [3]. In this section, we will take a selection of geometric views on minimization based on the discussions in [9] and [8], which do not hinge on RIP analysis. We will see that they provide valuable insight for our analysis of (1) in the case of non-convex relaxation.
To interpret (2) geometrically, we assume that entries of A are iid standard normal, i.e.,
A local optimality criterion
In this section, we give a sufficient condition for an s-sparse signal to be the local minimizer of (4) with . Compared to the global optimality condition obtained later in Section 4, the local optimality condition in this section aids in understanding the behavior of optimization near . This is important in practice since many non-convex algorithms only have local convergence guarantees. Our local optimality result is signal-dependent but it offers asymptotically weaker and
A sufficient condition for exact recovery
In this section, we propose a sufficient condition that guarantees the uniform exact recovery for sparse vectors using . As we will see, the condition to be obtained will hold with overwhelming probability for a large class of sub-Gaussian random matrices. Since our approach for deriving this recoverability condition applies to other situations as well, we consider the following problem which is slightly more general than (4): For fixed, consider the optimization
Robustness analysis
In this section, we discuss the robustness of minimization when noise is present. As in other compressed sensing results, we assume that b is contaminated by some noise e: , where is a sparse vector. If the size (say the norm) of e is bounded by an a priori known quantity ε, then the denoising problem can be stated as
Let be a minimizer of (35). Then , and necessarily, This inequality will play an
Numerical experiments
In this section, we present several numerical simulations to complement our previous theoretical investigation. All the results in this section are repeatable on a standard laptop installed with R [39]. For simplicity, we will restrict to the noiseless case. The noisy case can be carried out similarly by tuning the parameter of the penalty term arising from the constraint. Since minimization problems are non-convex, our algorithms solve the problem approximately. In particular, we utilize
Conclusion and future work
We have theoretically and numerically investigated the minimization problem in the context of recovery of sparse signals from a small number of measurements. We have provided a novel local optimality criterion in Theorem 3.1, which gives some theoretical justification to the empirical observation that performs better when the nonzero entries of the sparse solution have a large dynamic range. We also provided a uniform recoverability condition in Theorem 4.1 for the
Declaration of Competing Interest
There is none.
Acknowledgements
We would like to thank the anonymous referees for their very helpful comments which significantly improve the presentation of the paper. The first author ([email protected]) thanks Tom Alberts, You-Cheng Chou and Dong Wang for constructive discussions. The first and second authors ([email protected], [email protected]) acknowledge partial support from NSF DMS-1848508. The third author ([email protected]) acknowledges support from Scientific Discovery through Advanced Computing (SciDAC) program
References (48)
- et al.
Sparsest solutions of underdetermined linear systems via -minimization for
Appl. Comput. Harmon. Anal.
(2009) - et al.
Highly sparse representations from dictionaries are unique and independent of the sparseness measure
Appl. Comput. Harmon. Anal.
(2007) - et al.
Iterative hard thresholding for compressed sensing
Appl. Comput. Harmon. Anal.
(2009) - et al.
A class of null space conditions for sparse recovery via nonconvex, non-separable minimizations
Results in Applied Mathematics
(2019) - et al.
Reconstruction of jointly sparse vectors via manifold optimization
Appl. Numer. Math.
(2019) - et al.
The Gelfand widths of -balls for
J. Complex.
(2010) - et al.
On the number of iterations for convergence of CoSaMP and subspace pursuit algorithms
Appl. Comput. Harmon. Anal.
(2017) - et al.
Optimally sparse representation in general (nonorthogonal) dictionaries via minimization
Proc. Natl. Acad. Sci. USA
(2003) Sparse approximate solutions to linear systems
SIAM J. Comput.
(1995)Compressed sensing
IEEE Trans. Inf. Theory
(2006)
Near-optimal signal recovery from random projections: universal encoding strategies?
IEEE Trans. Inf. Theory
Uncertainty principles and ideal atomic decomposition
IEEE Trans. Inf. Theory
Compressed sensing and best k-term approximation
J. Am. Math. Soc.
Theory of compressive sensing via ℓ 1-minimization: a non-RIP analysis and extensions
J. Oper. Res. Soc. China
Estimation in high dimensions: a geometric perspective
Exact reconstruction of sparse signals via nonconvex minimization
IEEE Signal Process. Lett.
Enhancing sparsity by reweighted minimization
CoSaMP
Commun. ACM
Ratio and difference of and norms and sparse representation with coherent dictionaries
Commun. Inf. Syst.
Minimization of for compressed sensing
SIAM J. Sci. Comput.
Non-negative sparse coding
Comparing measures of sparsity
A scale-invariant approach for sparse signal recovery
SIAM J. Sci. Comput.
A unified approach to model selection and sparse recovery using regularized least squares
Ann. Stat.
Cited by (16)
Low-rank matrix recovery problem minimizing a new ratio of two norms approximating the rank function then using an ADMM-type solver with applications
2024, Journal of Computational and Applied MathematicsAdaptive multispace adjustable sparse filtering: A sparse feature learning method for intelligent fault diagnosis of rotating machinery
2023, Engineering Applications of Artificial IntelligenceA wonderful triangle in compressed sensing
2022, Information SciencesSorted L1/L2 Minimization for Sparse Signal Recovery
2024, Journal of Scientific ComputingA lifted ℓ<inf>1</inf>framework for sparse recovery
2024, Information and Inference