Analysis of the ratio of 1 and 2 norms in compressed sensing

https://doi.org/10.1016/j.acha.2021.06.006Get rights and content

Abstract

We study the ratio of 1 and 2 norms (1/2) as a sparsity-promoting objective in compressed sensing. We first propose a novel criterion that guarantees that an s-sparse signal is the local minimizer of the 1/2 objective; our criterion is interpretable and useful in practice. We also give the first uniform recovery condition using a geometric characterization of the null space of the measurement matrix, and show that this condition is satisfied for a class of random matrices. We also present analysis on the robustness of the procedure when noise pollutes data. Numerical experiments are provided that compare 1/2 with some other popular non-convex methods in compressed sensing. Finally, we propose a novel initialization approach to accelerate the numerical optimization procedure. We call this initialization approach support selection, and we demonstrate that it empirically improves the performance of existing 1/2 algorithms.

Introduction

The goal of compressed sensing (CS) problem is to seek the sparsest solution of an underdetermined linear system:minx0subject toAx=b, where xRn,bRm and ARm×n with mn. The quasinorm x0 measures the number of nonzero components in x. In CS applications, one typically considers x as the frame/basis coordinates of an unknown signal, and it is typically assumed that the coordinate representation is sparse, i.e., that x0 is “small”. A is the measurement matrix that encodes linear measurements of the signal x, and b contains the corresponding measured values. In the language of signal processing, (1) is equivalent to applying the sparsity decoder to reconstruct a signal from the undersampled measurement pair (A, b). A naive, empirical counting argument suggests that if mn measurements b of an unknown signal x are available, then we can perhaps compute the original signal coordinates x, assuming x is approximately m-sparse. The optimization (1) is the quantitative manifestation of this argument.

It was established in [1] that under mild conditions, (1) has a unique minimizer. In the rest of the paper we assume that the minimizer is unique and denote it by x0. One of the central problems in compressed sensing is to design an effective algorithm to find x0: Directly solving (1) via combinatorial search is NP-hard [2]. A more practical approach, which was proposed in the seminal work [3], is to relax the sparsity measure 0 to the convex 1 norm 1:minx1subject toAx=b.

The convexity of the problem (2) ensures that efficient algorithms can be leveraged to compute solutions. Many pioneering works in compressed sensing have focused on understanding the equivalence between (1) and (2), see [3], [4], [5]. A major theoretical cornerstone of such equivalence results is the Null Space Property (NSP), which was first introduced in [6] and plays a crucial role in establishing sufficient and necessary conditions for the equivalence between (1) and (2). A sufficient condition for such an equivalence is called an exact recovery condition. A closely related but stronger condition is the Restricted Isometry Property (RIP), see [4]. The RIP is more flexible than the NSP for practical usage, yet conditions given by both the NSP and RIP are hard to verify in the case when measurements (i.e., the matrix A) are deterministically sampled. An alternative approach based on analyzing the mutual coherence of A produces a practically computable but suboptimal condition, see [1]. We will use a slightly more general definition of the NSP that was introduced in [7]:

Definition 1.1 Null space property

Given sN and c>0, a matrix ARm×n satisfies the (s,c)-NSP in the quasi-norm q (0<q1) if for all hker(A) and T[n]s, we havehTqq<chTqq. Here, [n]s is the collection of all subsets of {1,n} with cardinality at most s,[n]s:={T[n]||T|s},[n]:={1,,n}, hT is the restriction of h to the index set T, and T:=[n]T.

Nearly all exact recovery conditions based on the RIP are probabilistic in nature. This means that such analysis typically is split into two major thrusts: (i) the first one establishes that (1) and (2) are equivalent for a class of sparse signals if A satisfies an RIP condition, and (ii) the second one proves that the RIP condition for a suitable random matrix A is achievable with high probability. Such random arguments appear to be necessary in practice for the RIP analysis in order to mitigate pathological measurement configurations.

Under proper randomness assumptions, an alternative approach that circumvents the RIP also yields fruitful results in the study of (2), see [8], [9]. This approach is more reliant on a geometric characterization of the nullspace of the measurements, and therefore could be potentially adapted to analyzing non-convex objectives with similar geometric interpretations. We take this approach for analysis in this paper.

Although (2) has attracted a lot of interest in the past decades, the community realized that 1 minimization is not as robust for computing sparsity-promoting solutions compared to other objective functions, in particular compared to other computationally efficient non-convex objectives. This motivates the study of non-convex relaxation methods (use non-convex objectives to approximate 0), which are believed to be more sparsity-aware. Many non-convex objective functions, such as q (0<q<1) [10], [11], [7]), reweighted 1 [12], CoSaMP [13], IHT [14], 12 [15], [16], and 1/2 [17], [18], [15], [19], are empirically shown to outperform 1 in certain contexts. However, relatively few such approaches have been successfully analyzed for theoretical recovery. In fact, obtaining exact recovery conditions and robustness analysis for general non-convex optimization problems is difficult, unless the objective possesses certain exploitable structure, see [20], [21].

We aim to investigate exact recovery conditions as well as the robustness for the objective 1/2 in this paper. We are interested in providing conditions under which (1) is equivalent to the following problem:minx1x2subject toAx=b. To the best of our knowledge, 1/2 does not belong to any particular class of non-convex functions that has a systematic theory behind it. This is mainly because 1/2 is neither convex nor concave, and is not even globally continuous. However, there are a few observations that make this non-convex objective worth investigating. First, in [22] it was shown numerically that the 1/2 outperforms 1 by a notable margin in jointly sparse recovery problems (in the sense that many fewer measurements are required to achieve the same recovery rate); particularly, 1/2 admits a high-dimensional generalization called orthogonal factor and the corresponding minimization problem can be effectively solved using modern methods of manifold optimization. Understanding 1/2 in one dimension would offer a baseline for its higher-dimensional counterparts. Secondly, in the matrix completion problem [23], one desires a matrix with minimal rank under the component constraints. Note that the rank of a matrix is the 0 measure of its singular value vector. A natural relaxation of rank to a more regular objectives includes the so-called numerical intrinsic rank, which is defined by the ratio 1/ of the singular value vector, and the numerical/stable rank, which is defined by the ratio 2/ of the singular value vector. This suggests that the ratio between different norms might be a useful function to measure sparsity (complexity) of an object, and therefore leads us to study the objective 1/2 in compressed sensing.

A few attempts have been made recently to reveal both the theory and applications behind the 1/2 problem [24], [25], [15], [19], [26]. However, the existing analysis is either applicable only for non-negative signals, or yields a local optimality condition which is often too strict in practice. The investigation of efficient algorithms for solving the 1/2 minimization is also an active area of research [19], [27], [28], [29], [30], [31].

Our contributions in this paper are two-fold. First we propose a new local optimality criterion which provides evidence that a large “dynamic range” may lead to better performance of an 1/2 procedure, as was observed in [27]. We also conduct a first attempt at analyzing the exact recovery condition (global optimality) of 1/2; a sufficient condition for uniform recoverability as well as some analysis of the robustness to noise are also given. We also provide numerical demonstrations, in which a novel initialization step for the optimization is proposed and explored to improve the performance of existing algorithms. We remark that since this problem is non-convex, none of the results in this paper are tight; they only serve as the initial insight into certain aspects of the method that have been observed in practice.

The rest of the paper is organized as follows. In Section 2 we briefly introduce the results in [8] and [9] obtained by a high-dimensional geometry approach, which are relevant to our analysis. In Section 3, we give a new local optimality condition ensuring that an s-sparse vector is the local minimizer of the 1/2 problem. In Section 4 we investigate the uniform recoverability of 1/2 and propose a new sufficient condition for this recoverability. We also show that this condition is easily satisfied for a large class of random matrices. In Section 5, we give some elementary analysis on how the solution to 1/2 minimization problem is affected in the presence of noise. In Section 6, we provide some numerical experiments to support our findings and propose a novel initialization technique that further improves an existing 1/2 algorithm from [27]. In Section 7, we summarize our findings as well as point out some possible directions for future investigation.

Section snippets

A geometric perspective on 1 minimization

Geometric interpretation of compressed sensing first appeared in an abstract formulation of the problem in Donoho's original work [3]. In this section, we will take a selection of geometric views on 1 minimization based on the discussions in [9] and [8], which do not hinge on RIP analysis. We will see that they provide valuable insight for our analysis of (1) in the case of non-convex relaxation.

To interpret (2) geometrically, we assume that entries of A are iid standard normal, i.e., (A)i,jN(

A local optimality criterion

In this section, we give a sufficient condition for an s-sparse signal x0 to be the local minimizer of (4) with b:=Ax0. Compared to the global optimality condition obtained later in Section 4, the local optimality condition in this section aids in understanding the behavior of 1/2 optimization near x0. This is important in practice since many non-convex algorithms only have local convergence guarantees. Our local optimality result is signal-dependent but it offers asymptotically weaker and

A sufficient condition for exact recovery

In this section, we propose a sufficient condition that guarantees the uniform exact recovery for sparse vectors using 1/2. As we will see, the condition to be obtained will hold with overwhelming probability for a large class of sub-Gaussian random matrices. Since our approach for deriving this recoverability condition applies to other situations as well, we consider the following problem which is slightly more general than (4): For 0<q1 fixed, consider the optimizationminxqx2subject to

Robustness analysis

In this section, we discuss the robustness of 1/2 minimization when noise is present. As in other compressed sensing results, we assume that b is contaminated by some noise e: b=Ax0+eRm, where x0 is a sparse vector. If the size (say the 2 norm) of e is bounded by an a priori known quantity ε, then the 1/2 denoising problem can be stated asminx1x2subject toAxb2ε.

Let x be a minimizer of (35). Then AxAx022ε, and necessarily,x1x2x01x02s. This inequality will play an

Numerical experiments

In this section, we present several numerical simulations to complement our previous theoretical investigation. All the results in this section are repeatable on a standard laptop installed with R [39]. For simplicity, we will restrict to the noiseless case. The noisy case can be carried out similarly by tuning the parameter of the penalty term arising from the constraint. Since 1/2 minimization problems are non-convex, our algorithms solve the problem approximately. In particular, we utilize

Conclusion and future work

We have theoretically and numerically investigated the 1/2 minimization problem in the context of recovery of sparse signals from a small number of measurements. We have provided a novel local optimality criterion in Theorem 3.1, which gives some theoretical justification to the empirical observation that 1/2 performs better when the nonzero entries of the sparse solution have a large dynamic range. We also provided a uniform recoverability condition in Theorem 4.1 for the 1/2

Declaration of Competing Interest

There is none.

Acknowledgements

We would like to thank the anonymous referees for their very helpful comments which significantly improve the presentation of the paper. The first author ([email protected]) thanks Tom Alberts, You-Cheng Chou and Dong Wang for constructive discussions. The first and second authors ([email protected], [email protected]) acknowledge partial support from NSF DMS-1848508. The third author ([email protected]) acknowledges support from Scientific Discovery through Advanced Computing (SciDAC) program

References (48)

  • E.J. Candes et al.

    Near-optimal signal recovery from random projections: universal encoding strategies?

    IEEE Trans. Inf. Theory

    (2006)
  • D. Donoho et al.

    Uncertainty principles and ideal atomic decomposition

    IEEE Trans. Inf. Theory

    (2001)
  • A. Cohen et al.

    Compressed sensing and best k-term approximation

    J. Am. Math. Soc.

    (2008)
  • Y. Zhang

    Theory of compressive sensing via 1-minimization: a non-RIP analysis and extensions

    J. Oper. Res. Soc. China

    (2013)
  • R. Vershynin

    Estimation in high dimensions: a geometric perspective

  • R. Chartrand

    Exact reconstruction of sparse signals via nonconvex minimization

    IEEE Signal Process. Lett.

    (2007)
  • E.J. Candes et al.

    Enhancing sparsity by reweighted 1 minimization

    (Jul. 2008)
  • D. Needell et al.

    CoSaMP

    Commun. ACM

    (2010)
  • P. Yin et al.

    Ratio and difference of 1 and 2 norms and sparse representation with coherent dictionaries

    Commun. Inf. Syst.

    (2014)
  • P. Yin et al.

    Minimization of 12 for compressed sensing

    SIAM J. Sci. Comput.

    (2015)
  • P. Hoyer

    Non-negative sparse coding

  • N. Hurley et al.

    Comparing measures of sparsity

  • Y. Rahimi et al.

    A scale-invariant approach for sparse signal recovery

    SIAM J. Sci. Comput.

    (2019)
  • J. Lv et al.

    A unified approach to model selection and sparse recovery using regularized least squares

    Ann. Stat.

    (2009)
  • Cited by (16)

    • Sorted L1/L2 Minimization for Sparse Signal Recovery

      2024, Journal of Scientific Computing
    View all citing articles on Scopus
    View full text