Abstract
We introduce and analyze a framework and corresponding method for compressed sensing in infinite dimensions. This extends the existing theory from finite-dimensional vector spaces to the case of separable Hilbert spaces. We explain why such a new theory is necessary by demonstrating that existing finite-dimensional techniques are ill suited for solving a number of key problems. This work stems from recent developments in generalized sampling theorems for classical (Nyquist rate) sampling that allows for reconstructions in arbitrary bases. A conclusion of this paper is that one can extend these ideas to allow for significant subsampling of sparse or compressible signals. Central to this work is the introduction of two novel concepts in sampling theory, the stable sampling rate and the balancing property, which specify how to appropriately discretize an infinite-dimensional problem.
Similar content being viewed by others
Notes
Since writing this paper, it has subsequently been shown that the term N / m can be removed and that noise can be incorporated in the data and recovery guarantees. See [7] for details.
References
B. Adcock. Infinite-dimensional \(\ell ^1\) minimization and function approximation from pointwise data. arXiv:1503.02352, 2015.
B. Adcock and A. C. Hansen. A generalized sampling theorem for stable reconstructions in arbitrary bases. J. Fourier Anal. Appl., 18(4):685–716, 2012.
B. Adcock and A. C. Hansen. Stable reconstructions in Hilbert spaces and the resolution of the Gibbs phenomenon. Appl. Comput. Harmon. Anal., 32(3):357–388, 2012.
B. Adcock, A. C. Hansen, E. Herrholz, and G. Teschke. Generalized sampling: extension to frames and inverse and ill-posed problems. Inverse Problems, 29(1):015008, 2013.
B. Adcock, A. C. Hansen, and C. Poon. Beyond consistent reconstructions: optimality and sharp bounds for generalized sampling, and application to the uniform resampling problem. SIAM J. Math. Anal., 45(5):3114–3131, 2013.
B. Adcock, A. C. Hansen, and C. Poon. On optimal wavelet reconstructions from Fourier samples: linearity and universality of the stable sampling rate. Appl. Comput. Harmon. Anal., 36(3):387–415, 2014.
B. Adcock, A. C. Hansen, C. Poon, and B. Roman. Breaking the coherence barrier: a new theory for compressed sensing. arXiv:1302.0561, 2014.
B. Adcock, A. C. Hansen, B. Roman, and G. Teschke. Generalized sampling: stable reconstructions, inverse problems and compressed sensing over the continuum. Advances in Imaging and Electron Physics, 182:187–279, 2014.
B. Adcock, A. C. Hansen, and A. Shadrin. A stability barrier for reconstructions from Fourier samples. SIAM J. Numer. Anal., 52(1):125–139, 2014.
A. Aldroubi. Oblique projections in atomic spaces. Proc. Amer. Math. Soc., 124(7):2051–2060, 1996.
A. Averbuch, R. R. Coifman, D. L. Donoho, M. Israeli, and Y. Shkolnisky. A framework for discrete integral transformations. I. The pseudopolar Fourier transform. SIAM J. Sci. Comput., 30(2):764–784, 2008.
A. Averbuch, R. R. Coifman, D. L. Donoho, M. Israeli, Y. Shkolnisky, and I. Sedelnikov. A framework for discrete integral transformations. II. The 2D discrete Radon transform. SIAM J. Sci. Comput., 30(2):785–803, 2008.
A. Bastounis and A. C. Hansen. On the absence of the RIP in real-world applications of compressed sensing and the RIP in levels. arXiv:1411.4449, 2014.
T. Blu, P. L. Dragotti, M. Vetterli, P. Marziliano, and L. Coulout. Sparse sampling of signal innovations. IEEE Signal Process. Mag., 25(2):31–40, 2008.
A. Bourrier, M. E. Davies, T. Peleg, P. Pérez, and R. Gribonval. Fundamental performance limits for ideal decoders in high-dimensional linear inverse problems. IEEE Trans. Inform. Theory, 60(12):7928–7946, 2014.
E. J. Candès. An introduction to compressive sensing. IEEE Signal Process. Mag., 25(2):21–30, 2008.
E. J. Candès and C. Fernandez-Granda. Towards a mathematical theory of super-resolution. Comm. Pure Appl. Math. (to appear), 67(6):906–956, 2014.
E. J. Candes and Y. Plan. A Probabilistic and RIPless Theory of Compressed Sensing. IEEE Trans. Inform. Theory, 57(11):7235–7254, 2011.
E. J. Candès and J. Romberg. Sparsity and incoherence in compressive sampling. Inverse Problems, 23(3):969–985, 2007.
E. J. Candès, J. Romberg, and T. Tao. Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information. IEEE Trans. Inform. Theory, 52(2):489–509, 2006.
E. J. Candès and T. Tao. Near-optimal signal recovery from random projections: universal encoding strategies? IEEE Trans. Inform. Theory, 52(12):5406–5425, 2006.
Y. Chi, L. L. Scharf, A. Pezeshki, and A. Calderbank. Sensitivity to basis mismatch in compressed sensing. IEEE Trans. Signal Process., 59(5):2182–2195, 2011.
A. Cohen, W. Dahmen, and R. DeVore. Compressed sensing and best \(k\)-term approximation. J. Amer. Math. Soc., 22(1):211–231, 2009.
M. A. Davenport, M. F. Duarte, Y. C. Eldar, and G. Kutyniok. Introduction to compressed sensing. In Compressed Sensing: Theory and Applications. Cambridge University Press, 2011.
D. L. Donoho. Compressed sensing. IEEE Trans. Inform. Theory, 52(4):1289–1306, 2006.
D. L. Donoho and J. Tanner. Counting faces of randomly projected polytopes when the projection radically lowers dimension. J. Amer. Math. Soc., 22(1):1–53, 2009.
P. L. Dragotti, M. Vetterli, and T. Blu. Sampling moments and reconstructing signals of finite rate of innovation: Shannon meets Strang–Fix. IEEE Trans. Signal Process., 55(5):1741–1757, 2007.
I. Ekeland and T. Turnbull. Infinite-dimensional optimization and convexity. Chicago Lectures in Mathematics. University of Chicago Press, Chicago, IL, 1983.
Y. Eldar. Sampling with arbitrary sampling and reconstruction spaces and oblique dual frame vectors. J. Fourier Anal. Appl., 9(1):77–96, 2003.
Y. C. Eldar. Compressed sensing of analog signals in shift-invariant spaces. IEEE Trans. Signal Process., 57(8):2986–2997, 2009.
Y. C. Eldar and T. Michaeli. Beyond Bandlimited Sampling. IEEE Signal Process. Mag., 26(3):48–68, 2009.
A. Fannjiang and W. Liao. Coherence pattern-guided compressive sensing with unresolved grids. SIAM J. Imaging Sci., 5:179–202, 2012.
M. Fornasier and H. Rauhut. Compressive sensing. In Handbook of Mathematical Methods in Imaging, pages 187–228. Springer, 2011.
S. Foucart and H. Rauhut. A Mathematical Introduction to Compressive Sensing. Birkhauser, 2013.
K. Gröchenig, Z. Rzeszotnik, and T. Strohmer. Quantitative estimates for the finite section method and Banach algebras of matrices. Integr Equat Oper Th., 67(2):183–202, 2011.
D. Gross. Recovering low-rank matrices from few coefficients in any basis. IEEE Trans. Inf. Theory, 57(3):1548–1566, 2011.
D. Gross, F. Krahmer, and R. Kueng. A partial derandomization of phaselift using spherical designs. arXiv:1310.2267, 2014.
M. Guerquin-Kern, M. Häberlin, K. P. Pruessmann, and M. Unser. A fast wavelet-based reconstruction method for Magnetic Resonance Imaging. IEEE Trans. Med. Imaging, 30(9):1649–1660, 2011.
M. Guerquin-Kern, L. Lejeune, K. P. Pruessmann, and M. Unser. Realistic analytical phantoms for parallel Magnetic Resonance Imaging. IEEE Trans. Med. Imaging, 31(3):626–636, 2012.
A. C. Hansen. On the approximation of spectra of linear operators on Hilbert spaces. J. Funct. Anal., 254(8):2092–2126, 2008.
A. C. Hansen. On the solvability complexity index, the \(n\)-pseudospectrum and approximations of spectra of operators. J. Amer. Math. Soc., 24(1):81–124, 2011.
E. Herrholz and G. Teschke. Compressive sensing principles and iterative sparse recovery for inverse and ill-posed problems. Inverse Problems, 26(12):125012, 2010.
T. Hrycak and K. Gröchenig. Pseudospectral Fourier reconstruction with the modified inverse polynomial reconstruction method. J. Comput. Phys., 229(3):933–946, 2010.
T. W. Körner. Fourier Analysis. Cambridge University Press, 1988.
M. Ledoux. The Concentration of Measure Phenomenon, volume 89 of Mathematical Surveys and Monographs. American Mathematical Society, 2001.
M. Lustig, D. L. Donoho, J. M. Santos, and J. M. Pauly. Compressed Sensing MRI. IEEE Signal Process. Mag., 25(2):72–82, March 2008.
M. Mishali and Y. C. Eldar. Xampling: Analog to digital at sub-Nyquist rates. IET Circuits, Devices, & Systems, 5(1):8–20, 2011.
M. Mishali, Y. C. Eldar, and A. J. Elron. Xampling: Signal acquisition and processing in union of subspaces. IEEE Trans. Signal Process., 59(10):4719–4734, 2011.
S. K. Mitter. Convex optimization in infinite dimensional spaces. In Recent Advances in Learning and Control, volume 371 of Lecture Notes in Control and Information Science, pages 161–179. Springer, London, 2008.
C. Poon. A consistent and stable approach to generalized sampling. J. Fourier Anal. Appl., 20(5):985–1019, 2014.
H. Rauhut and R. Ward. Sparse recovery for spherical harmonic expansions. In Proceedings of the 9th International Conference on Sampling Theory and Applications, 2011.
H. Rauhut and R. Ward. Sparse Legendre expansions via l1-minimization. J. Approx. Theory, 164(5):517–533, 2012.
B. Roman, A. Hansen, and B. Adcock. On asymptotic structure in compressed sensing. arXiv:1406.4178, 2014.
M. Rudelson. Random vectors in the isotropic position. J. Funct. Anal., 164(1):60–72, 1999.
G. Strang and T. Nguyen. Wavelets and filter banks. Wellesley-Cambridge Press, 1997.
T. Strohmer. Measure what should be measured: progress and challenges in compressive sensing. IEEE Signal Proc Let., 19(12):887–893, 2012.
A. M. Stuart. Inverse problems: A Bayesian perspective. Acta Numer., 19:451–559, 2010.
B. P. Sutton, D. C. Noll, and J. A. Fessler. Fast, iterative image reconstruction for MRI in the presence of field inhomogeneities. IEEE Trans. Med. Imaging, 22(2):178–188, 2003.
M. Talagrand. New concentration inequalities in product spaces. Invent. Math., 126(3):505–563, 1996.
G. Tang, B. Bhaskar, P. Shah, and B. Recht. Compressed sensing off the grid. IEEE Trans. Inform. Theory., 59(11):7465–7490, 2013.
J. A. Tropp. On the conditioning of random subdictionaries. Appl. Comput. Harmon. Anal., 25(1):1–24, 2008.
M. Unser. Sampling–50 years after Shannon. Proc. IEEE, 88(4):569–587, 2000.
M. Unser and A. Aldroubi. A general sampling theory for nonideal acquisition devices. IEEE Trans. Signal Process., 42(11):2915–2925, 1994.
M. Vetterli, P. Marziliano, and T. Blu. Sampling signals with finite rate of innovation. IEEE Trans. Signal Process., 50(6):1417–1428, 2002.
Acknowledgments
The authors would like to thank Akram Aldroubi, Emmanuel Candès, Ron DeVore, David Donoho, Karlheinz Gröchenig, Gerd Teschke, Joel Tropp, Martin Vetterli, Christopher White, Pengchong Yan and Özgür Yilmaz for useful discussions and comments. They would also like to thank Clarice Poon for helping to improve several of the arguments in the proofs, Bogdan Roman for producing the example in Sect. 7.2 and the anonymous referees for their many useful comments and suggestions.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Emmanuel Candès.
Appendix
Appendix
This appendix contains all the proofs not given so far. Before we do this, there are two results that will be crucial. The first is a due to Rudelson [54].
Lemma 11.1
(Rudelson) Let \(\eta _1, \ldots , \eta _M \in {\mathbb {C}}^n\) and let \(\varepsilon _1, \ldots \varepsilon _M\) be independent Bernoulli variables taking values \(1,-1\) with probability 1 / 2. Then
where \(p= \max \{ 2, 2\log (n) \}\).
Note that the original lemma in [54] does not apply in this case. Actually, we need the complex version proved in [61]. We will, however, still refer to it as Rudelson’s Lemma. The following theorem is also indispensable:
Theorem 11.2
(Talagrand [45, 59]) There exists a number K with the following property. Consider n independent random variables \(X_i\) valued in a measurable space \(\Omega .\) Let \({\mathcal {F}}\) be a (countable) class of measurable functions on \(\Omega \) and consider the random variable \(Z = \sup _{f \in {\mathcal {F}}}\sum _{i \le n} f(X_i).\) Let
If \( {\mathbb {E}}(f(X_i)) = 0\) for all \(f \in {\mathcal {F}}\) and \(i\le n\), then, for each \(t > 0\), we have
where \(\overline{Z} = \sup _{f\in {\mathcal {F}}}|\sum _{i \le n} f(X_i)|\).
Note that we deliberately forgo the use of any vector/matrix Bernstein inequalities in the proofs that follow, and instead use Talagrand’s result. This allows for more flexibility in the infinite-dimensional setting.
We next present the proofs of Propositions 8.5 and 8.6. For this, it is useful to have a result about the existence of unique minimizers. The finite-dimensional version of the following proposition has become standard for showing existence of unique minimizers for finite-dimensional problems found in CS (see, e.g., [20, Lem. 2.1] or [34, Thm. 4.26]). Fortunately, the extension to infinite dimensions is rather straightforward:
Proposition 11.3
Let \(U \in {\mathcal {B}}(l^2({\mathbb {N}}))\) be unitary and let \(\Omega , \Delta \subset {\mathbb {N}}\) be such that \(|\Omega |, |\Delta | < \infty \). Suppose that \(x_0 \in {\mathcal {H}}\) with \(\mathrm {supp}(x_0) = \Delta \) and consider the optimization problem
Suppose that there exists a vector \(\rho \in {\mathcal {H}}\) such that
-
(i)
\(\rho = U^*P_{\Omega }\eta \) for some \(\eta \in {\mathcal {H}}\)
-
(ii)
\(\langle \rho , e_j \rangle = \langle \mathrm {sgn}(x_0), e_j \rangle \), \(j \in \Delta \)
-
(iii)
\(|\langle \rho , e_j \rangle | < 1\), \(j \notin \Delta ,\)
and in addition that \(P_{\Omega }UP_{\Delta }: P_{\Delta }{\mathcal {H}} \rightarrow P_{\Omega }{\mathcal {H}}\) has full rank, then \(x_0\) is the unique minimizer of (11.1). If U and \(x_0\) are real the converse is also true.
Proof
By assumption, there is a \(\rho \in l^{\infty }({\mathbb {N}})\) such that \(\rho = U^*P_{\Omega }y\) for some \(y \in P_{\Omega }{\mathcal {H}}\) and \(\Vert \rho \Vert _{l^{\infty }} \le 1\). Also, by (ii)
Thus, by using duality (recall Proposition 8.3), in particular the fact that \(P_{\Omega }U : {\mathcal {H}} \rightarrow P_{\Omega }{\mathcal {H}}\) is onto (this follows since U is unitary) and that
it follows that \(x_0\) is a minimizer. But \(\langle \rho ,e_j\rangle < 1\) for \(j \notin \Delta \) so if \(\xi \) is another minimizer then \(\mathrm {supp}(\xi ) = \Delta .\) However, \(P_{\Omega }UP_{\Delta }\) has full rank, so \(\xi = x_0.\)
As for the converse in the real case, suppose that \(x_0\) is the unique minimizer. Then, for all sufficiently large n, \(x_0\) is the unique minimizer to the finite-dimensional optimization problem \( \inf \{\Vert x\Vert _{l^1}: x \in P_n{\mathcal {H}}, \, P_{\Omega }UP_nx = P_{\Omega }Ux_0\}. \) Proposition 11.3 is well known to be true in finite dimensions [34]. It follows that there is a \(y_n\) such that, for \(\rho _n = P_nU^*P_{\Omega } y_n,\) we have \(\langle \rho _n, e_j \rangle < 1\) when \(j \notin \Delta \) and \(j \le n\), and \(\langle \rho _n, e_j \rangle = \mathrm {sgn}(\langle x_0, e_j \rangle )\) for \(j \in \Delta .\) It is easy to see that there is a constant \(M < \infty \) such that \(\Vert y_n\Vert _{l^{\infty }} \le M\) for all large n. Now we can define \(\rho = U^* P_{\Omega } y_n.\) Then \(\rho = \rho _n + P^{\perp }_nU^*P_{\Omega }y_n\), and thus \(\rho \) satisfies the requirements (i), (ii) and (iii) for large n. \(\square \)
Proof of Proposition 8.5
Let \(\alpha = |\Delta |\) and also \(\omega = \{\omega _j\}_{j=1}^{\alpha }\), where \(\omega _j \in {\mathbb {C}}\). Now define
where \(S_{\omega } = \mathrm {diag}\left( \{\omega _j\}_{j=1}^{\alpha }\right) \) on \(P_{\Delta }{\mathcal {H}}\) and \(I_{\Delta ^c}\) is the identity on \(P_{\Delta }^{\perp }{\mathcal {H}}\). Define \(U(\omega ) = UV_{\omega }.\) Note that to prove the proposition it suffices to show that \(V_{\omega }x\) is the unique minimizer of \(\inf \{\Vert \eta \Vert _{l_1}: P_{\Omega }U\eta = P_{\Omega }U(\omega )x\}\) for all \(\omega \), where
Indeed, if the assertion is true, Proposition 11.3 yields that every real \(\tilde{x} \in l^2({\mathbb {N}})\) with \(\mathrm {supp}(\tilde{x}) = \Delta \) is the unique minimizer of \(\inf \{\Vert \eta \Vert _{l_1}: P_{\Omega }U\eta = P_{\Omega }U\tilde{x}\}\). Thus, for any \(y \in l^2({\mathbb {N}})\) such that \(\mathrm {supp}(y) = \Delta \) choose \(\omega \in \Lambda \) and a real \(\tilde{x} \in l^2({\mathbb {N}})\) such that \(y = V_{\omega }\tilde{x}\). Then, by using the assertion above for \(\tilde{x}\) we have proved the proposition.
To prove the assertion, note that if \(\omega \in \Lambda ,\) then \(V_{\omega }\) is clearly unitary and also an isometry on \(l^1({\mathbb {N}}).\) Thus, it is easy to see that \(V_{\omega }\zeta \) is a minimizer of \( \inf \{\Vert \eta \Vert _{l_1}: P_{\Omega }U\eta = P_{\Omega }U(\omega )x\}\) if and only if \(\zeta \) is a minimizer of \( \inf \{\Vert \eta \Vert _{l_1}: P_{\Omega }U(\omega )\eta = P_{\Omega }U(\omega )x\}. \) We will therefore consider the latter minimization problem and show that x is the unique minimizer of that problem for all \(\omega \in \Lambda .\) To do that, it suffices, by Proposition 11.3 and the fact that \(U(\omega )\) is unitary to show that there exists a vector \(\rho \in l^2({\mathbb {N}})\) such that
Now, for \(\epsilon > 0\) (we will specify the value of \(\epsilon \) later), define the function \(\varphi :\cup _{a\in \Lambda }{\mathcal {B}}(a, \epsilon ) \rightarrow \mathbb {R}_+,\) where \({\mathcal {B}}(a, \epsilon )\) denotes the \(\epsilon \)-ball around a, in the following way. Let
and define
where \(\iota _{\Delta } : P_{\Delta }l^2({\mathbb {N}}) \rightarrow l^2({\mathbb {N}})\) is the inclusion operator. Then (11.4) is satisfied if and only if \(\varphi (\omega ) < 1.\) Thus, to show (11.4) we must show that \(\varphi (\omega ) < 1\) for all \(\omega \in \Lambda .\)
Suppose for the moment that \(\epsilon \) is chosen such that \(\varphi \) is defined on its domain. We will show that \(\varphi \) is continuous. For this, it suffices to show that \(\varphi \) is continuous on \({\mathcal {B}}(a,\epsilon )\) for \(a \in \Lambda \). Note that, by the fact that \({\mathcal {B}}(a,\epsilon )\) is open it is enough to show that \(\varphi \) is convex. To see that \(\varphi \) is convex, let \(\omega _1, \omega _2 \in {\mathcal {B}}(a, \epsilon )\) and \(t \in (0,1).\) Also let \(\xi , \eta \in l^2({\mathbb {N}})\) be such that
Note that the existence of such vectors is guaranteed by the assumption that \(\varphi \) is defined on its domain. Now
Thus, taking infimum on the right-hand side yields \(\varphi (t\omega _1 + (1-t)\omega _2) \le t \varphi (\omega _1) + (1-t)\varphi (\omega _2),\) as required. Returning to the question of the domain of \(\varphi ,\) note that if \((P_{\Omega }UP_{\Delta })^* P_{\Omega }UP_{\Delta }|_{P_{\Delta }l^2({\mathbb {N}})}\) is invertible, then \((P_{\Omega }U(\omega )P_{\Delta })^* P_{\Omega }U(\omega )P_{\Delta }|_{P_{\Delta }l^2({\mathbb {N}})}\) is invertible if \(\Vert U(\tilde{\omega }) - U(\omega )\Vert \) is small and \(\tilde{\omega } \in \Lambda \). Letting
we get
Thus, \(\varphi \) is defined on its domain for small \(\epsilon .\)
Let \(\Gamma \) denote the subset of all \(\omega \in \Lambda \) such that x is the unique minimizer of \(\inf \{\Vert \eta \Vert _{l_1}: P_{\Omega }U(\omega )\eta = P_{\Omega }U(\omega )x\}.\) Note that \(\Gamma \) is closed. Indeed, if \(\omega \in \overline{\Gamma }\) and \(\{\omega _n\} \subset \Gamma \) is a sequence such that \(\omega _n \rightarrow \omega \) then \(\omega \in \Gamma .\) To see that, observe that since \(\{U,\Omega , \Delta \}\) is weakly f stable, it follows that for \(\xi \in l^2({\mathbb {N}})\) satisfying
we have
Thus, \(\xi = x\) and hence \(\omega \in \Gamma .\)
Note also that \(\Gamma \) is open. Indeed, for if \(\tilde{\omega } \in \Gamma \) then there exist \(\rho \in {\mathcal {H}}\) such that \(\rho \) satisfies (11.4) (with \(\omega \) replaced by \(\tilde{\omega }\)) e.g., \(\varphi (\tilde{\omega }) < 1.\) But, by continuity of \(\varphi \) it follows that \(\varphi \) is strictly less than one on a neighborhood of \(\tilde{\omega }.\) Since \( (P_{\Omega }UP_{\Delta })^* P_{\Omega }UP_{\Delta }|_{P_{\Delta }l^2({\mathbb {N}})} \) is invertible, it is easy to see that \(P_{\Omega }U(\omega )P_{\Delta })^* P_{\Omega }U(\omega )P_{\Delta }|_{P_{\Delta }l^2({\mathbb {N}})}\) is invertible, for all \(\omega \in \Lambda \). Thus it follows by Proposition 11.3 that (11.4) is satisfied for all \(\omega \in \Lambda \) in a neighborhood of \(\tilde{\omega }\) and hence \(\Gamma \) is open.
The fact that \(\Gamma \) is open and closed yields that either \(\Gamma = \emptyset \) or \(\Gamma = \Lambda \). The fact that \(\{1,\ldots ,1\} \in \Gamma \) by assumption yields the theorem. \(\square \)
Proof of Proposition 8.6
Let \(V_{\omega }\) and \(\Lambda \) be defined as in (11.2) and (11.3), respectively. Suppose that \(y \in l^2({\mathbb {N}})\) is real with \(\mathrm {supp}(y) = \Delta .\) Then, by assumption, \(V_{\omega }y\) is the unique minimizer of \(\inf \{\Vert \eta \Vert _{l_1}: P_{\Omega }U\eta = P_{\Omega }UV_{\omega }y\},\) when \(V_{\omega }\) is real. Thus, by Proposition 11.3 it follows that there exists a \(\rho _{\omega } \in l^2({\mathbb {N}})\) such that
Let \(\beta = \max _{\omega \in \Lambda }\{\Vert P_{\Delta ^c}\rho _{\omega }\Vert _{l^{\infty }}, \omega \text { is real}\}.\) It is clear that \(\beta < 1\). Thus, for every \(y \in {\mathcal {H}}\) with \(\mathrm {supp}(y) = \Delta \) there exists \(\rho _{\omega } \in l^2({\mathbb {N}})\) satisfying (11.5) where \(\Vert P_{\Delta ^c}\rho _{\omega }\Vert _{l^{\infty }} \le \beta \). It is now easy to show that (see the proof of Lemma 2.1 in [21]) there exists a constant \(C >0\) (depending on \(\beta \)) such that, if \(\xi \in l^2({\mathbb {N}})\), \(\mathrm {supp}(\xi ) = \Delta \), is the unique minimizer of \(\inf \{\Vert \eta \Vert _{l_1}: P_{\Omega }U\eta = P_{\Omega }U\xi \}\), \(\zeta \in l^2({\mathbb {N}})\) and x is a minimizer of \(\inf \{\Vert \eta \Vert _{l_1}: P_{\Omega }U\eta = P_{\Omega }U\zeta \}\) then \(\Vert P_{\Delta ^c}x\Vert _{l_1} \le C\Vert \xi -\zeta \Vert _{l_1}.\) Thus, since
and \((P_{\Omega }UP_{\Delta })^* P_{\Omega }UP_{\Delta } |_{P_{\Delta }{\mathcal {H}}}\) is invertible, the proposition follows. \(\square \)
Proof of Proposition 9.1
Without loss of generality, we may assume that \(\Vert \eta \Vert = 1.\) Let \(\{\delta _j\}_{j=1}^N\) be random Bernoulli variables with \({\mathbb {P}}(\delta _j = 1) = q.\) We will split the proof into two steps, where we will prove the finite-dimensional part of the proposition in Step I, and then tweak these ideas to fit the infinite-dimensional part of the proposition in Step II.
Step I: We start by noting that, since U is an isometry, we have
Our goal is to eventually use Bernstein’s inequality, and the following is therefore a setup to do so. For \(1 \le j \le N\), define the random variables
Thus, for \(s>0\) it follows from (11.6) that
where we have used the fact that U is an isometry and hence
Thus, by choosing \(s = t + \Vert P_MP_{\Delta }^{\perp }U^* P_N UP_{\Delta }\Vert _{\mathrm {mr}}\) we deduce that
To estimate the right-hand side of (11.7), we shall use Bernstein’s inequality, and in order to do that, we need a couple of observations. First note that
Thus
Also, observe that
for \(1 \le j \le N\) and \(i \in \Delta ^c \cap \{1,\ldots ,M\}.\) Now applying Bernstein’s inequality to \(\mathrm {Re}(X_1^i), \ldots , \mathrm {Re}(X_N^i)\) and \(\mathrm {Im}(X_1^i), \ldots , \mathrm {Im}(X_N^i)\), we get that
for all \(i \in \Delta ^c \cap \{1,\ldots ,M\}.\) Thus, by invoking (11.10) and (11.7) it follows that
when
The first part of the proposition now follows. The fact that the left-hand side of (9.3) is zero when \(q=1\) is clear from (11.8) and (11.9).
Step II: To prove the second part of the proposition, we will use the same ideas; however, we are now faced with the problem that \(P_{\Delta }^{\perp }E_{\Omega }P_{\Delta }\eta \) (contrary to \(P_MP_{\Delta }^{\perp }E_{\Omega }P_{\Delta }\eta \)) has infinitely many components. This is an obstacle since the proof of the bound on \(P_MP_{\Delta }^{\perp }E_{\Omega }P_{\Delta }\eta \) was based on bounding the probability of the deviation of every component of \(P_MP_{\Delta }^{\perp }E_{\Omega }P_{\Delta }\eta \), and thus, if there are infinitely many components to take care of, the task would be impossible. To overcome this obstacle, we proceed as follows. Note that, just as argued in the previous case in Step I, we have that
Define (as we did above) the random variables
Note that we now have infinitely many \(X^i_j\)’s. However, suppose for a moment that for every \(t >0\) there exists a nonempty set \(\Lambda _t \subset {\mathbb {N}}\) such that
Then, if that was the case, we would immediately get (by arguing as in Step I and using (11.11) and the assumption that \(\Vert \eta \Vert = 1\)) that
Thus, we could use the analysis provided above, via (11.10), and deduce that
when
Hence, if we could show the existence of \( \Lambda _t\) and provide a bound on \(\left| \Delta ^c \setminus \Lambda _t\right| \), we could appeal to (11.11) and (11.13) and complete the proof. To do that, define
Note that \((e_j \otimes e_j)Ue_i \rightarrow 0\) as \(i \rightarrow \infty \) for all \(j \le N\). Thus, \(\Lambda _t \ne \emptyset .\) Moreover, we also immediately deduce that \(\left| \Delta ^c \setminus \Lambda _t\right| < \infty .\) Note also that (11.12) follows by the fact that \(X^i_j = \langle \eta ,q^{-1} P_{\Delta } U^*\delta _j(e_j \otimes e_j)Ue_i\rangle \) and the Cauchy–Schwarz inequality. With the existence of \(\Lambda _t\) established, we now continue with the task of estimating \(\left| \Delta ^c \setminus \Lambda _t\right| .\) Note that to estimate \(\left| \Delta ^c \setminus \Lambda _t\right| \) we need information about the location of \(\Delta \) which is not assumed. We only assume the knowledge of some \(M \in {\mathbb {N}}\) such that \(P_M \ge P_{\Delta }\). Thus (although an estimate of \(\left| \Delta ^c \setminus \Lambda _t\right| \) would be sharper than what we will eventually obtain), we define
Note that it is straightforward to show that \(\tilde{\Lambda }_{q}(|\Delta |,M,t) \subset \Lambda _t.\) Also, \(\tilde{\Lambda }_{q}(|\Delta |,M,t)\) depends only on known quantities. Observe that, clearly, for any \(\Gamma _1 \subset \{1,\ldots , M \}\) and \(\Gamma _2 \subset \{1,\ldots , N \}\) then \(\left\| P_{\Gamma _1} U^*P_{\Gamma _2}Ue_i\right\| \rightarrow \infty \) as \(i \rightarrow \infty .\) Thus, \(|\Delta ^c \setminus \Lambda _{q}(|\Delta |,M,t)| < \infty \) and since \(\Lambda _{q}(|\Delta |,M,t) \subset \Lambda _t\) it follows that
This gives the second part of the proposition. The fact that the left-hand side of (9.4) is zero when \(q=1\) is clear from (11.8) and (11.9). \(\square \)
Proof of Proposition 9.2
Without loss of generality, we may assume that \(\Vert \eta \Vert = 1.\) Let \(\{\delta _j\}_{j=1}^N\) be random Bernoulli variables with \({\mathbb {P}}(\delta _j = 1) = q.\) Also, for \(k \in {\mathbb {N}},\) let \(\xi _k = (UP_{\Delta })^*e_k.\) Observe that, since U is an isometry,
and
where the infinite series in (11.14) converges in operator norm. To get the desired result, we first focus on obtaining bounds on \(\Vert ( \sum _{k=1}^{N} (q^{-1}\delta _k -1)\xi _k\otimes \bar{\xi }_k)\eta \Vert \). The goal is to use Talagrand’s formula, and what follows is a setup for that. In particular, let \(\zeta \in {\mathcal {H}}\) be a unit vector and denote the mapping \({\mathcal {H}} \ni \xi \mapsto \mathrm {Re}(\langle \xi , \zeta \rangle )\) by \(\hat{\zeta }.\) Let \({\mathcal {F}}\) be a countable collection of unit vectors such that for any \(\xi \in {\mathcal {H}}\) we have \( \Vert \xi \Vert = \sup _{\zeta \in {\mathcal {F}}}\hat{\zeta }(\xi ), \) and now define
Note that the following is clear (and note how this immediately gives us the setup for Talagrand’s Theorem)
To use Talagrand’s theorem, we must estimate the following quantities:
For V we have the following estimate:
where we have used the fact that U is an isometry in the step going from the second to the third inequality. The S term can be estimated as follows. Note that
thus
Finally, we can estimate R as follows:
again using the fact that U is an isometry. Therefore,
With the estimates on V, S and R now established we may appeal to Theorem 11.2 and deduce that there is a constant \(K >0\) such that, for \(\theta > 0\),
provided q is chosen such that the right-hand side of (11.18) is bounded by 1 (this is guaranteed by the assumptions of the proposition). But by (11.15) it follows that for any \(r > 0\), we have
Therefore, by appealing to (11.20) and (11.19) we obtain that
where \( \Xi = \Vert (P_{\Delta }U^* P_NUP_{\Delta } -P_{\Delta })\Vert . \) Choosing \(\theta = t/2\) yields the proposition. \(\square \)
Proof of Theorem 9.3
The proof is quite similar to the proof of Proposition 9.2. Let \(\{\delta _j\}_{j=1}^N\) be random Bernoulli variables with \({\mathbb {P}}(\delta _j = 1) = \theta .\) Note that we may argue as in (11.14) and observe that
where \(\xi _k = (UP_{\Delta })^*e_k\). To get the desired result, we first focus on getting bounds on \(\Vert \sum _{k=1}^{N} (\theta ^{-1}\delta _k -1)\xi _k\otimes \bar{\xi }_k\Vert .\) As in the proof of Proposition 9.2, the goal is to use Talagrand’s thereom and the first step to do so is to estimate \( {\mathbb {E}}\left( \Vert Z\Vert \right) , \) where \( Z = \sum _{k=1}^{N} (\theta ^{-1}\delta _k -1)\xi _k\otimes \bar{\xi }_k. \)
Claim: We claim that
when
To prove the claim, we simply rework the techniques used in [54]. This is now standard and has also been used in [19, 61]. We start by letting \(\tilde{\delta }= \{\tilde{\delta }_k\}_{k=1}^N\) be independent copies of \(\delta = \{\delta _k\}_{k=1}^N.\) Then
by Jensen’s inequality. Let \(\varepsilon = \{\varepsilon _j\}_{j=1}^N\) be a sequence of Bernoulli variables taking values \(\pm 1\) with probability 1 / 2. Then, by (11.23), symmetry, Fubini’s Theorem and the triangle inequality, it follows that
Now, by Lemma 11.1 we get that
And hence, by using (11.24) and (11.25), it follows that
Thus, by using the easy calculus fact that if \(r >0\), \(c\le 1\) and \(r \le c\sqrt{r+1}\) then \(r \le c(1+\sqrt{5})/2,\) and the fact that U is an isometry (so that \(\Vert \sum _{k=1}^N \xi _k\otimes \bar{\xi }_k\Vert \le 1\)), it is easy to see that the claim follows.
To be able to use Talagrand’s formula, there are now some preparations that have to be done. First write
Clearly, since Z is self-adjoint, we have that \( \Vert Z\Vert = \sup _{\eta \in {\mathcal {F}}}|\langle Z\eta ,\eta \rangle |, \) where \(\mathcal {G}\) is a countable set of unit vectors. For \(\eta \in \mathcal {G},\) let the mappings \({\mathcal {B}}({\mathcal {H}}) \ni T \mapsto \langle T\eta ,\eta \rangle \) and \({\mathcal {B}}({\mathcal {H}}) \ni T \mapsto -\langle T\eta ,\eta \rangle \) be denoted by \(\hat{\eta }_1\) and \(\hat{\eta }_2\), respectively. Letting \({\mathcal {F}}\) denote the family of all these mappings we have that \(\Vert Z\Vert = \sup _{\hat{\eta }_i \in {\mathcal {F}}} \sum _{k \le N} \hat{\eta }_i(Z_k)\), and the setup for Talagrand’s theorem is complete.
For \(k = 1,\ldots ,N\) note that
Thus, after restricting \(\hat{\eta }_i\) to the ball of radius \(\theta ^{-1}\max _{k\le N}\Vert \xi _k\Vert ^2\) it follows that
Also, note that
where the third inequality follows from the fact that U is an isometry. It follows by Talagrand’s inequality (Theorem 11.2), the earlier claim (and requiring that the right hand side of (11.22) is bounded by one, which is guarantied by the assumption of the theorem), the first part of the assumed (9.6), (11.26) and (11.27), that there is a constant \(K >0\) such that for \(t > 0\)
But by (11.21) we have
for any \(r > 0\). Therefore, by appealing to (11.29) and (11.28) we obtain
for \(t > 0\). Choosing \(t = \frac{1}{2\gamma }\) yields the first part of the theorem. The last statement of the theorem is clear. \(\square \)
Rights and permissions
About this article
Cite this article
Adcock, B., Hansen, A.C. Generalized Sampling and Infinite-Dimensional Compressed Sensing. Found Comput Math 16, 1263–1323 (2016). https://doi.org/10.1007/s10208-015-9276-6
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10208-015-9276-6