On a Generalization of the Preconditioned Crank–Nicolson Metropolis Algorithm

Rudolf, Daniel; Sprungk, Björn

doi:10.1007/s10208-016-9340-x

On a Generalization of the Preconditioned Crank–Nicolson Metropolis Algorithm

Published: 17 November 2016

Volume 18, pages 309–343, (2018)
Cite this article

Foundations of Computational Mathematics Aims and scope Submit manuscript

Daniel Rudolf¹ &
Björn Sprungk²

1255 Accesses
32 Citations
2 Altmetric
1 Mention
Explore all metrics

Abstract

Metropolis algorithms for approximate sampling of probability measures on infinite dimensional Hilbert spaces are considered, and a generalization of the preconditioned Crank–Nicolson (pCN) proposal is introduced. The new proposal is able to incorporate information on the measure of interest. A numerical simulation of a Bayesian inverse problem indicates that a Metropolis algorithm with such a proposal performs independently of the state-space dimension and the variance of the observational noise. Moreover, a qualitative convergence result is provided by a comparison argument for spectral gaps. In particular, it is shown that the generalization inherits geometric convergence from the Metropolis algorithm with pCN proposal.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Randomized maximum likelihood based posterior sampling

Article Open access 20 December 2021

On the convergence of the Laplace approximation and noise-level-robustness of Laplace-based Monte Carlo methods for Bayesian inverse problems

Article Open access 13 July 2020

Modified Hamiltonian Monte Carlo for Bayesian inference

Article 22 July 2019

Notes

In general elliptic PDEs can be solved in a weak sense by variational methods. Then adjoint methods known from PDE-constrained optimization and parameter identification can be employed to compute $\nabla G(\varvec{\xi })$, see [31, Chapter 6] for details.
The empirical performance of each algorithm was best for this particular tuning.
We also studied other functions such as $f(\varvec{\xi }) = \xi _1$, $f(\varvec{\xi }) = \max _x {{\mathrm{\mathrm {e}}}}^{u(x,\varvec{\xi })}$ and $f(\varvec{\xi }) = p(0.5,\varvec{\xi })$ but the results of the comparison were essentially the same.
We also estimated the $\text {ESS}$ by batch means (100 batches of size $10^4$) to control our simulations. This led to similar results.

References

A. Beskos, G. Roberts, A. Stuart, and J. Voss. MCMC methods for diffusion bridges. Stoch. Dynam., 8(3):319–350, 2008.
Article MathSciNet MATH Google Scholar
V. Bogachev. Gaussian Measures. American Mathematical Society, 1998.
N. Bou-Rabee and M. Hairer. Nonasymptotic mixing of the MALA algorithm. IMA J. Numer. Anal., 33(1):80–110, 2013.
Article MathSciNet MATH Google Scholar
S. Cotter, G. Roberts, A. Stuart, and D. White. MCMC methods for functions: Modifying old algorithms to make them faster. Stat. Sci., 28(3):283–464, 2013.
Article MathSciNet MATH Google Scholar
T. Cui, K. Law, and Y. Marzouk. Dimension-independent likelihood-informed MCMC. Journal of Computational Physics, 304:109–137, 2016.
Article MathSciNet MATH Google Scholar
G. Da Prato and J. Zabczyk. Second Order Partial Differential Equations in Hilbert Spaces. Cambridge University Press, Cambridge, 2004.
MATH Google Scholar
R. Douglas. On majorization, factorization, and range inclusion of operators on Hilbert space. Proc. Amer. math. Soc., 17:413–415, 1966.
Article MathSciNet MATH Google Scholar
N. Dunford and J. Schwartz. Linear Operators, Part I: General Theory. Wiley-Interscience, New York, 1958.
MATH Google Scholar
O. Ernst, B. Sprungk, and H.-J. Starkloff. Analysis of the ensemble and polynomial chaos Kalman filters in Bayesian inverse problems. SIAM/ASA J. Uncertainty Quantification, 3(1):823–851, 2015.
Article MathSciNet MATH Google Scholar
C. Geyer. Practical Markov chain Monte Carlo. Stat. Sci., 7(4):473–483, 1992.
Article Google Scholar
M. Girolami and B. Calderhead. Riemann manifold Langevin and Hamiltonian Monte Carlo methods. J. R. Stat. Soc. Ser. B, 73(2):123–214, 2010.
Article MathSciNet Google Scholar
M. Hairer. An introduction to stochastic PDEs. Lecture notes, 2009.
M. Hairer, A. Stuart, and S. Vollmer. Spectral gaps for a Metropolis–Hastings algorithm in infinite dimensions. Ann. Appl. Probab., 24(6):2455–2490, 2014.
Article MathSciNet MATH Google Scholar
M. Hladnik and M. Omladič. Spectrum of the product of operators. Proc. Amer. math. Soc., 102(2):300–302, 1988.
Article MathSciNet MATH Google Scholar
Z. Hu, Z. Yao, and J. Li. On an adaptive preconditioned Crank–Nicolson algorithm for infinite dimensional Bayesian inferences. arXiv:1511.05838, 2015.
C. Kipnis and S. Varadhan. Central limit theorem for additive functionals of reversible Markov processes and applications to simple exclusions. Comm. Math. Phys., 104(1):1–19, 1986.
Article MathSciNet MATH Google Scholar
K. Law. Proposals which speed up function-space MCMC. J. Comput. Appl. Math., 262:127–138, 2014.
Article MathSciNet MATH Google Scholar
G. Lawler and A. Sokal. Bounds on the $L^2$ spectrum for Markov chains and Markov processes: a generalization of Cheeger’s inequality. Trans. Amer. Math. Soc., 309(2):557–580, 1988.
MathSciNet MATH Google Scholar
A. Lee and K. Łatuszyński. Variance bounding and geometric ergodicity of Markov chain Monte Carlo kernels for approximate Bayesian computation. Biometrika, 101(3):655–671, 2014.
Article MathSciNet MATH Google Scholar
S. Livingstone. Geometric ergodicity of the Random Walk Metropolis with position-dependent proposal covariance. arXiv:1507.05780, 2015.
A. Mandelbaum. Linear estimators and measurable linear transformations on a Hilbert space. Z. Wahrscheinlichkeitstheorie verw. Gebiete, 65:385–397, 1984.
Article MathSciNet MATH Google Scholar
J. Martin, L. Wilcox, C. Burstedde, and O. Ghattas. A stochastic Newton MCMC method for large-scale statistical inverse problems with application to seismic inversion. SIAM J. Sci. Comput., 34(3):A1460–A1487, 2012.
Article MathSciNet MATH Google Scholar
F. Pinski, G. Simpson, A. Stuart, and H. Weber. Algorithms for Kullback–Leibler approximation of probability measures in infinite dimensions. SIAM J. Sci. Comput., 37(6):A2733–A2757, 2015.
G. Roberts and J. Rosenthal. Geometric ergodicity and hybrid Markov chains. Electron. Comm. Probab., 2(2):13–25, 1997.
Article MathSciNet MATH Google Scholar
G. Roberts and J. Rosenthal. Optimal scaling for various Metropolis–Hastings algorithms. Stat. Sci., 16(4):351–367, 2001.
Article MathSciNet MATH Google Scholar
D. Rudolf. Explicit error bounds for Markov chain Monte Carlo. Dissertationes Math. (Rozprawy Mat.), 485:1–93, 2012.
Article MathSciNet MATH Google Scholar
D. Rudolf and M. Ullrich. Positivity of hit-and-run and related algorithms. Electron. Commun. Probab., 18:1–8, 2013.
Article MathSciNet MATH Google Scholar
A. Stuart. Inverse problems: a Bayesian perspective. Acta Numer., 19:451–559, 2010.
Article MathSciNet MATH Google Scholar
L. Tierney. Markov chains for exploring posterior distributions. Ann. Stat., 22(4):1701–1762, 1994.
Article MathSciNet MATH Google Scholar
L. Tierney. A note on Metropolis–Hastings kernels for general state spaces. Ann. Appl. Probab., 8(1):1–9, 1998.
Article MathSciNet MATH Google Scholar
C. Vogel. Computational Methods for Inverse Problems. SIAM, Philadelphia, 2002.
Book MATH Google Scholar

Download references

Acknowledgements

We thank Oliver Ernst, Hans-Jörg Starkloff and Albrecht Böttcher for many fruitful discussions and the referees for their valuable comments which helped to improve the paper. D. R. was supported by the DFG Priority Program 1324 and the DFG Research Training Group 1523. B. S. was supported by the DFG Priority Program 1324.

Author information

Authors and Affiliations

Institut für Mathematische Stochastik, University of Göttingen, Goldschmidtstraße 7, 37077, Göttingen, Germany
Daniel Rudolf
Technische Universität Chemnitz, Reichenhainer Str. 41, 09107, Chemnitz, Germany
Björn Sprungk

Authors

Daniel Rudolf
View author publications
You can also search for this author in PubMed Google Scholar
Björn Sprungk
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Daniel Rudolf.

Additional information

Communicated by Andrew Stuart.

Appendix

1.1 Gaussian Measures

The following brief introduction to Gaussian measures is based on the presentations given in [6, Section 1] and [12, Section 3]. Another comprehensive reference for this topic is [2].

Let ${\mathcal {H}}$ be a Hilbert space with norm $\Vert {\cdot }\Vert $ and inner-product $\langle {\cdot }, {\cdot } \rangle $ and let $\mathcal L^1_+({\mathcal {H}})$ denote the set of all linear, bounded, self-adjoint, positive and trace class operators $A:{\mathcal {H}} \rightarrow {\mathcal {H}}$.

Let $\mu $ be a measure on $({\mathcal {H}}, \mathcal B({\mathcal {H}}))$, and for simplicity let us assume that $\int _{{\mathcal {H}}} \Vert v\Vert ^2\, \mu (\mathrm {d}v) < \infty $. The mean $m \in {\mathcal {H}}$ of $\mu $ is defined as the Bochner integral $m = \int _{{\mathcal {H}}} v\, \mu (\mathrm {d}v)$ and the covariance of $\mu $ is the unique operator $C\in \mathcal L^1_+({\mathcal {H}})$ given by

$$\begin{aligned} \langle C u, u^{\prime } \rangle = \int _{{\mathcal {H}}} \langle u , v - m \rangle \langle u^{\prime } , v - m \rangle \mu (\mathrm {d}v), \qquad \forall u, u^{\prime } \in {\mathcal {H}}. \end{aligned}$$

A measure $\mu $ on ${\mathcal {H}}$ is called a Gaussian measure with mean $m\in {\mathcal {H}}$ and covariance operator $C\in \mathcal L^1_+({\mathcal {H}})$, denoted by N(m, C), iff

$$\begin{aligned} \int _{{\mathcal {H}}} {{\mathrm{\mathrm {e}}}}^{{\mathrm {i}}\langle u, v \rangle } \mu (\mathrm {d}v) = {{\mathrm{\mathrm {e}}}}^{{\mathrm {i}}\langle m , u \rangle - \frac{1}{2}\langle C u,u\rangle }, \qquad \forall u \in {\mathcal {H}}. \end{aligned}$$

This definition is equivalent to $\langle u \rangle _*\mu = N(\langle u, m\rangle , \langle \hbox {Cu}, u\rangle )$ for all $u\in {\mathcal {H}}$ where $\langle u \rangle : {\mathcal {H}} \rightarrow \mathbb {R}$ with $\langle u \rangle (v) := \langle u, v\rangle $ and where $\langle u \rangle _*\mu $ denotes the pushforward measure of $\mu $ under the mapping $\langle u \rangle $. Gaussian measures are uniquely determined by their mean and covariance, i.e., for any $m\in {\mathcal {H}}$ and any $C \in \mathcal L^1_+({\mathcal {H}})$ there exists a unique Gaussian measure $\mu =N(m,C)$ on ${\mathcal {H}}$. Moreover, the set of random variables on ${\mathcal {H}}$ distributed according to a Gaussian measure is closed w.r.t. affine transformations. In detail, let $X \sim N(m,C)$ be a Gaussian random variable on ${\mathcal {H}}$ and let $b\in {\mathcal {H}}$ and $T:{\mathcal {H}} \rightarrow {\mathcal {H}}$ be a bounded, linear operator, then due to [6, Proposition 1.2.3] we have

$$\begin{aligned} b+TX \sim N(b+Tm, TCT^*). \end{aligned}$$

(34)

The Cameron–Martin space ${\mathcal {H}}_\mu $ of a Gaussian measure $\mu = N(m,C)$ on ${\mathcal {H}}$ is defined as the image space $\mathrm{Im}C^{1/2}$ which forms equipped with $\langle u, v \rangle _{C^{-1}}:= \langle C^{-1/2}u, C^{-1/2}v\rangle $ again a Hilbert space. The space ${\mathcal {H}}_\mu $ has some surprising properties: It is the intersection of all measurable linear subspaces $\mathcal X \subseteq {\mathcal {H}}$ with $\mu (\mathcal X)=1$; if $\ker C = \{0\}$ then ${\mathcal {H}}_\mu $ is dense in ${\mathcal {H}}$ and if ${\mathcal {H}}$ is infinite dimensional then $\mu ({\mathcal {H}}_\mu )= 0$. Moreover, the space ${\mathcal {H}}_\mu $ plays an important role for the equivalence of Gaussian measures as rigorously expressed in the Cameron–Martin theorem below. Before stating the result we need some more notation.

In the following let $\mu = N(0,C)$. For $u\in {\mathcal {H}}_\mu $ we set

$$\begin{aligned} W_u(v) := \langle C^{-1/2} u, v \rangle , \qquad \forall v\in {\mathcal {H}}, \end{aligned}$$

and understand $W_u$ as an element of $L_2(\mu )$. Since the mapping ${\mathcal {H}}_\mu \ni u \mapsto W_u \in L_2(\mu )$ is an isometry [6, Section 1.2.4], we can define for any $u\in {\mathcal {H}}$

$$\begin{aligned} \langle C^{-1/2} u, {\cdot } \rangle := L_2(\mu )\text {-}\lim _{n\rightarrow \infty } W_{u_n} \end{aligned}$$

where $u_n \in {\mathcal {H}}_\mu $ and $u_n \rightarrow u$ in ${\mathcal {H}}$ as $n\rightarrow \infty $. And by [6, Proposition 1.2.7] it holds that

$$\begin{aligned} \int _{{\mathcal {H}}} {{\mathrm{\mathrm {e}}}}^{\langle C^{-1/2} u, v \rangle }\, \mu (\mathrm {d}v) = {{\mathrm{\mathrm {e}}}}^{\frac{1}{2} \Vert u\Vert ^2}, \qquad \forall u\in {\mathcal {H}}. \end{aligned}$$

(35)

Hence, if $h\in {\mathcal {H}}_\mu $, we understand $\langle C^{-1} h, {\cdot } \rangle $ as $\langle C^{-1/2} (C^{-1/2} h), {\cdot } \rangle \in L_2(\mu )$.

Theorem 21

(Cameron–Martin formula, [6, Theorem 1.3.6]) Let $\mu = N(0, C)$ and $\mu _h = N(h, C)$ be Gaussian measures on a separable Hilbert space ${\mathcal {H}}$. Then, $\mu $ and $\mu _h$ are equivalent iff $h\in {\mathcal {H}}_\mu = \mathrm{Im}C^{1/2}$ in which case

$$\begin{aligned} \frac{\mathrm {d}\mu _h}{\mathrm {d}\mu }(v) = \exp \left( -\frac{1}{2} \Vert C^{-1/2}h\Vert ^2 + \langle C^{-1} h, v \rangle \right) . \end{aligned}$$

Thus, two Gaussian measures N(m, C) and $N(m+h, C)$ are only equivalent if $h\in \mathrm{Im}C^{1/2}$. Consider now $\mu = N(0,C)$ and $\nu = N(0, Q)$ with $C\ne Q$. Before stating a theorem about the equivalence of $\mu $ and $\nu $, we need some more notations. Let $T:{\mathcal {H}} \rightarrow {\mathcal {H}}$ be in the following a self-adjoint trace class operator and let $(t_n)_{n\in \mathbb {N}}$ denote the sequence of its eigenvalues. We set

$$\begin{aligned} \det (I+T) := \prod _{n=1}^\infty (1+t_n) \end{aligned}$$

and define

$$\begin{aligned} \langle TC^{-1/2}u, C^{-1/2}u\rangle := \lim _{N\rightarrow \infty } \langle TC^{-1/2} \, \varPi _N u, C^{-1/2}\, \varPi _N u\rangle , \qquad \mu \text {-a.e.} \end{aligned}$$

where $\varPi _N$ denotes the projection operator to $\mathrm {span}\{e_1,\ldots ,e_N\}$ with $e_n$ denoting the nth eigenvector of C. The existence of the $\mu \text {-a.e.}$-limit above is proven in [6, Proposition 1.2.10], and furthermore, if $\langle Tu, u\rangle < \Vert u\Vert ^2$ holds for any $u\in {\mathcal {H}}$, then by [6, Proposition 1.2.11] we have

$$\begin{aligned} \int _{{\mathcal {H}}} {{\mathrm{\mathrm {e}}}}^{ \frac{1}{2} \langle T C^{-1/2} u, C^{-1/2} u \rangle )}\, \mathrm {d}\mu (u) = \frac{1}{\sqrt{\det (1-T)}}. \end{aligned}$$

(36)

Theorem 22

([6, Proposition 1.3.11]) Let $\mu = N(0, C)$ and $\nu = N(0, Q)$ be Gaussian measures on a separable Hilbert space ${\mathcal {H}}$. If $T:= I - C^{-1/2}QC^{-1/2}$ is self-adjoint, trace class and satisfies $\langle Tu, u\rangle < \Vert u\Vert ^2$ for any $u\in {\mathcal {H}}$, then $\mu $ and $\nu $ are equivalent with

$$\begin{aligned} \frac{\mathrm {d}\nu }{\mathrm {d}\mu }(u) = \frac{1}{\sqrt{\det (I-T)}} \exp \left( - \frac{1}{2} \langle T(I-T)^{-1}\, C^{-1/2} u, C^{-1/2} u \rangle \right) , \quad u\in {\mathcal {H}}. \end{aligned}$$

We note that the assumptions of Theorem 22 can be relaxed to $I - C^{-1/2}QC^{-1/2}$ being Hilbert–Schmidt which is known as Feldman–Hajek theorem. Also in this case expression for the Radon–Nikodym derivative can be obtained, see [2, Corollary 6.4.11].

Finally, we recall two simple but useful facts resulting from a change of variables.

Lemma 23

Let ${\mathcal {H}}$ be a separable Hilbert space, $0<s<\infty $ and $h\in {\mathcal {H}}$.

Assume $\mu = N(m,C)$, $\nu = N(m+h, s^2C)$ on ${\mathcal {H}}$ and $f:{\mathcal {H}} \rightarrow \mathbb {R}$. Then
$$\begin{aligned} \int _{{\mathcal {H}}} f(v) \mu (\mathrm {d}v) = \int _{{\mathcal {H}}} f\left( \frac{1}{s} (v - h)\right) \, \nu (\mathrm {d}v). \end{aligned}$$
Assume $\mu _1 = N(m_1, C_1)$ and $\mu _2 = N(m_2, C_2)$ are equivalent with $\frac{\mathrm {d}\mu _2}{\mathrm {d}\mu _1}(u) = \pi (u)$. Then the measures $\nu _1 = N(m_1+h, s^2C_1)$ and $\nu _2 = N(m_2+h, s^2C_2)$ are also equivalent with
$$\begin{aligned} \frac{\mathrm {d}\nu _2}{\mathrm {d}\nu _1}(u) = \pi \left( \frac{u-h}{s}\right) . \end{aligned}$$

1.2 Proofs

The following proofs are rather operator theoretic and rely heavily on the holomorphic functional calculus. We refer to [8, Section VII.3] for a comprehensive introduction.

1.2.1 Proof of Lemma 2

From the proof of Proposition 1 we know that $(I + H_\varGamma )^{-1}:{\mathcal {H}} \rightarrow {\mathcal {H}}$ is self-adjoint and that $\Vert (I + H_\varGamma )^{-1}\Vert \le 1$. Thus, $I-s^2(I+H_\varGamma )^{-1}$ is also a self-adjoint, bounded and positive operator on ${\mathcal {H}}$ and its square root operator appearing in (16) exists. This yields the well definedness of $A_\varGamma : \mathrm {Im}\,C^{1/2}\rightarrow {\mathcal {H}}$. We now prove that $A_\varGamma $ is a bounded operator on $\mathrm {Im}\,C^{1/2}$. For $s=0$ we get $A_\varGamma = I$ and the assertion follows, so that we assume $s\in (0,1)$. Let us now define $f:\mathbb {C}{\setminus } \{-1\} \rightarrow \mathbb {C}$ by

$$\begin{aligned} f(z) = \sqrt{1 - s^2(1+z)^{-1}}. \end{aligned}$$

The function f is analytic in the complex half plane $\{z \in \mathbb {C}: \mathfrak {R}(z) > s^2-1\}$, since $\mathfrak {R}(1+z) > s^2$ implies

$$\begin{aligned} \mathfrak {R}\left( (1+z)^{-1} \right) = \frac{\mathfrak {R}(1+z)}{|1+z|^2} \le \frac{1}{\mathfrak {R}(1+z)}< \frac{1}{s^2}. \end{aligned}$$

Denoting $\gamma :=\Vert H_\varGamma \Vert $ the spectrum of $H_\varGamma = C^{1/2}\varGamma C^{1/2}$ is contained in $[0,\gamma ]$. Then, since $s<1$ we have that f is analytic in a neighborhood, say, $\mathcal {N}[0,\gamma ]$ of $[0,\gamma ]$. Hence, by functional calculus we obtain

$$\begin{aligned} \sqrt{I - s^2 \left( I + H_\varGamma \right) ^{-1} } = f(H_\varGamma ) = \frac{1}{2\pi i} \int _{\partial \mathcal N[0,\gamma ]} f(\zeta )\, (\zeta I - H_\varGamma )^{-1} \, \mathrm {d}\zeta . \end{aligned}$$

Due to analyticity we can approximate f by a sequence of polynomials $p_n$ with degree n which converge uniformly on $\mathcal {N}[0,\gamma ]$ to f for $n\rightarrow \infty $. Then, by [8, Lemma VII.3.13] holds

$$\begin{aligned} \Vert p_n(H_\varGamma ) - f(H_\varGamma ) \Vert _{{\mathcal {H}}\rightarrow {\mathcal {H}}} \rightarrow 0, \end{aligned}$$

for $n \rightarrow \infty $. Since the polynomials $p_n$ can be represented as $ p_n(z) = \sum _{k=0}^n a_k^{(n)} z^k$, we obtain further

$$\begin{aligned} C^{1/2} \, p_n(H_\varGamma ) = C^{1/2} \sum _{k=0}^n a^{(n)}_k \, (C^{1/2}\varGamma C^{1/2})^k = p_n(C\varGamma ) \, C^{1/2}. \end{aligned}$$

By [14, Proposition 1] we have

$$\begin{aligned} \mathrm{spec}(C\varGamma \mid {\mathcal {H}}) = \mathrm{spec}(C^{1/2}\varGamma C^{1/2} \mid {\mathcal {H}})\subseteq [0,\gamma ] \end{aligned}$$

where $\mathrm{spec}({\cdot }\mid {\mathcal {H}})$ denotes the spectrum on ${\mathcal {H}}$, and, thus, we can conclude $\Vert p_n(C\varGamma ) - f(C\varGamma ) \Vert _{{\mathcal {H}} \rightarrow {\mathcal {H}}} \rightarrow 0$ as $n \rightarrow \infty $ again by [8, Lemma VII.3.13]. Hence,

$$\begin{aligned} C^{1/2} f(H_\varGamma )&= \lim _{n\rightarrow \infty } C^{1/2} \, p_n(H_\varGamma ) = \lim _{n\rightarrow \infty } p_n(C\varGamma ) \, C^{1/2} = f(C\varGamma ) C^{1/2} \end{aligned}$$

and

$$\begin{aligned} A_{\varGamma }= C^{1/2} f(H_\varGamma ) C^{-1/2} = f(C\varGamma ) C^{1/2}C^{-1/2} = f(C\varGamma ) \end{aligned}$$

where $f(C\varGamma )$ is by construction a bounded operator on ${\mathcal {H}}$.

1.2.2 Proof of Lemma 12

By [7, Theorem 1] the relation $\mathrm{Im}(\varDelta _\varGamma ) \subseteq \mathrm{Im}(C^{1/2})$ holds iff there exists a bounded operator $B:{\mathcal {H}}\rightarrow {\mathcal {H}}$ such that

$$\begin{aligned} \varDelta _\varGamma = C^{1/2} B. \end{aligned}$$

(37)

Thus, $\mathrm{Im}(\varDelta _\varGamma ) \subseteq \mathrm{Im}(C^{1/2})$ is equivalent to $C^{-1/2}\varDelta _\varGamma $ being bounded on ${\mathcal {H}}$. In order to construct and analyze the operator B, we define $f:\mathbb {C}{\setminus }\{-1\} \rightarrow \mathbb {C}$ by

$$\begin{aligned} f(z) := \sqrt{1 - s^2(1+z)^{-1}} - \sqrt{1-s^2}, \end{aligned}$$

which is analytic in $\{z \in \mathbb {C}: \mathfrak {R}(z) > s^2-1\}$, cf. the proof of Lemma 2, and particularly in

$$\begin{aligned} V = \{z \in \mathbb {C}: \mathrm{dist}(z, [0,\gamma ]) \le \varepsilon \}, \qquad 0< \varepsilon < 1-s^2, \end{aligned}$$

where $\gamma := \Vert H_\varGamma \Vert $. We have the following representation

$$\begin{aligned} -\varDelta _\varGamma&= A_{\varGamma } - \sqrt{1-s^2} I\\&= C^{1/2}\,\left( \sqrt{I - s^2 \left( I + H_\varGamma \right) ^{-1} } - \sqrt{1-s^2} I\right) \, C^{-1/2}\\&= C^{1/2}\, f(H_\varGamma ) \, C^{-1/2} \end{aligned}$$

with

$$\begin{aligned} f(H_\varGamma ) = \frac{1}{2\pi i} \int _{\partial V} f(\zeta )\, (\zeta I - H_\varGamma )^{-1} \, \mathrm {d}\zeta \end{aligned}$$

see [8, Chapter VII.3]. Hence, if we can prove that $B = - f(H_\varGamma ) \, C^{-1/2}$ is a bounded operator on ${\mathcal {H}}$, we have shown the assertion.

To this end let $p_n(z) = \sum _{k=0}^n a_k^{(n)} z^k$ be polynomials of degree n, with $n\in \mathbb {N}$, which converge uniformly on V to f. Such polynomials exist due to the analyticity of f and by the fact that $f(0) = 0$ we can assume w.l.o.g. that $a_0^{(n)} = 0$ for all $n\in \mathbb {N}$. This leads to

$$\begin{aligned} p_n(H_\varGamma )&= C^{1/2} \varGamma ^{1/2} \left( \sum _{k=1}^n a_k^{(n)} (\varGamma ^{1/2} C \varGamma ^{1/2} )^{k-1} \right) \varGamma ^{1/2} C^{1/2}\\&= C^{1/2} \varGamma ^{1/2} \; q_{n-1}(\varGamma ^{1/2} C \varGamma ^{1/2}) \; \varGamma ^{1/2} C^{1/2} \end{aligned}$$

with $q_{n-1}(z) := \sum _{k=1}^{n} a_{k}^{(n)} z^{k-1} = p_n(z)/z$. Now, [14, Proposition 1] implies that the operators $C^{1/2}\varGamma C^{1/2}$ and $\varGamma ^{1/2}C \varGamma ^{1/2}$ share the same spectrum, since C and $\varGamma $ are positive. Thus, $\mathrm{spec}(\varGamma ^{1/2}C \varGamma ^{1/2}\mid {\mathcal {H}}) \subset [0,\gamma ]$ and we have

$$\begin{aligned} q_n(\varGamma ^{1/2}C \varGamma ^{1/2}) = \frac{1}{2\pi i} \int _{\partial V} q_n(\zeta )\, (\zeta I - \varGamma ^{1/2}C \varGamma ^{1/2})^{-1} \, \mathrm {d}\zeta , \qquad n\in \mathbb {N}. \end{aligned}$$

Moreover, the polynomials $q_n$ are a Cauchy sequence in $C(\partial V)$, since

$$\begin{aligned} \sup _{\zeta \in \partial V}|q_n(\zeta ) - q_m(\zeta )|&\quad \le \sup _{\zeta \in \partial V}\frac{|\zeta |}{\min _{\eta \in \partial V} |\eta |} |q_n(\zeta ) - q_m(\zeta )| \\&\quad = \frac{1}{\min _{\eta \in \partial V} |\eta |} \sup _{\zeta \in \partial V}|\zeta q_n(\zeta ) - \zeta q_m(\zeta )| \\&\quad = \frac{1}{\min _{\eta \in \partial V} |\eta |} \sup _{\zeta \in \partial V}|p_{n+1}(\zeta ) - p_{m+1}(\zeta ))| \end{aligned}$$

where $\min _{\eta \in \partial V} |\eta | = \varepsilon > 0$ due to our choice of V. Thus, the polynomials $q_n$ converge uniformly on $\partial V$ to a function g. This implies that the operators $q_n(\varGamma ^{1/2}C \varGamma ^{1/2})$ converge in the operator norm to a bounded operator

$$\begin{aligned} g(\varGamma ^{1/2}C \varGamma ^{1/2}) := \frac{1}{2\pi i} \int _{\partial V} g(\zeta )\, (\zeta I - \varGamma ^{1/2}C \varGamma ^{1/2})^{-1} \, \mathrm {d}\zeta . \end{aligned}$$

We arrive at

$$\begin{aligned} f(H_\varGamma )&= \lim _{n\rightarrow \infty } p_n(C^{1/2}\varGamma C^{1/2}) \\&= \lim _{n\rightarrow \infty } C^{1/2} \varGamma ^{1/2} \; q_{n-1}(\varGamma ^{1/2} C \varGamma ^{1/2}) \; \varGamma ^{1/2} C^{1/2} \\&= C^{1/2} \varGamma ^{1/2} \; g(\varGamma ^{1/2} C \varGamma ^{1/2}) \; \varGamma ^{1/2} C^{1/2}, \end{aligned}$$

which yields

$$\begin{aligned} B = -f(H_\varGamma ) C^{-1/2} = -C^{1/2} \varGamma ^{1/2} \; g(\varGamma ^{1/2} C \varGamma ^{1/2}) \varGamma ^{1/2} \end{aligned}$$

being bounded on ${\mathcal {H}}$.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rudolf, D., Sprungk, B. On a Generalization of the Preconditioned Crank–Nicolson Metropolis Algorithm. Found Comput Math 18, 309–343 (2018). https://doi.org/10.1007/s10208-016-9340-x

Download citation

Received: 15 April 2015
Revised: 02 February 2016
Accepted: 05 October 2016
Published: 17 November 2016
Issue Date: April 2018
DOI: https://doi.org/10.1007/s10208-016-9340-x

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On a Generalization of the Preconditioned Crank–Nicolson Metropolis Algorithm

Abstract

Access this article

Similar content being viewed by others

Randomized maximum likelihood based posterior sampling

On the convergence of the Laplace approximation and noise-level-robustness of Laplace-based Monte Carlo methods for Bayesian inverse problems

Modified Hamiltonian Monte Carlo for Bayesian inference

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix

1.1 Gaussian Measures

Theorem 21

Theorem 22

Lemma 23

1.2 Proofs

1.2.1 Proof of Lemma 2

1.2.2 Proof of Lemma 12

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

On a Generalization of the Preconditioned Crank–Nicolson Metropolis Algorithm

Abstract

Access this article

Similar content being viewed by others

Randomized maximum likelihood based posterior sampling

On the convergence of the Laplace approximation and noise-level-robustness of Laplace-based Monte Carlo methods for Bayesian inverse problems

Modified Hamiltonian Monte Carlo for Bayesian inference

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix

Appendix

1.1 Gaussian Measures

Theorem 21

Theorem 22

Lemma 23

1.2 Proofs

1.2.1 Proof of Lemma 2

1.2.2 Proof of Lemma 12

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation