1 Introduction

The analysis and computation of spectral properties of operators form core parts of many branches of science and mathematics, arising in diverse fields such as differential and integral equations, orthogonal polynomials, quantum mechanics, statistical mechanics, integrable systems and optics [15, 43, 44, 69, 110, 129, 138]. Correspondingly, the problem of numerically computing the spectrum, \(\sigma (T)\), of an operator T acting on the canonical separable Hilbert space \(l^2({\mathbb {N}})\) has attracted a large amount of interest over the last 60 years or so [5,6,7,8, 21, 22, 24, 26, 27, 34, 38, 40, 46, 49, 73, 79,80,81, 94, 98, 99, 116, 120, 122,123,124]. However, the richness, beauty and difficulties that are encountered in infinite dimensions lie not just in the set \(\sigma (T)\subset {\mathbb {C}}\), but also in the generalisation of projections onto eigenspaces and the possibility of different spectral types. Specifically, given a normal operator T, there is an associated projection-valued measure (resolution of the identity), which we denote by \(E^T\), whose existence is guaranteed by the spectral theorem and whose support is \(\sigma (T)\) [86, 87, 113]. This allows the representation of the operator T as an integral over \(\sigma (T)\), analogous to the finite-dimensional case of diagonalisation:

$$\begin{aligned} Tx=\left[ \int _{\sigma (T)} \lambda dE^T(\lambda )\right] x,\quad \forall x\in {\mathcal {D}}(T), \end{aligned}$$
(1.1)

where \({\mathcal {D}}(T)\) denotes the domain of T. For example, if T is compact, then \(E^T\) corresponds to projections onto eigenspaces, familiar from the finite-dimensional setting. However, in general, the situation is much richer and more complicated, with different types of spectra (pure point, absolutely continuous and singular continuous). An excellent and readable introduction can be found in Halmos’ article [77].

The computation of \(E^T\), along with its various decompositions and their supports, is of great interest, both theoretically and for practical applications. For example, spectral measures are intimately related to correlation functions in signal processing, resonance phenomena in scattering theory, and stability analysis for fluids. Moreover, the computation of \(E^T\) allows one to compute many additional objects, which we provide the first general algorithms for in this paper, such as the functional calculus (Theorem 4.1), the Radon–Nikodym derivative of the absolutely continuous component of the measure (Theorem 4.2), and the spectral measures and spectral set decompositions (Theorem 3.2 and Theorem 5.1). For instance, in Sect. 1.2.1 we show how our results allow the computation of spectral measures and the functional calculus of almost arbitrary self-adjoint partial differential operators on \(L^2({\mathbb {R}}^d)\). An important class of examples is given by solutions of evolution equations such as the Schrödinger equation on \(L^2({\mathbb {R}}^d)\) with a potential of locally bounded total variation. We prove that this is computationally possible even when an algorithm is only allowed to point sample the potential. A numerical example of fractional diffusion for a discrete quasicrystal is also given in Sect. 6.4.

Despite its importance, there has been no general method able to compute spectral measures of normal operators. Although there is a rich literature on the theory of spectral measures, most of the efforts to develop computational tools have focused on specific examples where analytical formulae are available (or perturbations thereof) or on classes of operators that carry a lot of structure. Indeed, apart from the work of Webb and Olver [151] (which deals with compact perturbations of tridiagonal Toeplitz operators) and methods for computing spectral density functions of Sturm–Liouville problems and other highly structured operators, there has been limited success in computing the measure \(E^T\).Footnote 1 In some sense, this is not surprising given the difficulty in rigorously computing spectra. One can consider computing spectral measures/projections as the infinite-dimensional analogue of computing projections onto eigenspaces.Footnote 2 Thus, from a numerical/computational point of view, the current state of affairs in infinite-dimensional spectral computations is highly deficient, analogous in finite dimensions to being able to compute the location of eigenvalues but not eigenvectors! It has been unknown if the general computation of spectral measures is possible, even for simple subclasses such as discrete Schrödinger operators. In other words, the computational problem of “diagonalisation” through computing spectral measures remains an important and predominantly open problem.

In this paper, we solve this problem by providing the first set of algorithms for the computation of spectral measures for a large class of self-adjoint and unitary operators, namely, those whose matrix columns decay at a known asymptotic rate. This is a very weak assumption and covers the majority of operators, even unbounded, found in applications. In particular, those whose representation is sparse (such as many of the graph or lattice operators typically dealt with in physics) and also partial differential operators, once a suitable basis has been selected (see Theorem 1.1 and Appendix B). We also show how to compute spectral measure decompositions, the functional calculus, the density of the absolutely continuous part of the measure (Radon–Nikodym derivative) and different types of spectra (pure point, absolutely continuous and singular continuous—these sets often characterise different physical properties in quantum mechanics [2, 41, 42, 53, 61, 93, 119, 127]). A central ingredient of these new algorithms is the computation of the resolvent operator with error control through appropriate rectangular truncations (Theorem 2.1). Furthermore, we demonstrate the applicability of our algorithms. The algorithms are efficient and parallelisable, allowing large scale computations.

1.1 The solvability complexity index and classification of problems

A surprise thrown up by the infinite-dimensional spectral problem, which turns out to be quite generic, is the Solvability Complexity Index (SCI) [81]. The SCI provides a hierarchy for classifying the difficulty of computational problems. In classical numerical analysis, when computing spectra, one hopes to construct an algorithm, \(\Gamma _n\), with one limit such that for an operator T,

$$\begin{aligned} \Gamma _n(T) \rightarrow \sigma (T), \quad \text {as } n \rightarrow \infty , \end{aligned}$$
(1.2)

preferably with some form of error control or rate of convergence. However, this is not always possible. For example, when considering the class of bounded operators, the best possible alternative is an algorithm depending on three indices \(n_1, n_2, n_3\) such that

$$\begin{aligned} \lim _{n_3 \rightarrow \infty } \lim _{n_2 \rightarrow \infty } \lim _{n_1 \rightarrow \infty } \Gamma _{n_3,n_2,n_1}(T) = \sigma (T). \end{aligned}$$

Any algorithm with fewer than three limits will fail on some bounded operator, and neither error control nor convergence rates on any of the limits are possible since these would reduce the required number of limits. However, for self-adjoint operators, it is possible to reduce the number of limits to two, but not one [12, 81]. With more structure (such as sparsity or column decay) it is possible to compute the spectrum in one limit with a certain type of error control [40]. Hence, the only way to characterise the computational spectral problem is through a hierarchy, classifying the difficulty of computing spectral properties of different subclasses of operators. The SCI classifies difficulty by considering the minimum number of limits that one must take to calculate the quantity of interest (see Appendix A for a full definition). The SCI has roots in the work of Smale [132, 134], and his program on the foundations of computational mathematics and scientific computing, though it is quite distinct. The notions of Turing computability [145] and computability in the Blum–Shub–Smale (BSS) [17] sense become special cases, and impossibility results that are proven in the SCI hierarchy hold in all models of computation. The phenomenon of needing several limits also covers general numerical analysis problems, such as Smale’s question on the existence of purely iterative algorithms for polynomial root finding [133]. As demonstrated by McMullen [101, 102] and Doyle and McMullen [51], this is a case where several limits are needed in the computation, and their results become special cases of classification in the SCI hierarchy [12]. Extensions of the hierarchy to error control [33, 36, 37] also have potential applications in the growing field of computer-assisted proofs, where one must perform a computation with absolute certainty. See, for example, the work of Fefferman and Seco on the Dirac–Schwinger conjecture [55, 56] and Hales on Kepler’s conjecture (Hilbert’s 18th problem) [75], both of which implicitly provide classifications in the SCI hierarchy. As well as spectral problems [12, 33, 36,37,38,39,40, 81], the SCI hierarchy has recently been used to solve problems related to computing resonances [13, 14], computing solutions of semigroups and evolution PDEs [10, 35], and computational barriers for stable and accurate neural networks [4].

An informal definition of the SCI hierarchy is as follows, with a detailed summary contained in Appendix A. The SCI hierarchy is based on the concept of a computational problem. This is described by a function

$$\begin{aligned} \Xi :\Omega \rightarrow {\mathcal {M}} \end{aligned}$$

that we want to compute, where \(\Omega \) is some domain, and \(({\mathcal {M}},d)\) is a metric space. For example, we could take \(\Xi (T) = \sigma (T)\) (the spectrum) for some class of operators \(\Omega \) and \({\mathcal {M}}\) the collection of non-empty closed subsets of \({\mathbb {C}}\) equipped with the Attouch–Wets metric. The SCI of a computational problem is the smallest number of limits needed in order to compute the solution. For a given set of evaluation functions (the information our algorithm is allowed to read—in our case, \(\Lambda _1\) or \(\Lambda _2\) defined in (1.18)), class of objects (in our case, subclasses of operators acting on \(l^2({\mathbb {N}})\)) and model of computation \(\alpha \) (in this paper general, G, or arithmetic, A) we define:

  1. (i)

    \(\Delta ^{\alpha }_0\) is the set of problems that can be computed in finite time, the SCI \(=0\).

  2. (ii)

    \(\Delta ^{\alpha }_1\) is the set of problems that can be computed using one limit (the SCI \(=1\)) with control of the error, i.e. \(\exists \) a sequence of algorithms \(\{\Gamma _n\}\) such that \(d(\Gamma _n(A), \Xi (A)) \le 2^{-n}, \, \forall A \in \Omega \).

  3. (iii)

    \(\Delta ^{\alpha }_2\) is the set of problems that can be computed using one limit (the SCI \(=1\)) without error control, i.e. \(\exists \) a sequence of algorithms \(\{\Gamma _n\}\) such that \(\lim _{n\rightarrow \infty }\Gamma _n(A) = \Xi (A), \, \forall A \in \Omega \).

  4. (iv)

    \(\Delta ^{\alpha }_{m+1}\), for \(m \in {\mathbb {N}}\), is the set of problems that can be computed by using m limits, (the SCI \(\le m\)), i.e. \(\exists \) a family of algorithms \(\{\Gamma _{n_m, \ldots , n_1}\}\) such that

    $$\begin{aligned} \lim _{n_m \rightarrow \infty }\ldots \lim _{n_1\rightarrow \infty }\Gamma _{n_m,\ldots , n_1}(A) = \Xi (A), \, \forall A \in \Omega . \end{aligned}$$

The class \(\Delta _1\) is, of course, a highly desired class; however, non-trivial spectral problems are higher up in the hierarchy. For example, the following classifications are known [12, 81]:

  1. (i)

    The general spectral problem is in \(\Delta _4{\setminus } \Delta _3\).

  2. (ii)

    The self-adjoint spectral problem is in \(\Delta _3{\setminus } \Delta _2\).

  3. (iii)

    The compact spectral problem is in \(\Delta _2{\setminus } \Delta _1\).

Here, the notation \({\setminus }\) indicates the standard “setminus”. Hence, the computational spectral problem becomes an infinite classification theory to characterise the above hierarchy. In order to do so, there will, necessarily, have to be many different types of algorithms. Characterising the hierarchy will yield a myriad of different approaches, as different structures on the various classes of operators will require specific algorithms.

This paper provides classifications of spectral problems associated with \(E^T\) (such as decompositions of the measure and spectrum) in the SCI hierarchy, some of which can be computed in one limit. We provide algorithms for these problems, and one of the main tools used is the computation of the resolvent operator \(R(z,T):=(T-zI)^{-1}\) with error control (Theorem 2.1). This is possible through appropriate rectangular truncations of the infinite-dimensional operator. This approach differs from finite-dimensional techniques, which typically consider square truncations.

Remark 1

(Recursivity and independence of the model of computation) The constructive inclusion results we provide hold for arithmetic algorithms and the impossibility results hold for general algorithms. We refer the reader to Appendix A for a detailed explanation. Put simply, this means that the algorithms constructed can be recursively implemented with inexact input and restrictions to arithmetic operations over the rationals (it is also straightforward to implement them using interval arithmetic), whereas the impossibility results hold in any model of computation (such as the Turing or BSS models).

1.2 Summary of the main results

1.2.1 Partial differential operators

For \(N \in {\mathbb {N}}\), consider the operator formally defined on \(L^2({\mathbb {R}}^d)\) by

$$\begin{aligned} Lu(x)=\sum _{k\in {\mathbb {Z}}_{\ge 0}^d,\left| k\right| \le N}a_k(x)\partial ^ku(x), \end{aligned}$$
(1.3)

where throughout we use multi-index notation with \(\left| k\right| =\max \{\left| k_1\right| ,\ldots ,\left| k_d\right| \}\) and \(\partial ^k=\partial ^{k_1}_{x_1}\partial ^{k_2}_{x_2}\ldots \partial ^{k_d}_{x_d}\). We will assume that the coefficients \(a_k(x)\) are complex-valued measurable functions on \({\mathbb {R}}^d\) and that L is self-adjoint. For dimension d and \(r>0\) consider the space

$$\begin{aligned} {\mathcal {A}}_r=\{f\in M([-r,r]^d):\left\| f\right\| _{\infty }+\mathrm {TV}_{[-r,r]^d}(f)<\infty \}, \end{aligned}$$
(1.4)

where \(M([-r,r]^d)\) denotes the set of measurable functions on the hypercube \([-r,r]^d\) and \(\mathrm {TV}_{[-r,r]^d}\) denotes the total variation norm in the sense of Hardy and Krause [103]. This space becomes a Banach algebra when equipped with the norm [18]

$$\begin{aligned} \left\| f\right\| _{{\mathcal {A}}_r}:=\left\| f|_{[-r,r]^d}\right\| _{\infty }+(3^d+1)\mathrm {TV}_{[-r,r]^d}(f). \end{aligned}$$

Let \(\Omega _{\mathrm {PDE}}\) consist of all such L such that the following assumptions hold:

  1. (1)

    The set \(C_{0}^\infty ({\mathbb {R}}^d)\) of smooth, compactly supported functions forms a core of L.

  2. (2)

    For each of the functions \(a_k(x)\), there exists a positive constant \(A_k\) and an integer \(B_k\) (both possibly unknown) such that

    $$\begin{aligned} \left| a_k(x)\right| \le A_k\left( 1+\left| x\right| ^{2B_k}\right) , \end{aligned}$$

    almost everywhere on \({\mathbb {R}}^d\), that is, we have at most polynomial growth of the coefficients.

  3. (3)

    The restrictions \(a_k|_{[-r,r]^d}\in {\mathcal {A}}_r\) for all \(r>0\).

We consider the case where our algorithms can do the following:

  1. 1.

    Evaluate any coefficient \(a_k(x)\) to a given precision at \(x\in {\mathbb {Q}}^d\), where \({\mathbb {Q}}\) denotes the field of rationals, and output an approximation in \({\mathbb {Q}}+i{\mathbb {Q}}\).

  2. 2.

    For each n, evaluate a positive number \(b_n(L)\) such that the sequence \(\{b_n(L)\}_{n\in {\mathbb {N}}}\) satisfies

    $$\begin{aligned} \sup _{n\in {\mathbb {N}}}\max _{\left| k\right| \le N}\frac{\left\| a_k\right\| _{{\mathcal {A}}_n}}{b_n(L)}<\infty . \end{aligned}$$
    (1.5)

In Appendix B, we prove (and state a more precise version of) the following theorem.

Theorem 1.1

(Spectral properties of self-adjoint partial differential operators can be computed). Given the above set-up, there exist sequences of arithmetic algorithms that compute spectral measures, the functional calculus, and Radon–Nikodym derivatives of the absolutely continuous part of the measure over the class \(\Omega _\mathrm {PDE}\). In other words, these objects can be computed in one limit with \(\mathrm {SCI}=1\).

Remark 2

As noted in Appendix B, we can extend this result to computing the decompositions into pure point, absolutely continuous and singular continuous parts of measures and spectra (with \(\mathrm {SCI}>1\)).

The above properties characterising \(\Omega _{\mathrm {PDE}}\) are deliberately very weak, and hence the class \(\Omega _{\mathrm {PDE}}\) is very large. For example, Schrödinger operators \(L=-\Delta +V\) with polynomially bounded potentials of locally bounded total variation are a subclass of \(\Omega _{\mathrm {PDE}}\). Hence, in this case, the theorem says that we can compute the spectral properties (measures, functional calculus etc.) of L by point sampling the potential V, if we have an asymptotic bound on the total variation of V over finite rectangles. In particular, we can solve the Schrödinger equation

$$\begin{aligned} \frac{du}{dt}=-iLu,\quad u_{t=0}=u_0 \end{aligned}$$
(1.6)

by computing \(\exp (-itL)u_0\) with guaranteed convergence. Applications of some of our results to semigroups (where error control can also be provided) are given in [35]. The proof of Theorem 1.1 can also be extended to other domains such as the half-line, and can be adapted to cope with other types of coefficients that are not of locally bounded total variation (for instance, Coulombic potentials for Dirac or Schrödinger operators). In order to prove Theorem 1.1, we prove theorems for operators on \(l^2({\mathbb {N}})\).

1.2.2 Operators on \(l^2({\mathbb {N}})\)

We consider self-adjoint operators given as an infinite matrix T whose columns decay at a known asymptotic rate:

$$\begin{aligned} \Vert (P_{f(n)}-I)TP_n\Vert =O(\alpha _n) \end{aligned}$$
(1.7)

for a sequence \(\alpha _n\downarrow 0\) and function \(f:{\mathbb {N}}\rightarrow {\mathbb {N}}\), where \(P_n\) denotes the orthogonal projection onto the linear span of the first n canonical basis vectors. This set-up is adapted in Appendix B to prove Theorem 1.1, where we make use of the following theorems proven for operators on \(l^2({\mathbb {N}})\).

We consider the problem of computing spectral measures. Specifically, for operators of the form (1.7) we develop the first algorithms and SCI classifications for:

\(\bullet \):

Theorem 3.1: \(\Delta _2^A\) classification for the projection-valued spectral measure and, through taking inner products, the computation of the scalar spectral measures defined via

$$\begin{aligned} \mu _{x,y}^T(U)=\langle E^T({U}) x,y\rangle . \end{aligned}$$

This is done for open sets U and can be extended to other types of sets such as closed intervals or singletons (Theorem 5.2 shows that the problem \(\notin \Delta _1^G\) in certain cases). These scalar-valued spectral measures play an important role in, for example, quantum mechanics, where they correspond to the distribution of the physical observable modelled by a Hamiltonian T.

\(\bullet \):

Theorem 3.2: \(\Delta _3^A{\setminus } \Delta _2^G\) classification for the decompositions of the projection-valued and scalar-valued spectral measures into absolutely continuous, singular continuous and pure point parts. These decompositions often characterise different physical properties in quantum mechanics [2, 41, 42, 53, 61, 93, 119, 127]

\(\bullet \):

Theorem 4.1: \(\Delta _2^A\) classification for the functional calculus of operators, i.e. the computation of F(T) for suitable functions F. For brevity, we consider functions that are bounded and continuous on the spectrum of T. However, the proof makes clear that wider classes can be dealt with. In some cases, \(\Delta _1^A\) error control is possible, for instance, when considering the holomorphic functional calculus (see Sect. 6.4). A key application of this result is the computation of solutions of many evolution PDEs, such as the Schrödinger equation through the function \(F(z)=\exp (-izt)\).

\(\bullet \):

Theorem 4.2: \(\Delta _2^A\) classification for the Radon–Nikodym derivatives (densities) of the absolutely continuous parts of the scalar spectral measures with convergence in the \(L^1\) sense on an open set. This requires a certain separation condition, without which our algorithm converges (Lebesgue) almost everywhere.

We also consider the computation of spectra as sets in the complex plane. Convergence is measured using the Hausdorff metric in the bounded case and using the Attouch–Wets metric in the unbounded case (i.e. uniform convergence on compact subsets of \({\mathbb {C}}\)). Specifically, we provide the first algorithms computing these quantities and prove in Theorem5.1 that:

  • The absolutely continuous spectrum \(\sigma _{\mathrm {ac}}(T)\) can be computed in two limits but not one limit (\(\Delta _3^A{\setminus } \Delta _2^G\) classification).

  • The pure point spectrum \(\sigma _{\mathrm {pp}}(T)\) can be computed in two limits but not one limit (\(\Delta _3^A{\setminus } \Delta _2^G\) classification).

  • The singular continuous spectrum \(\sigma _{\mathrm {sc}}(T)\) can be computed in three limits (\(\Delta _4^A\) classification). If \(f(n)-n\ge \sqrt{2n}+1/2\), then the computation cannot be done in two limits (\(\notin \Delta _3^G\) classification). That is, if the local asymptotic bandwidth is allowed to grow sufficiently rapidly, three limits are needed, and this computational problem is exceedingly difficult. We do not know whether this growth condition on f can be dropped. However, without it, the problem still requires at least two limits (\(\notin \Delta _2^G\) classification).

The main tool used to prove the above results is

  • Theorem 2.1and Corollary2.2: The action of the resolvent \(x\rightarrow R(z,T)x\) can be computed with error control. This also opens up potential applications in computer-assisted proofs.

We demonstrate that the “one-limit” algorithms constructed in this paper are implementable and efficient. These provide the first set of algorithms addressing these problems, and we have provided extensive numerical experiments in Sect. 6. This includes orthogonal polynomials on the real line and unit circle (where we also discuss acceleration through extrapolation), as well as fractional diffusion for a two-dimensional quasicrystal. Since writing the initial version of this paper, our algorithms have also been implemented in the software package SpecSolve of [39], which uses the machinery of high-order rational kernels to accelerate computation of the Radon–Nikodym derivative of regular enough measures. In the current paper, we also show that in the case where the measure is regular enough, a global collocation method and local Richardson extrapolation for the computation of the Radon–Nikodym derivative can both accelerate convergence.

Finally, some brief remarks are in order.

  1. (i)

    The impossibility results hold in general, even when restricted to tridiagonal operators. Furthermore, many of the impossibility results hold for structured operators such as bounded discrete Schrödinger operators. Our results (constructive algorithms and impossibility results) also carry over to a large class of normal operators, including unitary operators or skew-adjoint operators, both of which are important in applications. For the sake of clarity, we have stuck to the self-adjoint case in the statement of theorems and proofs. Numerical examples for unitary operators are given in Sect. 6.3.

  2. (ii)

    The difficulty encountered when computing the singular continuous spectrum is partly due to the negative definition of the singular continuous part of a measure. It is the “leftover” part of the measure, the part that is not continuous with respect to Lebesgue measure and does not contain atoms. The challenge of studying \(\sigma _{\mathrm {sc}}\) analytically also reflects this difficulty—singular continuous spectra were once thought to be rather rare or exotic. However, they are quite generic; see, for example, [128] for a precise topological statement to this effect.

  3. (iii)

    One might at first expect computational results to be independent of the function f due to tridiagonalisation. However, the infinite-dimensional case is much more subtle than the finite-dimensional case. Using Householder transformations on a bounded sparse self-adjoint operator T leads to a tridiagonal operator, but, in general, this operator is T restricted to a strict subspace of \(l^2({\mathbb {N}})\). Part of the operator may be lost in the strong operator limit. Instead, one must consider a sum of possibly infinitely many tridiagonal operators (see [78, Ch. 2 & 8]). Hence some spectral problems may have different classifications for different f.

1.3 Open problems

The results of this paper form part of a program [33, 34, 36,37,38,39,40] on determining the foundations of computation (boundaries of what is possible) for infinite-dimensional spectral problems. As such, there remain many interesting open problems related to this paper, such as:

  • Computation of spectral measures for more general normal operators: Proposition 2.4 demonstrates how the techniques of this paper can be generalised to certain classes of normal operators whose spectrum does not necessarily lie along a curve. However, it is not obvious how to extend the techniques to completely general normal operators. For example, the method of integrating the resolvent along a contour cannot be easily extended to interior points of the spectrum.

  • Brown measure for non-normal operators: There is an extension of spectral measures to certain non-normal operators known as the Brown measure [25, 71, 72]. Computing spectra of non-normal operators is generally harder (in a sense made precise by the SCI hierarchy) than for normal operators (issues such as numerical stability are also present, even in the finite-dimensional case). It would be interesting to see if this phenomenon is also present for computing the Brown measure. Some works on approximating the Brown measure from matrix elements include [7, 11, 81].

1.4 A motivating example

As a motivating example, consider the case of a Jacobi operator with matrix

$$\begin{aligned} J=\begin{pmatrix} b_1 &{} a_1 &{} &{} \\ a_1 &{} b_2 &{} a_2 &{} \\ &{} a_2 &{} b_3 &{} \ddots \\ &{} &{} \ddots &{} \ddots \end{pmatrix}, \end{aligned}$$

where \(a_j,b_j\in {\mathbb {R}}\) and \(a_j>0\). An enormous amount of work exists on the study of these operators, and the correspondence between bounded Jacobi matrices and probability measures with compact support [45, 143]. The entries in the matrix provide the coefficients in the recurrence relation for the associated orthonormal polynomials. To study the canonical measure \(\mu _J\), one usually considers the principal resolvent function, which is defined on \({\mathbb {C}}\backslash \sigma (J)\) via

$$\begin{aligned} G(z):=\langle R(z,J)e_1,e_1\rangle =\int _{{\mathbb {R}}}\frac{d\mu _J(\lambda )}{\lambda -z}, \end{aligned}$$
(1.8)

and then takes z close to the real axis. The function G is also known in the differential equations and Schrödinger communities as the Weyl m-function [64, 143] and one can develop the discrete analogue of what is known as Weyl–Titchmarsh–Kodaira theory for Sturm–Liouville operators. Going back to the work of Stieltjes [136] (see also [1, 149]), there is a representation of G as a continued fraction:

$$\begin{aligned} G(z):=\frac{1}{-z+b_1-\frac{a_1^2}{-z+b_2-\ldots }}. \end{aligned}$$
(1.9)

One can also approximate G via finite truncated matrices [143].

However, there are two major obstacles to overcome when using (1.9) and its variants as a means to compute measures. First of all, this representation of the principal resolvent function is structurally dependent. For example, (1.9) is valid for the restricted case of Jacobi operators and hence one is led to seeking different methods for different operators (such as tight-binding Hamiltonians on two-dimensional lattices which have a growing bandwidth when represented as operators on \(l^2({\mathbb {N}})\)). Second, this would seem to give the wrong classification of the difficulty of the problem in the SCI hierarchy, giving rise to a tower of algorithms with two limits. One first takes a truncation parameter n to infinity to compute G(z) for \(\mathrm {Im}(z)>0\), and then a second limit as z approaches the real axis. One of the main messages of this paper is that both of these issues can be overcome. Measures can be computed in one limit via an algorithm \(\Gamma _n\) and for a large class of operators. The only restriction is a known asymptotic decay rate of the off-diagonal entries. As a by-product, we compute the m-function of such operators with error control. Specific cases where this can be written explicitly do exist, such as periodic Jacobi matrices or perturbations of Toeplitz operators [52] (see also Sect. 1.7). However, there has been no general method proposed to compute the resolvent with error control. This consideration is crucial to allow the computation of measures in one limit.

To see how we might compute the measure using the resolvent, consider the Poisson kernels for the half-plane and the unit disk, defined respectively by

$$\begin{aligned} P_{H}(x,y)= \frac{1}{\pi }\frac{y}{x^2+y^2} \text{ and } P_{D}(x,y)= & {} \frac{1}{2\pi }\frac{1-(x^2+y^2)}{(x-1)^2+y^2}\nonumber \\= & {} P_{D}(r,\theta )= \frac{1}{2\pi }\frac{1-r^2}{1-2r\cos (\theta )+r^2},\qquad \qquad \qquad \end{aligned}$$
(1.10)

where \((r,\theta )\) denote the usual polar coordinates. Let T be a normal operator, then for \(z\notin \sigma (T)\), we have from the functional calculus that

$$\begin{aligned} R(z,T)=\int _{\sigma (T)}\frac{1}{\lambda -z}dE^T(\lambda ). \end{aligned}$$

For self-adjoint T, \(z=u+iv\in {\mathbb {C}}\backslash {{\mathbb {R}}}\) (\(u,v\in {\mathbb {R}}\)) and \(x\in l^2({\mathbb {N}})\), we define

$$\begin{aligned} \begin{aligned} K_H(z;T,x)&:=\frac{1}{2\pi i}[R(z,T)-R({\overline{z}},T)]x\\&=\frac{1}{2\pi i}\int _{-\infty }^\infty \left( \frac{1}{\lambda -z}-\frac{1}{\lambda -{\overline{z}}}\right) dE^T(\lambda )x\\&=\int _{-\infty }^\infty P_H(u-\lambda ,v)dE^T(\lambda )x. \end{aligned} \end{aligned}$$
(1.11)

Similarly, if T is unitary, \(z=r\exp (i\psi )\in {\mathbb {C}}\backslash {{\mathbb {T}}}\) (with \(z\ne 0\)) and \(x\in l^2({\mathbb {N}})\), we define

$$\begin{aligned} K_D(z;T,x):= & {} \frac{1}{2\pi i}[R(z,T)-R(1/{\overline{z}},T)]x\nonumber \\= & {} \frac{1}{2\pi i}\int _{{\mathbb {T}}}\left[ \frac{1}{\lambda -z}-\frac{1}{\lambda -1/{\overline{z}}}\right] dE^T(\lambda )x. \end{aligned}$$
(1.12)

We change variables \(\lambda =\exp (i\theta )\) and, with an abuse of notation, write \(dE^T(\lambda )=i\exp (i\theta )dE^T(\theta )\). A simple calculation then gives

$$\begin{aligned} K_D(z;T,x)=\int _{0}^{2\pi } P_D(r,\psi -\theta )dE^T(\theta )x. \end{aligned}$$
(1.13)

Returning to our example, we see that the computation of the resolvent with error control allows the computation of G(z) with error control through taking inner products. By considering \(G(z)-G({\overline{z}})\), this allows the computation of the convolution of the measure \(\mu _J\) with the Poisson kernel \(P_{H}\). In other words, we can compute a smoothed version of the measure \(\mu _J\) with error control. We can then take the smoothing parameter to zero to recover the measure (one can show that these smoothed approximations converge weakly in the sense of measures). Fig. 1 demonstrates this for a typical example.

Fig. 1
figure 1

Smoothed approximations of the Radon–Nikodym derivative for the Jacobi operator associated to Jacobi polynomials with \(\alpha =1\), \(\beta =1/2\). Here the measure is absolutely continuous and supported on \([-1,1]\). Left: Computation of convolutions for different \(\epsilon \) using the methods of this paper. Right: The associated Poisson kernel \(\pi ^{-1}\epsilon /(\epsilon ^2+x^2)\) which approaches a Dirac delta distribution as \(\epsilon \downarrow 0\)

1.5 Functional analytic setup

We consider the canonicalFootnote 3 separable Hilbert space \({\mathcal {H}} = l^2({\mathbb {N}})\), the set of square summable sequences with canonical basis \(\{e_n\}_{n\in {\mathbb {N}}}\). Let \({\mathcal {C}}(l^2({\mathbb {N}}))\) be the set of closed densely defined linear operators T such that \(\mathrm {span}\{e_n:n\in {\mathbb {N}}\}\) forms a core of T and \(T^*\). The spectrum of \(T\in {\mathcal {C}}(l^2({\mathbb {N}}))\) will be denoted by \(\sigma (T)\) and the point spectrum (the set of eigenvalues) by \(\sigma _{\mathrm {p}}(T)\). The latter set is not always closed and in general the closure of a set S will be denoted by \({\overline{S}}\). The resolvent operator \((T-zI)^{-1}\) defined on \({\mathbb {C}}\backslash \sigma (T)\) will be denoted by R(zT).

This paper focusses on the subclass \(\Omega _{\mathrm {N}}\subset {\mathcal {C}}(l^2({\mathbb {N}}))\) of normal operators, that is, operators for which \({\mathcal {D}}(T)={\mathcal {D}}(T^*)\) and \(\left\| Tx\right\| =\left\| T^*x\right\| \) for all \(x\in {\mathcal {D}}(T)\). The subclasses \(\subset \Omega _{\mathrm {N}}\) of self-adjoint and unitary operators will be denoted by \(\Omega _{\mathrm {SA}}\) and \(\Omega _{\mathrm {U}}\) respectively. For \(T\in \Omega _{\mathrm {SA}}\) and \(T\in \Omega _\mathrm {U}\), \(\sigma (T)\subset {\mathbb {R}}\) and \(\sigma (T)\subset {\mathbb {T}}\) respectively, where \({\mathbb {T}}\) denotes the unit circle. Given \(T\in \Omega _{\mathrm {N}}\) and a Borel set B, \(E^T_B\) will denote the projection \(E^T(B)\). Given \(x,y\in l^2({\mathbb {N}})\), we can define a bounded (complex-valued) measure \(\mu _{x,y}^T\) via the formula

$$\begin{aligned} \mu _{x,y}^T(B)=\langle E_{B}^T x,y\rangle . \end{aligned}$$
(1.14)

Via the Lebesgue decomposition theorem [76], the spectral measure \(\mu ^T_{x,y}\) can be decomposed into three parts

$$\begin{aligned} \mu _{x,y}^T=\mu _{x,y,\mathrm {ac}}^T+\mu _{x,y,\mathrm {sc}}^T+\mu _{x,y,\mathrm {pp}}^T, \end{aligned}$$
(1.15)

the absolutely continuous part of the measure (with respect to the Lebesgue measure), the singular continuous part (singular with respect to the Lebesgue measure and atomless) and the pure point part. When considering \(\Omega _{\mathrm {SA}}\), we will consider Lebesgue measure on \({\mathbb {R}}\) and let

$$\begin{aligned} \rho _{x,y}^T(\lambda )=\frac{d\mu _{x,y,\mathrm {ac}}^T}{dm}(\lambda ), \end{aligned}$$
(1.16)

the Radon–Nikodym derivative of \(\mu _{x,y,\mathrm {ac}}^T\) with respect to Lebesgue measure. Of course this can be extended to the unitary (and, more generally, normal) case. This naturally gives a decomposition of the Hilbert space \({\mathcal {H}}=l^2({\mathbb {N}})\). For \({\mathcal {I}}=\mathrm {ac},\mathrm {sc}\) and \(\mathrm {pp}\), we let \({\mathcal {H}}_{{\mathcal {I}}}\) consist of vectors x whose measure \(\mu _{x,x}^T\) is absolutely continuous, singular continuous and pure point respectively. This gives rise to the orthogonal (invariant subspace) decomposition

$$\begin{aligned} {\mathcal {H}}={\mathcal {H}}_{\mathrm {ac}}\oplus {\mathcal {H}}_{\mathrm {sc}}\oplus {\mathcal {H}}_{\mathrm {pp}}, \end{aligned}$$
(1.17)

whose associated projections will be denoted by \(P_{\mathrm {ac}}^{T}\), \(P_{\mathrm {sc}}^{T}\) and \(P_{\mathrm {pp}}^{T}\) respectively. These projections commute with T and the projections obtained through the projection-valued measure. Of particular interest is the spectrum of T restricted to each \({\mathcal {H}}_{{\mathcal {I}}}\), which will be denoted by \(\sigma _{{\mathcal {I}}}(T)\). These different sets and subspaces often, but not always, characterise different physical properties in quantum mechanics (such as the famous RAGE theorem [2, 53, 119]), where a system is modelled by some Hamiltonian \(T\in \Omega _{\mathrm {SA}}\) [41, 42, 61, 93]. For example, pure point spectrum implies the absence of ballistic motion for many Schrödinger operators [127].

1.6 Algorithmic setup

Given an operator \(T\in {\mathcal {C}}(l^2({\mathbb {N}}))\), we can view it as an infinite matrix

$$\begin{aligned} T=\left( \begin{array}{cccc} t_{11} &{}\quad t_{12} &{}\quad t_{13} &{}\quad \ldots \\ t_{21} &{}\quad t_{22} &{}\quad t_{23} &{}\quad \ldots \\ t_{31} &{}\quad t_{32} &{}\quad t_{33} &{}\quad \ldots \\ \vdots &{}\quad \vdots &{}\quad \vdots &{}\quad \ddots \end{array} \right) \end{aligned}$$

through the inner productsFootnote 4\(t_{ij}=\langle Te_j,e_i\rangle \). All of the algorithms constructed can also be adapted to operators on \(l^2({\mathbb {Z}})\), either through the use of a suitable re-ordering of the basis, or though considering truncations of matrices in two directions, which is useful numerically since it preserves bandwidth. To be precise about the information needed to compute spectral properties, we define two classes of evaluation functions as

$$\begin{aligned} \Lambda _1=\{\langle Te_j,e_i\rangle :i,j\in {\mathbb {N}}\},\quad \Lambda _2=\{\langle Te_j,e_i\rangle ,\langle T^*e_j,T^*e_i\rangle :i,j\in {\mathbb {N}}\}. \end{aligned}$$
(1.18)

These can be understood as different sets of information our algorithms are allowed to access (see Appendix A for a precise meaning). All the results proven in this paper can be easily extended to the case of inexact input. This means replacing the evaluation functions by

$$\begin{aligned} f_{i,j,m}^{(1)},f_{i,j,m}^{(2)}:{\mathcal {C}}(l^2({\mathbb {N}}))\rightarrow {\mathbb {Q}}+i{\mathbb {Q}} \end{aligned}$$

such that \(|f_{i,j,m}^{(1)}(T)-\langle Te_j,e_i\rangle |\le 2^{-m}\) and \(|f_{i,j,m}^{(2)}(T)-\langle T^*e_j,T^*e_i\rangle |\le 2^{-m}\). Hence, the existence results carry over to algorithms that are only allowed to perform arithmetic operations over \({\mathbb {Q}}\). This could be useful for rigorous bounds using interval arithmetic and computer-assisted proofs (for those familiar with the term, our algorithms are Turing recursive), though for brevity, we stick to \(\Lambda _1\) and \(\Lambda _2\) throughout. For discrete operators, the above information is often given to us, for example, in tight-binding models in physics or as a discretisation of a PDE, and hence it is natural to seek to compute spectral properties from matrix values. The set \(\Lambda _2\) is motivated via variational problems. For partial differential operators, such information is often given through inner products with a suitable basis, and in this case, the inexact input model is needed due to approximating the integrals (see Appendix B). For the classes considered in this paper, the evaluation sets \(\Lambda _1\) and \(\Lambda _2\) are in general different, yet the classifications in the SCI remain the same.

We will be concerned operators whose matrix representation has a known asymptotic rate of column/off-diagonal decay. Namely, let \(f:{\mathbb {N}}\rightarrow {\mathbb {N}}\) with \(f(n)>n\) and let \(\alpha =\{\alpha _n\}_{n\in {\mathbb {N}}},\beta =\{\beta _n\}_{n\in {\mathbb {N}}}\) be null sequencesFootnote 5 of non-negative real numbers. We then define for \(\mathrm {X}=\mathrm {SA}\) or \(\mathrm {X}=\mathrm {U}\),

$$\begin{aligned} \begin{aligned} \Omega _{f,\alpha ,\beta }^{\mathrm {X}}&=\{T\in \Omega _{\mathrm {X}}:\Vert (P_{f(n)}-I)TP_n\Vert =O(\alpha _n),\text { as }n\rightarrow \infty \}\\&\quad \quad \quad \quad \quad \times \{x\in l^2({\mathbb {N}}):\Vert P_nx-x\Vert =O(\beta _n),\text { as }n\rightarrow \infty \}, \end{aligned} \end{aligned}$$
(1.19)

where \(P_n\) denotes the orthogonal projection onto \(\mathrm {span}\{e_1,\ldots ,e_n\}\). In using the \(O(\cdot )\) notation, the hidden constant is allowed to depend on the operator T or the vector x. We will also use

$$\begin{aligned} \Omega _{f,\alpha }^{\mathrm {X}}=\{T\in \Omega _{\mathrm {X}}:\Vert (P_{f(n)}-I)TP_n\Vert =O(\alpha _n),\text { as }n\rightarrow \infty \}. \end{aligned}$$
(1.20)

When discussing \(\Omega _{f,\alpha ,\beta }^{\mathrm {SA}}\) and \(\Omega _{f,\alpha }^{\mathrm {SA}}\) we will use the notation \(\Omega _{f,\alpha ,\beta }\) and \(\Omega _{f,\alpha }\). The collection of vectors in \(l^2({\mathbb {N}})\) satisfying \(\Vert P_nx-x\Vert =O(\beta _n)\) will be denoted by \(V_{\beta }\). Finally, when \(\alpha _n\equiv 0\), we will abuse notation slightly in requiring the stronger condition that

$$\begin{aligned} \Vert (P_{f(n)}-I)TP_n\Vert =0. \end{aligned}$$

Thus \(\Omega _{f,0}\) is the class of self-adjoint operators whose matrix sparsity structure is captured by the function f. For example, if \(f(n)=n+1\), we recover the class of self-adjoint tridiagonal matrices, the most studied class of infinite-dimensional operators. When discussing classes that include vectors x, we extend \(\Lambda _i\) to include pointwise evaluations of the coefficients of x. Other additions are sometimes needed, such as data regarding open sets as inputs for computations of measures, but this will always be made clear. When considering the general case of \(\Omega _{f,\alpha }\), the function f and sequence \(\alpha \) can also be considered as inputs to the algorithm—in other words, the same algorithm works for each class.

1.7 Connections with previous work

We have mentioned the literature on infinite-dimensional spectral problems and the SCI hierarchy. Computationally, our point of view in this paper is closest to the work of Olver, Townsend and Webb on practical infinite-dimensional linear algebra [104,105,106,107, 151]. Their work includes efficient codes, such as the infinite-dimensional QL (IQL) algorithm [150] (see also [38] for the IQR algorithm, which has its roots in the work of Deift, Li and Tomei [46]), as well as theoretical results. A PDE version of the FEAST algorithm based on contour integration of the resolvent has recently been proposed by Horning and Townsend in [84], which computes discrete spectra. The set of algorithms we provide can be considered as a new member within the growing family of infinite-dimensional techniques.

A similar, though different, object studied in the mathematical physics literature, particularly when considering random Schrödinger operators, is the density of states [29, 89, 91], which we mention for completeness and also to avoid potential confusion. This object is defined via the “thermodynamic limit”, where instead of considering the infinite-dimensional operator T, one considers finite truncations, say \(P_nTP_n\), and the limit \(n\rightarrow \infty \) of the measure \(\sum _{x_j\in \sigma (P_nTP_n)} \delta _{x_j}/n\). This measure is subtly different from the spectral measure of T (for instance, on the discrete spectrum). The density of states is an important quantity in quantum mechanics, and there is a large literature on its computation. We refer the reader to the excellent review article [95], which discusses the most common methods. The idea of using the resolvent to approximate the density of states of finite matrices can be found in the method of [82], which approximates the imaginary part of \(\mathrm {Trace}\left[ {R}(z,P_nTP_n)\right] \) for \(\mathrm {Im}(z)>0\). Similarly, in the random matrix literature, the connection is made through the Stieltjes transformation (see, for example, [9]). There are three immediate differences between our algorithms and those that compute the density of states. First, we seek to deal with the full, infinite-dimensional, operator directly to compute the spectral measure (and not the limit of increasing system sizes). Second, the object we are computing contains more refined spectral information of the operator and does not involve an averaging procedure. The density of states does not capture the full spectral information, such as the contribution of eigenvalues in the discrete spectrum, whereas the projection-valued spectral measure does. Third, there is a subtlety regarding the limits as \(\mathrm {Im}(z)\) goes to zero and the truncation parameter goes to infinity (a similar trade-off also occurs in random matrix literature when theoretically analysing the density of states—see [54]). In our case, appropriate rectangular truncations of the infinite-dimensional operator are required to compute the resolvent with error control (see Theorem 2.1). This approach differs from finite-dimensional techniques, which typically consider square truncations.

For the \(C^*\)-algebra viewpoint of the density of states, we refer the reader to the work of Areveson [7] and the references therein. Estimating the spectrum of T via \(\sigma (P_nTP_n)\) is known as the finite section method. This has often been viewed in connection with Toeplitz theory. See, for example, the work by Böttcher [19, 20], Böttcher and Silberman [23], Böttcher, Brunner, Iserles and Nørsett [21], Brunner, Iserles and Nørsett [27], Hagen, Roch and Silbermann [73], Lindner [96], Marletta [98], Marletta and Scheichl [99], and Seidel [121].

The study of spectral measures also has a rich history in the theory of orthogonal polynomials and quadrature rules for numerical integration going back to the work of Szegő [50, 139], briefly touched upon in Sect. 1.4. In certain cases, one can recover a distribution function for the associated measure of the Jacobi operator as a limit of functions constructed using Gaussian quadrature [31, Ch. 2]. Our examples in Sect. 6.1 can be considered as a computational realisation of Favard’s theorem.

There are several results in the literature considering the computation of spectral density functions for Sturm–Liouville problems. In the case of Sturm–Liouville problems, the spectral density function corresponds to the multiplicative version of the spectral theorem. This is subtly different from the measures we compute, which arise from the projection-valued measure version of the spectral theorem. A common approach to approximate spectral density functions associated with Sturm–Liouville operators on unbounded domains is to truncate the domain and use the Levitan–Levinson formula, as implemented in the software package SLEDGE [59, 60, 111]. This approach can be computationally expensive since the eigenvalues cluster as the domain size increases; often, hundreds of thousands of eigenvalues and eigenvectors need to be computed. More sophisticated methods avoiding domain truncation are considered for special cases in [57, 58], and an application in plasma physics can be found in [152]. These make use of the additional structure present in Sturm–Liouville problems using results analogous to (1.9) in the continuous case. Our results, particularly Theorem 1.1, hold for partial differential operators much more complicated than Sturm–Liouville operators (see Appendix B).

Finally, we wish to highlight the work of Webb and Olver [151], which is of particular relevance to the present study. There the authors studied, through connection coefficients, Jacobi operators that arise as compact perturbations of Toeplitz operators. Similar perturbations (where a stronger exponential decay of the perturbation is crucial for analyticity properties of the resolvent) were studied in the work of Bilman and Trogdon [16] in connection with the inverse scattering transform for the Toda lattice. (See also the work of Trogdon, Olver and Deconinck [144] for computations of spectral measures for inverse scattering for the KdV equation.) The results proven in [151] can be stated in terms of the SCI hierarchy:

  • If the perturbation is finite rank (and known), the computation of \(\sigma _{\mathrm {pp}}\) lies in \(\Delta _1^G\), and the computation of the \(\mu _{\mathrm {ac}}\) lies in \(\Delta _0^G\) (note that \(\sigma _{\mathrm {ac}}\) is known analytically).

  • If the perturbation is compact with a known rate of decay at infinity, then the computation of the full spectrum \(\sigma \) lies in \(\Delta _1^G\).

The current paper complements the work of [151] by; considering operators much more general than tridiagonal compact perturbations of Toeplitz operators (we deal with arbitrary self-adjoint operators and assume we know f such that (1.20) holds) and partial differential operators, allowing operators to be unbounded, building algorithms that are arithmetic and can cope with inexact input, and considering computation of a wider range of spectral information. At the price of this greater generality, the objects we study are generally not computable with error control (unless one has local regularity assumptions on the measure—see [33, Ch. 4]), and some lead to computational problems higher up in the SCI hierarchy, though still computationally useful as we shall demonstrate. Our methods are also entirely different and rely on estimating the resolvent operator with error control (Theorem 2.1).

1.8 Organisation of the paper

The paper is organised as follows. In Sect. 2 we consider the computation of the resolvent with error control and generalisations of Stone’s formula. The computation of measures, their various decompositions and projections are discussed in Sect. 3. We then discuss the functional calculus and density of measures in Sect. 4. The computation of the different types of spectra as sets in the complex plane is discussed in Sect. 5. We run extensive numerical tests in Sect. 6, where we also introduce a new collocation method for the computation of the Radon–Nikodym derivative. We find that increased rates of convergence can also be obtained through iterations of Richardson extrapolation. A summary of the SCI hierarchy can be found in Appendix A and a proof of Theorem 1.1 in Appendix B. Throughout, our theorems and proofs use the notation introduced in Sect. 1.5 and Sect. 1.6.

2 Preliminary Results

The algorithms built in this paper rely on the computation of the action of the resolvent operator \(R(z,T)=(T-z)^{-1}\) for \(z\notin \sigma (T)\) with (asymptotic) error control. Given this, one can compute the action of the projections \(E^T_{S}\) for a wide range of sets S (Theorem 3.1 and its generalisations), and hence the measures \(\mu _{x,y}^T\). This section discusses the computation of the resolvent with error control and generalisations of Stone’s formula, which relate the resolvent to the projection-valued measures.

2.1 Approximating the resolvent operator

The key theorem for computing the action of the resolvent operator is the following, where we use \(\sigma _1\) to denote the injection modulus of an operator defined as

$$\begin{aligned} \sigma _1(T):=\min \{\Vert Tx\Vert :x\in {\mathcal {D}}(T),\Vert x\Vert =1\}. \end{aligned}$$

The proof of the following theorem boils down to a careful computation of a least-squares solution of a rectangular linear system.

Theorem 2.1

Let \(\alpha =\{\alpha _n\}_{n\in {\mathbb {N}}}\) and \(\beta =\{\beta _n\}_{n\in {\mathbb {N}}}\) be null sequences, and \(f:{\mathbb {N}}\rightarrow {\mathbb {N}}\) with \(f(n)>n\). Define

$$\begin{aligned} S_{f,\alpha ,\beta }:=\left\{ (T,x,z)\in \Omega ^{\mathrm {N}}_{f,\alpha ,\beta }\times {\mathbb {C}}:z\in {\mathbb {C}}\backslash \sigma (T)\right\} \end{aligned}$$

and for \((T,x)\in \Omega ^{\mathrm {N}}_{f,\alpha ,\beta }\) (defined in (1.19)), let \(C_1(T,x)\), \(C_2(T,x)\in {\mathbb {R}}_{\ge 0}\) be such that

  1. 1.

    \( \Vert (I-P_{f(n)})TP_n\Vert \le C_1\alpha _n\),

  2. 2.

    \(\Vert P_nx-x\Vert \le C_2\beta _n\),

where, for notational convenience, we drop the (Tx) dependence in the notation for \(C_1\) and \(C_2\). Then there exists a sequence of arithmetical algorithms

$$\begin{aligned} \Gamma _n:S_{f,\alpha ,\beta }\rightarrow l^2({\mathbb {N}}), \end{aligned}$$

each of which use the evaluation functions in \(\Lambda _1\) (defined in (1.18)), such that each vector \(\Gamma _n(T,x,z)\) has finite support with respect to the canonical basis for each n, and

$$\begin{aligned} \lim _{n\rightarrow \infty }\Gamma _n(T,x,z)= R(z,T)x\text { in }l^2({\mathbb {N}}). \end{aligned}$$

Moreover, the following error bound holds

$$\begin{aligned}&\Vert \Gamma _{n}(T,x,z)-R(z,T)x\Vert \nonumber \\&\quad \le \frac{C_2\beta _{f(n)}+C_1\alpha _n\Vert \Gamma _{n}(T,x,z)\Vert +\Vert P_{f(n)}(T-zI)\Gamma _{n}(T,x,z)-P_{f(n)}x\Vert }{\mathrm {dist}(z,\sigma (T))}.\nonumber \\ \end{aligned}$$
(2.1)

If a bound on \(C_1\) and \(C_2\) are known, this error bound can be computed to arbitrary accuracy using finitely many arithmetic operations and comparisons.

Proof

Let \((T,x,z)\in S_{f,\alpha ,\beta }\). We have that \(n=\mathrm {rank} (P_n)=\mathrm {rank} ((T-zI)P_n)=\mathrm {rank} (P_{f(n)}(T-zI)P_n)\) for large n since \(\sigma _1(T-zI)>0\) and \(\Vert (I-P_{f(n)})(T-zI)P_n\Vert \le C_1\alpha _n\rightarrow 0\) (recall that \(z\notin \sigma (T)\)). Hence we can define

$$\begin{aligned} {\widetilde{\Gamma }}_{n}(T,x,z):=\left\{ \begin{array}{ll} 0 \quad \quad \quad \text { if } \sigma _1(P_n(T^*-{\overline{z}}I)P_{f(n)}(T-zI)P_n)\le \frac{1}{n}\\ {}[P_n(T^*-{\overline{z}}I)P_{f(n)}(T-zI)P_n]^{-1}P_n(T^*-{\overline{z}}I)P_{f(n)}x \quad {\text { otherwise.}} \end{array}\right. \end{aligned}$$

Suppose that n is large enough so that \(\sigma _1(P_n(T^*-{\overline{z}}I)P_{f(n)}(T-zI)P_n)>1/n\). Then \({\widetilde{\Gamma }}_{n}(T,x,z)\) is a least-squares solution of the optimisation problem \(\mathrm {argmin}_{y}\Vert P_{f(n)}(T-zI)P_ny-x\Vert \). The linear space \(\mathrm {span}\{e_n:n\in {\mathbb {N}}\}\) forms a core of T and hence also of \(T-zI\). It follows by invertibility of \(T-zI\) that given any \(\epsilon >0\), there exists an \(m=m(\epsilon )\) and a \(y=y(\epsilon )\) with \(P_my=y\) such that

$$\begin{aligned} \Vert (T-zI)y-x\Vert \le \epsilon . \end{aligned}$$

It follows that for all \(n\ge m\),

$$\begin{aligned} \Vert (T-zI){\widetilde{\Gamma }}_{n}(T,x,z)-x\Vert&\le \Vert P_{f(n)}(T-zI){\widetilde{\Gamma }}_{n}(T,x,z)-x\Vert +C_1\alpha _n\Vert {\widetilde{\Gamma }}_{n}(T,x,z)\Vert \\&\le \Vert P_{f(n)}(T-zI)y-x\Vert +C_1\alpha _n\Vert {\widetilde{\Gamma }}_{n}(T,x,z)\Vert \\&\le \Vert P_{f(n)}(T-zI)y-P_{f(n)}x\Vert +C_2\beta _{f(n)}+C_1\alpha _n\Vert {\widetilde{\Gamma }}_{n}(T,x,z)\Vert \\&\le \epsilon +C_2\beta _{f(n)}+C_1\alpha _n\Vert {\widetilde{\Gamma }}_{n}(T,x,z)\Vert . \end{aligned}$$

This implies that

$$\begin{aligned} \Vert {\widetilde{\Gamma }}_{n}(T,x,z)-R(z,T)x\Vert&\le \Vert R(z,T)\Vert \Vert (T-zI){\widetilde{\Gamma }}_{n}(T,x,z)-x\Vert \\&\le \Vert R(z,T)\Vert \left( \epsilon +C_2\beta _{f(n)}+C_1\alpha _n\Vert {\widetilde{\Gamma }}_{n}(T,x,z)\Vert \right) . \end{aligned}$$

In particular, since \(\alpha \) and \(\beta \) are null, this implies that \(\Vert {\widetilde{\Gamma }}_{n}(T,x,z)\Vert \) is uniformly bounded in n. Since \(\epsilon >0\) was arbitrary, we also see that \({\widetilde{\Gamma }}_{n}(T,x,z)\) converges to R(zT)x.

Define the matrices

$$\begin{aligned} B_n=P_n(T^*-{\overline{z}}I)P_{f(n)}(T-zI)P_n,\quad C_n=P_n(T^*-{\overline{z}}I)P_{f(n)}. \end{aligned}$$

Given the evaluation functions in \(\Lambda _1\), we can compute the entries of these matrices to any given accuracy and hence also to arbitrary accuracy in the operator norm using finitely many arithmetic operations and comparisons (using the error in the Frobenius norm to bound the error in the operator norm). Denote approximations of \(B_n\) and \(C_n\) by \({\widetilde{B}}_n\) and \({\widetilde{C}}_n\) respectively and assume that

$$\begin{aligned} \Vert B_n-{\widetilde{B}}_n\Vert \le u_n,\quad \Vert C_n-{\widetilde{C}}_n\Vert \le v_n, \end{aligned}$$

for null sequences \(\{u_n\},\{v_n\}\). Note that \({\widetilde{B}}_n^{-1}\) can be computed using finitely many arithmetic operations and comparisons. So long as \(u_n\) is small enough, the resolvent identity implies that

$$\begin{aligned} \Vert B_n^{-1}-{\widetilde{B}}_n^{-1}\Vert \le \frac{\Vert {\widetilde{B}}_n^{-1}\Vert ^2u_n}{1-u_n\Vert {\widetilde{B}}_n^{-1}\Vert }=:w_n. \end{aligned}$$

By taking \(u_n\) and \(v_n\) smaller if necessary (so that the algorithm is adaptive and it is straightforward to bound the norm of a finite matrix from above), we can ensure that \(\Vert {\widetilde{B}}_n^{-1}\Vert v_n\le n^{-1}\) and \((\Vert {\widetilde{C}}_n\Vert +v_n)w_n\le n^{-1}\). From Proposition A.7 and a simple search routine, we can also compute \(\sigma _1(P_n(T^*-{\overline{z}}I)P_{f(n)}(T-zI)P_n)\) to arbitrary accuracy using finitely many arithmetic operations and comparisons. Suppose this is done to an accuracy \(1/n^2\) and denote the approximation via \(\tau _n\). We then define

$$\begin{aligned} \Gamma _{n}(T,x,z):=\left\{ \begin{array}{lll} \quad \quad \quad 0 &{} \text { if } \tau _n\le \frac{1}{n}\\ {\widetilde{B}}_n^{-1}{\widetilde{C}}_n {\widetilde{x}}_n&{} \text { otherwise,} \end{array}\right. \end{aligned}$$

where \({\widetilde{x}}_n=P_{f(n)}x\). It follows that \(\Gamma _{n}(T,x,z)\) can be computed using finitely many arithmetic operations and, for large n,

$$\begin{aligned} \Vert \Gamma _{n}(T,x,z)-{\widetilde{\Gamma }}_{n}(T,x,z)\Vert \le \left( \Vert {\widetilde{B}}_n^{-1}\Vert v_n+(\Vert {\widetilde{C}}_n\Vert +v_n)w_n\right) \Vert x\Vert \rightarrow 0, \end{aligned}$$

so that \(\Gamma _n(T,x,z)\) converges to R(zT)x. By construction, \(\Gamma _n(T,x,z)\) has finite support with respect to the canonical basis.

Furthermore, the following error bound holds (which also holds if \(\tau _n\le 1/n\))

$$\begin{aligned}&\Vert \Gamma _{n}(T,x,z)-R(z,T)x\Vert \le \Vert R(z,T)\Vert \Vert (T-zI)\Gamma _{n}(T,x,z)-x\Vert \\&\quad \le \frac{C_2\beta _{f(n)}+C_1\alpha _n\Vert \Gamma _{n}(T,x,z)\Vert +\Vert P_{f(n)}(T-zI)\Gamma _{n}(T,x,z)-P_{f(n)}x\Vert }{\mathrm {dist}(z,\sigma (T))}, \end{aligned}$$

since T is normal so that \(\Vert R(z,T)\Vert =\mathrm {dist}(z,\sigma (T))^{-1}\). This bound converges to 0 as \(n\rightarrow \infty \). If the \(C_1\) and \(C_2\) are known, the bound can be approximated to arbitrary accuracy using finitely many arithmetic operations and comparisons. \(\quad \square \)

Remark 3

If T corresponds to a choice of basis in a space of functions (for example when using a spectral method), there is often a link between the regularity of the functions x and the decay of the terms \(\beta _n\). The bound (2.1) can then often be adapted to include such asymptotics, and hence indicate how large n needs to be to gain a given approximation.

Of course, a vast literature exists on computing R(zT), especially for infinite matrices with structure (such as being banded) and we refer the reader to [70, 96, 112, 121] for a small sample. Note that if T is banded with bandwidth m, then we can take \(f(n)=n+m\) and the above computation can be done in \(O(nm^2)\) operations [66]. The following corollary of Theorem 2.1 will be used repeatedly in the following proofs.

Corollary 2.2

There exists a sequence of arithmetic algorithms

$$\begin{aligned} \Gamma _n: \Omega _{f,\alpha ,\beta }\times {\mathbb {C}}\backslash {\mathbb {R}}\rightarrow l^2({\mathbb {N}}) \end{aligned}$$

with the following properties:

  1. 1.

    For all \((T,x)\in \Omega _{f,\alpha ,\beta }\) and \(z\in {\mathbb {C}}\backslash {\mathbb {R}}\), \(\Gamma _n(T,x,z)\) has finite support with respect to the canonical basis and converges to R(zT)x in \(l^2({\mathbb {N}})\) as \(n\rightarrow \infty \).

  2. 2.

    For any \((T,x)\in \Omega _{f,\alpha ,\beta }\), there exists a constant C(Tx) such that for all \(z\in {\mathbb {C}}\backslash {\mathbb {R}}\),

    $$\begin{aligned} \Vert \Gamma _n(T,x,z)-R(z,T)x\Vert \le \frac{C(T,x)}{\left| \mathrm {Im}(z)\right| }\left[ \alpha _n+\beta _n\right] . \end{aligned}$$

Proof

Let \(\Gamma _n(T,x,z)={\widehat{\Gamma }}_{m(n,T,x,z)}(T,x,z)\), where \({\widehat{\Gamma }}_k\) are the algorithms from the statement of Theorem 2.1 and m(nTxz) is a subsequence diverging to infinity as \(n\rightarrow \infty \). Clearly statement (1) holds so we must show how to choose the sequence m(nTxz) such that (2) holds (and hence our algorithms will be adaptive). From (2.1), it is enough to show that \(m=m(n,T,x,z)\) can be chosen such that

$$\begin{aligned} \beta _{f(m)}+\alpha _m\Vert {\widehat{\Gamma }}_{m}(T,x,z)\Vert +\Vert P_{f(m)}(T-zI){\widehat{\Gamma }}_{m}(T,x,z)-P_{f(m)}x\Vert \lesssim \alpha _n+\beta _n. \end{aligned}$$

The left-hand side can be approximated to arbitrary accuracy using finitely many arithmetic operations and comparisons. Hence by repeatedly computing approximations to within \(\alpha _n+\beta _n\), we can choose the minimal m such that these approximate bounds are at most \(2(\alpha _n+\beta _n)\). \(\quad \square \)

2.2 Stone’s formula and Poisson kernels

Here we briefly discuss Stone’s famous formula [32, 113, 137], which relates the convolution of spectral measures with Poisson kernels to the pointwise action of the projection-valued measures associated with an operator \(T\in \Omega _{\mathrm {SA}}\) as \(\epsilon \downarrow 0\) (see Sect. 1.4). Stone’s formula can also be generalised to unitary operators and a much larger class of normal operators (see Proposition 2.4). We include a short (and standard) proof of Proposition 2.3 for the benefit of the reader.

Proposition 2.3

(Stone’s formula). The following boundary limits hold:

  1. (i)

    Let \(T\in \Omega _{\mathrm {SA}}\). Then for any \(-\infty \le a<b\le \infty \) and \(x\in l^2({\mathbb {N}})\),

    $$\begin{aligned} \lim _{\epsilon \downarrow 0}\int _{a}^bK_H(u+i\epsilon ;T,x)du=E^{T}_{(a,b)}x+\frac{1}{2}E^{T}_{\{a,b\}}x. \end{aligned}$$
  2. (ii)

    Let \(T\in \Omega _{\mathrm {U}}\). Then for any \(0\le a<b<2\pi \) and \(x\in l^2({\mathbb {N}})\),

    $$\begin{aligned} \lim _{\epsilon \downarrow 0}\int _{a}^bi\exp (i\psi )K_D((1-\epsilon )\exp (i\psi );T,x)d\psi =E^{T}_{(a,b)_{{\mathbb {T}}}}x+\frac{1}{2}E^{T}_{\{\exp (ia),\exp (ib)\}}x, \end{aligned}$$

    where \((a,b)_{{\mathbb {T}}}\) denotes the image of (ab) under the map \(\theta \rightarrow \exp (i\theta )\).

Proof

To prove (i), we can apply Fubini’s theorem to interchange the order of integration and arrive at

$$\begin{aligned} \int _{a}^bK_H(u+i\epsilon ;T,x)du=\left[ \int _{-\infty }^\infty \int _a^bP_H(u-\lambda ,\epsilon )du\ dE^T(\lambda )\right] x \end{aligned}$$

But

$$\begin{aligned} \int _a^bP_H(u-\lambda ,\epsilon )du=\frac{1}{\pi }\left[ \tan ^{-1}\left( \frac{b-\lambda }{\epsilon }\right) -\tan ^{-1}\left( \frac{a-\lambda }{\epsilon }\right) \right] \end{aligned}$$

is bounded and converges pointwise as \(\epsilon \downarrow 0\) to \(\chi _{(a,b)}(\lambda )+\chi _{\{a,b\}}(\lambda )/2\), where \(\chi _S\) denotes the indicator function of a set S. Part (i) now follows from the dominated convergence theorem.

To prove (ii), we apply Fubini’s theorem again, now noting that

$$\begin{aligned} \int _a^bi\exp (i\psi )P_D((1-\epsilon ),\psi -\theta )d\psi =\frac{i\exp (i\theta )}{2\pi }\int _{a-\theta }^{b-\theta }\frac{(2\epsilon -\epsilon ^2)\exp (i\psi )}{\epsilon ^2+2(1-\epsilon )(1-\cos (\psi ))}d\psi .\nonumber \\ \end{aligned}$$
(2.2)

We can split the interval into small intervals of width \(\rho \) (where \(0<\rho <1\)) around each point where \(\cos (\psi )=1\), and a finite union of intervals on which \(1-\cos (\psi )\) is positive, bounded away from 0. On these later intervals, the limit vanishes as \(\epsilon \downarrow 0\). Hence by periodicity and considering odd and even parts, we are left with considering

$$\begin{aligned} I_1(\rho ,\epsilon )= & {} \int _0^\rho \frac{(2\epsilon -\epsilon ^2)\cos (\psi )}{\epsilon ^2+2(1-\epsilon )(1-\cos (\psi ))}d\psi ,\\ I_2(\rho ,\epsilon )= & {} \int _0^\rho \frac{(2\epsilon -\epsilon ^2)\sin (\psi )}{\epsilon ^2+2(1-\epsilon )(1-\cos (\psi ))}d\psi . \end{aligned}$$

Explicit integration yields \(I_2(\epsilon ,\rho )=O(\epsilon \log (\epsilon ))\) and hence the contribution vanishes in the limit. We also have

$$\begin{aligned} I_1(\rho ,\epsilon )=\frac{(\epsilon ^2-2\epsilon )\rho +2(2+\epsilon ^2-2\epsilon )\tan ^{-1}\left( \frac{(2-\epsilon )\tan \left( \frac{\rho }{2}\right) }{\epsilon }\right) }{2(1-\epsilon )}. \end{aligned}$$

This converges to \(\pi \) as \(\epsilon \downarrow 0\). Considering the contributions of \(I_1\) and \(I_2\) in (2.2), we see that (2.2) converges pointwise as \(\epsilon \downarrow 0\) to

$$\begin{aligned} i\exp (i\theta )\left\{ \chi _{(a,b)}(\theta )+[\chi _{\{a\}}(\theta )+\chi _{\{b\}}(\theta )]/2\right\} . \end{aligned}$$

Since the integral is also bounded, part (ii) now follows from the dominated convergence theorem and change of variables. \(\quad \square \)

This type of construction can be generalised to \(T\in \Omega _{\mathrm {N}}\) whose spectrum lies on a regular enough curve. However, it is much more straightforward in the general case to use the analytic properties of the resolvent. The next proposition does this and also holds for operators whose spectrum does not necessarily lie along a curve.

Fig. 2
figure 2

Left: Exterior cone condition for Proposition 2.4. Right: Deformed contour \(\gamma _{\epsilon }\) to compute \(f_{\epsilon }(z_i)\)

Proposition 2.4

(Generalised Stone’s formula). Let \(T\in \Omega _{\mathrm {N}}\) and \(\gamma \) be a rectifiable positively oriented Jordan curve with the following properties. The spectrum \(\sigma (T)\) intersects \(\gamma \) at finitely many points \(z_1,\ldots ,z_m\) and in a neighbourhood of each of the \(z_i\), \(\gamma \) is formed of a line segment meeting \(\sigma (T)\) only at \(z_i\), at which point \(\sigma (T)\) has a local exterior cone condition with respect to \(\gamma \) (see Fig. 2). Let \(x\in l^2({\mathbb {N}})\). Then we can define the Cauchy principal value integral of the resolvent R(zT)x along \(\gamma \) and have

$$\begin{aligned} \frac{-1}{2\pi i}\mathrm {PV}\int _{\gamma } R(z,T)xdz=E^T_{\sigma (T;\gamma )}x-\frac{1}{2}\left[ \sum _{j=1}^mE^T_{\{z_j\}}x\right] , \end{aligned}$$
(2.3)

where \(\sigma (T;\gamma )\) is the closure of the intersection of \(\sigma (T)\) with the interior of \(\gamma \).

Proof

We will argue for the case \(m=1\), and the general case follows in exactly the same manner. Let \(\epsilon >0\) be small so that in a neighbourhood of the \(\epsilon -\)ball around \(z_1\), \(\gamma \) is given by a straight line. We then decompose \(\gamma \) into two disjoint parts

$$\begin{aligned} \gamma =\gamma _\epsilon ^1\cup \gamma _\epsilon ^2, \end{aligned}$$

where \(\gamma _\epsilon ^2\) denotes the line segment of \(\gamma \) at most \(\epsilon \) away from \(z_1\) (as shown in Fig. 2). We set

$$\begin{aligned} F_\epsilon (x,T)=\int _{\gamma _\epsilon ^1} R(z,T)xdz=\left[ \int _{\sigma (T)}\int _{\gamma _\epsilon ^1}\frac{1}{\lambda -z}dz dE^T(\lambda ) \right] x. \end{aligned}$$

We then consider the inner integral

$$\begin{aligned} f_\epsilon (\lambda )=\int _{\gamma _\epsilon ^1}\frac{1}{\lambda -z}dz. \end{aligned}$$

If \(\lambda \) is inside \(\gamma \) then \(\lim _{\epsilon \downarrow 0}f_\epsilon (\lambda )=-2\pi i\) via Cauchy’s residue theorem. Similarly, if \(\lambda \) is outside \(\gamma \) then \(\lim _{\epsilon \downarrow 0}f_\epsilon (\lambda )=0\). To calculate \(f_{\epsilon }(z_1)\), consider the contour integral along \(\gamma _{\epsilon }\) in Fig. 2. We see that

$$\begin{aligned} f_{\epsilon }(z_1)-i\pi =-2i\pi \end{aligned}$$

and hence \(f_{\epsilon }(z_1)=-i\pi \). We would like to apply the dominated convergence theorem. Clearly, away from \(z_1\), \(f_{\epsilon }\) is bounded as \(\epsilon \downarrow 0\). Now let \(0<\delta <\epsilon \) then

$$\begin{aligned} f_{\delta }(\lambda )-f_{\epsilon }(\lambda )= & {} \int _\delta ^\epsilon \frac{1}{\frac{\lambda -z_1}{w}-s}+\frac{1}{\frac{\lambda -z_1}{w}+s}ds =\log \left( \frac{\epsilon +\frac{\lambda -z_1}{w}}{-\epsilon +\frac{\lambda -z_1}{w}}\right) -\log \left( \frac{\delta +\frac{\lambda -z_1}{w}}{-\delta +\frac{\lambda -z_1}{w}}\right) \end{aligned}$$

for some \(w\in {\mathbb {T}}\). Taking the pointwise limit \(\delta \downarrow 0\), we see that \(f_{\epsilon }(\lambda )\) is bounded for \(\lambda \in \sigma (T)\) in a neighbourhood of \(z_1\) as \(\epsilon \downarrow 0\) if the same holds for

$$\begin{aligned} g_\epsilon (\lambda )=\log \left( \frac{\epsilon +\frac{\lambda -z_1}{w}}{-\epsilon +\frac{\lambda -z_1}{w}}\right) . \end{aligned}$$

By rotating and translating, we can assume that \(w=1\) and \(z_1=0\) without loss of generality. Let \(\lambda _1=\mathrm {Re}(\lambda )\) and \(\lambda _2=\mathrm {Im}(\lambda )\). Using the cone condition gives \(\alpha \left| \lambda _1\right| \le \left| \lambda _2\right| \) for some \(\alpha >0\). Assume \(\lambda _1\ne 0\) then

$$\begin{aligned} \left| \frac{\epsilon +\lambda }{-\epsilon +\lambda }\right| ^2=\frac{(\epsilon +\lambda _1)^2+\lambda _2^2}{(\epsilon -\lambda _1)^2+\lambda _2^2}=1+\frac{4x}{(x-1)^2+y^2}, \end{aligned}$$

where \(x=\epsilon /\lambda _1\) and \(y=\lambda _2/\lambda _1\). Note that \(y^2\ge \alpha ^2\) and without loss of generality we take \(y\ge \alpha \). Define

$$\begin{aligned} h(x,y)=\frac{4x}{(x-1)^2+y^2} \end{aligned}$$

Note that \(h(x,y)\rightarrow 0\) as \(\left| x\right| ^2+\left| y\right| ^2\rightarrow \infty \). We must show that h(xy) is bounded above \(-1\) for \(y\ge \alpha \). It is enough to consider points where \(\partial h/\partial x=0\) which occur when \(x_{\pm }=\pm \sqrt{1+y^2}.\) We have

$$\begin{aligned} h(x_{\pm },y)=\frac{\pm 2}{\sqrt{1+y^2}\mp 1}\ge \frac{-2}{\sqrt{1+\alpha ^2}+1}>-1, \end{aligned}$$

and hence we have proved the required boundedness. We then define

$$\begin{aligned} \mathrm {PV}\int _{\gamma } R(z,T)xdz=\lim _{\epsilon \downarrow 0}F_\epsilon (x,T). \end{aligned}$$

The relation (2.3) now follows from the dominated convergence theorem. \(\quad \square \)

3 Computation of Measures

For the sake of brevity, the analysis in the rest of this paper will consider the self-adjoint case \(T\in \Omega _{\mathrm {SA}}\), which is the case most encountered in applications. However, the algorithms we build are based on Theorem 2.1 (and Corollary 2.2) and the link with Poisson kernels/Cauchy transforms. Given the relation (1.13) and Proposition 2.4, many of the results can be straightforwardly extended to the unitary case and more general cases where conditions similar to that of Proposition 2.4 hold. We consider examples of unitary operators in Sect. 6.3.

3.1 Full spectral measure

We start by considering the computation of \(E_U^Tx\), where \(U\subset {\mathbb {R}}\) is a non-trivial open set. In other words, U is not the whole of \({\mathbb {R}}\) or the empty set. The collection of these subsets will be denoted by \({\mathcal {U}}\). To be precise, we assume that we have access to a finite or countable collection \(a_m(U),b_m(U)\in {\mathbb {R}}\cup \{\pm \infty \}\) such that U can be written as a disjoint union

$$\begin{aligned} U=\bigcup _{m}\left( a_m(U),b_m(U)\right) . \end{aligned}$$
(3.1)

With an abuse of notation, we add this information as evaluation functions to \(\Lambda _i\) (defined in (1.18)).

Theorem 3.1

(Computation of measures on open sets). Given the set-up in Sects. 1.5, 1.6 and the previous paragraph, consider the map

$$\begin{aligned} \Xi _{\mathrm {meas}}:\Omega _{f,\alpha ,\beta }\times {\mathcal {U}}&\rightarrow l^2({\mathbb {N}})\\ (T,x,U)&\rightarrow E_U^Tx. \end{aligned}$$

Then \(\{\Xi _{\mathrm {meas}},\Omega _{f,\alpha ,\beta }\times {\mathcal {U}}, \Lambda _1\}\in \Delta _2^A\). In other words, we can construct a convergent sequence of arithmetic algorithms for the problem.

Remark 4

Essentially, this theorem tells us that if we can compute the action of the resolvent operator with asymptotic error control near the real axis, then we can compute the spectral measures of open sets in one limit. In the unitary case, this can easily be extended to relatively open sets of \({\mathbb {T}}\) if we can evaluate the resolvent near the unit circle. For any \(U\in {\mathcal {U}}\), the approximation of \(E_U^Tx\) has finite support, and hence we can take inner products to compute \(\mu _{x,y}^T(U)\).

Remark 5

One may wonder whether it is possible to upgrade the convergence of the algorithm in Theorem 3.1 from \(\Delta _2\) to \(\Delta _1\). In other words, whether it is possible to compute the measure with error control. However, this is difficult because the measure may be singular. Theorem 5.2 shows this is impossible even for singleton sets and discrete Schrödinger operators acting on \(l^2({\mathbb {N}})\).

Proof of Theorem 3.1

Let \(T\in \Omega _{\mathrm {SA}}\) and \(z_1,z_2\in {\mathbb {C}}\backslash {\mathbb {R}}\). By the resolvent identity and self-adjointness of T,

$$\begin{aligned} \Vert R(z_1,T)-R(z_1,T)\Vert \le \left| \mathrm {Im}(z_1)\right| ^{-1}\left| \mathrm {Im}(z_2)\right| ^{-1}\left| z_1-z_2\right| . \end{aligned}$$

Hence, for \(z=u+i\epsilon \) with \(\epsilon >0\), the vector-valued function \(K_H(u+i\epsilon ;T,x)\) (considered with argument u) is Lipschitz continuous with Lipschitz constant bounded by \(\epsilon ^{-2}\Vert x\Vert /\pi \). Now consider the class \(\Omega _{f,\alpha ,\beta }\times {\mathcal {U}}\) and let \((T,x,U)\in \Omega _{f,\alpha ,\beta }\times {\mathcal {U}}\). From Corollary 2.2, we can construct a sequence of arithmetic algorithms, \({\widehat{\Gamma }}_n\), such that

$$\begin{aligned} \Vert {\widehat{\Gamma }}_n(T,u,z)-K_H(u+i\epsilon ;T,x)\Vert \le \frac{C(T,x)}{\epsilon }\left( \alpha _n+\beta _n\right) \end{aligned}$$

for all \((T,x)\in \Omega _{f,\alpha ,\beta }\). It follows from standard quadrature rules and taking subsequences if necessary (using that \(\{\alpha _n\}\) and \(\{\beta _n\}\) are null), that for \(-\infty<a<b<\infty \), the integral

$$\begin{aligned} \int _{a}^b K_H\left( u+\frac{i}{n};T,x\right) du \end{aligned}$$
(3.2)

can be approximated to an accuracy \({\widehat{C}}(T,x)/n\) using finitely many arithmetic operations and comparisons and the relevant set of evaluation functions \(\Lambda _1\) (the constant C now becomes \({\widehat{C}}\) due to not knowing the exact value of \(\Vert x\Vert \)).

Recall that we assumed the disjoint union

$$\begin{aligned} U=\bigcup _{m}(a_m,b_m) \end{aligned}$$

where \(a_m,b_m\in {\mathbb {R}}\cup \{\pm \infty \}\) and the union is at most countable. Without loss of generality, we assume that the union is over \(m\in {\mathbb {N}}\). We then let \(a_{m,n},b_{m,n}\in {\mathbb {Q}}\) be such that \(a_{m,n}\downarrow a_m\) and \(b_{m,n}\uparrow b_m\) as \(n\rightarrow \infty \) with \(a_{m,n}<b_{m,n}\) and hence \((a_{m,n},b_{m,n})\subset (a_m,b_m)\). Let

$$\begin{aligned} U_n=\bigcup _{m=1}^n (a_{m,n},b_{m,n}), \end{aligned}$$

then the proof of Stone’s formula in Proposition 2.3 (essentially an application of the dominated convergence theorem) can be easily adapted to show that

$$\begin{aligned} \lim _{n\rightarrow \infty }\int _{U_n} K_H\left( u+\frac{i}{n};T,x\right) du=E^{T}_{U}x. \end{aligned}$$

Note that we do not have to worry about contributions from endpoints of the intervals \((a_m,b_m)\) since we approximate strictly from within the open set U. To finish the proof, we simply let \(\Gamma _{n}(T,x,U)\) be an approximation of the integral

$$\begin{aligned} \int _{U_n} K_H\left( u+\frac{i}{n};T,x\right) du \end{aligned}$$

with accuracy \({\widehat{C}}(T,x)/n\). By the above remarks, such an approximation can be computed using finitely many arithmetic operations and comparisons from the relevant set of evaluation functions \(\Lambda _1\). \(\quad \square \)

This theorem can clearly be extended to cover the more general case of Proposition 2.4 if \(\gamma \) is regular enough to allow approximation of

$$\begin{aligned} \mathrm {PV}\int _{\gamma } R(z,T)xdz, \end{aligned}$$

given the ability to compute R(zT)x with asymptotic error control. Note that when it comes to numerically computing the integrals in Propositions 2.3 and 2.4, it is advantageous to deform the contour so that most of the contour lies far from the spectrum so that the resolvent has a smaller Lipschitz constant. The proof can also be adapted to compute \(E_{I}x\), where \(I=[a,b]\) is a closed interval, by considering intervals shrinking to [ab] (ab finite). A special case of this is the computation of the spectral measure of singleton sets. However, for these it much easier to directly use the formulae

$$\begin{aligned} E^{T}_{\{u\}}x= & {} \lim _{\epsilon \downarrow 0}\epsilon \pi K_H(u+i\epsilon ;T,x),\\ E^{T}_{\{\exp (i\theta )\}}x= & {} \lim _{\epsilon \downarrow 0}\epsilon \pi i\exp (i\theta ) K_D((1-\epsilon )\exp (i\theta );T,x), \end{aligned}$$

for \(T\in \Omega _{\mathrm {SA}}\) and \(T\in \Omega _{\mathrm {U}}\) respectively.

3.2 Measure decompositions and projections

Recall from Sect. 1.5 that \(P_{{\mathcal {I}}}^T\) denotes the orthogonal projection onto the space \({\mathcal {H}}^T_{{\mathcal {I}}}\), where \({\mathcal {I}}\) denotes a generic type (\(\mathrm {ac},\mathrm {sc},\mathrm {pp},\mathrm {c}\) or \(\mathrm {s}\)). We have included the continuous and singular parts denoted by \(\mathrm {c}\) or \(\mathrm {s}\) which correspond to \({\mathcal {H}}_{\mathrm {ac}}\oplus {\mathcal {H}}_{\mathrm {sc}}\) and \({\mathcal {H}}_{\mathrm {sc}}\oplus {\mathcal {H}}_{\mathrm {pp}}\) respectively. These are often encountered in mathematical physics. As in Sect. 3.1, we assume the decomposition in (3.1) and add the \(\{a_m,b_m\}\) as evaluation functions to \(\Lambda _i\) (defined in (1.18)). In this section, we prove the following theorem.

Theorem 3.2

Given the set-up in Sects. 1.5, 1.6 and 3.1, consider the map

$$\begin{aligned} \Xi _{{\mathcal {I}}}:\Omega _{f,\alpha ,\beta }\times V_{\beta }\times {\mathcal {U}}&\rightarrow {\mathbb {C}}\\ (T,x,y,U)&\rightarrow \langle P_{{\mathcal {I}}}^TE_U^Tx,y\rangle =\mu ^T_{x,y,{\mathcal {I}}}(U), \end{aligned}$$

for \({\mathcal {I}}=\mathrm {ac},\mathrm {sc},\mathrm {pp},\mathrm {c}\) or \(\mathrm {s}\). Then for \(i=1,2\)

$$\begin{aligned} \Delta ^G_2 \not \ni \{\Xi _{{\mathcal {I}}},\Omega _{f,\alpha ,\beta }\times V_{\beta }\times {\mathcal {U}},\Lambda _i\} \in \Delta ^A_3. \end{aligned}$$

To prove this theorem, it is enough, by the polarisation identity, to consider \(x=y\) (note that all the projections commute). We will split the proof into two parts: the \(\Delta ^A_3\) inclusion, for which it is enough to consider \(\Lambda _1\), and the \(\Delta ^G_2\) exclusion, for which it is enough to consider \(\Lambda _2\).

3.2.1 Proof of inclusion in Theorem 3.2

Proof of inclusion in Theorem 3.2

Since \(P^T_{\mathrm {pp}}=I-P^T_{\mathrm {c}}\), \(P^T_{\mathrm {ac}}=I-P^T_{\mathrm {s}}\) and \(P^T_{\mathrm {sc}}=P^T_{\mathrm {s}}-P^T_{\mathrm {pp}}\), it is enough, by Theorem 3.1 and Remark 4, to consider only \({\mathcal {I}}=\mathrm {c}\) and \({\mathcal {I}}=\mathrm {s}\).

Step 1: We first deal with \({\mathcal {I}}=\mathrm {c}\), where we shall use a similar argument to the proof of Theorem 4.1 (which is more general than what we need). We recall the RAGE theorem [2, 53, 119] as follows. Let \(Q_n\) denote the orthogonal projection onto vectors in \(l^2({\mathbb {N}})\) with support outside the subset \(\{1,\ldots ,n\}\subset {\mathbb {N}}\). Then for any \(x\in l^2({\mathbb {N}})\),

$$\begin{aligned} \begin{aligned} \langle P_{\mathrm {c}}^TE_U^Tx,x\rangle =\Vert P_{\mathrm {c}}^TE_U^Tx\Vert ^2&=\lim _{n\rightarrow \infty }\lim _{t\rightarrow \infty }\frac{1}{t}\int _0^t\left\| Q_ne^{-iTs}E_U^Tx\right\| ^2 ds\\&=\lim _{n\rightarrow \infty }\lim _{t\rightarrow \infty }\frac{1}{t}\int _0^t\left\| Q_ne^{-iTs}\chi _{U}(T)x\right\| ^2 ds. \end{aligned} \end{aligned}$$
(3.3)

The proof of Theorem 4.1 is easily adapted to show that there exists arithmetic algorithms \({\widetilde{\Gamma }}_{n,m}\) using \(\Lambda _1\) such that

$$\begin{aligned} \Vert Q_ne^{-iTs}\chi _{U}(T)x-{\widetilde{\Gamma }}_{n,m}(T,x,U,s)\Vert \le \frac{C(T,x,U)}{m} \end{aligned}$$

for all \((T,x,U,s)\in \Omega _{f,\alpha ,\beta }\times {\mathcal {U}}\times {\mathbb {R}}\). Note that this bound can be made independent of s (as we have written above) by sufficiently approximating the function \(\lambda \rightarrow \exp (-i\lambda s)\chi _{U}(\lambda )\) (it has known total variation for a given s and uniform bound). We now define

$$\begin{aligned} \Gamma _{n,m}(T,x,U)=\frac{1}{m^2}\sum _{j=1}^{m^2}\Vert {\widetilde{\Gamma }}_{m,n}(T,x,U,j/m)\Vert ^2. \end{aligned}$$

Using the fact that for \(a,b\in l^2({\mathbb {N}})\),

$$\begin{aligned} \left| \langle a,a \rangle -\langle b,b \rangle \right| \le \Vert a-b\Vert \left( 2\Vert a\Vert +\Vert a-b\Vert \right) , \end{aligned}$$
(3.4)

it follows that

$$\begin{aligned} \left| \Vert Q_ne^{-iTs}\chi _{U}(T)x\Vert ^2-\Vert {\widetilde{\Gamma }}_{n,m}(T,x,U,s)\Vert ^2\right| \le \frac{C(T,x,U)}{m}\left( 2\Vert x\Vert +\frac{C(T,x,U)}{m}\right) . \end{aligned}$$

Hence

$$\begin{aligned}&\left| \Gamma _{n,m}(T,x,U)-\frac{1}{m}\int _0^m\left\| Q_ne^{-iTs}\chi _{U}(T)x\right\| ^2 ds\right| \\&\quad \le \frac{1}{m^2}\sum _{j=1}^{m^2}\frac{C(T,x,U)}{m}\left( 2\Vert x\Vert +\frac{C(T,x,U)}{m}\right) \\&\qquad +\frac{1}{m^2}\sum _{j=1}^{m^2}\left| g_n(j/m)-m\int _{\frac{j-1}{m}}^{\frac{j}{m}}g_n(s)ds\right| , \end{aligned}$$

where \(g_n(s)=\Vert Q_ne^{-iTs}\chi _{U}(T)x\Vert ^2.\) Clearly the first term converges to 0 as \(m\rightarrow \infty \), so we only need to consider the second. Using (3.4), it follows that for any \(\epsilon >0\)

$$\begin{aligned} \left| g_n(s)-g_n(s+\epsilon )\right|\le & {} 4\Vert Q_ne^{-iTs}(e^{-iT\epsilon }-I)\chi _{U}(T)x\Vert \Vert x\Vert \\\le & {} 4\Vert x\Vert \Vert (e^{-iT\epsilon }-I)\chi _{U}(T)x\Vert . \end{aligned}$$

But \(e^{-iT\epsilon }-I\) converges strongly to 0 as \(\epsilon \downarrow 0\) and hence the quantity

$$\begin{aligned} \left| g_n(j/m)-m\int _{\frac{j-1}{m}}^{\frac{j}{m}}g_n(s)ds\right| \rightarrow 0 \end{aligned}$$

uniformly in j as \(m\rightarrow \infty \). It follows that

$$\begin{aligned} \lim _{m\rightarrow \infty }\Gamma _{n,m}(T,x,U)=\lim _{t\rightarrow \infty }\frac{1}{t}\int _0^t\left\| Q_ne^{-iTs}E_U^Tx\right\| ^2 ds \end{aligned}$$

and hence

$$\begin{aligned} \lim _{n\rightarrow \infty }\lim _{m\rightarrow \infty }\Gamma _{n,m}(T,x,U)=\langle P_{\mathrm {c}}^TE_U^Tx,x\rangle . \end{aligned}$$

Step 2: Next we deal with the case \({\mathcal {I}}=\mathrm {s}\). Note that for \(z\in {\mathbb {C}}\backslash {\mathbb {R}}\), \(\langle R(z,T)x,x \rangle \) is simply the Stieltjes transform (also called the Borel transform) of the positive measure \(\mu _{x,x}^T\)

$$\begin{aligned} \langle R(z,T)x,x \rangle =\int _{{\mathbb {R}}}\frac{1}{\lambda -z}d\mu _{x,x}^T(\lambda ). \end{aligned}$$

The Hilbert transform of \(\mu ^T_{x,x}\) is given by the limit

$$\begin{aligned} H_{\mu _{x,x}^T}(t)=\frac{1}{\pi }\lim _{\epsilon \downarrow 0}\mathrm {Re}\left( \langle R(t+i\epsilon ,T)x,x \rangle \right) , \end{aligned}$$

with the limit existing (Lebesgue) almost everywhere. This object was studied in [108, 109], where we shall use the result (since the measure is positive) that for any bounded continuous function f,Footnote 6

$$\begin{aligned} \lim _{\theta \rightarrow \infty }\frac{\pi \theta }{2}\int _{{\mathbb {R}}}f(t)\chi _{\{w:|H_{\mu _{x,x}^T}(w)|\ge \theta \}}(t)dt=\int _{{\mathbb {R}}}f(t)d\mu _{x,x,\mathrm {s}}^T(t). \end{aligned}$$
(3.5)

Now let \((T,x,U)\in \Omega _{f,\alpha ,\beta }\times {\mathcal {U}}\) with

$$\begin{aligned} U=\bigcup _{m}(a_m,b_m), \end{aligned}$$

where \(a_m,b_m\in {\mathbb {R}}\cup \{\pm \infty \}\) and the disjoint union is at most countable as in (3.1). Without loss of generality, we assume that the union is over \(m\in {\mathbb {N}}\). Due to the possibility of point spectra at the endpoints \(a_m,b_m\), we cannot simply replace f by \(\chi _{U}\) in the above limit (3.5). However, this can be overcome in the following manner.

Let \(\partial U\) denote the boundary of U defined by \({\overline{U}}\backslash U\) and let \(\nu \) denote the measure \(\mu _{x,x}^T|_{\partial U}\). Let \(\{f_{l}\}_{l\in {\mathbb {N}}}\) denote a pointwise increasing sequence of continuous functions, converging everywhere up to \(\chi _{U}\), such that the support of each \(f_l\) is contained in

$$\begin{aligned}{}[-l,l]\bigcap \left( \bigcup _{m=1}^{l}\left( a_m+1/\sqrt{l},b_m-1/\sqrt{l}\right) \right) . \end{aligned}$$

Such a sequence exists (and can easily be explicitly constructed) precisely because U is open. We first claim that

$$\begin{aligned} \lim _{l\rightarrow \infty }\frac{\pi l}{2}\int _{{\mathbb {R}}}f_l(t)\chi _{\{w:|H_{\mu _{x,x}^T}(w)|\ge l\}}(t)dt=\mu _{x,x,\mathrm {s}}^T(U). \end{aligned}$$
(3.6)

To see this note that for any \(k\in {\mathbb {N}}\), the following inequalities hold

$$\begin{aligned} \begin{aligned} \liminf _{l\rightarrow \infty }\frac{\pi l}{2}\int _{{\mathbb {R}}}f_l(t)\chi _{\{w:|H_{\mu _{x,x}^T}(w)|\ge l\}}(t)dt&\ge \liminf _{l\rightarrow \infty }\frac{\pi l}{2}\int _{{\mathbb {R}}}f_k(t)\chi _{\{w:|H_{\mu _{x,x}^T}(w)|\ge l\}}(t)dt\\&=\int _{{\mathbb {R}}}f_k(t)d\mu _{x,x,\mathrm {s}}^T(t), \end{aligned} \end{aligned}$$

where the last equality is due to (3.5). Taking \(k\rightarrow \infty \) yields

$$\begin{aligned} \liminf _{l\rightarrow \infty }\frac{\pi l}{2}\int _{{\mathbb {R}}}f_l(t)\chi _{\{w:|H_{\mu _{x,x}^T}(w)|\ge l\}}(t)dt\ge \mu _{x,x,\mathrm {s}}^T(U), \end{aligned}$$
(3.7)

so we are left with proving a similar bound for the limit supremum. Note that any point in the support of \(f_l\) is of distance at least \(1/\sqrt{l}\) from \(\partial U\). It follows that there exists a constant C independent of t such that for any \(t\in \mathrm {supp}(f_l)\),

$$\begin{aligned} \left| H_{\nu }(t)\right| \le C\sqrt{l} \end{aligned}$$

Now let \(\epsilon \in (0,1)\). Then, for large l, \(l-C\sqrt{l}\ge (1-\epsilon )l\) and hence

$$\begin{aligned} \mathrm {supp}(f_l)\cap \{w:|H_{\mu _{x,x}^T}(w)|\ge l\}\subset \mathrm {supp}(f_l)\cap \{w:|H_{\mu _{x,x}^T-\nu }(w)|\ge (1-\epsilon )l\}. \end{aligned}$$
(3.8)

Now let f be any bounded continuous function such that \(f\ge \chi _U\). Then using (3.8),

$$\begin{aligned} \begin{aligned}&\limsup _{l\rightarrow \infty }\frac{\pi l}{2}\int _{{\mathbb {R}}}f_l(t)\chi _{\{w:|H_{\mu _{x,x}^T}(w)|\ge l\}}(t)dt\\&\quad \le \limsup _{l\rightarrow \infty }\frac{1}{1-\epsilon }\frac{\pi (1-\epsilon )l}{2}\int _{{\mathbb {R}}}f_l(t)\chi _{\{w:|H_{\mu _{x,x}^T-\nu }(w)|\ge (1-\epsilon )l\}}(t)dt\\&\quad \le \limsup _{l\rightarrow \infty }\frac{1}{1-\epsilon }\frac{\pi (1-\epsilon )l}{2}\int _{{\mathbb {R}}}f(t)\chi _{\{w:|H_{\mu _{x,x}^T-\nu }(w)|\ge (1-\epsilon )l\}}(t)dt\\&\quad =\frac{1}{1-\epsilon }\int _{{\mathbb {R}}}f(t)d([\mu _{x,x}^T-\nu ]_{\mathrm {s}})(t). \end{aligned} \end{aligned}$$

Now we let \(f\downarrow \chi _{{\overline{U}}}\), with pointwise convergence everywhere. This is possible since the complement of \({\overline{U}}\) is open. By the dominated convergence theorem, and since \(\epsilon \) was arbitrary, this yields

$$\begin{aligned} \limsup _{l\rightarrow \infty }\frac{\pi l}{2}\int _{{\mathbb {R}}}f_l(t)\chi _{\{w:|H_{\mu _{x,x}^T}(w)|\ge l\}}(t)dt\le [\mu _{x,x}^T-\nu ]_{\mathrm {s}}({\overline{U}})=\mu _{x,x,{\mathrm {s}}}^T(U), \end{aligned}$$

where the last equality follows from the definition of \(\nu \). The claim (3.6) now follows.

Let \(\chi _n\) be a sequence of non-negative continuous piecewise affine functions on \({\mathbb {R}}\), bounded by 1 and such that \(\chi _n(t)=0\) if \(t\le n-1\) and \(\chi _n(t)=1\) if \(t \ge n+1\). Consider the integrals

$$\begin{aligned} I(n,m)=\frac{\pi n}{2}\int _{{\mathbb {R}}}f_n(t)\chi _n(\left| F_m(t)\right| )dt, \end{aligned}$$

where \(F_m(t)\) is an approximation of

$$\begin{aligned} \frac{1}{\pi }\mathrm {Re}\left( \left\langle R\left( t+\frac{i}{m},T\right) x,x \right\rangle \right) \end{aligned}$$

to pointwise accuracy \(O(m^{-1})\) over \(t\in [-n,n]\). Note that a suitable piecewise affine function \(f_n\) can be constructed using \(\Lambda _1\), as can suitable \(\chi _n\), and a suitable approximation function \(F_m\) can be pointwise evaluated using \(\Lambda _1\) (again by Corollary 2.2). To see this, recall the definition of \(\Lambda _1\) in (1.18) and that we added \(\{a_m,b_m\}\) from (3.1) to \(\Lambda _1\). To define \(f_n\), we can define the function by suitable piecewise affine functions on each interval \([-n,n]\cap \left( a_m+1/\sqrt{n},b_m-1/\sqrt{n}\right) \). It follows that there exists arithmetic algorithms \(\Gamma _{n,m}(T,x,U)\) using \(\Lambda _1\) such that

$$\begin{aligned} \left| I(n,m)-\Gamma _{n,m}(T,x,U)\right| \le \frac{C(T,x,U)}{m}. \end{aligned}$$

The dominated convergence theorem implies that

$$\begin{aligned} \lim _{m\rightarrow \infty }\Gamma _{n,m}(T,x,U)=\lim _{m\rightarrow \infty }I(n,m)=\frac{\pi n}{2}\int _{{\mathbb {R}}}f_n(t)\chi _n(|H_{\mu _{x,x}^T}(t)|)dt. \end{aligned}$$

Note that continuity of \(\chi _n\) is needed to gain convergence almost everywhere and prevent possible oscillations about the level set \(\{H_{\mu _{x,x}^T}(t)=n\}\). We also have

$$\begin{aligned} \chi _{\{w:|H_{\mu _{x,x}^T}(w)|\ge n+1\}}(t)\le \chi _n(|H_{\mu _{x,x}^T}(t)|)\le \chi _{\{w:|H_{\mu _{x,x}^T}(w)|\ge n-1\}}(t) \end{aligned}$$

The same arguments used to prove (3.6), therefore show that

$$\begin{aligned} \lim _{n\rightarrow \infty }\frac{\pi n}{2}\int _{{\mathbb {R}}}f_n(t)\chi _n(|H_{\mu _{x,x}^T}(t)|)dt=\mu _{x,x,\mathrm {s}}^T(U). \end{aligned}$$

Hence,

$$\begin{aligned} \lim _{n\rightarrow \infty }\lim _{m\rightarrow \infty }\Gamma _{n,m}(T,x,U)=\mu _{x,x,\mathrm {s}}^T(U), \end{aligned}$$

completing the proof of inclusion in Theorem 3.2. \(\quad \square \)

3.2.2 Proof of exclusion in Theorem 3.2

To prove the exclusion, we need two results which will also be used in Sect. 5. First, we consider a result connected to Anderson localisation (Theorem 3.3) and, second, we consider a result concerning sparse potentials of discrete Schrödinger operators (Theorem 3.4). The free Hamiltonian \(H_0\) acts on \(l^2({\mathbb {N}})\) via the tridiagonal matrix representation

$$\begin{aligned} H_0=\begin{pmatrix} 2 &{}{} -1 &{}{} &{}{} \\ -1 &{}{} 2 &{}{} -1 &{}{} \\ &{}{} -1 &{}{} 2 &{}{} \ddots \\ &{}{} &{}{} \ddots &{}{} \ddots \end{pmatrix}. \end{aligned}$$

We define a Schrödinger operator acting on \(l^2({\mathbb {N}})\) to be an operator of the form

$$\begin{aligned} H_v=H_0+v, \end{aligned}$$

where v is a bounded (real-valued) multiplication operator with matrix \(\mathrm {diag}(v(1),v(2),\ldots )\).

Since Anderson’s introduction of his famous model 60 years ago [3], there has been a considerable amount of work by both physicists and mathematicians aiming to understand the suppression of electron transport due to disorder (Anderson localisation). A full discussion of Anderson localisation is beyond the scope of this paper, and we refer the reader to [29, 42, 90] for broader surveys. When considering Anderson localisation, we will assume that \(v=v_{\omega }=\{v(1),v(2),\ldots \}\) is a collection of independent identically distributed random variables. Following [67], we assume that the single-site probability distribution has a density \(\rho \in L^1({\mathbb {R}})\cap L^\infty ({\mathbb {R}})\) with \(\Vert \rho \Vert _1=1\) (with respect to the standard Lebesgue measure). For such a potential, a measure of disorder is given by the quantity \(\Vert \rho \Vert _{\infty }^{-1}\). The first result we need is the following theorem which follows straightforwardly from the technique of [67], and hence we have not provided a proof which would be almost verbatim to [67].

Theorem 3.3

(Anderson Localisation for Perturbed Operator [67]). There exists a constant \(C>0\) such that if \(\Vert \rho \Vert _{\infty }\le C\) and \(\rho \) has compact support, then the operator \(H_{v_\omega }+A\) has only pure point spectrum with probability 1 for any fixed self-adjoint operator A of the form

$$\begin{aligned} A=\sum _{j=1}^M\alpha _j\left| x_{m_j}\right\rangle \left\langle x_{n_j}\right| . \end{aligned}$$
(3.9)

In other words, the operator A’s matrix with respect to the canonical basis has only finitely many non-zeros.

The second result we need is the following from [92].

Theorem 3.4

(Krutikov and Remling [92]). Consider discrete Schrödinger operators acting on \(l^2({\mathbb {N}})\). Let v be a (real-valued and bounded) potential of the following form:

$$\begin{aligned} v(n)=\sum _{j=1}^\infty g_j\delta _{n,m_j},\quad m_{j-1}/m_{j}\rightarrow 0. \end{aligned}$$

Then \([0,4]\subset \sigma _{\mathrm {ess}}(H_v)\) and the following dichotomy holds:

  1. (a)

    If \(\sum _{j\in {\mathbb {N}}}g_j^2<\infty \) then \(H_v\) is purely absolutely continuous on (0, 4).

  2. (b)

    If \(\sum _{j\in {\mathbb {N}}}g_j^2=\infty \) then \(H_v\) is purely singular continuous on (0, 4).

Proof of exclusion in Theorem 3.2

Since \(P^T_{\mathrm {pp}}=I-P^T_{\mathrm {c}}\), \(P^T_{\mathrm {ac}}=I-P^T_{\mathrm {s}}\) and \(P^T_{\mathrm {sc}}=P^T_{\mathrm {s}}-P^T_{\mathrm {pp}}\), it is enough, by Theorem 3.1 and Remark 4, to consider \({\mathcal {I}}=\mathrm {pp}\), \(\mathrm {ac}\) and \(\mathrm {sc}\). We restrict the proof to considering bounded Schrödinger operators \(H_v\) acting on \(l^2({\mathbb {N}})\), which are clearly a subclass of \(\Omega _{f,0}\) for \(f(n)=n+1\). Note that since the evaluation functions in \(\Lambda _2\) can be recovered from those in \(\Lambda _1\) in this special case, we can assume that we are dealing with \(\Lambda _1\). We also set \(x=e_1\), with the crucial properties that this vector is cyclic and hence \(\mu ^{H_v}_{e_1,e_1}\) has the same support as \(\sigma (H_v)\), and that \(x\in {V}_{0}\). Throughout, we also take \(U=(0,4)\).

Step 1: We begin with \(P^T_{\mathrm {pp}}\). Suppose for a contradiction that there does exist a sequence of general algorithms \(\Gamma _n\) such that

$$\begin{aligned} \lim _{n\rightarrow \infty }\Gamma _n(H_v)=\langle P^{H_v}_{\mathrm {pp}}E_{(0,4)}^{H_v}e_1,e_1\rangle . \end{aligned}$$

We take a general algorithm, denoted \({\widehat{\Gamma }}_n\), from Theorem 3.1 which has

$$\begin{aligned} \lim _{n\rightarrow \infty }{\widehat{\Gamma }}_n(H_v)=\mu _{e_1,e_1}^{H_v}((0,4)). \end{aligned}$$

Since \(e_1\) is cyclic, this limit is non-zero if \((0,4)\cap \sigma (H_v)\ne \emptyset \). We therefore define

$$\begin{aligned} {\widetilde{\Gamma }}_n(H_v)={\left\{ \begin{array}{ll} 0\quad \text {if }{\widehat{\Gamma }}_n(H_v)=0\\ \frac{\Gamma _n(H_v)}{{\widehat{\Gamma }}_n(H_v)}\quad \text {otherwise}. \end{array}\right. } \end{aligned}$$

We will use Theorem 3.3 and the following well-known facts:

  1. 1.

    If for any \(l\in {\mathbb {N}}\) there exists \(m_l\) such that \(v({m_l+1})=v({m_l+2})=\cdots =v({m_l+l})=0\), then \((0,4)\subset \sigma (H_v)\).

  2. 2.

    If there exists \(N\in {\mathbb {N}}\) such that v(n) is 0 for \(n\ge N\), then \(\sigma _{\mathrm {pp}}(H_v)\cap (0,4)=\emptyset \) [114], but \([0,4]\subset \sigma (H_v)\) (the potential acts as a compact perturbation so the essential spectrum is [0, 4]).

  3. 3.

    If we are in the setting of Theorem 3.3, then the spectrum of \(H_{v_\omega }+A\) is pure point almost surely. Moreover, if \(\rho =\chi _{[-c,c]}/(2c)\) for some constant c, then \([-c,4+c]\subset \sigma _{\mathrm {pp}}(H_{v_\omega }+A)\) almost surely.

The strategy will be to construct a potential v such that \((0,4)\subset \sigma (H_v)\), yet \({\widetilde{\Gamma }}_n(H_v)\) does not converge. This is a contradiction since by our assumptions, for such a v we must have

$$\begin{aligned} {\widetilde{\Gamma }}_n(H_v)\rightarrow \frac{\langle P^{H_v}_{\mathrm {pp}}E_{(0,4)}^{H_v}e_1,e_1\rangle }{\mu _{e_1,e_1}^{H_v}((0,4))}. \end{aligned}$$

To do this, choose \(\rho =\chi _{[-c,c]}/(2c)\) for some constant c such that the conditions of Theorem 3.3 hold and define the potential v inductively as follows.

Let \(v_1\) be a potential of the form \(v_{\omega }\) (with the density \(\rho \)) such that \(\sigma (H_{v_1})\) is pure point. Such a \(v_1\) exists by Theorem 3.3 and we have \(\langle P^{H_{v_1}}_{\mathrm {pp}}E_{(0,4)}^{H_{v_1}}e_1,e_1\rangle =\mu _{e_1,e_1}^{H_{v_1}}((0,4))\). Hence for large enough n it must hold that \({\widetilde{\Gamma }}_n(H_{v_1})>3/4\). Fix \(n=n_1\) such that this holds. Then \(\Gamma _{n_1}(H_{v_1})\) only depends on \(\{v_1(j):j\le N_1\}\) for some integer \(N_1\) by (i) of Definition A.2. Define the potential \(v_2\) by \(v_2(j)=v_1(j)\) for all \(j\le N_1\) and \(v_2(j)=0\) otherwise. Then by fact (2) above, \(\langle P^{H_{v_2}}_{\mathrm {pp}}E_{(0,4)}^{H_{v_2}}e_1,e_1\rangle =0\) but \(\mu _{e_1,e_1}^{H_{v_2}}((0,4))\ne 0\), and hence \({\widetilde{\Gamma }}_n(H_{v_2})<1/4\) for large n, say for \(n=n_2>n_1\). But then \(\Gamma _{n_2}(H_{v_2})\) only depends on \(\{v_2(j):j\le N_2\}\) for some integer \(N_2\).

We repeat this process inductively switching between potentials which induce \({\widetilde{\Gamma }}_{n_k}(H_{v_{k}})<1/4\) for k even and potentials which induce \({\widetilde{\Gamma }}_{n_k}(H_{v_{k}})>3/4\) for k odd. Explicitly, if k is even then define a potential \(v_{k+1}\) by \(v_{k+1}(j)=v_k(j)\) for all \(j\le N_k\) and \(v_{k+1}(j)=v_{\omega }(j)\) (with the density \(\rho \)) otherwise such that the spectrum of \(H_{v_{k}}\) is pure point. Such a \(\omega \) exists from Theorem 3.3 applied with the perturbation A to match the potential for \(j\le N_k\). If k is odd then we define \(v_{k+1}\) by \(v_{k+1}(j)=v_k(j)\) for all \(j\le N_k\) and \(v_{k+1}(j)=0\) otherwise. We can then choose \(n_{k+1}\) such that the above inequalities hold and \(N_{k+1}\) such that \(\Gamma _{n_{k+1}}(H_{v_{k+1}})\) only depends on \(\{v_{k+1}(j):j\le N_{k+1}\}\). We also ensure that \(N_{k+1}\ge N_k+k\).

Finally set \(v(j)=v_{k}(j)\) for \(j\le N_{k}\). It is clear from (iii) of Definition A.2, that \({\widetilde{\Gamma }}_{n_k}(H_v)={\widetilde{\Gamma }}_{n_k}(H_{v_{k}})\) and this implies that \({\widetilde{\Gamma }}_{n_k}(H_v)\) cannot converge. However, since \(N_{k+1}\ge N_k+k\), for any k odd we have \(v({N_k+1})=v({N_k+2})=\cdots =v({N_k+k})=0\). Fact (1) implies that \((0,4)\subset \sigma (H_v)\), hence \(\mu _{e_1,e_1}^{H_v}((0,4))\ne 0\) and therefore \({\widetilde{\Gamma }}_n(H_v)\) converges. This provides the required contradiction.

Step 2: Next we deal with \({\mathcal {I}}=\mathrm {ac}\). To prove that one limit will not suffice, our strategy will be to reduce a certain decision problem to the computation of \(\Xi _{\mathrm {ac}}\). Let \(({\mathcal {M}}',d')\) be the discrete space \(\{0,1\}\), let \(\Omega '\) denote the collection of all infinite sequence \(\{a_{j}\}_{j\in {\mathbb {N}}}\) with entries \(a_{j}\in \{0,1\}\) and consider the problem function

$$\begin{aligned} \Xi '(\{a_{j}\}):\text { `Does }\{a_{j}\}\text { have infinitely many non-zero entries?'}, \end{aligned}$$

which maps to \(({\mathcal {M}}',d')\). In Appendix A, it is shown that \(\mathrm {SCI}(\Xi ',\Omega ')_{G} = 2\) (where the evaluation functions consist of component-wise evaluation of the array \(\{a_{j}\}\)). Suppose for a contradiction that \(\Gamma _{n}\) is a height one tower of general algorithms such that

$$\begin{aligned} \lim _{n\rightarrow \infty }\Gamma _n(H_v)=\langle P^{H_v}_{\mathrm {ac}}E_{(0,4)}^{H_v}e_1,e_1\rangle . \end{aligned}$$

We will gain a contradiction by using the supposed tower to solve \(\{\Xi ',\Omega '\}\).

Given \(\{a_{j}\}\in \Omega '\), consider the operator \(H_v\), where the potential is of the following form:

$$\begin{aligned} v(m)=\sum _{k=1}^\infty a_{k}\delta _{m,k!}. \end{aligned}$$
(3.10)

Then by Theorem 3.4, \(\langle P^{H_v}_{\mathrm {ac}}E_{(0,4)}^{H_v}e_1,e_1\rangle =\mu ^{H_v}_{e_1,e_1}((0,4))\) if \(\sum _{k}a_{k}<\infty \) (that is, if \(\Xi '(\{a_j\})=0\)) and \(\langle P^{H_v}_{\mathrm {ac}}E_{(0,4)}^{H_v}e_1,e_1\rangle =0\) otherwise. Note that in either case we have \(\mu ^{H_v}_{e_1,e_1}((0,4))\ne 0\). We follow Step 1 and take a general algorithm, denoted \({\widehat{\Gamma }}_n\), from Theorem 3.1 which has

$$\begin{aligned} \lim _{n\rightarrow \infty }{\widehat{\Gamma }}_n(H_v)=\mu _{e_1,e_1}^{H_v}((0,4)). \end{aligned}$$

Since \(e_1\) is cyclic, this limit is non-zero for \(H_v\), where v is of the form (3.10). We therefore define

$$\begin{aligned} {\widetilde{\Gamma }}_n(H_v)={\left\{ \begin{array}{ll} 0\quad \text {if }{\widehat{\Gamma }}_n(H_v)=0\\ \frac{\Gamma _n(H_v)}{{\widehat{\Gamma }}_n(H_v)}\quad \text {otherwise}. \end{array}\right. } \end{aligned}$$

It follows that

$$\begin{aligned} \lim _{n\rightarrow \infty }{\widetilde{\Gamma }}_n(H_v)={\left\{ \begin{array}{ll} 1\quad \text {if }\Xi '(\{a_j\})=0\\ 0\quad \text {otherwise}. \end{array}\right. } \end{aligned}$$

Given N, we can evaluate any matrix value of H using only finitely many evaluations of \(\{a_{j}\}\) and hence the evaluation functions \(\Lambda _1\) can be computed using component-wise evaluations of the sequence \(\{a_j\}\). We now set

$$\begin{aligned} \overline{\Gamma }_{n}(\{a_{j}\})={\left\{ \begin{array}{ll} 0\quad \text {if }\Gamma _n(H_v)>\frac{1}{2}\\ 1\quad \text {otherwise}. \end{array}\right. } \end{aligned}$$

The above comments show that each of these is a general algorithm and it is clear that it converges to \(\Xi '(\{a_j\})\) as \(n\rightarrow \infty \), the required contradiction.

Step 3: Finally, we must deal with \({\mathcal {I}}=\mathrm {sc}\). The argument is the same as Step 2, replacing \(\langle P^{H_v}_{\mathrm {ac}}E_{(0,4)}^{H_v}e_1,e_1\rangle \) with \(\langle P^{H_v}_{\mathrm {sc}}E_{(0,4)}^{H_v}e_1,e_1\rangle \) and the resulting \({\widetilde{\Gamma }}_n(H_v)\) with \(1-{\widetilde{\Gamma }}_n(H_v)\). \(\quad \square \)

4 Two Important Applications

Two important applications of our techniques are the computation of the functional calculus and of the Radon-Nikodym derivative of \(\mu _{x,y,\mathrm {ac}}^T\) with respect to Lebesgue measure, denoted by \(\rho _{x,y}^T\). Both of these have applications throughout mathematics and the physical sciences, some of which are explored numerically in Sect. 6. For example, suppose that we wish to solve the Schrödinger equation

$$\begin{aligned} \frac{du}{dt}=-iHu,\quad u_{t=0}=u_0, \end{aligned}$$

where H is some self-adjoint Hamiltonian. We can express the solution at time t as

$$\begin{aligned} u(t)=\exp (-iHt)u_0=\left[ \int _{{\mathbb {R}}}\exp (-i\lambda t)dE^H(\lambda )\right] u_0. \end{aligned}$$

For example, the quantity

$$\begin{aligned} L(t)=\langle u(t),u_0\rangle =\int _{{\mathbb {R}}}\exp (-i\lambda t)d\mu _{u_0,u_0}^H(\lambda ), \end{aligned}$$

known as the autocorrelation function [141], is simply the Fourier transform of the spectral measure \(d\mu _{u_0,u_0}^H\). In particular, if \(d\mu _{u_0,u_0}^H\) is absolutely continuous, then \(\rho _{u_0,u_0}^H\) and L form a Fourier transform pair. The computation of evolution generated by an operator is in some sense dual to the computation of the spectral measure. This interpretation of a time evolution can be adapted to describe many signals generated by PDEs [48, 85, 126] and stochastic processes [65, 88] [118, Ch. 7]. In this section, we show how to compute the functional calculus and \(\rho _{x,y}^T\).

4.1 Computation of the functional calculus

Recall that given a possibly unbounded complex-valued Borel function F, defined on \(\sigma (T)\), and \(T\in \Omega _{\mathrm {N}}\), F(T) is defined by

$$\begin{aligned} F(T)=\int _{{\sigma (T)}}F(\lambda )dE^T(\lambda ). \end{aligned}$$

F(T) is a densely defined closed normal operator with dense domain given by

$$\begin{aligned} {\mathcal {D}}(F(T))=\left\{ x\in l^2({\mathbb {N}}):\int _{\sigma (T)}\left| F(\lambda )\right| ^2d\mu _{x,x}^T(\lambda )<\infty \right\} . \end{aligned}$$

For simplicity, we will only deal with the case that F is a bounded continuous function on \({\mathbb {R}}\), that is, \(F\in C_b({\mathbb {R}})\). In this case \({\mathcal {D}}(F(T))\) is the whole of \(l^2({\mathbb {N}})\) (the variations \(|\mu ^T_{x,y}|\) are finite) and we can use standard properties of the Poisson kernel. We assume that given \(F\in C_b({\mathbb {R}})\), we have access to piecewise constant functions \(F_n\) supported in \([-n,n]\) such that \(\Vert F-F_n\Vert _{L^\infty ([-n,n])}\le n^{-1}\). Clearly other suitable data also suffices and, as usual, we abuse notation slightly by adding this information to the evaluation sets \(\Lambda _i\) (recall that \(\Lambda _i\) are defined in (1.18)).

Theorem 4.1

(Computation of the functional calculus). Given the set-up in Sects. 1.5 and 1.6, consider the map

$$\begin{aligned} \Xi _{\mathrm {fun}}:\Omega _{f,\alpha ,\beta }\times C_b({\mathbb {R}})&\rightarrow l^2({\mathbb {N}})\\ (T,x,F)&\rightarrow F(T)x. \end{aligned}$$

Then \(\{\Xi _{\mathrm {fun}},\Omega _{f,\alpha ,\beta }\times C_b({\mathbb {R}}), \Lambda _1\}\in \Delta _2^A\).

Proof

Let \((T,x,F)\in \Omega _{f,\alpha ,\beta }\times C_b({\mathbb {R}})\) then by Fubini’s theorem,

$$\begin{aligned} \int _{-n}^n K_H(u+i/n;T,x)F_n(u)du=\left[ \int _{-\infty }^\infty \int _{-n}^nP_H(u-\lambda ,1/n)F_n(u)du\ dE^T(\lambda )\right] x. \end{aligned}$$

The inner integral is bounded since F is bounded and the Poisson kernel integrates to 1 along the real line. It also converges to \(F(\lambda )\) everywhere. Hence by the dominated convergence theorem

$$\begin{aligned} \lim _{n\rightarrow \infty }\int _{-n}^n K_H(u+i/n;T,x)F_n(u)du=F(T)x. \end{aligned}$$

We now use the same arguments used to prove Theorem 3.1. Using Corollary 2.2, together with \(\Vert K_H(u+i/n;T,x)\Vert _{L^\infty ({\mathbb {R}})}\le nC_1\) and the fact that \(K_H(u+i/n;T,x)\) is Lipschitz continuous with Lipschitz constant \(n^2C_2\) for some (possibly unknown) constants \(C_1\) and \(C_2\), we can approximate this integral with an error that vanishes in the limit \(n\rightarrow \infty \). \(\quad \square \)

If \(\sigma (T)\) is bounded, then, with slightly more information available to our algorithms, a simpler proof holds using the Stone–Weierstrass theorem. Suppose that given x, the vectors \(T^nx\) can be computed to arbitrary precision. There exists a sequence of polynomials \(p_m(z)\) converging uniformly to F(z) on \(\sigma (T)\). Assuming such a sequence can be explicitly constructed (for example using Bernstein or Chebyshev polynomials), we can take \(p_m(T)x\) as approximations of F(T)x. If we can bound \(\Vert p_m(z)-F(z)\Vert _{\sigma (T)}\le \epsilon _m\) with \(\epsilon _m\) null, then the vector F(T)x can be computed with error control. However, computing \(T^nx\) for large n (even if \(x=e_1\)) may be computationally expensive as was found in the example in Sect. 6.4. We will also see in Sect. 6.4 that if \(\sigma (T)\) is bounded and F is analytic in an open neighbourhood of \(\sigma (T)\), then F(T)x can be computed with error control by deforming the integration contour away from the spectrum. Such a deformation is useful since the resolvent does not blow up along such a contour, and we can bound its Lipschitz constant.

4.2 Computation of the Radon–Nikodym derivative

Recall the definition of the Radon–Nikodym derivative in (1.16) and note that \(\rho _{x,y}^T\in L^1({\mathbb {R}})\) for \(T\in \Omega _{\mathrm {SA}}\). We consider its computation in the \(L^1\) sense in the following theorem, where, as before, we assume (3.1), adding the approximations of U to our evaluation set \(\Lambda _1\) (defined in (1.18)), along with component-wise evaluations of a given vector y. However, we must consider the computation away from the singular part of the spectrum.

Theorem 4.2

(Computation of the Radon–Nikodym derivative). Given the set-up in Sects. 1.5 and 1.6, consider the map

$$\begin{aligned} \Xi _{\mathrm {RN}}:\Omega _{f,\alpha ,\beta }\times l^2({\mathbb {N}})\times {\mathcal {U}}&\rightarrow L^1({\mathbb {R}})\\ (T,x,y,U)&\rightarrow \rho _{x,y}^T|_{U}. \end{aligned}$$

We restrict this map to the quadruples (TxyU) such that U is strictly separated from \(\mathrm {supp}(\mu _{x,y,\mathrm {sc}}^T)\cup \mathrm {supp}(\mu _{x,y,\mathrm {pp}}^T)\) and denote this subclass by \({\widetilde{\Omega }}_{f,\alpha ,\beta }\). Then \(\{\Xi _{\mathrm {RN}},{\widetilde{\Omega }}_{f,\alpha ,\beta }, \Lambda _1\}\in \Delta _2^A\). Furthermore, each output \(\Gamma _n(T,x,y,U)\) of the algorithms constructed in the proof consists of a piecewise affine function, supported in U with rational knots and taking (complex) rational values at these knots.

Remark 6

Essentially, this theorem tells us that if we can compute the action of the resolvent operator with asymptotic error control, then we can compute the Radon–Nikodym derivative of the absolutely continuous part of the measures on open sets which are a positive distance away from the singular support of the measure. The assumption that U is separated from \(\mathrm {supp}(\mu _{x,y,\mathrm {sc}}^T)\cup \mathrm {supp}(\mu _{x,y,\mathrm {pp}}^T)\) may seem unnatural but is needed to gain \(L^1\) convergence of the approximation. However, without it, the proof still gives almost everywhere pointwise convergence.

Proof

Let \((T,x,y,U)\in {\widetilde{\Omega }}_{f,\alpha ,\beta }\). For \(u\in U\) we decompose as follows

$$\begin{aligned} \langle K_H(u+i\epsilon ;T,x),y \rangle= & {} \frac{1}{\pi }\int _{{\mathbb {R}}}\frac{\epsilon }{(\lambda -u)^2+\epsilon ^2}\rho _{x,y}^T(\lambda )d\lambda \nonumber \\&+\frac{1}{\pi }\int _{{\mathbb {R}}\backslash U}\frac{\epsilon }{(\lambda -u)^2 +\epsilon ^2}\left\{ d\mu _{{x,y},\mathrm {sc}}^T(\lambda )+d\mu _{{x,y},\mathrm {pp}}^T(\lambda )\right\} .\nonumber \\ \end{aligned}$$
(4.1)

The first term converges to \(\rho _{x,y}^T|_{U}\) in \(L^1(U)\) as \(\epsilon \downarrow 0\) since \(\rho _{x,y}^T|_U\in L^1(U)\). Since we assumed that U is separated from \(\mathrm {supp}(\mu _{{x,y},\mathrm {sc}}^T)\cup \mathrm {supp}(\mu _{{x,y},\mathrm {pp}}^T)\), it follows that the second term of (4.1) converges to 0 in \(L^1(U)\) as \(\epsilon \downarrow 0\). Hence we are done if we can approximate \(\langle K_H(u+i/n;T,x),y \rangle \) in \(L^1(U)\) with an error converging to zero as \(n\rightarrow \infty \).

Recall that \(K_H(u+i/n;T,x)\) is Lipschitz continuous with Lipschitz constant at most \(n^2\Vert x\Vert /\pi \). By assumption, and using Corollary 2.2, we can approximate \(K_H(u+i/n;T,x)\) to asymptotic precision with vectors of finite support. Hence the inner product

$$\begin{aligned} f_n(u):=\langle K_H(u+i/n;T,x),y \rangle \end{aligned}$$

can be approximated to asymptotic precision (now with a possibly unknown constant also depending on \(\Vert y\Vert \)) and \(f_n\) is Lipschitz continuous with Lipshitz constant at most \(n^2\Vert x\Vert \Vert y\Vert /\pi \).

Recall that U can be written as the disjoint union

$$\begin{aligned} U=\bigcup _{m}(a_m,b_m), \end{aligned}$$

where \(a_m,b_m\in {\mathbb {R}}\cup \{\pm \infty \}\) and the union is at most countable. Without loss of generality, we assume that the union is over \(m\in {\mathbb {N}}\). Given an interval \((a_m,b_m)\), let \(a_m<z_{m,1,n}< z_{m,2,n}<\cdots<z_{m,r_m,n}<b_m\) such that \(z_{m,j,n}\in {\mathbb {Q}}\) and \(\left| z_{m,j,n}-z_{m,j+1,n}\right| \le (b_m-a_m)^{-1}n^{-3}m^{-2}\) and \(\left| a_m-z_{m,1,n}\right| ,\left| b_m-z_{m,r_m,n}\right| \le n^{-1}\). Let \(f_{m,n}\) be a piecewise affine interpolant with knots \(z_{m,1,n},\ldots ,z_{m,r_m,n}\) supported on \((z_{m,1,n},z_{m,r_m,n})\) with the property that \(\left| f_{m,n}(z_{m,j,n})-f_n(z_{m,j,n})\right| <C(b_m-a_m)^{-1}n^{-1}m^{-2}\). Here C is some unknown constant which occurs from the asymptotic approximation of \(f_n\) that arises from Corollary 2.2 and we can always compute such \(f_{m,n}\) in finitely many arithmetic operations and comparisons.

Let \(\Gamma _n(T,x,y,U)\) be the function that agrees with \(f_{m,n}\) on \((a_m,b_m)\) for \(m\le n\) and is zero elsewhere. Clearly the nodes of \(\Gamma _n(T,x,y,U)\) can be computed using finitely many arithmetic operations and comparisons and the relevant set of evaluation functions \(\Lambda _1\). A simple application of the triangle inequality implies that

$$\begin{aligned} \begin{aligned}&\int _{U}\left| \Gamma _n(T,U,x,y)(u)-\rho _{x,y}^T(u)\right| du\\&\quad \le \sum _{m>n}\int _{(a_m,b_m)}\left| \rho _{x,y}^T(u)\right| du+\sum _{m\le n}\int _{(a_m,b_m)\backslash (z_{m,1,n},z_{m,r_m,n})}\left| \rho _{x,y}^T(u)\right| du\\&\qquad +\sum _{m\le n}\int _{(z_{m,1,n},z_{m,r_m,n})}\left| \rho _{x,y}^T(u)-f_n(u)\right| du+\frac{{\widetilde{C}}(x,y,T)}{n}\sum _{m\le n}\frac{1}{m^2}, \end{aligned} \end{aligned}$$

where the last term arises due to the piecewise affine interpolant. The bound clearly converges to zero as required. \(\quad \square \)

5 Computing Spectra as Sets

We now turn to computing the different types of spectra as sets in the complex plane. Specifically, define the problem functions \(\Xi _{{\mathcal {I}}}^{{\mathbb {C}}}(T)=\sigma _{{\mathcal {I}}}(T)\) for \({\mathcal {I}}=\mathrm {ac},\mathrm {sc}\) or \(\mathrm {pp}\). Note also that \(\sigma _{\mathrm {pp}}(T)=\overline{\sigma _{\mathrm {p}}(T)}\), the closure of the set of eigenvalues. Recalling the definition of a computational problem in Appendix A, we compute these quantities in a metric space \({\mathcal {M}}\) with metric \(d_{{\mathcal {M}}}\). Since we wish to include unbounded operators, we use the Attouch–Wets metric defined by

$$\begin{aligned} d_{\mathrm {AW}}(C_1,C_2)=\sum _{n=1}^{\infty } 2^{-n}\min \left\{ {1,\underset{\left| x\right| \le n}{\sup }\left| \mathrm {dist}(x,C_1)-\mathrm {dist}(x,C_2)\right| }\right\} , \end{aligned}$$
(5.1)

for \(C_1,C_2\in \mathrm {Cl}({\mathbb {C}}),\) where \(\mathrm {Cl}({\mathbb {C}})\) denotes the set of closed non-empty subsets of \({\mathbb {C}}\). When considering bounded T, whose spectrum is necessarily a compact subset of \({\mathbb {C}}\), we let \(({\mathcal {M}},d_{{\mathcal {M}}})\) be the set of all non-empty compact subsets of \({\mathbb {C}}\) provided with the Hausdorff metric \(d_{{\mathcal {M}}}=d_{\mathrm {H}}\):

$$\begin{aligned} d_{\mathrm {H}}(X,Y) = \max \left\{ \sup _{x \in X} \inf _{y \in Y} d(x,y), \sup _{y \in Y} \inf _{x \in X} d(x,y) \right\} , \end{aligned}$$
(5.2)

where \(d(x,y) = |x-y|\) is the usual Euclidean distance. Note that for compact sets (and hence for bounded operators), the topological notions of convergence according to \(d_\mathrm {H}\) and \(d_\mathrm {AW}\) coincide. To allow the possibility that the different spectral sets \(\sigma _{{\mathcal {I}}}(T)\) are empty, we add the empty set to our metric space as an isolated point (the space remains metrisable).Footnote 7

The main theorem of this section is the following:

Theorem 5.1

Given the above setup and that in Sects. 1.5 and 1.6, for \(i=1,2\) it holds that

$$\begin{aligned}&\Delta ^G_2 \not \ni \{\Xi _{\mathrm {ac}}^{{\mathbb {C}}},\Omega _{f,\alpha },\Lambda _i\} \in \Delta ^A_3,\quad \Delta ^G_2 \not \ni \{\Xi _{\mathrm {sc}}^{{\mathbb {C}}},\Omega _{f,\alpha },\Lambda _i\} \in \Delta ^A_4,\\&\Delta ^G_2 \not \ni \{\Xi _{\mathrm {pp}}^{{\mathbb {C}}},\Omega _{f,\alpha },\Lambda _i\} \in \Delta ^A_3. \end{aligned}$$

If \(f(n)-n\ge \sqrt{2n}+\frac{1}{2}\), then \(\{\Xi _{\mathrm {sc}}^{{\mathbb {C}}},\Omega _{f,0},\Lambda _i\}\not \in \Delta ^G_3\) also holds.

In order to prove Theorem 5.1, we only need to prove the lower bounds for \(\Lambda _2\) and the upper bounds for \(\Lambda _1\) (recall that \(\Lambda _i\) are defined in (1.18)). These results show that despite the results of Sects. 24, in general it is very hard to compute the decomposition of the spectrum in the sense of (1.17). We also answer the question posed in Sect. 2.2 and prove that the spectral measures, while computable in one limit, cannot, in general, be computed with error control (see Theorem 5.2), unless one has additional regularity assumptions such as in [39] (computation with error control is made precise in [33, Ch. 4]).

5.1 Proof for point spectra

Proof that \(\{\Xi _{\mathrm {pp}}^{{\mathbb {C}}},\Omega _{f,\alpha },\Lambda _2\}\notin \Delta _2^G\). To prove this, it is enough to consider bounded Schrödinger operators acting on \(l^2({\mathbb {N}})\) (defined in Sect. 3.2.2), which are clearly a subclass of \(\Omega _{f,0}\) for \(f(n)=n+1\). Note that since the evaluation functions in \(\Lambda _2\) can be recovered from those in \(\Lambda _1\) in this special case, we can assume that we are dealing with \(\Lambda _1\). Suppose for a contradiction that there does exist a sequence of general algorithms, \(\Gamma _n\), with

$$\begin{aligned} \lim _{n\rightarrow \infty }\Gamma _n(H_v)=\Xi _{\mathrm {pp}}^{{\mathbb {C}}}(H_v). \end{aligned}$$

We will construct a potential v such that \(\Gamma _n(H_v)\) does not converge. To do this, choose \(\rho =\chi _{[-c,c]}/(2c)\) for some constant c such that the conditions of Theorem 3.3 hold. We will use Theorem 3.3 and the following well-known facts:

  1. 1.

    If there exists \(N\in {\mathbb {N}}\) such that v(n) is 0 for \(n\ge N\), then \(\sigma _{\mathrm {pp}}(H_v)\cap (0,4)=\emptyset \) [114], but \([0,4]\subset \sigma (H_v)\) (the potential acts as a compact perturbation so the essential spectrum is [0, 4]).

  2. 2.

    If we are in the setting of Theorem 3.3, then the spectrum of \(H_{v_\omega }+A\) is pure point almost surely. Moreover, if \(\rho =\chi _{[-c,c]}/(2c)\) for some constant c, then \([-c,4+c]\subset \sigma _{\mathrm {pp}}(H_{v_\omega }+A)\) almost surely.

We will define the potential v inductively as follows. Let \(v_1\) be a potential of the form \(v_{\omega }\) (with density \(\rho \)) such that \([-c,4+c]\subset \sigma (H_{v_1})\) and \(\sigma (H_{v_1})\) is pure point. Such a \(v_1\) exists by Theorem 3.3 and fact (2) above. Then for large enough n there exists \(z_n\in \Gamma _n(H_{v_1})\) such that \(\left| z_n-2\right| \le 1\). Fix \(n=n_1\) such that this holds. Then \(\Gamma _{n_1}(H_{v_1})\) only depends on \(\{v_1(j):j\le N_1\}\) for some integer \(N_1\) by (i) of Definition A.2. Define the potential \(v_2\) by \(v_2(j)=v_1(j)\) for all \(j\le N_1\) and \(v_2(j)=0\) otherwise. Then by fact (1) above, \(\Gamma _n(H_{v_2})\cap [1/2,7/2]=\emptyset \) for large n, say for \(n_2\). But then \(\Gamma _{n_2}(H_{v_2})\) only depends on \(\{v_2(j):j\le N_2\}\) for some integer \(N_2\).

We repeat this process inductively switching between potentials which induce \(\Gamma _{n_k}(H_{v_k})\cap [1/2,7/2]=\emptyset \) for k even and potentials which induce \(\Gamma _{n_k}(H_{v_k})\cap [1,3]\ne \emptyset \) for k odd. Explicitly, if k is even then define a potential \(v_{k+1}\) by \(v_{k+1}(j)=v_k(j)\) for all \(j\le N_k\) and \(v_{k+1}(j)=v_{\omega }(j)\) (with the density \(\rho \)) otherwise such that \([-c,4+c]\subset \sigma (H_{v_{k+1}})\) and \(\sigma (H_{v_{k+1}})\) is pure point. Such a \(\omega \) exists from Theorem 3.3 and fact (2) above applied with the perturbation A to match the potential for \(j\le N_k\). If k is odd then we define \(v_{k+1}\) by \(v_{k+1}(j)=v_k(j)\) for all \(j\le N_k\) and \(v_{k+1}(j)=0\) otherwise. We can then choose \(n_{k+1}\) such that the above intersections hold and \(N_{k+1}\) such that \(\Gamma _{n_{k+1}}(H_{v_{k+1}})\) only depends on \(\{v_{k+1}(j):j\le N_{k+1}\}\). Finally set \(v(j)=v_{k}(j)\) for \(j\le N_{k}\). It is clear from (iii) of Definition A.2, that \(\Gamma _{n_k}(H_{v})=\Gamma _{n_k}(H_{v_k})\). But then this implies that \(\Gamma _{n_k}(H_{v})\) cannot converge, the required contradiction. \(\quad \square \)

A similar argument gives the following theorem, where \({\mathbb {V}}\) is used to denote the set of bounded real-valued potentials on \({\mathbb {N}}\) and \(\Lambda _3\) denotes the pointwise evaluations of such potentials.

Theorem 5.2

(Impossibility of computing spectral measures with error control). Consider the problem function

$$\begin{aligned} \begin{aligned} {\widehat{\Xi }}:{\mathbb {V}}\times {\mathbb {N}}&\rightarrow {\mathbb {R}}_{\ge 0}\\ (v,j)&\rightarrow \langle E_{\{1\}}^{H_v}e_j, e_j\rangle \end{aligned}. \end{aligned}$$

Then \(\{{\widehat{\Xi }},{\mathbb {V}}\times {\mathbb {N}},\Lambda _3\}\in \Delta _2^A\) but \(\{{\widehat{\Xi }},{\mathbb {V}}\times {\mathbb {N}},\Lambda _3\}\notin \Delta _1^G\). In other words, \({\widehat{\Xi }}\) can be computed in one limit, but it cannot be computed with error control.

Proof

The result \(\{{\widehat{\Xi }},{\mathbb {V}}\times {\mathbb {N}},\Lambda _3\}\in \Delta _2^A\) follows directly from the remarks after Theorem 3.1 and Theorem 2.1. Suppose for a contradiction that \(\{{\widehat{\Xi }},{\mathbb {V}}\times {\mathbb {N}},\Lambda _3\}\in \Delta _1^G\) and that \(\Gamma _n\) is a sequence of general algorithms solving the problem with error control. It follows that for each \(j\in {\mathbb {N}}\), there exists a sequence of general algorithms \(\Gamma _n^j\) such that

$$\begin{aligned} \lim _{n\rightarrow \infty }\Gamma _n^j(v)={\left\{ \begin{array}{ll} 1,\quad &{}\text {if }{\widehat{\Xi }}(v,j)>0\\ 0,\quad &{}\text {otherwise} \end{array}\right. }. \end{aligned}$$

Informally, these are described as follows. Fix j and consider the lower bound on \({\widehat{\Xi }}(v,j)\) computed by \(\{\Gamma _m(v,j):m\le n\}\). If this is greater than 0 then set \(\Gamma _n^j(v)=1\), otherwise set \(\Gamma _n^j(v)=0\). It follows that \(\Gamma _n^j(v)\) also converges from below. It holds that \(1\in \sigma _{\mathrm {p}}(H_v)\) if and only if \({\widehat{\Xi }}(v,j)>0\) for some \(j\in {\mathbb {N}}\). Now define

$$\begin{aligned} {\widehat{\Gamma }}_{n}(v)=\sup _{j\le n}\Gamma ^j_n(v). \end{aligned}$$

It is clear that this is a general algorithm using \(\Lambda _3\). Furthermore,

$$\begin{aligned} \lim _{n\rightarrow \infty }{\widehat{\Gamma }}_n(v)={\left\{ \begin{array}{ll} 1,\quad &{}\text {if }1\in \sigma _{\mathrm {p}}(H_v)\\ 0,\quad &{}\text {otherwise} \end{array}\right. }, \end{aligned}$$

with convergence from below.

Now we may choose a v such that \(1\in \sigma _{\mathrm {p}}(H_v)\) (this can be achieved for example by taking a potential which induces pure point spectrum and shifting the operator accordingly). It follows that for large n, we have \({\widehat{\Gamma }}_n(v)=1\). But the computation of \({\widehat{\Gamma }}_n(v)\) is only dependent on v(j) for \(j<N\) for some \(N\in {\mathbb {N}}\). Define \(v_0\in {\mathbb {V}}\) by \(v_0(j)=v(j)\) if \(j<N\) and \(v_0(j)=0\) otherwise. It follows that \({\widehat{\Gamma }}_n(v_0)=1\). But since the potential has compact support, \(1\notin \sigma _{\mathrm {p}}(H_{v_0})\) and hence \({\widehat{\Gamma }}_n(v_0)=0\), the required contradiction. \(\quad \square \)

We now prove that \(\Xi _{\mathrm {pp}}^{\mathbb {C}}\) can be computed using a height two arithmetical tower. The first step is the following technical lemma, whose proof will also be used later when considering \(\Xi _{\mathrm {ac}}^{\mathbb {C}}\).

Lemma 5.3

Let \(a<b\) with \(a,b\in {\mathbb {R}}\) and consider the decision problem

$$\begin{aligned} \Xi _{a,b,\mathrm {pp}}:\Omega _{f,\alpha }&\rightarrow \{0,1\}\\ T&\rightarrow {\left\{ \begin{array}{ll} 1\quad \text {if }\sigma _{\mathrm {pp}}(T)\cap [a,b]\ne \emptyset \\ 0\quad \text {otherwise}. \end{array}\right. } \end{aligned}$$

Then there exists a height two arithmetical tower \(\Gamma _{n_2,n_1}\) (with evaluation functions \(\Lambda _1\)) for \(\Xi _{a,b,\mathrm {pp}}\). Furthermore, the final limit is from below in the sense that \(\Gamma _{n_2}(T):=\lim _{n_1\rightarrow \infty }\Gamma _{n_2,n_1}(T)\le \Xi _{a,b,\mathrm {pp}}(T)\).

Proof

Step 1 of the proof of Theorem 3.2 yields a height two arithmetical tower \({\widehat{\Gamma }}_{n_2,n_1}^j(T)\) for the computation of \(\mu _{e_j,e_j,\mathrm {c}}^T((a,b))\). Note that the final limit is from above and using the fact that \(\mu _{e_j,e_j,\mathrm {c}}^T(\{a,b\})=0\), we obtain a height two tower for \(\mu _{e_j,e_j,\mathrm {c}}^T([a,b])\). We can then use the height one tower for \(\mu _{e_j,e_j}^T([a,b])\) constructed in Sect. 2.2, denoted by \({\widetilde{\Gamma }}_{n_1}^j(T)\), and define

$$\begin{aligned} a_{j,n_2,n_1}(T)={\widetilde{\Gamma }}_{n_1}^j(T)-{\widehat{\Gamma }}_{n_2,n_1}^j(T). \end{aligned}$$

This provides a height two arithmetical tower for \(\mu _{e_j,e_j,\mathrm {pp}}^T([a,b])\) with the final limit from below. Without loss of generality (by taking successive maxima), we can assume that these towers are non-decreasing in \(n_2\). Now set

$$\begin{aligned} \Upsilon _{n_2,n_1}(T)=\max _{1\le j\le n_2} a_{j,n_2,n_1}(T). \end{aligned}$$

Then it is clear that the limit \(\lim _{n_1\rightarrow \infty }\Upsilon _{n_2,n_1}(T)=\Upsilon _{n_2}(T)\) exists. Furthermore, the monotonicity of \(a_{j,n_2,n_1}(T)\) in \(n_2\) implies that

$$\begin{aligned} \lim _{n_2\rightarrow \infty }\Upsilon _{n_2}(T)=\sup _{n\in {\mathbb {N}}}\mu _{e_n,e_n,\mathrm {pp}}^T([a,b]), \end{aligned}$$

with monotonic convergence from below. This limiting value is zero if \(\Xi _{a,b,\mathrm {pp}}(T)=0\), otherwise it is a positive finite number.

To convert this to a height two tower for the decision problem \(\Xi _{a,b,\mathrm {pp}}\), that maps to the discrete space \(\{0,1\}\), we use the following trick. Consider the intervals \( J_1^{n_2}=[0,1/n_2], \) and \( J_2^{n_2}=[2/n_2,\infty ). \) Let \(k(n_2,n_1)\le n_1\) be maximal such that \(\Upsilon _{n_2,n_1}(T)\in J_1^{n_2}\cup J_2^{n_2}\). If no such k exists or \(\Upsilon _{n_2,k}(T)\in J_1^{n_2}\) then set \({\Gamma }_{n_2,n_1}(T)=0\). Otherwise set \({\Gamma }_{n_2,n_1}(T)=1\). These can be computed using finitely many arithmetic operations and comparisons using \(\Lambda _1\). The point of the intervals \(J_1^{n_2}\) and \(J_2^{n_2}\) is that we can show \(\lim _{n_1\rightarrow \infty }{\Gamma }_{n_2,n_1}(T)={\Gamma }_{n_2}(T)\) exists. This is because \(\lim _{n_1\rightarrow \infty }\Upsilon _{n_2,n_1}(T)=\Upsilon _{n_2}(T)\) exists and hence we cannot oscillate infinitely often between the separated intervals \(J_1^{n_2}\) and \(J_2^{n_2}\). Now suppose that \(\Xi _{a,b,\mathrm {pp}}(T)=0\), then \(\lim _{n_1\rightarrow \infty }{\widehat{\Gamma }}_{n_2,n_1}(T)=0\) and hence \(\lim _{n_1\rightarrow \infty }\Gamma _{n_2,n_1}(T)=0\) for all \(n_2\). Now suppose that \(\Xi _{a,b,\mathrm {pp}}(T)=1\), then for large enough \(n_2\) we must have that \(\Upsilon _{n_2}(T)>2/n_2\) and hence \(\Gamma _{n_2}(T)=1\). Together, these prove the convergence and that \(\Gamma _{n_2}(T)\le \Xi _{a,b,\mathrm {pp}}(T)\). \(\quad \square \)

Proof that \(\{\Xi _{\mathrm {pp}}^{{\mathbb {C}}},\Omega _{f,\alpha },\Lambda _1\}\in \Delta _3^A\). To construct a height two arithmetical tower for \(\Xi _{\mathrm {pp}}^{{\mathbb {C}}}\), we will use Lemma 5.3 repeatedly. Let \({\widehat{\Gamma }}_{n_2,n_1}(\cdot ,I)\) denote the height two tower constructed in the proof of Lemma 5.3 for the closed interval I (\(I=[a,b]\)), where without loss of generality by taking successive maxima in \(n_2\), we can assume that this tower is non-decreasing in \(n_2\) (this is where we use convergence from below in the final limit in the statement of the lemma). For a given \(n_1\) and \(n_2\), we construct \(\Gamma _{n_2,n_1}(T)\) as follows (we will use some basic terminology from graph theory).

Define the intervals \(I_{n_2,n_1,j}^{0}=[j,j+1]\) for \(j=-n_2,\ldots ,n_2-1\) so that these cover the interval \([-n_2,n_2]\). Now suppose that \(I_{n_2,n_1,j}^{k}\) are defined for \(j=1,\ldots ,r_k(n_2,n_1,T)\). Compute each \({\widehat{\Gamma }}_{n_2,n_1}(T,{I_{n_2,n_1,j}^{k}})\) and if this is 1, bisect \(I_{n_2,n_1,j}^{k}\) via its midpoint into two equal halves consisting of closed intervals. We then take all these bisected intervals and label them as \(I_{n_2,n_1,j}^{k+1}\) for \(j=1,\ldots ,r_{k+1}(n_2,n_1,k,T)\). This is repeated until we have no further bisections or the intervals \(I_{n_2,n_1,j}^{n_2}\) have been computed. By adding the interval \([-n_2,n_2]\) as a root with children \(I_{n_2,n_1,j}^{0}\), this creates a finite tree structure where a non-root interval I is a parent of two intervals precisely if those two intervals are formed from its bisection and \({\widehat{\Gamma }}_{n_2,n_1}(T,I)=1\). We then prune this tree by discarding all leaves I which have \({\widehat{\Gamma }}_{n_2,n_1}(T,I)=0\) to form the tree \({\mathcal {T}}_{n_2,n_1}(T)\). Finally, we let \(\Gamma _{n_2,n_1}(T)\) be the union of all the leaves of \({\mathcal {T}}_{n_2,n_1}(T)\). Clearly this can be computed using finitely many arithmetic operations and comparisons using \(\Lambda _1\). The construction is shown visually in Fig. 3.

In the above construction, the number of intervals considered (including those not in the tree \({\mathcal {T}}_{n_2,n_1}(T)\)) for a fixed \(n_2\) is \(n_22^{n_2+1}+1\) and hence independent of \(n_1\). It follows that \({\mathcal {T}}_{n_2,n_1}(T)\) and \(\Gamma _{n_2,n_1}(T)\) are constant for large \(n_1\) (due to the convergence of the \({\widehat{\Gamma }}_{n_2,n_1}(T,I)\) in \(\{0,1\}\)). We denote these limiting values by \({\mathcal {T}}_{n_2}(T)\) and \(\Gamma _{n_2}(T)\) respectively and also denote the corresponding intervals in the construction at the \(m-\)th level of this limit by \(I_{n_2,j}^{m}\). Note also that if \(\Xi _{\mathrm {pp}}^{{\mathbb {C}}}(T)=\emptyset \) then \(\Gamma _{n_2}(T)=\emptyset \).

Now suppose that \(z\in \Xi _{\mathrm {pp}}^{{\mathbb {C}}}(T)\), then there exists a sequence of nested intervals \(I_m=I_{n_2,a_{m,n_2}}^m\) containing z for \(m=0,\ldots ,n_2\), where these intervals are independent of \(n_2\). Fix m, then for large \(n_2\) we must have that \({\widehat{\Gamma }}_{n_2}(T,I_{j})=1\) for \(j=1,\ldots ,m\). It follows that \(I_m\) has a descendent interval \(I_{n_2,m}\) contained in \(\Gamma _{n_2}(T)\) and hence we must have

$$\begin{aligned} \mathrm {dist}(z,\Gamma _{n_2}(T))\le 2^{-m}. \end{aligned}$$

Since m was arbitrary it follows that \(\mathrm {dist}(z,\Gamma _{n_2}(T))\) converges to 0 as \(n_2\rightarrow \infty \).

Conversely, suppose that \(z_{m_j}\in \Gamma _{m_j}(T)\) with \(m_j\rightarrow \infty \), then we must show that all limit points of \(\{z_{m_j}\}\) lie in \(\Xi _{\mathrm {pp}}^{{\mathbb {C}}}(T)\). Suppose this were false, then by taking a subsequence if necessary, we can assume that \(z_{m_j}\rightarrow z\) and \(\mathrm {dist}(z_{m_j},\Xi _{\mathrm {pp}}^{{\mathbb {C}}}(T))\ge \delta \) for some \(\delta >0\). We claim that it is sufficient to prove that the maximum length of the leaves of \({\mathcal {T}}_{n_2}(T)\) intersecting a fixed compact subset of \({\mathbb {R}}\), converges to zero as \(n_2\rightarrow \infty \). Suppose this has been shown, then \(z_{m_j}\in I_{m_j}\) for some leaf \(I_{m_j}\) of \({\mathcal {T}}_{m_j}(T)\). It follows that \(I_{m_j}\cap \Xi _{\mathrm {pp}}^{{\mathbb {C}}}(T)\ne \emptyset \) and \(\left| I_{m_j}\right| \rightarrow 0\). But this contradicts \(z_{m_j}\) being positively separated from \(\Xi _{\mathrm {pp}}^{{\mathbb {C}}}(T)\).

We are thus left with proving the claim regarding the lengths of leaves. Suppose this were false, then there exists a compact set \(K\subset {\mathbb {R}}\) and leaves \(I_{j}\) in \({\mathcal {T}}_{b_j}(T)\) such that the lengths of \(I_j\) do not converge to zero and \(I_j\) intersect K. By taking subsequences if necessary, we can assume that the lengths of each \(I_j\) are constant. Then by the compactness of K and taking subsequences if necessary again, we can assume that each of the \(I_j\) are equal to a common interval I. It follows that \({\widehat{\Gamma }}_{b_j}(T,I)=1\) but that \({\widehat{\Gamma }}_{b_j}(T,I_1)={\widehat{\Gamma }}_{b_j}(T,I_2)=0\) since I is a leaf, where \(I_1\) and \(I_2\) form the bisection of I. Taking \(b_j\rightarrow \infty \), this implies that \(I\cap \Xi _{\mathrm {pp}}^{{\mathbb {C}}}(T)\ne \emptyset \) but \(I_1\cap \Xi _{\mathrm {pp}}^{{\mathbb {C}}}(T)=I_2\cap \Xi _{\mathrm {pp}}^{{\mathbb {C}}}(T)=\emptyset \) which is absurd. Hence we have shown the required contradiction, and we have finished the proof. \(\quad \square \)

Fig. 3
figure 3

Example of tree structure used to compute the point spectrum for \(n_2=3\). Each tested interval is shown in green (\({\widehat{\Gamma }}_{n_2,n_1}(T,I)=1\)) or red (\({\widehat{\Gamma }}_{n_2,n_1}(T,I)=0\)). The arrows show the bisections and the final output is shown in blue

5.2 Proof for absolutely continuous spectra

To prove the lower bound (that one limit will not suffice), our strategy will be to reduce a certain decision problem to the computation of \(\Xi _{\mathrm {ac}}^{{\mathbb {C}}}\). Let \(({\mathcal {M}}',d')\) be the discrete space \(\{0,1\}\), let \(\Omega '\) denote the collection of all infinite sequences \(\{a_{j}\}_{j\in {\mathbb {N}}}\) with entries \(a_{j}\in \{0,1\}\) and consider the problem function

$$\begin{aligned} \Xi '(\{a_{j}\}):\text { `Does }\{a_{j}\}\text { have infinitely many non-zero entries?'}, \end{aligned}$$

that maps to \(({\mathcal {M}}',d')\). In Appendix A, it is shown that \(\mathrm {SCI}(\Xi ',\Omega ')_{G} = 2\) (where the evaluation functions consist of component-wise evaluations of the array \(\{a_{j}\}\)).

Proof that \(\{\Xi _{\mathrm {ac}}^{{\mathbb {C}}},\Omega _{f,\alpha },\Lambda _2\}\notin \Delta _2^G\). We are done if we prove the result for \(f(n)=n+1\) and \(\alpha =0\). In this case \(\Lambda _1\) and \(\Lambda _2\) are equivalent so we can restrict the argument to \(\Lambda _1\). Suppose for a contradiction that \(\Gamma _{n}\) is a height one tower of general algorithms solving \(\{\Xi _{\mathrm {ac}}^{{\mathbb {C}}},\Omega _{f,0},\Lambda _2\}\). We will gain a contradiction by using the supposed tower to solve \(\{\Xi ',\Omega '\}\).

Given \(\{a_{j}\}\in \Omega '\), consider the operator \(H=H_0+v\), where the potential is of the following form:

$$\begin{aligned} v(m)=\sum _{k=1}^\infty a_{k}\delta _{m,k!}. \end{aligned}$$

Then by Theorem 3.4, \([0,4]\subset \sigma _{\mathrm {ac}}(H)\) if \(\sum _{k}a_{k}<\infty \) (that is, if \(\Xi '(\{a_j\})=0\)) and \(\sigma _{\mathrm {ac}}(H)\cap (0,4)=\emptyset \) otherwise. Given N we can evaluate any matrix value of H using only finitely many evaluations of \(\{a_{j}\}\) and hence the evaluation functions \(\Lambda _1\) can be computed using component-wise evaluations of the sequence \(\{a_j\}\). We now set

$$\begin{aligned} {\widehat{\Gamma }}_{n}(\{a_{j}\})={\left\{ \begin{array}{ll} 0\quad \text {if }\mathrm {dist}(2,\Gamma _n(H))<1\\ 1\quad \text {otherwise}. \end{array}\right. } \end{aligned}$$

The above comments show that each of these is a general algorithm, and it is clear that it converges to \(\Xi '(\{a_j\})\) as \(n\rightarrow \infty \), the required contradiction. \(\quad \square \)

To construct the height two (arithmetical) tower for \(\Xi _{\mathrm {ac}}^{{\mathbb {C}}}\), we require the following lemma.

Lemma 5.4

Let \(a<b\) with \(a,b\in {\mathbb {R}}\) and consider the decision problem

$$\begin{aligned} \Xi _{a,b,\mathrm {ac}}:\Omega _{f,\alpha }&\rightarrow \{0,1\}\\ T&\rightarrow {\left\{ \begin{array}{ll} 1\quad \text {if }\sigma _{\mathrm {ac}}(T)\cap [a,b]\ne \emptyset \\ 0\quad \text {otherwise}. \end{array}\right. } \end{aligned}$$

Then there exists a height two arithmetical tower \(\Gamma _{n_2,n_1}\) (with evaluation functions \(\Lambda _1\)) for \(\Xi _{a,b,\mathrm {ac}}\). Furthermore, the final limit is from below with \(\Gamma _{n_2}(T):=\lim _{n_1\rightarrow \infty }\Gamma _{n_2,n_1}(T)\le \Xi _{a,b,\mathrm {ac}}(T)\).

Proof

Fix such an a and b and let \(\chi _n\) be a sequence of non-negative, continuous piecewise affine functions on \({\mathbb {R}}\), bounded by 1 and of compact support such that \(\chi _n\) converge pointwise monotonically up to the constant function 1. Define the function

$$\begin{aligned} \upsilon _{m,n}(u,T)=\langle K_{H}(u+i/n,T,e_m),e_m\rangle \end{aligned}$$

and set

$$\begin{aligned} a_{m,n_2,n_1}(T)=\int _{a}^b \upsilon _{m,n_1}(u,T)\chi _{n_2}(\left| \upsilon _{m,n_1}(u,T)\right| )du. \end{aligned}$$

Since each \(\chi _n\) is continuous and has compact support (which implies that the integrand is bounded for fixed \(n_2\)), and since \(\upsilon _{m,n}(u,T)\) converges almost everywhere to \(\rho ^T_{e_{m},e_m}(u)\) (the Radon–Nikodym derivative of the absolutely continuous part of the measure \(\mu _{e_m,e_m}^T\)), it follows by the dominated convergence theorem that

$$\begin{aligned} \lim _{n_1\rightarrow \infty }a_{m,n_2,n_1}(T)=:a_{m,n_2}(T)=\int _{a}^b \rho ^T_{e_{m},e_m}(u)\chi _{n_2}(\rho ^T_{e_{m},e_m}(u))du. \end{aligned}$$

We now use the fact that the \(\chi _n\) are increasing and the dominated convergence theorem again (with bounding integrable function \(\rho ^T_{e_{m},e_m}\)) to deduce that

$$\begin{aligned} \lim _{n_2\rightarrow \infty }a_{m,n_2}(T)=\mu _{e_m,e_m,\mathrm {ac}}^T([a,b]), \end{aligned}$$

with monotonic convergence from below.

Using Corollary 2.2 (and the now standard argument of Lipschitz continuity of the resolvent), we can compute approximations of \(a_{m,n_2,n_1}(T)\) to accuracy \(1/n_1\) in finitely many arithmetic operations and comparisons. Call these approximations \({\widetilde{a}}_{m,n_2,n_1}(T)\) and set

$$\begin{aligned} \Upsilon _{n_2,n_1}(T)=\max _{1\le j\le n_2} {\widetilde{a}}_{j,n_2,n_1}(T). \end{aligned}$$

The proof now follows that of Lemma 5.3 exactly. \(\quad \square \)

Proof that \(\{\Xi _{\mathrm {ac}}^{{\mathbb {C}}},\Omega _{f,\alpha },\Lambda _1\}\in \Delta _3^A\). This is exactly the same construction as in the above proof of the inclusion \(\{\Xi _{\mathrm {pp}}^{{\mathbb {C}}},\Omega _{f,\alpha },\Lambda _1\}\in \Delta _3^A\). We simply replace the tower constructed in the proof of Lemma 5.3 by the tower constructed in the proof of Lemma 5.4. \(\quad \square \)

5.3 Proof for singular continuous spectra

We first prove the lower bound for the singular continuous spectrum via Theorem 3.4. Note that the impossibility result \(\{\Xi _{\mathrm {sc}}^{{\mathbb {C}}},\Omega _{f,\alpha },\Lambda _2\}\notin \Delta _2^G\) follows from the same argument that was used to show \(\{\Xi _{\mathrm {ac}}^{{\mathbb {C}}},\Omega _{f,\alpha },\Lambda _2\}\notin \Delta _2^G\). To show that two limits will not suffice for \(f(n)-n\ge \sqrt{2n}+1/2\), our strategy will be to again reduce a certain decision problem to the computation of \(\Xi _{\mathrm {sc}}^{{\mathbb {C}}}\). Let \(({\mathcal {M}}',d')\) be the discrete space \(\{0,1\}\), let \(\Omega '\) denote the collection of all infinite matrices \(\{a_{i,j}\}_{i,j\in {\mathbb {N}}}\) with entries \(a_{i,j}\in \{0,1\}\) and consider the problem function

$$\begin{aligned} \Xi '(\{a_{i,j}\}):\text { `Does }\{a_{i,j}\}\text { have a column containing infinitely many non-zero entries?'}, \end{aligned}$$

that maps to \(({\mathcal {M}}',d')\). In [12], a Baire category argument was used to prove that \(\mathrm {SCI}(\Xi ',\Omega ')_{G} = 3\) (where the evaluation functions consist of component-wise evaluations of the array \(\{a_{i,j}\}\)).

Proof that \(\{\Xi _{\mathrm {sc}}^{{\mathbb {C}}},\Omega _{f,\alpha },\Lambda _2\}\notin \Delta _3^G\) if \(f(n)-n\ge \sqrt{2n}+1/2 \). Assume that the function f satisfies \(f(n)-n\ge \sqrt{2n}+1/2\). The proof will use a direct sum construction. Given \(\{a_{i,j}\}\in \Omega '\), consider the operators \(H_j=H_0+v_{(j)}\), where the potential is of the following form:

$$\begin{aligned} v_{(j)}(n)=\sum _{k=1}^\infty a_{k,j}\delta _{n,k!}. \end{aligned}$$

Using Theorem 3.4, \([0,4]\subset \sigma _{\mathrm {sc}}(H_j)\) if \(\sum _{k}a_{k,j}=\infty \) (that is, if the j-th column has infinitely many 1s) and \(\sigma _{\mathrm {sc}}(H_j)\cap (0,4)=\emptyset \) otherwise. Now consider an effective bijection (with effective inverse) between the canonical bases of \(l^2({\mathbb {N}})\) and \(\oplus _{j=1}^\infty l^2({\mathbb {N}})\):

$$\begin{aligned} \phi :\{e_n:n\in {\mathbb {N}}\}\rightarrow \{e_\mathbf{k }:\mathbf{k} \in {\mathbb {N}}^{{\mathbb {N}}},\Vert \mathbf{k} \Vert _{0}=1\}. \end{aligned}$$

Set \(H(\{a_{i,j}\})=\bigoplus _{j=1}^\infty H_j\). Then through \(\phi \), we view \(H=H(\{a_{i,j}\})\) as a self-adjoint operator acting on \(l^2({\mathbb {N}})\). Explicitly, we consider the matrix

$$\begin{aligned} H_{m,n}=\langle H e_{\phi (n)},e_{\phi (m)}\rangle . \end{aligned}$$

We choose the following bijection (where m lists the canonical basis in each Hilbert space):

figure a

A straightforward computation shows that \(H\in \Omega _{f,0}\). We also observe that if \(\Xi '(\{a_{i,j}\})=1\) then \([0,4]\subset \sigma _{\mathrm {sc}}(H)\), otherwise \(\sigma _{\mathrm {sc}}(H)\cap (0,4)=\emptyset \).

Suppose for a contradiction that \(\Gamma _{n_2,n_1}\) is a height two tower of general algorithms that solves \(\{\Xi _{\mathrm {sc}}^{{\mathbb {C}}},\Omega _{f,0},\Lambda _1\}\). We will gain a contradiction by using the supposed height two tower to solve \(\{\Xi ',\Omega '\}\). We now set

$$\begin{aligned} {\widehat{\Gamma }}_{n_2,n_1}(\{a_{i,j}\})=1-\min \{1,\mathrm {dist}(3,\Gamma _{n_2,n_1}(H(\{a_{i,j}\})))\}, \end{aligned}$$

where we use the convention \(\mathrm {dist}(3,\emptyset )=1\). The comments above show that each of these is a general algorithm. Furthermore, the convergence of \(\Gamma _{n_2,n_1}\) implies that

$$\begin{aligned} \lim _{n_2\rightarrow \infty }\lim _{n_1\rightarrow \infty }{\widehat{\Gamma }}_{n_2,n_1}(\{a_{i,j}\})=1-\min \{1,\mathrm {dist}(3,\sigma _{\mathrm {sc}}(H(\{a_{i,j}\})))\}=\Xi '(\{a_{i,j}\}). \end{aligned}$$

We are not quite done since the convergence here takes place on the interval [0, 1] with the usual metric as opposed to \(\{0,1\}\) with the discrete metric. To get round this, we use the following, now familiar, trick.

Consider the intervals \( J_1=[0,1/2], \) and \( J_2=[3/4,1]. \) Let \(k(n_2,n_1)\le n_1\) be maximal such that \({\widehat{\Gamma }}_{n_2,k}(\{a_{i,j}\})\in J_1\cup J_2\). If no such k exists or \({\widehat{\Gamma }}_{n_2,k}(\{a_{i,j}\})\in J_1\) then set \({\Gamma }_{n_2,n_1}'(\{a_{i,j}\})=0\). Otherwise set \({\Gamma }_{n_2,n_1}'(\{a_{i,j}\})=1\). As in the proof of Lemma 5.3, we can show \(\lim _{n_1\rightarrow \infty }{\Gamma }_{n_2,n_1}'(\{a_{i,j}\})={\Gamma }_{n_2}'(\{a_{i,j}\})\) exists. If \(\Xi '(\{a_{i,j}\})=0\), then for large \(n_2\), \(\lim _{n_1\rightarrow \infty }{\widehat{\Gamma }}_{n_2,k}(\{a_{i,j}\})<1/2\) and hence \(\lim _{n_2\rightarrow \infty }{\Gamma }_{n_2}'(\{a_{i,j}\})=0\). Similarly, if \(\Xi '(\{a_{i,j}\})=1\), then for large \(n_2\) we must have \(\lim _{n_1\rightarrow \infty }{\widehat{\Gamma }}_{n_2,k}(\{a_{i,j}\})>3/4\) and hence \(\lim _{n_2\rightarrow \infty }{\Gamma }_{n_2}'(\{a_{i,j}\})=1\). Hence \({\Gamma }_{n_2,n_1}'\) is a height two tower of general algorithms solving \(\{\Xi ',\Omega '\}\), a contradiction. \(\quad \square \)

Finally, we will use the following lemma to prove that the singular continuous spectrum can be computed in three limits.

Lemma 5.5

Let \(a<b\) with \(a,b\in {\mathbb {R}}\) and consider the decision problem

$$\begin{aligned} \Xi _{a,b,\mathrm {sc}}:\Omega _{f,\alpha }&\rightarrow \{0,1\}\\ T&\rightarrow {\left\{ \begin{array}{ll} 1\quad \text {if }\sigma _{\mathrm {sc}}(T)\cap [a,b]\ne \emptyset \\ 0\quad \text {otherwise}. \end{array}\right. } \end{aligned}$$

Then there exists a height three arithmetical tower \(\Gamma _{n_3,n_2,n_1}\) (with evaluation functions \(\Lambda _1\)) for \(\Xi _{a,b,\mathrm {sc}}\). Furthermore, the final limit is from below with \(\Gamma _{n_3}(T):=\lim _{n_2\rightarrow \infty }\lim _{n_1\rightarrow \infty }\Gamma _{n_3,n_2,n_1}(T)\le \Xi _{a,b,\mathrm {sc}}(T)\).

Once this is proven, we can use the same construction that was used to prove \(\{\Xi _{\mathrm {pp}}^{{\mathbb {C}}},\Omega _{f,\alpha },\Lambda _1\}\in \Delta _3^A\) and \(\{\Xi _{\mathrm {ac}}^{{\mathbb {C}}},\Omega _{f,\alpha },\Lambda _1\}\in \Delta _3^A\) to show that \(\{\Xi _{\mathrm {sc}}^{{\mathbb {C}}},\Omega _{f,\alpha },\Lambda _1\}\in \Delta _4^A\), but with an additional limit. Namely, we replace \((n_2,n_1)\) by \((n_3,n_2)\) in the proof and use the tower constructed in the proof of Lemma 5.4 instead of \({\widehat{\Gamma }}_{n_2,n_1}(T,I)\) for an interval I. We still gain the required convergence, since the only change is an additional limit in the finite number of decision problems that decide the appropriate tree.

Proof of Lemma 5.5

Note that we can write

$$\begin{aligned} \mu _{e_m,e_m,\mathrm {sc}}^T([a,b])=\mu _{e_m,e_m}^T([a,b])-\mu _{e_m,e_m,\mathrm {pp}}^T([a,b])-\mu _{e_m,e_m,\mathrm {ac}}^T([a,b]). \end{aligned}$$

From this and the proofs of Lemmas 5.3 and 5.4, it is clear that we can construct a height two arithmetical tower, \(a_{m,n_2,n_1}(T)\), for \(\mu _{e_m,e_m,\mathrm {sc}}^T([a,b])\), where the final limit is from above. Now set

$$\begin{aligned} \Upsilon _{n_3,n_2,n_1}(T)=\max _{1\le j\le n_3} a_{j,n_2,n_1}. \end{aligned}$$

We see that each successive limit converges, with the second from above and the final from below. By taking successive maxima, minima of our base algorithms, we can assume that the second and final limits are monotonic and that \(\Upsilon _{n_3,n_2,n_1}(T)\) is monotonic in both \(n_2\) and \(n_3\). Define \(\Upsilon _{n_3,n_2}(T)=\lim _{n_1\rightarrow \infty }\Upsilon _{n_3,n_2,n_1}(T)\), \(\Upsilon _{n_3}(T)=\lim _{n_2\rightarrow \infty }\Upsilon _{n_3,n_2}(T)\) and \(\Upsilon (T)=\lim _{n_3\rightarrow \infty }\Upsilon _{n_3}(T)\). Then \(\Upsilon (T)\) is zero if \(\Xi _{a,b,\mathrm {sc}}(T)=0\), otherwise it is a positive finite number.

With a slight change to the previous argument (the monotonicity in \(n_2\) and \(n_3\) is crucial for this to work), consider the intervals \( J_1^{m}=[0,1/m], \) and \( J_2^{m}=[2/m,\infty ). \) Let \(k(m,n,n_1)\le n_1\) be maximal such that \(\Upsilon _{m,n,n_1}(T)\in J_1^{m}\cup J_2^{m}\). If no such k exists or \(\Upsilon _{m,n,k}(T)\in J_1^{m}\) then set \({{\widehat{\Gamma }}}_{m,n,n_1}(T)=0\). Otherwise set \({{\widehat{\Gamma }}}_{m,n,n_1}(T)=1\). We then define

$$\begin{aligned} \Gamma _{n_3,n_2,n_1}=\max _{1\le m\le n_3}\min _{1\le n\le n_2}{{\widehat{\Gamma }}}_{m,n,n_1}(T). \end{aligned}$$

These can be computed using finitely many arithmetic operations and comparisons using \(\Lambda _1\), and, as before, the first limit exists with

$$\begin{aligned} \Gamma _{n_3,n_2}(T)=\lim _{n_1\rightarrow \infty }\Gamma _{n_3,n_2,n_1}(T)=\max _{1\le m\le n_3}\min _{1\le n\le n_2}{{\widehat{\Gamma }}}_{m,n}(T). \end{aligned}$$

Note also that the second and third sequential limits exist through the use of maxima and minima.

Now suppose that \(\Xi _{a,b,\mathrm {sc}}(T)=0\) and fix \(n_3\). Then for large \(n_2\), we must have that \(\Upsilon _{m,n_2}(T)<1/(2n_3)\) for all \(m\le n_3\) due to the monotonic convergence of \(\Upsilon _{p}\) as \(p\rightarrow \infty \). It follows in this case that

$$\begin{aligned} \lim _{n_2\rightarrow \infty }\Gamma _{n_3,n_2}(T)=0,\quad \text {for all }n_3. \end{aligned}$$

Now suppose that \(\Xi _{a,b,\mathrm {sc}}(T)=1\). It follows in this case that there exists \(M\in {\mathbb {N}}\) such that if \(m\ge M\) then \(\Upsilon _m(T)>3/m\). Due to the monotonic convergence of \(\Upsilon _{m,p}\) as \(p\rightarrow \infty \) it follows that for all p we must have \(\Upsilon _{m,p}>3/m\) and hence there exists \(N(m,p)\in {\mathbb {N}}\) such that if \(n_1\ge N(m,p)\) then we must have \(\Upsilon _{m,p,n_1}\ge 2/m\). It follows that if \(n_3\ge M\) then we must have \({\widehat{\Gamma }}_{n_3,p}(T)=1\) for all p and hence that

$$\begin{aligned} \lim _{n_3\rightarrow \infty }\Gamma _{n_3}(T)=1. \end{aligned}$$

The conclusion of the lemma now follows. \(\quad \square \)

6 Numerical Examples

We now demonstrate the applicability of the new algorithms. In particular, these are the first algorithms that compute their respective spectral properties for the whole class \(\Omega _{f,\alpha }\), and even for the restricted case of tridiagonal self-adjoint matrices. The algorithms are also implicitly parallelisable, allowing large scale computations. We focus on discrete operators and the numerical implementation of the algorithms for ODEs, PDEs and integral operators will be the focus of a future software package.Footnote 8

6.1 Jacobi matrices and orthogonal polynomials

For our first set of examples, we consider the natural link between the spectral measures of Jacobi operators and orthogonal polynomials on \({\mathbb {R}}\). Let J be a Jacobi matrix

$$\begin{aligned} J=\begin{pmatrix} b_1 &{} a_1 &{} &{}\\ a_1 &{} b_2 &{} a_2 &{} \\ &{} a_2 &{} b_3 &{} \ddots \\ &{} &{} \ddots &{} \ddots \end{pmatrix} \end{aligned}$$

with \(a_j,b_j\in {\mathbb {R}}\) and \(a_j>0\). In this case, under suitable conditions, the probability measure \(\mu _J:=\mu _{e_1,e_1}^J\) is the probability measure associated with the orthonormal polynomials defined by

$$\begin{aligned} \begin{aligned}&xP_k(x)=a_{k+1}P_{k+1}(x)+b_{k+1}P_k(x)+a_kP_{k-1}(x),\\&P_{-1}(x)=0,\quad P_0(x)=1, \end{aligned} \end{aligned}$$

and the spectral measure that appears in the multiplicative version of the spectral theorem (see, for example, [45, 137, 143]). Classically, one usually first considers the measure and then constructs the orthogonal polynomials and the corresponding J. In some sense, the algorithms constructed in this paper, and in particular Sect. 4, compute the inverse problem (note that \(J\in \Omega _{n\rightarrow n+1,0}\)). In other words, we compute the measure \(\mu _J\) given the recurrence coefficients defining the orthogonal polynomials. This is a very difficult problem to study theoretically, and, until now, there has been no numerical method able to tackle the problem in this generality (see Sect. 1.7 for comments on methods that tackle compact perturbations of Toeplitz operators). To verify our method, we will consider problems where the measure \(\mu _J\) is known analytically.

We begin with the well-known class of Jacobi polynomials defined for \(\alpha ,\beta >-1\) which have

$$\begin{aligned} a_k=&2\sqrt{\frac{k(k+\alpha )(k+\beta )(k+\alpha +\beta )}{(2k+\alpha +\beta -1)(2k+\alpha +\beta )^2(2k+\alpha +\beta +1)}},\\ b_k=&\frac{\beta ^2-\alpha ^2}{(2k+\alpha +\beta )(2k-2+\alpha +\beta )}, \end{aligned}$$

and measure on the interval \([-1,1]\) given by

$$\begin{aligned} d\mu _J=\frac{(1-x)^\alpha (1+x)^\beta }{N(\alpha ,\beta )}dx=f_{\alpha ,\beta }(x)dx, \end{aligned}$$
(6.1)

where \(N(\alpha ,\beta )\) is a normalising constant, ensuring the measure is a probability measure. To assess the convergence of the algorithm in Sect. 4.2 that approximates the Radon–Nikodym derivative \(f_{\alpha ,\beta }\), in this section we will plot various errors as a function of \(\epsilon \), the distance from the points at which we compute the resolvents to the real axis.

Fig. 4
figure 4

Results for Jacobi polynomials with \(\alpha =0.7\) and \(\beta =0.3\). Left: Convergence in \(L^1\) and at the points \(\pm 1\) and 0. The rates \(O(\epsilon )\), \(O(\epsilon ^{0.7})\) and \(O(\epsilon ^{0.3})\) are also shown as dashed lines. Right: Convergence with Richardson extrapolation. The rates \(O(\epsilon ^2)\), \(O(\epsilon ^{1.3})\), \(O(\epsilon ^{0.7})\) and \(O(\epsilon ^{0.3})\) are also shown. As discussed in the text, the rates reflect the local smoothness properties of the density \(f_{\alpha ,\beta }\)

Fig. 5
figure 5

Results for Laguerre polynomials with \(\alpha =0.5\). Left: Convergence in \(L^1\) and at the points 0 and 1. The rates \(O(\epsilon )\) and \(O(\epsilon ^{0.5})\) are also shown as dashed lines. Right: Convergence with Richardson extrapolation. The rates \(O(\epsilon ^2)\), \(O(\epsilon ^{1.5})\) and \(O(\epsilon ^{0.5})\) are also shown

Figure 4 (left) shows a typical error plot for \(\alpha =0.7\) and \(\beta =0.3\). We plot both the \(L^1\) error (computed using a large number of discrete points), and the pointwise errors at \(-1,0\) and 1. The procedure of Theorem 2.1 is used to determine adaptively how large our (rectangular) matrix truncations should be for a given \(\epsilon \). We see that both the \(L^1\) error and error at 0 appear to decrease as \(O(\epsilon )\),Footnote 9 whereas the errors at \(-1\) and \(+1\) decrease as \(O(\epsilon ^{\beta })\) and \(O(\epsilon ^{\alpha })\) respectively, shown in the plot. This suggests using Richardson extrapolation to accelerate convergence [115]. This is shown in the right of Fig. 4, where the extrapolation was computed at distances \(\epsilon \) and \(\epsilon /2\) from the real axis. Now the error at 0 decreases as \(O(\epsilon ^{2})\), whereas the \(L^1\) error appears to decrease as \(O(\epsilon ^{1.3})\). We found similar results for different \(\alpha \) and \(\beta \). In general, interior points decrease at the rate \(O(\epsilon )\) and then \(O(\epsilon ^{2})\) after extrapolation. The left end point error decreases as \(O(\epsilon ^{\min \{1,\beta \}})\) and then \(O(\epsilon ^{\min \{2,\beta \}})\) after extrapolation. The right end point error decreases as \(O(\epsilon ^{\min \{1,\alpha \}})\) and then \(O(\epsilon ^{\min \{2,\alpha \}})\) after extrapolation. Finally, the \(L^1\) error decreases as \(O(\epsilon ^{\min \{1,1+\alpha ,1+\beta \}})\) and then \(O(\epsilon ^{\min \{2,1+\alpha ,1+\beta \}})\) after extrapolation. These rates for pointwise and \(L^1\) errors reflect the local Hölder exponent and integrability of the density and its first derivative respectively, and can be proven using a Taylor series argument for general measures.Footnote 10 Moreover, we found that increased rates of convergence could be obtained (and again proven) locally near smoother parts of the measure by using further iterates of extrapolation. Note also that we took a uniform value \(\epsilon \) over the whole interval. However, \(\epsilon \) could just have easily been a function of the position x, allowing it to be smaller for points where the resolvent is estimated more accurately for a given matrix size.

To demonstrate the algorithm on unbounded operators, we next consider the class of generalised Laguerre polynomials for \(\alpha >-1\) which have

$$\begin{aligned} a_k=\sqrt{k(k+\alpha )},\quad b_k=2k+\alpha -1, \end{aligned}$$

and measure on the interval \([0,\infty )\) given by

$$\begin{aligned} d\mu _J=\frac{x^{\alpha }e^{-x}}{\Gamma (\alpha +1)}dx. \end{aligned}$$
(6.2)

Results are shown in Fig. 5 for \(\alpha =0.5\), where we have plotted the (relative) \(L^1\) error over the interval [0, 1], as well as pointwise errors at 0 and 1. Similar conclusions can be drawn as before. Pointwise errors are also shown for this example and the Jacobi operator, but now using the 10th iterate of Richardson extrapolation, in Fig. 6. The errors decay at the expected rates (also shown), with \(O(\epsilon ^{10})\) convergence at smooth parts of the measure. Near singular points (namely at \(x=-0.99\) and \(x=0.01\) for the Jacobi and Laguerre cases, respectively), the prefactor in front of the \(O(\epsilon ^{10})\) term is larger, and smaller \(\epsilon \) is needed before the expected rate kicks in.

Fig. 6
figure 6

Left: Pointwise errors for the Jacobi example (\(\alpha =0.7\) and \(\beta =0.3\)) and (10th) iterated Richardson extrapolation. Right: Same but for the Laguerre example (\(\alpha =0.5\)). The dotted lines show the expected rates of convergence, with \(O(\epsilon ^{10})\) convergence at smooth parts of the measure

Fig. 7
figure 7

Left: Computation of \(\epsilon \pi \langle K_H(x+i\epsilon ;T,e_1),e_1\rangle \) (denoted \(\epsilon \pi \langle K_H\)) for \(\epsilon =10^{-7}\) and \(\alpha =5\). Right: Same but for \(\alpha =0.5\). The blues crosses represent the weights of the atoms of the measure, corresponding to projections onto eigenspaces

Finally, we demonstrate the computation of measures for a Jacobi operator with non-empty discrete spectrum. The Charlier polynomials are generated by

$$\begin{aligned} a_k=\sqrt{\alpha k},\quad b_k=k+\alpha -1, \end{aligned}$$

for \(\alpha >0\), and have measure

$$\begin{aligned} d\mu _J=\exp (-\alpha )\sum _{m=0}^\infty \frac{\alpha ^m}{m!}\delta _{m}, \end{aligned}$$
(6.3)

where \(\delta _m\) denotes a Dirac measure located at the point m. Figure 7 shows plots of \(\epsilon \pi \langle K_H(x+i\epsilon ;T,e_1),e_1\rangle \) for \(\epsilon =10^{-7}\), computed using an \((n+1)\times n\) matrix with \(n=1000\). The peaks clearly coincide with the atoms of the measure. The difference between the peak values and the weight \(\exp (-\alpha )\alpha ^m/m!\) was of the order \(10^{-13}\) for both examples. This demonstrates that an effective way to compute eigenvalues (particularly the challenging case of those in gaps of the essential spectrum, where spectral pollution occurs, or even those embedded in the essential spectrum) and projections onto eigenspaces may be to find local maxima or spikes of \(\epsilon \pi \langle K_H(x+i\epsilon ;T,e_1),e_1\rangle \). Such a routine would only require the solution of shifted linear systems (the resolvent), without diagonalisation, and could be executed rapidly in parallel.

6.2 A global collocation approach

Typically, the size of the linear system needed to approximate the resolvent accurately (Theorem 2.1) grows as \(\epsilon \downarrow 0\) and we approach the spectrum. It is therefore beneficial to increase the rate of convergence of approximating the measures as \(\epsilon \downarrow 0\). One local (in terms of \(x\in {\mathbb {R}}\)) approach, Richardson extrapolation, was used in Sect. 6.1. Here we briefly outline a different, more global approach.

Suppose that we know the spectral measure of an operator T has support included in \(I\subset {\mathbb {R}}\) and is absolutely continuous. Alternatively, we may analytically know, or be able to estimate, the singular part of the measure and subtract this from what follows. In this case, a natural way to approximate the Radon–Nikodym derivative is through a formal basis expansion

$$\begin{aligned} \rho _{x,y}^T(\lambda )=\sum _{m=1}^\infty a_m\phi _m(\lambda ), \end{aligned}$$

where \(\phi _m\) are functions with support I whose Cauchy’s transforms are easy to compute. To approximate the coefficients \(a_m\), we collocate in the complex plane as follows. Let \({\mathcal {C}}\) be a finite collection of complex points in the upper half-plane and truncate the approximation of \(\rho _{x,y}^T\) to M terms. To generate a linear system for \(\{a_m\}_{m=1}^M\), we evaluate the Cauchy transform at points \(z\in {\mathcal {C}}\). The Cauchy transform satisfies

$$\begin{aligned} \int _{{\mathbb {R}}}\frac{\rho _{x,y}^T(\lambda )}{\lambda -z}d\lambda =\langle R(z,T)x,y\rangle , \end{aligned}$$

which can be computed with error control using the results of Sect. 2.1. Call this approximation \(b_{x,y}(z)\) and define

$$\begin{aligned} {\widehat{\phi }}_m(z)=\int _{I}\frac{\phi _m(\lambda )}{\lambda -z}d\lambda . \end{aligned}$$

Then for each \(z\in {\mathcal {C}}\), an approximate linear relation can be written as

$$\begin{aligned} \sum _{m=1}^M a_m {\widehat{\phi }}_m(z)=b_{x,y}(z). \end{aligned}$$

Evaluating this relation at \(\ge M\) points in \({\mathcal {C}}\) gives rise to a linear system, which can be inverted in the least-squares sense for the approximation of the coefficients \(\{a_m\}_{m=1}^M\). If \(x=y\), and the basis functions are real, then the coefficients are real. Hence, in this case, taking real and imaginary parts of the linear system gives further linear relations without having to compute further resolvents.

If our basis functions satisfy recursion relations of the form

$$\begin{aligned} \phi _{m+1}(\lambda )=\alpha _m \lambda \phi _m(\lambda )+\beta _m \phi _m(\lambda )+\gamma _m\phi _{m-1}(\lambda ), \end{aligned}$$

then

$$\begin{aligned} {\widehat{\phi }}_{m+1}(z)&=\alpha _m \int _I\frac{\lambda \phi _m(\lambda )}{\lambda -z}d\lambda +\beta _m{\widehat{\phi }}_{m}(z)+\gamma _m{\widehat{\phi }}_{m-1}(z)\\&=\alpha _m\int _I\phi _m(\lambda )d\lambda +z\alpha _m {\widehat{\phi }}_m(z)+\beta _m{\widehat{\phi }}_{m}(z)+\gamma _m{\widehat{\phi }}_{m-1}(z). \end{aligned}$$

Hence, given the values of the integrals

$$\begin{aligned} \int _I\phi _m(\lambda )d\lambda ,\quad {\widehat{\phi }}_{1}(z), \end{aligned}$$
(6.4)

we can compute \({\widehat{\phi }}_m(z)\) for all \(m\in {\mathbb {N}}\). The integrals in (6.4) have simple forms for all the bases used in this paper. This method of computation of \({\widehat{\phi }}_m\) is fast, meaning that the most expensive part of the collocation method is the computation of the right hand side of the linear system, that is, computing the resolvent.

Fig. 8
figure 8

Left: Convergence of collocation method using Chebyshev polynomials. Right: Convergence of collocation method using Laguerre functions on \([0,\infty )\)

Figure 8 (left) revisits the Jacobi polynomials example in Sect. 6.1 and shows the collocation method in the case of using Chebyshev polynomials as the basis functions \(\phi _m\) (taking the first 1500). For collocation points, we took M Chebyshev nodes offset by an additional \(\epsilon i\) to lie just above the interval \([-1,1]\) in the complex plane. Note that the rate of convergence is faster than \(O(\epsilon ^{\min \{2,1+\alpha ,1+\beta \}})\) achieved with the methods of Sect. 4 after extrapolation. There are at least two prices to pay, though. First, there is no current proof that collocation converges and, second, global regularity of the measure is needed. At the very least, we need to be able to subtract off the singular part of the measure. In practice, even if the measure is absolutely continuous, a large number of basis functions may be needed to capture the Radon–Nikodym derivative. Examples are given in Sect. 6.3, where collocation performs worse than the techniques of Sect. 4 due to either of these regularity conditions. Figure 8 (right) revisits the (generalised) Laguerre polynomials example in Sect. 6.1 and shows the collocation method in the case of using Laguerre functions (the polynomials multiplied by the square root of the weight function) as the basis functions with \(M=1000\). The collocation points were \(\{1^2/M^2,2^2/M^2,\ldots ,1\}+\epsilon i\). Again, this method converges with faster rates than that in Sect. 6.1.

6.3 CMV matrices and extensions to unitary operators

We now demonstrate that the algorithms extend to the unitary case through use of the functions \(K_D\), namely, the convolution with the Poisson kernel of the unit disk. We will consider the class of CMV matrices (named after Cantero, Moral and Velázquez [28]) linked with orthogonal polynomials on the unit circle. A full discussion of this subject is beyond the scope of this paper, and we refer the reader to the monographs of Simon [130, 131]. However, the background for this example is as follows. Given a probability measure \(\mu \) on the unit circle \({\mathbb {T}}\), whose support is not a finite set, we can define a system of orthogonal polynomials \(\{\Phi _n\}_{n=0}^\infty \) by applying the Gram–Schmidt process to \(\{1,z,z^2,\ldots \}\). Given a polynomial \(Q_n(z)\) of degree n, we define the reversed polynomial \(Q_n^*(z)\) via \(Q_n^*(z)=z^n\overline{Q_n(1/{\overline{z}})}\). Szegő’s recurrence relation [139] is given by

$$\begin{aligned} \Phi _{n+1}(z)=z\Phi _n(z)-\overline{\alpha _n}\Phi _{n}^*(z), \end{aligned}$$
(6.5)

where the \(\alpha _n\) are known as Verblunsky coefficients [148] and satisfy \(\left| \alpha _j\right| <1\). Verblunsky’s theorem [147] sets up a one-to-one correspondence between \(\mu \) and the coefficients \(\{\alpha _j\}_{j=0}^\infty \). Define also

$$\begin{aligned} \rho _j=\sqrt{1-\left| \alpha _j\right| ^2}>0. \end{aligned}$$

The CMV matrix associated with \(\{\alpha _j\}_{j=0}^\infty \) is

$$\begin{aligned} C=\begin{pmatrix} \overline{\alpha _0} &{} \overline{\alpha _1}\rho _0 &{} \rho _1\rho _0 &{} 0 &{} 0 &{} \cdots \\ \rho _0 &{} -\overline{\alpha _1}\alpha _0 &{} -\rho _1\alpha _0 &{} 0 &{} 0 &{} \cdots \\ 0 &{} \overline{\alpha _2}\rho _1 &{} -\overline{\alpha _2}\alpha _1 &{} \overline{\alpha _3}\rho _2 &{} \rho _3\rho _2 &{} \cdots \\ 0 &{} \rho _2\rho _1 &{} -\rho _2\alpha _1 &{} -\overline{\alpha _3}\alpha _2 &{} -\rho _3\alpha _2&{} \cdots \\ 0 &{} 0 &{} 0 &{} \overline{\alpha _4}\rho _3 &{} -\overline{\alpha _4}\alpha _3 &{} \cdots \\ \cdots &{}\cdots &{}\cdots &{}\cdots &{}\cdots &{}\cdots \end{pmatrix}. \end{aligned}$$
(6.6)

This matrix is unitary and banded (and lies in \(\Omega _{n\rightarrow n+2,0}^{\mathrm {U}}\)). This last property does not hold for the so-called GGT representation [62, 68, 142], which has infinitely many non-zero entries in each row. The GGT representation uses the basis \(\{\Phi _n\}_{n=0}^\infty \), whereas the CMV representation obtains a basis via applying Gram–Schmidt to \(\{1,z,z^{-1},z^2,z^{2},\ldots \}\). The key result is that \(\mu _C:=\mu ^C_{e_1,e_1}\) is precisely the measure \(\mu \) on the unit circle. Hence our new algorithms can be considered as a computational tool for the correspondence

$$\begin{aligned} \{\alpha _j\}_{j=0}^\infty \rightarrow \mu , \end{aligned}$$

in much the same way as for orthogonal polynomials on the real line in Sect. 6.1.

Fig. 9
figure 9

Left: Convergence of algorithm for Rogers–Szegő polynomials. Right: Corresponding Radon–Nikodym derivatives (densities of the measure)

The first example we consider are the Rogers–Szegő polynomials [117] given by

$$\begin{aligned} \alpha _j=(-1)^jq^{(j+1)/2}, \end{aligned}$$

where \(q\in (0,1)\). In this case,

$$\begin{aligned} d\mu _C=\frac{1}{\sqrt{2\pi \log (q^{-1})}}\sum _{m\in {\mathbb {Z}}}\exp \left( -\frac{(\theta -2\pi m)^2}{2\log (q^{-1})}\right) d\theta , \end{aligned}$$

which can be expressed in terms of the theta function. Figure 9 (left) shows the convergence of the new algorithm for various q and we see algebraic convergence as before, with rates \(O(\epsilon )\) and \(O(\epsilon ^2)\) before and after extrapolation respectively (here \(\epsilon \) is such that we evaluate the convolutions with the Poisson kernel at radius \(r=(1+\epsilon )^{-1}\)). Radon–Nikodym derivatives for larger values of q (shown in the right of Fig. 9) have larger derivatives and hence larger pre-factor in front of these rates. We can use a similar collocation method as in Sect. 6.2, using the standard Fourier basis \(\{e^{im\theta }\}_{m\in {\mathbb {Z}}}\). Note that the relevant Cauchy transforms can be computed explicitly using Cauchy’s residue theorem. Collocation points inside and outside the unit disk are needed (collocating inside the unit disk causes the Cauchy transforms of the basis functions with negative m to vanish). This achieved machine precision using just 41 basis functions and collocating at distance \(\epsilon =0.1\) from \({\mathbb {T}}\).

The next example we consider are the Geronimus polynomials [63], which have \(\alpha _j=a\) with \(\left| a\right| <1\). In this case, for \(\left| a+1/2\right| \le 1/2\),

$$\begin{aligned} d\mu _C=\chi _{\left| \theta \right| >\theta _a}\frac{\sqrt{\cos ^2(\theta _a/2)-\cos ^2(\theta /2)}}{2\pi \left| 1+a\right| \left| \sin ((\theta -b)/2)\right| }d\theta , \end{aligned}$$

where \(\theta \in [-\pi ,\pi ]\), \(b=2\mathrm {arg}(1+{\overline{a}})\) and \(\theta _a=2\sin ^{-1}(\left| a\right| )\). Figure 10 (right) shows some typical examples, and note that these density functions are not smooth. When \(\left| a+1/2\right| >1/2\), there is also a singular part of the measure (see [130] for the exact formula).

In the cases shown in Fig. 10, the collocation method struggles to gain an accuracy beyond \(10^{-4}\) owing to discontinuity in the derivative of the Radon–Nikodym derivative and the algorithm based on convolutions is able to gain much more accurate results. Finally, a typical example is shown in Fig. 11 (left) for \(a=0.8\) (there is now a singular part located at \(\theta =0\)), where we have shown the output of the algorithm and the exact convolution with the Poisson kernel for \(r=1/1.01\), as well as the collocation method using 21 basis functions and collocation points \(\{(1\pm 0.1)2\pi /21,(1\pm 0.1)2\pi \cdot 2/21,\ldots ,(1\pm 0.1)2\pi \}\). Here, we see an exact agreement between the algorithm and convolution. Unsurprisingly in the presence of point spectra, the collocation method is unstable. Consistent with Theorem 4.2, the algorithm in Sect. 4.2 converges to the density over the portion of the spectrum, which is purely absolutely continuous. This is shown in the right of Fig. 11.

Fig. 10
figure 10

Left: Convergence of algorithm for Geronimus polynomials. In this case the algorithm can obtain much more accurate results than collocation owing to the non-smoothness of the density functions. Right: Example density functions for the cases considered

Fig. 11
figure 11

Left: The smooth part of the density function (black) for the case of an additional point mass at \(\theta =0\). Note that the algorithm’s output agrees almost perfectly with the exact convolution of the measure. We have also shown the output of the collocation method, which is unstable in the presence of the point mass. Right: Convergence of the algorithm from Theorem 4.2 over the portion of the spectrum which is purely absolutely continuous

6.4 Fractional diffusion on a 2D quasicrystal

In this example, we demonstrate computation of the functional calculus and consider operators acting on the graph of a Penrose tile—the canonical model of a two-dimensional quasicrystal [47, 140, 146] (aperiodic crystals which typically have anomalous spectra/transport properties). Quasicrystals were discovered in 1982 by Shechtman [125] who was awarded the 2011 Nobel Prize in Chemistry for his discovery. Since then, quasicrystals have generated considerable interest due to their often exotic physical properties [135], with a vast literature on the physics and spectral properties of such aperiodic systems.Footnote 11 Unlike the one-dimensional case, little is known about the spectral properties of two-dimensional quasicrystals. A finite portion of the infinite tile is shown in Fig. 12, and we consider the natural graph whose vertices are the vertices of the tiling and edges correspond to the edges of the rhombi. Such a graph posses no periodic structure, and it is generically impossible to study its spectral properties analytically with current methods. The free Hamiltonian \(H_0\) (Laplacian) is given by

$$\begin{aligned} (H_0\psi )_{i} = \sum _{i \sim j} \left( \psi _j-\psi _i\right) , \end{aligned}$$
(6.7)

with summation over nearest neighbour sites (vertices). The first rigorous computation of the spectrum of \(H_0\) (also with additional error control) was performed in [40]. We chose a natural ordering of the vertices as in Fig. 12, which leads to an operator \(H_0\) acting on \(l^2({\mathbb {N}})\). The local bandwidth grows for this operator (our ordering is asymptotically optimal) and hence computation of powers \(H_0^m\) is infeasible for \(m\gtrsim 50\), rendering polynomial approximations of the functional calculus intractable. In the above notation, \(H_0\in \Omega _{f,0}\) with \(f(n)-n=O(\sqrt{n})\), and so this example provides a demonstration of the algorithm for a non-banded matrix. Throughout, we take \(u_0=e_1\), though different initial conditions are handled in the same manner.

Fig. 12
figure 12

Left: Finite portion of Penrose tile showing the fivefold rotational symmetry. We labelled vertices from the centre in a spiral outwards in increasing distance from the origin. Right: Contours used for the fractional diffusion on the Penrose tile (\(\alpha \in {\mathbb {N}}\) top, \(\alpha \notin {\mathbb {N}}\) bottom). The red line represents the interval containing the spectrum of \(-H_0\), the branch cut for \(z^{\alpha }\) is taken to be \({\mathbb {R}}_{\le 0}\)

Fig. 13
figure 13

Left: Convergence for \(\alpha =1/2\). Right: Convergence for \(\alpha =1\). We have plotted the errors as a function of the matrix size (number of matrix columns in the rectangular truncations) used

The ability to compute the functional calculus allows the solution of linear evolution equations. Given \(A\in \Omega _{\mathrm {N}}\), a function F (continuous and bounded on \(\sigma (A)\)) and \(u_0\in l^2({\mathbb {N}})\), consider the evolution equation

$$\begin{aligned} \frac{du}{dt}=F(A)u,\quad u_{t=0}=u_0. \end{aligned}$$
(6.8)

The solution of this equation is

$$\begin{aligned} u(t)=\exp (F(A)t)u_0 \end{aligned}$$

and can be computed via the algorithm outlined in Sect. 4.1.

Fig. 14
figure 14

Evolution of initial wavepacket under fractional diffusion

We consider fractional diffusion governed by

$$\begin{aligned} \frac{du}{dt}=-(-H_0)^{\alpha }u,\quad u_{t=0}=u_0, \end{aligned}$$

for \(\alpha >0\). If \(\alpha \) is an integer, then the solution can be represented via contour deformation as

$$\begin{aligned} u(t)=\frac{1}{2\pi i}\int _{\gamma }\exp (-z^{\alpha }t)R(z,-H_0)u_0dz, \end{aligned}$$
(6.9)

where \(\gamma \) is a closed contour looping once around the spectrum. Typically we took the rectangular contour shown in Fig. 12 and approximated the integral via Gaussian quadrature. This allows us to compute the solution with error control (we known the minimal distance between \(\gamma \) and \(\sigma (-H_0)\) so can bound the Lipschitz constant of the resolvent) and clearly, this holds for other functions F, holomorphic on a neighbourhood of \(\sigma (-H_0)\). Note that other methods, such as direct diagonalisation of finite square truncations or discrete time stepping (which is difficult if \(\alpha \notin {\mathbb {N}}\)), do not give error control and are slower. In fact, for this example, direct diagonalisation was impractical. When \(\alpha \notin {\mathbb {N}}\), we can still deform the contour, but not at 0 since \(0\in \sigma (-H_0)\). Hence we deform the contour to that shown in Fig. 12. For a discussion of contour methods applied to finite matrices (in the case that the spectrum is strictly positive), we refer the reader to [74]. Unfortunately, such methods cannot be applied here since \(0\in \sigma (-H_0)\). Moreover, 0 appears to not be an isolated point of the spectrum, meaning that dealing with this point separately is also impossible.

Figure 13 shows the convergence of the algorithm for \(\alpha =1/2\) and \(\alpha =1\). For \(\alpha =1/2\), error control is not given by our algorithm, so we computed an error by comparing to a “converged” solution using larger n. The \(l^2\) error refers to the error in \(l^2({\mathbb {N}})\). The method converges algebraically for \(\alpha =1/2\) (owing to the contour touching the spectrum at 0) but converges exponentially for \(\alpha =1\) with similar convergence observed over a large range of times t. Figure 14 shows the magnitude (log scale) of the computed solution for various times. Note that the techniques presented here can be applied to any evolution equation of the form (6.8) on infinite-dimensional Hilbert spaces. They may also be useful for splitting methods/exponential integrators, which require fast computation of matrix/operator exponentials (see [83, 100] and the references therein) and more generally in the field of infinite-dimensional ODE/PDE systems. For methods that compute general semigroups on infinite-dimensional separable Hilbert spaces with error control, see [35].