1 Introduction

1.1 Preface

The Hermite general \(\beta >0\) ensemble of rank \(N\) is a probability distribution on the set of \(N\) tuples of reals \(z_1<z_2<\cdots <z_N\) with density proportional to

$$\begin{aligned} \prod _{1\le i<j \le N} (z_j-z_i)^\beta \; \prod _{i=1}^N \exp \left( -\frac{z_i^2}{2}\right) . \end{aligned}$$
(1.1)

When \(\beta =2\), that density describes the joint distribution of the eigenvalues of a random Hermitian \(N\times N\) matrix \(M\), whose diagonal entries are i.i.d. real standard normal random variables, while real and imaginary parts of its entries above the diagonal are i.i.d. normal random variables of variance \(1/2\). The law of such a random matrix is referred to as the Gaussian Unitary Ensemble (GUE) (see e.g. [3, 20, 32]) and it has attracted much attention in the mathematical physics literature following the seminal work of Wigner in the 50s. Similarly, the case of \(\beta =1\) describes the joint distribution of eigenvalues of a real symmetric matrix sampled from the Gaussian Orthogonal Ensemble (GOE) and the case \(\beta =4\) corresponds to the Gaussian Symplectic Ensemble (GSE) (see e.g. [3, 20, 32] for the detailed definitions).

It is convenient to view the realizations of (1.1) as point processes on the real line and there are two well-known ways of adding a second dimension to that picture. The first one is to consider an \(N\)-dimensional diffusion known as Dyson Brownian motion (see [32, Chapter 9], [3, Section 4.3] and the references therein) which is the unique strong solution of the system of stochastic differential equations

$$\begin{aligned} \mathrm{d}X_i(t)=\frac{\beta }{2}\,\sum _{j\ne i} \frac{1}{X_i(t)-X_j(t)}\,\mathrm{d}t + \mathrm{d}W_i(t), \quad i=1,2,\ldots ,N \end{aligned}$$
(1.2)

with \(W_1,W_2,\ldots ,W_N\) being independent standard Brownian motions. If one solves (1.2) with zero initial condition, then the distribution of the solution at time \(1\) is given by (1.1). When \(\beta =2\) and one starts with zero initial condition, the diffusion (1.2) has two probabilistic interpretations: it can be either viewed as a system of \(N\) independent standard Brownian motions conditioned never to collide via a suitable Doob’s \(h\)-transform; or it can be regarded as the evolution of the eigenvalues of a Hermitian random matrix whose elements evolve as (independent) Brownian motions.

An alternative way of adding a second dimension to the ensemble in (1.1) involves the so-called corner processes. For \(\beta =2\) take a \(N\times N\) GUE matrix \(M\) and let \(x^N_1\le x^N_2\le \cdots \le x^N_N\) be its ordered eigenvalues. More generally for every \(1\le k\le N\) let \(x^k_1\le x^k_2 \le \cdots \le x^k_k\) be the eigenvalues of the top-left \(k\times k\) submatrix (“corner”) of \(M\). It is well-known that the eigenvalues interlace in the sense that \(x_i^k\le x_i^{k-1}\le x_{i+1}^k\) for \(i=1,\dots ,k-1\) (see Fig. 1 for a schematic illustration of the eigenvalues).

Fig. 1
figure 1

Interlacing particles arising from eigenvalues of corners of a \(3\times 3\) matrix. Row number \(k\) in the picture corresponds to eigenvalues of the \(k\times k\) corner

The joint distribution of \(x^k_i\), \(1\le i\le k\le N\) is known as the GUE-corners process (some authors also use the name “GUE-minors process”) and its study was initiated in [4] and [26]. The GUE-corners process is uniquely characterized by two properties: its projection to the set of particles \(x^N_1,x^N_2,\ldots ,x^N_N\) is given by (1.1) with \(\beta =2\) (normalized to a probability density), and the conditional distribution of \(x^k_i\), \(1\le i\le k\le N-1\) given \(x^N_1,x^N_2,\ldots ,x^N_N\) is uniform on the polytope defined by the interlacing conditions above, see [4, 21]. Due to the combination of the gaussianity and uniformity embedded into its definition, the GUE–corners process appears as a universal scaling limit for a number of 2d models of statistical mechanics, see [2224, 26, 38].

Similarly, one can construct corners processes for \(\beta =1\) and \(\beta =4\), see e.g. [33]. Extrapolating the resulting formulas for the joint density of eigenvalues to general values of \(\beta >0\) one arrives at the following definition.

Definition 1.1

The Hermite \(\beta \) corners process of variance \(t>0\) is the unique probability distribution on the set of reals \(x^k_i\), \(1\le i\le k\le N\) subject to the interlacing conditions \(x_i^k\le x_i^{k-1}\le x_{i+1}^k\) whose density is proportional to

$$\begin{aligned} \prod _{i<j} \left( x_j^N-x_i^N\right) \prod _{i=1}^N \exp \left( - \frac{(x_i^N)^2}{2t}\right) \prod _{k=1}^{N-1} \prod _{1\le i<j\le k} \left( x_j^k-x_i^k\right) ^{2-\beta } \prod _{a=1}^k \prod _{b=1}^{k+1} \left| x^k_a-x^{k+1}_b\right| ^{\beta /2-1}. \end{aligned}$$
(1.3)

The fact that the projection of the Hermite \(\beta \) corners process of variance \(1\) onto level \(k\) (that is, on the coordinates \(x_1^k,x^k_2,\ldots ,x_k^k\)) is given by the corresponding Hermite \(\beta \) ensemble of (1.1) can be deduced from the Dixon-Anderson integration formula (see [2, 17]), which was studied before in the context of Selberg integrals (see [44], [20, Chapter 4]). One particular case of the Selberg integral is the evaluation of the normalizing constant for the probability density of the Hermite \(\beta \) ensemble of (1.1). We provide more details in this direction in Sect. 2.2.

The ultimate goal of the present article is to combine Dyson Brownian motions and corner processes in a single picture. In other words, we aim to introduce a relatively simple diffusion on interlacing particle configurations whose projection on a fixed level is given by Dyson Brownian motion of (1.2), while its fixed time distributions are given by the Hermite \(\beta \) corners processes of Definition 1.1.

One would think that a natural way to do this (at least for \(\beta =1,2,4\)) is to consider an \(N\times N\) matrix of suitable Brownian motions and to project it onto the (interlacing) set of eigenvalues of the matrix and its top-left \(k\times k\) corners, thus generalizing the original construction of Dyson. However, the resulting stochastic process ends up being quite nasty even in the case \(\beta =2\). It is shown in [1] that already for \(N=3\) (and at least some initial conditions) the projection is not a Markov process. When one considers only two adjacent levels (that is, the projection onto \(x_1^N,x^N_2,\ldots ,x_N^N; x_1^{N-1},x^{N-1}_2,\ldots ,x_{N-1}^{N-1}\)), then it can be proven (see [1]) that the projection is Markovian, but the corresponding SDE is very complicated.

An alternative elegant solution for the case \(\beta =2\) was given by Warren [46]. Consider the process \((Y^k_i:1\le i\le k\le N)\) defined through the following inductive procedure: \(Y^1_1\) is a standard Brownian motion with zero initial condition; given \(Y^1_1\), the processes \(Y^2_1\) and \(Y^2_2\) are constructed as independent standard Brownian motions started at zero and reflected on the trajectory of \(Y^1_1\) in such a way that \(Y^2_1(t)\le Y^1_1(t)\le Y^2_2(t)\) holds for all \(t\ge 0\). More generally, having constructed the processes on the first \(k\) levels (that is, \(Y^m_i\), \(1\le i\le m\le k\)) one defines \(Y^{k+1}_i\) as an independent standard Brownian motion started at \(0\) and reflected on the trajectories of \(Y^k_{i-1}\) and \(Y^k_{i}\) in such a way that \(Y^{k-1}_{i-1}(t)\le Y^k_i(t)\le Y^{k-1}_i(t)\) remains true for all \(t\ge 0\) (see [46] and also [24] for more details). Warren shows that the projection of the dynamics on a level \(k\) (that is, on \(Y^k_1,Y^k_2,\ldots ,Y^k_k\)) is given by a \(k\)-dimensional Dyson Brownian Motion of (1.2) with \(\beta =2\), and that the fixed time distributions of the process \((Y^k_i:1\le i\le k\le N)\) are given by the Hermite \(\beta \) corners processes of Definition 1.1 with \(\beta =2\).

Our aim is to construct a generalization of the Warren process for general values of \(\beta \). In other words, we want to answer the question “What is the general \(\beta \) analogue of the reflected interlacing Brownian Motions of [46]?”.

1.2 Our results

Our approach to the construction of the desired general \(\beta \) multilevel stochastic process is based on discrete space approximation. In [24] we proved that the reflected interlacing Brownian motions of [46] can be obtained as a diffusive scaling limit for a class of stochastic dynamics on discrete interlacing particle configurations. The latter dynamics are constructed from independent random walks by imposing the local block/push interactions between particles to preserve the interlacing conditions. The special cases of such processes arise naturally in the study of two-dimensional statistical mechanics systems such as random stepped surfaces and various types of tilings (cf. [6, 9, 10, 34]).

In Sect. 2 we introduce a deformation \(X^{multi}_{disc}(t)\) of these processes depending on a positive parameter \(\theta \) (which is omitted from the notation, and with \(\theta =1\) corresponding to the previously known case). The resulting discrete space dynamics is an intriguing interacting particle system with global interactions whose state space is given by interlacing particle configurations with integer coordinates. Computer simulations of this dynamics for \(\theta =1/2\) and \(\theta =2\) can be found at [25].

We further study the diffusive limit of \(X^{multi}_{disc}(s)\) under the rescaling of time \(s=\varepsilon ^{-1} t\) and \(\varepsilon ^{-1/2}\) scaling of space as \(\varepsilon \downarrow 0\). Our first result is that for any fixed \(\theta >0\), the rescaled processes are tight as \(\varepsilon \downarrow 0\), see Theorem 5.1 for the exact statement. The continuous time, continuous space processes \(Y^{mu}(t)\), defined as subsequential limits \(\varepsilon \downarrow 0\) of the family \(X^{multi}_{disc}(s)\), are our main heros and we prove a variety of results about them for different values of \(\theta \).

  1. (1)

    For any \(\theta \ge 2\) we show in Theorem 5.2 that \(Y^{mu}(t)\) satisfies the system of SDEs (1.4) below.

  2. (2)

    For any \(\theta \ge 1/2\) and any \(1\le k\le N\) we show in Theorem 5.3 that if \(Y^{mu}(t)\) is started from a \(\theta \)Gibbs initial condition (zero initial condition is a particular case), then the \(k\)–dimensional restriction of the \(N(N-1)/2\)–dimensional process \(Y^{mu}(t)\) to the level \(k\) is a \(2\theta \)–Dyson Brownian motion, that is, the vector \((Y^{mu}(t)^k_1, Y^{mu}(t)^k_2,\dots ,Y^{mu}(t)^k_k)\) solves (1.2) with \(\beta =2\theta \) and suitable independent standard Brownian motions \(W_1(t),W_2(t),\dots ,W_k(t)\).

  3. (3)

    For any \(\theta > 0\) we show that if \(Y^{mu}(t)\) is started from zero initial condition, then its distribution at time \(t\) is the Hermite \(2\theta \) corners process of variance \(t\), that is, the corresponding probability density is proportional to (1.3) with \(\beta =2\theta \). In fact, we prove a more general statement, see Theorem 5.3 and Corollary 5.4.

  4. (4)

    For \(\theta =1\) the results of [24] yield that \(Y^{mu}(t)\) is the collection of reflected interlacing Brownian motions of [46].

The above results are complemented by the following uniqueness theorem for the system of SDEs (1.4). In particular, it implies that for \(\theta \ge 2\) all the subsequential limits \(Y^{mu}(t)\) of \(X^{multi}_{disc}(s)\) as \(\varepsilon \downarrow 0\) are the same.

Theorem 1.2

(Theorem 4.1) For any \(N\in \mathbb {N}\) and \(\theta >1\) the system of SDEs

$$\begin{aligned} \mathrm{d}Y^k_i(t) = \Biggl (\sum _{m\ne i} \frac{1-\theta }{Y^k_i(t)-Y^k_m(t)}-\sum _{m=1}^{k-1} \frac{1-\theta }{Y_i^k(t)-Y_m^{k-1}(t)}\Biggr )\,\mathrm{d}t + \mathrm{d}W_i^k(t), \quad 1\le i\le k\le N, \end{aligned}$$
(1.4)

where \(W_i^k\), \(1\le i\le k\le N\) are independent standard Brownian motions, possesses a unique weak solution taking values in the cone

$$\begin{aligned} \overline{\mathcal {G}^N} =\left\{ y=(y^k_i)_{1\le i\le k\le N}\in \mathbb {R}^{N(N+1)/2}:y^{k-1}_{i-1}\le y^k_i\le y^{k-1}_i\right\} \end{aligned}$$
(1.5)

for any initial condition \(Y(0)\) in the interior \(\mathcal {G}^N\) of \(\overline{\mathcal {G}^N}\).

It would be interesting to extend all the above results to general \(\theta >0\). We believe (but we do not have a proof) that the identification of \(Y^{mu}(t)\) with a solution of (1.4) is valid for all \(\theta >1\) and that the identification of the projection of \(Y^{mu}(t)\) onto the \(N\)–th level with a \(\beta =2\theta \)–Dyson Brownian motion is valid for any \(\theta >0\). On the other hand, \(Y^{mu}(t)\) cannot be a solution to (1.4) for \(\theta \le 1\). Indeed, we know that when \(\theta =1\) the process \(Y^{mu}(t)\) is a collection of reflected interlacing Brownian motions which hints that one should introduce additional local time terms in (1.4). In addition, the interpretation of the solution to (1.4) as a generalization of the one-dimensional Bessel process to a process in the Gelfand–Tseitlin cone suggests that the corresponding process for \(\theta <1\) is no longer a semimartingale and should be defined and studied along the lines of [42, Chapter XI, Exercise (1.26)].

1.3 Our methods

Our approach to the construction and study of the discrete approximating process \(X^{multi}_{disc}(s)\) is related to Jack symmetric polynomials. Recall that Jack polynomials \(J_{\lambda }(x_1,x_2,\ldots ,x_N;\theta )\), indexed by Young diagrams \(\lambda \) and a positive parameter \(\theta \), are eigenfunctions of the Sekiguchi differential operators ([43], [31, Chapter VI, Section 10], [20, Chapter 12])

$$\begin{aligned} D(u;\theta )=\frac{1}{\prod _{i<j} (x_i-x_j)} \det \left[ x_i^{N-j}\left( x_i\,\frac{\partial }{\partial x_i}+(N-j)\theta +u\right) \right] _{i,j=1,2,\ldots ,N}. \end{aligned}$$

One can also define \(J_{\lambda }(x_1,x_2,\ldots ,x_N;\theta )\) as limits of Macdonald polynomials \(P_\lambda (\cdot ;q,t)\) as \(q,t\rightarrow 1\) in such a way that \(t=q^{\theta }\) (see [31]). For the special values \(\theta =1/2,\,1,\,2\) these polynomials are spherical functions of Gelfand pairs \(O(N)\subset U(N)\), \(U(N)\subset U(N)\times U(N)\), \(U(2N)\subset Sp(N)\), respectively, and are also known as Zonal polynomials (see e.g. [31, Chapter 7] and the references therein). It is known that spherical functions of compact type (corresponding to the above Gelfand pairs) degenerate to the spherical functions of Euclidian type, which in our case are related to real symmetric, complex Hermitian and quaternionic Hermitian matrices, respectively (see e.g. [35, Section 4] and the references therein). In particular, in the case \(\theta =1\) this is a manifestation of the fact that the tangent space to the unitary group \(U(N)\) at identity can be identified with the set of Hermitian matrices. Due to all these facts it comes at no surprise that Hermite \(\beta \) ensembles can be obtained as limits of discrete probabilistic structures related to Jack polynomials with parameter \(\theta =\beta /2\).

On the discrete level our construction of the multilevel stochastic dynamics is based on a procedure introduced by Diaconis and Fill [16], which has recently been used extensively in the study of Markov chains on interlacing particle configurations (see e.g. [6, 7, 911]). The idea is to use commuting Markov operators and conditional independence to construct a multilevel Markov chain with given single level marginals. In our case these operators can be written in terms of Jack polynomials. In the limit the commutation relation we use turns into the following statement, which might be of independent interest.

Let \(P_N(t;\beta )\) denote the Markov transition operators of the \(N\)-dimensional Dyson Brownian Motion of (1.2) and let \(L^N_{N-1}(\beta )\) denote the Markov transition operator corresponding to conditioning the \((N-1)\)-st level (that is, \(x^{N-1}_1,x^{N-1}_2,\ldots ,x^{N-1}_{N-1}\)) on the \(N\)-th level (that is, \(x^N_1,x^N_2,\ldots ,x^N_N\)) in the Hermite \(\beta \) corners process.

Proposition 1.3

(Corollary of Theorem 5.3) For any \(\beta \ge 1\) the links \(L^N_{N-1}(\beta )\) given by the stochastic transition kernels

$$\begin{aligned} \frac{\Gamma (N\beta /2)}{\Gamma (\beta /2)^N} \prod _{1\le i<m\le N-1} \left( x^{N-1}_m-x^{N-1}_i\right) \prod _{i=1}^{N-1} \prod _{j=1}^N \left| x^N_j-x^{N-1}_i\right| ^{\beta /2-1} \prod _{1\le j<n\le N} \left( x^N_n-x^N_j\right) ^{1-\beta }\nonumber \\ \end{aligned}$$
(1.6)

intertwine the semigroups \(P_N(t;\beta )\) and \(P_{N-1}(t;\beta )\) in the sense that

$$\begin{aligned} L^N_{N-1}(\beta ) P_N(t;\beta ) = P_{N-1}(t;\beta )L^N_{N-1}(\beta ),\quad t\ge 0. \end{aligned}$$
(1.7)

The latter phenomenon can be subsumed into a general theory of intertwinings for diffusions, which for example also includes the findings in [45] and [39].

Due to the presence of singular drift terms neither existence, nor uniqueness of the solution of (1.4) is straightforward. When dealing with systems of SDEs with singular drift terms one typically shows the existence and uniqueness of strong solutions by truncating the singularity first (thus, obtaining a well-behaved system of SDEs) and by proving afterwards that the solution cannot reach the singularity in finite time using suitable Lyapunov functions, see e.g. [3, proof of Proposition 4.3.5]. However, for \(1<\theta <2\) the solutions of (1.4) do reach some of the singularities. A similar phenomenon occurs in the case of the \(\beta \)–Dyson Brownian motion (1.2) with \(0<\beta <1\), for which the existence and uniqueness theorem was established in [14] using the theory of multivalued SDEs; however, in the multilevel setting we lack a certain monotonicity property which plays a crucial role in [14]. In addition, due to the intrinsic asymmetry built into the drift terms the solution of (1.4) seems to be beyond the scope of the processes that can be constructed using Dirichlet forms (see e.g. [40] for Dirichlet form constructions of symmetric diffusions with a singular drift at the boundary of their domain and for the limitations of that method). Instead, by localizing in time, using appropriate Lyapunov functions, and applying the Girsanov Theorem, we are able to reduce (1.4) to a number of non-interacting Bessel processes, whose existence and uniqueness is well-known. This approach has an additional advantage over Dirichlet form type constructions, since it allows to establish convergence to the solution of (1.4) via martingale problem techniques, which is how our proof of Theorem 5.2 goes.

Note also that for \(\theta =1\) (\(\beta =2\)) the interactions in the definition of \(X^{multi}_{disc}(s)\) become local. We have studied the convergence of such dynamics to the process of Warren in [24], with the proof being based on the continuity of a suitable Skorokhod reflection map. For general values of \(\theta >0\) neither the discrete dynamics, nor the continuous dynamics can be obtained as the image of an explicitly known process under the Skorokhod reflection map of [24].

1.4 Further developments and open problems

It would be interesting to study the asymptotic behavior of both the discrete and the continuous dynamics as the number of levels \(N\) goes to infinity. There are at least two groups of questions here.

The global fluctuations of Dyson Brownian motions as \(N\rightarrow \infty \) are known to be Gaussian (see [3, Section 4.3]); moreover, the limiting covariance structure can be described by the Gaussian Free Field (see [5, 12]). In addition, the asymptotic fluctuations of the Hermite \(\beta \) corners processes of Definition 1.1 are also Gaussian and can be described via the Gaussian Free Field (cf. [12]). This raises the question of whether the 3-dimensional global fluctuations of the solution to (1.4) are also asymptotically (as \(N\rightarrow \infty \)) Gaussian and how the limiting covariance structure might look like. A partial result in this direction was obtained for \(\beta =2\) in [9].

The edge fluctuations (that is, the fluctuations of the rightmost particle as \(N\rightarrow \infty \)) in the Hermite \(\beta \) ensemble given by (1.1) can be described via the \(\beta \)-Tracy–Widom distribution (see [41]). Moreover, in the present article we link the Hermite \(\beta \) ensemble to a certain discrete interacting particle system. This suggests that one might find the \(\beta \)-Tracy–Widom distribution in the limit of the edge fluctuations of that interacting particle system or its simplified versions.

2 Discrete space dynamics via Jack polynomials

2.1 Preliminaries on Jack polynomials

In this section we collect certain facts about Jack symmetric polynomials. A reader familiar with these polynomials can proceed to Sect. 2.2. Our notations generally follow the ones in [31].

In what follows \(\Lambda ^N\) is the algebra of symmetric polynomials in \(N\) variables. In addition, we let \(\Lambda \) be the algebra of symmetric polynomials in countably many variables, that is, of symmetric functions. An element of \(\Lambda \) is a formal symmetric power series of bounded degree in the variables \(x_1,x_2,\dots \). One way to view \(\Lambda \) is as an algebra of polynomials in the Newton power sums \(p_k=\sum _i (x_i)^k\). There exists a unique canonical projection \(\pi _N:\,\Lambda \rightarrow \Lambda _N\), which sets all variables except for \(x_1,x_2,\ldots ,x_N\) to zero (see [31, Chapter 1, Section 2] for more details).

A partition of size \(n\), or a Young diagram with \(n\) boxes, is a sequence of non-negative integers \(\lambda _1\ge \lambda _2\ge \cdots \ge 0\) such that \(\sum _i \lambda _i=n\). \(|\lambda |\) stands for the number of boxes in \(\lambda \) and \(\ell (\lambda )\) is the number of non-empty rows in \(\lambda \) (that is, the number of non-zero sequence elements \(\lambda _i\) in \(\lambda \)). Let \(\mathbb {Y}\) denote the set of all Young diagrams, and \(\mathbb {Y}^N\) the set of all Young diagrams \(\lambda \) with at most \(N\) rows (that is, such that \(\lambda _{N+1}=0\)). Typically, we will use the symbols \(\lambda \), \(\mu \) for Young diagrams. We adopt the convention that the empty Young diagram \(\emptyset \) with \(|\emptyset |=0\) also belongs to \(\mathbb {Y}\) and \(\mathbb {Y}^N\). For a box \(\,\square =(i,j)\) of a Young diagram \(\lambda \) (that is, a pair \((i,j)\) such that \(\lambda _i\ge j\)), \(a(i,j;\lambda )\) and \(l(i,j;\lambda )\) are its arm and leg lengths:

$$\begin{aligned} a(i,j;\lambda )=\lambda _i-j,\quad l(i,j;\lambda )=\lambda '_j-i, \end{aligned}$$

where \(\lambda '_j\) is the row length in the transposed diagram \(\lambda '\) defined by

$$\begin{aligned} \lambda '_j=|\{i:\lambda _i\ge j\}|. \end{aligned}$$

Further, \(a'(i,j)\), \(l'(i,j)\) stand for the co-arm and the co-leg lengths, which do not depend on \(\lambda \)

$$\begin{aligned} a'(i,j)=j-1,\quad l'(i,j)=i-1. \end{aligned}$$

When it is clear from the context which Young diagram is used, we omit it from the notation and write simply \(a(i,j)\) (or \(a(\square )\)) and \(l(i,j)\) (or \(l(\square )\)).

We write \(J_\lambda (\,\cdot ;\theta )\) for Jack polynomials, which are indexed by Young diagrams \(\lambda \) and positive reals \(\theta \). Many facts about these polynomials can be found in [31, Chapter VI, Section 10]. Note however that in that book Macdonald uses the parameter \(\alpha \) given by our \(\theta ^{-1}\). We use \(\theta \), following [27]. \(J_\lambda \) can be viewed either as an element of the algebra \(\Lambda \) of symmetric functions in countably many variables \(x_1,x_2,\ldots \), or (specializing all but finitely many variables to zeros) as a symmetric polynomial in \(x_1,x_2,\ldots ,x_N\) from the algebra \(\Lambda ^N\). In both interpretations the leading term of \(J_\lambda \) is given by \(x_1^{\lambda _1} x_2^{\lambda _2}\cdots x_{\ell (\lambda )}^{\lambda _{\ell (\lambda )}}\). When \(N\) is finite, the polynomials \(J_\lambda (x_1,\dots ,x_N;\theta )\) are known to be the eigenfunctions of the Sekiguchi differential operator:

$$\begin{aligned}&\frac{1}{\prod _{i<j} (x_i-x_j)} \det \left[ x_i^{N-j}\left( x_i\,\frac{\partial }{\partial x_i}+(N-j)\theta +u\right) \right] _{i,j=1,2,\ldots ,N} J_\lambda (x_1,\dots ,x_N;\theta )\nonumber \\&\quad =\Big (\prod _{i=1}^N \big (\lambda _i + (N-i)\theta +u\big )\Big ) J_\lambda (x_1,\dots ,x_N;\theta ). \end{aligned}$$
(2.1)

The eigenrelation (2.1) can be taken as a definition for the Jack polynomials. We also need dual polynomials \(\widetilde{J}_\lambda \) which differ from \(J_\lambda \) by an explicit multiplicative constant:

$$\begin{aligned} \widetilde{J}_\lambda =J_\lambda \cdot \prod _{\square \in \lambda } \frac{a(\square )+\theta \,l(\square )+\theta }{a(\square )+\theta \,l(\square )+1}. \end{aligned}$$
(2.2)

Next, we recall the definition of skew Jack polynomials \(J_{\lambda /\mu }\). Take two infinite sets of variables \(x\) and \(y\), and consider a Jack polynomial \(J_\lambda (x,y;\theta )\). The latter is, in particular, a symmetric polynomial in the \(x\) variables. The coefficients \(J_{\lambda /\mu }(y;\theta )\) in its decomposition in the linear basis of Jack polynomials in the \(x\) variables are symmetric polynomials in the \(y\) variables and are referred to as skew Jack polynomials:

$$\begin{aligned} J_\lambda (x,y;\theta ) = \sum _{\mu } J_{\mu }(x;\theta )\,J_{\lambda /\mu }(y;\theta ). \end{aligned}$$
(2.3)

Similarly, one writes

$$\begin{aligned} \widetilde{J}_\lambda (x,y;\theta ) = \sum _{\mu } \widetilde{J}_{\mu }(x;\theta )\,\widetilde{J}_{\lambda /\mu }(y;\theta ). \end{aligned}$$

It is known (see e.g. [31, Chapter VI, Section 10]) that \(J_{\lambda /\mu }(y;\theta )=\widetilde{J}_{\lambda /\mu }(y;\theta )=0\) unless \(\mu \subset \lambda \), which means that \(\lambda _i\ge \mu _i\) for \(i=1,2,\dots \). Also \(J_{\lambda /\emptyset }(y;\theta )=J_{\lambda }(y;\theta )\) and \(\widetilde{J}_{\lambda /\emptyset }(y;\theta )=\widetilde{J}_{\lambda }(y;\theta )\).

Throughout the article the parameter \(\theta \) remains fixed and, thus, we usually omit it, writing simply \(J_\lambda (x)\), \(\widetilde{J}_\lambda (x)\), \(J_{\lambda /\mu }(x)\), \(\widetilde{J}_{\lambda /\mu }(x)\).

A specialization \(\rho \) is an algebra homomorphism from \(\Lambda \) to the set of complex numbers. A specialization is called Jack-positive if its values on all (skew) Jack polynomials with a fixed parameter \(\theta >0\) are real and non-negative. The following statement gives a classification of all Jack-positive specializations.

Proposition 2.1

([27]) For any fixed \(\theta >0\), Jack-positive specializations can be parameterized by triplets \((\alpha ,\beta ,\gamma )\), where \(\alpha \), \(\beta \) are sequences of real numbers with

$$\begin{aligned} \alpha _1\ge \alpha _2\ge \cdots \ge 0,\quad \beta _1\ge \beta _2\ge \cdots \ge 0, \quad \sum _{i=1}^\infty (\alpha _i+\beta _i)<\infty \end{aligned}$$

and \(\gamma \) is a non-negative real number. The specialization corresponding to a triplet \((\alpha ,\beta ,\gamma )\) is given by its values on the Newton power sums \(p_k\), \(k\ge 1\):

$$\begin{aligned}&p_1\mapsto p_1(\alpha ,\beta ,\gamma )= \gamma +\sum _{i=1}^\infty (\alpha _i+\beta _i), \\&p_k\mapsto p_k(\alpha ,\beta ,\gamma )= \sum _{i=1}^\infty \alpha _i^k + (-\theta )^{k-1} \sum _{i=1}^\infty \beta _i^k, \quad k\ge 2. \end{aligned}$$

The specialization with all parameters taken to be zero is called the empty specialization. This specialization maps a polynomial to its constant term (that is, the degree zero summand).

We prepare the following explicit formulas for Jack-positive specializations for future use.

Proposition 2.2

([31, Chapter VI, (10.20)]) Consider the Jack-positive specialization \(\mathfrak {a}^N\) with \(\alpha _1=\alpha _2=\cdots =\alpha _N=\mathfrak {a}\) and all other parameters set to zero. We have

$$\begin{aligned} J_\lambda (\mathfrak {a}^N)={\left\{ \begin{array}{ll} \mathfrak {a}^{|\lambda |}\,\prod _{\square \in \lambda } \dfrac{N\,\theta +a'(\square )-\theta \,l'(\square )}{a(\square )+\theta \,l(\square ) +\theta }, &{} \mathrm{if\;}\ell (\lambda )\le N,\\ 0,&{}\mathrm{otherwise.} \end{array}\right. } \end{aligned}$$

Taking the limit \(N\rightarrow \infty \) of specializations \(\left( \frac{s}{N}\right) ^N\) of Proposition 2.2 we obtain the following.

Proposition 2.3

Consider the Jack-positive specialization \(\mathfrak {r}_s\) with \(\gamma =s\) and all other parameters set to zero. We have

$$\begin{aligned} J_\lambda (\mathfrak {r}_s)=s^{|\lambda |}\,\theta ^{|\lambda |}\,\prod _{\square \in \lambda } \frac{1}{a(\square )+\theta \,l(\square ) +\theta }. \end{aligned}$$

Certain specializations of skew Jack polynomials also admit explicit formulas. We say that two Young diagrams \(\lambda \) and \(\mu \) interlace and write \(\mu \prec \lambda \) if

$$\begin{aligned} \lambda _1\ge \mu _1\ge \lambda _2\ge \mu _2\ge \cdots . \end{aligned}$$

Proposition 2.4

For any complex number \(\mathfrak {a}\ne 0\), the specialization value \(J_{\lambda /\mu }(\mathfrak {a}^1)\) vanishes unless \(\mu \prec \lambda \). In the latter case,

$$\begin{aligned} J_{\lambda /\mu }(\mathfrak {a}^1)&= \mathfrak {a}^{|\lambda |-|\mu |} \prod _{1\le i\le j\le k-1} \frac{(\mu _i-\mu _j+\theta \,(j-i)+\theta )_{\mu _j-\lambda _{j+1}}}{(\mu _i-\mu _j+\theta \,(j-i)+1)_{\mu _j-\lambda _{j+1}}}\nonumber \\&\cdot \frac{(\lambda _i-\mu _j+\theta \,(j-i)+1)_{\mu _j-\lambda _{j+1}}}{(\lambda _i-\mu _j+\theta \,(j-i)+\theta )_{\mu _j-\lambda _{j+1}}}, \end{aligned}$$
(2.4)

where \(k\) is any integer satisfying \(\ell (\lambda )\le k\) and we have used the Pochhammer symbol notation

$$\begin{aligned} (b)_n= b\,(b+1)\cdots (b+n-1). \end{aligned}$$

When \(\mu \) differs from \(\lambda \) by one box \(\lambda =\mu \sqcup (i,j)\) the formula can be simplified to read in terms of \(\widetilde{J}_{\lambda /\mu }\)

$$\begin{aligned} \widetilde{J}_{\mu /\lambda }(\mathfrak {a}^1) = \mathfrak {a}\,\theta \prod _{l=1}^{i-1} \frac{a(l,j;\mu )+\theta \,(i-l+1)}{a(l,j;\mu )+\theta \,(i-l)} \cdot \frac{a(l,j;\mu )+1+\theta \,(i-l-1)}{a(l,j;\mu )+1+\theta \,(i-l)}. \end{aligned}$$
(2.5)

Note that the arm lengths in the latter formula are computed with respect to the (smaller) diagram \(\mu \).

Proof

The evaluation of (2.4) is known as the branching rule for Jack polynomials and is also a limit of a similar rule for Macdonald polynomials, see e.g. [31, (7.14’), Section VII, Chapter VI] or [36, (2.3)]. The formula (2.5) is obtained from (2.4) using (2.2). However, this computation is quite involved and we also provide an alternative way: formulas [31, (7.13), (7.14), Chapter VI] relate the skew Macdonald polynomials to certain functions \(\varphi _{\lambda /\mu }\). Further, formulas [31, (6.20),(6.24), Chapter VI] give explicit expressions for \(\varphi _{\lambda /\mu }\), and [31, Section 10, Chapter VI] explains that (skew) Jack polynomials are obtained from (skew) Macdonald polynomials parametrized by pairs \((q,t)\) by the limit transition \(q\rightarrow 1\), \(t=q^{\theta }\). This limit in the expression for \(\varphi _{\lambda /\mu }\) of [31, (6.24), Chapter VI] gives (2.5). \(\square \)

We also need the following two summation formulas for Jack polynomials.

Proposition 2.5

Take two specializations \(\rho _1\), \(\rho _2\) such that the series \(\sum _{k=1}^{\infty } \frac{p_k(\rho _1)\,p_k(\rho _2)}{k}\) is absolutely convergent, and define

$$\begin{aligned} H_{\theta }(\rho _1;\rho _2)=\exp \bigg (\sum _{k=1}^{\infty } \frac{\theta }{k}\,p_k(\rho _1)\,p_k(\rho _2)\bigg ). \end{aligned}$$

Then

$$\begin{aligned} \sum _{\lambda \in \mathbb {Y}} J_\lambda (\rho _1)\,\widetilde{J}_\lambda (\rho _2)=H_{\theta }(\rho _1;\rho _2), \end{aligned}$$
(2.6)

and more generally for any \(\nu ,\kappa \in \mathbb {Y}\)

$$\begin{aligned} \sum _{\lambda \in \mathbb {Y}} J_{\lambda /\nu }(\rho _1)\,\widetilde{J}_{\lambda /\kappa }(\rho _2)=H_{\theta }(\rho _1;\rho _2) \sum _{\mu \in \mathbb {Y}} J_{\kappa /\mu }(\rho _1)\,\widetilde{J}_{\nu /\mu }(\rho _2). \end{aligned}$$
(2.7)

Proof

(2.6) is the specialized version of a Cauchy-type identity for Jack polynomials, see e.g. [31, (10.4), Section 10, Chapter VI]. The latter is also a \((q,t)\rightarrow (1,1)\) limit of a similar identity for Macdonald polynomials [31, (4.13), Section 4, Chapter VI], as is explained in [31, Section 10, Chapter VI]. Similarly, (2.7) is the specialized version of the limit of a skew-Cauchy identity for Macdonald polynomials, see e.g. [31, Exercise 6, Section 7, Chapter VI]. \(\square \)

2.2 Probability measures related to Jack polynomials

We start with the definition of Jack probability measures which is based on (2.6).

Definition 2.6

Given two Jack-positive specializations \(\rho _1\) and \(\rho _2\) such that the series \( \sum _{k=1}^{\infty } \frac{p_k(\rho _1)\,p_k(\rho _2)}{k} \) is absolutely convergent, the Jack probability measure \(\mathcal {J}_{\rho _1;\rho _2}\) on \(\mathbb {Y}\) is defined through

$$\begin{aligned} \mathcal {J}_{\rho _1;\rho _2}(\lambda ) = \dfrac{J_\lambda (\rho _1)\,\widetilde{J}_\lambda (\rho _2)}{H_\theta (\rho _1;\rho _2)}\,, \end{aligned}$$
(2.8)

with the normalization constant being given by

$$\begin{aligned} H_{\theta }(\rho _1;\rho _2)=\exp \Big (\sum _{k=1}^{\infty } \frac{\theta }{k}\,p_k(\rho _1)\,p_k(\rho _2)\Big ). \end{aligned}$$

Remark 2.7

The construction of probability measures via specializations of symmetric polynomials was originally suggested by Okounkov in the context of Schur measures [37]. Recently, similar constructions for more general polynomials have led to many interesting results starting from the paper [7] by Borodin and Corwin. We refer to [8, Introduction] for the chart of probabilistic objects which are linked to various degenerations of Macdonald polynomials.

The following statement is a corollary of Propositions 2.2, 2.3 and formula (2.2).

Proposition 2.8

Take specializations \(1^N\) and \(\mathfrak {r}_s\) of Propositions 2.2 and 2.3, respectively. Then \(\mathcal {J}_{1^N;\mathfrak {r}_s}(\lambda )\) vanishes unless \(\lambda \in \mathbb {Y}^N\), and in the latter case we have

$$\begin{aligned} \mathcal {J}_{1^N;\mathfrak {r}_s}(\lambda )=\exp \big (-\theta s N\big )\,s^{|\lambda |}\,\theta ^{|\lambda |}\, \prod _{\square \in \lambda } \frac{N\theta +a'(\square )-\theta \, l'(\square )}{(a(\square )+\theta \,l(\square ) +\theta ) (a(\square )+\theta \, l(\square )+1)}. \end{aligned}$$
(2.9)

Next, we consider limits of the measures \(\mathcal {J}_{1^N;\mathfrak {r}_s}\) under a diffusive rescaling of \(s\) and \(\lambda \). Define the open Weyl chamber \({\mathcal {W}^N}=\{y\in \mathbb {R}^N:y_1< y_2 <\cdots < y_N\}\) and let \(\overline{\mathcal {W}^N}\) be its closure.

Proposition 2.9

Fix some \(N\in \mathbb {N}\). Then under the rescaling

$$\begin{aligned} s= \varepsilon ^{-1}\,\frac{t}{\theta }, \quad \lambda _i= \varepsilon ^{-1}\,t+ \varepsilon ^{-1/2}\,y_{N+1-i},\quad i=1,2,\ldots ,N, \end{aligned}$$

the measures \(\mathcal {J}_{1^N;\mathfrak {r}_s}\) converge weakly in the limit \(\varepsilon \rightarrow 0\) to the probability measure with density

$$\begin{aligned} \frac{1}{Z}\,\prod _{i<j} \left( y_j-y_i\right) ^{2\theta }\, \prod _{i=1}^N \exp \left( - \frac{y_i^2}{2t}\right) \end{aligned}$$
(2.10)

on the closed Weyl chamber \(\overline{\mathcal {W}^N}\) where

$$\begin{aligned} Z=t^{\theta \frac{N(N-1)}{2}+\frac{N}{2}}\,(2\pi )^{N/2}\,\prod _{j=1}^{N} \frac{\Gamma (j\theta )}{\Gamma (\theta )}. \end{aligned}$$
(2.11)

Note that we have chosen the notation is such a way that the row lengths \(\lambda _i\) are non-increasing, while the continuous coordinates \(y_i\) are non-decreasing in \(i\).

Proof of Proposition 2.9

We start by observing that (2.10), (2.11) define a probability density, namely that the total mass of the corresponding measure is equal to one. Indeed, the computation of the normalization constant is a particular case of the Selberg integral (see [20, 32, 44]). Since \(\mathcal {J}_{1^N;\mathfrak {r}_s}\) is also a probability measure, it suffices to prove that as \(\varepsilon \rightarrow 0\)

$$\begin{aligned} \mathcal {J}_{1^N;\mathfrak {r}_s}(\lambda )=\varepsilon ^{N/2}\,\frac{1}{Z}\,\prod _{i<j} (y_j-y_i)^{2\theta }\, \prod _{i=1}^N \exp \left( -\frac{y_i^2}{2t}\right) \big (1+o(1)\big ) \end{aligned}$$

with the error term \(o(1)\) being uniformly small on compact subsets of \({\mathcal {W}^N}\). The product over boxes in the first row of \(\lambda \) in (2.9) is (with the convention \(\lambda _{N+1}=0\))

$$\begin{aligned}&\prod _{i=1}^{\lambda _1} (N\theta +i-1)\,\prod _{i=1}^{N} \,\prod _{j=\lambda _{i+1}+1}^{\lambda _i} \frac{1}{(\lambda _1-j+\theta \,(i-1) +\theta )(\lambda _1-j+\theta \,(i-1)+1)} \\&\quad =\frac{\Gamma (N\theta +\lambda _1)}{\Gamma (N\theta )} \prod _{i=1}^{N} \frac{\Gamma (\lambda _1-\lambda _{i}+i \theta )}{\Gamma (\lambda _1-\lambda _{i+1}+i\theta )}\, \frac{\Gamma (\lambda _1-\lambda _{i}+i \theta +1-\theta )}{\Gamma (\lambda _1-\lambda _{i+1}+i\theta +1-\theta )} \\&\quad =\frac{\Gamma (\theta ) }{\Gamma (N\theta )\,\Gamma ((N-1)\theta +\lambda _1+1)}\, \prod _{i=2}^N \frac{\Gamma (\lambda _1-\lambda _i+i \theta )}{\Gamma (\lambda _1-\lambda _i+(i-1)\,\theta )}\nonumber \\&\qquad \times \frac{\Gamma (\lambda _1-\lambda _{i}+(i-1)\,\theta +1)}{\Gamma (\lambda _1-\lambda _{i}+(i-2)\,\theta +1)} \\&\quad \sim \frac{\Gamma (\theta ) }{\Gamma (N\theta ) \Gamma ((N-1)\,\theta +\lambda _1+1)} \prod _{i=1}^{N-1} (\varepsilon ^{-1/2}(y_N-y_i))^{2\theta } \end{aligned}$$

where we have written \(A(\varepsilon )\sim B(\varepsilon )\) for \(\lim _{\varepsilon \rightarrow 0} \frac{A(\varepsilon )}{B(\varepsilon )}=1\). Further,

$$\begin{aligned}&\frac{e^{-{t}{\varepsilon ^{-1}}}\,t^{\lambda _1}{\varepsilon ^{-\lambda _1}}}{\Gamma ((N-1)\theta +\lambda _1+1)} \sim e^{-{t}{\varepsilon ^{-1}}} \Big ({t}{\varepsilon ^{-1}}\Big )^{{t}{\varepsilon ^{-1}}+{y_N}{\varepsilon ^{-1/2}}+1/2} \\&\quad \times \frac{1}{\sqrt{2\pi }}\left( \dfrac{(N-1)\theta +{t}{\varepsilon ^{-1}}+{y_N}{\varepsilon ^{-1/2}}+1}{e}\right) ^{-(N-1)\theta -{t}{\varepsilon ^{-1}}-{y_N}{\varepsilon ^{-1/2}}-1} \\&\quad \sim \frac{\big ({t}{\varepsilon ^{-1}}\big )^{1/2-(N-1)\theta -1}}{\sqrt{2\pi }}\,e^{(N-1)\theta +1+{y_N}{\varepsilon ^{-1/2}}} \\&\quad \times \left( 1+\varepsilon ^{1/2}\,\frac{y_N}{t}+\varepsilon \,\frac{(N-1)\theta +1}{t}\right) ^ {-(N-1)\theta -{t}{\varepsilon ^{-1}}-{y_N}{\varepsilon ^{-1/2}}-1} \\&\quad \sim \frac{\big ({t}{\varepsilon ^{-1}}\big )^{1/2-(N-1)\theta +\theta }}{\sqrt{2\pi }} e^{(N-1)\theta +1+{y_N}{\varepsilon ^{-1/2}}} \\&\quad \times \exp \Big (\Big (\varepsilon ^{1/2}\,\frac{y_N}{t} +\varepsilon \,\frac{(N-1)\theta +1}{t} -\varepsilon \frac{y_N^2}{2\,t^2}\Big )\Big (-{t}{\varepsilon ^{-1}}-{y_N}{\varepsilon ^{-1/2}}\Big )\Big ) \\&\quad = \varepsilon ^{1/2+(N-1)\theta }\,\frac{ t^{-1/2-(N-1)\theta }}{\sqrt{2\pi }}\, \exp \left( -\frac{y_N^2}{2\,t}\right) . \end{aligned}$$

Therefore, the factors coming from the first row of \(\lambda \) in (2.9) are asymptotically given by

$$\begin{aligned} \frac{\Gamma (\theta ) }{\Gamma (N\theta )}\,\varepsilon ^{1/2}\,\frac{t^{-1/2-(N-1)\theta }}{\sqrt{2\pi }} \,\exp \left( -\frac{y_N^2}{2\,t}\right) \,\prod _{i=1}^{N-1} (y_N-y_i)^{2\theta }. \end{aligned}$$

Performing similar computations for the other rows we get

$$\begin{aligned} \mathcal {J}_{1^N;\mathfrak {r}_s}(\lambda )=\varepsilon ^{N/2}\,\prod _{j=1}^N \frac{\Gamma (\theta )\,t^{-1/2-(j-1)\theta } }{ \Gamma (j\theta ) \sqrt{2\pi }} \,\prod _{i<j} (y_j-y_i)^{2\theta }\, \prod _{i=1}^N \exp \left( -\frac{y_i^2}{2\,t}\right) \big (1+o(1)\big ) \end{aligned}$$

which finishes the proof. \(\square \)

We now proceed to the definition of probability measures on multilevel structures associated with Jack polynomials. Let \(\mathbb {GT}^{(N)}\) denote the set of sequences of Young diagrams \(\lambda ^1\prec \lambda ^2\prec \cdots \prec \lambda ^N\) such that \(\ell (\lambda ^i)\le i\) for every \(i\) and the Young diagrams interlace, that is,

$$\begin{aligned} \lambda ^{i+1}_1\ge \lambda ^i_1\ge \lambda ^{i+1}_2\ge \dots \ge \lambda ^{i}_i\ge \lambda ^{i+1}_{i+1},\quad i=1,2,\ldots ,N-1. \end{aligned}$$

The following definition is motivated by the property (2.3) of skew Jack polynomials.

Definition 2.10

A probability distribution \(P\) on arrays \((\lambda ^1\prec \dots \prec \lambda ^N)\in \mathbb {GT}^{(N)}\) is called a Jack–Gibbs distribution, if for any \(\mu \in \mathbb {Y}^N\) such that \(P(\lambda ^N=\mu )>0\) the conditional distribution of \(\lambda ^1\dots ,\lambda ^{N-1}\) given \(\lambda ^N=\mu \) is

$$\begin{aligned} P\big (\lambda ^1,\dots ,\lambda ^{N-1}\mid \lambda ^N=\mu \big ) =\frac{J_{\mu /\lambda ^{N-1}}(1^1)\,J_{\lambda ^{N-1}/\lambda ^{N-2}}(1^1) \cdots J_{\lambda ^2/\lambda ^1}(1^1)\,J_{\lambda ^1}(1^1)}{J_{\mu }(1^N)}. \end{aligned}$$
(2.12)

Remark 2.11

When \(\theta =1\), (2.12) implies that the conditional distribution of \(\lambda ^1,\dots ,\lambda ^{N-1}\) is uniform on the polytope defined by the interlacing conditions.

One important example of a Jack–Gibbs measure is given by the following definition, see [7, 8, 11, 13] for a review of related constructions in the context of Schur and, more generally, Macdonald polynomials.

Definition 2.12

Given a Jack-positive specialization \(\rho \) such that \(\sum \nolimits _{k=1}^{\infty } \frac{p_k(\rho )}{k}<\infty \) we define the ascending Jack process \(\mathcal {J}^{asc}_{\rho ;N}\) as the probability measure on \(\mathbb {GT}^{(N)}\) given by

$$\begin{aligned} \mathcal {J}^{asc}_{\rho ;N}(\lambda ^1,\lambda ^2,\ldots ,\lambda ^N) =\dfrac{\widetilde{J}_{\lambda ^N}(\rho )\,J_{\lambda ^N/\lambda ^{N-1}}(1^1)\cdots J_{\lambda ^2/\lambda ^1}(1^1)\,J_{\lambda ^1}(1^1)}{H_\theta (\rho ;1^N)}. \end{aligned}$$
(2.13)

Remark 2.13

If \(\rho \) is the empty specialization, then \(\mathcal {J}^{asc}_{\rho ;N}\) assigns mass \(1\) to the single element of \(\mathbb {GT}^{(N)}\) such that \(\lambda ^i_j=0\), \(1\le i\le j\le N\).

Lemma 2.14

The formula (2.13) defines a Jack–Gibbs probability distribution. Furthermore, for any \(1\le k \le N\) the projection of \(\mathcal {J}^{asc}_{\rho ;N}\) to \((\lambda ^1,\dots ,\lambda ^k)\) is \(\mathcal {J}^{asc}_{\rho ;k}\), and the projection of \(\mathcal {J}^{asc}_{\rho ;N}\) to \(\lambda ^k\) is \(\mathcal {J}_{\rho ;1^k}\).

Proof

The formula (2.4) yields that \(J_{\lambda /\mu }(1^1)\) vanishes unless \(\mu \prec \lambda \), thus, the support of \(\mathcal {J}^{asc}_{\rho ;N}\) is indeed a subset of \(\mathbb {GT}^{(N)}\). Now we can sum (2.13) sequentially over \(\lambda ^1\), ..., \(\lambda ^{N-1}\) using (2.3). This proves that the projection of \(\mathcal {J}^{asc}_{\rho ;N}\) to \(\lambda ^N\) is \(\mathcal {J}_{\rho ;1^N}\). Thus, since \(\mathcal {J}_{\rho ;1^N}\) is a probability measure, so is \(\mathcal {J}^{asc}_{\rho ;N}\). Further, dividing \(\mathcal {J}^{asc}_{\rho ;N}(\lambda ^1,\dots ,\lambda ^N)\) by \(\mathcal {J}_{\rho ;1^N}(\lambda ^N)\) we get the conditional distribution (2.12), which proves that (2.13) is Jack–Gibbs.

To compute the projection onto \((\lambda ^1,\dots ,\lambda ^k)\) we sum (2.13) sequentially over \(\lambda ^N\),..., \(\lambda ^{k+1}\) using (2.7) and arrive at \(\mathcal {J}^{asc}_{\rho ;k}\). In order to further compute the projection to \(\lambda ^k\) we also sum over \(\lambda ^1,\dots ,\lambda ^{k-1}\) using (2.3) and get \(\mathcal {J}_{\rho ;1^k}\). \(\square \)

Define the (open) Gelfand–Tsetlin cone via

$$\begin{aligned} {\mathcal {G}^N}=\left\{ y\in \mathbb {R}^{N(N+1)/2}:y^{j+1}_i< y^j_i< y^{j+1}_{i+1},\;1\le i\le j\le N-1\right\} \,, \end{aligned}$$

and let \(\overline{\mathcal {G}^N}\) be its closure. A natural continuous analogue of Definition 2.10 is:

Definition 2.15

An absolutely continuous (with respect to the Lebesgue measure) probability distribution \(P\) on arrays \(y\in {\mathcal {G}^N}\) is called \(\theta \)–Gibbs, if the conditional distribution of the first \(N-1\) levels \(y^k_i\), \(1\le i \le k\le N-1\) given the \(N\)–th level \(y^N_1,\dots ,y^N_N\) has density

$$\begin{aligned}&P\left( y^k_i, 1\le i \le k\le N-1 \mid y^N_1,\dots , y^N_N\right) \nonumber \\&\quad =\prod _{k=2}^N\bigg ( \frac{\Gamma (k\theta )}{\Gamma (\theta )^k} \,\prod _{1\le i<m\le k-1} (y^{k-1}_m-y^{k-1}_i) \, \prod _{i=1}^{k-1} \, \prod _{j=1}^k |y^k_j\nonumber \\&\qquad -y^{k-1}_i|^{\theta -1} \, \prod _{1\le j<n\le k} (y^k_n-y^k_j)^{1-2\theta }\bigg ). \end{aligned}$$
(2.14)

To see that (2.14) indeed defines a probability measure, one can use a version of the Dixon-Anderson identity (see [2, 17], [20, Chapter 4]), which reads

$$\begin{aligned}&\int \int \ldots \int \prod _{1\le i<j \le m} |u_i-u_j| \,\prod _{i=1}^m\, \prod _{j=1}^{m+1} |u_i-v_j|^{\theta -1}\,\mathrm{d}u_1\,\mathrm{d}u_2\,\ldots \,\mathrm{d}u_m \nonumber \\&\quad = \frac{\Gamma (\theta )^{m+1}}{\Gamma ((m+1)\theta )}\,\prod _{1\le i <j \le m+1} |v_i-v_j|^{2\theta -1}\,, \end{aligned}$$
(2.15)

where the integration is performed over the domain

$$\begin{aligned} v_1<u_1<v_2<u_2<v_3<\cdots <u_m<v_{m+1}. \end{aligned}$$

Applying (2.15) sequentially to integrate the density in (2.14) with respect to the variable \(y^1_1\), then the variables \(y^2_1\), \(y^2_2\) and so on, we eventually arrive at \(1\).

Proposition 2.16

Let \(P(q)\), \(q=1,2,\dots \) be a sequence of Jack–Gibbs measures on \(\mathbb {GT}^{(N)}\), and for each \(q\) let \(\{\lambda ^k_i(q)\}\) be a \(P(q)\)–distributed random element of \(\mathbb {GT}^{(N)}\). Suppose that there exist two sequences \(m(q)\) and \(b(q)\) such that \(\lim _{q\rightarrow \infty } b(q)=\infty \) and as \(q\rightarrow \infty \) the \(N\)–dimensional vector

$$\begin{aligned} \left( \frac{\lambda ^N_N-m(q)}{b(q)}, \frac{\lambda ^N_{N-1}-m(q)}{b(q)},\dots , \frac{\lambda ^N_1-m(q)}{b(q)} \right) \end{aligned}$$

converges weakly to a random vector whose distribution is absolutely continuous with respect to the Lebesgue measure. Then the whole \(N(N+1)/2\)–dimensional vector

$$\begin{aligned} \left( \frac{\lambda ^k_i-m(q)}{b(q)}\right) ,\quad 1\le i\le k\le N \end{aligned}$$

also converges weakly and its limiting distribution is \(\theta \)–Gibbs.

Proof

Since we deal with probability distributions converging to another probability distribution, it suffices to check that the quantity in (2.12) (written in the rescaled and reordered coordinates \(y^j_i=\frac{\lambda _{j+1-i}^j-m(q)}{b(q)}\)) converges to (2.14), uniformly on compact subsets of \({\mathcal {G}^N}\).

\(J_{\lambda /\mu }(1^1)\) can be evaluated according to the identity (2.4). Thus, with the notation \( f(\alpha )=\frac{\Gamma (\alpha +1)}{\Gamma (\alpha +\theta )}\) we have

$$\begin{aligned} J_{\lambda ^{k}/\lambda ^{k-1}}(1^1)= \prod _{1\le i\le j\le k-1} \frac{f(\lambda ^{k-1}_i-\lambda ^{k-1}_j+\theta \,(j-i))\,f(\lambda ^k_i-\lambda ^k_{j+1}+\theta \,(j-i))}{ f(\lambda ^{k-1}_i-\lambda ^k_{j+1}+\theta \,(j-i))\,f(\lambda ^k_i-\lambda ^{k-1}_j+\theta \,(j-i))}. \end{aligned}$$

The asymptotics \(f(\alpha )\sim \alpha ^{1-\theta }\) as \(\alpha \rightarrow \infty \) shows

$$\begin{aligned}&J_{\lambda ^{k}/\lambda ^{k-1}}(1^1)\sim \frac{b(q)^{(1-k)(1-\theta )}}{\Gamma (\theta )^{k-1}}\,\\&\quad \times \prod _{1\le i< j\le k-1} (y^{k-1}_j-y^{k-1}_i)^{1-\theta } \, \prod _{1\le i< j\le k} (y^{k}_j-y^{k}_i)^{1-\theta } \, \prod _{i=1}^{k-1}\prod _{j=1}^k |y^{k-1}_i-y^k_j|^{\theta -1}. \end{aligned}$$

It remains to analyze the asymptotics of \(J_{\mu }(1^N)\) in the denominator of (2.12). To this end, we use the expression for \(J_{\mu }(1^N)\) in Proposition 2.2 and recall that \(\mu \) can be identified with \(\lambda ^N\) to find that the product over the boxes in the first row of \(\mu \) asymptotically (in the limit \(q\rightarrow \infty \)) behaves as

$$\begin{aligned}&\prod _{i=1}^{\mu _1} (N\theta +i-1) \, \prod _{i=1}^{N} \, \prod _{j=\mu _{i+1}+1}^{\mu _i} \frac{1}{\mu _1-j+\theta \,(i-1) +\theta }\\&\quad = \frac{\Gamma (N\theta +\lambda _1)}{\Gamma (N\theta )}\, \prod _{i=1}^{N} \frac{\Gamma (\lambda _1-\lambda _{i}+i \theta )}{\Gamma (\lambda _1-\lambda _{i+1}+i\theta )}\\&\quad =\prod _{i=2}^N \Big ( \frac{\Gamma (\lambda _1-\lambda _i+i \theta )}{\Gamma (\lambda _1-\lambda _i+(i-1)\,\theta )} \Big ) \sim \prod _{i=1}^{N-1} \big (b(q)(y_N^N-y_i^N)\big )^{\theta }. \end{aligned}$$

Performing the same computations for the other rows we find that, as \(q\rightarrow \infty \),

$$\begin{aligned} \frac{1}{J_{\mu }(1^N)}\sim b(q)^{-\theta N(N-1)/2} \prod _{1\le i<j\le N} (y_j^N-y_i^N)^{-\theta }. \end{aligned}$$

One obtains the desired convergence to (2.14) by putting together the asymptotics of the factors in (2.12) and multiplying the result by the term \(b(q)^{N(N-1)/2}\) coming from the space rescaling. \(\square \)

As a combination of Propositions 2.9 and 2.16 we obtain the following statement.

Corollary 2.17

Fix some \(N\in \mathbb {N}\). Then, under the rescaling

$$\begin{aligned} s=\varepsilon ^{-1}\,\frac{t}{\theta },\quad \lambda _i^j=\varepsilon ^{-1}\,t+\varepsilon ^{-1/2}\,y_{j+1-i}^j,\quad 1\le i\le j\le N\,, \end{aligned}$$

the measures \(\mathcal {J}^{asc}_{\mathfrak {r}_s;N}\) converge weakly in the limit \(\varepsilon \rightarrow 0\) to the probability measure on the Gelfand–Tsetlin cone \({\mathcal {G}^N}\) with density

$$\begin{aligned}&\frac{1}{Z}\, \prod _{i<j} \left( y_j^N-y_i^N\right) \, \prod _{i=1}^N \exp \left( - \frac{(y_i^N)^2}{2\,t}\right) \, \prod _{n=1}^{N-1} \, \prod _{1\le i<j\le n} \left( y_j^n-y_i^n\right) ^{2-2\theta } \, \nonumber \\&\quad \times \prod _{a=1}^n \, \prod _{b=1}^{n+1} \big |y^n_a -y^{n+1}_b\big |^{\theta -1} \end{aligned}$$
(2.16)

where

$$\begin{aligned} Z=t^{\theta N(N-1) +N/2}\,(2\pi )^{N/2}\,\prod _{j=1}^{N} \frac{(\Gamma (j\theta ))^2}{\Gamma (\theta )^{j+1}}. \end{aligned}$$
(2.17)

Note that the probability measure of (2.16) is precisely the Hermite \(\beta =2\theta \) corners process with variance \(t\) of Definition 1.1.

Remark 2.18

When \(\theta =1\), the factors \((y_j^n-y_i^n)^{2-2\theta }\) and \(|y^n_a-y^{n+1}_b|^{\theta -1}\) in (2.16) disappear, and the conditional distribution of \(y^1, y^2,\dots ,y^{N-1}\) given \(y^N\) becomes uniform on the polytope defined by the interlacing conditions. This distribution is known to be that of eigenvalues of corners of a random Gaussian \(N\times N\) Hermitian matrix sampled from the Gaussian Unitary Ensemble (see e.g. [4]). Similarly, for \(\theta =1/2\) and \(\theta =2\) one gets the joint distribution of the eigenvalues of corners of the Gaussian Orthogonal Ensemble and the Gaussian Symplectic Ensemble, respectively (see e.g. [33], [35, Section 4]).

2.3 Dynamics related to Jack polynomials

We are now ready to construct the stochastic dynamics related to Jack polynomials. Similar constructions for Schur, \(q\)-Whittacker and Macdonald polynomials can be found in [6, 7, 9, 11].

Definition 2.19

Given two specializations \(\rho ,\rho '\) define their union \((\rho ,\rho ')\) through the formulas

$$\begin{aligned} p_k(\rho ,\rho ')=p_k(\rho )+p_k(\rho '),\quad k\ge 1 \end{aligned}$$

where \(p_k\), \(k\ge 1\) are the Newton power sums as before.

Let \(\rho \) and \(\rho '\) be two Jack-positive specializations such that \(H_\theta (\rho ;\rho ')<\infty \). Define matrices \(p^{\uparrow }_{\lambda \rightarrow \mu }\) and \(p^{\downarrow }_{\lambda \rightarrow \mu }\) with rows and columns indexed by Young diagrams as follows:

$$\begin{aligned} p^{\uparrow }_{\lambda \rightarrow \mu }(\rho ;\rho ')= \frac{1}{H_\theta (\rho ;\rho ')}\, \frac{J_{\mu }(\rho )}{J_\lambda (\rho )}\,\widetilde{J}_{\mu /\lambda }(\rho '), \quad \lambda , \mu \in \mathbb {Y}, \quad J_\lambda (\rho )\ne 0, \end{aligned}$$
(2.18)
$$\begin{aligned} p^{\downarrow }_{\lambda \rightarrow \mu }(\rho ;\rho ')= \frac{J_{\mu }(\rho )}{J_\lambda (\rho ,\rho ')}\, J_{\lambda /\mu }(\rho '), \quad \quad \, \lambda ,\mu \in \mathbb {Y}, \quad J_\lambda (\rho ,\rho ')\ne 0. \end{aligned}$$
(2.19)

The next three propositions follow from (2.3), (2.6), (2.7) (see also [6, 7, 11] for analogous results in the cases of Schur, \(q\)-Whittacker and Macdonald polynomials).

Proposition 2.20

The matrices \(p^{\uparrow }_{\lambda \rightarrow \mu }\) and \(p^{\downarrow }_{\lambda \rightarrow \mu }\) are stochastic, that is, all matrix elements are non-negative, and for every \(\lambda \in \mathbb {Y}\) we have

$$\begin{aligned}&\sum \limits _{\mu \in \mathbb {Y}} p^{\uparrow }_{\lambda \rightarrow \mu }(\rho ,\rho ')=1,\quad \text { if }\, J_\lambda (\rho )\ne 0, \\&\sum \limits _{\mu \in \mathbb {Y}} p^{\downarrow }_{\lambda \rightarrow \mu }(\rho ,\rho ')=1, \quad \text { if } \, J_\lambda (\rho ,\rho ')\ne 0. \end{aligned}$$

Proposition 2.21

For any \(\mu \in \mathbb {Y}\) and any Jack-positive specializations \(\rho _1,\rho _2,\rho _3\) we have

$$\begin{aligned}&\sum _{\lambda \in \mathbb {Y}:\,{\mathcal {J}}_{\rho _1;\rho _2}(\lambda )\ne 0} {\mathcal {J}}_{\rho _1;\rho _2}(\lambda )\, p^\uparrow _{\lambda \rightarrow \mu }(\rho _2;\rho _3)= {\mathcal {J}}_{\rho _1,\rho _3;\rho _2} (\mu ), \\&\sum _{\lambda \in \mathbb {Y}:\, {\mathcal {J}}_{\rho _1;\rho _2,\rho _3}(\lambda )\ne 0} {\mathcal {J}}_{\rho _1;\rho _2,\rho _3}(\lambda )\,p^\downarrow _{\lambda \rightarrow \mu }(\rho _2;\rho _3)={\mathcal {J}}_{\rho _1;\rho _2} (\mu ). \end{aligned}$$

Proposition 2.22

The following commutation relation on matrices \(p^\uparrow _{\lambda \rightarrow \mu }\) and \(p^\downarrow _{\lambda \rightarrow \mu }\) holds:

$$\begin{aligned} p^{\uparrow }(\rho _1,\rho _2;\rho _3)\,p^{\downarrow }(\rho _1;\rho _2) = p^{\downarrow }(\rho _1;\rho _2)\,p^{\uparrow }(\rho _1;\rho _3). \end{aligned}$$

Let \(X^N_{disc}(s)\), \(s\ge 0\) denote the continuous time Markov chain on \(\mathbb {Y}^N\) with transition probabilities given by \(p^{\uparrow }(1^N;\mathfrak {r}_s)\), \(s\ge 0\) (and arbitrary initial condition \(X^N_{disc}(0)\in \mathbb {Y}^N\)). We record the jump rates of \(X^N_{disc}\) for later use.

Proposition 2.23

The jump rates of the Markov chain \(X^N_{disc}\) on \(\mathbb {Y}^N\) are given by

$$\begin{aligned} q_{\lambda \rightarrow \mu } ={\left\{ \begin{array}{ll} \dfrac{J_{\mu }(1^N)}{J_\lambda (1^N)}\,\widetilde{J}_{\mu /\lambda }(\mathfrak {r}_1),&{} \mu =\lambda \sqcup \square ,\\ -\sum \limits _{\nu =\lambda \sqcup \square } q_{\lambda \rightarrow \nu },&{} \mu =\lambda ,\\ 0,&{}\text {otherwise.} \end{array}\right. } \end{aligned}$$
(2.20)

Explicitly, for \(\mu =\lambda \sqcup (i,j)\) we have

$$\begin{aligned} q_{\lambda \rightarrow \mu }&= \frac{\prod _{\square \in \mu } \dfrac{N\theta +a'(\square )-\theta \,l'(\square )}{a(\square ;\mu )+\theta \, l(\square ;\mu ) +\theta }}{\prod _{\square \in \lambda } \dfrac{N\theta +a'(\square )-\theta \,l'(\square )}{a(\square ;\lambda )+\theta \,l(\square ;\lambda ) +\theta }} \,\theta \, \prod _{k=1}^{i-1} \frac{a(k,j;\lambda )+\theta \,(i-k+1)}{a(k,j;\lambda )+\theta \,(i-k)} \nonumber \\&\quad \times \frac{a(k,j;\lambda )+1+\theta \,(i-l-1)}{a(k,j;\lambda )+1+\theta \,(i-l)}. \end{aligned}$$
(2.21)

Remark 2.24

While the jump rates \(q_{\lambda \rightarrow \mu }\) are explicit, we are not aware of any fairly simple formulas for the transition probabilities \(p^{\uparrow }(1^N;\mathfrak {r}_s)\), \(s\ge 0\) of \(X^N_{disc}\).

Proof of Proposition 2.23

The formula (2.20) is readily obtained from the definition of the transition probabilities in (2.18). In order to get (2.21) we note that \(J_\lambda (1^N)\) and \(J_\mu (1^N)\) have been computed in Proposition 2.2. Further, observe that \(\widetilde{J}_{\mu /\lambda }\) is a symmetric polynomial of degree \(1\), thus, it is proportional to the sum of indeterminates \(p_1\). Therefore, \(\widetilde{J}_{\mu /\lambda }(1^1)=\widetilde{J}_{\mu /\lambda }(\mathfrak {r}_1)\) and we can use the formula (2.5) to evaluate it.

The following proposition will prove useful below.

Proposition 2.25

The process \(|X^N_{disc}|\,{:=}\,\sum _{i=1}^N (X^N_{disc})_i\) is a Poisson process with intensity \(N\theta \).

Proof

According to Proposition 2.23 the process \(|X^N_{disc}|\) increases by \(1\) with rate

$$\begin{aligned} \sum _{\mu =\lambda \sqcup \square } \dfrac{J_{\mu }(1^N)}{J_\lambda (1^N)}\, \widetilde{J}_{\mu /\lambda }(\mathfrak {r}_1). \end{aligned}$$

In order to evaluate the latter sum we use Pieri’s rule for Jack polynomials, which is a formula for the product of a Jack polynomial with the sum of indeterminates and in our case reads

$$\begin{aligned} \sum _{\mu =\lambda \sqcup \square } {J_{\mu }(1^N)}\,\widetilde{J}_{\mu /\lambda }(\mathfrak {r}_1) =\theta \,p_1(1^N)\,J_\lambda (1^N) =N\,\theta \,J_\lambda (1^N). \end{aligned}$$

The proof of Pieri’s fule (for Macdonald polynomials, with the case of Jack polynomials being given by the limit transition \(q\rightarrow 1\), \(t=q^{\theta }\)) can be found in [31, (6.24) and Section 10 in Chapter VI]. Note that in the formulas of [31] the notation \(\varphi _{\mu /\lambda }\) is used for \(\widetilde{J}_{\mu /\lambda }(\mathfrak {r}_1)=\widetilde{J}_{\mu /\lambda }(1^1)\) and the \(g_1\) there is proportional to our \(p_1\). \(\square \)

Proposition 2.21 implies the following statement.

Proposition 2.26

Suppose that the initial condition \(X^N_{disc}(0)\) is the empty Young diagram, that is, \(\lambda _1=\lambda _2=\cdots =\lambda _N=0\). Then, for any fixed \(s>0\), the law of \(X^N_{disc}(s)\) is given by \(\mathcal {J}_{1^N;\mathfrak {r}_s}\) which was computed explicitly in Proposition 2.8.

Our next goal is to define a stochastic dynamics on \(\mathbb {GT}^{(N)}\). The construction we use is parallel to those of [6, 7, 911]; it is based on an idea going back to [16], which allows to couple the dynamics of Young diagrams of different sizes. We start from the degenerate discrete time dynamics \(\lambda ^0(n)=\emptyset \), \(n\in \mathbb {N}_0\) and construct the discrete time dynamics of \(\lambda ^1,\lambda ^2,\ldots ,\lambda ^N\) inductively. Given \(\lambda ^{k-1}(n)\), \(n\in \mathbb {N}_0\) and a Jack-positive specialization \(\rho \) we define the process \(\lambda ^k(n)\), \(n\in \mathbb {N}_0\) with a given initial condition \(\lambda ^k(0)\) satisfying \(\lambda ^{k-1}(0)\prec \lambda ^k(0)\) as follows. We let the distribution of \(\lambda ^k(n+1)\) depend only on \(\lambda ^k(n)\) and \(\lambda ^{k-1}(n+1)\) and be given by

$$\begin{aligned} \mathbb {P}(\lambda ^k(n+1)=\nu \mid \lambda ^k(n)=\lambda ,\, \lambda ^{k-1}(n+1)=\mu )=\dfrac{\widetilde{J}_{\nu /\lambda }(\rho )\,J_{\nu /\mu }(1^1)}{\sum _{\kappa \in \mathbb {Y}} \widetilde{J}_{\kappa /\lambda }(\rho )\,J_{\kappa /\mu }(1^1)}.\nonumber \\ \end{aligned}$$
(2.22)

Carrying out this procedure for \(k=1,2,\ldots ,N\) we end up with a discrete time Markov chain \(\hat{X}^{multi}_{disc}(n;\rho )\), \(n\in \mathbb {N}_0\) on \(\mathbb {GT}^{(N)}\).

Definition 2.27

Define the continuous time dynamics \(X^{multi}_{disc}(s)\), \(s\ge 0\) on \(\mathbb {GT}^{(N)}\) with an initial condition \(X^{multi}_{disc}(0)\in \mathbb {GT}^{(N)}\) as the distributional limit

$$\begin{aligned} \lim _{\varepsilon \rightarrow 0} \, \hat{X}^{multi}_{disc}(\lfloor \varepsilon ^{-1}\,s \rfloor ;\mathfrak {r}_\varepsilon ) \end{aligned}$$

where all dynamics \(\hat{X}^{multi}_{disc}(\cdot ;\mathfrak {r}_\varepsilon )\) are started from the initial condition \(X^{multi}_{disc}(0)\) and the specialization \(\mathfrak {r}_\varepsilon \) is defined as in Proposition 2.3.

Remark 2.28

Alternatively, we could have started from the specialization \(\rho \) with a single \(\alpha \) parameter \(\alpha _1=\varepsilon \) and we would have arrived at the same continuous time dynamics. Analogous constructions of the continuous time dynamics in the context of Schur and Macdonald polynomials can be found in [9] and [7].

Note that when \(\lambda \prec \kappa \) the term \(\widetilde{J}_{\kappa /\lambda }(\mathfrak {r}_\varepsilon )\) is of order \(\varepsilon ^{|\kappa |-|\lambda |}\) as \(\varepsilon \rightarrow 0\). Therefore, the leading order term in the sum on the right-hand side of (2.22) comes from the choice \(\kappa =\lambda \) unless that \(\kappa \) violates \(\mu \prec \kappa \), in which case the leading term corresponds to taking \(\kappa =\mu \). Moreover, the first-order terms come from the choices \(\kappa =\lambda \sqcup \square \) and the resulting terms turn into the jump rates of the continuous time dynamics. Summing up, the continuous time dynamics \(X^{multi}_{disc}(s)\), \(s\ge 0\) looks as follows: given the trajectory of \(\lambda ^{k-1}\), a box \(\square \) is added to the Young diagram \(\lambda ^k\) at time \(t\) at the rate

$$\begin{aligned} q(\square ,\lambda ^k(s-),\lambda ^{k-1}(s)) =\widetilde{J}_{(\lambda ^{k}(s-)\sqcup \square )/\lambda ^{k}(s-)}(\mathfrak {r}_1)\, \dfrac{J_{(\lambda ^{k}(s-)\sqcup \square )/\lambda ^{k-1}(s)}(1^1)}{J_{\lambda ^{k}(s-)/\lambda ^{k-1}(s)}(1^1)}. \end{aligned}$$
(2.23)

In particular, the latter jump rates incorporate the following push interaction: if the coordinates of \(\lambda ^{k-1}\) evolve in a way which violates the interlacing condition \(\lambda ^{k-1}\prec \lambda ^{k}\), then the appropriate coordinate of \(\lambda ^k\) is pushed in the sense that a box is added immediately to the Young diagram \(\lambda ^k\) to restore the interlacing. The factors on the right-hand side of (2.23) are explicit and given by (2.4) and (2.5). Simulations of the continuous time dynamics for \(\theta =0.5\) and \(\theta =2\) can be found at [25].

The following statement is based on the results of Propositions 2.20, 2.21, 2.22 and can be proved by the argument of [9, Sections 2.2, 2.3], see also [6, 7, 11].

Proposition 2.29

Suppose that \(X^{multi}_{disc}(s)\), \(s\ge 0\) is started from a random initial condition with a Jack–Gibbs distribution. Then:

  • the restriction of \(X^{multi}_{disc}(s)\) to level \(N\) coincides with \(X^{N}_{disc}(s)\), \(s\ge 0\) started from the restriction to level \(N\) of the initial condition \(X^{multi}_{disc}(0)\);

  • the law of \(X^{multi}_{disc}(s)\) at a fixed time \(s>0\) is a Jack–Gibbs distribution. Moreover, if \(X^{multi}_{disc}(0)\) has law \(\mathcal {J}^{asc}_{\rho ;N}\), then \(X^{multi}_{disc}(s)\) has law \(\mathcal {J}^{asc}_{\rho ,\mathfrak {r}_s;N}\).

Remark 2.30

In fact, there is a way to generalize Proposition 2.29 to a statement describing the restriction of our multilevel dynamics started from Jack–Gibbs initial conditions to any monotone space-time path (meaning that we look at level \(N\) for some time, then at level \(N-1\) and so on). We refer the reader to [9, Proposition 2.5] for a precise statement in the setting of multilevel dynamics based on Schur polynomials.

3 Convergence to Dyson Brownian motion

The goal of this section is to prove that the Markov chain \(X^N_{disc}(s)\) converges in the diffusive scaling limit to a Dyson Brownian Motion.

To start with, we recall the existence and uniqueness result for Dyson Brownian motions with \(\beta >0\) (see e.g. [3, Proposition 4.3.5] for the case \(\beta \ge 1\) and [14, Theorem 3.1] for the case \(0<\beta <1\)).

Proposition 3.1

For any \(N\in \mathbb {N}\) and \(\beta >0\), the system of SDEs

$$\begin{aligned} \mathrm{d}X_i(t)=\frac{\beta }{2}\,\sum _{j\ne i} \frac{1}{X_i(t)-X_j(t)}\,\mathrm{d}t + \mathrm{d}W_i(t), \end{aligned}$$
(3.1)

\(i=1,2,\ldots ,N\), with \(W_1,W_2,\ldots ,W_N\) being independent standard Brownian motions, has a unique strong solution taking values in the Weyl chamber \( \overline{\mathcal {W}^N}\) for any initial condition \(X(0)\in \overline{\mathcal {W}^N}\). Moreover, for all initial conditions, the stopping time

$$\begin{aligned} \tau \,{:=}\,\inf \{t>0:X_i(t)=X_{i+1}(t)\;\hbox { for some }i\} \end{aligned}$$
(3.2)

is infinite with probability one if \(\beta \ge 1\) and finite with positive probability if   \(0<\beta <1\).

We write \(D^N=D([0,\infty ),\mathbb {R}^N)\) for the space of right-continuous paths with left limits taking values in \(\mathbb {R}^N\) and endow it with the usual Skorokhod topology (see e.g. [19]).

Theorem 3.2

Fix \(\theta \ge 1/2\) and let \(\varepsilon >0\) be a small parameter. Let the \(N\)–dimensional stochastic process \(Y^N_\varepsilon (t)=(Y^N_\varepsilon (t)_1,\dots ,Y^N_\varepsilon (t)_N)\) be defined through

$$\begin{aligned} (Y^N_\varepsilon (t))_i= \varepsilon ^{1/2}\big ((X^N_{disc})_{N+1-i}\big (\varepsilon ^{-1}\,\theta ^{-1}\,t\big )-\varepsilon ^{-1}\,t\big ), \quad i=1,\dots ,N \end{aligned}$$

where \((X^N_{disc})_i\) is \(i\)-th coordinate of the process \(X^N_{disc}\). Suppose that, as \(\varepsilon \rightarrow 0\), the initial conditions \(Y^N_\varepsilon (0)\) converge to a point \(Y(0)\) in the interior of \(\overline{\mathcal {W}^N}\). Then the process \(Y^N_{\varepsilon }(t)\) converges in the limit \(\varepsilon \downarrow 0\) in law on \(D^N\) to the \(\beta =2\theta \)–Dyson Brownian motion, that is, to the unique strong solution of (3.1) with \(\beta =2\theta \).

Remark 3.3

We believe that Theorem 3.2 should hold for any \(\theta >0\). However, the case \(0<\theta <1/2\) presents additional technical challenges, since there the stopping time \(\tau \) of (3.2) may be finite.

Let us first present the plan of the proof of Theorem 3.2. In Step 1 we study the asymptotics of the jump rates of \(X^N_{disc}\) in the scaling limit of Theorem 3.2. In Step 2 we prove the tightness of the processes \(Y^N_\varepsilon \) as \(\varepsilon \rightarrow 0\). In Step 3 we show that subsequential limits of that family solve the SDE (3.1). This fact and the uniqueness of the solution to (3.1) yield together Theorem 3.2.

3.1 Step 1: Rates

\(Y^N_\varepsilon (t)\) is a continuous time Markov process with state space \(\overline{\mathcal {W}^N}\), a (constant) drift of \(-\varepsilon ^{-1/2}\) in each coordinate and jump rates

$$\begin{aligned} p^N_\varepsilon (y,y',t)=\theta ^{-1}\,\varepsilon ^{-1}\,q_{\frac{t}{\theta }\varepsilon ^{-1}+\hat{y}\varepsilon ^{-1/2}\,\rightarrow \,\frac{t}{\theta }\varepsilon ^{-1}+\hat{y'}\varepsilon ^{-1/2}} \end{aligned}$$

where \(\hat{y}\), \(\hat{y'}\) are the vectors (viewed as Young diagrams) obtained from \(y\), \(y'\) by reordering the components in decreasing order, and the intensities \(q_{\lambda \rightarrow \mu }\) are given in Proposition 2.23. If we write \(y'\approx _\varepsilon y\) for vectors \(y'\), \(y\) which differ in exactly one coordinate with the difference being \(\varepsilon ^{1/2}\), then \(p^N_\varepsilon (y,y',t)=0\) unless \(y'\approx _\varepsilon y\). As we will see, in fact, \(p^N_\varepsilon (y,y',t)\) does not depend on \(t\).

Now, take two sequences \(y'\approx _\varepsilon y\) with \(y'_{N+1-i}-y_{N+1-i}=\varepsilon ^{1/2}\) for some fixed \(i\in \{1,2,\ldots ,N\}\). Define Young diagrams \(\lambda \) and \(\mu \) via \(\lambda _l=\frac{t}{\theta }\,\varepsilon ^{-1}+y_{N+1-l}\,\varepsilon ^{-1/2}\), \(\mu _l=\frac{t}{\theta }\,\varepsilon ^{-1}+y'_{N+1-j}\,\varepsilon ^{-1/2}\). Then \(\mu _i=\lambda _i+1\) and \(\mu _l=\lambda _l\) for \(j\ne i\). Also, set \(j=\lambda _i+1\), so that \(\mu =\lambda \sqcup (i,j)\).

Lemma 3.4

For sequences \(y'\approx _\varepsilon y\) differing in the \((N+1-i)\)-th coordinate as above, we have in the limit \(\varepsilon \rightarrow 0\):

$$\begin{aligned} p^N_\varepsilon (y,y',t)= \varepsilon ^{-1}+ \varepsilon ^{-1/2} \left( \sum _{j\ne i} \frac{\theta }{y_{N+1-i}-y_{N+1-j}}\right) + O(1) \end{aligned}$$

where the error \(O(1)\) is uniform on compact subsets of the open Weyl chamber \(\mathcal {W}^N\).

Proof

Using Proposition 2.23 we have

$$\begin{aligned}&p^N_\varepsilon (y,y',t)=\frac{\varepsilon ^{-1}}{\theta }\,\frac{J_\mu (1^N)}{J_\lambda (1^N)}\,\tilde{J}_{\mu /\lambda }(\mathfrak {r}_1) =\frac{\varepsilon ^{-1}}{\theta }\,\frac{\prod _{\square \in \mu } \dfrac{N\theta +a'(\square )-\theta \, l'(\square )}{a(\square ;\mu )+\theta \, l(\square ;\mu ) +\theta }}{\prod _{\square \in \lambda } \dfrac{N\theta +a'(\square )-\theta \,l'(\square )}{a(\square ;\lambda )+\theta \, l(\square ;\lambda ) +\theta }} \nonumber \\&\qquad \times \,\theta \,\prod _{m=1}^{i-1} \frac{a(m,j;\lambda )+\theta \,(i-m+1)}{a(m,j;\lambda )+\theta \,(i-m)} \, \frac{a(m,j;\lambda )+1+\theta \,(i-m-1)}{a(m,j;\lambda )+1+\theta \,(i-m)}\nonumber \\&\quad =\frac{\varepsilon ^{-1}}{\theta }\,\prod _{k=1}^{j-1} \frac{j-k-1+\theta \,(\lambda '_k-i+1)}{j-k+\theta \,(\lambda '_k-i+1)} \, \prod _{m=1}^{i-1} \frac{\lambda _m-j+\theta \,(i-m)}{\lambda _m-j+\theta \,(i-m+1)} \nonumber \\&\qquad \times \,\big ((N-i+1)\,\theta +j-1\big ) \prod _{m=1}^{i-1} \frac{\lambda _m-j+\theta \,(i-m+1)}{\lambda _m-j+\theta \, (i-m)} \cdot \frac{\lambda _m-j+1+\theta \,(i-m-1)}{\lambda _m-j+1+\theta \,(i-m)} \nonumber \\&\quad =\frac{\varepsilon ^{-1}}{\theta }\,\prod _{k=1}^{j-1} \frac{j-k-1+\theta \,(\lambda '_k-i+1)}{j-k+\theta \,(\lambda '_k-i+1)} \big ((N-i+1)\,\theta +j-1\big ) \nonumber \\&\qquad \times \,\prod _{m=1}^{i-1} \frac{({y}_{N+1-m}-{y}_{N+1-i})\,\varepsilon ^{-1/2}+\theta \,(i-m-1)}{({y}_{N+1-m}-{y}_{N+1-i})\,\varepsilon ^{-1/2} +\theta \,(i-m)}. \end{aligned}$$
(3.3)

Now, for any \(y\) the corresponding Young diagram \(\lambda \) has \(\lambda _N\) columns of length \(N\), \((\lambda _{N-1}-\lambda _N)\) columns of length \((N-1)\), \((\lambda _{N-2}-\lambda _{N-1})\) columns of length \((N-2)\) etc. Therefore, the latter expression for \(p^N_\varepsilon (y,y',t)\) can be simplified to

$$\begin{aligned}&\varepsilon ^{-1}\,\prod _{r=0}^{N-i-1} \frac{j-1-\lambda _{N-r}+\theta \,(N-i-r+1)}{j-1-\lambda _{N-r}+\theta \,(N-i-r)} \prod _{m=1}^{i-1}\nonumber \\&\qquad \times \frac{({y}_{N+1-m}-{y}_{N+1-i})\,\varepsilon ^{-1/2}+\theta \,(i-m-1)}{({y}_{N+1-m}-{y}_{N+1-i})\,\varepsilon ^{-1/2}+\theta \,(i-m)} \nonumber \\&\quad =\varepsilon ^{-1}\,\prod _{k=i+1}^N \frac{({y}_{N+1-i}-{y}_{N+1-k})\,\varepsilon ^{-1/2}+\theta \,(k-i+1)}{({y}_{N+1-i}-{y}_{N+1-k})\,\varepsilon ^{-1/2}+\theta \,(k-i)}\, \prod _{m=1}^{i-1}\nonumber \\&\quad \quad \times \frac{({y}_{N+1-m}-{y}_{N+1-i})\,\varepsilon ^{-1/2}+\theta \,(i-m-1)}{({y}_{N+1-m}-{y}_{N+1-i})\,\varepsilon ^{-1/2}+\theta \,(i-m)}\nonumber \\&\quad =\varepsilon ^{-1}+ \varepsilon ^{-1/2}\,\left( \sum _{j\ne i} \frac{\theta }{y_{N+1-i}-y_{N+1-j}}\right) + O(1), \end{aligned}$$
(3.4)

\(\square \)

with the remainder \(O(1)\) being uniform over \(y\) such that \(|y_{N+1-i}-y_{N+1-j}|>\delta \) for \(j\ne i\) and a fixed \(\delta >0\).

3.2 Step 2: Tightness

Let us show that the family \(Y^N_\varepsilon \), \(\varepsilon \in (0,1)\) is tight on \(D^N\). To this end, we aim to apply the necessary and sufficient condition for tightness of [19, Corollary 3.7.4] and need to show that, for any fixed \(t\ge 0\), the random variables \(Y^N_\varepsilon (t)\) are tight on \(\mathbb {R}^N\) as \(\varepsilon \downarrow 0\) and that for every \(\Delta >0\) and \(T>0\) there exists a \(\delta >0\) such that

$$\begin{aligned} \limsup _{\varepsilon \downarrow 0}\;\mathbb {P}\left( \sup _{0\le s<t\le T,t-s<\delta } \big |(Y^N_\varepsilon )_i(t)-(Y^N_\varepsilon )_i(s)\big |>\Delta \right) <\Delta ,\;\; i=1,2,\ldots ,N. \end{aligned}$$
(3.5)

We first explain how to obtain the desired controls on \((Y^N_\varepsilon (t))_+\) (the vector of positive parts of the components of \(Y^N_\varepsilon (t)\)) and

$$\begin{aligned} \sup _{0\le s<t\le T,t-s<\delta } \Big ((Y^N_\varepsilon )_i(t)-(Y^N_\varepsilon )_i(s)\Big ),\quad i=1,2,\ldots ,N. \end{aligned}$$
(3.6)

To control \((Y^N_\varepsilon (t))_+\) and the expressions in (3.6) we proceed by induction over the index of the coordinates in \(Y^N_\varepsilon \). For the first coordinate \((Y^N_\varepsilon )_1\) the explicit formula (3.4) in Step 1 shows that the jump rates of the process \((Y^N_\varepsilon )_1\) are bounded above by \(\varepsilon ^{-1}\). Hence, a comparison with a Poisson process with jump rate \(\varepsilon ^{-1}\), jump size \(\varepsilon ^{1/2}\) and drift \(-\varepsilon ^{-1/2}\) shows that \(((Y^N_\varepsilon )_1(t))_+\) and the expression in (3.6) for \(i=1\) behave in accordance with the conditions of Corollary 3.7.4 in [19] as stated above. Next, we consider \((Y^N_\varepsilon )_i\) for some \(i\in \{2,3,\ldots ,N\}\). In this case, the formula (3.4) in Step 1 shows that, whenever the spacing \((Y^N_\varepsilon )_i-(Y^N_\varepsilon )_{i-1}\) exceeds \(\Delta /3\), the jump rate of \((Y^N_\varepsilon )_i\) is bounded above by

$$\begin{aligned} \varepsilon ^{-1}+\sum _{j=1}^{i-1} \frac{\theta \,\varepsilon ^{-1/2}}{(Y^N_\varepsilon )_i(t)-(Y^N_\varepsilon )_j(t)}+C(\Delta ) \le \varepsilon ^{-1}+\frac{3(i-1)\theta }{\Delta }\,\varepsilon ^{-1/2}+C(\Delta ).\nonumber \\ \end{aligned}$$
(3.7)

Let us show that \((Y^N_\varepsilon )_i\) can be coupled with a Poisson jump process \(R_\varepsilon \) with jump size \(\varepsilon ^{1/2}\), jump rate given by the right-hand side of the last inequality and drift \(-\varepsilon ^{-1/2}\), so that, whenever \((Y^N_\varepsilon )_i-(Y^N_\varepsilon )_{i-1}\) exceeds \(\Delta /3\) and \((Y^N_\varepsilon )_i\) has a jump to the right, the process \(R_\varepsilon \) has a jump to the right as well.

To do this, recall that (by definition) the law of the jump times of \(Y^N_\varepsilon \) can be described as follows. We take \(N\) independent exponential random variables \(a_1,\dots ,a_N\) with means \(r_j(Y^N_\varepsilon )\), \(j=1,\dots ,N\) defined by (3.3) with \(y\) and \(y'\) differing in the \(j\)-th coordinate. If we let \(k\) be the index for which \(a_k=\min (a_1,\dots ,a_N)\), then at time \(a_k\) the \(k\)-th particle (that is, \((Y^N_\varepsilon )_k\)) jumps. After this jump we repeat the procedure again to determine the next jump.

Let \(M\) denote the right-hand side of (3.7) and consider in each time interval between the jumps of \(Y^N_\varepsilon \) an additional independent exponential random variable \(b\) with mean \(M-r_i(Y^N_\varepsilon )\) if \((Y^N_\varepsilon )_i-(Y^N_\varepsilon )_{i-1}\) exceeds \(\Delta /3\) and with mean \(M\) otherwise. Now, instead of considering \(\min (a_1,\dots ,a_N)\), we consider \(\min (a_1,\dots ,a_N,b)\). If the minimum is given by \(b\), then no jump happens and the whole procedure is repeated. Now, we define the jump times of process \(R_\varepsilon \) to be all times when the clock of the \(i\)-th particle rings provided that \((Y^N_\varepsilon )_i-(Y^N_\varepsilon )_{i-1}\) exceeds \(\Delta /3\), and also all times when the auxilliary random variable \(b\) constitutes the minimum. One readily checks that \(R_\varepsilon \) is given by a Poisson jump process of constant intensity \(M\) and drift \(-\varepsilon ^{-1/2}\).

We further use the convergence of \(R_\varepsilon \) to Brownian motion with drift \(3(i-1)\theta /\Delta \), which implies the tightness and conditions (3.5) are satisfied for \(R_\varepsilon \). Now we get the desired control for \(((Y^N_\varepsilon )_i(t))_+\) and the quantities in (3.6) by invoking the induction hypothesis when spacing \((Y^N_\varepsilon )_i-(Y^N_\varepsilon )_{i-1}\) is less than \(\Delta /3\) and by comparison with \(R_\varepsilon \) when the spacing is larger.

It remains to observe that \((Y^N_\varepsilon (t))_-\) (the vector of negative parts of the components of \(Y^N_\varepsilon \)) and

$$\begin{aligned} \sup _{0\le s<t\le T,t-s<\delta } -\big ((Y^N_\varepsilon )_i(t)-(Y^N_\varepsilon )_i(s)\big ),\quad i=1,2,\ldots ,N \end{aligned}$$

can be dealt with in a similar manner (but considering the rightmost particle first and moving from right to left). Together these controls yield the conditions of [19, Corollary 3.7.4].

We also note that, since the maximal size of the jumps tends to zero as \(\varepsilon \downarrow 0\), any limit point of the family \(Y^N_\varepsilon \), \(\varepsilon \in (0,1)\) as \(\varepsilon \downarrow 0\) must have continuous paths (see e.g. [19, Theorem 3.10.2]).

Remark 3.5

Note that in the proof of the tightness result the condition \(\theta \ge 1/2\) is not used.

3.3 Step 3: SDE for subsequential limits

Throughout this section we let \(Y^N\) be an arbitrary limit point of the family \(Y^N_\varepsilon \) as \(\varepsilon \downarrow 0\). Our goal is to identify \(Y^N\) with the solution of (3.1). We pick a sequence of \(Y^N_\varepsilon \) which converges to \(Y^N\) in law, and by virtue of the Skorokhod Embedding Theorem (see e.g. Theorem 3.5.1 in [18]) may assume that all processes involved are defined on the same probability space and that the convergence holds in the almost sure sense. In the rest of this section all limits \(\varepsilon \rightarrow 0\) are taken along this sequence.

Let \(\mathcal {F}\) denote the set of all infinitely differentiable functions on \(\overline{\mathcal {W}^N}\) whose support is a compact subset of \(\mathcal {W}^N\). Define

$$\begin{aligned} \mathcal {F}_\delta :=\big \{f\in \mathcal {F}:f(\mathbf{x})=0\;\,\mathrm{whenever}\;\,\mathrm{dist}(\mathbf{x},\partial \overline{\mathcal {W}^N})\le \delta \big \} \end{aligned}$$

where \(\partial \overline{\mathcal {W}^N}\) denotes the boundary of \(\overline{\mathcal {W}^N}\) and \(\mathrm{dist}\) stands for the \(L^\infty \) distance:

$$\begin{aligned} \mathrm{dist}(\mathbf{x},\partial \overline{\mathcal {W}^N})=\min _{i=1,\dots ,N-1} |x_{i+1}-x_i|. \end{aligned}$$

Clearly, \({\mathcal {F}} =\bigcup _{\delta >0} {\mathcal {F}}_\delta \).

For functions \(f\in {\mathcal {F}}\) we consider the processes

$$\begin{aligned} M^f(t)&:= f(Y^N(t))-f(Y^N(0))-\int _0^t \sum _{1\le i\ne j\le N} \frac{\theta }{Y^N_i(s)-Y^N_j(r)} f_{y_i}(Y^N(r))\,\mathrm{d}r\nonumber \\&-\,\frac{1}{2}\,\int _0^t \sum _{i=1}^N f_{y_iy_i}(Y^N(r))\,\mathrm{d}r. \end{aligned}$$
(3.8)

Here, \(f_{y_i}\) (\(f_{y_iy_i}\) resp.) stands for the first (second resp.) partial derivative of \(f\) with respect to \(y_i\).

In Step 3a we show that the processes in (3.8) are martingales and identify their quadratic covariations. In step 3b we use the latter results to derive the SDEs for the processes \(Y^N_1\),..., \(Y^N_N\).

Step 3a. We now fix an \(f\in \mathcal {F}_\delta \) for some \(\delta >0\) and consider the family of martingales

$$\begin{aligned} M^f_\varepsilon (t)&:= f(Y^N_\varepsilon (t))-f(Y^N_\varepsilon (0))-\int _0^t \Bigg (\sum _{i=1}^N -\varepsilon ^{-1/2}\,f_{y_i}(Y^N_\varepsilon (r)) \nonumber \\&+\,\sum _{y'\approx _\varepsilon Y^N(r)} p^N_\varepsilon (Y^N_\varepsilon (r),y',s)(f(y')-f(Y^N(s)))\Bigg )\,\mathrm{d}r,\;\; \varepsilon >0.\qquad \end{aligned}$$
(3.9)

Lemma 3.4 implies that the integrand in (3.9) behaves asymptotically as

$$\begin{aligned} \frac{1}{2}\sum _{i=1}^N f_{y_iy_i}(Y^N_\varepsilon (r))+\sum _{i=1}^N b_i(Y^N_\varepsilon (r)) f_{y_i}(Y^N_\varepsilon (r)) + O(\varepsilon ^{1/2}), \end{aligned}$$

where \(b_{i}(y)=\sum _{j\ne i} \frac{\theta }{y_{i}-y_j}\). Note also that, for any fixed function \(f\in \mathcal {F}\), the error terms can be bounded uniformly for all sequences \(y'\approx _\varepsilon y\) as above, since \(f\) and all its partial derivatives are bounded and vanish in a neighborhood of the boundary \(\partial \overline{{\mathcal {W}}^N}\) of \(\overline{{\mathcal {W}}^N}\).

By taking the limit of the corresponding martingales \(M^f_\varepsilon \) for a fixed \(f\in \mathcal {F}\) and noting that their limit \(M^f\) can be bounded uniformly on every compact time interval, we conclude that \(M^f\) must be a martingale as well.

In order to proceed further we recall the following definitions. For a real-valued function \(f\) defined on an interval \([0,T]\) (\(T\) can be \(+\infty \) here), its quadratic variation \(\langle f\rangle (t)\) is defined for \(0\le t\le T\) via

$$\begin{aligned} \langle f\rangle (t)= \lim _{\begin{array}{c} ||\mathfrak {P}||\rightarrow 0 \\ \mathfrak {P}=(t_0<t_1<\dots <t_k) \end{array}}\sum _{i=1}^k \bigl (f(t_i)-f(t_{i-1})\bigr )^2 \end{aligned}$$

where \(\mathfrak {P}\) ranges over all ordered collections of points \(0=t_0<t_1<\dots <t_k=t\) with \(k\) being arbitrary, and \(||\mathfrak {P}||=\min \nolimits _{1\le i\le k} (t_i-t_{i-1})\). Similarly, for two function \(f\) and \(g\) their quadratic covariation \(\langle f,g\rangle (t)\) is defined as

$$\begin{aligned} \langle f,g\rangle (t)=\lim _{\begin{array}{c} ||\mathfrak {P}||\rightarrow 0 \\ P =(t_0<t_1<\dots <t_k) \end{array}}\sum _{i=1}^k \bigl (f(t_i)-f(t_{i-1})\bigr )\,\bigl (g(t_i)-g(t_{i-1})\bigr ). \end{aligned}$$

Lemma 3.6

For any two functions \(g,h\in {\mathcal {F}}\), the quadratic covariation of \(M^{g}\) and \(M^{h}\) is given by

$$\begin{aligned} \big \langle M^g,M^h\big \rangle (t) = \sum _{j=1}^N \int _0^t g_{y_j}(Y^N(r))h_{y_j}(Y^N(r))\,\mathrm{d}r. \end{aligned}$$
(3.10)

Proof

Due to the polarization identity

$$\begin{aligned} 2\,\langle M^g,M^h \rangle (t)= \langle M^g+M^h \rangle (t)-\langle M^g,M^g \rangle (t)-\langle M^h,M^h \rangle (t) \end{aligned}$$

it is enough to consider the case \(g=h\), that is, to determine the quadratic variation \(\langle M^g\rangle (t)\).

We proceed by finding the limit of the quadratic variation processes \(\langle M^g_\varepsilon \rangle \) of \(M^g_\varepsilon \) as \(\varepsilon \rightarrow 0\). For each \(\varepsilon >0\) and \(j=1,\dots ,N\) define \(\mathcal {S}^{j}_{\varepsilon }\) as the (random) set of all times when the \(j\)-th coordinate of \(Y^N_\varepsilon \) jumps. Note that the sets \(\mathcal {S}^{j}_{\varepsilon }\) are pairwise disjoint and their union \(\bigcup _{j=1}^N S^j_\varepsilon \) is a Poisson point process of intensity \(\varepsilon ^{-1}N\) (see Proposition 2.25).

Recall that the quadratic variation process \(\langle M^g_\varepsilon \rangle (t)\) of \(M^g_\varepsilon \) is given by the sum of squares of the jumps of the process \(M^g_\varepsilon \) (see e.g. [15, Proposition 8.9]) and conclude

$$\begin{aligned} \bigl \langle M^g_\varepsilon \bigr \rangle (t)=\sum _{j=1}^N \sum _{r\in \mathcal {S}^{j}_{\varepsilon } \cap [0,t]} \big (\varepsilon ^{1/2}g_{y_j}(Y^N_\varepsilon (r)+O(\varepsilon )\big )^2, \end{aligned}$$
(3.11)

with a uniform error term \(O(\varepsilon )\). Suppose that \(g\in \mathcal {F}_{2\delta }\) and consider new \(N\) pairwise disjoint sets \(\widehat{\mathcal {S}}^{j}_{\varepsilon }\), \(j=1,\dots ,N\) satisfying \(\bigcup _{j=1}^N \widehat{\mathcal {S}}^{j}_{\varepsilon } = \bigcup _{j=1}^N {\mathcal {S}}^{j}_{\varepsilon }\) and defined through the following procedure. Take any \(r\in \bigcup _{j=1}^N S^{j}_{\varepsilon }\) and suppose that \(r\in \mathcal {S}^{k}_{\varepsilon }\). If \(\mathrm{dist}(Y^N_\varepsilon (r),\partial \overline{\mathcal {W}^N})\ge \delta \), then put \(r\in \widehat{\mathcal {S}}^k_\varepsilon \). Otherwise, take an independent random variable \(\kappa \) sampled from the uniform distribution on the set \(\{1,2,\dots ,N\}\) and put \(r\in \widehat{\mathcal {S}}^\kappa _\varepsilon \). The definition implies that, for small \(\varepsilon \),

$$\begin{aligned} \bigl \langle M^g_\varepsilon \bigr \rangle (t)=\sum _{j=1}^N \sum _{r\in \widehat{\mathcal {S}}^{j}_{\varepsilon } \cap [0,t]} \big (\varepsilon ^{1/2}g_{y_j}(Y^N_\varepsilon (r)+O(\varepsilon )\big )^2, \end{aligned}$$
(3.12)

with a uniform error term \(O(\varepsilon )\). Now, take any two reals \(a<b\). We claim that the sets \(\widehat{\mathcal {S}}^{k}_{\varepsilon }\) satisfy the following property almost surely:

$$\begin{aligned} \lim _{\varepsilon \rightarrow 0} \varepsilon \,|\widehat{\mathcal {S}}^{k}_{\varepsilon } \cap [a,b]| = b-a. \end{aligned}$$
(3.13)

Indeed, the Law of Large Numbers for Poisson Point Processes implies

$$\begin{aligned} \lim _{\varepsilon \rightarrow 0} \varepsilon \,\left| \left( \bigcup _{k=1}^N \widehat{\mathcal {S}}^{k}_{\varepsilon }\right) \cap [a,b]\right| =N(b-a). \end{aligned}$$
(3.14)

On the other hand, Lemma 3.4 implies the following uniform asymptotics as \(\varepsilon \rightarrow 0\):

$$\begin{aligned} \mathbb {P}\left( t\in \widehat{\mathcal {S}}^{k}_\varepsilon \mid \Theta _{<t},t\in \bigcup _{j=1}^N \widehat{\mathcal {S}}^{j}_\varepsilon \right) =\frac{1}{N}+o(1), \end{aligned}$$

where \(\Theta _{<t}\) is the \(\sigma \)-algebra generated by the point process \(\widehat{\mathcal {S}}^{k}_\varepsilon \), \(j=1,\dots ,N\) up to time \(t\). Therefore, the conditional distribution of \(|\widehat{\mathcal {S}}^{k}_{\varepsilon } \cap [a,b]|\) given \(|(\bigcup _{k=1}^N \widehat{\mathcal {S}}^{k}_{\varepsilon }) \cap [a,b]|\) can be sandwiched between two binomial distributions with parameters \(\frac{1}{N}\pm C(\varepsilon )\), where \(\lim _{\varepsilon \rightarrow 0} C(\varepsilon )=0\) (see e.g. [30, Lemma 1.1]). Now, (3.14) and the Law of Large Numbers for the Binomial Distribution imply (3.13).

It follows that the sums in (3.12) approximate the corresponding integrals and we obtain

$$\begin{aligned} \lim _{\varepsilon \downarrow 0}\,\bigl \langle M^g_\varepsilon \bigr \rangle (t) = \sum _{j=1}^N \int _0^t g_{y_j}(Y^N(r))^2\,\mathrm{d}r. \end{aligned}$$
(3.15)

Note that for each \(g\in {\mathcal {F}}\), both \(M^g_\varepsilon (t)^2\) and \(\langle M^g_\varepsilon \rangle (t)\) are uniformly integrable on compact time intervals (this can be shown for example by another comparison with a Poisson jump process). Further, one of the properties of the quadratic variation (see e.g. [19, Chapter 2, Proposition 6.1]) is that \(M^g_\varepsilon (t)^2-\langle M^g_\varepsilon \rangle (t)\) is a martingale. Sending \(\varepsilon \rightarrow 0\) (see e.g. [19, Chapter 7, Problem 7] for a justification) it follows that the process

$$\begin{aligned} M^g(t)^2-\sum _{j=1}^N \int _0^t g_{y_j}(Y^N(r))^2\,\mathrm{d}r= \lim _{\varepsilon \rightarrow 0} \left( M^g_\varepsilon (t)^2-[M^g_\varepsilon ](t)\right) \end{aligned}$$
(3.16)

is a martingale. On the other hand, since \(M^g(t)\) is continuous in \(t\) (see the end of Step 2), its quadratic variation \(\langle M^g\rangle (t)\) is a unique increasing predictable process such that \(M^g(t)-\langle M^g\rangle (t)\) is a martingale (see [19, Chapter 2, Section 6]). We conclude that

$$\begin{aligned} \bigl \langle M^g\bigr \rangle (t)= \sum _{j=1}^N \int _0^t g_{y_j}(Y^N(r))^2\,\mathrm{d}r. \end{aligned}$$

\(\square \)

Step 3b. We are now ready to derive the SDEs for the processes \(Y_1^N,\dots ,Y_N^N\). Define the stopping times \(\tau _\delta \), \(\delta >0\) by

$$\begin{aligned} \tau _\delta&= \inf \{t\ge 0:Y_{i}^N(t)-Y_{i-1}^N(t)\nonumber \\&\le \delta \; \mathrm{for\;some\;}i\} \wedge \inf \{t\ge 0:|Y_i^N(t)|\ge 1/\delta \;\mathrm{for\;some\;}i\}. \end{aligned}$$
(3.17)

Our next aim is to derive the stochastic integral equations for the processes

$$\begin{aligned} \big (Y^N_{j}(t\wedge \tau _\delta )\big )_{j=1,\dots ,N}. \end{aligned}$$

Let \(f_j\), \(j=1,\dots ,N\), be an arbitrary function from \({\mathcal {F}}\) such that \(f_j(\mathbf{y})=y_j\) for \(\mathbf{y}\) inside the box \(|y_j|\le 1/\delta \) and such that \(\mathrm{dist}(\mathbf{y},\partial \overline{\mathcal {W}^N})\ge \delta \). The results of Step 3a imply that the processes \(M^{{f}_j}\) are martingales. Note that the definition of stopping times \(\tau _\delta \) imply that on the time interval \([0,\tau _\delta ]\) the processes \(M^{{f}_j}\) and \(M^{y_j}\) almost surely coincide. At this point we can use Lemma 3.6 to conclude that

$$\begin{aligned} Y^N_j(t\wedge \tau _\delta )-Y^N_j(0) -\int _{0}^{t\wedge \tau _\delta } \sum _{n\ne j} \frac{\theta }{Y^N_j(r)-Y^N_n(r)} \,\mathrm{d}r,\qquad j=1,\dots ,N \end{aligned}$$

are martingales with quadratic variations given by \(t\wedge \tau _\delta \) and with the quadratic covariation between any two of them being zero. We may now apply the Martingale Representation Theorem in the form of [29, Theorem 3.4.2] to deduce the existence of independent standard Brownian motions \(W_1,\ldots ,W_N\) (possibly on an extension of the underlying probability space) such that

$$\begin{aligned} Y^N_j(t\wedge \tau _\delta )-Y^N_j(0) -\int _{0}^{t\wedge \tau _\delta } \sum _{n\ne j} \frac{\theta }{Y^N_j(r)-Y^N_n(r)} \,\mathrm{d}r =\!\!\int _{0}^{t\wedge \tau _\delta } \mathrm{d}W_j(s),\;\; j=1,\dots ,N. \end{aligned}$$
(3.18)

To finish the proof of Theorem 3.2 it remains to observe that Proposition 3.1 implies

$$\begin{aligned} \lim _{\delta \rightarrow 0} \tau _\delta =\infty \end{aligned}$$

with probability one.

Remark 3.7

An alternative way to derive the system of SDEs for the components of \(Y^N\) is to use [29, Chapter 5, Proposition 4.6]. We will employ this strategy in Sect. 5.3 due to the lack of a straightforward generalization of Lemma 3.6 to the multilevel setting.

3.4 Zero initial condition

A refinement of the proof of Theorem 3.2 involving Proposition 2.9 allows us to deal with the limiting process which is started from \(0\in \overline{{\mathcal {W}}^N}\).

Corollary 3.8

Fix \(\theta \ge 1\). In the notations of Theorem 3.2 and assuming the convergence of the initial conditions to \(0\in \overline{\mathcal {W}^N}\), the process \(X^N_{disc}\) converges in the limit \(\varepsilon \downarrow 0\) in law on \(D^N\) to the \(\beta =2\theta \)–Dyson Brownian motion started from \(0\in \overline{{\mathcal {W}}^N}\), that is, to the unique strong solution of (3.1) with \(\beta =2\theta \) and \(Y(0)=0\in \overline{{\mathcal {W}}^N}\).

Proof

Using Proposition 2.9 and arguing as in the proof of Theorem 3.2 one obtains the convergence of the rescaled versions of the process \(X^N_{disc}\) on every time interval \([t,\infty )\) with \(t>0\) to the solution of (3.1) starting according to the initial distribution of (2.10). Since (2.10) converges to the delta–function at the origin as \(t\rightarrow 0\), we identify the limit points of the rescaled versions of \(X^N_{disc}\) with the solution of (3.1) started from \(0\in \overline{{\mathcal {W}}^N}\). \(\square \)

4 Existence and uniqueness for multilevel DBM

The aim of this section is to prove an analogue of Proposition 3.1 for the multilevel Dyson Brownian motion.

Theorem 4.1

For any \(N\in \mathbb {N}\) and \(\theta > 1\) (that is, \(\beta =2\theta > 2\)), and for any initial condition \(X(0)\) in the interior of \(\overline{\mathcal {G}^N}\), the system of SDEs

$$\begin{aligned} \mathrm{d}X^k_i(t) = \Biggl ( \sum _{m\ne i} \frac{1-\theta }{X^k_i(t)-X^k_m(t)}-\sum _{m=1}^{k-1} \frac{1-\theta }{X_i^k(t)-X_m^{k-1}(t)}\Biggr )\,\mathrm{d}t + \mathrm{d}W_i^k ,\;\;\;1\le i\le k\le N\nonumber \\ \end{aligned}$$
(4.1)

with \(W_i^k\), \(1\le i\le k\le N\) being independent standard Brownian motions, possesses a unique weak solution taking values in the Gelfand–Tsetlin cone \(\overline{\mathcal {G}^N}\).

Proof

Given a stochastic process \(X(t)\) taking values in \(\overline{\mathcal {G}^N}\), for any fixed \(\delta >0\), let \(\widehat{\tau }_{\delta }(X)\) denote

$$\begin{aligned} \widehat{\tau }_{\delta }[X]=\inf \{t\ge 0 \, :\, |X^k_i(t)-X^{k'}_{i'}(t)|\le \delta ,\, |k-k'|\le 1\}, \end{aligned}$$

that is, the first time when two particles on adjacent levels are at the distance of at most \(\delta \). Further, we define the stopping time \(\tau _\delta [X]\) as the first time when three particles on adjacent levels are at the distance of at most \(\delta \):

$$\begin{aligned} \tau _{\delta }[X]&= \inf \biggl \{t\ge 0\, : \, |X^k_i(t)-X^{k'}_{i'}(t)|\le \delta ,\, |X^k_i(t)-X^{k''}_{i''}(t)|\le \delta ,\\&\quad \text { for } (k',i')\ne (k'',i'') \text { such that } |k-k'|=|k-k''|=1 \biggr \}. \end{aligned}$$

Figure 2 shows schematically the six possible triplets of nearby particles at time \(\tau _\delta \).

Fig. 2
figure 2

Six possible triplets of nearby particles: one of these situations occurs at time \(\tau _\delta \)

The following proposition will be proved in Sect. 4.1.

Proposition 4.2

For any \(N\in \mathbb {N}\), \(\delta >0\) and \(\theta > 1\) (that is, \(\beta =2\theta >2\)), and for any initial condition \(X(0)\) in the interior of \(\overline{\mathcal {G}^N}\) the system of stochastic integral equations

$$\begin{aligned} X^k_i(t)-X^k_i(0)&= \int _0^{t\wedge \tau _\delta [X]}\left( \sum _{m\ne i} \frac{1-\theta }{X^k_i(s)-X^k_m(s)}-\sum _{m=1}^{k-1} \frac{1-\theta }{X_i^k(t)-X_m^{k-1}(t)}\right) \,\mathrm{d}s\nonumber \\&+\, W_i^k(t\wedge \tau _\delta [X]),1\le i\le k\le N, \end{aligned}$$
(4.2)

with \(W_i^k\), \(1\le i\le k\le N\) being independent standard Brownian motions, possesses a unique weak solution.

In view of Proposition 4.2 we can consider a product probability space which supports independent weak solutions of (4.2) for all \(\delta >0\) and all initial conditions in the interior of \(\overline{\mathcal {G}^N}\). Choosing a sequence \(\delta _l\), \(l\in \mathbb {N}\) decreasing to zero, we can define on this space a process \(X\) such that the law of \(X(t\wedge \tau _{\delta _1}[X])\), \(t\ge 0\) coincides with the law of the solution of (4.2) with \(\delta =\delta _1\) and initial condition \(X(0)\), the law of \(X((\tau _{\delta _1}[X]+t)\wedge \tau _{\delta _2}[X])\), \(t\ge 0\) is given by the law of the solution of (4.2) with \(\delta =\delta _2\) and initial condition \(X(\tau _{\delta _1[X]})\) etc. The uniqueness part of Proposition 4.2 now shows that, for each \(l\in \mathbb {N}\), the law of \(X(t\wedge \tau _{\delta _l[X]})\), \(t\ge 0\) is that of the weak solution of (4.2) with \(\delta =\delta _l\). Since the paths of \(X\) are continuous by construction and hence \(\lim _{l\rightarrow \infty } \tau _{\delta _l}[X]=\tau _0[X]\), we have constructed a weak solution of the system

$$\begin{aligned} X^k_i(t)-X^k_i(0)&= \int _0^{t\wedge \tau _0[X]}\left( \sum _{m\ne i} \frac{1-\theta }{X^k_i(s)-X^k_m(s)}-\sum _{m=1}^{k-1} \frac{1-\theta }{X_i^k(t)-X_m^{k-1}(t)}\right) \,\mathrm{d}s\nonumber \\&+ \,W_i^k(t\wedge \tau _0[X]) , 1\le i\le k\le N \end{aligned}$$
(4.3)

with \(W_i^k\), \(1\le i\le k\le N\) being independent standard Brownian motions as before. In addition, we note that the law of the solution to (4.3) is uniquely determined. Indeed, for any \(\delta >0\), the process \(X\) stopped at time \(\tau _\delta \) would give a solution to (4.2). Uniqueness of the latter for any \(\delta >0\) now readily implies the uniqueness of the weak solution to (4.3). At this point, Theorem 4.1 is a consequence of the following statement which will be proved in Sect. 4.2.

Proposition 4.3

Suppose that \(X(0)\) lies in the interior of the cone \(\overline{\mathcal {G}^N}\) and let \(X\) be a solution to (4.3).

  1. (a)

    If \(\theta > 1\), then almost surely \(\tau _0[X]=\infty .\)

  2. (b)

    If \(\theta \ge 2\), then almost surely \( \widehat{\tau }_{0}[X]=\infty .\)

4.1 Proof of Proposition 4.2

Our proof of Proposition 4.2 is based on a Girsanov change of measure that will dramatically simplify the SDE in consideration. We refer the reader to [29, Section 3.5] and [29, Section 5.3] for general information about Girsanov’s theorem and weak solutions of SDEs.

We start with the uniqueness part. Fix \(N\in \{1,2,\ldots \}\), \(\theta \ge 1\), \(\delta >0\), and let \(X\) be a solution of (4.2). Let \(\mathcal I\) denote the set of \(N(N+1)/2\) pairs \((k,i)\), \(k=1,\dots ,N\), \(i=1,\dots ,k\) which represent different coordinates (particles) in the process \(X\). We will subdivide \(\mathcal I\) into disjoint singletons and pairs of neighboring particles, that is, pairs of the form \(((k,i),(k-1,i))\) or \(((k,i),(k-1,i-1))\). We call any such subdivision a pair-partition of \(\mathcal I\). An example is shown in Fig. 3.

Fig. 3
figure 3

A pair-partition with \(N=3\), two pairs and two singletons

Lemma 4.4

There exists a sequence of stopping times \(0=\sigma _0\le \sigma _1\le \sigma _2\le \dots \le \tau _\delta [X]\) and (random) pair–partitions \(A_1, A_2,\dots \) such that

  • for any \(n=1,2,\dots \), any \(\sigma _{n-1}\le t < \sigma _n\), any two pairs \((k,i)\), \((k',i')\), \(1\le i\le k\le N\), \(1\le i'\le k'\le N\), \(|k-k'|\le 1\), we have \(|X^{k}_i(t)-X^{k'}_{i'}(t)|\ge \delta /2\) unless the pair \(((k,i),(k',i'))\) is one of the pairs of the pair-partition \(A_n\), and

  • for any \(n=1,2,\dots \), either \(\sigma _n=\tau _\delta \) or \(|X^k_i(\sigma _{n+1})-X^k_i(\sigma _n)|\ge \delta /2\) for some \((k,i)\).

Proof

Define the (random) sets \({\mathcal {B}}^k_i\), \({\mathcal {D}}^k_i\) by setting

$$\begin{aligned} {\mathcal {B}}^k_i&= \left\{ 0\le t\le \tau _\delta [X]\mid |X^k_i(t)-X^{k-1}_i(t)|\le \delta \right\} ,\\ {\mathcal {D}}^k_i&= \left\{ 0\le t\le \tau _\delta [X]\mid |X^k_i(t)-X^{k-1}_{i-1}(t)|\le \delta \right\} . \end{aligned}$$

Note that these sets are closed due to the continuity of the trajectories of \(X\), which in turn is a consequence of (4.2). Define \(A(t;\delta )\) as a pair-partition such that pair \(((k,i),(k-1,i))\) belongs to \(A(t;\delta )\) iff \(t\in {\mathcal {B}}^k_i\) and pair \(((k,i),(k-1,i-1))\) belongs to \(A(t;\delta )\) iff \(t\in {\mathcal {D}}^k_i\). Define \(A(t,\delta /2)\) similarly. The definition of \(\tau _\delta [X]\) implies that such pair-partitions \(A(t;\delta /2)\subset A(t;\delta )\) are well-defined for any \(0\le t\le \tau _\delta [X]\).

Now, we define \(\sigma _n\) and \(A_n\) inductively. First, set \(\sigma _0=0\). Further, for \(n=1,2,\dots \) let \(A_n=A(\sigma _{n-1};\delta )\) and set \(\sigma _n\) to be the minimal \(t\) satisfying \(\tau _\delta [X]\ge t\ge \sigma _{n-1}\) and such that \(A(t;\delta /2)\) has a pair which \(A(\sigma _{n-1};\delta )\) does not have. Since the sets \({\mathcal {B}}^k_i\), \({\mathcal {D}}^k_i\) are closed, either such \(t\) exists or no new pairs are added after time \(\sigma _{n-1}\) and up to time \(\tau _\delta [X]\). In the latter case we set \(\sigma _n=\tau _\delta \). \(\square \)

Next, we fix a \(T>0\), set \(I_n=[\sigma _{n-1},\sigma _{n})\), \(n=1,2,\dots \) and apply a Girsanov change of measure (see e.g. [29, Theorem 5.1, Chapter 3] and note that Novikov’s condition as in [29, Corollary 5.13, Chapter 3] is satisfied due to the boundedness of the integrand in the stochastic exponential) with a density of the form

$$\begin{aligned} \exp \left( \sum _{n=0}^\infty \sum _{1\le i\le k\le N} \left( \int _0^T b^k_{i,n}(t)\,\mathbf{1}_{I_n}(t)\,\mathrm{d}W^k_i(t) - \frac{1}{2}\int _0^T (b^k_{i,n}(t))^2\,\mathbf{1}_{I_n}(t)\,\mathrm{d}t\right) \right) ,\qquad \end{aligned}$$
(4.4)

so that under the new measure \(\widetilde{P}\) for every fixed \(k\), \(i\), \(n\) and \(0\le t\le T\):

$$\begin{aligned}&X^k_i(t\wedge \sigma _n)-X^k_i(t\wedge \sigma _{n-1})\nonumber \\&\quad ={\left\{ \begin{array}{ll} \int \limits _{t\wedge \sigma _{n-1}}^{t\wedge \sigma _{n}} \Big (\mathrm{d}\tilde{W}^k_i-\frac{(1-\theta )\,\mathrm{d}s}{X^k_i(s)-X^{k-1}_{i-1}(s)} \Big ), \;\;\; \text { if }\ ((k,i),(k-1,i-1))\in A_n,\\ \int \limits _{t\wedge \sigma _{n-1}}^{t\wedge \sigma _{n}}\, \Big (\mathrm{d}\tilde{W}^k_i-\frac{(1-\theta )\mathrm{d}s}{X^k_i(s)-X^{k-1}_i(s)}\Big ), \;\;\;\text { if }\ ((k,i),(k-1,i))\in A_n,\\ \int \limits _{t\wedge \sigma _{n-1}}^{t\wedge \sigma _{n}} \mathrm{d}\tilde{W}^k_i,\;\;\;\text {otherwise} \end{array}\right. } \end{aligned}$$
(4.5)

where \(\tilde{W}^k_i\), \(1\le i\le k\le N\) are independent standard Brownian motions under the measure \(\widetilde{P}\). We claim that the solution of the resulting system of SDEs (4.5) is pathwise unique on \([0,\lim _{n\rightarrow \infty } \sigma _n)\) (that is, for any two strong solutions of (4.5) adapted to the same Brownian filtration, the quantities \(\lim _{n\rightarrow \infty } \sigma _n\) for the two solutions will be the same with probability one and the trajectories of the two solutions on \([0,\lim _{n\rightarrow \infty } \sigma _n)\) will be identical with probability one). Indeed, on each time interval \(\sigma _{n-1}\le t \le \sigma _n\) the system (4.5) splits into \(|A_n|\) non-interacting systems of SDEs each of which consists of one equation

$$\begin{aligned} X^k_i(t\wedge \sigma _n)-X^k_i(t\wedge \sigma _{n-1})= \int \limits _{t\wedge \sigma _{n-1}}^{t\wedge \sigma _{n}} \mathrm{d}\tilde{W}^k_i, \end{aligned}$$
(4.6)

if \((k,i)\) is a singleton in \(A_n\), or of a system of two equations

$$\begin{aligned}&\big (X^k_i(t\wedge \sigma _n)-X^{k-1}_{i'}(t\wedge \sigma _n)\big ) -\big (X^k_i(t\wedge \sigma _{n-1})-X^k_{i'}(t\wedge \sigma _{n-1})\big )\nonumber \\&\quad =\,\int \limits _{t\wedge \sigma _{n-1}}^{t\wedge \sigma _{n}}\big (\mathrm{d}\tilde{W}^k_i-\mathrm{d}\tilde{W}^{k-1}_{i'}\big ) \nonumber \\&\qquad -\,\int \limits _{t\wedge \sigma _{n-1}}^{t\wedge \sigma _{n}}\frac{(1-\theta )\,\mathrm{d}s}{X^k_i(s)-X^{k-1}_{i'}(s)},\end{aligned}$$
(4.7)
$$\begin{aligned}&\quad X^{k-1}_{i'}(t\wedge \sigma _n)-X^k_i(t\wedge \sigma _{n-1})= \int \limits _{t\wedge \sigma _{n-1}}^{t\wedge \sigma _{n}} \mathrm{d}\tilde{W}^k_i, \end{aligned}$$
(4.8)

if \(((k,i),(k-1,i'))\) is a pair in \(A_n\). Therefore, one can argue by induction over \(n\) and, once pathwise uniqueness of the triplet \(((X(t\wedge \sigma _{n-1}):\,t\ge 0),\sigma _{n-1},A_{n-1})\) is established, appeal to the pathwise uniqueness for (4.6), (4.8) and (4.7) (the latter being the equation for the Bessel process of dimension \(\theta >1\), see [42, Section 1, Chapter XI]) to deduce the pathwise uniqueness of the triplet \(((X(t\wedge \sigma _n):\,t\ge 0),\sigma _n,A_n)\).

The SDEs in (4.5) also allow us to prove the following statement.

Lemma 4.5

The identity \(\lim _{n\rightarrow \infty } \sigma _n=\tau _\delta [X]\) holds with probability one.

Proof

It suffices to show that \(\lim _{n\rightarrow \infty } \sigma _n\wedge T = \tau _\delta [X]\wedge T\) for any given \(T>0\). Indeed, then

$$\begin{aligned} \lim _{n\rightarrow \infty } \sigma _n\ge \lim _{T\rightarrow \infty } \lim _{n\rightarrow \infty } \sigma _n\wedge T = \lim _{T\rightarrow \infty } \tau _\delta [X]\wedge T = \tau _\delta [X] \end{aligned}$$

and \(\lim _{n\rightarrow \infty } \sigma _n\le \tau _\delta [X]\) holds by the definitions of the stopping times involved. If for some \(n\) we have \(\tau _\delta [X]\wedge T=\sigma _n\wedge T\), then we are done. Otherwise, \(\sigma _n<T\) for all \(n\) and the definition in Lemma 4.4 shows that \(|X^k_i(\sigma _{n+1})-X^k_i(\sigma _n)|\ge \delta /2\) for some \((k,i)\). In addition, (4.5) yields that, under the measure \(\tilde{P}\), \(|X^k_i(\sigma _{n+1})-X^k_i(\sigma _n)|\) is bounded above by the sum of absolute values of the increments of at most two Brownian motions and one Bessel process in time \((\sigma _{n+1}-\sigma _n)\). Since the trajectories of such processes are uniformly continuous on the compact interval \([0,T]\) with probability one, there exist two constants \(c>0\) and \(p>0\) such that \(\tilde{P}(\sigma _{n+1}-\sigma _n>c)>p\). Consequently, \(\sigma _n/c\) stochastically dominates a binomial random variable \(Bin(n,p)\). In view of the law of large numbers for the latter, this is a contradiction to \(\sigma _n<T\) for all \(n\). \(\square \)

Now, we make a Girsanov change of measure back to the original probability measure and conclude that the joint law of \(X(t\wedge \tau _\delta \wedge T)\), \(t\ge 0\), \(\sigma _n\wedge T\) and \(\tau _\delta \wedge T\) under the original probability measure is determined by such law under the measure \(\widetilde{P}\) (the justification for this conclusion can be found for example in the proof of [29, Proposition 5.3.10]). Since the latter is uniquely defined (by the law of the solution to (4.5)), so is the former. Finally, since \(T>0\) was arbitrary, we conclude that the joint law of \(X(t\wedge \tau _\delta [X])\), \(t\ge 0\) and \(\tau _\delta [X]\) is uniquely determined.

To construct a weak solution to (4.2) we start with a probability space \((\Omega ,{{\mathcal {F}}},\mathbb {P})\) that supports a family of independent standard Brownian motions \(\tilde{W}^k_i\), \(1\le i\le k\le N\). In addition, we note (see [42, Section XI] for a proof) that to each pair of Brownian motions of the form \((\tilde{W}^k_i,\,\tilde{W}^{k-1}_{i-1})\) or \((\tilde{W}^k_i,\,\tilde{W}^{k-1}_i)\) and all initial conditions we can associate the unique strong solutions of the SDEs

$$\begin{aligned} \mathrm{d}R^{k,-}_i(t)=\frac{\theta -1}{R^{k,-}_i(t)}\,\mathrm{d}t+\mathrm{d}\tilde{W}^k_i(t)-\mathrm{d}\tilde{W}^{k-1}_{i-1}(t) ,\end{aligned}$$
(4.9)
$$\begin{aligned} \mathrm{d}R^{k,+}_i(t)=\frac{\theta -1}{R^{k,+}_i(t)}\,\mathrm{d}t+\mathrm{d}\tilde{W}^k_i(t)-\mathrm{d}\tilde{W}^{k-1}_i(t), \end{aligned}$$
(4.10)

defined on the same probability space.

We will now construct an \(N(N+1)/2\)-dimensional process \(X(t)\), \(t\ge 0\), stopping times \(\tau _\delta \), \(\sigma _n\), \(n=0,1,2,\dots \) and pair-partitions \(A_n\), \(n=0,1,2,\dots \) which satisfy the conditions of Lemma 4.4 and the system of equations (4.5).

The construction proceeds for each \(\omega \in \Omega \) independently, and is inductive. If the initial condition \(X(0)\) is such that \(\tau _\delta [X]=0\), then there is nothing to prove. Otherwise, we set \(\sigma _0=0\) and \(A_1=A(0;\delta )\) (see the proof of Lemma 4.4 for the definition of \(A(t;\delta )\)). Next, we define \(\hat{X}\) as the unique strong solution of

$$\begin{aligned} \hat{X}^k_i(t)-\hat{X}^k_i(0)={\left\{ \begin{array}{ll} \int \limits _{0}^{t} \Big (\mathrm{d}\tilde{W}^k_i-\frac{(1-\theta )\,\mathrm{d}s}{\hat{X}^k_i(s)-\hat{X}^{k-1}_{i-1}(s)} \Big ),\;\;\; \text { if }\ ((k,i),(k-1,i-1))\in A_1,\\ \int \limits _{0}^{t}\, \Big (\mathrm{d}\tilde{W}^k_i-\frac{(1-\theta )\mathrm{d}s}{\hat{X}^k_i(s)-\hat{X}^{k-1}_i(s)}\Big ), \;\;\;\text { if }\ ((k,i),(k-1,i))\in A_1,\\ \int \limits _{0}^{t} \mathrm{d}\tilde{W}^k_i, \;\;\;\text { otherwise,} \end{array}\right. } \end{aligned}$$
(4.11)

with initial condition \(\hat{X}(0)=X(0)\).

Now, we can define \(\sigma _1\) as in Lemma 4.4, but with \(\hat{X}(t)\) instead of \(X(t)\). After this, we set \(X(t)\) to be equal to \(\hat{X}(t)\) on the time interval \([0,\sigma _1]\). We further define \(A_2=A(\sigma _1;\delta )\) and repeat the above procedure to define \(X(t)\) on the time interval \([\sigma _1,\sigma _2]\). Iterating this process we can define \(X(t)\) up to time \(\tau _\delta [X]\) thanks to Lemma 4.5. We extend it to all \(t\ge 0\) by setting \(X^k_i(t)=X^k_i(\tau _\delta [X])\) for \(t>\tau _\delta [X]\).

Next, we apply the Girsanov Theorem as in the uniqueness part to conclude that, for each \(T>0\), there exists a probability measure \(\mathbb {Q}_T\) which is absolutely continuous with respect to \(\mathbb {P}\) and such that the representation

$$\begin{aligned} X^k_i(t\wedge T){-}X^k_i(0)&= \int _0^{t\wedge T\wedge \tau _\delta [X]} \left( \sum _{m\ne i} \!\frac{1-\theta }{X^k_i(s)\!-\!X^k_m(s)} {-}\sum _{m=1}^{k-1} \frac{1-\theta }{X^k_i(s)\!-\!X^{k-1}_m(s)}\right) \,\mathrm{d}s \\&+ \,W^k_i(t\wedge T\wedge \tau _\delta [X]), \quad 1\le i\le k\le N \end{aligned}$$

holds with \(W^k_i\), \(1\le i\le k\le N\) being independent standard Brownian motions under \(\mathbb {Q}_T\).

Finally, replacing \(T\) by a sequence \(T_n\uparrow \infty \) and using the Kolmogorov Extension Theorem (see e.g. [28, Theorem 6.16] and note that the consistency condition is satisfied due to the uniqueness of the solution to (4.2)), we deduce the existence of processes \(X^k_i\), \(1\le i\le k\le N\) defined on a suitable probability space which solve (4.2).

4.2 Proof of Proposition 4.3

We start with a version of Feller’s test for explosions that will be used below (see e.g. [29, Section 5.5.C] and the references therein for related results).

Lemma 4.6

Let \(Z\) be a one-dimensional continuous semimartingale satisfying \(Z(0)>0\) and

$$\begin{aligned} \forall \,0\le t_1<t_2:\quad Z(t_2)-Z(t_1)=b(t_2-t_1)+M(t_2)-M(t_1) \end{aligned}$$
(4.12)

with a constant \(b>0\) and a local martingale \(M\). If the quadratic variation of \(M\) satisfies

$$\begin{aligned} \forall \,0\le t_1<t_2:\left\langle M\right\rangle (t_2)-\left\langle M\right\rangle (t_1) \le 2b\,\int _{t_1}^{t_2} Z(t)\,\mathrm{d}t, \end{aligned}$$
(4.13)

then the process \(Z\) does not reach zero in finite time with probability one.

Proof

We fix two constants \(0<r_1<Z(0)<R_1<\infty \) and let \(\tau _{r_1,R_1}\) be the first time that \(Z\) reaches \(r_1\) or \(R_1\). Next, we apply Itô’s formula (see e.g. [29, Section 3.3.A]) to obtain

$$\begin{aligned} \ln Z(t\wedge \tau _{r_1,R_1}\wedge \zeta )-\ln Z(0)&= \int _0^{t\wedge \tau _{r_1,R_1}\wedge \zeta } \left( \frac{b\,\mathrm{d}s}{Z(s)} - \frac{\mathrm{d}\langle M\rangle (s)}{2\,Z(s)^2}\right) \nonumber \\&\quad +\int _0^{t\wedge \tau _{r_1,R_1}\wedge \zeta } \frac{\mathrm{d}M(s)}{Z(s)} \end{aligned}$$
(4.14)

for any stopping time \(\zeta \). By (4.13), the first integral in (4.14) takes non-negative values. Hence, picking a localizing sequence of stopping times \(\zeta =\zeta _m\) for the local martingale given by the second integral in (4.14), taking the expectation in (4.14) and passing to the limit \(m\rightarrow \infty \), we obtain

$$\begin{aligned} \mathbb {E}\left[ \ln Z(t\wedge \tau _{r_1,R_1})\right] \ge \ln Z(0). \end{aligned}$$

Now, Fatou’s Lemma and \(Z(t\wedge \tau _{r_1,R_1})\le R_1\) yield the chain of estimates

$$\begin{aligned} \ln Z(0)\le \mathbb {E}\left[ \limsup _{t\rightarrow \infty } \ln Z(t\wedge \tau _{r_1,R_1})\right] \le p_{r_1}\ln r_1+(1-p_{r_1})\ln R_1 \end{aligned}$$

where \(p_{r_1}=P(\limsup _{t\rightarrow \infty } \ln Z(t\wedge \tau _{r_1,R_1})=r_1)\). Consequently,

$$\begin{aligned} p_{r_1}\le \frac{\ln R_1-\ln Z(0)}{\ln R_1 - \ln r_1}. \end{aligned}$$

The lemma now follows by taking the limit \(r_1\downarrow 0\). \(\square \)

We will now show that \(\tau _0[X]=\infty \) for the solution of (4.2) with initial condition \(X(0)\) such that \(\tau _0[X]>0\) (in particular, this includes the case that \(X(0)\) belongs to the interior of \(\overline{\mathcal {G}^N}\)). Recall that \(\tau _0\) was defined as the first time at which one of the events in Fig. 2 with \(\delta =0\) occurs. We will show that neither of the cases \(A-F\) in Fig. 2 can occur in finite time. We will argue by induction over \(k\) to prove that none of these events can happen on the first \(k\) levels for \(k=1,2,\dots ,N\).

First, we focus on the cases \(A\) and \(B\).

Lemma 4.7

An event of the form \(X^k_i(t)=X^k_{i+1}(t)\) cannot occur in finite time without one of the events

$$\begin{aligned} X^{k-1}_i(t)-X^{k-2}_{i-1}(t)=0,\quad X^{k-2}_i(t)-X^{k-1}_i(t)=0 \end{aligned}$$
(4.15)

occurring at the same time.

Proof

If the statement of the lemma was not true, then the continuity of the paths of the particles would allow us to find stopping times \(\sigma \), \(\sigma '\) similar to the ones introduced in Sect. 4.1 and a real number \(\kappa >0\) such that \(\sigma <\sigma '\) with probability one, the spacings in (4.15) are at least \(\kappa \) during the time interval \([\sigma ,\sigma ']\) and the event \(X^k_i(t)=X^k_{i+1}(t)\) occurs for the first time at time \(\sigma '\). Moreover, the interlacing condition and the induction hypothesis imply together that \([\sigma ,\sigma ']\) and \(\kappa \) can be chosen such that the spacings

$$\begin{aligned} X^k_i-X^{k-1}_{i-1},\quad X^{k-1}_{i+1}-X^k_{i+1} \end{aligned}$$

do not fall below \(\kappa \) on \([\sigma ,\sigma ']\) (otherwise at least one of the events \(X^{k-1}_{i-1}(t)=X^{k-1}_i(t)\) or \(X^{k-1}_i(t)=X^{k-1}_{i+1}(t)\) would have occurred at time \(\sigma '\) in contradiction to the induction hypothesis). The described inequalities are shown in Fig. 4.

Fig. 4
figure 4

Decoupling of the particles \(X^k_i\), \(X^k_{i+1}\), \(X^{k-1}_i\)

Now, making a Girsanov change of measure similar to the one in Sect. 4.1, we can decouple the particles \(X^k_i\), \(X^k_{i+1}\), \(X^{k-1}_i\) from the rest of the particle system, thus reducing their dynamics on the time interval \([\sigma ,\sigma ']\) to the two-level dynamics:

$$\begin{aligned} X^k_i(t\wedge \sigma ')-X^k_i(t\wedge \sigma )&= \int _{t\wedge \sigma }^{t\wedge \sigma '} \biggl (\mathrm{d}\tilde{W}^k_i+\frac{(1-\theta )\,\mathrm{d}s}{X^k_i(s)-X^k_{i+1}(s)}-\frac{(1-\theta )\,\mathrm{d}s}{X^k_i(s)-X^{k-1}_i(s)}\biggr ), \\ X^k_{i+1}(t\wedge \sigma ')-X^k_{i+1}(t\wedge \sigma )&= \int _{t\wedge \sigma }^{t\wedge \sigma '} \biggl (\mathrm{d}\tilde{W}^k_{i+1}+\frac{(1-\theta )\,\mathrm{d}s}{X^k_{i+1}(s)-X^k_i(s)} -\frac{(1-\theta )\,\mathrm{d}s}{X^k_{i+1}(s)-X^{k-1}_i(s)}\biggr ),\\ X^{k-1}_i(t\wedge \sigma ')-X^{k-1}_i(t\wedge \sigma )&= \int _{t\wedge \sigma }^{t\wedge \sigma '} \mathrm{d}\tilde{W}^{k-1}_i, \end{aligned}$$

with \(\tilde{W}^k_i\), \(\tilde{W}^k_{i+1}\), \(\tilde{W}^{k-1}_i\) being standard Brownian motions under the new probability measure. Next, we note that the process \(X^k_{i+1}-X^k_i\) hits zero if and only if the process

$$\begin{aligned} Z=\frac{1}{2}\Big ((X_i^{k-1}-X^k_i)^2+(X_{i+1}^k-X_i^{k-1})^2\Big ) \end{aligned}$$
(4.16)

hits zero and in this case both events occur at the same time. Moreover, applying Itô’s formula (see e.g. [29, Section 3.3.A]) and simplifying the result we obtain

$$\begin{aligned}&Z(t\wedge \sigma ')-Z(t\wedge \sigma ) = \int _{t\wedge \sigma }^{t\wedge \sigma '} \Big ((1+\theta )\,\mathrm{d}s +(X_i^{k-1}-X^k_i)\,\mathrm{d}(\tilde{W}^{k-1}_i-\tilde{W}^k_i)\\&\quad +\,(X_{i+1}^k-X_i^{k-1})\,\mathrm{d}(\tilde{W}^k_{i+1}-\tilde{W}^{k-1}_i)\Big ) =:\int _{t\wedge \sigma }^{t\wedge \sigma '} \big ((1+\theta )\,\mathrm{d}s+\mathrm{d}M\big ) \end{aligned}$$

where \(M\) is a local martingale whose quadratic variation process satisfies

$$\begin{aligned} \langle M\rangle (t\wedge \sigma ')-\langle M\rangle (t\wedge \sigma )&= \int _{t\wedge \sigma }^{t\wedge \sigma '} \Big (2\,(X_i^{k-1}-X^k_i)^2+2\,(X_{i+1}^k-X_i^{k-1})^2\\&-(X_i^{k-1}-X^k_i)(X_{i+1}^k-X_i^{k-1})\Big )\,\mathrm{d}s,\quad t\ge 0. \end{aligned}$$

We can now define the (random) time change

$$\begin{aligned} s(t)=\inf \left\{ s\ge 0:\int _{s\wedge \sigma }^{s\wedge \sigma '} (1+\theta )\,\mathrm{d}u=t\right\} ,\quad 0\le t\le \int _{\sigma }^{\sigma '} (1+\theta )\,\mathrm{d}u \end{aligned}$$

and rewrite the stochastic integral equation for \(Z\) as

$$\begin{aligned} Z(s(t_2))-Z(s(t_1))=(t_2-t_1)+M(s(t_2))-M(s(t_1)),\quad 0\le t_1\le t_2\le \int _{\sigma }^{\sigma '} (1+\theta )\,\mathrm{d}u. \end{aligned}$$

A standard application of the Optional Sampling Theorem (see e.g. the proof of [29, Theorem 4.6, Chapter 3] for a similar argument) shows that the process \(M(s(t))\), \(t\ge 0\) is a local martingale in its natural filtration. In addition,

$$\begin{aligned} \langle M\rangle (s(t_2))-\langle M\rangle (s(t_1))&\le \int _{t_1}^{t_2} \frac{2\,(X_i^{k-1}(s(t))-X^k_i(s(t)))^2+2\,(X_{i+1}^k(s(t))-X_i^{k-1}(s(t)))^2}{1+\theta }\,\mathrm{d}t \\&\le 2\,\int _{t_1}^{t_2} Z(s(t))\,\mathrm{d}t, \quad \quad \quad 0\le t_1\le t_2\le \int _{\sigma }^{\sigma '} (1+\theta )\,\mathrm{d}u. \end{aligned}$$

It follows that the process

$$\begin{aligned} \tilde{Z}(t)= {\left\{ \begin{array}{ll} Z(s(t)) &{} \text {if}\ t\in \big [0,\int _{\sigma }^{\sigma '} (1+\theta )\,\mathrm{d}u\big ] \\ Z\big (s\big (\int _{\sigma }^{\sigma '} (1+\theta )\,\mathrm{d}u\big )\big )+\big (t-\int _{\sigma }^{\sigma '} (1+\theta )\,\mathrm{d}u\big ) &{} \text {if}\ t\in \big (\int _{\sigma }^{\sigma '} (1+\theta )\,\mathrm{d}u,\infty \big ) \end{array}\right. } \end{aligned}$$

falls into the framework of Lemma 4.6. Consequently, the original process \(Z(t)\), \(t\ge 0\) does not reach zero on the time interval \([\sigma ,\sigma ']\) with probability one. Using Girsanov’s Theorem again (now to go back to the original probability measure) we conclude that \(X^k_{i+1}-X^k_i\) does not hit zero on \([\sigma ,\sigma ']\) under the original probability measure, which is the desired contradiction. \(\square \)

Next, we study the events \(C-F\) in Fig. 2. All of them can be dealt in exactly the same manner (in particular, using a Lyapunov function of the same form) and we will only show the following:

Lemma 4.8

The event

$$\begin{aligned} X^k_i(t)=X^{k-1}_i(t)=X^{k-2}_i(t) \end{aligned}$$
(4.17)

cannot occur in finite time.

Proof

To show the non-occurrence of the event in (4.17), we again argue by induction over \(k\) and by contradiction. Assuming that the event in (4.17) occurs in finite time, we may invoke the induction hypothesis and Lemma 4.7 to find a random time interval \([\sigma ,\sigma ']\) with \(\sigma \), \(\sigma '\) being stopping times and a real number \(\kappa >0\) such that the event in (4.17) occurs for the first time at \(\sigma '\), and either \(X^k_i(\sigma ')=X^k_{i+1}(\sigma ')=X^{k-1}_i(\sigma ')=X^{k-2}_i(\sigma ')\) and the spacings \(X^k_i-X^{k-1}_{i-1}\), \(X^{k-1}_i-X^{k-2}_{i-1}\), \(X^{k-2}_i-X^{k-3}_{i-1}\), \(X^{k-3}_i-X^{k-2}_i\), \(X^{k-1}_{i+1}-X^{k-2}_i\), \(X^{k-1}_{i+1}-X^k_{i+1}\) are bounded below by \(\kappa \) on \([\sigma ,\sigma ']\); or \(X^k_i(\sigma ')=X^{k-1}_i(\sigma ')=X^{k-2}_i(\sigma ')\) and the spacings \(X^k_i-X^{k-1}_{i-1}\), \(X^{k-1}_i-X^{k-2}_{i-1}\), \(X^{k-2}_i-X^{k-3}_{i-1}\), \(X^{k-3}_i-X^{k-2}_i\), \(X^{k-1}_{i+1}-X^{k-2}_i\), \(X^k_{i+1}-X^{k-1}_i\) are bounded below by \(\kappa \) on \([\sigma ,\sigma ']\). Figure 5 shows schematic illustrations of these two cases. In the first case, we can make a Girsanov change of measure such that under the new measure the evolution of the particles \(X^k_i\), \(X^k_{i+1}\), \(X^{k-1}_i\), \(X^{k-2}_i\) decouples from the rest of the particle system on the time interval \([\sigma ,\sigma ']\). Similarly, in the second case, we can apply a Girsanov change of measure such that under the new measure the dynamics of the particles \(X^k_i\), \(X^{k-1}_i\), \(X^{k-2}_i\) decouples from the dynamics of the rest of the particle configuration.

We only treat the first of the two cases in detail (the second case can be dealt with by proceeding as below with the same definitions for \(R\), \(S\), and \(Z\) redefined to \(\frac{1}{2}(R^2+S^2+2\,R\,S)\)). In the first case, the decoupled particles satisfy under the new measure:

$$\begin{aligned} X^k_i(t\wedge \sigma ')-X^k_i(t\wedge \sigma )&= \int _{t\wedge \sigma }^{t\wedge \sigma '} \Big (\mathrm{d}\tilde{W}^k_i+\frac{(1-\theta )\,\mathrm{d}s}{X^k_i(s)-X^k_{i+1}(s)}-\frac{(1-\theta )\,\mathrm{d}s}{X^k_i(s)-X^{k-1}_i(s)}\Big ), \\ X^k_{i+1}(t\wedge \sigma ')-X^k_{i+1}(t\wedge \sigma )&= \int _{t\wedge \sigma }^{t\wedge \sigma '} \Big (\mathrm{d}\tilde{W}^k_{i+1}+\frac{(1-\theta )\,\mathrm{d}s}{X^k_{i+1}(s)-X^k_i(s)} -\frac{(1-\theta )\,\mathrm{d}s}{X^k_{i+1}(s)-X^{k-1}_i(s)}\Big ), \\ X^{k-1}_i(t\wedge \sigma ')-X^{k-1}_i(t\wedge \sigma )&= \int _{t\wedge \sigma }^{t\wedge \sigma '} \Big (\mathrm{d}\tilde{W}^{k-1}_i-\frac{(1-\theta )\,\mathrm{d}s}{X^{k-1}_i(s)-X^{k-2}_i(s)}\Big ),\\ X^{k-2}_i(t\wedge \sigma ')-X^{k-2}_i(t\wedge \sigma )&= \int _{t\wedge \sigma }^{t\wedge \sigma '}\,\mathrm{d}\tilde{W}^{k-2}_i \end{aligned}$$

where \(\tilde{W}^k_i,\,\tilde{W}^k_{i+1},\,\tilde{W}^{k-1}_i,\,\tilde{W}^{k-2}_i\) are independent standard Brownian motions under the new measure. Next, we set \(R:=X^{k-1}_i-X^k_i\), \(S:=X^{k-2}_i-X^{k-1}_i\), \(U:=X^k_{i+1}-X^{k-1}_i\), \(B_1:=\tilde{W}^{k-1}_i-\tilde{W}^k_i\), \(B_2:=\tilde{W}^{k-2}_i-\tilde{W}^{k-1}_i\), \(B_3:=\tilde{W}^k_{i+1}-\tilde{W}^{k-1}_i\) and define \(Z:=\frac{1}{2}(R^2+S^2+U^2+2\,R\,U)\). Applying Itô’s formula (see e.g. [29, Section 3.3.A]) and simplifying, we obtain

$$\begin{aligned}&Z(t\wedge \sigma ')-Z(t\wedge \sigma ) =\int _{t\wedge \sigma }^{t\wedge \sigma '} \left( 4+\theta +(\theta -1)\left( \frac{U(s)}{R(s)}+\frac{R(s)}{U(s)}\right) \right) \mathrm{d}s\\&\quad +\,(R+U)\mathrm{d}(B_1+B_3)+S\mathrm{d}B_2 =:\int _{t\wedge \sigma }^{t\wedge \sigma '} \Big (D(s)\,\mathrm{d}s+\mathrm{d}M\Big ) \end{aligned}$$

where \(M\) is a local martingale whose quadratic variation process satisfies

$$\begin{aligned} \langle M\rangle (t\wedge \sigma ')-\langle M\rangle (t\wedge \sigma )&= \int _{t\wedge \sigma }^{t\wedge \sigma '} 2\,(R^2+S^2+U^2+2\,R\,U)\,\mathrm{d}s\\&= \int _{t\wedge \sigma }^{t\wedge \sigma '} 4\,Z\,\mathrm{d}s,\quad t\ge 0. \end{aligned}$$

Next, we introduce the time change

$$\begin{aligned} s(t)=\inf \left\{ s\ge 0:\int _{t\wedge \sigma }^{t\wedge \sigma '} D(u)\,\mathrm{d}u=t\right\} ,\quad 0\le t\le \int _\sigma ^{\sigma '} D(u)\,\mathrm{d}u \end{aligned}$$

and rewrite the latter stochastic integral equation for \(Z\) as

$$\begin{aligned} Z(s(t_2))-Z(s(t_1))=(t_2-t_1)+M(s(t_2))-M(s(t_1)),\quad 0\le t_1\le t_2\le \int _\sigma ^{\sigma '} D(u)\,\mathrm{d}u. \end{aligned}$$

At this point, a routine application of the Optional Sampling Theorem (see e.g. the proof of [29, Theorem 4.6, Chapter 3] for an argument of this type) shows that \(M(s(t))\), \(t\ge 0\) is a local martingale in its natural filtration. In addition,

$$\begin{aligned} \langle M\rangle (s(t_2))-\langle M\rangle (s(t_1))=\int _{t_1}^{t_2} \frac{4\,Z(s(t))}{D(s(t))}\,\mathrm{d}t \le 2\int _{t_1}^{t_2} Z(s(t))\,\mathrm{d}t,\quad 0\le t_1\le t_2\le \int _\sigma ^{\sigma '} D(u)\,\mathrm{d}u. \end{aligned}$$

Hence, the process

$$\begin{aligned} \tilde{Z}(t)= {\left\{ \begin{array}{ll} Z(s(t)) &{} \text {if}\;\;\;t\in \big [0,\int _\sigma ^{\sigma '} D(u)\,\mathrm{d}u\big ] \\ Z\big (s\big (\int _\sigma ^{\sigma '} D(u)\,\mathrm{d}u\big )\big )+\big (t-\int _\sigma ^{\sigma '} D(u)\,\mathrm{d}u\big ) &{} \text {if}\;\;\;t\in \big (\int _\sigma ^{\sigma '} D(u)\,\mathrm{d}u,\infty \big ) \end{array}\right. } \end{aligned}$$

falls into the setting of Lemma 4.6. The result of that lemma implies that the original process \(Z(t)\), \(t\ge 0\) does not hit zero on the time interval \([\sigma ,\sigma ']\) with probability one. Changing the measure back to the original probability measure by a suitable application of Girsanov’s Theorem we conclude that the same is true under the original probability measure. This is the desired contradiction. \(\square \)

Fig. 5
figure 5

Decoupling of the particles \(X^k_i\), \(X^k_{i+1}\), \(X^{k-1}_i\), \(X^{k-2}_i\) (left panel) and the particles \(X^k_i\), \(X^{k-1}_i\), \(X^{k-2}_i\) (right panel)

Putting together Lemmas 4.7 and 4.8 we deduce that \(\tau _0[X]=\infty \) for all \(\theta >1\).

Finally, for \(\theta \ge 2\) and any \(T>0\), we have shown that the law of our process up to time \(\tau _\delta [X]\wedge T\) is absolutely continuous with respect to the law of a process comprised of a number of Brownian motions and Bessel processes of dimension \(\theta \). The definition of the latter implies that two of its components can collide only if the corresponding Bessel process hits zero. However, it is well-known that the Bessel process of dimension \(\theta \ge 2\) does not reach zero with probability one (see e.g. [42, Chapter XI, Section 1]). It follows that \(\widehat{\tau }_\delta [X]\ge \lim _{T\rightarrow \infty } \tau _\delta [X]\wedge T=\tau _\delta [X]\). Passing to the limit \(\delta \downarrow 0\), we conclude that \(\widehat{\tau }_0[X]=\infty \).

5 Convergence to multilevel Dyson Brownian Motion

In this section we study the diffusive scaling limit of the multilevel process \(X^{multi}_{disc}\) of Definition 2.27. We start by formulating our main results.

We fix \(\theta >0\), let \(\varepsilon >0\) be a small parameter and define the \(\frac{N(N+1)}{2}\)-dimensional stochastic process \(Y^{mu}_\varepsilon =((Y^{mu}_\varepsilon )_i^k:\,1\le i\le k\le N)\) by

$$\begin{aligned} (Y^{mu}_\varepsilon )^k_i(t)= \varepsilon ^{1/2} \Big ((X^{multi}_{disc})^k_{N+1-i}\Big (\frac{t}{\theta \varepsilon }\Big )-\frac{t}{\varepsilon }\Big ),\;\;t\ge 0,\qquad 1\le i\le k\le N \end{aligned}$$

where \((X^{multi}_{disc})_i^k\), \(1\le i\le k\le N\) are the coordinate processes of \(X^{multi}_{disc}\). Here, in contrast to Definition 2.27, we allow \(X^{multi}_{disc}\) (or, equivalently, \(Y^{mu}_\varepsilon \)) to start from an arbitrary initial condition, in particular, one that depends on \(\varepsilon \). In addition, we use the notation \(D^{N(N+1)/2}=D([0,\infty ),\mathbb {R}^{N(N+1)/2})\) for the space of right-continuous paths with left limits taking values in \(\mathbb {R}^{N(N+1)/2}\), endowed with the Skorokhod topology.

Theorem 5.1

Let \(\theta >0\) and suppose that the family of initial conditions \(Y^{mu}_\varepsilon (0)\), \(\varepsilon \in (0,1)\) is tight on \(\mathbb {R}^{N(N+1)/2}\). Then the family \(Y^{mu}_\varepsilon \), \(\varepsilon \in (0,1)\) is tight on \(D^{N(N+1)/2}\).

We defer the proof of Theorem 5.1 to Sect. 5.2.

Next, we let \(Y^{mu}\) be an arbitrary limit point as \(\varepsilon \downarrow 0\) of the tight family \(Y^{mu}_\varepsilon \), \(\varepsilon \in (0,1)\). For \(\theta \ge 2\), we can uniquely identify the limit point with the solution of (4.1), thus, obtaining the diffusive scaling limit. For \(\theta \in \bigl [\frac{1}{2},2\bigr )\), we give a partial result towards such an identification.

Theorem 5.2

Let \(\theta \ge 2\) (that is, \(\beta =2\theta \ge 4\)) and suppose that the initial conditions \(Y^{mu}_\varepsilon (0)\), \(\varepsilon \in (0,1)\) converge as \(\varepsilon \downarrow 0\) in distribution to a limit \(Y^{mu}(0)\) which takes values in the interior of the Gelfand–Tsetlin cone \(\overline{\mathcal {G}^N}\) with probability one. Then the family \(Y^{mu}_\varepsilon \), \(\varepsilon \in (0,1)\) converges as \(\varepsilon \downarrow 0\) in distribution in \(D^{N(N+1)/2}\) to the unique solution of the system of SDEs

$$\begin{aligned} \mathrm{d}(Y^{mu})^k_i&= \left( \sum _{m\ne i} \frac{1-\theta }{(Y^{mu})^k_i-(Y^{mu})^k_m} -\sum _{m=1}^{k-1} \frac{1-\theta }{(Y^{mu})^k_i-(Y^{mu})^{k-1}_m}\right) \,\mathrm{d}t\\&+\,\mathrm{d}W_i^k , \;\;1\le i\le k\le N \end{aligned}$$

started from \(Y^{mu}(0)\) and where \(W_i^k\), \(1\le i\le k\le N\) are independent standard Brownian motions.

We give the proof of Theorem 5.2 in Sect. 5.3. We expect Theorem 5.2 to be valid for all \(\theta >1\), but we are not able to prove this generalization.

Theorem 5.3

Suppose that the initial conditions \(Y^{mu}_\varepsilon (0)\), \(\varepsilon \in (0,1)\) converge as \(\varepsilon \downarrow 0\) in distribution to a limit \(Y^{mu}(0)\) which takes values in the interior of the Gelfand–Tsetlin cone \(\overline{\mathcal {G}^N}\) with probability one. In addition, suppose that the distribution of \(Y^{mu}(0)\) is \(\theta \)-Gibbs in the sense of Definition 2.15.

  1. (a)

    If \(\theta \ge \frac{1}{2}\) (that is, \(\beta =2\theta \ge 1\)), then the restriction of \(Y^{mu}\) to level \(N\), that is the process \(((Y^{mu})^N_1,\dots ,(Y^{mu})^N_N)\), is a \((2\theta )\)-Dyson Brownian motion:

    $$\begin{aligned} \mathrm{d}(Y^{mu})^N_i(t) = \sum _{m\ne i} \frac{\theta }{(Y^{mu})^N_i(t)-(Y^{mu})^N_m(t)}\,\mathrm{d}t + \mathrm{d}W_i ,\;\;1\le i \le N \end{aligned}$$

    with \(W_i^k\), \(1\le i\le k\le N\) being independent standard Brownian motions.

  2. (b)

    For any \(\theta >0\) and any fixed \(t>0\), the distribution of \(Y^{mu}(t)\) is \(\theta \)-Gibbs.

We expect the first part of Theorem 5.3 to be valid for all \(\theta >0\), but we are currently not able to prove this.

Proof of Theorem 5.3

The theorem follows from a combination of Propositions 2.29, 2.16 and Theorem 3.2. \(\square \)

Corollary 5.4

Take any \(\theta >0\) and suppose that \(Y^{mu}_\varepsilon (0)=0\in \mathbb {R}^{N(N+1)/2}\), \(\varepsilon \in (0,1)\). Then, for any \(t\ge 0\), the distribution of \(Y^{mu}(t)\) is given by the Hermite \(\beta =2\theta \) corners process of variance \(t\) (see Definition 1.1).

Remark 5.5

Since for any \(t>0\), the Hermite \(\beta =2\theta \) corners process of variance \(t\) is supported by the interior of the Gelfand–Tsetlin cone \(\overline{\mathcal {G}^N}\), it follows that Theorem 5.2 (for \(\theta \ge 2\)) can be applied in this case as well. Consequently, for \(\theta \ge 2\), the process \(Y^{mu}\) started from the zero initial condition is a diffusion that combines Dyson Brownian motions and corners processes into a single picture as desired.

Proof of Corollary 5.4

The corollary is a consequence of Proposition 2.29 and Corollary 2.17. \(\square \)

The rest of this section is devoted to the proofs of Theorems 5.1 and 5.2. Our proof strategy is similar to that in the proof of Theorem 3.2: In Sect. 5.1 we analyze the asymptotic behavior of the jump rates of the processes \(Y^{mu}_\varepsilon \), \(\varepsilon \in (0,1)\), in Sect. 5.2 we use this asymptotics to prove that the family \(Y^{mu}_\varepsilon \), \(\varepsilon \in (0,1)\) is tight, and in Sect. 5.3 we deduce the SDE (4.1) for subsequential limits as \(\varepsilon \downarrow 0\) of this family when \(\theta \ge 2\). We omit the details in the parts that are parallel to the arguments of Sect. 3.

5.1 Step 1: Rates

We start by noting that, for each \(\varepsilon \in (0,1)\), \(Y^{mu}_\varepsilon \) is a continuous time Markov process with state space \(\overline{\mathcal {G}^N}\), a (constant) drift of \(-\varepsilon ^{-1/2}\) in each coordinate and jump rates

$$\begin{aligned} q^{mu}_\varepsilon (y,y',t)=\frac{1}{\theta \varepsilon }\,q_{\frac{t}{\theta }\,\varepsilon ^{-1}+\hat{y}\,\varepsilon ^{-1/2}\,\rightarrow \,\frac{t}{\theta }\,\varepsilon ^{-1}+\hat{y'}\,\varepsilon ^{-1/2}} \end{aligned}$$

where \(\hat{y}\), \(\hat{y'}\) are the vectors obtained from \(y\), \(y'\) by reordering the coordinates on each level in decreasing order, and intensities \(q\) are given by (2.23). Write \(y'\approx _\varepsilon y\) for vectors \(y,y'\in \overline{\mathcal {G}^N}\) such that \(y'\) can be obtained from \(y\) by increasing one coordinate (say, \(y^k_i\)) by \(\varepsilon ^{1/2}\), and, if necessary, by increasing other coordinates as well to preserve the interlacing condition (in the sense of the push interaction as explained after (2.23)). Clearly, \(q^{mu}_\varepsilon (y,y',t)=0\) unless \(y'\approx _\varepsilon y\). As we will see, in fact, \(q^{mu}_\varepsilon (y,y',t)\) does not depend on \(t\).

Lemma 5.6

For any sequence of vectors \(y'\approx _\varepsilon y\) and any fixed \(k\in \{1,\dots ,N\}\), \(i\in \{1,2,\ldots ,k\}\) as above one has the following \(\varepsilon \downarrow 0\) asymptotics:

$$\begin{aligned} q^{mu}_\varepsilon (y,y',t)=\varepsilon ^{-1}+\varepsilon ^{-1/2}\Big (\sum _{m\ne i} \frac{1-\theta }{y_i^k-y^k_m}-\sum _{m=1}^{k-1} \frac{1-\theta }{y_i^k-y_m^{k-1}}\Big )+O(1) \end{aligned}$$
(5.1)

with a uniform \(O(1)\) remainder on compact subsets of the open Gelfand–Tsetlin cone \({{\mathcal {G}^N}}\).

Proof

We write \(\hat{y}_i^k\) for \(y_{k+1-i}^k\) and \((\hat{y}')_i^k\) for \((y')_{k+1-i}^k\). Using (2.23), Proposition 2.4 and arguing as in Lemma 3.4 we rewrite \(q^{mu}_\varepsilon (y,y',t)\) as

$$\begin{aligned}&\varepsilon ^{-1}\prod _{l=1}^{i-1} \frac{(\hat{y}_l^k-\hat{y}_i^k)\,\varepsilon ^{-1/2}-1+\theta \,(i-l+1)}{(\hat{y}_l^k-\hat{y}_i^k)\,\varepsilon ^{-1/2}-1+\theta \,(i-l)}\cdot \frac{(\hat{y}_l^k-\hat{y}_i^k)\,\varepsilon ^{-1/2}+\theta \,(i-l-1)}{(\hat{y}_l^k-\hat{y}_i^k)\,\varepsilon ^{-1/2}+\theta \,(i-l)}\\&\quad \times \prod _{1\le m\le n\le k-1} \frac{\big (((\hat{y}')^{k-1}_m-(\hat{y}')^{k-1}_n)\,\varepsilon ^{-1/2}+\theta \,(n-m)+\theta \big )_{((\hat{y}')^{k-1}_n-(\hat{y}')^k_{n+1})\,\varepsilon ^{-1/2}}}{\big (((\hat{y}')^{k-1}_m-(\hat{y}')^{k-1}_n)\,\varepsilon ^{-1/2}+\theta \,(n-m)+1\big )_{((\hat{y}')^{k-1}_n-(\hat{y}')^k_{n+1})\,\varepsilon ^{-1/2}}}\\&\quad \times \, \frac{\big (((\hat{y}')^k_m-(\hat{y}')^{k-1}_n)\,\varepsilon ^{-1/2}+\theta \,(n-m)+1\big )_{((\hat{y}')^{k-1}_n-(\hat{y}')^k_{n+1})\,\varepsilon ^{-1/2}}}{\big (((\hat{y}')^k_m-(\hat{y}')^{k-1}_n)\,\varepsilon ^{-1/2}+\theta \,(n-m)+\theta \big )_{((\hat{y}')^{k-1}_n-(\hat{y}')^k_{n+1})\,\varepsilon ^{-1/2}}}\\&\quad \times \, \prod _{1\le m\le n\le k-1} \frac{\big ((y^{k-1}_m-y^{k-1}_n)\,\varepsilon ^{-1/2}+\theta \,(n-m)+1\big )_{(y^{k-1}_n-y^k_{n+1})\,\varepsilon ^{-1/2}}}{\big ((y^{k-1}_m-y^{k-1}_n)\,\varepsilon ^{-1/2}+\theta \,(n-m)+\theta \big )_{(y^{k-1}_n-y^k_{n+1})\,\varepsilon ^{-1/2}}}\\&\quad \times \, \frac{\big ((y^k_m-y^{k-1}_n)\,\varepsilon ^{-1/2}+\theta \,(n-m)+\theta \big )_{(y^{k-1}_n-y^k_{n+1})\,\varepsilon ^{-1/2}}}{\big ((y^k_m-y^{k-1}_n)\,\varepsilon ^{-1/2}+\theta \,(n-m)+1\big )_{(y^{k-1}_n-y^k_{n+1})\,\varepsilon ^{-1/2}}}. \end{aligned}$$

Using the fact that \(y\) and \(y'\) differ only in one coordinate, we can simplify the latter expression to

$$\begin{aligned}&\!\!\!\!\varepsilon ^{-1}\,\prod _{m=1}^{i-1} \frac{(\hat{y}_m^k-\hat{y}_i^k)\,\varepsilon ^{-1/2}-1+\theta \,(i-m+1)}{(\hat{y}_m^k-\hat{y}_i^k)\,\varepsilon ^{-1/2}-1+\theta \,(i-m)}\, \frac{(\hat{y}_m^k-\hat{y}_i^k)\,\varepsilon ^{-1/2}+\theta \,(i-m-1)}{(\hat{y}_m^k-\hat{y}_i^k)\,\varepsilon ^{-1/2}+\theta \,(i-m)}\\&\quad \times \prod _{m=1}^{i-1} \frac{(\hat{y}_m^{k-1}-\hat{y}_i^k)\,\varepsilon ^{-1/2}+\theta \,(i-1-m)}{(\hat{y}_m^{k-1}-\hat{y}_i^k)\,\varepsilon ^{-1/2}-1+\theta \,(i-m)} \prod _{n=i}^{k-1} \frac{(\hat{y}_i^k-\hat{y}_{n+1}^k)\,\varepsilon ^{-1/2}+\theta \,(n-i)+1}{(\hat{y}_i^k-\hat{y}_n^{k-1})\,\varepsilon ^{-1/2}+\theta \,(n-i)+1}\\&\quad \times \prod _{m=1}^{i-1} \frac{(\hat{y}_m^k-\hat{y}_i^k)\,\varepsilon ^{-1/2}+\theta \,(i-m)-1}{(\hat{y}_m^k-\hat{y}_i^k)\,\varepsilon ^{-1/2}+\theta \,(i-1-m)}\, \prod _{n=i}^{k-1} \frac{(\hat{y}_i^k-\hat{y}_n^{k-1})\,\varepsilon ^{-1/2}+\theta \,(n-i+1)}{(\hat{y}_i^k-\hat{y}_{n+1}^k)\,\varepsilon ^{-1/2} +\theta \,(n-i+1)}\\&=\varepsilon ^{-1}\,\prod _{m=1}^{i-1} \frac{(\hat{y}_m^k-\hat{y}_i^k)\,\varepsilon ^{-1/2}-1+\theta \,(i-m+1)}{(\hat{y}_m^k-\hat{y}_i^k)\,\varepsilon ^{-1/2}+\theta \,(i-m)}\\&\quad \times \frac{(\hat{y}_m^{k-1}-\hat{y}_i^k)\,\varepsilon ^{-1/2}+\theta \,(i-1-m)}{(\hat{y}_m^{k-1}-\hat{y}_i^k)\,\varepsilon ^{-1/2}-1+\theta \,(i-m)}\\&\quad \times \prod _{n=i}^{k-1} \frac{(\hat{y}_i^k-\hat{y}_{n+1}^k)\,\varepsilon ^{-1/2}+\theta \,(n-i)+1}{(\hat{y}_i^k-\hat{y}_{n+1}^k)\,\varepsilon ^{-1/2}+\theta \,(n-i+1)} \, \frac{(\hat{y}_i^k-\hat{y}_n^{k-1})\,\varepsilon ^{-1/2}+\theta \,(n-i+1)}{(\hat{y}_i^k-\hat{y}_n^{k-1})\,\varepsilon ^{-1/2}+\theta \,(n-i)+1}. \end{aligned}$$

Expanding the last expression into a Taylor series in terms of \(\varepsilon ^{1/2}\) we get

$$\begin{aligned} \varepsilon ^{-1}&+ \varepsilon ^{-1/2}\left( \sum _{m=1}^{i-1} \left( \frac{\theta -1}{\hat{y}_m^k-\hat{y}_i^k}+ \frac{1-\theta }{\hat{y}_m^{k-1}-\hat{y}_i^k}\right) \right) \\&+ \left( \sum _{n=i}^{k-1} \left( \frac{1-\theta }{\hat{y}_i^k-\hat{y}_{n+1}^k}+ \frac{\theta -1}{\hat{y}_i^k-\hat{y}_n^{k-1}}\right) \right) +O(1). \end{aligned}$$

The lemma now readily follows. \(\square \)

5.2 Step 2: Tightness

We show next that the family \(Y^{mu}_\varepsilon \), \(\varepsilon \in (0,1)\) is tight on \(D^{N(N+1)/2}\). To this end, we aim to apply the necessary and sufficient conditions for tightness of [19, Corollary 3.7.4] which amount to showing that, for any fixed \(t\ge 0\), the random variables \(Y^{mu}_\varepsilon (t)\), \(\varepsilon \in (0,1)\) are tight on \(\mathbb {R}^{N(N+1)/2}\) and that, for every \(\Delta >0\) and \(T>0\), there exists a \(\delta >0\) such that

$$\begin{aligned} \limsup _{\varepsilon \downarrow 0}\,\mathbb {P}\left( \sup _{0\le s<t\le T,t-s<\delta } \big |(Y^{mu}_\varepsilon )^k_i(t)-(Y^{mu}_\varepsilon )^k_i(s)\big |>\Delta \right) <\Delta ,\quad 1\le i\le k\le N. \end{aligned}$$

We start by explaining how to deal with \((Y^{mu}_\varepsilon (t))_+\) (the vector of positive parts of components of \(Y^{mu}_\varepsilon (t)\)) and

$$\begin{aligned} \sup _{0\le s<t\le T,t-s<\delta } \big ((Y^{mu}_\varepsilon )^k_i(t)-(Y^{mu}_\varepsilon )^k_i(s)\big ),\quad 1\le i\le k\le N. \end{aligned}$$
(5.2)

We argue by induction over \(k\). For \(k=1\), it is sufficient to observe that \((Y^{mu}_\varepsilon )^1_1\) is a Poisson process with jump size \(\varepsilon ^{1/2}\), jump rate \(\varepsilon ^{-1}\) and drift \(-\varepsilon ^{-1/2}\) and, hence, converges to a standard Brownian motion in the limit \(\varepsilon \downarrow 0\). Therefore, the necessary and sufficient conditions of [19, Corollary 3.7.4] hold for \((Y^{mu}_\varepsilon )^1_1\).

We now fix some \(k\ge 2\) and distinguish the cases \(0<\theta \le 1\) and \(\theta >1\). In the first case, we consider first \(i=k\). It is easy to see from the formulas for the jump rates in the proof of Lemma 5.6 that for \(0<\theta <1\), whenever the spacing \((Y^{mu}_\varepsilon )^k_k-(Y^{mu}_\varepsilon )^{k-1}_{k-1}\) exceeds \(\Delta /3\), the jump rate to the right of \((Y^{mu}_\varepsilon )^k_k\) is bounded above by \(\varepsilon ^{-1}\). For \(\theta =1\) the jump rate is equal to \(\varepsilon ^{-1}\). Hence, arguing as in Sect. 3.2, we conclude that for \(0<\theta \le 1\), \((Y^{mu}_\varepsilon )^k_k\) can be coupled with a Poisson process with jump size \(\varepsilon ^{1/2}\), jump rate \(\varepsilon ^{-1}\) and drift \(-\varepsilon ^{-1/2}\) in such a way that, whenever \((Y^{mu}_\varepsilon )^k_k-(Y^{mu}_\varepsilon )^{k-1}_{k-1}\) exceeds \(\Delta /3\) and \((Y^{mu}_\varepsilon )^k_k\) has a jump to the right, the Poisson process jumps to the right as well. Therefore, the convergence of such Poisson processes to a standard Brownian motion and the necessary and sufficient conditions of [19, Corollary 3.7.4] for them imply the corresponding conditions for \(((Y^{mu}_\varepsilon )^k_k(t))_+\) and the quantity in (5.2) with \(i=k\).

For \(i\in \{1,2,\ldots ,k-1\}\), the quantity \(((Y^{mu}_\varepsilon )^k_i(t))_+\) can be bounded above by the quantity \(((Y^{mu}_\varepsilon )^{k-1}_i(t))_+\) and the latter satisfies the required condition by the induction hypothesis. Moreover, the formula for the jump rates (5.1) reveals that, whenever \((Y^{mu}_\varepsilon )^k_i-(Y^{mu}_\varepsilon )^{k-1}_{i-1}\) and \((Y^{mu}_\varepsilon )^{k-1}_i-(Y^{mu}_\varepsilon )^k_i\) both exceed \(\Delta /4\), the jump rate to the right of \((Y^{mu}_\varepsilon )^k_i\) is bounded above by

$$\begin{aligned} \varepsilon ^{-1}+\frac{\varepsilon ^{-1/2}(1-\theta )}{(Y^{mu}_\varepsilon )^{k-1}_i(t)-(Y^{mu}_\varepsilon )^k_i(t)}+O(1) \le \varepsilon ^{-1}+4\,\varepsilon ^{-1/2}(1-\theta )/\Delta +O(1).\nonumber \\ \end{aligned}$$
(5.3)

Hence, \((Y^{mu}_\varepsilon )^k_i\) can be coupled with a Poisson process with jump size \(\varepsilon ^{1/2}\), jump rate given by the right-hand side of the latter inequality and drift \(-\varepsilon ^{-1/2}\) in such a way that, whenever \((Y^{mu}_\varepsilon )^k_i-(Y^{mu}_\varepsilon )^{k-1}_{i-1}\) and \((Y^{mu}_\varepsilon )^{k-1}_i-(Y^{mu}_\varepsilon )^k_i\) both exceed \(\Delta /4\) and \((Y^{mu}_\varepsilon )^k_i\) has a jump to the right, the Poisson process jumps to the right as well. Thus, the convergence of such Poisson processes to Brownian motion with drift \(4\,(1-\theta )/\Delta \) and the necessary and sufficient conditions of [19, Corollary 3.7.4] for them imply the corresponding control on the quantities in (5.2).

In the case \(\theta >1\), we first consider \(i=1\). From the formulas for the jump rates in the proof of Lemma 5.6 it is not hard to see that the jump rate to the right of the process \((Y^{mu}_\varepsilon )^k_1\) is bounded above by \(\varepsilon ^{-1}\). Therefore, it can be coupled with a Poisson process with jump size \(\varepsilon ^{1/2}\), jump rate \(\varepsilon ^{-1}\) and drift \(-\varepsilon ^{-1/2}\) in such a way that, whenever \((Y^{mu}_\varepsilon )^k_1\) has a jump to the right, the Poisson process jumps to the right as well. Thus, as in Sect. 3.2, the convergence of such Poisson processes to a Brownian motion and the necessary and sufficient conditions of [19, Corollary 3.7.4] for them give the desired control on \(((Y^{mu}_\varepsilon )^k_1(t))_+\) and the quantity in (5.2) with \(i=1\).

For \(i\in \{2,3,\ldots ,k\}\), the formulas for the jump rates in the proof of Lemma 5.6 reveal that, whenever \((Y^{mu}_\varepsilon )^k_i-(Y^{mu}_\varepsilon )^{k-1}_{i-1}\) exceeds \(\Delta /3\), the jump rate to the right of \((Y^{mu}_\varepsilon )^k_i\) is bounded above by

$$\begin{aligned} \varepsilon ^{-1}\!-\!\frac{\varepsilon ^{-1/2}(1\!-\!\theta )}{(Y^{mu}_\varepsilon )^k_i(t)\!-\!(Y^{mu}_\varepsilon )^{k-1}_{i-1} (t)}\!+\!O(1) \!\le \!\varepsilon ^{-1}\!-\!3\,\varepsilon ^{-1/2}(1\!-\!\theta )/\Delta \!+\!O(1).\qquad \end{aligned}$$
(5.4)

Hence, \((Y^{mu}_\varepsilon )^k_i\) can be coupled with a Poisson process with jump size \(\varepsilon ^{1/2}\), jump rate given by the right-hand side of the last inequality and drift \(-\varepsilon ^{-1/2}\), so that, whenever \((Y^{mu}_\varepsilon )^k_i-(Y^{mu}_\varepsilon )^{k-1}_{i-1}\) exceeds \(\Delta /3\) and \((Y^{mu}_\varepsilon )^k_i\) has a jump to the right, the Poisson process jumps to the right as well. Therefore, as in Sect. 3.2, the convergence of such Poisson processes to Brownian motion with drift \(-3\,(1-\theta )/\Delta \) and the necessary and sufficient conditions of [19, Corollary 3.7.4] for them yield the corresponding conditions for \(((Y^{mu}_\varepsilon )^k_i(t))_+\) and the quantities in (5.2).

Finally, we note that the quantities \((Y^{mu}_\varepsilon (t))_-\) (the vector of negative parts of components of \(Y^{mu}_\varepsilon (t)\)) and

$$\begin{aligned} \sup _{0\le s<t\le T,t-s<\delta } -\big ((Y^{mu}_\varepsilon )^k_i(t)-(Y^{mu}_\varepsilon )^k_i(s)\big ),\quad 1\le i\le k\le N \end{aligned}$$

can be analyzed in a similar manner (however, now by moving from the leftmost to the rightmost particle on every level for \(0<\theta \le 1\) and vice versa for \(\theta >1\)). By combining everything together and using [19, Corollary 3.7.4] we conclude that the family \(Y^{mu}_\varepsilon \), \(\varepsilon \in (0,1)\) is tight.

We also note that, since the maximal size of the jumps tends to zero as \(\varepsilon \downarrow 0\), any limit point of the family \(Y^{mu}_\varepsilon \), \(\varepsilon \in (0,1)\) as \(\varepsilon \downarrow 0\) must have continuous paths (see e.g. [19, Theorem 3.10.2]).

5.3 Step 3: SDE for subsequential limits

Writing \(Y^{mu}\) for an arbitrary limit point as \(\varepsilon \downarrow 0\) of the tight family \(Y^{mu}_\varepsilon \), \(\varepsilon \in (0,1)\) as before, our goal now is to prove that \(Y^{mu}\) solves the SDE (4.1). We pick a sequence of \(Y^{mu}_\varepsilon \) which converges to \(Y^{mu}\) in law, and by virtue of the Skorokhod Embedding Theorem (see e.g. [18, Theorem 3.5.1]) may assume that all processes involved are defined on the same probability space and that the convergence holds in the almost sure sense. In the rest of this section all the limits \(\varepsilon \downarrow 0\) are taken along such a sequence.

Define \(\mathcal {F}^{mu}\) as the set of all infinitely differentiable functions on \(\overline{\mathcal {G}^N}\) with support in a compact subset of \(\mathcal {G}^N\). Further, set

$$\begin{aligned} \mathcal {F}^{mu}_\delta :=\big \{f\in \mathcal {F}^{mu}|f(x)=0\;\,\mathrm{whenever}\;\,\mathrm{dist}(x, \partial \overline{\mathcal {G}^N})\le \delta \big \}, \end{aligned}$$

where \(\partial \overline{\mathcal {G}^N}\) stands for the boundary of \(\overline{\mathcal {G}^N}\) and \(\mathrm{dist}(x, \partial \overline{\mathcal {G}^N})\) is the \(L^{\infty }\) distance to the boundary, i.e.

$$\begin{aligned} \mathrm{dist}(x, \partial \overline{\mathcal {G}^N})=\inf _{\begin{array}{c} 1\le i \le j \le N,\\ 1\le i'\le j'\le N,\\ (i,j)\ne (i',j') \end{array}} |x^j_i-x^{j'}_{i'}|. \end{aligned}$$

Clearly \(\mathcal {F}^{mu}=\bigcup _{\delta >0} \mathcal {F}^{mu}_\delta \).

In addition, for every test function \(f\in \mathcal {F}^{mu}\), define the process

$$\begin{aligned} M^f(t)&:= f(Y^{mu}(t))-f(Y^{mu}(0))- \sum _{1\le i\le k\le N} \int _0^t \frac{1}{2}\,f_{y^k_iy^k_i}(Y^{mu}(s))\,\mathrm{d}s\\&\quad -\, \sum _{1\le i\le k\le N} \int _0^t \Bigg (\sum _{m\ne i} \frac{1-\theta }{(Y^{mu})^k_i(s)-(Y^{mu})^k_m(s)}\\&\quad -\sum _{m=1}^{k-1} \frac{1-\theta }{(Y^{mu})^k_i(s)-(Y^{mu})^{k-1}_m(s)}\Bigg )f_{y^k_i}(Y^{mu}(s))\,\mathrm{d}s. \end{aligned}$$

Our first aim is to show that each \(M^f\) is a martingale. To this end, we fix an \(f\in \mathcal {F}^{mu}\) and consider the family of martingales

$$\begin{aligned} M^f_\varepsilon (t)&:= f(Y^{mu}_\varepsilon (t))-f(Y^{mu}_\varepsilon (0)) + \int _0^t \sum _{1\le i\le k\le N} \varepsilon ^{-1/2}\,f_{y^k_i}(Y^{mu}_\varepsilon (s))\,\mathrm{d}s \\&\quad -\,\int _0^t\sum _{y'\approx _\varepsilon Y^{mu}_\varepsilon (s)} q^{mu}_\varepsilon (Y^{mu}_\varepsilon (s),\hat{y}',s)\big (f(y')-f(Y^{mu}_\varepsilon (s))\big ) \,\mathrm{d}s,\quad \varepsilon >0 \end{aligned}$$

where the notations are the same as in Sect. 5.1. Now, one can argue as in Sect. 3.3, Step 3a to conclude that \(M^f\) is a martingale, as the \(\varepsilon \downarrow 0\) limit of the martingales \(M^f_\varepsilon \) which can be bounded uniformly on every compact time interval.

Next, for each \(\delta >0\) and \(K>0\), we define the stopping time

$$\begin{aligned} \widehat{\tau }_{\delta ,K}\,{:=}\,\inf \big \{t\ge 0:\mathrm{dist}(Y^{mu}(t),\partial \overline{\mathcal {G}^N})\le \delta \quad \mathrm{or}\quad |(Y^{mu})^k_i(t)|\ge K\;\mathrm{for\;some\;}(k,i)\big \}. \end{aligned}$$

Now, note that for every function \(\overline{\mathcal {G}^N}\rightarrow \mathbb {R}\) of the form \(x\mapsto x^k_i\) or \(x\mapsto x^k_i\,x^{k'}_{i'}\), there is a function \(f\in \mathcal {F}^{mu}\) which coincides with that function on

$$\begin{aligned} \{x\in \overline{\mathcal {G}^N}:\mathrm{dist}(x,\partial \overline{\mathcal {G}^N})\ge \delta ,\; |x^k_i|\le K\;\mathrm{for\;all\;}(k,i)\}. \end{aligned}$$

Combining this observation, the Optional Sampling Theorem and the conclusion of the preceeding paragraph we deduce that all processes of the two forms

$$\begin{aligned}&M^{(k,i)}(t):=(Y^{mu})^k_i(t\wedge \widehat{\tau }_{\delta ,K})-(Y^{mu})^k_i(0) \\&\quad -\int _0^{t\wedge \widehat{\tau }_{\delta ,K}} \left( \sum _{m\ne i} \frac{1-\theta }{(Y^{mu})^k_i(s)-(Y^{mu})^k_m(s)} -\sum _{m=1}^{k-1} \frac{1-\theta }{(Y^{mu})^k_i(s)-(Y^{mu})^{k-1}_m(s)}\right) \,\mathrm{d}s, \\&M^{(k,i),(k',i')}(t)\,{:=}\,(Y^{mu})^k_i(t\wedge \widehat{\tau }_{\delta ,K})(Y^{mu})^{k'}_{i'}(t\wedge \widehat{\tau }_{\delta ,K})\\&\quad \,-(Y^{mu})^k_i(0)(Y^{mu})^{k'}_{i'}(0)-\mathbf{1}_{\{k=k',i=i'\}}\cdot t\wedge \widehat{\tau }_{\delta ,K} \\&\quad -\int _0^{t\wedge \widehat{\tau }_{\delta ,K}} \left( \sum _{m\ne i} \frac{(1-\theta )(Y^{mu})^{k'}_{i'}(s)}{(Y^{mu})^k_i(s)-(Y^{mu})^k_m(s)} -\sum _{m=1}^{k-1} \frac{(1-\theta )(Y^{mu})^{k'}_{i'}(s)}{(Y^{mu})^k_i(s)-(Y^{mu})^{k-1}_m(s)}\right) \,\mathrm{d}s \\&\quad \,-\int _0^{t\wedge \widehat{\tau }_{\delta ,K}} \left( \sum _{m\ne i'} \frac{(1-\theta )(Y^{mu})^k_i(s)}{(Y^{mu})^{k'}_{i'}(s)-(Y^{mu})^{k'}_m(s)} {-}\sum _{m=1}^{k'-1} \frac{(1-\theta )(Y^{mu})^k_i(s)}{(Y^{mu})^{k'}_{i'}(s)-(Y^{mu})^{k'-1}_m(s)}\right) \,\mathrm{d}s \end{aligned}$$

are martingales. Now, the argument of [29, Chapter 5, Proposition 4.6] (only replacing every occurrence of \(t\) by \(t\wedge \widehat{\tau }_{\delta ,K}\) there) yields that \(Y^{mu}\) satisfies the system of stochastic integral equations

$$\begin{aligned}&(Y^{mu})^k_i(t\wedge \widehat{\tau }_{\delta ,K})-(Y^{mu})^k_i(0) \\&\quad =\int _0^{t\wedge \widehat{\tau }_{\delta ,K}} \left( \sum _{m\ne i} \frac{1-\theta }{(Y^{mu})^k_i(s)-(Y^{mu})^k_m(s)} -\sum _{m=1}^{k-1} \frac{1-\theta }{(Y^{mu})^k_i(s)-(Y^{mu})^{k-1}_m(s)}\right) \,\mathrm{d}s\\&\qquad +\,\mathrm{W}^k_i(t\wedge \widehat{\tau }_{\delta ,K}), 1\le i\le k\le N \end{aligned}$$

where \(W^k_i\), \(1\le i\le k\le N\) are independent standard Brownian motions, possibly defined on an extension of the underlying probability space. At this stage, one can repeat the argument at the end of Sect. 4.2 to show that \(\lim _{K\uparrow \infty } \widehat{\tau }_{\delta ,K}\ge \tau _{\delta }[Y^{mu}]\) and then combine Propositions 4.2 and 4.3 to end up with the SDE of Theorem 5.2 as desired.

Remark 5.7

In Sect. 3.3 we have used Lemma 3.6 instead of [29, Chapter 5, Proposition 4.6], but we could have used the latter as well. On the other hand, it is not straightforward to generalize Lemma 3.6 to the setting of the current section, because there is no obvious multilevel analogue for Proposition 2.25.