Appendix A: Proofs
1.1 A.1. Proofs of Proposition 3.1 and Lemma 3.3
Proof of Proposition 3.1
Let us first note that Condition (A5a) is equivalent to
$$\begin{aligned} {\mathbb {E}}\left( \sup _{\theta \in \Theta }\left| \left| \prod _{i=1}^t \Phi (\Delta _i,\theta )\right| \right| ^{8}\right) \le C \rho ^t,\quad {\mathbb {E}}\left( \left| \left| \prod _{i=1}^t \Psi (\Delta _i)\right| \right| ^{8}\right) \le C \rho ^t, \end{aligned}$$
(41)
for some constant \(C>0\) and \(0<\rho <1\) (independent from \(\theta \)). Let us first introduce the processes \(({\tilde{Z}}_t)_{t\in {\mathbb {Z}}}\) and \(({\tilde{\omega }}_t)_{t\in {\mathbb {Z}}}\) by
$$\begin{aligned}&{\tilde{Z}}_t=(X_t,\dots ,X_{t-p+1},\epsilon _t,\dots ,\epsilon _{t-q+1})'\in {\mathbb {R}}^{(p+q)\times 1}, \\&{\tilde{\omega }}_t= (\epsilon _t,0,\dots ,\epsilon _t,\dots ,0)'\in {\mathbb {R}}^{(p+q)\times 1} \end{aligned}$$
where \(\epsilon _t\) in the latter is in \((p+1)\)th position in \({\tilde{\omega }}_t\). Then it is clear that we have the following equation for \({\tilde{Z}}_t\):
$$\begin{aligned} {\tilde{Z}}_t=\Psi (\Delta _{t}){\tilde{Z}}_{t-1} + {\tilde{\omega }}_t ,\quad \forall t\in {\mathbb {Z}}, \end{aligned}$$
of which a candidate for the solution of the above equation is, with the usual convention \(\prod _{j=0}^{-1}=1\),
$$\begin{aligned} {\tilde{Z}}_t= \sum _{k=0}^\infty \prod _{j=0}^{k-1}\Psi (\Delta _{t-j}){\tilde{\omega }}_{t-k},\quad t\in {\mathbb {Z}}, \end{aligned}$$
(42)
a stationary process, provided that the series converges, which we prove now. Let us pick for \(||\cdot ||\) a subordinate norm on the set of matrices. By independence of the processes \((\Delta _t)_{t\in {\mathbb {Z}}}\) and \((\epsilon _t)_{t\in {\mathbb {Z}}}\), and using the fact that the latter is square integrable, we easily get, for \(k\ge 1\),
$$\begin{aligned} {\mathbb {E}}\left( \left| \left| \Psi (\Delta _{t})\dots \Psi (\Delta _{t-k+1}){\tilde{\omega }}_{t-k}\right| \right| ^2\right)&\le {\mathbb {E}}\left( \left| \left| \Psi (\Delta _{t})\dots \Psi (\Delta _{t-k+1}) \right| \right| ^2 .\left| \left| {\tilde{\omega }}_{t-k}\right| \right| ^2\right) \\&={\mathbb {E}}\left( \left| \left| \Psi (\Delta _{t})\dots \Psi (\Delta _{t-k+1}) \right| \right| ^2\right) {\mathbb {E}}\left( \left| \left| {\tilde{\omega }}_{t-k}\right| \right| ^2\right) \\&\le C {\mathbb {E}}\left( \left| \left| {\tilde{\omega }}_{0}\right| \right| ^2\right) \rho ^k, \end{aligned}$$
the last inequality stemming from (41), so that series (42) converges in \(L^2\). Note that we prove that \({\tilde{Z}}_t\) (hence \(X_t\)) is in \(L^4\) by replacing \(||\cdot ||^2\) by \(||\cdot ||^4\) in the above inequalities, using again (41) and the fact that \((\epsilon )_{t\in {\mathbb {Z}}}\) is in \( L^4\), see assumption (A3). Similarly, defining
$$\begin{aligned} Z_t(\theta ):=(\epsilon _t(\theta ),\dots ,\epsilon _{t-q+1}(\theta ),X_t,\dots ,X_{t-p+1})' , \quad \omega _t= (X_t,0,\dots ,X_t,\dots ,0)' \end{aligned}$$
(43)
where \(X_t\) in the latter is in \((q+1)\)th position, we also get that \(Z_t(\theta )\) satisfies
$$\begin{aligned} Z_t(\theta )=\Phi (\Delta _{t},\theta )Z_{t-1}(\theta )+\omega _t . \end{aligned}$$
A solution candidate to the above solution is
$$\begin{aligned} Z_t(\theta )= \sum _{k=0}^\infty \prod _{j=0}^{k-1}\Phi (\Delta _{t-j},\theta )\omega _{t-k},\quad t\in {\mathbb {Z}}. \end{aligned}$$
(44)
Similarly to the proof leading to (42), convergence of (44) is obtained thanks to (41) as well as stationarity of \((X_t)_{t\in {\mathbb {Z}}}\) and the fact that \(X_t\in L^4\).
We check that \(\omega _t=M{\tilde{Z}}_t\) and \(\epsilon _t(\theta )=w_1Z_t(\theta )\), which, plugged into (42) and (44) yields (8) with coefficients \(c_i(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1})\) given by (9). Finally, let us verify that \((c_i(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1}))_{i\in {\mathbb {N}}}\) is the unique sequence verifying (8). Let us then pick a sequence of r.v. \((d_i)_{i\in {\mathbb {N}}}\) in \({{\mathcal {H}}}\) such that \(\epsilon _t(\theta )= \sum _{i=0}^\infty c_i(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1})\epsilon _{t-i}= \sum _{i=0}^\infty d_i \epsilon _{t-i}\). We then get, by independence from \( (\epsilon _t)_{t\in {\mathbb {Z}}}\) as well as by the fact that the latter is a weak white noise:
$$\begin{aligned} 0= & {} {\mathbb {E}}\left( \left[ \sum _{i=0}^\infty (c_i(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1})-d_i)\epsilon _{t-i}\right] ^2\right) \\= & {} \sigma ^2{\mathbb {E}}\left( \sum _{i=0}^\infty (c_i(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1})-d_i)^2\right) \end{aligned}$$
hence \((c_i(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1}))_{i\in {\mathbb {N}}}=(d_i)_{i\in {\mathbb {N}}}\) a.s. \(\square \)
Proof of Lemma 3.3
The fact that the \(\theta \mapsto c_i(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1})\), \(\theta \mapsto \nabla [c_i(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1})]^2\) and \(\theta \mapsto \nabla ^2 [c_i(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1})]^2\) are polynomial functions (of several variables) can be verified easily using the fact that, for all \(s\in {{\mathcal {S}}}\), \(\theta \mapsto \Phi (s,\theta )\) and \(\theta \mapsto \Psi (\theta )\) are affine functions. We turn to (12). Using Minkovski’s inequality, the fact that the matrix norm \(||\cdot ||\) is submultiplicative entails
$$\begin{aligned}&\left| \left| \sup _{\theta \in \Theta } |c_i(\theta ,\Delta _{i},\dots ,\Delta _{1})|\right| \right| _{2\nu +4} \nonumber \\&\quad \le \sum _{k=0}^i \left| \left| \sup _{\theta \in \Theta } |w_1\Phi (\Delta _{i},\theta )\dots \Phi (\Delta _{i-k+1},\theta )M \Psi (\Delta _{i-k})\dots \Psi (\Delta _{1})w_{p+1}'|\right| \right| _{2\nu +4} \nonumber \\&\quad \le C \sum _{k=0}^i \left[ {\mathbb {E}}\left( \sup _{\theta \in \Theta }\left| \left| \Phi (\Delta _{i},\theta )\dots \Phi (\Delta _{i-k+1},\theta ) \right| \right| ^{2\nu +4} \left| \left| \Psi (\Delta _{i-k})\dots \Psi (\Delta _{1}) \right| \right| ^{2\nu +4}\right) \right] ^{1/(2\nu +4)} \end{aligned}$$
(45)
for some constant \(C>0\). The Cauchy-Schwarz inequality as well as (A5a) yields
$$\begin{aligned}&\left[ {\mathbb {E}}\left( \sup _{\theta \in \Theta }\left| \left| \Phi (\Delta _{i},\theta )\dots \Phi (\Delta _{i-k+1},\theta ) \right| \right| ^{2\nu +4} \left| \left| \Psi (\Delta _{i-k})\dots \Psi (\Delta _{1})\right| \right| ^{2\nu +4}\right) \right] ^{1/(2\nu +4)}\\&\quad \le \left[ {\mathbb {E}}\left( \sup _{\theta \in \Theta }\left| \left| \Phi (\Delta _{i},\theta )\dots \Phi (\Delta _{i-k+1},\theta ) \right| \right| ^{4\nu +8} \right) \right] ^{\frac{1}{(4\nu +8 )}} \\&\qquad \left[ {\mathbb {E}}\left( \left| \left| \Psi (\Delta _{i-k})\dots \Psi (\Delta _{1})\right| \right| ^{4\nu +8} \right) \right] ^{\frac{1}{(4\nu +8 )}}\le \kappa \rho ^{\frac{i}{(2\nu +4 )}} \end{aligned}$$
which, plugged in (45), yields inequality (12) for \(c_i(\theta ,\Delta _{i},\dots ,\Delta _{1})\). The inequalities for \(\nabla ^j [c_i(\theta ,\Delta _{i},\dots ,\Delta _{1})]\), \(j=2,3\), are proved similarly. As to \(c_i^e(t,\theta ,\Delta _{t},\dots ,\Delta _{t-i+1})\), (11) yields the upper bound
$$\begin{aligned}&\left| \left| \sup _{\theta \in \Theta } |c_i^e(t,\theta ,\Delta _{t},\dots ,\Delta _{t-i+1})|\right| \right| _{2\nu +4} \\&\quad \le \sum _{k=0}^i \left| \left| \sup _{\theta \in \Theta } |w_1\Phi (\Delta _{t},\theta )\dots \Phi (\Delta _{t-k+1},\theta )M \Psi (\Delta _{t-k})\dots \Psi (\Delta _{t-i+1})w_{p+1}'|\right| \right| _{2\nu +4}, \end{aligned}$$
so that upper bound (13) for \(c_i^e(t,\theta ,\Delta _{t-1},\dots ,\Delta _{t-i})\) follows again by a Cauchy-Schwarz argument. The upper bound (13) for \(\nabla c_i^e(t,\theta ,\Delta _{t-1},\dots ,\Delta _{t-i})\) is obtained similarly. \(\square \)
1.2 A.2. Proofs of Lemma 3.4 and Proposition 3.5
Proof of Lemma 3.4
We first prove Point 1. Using decomposition (8) of \(\epsilon _t(\theta )\), independence of the white noise from the modulating process, as well as stationarity of the former, we obtain
$$\begin{aligned} \left| \left| \sup _{\theta \in \Theta } |\epsilon _0(\theta )|\right| \right| _4\le \sum _{i=0}^\infty \left| \left| \sup _{\theta \in \Theta } |c_i(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1})|\right| \right| _4 \cdot ||\epsilon _{0}||_4 \end{aligned}$$
which is a converging series because of (12). As to \(e_t(\theta )\), we use this time decomposition (10) as well as (13) in order to get
$$\begin{aligned} \sup _{t\ge 0}\left| \left| \sup _{\theta \in \Theta } |e_t(\theta )|\right| \right| _4\le \sum _{i=0}^\infty \sup _{t\ge 0}\left| \left| \sup _{\theta \in \Theta }|c_i^e(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1})|\right| \right| _4\cdot ||\epsilon _{0}||_4<+\infty . \end{aligned}$$
In order to prove Point 2, we remind the following notations. From (4) and (5), we have
$$\begin{aligned} Z_t(\theta )=\omega _t+ \Phi (\Delta _{t},\theta )Z_{t-1}(\theta )\qquad \forall t\in {\mathbb {Z}}, \end{aligned}$$
and
$$\begin{aligned} Z^e_t(\theta )=\omega ^e_t+ \Phi (\Delta _{t},\theta )Z^e_{t-1}(\theta )\qquad t=1,\dots ,n, \end{aligned}$$
where \(Z^e_t(\theta ) := (e_t(\theta ),\dots ,e_{t-q+1}(\theta ), {\tilde{X}}_t,\dots ,{\tilde{X}}_{t-p+1})' , \quad \omega ^e_t= ({\tilde{X}}_t,0,\dots ,{\tilde{X}}_t,\dots ,0)', \) so that \(\omega ^e_t=\omega _t\) for \(t\ge r+1\) (where \(r=\max (p,q)\)), \(\omega ^e_t(\theta )=0_{p+q}\) for \(t\le 0\). We recall that the processes \(({\tilde{X}}_t)_{t\in {\mathbb {Z}}}\) and \((e_t(\theta ))_{t\in {\mathbb {Z}}}\) verify (5). Note that \(\left| \left| \sup _{\theta \in \Theta } |\epsilon _t(\theta )-e_t(\theta )|\right| \right| _2\longrightarrow 0\) is equivalent to \(\left| \left| \sup _{\theta \in \Theta } ||Z^e_t(\theta )-Z_t(\theta )||\right| \right| _2\longrightarrow 0\) as \(t\rightarrow \infty \). Now, since \({\tilde{X}}_t=X_t\) for \(t\ge 1\), we easily see that
$$\begin{aligned} Z^e_t(\theta )-Z_t(\theta )&=\Phi (\Delta _{t},\theta ) [Z^e_{t-1}(\theta )-Z_{t-1}(\theta )],\quad \forall t\ge r+1, \end{aligned}$$
(46)
$$\begin{aligned} Z^e_t(\theta )-Z_t(\theta )&=\omega ^e_t-\omega _t+ \Phi (\Delta _{t},\theta )[Z^e_{t-1}(\theta )-Z_{t-1}(\theta )], \text{ for } t=1,\dots ,r. \end{aligned}$$
(47)
Now, using (46) and (47) we obtain
$$\begin{aligned} Z^e_t(\theta )-Z_t(\theta )= & {} \prod _{j=0}^{t-r-1}\Phi (\Delta _{t-j}, \theta )[Z^e_{r}(\theta )-Z_{r}(\theta )],\quad \forall t\ge r+1,\nonumber \\= & {} \prod _{j=0}^{t-r-1}\Phi (\Delta _{t-j},\theta ) \nonumber \\&\left( \sum _{i=0}^{r-1} \prod _{j=0}^{i-1}\Phi (\Delta _{r-j},\theta )[\omega ^e_{r-i}-\omega _{r-i}] \prod _{j=0}^{r-1}\Phi (\Delta _{r-j},\theta )\omega _{0} \right) . \end{aligned}$$
(48)
Let us furthermore note that
$$\begin{aligned}&\left| \left| \sup _{\theta \in \Theta } |{\tilde{X}}_t-X_t|\right| \right| _4 \\&\quad = \left| \left| \sup _{\theta \in \Theta } |\sum _{i=t}^{r} g_i^a(\Delta _t,\theta ){X}_{t-i}+\sum _{j=t}^{r} g_j^b(\Delta _t,\theta )\epsilon _{t-i}(\theta )|\right| \right| _4<+\infty \text{ for } t=1,\dots ,r \end{aligned}$$
as indeed \(X_t\in L^4\) (as proved in the proof of Proposition 3.1) and \(|| \sup _{\theta \in \Theta } \epsilon _t(\theta )||_4 <+\infty \) as proved in Point 1. In view of (48), using Minkowski’s and Hölder’s inequalities and (A5a), we thus have
$$\begin{aligned} \left| \left| \sup _{\theta \in \Theta } ||Z^e_t(\theta )-Z_t(\theta )||\right| \right| _2\le C\rho ^t, \end{aligned}$$
for some constant \(C>0\) and \(0<\rho <1\) (independent from \(\theta \)).
Let us turn to Point 3. This is due to
$$\begin{aligned} {{\mathbb {P}}}\left( t^\alpha \sup _{\theta \in \Theta } |\epsilon _t(\theta )-e_t(\theta )|>\eta \right) \le \frac{t^{2+2\alpha }\left| \left| \sup _{\theta \in \Theta } |\epsilon _t(\theta )-e_t(\theta )|\right| \right| _2^2 }{t^{2} \eta ^2}= o\left( \frac{1}{t^{2}}\right) ,\quad \forall \eta >0 , \end{aligned}$$
the last equality thanks to Point 2, and using Borel-Cantelli’s lemma.
We now turn to Point 4. The fact that \(\left| \left| \sup _{\theta \in \Theta } ||\nabla ^j\epsilon _0(\theta )||\right| \right| _4\) and \(\sup _{t\ge 0}\left| \left| \sup _{\theta \in \Theta }\right. \right. \left. \left. ||\nabla ^j e_t(\theta )||\right| \right| _4\) are finite is proved similarly to Point 1 and using estimates (12) and (13). We then pass on to the limit of \(t^\alpha \left| \left| \sup _{\theta \in \Theta } ||\nabla (e_t- \epsilon _t)(\theta )||\right| \right| _{4/3}\) as \(t\rightarrow \infty \). Let \(i\in {{\mathcal {S}}}\). Deriving (46) with respect to \(\theta _i\) yields
$$\begin{aligned}&\frac{\partial }{\partial \theta _i}[Z^e_t(\theta )- Z_t(\theta )] \nonumber \\&\quad =\Phi (\Delta _{t},\theta )\frac{\partial }{\partial \theta _i}[Z^e_{t-1}(\theta )-Z_{t-1}(\theta )] + \frac{\partial }{\partial \theta _i}\Phi (\Delta _{t},\theta ) [Z^e_{t-1}(\theta )-Z_{t-1}(\theta )], \nonumber \\&\qquad \forall t\ge p+1, \end{aligned}$$
(49)
hence we may write
$$\begin{aligned} \frac{\partial }{\partial \theta _i}[Z^e_t(\theta )-Z_t(\theta )]= \sum _{k=0}^{t-p}\prod _{j=0}^{k-1}\Phi (\Delta _{t-j},\theta ) \frac{\partial }{\partial \theta _i}\Phi (\Delta _{t-k},\theta ) [Z^e_{t-k}(\theta )-Z_{t-k}(\theta )], \end{aligned}$$
hence, using Minkovski’s and Hölder’s inequalities, and letting \(M_\Phi :=\max _{s\in {{\mathcal {S}}},\theta \in \Theta }\left| \frac{\partial }{\partial \theta _i}\Phi (s,\theta )\right| \), we get
$$\begin{aligned} t^\alpha \left| \left| \sup _{\theta \in \Theta } ||\frac{\partial }{\partial \theta _i}[Z^e_t(\theta )-Z_t(\theta )]||\right| \right| _{{8/5}}&\le M_\Phi \sum _{k=0}^{t-p} \left| \left| \sup _{\theta \in \Theta } |\prod _{j=0}^{k-1}\Phi (\Delta _{t-j},\theta )| \right| \right| _{{8}} \nonumber \\&\quad \cdot t^\alpha \left| \left| \ \sup _{\theta \in \Theta }|| Z^e_{t-k}(\theta )-Z_{t-k}(\theta )|| \right| \right| _2 \cdot \end{aligned}$$
(50)
Now, since \(\left| \left| \sup _{\theta \in \Theta } ||\prod _{j=0}^{k-1}\Phi (\Delta _{t-j},\theta )|| \right| \right| _{{8}}\le \kappa \rho ^k\) for some \(\kappa >0\) and \(\rho <1\) thanks to (A5a), and since \(t^\alpha \left| \left| \ \sup _{\theta \in \Theta }|| Z^e_{t-k}(\theta )-Z_{t-k}(\theta )|| \right| \right| _2 \) is uniformly bounded in t and \(k\le t\), and tends to 0 as \(t\rightarrow \infty \), the dominated convergence theorem yields that \(t^\alpha \left| \left| \sup _{\theta \in \Theta } ||\frac{\partial }{\partial \theta _i}[Z^e_t(\theta )-Z_t(\theta )]||\right| \right| _{{8/5}} \longrightarrow 0\) as \(t\rightarrow \infty \), proving \(t^\alpha \left| \left| \sup _{\theta \in \Theta } ||\nabla (e_t- \epsilon _t)(\theta )||\right| \right| _{{8/5}}\longrightarrow 0\) as \(t\rightarrow \infty \) in Point 4. Let us now prove that \(t^\alpha \left| \left| \sup _{\theta \in \Theta } ||\nabla ^2 (e_t- \epsilon _t)(\theta )||\right| \right| _{{4/3}}\longrightarrow 0\). Deriving again (49) with respect to \(\theta _\ell \), \(\ell \in {{\mathcal {S}}}\), we obtain
$$\begin{aligned}&\frac{\partial ^2}{\partial \theta _\ell \partial \theta _i}[Z^e_t(\theta )-Z_t(\theta )] \nonumber \\&\quad =\Phi (\Delta _{t},\theta )\frac{\partial ^2}{\partial \theta _\ell \partial \theta _i}[Z^e_{t-1}(\theta )-Z_{t-1}(\theta )]+ \frac{\partial }{\partial \theta _\ell }\Phi (\Delta _{t},\theta ) \frac{\partial }{\partial \theta _i}[Z^e_{t-1}(\theta )-Z_{t-1}(\theta )] \nonumber \\&\qquad +\,\frac{\partial }{\partial \theta _i}\Phi (\Delta _{t},\theta ) \frac{\partial }{\partial \theta _\ell }[Z^e_{t-1}(\theta )-Z_{t-1}(\theta )] + \frac{\partial ^2}{\partial \theta _\ell \partial \theta _i}\Phi (\Delta _{t},\theta ) [Z^e_{t-1}(\theta )-Z_{t-1}(\theta )], \nonumber \\&\qquad \qquad \forall t\ge p+1, \end{aligned}$$
(51)
so that, in the same spirit as (49), we obtain
$$\begin{aligned} t^\alpha \left| \left| \sup _{\theta \in \Theta } ||\frac{\partial ^2}{\partial \theta _\ell \partial \theta _i}[Z^e_t(\theta )-Z_t(\theta )]||\right| \right| _{{4/3}}&\le M_\Phi ' \sum _{k=0}^{t-p} \left| \left| \sup _{\theta \in \Theta } |\prod _{j=0}^{k-1}\Phi (\Delta _{t-j},\theta )| \right| \right| _{{8}} \nonumber \\&\qquad \cdot t^\alpha \left[ \left| \left| \ \sup _{\theta \in \Theta }|| Z^e_{t-k}(\theta )-Z_{t-k}(\theta )|| \right| \right| _{{8/5}} \right. \nonumber \\&\qquad +\,\left| \left| \ \sup _{\theta \in \Theta }|| \frac{\partial }{\partial \theta _\ell }[Z^e_{t-k}(\theta )-Z_{t-k}(\theta )]|| \right| \right| _{{8/5}} \nonumber \\&\qquad \left. +\,\left| \left| \ \sup _{\theta \in \Theta }|| \frac{\partial }{\partial \theta _i}[Z^e_{t-k}(\theta )-Z_{t-k}(\theta )]|| \right| \right| _{{8/5}} \right] , \end{aligned}$$
(52)
for some positive constant \(M_\Phi '\). Using Point 2 (so that \(t^\alpha \left| \left| \ \sup _{\theta \in \Theta }|| Z^e_{t-k}(\theta )-Z_{t-k}(\theta )|| \right| \right| _{{8/5}}\) tends to 0 as \(t\rightarrow \infty \), since \(8/5<2\)) and the previous estimate
$$\begin{aligned} t^\alpha \left| \left| \sup _{\theta \in \Theta } ||\frac{\partial }{\partial \theta _i}[Z^e_t(\theta )-Z_t(\theta )]||\right| \right| _{{8/5}} \longrightarrow 0 \end{aligned}$$
for all \(i\in {{\mathcal {S}}}\), we conclude by a dominated convergence theorem that
$$\begin{aligned} t^\alpha \left| \left| \sup _{\theta \in \Theta } ||\frac{\partial ^2}{\partial \theta _\ell \partial \theta _i}[Z^e_t(\theta )-Z_t(\theta )]||\right| \right| _{{4/3}},\text { hence }t^\alpha \left| \left| \sup _{\theta \in \Theta } ||\frac{\partial ^2}{\partial \theta _\ell \partial \theta _i} (e_t- \epsilon _t)(\theta )||\right| \right| _{{4/3}}, \end{aligned}$$
tends to 0.
We finish by sketching the proof leading to \(t^\alpha \left| \left| \sup _{\theta \in \Theta } ||\nabla ^3 (e_t- \epsilon _t)(\theta )||\right| \right| _{1}\longrightarrow 0\). The starting point is again deriving (51) with respect to \(\theta _{\ell '}\), \(\ell '\in {{\mathcal {S}}}\), which yields, as in (52), the following estimate:
$$\begin{aligned}&t^\alpha \left| \left| \sup _{\theta \in \Theta } ||\frac{\partial ^3}{\partial \theta _\ell ' \partial \theta _\ell \partial \theta _i}[Z^e_t(\theta )-Z_t(\theta )]||\right| \right| _{1} \\&\quad \le M_\Phi '' \sum _{k=0}^{t-p} \left| \left| \sup _{\theta \in \Theta } |\prod _{j=0}^{k-1}\Phi (\Delta _{t-j},\theta )| \right| \right| _{8}\\&\quad \qquad \cdot t^\alpha \left[ \left| \left| \ \sup _{\theta \in \Theta }|| Z^e_{t-k}(\theta )-Z_{t-k}(\theta )|| \right| \right| _{4/3}+\left| \left| \ \sup _{\theta \in \Theta }|| \frac{\partial }{\partial \theta _\ell }[Z^e_{t-k}(\theta )-Z_{t-k}(\theta )]|| \right| \right| _{4/3}\right. \\&\qquad \qquad +\,\left| \left| \ \sup _{\theta \in \Theta }|| \frac{\partial }{\partial \theta _i}[Z^e_{t-k}(\theta )-Z_{t-k}(\theta )]|| \right| \right| _{4/3} \\&\qquad \qquad +\left| \left| \ \sup _{\theta \in \Theta }|| \frac{\partial ^2}{\partial \theta _\ell \partial \theta _i}[Z^e_{t-k}(\theta )-Z_{t-k}(\theta )]|| \right| \right| _{4/3} \\&\qquad \qquad +\, \left| \left| \ \sup _{\theta \in \Theta }|| \frac{\partial ^2}{\partial \theta _\ell '\partial \theta _i}[Z^e_{t-k}(\theta )-Z_{t-k}(\theta )]|| \right| \right| _{4/3} \\&\qquad \qquad \left. +\,\left| \left| \ \sup _{\theta \in \Theta }|| \frac{\partial ^2}{\partial \theta _\ell '\partial \ell }[Z^e_{t-k}(\theta )-Z_{t-k}(\theta )]|| \right| \right| _{4/3}\right] , \end{aligned}$$
for some constant \(M_\Phi ''\), so that we conclude similarly. \(\square \)
Proof of Proposition 3.5
In this proof, C will denote a generic positive constant that will change from line to line. Let us start with Point 1. The fact that \(Q_n(\theta )\) converges a.s. to \(O_\infty (\theta )={{\mathbb {E}}}(\epsilon _0(\theta ))\) as \(n\rightarrow \infty \) is a consequence of the fact that \( \sup _{\theta \in \Theta } |\epsilon _t(\theta )-e_t(\theta )|^2\longrightarrow 0\) (itself a consequence of Point 3 of Lemma 3.4) and is justified by the same exact proof of Lemma 7 in Francq and Zakoïan (1998). We now prove that \(n^\alpha \left| \left| \sup _{\theta \in \Theta } |Q_n(\theta )-O_n(\theta )|\right| \right| _1 \). Let \(\alpha \in (0,1)\). Using the upper bound \(\sup _{\theta \in \Theta }|e_t(\theta )^2-\epsilon _t(\theta )^2|\le \left[ \sup _{\theta \in \Theta } |e_t(\theta )| + \sup _{\theta \in \Theta } |\epsilon _t(\theta )|\right] .\sup _{\theta \in \Theta } |e_t(\theta )- \epsilon _t(\theta )|\), as well as Cauchy-Schwarz and Minkovski’s inequalities, we get the following
$$\begin{aligned} n^\alpha \left| \left| \sup _{\theta \in \Theta } |Q_n(\theta )-O_n(\theta )|\right| \right| _1\le & {} \frac{1}{n^{1-\alpha }}\sum _{t=1}^n \left[ \left| \left| \sup _{\theta \in \Theta } |e_t(\theta )| \right| \right| _2 + \left| \left| \sup _{\theta \in \Theta } |\epsilon _t(\theta )|\right| \right| _2\right] \\&\cdot \left| \left| \sup _{\theta \in \Theta } |e_t(\theta )- \epsilon _t(\theta )| \right| \right| _2 . \end{aligned}$$
Since \( \left| \left| \sup _{\theta \in \Theta } |e_t(\theta )| \right| \right| _2\) is upper bounded by Point 1 of Lemma 3.4, and \(\left| \left| \sup _{\theta \in \Theta } |\epsilon _t(\theta )|\right| \right| _2\) is constant in t and finite, there thus exists some constant \(C>0\) such that
$$\begin{aligned} n^\alpha \left| \left| \sup _{\theta \in \Theta } |Q_n(\theta )-O_n(\theta )|\right| \right| _1 \le C \frac{1}{n^{1-\alpha }}\sum _{t=1}^n \left| \left| \sup _{\theta \in \Theta } |e_t(\theta )- \epsilon _t(\theta )| \right| \right| _2 . \end{aligned}$$
(53)
Let us write the right hand side of the above inequality in the form \(\frac{1}{n^{1-\alpha }}\sum _{t=1}^n [t^{1-\alpha }-(t-1)^{1-\alpha }] \frac{1}{t^{1-\alpha }-(t-1)^{1-\alpha }} \left| \left| \sup _{\theta \in \Theta }|e_t(\theta )- \epsilon _t(\theta )| \right| \right| _2 \). Since
$$\begin{aligned} \frac{1}{t^{1-\alpha }-(t-1)^{1-\alpha }} \left| \left| \sup _{\theta \in \Theta }|e_t(\theta )- \epsilon _t(\theta )| \right| \right| _2\sim _{t\rightarrow \infty } \frac{1}{(1-\alpha )t^{-\alpha }}\left| \left| \sup _{\theta \in \Theta }|e_t(\theta )- \epsilon _t(\theta )| \right| \right| _2, \end{aligned}$$
which tends to 0 as \(t\rightarrow \infty \) (a consequence of Point 2 of Lemma 3.4), Toeplitz’s lemma implies that the right hand side of (53) tends to 0 as \(n\rightarrow \infty \), and this proves Point 1.
We now prove Point 2. We have for all \(\theta \in \Theta \)
$$\begin{aligned} ||\nabla [e_t(\theta )^2-\epsilon _t(\theta )^2]||&=|| 2 e_t(\theta ) \nabla [e_t(\theta )-\epsilon _t(\theta )]+ 2[e_t(\theta )-\epsilon _t(\theta )] \nabla \epsilon _t(\theta )|| \nonumber \\&\le 2 || e_t(\theta ) \nabla [e_t(\theta )-\epsilon _t(\theta )]|| + 2|e_t(\theta )- \epsilon _t(\theta )|\cdot || \nabla \epsilon _t(\theta ) ||\cdot \end{aligned}$$
(54)
so that
$$\begin{aligned} \sup _{\theta \in \Theta } ||\nabla (Q_n(\theta )-O_n(\theta ))||&\le \frac{1}{n}\sum _{t=1}^n \sup _{\theta \in \Theta } |e_t(\theta )|\cdot \sup _{\theta \in \Theta }|| \nabla [e_t(\theta )-\epsilon _t(\theta )]|| \nonumber \\&\quad + \,\frac{1}{n}\sum _{t=1}^n \sup _{\theta \in \Theta } |e_t(\theta )-\epsilon _t(\theta )|\cdot \sup _{\theta \in \Theta }|| \nabla \epsilon _t(\theta )||. \end{aligned}$$
(55)
Lemma 3.4, Points 2 and 4, along with Borel-Cantelli’s lemma, yields that \(\sup _{\theta \in \Theta } |\epsilon _t(\theta )-e_t(\theta )|\) and \(\sup _{\theta \in \Theta } ||\nabla (\epsilon _t-e_t)(\theta )||\) a.s. tend to 0 as \(t\rightarrow \infty \). The second term on the right hand side of (55) if then a.s. upper bounded thanks to Cauchy-Scwharz inequality by
$$\begin{aligned} \left[ \frac{1}{n}\sum _{t=1}^n \sup _{\theta \in \Theta } |e_t(\theta )-\epsilon _t(\theta )|^2\right] ^{1/2}\cdot \left[ \frac{1}{n}\sum _{t=1}^n \sup _{\theta \in \Theta }|| \nabla \epsilon _t(\theta )||^2\right] ^{1/2}, \end{aligned}$$
which tends to zero thanks to Cesaro’s Lemma and the ergodic theorem. And since, by Minkowski’s inequality,
$$\begin{aligned} \left[ \frac{1}{n}\sum _{t=1}^n \sup _{\theta \in \Theta }|e_t(\theta )|^2\right] ^{1/2}\le \left[ \frac{1}{n}\sum _{t=1}^n \sup _{\theta \in \Theta }|e_t(\theta )-\epsilon _t(\theta )|^2\right] ^{1/2} + \left[ \frac{1}{n}\sum _{t=1}^n \sup _{\theta \in \Theta }|\epsilon _(\theta )|^2\right] ^{1/2}, \end{aligned}$$
we have that \(\left[ \frac{1}{n}\sum _{t=1}^n \sup _{\theta \in \Theta }|e_t(\theta )|^2\right] ^{1/2}\) is a.s. upper bounded in \(n\ge 1\), again by a Cesaro and ergodic theorem argument. The first term on the right hand side of (55) if then again a.s. upper bounded thanks to Cauchy-Scwharz inequality by
$$\begin{aligned} \left[ \frac{1}{n}\sum _{t=1}^n \sup _{\theta \in \Theta } ||\nabla (e_t-\epsilon _t)(\theta )||^2\right] ^{1/2}\cdot \left[ \frac{1}{n}\sum _{t=1}^n \sup _{\theta \in \Theta }| e_t(\theta )|^2\right] ^{1/2}, \end{aligned}$$
which tends to zero as \(t\rightarrow \infty \). Hence (55) implies that \(\sup _{\theta \in \Theta } ||\nabla (Q_n(\theta )-O_n(\theta ))||\) a.s. tends to 0 as \(n\rightarrow \infty \). Proof of a.s. convergence of \(\sup _{\theta \in \Theta } ||\nabla ^j(Q_n(\theta )-O_n(\theta ))||\) to 0 for \(j=2,3\) is obtained similarly, using arguments related to Points 3 and 4 from Lemma 3.4.
Let us now prove Point 3. Let \(\alpha \in (0,1)\). We deduce from (54), using Minkowski and Hölder inequalities, that
$$\begin{aligned} n^\alpha \left| \left| \sup _{\theta \in \Theta } ||\nabla (Q_n(\theta )-O_n(\theta ))||\right| \right| _1&\le \frac{C}{n^{1-\alpha }}\sum _{t=1}^n \left| \left| \sup _{\theta \in \Theta } |e_t(\theta )| \right| \right| _4 \left| \left| \sup _{\theta \in \Theta }|| \nabla [e_t(\theta )-\epsilon _t(\theta )]|| \right| \right| _{4/3} \nonumber \\&\quad + \,\frac{C}{n^{1-\alpha }}\sum _{t=1}^n \left| \left| \sup _{\theta \in \Theta } |e_t(\theta )-\epsilon _t(\theta )| \right| \right| _2 \left| \left| \sup _{\theta \in \Theta }|| \nabla \epsilon _t(\theta )|| \right| \right| _{2} . \end{aligned}$$
(56)
Using Point 1 of Lemma 3.4, we have that \(\left| \left| \sup _{\theta \in \Theta } |e_t(\theta )| \right| \right| _4\) is upper bounded by some constant C. The first term in the righthandside of (56) may thus be upper bounded by
$$\begin{aligned} C\frac{1}{n^{1-\alpha }}\sum _{t=1}^n \left| \left| \sup _{\theta \in \Theta }|| \nabla [e_t(\theta )-\epsilon _t(\theta )]|| \right| \right| _{4/3}. \end{aligned}$$
Noting that \(\left| \left| \sup _{\theta \in \Theta }|| \nabla [e_t(\theta )-\epsilon _t(\theta )]|| \right| \right| _{4/3}\le C' \left| \left| \sup _{\theta \in \Theta }|| \nabla [e_t(\theta )-\epsilon _t(\theta )]|| \right| \right| _{8/5}\) for some constant \(C'\), the above expression is, similarly to the argument in (53), a quantity that tends to 0 as \(n\rightarrow \infty \) thanks to Point 4 in Lemma 3.4 coupled with Toeplitz’s lemma. Hence the first term in the right hand side of (56) tends to 0 as \(n\rightarrow \infty \). Again using Point 1 and Point 2 of the same lemma, and with the same argument, we also have that the second term in the right hand side of (56) tends to 0 as \(n\rightarrow \infty \), which proves Point 2. \(\square \)
1.3 A.3. Proofs of Proposition 3.6 and Theorem 3.7
Proof of Proposition 3.6
Independence of the processes \((\Delta _t)_{t\in {\mathbb {Z}}}\) and \((\epsilon _t)_{t\in {\mathbb {Z}}}\) as well their ergodicity yields that, for fixed \(j\in {\mathbb {N}}\), the process \(\left( (\Delta _{t-1},\ldots ,\Delta _{t-j},\epsilon _{t-j}) \right) \) is ergodic. We thus deduce from Expression (8), and using the fact that \((\epsilon _t)_{t\in {\mathbb {Z}}}\) is a weak white noise, that \(O_n(\theta )\) defined by (14) verifies
$$\begin{aligned} \begin{aligned} 2 O_n(\theta ) \longrightarrow 2 O_\infty (\theta )&:=\sigma ^2 \sum _{j=0}^\infty {\mathbb {E}}\left( [c_j(\theta ,\Delta _0,\ldots ,\Delta _{-j})]^2\right) \\&= \sigma ^2 + \sigma ^2 \sum _{j=1}^\infty {\mathbb {E}}\left( [c_j(\theta ,\Delta _0,\ldots ,\Delta _{-j})]^2\right) \quad \text{ a.s. } \end{aligned} \end{aligned}$$
(57)
as \(n\rightarrow \infty \) (remember that \(c_0(\theta ,\Delta _0)=1\)). By uniqueness of decomposition (8) in Proposition 3.1, and since \(\epsilon _t(\theta _0)=\epsilon _t\), we have that \((c_i(\theta ,\Delta _{t-1},\dots ,\Delta _{t-i}))_{i\in {\mathbb {N}}}=(1,0,\ldots )\) if and only if \(\theta =\theta _0\), and that \(O_\infty (\theta )\) given in (57) is minimum at \(\theta =\theta _0\), with minimum given by \(O_\infty (\theta _0)=\sigma ^2\). Let us then deduce that the estimator \({\check{\theta }}_n\) defined in (15) converges a.s. towards \(\theta _0\). For this we let a subsequence \(({\check{\theta }}_{n_k})_{k\in {\mathbb {N}}}\) converging to some \(\theta ^*\) in the compact set \(\Theta \) and we prove that \(\theta ^*=\theta _0\). Indeed, by definition of the estimator \({\check{\theta }}_{n_k}\) we have
$$\begin{aligned} O_{n_k}(\theta _0)\ge O_{n_k}({\check{\theta }}_{n_k}) \end{aligned}$$
(58)
for all \(k\in {\mathbb {N}}\). A Taylor expansion yields the inequality
$$\begin{aligned} | O_{n_k}({\check{\theta }}_{n_k})- O_{n_k}(\theta ^*)|\le || {\check{\theta }}_{n_k} - \theta ^* ||\cdot \frac{1}{n_k}\sum _{t=1}^{n_k}\sup _{\theta \in \Theta }[|\epsilon _t(\theta )|\cdot || \nabla \epsilon _t(\theta )||]. \end{aligned}$$
(59)
But, using the ergodic theorem, we have
$$\begin{aligned} \frac{1}{n_k}\sum _{t=1}^{n_k}\sup _{\theta \in \Theta }[|\epsilon _t(\theta )|\cdot || \nabla \epsilon _t(\theta )||]&\le \frac{1}{2 n_k}\sum _{t=1}^{n_k}\left[ \sup _{\theta \in \Theta }|\epsilon _t(\theta )|^2 + \sup _{\theta \in \Theta }||\nabla \epsilon _t(\theta )||^2 \right] \\&\longrightarrow \frac{1}{2} \left| \left| \sup _{\theta \in \Theta }|\epsilon _0(\theta )| \right| \right| _2^2 + \frac{1}{2} \left| \left| \sup _{\theta \in \Theta }||\nabla \epsilon _0(\theta )|| \right| \right| _2^2<+\infty , \end{aligned}$$
so that we get from (59) that \(O_{n_k}({\check{\theta }}_{n_k})- O_{n_k}(\theta ^*)\longrightarrow 0\) as \(k\rightarrow \infty \). Since \(O_{n_k}(\theta ^*)\longrightarrow O_{\infty }(\theta ^*)\), we obtain, passing to the limit in (58), that
$$\begin{aligned} O_{\infty }(\theta _0)\ge O_{\infty }(\theta ^*), \end{aligned}$$
hence \( \theta ^*=\theta _0\) thank to uniqueness of the minimum of \(O_{\infty }(\theta )\). \(\square \)
Proof of Theorem 3.7
Similarly to the proof of the previous theorem, we let a subsequence \(({\hat{\theta }}_{n_k})_{k\in {\mathbb {N}}}\) converging to some \(\theta _*\) in the compact set \(\Theta \) and we prove that \(\theta _*=\theta _0\) by proving that \(O_{\infty }(\theta _0)= O_{\infty }(\theta _*)\). By definition of \({\hat{\theta }}_{n_k}\) we have
$$\begin{aligned} Q_{n_k}(\theta _0)\ge Q_{n_k}({\hat{\theta }}_{n_k}),\quad \forall k \ge 0. \end{aligned}$$
(60)
Now, a Taylor expansion yields, for all \(\theta '\) and \(\theta ''\) in \(\Theta \), similarly to the argument in the proof of Proposition 3.6,
$$\begin{aligned} | Q_{n_k}(\theta ')- Q_{n_k}(\theta '')|\le || \theta ' - \theta '' ||\cdot \frac{1}{2 n_k}\sum _{t=1}^{n_k}\left[ \sup _{\theta \in \Theta }|e_t(\theta )|^2 + \sup _{\theta \in \Theta }||\nabla e_t(\theta )||^2 \right] . \end{aligned}$$
(61)
Using inequality \((a+b)^2\le 2 (a^2+b^2)\) for all a and b, we deduce that \(\sup _{\theta \in \Theta }|e_t(\theta )|^2\le 2 (\sup _{\theta \in \Theta }|e_t(\theta )-\epsilon _t(\theta )|^2) + \sup _{\theta \in \Theta }|\epsilon _t(\theta )|^2\). Since a consequence of Point 3 of Lemma 3.4 is that \(\sup _{\theta \in \Theta }|e_t(\theta )-\epsilon _t(\theta )|^2\) tends to 0 as \(t\rightarrow \infty \), the ergodic theorem yields that
$$\begin{aligned} \frac{1}{n_k}\sum _{t=1}^{n_k}\left[ \sup _{\theta \in \Theta }|e_t(\theta )|^2 + \sup _{\theta \in \Theta }||\nabla e_t(\theta )||^2 \right] \longrightarrow \left| \left| \sup _{\theta \in \Theta }|\epsilon _0(\theta )| \right| \right| _2^2 + \left| \left| \sup _{\theta \in \Theta }||\nabla \epsilon _0(\theta )|| \right| \right| _2^2<+\infty \end{aligned}$$
as \(k\rightarrow \infty \). Thanks to (61) and Point 1 of Proposition 3.5, we thus deduce that \(Q_{n_k}(\theta _0)\longrightarrow O_\infty (\theta _0)\) and \(Q_{n_k}({\hat{\theta }}_{n_k})\longrightarrow O_\infty (\theta _*)\) as \(k\rightarrow \infty \), and we conclude in the same way as in proof of Theorem 3.6. \(\square \)
1.4 A.4. Proofs of Theorem 3.8
Let us introduce the following matrices and vectors
$$\begin{aligned}&I_n(\theta ):= \mathrm{Var}\left( \sqrt{n}\nabla O_n(\theta )\right) \nonumber \\&\quad = \left( I_n(l,r)(\theta )\right) _{l,r=1\dots (p+q)K}\in {\mathbb {R}}^{(p+q)K\times (p+q)K},\quad n\in {\mathbb {N}}, \end{aligned}$$
(62)
$$\begin{aligned}&Y_k(\theta ) := \epsilon _k(\theta ) \nabla \epsilon _k(\theta )=(Y_k(l)(\theta ))_{l=1\dots (p+q)K} \in {\mathbb {R}}^{(p+q)K\times 1},\quad k\in {\mathbb {Z}}, \end{aligned}$$
(63)
Theorem 3.8 can be established using the following lemmas.
Lemma A.1
(Davydov (1968)) Let p, q and r three positive numbers such that \(p^{-1}+q^{-1}+r^{-1}=1\). Then
$$\begin{aligned} \left| \text{ Cov }(X,Y)\right| \le K_0\Vert X\Vert _p\Vert Y\Vert _q\left[ \alpha \left\{ \sigma (X),\sigma (Y)\right\} \right] ^{1/r}, \end{aligned}$$
(64)
where \(\Vert X\Vert _p^p={{\mathbb {E}}}(X^p)\), \(K_0\) is an universal constant, and \(\alpha \left\{ \sigma (X),\sigma (Y)\right\} \) denotes the strong mixing coefficient between the \(\sigma \)-fields \(\sigma (X)\) and \(\sigma (Y)\) generated by the random variables X and Y, respectively.
Lemma A.2
Let the assumptions of Theorem 3.8 be satisfied. For all l, r in 1,...,\((p+q)K\) and \(\theta \in \Theta \) we have
$$\begin{aligned} I_n(l,r)(\theta )\longrightarrow I(l,r)(\theta ):=\sum _{k=-\infty }^\infty c_k (l,r)(\theta ),\quad n\rightarrow +\infty , \end{aligned}$$
where \(c_k(l,r)(\theta )=\mathrm{Cov}\left( Y_t(l)(\theta ),Y_{t-k}(r)(\theta )\right) \), \(k\in {\mathbb {Z}}\), the former being a convergent series.
Proof of Lemma A.2
Let us write
$$\begin{aligned} \nabla \epsilon _t(\theta ) = \left( \frac{\partial \epsilon _t(\theta )}{\partial \theta _1},\dots , \frac{\partial \epsilon _t(\theta )}{\partial \theta _{(p+q)K}}\right) ', \end{aligned}$$
where \(\epsilon _t(\theta )\) is given by (8). The process \(\left( Y_k(\theta )\right) _k\) is strictly stationary and ergodic. Moreover, we have
$$\begin{aligned} I_n(\theta )=\mathrm{Var}\left( \sqrt{n}\frac{\partial }{\partial \theta }O_n(\theta )\right)= & {} \mathrm{Var}\left( \frac{1}{\sqrt{n}}\sum _{t=1}^{n}Y_t(\theta )\right) =\frac{1}{n}\sum _{t,s=1}^{n}\text{ Cov }\left( Y_t(\theta ),Y_s(\theta )\right) \\= & {} \frac{1}{n}\sum _{k=-n+1}^{n-1}(n-|k|)\text{ Cov }\left( Y_t(\theta ), Y_{t-k}(\theta )\right) . \end{aligned}$$
From Proposition 3.1 and Lemma 12, we have
$$\begin{aligned}&\epsilon _t(\theta )= \sum _{i=0}^\infty c_i(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1})\epsilon _{t-i}\text { and }\frac{\partial \epsilon _t(\theta )}{\partial \theta _l}=\sum _{i=0}^\infty c_{i,l}(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1}) \epsilon _{t-i}, \\&\quad \text { for }l=1,\dots ,(p+q)K, \end{aligned}$$
where we recall that \(c_i(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1})\) is defined by (9), and
$$\begin{aligned}&c_{i,l}(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1}) \\&\quad =\frac{\partial }{\partial \theta _l} c_i(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1})\\&\quad =\frac{\partial }{\partial \theta _l} \left( \sum _{k=0}^i w_1\Phi (\Delta _{t},\theta )\dots \Phi (\Delta _{t-k+1},\theta )M \Psi (\Delta _{t-k})\dots \Psi (\Delta _{t-i+1})w_{p+1}'\right) , \end{aligned}$$
with the following upper bound holding thanks to (13):
$$\begin{aligned} {\mathbb {E}}\sup _{\theta \in \Theta }(c_i(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1}))^2\le C\rho ^i \text { and }{\mathbb {E}}\sup _{\theta \in \Theta }( c_{i,l}(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1}))^2\le C\rho ^i,\quad \forall i. \end{aligned}$$
Let
$$\begin{aligned}&\beta _{i,j,i',j',k}(l,r)(\theta ) \nonumber \\&\quad = {{\mathbb {E}}}\left[ c_i(\theta ,\Delta _{t},\dots , \Delta _{t-i+1})c_{j,l}(\theta ,\Delta _{t},\dots ,\Delta _{t-j+1}) c_{i'}(\theta ,\Delta _{t-k},\dots ,\Delta _{t-k-i'+1})\right. \nonumber \\&\qquad \qquad \left. c_{j',r}(\theta ,\Delta _{t-k},\dots ,\Delta _{t-k-j'+1})\right] {{\mathbb {E}}}\left[ \epsilon _{t-i} \epsilon _{t-j}\epsilon _{t-k-i'}\epsilon _{t-k-j'}\right] \nonumber \\&\qquad -\, {{\mathbb {E}}}\left[ c_i(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1}) c_{j,l}(\theta ,\Delta _{t},\dots ,\Delta _{t-j+1})\right] \nonumber \\&\qquad \times {{\mathbb {E}}}\left[ c_{i'}(\theta ,\Delta _{t-k},\dots , \Delta _{t-k-i'+1})c_{j',r}(\theta ,\Delta _{t-k},\dots , \Delta _{t-k-j'+1})\right] {{\mathbb {E}}}\left[ \epsilon _{t-i} \epsilon _{t-j}\right] \nonumber \\&\qquad \times \,{{\mathbb {E}}}\left[ \epsilon _{t-k-i'}\epsilon _{t-k-j'}\right] \nonumber \\&\quad = {{\mathbb {E}}}\left[ c_i(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1})c_{j,l} (\theta ,\Delta _{t},\dots ,\Delta _{t-j+1}) c_{i'}(\theta ,\Delta _{t-k},\dots ,\Delta _{t-k-i'+1})\right. \nonumber \\&\qquad \qquad \left. c_{j',r}(\theta ,\Delta _{t-k},\dots ,\Delta _{t-k-j'+1})\right] \text{ Cov }\left( \epsilon _{t-i} \epsilon _{t-j},\epsilon _{t-k-i'}\epsilon _{t-k-j'}\right) \nonumber \\&\qquad +\, \text{ Cov }\left( c_i(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1})c_{j,l} (\theta ,\Delta _{t},\dots ,\Delta _{t-j+1}),c_{i'}(\theta ,\Delta _{t-k}, \dots ,\Delta _{t-k-i'+1})\right. \nonumber \\&\qquad \qquad \left. c_{j',r}(\theta ,\Delta _{t-k},\dots ,\Delta _{t-k-j'+1})\right) {{\mathbb {E}}}\left[ \epsilon _{t-i} \epsilon _{t-j}\right] {{\mathbb {E}}}\left[ \epsilon _{t-k-i'}\epsilon _{t-k-j'}\right] . \end{aligned}$$
(65)
We then obtain
$$\begin{aligned} c_k(l,r)(\theta )=\sum _{i=0}^\infty \sum _{j=0}^\infty \sum _{i'=0}^\infty \sum _{j'=0}^\infty \beta _{i,j,i',j',k}(l,r)(\theta ),\quad k\in {\mathbb {Z}}. \end{aligned}$$
The Cauchy-Schwarz inequality implies that
$$\begin{aligned}&\left| {{\mathbb {E}}}[c_i(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1})c_{j,l} (\theta ,\Delta _{t},\dots ,\Delta _{t-j+1}) c_{i'}(\theta ,\Delta _{t-k},\dots ,\Delta _{t-k-i'+1})\right. \nonumber \\&\quad \qquad \, \times \left. c_{j',r}(\theta ,\Delta _{t-k},\dots ,\Delta _{t-k-j'+1})]\right| \nonumber \\&\quad \le \left( {{\mathbb {E}}}[c_i(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1})c_{j,l}(\theta ,\Delta _{t}, \dots ,\Delta _{t-j+1})]^2\right) ^{1/2} \nonumber \\&\qquad \,\times \left( {\mathbb {E}} [c_{i'}(\theta ,\Delta _{t-k},\dots ,\Delta _{t-k-i'+1}) c_{j',r}(\theta ,\Delta _{t-k},\dots ,\Delta _{t-k-j'+1})]^2\right) ^{1/2} \nonumber \\&\quad \le \left( {{\mathbb {E}}}[c_i(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1})]^4 \times {{\mathbb {E}}}[c_{j,l}(\theta ,\Delta _{t},\dots ,\Delta _{t-j+1})]^4\right) ^{1/4}\nonumber \\&\qquad \left( {\mathbb {E}} [c_{i'}(\theta ,\Delta _{t-k},\dots , \Delta _{t-k-i'+1})]^4 {{\mathbb {E}}}[c_{j',r}(\theta ,\Delta _{t-k},\dots , \Delta _{t-k-j'+1})]^4\right) ^{1/4}\nonumber \\&\quad \le C\rho ^{i+j+i'+j'}. \end{aligned}$$
(66)
First, suppose that \(k\ge 0\), for all l, r in 1,...,\((p+q)K\) and \(\theta \in \Theta \), in view of (66) it follows that
$$\begin{aligned} \left| c_k(l,r)(\theta )\right|= & {} \left| \text{ cov } \left( Y_t(l)(\theta ),Y_{t-k}(r)(\theta )\right) \right| =\left| \sum _{i=0}^\infty \sum _{j=0}^\infty \sum _{i'=0}^\infty \sum _{j'=0}^\infty \beta _{i,j,i',j',k}(l,r)(\theta )\right| \\\le & {} g_1+g_2+g_3+g_4+g_5+h_1+h_2+h_3 , \end{aligned}$$
where
$$\begin{aligned} g_1= & {} \sum _{i>[k/2]}\sum _{j=0}^\infty \sum _{i'=0}^\infty \sum _{j'=0}^\infty \kappa \rho ^{i+j+i'+j'}\left| \text{ Cov }\left( \epsilon _{t-i} \epsilon _{t-j},\epsilon _{t-k-i'}\epsilon _{t-k-j'}\right) \right| ,\\ g_2= & {} \sum _{i=0}^\infty \sum _{j>[k/2]}\sum _{i'=0}^\infty \sum _{j'=0}^\infty \kappa \rho ^{i+j+i'+j'}\left| \text{ Cov }\left( \epsilon _{t-i} \epsilon _{t-j},\epsilon _{t-k-i'}\epsilon _{t-k-j'}\right) \right| \\ g_3= & {} \sum _{i=0}^\infty \sum _{j=0}^\infty \sum _{i'>[k/2]}\sum _{j'=0}^\infty \kappa \rho ^{i+j+i'+j'}\left| \text{ Cov }\left( \epsilon _{t-i} \epsilon _{t-j},\epsilon _{t-k-i'}\epsilon _{t-k-j'}\right) \right| ,\\ g_4= & {} \sum _{i=0}^\infty \sum _{j=0}^\infty \sum _{i'=0}^\infty \sum _{j'>[k/2]} \kappa \rho ^{i+j+i'+j'}\left| \text{ Cov }\left( \epsilon _{t-i} \epsilon _{t-j},\epsilon _{t-k-i'}\epsilon _{t-k-j'}\right) \right| \\ g_5= & {} \sum _{i=0}^{[k/2]}\sum _{j=0}^{[k/2]}\sum _{i'=0}^{[k/2]} \sum _{j'=0}^{[k/2]}\kappa \rho ^{i+j+i'+j'}\left| \text{ Cov }\left( \epsilon _{t-i} \epsilon _{t-j},\epsilon _{t-k-i'}\epsilon _{t-k-j'}\right) \right| , \\ h_1= & {} \sigma ^4\sum _{i>[k/2]}\sum _{i'=0}^{\infty }\left| \text{ Cov } (c_i(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1})c_{i,l}(\theta ,\Delta _{t}, \dots ,\Delta _{t-i+1}),\right. \\&\qquad \left. c_{i'}(\theta ,\Delta _{t-k},\dots ,\Delta _{t-k-i'+1})c_{i',r} (\theta ,\Delta _{t-k},\dots ,\Delta _{t-k-i'+1}))\right| , \\ h_2= & {} \sigma ^4\sum _{i=0}^{\infty }\sum _{i'>[k/2]}\left| \text{ Cov } (c_i(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1})c_{i,l}(\theta ,\Delta _{t}, \dots ,\Delta _{t-i+1}),\right. \\&\qquad \left. c_{i'}(\theta ,\Delta _{t-k},\dots ,\Delta _{t-k-i'+1})c_{i',r}(\theta , \Delta _{t-k},\dots ,\Delta _{t-k-i'+1}))\right| , \\ h_3= & {} \sigma ^4\sum _{i=0}^{[k/2]}\sum _{i'=0}^{[k/2]}\left| \text{ Cov } (c_i(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1})c_{i,l}(\theta ,\Delta _{t}, \dots ,\Delta _{t-i+1}),\right. \\&\qquad \left. c_{i'}(\theta ,\Delta _{t-k},\dots ,\Delta _{t-k-i'+1})c_{i',r} (\theta ,\Delta _{t-k},\dots ,\Delta _{t-k-i'+1}))\right| . \end{aligned}$$
Note that, in the strong noise case, we easily check that the \(\text{ Cov }\left( \epsilon _{t-i} \epsilon _{t-j},\epsilon _{t-k-i'}\epsilon _{t-k-j'}\right) \) term in (65) is non zero only for indices i, j, \(i'\), \(j'\) such that \(i=j=k+i'=k+j'\). This fact entails that, instead of considering five sums \(g_1\),..., \(g_5\), we only need to consider one sum in the form \( \kappa \sum _{j=k}^\infty \rho ^{2(2j-k)}\), which is a \(\mathrm {O}(\rho ^k)\).
Because
$$\begin{aligned} \left| \text{ Cov }\left( \epsilon _{t-i} \epsilon _{t-j},\epsilon _{t-k-i'}\epsilon _{t-k-j'}\right) \right|\le & {} \sqrt{{{\mathbb {E}}}\left[ \epsilon _{t-i} \epsilon _{t-j}\right] ^2 {{\mathbb {E}}}\left[ \epsilon _{t-k-i'}\epsilon _{t-k-j'}\right] ^2} \le {{\mathbb {E}}}\left| \epsilon _t\right| ^4<\infty \end{aligned}$$
by Assumption \({(\mathbf A3)}\), we have
$$\begin{aligned} g_1=\sum _{i>[k/2]}\sum _{j=0}^\infty \sum _{i'=0}^\infty \sum _{j'=0}^\infty \kappa \rho ^{i+j+i'+j'}\left| \text{ Cov }\left( \epsilon _{t-i} \epsilon _{t-j},\epsilon _{t-k-i'}\epsilon _{t-k-j'}\right) \right| \le \kappa _1\rho ^{k/2}, \end{aligned}$$
for some positive constant \(\kappa _1\). Using the same arguments we obtain that \(g_i\quad (i=2,3,4)\) is bounded by \(\kappa _i\rho ^{k/2}\). Furthermore, (A3) and the Cauchy-Schwarz inequality yields that \(\left\| \epsilon _{i} \epsilon _{i'}\right\| _{2+\nu }<+\infty \) for any i and \(i'\) in \({\mathbb {Z}}\). Lemma A.1 thus entails that
$$\begin{aligned} g_5= & {} \sum _{i=0}^{[k/2]}\sum _{j=0}^{[k/2]}\sum _{i'=0}^{[k/2]} \sum _{j'=0}^{[k/2]}\kappa \rho ^{i+j+i'+j'}\left| \text{ Cov }\left( \epsilon _{t-i} \epsilon _{t-j},\epsilon _{t-k-i'}\epsilon _{t-k-j'}\right) \right| \\\le & {} \sum _{i=0}^{[k/2]}\sum _{j=0}^{[k/2]}\sum _{i'=0}^{[k/2]} \sum _{j'=0}^{[k/2]}\kappa _5\rho ^{i+j+i'+j'}\left\| \epsilon _{t-i} \epsilon _{t-j}\right\| _{2+\nu }\left\| \epsilon _{t-k-i'}\epsilon _{t-k-j'} \right\| _{2+\nu }\\&\times \,\left\{ \alpha _{\epsilon }\left( \min \left[ k+j'-i,k+i'-i, k+j'-j,k+i'-j\right] \right) \right\} ^{\nu /(2+\nu )} \\\le & {} \kappa ' \alpha _{\epsilon }^{\nu /(2+\nu )}\left( \left[ k/2\right] \right) . \end{aligned}$$
Since
$$\begin{aligned}&\left| \text{ Cov }(c_i(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1})c_{i,l} (\theta ,\Delta _{t},\dots ,\Delta _{t-i+1}), c_{i'}(\theta ,\Delta _{t-k},\dots ,\Delta _{t-k-i'+1})\right. \\&\quad \times \, \left. c_{i',r}(\theta ,\Delta _{t-k},\dots ,\Delta _{t-k-i'+1}))\right| \le C\rho ^{i+i'}, \end{aligned}$$
we have
$$\begin{aligned} h_1= & {} \sigma ^4\sum _{i>[k/2]}\sum _{i'=0}^{\infty }\left| \text{ Cov } (c_i(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1})c_{i,l}(\theta , \Delta _{t},\dots ,\Delta _{t-i+1}),\right. \\&\left. c_{i'}(\theta ,\Delta _{t-k},\dots ,\Delta _{t-k-i'+1})c_{i',r} (\theta ,\Delta _{t-k},\dots ,\Delta _{t-k-i'+1}))\right| \le \kappa '_1\rho ^{k/2}, \end{aligned}$$
for some positive constant \(\kappa '_1\). Using the same arguments we obtain that \(h_2\) is bounded by \(\kappa '_2\rho ^{k/2}\). The \(\alpha -\)mixing property (see Theorem 14.1 in Davidson 1994, p. 210) and Lemma A.1, along with (12), entail that
$$\begin{aligned} h_3= & {} \sigma ^4\sum _{i=0}^{[k/2]}\sum _{i'=0}^{[k/2]}\left| \text{ Cov } (c_i(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1})c_{i,l}(\theta ,\Delta _{t}, \dots ,\Delta _{t-i+1}),\right. \\&\qquad \left. c_{i'}(\theta ,\Delta _{t-k},\dots ,\Delta _{t-k-i'+1}) c_{i',r}(\theta ,\Delta _{t-k},\dots ,\Delta _{t-k-i'+1}))\right| \\\le & {} \sum _{i=0}^{[k/2]}\sum _{i'=0}^{[k/2]}\kappa _6\left\| c_i (\theta ,\Delta _{t},\dots ,\Delta _{t-i+1})c_{i,l}(\theta ,\Delta _{t}, \dots ,\Delta _{t-i+1})\right\| _{2+\nu }\\&\times \,\left\| c_{i'}(\theta ,\Delta _{t-k},\dots ,\Delta _{t-k-i'+1}) c_{i',r}(\theta ,\Delta _{t-k},\dots ,\Delta _{t-k-i'+1}))\right\| _{2+\nu }\\&\times \,\left\{ \alpha _{\Delta }\left( k+1-i\right) \right\} ^{\nu /(2+\nu )}\le \kappa '_3 \alpha _{\Delta }^{\nu /(2+\nu )}\left( \left[ k/2\right] \right) . \end{aligned}$$
It follows that
$$\begin{aligned} \sum _{k=0}^{\infty }\left| c_k(l,r)(\theta )\right| \le \kappa \sum _{k=0}^{\infty }\rho ^{|k|/2}+\kappa '\sum _{k=0}^{\infty } \alpha _{\epsilon }^{\nu /(2+\nu )} \left( \left[ k/2\right] \right) +\kappa '' \sum _{k=0}^{\infty }\alpha _{\Delta }^{\nu /(2+\nu )} \left( \left[ k/2\right] \right) <\infty , \end{aligned}$$
by Assumption \({(\mathbf A2)}\). The same bounds clearly holds for
$$\begin{aligned} \sum _{k=-\infty }^{0}\left| c_k(l,r)(\theta )\right| , \end{aligned}$$
which shows that
$$\begin{aligned} \sum _{k=-\infty }^{\infty }\left| c_k(l,r)(\theta )\right| <\infty . \end{aligned}$$
Then, the dominated convergence theorem gives
$$\begin{aligned} I_n(l,r)(\theta )=\frac{1}{n}\sum _{k=-n+1}^{n-1}(n-|k|)c_k(l,r)(\theta )\longrightarrow I(l,r)(\theta ):=\sum _{k=-\infty }^\infty c_k (l,r)(\theta ),\quad n\rightarrow +\infty , \end{aligned}$$
and completes the proof. \(\square \)
Lemma A.3
Under the assumptions of Theorem 3.8, we have convergence in distribution of the random vector
$$\begin{aligned} \sqrt{n}\nabla Q_n(\theta _0) {\mathop {\rightarrow }\limits ^{{{\mathcal {D}}}}}\mathcal{N}(0,I),\text { as } n\rightarrow \infty \end{aligned}$$
where we recall that matrix I is given by (18).
Proof of Lemma A.3
In view of Proposition 3.5, it is easy to see that
$$\begin{aligned} \sqrt{n}\nabla \left( Q_n-O_n\right) (\theta _0)=o_{{\mathbb {P}}}(1). \end{aligned}$$
Thus \(\nabla Q_n(\theta _0) \) and \(\nabla O_n(\theta _0) \) have the same asymptotic distribution. Therefore, it remains to show that
$$\begin{aligned} \sqrt{n}\nabla O_n(\theta _0) {\mathop {\rightarrow }\limits ^{{{\mathcal {D}}}}}\mathcal{N}(0,I),\text { as } n\rightarrow \infty . \end{aligned}$$
For l, in 1,...,\((p+q)K\) and \(\theta \in \Theta \), we have
$$\begin{aligned} \frac{\partial \epsilon _{t}(\theta )}{\partial \theta _{l}}= \sum _{i=1}^{\infty }c_{i,l}(\theta ,\Delta _{t},\dots , \Delta _{t-i+1})\epsilon _{t-i}, \end{aligned}$$
(67)
where the sequence \(c_{i,l}(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1})\) is such that \({\mathbb {E}}\sup _{\theta \in \Theta }|(c_{i,l} (\theta ,\Delta _{t},\dots ,\Delta _{t-i+1}))^2\rightarrow 0\) at a geometric rate as \(i\rightarrow \infty \) (see Lemma 3.3). Moreover, note that
$$\begin{aligned} \sqrt{n}\frac{\partial O_n(\theta )}{\partial \theta _l}= & {} \frac{1}{\sqrt{n}}\sum _{t=1}^{n}Y_t(l)(\theta ) \\= & {} \frac{1}{\sqrt{n}}\sum _{t=1}^{n}\sum _{i=0}^{\infty } c_{i}(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1})\epsilon _{t-i} \sum _{j=1}^{\infty }c_{j,l}(\theta ,\Delta _{t},\dots , \Delta _{t-j+1})\epsilon _{t-j}. \end{aligned}$$
Since \( \nabla \epsilon _{t}(\theta _0) \) belongs to the Hilbert space \({{\mathcal {H}}}_\epsilon (t-1)\), the random variables \(\epsilon _{t}(\theta _0)\) and \( \nabla \epsilon _{t}(\theta _0) \) are orthogonal and it is easy to verify that \({{\mathbb {E}}}\left[ \sqrt{n}\nabla O_n(\theta _0)\right] =0\). Now, we have for all m
$$\begin{aligned} \sqrt{n}\frac{\partial O_n(\theta _0)}{\partial \theta _l}= \frac{1}{\sqrt{n}} \sum _{t=1}^{n}Y_{t,m}(l)+\frac{1}{\sqrt{n}}\sum _{t=1}^{n}Z_{t,m}(l) \end{aligned}$$
where
$$\begin{aligned} Y_{t,m}(l)= & {} \sum _{j=1}^{m}c_{j,l}(\theta _0,\Delta _{t},\dots , \Delta _{t-j+1})\epsilon _t\epsilon _{t-j}\\ Z_{t,m}(l)= & {} \sum _{j=m+1}^{\infty }c_{j,l}(\theta _0,\Delta _{t}, \dots ,\Delta _{t-j+1})\epsilon _t\epsilon _{t-j}. \end{aligned}$$
Let
$$\begin{aligned} Y_{t,m}&:=Y_{t,m}(\theta _0)=\left( Y_{t,m}(1),\dots ,Y_{t,m} ((p+q)K)\right) '\text { and }\\ Z_{t,m}&:=Z_{t,m}(\theta _0)=\left( Z_{t,m,}(1),\dots ,Z_{t,m}((p+q)K)\right) '. \end{aligned}$$
The processes \((Y_{t,m})_{t}\) and \((Z_{t,m})_{t}\) are stationary and centered. Moreover, under Assumption (A2) and m fixed, the process \(Y=(Y_{t,m})_{t}\) is strong mixing (see Davidson 1994, Theorem 14.1 p. 210), with mixing coefficients \(\alpha _Y(h)\le \alpha _{\Delta ,\epsilon }\left( \max \{0,h-m\}\right) \le \alpha _{\Delta }\left( \max \{0,h-m+1\}\right) +\alpha _{\epsilon } \left( \max \{0,h-m\}\right) \), by independence of \((\Delta _t)_{t\in {\mathbb {Z}}}\) and \((\epsilon _t)_{t\in {\mathbb {Z}}}\). Applying the central limit theorem (CLT) for mixing processes (see Herrndorf 1984) we directly obtain
$$\begin{aligned} \frac{1}{\sqrt{n}}\sum _{t=1}^{n}Y_{t,m}{\mathop {\rightarrow }\limits ^{{{\mathcal {D}}}}}\mathcal{N}(0,I_m),\quad I_m=\sum _{h=-\infty }^{\infty } \mathrm{Cov}\left( Y_{t,m},Y_{t-h,m}\right) . \end{aligned}$$
In the strong noise case, the infinite sum in \(I_m\) reduces to one term corresponding to \(h=0\), and \(I_m\) simply equals \(\mathrm{Cov}\left( Y_{t,m},Y_{t,m}\right) \).
As in Francq and Zakoïan (1998) (see Lemma 3), we can show that \(I=\lim _{m\rightarrow \infty }I_m\) exists. Since \(\Vert Z_{t,m}\Vert _2\rightarrow 0\) at an exponential rate when \(m\rightarrow \infty \), using the arguments given in Francq and Zakoïan (1998) (see Lemma 4), we show that
$$\begin{aligned} \lim _{m\rightarrow \infty }\limsup _{n\rightarrow \infty }{{\mathbb {P}}}\left\{ \left\| n^{-1/2} \sum _{t=1}^{n}Z_{t,m}\right\| >\varepsilon \right\} =0 \end{aligned}$$
(68)
for every \(\varepsilon >0\) (see the following lemma A.4). From a standard result (see e.g. Brockwell and Davis 1991, Proposition 6.3.9), we deduce that
$$\begin{aligned} \frac{1}{\sqrt{n}}\sum _{t=1}^{n} \nabla O_n(\theta _0) = \frac{1}{\sqrt{n}}\sum _{t=1}^{n}Y_{t,m}+\frac{1}{\sqrt{n}} \sum _{t=1}^{n}Z_{t,m}{\mathop {\rightarrow }\limits ^{{{\mathcal {D}}}}}{{\mathcal {N}}}(0,I), \end{aligned}$$
which completes the proof. \(\square \)
Lemma A.4
Under the assumptions of Theorem 3.8, (68) holds, that is
$$\begin{aligned} \lim _{m\rightarrow \infty }\limsup _{n\rightarrow \infty }{{\mathbb {P}}}\left\{ \left\| n^{-1/2} \sum _{t=1}^{n}Z_{t,m}\right\| >\varepsilon \right\} =0. \end{aligned}$$
Proof of Lemma A.4
For \(l=1,\dots ,(p+q)K\), by stationarity we have
$$\begin{aligned} \mathrm{Var}\left( \frac{1}{\sqrt{n}}\sum _{t=1}^{n}Z_{t,m}(l)\right)= & {} \frac{1}{n}\sum _{t,s=1}^{n}\text{ Cov }(Z_{t,m}(l),Z_{s,m}(l))\\= & {} \frac{1}{n}\sum _{|h|<n}(n-|h|)\text{ Cov }(Z_{t,m}(l),Z_{t-h,m}(l))\\\le & {} \sum _{h=-\infty }^{\infty }\left| \text{ Cov }(Z_{t,m}(l),Z_{t-h,m}(l))\right| . \end{aligned}$$
Consider first the case \(h\ge 0\). Because \({\mathbb {E}} \sup _{\theta \in \Theta }(c_{j,l}(\theta _0,\Delta _{t}, \dots ,\Delta _{t-j+1}))^2\le \kappa \rho ^{j}\) (see 12), using also \({{\mathbb {E}}}|\epsilon _t|^4<\infty \), for \([h/2]\le m\), it follows from the Hölder inequality that
$$\begin{aligned} \sup _h\left| \text{ Cov }(Z_{t,m}(l),Z_{t-h,m}(l))\right| = \sup _h\left| {{\mathbb {E}}}(Z_{t,m}(l)Z_{t-h,m}(l))\right| \le \kappa \rho ^m. \end{aligned}$$
(69)
Let \(h>0\) such that \([h/2]>m\). Write
$$\begin{aligned} Z_{t,m}=Z_{t,m}^{h^-}(l)+Z_{t,m}^{h^+}(l), \end{aligned}$$
where
$$\begin{aligned}&Z_{t,m}^{h^-}(l)=\sum _{j=m+1}^{[h/2]}c_{j,l}(\theta _0,\Delta _{t}, \dots ,\Delta _{t-j+1})\epsilon _t\epsilon _{t-j}, \\&Z_{t,m}^{h^+}(l)=\sum _{j=[h/2]+1}^{\infty }c_{j,l}(\theta _0,\Delta _{t}, \dots ,\Delta _{t-j+1})\epsilon _t\epsilon _{t-j}. \end{aligned}$$
Note that \(Z_{t,m}^{h^-}(l)\) belongs to the \(\sigma \)-field generated by \(\{\Delta _{t},\dots , \Delta _{t-[h/2]+1}, \epsilon _t,\epsilon _{t-1},\dots ,\epsilon _{t-[h/2]}\}\) and that \(Z_{t-h,m}(l)\) belongs to the \(\sigma \)-field generated by \(\{\Delta _{t-h},\Delta _{t-h-1},\dots ,\epsilon _{t-h},\epsilon _{t-h-1},\dots \}\). Note also that, by (A3), \({{\mathbb {E}}}|Z_{t,m}^{h^-}(l)|^{2+\nu }<\infty \) and \({{\mathbb {E}}}|Z_{t-h,m}(l)|^{2+\nu }<\infty \). The \(\alpha -\)mixing property and Lemma A.1 then entail that
$$\begin{aligned}&\left| \text{ Cov }(Z_{t,m}^{h^-}(l),Z_{t-h,m}(l))\right| \nonumber \\&\quad \le \kappa _1\sum _{j=m+1}^{[h/2]}\sum _{j'=m+1}^{\infty }\left\| c_{j',l} (\theta _0,\Delta _{t-h},\dots ,\Delta _{t-h-j'+1})\epsilon _t\epsilon _{t-j'} \right\| _{2+\nu }\nonumber \\&\qquad \times \,\left\| c_{j,l}(\theta _0,\Delta _{t},\dots ,\Delta _{t-j+1}) \epsilon _t\epsilon _{t-j}\right\| _{2+\nu }\left[ \alpha _{\Delta , \epsilon }([h/2])\right] ^{\nu /(2+\nu )}\nonumber \\&\quad \le \kappa _2\sum _{j=m+1}^{[h/2]}\sum _{j'=m+1}^{\infty }\rho ^j \rho ^{j'}\left[ \alpha _{\epsilon }^{\nu /(2+\nu )} ([h/2])+\alpha _{\Delta }^{\nu /(2+\nu )}([h/2])\right] \nonumber \\&\quad \le \kappa \rho ^{m}\left[ \alpha _{\epsilon }^{\nu /(2+\nu )} ([h/2])+\alpha _{\Delta }^{\nu /(2+\nu )}([h/2])\right] . \end{aligned}$$
(70)
By the argument used to show (69), we also have
$$\begin{aligned} \left| \text{ Cov }(Z_{t,m}^{h^+}(l),Z_{t-h,m}(l))\right| \le \kappa \rho ^h\rho ^m. \end{aligned}$$
(71)
In view of (69), (70) and (71), we obtain
$$\begin{aligned}&\sum _{h=0}^{\infty }\left| \text{ Cov }(Z_{t,m}(l),Z_{t-h,m}(l))\right| \\&\quad \le \kappa m\rho ^m+\sum _{h=m}^{\infty } \left\{ \kappa \rho ^h\rho ^m+ \kappa \rho ^{m}\left[ \alpha _{\epsilon }^{\nu /(2+\nu )}([h/2])+ \alpha _{\Delta }^{\nu /(2+\nu )}([h/2])\right] \right\} \rightarrow 0 \end{aligned}$$
as \(m\rightarrow \infty \) by (A2). This implies that
$$\begin{aligned} \sup _n\mathrm{Var}\left( \frac{1}{\sqrt{n}}\sum _{t=1}^{n}Z_{t,m}(l)\right) \xrightarrow [m\rightarrow \infty ]{}0. \end{aligned}$$
(72)
We have the same bound for \(h<0\). The conclusion follows from (72). \(\square \)
Lemma A.5
Under the assumptions of Theorem 3.8, almost surely
$$\begin{aligned} \nabla ^2 Q_n(\theta _0) \longrightarrow J,\quad n\rightarrow \infty , \end{aligned}$$
where J given by (17) exists and is invertible.
Proof of Lemma A.5
For all l, r in \(1,\dots ,(p+q)K\), in view of Proposition 3.5, we have almost surely
$$\begin{aligned} \left| \frac{\partial ^2 }{\partial \theta _l\partial \theta _r} \left( Q_n(\theta _0)-O_n(\theta _0)\right) \right| \rightarrow 0, \text { as } t\rightarrow \infty . \end{aligned}$$
Thus \({\partial ^2 Q_n(\theta _0)}/{\partial \theta _l\partial \theta _r}\) and \({\partial ^2 O_n(\theta _0)}/{\partial \theta _l\partial \theta _r}\) have almost surely the same asymptotic distribution. From (8) and (12), there exists a sequence \(\left( c_{i,l,r}(\theta ,\Delta _{t-1},\dots ,\Delta _{t-i})\right) _{i\in {\mathbb {N}}}\) such that
$$\begin{aligned}&\frac{\partial ^2 \epsilon _{t}(\theta )}{\partial \theta _{l}\partial \theta _{r}}=\sum _{i=1}^{\infty }c_{i,l,r}(\theta , \Delta _{t},\dots ,\Delta _{t-i+1})\epsilon _{t-i} \text { with }{\mathbb {E}}(c_{i,l,r}(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1}))^2\le C\rho ^i, \nonumber \\&\quad \forall i. \end{aligned}$$
(73)
This implies that \({\partial ^2 \epsilon _{t}(\theta )}/{\partial \theta _{l}\partial \theta _{r}}\) belongs to \(L^2\). On the other hand, we have
$$\begin{aligned} \frac{\partial ^2 O_n(\theta )}{\partial \theta _l\partial \theta _r}= & {} \frac{1}{n}\sum _{t=1}^n\epsilon _t(\theta )\frac{\partial ^2 \epsilon _t(\theta )}{\partial \theta _l\partial \theta _r}+\frac{1}{n}\sum _{t=1}^n\frac{\partial \epsilon _t(\theta )}{\partial \theta _l}\frac{\partial \epsilon _t(\theta )}{\partial \theta _r}\\\longrightarrow & {} {{\mathbb {E}}}\left( \epsilon _t(\theta )\frac{\partial ^2 \epsilon _t(\theta )}{\partial \theta _l\partial \theta _r}\right) +{{\mathbb {E}}}\left( \frac{\partial \epsilon _t(\theta )}{\partial \theta _l}\frac{\partial \epsilon _t(\theta )}{\partial \theta _r}\right) ,\text { as } n\rightarrow \infty , \end{aligned}$$
by the ergodic theorem. Using the uncorrelatedness between \(\epsilon _t(\theta _0)\) and the linear past \({{\mathcal {H}}}_\epsilon (t-1)\), \(\partial \epsilon _t(\theta _0)/\partial \theta _l\in \mathcal{H}_\epsilon (t-1)\), and \(\partial ^2 \epsilon _t(\theta _0)/\partial \theta _l\partial \theta _r\in {{\mathcal {H}}}_\epsilon (t-1)\), we have
$$\begin{aligned} {{\mathbb {E}}}\left( \frac{\partial ^2 O_n(\theta _0)}{\partial \theta _l\partial \theta _r}\right) = {{\mathbb {E}}}\left( \frac{\partial \epsilon _t(\theta _0)}{\partial \theta _l}\frac{\partial \epsilon _t(\theta _0)}{\partial \theta _r}\right) =J(l,r). \end{aligned}$$
(74)
Therefore, J is the covariance matrix of \(\partial \epsilon _t(\theta _0)/\partial \theta \). If J is singular, then there exists a vector \(\varvec{c}=(c_1,\dots ,c_{(p+q)K})'\ne 0\) such that \(\varvec{c}'J\varvec{c}=0\). Thus we have
$$\begin{aligned} \sum _{k=1}^{(p+q)K}c_k\frac{\partial \epsilon _t(\theta _0)}{\partial \theta _k}=0,\,a.s. \end{aligned}$$
(75)
Differentiating the two sides of (4) yields
$$\begin{aligned} -\sum _{i=1}^p(g_i^{a})^*(\Delta _t,\theta _0) X_{t-i}&=\sum _{k=1}^{(p+q)K}c_k\frac{\partial \epsilon _t(\theta _0)}{\partial \theta _k}-\sum _{j=1}^q g_j^b(\Delta _t,\theta _0)\sum _{k=1}^{(p+q)K}c_k\frac{\partial \epsilon _{t-j}(\theta _0)}{\partial \theta _k} \\&\quad -\,\sum _{j=1}^q (g_j^{b})^*(\Delta _t,\theta _0)\epsilon _{t-j}(\theta _0) \end{aligned}$$
where
$$\begin{aligned} (g_i^{a})^*(\Delta _t,\theta _0)=\sum _{k=1}^{(p+q)K}c_k\frac{\partial g_i^a(\Delta _t,\theta _0)}{\partial \theta _k}\text { and } (g_j^{b})^*(\Delta _t,\theta _0)=\sum _{k=1}^{(p+q)K}c_k\frac{\partial g_j^b(\Delta _t,\theta _0)}{\partial \theta _k}. \end{aligned}$$
Because (75) is satisfied for all t, we have
$$\begin{aligned} \sum _{i=1}^p(g_i^{a})^*(\Delta _t,\theta _0) X_{t-i}=\sum _{j=1}^q (g_j^{b})^*(\Delta _t,\theta _0)\epsilon _{t-j}(\theta _0). \end{aligned}$$
The latter equation yields a ARMARC\((p-1,q-1)\) representation at best. The identifiability assumption (see Proposition 3.1) excludes the existence of such representation.
Thus
$$\begin{aligned}&(g_i^{a})^*(\Delta _t,\theta _0)=\sum _{k=1}^{(p+q)K}c_k\frac{\partial g_i^a(\Delta _t,\theta _0)}{\partial \theta _k}=0\text { and } \\&(g_j^{b})^*(\Delta _t,\theta _0)=\sum _{k=1}^{(p+q)K}c_k\frac{\partial g_j^b(\Delta _t,\theta _0)}{\partial \theta _k}=0 \end{aligned}$$
and the conclusion follows. \(\square \)
Proof of Theorem 3.8
For all \(i,j,k=1,\dots ,K(p+q)\) we have
$$\begin{aligned} \frac{\partial ^3O_n(\theta )}{\partial \theta _i\partial \theta _j\partial \theta _k}&= \frac{1}{n}\sum _{t=1}^n\left\{ \epsilon _t(\theta )\frac{\partial ^3\epsilon _t (\theta )}{\partial \theta _i\partial \theta _j\partial \theta _k} \right\} +\frac{1}{n}\sum _{t=1}^n\left\{ \frac{\partial \epsilon _t(\theta )}{\partial \theta _i} \frac{\partial ^2\epsilon _t(\theta )}{\partial \theta _j\partial \theta _k} \right\} \\&\quad +\frac{1}{n}\sum _{t=1}^n\left\{ \frac{\partial ^2\epsilon _t(\theta )}{\partial \theta _i\partial \theta _j}\frac{\partial \epsilon _t(\theta )}{\partial \theta _k}\right\} +\frac{1}{n}\sum _{t=1}^n \left\{ \frac{\partial \epsilon _t(\theta )}{\partial \theta _j} \frac{\partial ^2\epsilon _t(\theta )}{\partial \theta _i\partial \theta _k}\right\} . \end{aligned}$$
Using the ergodic theorem, the Cauchy-Schwarz inequality and Lemma 3.4, we obtain
$$\begin{aligned} \sup _n\sup _{\theta \in \Theta }\left| \frac{\partial ^3O_n(\theta )}{\partial \theta _i\partial \theta _j\partial \theta _k}\right| <+\infty . \end{aligned}$$
(76)
In view of Proposition 3.5, we have almost surely
$$\begin{aligned} \sup _{\theta \in \Theta }\left| \frac{\partial ^3 }{\partial \theta _i\partial \theta _j\partial \theta _k} \left( Q_n(\theta )-O_n(\theta )\right) \right| \longrightarrow 0, \text { as } n\rightarrow \infty . \end{aligned}$$
Thus \({\partial ^3 Q_n(\theta )}/{\partial \theta _i\partial \theta _j\partial \theta _k}\) and \({\partial ^2 O_n(\theta )}/{\partial \theta _i\partial \theta _j\partial \theta _k}\) have almost surely the same asymptotic distribution. In view of Theorem 3.6 and (A4), we have almost surely \({\hat{\theta }}_n\longrightarrow \theta _0\in {\mathop {\Theta }\limits ^{\circ }}\). Thus \(\nabla Q_n({\hat{\theta }}_n)=0_{{\mathbb {R}}^{(p+q)K}}\) for sufficiently large n, and a Taylor expansion gives for all \(r\in \{1,\ldots ,(p+q)K \}\),
$$\begin{aligned} 0=\sqrt{n} \frac{\partial }{\partial \theta _r} Q_n(\theta _0) + \nabla \frac{\partial }{\partial \theta _r} Q_n(\theta _{n,r}^*) \sqrt{n}\left( {\hat{\theta }}_n-\theta _0\right) , \end{aligned}$$
(77)
where \(\theta _{n,r}^*\) lies on the segment in \({\mathbb {R}}^{(p+q)K}\) with endpoints \({\hat{\theta }}_n\) and \(\theta _0\). Using again a Taylor expansion, Theorem 3.7 and (76), we obtain for all \(l=1,\dots ,(p+q)K\),
$$\begin{aligned} \left| \frac{\partial ^2 Q_n(\theta _{n,r}^*)}{\partial \theta _l\partial \theta _r}-\frac{\partial ^2 Q_n(\theta _0)}{\partial \theta _l\partial \theta _r}\right|\le & {} \sup _n\sup _{\theta \in \Theta }\left\| \nabla \left( \frac{\partial ^2 }{\partial \theta _l\partial \theta _r}Q_n(\theta )\right) \right\| \left\| \theta _{n,r}^*-\theta _0\right\| \\\longrightarrow & {} 0 \text { a.s. as }n\rightarrow \infty . \end{aligned}$$
This, along with (77), implies that, as \(n\rightarrow \infty \)
$$\begin{aligned} \sqrt{n}\left( {\hat{\theta }}_n-\theta _0\right) =-\left[ \nabla ^2 Q_n(\theta _0) \right] ^{-1}\sqrt{n}\frac{\partial Q_n(\theta _0)}{\partial \theta }+o_{{\mathbb {P}}}(1). \end{aligned}$$
From Lemma A.3 and Lemma A.4, we obtain that \(\sqrt{n}({\hat{\theta }}_n-\theta _0)\) has a limiting normal distribution with mean 0 and covariance matrix \(J^{-1}IJ^{-1}\). \(\square \)
1.5 A.5. Proofs of Theorem 3.10
The proof of Theorem 3.10 is based on a series of lemmas.
Consider the regression of \(\Upsilon _t\) on \(\Upsilon _{t-1},\dots ,\Upsilon _{t-r}\) defined by
$$\begin{aligned} \Upsilon _t=\sum _{i=1}^{r}\Phi _{r,i}\Upsilon _{t-i}+u_{r,t},\qquad \end{aligned}$$
(78)
where \(u_{r,t}\) is orthogonal to \(\left\{ \Upsilon _{t-1} \dots \Upsilon _{t-r}\right\} \) for the \(L^2\) inner product. If \(\Upsilon _{1},\dots ,\Upsilon _{n}\) were observed, the least squares estimators of \(\underline{{{\varvec{\Phi }}}}_{r}=\left( \Phi _{r,1}\cdots \Phi _{r,r}\right) \) and \(\Sigma _{u_r}=\text{ Var }(u_{r,t})\) would be given by
$$\begin{aligned} \underline{\breve{{\varvec{\Phi }}}}_{r}={\hat{\Sigma }}_{{\Upsilon }, \underline{{\Upsilon }}_{r}} {\hat{\Sigma }}_{\underline{{\Upsilon }}_{r}}^{-1}\qquad \text{ and }\qquad {\hat{\Sigma }}_{\breve{u}_r}=\frac{1}{n}\sum _{t=1}^n \left( {\Upsilon }_t-\underline{\breve{{\varvec{\Phi }}}}_{r} \underline{{\Upsilon }}_{r,t}\right) \left( {\Upsilon }_t-\underline{\breve{{\varvec{\Phi }}}}_{r} \underline{{\Upsilon }}_{r,t}\right) ' \end{aligned}$$
where \(\underline{{\Upsilon }}_{r,t}=({\Upsilon }_{t-1}' \cdots {\Upsilon }_{t-r}')'\),
$$\begin{aligned} {\hat{\Sigma }}_{{\Upsilon },\underline{{\Upsilon }}_{r}}= \frac{1}{n}\sum _{t=1}^n{\Upsilon }_t\underline{{\Upsilon }}_{r,t}',\qquad {\hat{\Sigma }}_{\underline{{\Upsilon }}_{r}}= \frac{1}{n}\sum _{t=1}^n\underline{{\Upsilon }}_{r,t} \underline{{\Upsilon }}_{r,t}', \end{aligned}$$
with by convention \({\Upsilon }_t=0\) when \(t\le 0\), and assuming \({\hat{\Sigma }}_{\underline{{\Upsilon }}_{r}}\) is non singular (which holds true asymptotically).
Actually, we just observe \(X_1,\dots ,X_n\). The residuals \({{\hat{\epsilon }}}_t:=e_t({{\hat{\theta }}}_n)\) are then available for \(t=1,\dots ,n\) and the vectors \({{\hat{\Upsilon }}}_t\) obtained by replacing \(\theta _0\) by \({\hat{\theta }}_n\) in (19) are available for \(t=1,\dots ,n\). We therefore define the least squares estimators of \(\underline{{{\varvec{\Phi }}}}_{r}=\left( \Phi _{r,1}\cdots \Phi _{r,r}\right) \) and \(\Sigma _{u_r}=\text{ Var }(u_{r,t})\) by
$$\begin{aligned} \underline{\hat{{\varvec{\Phi }}}}_{r}={\hat{\Sigma }}_{{\hat{\Upsilon }}, \underline{{\hat{\Upsilon }}}_{r}} {\hat{\Sigma }}_{\underline{{\hat{\Upsilon }}}_{r}}^{-1}\qquad \text{ and } \qquad {\hat{\Sigma }}_{{\hat{u}}_r}=\frac{1}{n}\sum _{t=1}^n \left( {\hat{\Upsilon }}_t-\underline{\hat{{\varvec{\Phi }}}}_{r} \underline{{\hat{\Upsilon }}}_{r,t}\right) \left( {\hat{\Upsilon }}_t-\underline{\hat{{\varvec{\Phi }}}}_{r} \underline{{\hat{\Upsilon }}}_{r,t}\right) ' \end{aligned}$$
where \(\underline{{\hat{\Upsilon }}}_{r,t}=({\hat{\Upsilon }}_{t-1}' \cdots {\hat{\Upsilon }}_{t-r}')'\),
$$\begin{aligned} {\hat{\Sigma }}_{{\hat{\Upsilon }},\underline{{\hat{\Upsilon }}}_{r}}= \frac{1}{n}\sum _{t=1}^n{\hat{\Upsilon }}_t\underline{{\hat{\Upsilon }}}_{r,t}',\qquad {\hat{\Sigma }}_{\underline{{\hat{\Upsilon }}}_{r}}= \frac{1}{n}\sum _{t=1}^n\underline{{\hat{\Upsilon }}}_{r,t} \underline{{\hat{\Upsilon }}}_{r,t}', \end{aligned}$$
with by convention \({\hat{\Upsilon }}_t=0\) when \(t\le 0\), and assuming \({\hat{\Sigma }}_{\underline{{\hat{\Upsilon }}}_{r}}\) is non singular (which holds true asymptotically).
We specify a bit more the matrix norm defined at the end of Sect. 2 and we use in the sequel the multiplicative matrix norm defined by
$$\begin{aligned} \Vert A\Vert =\sup _{\Vert x\Vert \le 1}\Vert Ax\Vert =\varrho ^{1/2}(A'{\bar{A}}), \end{aligned}$$
(79)
where A is a \({\mathbb {C}}^{d_1\times d_2}\) matrix, \(\Vert x\Vert ^2=x' {\bar{x}}\) is the Euclidean norm of the vector \(x\in {\mathbb {C}}^{d_2\times 1}\), and \(\varrho (\cdot )\) denotes the spectral radius. This norm satisfies
$$\begin{aligned} \Vert A\Vert ^2\le \sum _{i,j}a_{i,j}^2, \text { when }A \text { is a }{\mathbb {R}}^{d_1\times d_2}\text { matrix} \end{aligned}$$
(80)
with obvious notations. This choice of the norm is crucial for the following lemma to hold (with e.g. the Euclidean norm, this result is not valid). Let
$$\begin{aligned} {\Sigma }_{{\Upsilon },{\underline{\Upsilon }}_{r}}= & {} {\mathbb {E}}{\Upsilon }_{t} {\underline{\Upsilon }}_{r,t}', \quad {\Sigma }_{{\Upsilon }}={\mathbb {E}}{\Upsilon }_{t}{\Upsilon }_{t}', \quad {\Sigma }_{{\underline{\Upsilon }}_{r}}= {\mathbb {E}}{\underline{\Upsilon }}_{r,t}{\underline{\Upsilon }}_{r,t}',\quad {\hat{\Sigma }}_{{\hat{\Upsilon }}}= \frac{1}{n}\sum _{t=1}^n{\hat{\Upsilon }}_t{\hat{\Upsilon }}_{t}'. \end{aligned}$$
In the sequel, C and \(\rho \) denote generic constant such as \(K>0\) and \(\rho \in (0,1)\), whose exact values are unimportant.
Lemma A.6
Under the assumptions of Theorem 3.10,
$$\begin{aligned} \sup _{r\ge 1}\max \left\{ \left\| {\Sigma }_{{\Upsilon }, {\underline{\Upsilon }}_{r}}\right\| ,\left\| {\Sigma }_{{\underline{\Upsilon }}_{r}}\right\| , \left\| {\Sigma }_{{\underline{\Upsilon }}_{r}}^{-1}\right\| \right\} < \infty . \end{aligned}$$
Proof
The proof is an extension of Section 5.2 of Grenander and Szegö (1958). We readily have
$$\begin{aligned} \Vert {\Sigma }_{{\underline{\Upsilon }}_{r}}x\Vert \le \Vert {\Sigma }_{{\underline{\Upsilon }}_{r+1}}(x', 0_{(p+q)K}')'\Vert \quad \text{ and } \quad \Vert {\Sigma }_{{\underline{\Upsilon }}_{r}}x\Vert \le \Vert {\Sigma }_{{\underline{\Upsilon }}_{r+1}}(0_{(p+q)K}',x')'\Vert \end{aligned}$$
for any \(x\in {\mathbb {R}}^{K(p+q)r}\) and \(0_{(p+q)K}=(0,\dots ,0)'\in {\mathbb {R}}^{(p+q)K}\). Therefore
$$\begin{aligned} 0<\left\| \text{ Var }\left( {\Upsilon }_{t}\right) \right\| = \left\| {\Sigma }_{{\underline{\Upsilon }}_{1}}\right\| \le \left\| {\Sigma }_{{\underline{\Upsilon }}_{2}}\right\| \le \cdots \end{aligned}$$
and
$$\begin{aligned} \left\| {\Sigma }_{{\Upsilon },{\underline{\Upsilon }}_{r}}\right\| \le \left\| {\Sigma }_{{\underline{\Upsilon }}_{r+1}}\right\| , \end{aligned}$$
so that it suffices to prove that \( \sup _{r\ge 1}\left\| {\Sigma }_{{\underline{\Upsilon }}_{r}}\right\| \) and \(\sup _{r\ge 1}\left\| {\Sigma }_{{\underline{\Upsilon }}_{r}}^{-1}\right\| \) are finite to prove the result. Let us write matrix \({\Sigma }_{{\underline{\Upsilon }}_{r}}\) in blockwise form
$$\begin{aligned} {\Sigma }_{{\underline{\Upsilon }}_{r}}=\left[ C(i-j)\right] _{i,j=1,\ldots ,r},\quad C(k)={{\mathbb {E}}}(\Upsilon _{0}\Upsilon _{k}')\in {\mathbb {R}}^{K(p+q)\times K(p+q)},\ k\in {\mathbb {Z}}. \end{aligned}$$
Let now \(f:{\mathbb {R}}\longrightarrow {\mathbb {C}}^{K(p+q)\times K(p+q)}\) be the spectral density of \((\Upsilon _t)_{t\in {\mathbb {Z}}}\) defined by
$$\begin{aligned} f(\omega )=\frac{1}{2\pi } \sum _{k=-\infty }^\infty C(k) e^{i\omega k},\quad \omega \in {\mathbb {R}}. \end{aligned}$$
A direct consequence of (19) and Lemma A.2 is that \(f(\omega )\) is absolutely summable, and that \(\sup _{\omega \in {\mathbb {R}}}\Vert f(\omega )\Vert <+\infty \), for any norm \(\Vert \cdot \Vert \) on \({\mathbb {C}}^{K(p+q)\times K(p+q)}\) (in particular, one which is independent from \(r\ge 1\)). Another consequence is that we have the inversion formula
$$\begin{aligned} C(k)=\int _{-\pi }^\pi f(x) e^{-ikx}dx,\quad \forall k\in {\mathbb {Z}}. \end{aligned}$$
(81)
Last, it is easy to check that \(f(\omega )\) is an hermitian matrix for all \(\omega \in {\mathbb {R}}\), i.e. \(\overline{f(\omega )}=f(\omega ) '\), where \({\bar{z}}\) is the conjugate of any vector or matrix z with entries in \({\mathbb {C}}\). Let then \(\delta ^{(r)}=\left( {\delta ^{(r)}_1} ',\ldots ,{\delta ^{(r)}_r} '\right) \in {\mathbb {R}}^{rK(p+q)\times 1}\) be an eigenvector for \({\Sigma }_{{{\underline{\Upsilon }}_{r}}}\), with \(\delta ^{(r)}_j \in {\mathbb {R}}^{K(p+q)\times 1}\), \(j=1,\ldots ,r\), such that \(\Vert {\delta ^{(r)}}\Vert =1\) and
$$\begin{aligned} {\delta ^{(r)}} ' {\Sigma }_{{{\underline{\Upsilon }}_{r}}} \delta ^{(r)}= \Vert {\Sigma }_{{{\underline{\Upsilon }}_{r}}}\Vert =\varrho \left( {\Sigma }_{{{\underline{\Upsilon }}_{r}}}\right) , \end{aligned}$$
(82)
where \(\Vert {\Sigma }_{{{\underline{\Upsilon }}_{r}}}\Vert \) is the norm of matrix \({\Sigma }_{{{\underline{\Upsilon }}_{r}}}\) defined in (79). We then check that
$$\begin{aligned} {\delta ^{(r)}} ' {\Sigma }_{{{\underline{\Upsilon }}_{r}}} \delta ^{(r)}= & {} \sum _{i,j=1}^r {\delta ^{(r)}_i} ' C(i-j) {\delta ^{(r)}_j} \nonumber \\= & {} \int _{-\pi }^\pi \left( \sum _{m=1}^r \delta ^{(r)}_m e^{i(m-1)x}\right) ' f(x) \overline{\left( \sum _{m=1}^r \delta ^{(r)}_m e^{i(m-1)x}\right) }dx , \end{aligned}$$
(83)
the last equality a direct consequence of (81). f(x) being hermitian, \((X,Y)\in {\mathbb {C}}^{K(p+q)\times 1}\times {\mathbb {C}}^{K(p+q)\times 1}\mapsto X' f(x) {\bar{Y}}\) defines a semi definite non negative bilinear form, hence we have for all \(x\in {\mathbb {R}}\) and \(X\in {\mathbb {C}}^{K(p+q)\times 1}\):
$$\begin{aligned} 0\le X' f(x) {\bar{X}}\le \Vert f(x) \Vert \cdot X'{\bar{X}} \le \sup _{\omega \in {\mathbb {R}}}\Vert f(\omega )\Vert \cdot X'{\bar{X}} . \end{aligned}$$
Let us point out that \(\sup _{\omega \in {\mathbb {R}}}\Vert f(\omega )\Vert \) is a quantity which is independent from \(r\ge 1\). We deduce from (83) and the previous inequality that
$$\begin{aligned} {\delta ^{(r)}} ' {\Sigma }_{{{\underline{\Upsilon }}_{r}}} \delta ^{(r)}\le \sup _{\omega \in {\mathbb {R}}}\Vert f(\omega )\Vert \int _{-\pi }^\pi \left( \sum _{m=1}^r \delta ^{(r)}_m e^{i(m-1)x}\right) ' \overline{\left( \sum _{m=1}^r \delta ^{(r)}_m e^{i(m-1)x}\right) }dx . \end{aligned}$$
(84)
A short computation yields that
$$\begin{aligned} \frac{1}{2\pi }\int _{-\pi }^\pi \left( \sum _{m=1}^r \delta ^{(r)}_m e^{i(m-1)x}\right) ' \overline{\left( \sum _{m=1}^r \delta ^{(r)}_m e^{i(m-1)x}\right) }dx = \sum _{m=1}^r {\delta ^{(r)}_m} ' \delta ^{(r)}_m=\Vert {\delta ^{(r)}}\Vert ^2=1, \end{aligned}$$
which, coupled with (82) and (84), yields that \( \Vert {\Sigma }_{{{\underline{\Upsilon }}_{r}}}\Vert \le 2\pi \sup _{\omega \in {\mathbb {R}}}\Vert f(\omega )\Vert <+\infty \), an upper bound independent from \(r\ge 1\). By similar arguments, the smallest eigenvalue of \( {\Sigma }_{{\underline{\Upsilon }}_{r}}\) is greater than a positive constant independent of r. Using the fact that \(\Vert {\Sigma }_{{\underline{\Upsilon }}_{r}}^{-1}\Vert \) is equal to the inverse of the smallest eigenvalue of \( {\Sigma }_{{\underline{\Upsilon }}_{r}}\), the proof is completed. \(\square \)
The following lemma is necessary in the sequel.
Lemma A.7
Let us suppose that (A1) and that Stationarity condition (A5a) for \(\nu =6\)
- (A6):
$$\begin{aligned} \limsup _{t\rightarrow \infty }\frac{1}{t}\ln {\mathbb {E}}\left( \sup _{\theta \in \Theta } \left| \left| \prod _{i=1}^t \Phi (\Delta _i,\theta )\right| \right| ^{32}\right)<0,\quad \limsup _{t\rightarrow \infty }\frac{1}{t}\ln {\mathbb {E}}\left( \left| \left| \prod _{i=1}^t \Psi (\Delta _i)\right| \right| ^{32}\right) <0 \end{aligned}$$
hold. We assume that \(\epsilon _t\in L^{4\nu +8}\). Sequences \((\epsilon _t(\theta ))_{t\in {\mathbb {Z}}}\) and \((e_t(\theta ))_{t\in {\mathbb {Z}}}\) satisfy
- 1.
\(\left| \left| \sup _{\theta \in \Theta } |\epsilon _0(\theta )|\right| \right| _{16}<+\infty \) and \(\sup _{t\ge 0}\left| \left| \sup _{\theta \in \Theta } |e_t(\theta )|\right| \right| _{16}<+\infty \),
- 2.
\(\left| \left| \sup _{\theta \in \Theta } |\epsilon _t(\theta )-e_t(\theta )|\right| \right| _4\) tends to 0 exponentially fast as \(t\rightarrow \infty \),
- 3.
For all \(\alpha >0\), \(t^\alpha \sup _{\theta \in \Theta } |\epsilon _t(\theta )-e_t(\theta )|\longrightarrow 0\) a.s. as \(t\rightarrow \infty \),
- 4.
For all \(j=1, 2,3\), \(\left| \left| \sup _{\theta \in \Theta } ||\nabla ^j\epsilon _0(\theta )||\right| \right| _{16}<+\infty \), \(\sup _{t\ge 0}\left| \left| \sup _{\theta \in \Theta } ||\nabla ^j e_t(\theta )||\right| \right| _{16}<+\infty \) and we have \(t^\alpha \left| \left| \sup _{\theta \in \Theta } ||\nabla (e_t- \epsilon _t)(\theta )||\right| \right| _{16/5}\longrightarrow 0\) , as \(t\rightarrow \infty \) for all \(\alpha >0\).
Proof of Lemma A.7
is similar to the proofs of Lemmas 3.3 and 3.4. \(\square \)
Denote by \(\Upsilon _t(i)\) the i-th element of \(\Upsilon _t.\)
Lemma A.8
Let \((\epsilon _t)\) be a sequence of centered and uncorrelated variables, with \({{\mathbb {E}}}\left| \epsilon _t\right| ^{8+4\nu }<\infty \) and \(\sum _{h=0}^\infty \left[ \alpha _\epsilon (h)\right] ^{\nu /(2+\nu )}<\infty \) for some \(\nu >0\). Then there exits a finite constant \(C_1\) such that for \(m_1, m_2=1,\dots ,(p+q)K\) and all \(s\in {\mathbb {Z}}\),
$$\begin{aligned} \sum _{h=-\infty }^{\infty }\left| \text{ Cov }\left\{ \Upsilon _{1}(m_1) \Upsilon _{1+s}(m_2), \Upsilon _{1+h}(m_1)\Upsilon _{1+s+h}(m_2)\right\} \right| <C_1. \end{aligned}$$
Proof
Recall that
$$\begin{aligned} \frac{\partial \epsilon _t(\theta _0)}{\partial \theta _l}= & {} \sum _{i=0}^\infty c_{i,l}(\theta _0,\Delta _{t},\dots ,\Delta _{t-i+1}) \epsilon _{t-i},\text { for }l=1,\dots ,(p+q)K, \end{aligned}$$
(85)
where \(c_i(\theta _0,\Delta _{t},\dots ,\Delta _{t-i+1})\) is defined by (9) and \(c_{i,l}(\theta _0,\Delta _{t},\dots ,\Delta _{t-i+1})=\partial c_i(\theta _0,\Delta _{t},\dots ,\Delta _{t-i+1})/{\partial \theta _l}\), and with the following upper bound holding thanks to (13):
$$\begin{aligned} {\mathbb {E}}\sup _{\theta \in \Theta }(c_i(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1}))^2\le C\rho ^i \text { and }{\mathbb {E}}\sup _{\theta \in \Theta }( c_{i,l}(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1}))^2\le C\rho ^i,\quad \forall i. \end{aligned}$$
Let
$$\begin{aligned}&\gamma _{i,j,i',j',s,h}(m_1,m_2)(\theta _0) \nonumber \\&\quad = {{\mathbb {E}}}\left[ c_{i,m_1}(\theta _0,\Delta _{t},\dots , \Delta _{t-i+1})c_{j,m_2}(\theta _0,\Delta _{t+s},\dots ,\Delta _{t+s-j+1}) \right. \nonumber \\&\quad \qquad \left. \times \,c_{i',m_1}(\theta _0,\Delta _{t+h},\dots , \Delta _{t+h-i'+1})c_{j',m_2}(\theta _0,\Delta _{t+s+h},\dots , \Delta _{t+s+h-j'+1})\right] \nonumber \\&\qquad \times \,\text{ Cov }\left( \epsilon _{t}\epsilon _{t-i} \epsilon _{t+s}\epsilon _{t+s-j},\epsilon _{t+h}\epsilon _{t+h-i'} \epsilon _{t+s+h}\epsilon _{t+s+h-j'}\right) \nonumber \\&\qquad +\, \text{ Cov }\left( c_{i,m_1} (\theta _0,\Delta _{t},\dots , \Delta _{t-i+1})c_{j,m_2}(\theta _0,\Delta _{t+s},\dots ,\Delta _{t+s-j+1}), \right. \nonumber \\&\qquad \quad \left. c_{i',m_1}(\theta _0,\Delta _{t+h},\dots , \Delta _{t+h-i'+1})c_{j',m_2}(\theta _0,\Delta _{t+s+h},\dots , \Delta _{t+s+h-j'+1})\right) \nonumber \\&\qquad \times \,{{\mathbb {E}}}\left[ \epsilon _{t}\epsilon _{t-i} \epsilon _{t+s}\epsilon _{t+s-j}\right] {{\mathbb {E}}}\left[ \epsilon _{t+h} \epsilon _{t+h-i'}\epsilon _{t+s+h}\epsilon _{t+s+h-j'}\right] . \end{aligned}$$
(86)
The Cauchy-Schwarz inequality implies that
$$\begin{aligned}&\left| {{\mathbb {E}}}[c_{i,m_1}(\theta _0,\Delta _{t},\dots , \Delta _{t-i+1})c_{j,m_2}(\theta _0,\Delta _{t+s},\dots ,\Delta _{t+s-j+1}) \right. \nonumber \\&\qquad \left. \times \,c_{i',m_1}(\theta _0,\Delta _{t+h},\dots , \Delta _{t+h-i'+1})c_{j',m_2}(\theta _0,\Delta _{t+s+h}, \dots ,\Delta _{t+s+h-j'+1})]\right| \nonumber \\&\quad \le C\rho ^{i+j+i'+j'}. \end{aligned}$$
(87)
In view of (85) and (86), we have
$$\begin{aligned}&\sum _{h=-\infty }^{\infty }\text{ Cov }\left\{ \Upsilon _{1}(m_1)\Upsilon _{1+s}(m_2), \Upsilon _{1+h}(m_1)\Upsilon _{1+s+h}(m_2)\right\} \\&\quad = \sum _{h=-\infty }^{\infty }\sum _{i=0}^\infty \sum _{j=0}^\infty \sum _{i'=0}^\infty \sum _{j'=0}^\infty \gamma _{i,j,i',j',s,h}(m_1,m_2)(\theta _0). \end{aligned}$$
Without loss of generality, we can take the supremum over the integers \(s>0\), and consider the sum for positive h. Let \(m_0=m_1\wedge m_2\) and \(Y_{t,h_1} = \epsilon _{t}\epsilon _{t-h_1}-{\mathbb {E}}(\epsilon _{t}\epsilon _{t-h_1})\). We first suppose that \(h\ge 0\). It follows that
$$\begin{aligned}&\sum _{i=0}^\infty \sum _{j=0}^\infty \sum _{i'=0}^\infty \sum _{j'=0}^\infty \left| \text{ Cov }\left( c_{i,m_1} (\theta _0,\Delta _{t},\dots ,\Delta _{t-i+1})c_{j,m_2} (\theta _0,\Delta _{t+s},\dots ,\Delta _{t+s-j+1}), \right. \right. \\&\qquad \qquad \left. \left. c_{i',m_1}(\theta _0,\Delta _{t+h},\dots , \Delta _{t+h-i'+1})c_{j',m_2}(\theta _0,\Delta _{t+s+h}, \dots ,\Delta _{t+s+h-j'+1})\right) \right| \\&\quad \le v_1+v_2+v_3+v_4+v_5, \end{aligned}$$
where
$$\begin{aligned} v_1=v_1(h)= & {} \sum _{i>[h/2]}\sum _{j=0}^\infty \sum _{i'=0}^\infty \sum _{j'=0}^\infty \left| \text{ Cov }\left( {\mathbf {c}}^{t}_{i,m_1} {\mathbf {c}}^{t+s}_{j,m_2},{\mathbf {c}}^{t+h}_{i',m_1} {\mathbf {c}}^{t+h+s}_{j',m_2} \right) \right| ,\\ v_2=v_2(h)= & {} \sum _{i=0}^\infty \sum _{j>[h/2]}\sum _{i'=0}^\infty \sum _{j'=0}^\infty \left| \text{ Cov }\left( {\mathbf {c}}^{t}_{i,m_1} {\mathbf {c}}^{t+s}_{j,m_2},{\mathbf {c}}^{t+h}_{i',m_1} {\mathbf {c}}^{t+h+s}_{j',m_2}\right) \right| \\ v_3=v_3(h)= & {} \sum _{i=0}^\infty \sum _{j=0}^\infty \sum _{i'>[h/2]} \sum _{j'=0}^\infty \left| \text{ Cov }\left( {\mathbf {c}}^{t}_{i,m_1} {\mathbf {c}}^{t+s}_{j,m_2},{\mathbf {c}}^{t+h}_{i',m_1} {\mathbf {c}}^{t+h+s}_{j',m_2}\right) \right| ,\\ v_4=v_4(h)= & {} \sum _{i=0}^\infty \sum _{j=0}^\infty \sum _{i'=0}^\infty \sum _{j'>[h/2]}\left| \text{ Cov }\left( {\mathbf {c}}^{t}_{i,m_1} {\mathbf {c}}^{t+s}_{j,m_2},{\mathbf {c}}^{t+h}_{i',m_1} {\mathbf {c}}^{t+h+s}_{j',m_2}\right) \right| , \\ v_5=v_5(h)= & {} \sum _{i=0}^{[h/2]}\sum _{j=0}^{[h/2]} \sum _{i'=0}^{[h/2]}\sum _{j'=0}^{[h/2]}\left| \text{ Cov } \left( {\mathbf {c}}^{t}_{i,m_1}{\mathbf {c}}^{t+s}_{j,m_2}, {\mathbf {c}}^{t+h}_{i',m_1}{\mathbf {c}}^{t+h+s}_{j',m_2} \right) \right| , \end{aligned}$$
where
$$\begin{aligned} {\mathbf {c}}^{t}_{i_1,m}=c_{i_1,m}(\theta _0,\Delta _{t},\dots ,\Delta _{t-i_1+1}). \end{aligned}$$
One immediate remark is that \({\mathbf {c}}^{t}_{i_1,m}\) is measurable with respect to \(\Delta _r\), \(r\in \{ t,\ldots ,t-i_1+1\}\). Since
$$\begin{aligned}&\left| \text{ Cov }\left( {\mathbf {c}}^{t}_{i,m_1}{\mathbf {c}}^{t+s}_{j,m_2}, {\mathbf {c}}^{t+h}_{i',m_1}{\mathbf {c}}^{t+h+s}_{j',m_2}\right) \right| \le C\rho ^{i+i'+j+j'}, \end{aligned}$$
we have
$$\begin{aligned} v_1= & {} \sum _{i>[h/2]}\sum _{j=0}^\infty \sum _{i'=0}^\infty \sum _{j'=0}^\infty \left| \text{ Cov }\left( {\mathbf {c}}^{t}_{i,m_1} {\mathbf {c}}^{t+s}_{j,m_2},{\mathbf {c}}^{t+h}_{i',m_1} {\mathbf {c}}^{t+h+s}_{j',m_2}\right) \right| \le \kappa _1\rho ^{h/2}, \end{aligned}$$
for some positive constant \(\kappa _1\). Using the same arguments we obtain that \(v_i\), \(i=2,3,4\) are bounded by \(\kappa _i\rho ^{h/2}\). The \(\alpha -\)mixing property (see Theorem 14.1 in Davidson 1994, p. 210) and Lemmas A.1 and A.7, entail that
$$\begin{aligned} v_5= & {} \sum _{i=0}^{[h/2]}\sum _{j=0}^{[h/2]}\sum _{i'=0}^{[h/2]} \sum _{j'=0}^{[h/2]}\left| \text{ Cov }\left( {\mathbf {c}}^{t}_{i,m_1}{\mathbf {c}}^{t+s}_{j,m_2}, {\mathbf {c}}^{t+h}_{i',m_1}{\mathbf {c}}^{t+h+s}_{j',m_2}\right) \right| \\\le & {} \sum _{k=1}^{4}\sum _{(i,j,i',j')\in {\mathcal {C}}_k}\kappa _6 \left\| {\mathbf {c}}^{t}_{i,m_1}{\mathbf {c}}^{t+s}_{j,m_2}\right\| _{2+\nu } \left\| {\mathbf {c}}^{t+h}_{i',m_1}{\mathbf {c}}^{t+h+s}_{j',m_2}\right\| _{2+\nu } \\&\left\{ \alpha \left( {\mathbf {c}}^{t}_{i,m_1}{\mathbf {c}}^{t+s}_{j,m_2}, {\mathbf {c}}^{t+h}_{i',m_1}{\mathbf {c}}^{t+h+s}_{j',m_2}\right) \right\} ^{\nu /(2+\nu )}, \end{aligned}$$
where \(\alpha (U,V)\) denotes the strong mixing coefficient between the \(\sigma -\)field generated by the random variable U and that generated by V and where
$$\begin{aligned} {\mathcal {C}}_1= {\mathcal {C}}_1(h)= & {} \left\{ (i,j,i',j')\in \{0,1,\dots , [h/2]\}^4: i\ge j-s,\;j'\le i'+s\right\} ,\\ {\mathcal {C}}_2= {\mathcal {C}}_2(h)= & {} \left\{ (i,j,i',j')\in \{0,1,\dots , [h/2]\}^4: i\ge j-s,\;j'\ge i'+s\right\} ,\\ {\mathcal {C}}_3= {\mathcal {C}}_3(h)= & {} \left\{ (i,j,i',j')\in \{0,1,\dots , [h/2]\}^4: i\le j-s,\;j'\le i'+s\right\} ,\\ {\mathcal {C}}_4= {\mathcal {C}}_4(h)= & {} \left\{ (i,j,i',j')\in \{0,1,\dots , [h/2]\}^4: i\le j-s,\;j'\ge i'+s\right\} . \end{aligned}$$
We check easily that \({\mathbf {c}}^{t}_{i,m_1}{\mathbf {c}}^{t+s}_{j,m_2}\) and \({\mathbf {c}}^{t+h}_{i',m_1}{\mathbf {c}}^{t+h+s}_{j',m_2}\) are respectively measurable with respect to \(\Delta _r\), \(r\in \{t-i+1,\ldots ,t+s\}\) and \(\Delta _r\), \(r\in \{t-i'+h+1,\ldots ,t+h+s\}\) when \((i,j,i',j')\in {\mathcal {C}}_1\). We have \(t-i+1\le t+s-j+1\), \(t+h-i'+1\le t+h+s-j'+1\) and we thus deduce that
$$\begin{aligned} \left| \alpha \left( {\mathbf {c}}^{t}_{i,m_1}{\mathbf {c}}^{t+s}_{j,m_2}, {\mathbf {c}}^{t+h}_{i',m_1}{\mathbf {c}}^{t+h+s}_{j',m_2}\right) \right|\le & {} \alpha _{\Delta }\left( h-i'-s+1\right) ,\quad \forall h\ge i'+s-1, \\ \left| \alpha \left( {\mathbf {c}}^{t}_{i,m_1}{\mathbf {c}}^{t+s}_{j,m_2}, {\mathbf {c}}^{t+h}_{i',m_1}{\mathbf {c}}^{t+h+s}_{j',m_2}\right) \right|\le & {} \alpha _{\Delta }\left( -i-h-s+1\right) ,\quad \forall h\le -i-s+1, \\ \left| \alpha \left( {\mathbf {c}}^{t}_{i,m_1}{\mathbf {c}}^{t+s}_{j,m_2}, {\mathbf {c}}^{t+h}_{i',m_1}{\mathbf {c}}^{t+h+s}_{j',m_2}\right) \right|\le & {} \alpha _{\Delta }\left( 0\right) \le 1/4,\quad \forall h= -i-s+1,\dots ,i'+s-1. \end{aligned}$$
Note also that, by the Hölder inequality,
$$\begin{aligned} \left\| {\mathbf {c}}^{t}_{i,m_1}{\mathbf {c}}^{t+s}_{j,m_2}\right\| _{2+\nu } \le \left\| {\mathbf {c}}^{t}_{i,m_1}\right\| _{4+2\nu }\left\| {\mathbf {c}}^{t+s}_{j,m_2}\right\| _{4+2\nu }\le C\rho ^{i+j}. \end{aligned}$$
Therefore
$$\begin{aligned}&\sum _{h=0}^\infty \sum _{(i,j,i',j')\in {\mathcal {C}}_1}\left\| {\mathbf {c}}^{t}_{i,m_1}{\mathbf {c}}^{t+s}_{j,m_2}\right\| _{2+\nu } \left\| {\mathbf {c}}^{t+h}_{i',m_1}{\mathbf {c}}^{t+h+s}_{j',m_2} \right\| _{2+\nu } \left\{ \alpha \left( {\mathbf {c}}^{t}_{i,m_1}{\mathbf {c}}^{t+s}_{j,m_2}, {\mathbf {c}}^{t+h}_{i',m_1}{\mathbf {c}}^{t+h+s}_{j',m_2}\right) \right\} ^{\nu /(2+\nu )}, \\&\quad \le C^2\sum _{i,j,i',j'=0}^\infty \rho ^{i+j+i'+j'}\left( i'+2s-1+i+\sum _{r=0}^{\infty } \alpha _{\Delta }^{\nu /(2+\nu )}\left( r \right) \right) <\infty . \end{aligned}$$
Continuing in this way, we obtain that \(\sum _{h=0}^{\infty }v_5(h)<\infty \). It follows that
$$\begin{aligned}&\sum _{h=0}^{\infty }\sum _{i=0}^\infty \sum _{j=0}^\infty \sum _{i'=0}^\infty \sum _{j'=0}^\infty \left| \text{ Cov } \left( c_{i,m_1}(\theta _0,\Delta _{t},\dots ,\Delta _{t-i+1}) c_{j,m_2}(\theta _0,\Delta _{t+s},\dots ,\Delta _{t+s-j+1}), \right. \right. \nonumber \\&\qquad \qquad \left. \left. c_{i',m_1}(\theta _0,\Delta _{t+h},\dots , \Delta _{t+h-i'+1})c_{j',m_2}(\theta _0,\Delta _{t+s+h},\dots , \Delta _{t+s+h-j'+1})\right) \right| \nonumber \\&\quad \le \sum _{h=0}^{\infty } \sum _{i=1}^5 v_i(h)<\infty . \end{aligned}$$
(88)
The same bounds clearly holds for
$$\begin{aligned}&\sum _{h=-\infty }^{0}\sum _{i=0}^\infty \sum _{j=0}^\infty \sum _{i'=0}^\infty \sum _{j'=0}^\infty \left| \text{ Cov } \left( c_{i,m_1}(\theta _0,\Delta _{t-1},\dots ,\Delta _{t-i}) c_{j,m_2}(\theta _0,\Delta _{t+s-1},\dots ,\Delta _{t+s-j}), \right. \right. \\&\quad \left. \left. c_{i',m_1}(\theta _0,\Delta _{t+h},\dots , \Delta _{t+h-i'+1})c_{j',m_2}(\theta _0,\Delta _{t+s+h},\dots , \Delta _{t+s+h-j'+1})\right) \right| <\infty , \end{aligned}$$
which shows that
$$\begin{aligned}&\sum _{h=-\infty }^{\infty }\sum _{i=0}^\infty \sum _{j=0}^\infty \sum _{i'=0}^\infty \sum _{j'=0}^\infty \left| \text{ Cov } \left( c_{i,m_1}(\theta _0,\Delta _{t},\dots ,\Delta _{t-i+1}) c_{j,m_2}(\theta _0,\Delta _{t+s},\dots ,\Delta _{t+s-j+1}), \right. \right. \\&\quad \left. \left. c_{i',m_1}(\theta _0,\Delta _{t+h}, \dots ,\Delta _{t+h-i'+1})c_{j',m_2}(\theta _0,\Delta _{t+s+h}, \dots ,\Delta _{t+s+h-j'+1})\right) \right| <\infty . \end{aligned}$$
A slight extension of Corollary A.3 in Francq and Zakoïan (2010) shows that
$$\begin{aligned} \sum _{h=-\infty }^{\infty }\left| \text{ Cov }\left( Y_{1,i}Y_{1+s,j}, Y_{1+h,i'}Y_{1+s+h,j'}\right) \right| <\infty . \end{aligned}$$
(89)
Because, by Cauchy–Schwarz inequality
$$\begin{aligned} \left| {{\mathbb {E}}}\left[ \epsilon _{t}\epsilon _{t-i} \epsilon _{t+s}\epsilon _{t+s-j}\right] \right| \le {{\mathbb {E}}}\left| \epsilon _t\right| ^4<\infty \end{aligned}$$
by the assumption that \({{\mathbb {E}}}\left| \epsilon _t\right| ^{8+4\nu }<\infty \) and in view of (87) it follows that
$$\begin{aligned}&\sum _{h=-\infty }^{\infty }\left| \text{ Cov }\left\{ \Upsilon _{1}(m_1)\Upsilon _{1+s}(m_2), \Upsilon _{1+h}(m_1)\Upsilon _{1+s+h}(m_2)\right\} \right| \\&\quad \le \kappa \sum _{i=0}^\infty \sum _{j=0}^\infty \sum _{i'=0}^\infty \sum _{j'=0}^\infty \rho ^{i+j+i'+j'}\sum _{h=-\infty }^{\infty } \left| \text{ Cov }\left( Y_{1,i}Y_{1+s,j}, Y_{1+h,i'}Y_{1+s+h,j'}\right) \right| \\&\qquad +\, \kappa '\sum _{i=0}^\infty \sum _{j=0}^\infty \sum _{i'=0}^\infty \sum _{j'=0}^\infty \sum _{h=-\infty }^{\infty } \\&\qquad \left| \text{ Cov }\left( c_{i,m_1}(\theta _0,\Delta _{t},\dots , \Delta _{t-i+1})c_{j,m_2}(\theta _0,\Delta _{t+s},\dots ,\Delta _{t+s-j+1}), \right. \right. \\&\qquad \quad \left. \left. c_{i',m_1}(\theta _0,\Delta _{t+h}, \dots ,\Delta _{t+h-i'+1})c_{j',m_2}(\theta _0,\Delta _{t+s+h}, \dots ,\Delta _{t+s+h-j'+1})\right) \right| \end{aligned}$$
The conclusion follows from (88) and (89). \(\square \)
Let \({\hat{\Sigma }}_{\Upsilon }\) be the matrix obtained by replacing \({\hat{\Upsilon }}_t\) by \(\Upsilon _t\) in \({\hat{\Sigma }}_{{\hat{\Upsilon }}}\).
Lemma A.9
Under the assumptions of Theorem 3.10, \(\sqrt{r}\Vert {\hat{\Sigma }}_{{\underline{\Upsilon }}_{r}}- {\Sigma }_{{\underline{\Upsilon }}_{r}}\Vert \), \(\sqrt{r}\Vert {\hat{\Sigma }}_{{\Upsilon }}- {\Sigma }_{{\Upsilon }}\Vert ,\) and \(\sqrt{r}\Vert {\hat{\Sigma }}_{{\Upsilon },{\underline{\Upsilon }}_{r}}- {\Sigma }_{{\Upsilon },{\underline{\Upsilon }}_{r}}\Vert \) tend to zero in probability as \(n\rightarrow \infty \) when \(r=\mathrm {o}(n^{1/3})\).
Proof
For \(1\le m_1,m_2\le K(p+q)\) and \(1\le r_1,r_2\le r\), the element of the \(\left\{ (r_1-1)(p+q)K+m_1\right\} \)-th row and \(\left\{ (r_2-1)(p+q)K+m_2\right\} \)-th column of \({\hat{\Sigma }}_{{\underline{\Upsilon }}_{r}} \) is of the form \(n^{-1}\sum _{t=1}^nZ_t\) where \(Z_t:=Z_{t,r_1,r_2}(m_1,m_2) = \Upsilon _{t-r_1}(m_1)\Upsilon _{t-r_2}(m_2).\) By stationarity of \(\left( Z_t\right) \), we have
$$\begin{aligned} \text{ Var }\left( \frac{1}{n}\sum _{t=1}^nZ_t\right) = \frac{1}{n^{2}}\sum _{h=-n+1}^{n-1}\left( n-|h|\right) \text{ Cov }\left( Z_t,Z_{t-h}\right) \le \frac{C_1}{n}, \end{aligned}$$
(90)
where, by Lemma A.8, \(C_1\) is a constant independent of \(r_1,r_2,m_1,m_2\) and r, n. Now using the Tchebychev inequality, we have
$$\begin{aligned} \forall \beta>0, \quad {\mathbb {P}}\left\{ \sqrt{r}\Vert {\hat{\Sigma }}_{{\underline{\Upsilon }}_{r}}- {\Sigma }_{{\underline{\Upsilon }}_{r}}\Vert > \beta \right\} \le \frac{1}{\beta ^2}{\mathbb {E}}\left\{ r\Vert {\hat{\Sigma }}_{{\underline{\Upsilon }}_{r}}- {\Sigma }_{{\underline{\Upsilon }}_{r}}\Vert ^2\right\} . \end{aligned}$$
In view of (80) and (90) we have
$$\begin{aligned} {\mathbb {E}}\left\{ r\Vert {\hat{\Sigma }}_{\Upsilon }- {\Sigma }_{{\Upsilon }}\Vert ^2 \right\}\le & {} {\mathbb {E}}\left\{ r\Vert {\hat{\Sigma }}_{{\Upsilon },{\underline{\Upsilon }}_{r}}- {\Sigma }_{{\Upsilon },{\underline{\Upsilon }}_{r}}\Vert ^2 \right\} \\\le & {} {\mathbb {E}}\left\{ r\Vert {\hat{\Sigma }}_{{\underline{\Upsilon }}_{r}}- {\Sigma }_{{\underline{\Upsilon }}_{r}}\Vert ^2\right\} \le r \sum _{m_1,m_2=1}^{K(p+q)r}\text{ Var }\left( \frac{1}{n}\sum _{t=1}^nZ_t\right) \\\le & {} \frac{C_1K^2(p+q)^2r^3}{n}=\mathrm {o}(1) \end{aligned}$$
as \(n\rightarrow \infty \) when \(r=\mathrm {o}(n^{1/3})\). Hence, when \(r=\mathrm {o}(n^{1/3})\)
$$\begin{aligned} \sqrt{r}\Vert {\hat{\Sigma }}_{{\underline{\Upsilon }}_{r}}- {\Sigma }_{{\underline{\Upsilon }}_{r}}\Vert= & {} \mathrm {o}_{{\mathbb {P}}}(1),\\ \sqrt{r}\Vert {\hat{\Sigma }}_{{\Upsilon }}- {\Sigma }_{{\Upsilon }}\Vert= & {} \mathrm {o}_{{\mathbb {P}}}(1)\text { and }\sqrt{r}\Vert {\hat{\Sigma }}_{{\Upsilon },{\underline{\Upsilon }}_{r}}- {\Sigma }_{{\Upsilon },{\underline{\Upsilon }}_{r}}\Vert =\mathrm {o}_{{\mathbb {P}}}(1). \end{aligned}$$
The proof is complete. \(\square \)
We now show that the previous lemma applies when \(\Upsilon _t\) is replaced by \({\hat{\Upsilon }}_t\).
Lemma A.10
Under the assumptions of Theorem 3.10, \(\sqrt{r}\Vert {\hat{\Sigma }}_{\underline{{\hat{\Upsilon }}}_{r}}- {\Sigma }_{{\underline{\Upsilon }}_{r}}\Vert \), \(\sqrt{r}\Vert {\hat{\Sigma }}_{{{\hat{\Upsilon }}}}- {\Sigma }_{\Upsilon }\Vert ,\) and \(\sqrt{r}\Vert {\hat{\Sigma }}_{{{\hat{\Upsilon }}},\underline{{\hat{\Upsilon }}}_{r}}- {\Sigma }_{{\Upsilon },{\underline{\Upsilon }}_{r}}\Vert \) tend to zero in probability as \(n\rightarrow \infty \) when \(r=\mathrm {o}(n^{1/3})\).
Proof
We first show that the replacement of the unknown initial values \(\{X_u,\;u\le 0\}\) by zero is asymptotically unimportant. Let \({\hat{\Sigma }}_{{\underline{\Upsilon }}_{r,n}}\) be the matrix obtained by replacing \(e_t({{\hat{\theta }}}_n)\) by \(\epsilon _t({{\hat{\theta }}}_n)\) in \({\hat{\Sigma }}_{{\hat{{\underline{\Upsilon }}}_r}}\). We start by evaluating \({\mathbb {E}}\Vert {\hat{\Sigma }}_{{\hat{{\underline{\Upsilon }}}_r}}- {\hat{\Sigma }}_{{\underline{\Upsilon }}_{r,n}}\Vert ^2\). We first note that
$$\begin{aligned} {\hat{\Sigma }}_{{\hat{{\underline{\Upsilon }}}_r}}- {\hat{\Sigma }}_{{\underline{\Upsilon }}_{r,n}} =\left[ \frac{1}{n}\sum _{t=1}^na_{t-i,t-i',m_1,m_2}({{\hat{\theta }}}_n)\right] \end{aligned}$$
for \(i,i'=1,\dots ,r\) and \(m_1,m_2=1,\dots ,K(p+q)\) and where
$$\begin{aligned} a_{t-i,t-i',m_1,m_2}({{\hat{\theta }}}_n)= & {} e_{t-i}({{\hat{\theta }}}_n) e_{t-i'}({{\hat{\theta }}}_n)\frac{\partial e_{t-i}({{\hat{\theta }}}_n)}{\partial \theta _{m_1}}\frac{\partial e_{t-i'}({{\hat{\theta }}}_n)}{\partial \theta _{m_2}} \\&-\epsilon _{t-i}({{\hat{\theta }}}_n) \epsilon _{t-i'}({{\hat{\theta }}}_n)\frac{\partial \epsilon _{t-i}({{\hat{\theta }}}_n)}{\partial \theta _{m_1}}\frac{\partial \epsilon _{t-i'}({{\hat{\theta }}}_n)}{\partial \theta _{m_2}}. \end{aligned}$$
Using (80), we have
$$\begin{aligned} \Vert {\hat{\Sigma }}_{{\hat{{\underline{\Upsilon }}}_r}}- {\hat{\Sigma }}_{{\underline{\Upsilon }}_{r,n}}\Vert ^2\le & {} \sum _{i,i'=1}^r\sum _{m_1,m_2=1}^{K(p+q)} \left[ \frac{1}{n}\sum _{t=1}^na_{t-i,t-i',m_1,m_2}({{\hat{\theta }}}_n)\right] ^2. \end{aligned}$$
We thus deduce the following \(L^2\) estimate:
$$\begin{aligned} {\mathbb {E}}\Vert {\hat{\Sigma }}_{{\hat{{\underline{\Upsilon }}}_r}}- {\hat{\Sigma }}_{{\underline{\Upsilon }}_{r,n}}\Vert ^2\le & {} \sum _{i,i'=1}^r \sum _{m_1,m_2=1}^{K(p+q)}\left\| \frac{1}{n}\sum _{t=1}^na_{t-i,t-i',m_1,m_2} ({{\hat{\theta }}}_n)\right\| _2^2\\\le & {} \sum _{i,i'=1}^r\sum _{m_1,m_2=1}^{K(p+q)}\frac{1}{n}\sum _{t=1}^n\left\| a_{t-i,t-i',m_1,m_2}({{\hat{\theta }}}_n)\right\| _2^2, \end{aligned}$$
by Minkowski’s inequality. Thanks to Hölder’s inequality:
$$\begin{aligned}&\left\| a_{t-i,t-i',m_1,m_2}({{\hat{\theta }}}_n)\right\| _2 \\&\quad \le \sum _{j=1}^4 {{{\mathcal {A}}}}^j_{t-i,t-i',m_1,m_2},\text { with} \\&{{\mathcal {A}}}^1_{t-i,t-i',m_1,m_2} \\&\quad = \left\| \sup _{\theta \in \Theta } \left| e_{t-i}(\theta )-\epsilon _{t-i}(\theta )\right| \right\| _4 \sup _{t\ge 0}\left\| \sup _{\theta \in \Theta }\left| e_{t}(\theta )\right| \right\| _{12}\left( \sup _{t\ge 0}\left\| \sup _{\theta \in \Theta }\left\| \frac{\partial e_{t}(\theta )}{\partial \theta }\right\| \right\| _{12}\right) ^2\\&{{\mathcal {A}}}^2_{t-i,t-i',m_1,m_2} \\&\quad =\left\| \sup _{\theta \in \Theta } \left| \epsilon _{t}(\theta )\right| \right\| _{12}\left\| \sup _{\theta \in \Theta }\left| e_{t-i'}(\theta )-\epsilon _{t-i'} (\theta )\right| \right\| _4 \left( \sup _{t\ge 0}\left\| \sup _{\theta \in \Theta }\left\| \frac{\partial e_{t}(\theta )}{\partial \theta }\right\| \right\| _{12}\right) ^2 \\&{{\mathcal {A}}}^3_{t-i,t-i',m_1,m_2} \\&\quad =\left( \left\| \sup _{\theta \in \Theta }\left| \epsilon _{t}(\theta )\right| \right\| _{16}\right) ^2 \left\| \sup _{\theta \in \Theta }\left\| \frac{\partial }{\partial \theta }\left( e_{t-i}(\theta )-\epsilon _{t-i}(\theta )\right) \right\| \right\| _{16/5} \sup _{t\ge 0}\left\| \sup _{\theta \in \Theta }\left\| \frac{\partial e_{t}(\theta )}{\partial \theta }\right\| \right\| _{16} \\&{{\mathcal {A}}}^4_{t-i,t-i',m_1,m_2} \\&\quad =\left( \left\| \sup _{\theta \in \Theta }\left| \epsilon _{t}(\theta )\right| \right\| _{16}\right) ^2\left\| \sup _{\theta \in \Theta }\left\| \frac{\partial \epsilon _{t}(\theta )}{\partial \theta }\right\| \right\| _{16} \left\| \sup _{\theta \in \Theta }\left\| \frac{\partial }{\partial \theta }\left( e_{t-i'}(\theta )-\epsilon _{t-i'}(\theta )\right) \right\| \right\| _{16/5}. \end{aligned}$$
We deal with \({{\mathcal {A}}}^1_{t-i,t-i',m_1,m_2}\) and \(\mathcal{A}^2_{t-i,t-i',m_1,m_2}\), as \({{\mathcal {A}}}^3_{t-i,t-i',m_1,m_2}\) and \({{\mathcal {A}}}^4_{t-i,t-i',m_1,m_2}\) are dealt with similarly. In view of Lemma A.7, we have
$$\begin{aligned} \frac{1}{n}\sum _{t=1}^n{{\mathcal {A}}}^1_{t-i,t-i',m_1,m_2}\le & {} \kappa _1 \frac{1}{n}\sum _{t=1}^n\left\| \sup _{\theta \in \Theta }\left| e_{t-i} (\theta )-\epsilon _{t-i}(\theta )\right| \right\| _4 \\\le & {} \frac{\kappa _1}{n}\left( \sum _{t=1}^{n-r}\left\| \sup _{\theta \in \Theta }\left| e_{t}(\theta )-\epsilon _{t}(\theta ) \right| \right\| _4+r\left\| \sup _{\theta \in \Theta } \left| \epsilon _{0}(\theta )\right| \right\| _4\right) \\= & {} \mathrm {O}\left( \frac{1}{n}+\frac{r}{n}\right) = \mathrm {O}\left( \frac{r}{n}\right) , \end{aligned}$$
independent from i, \(i'\), \(m_1\) and \(m_2\). Similarly, we have
$$\begin{aligned} \frac{1}{n}\sum _{t=1}^n{{\mathcal {A}}}^3_{t-i,t-i',m_1,m_2}\le & {} \kappa _3\frac{1}{n}\sum _{t=1}^n\left\| \sup _{\theta \in \Theta } \left\| \frac{\partial }{\partial \theta }\left( e_{t-i}(\theta )- \epsilon _{t-i}(\theta )\right) \right\| \right\| _{16/5} \\\le & {} \kappa _3\frac{1}{n}\left( \sum _{t=1}^{n-r}\left\| \sup _{\theta \in \Theta }\left\| \frac{\partial }{\partial \theta }\left( e_{t}(\theta )-\epsilon _{t}(\theta )\right) \right\| \right\| _{16/5}\right. \\&\left. +\,r\left\| \sup _{\theta \in \Theta }\left\| \frac{\partial \epsilon _{0}(\theta )}{\partial \theta }\right\| \right\| _{16/5}\right) = \mathrm {O}\left( \frac{1}{n}+\frac{r}{n}\right) = \mathrm {O}\left( \frac{r}{n}\right) , \end{aligned}$$
because \(\sum _{t=1}^{\infty }\left\| \sup _{\theta \in \Theta } \left\| {\partial \left( e_{t}(\theta )-\epsilon _{t}(\theta )\right) }/{\partial \theta }\right\| \right\| _{16/5}<\infty \) and \(\left\| \sup _{\theta \in \Theta }\left\| {\partial \epsilon _{0}(\theta )}/{\partial \theta }\right\| \right\| _{16/5}<\infty \) (see Lemma A.7, Point 4). Gathering \(\mathcal{A}^1_{t-i,t-i',m_1,m_2}\), \({{\mathcal {A}}}^2_{t-i,t-i',m_1,m_2}\), \(\mathcal{A}^3_{t-i,t-i',m_1,m_2}\) and \({{\mathcal {A}}}^4_{t-i,t-i',m_1,m_2}\), we arrive at
$$\begin{aligned} {\mathbb {E}}\Vert {\hat{\Sigma }}_{{\hat{{\underline{\Upsilon }}}_r}}- {\hat{\Sigma }}_{{\underline{\Upsilon }}_{r,n}}\Vert ^2\le & {} \sum _{i,i'=1}^r \sum _{m_1,m_2=1}^{K(p+q)}\left( \frac{1}{n}\sum _{t=1}^n\sum _{j=1}^4 {{\mathcal {A}}}^j_{t-i,t-i',m_1,m_2} \right) ^2 \\= & {} \mathrm {O}\left( r^2\left\{ \frac{r}{n}\right\} ^2\right) = \mathrm {O}\left( \frac{r^4}{n^2}\right) . \end{aligned}$$
We thus deduce that
$$\begin{aligned} \sqrt{r}\Vert {\hat{\Sigma }}_{{\hat{{\underline{\Upsilon }}}_r}}- {\hat{\Sigma }}_{{\underline{\Upsilon }}_{r,n}}\Vert =\mathrm {o}_{{\mathbb {P}}}(1),\text { when }r=r(n)=\mathrm {o}\left( n^{2/5}\right) . \end{aligned}$$
(91)
We now prove that
$$\begin{aligned} \sqrt{r}\Vert {\hat{\Sigma }}_{{\underline{\Upsilon }}_{r,n}}- {\hat{\Sigma }}_{{{{\underline{\Upsilon }}}_r}}\Vert =\mathrm {o}_{{\mathbb {P}}}(1),\text { when }r=r(n)=\mathrm {o}\left( n^{1/3}\right) . \end{aligned}$$
Taylor expansions around \(\theta _0\) yield
$$\begin{aligned} \left| \epsilon _t({{\hat{\theta }}}_n)-{\epsilon }_t(\theta _0)\right| \le r_t\left\| {{\hat{\theta }}}_n-\theta _0\right\| ,\quad \left| \frac{\partial \epsilon _t({{\hat{\theta }}}_n)}{\partial \theta _m}- \frac{\partial \epsilon _t(\theta _0)}{\partial \theta _m}\right| \le s_t(m)\left\| {{\hat{\theta }}}_n-\theta _0\right\| \end{aligned}$$
(92)
with \(r_t=\sup _{\theta \in \Theta }\left\| {\partial {\epsilon }_t({\theta })}/{\partial \theta }\right\| \), \(s_{t}(m)= \sup _{\theta \in \Theta }\left\| {\partial ^2{\epsilon }_t({\theta })}/{\partial \theta \partial \theta _m}\right\| \) where \(m=m_1=m_2\). Define \(Z_t\) as in the proof of Lemma A.9, and let \(Z_{t,n}\) be obtained by replacing \(\Upsilon _t(m)\) by \(\Upsilon _{t,n}(m)=\epsilon _t({{\hat{\theta }}}_n)\partial \epsilon _t({{\hat{\theta }}}_n)/\partial \theta _m\) in \(Z_t\). Using (92), for \(i,i'=1,\dots ,r\) and \(m_1,m_2=1,\dots ,K(p+q)\), we have
$$\begin{aligned}&\left| \epsilon _{t-i}({{\hat{\theta }}}_n) \epsilon _{t-i'}({{\hat{\theta }}}_n)\frac{\partial \epsilon _{t-i}({{\hat{\theta }}}_n)}{\partial \theta _{m_1}}\frac{\partial \epsilon _{t-i'}({{\hat{\theta }}}_n)}{\partial \theta _{m_2}} - \epsilon _{t-i}(\theta _0) \epsilon _{t-i'}(\theta _0)\frac{\partial \epsilon _{t-i}(\theta _0)}{\partial \theta _{m_1}}\frac{\partial \epsilon _{t-i'}(\theta _0)}{\partial \theta _{m_2}}\right| \nonumber \\&\quad \le \sum _{j=1}^4 {{\mathcal {B}}}^j_{t-i,t-i',m_1,m_2}, \end{aligned}$$
(93)
with
$$\begin{aligned} {{\mathcal {B}}}^1_{t-i,t-i',m_1,m_2}= & {} r_{t-i}\left\| {{\hat{\theta }}}_n- \theta _0\right\| \sup _{\theta \in \Theta }\left| \epsilon _{t-i'}(\theta )\right| \sup _{\theta \in \Theta }\left| \frac{\partial \epsilon _{t-i} (\theta )}{\partial \theta _{m_1}}\right| \sup _{\theta \in \Theta }\left| \frac{\partial \epsilon _{t-i'} (\theta )}{\partial \theta _{m_2}}\right| \\ {{\mathcal {B}}}^2_{t-i,t-i',m_1,m_2}= & {} r_{t-i'}\left\| {{\hat{\theta }}}_n -\theta _0\right\| \sup _{\theta \in \Theta }\left| \epsilon _{t-i}(\theta )\right| \sup _{\theta \in \Theta }\left| \frac{\partial \epsilon _{t-i} (\theta )}{\partial \theta _{m_1}}\right| \sup _{\theta \in \Theta }\left| \frac{\partial \epsilon _{t-i'} (\theta )}{\partial \theta _{m_2}}\right| \\ {{\mathcal {B}}}^3_{t-i,t-i',m_1,m_2}= & {} s_{t-i}(m_1)\left\| {{\hat{\theta }}}_n -\theta _0\right\| \sup _{\theta \in \Theta }\left| \epsilon _{t-i}(\theta )\right| \sup _{\theta \in \Theta }\left| \epsilon _{t-i'}(\theta )\right| \sup _{\theta \in \Theta }\left| \frac{\partial \epsilon _{t-i'}(\theta )}{\partial \theta _{m_2}}\right| \\ {{\mathcal {B}}}^4_{t-i,t-i',m_1,m_2}= & {} s_{t-i'}(m_2) \left\| {{\hat{\theta }}}_n-\theta _0\right\| \sup _{\theta \in \Theta } \left| \epsilon _{t-i}(\theta )\right| \sup _{\theta \in \Theta }\left| \epsilon _{t-i'}(\theta )\right| \sup _{\theta \in \Theta }\left| \frac{\partial \epsilon _{t-i}(\theta )}{\partial \theta _{m_1}}\right| . \end{aligned}$$
We deal with \({{\mathcal {B}}}^1_{t-i,t-i',m_1,m_2}\) and \(\mathcal{B}^2_{t-i,t-i',m_1,m_2}\), as \({{\mathcal {B}}}^3_{t-i,t-i',m_1,m_2}\) and \({{\mathcal {B}}}^4_{t-i,t-i',m_1,m_2}\) are dealt with similarly. We note first that, for all \(i=1,\dots ,r\),
$$\begin{aligned} \frac{1}{n}\sum _{t=1}^n\sup _{\theta \in \Theta }\left| \epsilon _{t-i} (\theta )\right| ^4= & {} \frac{1}{n}\sum _{t=1-i}^{n-i}\sup _{\theta \in \Theta } \left| \epsilon _{t}(\theta )\right| ^4 =\frac{1}{n}\sum _{t=1-i}^{0} \sup _{\theta \in \Theta } \left| \epsilon _{t}(\theta )\right| ^4 +\frac{1}{n}\sum _{t=1}^{n-i}\sup _{\theta \in \Theta } \left| \epsilon _{t}(\theta )\right| ^4 \nonumber \\\le & {} \frac{r}{n}\frac{1}{r}\sum _{t=1-r}^{0}\sup _{\theta \in \Theta } \left| \epsilon _{t}(\theta )\right| ^4+\frac{1}{n}\sum _{t=1}^{n} \sup _{\theta \in \Theta }\left| \epsilon _{t}(\theta )\right| ^4 \nonumber \\= & {} \left( \frac{r}{n}+1\right) \left( \left\| \sup _{\theta \in \Theta } \left| \epsilon _{0}(\theta )\right| \right\| ^4_4+\mathrm {o}_{a.s.}(1) \right) , \end{aligned}$$
(94)
by the ergodic theorem. Similarly to (94), we have
$$\begin{aligned} \frac{1}{n}\sum _{t=1}^n\sup _{\theta \in \Theta }\left| \frac{\partial \epsilon _{t-i}(\theta )}{\partial \theta _{m}}\right| ^4 \le \left( \frac{r}{n}+1\right) \left( \left\| \sup _{\theta \in \Theta } \left| \frac{\partial \epsilon _{0}(\theta )}{\partial \theta _{m}}\right| \right\| ^4_4+\mathrm {o}_{a.s.}(1)\right) . \end{aligned}$$
(95)
By the Cauchy-Schwarz inequality and using (94) and (95), we have
$$\begin{aligned} \sum _{i,i'=1}^r\sum _{m_1,m_2=1}^{K(p+q)}\frac{1}{n}\sum _{t=1}^n {{\mathcal {B}}}^1_{t-i,t-i',m_1,m_2}\le & {} r^2\left\| {{\hat{\theta }}}_n-\theta _0\right\| \left( \frac{r}{n}+1\right) ^3\left( \kappa _1+\mathrm {o}_{a.s.}(1)\right) \\= & {} r^2\left\| {{\hat{\theta }}}_n-\theta _0\right\| \mathrm {O}(1)\left( \kappa _1+ \mathrm {o}_{a.s.}(1)\right) , \end{aligned}$$
when \(r=\mathrm {o}\left( n^{1/3}\right) \) and for some constant \(\kappa _1>0\). Similar inequalities hold for \(\mathcal{B}^j_{t-i,t-i',m_1,m_2}\), for \(j=2, 3, 4\). We thus deduce from (80) and (93) that
$$\begin{aligned} r\Vert {\hat{\Sigma }}_{{\underline{\Upsilon }}_{r,n}}- {\hat{\Sigma }}_{{{{\underline{\Upsilon }}}_r}}\Vert ^2\le & {} r^3\left\| {{\hat{\theta }}}_n-\theta _0\right\| ^2\mathrm {O}_{{\mathbb {P}}}(1). \end{aligned}$$
(96)
Since \(\sqrt{n}\left( {{\hat{\theta }}}_n-\theta _0\right) \) converges in distribution, a tightness argument yields \(\left\| {{\hat{\theta }}}_n- \theta _0\right\| =\mathrm {O}_{{\mathbb {P}}}\left( n^{-1/2}\right) \) and hence from (96), we obtain for \(r=\mathrm {o}(n^{1/3})\)
$$\begin{aligned} \sqrt{r}\Vert {\hat{\Sigma }}_{{\underline{\Upsilon }}_{r,n}}- {\hat{\Sigma }}_{{\underline{\Upsilon }}_r}\Vert =\mathrm {o}_{{\mathbb {P}}}(1). \end{aligned}$$
(97)
By Lemma A.9 , (91) and (97) show that \(\sqrt{r}\Vert {\hat{\Sigma }}_{{\hat{{\underline{\Upsilon }}}_r}}- {\Sigma }_{{\underline{\Upsilon }}_r}\Vert =\mathrm {o}_{{\mathbb {P}}}(1)\). The other results are obtained similarly. \(\square \)
Write \(\underline{{\varvec{\Phi }}}_{r}^*=\left( \Phi _{1}\cdots \Phi _{r}\right) \) where the \(\Phi _{i}\)’s are defined by (21).
Lemma A.11
Under the assumptions of Theorem 3.10,
$$\begin{aligned} \sqrt{r}\left\| \underline{{\varvec{\Phi }}}_{r}^*- \underline{{\varvec{\Phi }}}_{r}\right\| \rightarrow 0, \end{aligned}$$
as \(r\rightarrow \infty \).
Proof
Recall that by (21) and (78)
$$\begin{aligned} \Upsilon _t= & {} \underline{{\varvec{\Phi }}}_{r}{\underline{\Upsilon }}_{r,t}+u_{r,t} =\underline{{\varvec{\Phi }}}_{r}^*{\underline{\Upsilon }}_{r,t}+\sum _{i=r+1}^\infty \Phi _i{\Upsilon }_{t-i}+u_t :=\underline{{\varvec{\Phi }}}_{r}^*{\underline{\Upsilon }}_{r,t}+u_{r,t}^*. \end{aligned}$$
Hence, using the orthogonality conditions in (21) and (78)
$$\begin{aligned} \underline{{\varvec{\Phi }}}_{r}^*-\underline{{\varvec{\Phi }}}_{r}= & {} -{\Sigma }_{u_r^*,{\underline{\Upsilon }}_{r}} {\Sigma }_{{\underline{\Upsilon }}_{r}}^{-1} \end{aligned}$$
(98)
where \({\Sigma }_{u_r^*,{\underline{\Upsilon }}_{r}}= {\mathbb {E}}u_{r,t}^*{\underline{\Upsilon }}_{r,t}'\). Using arguments and notations of the proof of Lemma A.8, there exists a constant \(C_2\) independent of s and \(m_1,m_2\) such that
$$\begin{aligned} {\mathbb {E}}\left| {\Upsilon }_{1}(m_1){\Upsilon }_{1+s}(m_2)\right| \le C_1\sum _{h_1,h_2=0}^{\infty }\rho ^{h_1+h_2}\Vert \epsilon _1\Vert _{4}^4\le C_2. \end{aligned}$$
By the Cauchy-Schwarz inequality and (80), we then have
$$\begin{aligned} \left\| \text{ Cov }\left( {\Upsilon }_{t-r-h},{\underline{\Upsilon }}_{r,t}\right) \right\| \le C_2r^{1/2}K(p+q). \end{aligned}$$
Thus,
$$\begin{aligned} \Vert {\Sigma }_{u_r^*,{\underline{\Upsilon }}_{r}}\Vert= & {} \Vert \sum _{i=r+1}^\infty \Phi _i{\mathbb {E}}{\Upsilon }_{t-i}{\underline{\Upsilon }}_{r,t}'\Vert \le \sum _{h=1}^\infty \Vert \Phi _{r+h}\Vert \left\| \text{ Cov }\left( {\Upsilon }_{t-r-h},{\underline{\Upsilon }}_{r,t}\right) \right\| \nonumber \\= & {} \mathrm O(1)r^{1/2}\sum _{h=1}^\infty \Vert \Phi _{r+h}\Vert . \end{aligned}$$
(99)
Note that the assumption \(\Vert \Phi _i\Vert =\mathrm o\left( i^{-2}\right) \) entails \( r\sum _{h=1}^\infty \Vert \Phi _{r+h}\Vert =\mathrm o(1)\) as \(r\rightarrow \infty \). The lemma therefore follows from (98), (99) and Lemma A.6. \(\square \)
The following lemma is similar to Lemma 3 in Berk (1974).
Lemma A.12
Under the assumptions of Theorem 3.10,
$$\begin{aligned} \sqrt{r}\Vert {\hat{\Sigma }}_{\underline{{\hat{\Upsilon }}}_{r}}^{-1}- {\Sigma }_{{\underline{\Upsilon }}_{r}}^{-1}\Vert= & {} \mathrm o_{\mathbb {P}}(1) \end{aligned}$$
as \(n\rightarrow \infty \) when \(r=\mathrm o(n^{1/3})\) and \(r\rightarrow \infty \).
Proof
We have
$$\begin{aligned} \left\| {\hat{\Sigma }}_{\underline{{\hat{\Upsilon }}}_{r}}^{-1}- {\Sigma }_{{\underline{\Upsilon }}_{r}}^{-1}\right\|= & {} \left\| \left\{ {\hat{\Sigma }}_{\underline{{\hat{\Upsilon }}}_{r}}^{-1}- {\Sigma }_{{\underline{\Upsilon }}_{r}}^{-1}+ {\Sigma }_{{\underline{\Upsilon }}_{r}}^{-1}\right\} \left\{ {\Sigma }_{{\underline{\Upsilon }}_{r}}- {\hat{\Sigma }}_{\underline{{\hat{\Upsilon }}}_{r}}\right\} {\Sigma }_{{\underline{\Upsilon }}_{r}}^{-1}\right\| \\\le & {} \left( \left\| {\hat{\Sigma }}_{\underline{{\hat{\Upsilon }}}_{r}}^{-1}- {\Sigma }_{{\underline{\Upsilon }}_{r}}^{-1}\right\| + \left\| {\Sigma }_{{\underline{\Upsilon }}_{r}}^{-1}\right\| \right) \left\| {\hat{\Sigma }}_{\underline{{\hat{\Upsilon }}}_{r}}- {\Sigma }_{{\underline{\Upsilon }}_{r}}\right\| \left\| {\Sigma }_{{\underline{\Upsilon }}_{r}}^{-1}\right\| . \end{aligned}$$
Iterating this inequality, we obtain
$$\begin{aligned} \left\| {\hat{\Sigma }}_{\underline{{\hat{\Upsilon }}}_{r}}^{-1}- {\Sigma }_{{\underline{\Upsilon }}_{r}}^{-1}\right\|\le & {} \left\| {\Sigma }_{{\underline{\Upsilon }}_{r}}^{-1}\right\| \sum _{i=1}^{\infty }\left\| {\hat{\Sigma }}_{\underline{{\hat{\Upsilon }}}_{r}}- {\Sigma }_{{\underline{\Upsilon }}_{r}}\right\| ^i \left\| {\Sigma }_{{\underline{\Upsilon }}_{r}}^{-1}\right\| ^i. \end{aligned}$$
Thus, for every \(\varepsilon >0\),
$$\begin{aligned}&\mathbb P\left( \sqrt{r}\left\| {\hat{\Sigma }}_{\underline{{\hat{\Upsilon }}}_{r}}^{-1}- {\Sigma }_{{\underline{\Upsilon }}_{r}}^{-1}\right\|>\varepsilon \right) \\&\quad \le \mathbb P\left( \sqrt{r}\frac{\left\| {\Sigma }_{{\underline{\Upsilon }}_{r}}^{-1} \right\| ^2 \left\| {\hat{\Sigma }}_{\underline{{\hat{\Upsilon }}}_{r}}- {\Sigma }_{{\underline{\Upsilon }}_{r}}\right\| }{1- \left\| {\hat{\Sigma }}_{\underline{{\hat{\Upsilon }}}_{r}}- {\Sigma }_{{\underline{\Upsilon }}_{r}}\right\| \left\| {\Sigma }_{{\underline{\Upsilon }}_{r}}^{-1}\right\| }>\varepsilon \text{ and } \left\| {\hat{\Sigma }}_{\underline{{\hat{\Upsilon }}}_{r}}- {\Sigma }_{{\underline{\Upsilon }}_{r}}\right\| \left\| {\Sigma }_{{\underline{\Upsilon }}_{r}}^{-1}\right\| <1\right) \\&\qquad +\,\mathbb P\left( \sqrt{r}\left\| {\hat{\Sigma }}_{\underline{{\hat{\Upsilon }}}_{r}}- {\Sigma }_{{\underline{\Upsilon }}_{r}}\right\| \left\| {\Sigma }_{{\underline{\Upsilon }}_{r}}^{-1}\right\| \ge 1\right) \\&\quad \le {\mathbb {P}}\left( \sqrt{r}\left\| {\hat{\Sigma }}_{\underline{{\hat{\Upsilon }}}_{r}}- {\Sigma }_{{\underline{\Upsilon }}_{r}}\right\| >\frac{\varepsilon }{ \left\| {\Sigma }_{{\underline{\Upsilon }}_{r}}^{-1}\right\| ^2+\varepsilon r^{-1/2}\left\| {\Sigma }_{{\underline{\Upsilon }}_{r}}^{-1}\right\| } \right) \\&\qquad +\,{\mathbb {P}}\left( \sqrt{r}\left\| {\hat{\Sigma }}_{\underline{{\hat{\Upsilon }}}_{r}}- {\Sigma }_{{\underline{\Upsilon }}_{r}}\right\| \ge \left\| {\Sigma }_{{\underline{\Upsilon }}_{r}}^{-1}\right\| ^{-1}\right) = \mathrm o(1) \end{aligned}$$
by Lemmas A.9 and A.6. This establishes Lemma A.12. \(\square \)
Lemma A.13
Under the assumptions of Theorem 3.10,
$$\begin{aligned} \sqrt{r}\left\| \underline{\hat{{\varvec{\Phi }}}}_{r}- \underline{{\varvec{\Phi }}}_{r}\right\| =\mathrm o_{\mathbb {P}}(1) \end{aligned}$$
as \(r\rightarrow \infty \) and \(r=\mathrm o(n^{1/3})\).
Proof
By the triangle inequality and Lemmas A.6 and A.12, we have
$$\begin{aligned} \left\| {\hat{\Sigma }}_{\underline{{\hat{\Upsilon }}}_{r}}^{-1}\right\| \le \left\| {\hat{\Sigma }}_{\underline{{\hat{\Upsilon }}}_{r}}^{-1}- {\Sigma }_{{\underline{\Upsilon }}_{r}}^{-1}\right\| + \left\| {\Sigma }_{{\underline{\Upsilon }}_{r}}^{-1}\right\| =\mathrm O_\mathbb P(1). \end{aligned}$$
(100)
Note that the orthogonality conditions in (78) entail that \(\underline{{\varvec{\Phi }}}_{r}={\Sigma }_{{\Upsilon },{\underline{\Upsilon }}_{r}} {\Sigma }_{{\underline{\Upsilon }}_{r}}^{-1}\). By Lemmas A.6, A.9, A.12, and (100), we then have
$$\begin{aligned} \sqrt{r}\left\| \underline{\hat{{\varvec{\Phi }}}}_{r}- \underline{{\varvec{\Phi }}}_{r}\right\|= & {} \sqrt{r}\left\| {\hat{\Sigma }}_{{\hat{\Upsilon }}, \underline{{\hat{\Upsilon }}}_{r}} {\hat{\Sigma }}_{\underline{{\hat{\Upsilon }}}_{r}}^{-1} -{\Sigma }_{{\Upsilon },{\underline{\Upsilon }}_{r}} {\Sigma }_{{\underline{\Upsilon }}_{r}}^{-1}\right\| \\= & {} \sqrt{r}\left\| \left( {\hat{\Sigma }}_{{\hat{\Upsilon }}, \underline{{\hat{\Upsilon }}}_{r}} -{\Sigma }_{{\Upsilon },{\underline{\Upsilon }}_{r}}\right) {\hat{\Sigma }}_{\underline{{\hat{\Upsilon }}}_{r}}^{-1} +{\Sigma }_{{\Upsilon },{\underline{\Upsilon }}_{r}} \left( {\hat{\Sigma }}_{\underline{{\hat{\Upsilon }}}_{r}}^{-1} -{\Sigma }_{{\underline{\Upsilon }}_{r}}^{-1}\right) \right\| =\mathrm o_{\mathbb {P}}(1). \end{aligned}$$
\(\square \)
Proof of Theorem 3.10
In view of (20), it suffices to show that \(\underline{\hat{{\varvec{\Phi }}}}_r(1)\rightarrow \underline{{{\varvec{\Phi }}}}(1)\) and \({\hat{\Sigma }}_{u_r}\rightarrow {\Sigma }_{u}\) in probability. Let the \(r\times 1\) vector \(\mathbf{1}_r=(1,\dots ,1)'\) and the \(r(p+q)K\times (p+q)K\) matrix \(\mathbf{E}_r={\mathbb {I}}_{(p+q)K}\otimes \mathbf{1}_r\), where \(\otimes \) denotes the matrix Kronecker product and \({\mathbb {I}}_d\) the \(d\times d\) identity matrix. Using (80), and Lemmas A.11, A.13, we obtain
$$\begin{aligned} \left\| \underline{\hat{{\varvec{\Phi }}}}_r(1)- \underline{{{\varvec{\Phi }}}}(1)\right\|\le & {} \left\| \sum _{i=1}^r\left( \hat{ \Phi }_{r,i}-\Phi _{r,i}\right) \right\| +\left\| \sum _{i=1}^r\left( {\Phi }_{r,i}-\Phi _{i}\right) \right\| +\left\| \sum _{i=r+1}^{\infty }\Phi _{i}\right\| \\= & {} \left\| \left( \underline{\hat{{\varvec{\Phi }}}}_{r}-\underline{{{\varvec{\Phi }}}}_{r} \right) \mathbf{E}_r\right\| +\left\| \left( \underline{{{\varvec{\Phi }}}}_{r}^*- \underline{{{\varvec{\Phi }}}}_{r}\right) \mathbf{E}_r\right\| +\left\| \sum _{i=r+1}^{\infty }\Phi _{i}\right\| \\\le & {} \sqrt{(p+q)K}\sqrt{r}\left\{ \left\| \underline{\hat{{\varvec{\Phi }}}}_{r}- \underline{{{\varvec{\Phi }}}}_{r}\right\| +\left\| \underline{{{\varvec{\Phi }}}}_{r}^*-\underline{{{\varvec{\Phi }}}}_{r} \right\| \right\} +\left\| \sum _{i=r+1}^{\infty }\Phi _{i}\right\| \\= & {} \mathrm o_{\mathbb {P}}(1). \end{aligned}$$
Now note that
$$\begin{aligned} {\hat{\Sigma }}_{u_r}={\hat{\Sigma }}_{{{\hat{\Upsilon }}}} -\underline{\hat{{\varvec{\Phi }}}}_{r} {\hat{\Sigma }}_{{{\hat{\Upsilon }}},\underline{{\hat{\Upsilon }}}_r}' \end{aligned}$$
and, by (21)
$$\begin{aligned} {\Sigma }_{u}= & {} \mathbb Eu_tu_t'=\mathbb Eu_t\Upsilon _t'={\mathbb {E}} \left\{ \left( \Upsilon _t-\sum _{i=1}^{\infty }\Phi _i\Upsilon _{t-i}\right) \Upsilon _t'\right\} \\= & {} {\Sigma }_{{{\Upsilon }}}-\sum _{i=1}^{\infty }\Phi _i\mathbb E{\Upsilon }_{t-i}{\Upsilon }_{t}' ={\Sigma }_{\Upsilon }- \underline{{{\varvec{\Phi }}}}_{r}^*{\Sigma }_{{{\Upsilon }}, {{\underline{\Upsilon }}_{r}}}' -\sum _{i=r+1}^{\infty }\Phi _i{\mathbb {E}} {\Upsilon }_{t-i}{\Upsilon }_{t}'. \end{aligned}$$
Thus,
$$\begin{aligned} \left\| {\hat{\Sigma }}_{u_r}-{\Sigma }_{u}\right\|= & {} \left\| {\hat{\Sigma }}_{{\hat{\Upsilon }}} -{\Sigma }_{\Upsilon }- \left( \underline{\hat{{\varvec{\Phi }}}}_{r}- \underline{{{\varvec{\Phi }}}}_{r}^*\right) {\hat{\Sigma }}_{{{\hat{\Upsilon }}}, \underline{{\hat{\Upsilon }}}_r}'\right. \nonumber \\&\left. - \underline{{{\varvec{\Phi }}}}_{r}^* \left( {\hat{\Sigma }}_{{{\hat{\Upsilon }}},\underline{{\hat{\Upsilon }}}_r}'- {\Sigma }_{{{\Upsilon }},{{\underline{\Upsilon }}_{r}}}'\right) +\sum _{i=r+1}^{\infty }\Phi _i{\mathbb {E}}{\Upsilon }_{t-i} {\Upsilon }_{t}'\right\| \nonumber \\\le & {} \left\| {\hat{\Sigma }}_{{\hat{\Upsilon }}} -{\Sigma }_{\Upsilon }\right\| + \left\| \left( \underline{\hat{{\varvec{\Phi }}}}_{r}- \underline{{{\varvec{\Phi }}}}_{r}^*\right) \left( {\hat{\Sigma }}_{{{\hat{\Upsilon }}}, \underline{{\hat{\Upsilon }}}_r}'- {\Sigma }_{{{\Upsilon }}, \underline{{\Upsilon }}_r}'\right) \right\| \nonumber \\&+\left\| \left( \underline{\hat{{\varvec{\Phi }}}}_{r}- \underline{{{\varvec{\Phi }}}}_{r}^*\right) {\Sigma }_{{{\Upsilon }}, \underline{{\Upsilon }}_r}'\right\| +\left\| \underline{{{\varvec{\Phi }}}}_{r}^* \left( {\hat{\Sigma }}_{{{\hat{\Upsilon }}},\underline{{\hat{\Upsilon }}}_r}'- {\Sigma }_{{{\Upsilon }},{{\underline{\Upsilon }}_{r}}}'\right) \right\| \nonumber \\&+\left\| \sum _{i=r+1}^{\infty }\Phi _i\mathbb E{\Upsilon }_{t-i}{\Upsilon }_{t}'\right\| . \end{aligned}$$
(101)
In the right-hand side of this inequality, the first norm is \(\mathrm o_{\mathbb {P}}(1)\) by Lemma A.9. By Lemmas A.11 and A.13, we have \(\Vert \underline{\hat{{\varvec{\Phi }}}}_{r}- \underline{{{\varvec{\Phi }}}}_{r}^*\Vert =\mathrm o_\mathbb P(r^{-1/2})=\mathrm o_{\mathbb {P}}(1)\), and by Lemma A.9, \(\Vert {\hat{\Sigma }}_{{{\hat{\Upsilon }}}, \underline{{\hat{\Upsilon }}}_r}'-{\Sigma }_{{{\Upsilon }}, \underline{{\Upsilon }}_r}'\Vert =\mathrm o_{\mathbb {P}}(r^{-1/2})=\mathrm o_{\mathbb {P}}(1)\). Therefore the second norm in the right-hand side of (101) tends to zero in probability. The third norm tends to zero in probability because \(\Vert \underline{\hat{{\varvec{\Phi }}}}_{r}- \underline{{{\varvec{\Phi }}}}_{r}^*\Vert =\mathrm o_{\mathbb {P}}(1)\) and, by Lemma A.6, \(\Vert {\Sigma }_{{{\Upsilon }},\underline{{\Upsilon }}_r}'\Vert =\mathrm O(1)\). The fourth norm tends to zero in probability because, in view of Lemma A.9, \(\Vert {\hat{\Sigma }}_{{{\hat{\Upsilon }}},\underline{{\hat{\Upsilon }}}_r}'- {\Sigma }_{{{\Upsilon }},{{\underline{\Upsilon }}_{r}}}'\Vert =\mathrm o_{\mathbb {P}}(1)\), and, in view of (80), \(\Vert \underline{{{\varvec{\Phi }}}}_{r}^*\Vert ^2\le \sum _{i=1}^\infty \text{ Tr }(\Phi _i\Phi _i')<\infty \). Clearly, the last norm tends to zero, which completes the proof. \(\square \)