Skip to main content
Log in

Estimation of weak ARMA models with regime changes

  • Published:
Statistical Inference for Stochastic Processes Aims and scope Submit manuscript

Abstract

In this paper we derive the asymptotic properties of the least squares estimator (LSE) of autoregressive moving-average (ARMA) models with regime changes under the assumption that the errors are uncorrelated but not necessarily independent. Relaxing the independence assumption considerably extends the range of application of the class of ARMA models with regime changes. Conditions are given for the consistency and asymptotic normality of the LSE. A particular attention is given to the estimation of the asymptotic covariance matrix, which may be very different from that obtained in the standard framework. The theoretical results are illustrated by means of Monte Carlo experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Amendola A, Francq C (2009) Concepts of and tools for nonlinear time-series modelling, chapter 10. Wiley, Hoboen, pp 377–427

    Google Scholar 

  • Anderson PL, Meerschaert MM (1997) Periodic moving averages of random variables with regularly varying tails. Ann Statist 25(2):771–785

    MathSciNet  MATH  Google Scholar 

  • Andrews DWK (1991) Heteroskedasticity and autocorrelation consistent covariance matrix estimation. Econometrica 59(3):817–858

    MathSciNet  MATH  Google Scholar 

  • Andrews B, Davis RA, Breidt FJ (2006) Maximum likelihood estimation for all-pass time series models. J Multivariate Anal 97(7):1638–1659

    MathSciNet  MATH  Google Scholar 

  • Azrak R, Mélard G (1998) The exact quasi-likelihood of time-dependent ARMA models. J Statist Plann Inference 68(1):31–45

    MathSciNet  MATH  Google Scholar 

  • Azrak R, Mélard G (2006) Asymptotic properties of quasi-maximum likelihood estimators for ARMA models with time-dependent coefficients. Stat Inference Stoch Process 9(3):279–330

    MathSciNet  MATH  Google Scholar 

  • Basawa IV, Lund R (2001) Large sample properties of parameter estimates for periodic ARMA models. J Time Ser Anal 22(6):651–663

    MathSciNet  MATH  Google Scholar 

  • Berk KN (1974) Consistent autoregressive spectral estimates. Ann. Statist. 2:489–502 Collection of articles dedicated to Jerzy Neyman on his 80th birthday

    MathSciNet  MATH  Google Scholar 

  • Bibi A, Francq C (2003) Consistent and asymptotically normal estimators for cyclically time-dependent linear models. Ann Inst Statist Math 55(1):41–68

    MathSciNet  MATH  Google Scholar 

  • Billio M, Monfort A, Robert CP (1999) Bayesian estimation of switching ARMA models. J Econom 93(2):229–255

    MathSciNet  MATH  Google Scholar 

  • Bollerslev T (1986) Generalized autoregressive conditional heteroskedasticity. J Econom 31(3):307–327

    MathSciNet  MATH  Google Scholar 

  • Boubacar Mainassara Y (2011) Multivariate portmanteau test for structural VARMA models with uncorrelated but non-independent error terms. J Statist Plann Inference 141(8):2961–2975

    MathSciNet  MATH  Google Scholar 

  • Boubacar Maïnassara Y (2012) Selection of weak VARMA models by modified Akaike’s information criteria. J Time Series Anal 33(1):121–130

    MathSciNet  MATH  Google Scholar 

  • Boubacar Mainassara Y, Carbon M, Francq C (2012) Computing and estimating information matrices of weak ARMA models. Comput Statist Data Anal 56(2):345–361

    MathSciNet  MATH  Google Scholar 

  • Boubacar Maïnassara Y, Kokonendji CC (2016) Modified Schwarz and Hannan-Quinn information criteria for weak VARMA models. Stat Inference Stoch Process 19(2):199–217

    MathSciNet  MATH  Google Scholar 

  • Boubacar Maïnassara Y, Saussereau B (2018) Diagnostic checking in multivariate ARMA models with dependent errors using normalized residual autocorrelations. J Am. Statist Assoc 113(524):1813–1827

    MathSciNet  MATH  Google Scholar 

  • Brandt A (1986) The stochastic equation \(Y_{n+1}=A_nY_n+B_n\) with stationary coefficients. Adv Appl Probab 18(1):211–220

    MathSciNet  Google Scholar 

  • Brockwell PJ, Davis RA (1991) Time series: theory and methods. Springer series in statistics, 2nd edn. Springer, New York

    MATH  Google Scholar 

  • Dahlhaus R (1997) Fitting time series models to nonstationary processes. Ann Statist 25(1):1–37

    MathSciNet  MATH  Google Scholar 

  • Davidson J (1994) Stochastic limit theory. Advanced texts in econometrics. An introduction for econometricians. The Clarendon Press, Oxford University Press, New York

    Google Scholar 

  • Davydov JA (1968) Convergence of distributions generated by stationary stochastic processes. Theor Probab Appl 13(2):691–696

    MATH  Google Scholar 

  • den Haan WJ, Levin AT (1997) A practitioner’s guide to robust covariance matrix estimation. In: Robust inference, volume 15 of Handbook of statist. North-Holland, Amsterdam, pp 299–342

  • Engle RF (1982) Autoregressive conditional heteroscedasticity with estimates of the variance of United Kingdom inflation. Econometrica 50(4):987–1007

    MathSciNet  MATH  Google Scholar 

  • Francq C, Gautier A (2003) Estimation of time-varying ARMA models and applications to series subject to Markovian changes in regime. http://christian.francq140.free.fr/Christian-Francq/statistics-econometrics-papers/longversion.ps. Accessed 11 July 2019

  • Francq C, Gautier A (2004a) Estimation of time-varying ARMA models with Markovian changes in regime. Statist Probab Lett 70(4):243–251

    MathSciNet  MATH  Google Scholar 

  • Francq C, Gautier A (2004b) Large sample properties of parameter least squares estimates for time-varying ARMA models. J Time Ser Anal 25(5):765–783

    MathSciNet  MATH  Google Scholar 

  • Francq C, Roussignol M (1997) On white noises driven by hidden Markov chains. J Time Ser Anal 18(6):553–578

    MathSciNet  MATH  Google Scholar 

  • Francq C, Roussignol M (1998) Ergodicity of autoregressive processes with Markov-switching and consistency of the maximum-likelihood estimator. Statistics 32(2):151–173

    MathSciNet  MATH  Google Scholar 

  • Francq C, Zakoïan J-M (1998) Estimating linear representations of nonlinear processes. J Statist Plann Inference 68(1):145–165

    MathSciNet  MATH  Google Scholar 

  • Francq C, Zakoïan J-M (2001) Stationarity of multivariate markov-switching ARMA models. J Econom 102(2):339–364

    MathSciNet  MATH  Google Scholar 

  • Francq C, Zakoïan J-M (2002) Autocovariance structure of powers of switching-regime ARMA processes. ESAIM Probab Statist 6:259–270 New directions in time series analysis (Luminy, 2001)

    MathSciNet  Google Scholar 

  • Francq C, Zakoïan J-M (2005) Recent results for linear time series models with non independent innovations. In: Statistical modeling and analysis for complex data problems, volume 1 of GERAD 25th Anniv. Ser. Springer, New York, pp 241–265

  • Francq C, Zakoïan J-M (2007) HAC estimation and strong linearity testing in weak ARMA models. J Multivariate Anal 98(1):114–144

    MathSciNet  MATH  Google Scholar 

  • Francq C, Zakoïan J-M (2010) GARCH models. Structure, statistical inference and financial applications. Wiley, Chichester

    MATH  Google Scholar 

  • Gautier A (2004) Modèles de séries temporelles à coefficients dépendant du temps. Doctoral thesis. University of Lilles 3

  • Grenander U, Szegö G (1958) Toeplitz forms and their applications. California monographs in mathematical sciences. University of California Press, Berkeley, Los Angeles

    MATH  Google Scholar 

  • Hamilton JD (1988) Rational-expectations econometric analysis of changes in regime: an investigation of the term structure of interest rates. J Econom Dyn Control 12(2–3):385–423 Economic time series with random walk and other nonstationary components

    MathSciNet  MATH  Google Scholar 

  • Hamilton JD (1989) A new approach to the economic analysis of nonstationary time series and the business cycle. Econometrica 57(2):357–384

    MathSciNet  MATH  Google Scholar 

  • Hamilton JD (1990) Analysis of time series subject to changes in regime. J Econom 45(1–2):39–70

    MathSciNet  MATH  Google Scholar 

  • Hamilton JD (1994) Time series analysis. Princeton University Press, Princeton

    MATH  Google Scholar 

  • Hamilton JD, Susmel R (1994) Autoregressive conditional heteroskedasticity and changes in regime. J Econom 64(1):307–333

    MATH  Google Scholar 

  • Herrndorf N (1984) A functional central limit theorem for weakly dependent sequences of random variables. Ann Probab 12(1):141–153

    MathSciNet  MATH  Google Scholar 

  • Jones GL (2004) On the markov chain central limit theorem. Probab Surv 1:299–320

    MathSciNet  MATH  Google Scholar 

  • Kim C-J, Kim J (2015) Bayesian inference in regime-switching ARMA models with absorbing states: the dynamics of the ex-ante real interest rate under regime shifts. J Bus Econom Statist 33(4):566–578

    MathSciNet  Google Scholar 

  • Newey WK, West KD (1987) A simple, positive semidefinite, heteroskedasticity and autocorrelation consistent covariance matrix. Econometrica 55(3):703–708

    MathSciNet  MATH  Google Scholar 

  • Nicholls DF, Quinn BG (1982) Random coefficient autoregressive models: an introduction, volume 11 of Lecture Notes in Statistics. Springer, New York, Berlin. Lecture Notes in Physics, 151

    MATH  Google Scholar 

  • Norris JR (1998) Markov chains, volume 2 of Cambridge series in statistical and probabilistic mathematics. Cambridge University Press, Cambridge (reprint of 1997 original)

  • Romano JP, Thombs LA (1996) Inference for autocorrelations under weak assumptions. J Am Statist Assoc 91(434):590–600

    MathSciNet  MATH  Google Scholar 

  • Stelzer R (2009) On Markov-switching ARMA processes–stationarity, existence of moments, and geometric ergodicity. Econom Theory 25(1):43–62

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

We sincerely thank the anonymous reviewers and Editor in Chief for helpful remarks. The authors wish to acknowledge the support from the “Séries temporelles et valeurs extrêmes : théorie et applications en modélisation et estimation des risques” Projet Région grant No OPE-2017-0068.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yacouba Boubacar Maïnassara.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A: Proofs

Appendix A: Proofs

1.1 A.1. Proofs of Proposition 3.1 and Lemma 3.3

Proof of Proposition 3.1

Let us first note that Condition (A5a) is equivalent to

$$\begin{aligned} {\mathbb {E}}\left( \sup _{\theta \in \Theta }\left| \left| \prod _{i=1}^t \Phi (\Delta _i,\theta )\right| \right| ^{8}\right) \le C \rho ^t,\quad {\mathbb {E}}\left( \left| \left| \prod _{i=1}^t \Psi (\Delta _i)\right| \right| ^{8}\right) \le C \rho ^t, \end{aligned}$$
(41)

for some constant \(C>0\) and \(0<\rho <1\) (independent from \(\theta \)). Let us first introduce the processes \(({\tilde{Z}}_t)_{t\in {\mathbb {Z}}}\) and \(({\tilde{\omega }}_t)_{t\in {\mathbb {Z}}}\) by

$$\begin{aligned}&{\tilde{Z}}_t=(X_t,\dots ,X_{t-p+1},\epsilon _t,\dots ,\epsilon _{t-q+1})'\in {\mathbb {R}}^{(p+q)\times 1}, \\&{\tilde{\omega }}_t= (\epsilon _t,0,\dots ,\epsilon _t,\dots ,0)'\in {\mathbb {R}}^{(p+q)\times 1} \end{aligned}$$

where \(\epsilon _t\) in the latter is in \((p+1)\)th position in \({\tilde{\omega }}_t\). Then it is clear that we have the following equation for \({\tilde{Z}}_t\):

$$\begin{aligned} {\tilde{Z}}_t=\Psi (\Delta _{t}){\tilde{Z}}_{t-1} + {\tilde{\omega }}_t ,\quad \forall t\in {\mathbb {Z}}, \end{aligned}$$

of which a candidate for the solution of the above equation is, with the usual convention \(\prod _{j=0}^{-1}=1\),

$$\begin{aligned} {\tilde{Z}}_t= \sum _{k=0}^\infty \prod _{j=0}^{k-1}\Psi (\Delta _{t-j}){\tilde{\omega }}_{t-k},\quad t\in {\mathbb {Z}}, \end{aligned}$$
(42)

a stationary process, provided that the series converges, which we prove now. Let us pick for \(||\cdot ||\) a subordinate norm on the set of matrices. By independence of the processes \((\Delta _t)_{t\in {\mathbb {Z}}}\) and \((\epsilon _t)_{t\in {\mathbb {Z}}}\), and using the fact that the latter is square integrable, we easily get, for \(k\ge 1\),

$$\begin{aligned} {\mathbb {E}}\left( \left| \left| \Psi (\Delta _{t})\dots \Psi (\Delta _{t-k+1}){\tilde{\omega }}_{t-k}\right| \right| ^2\right)&\le {\mathbb {E}}\left( \left| \left| \Psi (\Delta _{t})\dots \Psi (\Delta _{t-k+1}) \right| \right| ^2 .\left| \left| {\tilde{\omega }}_{t-k}\right| \right| ^2\right) \\&={\mathbb {E}}\left( \left| \left| \Psi (\Delta _{t})\dots \Psi (\Delta _{t-k+1}) \right| \right| ^2\right) {\mathbb {E}}\left( \left| \left| {\tilde{\omega }}_{t-k}\right| \right| ^2\right) \\&\le C {\mathbb {E}}\left( \left| \left| {\tilde{\omega }}_{0}\right| \right| ^2\right) \rho ^k, \end{aligned}$$

the last inequality stemming from (41), so that series (42) converges in \(L^2\). Note that we prove that \({\tilde{Z}}_t\) (hence \(X_t\)) is in \(L^4\) by replacing \(||\cdot ||^2\) by \(||\cdot ||^4\) in the above inequalities, using again (41) and the fact that \((\epsilon )_{t\in {\mathbb {Z}}}\) is in \( L^4\), see assumption (A3). Similarly, defining

$$\begin{aligned} Z_t(\theta ):=(\epsilon _t(\theta ),\dots ,\epsilon _{t-q+1}(\theta ),X_t,\dots ,X_{t-p+1})' , \quad \omega _t= (X_t,0,\dots ,X_t,\dots ,0)' \end{aligned}$$
(43)

where \(X_t\) in the latter is in \((q+1)\)th position, we also get that \(Z_t(\theta )\) satisfies

$$\begin{aligned} Z_t(\theta )=\Phi (\Delta _{t},\theta )Z_{t-1}(\theta )+\omega _t . \end{aligned}$$

A solution candidate to the above solution is

$$\begin{aligned} Z_t(\theta )= \sum _{k=0}^\infty \prod _{j=0}^{k-1}\Phi (\Delta _{t-j},\theta )\omega _{t-k},\quad t\in {\mathbb {Z}}. \end{aligned}$$
(44)

Similarly to the proof leading to (42), convergence of (44) is obtained thanks to (41) as well as stationarity of \((X_t)_{t\in {\mathbb {Z}}}\) and the fact that \(X_t\in L^4\).

We check that \(\omega _t=M{\tilde{Z}}_t\) and \(\epsilon _t(\theta )=w_1Z_t(\theta )\), which, plugged into (42) and (44) yields (8) with coefficients \(c_i(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1})\) given by (9). Finally, let us verify that \((c_i(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1}))_{i\in {\mathbb {N}}}\) is the unique sequence verifying (8). Let us then pick a sequence of r.v. \((d_i)_{i\in {\mathbb {N}}}\) in \({{\mathcal {H}}}\) such that \(\epsilon _t(\theta )= \sum _{i=0}^\infty c_i(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1})\epsilon _{t-i}= \sum _{i=0}^\infty d_i \epsilon _{t-i}\). We then get, by independence from \( (\epsilon _t)_{t\in {\mathbb {Z}}}\) as well as by the fact that the latter is a weak white noise:

$$\begin{aligned} 0= & {} {\mathbb {E}}\left( \left[ \sum _{i=0}^\infty (c_i(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1})-d_i)\epsilon _{t-i}\right] ^2\right) \\= & {} \sigma ^2{\mathbb {E}}\left( \sum _{i=0}^\infty (c_i(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1})-d_i)^2\right) \end{aligned}$$

hence \((c_i(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1}))_{i\in {\mathbb {N}}}=(d_i)_{i\in {\mathbb {N}}}\) a.s. \(\square \)

Proof of Lemma 3.3

The fact that the \(\theta \mapsto c_i(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1})\), \(\theta \mapsto \nabla [c_i(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1})]^2\) and \(\theta \mapsto \nabla ^2 [c_i(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1})]^2\) are polynomial functions (of several variables) can be verified easily using the fact that, for all \(s\in {{\mathcal {S}}}\), \(\theta \mapsto \Phi (s,\theta )\) and \(\theta \mapsto \Psi (\theta )\) are affine functions. We turn to (12). Using Minkovski’s inequality, the fact that the matrix norm \(||\cdot ||\) is submultiplicative entails

$$\begin{aligned}&\left| \left| \sup _{\theta \in \Theta } |c_i(\theta ,\Delta _{i},\dots ,\Delta _{1})|\right| \right| _{2\nu +4} \nonumber \\&\quad \le \sum _{k=0}^i \left| \left| \sup _{\theta \in \Theta } |w_1\Phi (\Delta _{i},\theta )\dots \Phi (\Delta _{i-k+1},\theta )M \Psi (\Delta _{i-k})\dots \Psi (\Delta _{1})w_{p+1}'|\right| \right| _{2\nu +4} \nonumber \\&\quad \le C \sum _{k=0}^i \left[ {\mathbb {E}}\left( \sup _{\theta \in \Theta }\left| \left| \Phi (\Delta _{i},\theta )\dots \Phi (\Delta _{i-k+1},\theta ) \right| \right| ^{2\nu +4} \left| \left| \Psi (\Delta _{i-k})\dots \Psi (\Delta _{1}) \right| \right| ^{2\nu +4}\right) \right] ^{1/(2\nu +4)} \end{aligned}$$
(45)

for some constant \(C>0\). The Cauchy-Schwarz inequality as well as (A5a) yields

$$\begin{aligned}&\left[ {\mathbb {E}}\left( \sup _{\theta \in \Theta }\left| \left| \Phi (\Delta _{i},\theta )\dots \Phi (\Delta _{i-k+1},\theta ) \right| \right| ^{2\nu +4} \left| \left| \Psi (\Delta _{i-k})\dots \Psi (\Delta _{1})\right| \right| ^{2\nu +4}\right) \right] ^{1/(2\nu +4)}\\&\quad \le \left[ {\mathbb {E}}\left( \sup _{\theta \in \Theta }\left| \left| \Phi (\Delta _{i},\theta )\dots \Phi (\Delta _{i-k+1},\theta ) \right| \right| ^{4\nu +8} \right) \right] ^{\frac{1}{(4\nu +8 )}} \\&\qquad \left[ {\mathbb {E}}\left( \left| \left| \Psi (\Delta _{i-k})\dots \Psi (\Delta _{1})\right| \right| ^{4\nu +8} \right) \right] ^{\frac{1}{(4\nu +8 )}}\le \kappa \rho ^{\frac{i}{(2\nu +4 )}} \end{aligned}$$

which, plugged in (45), yields inequality (12) for \(c_i(\theta ,\Delta _{i},\dots ,\Delta _{1})\). The inequalities for \(\nabla ^j [c_i(\theta ,\Delta _{i},\dots ,\Delta _{1})]\), \(j=2,3\), are proved similarly. As to \(c_i^e(t,\theta ,\Delta _{t},\dots ,\Delta _{t-i+1})\), (11) yields the upper bound

$$\begin{aligned}&\left| \left| \sup _{\theta \in \Theta } |c_i^e(t,\theta ,\Delta _{t},\dots ,\Delta _{t-i+1})|\right| \right| _{2\nu +4} \\&\quad \le \sum _{k=0}^i \left| \left| \sup _{\theta \in \Theta } |w_1\Phi (\Delta _{t},\theta )\dots \Phi (\Delta _{t-k+1},\theta )M \Psi (\Delta _{t-k})\dots \Psi (\Delta _{t-i+1})w_{p+1}'|\right| \right| _{2\nu +4}, \end{aligned}$$

so that upper bound (13) for \(c_i^e(t,\theta ,\Delta _{t-1},\dots ,\Delta _{t-i})\) follows again by a Cauchy-Schwarz argument. The upper bound (13) for \(\nabla c_i^e(t,\theta ,\Delta _{t-1},\dots ,\Delta _{t-i})\) is obtained similarly. \(\square \)

1.2 A.2. Proofs of Lemma 3.4 and Proposition 3.5

Proof of Lemma 3.4

We first prove Point 1. Using decomposition (8) of \(\epsilon _t(\theta )\), independence of the white noise from the modulating process, as well as stationarity of the former, we obtain

$$\begin{aligned} \left| \left| \sup _{\theta \in \Theta } |\epsilon _0(\theta )|\right| \right| _4\le \sum _{i=0}^\infty \left| \left| \sup _{\theta \in \Theta } |c_i(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1})|\right| \right| _4 \cdot ||\epsilon _{0}||_4 \end{aligned}$$

which is a converging series because of (12). As to \(e_t(\theta )\), we use this time decomposition (10) as well as (13) in order to get

$$\begin{aligned} \sup _{t\ge 0}\left| \left| \sup _{\theta \in \Theta } |e_t(\theta )|\right| \right| _4\le \sum _{i=0}^\infty \sup _{t\ge 0}\left| \left| \sup _{\theta \in \Theta }|c_i^e(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1})|\right| \right| _4\cdot ||\epsilon _{0}||_4<+\infty . \end{aligned}$$

In order to prove Point 2, we remind the following notations. From (4) and (5), we have

$$\begin{aligned} Z_t(\theta )=\omega _t+ \Phi (\Delta _{t},\theta )Z_{t-1}(\theta )\qquad \forall t\in {\mathbb {Z}}, \end{aligned}$$

and

$$\begin{aligned} Z^e_t(\theta )=\omega ^e_t+ \Phi (\Delta _{t},\theta )Z^e_{t-1}(\theta )\qquad t=1,\dots ,n, \end{aligned}$$

where \(Z^e_t(\theta ) := (e_t(\theta ),\dots ,e_{t-q+1}(\theta ), {\tilde{X}}_t,\dots ,{\tilde{X}}_{t-p+1})' , \quad \omega ^e_t= ({\tilde{X}}_t,0,\dots ,{\tilde{X}}_t,\dots ,0)', \) so that \(\omega ^e_t=\omega _t\) for \(t\ge r+1\) (where \(r=\max (p,q)\)), \(\omega ^e_t(\theta )=0_{p+q}\) for \(t\le 0\). We recall that the processes \(({\tilde{X}}_t)_{t\in {\mathbb {Z}}}\) and \((e_t(\theta ))_{t\in {\mathbb {Z}}}\) verify (5). Note that \(\left| \left| \sup _{\theta \in \Theta } |\epsilon _t(\theta )-e_t(\theta )|\right| \right| _2\longrightarrow 0\) is equivalent to \(\left| \left| \sup _{\theta \in \Theta } ||Z^e_t(\theta )-Z_t(\theta )||\right| \right| _2\longrightarrow 0\) as \(t\rightarrow \infty \). Now, since \({\tilde{X}}_t=X_t\) for \(t\ge 1\), we easily see that

$$\begin{aligned} Z^e_t(\theta )-Z_t(\theta )&=\Phi (\Delta _{t},\theta ) [Z^e_{t-1}(\theta )-Z_{t-1}(\theta )],\quad \forall t\ge r+1, \end{aligned}$$
(46)
$$\begin{aligned} Z^e_t(\theta )-Z_t(\theta )&=\omega ^e_t-\omega _t+ \Phi (\Delta _{t},\theta )[Z^e_{t-1}(\theta )-Z_{t-1}(\theta )], \text{ for } t=1,\dots ,r. \end{aligned}$$
(47)

Now, using (46) and (47) we obtain

$$\begin{aligned} Z^e_t(\theta )-Z_t(\theta )= & {} \prod _{j=0}^{t-r-1}\Phi (\Delta _{t-j}, \theta )[Z^e_{r}(\theta )-Z_{r}(\theta )],\quad \forall t\ge r+1,\nonumber \\= & {} \prod _{j=0}^{t-r-1}\Phi (\Delta _{t-j},\theta ) \nonumber \\&\left( \sum _{i=0}^{r-1} \prod _{j=0}^{i-1}\Phi (\Delta _{r-j},\theta )[\omega ^e_{r-i}-\omega _{r-i}] \prod _{j=0}^{r-1}\Phi (\Delta _{r-j},\theta )\omega _{0} \right) . \end{aligned}$$
(48)

Let us furthermore note that

$$\begin{aligned}&\left| \left| \sup _{\theta \in \Theta } |{\tilde{X}}_t-X_t|\right| \right| _4 \\&\quad = \left| \left| \sup _{\theta \in \Theta } |\sum _{i=t}^{r} g_i^a(\Delta _t,\theta ){X}_{t-i}+\sum _{j=t}^{r} g_j^b(\Delta _t,\theta )\epsilon _{t-i}(\theta )|\right| \right| _4<+\infty \text{ for } t=1,\dots ,r \end{aligned}$$

as indeed \(X_t\in L^4\) (as proved in the proof of Proposition 3.1) and \(|| \sup _{\theta \in \Theta } \epsilon _t(\theta )||_4 <+\infty \) as proved in Point 1. In view of (48), using Minkowski’s and Hölder’s inequalities and (A5a), we thus have

$$\begin{aligned} \left| \left| \sup _{\theta \in \Theta } ||Z^e_t(\theta )-Z_t(\theta )||\right| \right| _2\le C\rho ^t, \end{aligned}$$

for some constant \(C>0\) and \(0<\rho <1\) (independent from \(\theta \)).

Let us turn to Point 3. This is due to

$$\begin{aligned} {{\mathbb {P}}}\left( t^\alpha \sup _{\theta \in \Theta } |\epsilon _t(\theta )-e_t(\theta )|>\eta \right) \le \frac{t^{2+2\alpha }\left| \left| \sup _{\theta \in \Theta } |\epsilon _t(\theta )-e_t(\theta )|\right| \right| _2^2 }{t^{2} \eta ^2}= o\left( \frac{1}{t^{2}}\right) ,\quad \forall \eta >0 , \end{aligned}$$

the last equality thanks to Point 2, and using Borel-Cantelli’s lemma.

We now turn to Point 4. The fact that \(\left| \left| \sup _{\theta \in \Theta } ||\nabla ^j\epsilon _0(\theta )||\right| \right| _4\) and \(\sup _{t\ge 0}\left| \left| \sup _{\theta \in \Theta }\right. \right. \left. \left. ||\nabla ^j e_t(\theta )||\right| \right| _4\) are finite is proved similarly to Point 1 and using estimates (12) and (13). We then pass on to the limit of \(t^\alpha \left| \left| \sup _{\theta \in \Theta } ||\nabla (e_t- \epsilon _t)(\theta )||\right| \right| _{4/3}\) as \(t\rightarrow \infty \). Let \(i\in {{\mathcal {S}}}\). Deriving (46) with respect to \(\theta _i\) yields

$$\begin{aligned}&\frac{\partial }{\partial \theta _i}[Z^e_t(\theta )- Z_t(\theta )] \nonumber \\&\quad =\Phi (\Delta _{t},\theta )\frac{\partial }{\partial \theta _i}[Z^e_{t-1}(\theta )-Z_{t-1}(\theta )] + \frac{\partial }{\partial \theta _i}\Phi (\Delta _{t},\theta ) [Z^e_{t-1}(\theta )-Z_{t-1}(\theta )], \nonumber \\&\qquad \forall t\ge p+1, \end{aligned}$$
(49)

hence we may write

$$\begin{aligned} \frac{\partial }{\partial \theta _i}[Z^e_t(\theta )-Z_t(\theta )]= \sum _{k=0}^{t-p}\prod _{j=0}^{k-1}\Phi (\Delta _{t-j},\theta ) \frac{\partial }{\partial \theta _i}\Phi (\Delta _{t-k},\theta ) [Z^e_{t-k}(\theta )-Z_{t-k}(\theta )], \end{aligned}$$

hence, using Minkovski’s and Hölder’s inequalities, and letting \(M_\Phi :=\max _{s\in {{\mathcal {S}}},\theta \in \Theta }\left| \frac{\partial }{\partial \theta _i}\Phi (s,\theta )\right| \), we get

$$\begin{aligned} t^\alpha \left| \left| \sup _{\theta \in \Theta } ||\frac{\partial }{\partial \theta _i}[Z^e_t(\theta )-Z_t(\theta )]||\right| \right| _{{8/5}}&\le M_\Phi \sum _{k=0}^{t-p} \left| \left| \sup _{\theta \in \Theta } |\prod _{j=0}^{k-1}\Phi (\Delta _{t-j},\theta )| \right| \right| _{{8}} \nonumber \\&\quad \cdot t^\alpha \left| \left| \ \sup _{\theta \in \Theta }|| Z^e_{t-k}(\theta )-Z_{t-k}(\theta )|| \right| \right| _2 \cdot \end{aligned}$$
(50)

Now, since \(\left| \left| \sup _{\theta \in \Theta } ||\prod _{j=0}^{k-1}\Phi (\Delta _{t-j},\theta )|| \right| \right| _{{8}}\le \kappa \rho ^k\) for some \(\kappa >0\) and \(\rho <1\) thanks to (A5a), and since \(t^\alpha \left| \left| \ \sup _{\theta \in \Theta }|| Z^e_{t-k}(\theta )-Z_{t-k}(\theta )|| \right| \right| _2 \) is uniformly bounded in t and \(k\le t\), and tends to 0 as \(t\rightarrow \infty \), the dominated convergence theorem yields that \(t^\alpha \left| \left| \sup _{\theta \in \Theta } ||\frac{\partial }{\partial \theta _i}[Z^e_t(\theta )-Z_t(\theta )]||\right| \right| _{{8/5}} \longrightarrow 0\) as \(t\rightarrow \infty \), proving \(t^\alpha \left| \left| \sup _{\theta \in \Theta } ||\nabla (e_t- \epsilon _t)(\theta )||\right| \right| _{{8/5}}\longrightarrow 0\) as \(t\rightarrow \infty \) in Point 4. Let us now prove that \(t^\alpha \left| \left| \sup _{\theta \in \Theta } ||\nabla ^2 (e_t- \epsilon _t)(\theta )||\right| \right| _{{4/3}}\longrightarrow 0\). Deriving again (49) with respect to \(\theta _\ell \), \(\ell \in {{\mathcal {S}}}\), we obtain

$$\begin{aligned}&\frac{\partial ^2}{\partial \theta _\ell \partial \theta _i}[Z^e_t(\theta )-Z_t(\theta )] \nonumber \\&\quad =\Phi (\Delta _{t},\theta )\frac{\partial ^2}{\partial \theta _\ell \partial \theta _i}[Z^e_{t-1}(\theta )-Z_{t-1}(\theta )]+ \frac{\partial }{\partial \theta _\ell }\Phi (\Delta _{t},\theta ) \frac{\partial }{\partial \theta _i}[Z^e_{t-1}(\theta )-Z_{t-1}(\theta )] \nonumber \\&\qquad +\,\frac{\partial }{\partial \theta _i}\Phi (\Delta _{t},\theta ) \frac{\partial }{\partial \theta _\ell }[Z^e_{t-1}(\theta )-Z_{t-1}(\theta )] + \frac{\partial ^2}{\partial \theta _\ell \partial \theta _i}\Phi (\Delta _{t},\theta ) [Z^e_{t-1}(\theta )-Z_{t-1}(\theta )], \nonumber \\&\qquad \qquad \forall t\ge p+1, \end{aligned}$$
(51)

so that, in the same spirit as (49), we obtain

$$\begin{aligned} t^\alpha \left| \left| \sup _{\theta \in \Theta } ||\frac{\partial ^2}{\partial \theta _\ell \partial \theta _i}[Z^e_t(\theta )-Z_t(\theta )]||\right| \right| _{{4/3}}&\le M_\Phi ' \sum _{k=0}^{t-p} \left| \left| \sup _{\theta \in \Theta } |\prod _{j=0}^{k-1}\Phi (\Delta _{t-j},\theta )| \right| \right| _{{8}} \nonumber \\&\qquad \cdot t^\alpha \left[ \left| \left| \ \sup _{\theta \in \Theta }|| Z^e_{t-k}(\theta )-Z_{t-k}(\theta )|| \right| \right| _{{8/5}} \right. \nonumber \\&\qquad +\,\left| \left| \ \sup _{\theta \in \Theta }|| \frac{\partial }{\partial \theta _\ell }[Z^e_{t-k}(\theta )-Z_{t-k}(\theta )]|| \right| \right| _{{8/5}} \nonumber \\&\qquad \left. +\,\left| \left| \ \sup _{\theta \in \Theta }|| \frac{\partial }{\partial \theta _i}[Z^e_{t-k}(\theta )-Z_{t-k}(\theta )]|| \right| \right| _{{8/5}} \right] , \end{aligned}$$
(52)

for some positive constant \(M_\Phi '\). Using Point 2 (so that \(t^\alpha \left| \left| \ \sup _{\theta \in \Theta }|| Z^e_{t-k}(\theta )-Z_{t-k}(\theta )|| \right| \right| _{{8/5}}\) tends to 0 as \(t\rightarrow \infty \), since \(8/5<2\)) and the previous estimate

$$\begin{aligned} t^\alpha \left| \left| \sup _{\theta \in \Theta } ||\frac{\partial }{\partial \theta _i}[Z^e_t(\theta )-Z_t(\theta )]||\right| \right| _{{8/5}} \longrightarrow 0 \end{aligned}$$

for all \(i\in {{\mathcal {S}}}\), we conclude by a dominated convergence theorem that

$$\begin{aligned} t^\alpha \left| \left| \sup _{\theta \in \Theta } ||\frac{\partial ^2}{\partial \theta _\ell \partial \theta _i}[Z^e_t(\theta )-Z_t(\theta )]||\right| \right| _{{4/3}},\text { hence }t^\alpha \left| \left| \sup _{\theta \in \Theta } ||\frac{\partial ^2}{\partial \theta _\ell \partial \theta _i} (e_t- \epsilon _t)(\theta )||\right| \right| _{{4/3}}, \end{aligned}$$

tends to 0.

We finish by sketching the proof leading to \(t^\alpha \left| \left| \sup _{\theta \in \Theta } ||\nabla ^3 (e_t- \epsilon _t)(\theta )||\right| \right| _{1}\longrightarrow 0\). The starting point is again deriving (51) with respect to \(\theta _{\ell '}\), \(\ell '\in {{\mathcal {S}}}\), which yields, as in (52), the following estimate:

$$\begin{aligned}&t^\alpha \left| \left| \sup _{\theta \in \Theta } ||\frac{\partial ^3}{\partial \theta _\ell ' \partial \theta _\ell \partial \theta _i}[Z^e_t(\theta )-Z_t(\theta )]||\right| \right| _{1} \\&\quad \le M_\Phi '' \sum _{k=0}^{t-p} \left| \left| \sup _{\theta \in \Theta } |\prod _{j=0}^{k-1}\Phi (\Delta _{t-j},\theta )| \right| \right| _{8}\\&\quad \qquad \cdot t^\alpha \left[ \left| \left| \ \sup _{\theta \in \Theta }|| Z^e_{t-k}(\theta )-Z_{t-k}(\theta )|| \right| \right| _{4/3}+\left| \left| \ \sup _{\theta \in \Theta }|| \frac{\partial }{\partial \theta _\ell }[Z^e_{t-k}(\theta )-Z_{t-k}(\theta )]|| \right| \right| _{4/3}\right. \\&\qquad \qquad +\,\left| \left| \ \sup _{\theta \in \Theta }|| \frac{\partial }{\partial \theta _i}[Z^e_{t-k}(\theta )-Z_{t-k}(\theta )]|| \right| \right| _{4/3} \\&\qquad \qquad +\left| \left| \ \sup _{\theta \in \Theta }|| \frac{\partial ^2}{\partial \theta _\ell \partial \theta _i}[Z^e_{t-k}(\theta )-Z_{t-k}(\theta )]|| \right| \right| _{4/3} \\&\qquad \qquad +\, \left| \left| \ \sup _{\theta \in \Theta }|| \frac{\partial ^2}{\partial \theta _\ell '\partial \theta _i}[Z^e_{t-k}(\theta )-Z_{t-k}(\theta )]|| \right| \right| _{4/3} \\&\qquad \qquad \left. +\,\left| \left| \ \sup _{\theta \in \Theta }|| \frac{\partial ^2}{\partial \theta _\ell '\partial \ell }[Z^e_{t-k}(\theta )-Z_{t-k}(\theta )]|| \right| \right| _{4/3}\right] , \end{aligned}$$

for some constant \(M_\Phi ''\), so that we conclude similarly. \(\square \)

Proof of Proposition 3.5

In this proof, C will denote a generic positive constant that will change from line to line. Let us start with Point 1. The fact that \(Q_n(\theta )\) converges a.s. to \(O_\infty (\theta )={{\mathbb {E}}}(\epsilon _0(\theta ))\) as \(n\rightarrow \infty \) is a consequence of the fact that \( \sup _{\theta \in \Theta } |\epsilon _t(\theta )-e_t(\theta )|^2\longrightarrow 0\) (itself a consequence of Point 3 of Lemma 3.4) and is justified by the same exact proof of Lemma 7 in Francq and Zakoïan (1998). We now prove that \(n^\alpha \left| \left| \sup _{\theta \in \Theta } |Q_n(\theta )-O_n(\theta )|\right| \right| _1 \). Let \(\alpha \in (0,1)\). Using the upper bound \(\sup _{\theta \in \Theta }|e_t(\theta )^2-\epsilon _t(\theta )^2|\le \left[ \sup _{\theta \in \Theta } |e_t(\theta )| + \sup _{\theta \in \Theta } |\epsilon _t(\theta )|\right] .\sup _{\theta \in \Theta } |e_t(\theta )- \epsilon _t(\theta )|\), as well as Cauchy-Schwarz and Minkovski’s inequalities, we get the following

$$\begin{aligned} n^\alpha \left| \left| \sup _{\theta \in \Theta } |Q_n(\theta )-O_n(\theta )|\right| \right| _1\le & {} \frac{1}{n^{1-\alpha }}\sum _{t=1}^n \left[ \left| \left| \sup _{\theta \in \Theta } |e_t(\theta )| \right| \right| _2 + \left| \left| \sup _{\theta \in \Theta } |\epsilon _t(\theta )|\right| \right| _2\right] \\&\cdot \left| \left| \sup _{\theta \in \Theta } |e_t(\theta )- \epsilon _t(\theta )| \right| \right| _2 . \end{aligned}$$

Since \( \left| \left| \sup _{\theta \in \Theta } |e_t(\theta )| \right| \right| _2\) is upper bounded by Point 1 of Lemma 3.4, and \(\left| \left| \sup _{\theta \in \Theta } |\epsilon _t(\theta )|\right| \right| _2\) is constant in t and finite, there thus exists some constant \(C>0\) such that

$$\begin{aligned} n^\alpha \left| \left| \sup _{\theta \in \Theta } |Q_n(\theta )-O_n(\theta )|\right| \right| _1 \le C \frac{1}{n^{1-\alpha }}\sum _{t=1}^n \left| \left| \sup _{\theta \in \Theta } |e_t(\theta )- \epsilon _t(\theta )| \right| \right| _2 . \end{aligned}$$
(53)

Let us write the right hand side of the above inequality in the form \(\frac{1}{n^{1-\alpha }}\sum _{t=1}^n [t^{1-\alpha }-(t-1)^{1-\alpha }] \frac{1}{t^{1-\alpha }-(t-1)^{1-\alpha }} \left| \left| \sup _{\theta \in \Theta }|e_t(\theta )- \epsilon _t(\theta )| \right| \right| _2 \). Since

$$\begin{aligned} \frac{1}{t^{1-\alpha }-(t-1)^{1-\alpha }} \left| \left| \sup _{\theta \in \Theta }|e_t(\theta )- \epsilon _t(\theta )| \right| \right| _2\sim _{t\rightarrow \infty } \frac{1}{(1-\alpha )t^{-\alpha }}\left| \left| \sup _{\theta \in \Theta }|e_t(\theta )- \epsilon _t(\theta )| \right| \right| _2, \end{aligned}$$

which tends to 0 as \(t\rightarrow \infty \) (a consequence of Point 2 of Lemma 3.4), Toeplitz’s lemma implies that the right hand side of (53) tends to 0 as \(n\rightarrow \infty \), and this proves Point 1.

We now prove Point 2. We have for all \(\theta \in \Theta \)

$$\begin{aligned} ||\nabla [e_t(\theta )^2-\epsilon _t(\theta )^2]||&=|| 2 e_t(\theta ) \nabla [e_t(\theta )-\epsilon _t(\theta )]+ 2[e_t(\theta )-\epsilon _t(\theta )] \nabla \epsilon _t(\theta )|| \nonumber \\&\le 2 || e_t(\theta ) \nabla [e_t(\theta )-\epsilon _t(\theta )]|| + 2|e_t(\theta )- \epsilon _t(\theta )|\cdot || \nabla \epsilon _t(\theta ) ||\cdot \end{aligned}$$
(54)

so that

$$\begin{aligned} \sup _{\theta \in \Theta } ||\nabla (Q_n(\theta )-O_n(\theta ))||&\le \frac{1}{n}\sum _{t=1}^n \sup _{\theta \in \Theta } |e_t(\theta )|\cdot \sup _{\theta \in \Theta }|| \nabla [e_t(\theta )-\epsilon _t(\theta )]|| \nonumber \\&\quad + \,\frac{1}{n}\sum _{t=1}^n \sup _{\theta \in \Theta } |e_t(\theta )-\epsilon _t(\theta )|\cdot \sup _{\theta \in \Theta }|| \nabla \epsilon _t(\theta )||. \end{aligned}$$
(55)

Lemma 3.4, Points 2 and 4, along with Borel-Cantelli’s lemma, yields that \(\sup _{\theta \in \Theta } |\epsilon _t(\theta )-e_t(\theta )|\) and \(\sup _{\theta \in \Theta } ||\nabla (\epsilon _t-e_t)(\theta )||\) a.s. tend to 0 as \(t\rightarrow \infty \). The second term on the right hand side of (55) if then a.s. upper bounded thanks to Cauchy-Scwharz inequality by

$$\begin{aligned} \left[ \frac{1}{n}\sum _{t=1}^n \sup _{\theta \in \Theta } |e_t(\theta )-\epsilon _t(\theta )|^2\right] ^{1/2}\cdot \left[ \frac{1}{n}\sum _{t=1}^n \sup _{\theta \in \Theta }|| \nabla \epsilon _t(\theta )||^2\right] ^{1/2}, \end{aligned}$$

which tends to zero thanks to Cesaro’s Lemma and the ergodic theorem. And since, by Minkowski’s inequality,

$$\begin{aligned} \left[ \frac{1}{n}\sum _{t=1}^n \sup _{\theta \in \Theta }|e_t(\theta )|^2\right] ^{1/2}\le \left[ \frac{1}{n}\sum _{t=1}^n \sup _{\theta \in \Theta }|e_t(\theta )-\epsilon _t(\theta )|^2\right] ^{1/2} + \left[ \frac{1}{n}\sum _{t=1}^n \sup _{\theta \in \Theta }|\epsilon _(\theta )|^2\right] ^{1/2}, \end{aligned}$$

we have that \(\left[ \frac{1}{n}\sum _{t=1}^n \sup _{\theta \in \Theta }|e_t(\theta )|^2\right] ^{1/2}\) is a.s. upper bounded in \(n\ge 1\), again by a Cesaro and ergodic theorem argument. The first term on the right hand side of (55) if then again a.s. upper bounded thanks to Cauchy-Scwharz inequality by

$$\begin{aligned} \left[ \frac{1}{n}\sum _{t=1}^n \sup _{\theta \in \Theta } ||\nabla (e_t-\epsilon _t)(\theta )||^2\right] ^{1/2}\cdot \left[ \frac{1}{n}\sum _{t=1}^n \sup _{\theta \in \Theta }| e_t(\theta )|^2\right] ^{1/2}, \end{aligned}$$

which tends to zero as \(t\rightarrow \infty \). Hence (55) implies that \(\sup _{\theta \in \Theta } ||\nabla (Q_n(\theta )-O_n(\theta ))||\) a.s. tends to 0 as \(n\rightarrow \infty \). Proof of a.s. convergence of \(\sup _{\theta \in \Theta } ||\nabla ^j(Q_n(\theta )-O_n(\theta ))||\) to 0 for \(j=2,3\) is obtained similarly, using arguments related to Points 3 and 4 from Lemma 3.4.

Let us now prove Point 3. Let \(\alpha \in (0,1)\). We deduce from (54), using Minkowski and Hölder inequalities, that

$$\begin{aligned} n^\alpha \left| \left| \sup _{\theta \in \Theta } ||\nabla (Q_n(\theta )-O_n(\theta ))||\right| \right| _1&\le \frac{C}{n^{1-\alpha }}\sum _{t=1}^n \left| \left| \sup _{\theta \in \Theta } |e_t(\theta )| \right| \right| _4 \left| \left| \sup _{\theta \in \Theta }|| \nabla [e_t(\theta )-\epsilon _t(\theta )]|| \right| \right| _{4/3} \nonumber \\&\quad + \,\frac{C}{n^{1-\alpha }}\sum _{t=1}^n \left| \left| \sup _{\theta \in \Theta } |e_t(\theta )-\epsilon _t(\theta )| \right| \right| _2 \left| \left| \sup _{\theta \in \Theta }|| \nabla \epsilon _t(\theta )|| \right| \right| _{2} . \end{aligned}$$
(56)

Using Point 1 of Lemma 3.4, we have that \(\left| \left| \sup _{\theta \in \Theta } |e_t(\theta )| \right| \right| _4\) is upper bounded by some constant C. The first term in the righthandside of (56) may thus be upper bounded by

$$\begin{aligned} C\frac{1}{n^{1-\alpha }}\sum _{t=1}^n \left| \left| \sup _{\theta \in \Theta }|| \nabla [e_t(\theta )-\epsilon _t(\theta )]|| \right| \right| _{4/3}. \end{aligned}$$

Noting that \(\left| \left| \sup _{\theta \in \Theta }|| \nabla [e_t(\theta )-\epsilon _t(\theta )]|| \right| \right| _{4/3}\le C' \left| \left| \sup _{\theta \in \Theta }|| \nabla [e_t(\theta )-\epsilon _t(\theta )]|| \right| \right| _{8/5}\) for some constant \(C'\), the above expression is, similarly to the argument in (53), a quantity that tends to 0 as \(n\rightarrow \infty \) thanks to Point 4 in Lemma 3.4 coupled with Toeplitz’s lemma. Hence the first term in the right hand side of (56) tends to 0 as \(n\rightarrow \infty \). Again using Point 1 and Point 2 of the same lemma, and with the same argument, we also have that the second term in the right hand side of (56) tends to 0 as \(n\rightarrow \infty \), which proves Point 2. \(\square \)

1.3 A.3. Proofs of Proposition 3.6 and Theorem 3.7

Proof of Proposition 3.6

Independence of the processes \((\Delta _t)_{t\in {\mathbb {Z}}}\) and \((\epsilon _t)_{t\in {\mathbb {Z}}}\) as well their ergodicity yields that, for fixed \(j\in {\mathbb {N}}\), the process \(\left( (\Delta _{t-1},\ldots ,\Delta _{t-j},\epsilon _{t-j}) \right) \) is ergodic. We thus deduce from Expression (8), and using the fact that \((\epsilon _t)_{t\in {\mathbb {Z}}}\) is a weak white noise, that \(O_n(\theta )\) defined by (14) verifies

$$\begin{aligned} \begin{aligned} 2 O_n(\theta ) \longrightarrow 2 O_\infty (\theta )&:=\sigma ^2 \sum _{j=0}^\infty {\mathbb {E}}\left( [c_j(\theta ,\Delta _0,\ldots ,\Delta _{-j})]^2\right) \\&= \sigma ^2 + \sigma ^2 \sum _{j=1}^\infty {\mathbb {E}}\left( [c_j(\theta ,\Delta _0,\ldots ,\Delta _{-j})]^2\right) \quad \text{ a.s. } \end{aligned} \end{aligned}$$
(57)

as \(n\rightarrow \infty \) (remember that \(c_0(\theta ,\Delta _0)=1\)). By uniqueness of decomposition (8) in Proposition 3.1, and since \(\epsilon _t(\theta _0)=\epsilon _t\), we have that \((c_i(\theta ,\Delta _{t-1},\dots ,\Delta _{t-i}))_{i\in {\mathbb {N}}}=(1,0,\ldots )\) if and only if \(\theta =\theta _0\), and that \(O_\infty (\theta )\) given in (57) is minimum at \(\theta =\theta _0\), with minimum given by \(O_\infty (\theta _0)=\sigma ^2\). Let us then deduce that the estimator \({\check{\theta }}_n\) defined in (15) converges a.s. towards \(\theta _0\). For this we let a subsequence \(({\check{\theta }}_{n_k})_{k\in {\mathbb {N}}}\) converging to some \(\theta ^*\) in the compact set \(\Theta \) and we prove that \(\theta ^*=\theta _0\). Indeed, by definition of the estimator \({\check{\theta }}_{n_k}\) we have

$$\begin{aligned} O_{n_k}(\theta _0)\ge O_{n_k}({\check{\theta }}_{n_k}) \end{aligned}$$
(58)

for all \(k\in {\mathbb {N}}\). A Taylor expansion yields the inequality

$$\begin{aligned} | O_{n_k}({\check{\theta }}_{n_k})- O_{n_k}(\theta ^*)|\le || {\check{\theta }}_{n_k} - \theta ^* ||\cdot \frac{1}{n_k}\sum _{t=1}^{n_k}\sup _{\theta \in \Theta }[|\epsilon _t(\theta )|\cdot || \nabla \epsilon _t(\theta )||]. \end{aligned}$$
(59)

But, using the ergodic theorem, we have

$$\begin{aligned} \frac{1}{n_k}\sum _{t=1}^{n_k}\sup _{\theta \in \Theta }[|\epsilon _t(\theta )|\cdot || \nabla \epsilon _t(\theta )||]&\le \frac{1}{2 n_k}\sum _{t=1}^{n_k}\left[ \sup _{\theta \in \Theta }|\epsilon _t(\theta )|^2 + \sup _{\theta \in \Theta }||\nabla \epsilon _t(\theta )||^2 \right] \\&\longrightarrow \frac{1}{2} \left| \left| \sup _{\theta \in \Theta }|\epsilon _0(\theta )| \right| \right| _2^2 + \frac{1}{2} \left| \left| \sup _{\theta \in \Theta }||\nabla \epsilon _0(\theta )|| \right| \right| _2^2<+\infty , \end{aligned}$$

so that we get from (59) that \(O_{n_k}({\check{\theta }}_{n_k})- O_{n_k}(\theta ^*)\longrightarrow 0\) as \(k\rightarrow \infty \). Since \(O_{n_k}(\theta ^*)\longrightarrow O_{\infty }(\theta ^*)\), we obtain, passing to the limit in (58), that

$$\begin{aligned} O_{\infty }(\theta _0)\ge O_{\infty }(\theta ^*), \end{aligned}$$

hence \( \theta ^*=\theta _0\) thank to uniqueness of the minimum of \(O_{\infty }(\theta )\). \(\square \)

Proof of Theorem 3.7

Similarly to the proof of the previous theorem, we let a subsequence \(({\hat{\theta }}_{n_k})_{k\in {\mathbb {N}}}\) converging to some \(\theta _*\) in the compact set \(\Theta \) and we prove that \(\theta _*=\theta _0\) by proving that \(O_{\infty }(\theta _0)= O_{\infty }(\theta _*)\). By definition of \({\hat{\theta }}_{n_k}\) we have

$$\begin{aligned} Q_{n_k}(\theta _0)\ge Q_{n_k}({\hat{\theta }}_{n_k}),\quad \forall k \ge 0. \end{aligned}$$
(60)

Now, a Taylor expansion yields, for all \(\theta '\) and \(\theta ''\) in \(\Theta \), similarly to the argument in the proof of Proposition 3.6,

$$\begin{aligned} | Q_{n_k}(\theta ')- Q_{n_k}(\theta '')|\le || \theta ' - \theta '' ||\cdot \frac{1}{2 n_k}\sum _{t=1}^{n_k}\left[ \sup _{\theta \in \Theta }|e_t(\theta )|^2 + \sup _{\theta \in \Theta }||\nabla e_t(\theta )||^2 \right] . \end{aligned}$$
(61)

Using inequality \((a+b)^2\le 2 (a^2+b^2)\) for all a and b, we deduce that \(\sup _{\theta \in \Theta }|e_t(\theta )|^2\le 2 (\sup _{\theta \in \Theta }|e_t(\theta )-\epsilon _t(\theta )|^2) + \sup _{\theta \in \Theta }|\epsilon _t(\theta )|^2\). Since a consequence of Point 3 of Lemma 3.4 is that \(\sup _{\theta \in \Theta }|e_t(\theta )-\epsilon _t(\theta )|^2\) tends to 0 as \(t\rightarrow \infty \), the ergodic theorem yields that

$$\begin{aligned} \frac{1}{n_k}\sum _{t=1}^{n_k}\left[ \sup _{\theta \in \Theta }|e_t(\theta )|^2 + \sup _{\theta \in \Theta }||\nabla e_t(\theta )||^2 \right] \longrightarrow \left| \left| \sup _{\theta \in \Theta }|\epsilon _0(\theta )| \right| \right| _2^2 + \left| \left| \sup _{\theta \in \Theta }||\nabla \epsilon _0(\theta )|| \right| \right| _2^2<+\infty \end{aligned}$$

as \(k\rightarrow \infty \). Thanks to (61) and Point 1 of Proposition 3.5, we thus deduce that \(Q_{n_k}(\theta _0)\longrightarrow O_\infty (\theta _0)\) and \(Q_{n_k}({\hat{\theta }}_{n_k})\longrightarrow O_\infty (\theta _*)\) as \(k\rightarrow \infty \), and we conclude in the same way as in proof of Theorem 3.6. \(\square \)

1.4 A.4. Proofs of Theorem 3.8

Let us introduce the following matrices and vectors

$$\begin{aligned}&I_n(\theta ):= \mathrm{Var}\left( \sqrt{n}\nabla O_n(\theta )\right) \nonumber \\&\quad = \left( I_n(l,r)(\theta )\right) _{l,r=1\dots (p+q)K}\in {\mathbb {R}}^{(p+q)K\times (p+q)K},\quad n\in {\mathbb {N}}, \end{aligned}$$
(62)
$$\begin{aligned}&Y_k(\theta ) := \epsilon _k(\theta ) \nabla \epsilon _k(\theta )=(Y_k(l)(\theta ))_{l=1\dots (p+q)K} \in {\mathbb {R}}^{(p+q)K\times 1},\quad k\in {\mathbb {Z}}, \end{aligned}$$
(63)

Theorem 3.8 can be established using the following lemmas.

Lemma A.1

(Davydov (1968)) Let p, q and r three positive numbers such that \(p^{-1}+q^{-1}+r^{-1}=1\). Then

$$\begin{aligned} \left| \text{ Cov }(X,Y)\right| \le K_0\Vert X\Vert _p\Vert Y\Vert _q\left[ \alpha \left\{ \sigma (X),\sigma (Y)\right\} \right] ^{1/r}, \end{aligned}$$
(64)

where \(\Vert X\Vert _p^p={{\mathbb {E}}}(X^p)\), \(K_0\) is an universal constant, and \(\alpha \left\{ \sigma (X),\sigma (Y)\right\} \) denotes the strong mixing coefficient between the \(\sigma \)-fields \(\sigma (X)\) and \(\sigma (Y)\) generated by the random variables X and Y, respectively.

Lemma A.2

Let the assumptions of Theorem 3.8 be satisfied. For all l, r in 1,...,\((p+q)K\) and \(\theta \in \Theta \) we have

$$\begin{aligned} I_n(l,r)(\theta )\longrightarrow I(l,r)(\theta ):=\sum _{k=-\infty }^\infty c_k (l,r)(\theta ),\quad n\rightarrow +\infty , \end{aligned}$$

where \(c_k(l,r)(\theta )=\mathrm{Cov}\left( Y_t(l)(\theta ),Y_{t-k}(r)(\theta )\right) \), \(k\in {\mathbb {Z}}\), the former being a convergent series.

Proof of Lemma A.2

Let us write

$$\begin{aligned} \nabla \epsilon _t(\theta ) = \left( \frac{\partial \epsilon _t(\theta )}{\partial \theta _1},\dots , \frac{\partial \epsilon _t(\theta )}{\partial \theta _{(p+q)K}}\right) ', \end{aligned}$$

where \(\epsilon _t(\theta )\) is given by (8). The process \(\left( Y_k(\theta )\right) _k\) is strictly stationary and ergodic. Moreover, we have

$$\begin{aligned} I_n(\theta )=\mathrm{Var}\left( \sqrt{n}\frac{\partial }{\partial \theta }O_n(\theta )\right)= & {} \mathrm{Var}\left( \frac{1}{\sqrt{n}}\sum _{t=1}^{n}Y_t(\theta )\right) =\frac{1}{n}\sum _{t,s=1}^{n}\text{ Cov }\left( Y_t(\theta ),Y_s(\theta )\right) \\= & {} \frac{1}{n}\sum _{k=-n+1}^{n-1}(n-|k|)\text{ Cov }\left( Y_t(\theta ), Y_{t-k}(\theta )\right) . \end{aligned}$$

From Proposition 3.1 and Lemma 12, we have

$$\begin{aligned}&\epsilon _t(\theta )= \sum _{i=0}^\infty c_i(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1})\epsilon _{t-i}\text { and }\frac{\partial \epsilon _t(\theta )}{\partial \theta _l}=\sum _{i=0}^\infty c_{i,l}(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1}) \epsilon _{t-i}, \\&\quad \text { for }l=1,\dots ,(p+q)K, \end{aligned}$$

where we recall that \(c_i(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1})\) is defined by (9), and

$$\begin{aligned}&c_{i,l}(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1}) \\&\quad =\frac{\partial }{\partial \theta _l} c_i(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1})\\&\quad =\frac{\partial }{\partial \theta _l} \left( \sum _{k=0}^i w_1\Phi (\Delta _{t},\theta )\dots \Phi (\Delta _{t-k+1},\theta )M \Psi (\Delta _{t-k})\dots \Psi (\Delta _{t-i+1})w_{p+1}'\right) , \end{aligned}$$

with the following upper bound holding thanks to (13):

$$\begin{aligned} {\mathbb {E}}\sup _{\theta \in \Theta }(c_i(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1}))^2\le C\rho ^i \text { and }{\mathbb {E}}\sup _{\theta \in \Theta }( c_{i,l}(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1}))^2\le C\rho ^i,\quad \forall i. \end{aligned}$$

Let

$$\begin{aligned}&\beta _{i,j,i',j',k}(l,r)(\theta ) \nonumber \\&\quad = {{\mathbb {E}}}\left[ c_i(\theta ,\Delta _{t},\dots , \Delta _{t-i+1})c_{j,l}(\theta ,\Delta _{t},\dots ,\Delta _{t-j+1}) c_{i'}(\theta ,\Delta _{t-k},\dots ,\Delta _{t-k-i'+1})\right. \nonumber \\&\qquad \qquad \left. c_{j',r}(\theta ,\Delta _{t-k},\dots ,\Delta _{t-k-j'+1})\right] {{\mathbb {E}}}\left[ \epsilon _{t-i} \epsilon _{t-j}\epsilon _{t-k-i'}\epsilon _{t-k-j'}\right] \nonumber \\&\qquad -\, {{\mathbb {E}}}\left[ c_i(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1}) c_{j,l}(\theta ,\Delta _{t},\dots ,\Delta _{t-j+1})\right] \nonumber \\&\qquad \times {{\mathbb {E}}}\left[ c_{i'}(\theta ,\Delta _{t-k},\dots , \Delta _{t-k-i'+1})c_{j',r}(\theta ,\Delta _{t-k},\dots , \Delta _{t-k-j'+1})\right] {{\mathbb {E}}}\left[ \epsilon _{t-i} \epsilon _{t-j}\right] \nonumber \\&\qquad \times \,{{\mathbb {E}}}\left[ \epsilon _{t-k-i'}\epsilon _{t-k-j'}\right] \nonumber \\&\quad = {{\mathbb {E}}}\left[ c_i(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1})c_{j,l} (\theta ,\Delta _{t},\dots ,\Delta _{t-j+1}) c_{i'}(\theta ,\Delta _{t-k},\dots ,\Delta _{t-k-i'+1})\right. \nonumber \\&\qquad \qquad \left. c_{j',r}(\theta ,\Delta _{t-k},\dots ,\Delta _{t-k-j'+1})\right] \text{ Cov }\left( \epsilon _{t-i} \epsilon _{t-j},\epsilon _{t-k-i'}\epsilon _{t-k-j'}\right) \nonumber \\&\qquad +\, \text{ Cov }\left( c_i(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1})c_{j,l} (\theta ,\Delta _{t},\dots ,\Delta _{t-j+1}),c_{i'}(\theta ,\Delta _{t-k}, \dots ,\Delta _{t-k-i'+1})\right. \nonumber \\&\qquad \qquad \left. c_{j',r}(\theta ,\Delta _{t-k},\dots ,\Delta _{t-k-j'+1})\right) {{\mathbb {E}}}\left[ \epsilon _{t-i} \epsilon _{t-j}\right] {{\mathbb {E}}}\left[ \epsilon _{t-k-i'}\epsilon _{t-k-j'}\right] . \end{aligned}$$
(65)

We then obtain

$$\begin{aligned} c_k(l,r)(\theta )=\sum _{i=0}^\infty \sum _{j=0}^\infty \sum _{i'=0}^\infty \sum _{j'=0}^\infty \beta _{i,j,i',j',k}(l,r)(\theta ),\quad k\in {\mathbb {Z}}. \end{aligned}$$

The Cauchy-Schwarz inequality implies that

$$\begin{aligned}&\left| {{\mathbb {E}}}[c_i(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1})c_{j,l} (\theta ,\Delta _{t},\dots ,\Delta _{t-j+1}) c_{i'}(\theta ,\Delta _{t-k},\dots ,\Delta _{t-k-i'+1})\right. \nonumber \\&\quad \qquad \, \times \left. c_{j',r}(\theta ,\Delta _{t-k},\dots ,\Delta _{t-k-j'+1})]\right| \nonumber \\&\quad \le \left( {{\mathbb {E}}}[c_i(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1})c_{j,l}(\theta ,\Delta _{t}, \dots ,\Delta _{t-j+1})]^2\right) ^{1/2} \nonumber \\&\qquad \,\times \left( {\mathbb {E}} [c_{i'}(\theta ,\Delta _{t-k},\dots ,\Delta _{t-k-i'+1}) c_{j',r}(\theta ,\Delta _{t-k},\dots ,\Delta _{t-k-j'+1})]^2\right) ^{1/2} \nonumber \\&\quad \le \left( {{\mathbb {E}}}[c_i(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1})]^4 \times {{\mathbb {E}}}[c_{j,l}(\theta ,\Delta _{t},\dots ,\Delta _{t-j+1})]^4\right) ^{1/4}\nonumber \\&\qquad \left( {\mathbb {E}} [c_{i'}(\theta ,\Delta _{t-k},\dots , \Delta _{t-k-i'+1})]^4 {{\mathbb {E}}}[c_{j',r}(\theta ,\Delta _{t-k},\dots , \Delta _{t-k-j'+1})]^4\right) ^{1/4}\nonumber \\&\quad \le C\rho ^{i+j+i'+j'}. \end{aligned}$$
(66)

First, suppose that \(k\ge 0\), for all l, r in 1,...,\((p+q)K\) and \(\theta \in \Theta \), in view of (66) it follows that

$$\begin{aligned} \left| c_k(l,r)(\theta )\right|= & {} \left| \text{ cov } \left( Y_t(l)(\theta ),Y_{t-k}(r)(\theta )\right) \right| =\left| \sum _{i=0}^\infty \sum _{j=0}^\infty \sum _{i'=0}^\infty \sum _{j'=0}^\infty \beta _{i,j,i',j',k}(l,r)(\theta )\right| \\\le & {} g_1+g_2+g_3+g_4+g_5+h_1+h_2+h_3 , \end{aligned}$$

where

$$\begin{aligned} g_1= & {} \sum _{i>[k/2]}\sum _{j=0}^\infty \sum _{i'=0}^\infty \sum _{j'=0}^\infty \kappa \rho ^{i+j+i'+j'}\left| \text{ Cov }\left( \epsilon _{t-i} \epsilon _{t-j},\epsilon _{t-k-i'}\epsilon _{t-k-j'}\right) \right| ,\\ g_2= & {} \sum _{i=0}^\infty \sum _{j>[k/2]}\sum _{i'=0}^\infty \sum _{j'=0}^\infty \kappa \rho ^{i+j+i'+j'}\left| \text{ Cov }\left( \epsilon _{t-i} \epsilon _{t-j},\epsilon _{t-k-i'}\epsilon _{t-k-j'}\right) \right| \\ g_3= & {} \sum _{i=0}^\infty \sum _{j=0}^\infty \sum _{i'>[k/2]}\sum _{j'=0}^\infty \kappa \rho ^{i+j+i'+j'}\left| \text{ Cov }\left( \epsilon _{t-i} \epsilon _{t-j},\epsilon _{t-k-i'}\epsilon _{t-k-j'}\right) \right| ,\\ g_4= & {} \sum _{i=0}^\infty \sum _{j=0}^\infty \sum _{i'=0}^\infty \sum _{j'>[k/2]} \kappa \rho ^{i+j+i'+j'}\left| \text{ Cov }\left( \epsilon _{t-i} \epsilon _{t-j},\epsilon _{t-k-i'}\epsilon _{t-k-j'}\right) \right| \\ g_5= & {} \sum _{i=0}^{[k/2]}\sum _{j=0}^{[k/2]}\sum _{i'=0}^{[k/2]} \sum _{j'=0}^{[k/2]}\kappa \rho ^{i+j+i'+j'}\left| \text{ Cov }\left( \epsilon _{t-i} \epsilon _{t-j},\epsilon _{t-k-i'}\epsilon _{t-k-j'}\right) \right| , \\ h_1= & {} \sigma ^4\sum _{i>[k/2]}\sum _{i'=0}^{\infty }\left| \text{ Cov } (c_i(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1})c_{i,l}(\theta ,\Delta _{t}, \dots ,\Delta _{t-i+1}),\right. \\&\qquad \left. c_{i'}(\theta ,\Delta _{t-k},\dots ,\Delta _{t-k-i'+1})c_{i',r} (\theta ,\Delta _{t-k},\dots ,\Delta _{t-k-i'+1}))\right| , \\ h_2= & {} \sigma ^4\sum _{i=0}^{\infty }\sum _{i'>[k/2]}\left| \text{ Cov } (c_i(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1})c_{i,l}(\theta ,\Delta _{t}, \dots ,\Delta _{t-i+1}),\right. \\&\qquad \left. c_{i'}(\theta ,\Delta _{t-k},\dots ,\Delta _{t-k-i'+1})c_{i',r}(\theta , \Delta _{t-k},\dots ,\Delta _{t-k-i'+1}))\right| , \\ h_3= & {} \sigma ^4\sum _{i=0}^{[k/2]}\sum _{i'=0}^{[k/2]}\left| \text{ Cov } (c_i(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1})c_{i,l}(\theta ,\Delta _{t}, \dots ,\Delta _{t-i+1}),\right. \\&\qquad \left. c_{i'}(\theta ,\Delta _{t-k},\dots ,\Delta _{t-k-i'+1})c_{i',r} (\theta ,\Delta _{t-k},\dots ,\Delta _{t-k-i'+1}))\right| . \end{aligned}$$

Note that, in the strong noise case, we easily check that the \(\text{ Cov }\left( \epsilon _{t-i} \epsilon _{t-j},\epsilon _{t-k-i'}\epsilon _{t-k-j'}\right) \) term in (65) is non zero only for indices i, j, \(i'\), \(j'\) such that \(i=j=k+i'=k+j'\). This fact entails that, instead of considering five sums \(g_1\),..., \(g_5\), we only need to consider one sum in the form \( \kappa \sum _{j=k}^\infty \rho ^{2(2j-k)}\), which is a \(\mathrm {O}(\rho ^k)\).

Because

$$\begin{aligned} \left| \text{ Cov }\left( \epsilon _{t-i} \epsilon _{t-j},\epsilon _{t-k-i'}\epsilon _{t-k-j'}\right) \right|\le & {} \sqrt{{{\mathbb {E}}}\left[ \epsilon _{t-i} \epsilon _{t-j}\right] ^2 {{\mathbb {E}}}\left[ \epsilon _{t-k-i'}\epsilon _{t-k-j'}\right] ^2} \le {{\mathbb {E}}}\left| \epsilon _t\right| ^4<\infty \end{aligned}$$

by Assumption \({(\mathbf A3)}\), we have

$$\begin{aligned} g_1=\sum _{i>[k/2]}\sum _{j=0}^\infty \sum _{i'=0}^\infty \sum _{j'=0}^\infty \kappa \rho ^{i+j+i'+j'}\left| \text{ Cov }\left( \epsilon _{t-i} \epsilon _{t-j},\epsilon _{t-k-i'}\epsilon _{t-k-j'}\right) \right| \le \kappa _1\rho ^{k/2}, \end{aligned}$$

for some positive constant \(\kappa _1\). Using the same arguments we obtain that \(g_i\quad (i=2,3,4)\) is bounded by \(\kappa _i\rho ^{k/2}\). Furthermore, (A3) and the Cauchy-Schwarz inequality yields that \(\left\| \epsilon _{i} \epsilon _{i'}\right\| _{2+\nu }<+\infty \) for any i and \(i'\) in \({\mathbb {Z}}\). Lemma A.1 thus entails that

$$\begin{aligned} g_5= & {} \sum _{i=0}^{[k/2]}\sum _{j=0}^{[k/2]}\sum _{i'=0}^{[k/2]} \sum _{j'=0}^{[k/2]}\kappa \rho ^{i+j+i'+j'}\left| \text{ Cov }\left( \epsilon _{t-i} \epsilon _{t-j},\epsilon _{t-k-i'}\epsilon _{t-k-j'}\right) \right| \\\le & {} \sum _{i=0}^{[k/2]}\sum _{j=0}^{[k/2]}\sum _{i'=0}^{[k/2]} \sum _{j'=0}^{[k/2]}\kappa _5\rho ^{i+j+i'+j'}\left\| \epsilon _{t-i} \epsilon _{t-j}\right\| _{2+\nu }\left\| \epsilon _{t-k-i'}\epsilon _{t-k-j'} \right\| _{2+\nu }\\&\times \,\left\{ \alpha _{\epsilon }\left( \min \left[ k+j'-i,k+i'-i, k+j'-j,k+i'-j\right] \right) \right\} ^{\nu /(2+\nu )} \\\le & {} \kappa ' \alpha _{\epsilon }^{\nu /(2+\nu )}\left( \left[ k/2\right] \right) . \end{aligned}$$

Since

$$\begin{aligned}&\left| \text{ Cov }(c_i(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1})c_{i,l} (\theta ,\Delta _{t},\dots ,\Delta _{t-i+1}), c_{i'}(\theta ,\Delta _{t-k},\dots ,\Delta _{t-k-i'+1})\right. \\&\quad \times \, \left. c_{i',r}(\theta ,\Delta _{t-k},\dots ,\Delta _{t-k-i'+1}))\right| \le C\rho ^{i+i'}, \end{aligned}$$

we have

$$\begin{aligned} h_1= & {} \sigma ^4\sum _{i>[k/2]}\sum _{i'=0}^{\infty }\left| \text{ Cov } (c_i(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1})c_{i,l}(\theta , \Delta _{t},\dots ,\Delta _{t-i+1}),\right. \\&\left. c_{i'}(\theta ,\Delta _{t-k},\dots ,\Delta _{t-k-i'+1})c_{i',r} (\theta ,\Delta _{t-k},\dots ,\Delta _{t-k-i'+1}))\right| \le \kappa '_1\rho ^{k/2}, \end{aligned}$$

for some positive constant \(\kappa '_1\). Using the same arguments we obtain that \(h_2\) is bounded by \(\kappa '_2\rho ^{k/2}\). The \(\alpha -\)mixing property (see Theorem 14.1 in Davidson 1994, p. 210) and Lemma A.1, along with (12), entail that

$$\begin{aligned} h_3= & {} \sigma ^4\sum _{i=0}^{[k/2]}\sum _{i'=0}^{[k/2]}\left| \text{ Cov } (c_i(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1})c_{i,l}(\theta ,\Delta _{t}, \dots ,\Delta _{t-i+1}),\right. \\&\qquad \left. c_{i'}(\theta ,\Delta _{t-k},\dots ,\Delta _{t-k-i'+1}) c_{i',r}(\theta ,\Delta _{t-k},\dots ,\Delta _{t-k-i'+1}))\right| \\\le & {} \sum _{i=0}^{[k/2]}\sum _{i'=0}^{[k/2]}\kappa _6\left\| c_i (\theta ,\Delta _{t},\dots ,\Delta _{t-i+1})c_{i,l}(\theta ,\Delta _{t}, \dots ,\Delta _{t-i+1})\right\| _{2+\nu }\\&\times \,\left\| c_{i'}(\theta ,\Delta _{t-k},\dots ,\Delta _{t-k-i'+1}) c_{i',r}(\theta ,\Delta _{t-k},\dots ,\Delta _{t-k-i'+1}))\right\| _{2+\nu }\\&\times \,\left\{ \alpha _{\Delta }\left( k+1-i\right) \right\} ^{\nu /(2+\nu )}\le \kappa '_3 \alpha _{\Delta }^{\nu /(2+\nu )}\left( \left[ k/2\right] \right) . \end{aligned}$$

It follows that

$$\begin{aligned} \sum _{k=0}^{\infty }\left| c_k(l,r)(\theta )\right| \le \kappa \sum _{k=0}^{\infty }\rho ^{|k|/2}+\kappa '\sum _{k=0}^{\infty } \alpha _{\epsilon }^{\nu /(2+\nu )} \left( \left[ k/2\right] \right) +\kappa '' \sum _{k=0}^{\infty }\alpha _{\Delta }^{\nu /(2+\nu )} \left( \left[ k/2\right] \right) <\infty , \end{aligned}$$

by Assumption \({(\mathbf A2)}\). The same bounds clearly holds for

$$\begin{aligned} \sum _{k=-\infty }^{0}\left| c_k(l,r)(\theta )\right| , \end{aligned}$$

which shows that

$$\begin{aligned} \sum _{k=-\infty }^{\infty }\left| c_k(l,r)(\theta )\right| <\infty . \end{aligned}$$

Then, the dominated convergence theorem gives

$$\begin{aligned} I_n(l,r)(\theta )=\frac{1}{n}\sum _{k=-n+1}^{n-1}(n-|k|)c_k(l,r)(\theta )\longrightarrow I(l,r)(\theta ):=\sum _{k=-\infty }^\infty c_k (l,r)(\theta ),\quad n\rightarrow +\infty , \end{aligned}$$

and completes the proof. \(\square \)

Lemma A.3

Under the assumptions of Theorem 3.8, we have convergence in distribution of the random vector

$$\begin{aligned} \sqrt{n}\nabla Q_n(\theta _0) {\mathop {\rightarrow }\limits ^{{{\mathcal {D}}}}}\mathcal{N}(0,I),\text { as } n\rightarrow \infty \end{aligned}$$

where we recall that matrix I is given by (18).

Proof of Lemma A.3

In view of Proposition 3.5, it is easy to see that

$$\begin{aligned} \sqrt{n}\nabla \left( Q_n-O_n\right) (\theta _0)=o_{{\mathbb {P}}}(1). \end{aligned}$$

Thus \(\nabla Q_n(\theta _0) \) and \(\nabla O_n(\theta _0) \) have the same asymptotic distribution. Therefore, it remains to show that

$$\begin{aligned} \sqrt{n}\nabla O_n(\theta _0) {\mathop {\rightarrow }\limits ^{{{\mathcal {D}}}}}\mathcal{N}(0,I),\text { as } n\rightarrow \infty . \end{aligned}$$

For l, in 1,...,\((p+q)K\) and \(\theta \in \Theta \), we have

$$\begin{aligned} \frac{\partial \epsilon _{t}(\theta )}{\partial \theta _{l}}= \sum _{i=1}^{\infty }c_{i,l}(\theta ,\Delta _{t},\dots , \Delta _{t-i+1})\epsilon _{t-i}, \end{aligned}$$
(67)

where the sequence \(c_{i,l}(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1})\) is such that \({\mathbb {E}}\sup _{\theta \in \Theta }|(c_{i,l} (\theta ,\Delta _{t},\dots ,\Delta _{t-i+1}))^2\rightarrow 0\) at a geometric rate as \(i\rightarrow \infty \) (see Lemma 3.3). Moreover, note that

$$\begin{aligned} \sqrt{n}\frac{\partial O_n(\theta )}{\partial \theta _l}= & {} \frac{1}{\sqrt{n}}\sum _{t=1}^{n}Y_t(l)(\theta ) \\= & {} \frac{1}{\sqrt{n}}\sum _{t=1}^{n}\sum _{i=0}^{\infty } c_{i}(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1})\epsilon _{t-i} \sum _{j=1}^{\infty }c_{j,l}(\theta ,\Delta _{t},\dots , \Delta _{t-j+1})\epsilon _{t-j}. \end{aligned}$$

Since \( \nabla \epsilon _{t}(\theta _0) \) belongs to the Hilbert space \({{\mathcal {H}}}_\epsilon (t-1)\), the random variables \(\epsilon _{t}(\theta _0)\) and \( \nabla \epsilon _{t}(\theta _0) \) are orthogonal and it is easy to verify that \({{\mathbb {E}}}\left[ \sqrt{n}\nabla O_n(\theta _0)\right] =0\). Now, we have for all m

$$\begin{aligned} \sqrt{n}\frac{\partial O_n(\theta _0)}{\partial \theta _l}= \frac{1}{\sqrt{n}} \sum _{t=1}^{n}Y_{t,m}(l)+\frac{1}{\sqrt{n}}\sum _{t=1}^{n}Z_{t,m}(l) \end{aligned}$$

where

$$\begin{aligned} Y_{t,m}(l)= & {} \sum _{j=1}^{m}c_{j,l}(\theta _0,\Delta _{t},\dots , \Delta _{t-j+1})\epsilon _t\epsilon _{t-j}\\ Z_{t,m}(l)= & {} \sum _{j=m+1}^{\infty }c_{j,l}(\theta _0,\Delta _{t}, \dots ,\Delta _{t-j+1})\epsilon _t\epsilon _{t-j}. \end{aligned}$$

Let

$$\begin{aligned} Y_{t,m}&:=Y_{t,m}(\theta _0)=\left( Y_{t,m}(1),\dots ,Y_{t,m} ((p+q)K)\right) '\text { and }\\ Z_{t,m}&:=Z_{t,m}(\theta _0)=\left( Z_{t,m,}(1),\dots ,Z_{t,m}((p+q)K)\right) '. \end{aligned}$$

The processes \((Y_{t,m})_{t}\) and \((Z_{t,m})_{t}\) are stationary and centered. Moreover, under Assumption (A2) and m fixed, the process \(Y=(Y_{t,m})_{t}\) is strong mixing (see Davidson 1994, Theorem 14.1 p. 210), with mixing coefficients \(\alpha _Y(h)\le \alpha _{\Delta ,\epsilon }\left( \max \{0,h-m\}\right) \le \alpha _{\Delta }\left( \max \{0,h-m+1\}\right) +\alpha _{\epsilon } \left( \max \{0,h-m\}\right) \), by independence of \((\Delta _t)_{t\in {\mathbb {Z}}}\) and \((\epsilon _t)_{t\in {\mathbb {Z}}}\). Applying the central limit theorem (CLT) for mixing processes (see Herrndorf 1984) we directly obtain

$$\begin{aligned} \frac{1}{\sqrt{n}}\sum _{t=1}^{n}Y_{t,m}{\mathop {\rightarrow }\limits ^{{{\mathcal {D}}}}}\mathcal{N}(0,I_m),\quad I_m=\sum _{h=-\infty }^{\infty } \mathrm{Cov}\left( Y_{t,m},Y_{t-h,m}\right) . \end{aligned}$$

In the strong noise case, the infinite sum in \(I_m\) reduces to one term corresponding to \(h=0\), and \(I_m\) simply equals \(\mathrm{Cov}\left( Y_{t,m},Y_{t,m}\right) \).

As in Francq and Zakoïan (1998) (see Lemma 3), we can show that \(I=\lim _{m\rightarrow \infty }I_m\) exists. Since \(\Vert Z_{t,m}\Vert _2\rightarrow 0\) at an exponential rate when \(m\rightarrow \infty \), using the arguments given in Francq and Zakoïan (1998) (see Lemma 4), we show that

$$\begin{aligned} \lim _{m\rightarrow \infty }\limsup _{n\rightarrow \infty }{{\mathbb {P}}}\left\{ \left\| n^{-1/2} \sum _{t=1}^{n}Z_{t,m}\right\| >\varepsilon \right\} =0 \end{aligned}$$
(68)

for every \(\varepsilon >0\) (see the following lemma A.4). From a standard result (see e.g. Brockwell and Davis 1991, Proposition 6.3.9), we deduce that

$$\begin{aligned} \frac{1}{\sqrt{n}}\sum _{t=1}^{n} \nabla O_n(\theta _0) = \frac{1}{\sqrt{n}}\sum _{t=1}^{n}Y_{t,m}+\frac{1}{\sqrt{n}} \sum _{t=1}^{n}Z_{t,m}{\mathop {\rightarrow }\limits ^{{{\mathcal {D}}}}}{{\mathcal {N}}}(0,I), \end{aligned}$$

which completes the proof. \(\square \)

Lemma A.4

Under the assumptions of Theorem 3.8, (68) holds, that is

$$\begin{aligned} \lim _{m\rightarrow \infty }\limsup _{n\rightarrow \infty }{{\mathbb {P}}}\left\{ \left\| n^{-1/2} \sum _{t=1}^{n}Z_{t,m}\right\| >\varepsilon \right\} =0. \end{aligned}$$

Proof of Lemma A.4

For \(l=1,\dots ,(p+q)K\), by stationarity we have

$$\begin{aligned} \mathrm{Var}\left( \frac{1}{\sqrt{n}}\sum _{t=1}^{n}Z_{t,m}(l)\right)= & {} \frac{1}{n}\sum _{t,s=1}^{n}\text{ Cov }(Z_{t,m}(l),Z_{s,m}(l))\\= & {} \frac{1}{n}\sum _{|h|<n}(n-|h|)\text{ Cov }(Z_{t,m}(l),Z_{t-h,m}(l))\\\le & {} \sum _{h=-\infty }^{\infty }\left| \text{ Cov }(Z_{t,m}(l),Z_{t-h,m}(l))\right| . \end{aligned}$$

Consider first the case \(h\ge 0\). Because \({\mathbb {E}} \sup _{\theta \in \Theta }(c_{j,l}(\theta _0,\Delta _{t}, \dots ,\Delta _{t-j+1}))^2\le \kappa \rho ^{j}\) (see 12), using also \({{\mathbb {E}}}|\epsilon _t|^4<\infty \), for \([h/2]\le m\), it follows from the Hölder inequality that

$$\begin{aligned} \sup _h\left| \text{ Cov }(Z_{t,m}(l),Z_{t-h,m}(l))\right| = \sup _h\left| {{\mathbb {E}}}(Z_{t,m}(l)Z_{t-h,m}(l))\right| \le \kappa \rho ^m. \end{aligned}$$
(69)

Let \(h>0\) such that \([h/2]>m\). Write

$$\begin{aligned} Z_{t,m}=Z_{t,m}^{h^-}(l)+Z_{t,m}^{h^+}(l), \end{aligned}$$

where

$$\begin{aligned}&Z_{t,m}^{h^-}(l)=\sum _{j=m+1}^{[h/2]}c_{j,l}(\theta _0,\Delta _{t}, \dots ,\Delta _{t-j+1})\epsilon _t\epsilon _{t-j}, \\&Z_{t,m}^{h^+}(l)=\sum _{j=[h/2]+1}^{\infty }c_{j,l}(\theta _0,\Delta _{t}, \dots ,\Delta _{t-j+1})\epsilon _t\epsilon _{t-j}. \end{aligned}$$

Note that \(Z_{t,m}^{h^-}(l)\) belongs to the \(\sigma \)-field generated by \(\{\Delta _{t},\dots , \Delta _{t-[h/2]+1}, \epsilon _t,\epsilon _{t-1},\dots ,\epsilon _{t-[h/2]}\}\) and that \(Z_{t-h,m}(l)\) belongs to the \(\sigma \)-field generated by \(\{\Delta _{t-h},\Delta _{t-h-1},\dots ,\epsilon _{t-h},\epsilon _{t-h-1},\dots \}\). Note also that, by (A3), \({{\mathbb {E}}}|Z_{t,m}^{h^-}(l)|^{2+\nu }<\infty \) and \({{\mathbb {E}}}|Z_{t-h,m}(l)|^{2+\nu }<\infty \). The \(\alpha -\)mixing property and Lemma A.1 then entail that

$$\begin{aligned}&\left| \text{ Cov }(Z_{t,m}^{h^-}(l),Z_{t-h,m}(l))\right| \nonumber \\&\quad \le \kappa _1\sum _{j=m+1}^{[h/2]}\sum _{j'=m+1}^{\infty }\left\| c_{j',l} (\theta _0,\Delta _{t-h},\dots ,\Delta _{t-h-j'+1})\epsilon _t\epsilon _{t-j'} \right\| _{2+\nu }\nonumber \\&\qquad \times \,\left\| c_{j,l}(\theta _0,\Delta _{t},\dots ,\Delta _{t-j+1}) \epsilon _t\epsilon _{t-j}\right\| _{2+\nu }\left[ \alpha _{\Delta , \epsilon }([h/2])\right] ^{\nu /(2+\nu )}\nonumber \\&\quad \le \kappa _2\sum _{j=m+1}^{[h/2]}\sum _{j'=m+1}^{\infty }\rho ^j \rho ^{j'}\left[ \alpha _{\epsilon }^{\nu /(2+\nu )} ([h/2])+\alpha _{\Delta }^{\nu /(2+\nu )}([h/2])\right] \nonumber \\&\quad \le \kappa \rho ^{m}\left[ \alpha _{\epsilon }^{\nu /(2+\nu )} ([h/2])+\alpha _{\Delta }^{\nu /(2+\nu )}([h/2])\right] . \end{aligned}$$
(70)

By the argument used to show (69), we also have

$$\begin{aligned} \left| \text{ Cov }(Z_{t,m}^{h^+}(l),Z_{t-h,m}(l))\right| \le \kappa \rho ^h\rho ^m. \end{aligned}$$
(71)

In view of (69), (70) and (71), we obtain

$$\begin{aligned}&\sum _{h=0}^{\infty }\left| \text{ Cov }(Z_{t,m}(l),Z_{t-h,m}(l))\right| \\&\quad \le \kappa m\rho ^m+\sum _{h=m}^{\infty } \left\{ \kappa \rho ^h\rho ^m+ \kappa \rho ^{m}\left[ \alpha _{\epsilon }^{\nu /(2+\nu )}([h/2])+ \alpha _{\Delta }^{\nu /(2+\nu )}([h/2])\right] \right\} \rightarrow 0 \end{aligned}$$

as \(m\rightarrow \infty \) by (A2). This implies that

$$\begin{aligned} \sup _n\mathrm{Var}\left( \frac{1}{\sqrt{n}}\sum _{t=1}^{n}Z_{t,m}(l)\right) \xrightarrow [m\rightarrow \infty ]{}0. \end{aligned}$$
(72)

We have the same bound for \(h<0\). The conclusion follows from (72). \(\square \)

Lemma A.5

Under the assumptions of Theorem 3.8, almost surely

$$\begin{aligned} \nabla ^2 Q_n(\theta _0) \longrightarrow J,\quad n\rightarrow \infty , \end{aligned}$$

where J given by (17) exists and is invertible.

Proof of Lemma A.5

For all l, r in \(1,\dots ,(p+q)K\), in view of Proposition 3.5, we have almost surely

$$\begin{aligned} \left| \frac{\partial ^2 }{\partial \theta _l\partial \theta _r} \left( Q_n(\theta _0)-O_n(\theta _0)\right) \right| \rightarrow 0, \text { as } t\rightarrow \infty . \end{aligned}$$

Thus \({\partial ^2 Q_n(\theta _0)}/{\partial \theta _l\partial \theta _r}\) and \({\partial ^2 O_n(\theta _0)}/{\partial \theta _l\partial \theta _r}\) have almost surely the same asymptotic distribution. From (8) and (12), there exists a sequence \(\left( c_{i,l,r}(\theta ,\Delta _{t-1},\dots ,\Delta _{t-i})\right) _{i\in {\mathbb {N}}}\) such that

$$\begin{aligned}&\frac{\partial ^2 \epsilon _{t}(\theta )}{\partial \theta _{l}\partial \theta _{r}}=\sum _{i=1}^{\infty }c_{i,l,r}(\theta , \Delta _{t},\dots ,\Delta _{t-i+1})\epsilon _{t-i} \text { with }{\mathbb {E}}(c_{i,l,r}(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1}))^2\le C\rho ^i, \nonumber \\&\quad \forall i. \end{aligned}$$
(73)

This implies that \({\partial ^2 \epsilon _{t}(\theta )}/{\partial \theta _{l}\partial \theta _{r}}\) belongs to \(L^2\). On the other hand, we have

$$\begin{aligned} \frac{\partial ^2 O_n(\theta )}{\partial \theta _l\partial \theta _r}= & {} \frac{1}{n}\sum _{t=1}^n\epsilon _t(\theta )\frac{\partial ^2 \epsilon _t(\theta )}{\partial \theta _l\partial \theta _r}+\frac{1}{n}\sum _{t=1}^n\frac{\partial \epsilon _t(\theta )}{\partial \theta _l}\frac{\partial \epsilon _t(\theta )}{\partial \theta _r}\\\longrightarrow & {} {{\mathbb {E}}}\left( \epsilon _t(\theta )\frac{\partial ^2 \epsilon _t(\theta )}{\partial \theta _l\partial \theta _r}\right) +{{\mathbb {E}}}\left( \frac{\partial \epsilon _t(\theta )}{\partial \theta _l}\frac{\partial \epsilon _t(\theta )}{\partial \theta _r}\right) ,\text { as } n\rightarrow \infty , \end{aligned}$$

by the ergodic theorem. Using the uncorrelatedness between \(\epsilon _t(\theta _0)\) and the linear past \({{\mathcal {H}}}_\epsilon (t-1)\), \(\partial \epsilon _t(\theta _0)/\partial \theta _l\in \mathcal{H}_\epsilon (t-1)\), and \(\partial ^2 \epsilon _t(\theta _0)/\partial \theta _l\partial \theta _r\in {{\mathcal {H}}}_\epsilon (t-1)\), we have

$$\begin{aligned} {{\mathbb {E}}}\left( \frac{\partial ^2 O_n(\theta _0)}{\partial \theta _l\partial \theta _r}\right) = {{\mathbb {E}}}\left( \frac{\partial \epsilon _t(\theta _0)}{\partial \theta _l}\frac{\partial \epsilon _t(\theta _0)}{\partial \theta _r}\right) =J(l,r). \end{aligned}$$
(74)

Therefore, J is the covariance matrix of \(\partial \epsilon _t(\theta _0)/\partial \theta \). If J is singular, then there exists a vector \(\varvec{c}=(c_1,\dots ,c_{(p+q)K})'\ne 0\) such that \(\varvec{c}'J\varvec{c}=0\). Thus we have

$$\begin{aligned} \sum _{k=1}^{(p+q)K}c_k\frac{\partial \epsilon _t(\theta _0)}{\partial \theta _k}=0,\,a.s. \end{aligned}$$
(75)

Differentiating the two sides of (4) yields

$$\begin{aligned} -\sum _{i=1}^p(g_i^{a})^*(\Delta _t,\theta _0) X_{t-i}&=\sum _{k=1}^{(p+q)K}c_k\frac{\partial \epsilon _t(\theta _0)}{\partial \theta _k}-\sum _{j=1}^q g_j^b(\Delta _t,\theta _0)\sum _{k=1}^{(p+q)K}c_k\frac{\partial \epsilon _{t-j}(\theta _0)}{\partial \theta _k} \\&\quad -\,\sum _{j=1}^q (g_j^{b})^*(\Delta _t,\theta _0)\epsilon _{t-j}(\theta _0) \end{aligned}$$

where

$$\begin{aligned} (g_i^{a})^*(\Delta _t,\theta _0)=\sum _{k=1}^{(p+q)K}c_k\frac{\partial g_i^a(\Delta _t,\theta _0)}{\partial \theta _k}\text { and } (g_j^{b})^*(\Delta _t,\theta _0)=\sum _{k=1}^{(p+q)K}c_k\frac{\partial g_j^b(\Delta _t,\theta _0)}{\partial \theta _k}. \end{aligned}$$

Because (75) is satisfied for all t, we have

$$\begin{aligned} \sum _{i=1}^p(g_i^{a})^*(\Delta _t,\theta _0) X_{t-i}=\sum _{j=1}^q (g_j^{b})^*(\Delta _t,\theta _0)\epsilon _{t-j}(\theta _0). \end{aligned}$$

The latter equation yields a ARMARC\((p-1,q-1)\) representation at best. The identifiability assumption (see Proposition 3.1) excludes the existence of such representation.

Thus

$$\begin{aligned}&(g_i^{a})^*(\Delta _t,\theta _0)=\sum _{k=1}^{(p+q)K}c_k\frac{\partial g_i^a(\Delta _t,\theta _0)}{\partial \theta _k}=0\text { and } \\&(g_j^{b})^*(\Delta _t,\theta _0)=\sum _{k=1}^{(p+q)K}c_k\frac{\partial g_j^b(\Delta _t,\theta _0)}{\partial \theta _k}=0 \end{aligned}$$

and the conclusion follows. \(\square \)

Proof of Theorem 3.8

For all \(i,j,k=1,\dots ,K(p+q)\) we have

$$\begin{aligned} \frac{\partial ^3O_n(\theta )}{\partial \theta _i\partial \theta _j\partial \theta _k}&= \frac{1}{n}\sum _{t=1}^n\left\{ \epsilon _t(\theta )\frac{\partial ^3\epsilon _t (\theta )}{\partial \theta _i\partial \theta _j\partial \theta _k} \right\} +\frac{1}{n}\sum _{t=1}^n\left\{ \frac{\partial \epsilon _t(\theta )}{\partial \theta _i} \frac{\partial ^2\epsilon _t(\theta )}{\partial \theta _j\partial \theta _k} \right\} \\&\quad +\frac{1}{n}\sum _{t=1}^n\left\{ \frac{\partial ^2\epsilon _t(\theta )}{\partial \theta _i\partial \theta _j}\frac{\partial \epsilon _t(\theta )}{\partial \theta _k}\right\} +\frac{1}{n}\sum _{t=1}^n \left\{ \frac{\partial \epsilon _t(\theta )}{\partial \theta _j} \frac{\partial ^2\epsilon _t(\theta )}{\partial \theta _i\partial \theta _k}\right\} . \end{aligned}$$

Using the ergodic theorem, the Cauchy-Schwarz inequality and Lemma 3.4, we obtain

$$\begin{aligned} \sup _n\sup _{\theta \in \Theta }\left| \frac{\partial ^3O_n(\theta )}{\partial \theta _i\partial \theta _j\partial \theta _k}\right| <+\infty . \end{aligned}$$
(76)

In view of Proposition 3.5, we have almost surely

$$\begin{aligned} \sup _{\theta \in \Theta }\left| \frac{\partial ^3 }{\partial \theta _i\partial \theta _j\partial \theta _k} \left( Q_n(\theta )-O_n(\theta )\right) \right| \longrightarrow 0, \text { as } n\rightarrow \infty . \end{aligned}$$

Thus \({\partial ^3 Q_n(\theta )}/{\partial \theta _i\partial \theta _j\partial \theta _k}\) and \({\partial ^2 O_n(\theta )}/{\partial \theta _i\partial \theta _j\partial \theta _k}\) have almost surely the same asymptotic distribution. In view of Theorem 3.6 and (A4), we have almost surely \({\hat{\theta }}_n\longrightarrow \theta _0\in {\mathop {\Theta }\limits ^{\circ }}\). Thus \(\nabla Q_n({\hat{\theta }}_n)=0_{{\mathbb {R}}^{(p+q)K}}\) for sufficiently large n, and a Taylor expansion gives for all \(r\in \{1,\ldots ,(p+q)K \}\),

$$\begin{aligned} 0=\sqrt{n} \frac{\partial }{\partial \theta _r} Q_n(\theta _0) + \nabla \frac{\partial }{\partial \theta _r} Q_n(\theta _{n,r}^*) \sqrt{n}\left( {\hat{\theta }}_n-\theta _0\right) , \end{aligned}$$
(77)

where \(\theta _{n,r}^*\) lies on the segment in \({\mathbb {R}}^{(p+q)K}\) with endpoints \({\hat{\theta }}_n\) and \(\theta _0\). Using again a Taylor expansion, Theorem 3.7 and (76), we obtain for all \(l=1,\dots ,(p+q)K\),

$$\begin{aligned} \left| \frac{\partial ^2 Q_n(\theta _{n,r}^*)}{\partial \theta _l\partial \theta _r}-\frac{\partial ^2 Q_n(\theta _0)}{\partial \theta _l\partial \theta _r}\right|\le & {} \sup _n\sup _{\theta \in \Theta }\left\| \nabla \left( \frac{\partial ^2 }{\partial \theta _l\partial \theta _r}Q_n(\theta )\right) \right\| \left\| \theta _{n,r}^*-\theta _0\right\| \\\longrightarrow & {} 0 \text { a.s. as }n\rightarrow \infty . \end{aligned}$$

This, along with (77), implies that, as \(n\rightarrow \infty \)

$$\begin{aligned} \sqrt{n}\left( {\hat{\theta }}_n-\theta _0\right) =-\left[ \nabla ^2 Q_n(\theta _0) \right] ^{-1}\sqrt{n}\frac{\partial Q_n(\theta _0)}{\partial \theta }+o_{{\mathbb {P}}}(1). \end{aligned}$$

From Lemma A.3 and Lemma A.4, we obtain that \(\sqrt{n}({\hat{\theta }}_n-\theta _0)\) has a limiting normal distribution with mean 0 and covariance matrix \(J^{-1}IJ^{-1}\). \(\square \)

1.5 A.5. Proofs of Theorem 3.10

The proof of Theorem 3.10 is based on a series of lemmas.

Consider the regression of \(\Upsilon _t\) on \(\Upsilon _{t-1},\dots ,\Upsilon _{t-r}\) defined by

$$\begin{aligned} \Upsilon _t=\sum _{i=1}^{r}\Phi _{r,i}\Upsilon _{t-i}+u_{r,t},\qquad \end{aligned}$$
(78)

where \(u_{r,t}\) is orthogonal to \(\left\{ \Upsilon _{t-1} \dots \Upsilon _{t-r}\right\} \) for the \(L^2\) inner product. If \(\Upsilon _{1},\dots ,\Upsilon _{n}\) were observed, the least squares estimators of \(\underline{{{\varvec{\Phi }}}}_{r}=\left( \Phi _{r,1}\cdots \Phi _{r,r}\right) \) and \(\Sigma _{u_r}=\text{ Var }(u_{r,t})\) would be given by

$$\begin{aligned} \underline{\breve{{\varvec{\Phi }}}}_{r}={\hat{\Sigma }}_{{\Upsilon }, \underline{{\Upsilon }}_{r}} {\hat{\Sigma }}_{\underline{{\Upsilon }}_{r}}^{-1}\qquad \text{ and }\qquad {\hat{\Sigma }}_{\breve{u}_r}=\frac{1}{n}\sum _{t=1}^n \left( {\Upsilon }_t-\underline{\breve{{\varvec{\Phi }}}}_{r} \underline{{\Upsilon }}_{r,t}\right) \left( {\Upsilon }_t-\underline{\breve{{\varvec{\Phi }}}}_{r} \underline{{\Upsilon }}_{r,t}\right) ' \end{aligned}$$

where \(\underline{{\Upsilon }}_{r,t}=({\Upsilon }_{t-1}' \cdots {\Upsilon }_{t-r}')'\),

$$\begin{aligned} {\hat{\Sigma }}_{{\Upsilon },\underline{{\Upsilon }}_{r}}= \frac{1}{n}\sum _{t=1}^n{\Upsilon }_t\underline{{\Upsilon }}_{r,t}',\qquad {\hat{\Sigma }}_{\underline{{\Upsilon }}_{r}}= \frac{1}{n}\sum _{t=1}^n\underline{{\Upsilon }}_{r,t} \underline{{\Upsilon }}_{r,t}', \end{aligned}$$

with by convention \({\Upsilon }_t=0\) when \(t\le 0\), and assuming \({\hat{\Sigma }}_{\underline{{\Upsilon }}_{r}}\) is non singular (which holds true asymptotically).

Actually, we just observe \(X_1,\dots ,X_n\). The residuals \({{\hat{\epsilon }}}_t:=e_t({{\hat{\theta }}}_n)\) are then available for \(t=1,\dots ,n\) and the vectors \({{\hat{\Upsilon }}}_t\) obtained by replacing \(\theta _0\) by \({\hat{\theta }}_n\) in (19) are available for \(t=1,\dots ,n\). We therefore define the least squares estimators of \(\underline{{{\varvec{\Phi }}}}_{r}=\left( \Phi _{r,1}\cdots \Phi _{r,r}\right) \) and \(\Sigma _{u_r}=\text{ Var }(u_{r,t})\) by

$$\begin{aligned} \underline{\hat{{\varvec{\Phi }}}}_{r}={\hat{\Sigma }}_{{\hat{\Upsilon }}, \underline{{\hat{\Upsilon }}}_{r}} {\hat{\Sigma }}_{\underline{{\hat{\Upsilon }}}_{r}}^{-1}\qquad \text{ and } \qquad {\hat{\Sigma }}_{{\hat{u}}_r}=\frac{1}{n}\sum _{t=1}^n \left( {\hat{\Upsilon }}_t-\underline{\hat{{\varvec{\Phi }}}}_{r} \underline{{\hat{\Upsilon }}}_{r,t}\right) \left( {\hat{\Upsilon }}_t-\underline{\hat{{\varvec{\Phi }}}}_{r} \underline{{\hat{\Upsilon }}}_{r,t}\right) ' \end{aligned}$$

where \(\underline{{\hat{\Upsilon }}}_{r,t}=({\hat{\Upsilon }}_{t-1}' \cdots {\hat{\Upsilon }}_{t-r}')'\),

$$\begin{aligned} {\hat{\Sigma }}_{{\hat{\Upsilon }},\underline{{\hat{\Upsilon }}}_{r}}= \frac{1}{n}\sum _{t=1}^n{\hat{\Upsilon }}_t\underline{{\hat{\Upsilon }}}_{r,t}',\qquad {\hat{\Sigma }}_{\underline{{\hat{\Upsilon }}}_{r}}= \frac{1}{n}\sum _{t=1}^n\underline{{\hat{\Upsilon }}}_{r,t} \underline{{\hat{\Upsilon }}}_{r,t}', \end{aligned}$$

with by convention \({\hat{\Upsilon }}_t=0\) when \(t\le 0\), and assuming \({\hat{\Sigma }}_{\underline{{\hat{\Upsilon }}}_{r}}\) is non singular (which holds true asymptotically).

We specify a bit more the matrix norm defined at the end of Sect. 2 and we use in the sequel the multiplicative matrix norm defined by

$$\begin{aligned} \Vert A\Vert =\sup _{\Vert x\Vert \le 1}\Vert Ax\Vert =\varrho ^{1/2}(A'{\bar{A}}), \end{aligned}$$
(79)

where A is a \({\mathbb {C}}^{d_1\times d_2}\) matrix, \(\Vert x\Vert ^2=x' {\bar{x}}\) is the Euclidean norm of the vector \(x\in {\mathbb {C}}^{d_2\times 1}\), and \(\varrho (\cdot )\) denotes the spectral radius. This norm satisfies

$$\begin{aligned} \Vert A\Vert ^2\le \sum _{i,j}a_{i,j}^2, \text { when }A \text { is a }{\mathbb {R}}^{d_1\times d_2}\text { matrix} \end{aligned}$$
(80)

with obvious notations. This choice of the norm is crucial for the following lemma to hold (with e.g. the Euclidean norm, this result is not valid). Let

$$\begin{aligned} {\Sigma }_{{\Upsilon },{\underline{\Upsilon }}_{r}}= & {} {\mathbb {E}}{\Upsilon }_{t} {\underline{\Upsilon }}_{r,t}', \quad {\Sigma }_{{\Upsilon }}={\mathbb {E}}{\Upsilon }_{t}{\Upsilon }_{t}', \quad {\Sigma }_{{\underline{\Upsilon }}_{r}}= {\mathbb {E}}{\underline{\Upsilon }}_{r,t}{\underline{\Upsilon }}_{r,t}',\quad {\hat{\Sigma }}_{{\hat{\Upsilon }}}= \frac{1}{n}\sum _{t=1}^n{\hat{\Upsilon }}_t{\hat{\Upsilon }}_{t}'. \end{aligned}$$

In the sequel, C and \(\rho \) denote generic constant such as \(K>0\) and \(\rho \in (0,1)\), whose exact values are unimportant.

Lemma A.6

Under the assumptions of Theorem 3.10,

$$\begin{aligned} \sup _{r\ge 1}\max \left\{ \left\| {\Sigma }_{{\Upsilon }, {\underline{\Upsilon }}_{r}}\right\| ,\left\| {\Sigma }_{{\underline{\Upsilon }}_{r}}\right\| , \left\| {\Sigma }_{{\underline{\Upsilon }}_{r}}^{-1}\right\| \right\} < \infty . \end{aligned}$$

Proof

The proof is an extension of Section 5.2 of Grenander and Szegö (1958). We readily have

$$\begin{aligned} \Vert {\Sigma }_{{\underline{\Upsilon }}_{r}}x\Vert \le \Vert {\Sigma }_{{\underline{\Upsilon }}_{r+1}}(x', 0_{(p+q)K}')'\Vert \quad \text{ and } \quad \Vert {\Sigma }_{{\underline{\Upsilon }}_{r}}x\Vert \le \Vert {\Sigma }_{{\underline{\Upsilon }}_{r+1}}(0_{(p+q)K}',x')'\Vert \end{aligned}$$

for any \(x\in {\mathbb {R}}^{K(p+q)r}\) and \(0_{(p+q)K}=(0,\dots ,0)'\in {\mathbb {R}}^{(p+q)K}\). Therefore

$$\begin{aligned} 0<\left\| \text{ Var }\left( {\Upsilon }_{t}\right) \right\| = \left\| {\Sigma }_{{\underline{\Upsilon }}_{1}}\right\| \le \left\| {\Sigma }_{{\underline{\Upsilon }}_{2}}\right\| \le \cdots \end{aligned}$$

and

$$\begin{aligned} \left\| {\Sigma }_{{\Upsilon },{\underline{\Upsilon }}_{r}}\right\| \le \left\| {\Sigma }_{{\underline{\Upsilon }}_{r+1}}\right\| , \end{aligned}$$

so that it suffices to prove that \( \sup _{r\ge 1}\left\| {\Sigma }_{{\underline{\Upsilon }}_{r}}\right\| \) and \(\sup _{r\ge 1}\left\| {\Sigma }_{{\underline{\Upsilon }}_{r}}^{-1}\right\| \) are finite to prove the result. Let us write matrix \({\Sigma }_{{\underline{\Upsilon }}_{r}}\) in blockwise form

$$\begin{aligned} {\Sigma }_{{\underline{\Upsilon }}_{r}}=\left[ C(i-j)\right] _{i,j=1,\ldots ,r},\quad C(k)={{\mathbb {E}}}(\Upsilon _{0}\Upsilon _{k}')\in {\mathbb {R}}^{K(p+q)\times K(p+q)},\ k\in {\mathbb {Z}}. \end{aligned}$$

Let now \(f:{\mathbb {R}}\longrightarrow {\mathbb {C}}^{K(p+q)\times K(p+q)}\) be the spectral density of \((\Upsilon _t)_{t\in {\mathbb {Z}}}\) defined by

$$\begin{aligned} f(\omega )=\frac{1}{2\pi } \sum _{k=-\infty }^\infty C(k) e^{i\omega k},\quad \omega \in {\mathbb {R}}. \end{aligned}$$

A direct consequence of (19) and Lemma A.2 is that \(f(\omega )\) is absolutely summable, and that \(\sup _{\omega \in {\mathbb {R}}}\Vert f(\omega )\Vert <+\infty \), for any norm \(\Vert \cdot \Vert \) on \({\mathbb {C}}^{K(p+q)\times K(p+q)}\) (in particular, one which is independent from \(r\ge 1\)). Another consequence is that we have the inversion formula

$$\begin{aligned} C(k)=\int _{-\pi }^\pi f(x) e^{-ikx}dx,\quad \forall k\in {\mathbb {Z}}. \end{aligned}$$
(81)

Last, it is easy to check that \(f(\omega )\) is an hermitian matrix for all \(\omega \in {\mathbb {R}}\), i.e. \(\overline{f(\omega )}=f(\omega ) '\), where \({\bar{z}}\) is the conjugate of any vector or matrix z with entries in \({\mathbb {C}}\). Let then \(\delta ^{(r)}=\left( {\delta ^{(r)}_1} ',\ldots ,{\delta ^{(r)}_r} '\right) \in {\mathbb {R}}^{rK(p+q)\times 1}\) be an eigenvector for \({\Sigma }_{{{\underline{\Upsilon }}_{r}}}\), with \(\delta ^{(r)}_j \in {\mathbb {R}}^{K(p+q)\times 1}\), \(j=1,\ldots ,r\), such that \(\Vert {\delta ^{(r)}}\Vert =1\) and

$$\begin{aligned} {\delta ^{(r)}} ' {\Sigma }_{{{\underline{\Upsilon }}_{r}}} \delta ^{(r)}= \Vert {\Sigma }_{{{\underline{\Upsilon }}_{r}}}\Vert =\varrho \left( {\Sigma }_{{{\underline{\Upsilon }}_{r}}}\right) , \end{aligned}$$
(82)

where \(\Vert {\Sigma }_{{{\underline{\Upsilon }}_{r}}}\Vert \) is the norm of matrix \({\Sigma }_{{{\underline{\Upsilon }}_{r}}}\) defined in (79). We then check that

$$\begin{aligned} {\delta ^{(r)}} ' {\Sigma }_{{{\underline{\Upsilon }}_{r}}} \delta ^{(r)}= & {} \sum _{i,j=1}^r {\delta ^{(r)}_i} ' C(i-j) {\delta ^{(r)}_j} \nonumber \\= & {} \int _{-\pi }^\pi \left( \sum _{m=1}^r \delta ^{(r)}_m e^{i(m-1)x}\right) ' f(x) \overline{\left( \sum _{m=1}^r \delta ^{(r)}_m e^{i(m-1)x}\right) }dx , \end{aligned}$$
(83)

the last equality a direct consequence of (81). f(x) being hermitian, \((X,Y)\in {\mathbb {C}}^{K(p+q)\times 1}\times {\mathbb {C}}^{K(p+q)\times 1}\mapsto X' f(x) {\bar{Y}}\) defines a semi definite non negative bilinear form, hence we have for all \(x\in {\mathbb {R}}\) and \(X\in {\mathbb {C}}^{K(p+q)\times 1}\):

$$\begin{aligned} 0\le X' f(x) {\bar{X}}\le \Vert f(x) \Vert \cdot X'{\bar{X}} \le \sup _{\omega \in {\mathbb {R}}}\Vert f(\omega )\Vert \cdot X'{\bar{X}} . \end{aligned}$$

Let us point out that \(\sup _{\omega \in {\mathbb {R}}}\Vert f(\omega )\Vert \) is a quantity which is independent from \(r\ge 1\). We deduce from (83) and the previous inequality that

$$\begin{aligned} {\delta ^{(r)}} ' {\Sigma }_{{{\underline{\Upsilon }}_{r}}} \delta ^{(r)}\le \sup _{\omega \in {\mathbb {R}}}\Vert f(\omega )\Vert \int _{-\pi }^\pi \left( \sum _{m=1}^r \delta ^{(r)}_m e^{i(m-1)x}\right) ' \overline{\left( \sum _{m=1}^r \delta ^{(r)}_m e^{i(m-1)x}\right) }dx . \end{aligned}$$
(84)

A short computation yields that

$$\begin{aligned} \frac{1}{2\pi }\int _{-\pi }^\pi \left( \sum _{m=1}^r \delta ^{(r)}_m e^{i(m-1)x}\right) ' \overline{\left( \sum _{m=1}^r \delta ^{(r)}_m e^{i(m-1)x}\right) }dx = \sum _{m=1}^r {\delta ^{(r)}_m} ' \delta ^{(r)}_m=\Vert {\delta ^{(r)}}\Vert ^2=1, \end{aligned}$$

which, coupled with (82) and (84), yields that \( \Vert {\Sigma }_{{{\underline{\Upsilon }}_{r}}}\Vert \le 2\pi \sup _{\omega \in {\mathbb {R}}}\Vert f(\omega )\Vert <+\infty \), an upper bound independent from \(r\ge 1\). By similar arguments, the smallest eigenvalue of \( {\Sigma }_{{\underline{\Upsilon }}_{r}}\) is greater than a positive constant independent of r. Using the fact that \(\Vert {\Sigma }_{{\underline{\Upsilon }}_{r}}^{-1}\Vert \) is equal to the inverse of the smallest eigenvalue of \( {\Sigma }_{{\underline{\Upsilon }}_{r}}\), the proof is completed. \(\square \)

The following lemma is necessary in the sequel.

Lemma A.7

Let us suppose that (A1) and that Stationarity condition (A5a) for \(\nu =6\)

(A6):
$$\begin{aligned} \limsup _{t\rightarrow \infty }\frac{1}{t}\ln {\mathbb {E}}\left( \sup _{\theta \in \Theta } \left| \left| \prod _{i=1}^t \Phi (\Delta _i,\theta )\right| \right| ^{32}\right)<0,\quad \limsup _{t\rightarrow \infty }\frac{1}{t}\ln {\mathbb {E}}\left( \left| \left| \prod _{i=1}^t \Psi (\Delta _i)\right| \right| ^{32}\right) <0 \end{aligned}$$

hold. We assume that \(\epsilon _t\in L^{4\nu +8}\). Sequences \((\epsilon _t(\theta ))_{t\in {\mathbb {Z}}}\) and \((e_t(\theta ))_{t\in {\mathbb {Z}}}\) satisfy

  1. 1.

    \(\left| \left| \sup _{\theta \in \Theta } |\epsilon _0(\theta )|\right| \right| _{16}<+\infty \) and \(\sup _{t\ge 0}\left| \left| \sup _{\theta \in \Theta } |e_t(\theta )|\right| \right| _{16}<+\infty \),

  2. 2.

    \(\left| \left| \sup _{\theta \in \Theta } |\epsilon _t(\theta )-e_t(\theta )|\right| \right| _4\) tends to 0 exponentially fast as \(t\rightarrow \infty \),

  3. 3.

    For all \(\alpha >0\), \(t^\alpha \sup _{\theta \in \Theta } |\epsilon _t(\theta )-e_t(\theta )|\longrightarrow 0\) a.s. as \(t\rightarrow \infty \),

  4. 4.

    For all \(j=1, 2,3\), \(\left| \left| \sup _{\theta \in \Theta } ||\nabla ^j\epsilon _0(\theta )||\right| \right| _{16}<+\infty \), \(\sup _{t\ge 0}\left| \left| \sup _{\theta \in \Theta } ||\nabla ^j e_t(\theta )||\right| \right| _{16}<+\infty \) and we have \(t^\alpha \left| \left| \sup _{\theta \in \Theta } ||\nabla (e_t- \epsilon _t)(\theta )||\right| \right| _{16/5}\longrightarrow 0\) , as \(t\rightarrow \infty \) for all \(\alpha >0\).

Proof of Lemma A.7

is similar to the proofs of Lemmas 3.3 and 3.4. \(\square \)

Denote by \(\Upsilon _t(i)\) the i-th element of \(\Upsilon _t.\)

Lemma A.8

Let \((\epsilon _t)\) be a sequence of centered and uncorrelated variables, with \({{\mathbb {E}}}\left| \epsilon _t\right| ^{8+4\nu }<\infty \) and \(\sum _{h=0}^\infty \left[ \alpha _\epsilon (h)\right] ^{\nu /(2+\nu )}<\infty \) for some \(\nu >0\). Then there exits a finite constant \(C_1\) such that for \(m_1, m_2=1,\dots ,(p+q)K\) and all \(s\in {\mathbb {Z}}\),

$$\begin{aligned} \sum _{h=-\infty }^{\infty }\left| \text{ Cov }\left\{ \Upsilon _{1}(m_1) \Upsilon _{1+s}(m_2), \Upsilon _{1+h}(m_1)\Upsilon _{1+s+h}(m_2)\right\} \right| <C_1. \end{aligned}$$

Proof

Recall that

$$\begin{aligned} \frac{\partial \epsilon _t(\theta _0)}{\partial \theta _l}= & {} \sum _{i=0}^\infty c_{i,l}(\theta _0,\Delta _{t},\dots ,\Delta _{t-i+1}) \epsilon _{t-i},\text { for }l=1,\dots ,(p+q)K, \end{aligned}$$
(85)

where \(c_i(\theta _0,\Delta _{t},\dots ,\Delta _{t-i+1})\) is defined by (9) and \(c_{i,l}(\theta _0,\Delta _{t},\dots ,\Delta _{t-i+1})=\partial c_i(\theta _0,\Delta _{t},\dots ,\Delta _{t-i+1})/{\partial \theta _l}\), and with the following upper bound holding thanks to (13):

$$\begin{aligned} {\mathbb {E}}\sup _{\theta \in \Theta }(c_i(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1}))^2\le C\rho ^i \text { and }{\mathbb {E}}\sup _{\theta \in \Theta }( c_{i,l}(\theta ,\Delta _{t},\dots ,\Delta _{t-i+1}))^2\le C\rho ^i,\quad \forall i. \end{aligned}$$

Let

$$\begin{aligned}&\gamma _{i,j,i',j',s,h}(m_1,m_2)(\theta _0) \nonumber \\&\quad = {{\mathbb {E}}}\left[ c_{i,m_1}(\theta _0,\Delta _{t},\dots , \Delta _{t-i+1})c_{j,m_2}(\theta _0,\Delta _{t+s},\dots ,\Delta _{t+s-j+1}) \right. \nonumber \\&\quad \qquad \left. \times \,c_{i',m_1}(\theta _0,\Delta _{t+h},\dots , \Delta _{t+h-i'+1})c_{j',m_2}(\theta _0,\Delta _{t+s+h},\dots , \Delta _{t+s+h-j'+1})\right] \nonumber \\&\qquad \times \,\text{ Cov }\left( \epsilon _{t}\epsilon _{t-i} \epsilon _{t+s}\epsilon _{t+s-j},\epsilon _{t+h}\epsilon _{t+h-i'} \epsilon _{t+s+h}\epsilon _{t+s+h-j'}\right) \nonumber \\&\qquad +\, \text{ Cov }\left( c_{i,m_1} (\theta _0,\Delta _{t},\dots , \Delta _{t-i+1})c_{j,m_2}(\theta _0,\Delta _{t+s},\dots ,\Delta _{t+s-j+1}), \right. \nonumber \\&\qquad \quad \left. c_{i',m_1}(\theta _0,\Delta _{t+h},\dots , \Delta _{t+h-i'+1})c_{j',m_2}(\theta _0,\Delta _{t+s+h},\dots , \Delta _{t+s+h-j'+1})\right) \nonumber \\&\qquad \times \,{{\mathbb {E}}}\left[ \epsilon _{t}\epsilon _{t-i} \epsilon _{t+s}\epsilon _{t+s-j}\right] {{\mathbb {E}}}\left[ \epsilon _{t+h} \epsilon _{t+h-i'}\epsilon _{t+s+h}\epsilon _{t+s+h-j'}\right] . \end{aligned}$$
(86)

The Cauchy-Schwarz inequality implies that

$$\begin{aligned}&\left| {{\mathbb {E}}}[c_{i,m_1}(\theta _0,\Delta _{t},\dots , \Delta _{t-i+1})c_{j,m_2}(\theta _0,\Delta _{t+s},\dots ,\Delta _{t+s-j+1}) \right. \nonumber \\&\qquad \left. \times \,c_{i',m_1}(\theta _0,\Delta _{t+h},\dots , \Delta _{t+h-i'+1})c_{j',m_2}(\theta _0,\Delta _{t+s+h}, \dots ,\Delta _{t+s+h-j'+1})]\right| \nonumber \\&\quad \le C\rho ^{i+j+i'+j'}. \end{aligned}$$
(87)

In view of (85) and (86), we have

$$\begin{aligned}&\sum _{h=-\infty }^{\infty }\text{ Cov }\left\{ \Upsilon _{1}(m_1)\Upsilon _{1+s}(m_2), \Upsilon _{1+h}(m_1)\Upsilon _{1+s+h}(m_2)\right\} \\&\quad = \sum _{h=-\infty }^{\infty }\sum _{i=0}^\infty \sum _{j=0}^\infty \sum _{i'=0}^\infty \sum _{j'=0}^\infty \gamma _{i,j,i',j',s,h}(m_1,m_2)(\theta _0). \end{aligned}$$

Without loss of generality, we can take the supremum over the integers \(s>0\), and consider the sum for positive h. Let \(m_0=m_1\wedge m_2\) and \(Y_{t,h_1} = \epsilon _{t}\epsilon _{t-h_1}-{\mathbb {E}}(\epsilon _{t}\epsilon _{t-h_1})\). We first suppose that \(h\ge 0\). It follows that

$$\begin{aligned}&\sum _{i=0}^\infty \sum _{j=0}^\infty \sum _{i'=0}^\infty \sum _{j'=0}^\infty \left| \text{ Cov }\left( c_{i,m_1} (\theta _0,\Delta _{t},\dots ,\Delta _{t-i+1})c_{j,m_2} (\theta _0,\Delta _{t+s},\dots ,\Delta _{t+s-j+1}), \right. \right. \\&\qquad \qquad \left. \left. c_{i',m_1}(\theta _0,\Delta _{t+h},\dots , \Delta _{t+h-i'+1})c_{j',m_2}(\theta _0,\Delta _{t+s+h}, \dots ,\Delta _{t+s+h-j'+1})\right) \right| \\&\quad \le v_1+v_2+v_3+v_4+v_5, \end{aligned}$$

where

$$\begin{aligned} v_1=v_1(h)= & {} \sum _{i>[h/2]}\sum _{j=0}^\infty \sum _{i'=0}^\infty \sum _{j'=0}^\infty \left| \text{ Cov }\left( {\mathbf {c}}^{t}_{i,m_1} {\mathbf {c}}^{t+s}_{j,m_2},{\mathbf {c}}^{t+h}_{i',m_1} {\mathbf {c}}^{t+h+s}_{j',m_2} \right) \right| ,\\ v_2=v_2(h)= & {} \sum _{i=0}^\infty \sum _{j>[h/2]}\sum _{i'=0}^\infty \sum _{j'=0}^\infty \left| \text{ Cov }\left( {\mathbf {c}}^{t}_{i,m_1} {\mathbf {c}}^{t+s}_{j,m_2},{\mathbf {c}}^{t+h}_{i',m_1} {\mathbf {c}}^{t+h+s}_{j',m_2}\right) \right| \\ v_3=v_3(h)= & {} \sum _{i=0}^\infty \sum _{j=0}^\infty \sum _{i'>[h/2]} \sum _{j'=0}^\infty \left| \text{ Cov }\left( {\mathbf {c}}^{t}_{i,m_1} {\mathbf {c}}^{t+s}_{j,m_2},{\mathbf {c}}^{t+h}_{i',m_1} {\mathbf {c}}^{t+h+s}_{j',m_2}\right) \right| ,\\ v_4=v_4(h)= & {} \sum _{i=0}^\infty \sum _{j=0}^\infty \sum _{i'=0}^\infty \sum _{j'>[h/2]}\left| \text{ Cov }\left( {\mathbf {c}}^{t}_{i,m_1} {\mathbf {c}}^{t+s}_{j,m_2},{\mathbf {c}}^{t+h}_{i',m_1} {\mathbf {c}}^{t+h+s}_{j',m_2}\right) \right| , \\ v_5=v_5(h)= & {} \sum _{i=0}^{[h/2]}\sum _{j=0}^{[h/2]} \sum _{i'=0}^{[h/2]}\sum _{j'=0}^{[h/2]}\left| \text{ Cov } \left( {\mathbf {c}}^{t}_{i,m_1}{\mathbf {c}}^{t+s}_{j,m_2}, {\mathbf {c}}^{t+h}_{i',m_1}{\mathbf {c}}^{t+h+s}_{j',m_2} \right) \right| , \end{aligned}$$

where

$$\begin{aligned} {\mathbf {c}}^{t}_{i_1,m}=c_{i_1,m}(\theta _0,\Delta _{t},\dots ,\Delta _{t-i_1+1}). \end{aligned}$$

One immediate remark is that \({\mathbf {c}}^{t}_{i_1,m}\) is measurable with respect to \(\Delta _r\), \(r\in \{ t,\ldots ,t-i_1+1\}\). Since

$$\begin{aligned}&\left| \text{ Cov }\left( {\mathbf {c}}^{t}_{i,m_1}{\mathbf {c}}^{t+s}_{j,m_2}, {\mathbf {c}}^{t+h}_{i',m_1}{\mathbf {c}}^{t+h+s}_{j',m_2}\right) \right| \le C\rho ^{i+i'+j+j'}, \end{aligned}$$

we have

$$\begin{aligned} v_1= & {} \sum _{i>[h/2]}\sum _{j=0}^\infty \sum _{i'=0}^\infty \sum _{j'=0}^\infty \left| \text{ Cov }\left( {\mathbf {c}}^{t}_{i,m_1} {\mathbf {c}}^{t+s}_{j,m_2},{\mathbf {c}}^{t+h}_{i',m_1} {\mathbf {c}}^{t+h+s}_{j',m_2}\right) \right| \le \kappa _1\rho ^{h/2}, \end{aligned}$$

for some positive constant \(\kappa _1\). Using the same arguments we obtain that \(v_i\), \(i=2,3,4\) are bounded by \(\kappa _i\rho ^{h/2}\). The \(\alpha -\)mixing property (see Theorem 14.1 in Davidson 1994, p. 210) and Lemmas A.1 and A.7, entail that

$$\begin{aligned} v_5= & {} \sum _{i=0}^{[h/2]}\sum _{j=0}^{[h/2]}\sum _{i'=0}^{[h/2]} \sum _{j'=0}^{[h/2]}\left| \text{ Cov }\left( {\mathbf {c}}^{t}_{i,m_1}{\mathbf {c}}^{t+s}_{j,m_2}, {\mathbf {c}}^{t+h}_{i',m_1}{\mathbf {c}}^{t+h+s}_{j',m_2}\right) \right| \\\le & {} \sum _{k=1}^{4}\sum _{(i,j,i',j')\in {\mathcal {C}}_k}\kappa _6 \left\| {\mathbf {c}}^{t}_{i,m_1}{\mathbf {c}}^{t+s}_{j,m_2}\right\| _{2+\nu } \left\| {\mathbf {c}}^{t+h}_{i',m_1}{\mathbf {c}}^{t+h+s}_{j',m_2}\right\| _{2+\nu } \\&\left\{ \alpha \left( {\mathbf {c}}^{t}_{i,m_1}{\mathbf {c}}^{t+s}_{j,m_2}, {\mathbf {c}}^{t+h}_{i',m_1}{\mathbf {c}}^{t+h+s}_{j',m_2}\right) \right\} ^{\nu /(2+\nu )}, \end{aligned}$$

where \(\alpha (U,V)\) denotes the strong mixing coefficient between the \(\sigma -\)field generated by the random variable U and that generated by V and where

$$\begin{aligned} {\mathcal {C}}_1= {\mathcal {C}}_1(h)= & {} \left\{ (i,j,i',j')\in \{0,1,\dots , [h/2]\}^4: i\ge j-s,\;j'\le i'+s\right\} ,\\ {\mathcal {C}}_2= {\mathcal {C}}_2(h)= & {} \left\{ (i,j,i',j')\in \{0,1,\dots , [h/2]\}^4: i\ge j-s,\;j'\ge i'+s\right\} ,\\ {\mathcal {C}}_3= {\mathcal {C}}_3(h)= & {} \left\{ (i,j,i',j')\in \{0,1,\dots , [h/2]\}^4: i\le j-s,\;j'\le i'+s\right\} ,\\ {\mathcal {C}}_4= {\mathcal {C}}_4(h)= & {} \left\{ (i,j,i',j')\in \{0,1,\dots , [h/2]\}^4: i\le j-s,\;j'\ge i'+s\right\} . \end{aligned}$$

We check easily that \({\mathbf {c}}^{t}_{i,m_1}{\mathbf {c}}^{t+s}_{j,m_2}\) and \({\mathbf {c}}^{t+h}_{i',m_1}{\mathbf {c}}^{t+h+s}_{j',m_2}\) are respectively measurable with respect to \(\Delta _r\), \(r\in \{t-i+1,\ldots ,t+s\}\) and \(\Delta _r\), \(r\in \{t-i'+h+1,\ldots ,t+h+s\}\) when \((i,j,i',j')\in {\mathcal {C}}_1\). We have \(t-i+1\le t+s-j+1\), \(t+h-i'+1\le t+h+s-j'+1\) and we thus deduce that

$$\begin{aligned} \left| \alpha \left( {\mathbf {c}}^{t}_{i,m_1}{\mathbf {c}}^{t+s}_{j,m_2}, {\mathbf {c}}^{t+h}_{i',m_1}{\mathbf {c}}^{t+h+s}_{j',m_2}\right) \right|\le & {} \alpha _{\Delta }\left( h-i'-s+1\right) ,\quad \forall h\ge i'+s-1, \\ \left| \alpha \left( {\mathbf {c}}^{t}_{i,m_1}{\mathbf {c}}^{t+s}_{j,m_2}, {\mathbf {c}}^{t+h}_{i',m_1}{\mathbf {c}}^{t+h+s}_{j',m_2}\right) \right|\le & {} \alpha _{\Delta }\left( -i-h-s+1\right) ,\quad \forall h\le -i-s+1, \\ \left| \alpha \left( {\mathbf {c}}^{t}_{i,m_1}{\mathbf {c}}^{t+s}_{j,m_2}, {\mathbf {c}}^{t+h}_{i',m_1}{\mathbf {c}}^{t+h+s}_{j',m_2}\right) \right|\le & {} \alpha _{\Delta }\left( 0\right) \le 1/4,\quad \forall h= -i-s+1,\dots ,i'+s-1. \end{aligned}$$

Note also that, by the Hölder inequality,

$$\begin{aligned} \left\| {\mathbf {c}}^{t}_{i,m_1}{\mathbf {c}}^{t+s}_{j,m_2}\right\| _{2+\nu } \le \left\| {\mathbf {c}}^{t}_{i,m_1}\right\| _{4+2\nu }\left\| {\mathbf {c}}^{t+s}_{j,m_2}\right\| _{4+2\nu }\le C\rho ^{i+j}. \end{aligned}$$

Therefore

$$\begin{aligned}&\sum _{h=0}^\infty \sum _{(i,j,i',j')\in {\mathcal {C}}_1}\left\| {\mathbf {c}}^{t}_{i,m_1}{\mathbf {c}}^{t+s}_{j,m_2}\right\| _{2+\nu } \left\| {\mathbf {c}}^{t+h}_{i',m_1}{\mathbf {c}}^{t+h+s}_{j',m_2} \right\| _{2+\nu } \left\{ \alpha \left( {\mathbf {c}}^{t}_{i,m_1}{\mathbf {c}}^{t+s}_{j,m_2}, {\mathbf {c}}^{t+h}_{i',m_1}{\mathbf {c}}^{t+h+s}_{j',m_2}\right) \right\} ^{\nu /(2+\nu )}, \\&\quad \le C^2\sum _{i,j,i',j'=0}^\infty \rho ^{i+j+i'+j'}\left( i'+2s-1+i+\sum _{r=0}^{\infty } \alpha _{\Delta }^{\nu /(2+\nu )}\left( r \right) \right) <\infty . \end{aligned}$$

Continuing in this way, we obtain that \(\sum _{h=0}^{\infty }v_5(h)<\infty \). It follows that

$$\begin{aligned}&\sum _{h=0}^{\infty }\sum _{i=0}^\infty \sum _{j=0}^\infty \sum _{i'=0}^\infty \sum _{j'=0}^\infty \left| \text{ Cov } \left( c_{i,m_1}(\theta _0,\Delta _{t},\dots ,\Delta _{t-i+1}) c_{j,m_2}(\theta _0,\Delta _{t+s},\dots ,\Delta _{t+s-j+1}), \right. \right. \nonumber \\&\qquad \qquad \left. \left. c_{i',m_1}(\theta _0,\Delta _{t+h},\dots , \Delta _{t+h-i'+1})c_{j',m_2}(\theta _0,\Delta _{t+s+h},\dots , \Delta _{t+s+h-j'+1})\right) \right| \nonumber \\&\quad \le \sum _{h=0}^{\infty } \sum _{i=1}^5 v_i(h)<\infty . \end{aligned}$$
(88)

The same bounds clearly holds for

$$\begin{aligned}&\sum _{h=-\infty }^{0}\sum _{i=0}^\infty \sum _{j=0}^\infty \sum _{i'=0}^\infty \sum _{j'=0}^\infty \left| \text{ Cov } \left( c_{i,m_1}(\theta _0,\Delta _{t-1},\dots ,\Delta _{t-i}) c_{j,m_2}(\theta _0,\Delta _{t+s-1},\dots ,\Delta _{t+s-j}), \right. \right. \\&\quad \left. \left. c_{i',m_1}(\theta _0,\Delta _{t+h},\dots , \Delta _{t+h-i'+1})c_{j',m_2}(\theta _0,\Delta _{t+s+h},\dots , \Delta _{t+s+h-j'+1})\right) \right| <\infty , \end{aligned}$$

which shows that

$$\begin{aligned}&\sum _{h=-\infty }^{\infty }\sum _{i=0}^\infty \sum _{j=0}^\infty \sum _{i'=0}^\infty \sum _{j'=0}^\infty \left| \text{ Cov } \left( c_{i,m_1}(\theta _0,\Delta _{t},\dots ,\Delta _{t-i+1}) c_{j,m_2}(\theta _0,\Delta _{t+s},\dots ,\Delta _{t+s-j+1}), \right. \right. \\&\quad \left. \left. c_{i',m_1}(\theta _0,\Delta _{t+h}, \dots ,\Delta _{t+h-i'+1})c_{j',m_2}(\theta _0,\Delta _{t+s+h}, \dots ,\Delta _{t+s+h-j'+1})\right) \right| <\infty . \end{aligned}$$

A slight extension of Corollary A.3 in Francq and Zakoïan (2010) shows that

$$\begin{aligned} \sum _{h=-\infty }^{\infty }\left| \text{ Cov }\left( Y_{1,i}Y_{1+s,j}, Y_{1+h,i'}Y_{1+s+h,j'}\right) \right| <\infty . \end{aligned}$$
(89)

Because, by Cauchy–Schwarz inequality

$$\begin{aligned} \left| {{\mathbb {E}}}\left[ \epsilon _{t}\epsilon _{t-i} \epsilon _{t+s}\epsilon _{t+s-j}\right] \right| \le {{\mathbb {E}}}\left| \epsilon _t\right| ^4<\infty \end{aligned}$$

by the assumption that \({{\mathbb {E}}}\left| \epsilon _t\right| ^{8+4\nu }<\infty \) and in view of (87) it follows that

$$\begin{aligned}&\sum _{h=-\infty }^{\infty }\left| \text{ Cov }\left\{ \Upsilon _{1}(m_1)\Upsilon _{1+s}(m_2), \Upsilon _{1+h}(m_1)\Upsilon _{1+s+h}(m_2)\right\} \right| \\&\quad \le \kappa \sum _{i=0}^\infty \sum _{j=0}^\infty \sum _{i'=0}^\infty \sum _{j'=0}^\infty \rho ^{i+j+i'+j'}\sum _{h=-\infty }^{\infty } \left| \text{ Cov }\left( Y_{1,i}Y_{1+s,j}, Y_{1+h,i'}Y_{1+s+h,j'}\right) \right| \\&\qquad +\, \kappa '\sum _{i=0}^\infty \sum _{j=0}^\infty \sum _{i'=0}^\infty \sum _{j'=0}^\infty \sum _{h=-\infty }^{\infty } \\&\qquad \left| \text{ Cov }\left( c_{i,m_1}(\theta _0,\Delta _{t},\dots , \Delta _{t-i+1})c_{j,m_2}(\theta _0,\Delta _{t+s},\dots ,\Delta _{t+s-j+1}), \right. \right. \\&\qquad \quad \left. \left. c_{i',m_1}(\theta _0,\Delta _{t+h}, \dots ,\Delta _{t+h-i'+1})c_{j',m_2}(\theta _0,\Delta _{t+s+h}, \dots ,\Delta _{t+s+h-j'+1})\right) \right| \end{aligned}$$

The conclusion follows from (88) and (89). \(\square \)

Let \({\hat{\Sigma }}_{\Upsilon }\) be the matrix obtained by replacing \({\hat{\Upsilon }}_t\) by \(\Upsilon _t\) in \({\hat{\Sigma }}_{{\hat{\Upsilon }}}\).

Lemma A.9

Under the assumptions of Theorem 3.10, \(\sqrt{r}\Vert {\hat{\Sigma }}_{{\underline{\Upsilon }}_{r}}- {\Sigma }_{{\underline{\Upsilon }}_{r}}\Vert \), \(\sqrt{r}\Vert {\hat{\Sigma }}_{{\Upsilon }}- {\Sigma }_{{\Upsilon }}\Vert ,\) and \(\sqrt{r}\Vert {\hat{\Sigma }}_{{\Upsilon },{\underline{\Upsilon }}_{r}}- {\Sigma }_{{\Upsilon },{\underline{\Upsilon }}_{r}}\Vert \) tend to zero in probability as \(n\rightarrow \infty \) when \(r=\mathrm {o}(n^{1/3})\).

Proof

For \(1\le m_1,m_2\le K(p+q)\) and \(1\le r_1,r_2\le r\), the element of the \(\left\{ (r_1-1)(p+q)K+m_1\right\} \)-th row and \(\left\{ (r_2-1)(p+q)K+m_2\right\} \)-th column of \({\hat{\Sigma }}_{{\underline{\Upsilon }}_{r}} \) is of the form \(n^{-1}\sum _{t=1}^nZ_t\) where \(Z_t:=Z_{t,r_1,r_2}(m_1,m_2) = \Upsilon _{t-r_1}(m_1)\Upsilon _{t-r_2}(m_2).\) By stationarity of \(\left( Z_t\right) \), we have

$$\begin{aligned} \text{ Var }\left( \frac{1}{n}\sum _{t=1}^nZ_t\right) = \frac{1}{n^{2}}\sum _{h=-n+1}^{n-1}\left( n-|h|\right) \text{ Cov }\left( Z_t,Z_{t-h}\right) \le \frac{C_1}{n}, \end{aligned}$$
(90)

where, by Lemma A.8, \(C_1\) is a constant independent of \(r_1,r_2,m_1,m_2\) and rn. Now using the Tchebychev inequality, we have

$$\begin{aligned} \forall \beta>0, \quad {\mathbb {P}}\left\{ \sqrt{r}\Vert {\hat{\Sigma }}_{{\underline{\Upsilon }}_{r}}- {\Sigma }_{{\underline{\Upsilon }}_{r}}\Vert > \beta \right\} \le \frac{1}{\beta ^2}{\mathbb {E}}\left\{ r\Vert {\hat{\Sigma }}_{{\underline{\Upsilon }}_{r}}- {\Sigma }_{{\underline{\Upsilon }}_{r}}\Vert ^2\right\} . \end{aligned}$$

In view of (80) and (90) we have

$$\begin{aligned} {\mathbb {E}}\left\{ r\Vert {\hat{\Sigma }}_{\Upsilon }- {\Sigma }_{{\Upsilon }}\Vert ^2 \right\}\le & {} {\mathbb {E}}\left\{ r\Vert {\hat{\Sigma }}_{{\Upsilon },{\underline{\Upsilon }}_{r}}- {\Sigma }_{{\Upsilon },{\underline{\Upsilon }}_{r}}\Vert ^2 \right\} \\\le & {} {\mathbb {E}}\left\{ r\Vert {\hat{\Sigma }}_{{\underline{\Upsilon }}_{r}}- {\Sigma }_{{\underline{\Upsilon }}_{r}}\Vert ^2\right\} \le r \sum _{m_1,m_2=1}^{K(p+q)r}\text{ Var }\left( \frac{1}{n}\sum _{t=1}^nZ_t\right) \\\le & {} \frac{C_1K^2(p+q)^2r^3}{n}=\mathrm {o}(1) \end{aligned}$$

as \(n\rightarrow \infty \) when \(r=\mathrm {o}(n^{1/3})\). Hence, when \(r=\mathrm {o}(n^{1/3})\)

$$\begin{aligned} \sqrt{r}\Vert {\hat{\Sigma }}_{{\underline{\Upsilon }}_{r}}- {\Sigma }_{{\underline{\Upsilon }}_{r}}\Vert= & {} \mathrm {o}_{{\mathbb {P}}}(1),\\ \sqrt{r}\Vert {\hat{\Sigma }}_{{\Upsilon }}- {\Sigma }_{{\Upsilon }}\Vert= & {} \mathrm {o}_{{\mathbb {P}}}(1)\text { and }\sqrt{r}\Vert {\hat{\Sigma }}_{{\Upsilon },{\underline{\Upsilon }}_{r}}- {\Sigma }_{{\Upsilon },{\underline{\Upsilon }}_{r}}\Vert =\mathrm {o}_{{\mathbb {P}}}(1). \end{aligned}$$

The proof is complete. \(\square \)

We now show that the previous lemma applies when \(\Upsilon _t\) is replaced by \({\hat{\Upsilon }}_t\).

Lemma A.10

Under the assumptions of Theorem 3.10, \(\sqrt{r}\Vert {\hat{\Sigma }}_{\underline{{\hat{\Upsilon }}}_{r}}- {\Sigma }_{{\underline{\Upsilon }}_{r}}\Vert \), \(\sqrt{r}\Vert {\hat{\Sigma }}_{{{\hat{\Upsilon }}}}- {\Sigma }_{\Upsilon }\Vert ,\) and \(\sqrt{r}\Vert {\hat{\Sigma }}_{{{\hat{\Upsilon }}},\underline{{\hat{\Upsilon }}}_{r}}- {\Sigma }_{{\Upsilon },{\underline{\Upsilon }}_{r}}\Vert \) tend to zero in probability as \(n\rightarrow \infty \) when \(r=\mathrm {o}(n^{1/3})\).

Proof

We first show that the replacement of the unknown initial values \(\{X_u,\;u\le 0\}\) by zero is asymptotically unimportant. Let \({\hat{\Sigma }}_{{\underline{\Upsilon }}_{r,n}}\) be the matrix obtained by replacing \(e_t({{\hat{\theta }}}_n)\) by \(\epsilon _t({{\hat{\theta }}}_n)\) in \({\hat{\Sigma }}_{{\hat{{\underline{\Upsilon }}}_r}}\). We start by evaluating \({\mathbb {E}}\Vert {\hat{\Sigma }}_{{\hat{{\underline{\Upsilon }}}_r}}- {\hat{\Sigma }}_{{\underline{\Upsilon }}_{r,n}}\Vert ^2\). We first note that

$$\begin{aligned} {\hat{\Sigma }}_{{\hat{{\underline{\Upsilon }}}_r}}- {\hat{\Sigma }}_{{\underline{\Upsilon }}_{r,n}} =\left[ \frac{1}{n}\sum _{t=1}^na_{t-i,t-i',m_1,m_2}({{\hat{\theta }}}_n)\right] \end{aligned}$$

for \(i,i'=1,\dots ,r\) and \(m_1,m_2=1,\dots ,K(p+q)\) and where

$$\begin{aligned} a_{t-i,t-i',m_1,m_2}({{\hat{\theta }}}_n)= & {} e_{t-i}({{\hat{\theta }}}_n) e_{t-i'}({{\hat{\theta }}}_n)\frac{\partial e_{t-i}({{\hat{\theta }}}_n)}{\partial \theta _{m_1}}\frac{\partial e_{t-i'}({{\hat{\theta }}}_n)}{\partial \theta _{m_2}} \\&-\epsilon _{t-i}({{\hat{\theta }}}_n) \epsilon _{t-i'}({{\hat{\theta }}}_n)\frac{\partial \epsilon _{t-i}({{\hat{\theta }}}_n)}{\partial \theta _{m_1}}\frac{\partial \epsilon _{t-i'}({{\hat{\theta }}}_n)}{\partial \theta _{m_2}}. \end{aligned}$$

Using (80), we have

$$\begin{aligned} \Vert {\hat{\Sigma }}_{{\hat{{\underline{\Upsilon }}}_r}}- {\hat{\Sigma }}_{{\underline{\Upsilon }}_{r,n}}\Vert ^2\le & {} \sum _{i,i'=1}^r\sum _{m_1,m_2=1}^{K(p+q)} \left[ \frac{1}{n}\sum _{t=1}^na_{t-i,t-i',m_1,m_2}({{\hat{\theta }}}_n)\right] ^2. \end{aligned}$$

We thus deduce the following \(L^2\) estimate:

$$\begin{aligned} {\mathbb {E}}\Vert {\hat{\Sigma }}_{{\hat{{\underline{\Upsilon }}}_r}}- {\hat{\Sigma }}_{{\underline{\Upsilon }}_{r,n}}\Vert ^2\le & {} \sum _{i,i'=1}^r \sum _{m_1,m_2=1}^{K(p+q)}\left\| \frac{1}{n}\sum _{t=1}^na_{t-i,t-i',m_1,m_2} ({{\hat{\theta }}}_n)\right\| _2^2\\\le & {} \sum _{i,i'=1}^r\sum _{m_1,m_2=1}^{K(p+q)}\frac{1}{n}\sum _{t=1}^n\left\| a_{t-i,t-i',m_1,m_2}({{\hat{\theta }}}_n)\right\| _2^2, \end{aligned}$$

by Minkowski’s inequality. Thanks to Hölder’s inequality:

$$\begin{aligned}&\left\| a_{t-i,t-i',m_1,m_2}({{\hat{\theta }}}_n)\right\| _2 \\&\quad \le \sum _{j=1}^4 {{{\mathcal {A}}}}^j_{t-i,t-i',m_1,m_2},\text { with} \\&{{\mathcal {A}}}^1_{t-i,t-i',m_1,m_2} \\&\quad = \left\| \sup _{\theta \in \Theta } \left| e_{t-i}(\theta )-\epsilon _{t-i}(\theta )\right| \right\| _4 \sup _{t\ge 0}\left\| \sup _{\theta \in \Theta }\left| e_{t}(\theta )\right| \right\| _{12}\left( \sup _{t\ge 0}\left\| \sup _{\theta \in \Theta }\left\| \frac{\partial e_{t}(\theta )}{\partial \theta }\right\| \right\| _{12}\right) ^2\\&{{\mathcal {A}}}^2_{t-i,t-i',m_1,m_2} \\&\quad =\left\| \sup _{\theta \in \Theta } \left| \epsilon _{t}(\theta )\right| \right\| _{12}\left\| \sup _{\theta \in \Theta }\left| e_{t-i'}(\theta )-\epsilon _{t-i'} (\theta )\right| \right\| _4 \left( \sup _{t\ge 0}\left\| \sup _{\theta \in \Theta }\left\| \frac{\partial e_{t}(\theta )}{\partial \theta }\right\| \right\| _{12}\right) ^2 \\&{{\mathcal {A}}}^3_{t-i,t-i',m_1,m_2} \\&\quad =\left( \left\| \sup _{\theta \in \Theta }\left| \epsilon _{t}(\theta )\right| \right\| _{16}\right) ^2 \left\| \sup _{\theta \in \Theta }\left\| \frac{\partial }{\partial \theta }\left( e_{t-i}(\theta )-\epsilon _{t-i}(\theta )\right) \right\| \right\| _{16/5} \sup _{t\ge 0}\left\| \sup _{\theta \in \Theta }\left\| \frac{\partial e_{t}(\theta )}{\partial \theta }\right\| \right\| _{16} \\&{{\mathcal {A}}}^4_{t-i,t-i',m_1,m_2} \\&\quad =\left( \left\| \sup _{\theta \in \Theta }\left| \epsilon _{t}(\theta )\right| \right\| _{16}\right) ^2\left\| \sup _{\theta \in \Theta }\left\| \frac{\partial \epsilon _{t}(\theta )}{\partial \theta }\right\| \right\| _{16} \left\| \sup _{\theta \in \Theta }\left\| \frac{\partial }{\partial \theta }\left( e_{t-i'}(\theta )-\epsilon _{t-i'}(\theta )\right) \right\| \right\| _{16/5}. \end{aligned}$$

We deal with \({{\mathcal {A}}}^1_{t-i,t-i',m_1,m_2}\) and \(\mathcal{A}^2_{t-i,t-i',m_1,m_2}\), as \({{\mathcal {A}}}^3_{t-i,t-i',m_1,m_2}\) and \({{\mathcal {A}}}^4_{t-i,t-i',m_1,m_2}\) are dealt with similarly. In view of Lemma A.7, we have

$$\begin{aligned} \frac{1}{n}\sum _{t=1}^n{{\mathcal {A}}}^1_{t-i,t-i',m_1,m_2}\le & {} \kappa _1 \frac{1}{n}\sum _{t=1}^n\left\| \sup _{\theta \in \Theta }\left| e_{t-i} (\theta )-\epsilon _{t-i}(\theta )\right| \right\| _4 \\\le & {} \frac{\kappa _1}{n}\left( \sum _{t=1}^{n-r}\left\| \sup _{\theta \in \Theta }\left| e_{t}(\theta )-\epsilon _{t}(\theta ) \right| \right\| _4+r\left\| \sup _{\theta \in \Theta } \left| \epsilon _{0}(\theta )\right| \right\| _4\right) \\= & {} \mathrm {O}\left( \frac{1}{n}+\frac{r}{n}\right) = \mathrm {O}\left( \frac{r}{n}\right) , \end{aligned}$$

independent from i, \(i'\), \(m_1\) and \(m_2\). Similarly, we have

$$\begin{aligned} \frac{1}{n}\sum _{t=1}^n{{\mathcal {A}}}^3_{t-i,t-i',m_1,m_2}\le & {} \kappa _3\frac{1}{n}\sum _{t=1}^n\left\| \sup _{\theta \in \Theta } \left\| \frac{\partial }{\partial \theta }\left( e_{t-i}(\theta )- \epsilon _{t-i}(\theta )\right) \right\| \right\| _{16/5} \\\le & {} \kappa _3\frac{1}{n}\left( \sum _{t=1}^{n-r}\left\| \sup _{\theta \in \Theta }\left\| \frac{\partial }{\partial \theta }\left( e_{t}(\theta )-\epsilon _{t}(\theta )\right) \right\| \right\| _{16/5}\right. \\&\left. +\,r\left\| \sup _{\theta \in \Theta }\left\| \frac{\partial \epsilon _{0}(\theta )}{\partial \theta }\right\| \right\| _{16/5}\right) = \mathrm {O}\left( \frac{1}{n}+\frac{r}{n}\right) = \mathrm {O}\left( \frac{r}{n}\right) , \end{aligned}$$

because \(\sum _{t=1}^{\infty }\left\| \sup _{\theta \in \Theta } \left\| {\partial \left( e_{t}(\theta )-\epsilon _{t}(\theta )\right) }/{\partial \theta }\right\| \right\| _{16/5}<\infty \) and \(\left\| \sup _{\theta \in \Theta }\left\| {\partial \epsilon _{0}(\theta )}/{\partial \theta }\right\| \right\| _{16/5}<\infty \) (see Lemma A.7, Point 4). Gathering \(\mathcal{A}^1_{t-i,t-i',m_1,m_2}\), \({{\mathcal {A}}}^2_{t-i,t-i',m_1,m_2}\), \(\mathcal{A}^3_{t-i,t-i',m_1,m_2}\) and \({{\mathcal {A}}}^4_{t-i,t-i',m_1,m_2}\), we arrive at

$$\begin{aligned} {\mathbb {E}}\Vert {\hat{\Sigma }}_{{\hat{{\underline{\Upsilon }}}_r}}- {\hat{\Sigma }}_{{\underline{\Upsilon }}_{r,n}}\Vert ^2\le & {} \sum _{i,i'=1}^r \sum _{m_1,m_2=1}^{K(p+q)}\left( \frac{1}{n}\sum _{t=1}^n\sum _{j=1}^4 {{\mathcal {A}}}^j_{t-i,t-i',m_1,m_2} \right) ^2 \\= & {} \mathrm {O}\left( r^2\left\{ \frac{r}{n}\right\} ^2\right) = \mathrm {O}\left( \frac{r^4}{n^2}\right) . \end{aligned}$$

We thus deduce that

$$\begin{aligned} \sqrt{r}\Vert {\hat{\Sigma }}_{{\hat{{\underline{\Upsilon }}}_r}}- {\hat{\Sigma }}_{{\underline{\Upsilon }}_{r,n}}\Vert =\mathrm {o}_{{\mathbb {P}}}(1),\text { when }r=r(n)=\mathrm {o}\left( n^{2/5}\right) . \end{aligned}$$
(91)

We now prove that

$$\begin{aligned} \sqrt{r}\Vert {\hat{\Sigma }}_{{\underline{\Upsilon }}_{r,n}}- {\hat{\Sigma }}_{{{{\underline{\Upsilon }}}_r}}\Vert =\mathrm {o}_{{\mathbb {P}}}(1),\text { when }r=r(n)=\mathrm {o}\left( n^{1/3}\right) . \end{aligned}$$

Taylor expansions around \(\theta _0\) yield

$$\begin{aligned} \left| \epsilon _t({{\hat{\theta }}}_n)-{\epsilon }_t(\theta _0)\right| \le r_t\left\| {{\hat{\theta }}}_n-\theta _0\right\| ,\quad \left| \frac{\partial \epsilon _t({{\hat{\theta }}}_n)}{\partial \theta _m}- \frac{\partial \epsilon _t(\theta _0)}{\partial \theta _m}\right| \le s_t(m)\left\| {{\hat{\theta }}}_n-\theta _0\right\| \end{aligned}$$
(92)

with \(r_t=\sup _{\theta \in \Theta }\left\| {\partial {\epsilon }_t({\theta })}/{\partial \theta }\right\| \), \(s_{t}(m)= \sup _{\theta \in \Theta }\left\| {\partial ^2{\epsilon }_t({\theta })}/{\partial \theta \partial \theta _m}\right\| \) where \(m=m_1=m_2\). Define \(Z_t\) as in the proof of Lemma A.9, and let \(Z_{t,n}\) be obtained by replacing \(\Upsilon _t(m)\) by \(\Upsilon _{t,n}(m)=\epsilon _t({{\hat{\theta }}}_n)\partial \epsilon _t({{\hat{\theta }}}_n)/\partial \theta _m\) in \(Z_t\). Using (92), for \(i,i'=1,\dots ,r\) and \(m_1,m_2=1,\dots ,K(p+q)\), we have

$$\begin{aligned}&\left| \epsilon _{t-i}({{\hat{\theta }}}_n) \epsilon _{t-i'}({{\hat{\theta }}}_n)\frac{\partial \epsilon _{t-i}({{\hat{\theta }}}_n)}{\partial \theta _{m_1}}\frac{\partial \epsilon _{t-i'}({{\hat{\theta }}}_n)}{\partial \theta _{m_2}} - \epsilon _{t-i}(\theta _0) \epsilon _{t-i'}(\theta _0)\frac{\partial \epsilon _{t-i}(\theta _0)}{\partial \theta _{m_1}}\frac{\partial \epsilon _{t-i'}(\theta _0)}{\partial \theta _{m_2}}\right| \nonumber \\&\quad \le \sum _{j=1}^4 {{\mathcal {B}}}^j_{t-i,t-i',m_1,m_2}, \end{aligned}$$
(93)

with

$$\begin{aligned} {{\mathcal {B}}}^1_{t-i,t-i',m_1,m_2}= & {} r_{t-i}\left\| {{\hat{\theta }}}_n- \theta _0\right\| \sup _{\theta \in \Theta }\left| \epsilon _{t-i'}(\theta )\right| \sup _{\theta \in \Theta }\left| \frac{\partial \epsilon _{t-i} (\theta )}{\partial \theta _{m_1}}\right| \sup _{\theta \in \Theta }\left| \frac{\partial \epsilon _{t-i'} (\theta )}{\partial \theta _{m_2}}\right| \\ {{\mathcal {B}}}^2_{t-i,t-i',m_1,m_2}= & {} r_{t-i'}\left\| {{\hat{\theta }}}_n -\theta _0\right\| \sup _{\theta \in \Theta }\left| \epsilon _{t-i}(\theta )\right| \sup _{\theta \in \Theta }\left| \frac{\partial \epsilon _{t-i} (\theta )}{\partial \theta _{m_1}}\right| \sup _{\theta \in \Theta }\left| \frac{\partial \epsilon _{t-i'} (\theta )}{\partial \theta _{m_2}}\right| \\ {{\mathcal {B}}}^3_{t-i,t-i',m_1,m_2}= & {} s_{t-i}(m_1)\left\| {{\hat{\theta }}}_n -\theta _0\right\| \sup _{\theta \in \Theta }\left| \epsilon _{t-i}(\theta )\right| \sup _{\theta \in \Theta }\left| \epsilon _{t-i'}(\theta )\right| \sup _{\theta \in \Theta }\left| \frac{\partial \epsilon _{t-i'}(\theta )}{\partial \theta _{m_2}}\right| \\ {{\mathcal {B}}}^4_{t-i,t-i',m_1,m_2}= & {} s_{t-i'}(m_2) \left\| {{\hat{\theta }}}_n-\theta _0\right\| \sup _{\theta \in \Theta } \left| \epsilon _{t-i}(\theta )\right| \sup _{\theta \in \Theta }\left| \epsilon _{t-i'}(\theta )\right| \sup _{\theta \in \Theta }\left| \frac{\partial \epsilon _{t-i}(\theta )}{\partial \theta _{m_1}}\right| . \end{aligned}$$

We deal with \({{\mathcal {B}}}^1_{t-i,t-i',m_1,m_2}\) and \(\mathcal{B}^2_{t-i,t-i',m_1,m_2}\), as \({{\mathcal {B}}}^3_{t-i,t-i',m_1,m_2}\) and \({{\mathcal {B}}}^4_{t-i,t-i',m_1,m_2}\) are dealt with similarly. We note first that, for all \(i=1,\dots ,r\),

$$\begin{aligned} \frac{1}{n}\sum _{t=1}^n\sup _{\theta \in \Theta }\left| \epsilon _{t-i} (\theta )\right| ^4= & {} \frac{1}{n}\sum _{t=1-i}^{n-i}\sup _{\theta \in \Theta } \left| \epsilon _{t}(\theta )\right| ^4 =\frac{1}{n}\sum _{t=1-i}^{0} \sup _{\theta \in \Theta } \left| \epsilon _{t}(\theta )\right| ^4 +\frac{1}{n}\sum _{t=1}^{n-i}\sup _{\theta \in \Theta } \left| \epsilon _{t}(\theta )\right| ^4 \nonumber \\\le & {} \frac{r}{n}\frac{1}{r}\sum _{t=1-r}^{0}\sup _{\theta \in \Theta } \left| \epsilon _{t}(\theta )\right| ^4+\frac{1}{n}\sum _{t=1}^{n} \sup _{\theta \in \Theta }\left| \epsilon _{t}(\theta )\right| ^4 \nonumber \\= & {} \left( \frac{r}{n}+1\right) \left( \left\| \sup _{\theta \in \Theta } \left| \epsilon _{0}(\theta )\right| \right\| ^4_4+\mathrm {o}_{a.s.}(1) \right) , \end{aligned}$$
(94)

by the ergodic theorem. Similarly to (94), we have

$$\begin{aligned} \frac{1}{n}\sum _{t=1}^n\sup _{\theta \in \Theta }\left| \frac{\partial \epsilon _{t-i}(\theta )}{\partial \theta _{m}}\right| ^4 \le \left( \frac{r}{n}+1\right) \left( \left\| \sup _{\theta \in \Theta } \left| \frac{\partial \epsilon _{0}(\theta )}{\partial \theta _{m}}\right| \right\| ^4_4+\mathrm {o}_{a.s.}(1)\right) . \end{aligned}$$
(95)

By the Cauchy-Schwarz inequality and using (94) and (95), we have

$$\begin{aligned} \sum _{i,i'=1}^r\sum _{m_1,m_2=1}^{K(p+q)}\frac{1}{n}\sum _{t=1}^n {{\mathcal {B}}}^1_{t-i,t-i',m_1,m_2}\le & {} r^2\left\| {{\hat{\theta }}}_n-\theta _0\right\| \left( \frac{r}{n}+1\right) ^3\left( \kappa _1+\mathrm {o}_{a.s.}(1)\right) \\= & {} r^2\left\| {{\hat{\theta }}}_n-\theta _0\right\| \mathrm {O}(1)\left( \kappa _1+ \mathrm {o}_{a.s.}(1)\right) , \end{aligned}$$

when \(r=\mathrm {o}\left( n^{1/3}\right) \) and for some constant \(\kappa _1>0\). Similar inequalities hold for \(\mathcal{B}^j_{t-i,t-i',m_1,m_2}\), for \(j=2, 3, 4\). We thus deduce from (80) and (93) that

$$\begin{aligned} r\Vert {\hat{\Sigma }}_{{\underline{\Upsilon }}_{r,n}}- {\hat{\Sigma }}_{{{{\underline{\Upsilon }}}_r}}\Vert ^2\le & {} r^3\left\| {{\hat{\theta }}}_n-\theta _0\right\| ^2\mathrm {O}_{{\mathbb {P}}}(1). \end{aligned}$$
(96)

Since \(\sqrt{n}\left( {{\hat{\theta }}}_n-\theta _0\right) \) converges in distribution, a tightness argument yields \(\left\| {{\hat{\theta }}}_n- \theta _0\right\| =\mathrm {O}_{{\mathbb {P}}}\left( n^{-1/2}\right) \) and hence from (96), we obtain for \(r=\mathrm {o}(n^{1/3})\)

$$\begin{aligned} \sqrt{r}\Vert {\hat{\Sigma }}_{{\underline{\Upsilon }}_{r,n}}- {\hat{\Sigma }}_{{\underline{\Upsilon }}_r}\Vert =\mathrm {o}_{{\mathbb {P}}}(1). \end{aligned}$$
(97)

By Lemma A.9 , (91) and (97) show that \(\sqrt{r}\Vert {\hat{\Sigma }}_{{\hat{{\underline{\Upsilon }}}_r}}- {\Sigma }_{{\underline{\Upsilon }}_r}\Vert =\mathrm {o}_{{\mathbb {P}}}(1)\). The other results are obtained similarly. \(\square \)

Write \(\underline{{\varvec{\Phi }}}_{r}^*=\left( \Phi _{1}\cdots \Phi _{r}\right) \) where the \(\Phi _{i}\)’s are defined by (21).

Lemma A.11

Under the assumptions of Theorem 3.10,

$$\begin{aligned} \sqrt{r}\left\| \underline{{\varvec{\Phi }}}_{r}^*- \underline{{\varvec{\Phi }}}_{r}\right\| \rightarrow 0, \end{aligned}$$

as \(r\rightarrow \infty \).

Proof

Recall that by (21) and (78)

$$\begin{aligned} \Upsilon _t= & {} \underline{{\varvec{\Phi }}}_{r}{\underline{\Upsilon }}_{r,t}+u_{r,t} =\underline{{\varvec{\Phi }}}_{r}^*{\underline{\Upsilon }}_{r,t}+\sum _{i=r+1}^\infty \Phi _i{\Upsilon }_{t-i}+u_t :=\underline{{\varvec{\Phi }}}_{r}^*{\underline{\Upsilon }}_{r,t}+u_{r,t}^*. \end{aligned}$$

Hence, using the orthogonality conditions in (21) and (78)

$$\begin{aligned} \underline{{\varvec{\Phi }}}_{r}^*-\underline{{\varvec{\Phi }}}_{r}= & {} -{\Sigma }_{u_r^*,{\underline{\Upsilon }}_{r}} {\Sigma }_{{\underline{\Upsilon }}_{r}}^{-1} \end{aligned}$$
(98)

where \({\Sigma }_{u_r^*,{\underline{\Upsilon }}_{r}}= {\mathbb {E}}u_{r,t}^*{\underline{\Upsilon }}_{r,t}'\). Using arguments and notations of the proof of Lemma A.8, there exists a constant \(C_2\) independent of s and \(m_1,m_2\) such that

$$\begin{aligned} {\mathbb {E}}\left| {\Upsilon }_{1}(m_1){\Upsilon }_{1+s}(m_2)\right| \le C_1\sum _{h_1,h_2=0}^{\infty }\rho ^{h_1+h_2}\Vert \epsilon _1\Vert _{4}^4\le C_2. \end{aligned}$$

By the Cauchy-Schwarz inequality and (80), we then have

$$\begin{aligned} \left\| \text{ Cov }\left( {\Upsilon }_{t-r-h},{\underline{\Upsilon }}_{r,t}\right) \right\| \le C_2r^{1/2}K(p+q). \end{aligned}$$

Thus,

$$\begin{aligned} \Vert {\Sigma }_{u_r^*,{\underline{\Upsilon }}_{r}}\Vert= & {} \Vert \sum _{i=r+1}^\infty \Phi _i{\mathbb {E}}{\Upsilon }_{t-i}{\underline{\Upsilon }}_{r,t}'\Vert \le \sum _{h=1}^\infty \Vert \Phi _{r+h}\Vert \left\| \text{ Cov }\left( {\Upsilon }_{t-r-h},{\underline{\Upsilon }}_{r,t}\right) \right\| \nonumber \\= & {} \mathrm O(1)r^{1/2}\sum _{h=1}^\infty \Vert \Phi _{r+h}\Vert . \end{aligned}$$
(99)

Note that the assumption \(\Vert \Phi _i\Vert =\mathrm o\left( i^{-2}\right) \) entails \( r\sum _{h=1}^\infty \Vert \Phi _{r+h}\Vert =\mathrm o(1)\) as \(r\rightarrow \infty \). The lemma therefore follows from (98), (99) and Lemma A.6. \(\square \)

The following lemma is similar to Lemma 3 in Berk (1974).

Lemma A.12

Under the assumptions of Theorem 3.10,

$$\begin{aligned} \sqrt{r}\Vert {\hat{\Sigma }}_{\underline{{\hat{\Upsilon }}}_{r}}^{-1}- {\Sigma }_{{\underline{\Upsilon }}_{r}}^{-1}\Vert= & {} \mathrm o_{\mathbb {P}}(1) \end{aligned}$$

as \(n\rightarrow \infty \) when \(r=\mathrm o(n^{1/3})\) and \(r\rightarrow \infty \).

Proof

We have

$$\begin{aligned} \left\| {\hat{\Sigma }}_{\underline{{\hat{\Upsilon }}}_{r}}^{-1}- {\Sigma }_{{\underline{\Upsilon }}_{r}}^{-1}\right\|= & {} \left\| \left\{ {\hat{\Sigma }}_{\underline{{\hat{\Upsilon }}}_{r}}^{-1}- {\Sigma }_{{\underline{\Upsilon }}_{r}}^{-1}+ {\Sigma }_{{\underline{\Upsilon }}_{r}}^{-1}\right\} \left\{ {\Sigma }_{{\underline{\Upsilon }}_{r}}- {\hat{\Sigma }}_{\underline{{\hat{\Upsilon }}}_{r}}\right\} {\Sigma }_{{\underline{\Upsilon }}_{r}}^{-1}\right\| \\\le & {} \left( \left\| {\hat{\Sigma }}_{\underline{{\hat{\Upsilon }}}_{r}}^{-1}- {\Sigma }_{{\underline{\Upsilon }}_{r}}^{-1}\right\| + \left\| {\Sigma }_{{\underline{\Upsilon }}_{r}}^{-1}\right\| \right) \left\| {\hat{\Sigma }}_{\underline{{\hat{\Upsilon }}}_{r}}- {\Sigma }_{{\underline{\Upsilon }}_{r}}\right\| \left\| {\Sigma }_{{\underline{\Upsilon }}_{r}}^{-1}\right\| . \end{aligned}$$

Iterating this inequality, we obtain

$$\begin{aligned} \left\| {\hat{\Sigma }}_{\underline{{\hat{\Upsilon }}}_{r}}^{-1}- {\Sigma }_{{\underline{\Upsilon }}_{r}}^{-1}\right\|\le & {} \left\| {\Sigma }_{{\underline{\Upsilon }}_{r}}^{-1}\right\| \sum _{i=1}^{\infty }\left\| {\hat{\Sigma }}_{\underline{{\hat{\Upsilon }}}_{r}}- {\Sigma }_{{\underline{\Upsilon }}_{r}}\right\| ^i \left\| {\Sigma }_{{\underline{\Upsilon }}_{r}}^{-1}\right\| ^i. \end{aligned}$$

Thus, for every \(\varepsilon >0\),

$$\begin{aligned}&\mathbb P\left( \sqrt{r}\left\| {\hat{\Sigma }}_{\underline{{\hat{\Upsilon }}}_{r}}^{-1}- {\Sigma }_{{\underline{\Upsilon }}_{r}}^{-1}\right\|>\varepsilon \right) \\&\quad \le \mathbb P\left( \sqrt{r}\frac{\left\| {\Sigma }_{{\underline{\Upsilon }}_{r}}^{-1} \right\| ^2 \left\| {\hat{\Sigma }}_{\underline{{\hat{\Upsilon }}}_{r}}- {\Sigma }_{{\underline{\Upsilon }}_{r}}\right\| }{1- \left\| {\hat{\Sigma }}_{\underline{{\hat{\Upsilon }}}_{r}}- {\Sigma }_{{\underline{\Upsilon }}_{r}}\right\| \left\| {\Sigma }_{{\underline{\Upsilon }}_{r}}^{-1}\right\| }>\varepsilon \text{ and } \left\| {\hat{\Sigma }}_{\underline{{\hat{\Upsilon }}}_{r}}- {\Sigma }_{{\underline{\Upsilon }}_{r}}\right\| \left\| {\Sigma }_{{\underline{\Upsilon }}_{r}}^{-1}\right\| <1\right) \\&\qquad +\,\mathbb P\left( \sqrt{r}\left\| {\hat{\Sigma }}_{\underline{{\hat{\Upsilon }}}_{r}}- {\Sigma }_{{\underline{\Upsilon }}_{r}}\right\| \left\| {\Sigma }_{{\underline{\Upsilon }}_{r}}^{-1}\right\| \ge 1\right) \\&\quad \le {\mathbb {P}}\left( \sqrt{r}\left\| {\hat{\Sigma }}_{\underline{{\hat{\Upsilon }}}_{r}}- {\Sigma }_{{\underline{\Upsilon }}_{r}}\right\| >\frac{\varepsilon }{ \left\| {\Sigma }_{{\underline{\Upsilon }}_{r}}^{-1}\right\| ^2+\varepsilon r^{-1/2}\left\| {\Sigma }_{{\underline{\Upsilon }}_{r}}^{-1}\right\| } \right) \\&\qquad +\,{\mathbb {P}}\left( \sqrt{r}\left\| {\hat{\Sigma }}_{\underline{{\hat{\Upsilon }}}_{r}}- {\Sigma }_{{\underline{\Upsilon }}_{r}}\right\| \ge \left\| {\Sigma }_{{\underline{\Upsilon }}_{r}}^{-1}\right\| ^{-1}\right) = \mathrm o(1) \end{aligned}$$

by Lemmas A.9 and A.6. This establishes Lemma A.12. \(\square \)

Lemma A.13

Under the assumptions of Theorem 3.10,

$$\begin{aligned} \sqrt{r}\left\| \underline{\hat{{\varvec{\Phi }}}}_{r}- \underline{{\varvec{\Phi }}}_{r}\right\| =\mathrm o_{\mathbb {P}}(1) \end{aligned}$$

as \(r\rightarrow \infty \) and \(r=\mathrm o(n^{1/3})\).

Proof

By the triangle inequality and Lemmas A.6 and A.12, we have

$$\begin{aligned} \left\| {\hat{\Sigma }}_{\underline{{\hat{\Upsilon }}}_{r}}^{-1}\right\| \le \left\| {\hat{\Sigma }}_{\underline{{\hat{\Upsilon }}}_{r}}^{-1}- {\Sigma }_{{\underline{\Upsilon }}_{r}}^{-1}\right\| + \left\| {\Sigma }_{{\underline{\Upsilon }}_{r}}^{-1}\right\| =\mathrm O_\mathbb P(1). \end{aligned}$$
(100)

Note that the orthogonality conditions in (78) entail that \(\underline{{\varvec{\Phi }}}_{r}={\Sigma }_{{\Upsilon },{\underline{\Upsilon }}_{r}} {\Sigma }_{{\underline{\Upsilon }}_{r}}^{-1}\). By Lemmas A.6A.9A.12, and (100), we then have

$$\begin{aligned} \sqrt{r}\left\| \underline{\hat{{\varvec{\Phi }}}}_{r}- \underline{{\varvec{\Phi }}}_{r}\right\|= & {} \sqrt{r}\left\| {\hat{\Sigma }}_{{\hat{\Upsilon }}, \underline{{\hat{\Upsilon }}}_{r}} {\hat{\Sigma }}_{\underline{{\hat{\Upsilon }}}_{r}}^{-1} -{\Sigma }_{{\Upsilon },{\underline{\Upsilon }}_{r}} {\Sigma }_{{\underline{\Upsilon }}_{r}}^{-1}\right\| \\= & {} \sqrt{r}\left\| \left( {\hat{\Sigma }}_{{\hat{\Upsilon }}, \underline{{\hat{\Upsilon }}}_{r}} -{\Sigma }_{{\Upsilon },{\underline{\Upsilon }}_{r}}\right) {\hat{\Sigma }}_{\underline{{\hat{\Upsilon }}}_{r}}^{-1} +{\Sigma }_{{\Upsilon },{\underline{\Upsilon }}_{r}} \left( {\hat{\Sigma }}_{\underline{{\hat{\Upsilon }}}_{r}}^{-1} -{\Sigma }_{{\underline{\Upsilon }}_{r}}^{-1}\right) \right\| =\mathrm o_{\mathbb {P}}(1). \end{aligned}$$

\(\square \)

Proof of Theorem 3.10

In view of (20), it suffices to show that \(\underline{\hat{{\varvec{\Phi }}}}_r(1)\rightarrow \underline{{{\varvec{\Phi }}}}(1)\) and \({\hat{\Sigma }}_{u_r}\rightarrow {\Sigma }_{u}\) in probability. Let the \(r\times 1\) vector \(\mathbf{1}_r=(1,\dots ,1)'\) and the \(r(p+q)K\times (p+q)K\) matrix \(\mathbf{E}_r={\mathbb {I}}_{(p+q)K}\otimes \mathbf{1}_r\), where \(\otimes \) denotes the matrix Kronecker product and \({\mathbb {I}}_d\) the \(d\times d\) identity matrix. Using (80), and Lemmas A.11A.13, we obtain

$$\begin{aligned} \left\| \underline{\hat{{\varvec{\Phi }}}}_r(1)- \underline{{{\varvec{\Phi }}}}(1)\right\|\le & {} \left\| \sum _{i=1}^r\left( \hat{ \Phi }_{r,i}-\Phi _{r,i}\right) \right\| +\left\| \sum _{i=1}^r\left( {\Phi }_{r,i}-\Phi _{i}\right) \right\| +\left\| \sum _{i=r+1}^{\infty }\Phi _{i}\right\| \\= & {} \left\| \left( \underline{\hat{{\varvec{\Phi }}}}_{r}-\underline{{{\varvec{\Phi }}}}_{r} \right) \mathbf{E}_r\right\| +\left\| \left( \underline{{{\varvec{\Phi }}}}_{r}^*- \underline{{{\varvec{\Phi }}}}_{r}\right) \mathbf{E}_r\right\| +\left\| \sum _{i=r+1}^{\infty }\Phi _{i}\right\| \\\le & {} \sqrt{(p+q)K}\sqrt{r}\left\{ \left\| \underline{\hat{{\varvec{\Phi }}}}_{r}- \underline{{{\varvec{\Phi }}}}_{r}\right\| +\left\| \underline{{{\varvec{\Phi }}}}_{r}^*-\underline{{{\varvec{\Phi }}}}_{r} \right\| \right\} +\left\| \sum _{i=r+1}^{\infty }\Phi _{i}\right\| \\= & {} \mathrm o_{\mathbb {P}}(1). \end{aligned}$$

Now note that

$$\begin{aligned} {\hat{\Sigma }}_{u_r}={\hat{\Sigma }}_{{{\hat{\Upsilon }}}} -\underline{\hat{{\varvec{\Phi }}}}_{r} {\hat{\Sigma }}_{{{\hat{\Upsilon }}},\underline{{\hat{\Upsilon }}}_r}' \end{aligned}$$

and, by (21)

$$\begin{aligned} {\Sigma }_{u}= & {} \mathbb Eu_tu_t'=\mathbb Eu_t\Upsilon _t'={\mathbb {E}} \left\{ \left( \Upsilon _t-\sum _{i=1}^{\infty }\Phi _i\Upsilon _{t-i}\right) \Upsilon _t'\right\} \\= & {} {\Sigma }_{{{\Upsilon }}}-\sum _{i=1}^{\infty }\Phi _i\mathbb E{\Upsilon }_{t-i}{\Upsilon }_{t}' ={\Sigma }_{\Upsilon }- \underline{{{\varvec{\Phi }}}}_{r}^*{\Sigma }_{{{\Upsilon }}, {{\underline{\Upsilon }}_{r}}}' -\sum _{i=r+1}^{\infty }\Phi _i{\mathbb {E}} {\Upsilon }_{t-i}{\Upsilon }_{t}'. \end{aligned}$$

Thus,

$$\begin{aligned} \left\| {\hat{\Sigma }}_{u_r}-{\Sigma }_{u}\right\|= & {} \left\| {\hat{\Sigma }}_{{\hat{\Upsilon }}} -{\Sigma }_{\Upsilon }- \left( \underline{\hat{{\varvec{\Phi }}}}_{r}- \underline{{{\varvec{\Phi }}}}_{r}^*\right) {\hat{\Sigma }}_{{{\hat{\Upsilon }}}, \underline{{\hat{\Upsilon }}}_r}'\right. \nonumber \\&\left. - \underline{{{\varvec{\Phi }}}}_{r}^* \left( {\hat{\Sigma }}_{{{\hat{\Upsilon }}},\underline{{\hat{\Upsilon }}}_r}'- {\Sigma }_{{{\Upsilon }},{{\underline{\Upsilon }}_{r}}}'\right) +\sum _{i=r+1}^{\infty }\Phi _i{\mathbb {E}}{\Upsilon }_{t-i} {\Upsilon }_{t}'\right\| \nonumber \\\le & {} \left\| {\hat{\Sigma }}_{{\hat{\Upsilon }}} -{\Sigma }_{\Upsilon }\right\| + \left\| \left( \underline{\hat{{\varvec{\Phi }}}}_{r}- \underline{{{\varvec{\Phi }}}}_{r}^*\right) \left( {\hat{\Sigma }}_{{{\hat{\Upsilon }}}, \underline{{\hat{\Upsilon }}}_r}'- {\Sigma }_{{{\Upsilon }}, \underline{{\Upsilon }}_r}'\right) \right\| \nonumber \\&+\left\| \left( \underline{\hat{{\varvec{\Phi }}}}_{r}- \underline{{{\varvec{\Phi }}}}_{r}^*\right) {\Sigma }_{{{\Upsilon }}, \underline{{\Upsilon }}_r}'\right\| +\left\| \underline{{{\varvec{\Phi }}}}_{r}^* \left( {\hat{\Sigma }}_{{{\hat{\Upsilon }}},\underline{{\hat{\Upsilon }}}_r}'- {\Sigma }_{{{\Upsilon }},{{\underline{\Upsilon }}_{r}}}'\right) \right\| \nonumber \\&+\left\| \sum _{i=r+1}^{\infty }\Phi _i\mathbb E{\Upsilon }_{t-i}{\Upsilon }_{t}'\right\| . \end{aligned}$$
(101)

In the right-hand side of this inequality, the first norm is \(\mathrm o_{\mathbb {P}}(1)\) by Lemma A.9. By Lemmas A.11 and A.13, we have \(\Vert \underline{\hat{{\varvec{\Phi }}}}_{r}- \underline{{{\varvec{\Phi }}}}_{r}^*\Vert =\mathrm o_\mathbb P(r^{-1/2})=\mathrm o_{\mathbb {P}}(1)\), and by Lemma A.9, \(\Vert {\hat{\Sigma }}_{{{\hat{\Upsilon }}}, \underline{{\hat{\Upsilon }}}_r}'-{\Sigma }_{{{\Upsilon }}, \underline{{\Upsilon }}_r}'\Vert =\mathrm o_{\mathbb {P}}(r^{-1/2})=\mathrm o_{\mathbb {P}}(1)\). Therefore the second norm in the right-hand side of (101) tends to zero in probability. The third norm tends to zero in probability because \(\Vert \underline{\hat{{\varvec{\Phi }}}}_{r}- \underline{{{\varvec{\Phi }}}}_{r}^*\Vert =\mathrm o_{\mathbb {P}}(1)\) and, by Lemma A.6, \(\Vert {\Sigma }_{{{\Upsilon }},\underline{{\Upsilon }}_r}'\Vert =\mathrm O(1)\). The fourth norm tends to zero in probability because, in view of Lemma A.9, \(\Vert {\hat{\Sigma }}_{{{\hat{\Upsilon }}},\underline{{\hat{\Upsilon }}}_r}'- {\Sigma }_{{{\Upsilon }},{{\underline{\Upsilon }}_{r}}}'\Vert =\mathrm o_{\mathbb {P}}(1)\), and, in view of (80), \(\Vert \underline{{{\varvec{\Phi }}}}_{r}^*\Vert ^2\le \sum _{i=1}^\infty \text{ Tr }(\Phi _i\Phi _i')<\infty \). Clearly, the last norm tends to zero, which completes the proof. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Boubacar Maïnassara, Y., Rabehasaina, L. Estimation of weak ARMA models with regime changes. Stat Inference Stoch Process 23, 1–52 (2020). https://doi.org/10.1007/s11203-019-09202-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11203-019-09202-3

Keywords

Mathematics Subject Classification

Navigation