DETECTING CHANGES IN GARCH(1,1) PROCESSES WITHOUT ASSUMING STATIONARITY

Lajos Horváth; Shixuan Wang

doi:10.1017/S026646662510011X

DETECTING CHANGES IN GARCH(1,1) PROCESSES WITHOUT ASSUMING STATIONARITY

Published online by Cambridge University Press: 24 November 2025

Lajos Horváth and

Shixuan Wang

Show author details

Lajos Horváth: Affiliation:
University of Utah
Shixuan Wang*: Affiliation:
University of Reading
*: Address correspondence to Shixuan Wang, Department of Economics, University of Reading, Reading, UK; e-mail: shixuan.wang@reading.ac.uk

Article contents

Abstract
Introduction
ASSUMPTIONS AND MAIN RESULTS
MONTE CARLO SIMULATIONS
EMPIRICAL APPLICATIONS
Conclusions
PROOFS OF TECHNICAL RESULTS
Footnotes
References

Rights & Permissions

Abstract

This article develops a new test to detect changes in generalized autoregressive conditionally heteroscedastic (GARCH(1,1)) processes without imposing a stationary assumption. Specifically, the procedure tests the null hypothesis of a GARCH process with constant parameters, either in (strictly) stationary or explosive regimes, against the alternative hypothesis of parameter changes. We derive the limiting distribution of the test statistics and establish their asymptotic consistency. Monte Carlo simulations show that the proposed test has good size control and high power. We demonstrate a prototype application on a small group of stocks and report a further extensive application to more than ten thousand U.S. stocks.

Information

Type: ARTICLES
Information: Econometric Theory , First View , pp. 1 - 38

DOI: https://doi.org/10.1017/S026646662510011X [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright: © The Author(s), 2025. Published by Cambridge University Press

1 Introduction

Modeling volatility in financial time series plays an important role in risk management. A key milestone in volatility modeling is the autoregressive conditionally heteroscedastic (ARCH) process introduced by Engle (Reference Engle1982). Following this seminal work, many extensions of the ARCH process have been developed. The most widely used is the generalized ARCH (GARCH) process proposed by Bollerslev (Reference Bollerslev1986). In the last few decades, the GARCH process has received substantial theoretical attention. More importantly, it has become the “workhorse” for volatility modeling in the financial industry (Lee and Hansen, Reference Lee and Hansen1994). We refer to Francq and Zakoian (Reference Francq and Zakoian2019) for a comprehensive review of the GARCH model.

In practical applications, financial returns are commonly assumed to be stationary. This assumption has led to a rich body of literature on stationary GARCH, overshadowing the comparatively limited research on nonstationary (explosive) GARCH. We conduct an empirical investigation using the nonstationarity test of Francq and Zakoïan (Reference Francq and Zakoïan2012) for the universe of U.S. stocks between 2011 and 2022. At the 1% significance level, the test reveals that the null hypothesis of nonstationarity (explosivity) cannot be rejected for 631 stocks, which constitute 5.94% of the total stocks tested. Further details can be found in Section 4. This empirical evidence shows that nonstationary (explosive) GARCH processes are more prevalent than is typically assumed.

The pioneering work in the limited literature on the explosive and nonstationary behavior of the GARCH(1,1) process is by Jensen and Rahbek (Reference Jensen and Rahbek2004). By imposing the constraint that the intercept parameter in the GARCH(1,1) model is fixed, they establish consistency and asymptotic normality for a constrained version of the quasi-maximum likelihood estimator (QMLE) without assuming stationarity. In groundbreaking work of Francq and Zakoïan (Reference Francq and Zakoïan2012), they study the asymptotic properties of unconstrained QMLE for GARCH(1,1) models without a strict stationarity assumption. They discover that, except for the intercept, the QMLE remains consistent and asymptotically normal in both stationary and nonstationary (explosive) cases. They further develop econometric tools for testing strict stationarity and nonstationarity in GARCH models. In a follow-up study, Francq and Zakoïan (Reference Francq and Zakoïan2013) establish the asymptotic normality of the unconstrained QMLE of the parameters, including the power parameter but excluding the intercept, for asymmetric power-transformed GARCH(1,1) models when strict stationarity does not hold.

The stability of parameter values in an ARCH or GARCH process is crucial for estimating the model parameters. If a researcher fails to account for potential changes in its parameter values, the inference of model parameters will not be valid, and the conditional volatility implied by the estimated model could be misleading, resulting in under- or over-estimation of the underlying risk. Kokoszka and Leipus (Reference Kokoszka and Leipus1999, Reference Kokoszka and Leipus2000) make early contributions to change-point analysis for ARCH processes. They introduce test procedures for ARCH models with at least a finite second moment. Expanding on this, Andreou and Ghysels (Reference Andreou and Ghysels2002) develop tests for multiple breaks that can be applied to ARCH processes.

An early contribution to the detection of parameter changes in GARCH models is attributed to Chu (Reference Chu1995), who employs the maximized Lagrange multiplier (LM) test. Using the approximate likelihood scores, Berkes, Horváth, and Kokoszka (Reference Berkes, Horváth and Kokoszka2004) introduce a test for parameter constancy in GARCH(p,q) models that does not require observations to have finite variance. From a different perspective, Galeano and Tsay (Reference Galeano and Tsay2010) propose an algorithm to detect multiple change points in individual parameters of a GARCH model. Chen and Hong (Reference Chen and Hong2016) propose a test that detects smooth changes in GARCH parameters by testing whether the global GARCH estimates deviate significantly from the local parameter estimates. Different from the retrospective detection framework, Berkes et al. (Reference Berkes, Gombay, Horváth and Kokoszka2004) develop a sequential monitoring scheme for the on-line detection of changes in GARCH(p,q) models in real time. Their procedure is based on quasi-likelihood scores and does not use model residuals.

Almost all the existing literature on detecting changes in GARCH models assumes that the parameters of GARCH models are in a (strictly) stationary regime under the null hypothesis. In contrast, we develop a test to detect changes in GARCH(1,1) processes without assuming stationarity. Namely, we test the null hypothesis of a GARCH process with constant parameters that are in either (strictly) stationary or explosive regimes against the alternative of the existence of parameter changes. Our test complements the existing econometric toolset for detecting changes in GARCH(1,1) processes that could be either (strictly) stationary or explosive under the null. From a practical perspective, our test procedure is the same for both stationary and explosive GARCH(1,1) processes that simplifies its application. We derive the limiting distribution of the test statistics and establish the asymptotic consistency of the test. Monte Carlo simulations show that the proposed test has good size control and high power. In line with Francq and Zakoïan (Reference Francq and Zakoïan2012, Reference Francq and Zakoïan2013), a limitation of our study is that our test is applicable to GARCH(1,1) parameters except for the intercept.

Our study is closely related to that of Richter, Wang, and Wu (Reference Richter, Wang and Wu2023), but the aim and setup differ. They construct a test to identify whether there is a period during which the parameters of a GARCH model have changed. In their test framework, the null hypothesis assumes a GARCH model with constant parameters, whereas the alternative hypothesis has a period during which the parameters of the GARCH model have changed. Notably, their null hypothesis is in the regime where the GARCH process is strictly stationary. In contrast, our test allows the parameters of GARCH(1,1) processes to be in either stationary or explosive regimes under the null hypothesis. We formally prove the limiting distribution of our test statistics and establish the asymptotic consistency of the test for both stationary and explosive GARCH(1,1) processes. As a result, our study makes theoretical contributions that complement the existing research in this area.

Our study is also linked to the broader literature on change-point detection for nonparametric and (semi)parametric measures of volatility and market risks. Andreou and Ghysels (Reference Andreou and Ghysels2006) study data-driven processes related to volatility and develop cumulative sum (CUSUM) change-point tests to monitor the stability of dynamic variance processes. Xu (Reference Xu2013) proposes modified CUSUM and LM tests for detecting structural changes in volatility, using a robust long-run variance estimator to achieve higher power through faster divergence under the alternative. Pape, Wied, and Galeano (Reference Pape, Wied and Galeano2016) introduce a model-independent multivariate monitoring procedure for detecting changes in the vector of componentwise unconditional variances. Lazar, Wang, and Xue (Reference Lazar, Wang and Xue2023) propose a test that applies the CUSUM procedure to the Wilcoxon statistic to detect change points in the value-at-risk and expected shortfall estimated by (semi)parametric models. Hoga and Demetrescu (Reference Hoga and Demetrescu2023) develop monitoring procedures for forecasts of value-at-risk and expected shortfall that are generated by nonparametric and (semi)parametric models.

The remainder of the article is organized as follows: In Section 2, we present the assumptions and formulate the main results of the asymptotic properties. Section 3 is dedicated to a simulation study examining the finite sample performance of the proposed test. In Section 4, we present an empirical demonstration of our testing procedure by applying it first to a small group of U.S. stocks and then to a dataset of more than ten thousand U.S. stocks. Section 5 concludes, and Section 6 is devoted to the proofs of all theoretical results.

2 ASSUMPTIONS AND MAIN RESULTS

The time-dependent GARCH (1,1) sequence is defined by the recursion

$$ \begin{align*}y_i=\sigma_i\epsilon_i\quad\;\; \text{and}\quad \;\;\sigma_i^2=\omega_i+\alpha_iy_{i-1}^2+\beta_i\sigma_{i-1}^2,\;\;\;1\leq i<\infty, \end{align*} $$

where $\sigma _0^2$ , $y_0^2$ are initial values, and $\alpha _i$ , $\beta _i$ , and $\omega _i$ are positive parameters. We observe $y_1,y_2,\dots , y_N$ . Under the null hypothesis, we have

$$ \begin{align*} H_0:\; (\alpha_1,\beta_1,\omega_1)=(\alpha_2,\beta_2,\omega_2)=\dots=(\alpha_N,\beta_N,\omega_N). \end{align*} $$

If the null hypothesis holds, the model has the form

(2.1)

$$ \begin{align} y_i=\sigma_i\epsilon_i\quad\;\; \text{and}\quad \;\;\sigma_i^2=\omega_0+\alpha_0y_{i-1}^2+\beta_0\sigma_{i-1}^2,\;\;\;1\leq i<\infty. \end{align} $$

We use the notation $\boldsymbol {\theta }_0=(\alpha _0, \beta _0, \omega _0)^\top $ . It is common to assume that the following assumptions.

Assumption 2.1. $\{\epsilon _i, -\infty <i<\infty \}$ are independent and identically distributed random variables, $\epsilon _0^2$ is nondegenerate $E\epsilon _0=0$ , $E\epsilon _0^2=1$ , $0<\mathrm { var}(\epsilon _0^2)$ , and $E|\epsilon _0|^\kappa <\infty $ with some $\kappa>4$

and

Assumption 2.2. $\alpha _0>0$ , $\beta _0>0$ , and $\omega _0>0$ .

Our method employs a quasi-likelihood function based on the normal density; the corresponding quasi-likelihood estimator is given by $\hat {\boldsymbol {\theta }}_N= (\hat {\alpha }_N, \hat {\beta }_N, \hat {\omega }_N)^\top $ . If a different density is used in the definition of the quasi-likelihood, as in Berkes and Horváth (Reference Berkes and Horváth2004), our results remain true.

Because the conditional variances $\sigma _{i-1}^2$ are not observed, we use the recursion

(2.2)

$$ \begin{align} \bar{\sigma}_i^2(\boldsymbol{\theta})=\omega+\alpha y_{i-1}^2+\beta\bar{\sigma}_{i-1}^2(\boldsymbol{\theta}),\quad i\geq 1, \end{align} $$

where $\boldsymbol {\theta }=( \alpha , \beta , \omega )^\top $ . The corresponding $\log $ quasi-likelihood function (ignoring a negative scaling factor) based on $\{y_i,1\leq i\leq k\}$ is

$$ \begin{align*}\bar{{\mathcal L}}_k(\boldsymbol{\theta})=\sum_{i=1}^k\bar{\ell}_i(\boldsymbol{\theta}),\quad \;\; \text{and}\quad\;\;\bar{\ell}_i(\boldsymbol{\theta})=\log \bar{\sigma}_i^2(\boldsymbol{\theta})+\frac{y_i^2}{\bar{\sigma}_i^2(\boldsymbol{\theta})}. \end{align*} $$

To find $\hat {\boldsymbol {\theta }}_N$ , we maximize $-\bar {{\mathcal L}}_k(\boldsymbol {\theta })$ on the set

$$ \begin{align*}\boldsymbol{\Theta}=\left\{\boldsymbol{\theta}:\;1/\bar{\alpha}\leq \alpha<\bar{\alpha}, 1/\bar{\beta}\leq \beta\leq \bar{\beta}, 1/\bar{\omega}\leq \omega\leq \bar{\omega} \right\}, \end{align*} $$

where $\bar {\alpha },\bar {\beta }$ , and $\bar {\omega }$ are arbitrarily large. Thus, we can assume the following assumption.

Assumption 2.3. $\boldsymbol {\theta }_0$ is in the interior of $\boldsymbol {\Theta }$ .

Since $y_1, y_2,\dots , y_N$ are observed, the QMLE satisfies

$$ \begin{align*}\inf \left\{\bar{{\mathcal L}}_N(\boldsymbol{\theta}):\;\boldsymbol{\theta}\in\boldsymbol{\Theta}\right\}=\bar{{\mathcal L}}_N(\hat{\boldsymbol{\theta}}_N), \end{align*} $$

where

(2.3)

$$ \begin{align} \bar{{\mathcal L}}_N(\boldsymbol{\theta})=\sum_{i=1}^N\bar{\ell}_i(\boldsymbol{\theta}),\quad \;\; \text{and}\quad\;\;\bar{\ell}_i(\boldsymbol{\theta})=\log \bar{\sigma}_i^2(\boldsymbol{\theta})+\frac{y_i^2}{\bar{\sigma}_i^2(\boldsymbol{\theta})}. \end{align} $$

Nelson (Reference Nelson1990) shows that $y_i$ converges to a stationary sequence if and only if $E\log (\alpha _0\epsilon _0^2+\beta _0)<0$ (the stationary case). Lumsdaine (Reference Lumsdaine1996), Berkes, Horváth, and Kokoszka (Reference Berkes, Horváth and Kokoszka2003), Francq and Zakoian (Reference Francq and Zakoian2004), and Straumann and Mikosch (Reference Straumann and Mikosch2006) establish the asymptotic consistency and normality of $\hat {\boldsymbol {\theta }}_N$ under slightly weaker restrictions than Assumption 2.1. Nelson (Reference Nelson1990) shows that if $E\log (\alpha _0\epsilon _0^2+\beta _0)\geq 0,$ then $|y_i|\to \infty $ in probability, as $i\to \infty $ . Therefore, in this case, no stationary solution exists. Jensen and Rahbek (Reference Jensen and Rahbek2004) prove that $\omega _0$ cannot be estimated consistently in the explosive case ( $E\log (\alpha _0\epsilon _0^2+\beta _0)>0$ ). Using the same arguments as Jensen and Rahbek (Reference Jensen and Rahbek2004), our Theorem 6.1 shows that $\omega _0$ cannot be estimated in the “boundary” case ( $E\log (\alpha _0\epsilon _0^2+\beta _0)=0$ ) as well. Thus, without assuming stationarity we can only test the null hypothesis

$$ \begin{align*} H_0^*:\; (\alpha_1,\beta_1)=(\alpha_2,\beta_2)=\dots=(\alpha_N,\beta_N). \end{align*} $$

The expected values of the derivatives of the $\log $ likelihood function with respect to $\alpha $ and $\beta $ are zero under the null hypothesis, so we use the sequence of random functions

$$ \begin{align*}\bar{{\mathcal r}}_k(\boldsymbol{\theta})=\sum_{i=1}^k\left( \frac{\partial \bar{\ell}_i(\boldsymbol{\theta})}{\partial \alpha}, \frac{\partial \bar{\ell}_i(\boldsymbol{\theta})}{\partial \beta} \right)^{\top} \end{align*} $$

in the definition of our test statistics. We normalize the test statistics by

$$ \begin{align*}\hat{\mathbf{D}}_N=\frac{1}{N}\sum_{i=1}^N \left( \frac{\partial \bar{\ell}_i(\hat{\boldsymbol{\theta}}_N)}{\partial \alpha}, \frac{\partial \bar{\ell}_i(\hat{\boldsymbol{\theta}}_N)}{\partial \beta} \right)^\top\left( \frac{\partial \bar{\ell}_i(\hat{\boldsymbol{\theta}}_N)}{\partial \alpha}, \frac{\partial \bar{\ell}_i(\hat{\boldsymbol{\theta}}_N)}{\partial \beta} \right). \end{align*} $$

If $H_0$ and Assumptions 2.1–2.3 hold, then

$$ \begin{align*}\|\hat{\mathbf{D}}_N-\mathbf{D}\|=o_P((\log \log N)^{-1/2}), \end{align*} $$

where $\mathbf {D}$ is a positive definite matrix. The form of $\mathbf {D}$ depends on whether $E\log (\alpha _0\epsilon _0^2+\beta _0)$ is negative or nonnegative. Explicit formulas for $\mathbf {D}$ are given by Berkes et al. (Reference Berkes, Horváth and Kokoszka2003) and Francq and Zakoian (Reference Francq and Zakoian2004) for the stationary case ( $E\log (\alpha _0\epsilon _0^2+\beta _0)<0$ ) and by Jensen and Rahbek (Reference Jensen and Rahbek2004) for the explosive case. We note that Jensen and Rahbek (Reference Jensen and Rahbek2004) only consider the case when $E\log (\alpha _0\epsilon _0^2+\beta _0)>0$ . As a result of Theorem 5.1, their formula also holds when $E\log (\alpha _0\epsilon _0^2+\beta _0)=0$ . We use the same estimator, and no prior knowledge of stationarity is required.

Our proposed tests are based on the weighted functionals of the stochastic process

$$ \begin{align*}Z_N(t)=\frac{1}{N^{1/2}}\left\{\bar{\mathcal{r}}_{\lfloor (N+1)t\rfloor}(\hat{\boldsymbol{\theta}}_N)\hat{\mathbf{D}}_N^{-1}\left(\bar{{\mathcal r}}_{\lfloor (N+1)t\rfloor}(\hat{\boldsymbol{\theta}}_N)\right)^\top\right\}^{1/2},\quad\;\;0<t<1, \end{align*} $$

where $\lfloor \cdot \rfloor $ denotes the integer part of a number.

The definition of $Z_N(t)$ implies that $Z_N(t)=0$ if $0<t<1/(N+1)$ and $N/(N+1)<t<1$ ; that is, the behavior of $Z_N(t)$ is the same if t is close to 0 and 1. The estimation of the parameters of a GARCH(1,1) sequence requires large sample sizes. We only use the estimator $\hat {\boldsymbol {\theta }}_N$ that is based on the full sample. The partial derivatives in the definition of $\bar {{\mathcal r}}_k(\boldsymbol {\theta })$ can be computed quickly using the recursion in (2.2). Another popular method to find changes in the parameters of a time series is based on splitting the data at time k. The estimators are computed from both the first k and the last $N-k$ observations. The corresponding statistic maximizes the difference between the estimators as a function of k. In the case of GARCH(1,1), this would require $2(N-3)$ estimates, and many of these estimators would be based on small subsamples. However, it is well known that estimations of GARCH parameters require large sample sizes (e.g., Huang, Wang, and Yao, Reference Huang, Wang and Yao2008). Using the linearization of the quasi-likelihood estimators (cf. our Lemma 6.3 and Jensen and Rahbek, Reference Jensen and Rahbek2004), one can show that this method and our proposed method are asymptotically the same.

We use $w(t)$ , a weight function that can be zero at 0 or 1, to get higher power if the change occurs at the end or at the beginning of the observation period. We assume that the weight function $w(t)$ satisfies the following assumption.

Assumption 2.4. (i) $\inf _{\delta \leq t \leq 1-\delta }{w(t)}>0$ for all $0<\delta <1/2$ , and (ii) $w(t)$ is non-decreasing in a neighborhood of 0 and non-increasing in a neighborhood of 1.

We employ the integral test of Chibisov (Reference Chibisov1964) and O’Reilly (Reference O’Reilly1974), which is based on the functional

$$ \begin{align*}I(w, c)=\int_0^{1}\frac{1}{s(1-s)}\exp\left(-c\frac{w^2(s)}{s(1-s)}\right)ds. \end{align*} $$

We refer to Csörgő and Horváth (Reference Csörgő and Horváth1993) for the properties of $I(w,c)$ and its connection to the upper and lower classes of the Wiener process and the Brownian bridge.

Theorem 2.1. We assume that $H_0$ and Assumptions 2.1–2.4 are satisfied. (i) If $I(w,c)<\infty $ with some $c>0$ , then

$$ \begin{align*}\sup_{0<t<1}\frac{Z_N(t)}{w(t)}\stackrel{{\mathcal D}}{\to}\sup_{0<t<1}\frac{1}{w(t)}\left(B_1^2(t)+B_2^2(t)\right)^{1/2}, \end{align*} $$

where $\{B_1(t), 0\leq t \leq 1\}$ and $\{B_2(t), 0\leq t \leq 1\}$ are independent Brownian bridges. (ii) In addition,

$$ \begin{align*} \lim_{N\to \infty} P&\left\{a(\log N)\max_{1\leq k <N}\left\{ \frac{N}{k(N-k)} \bar{{\mathcal r}}_{k}(\hat{\boldsymbol{\theta}}_N)\hat{\mathbf{D}}_N^{-1}\left(\bar{{\mathcal r}}_{k}(\hat{\boldsymbol{\theta}}_N)\right)^\top\right\}^{1/2} \leq x+b(\log N) \right\}\\ & = \exp(-2 e^{-x}) \end{align*} $$

for all x, where $a(x)=(2\log x)^{1/2}$ and $b(x)=2\log x+\log \log x$ .

We note that the limit in Theorem 2.1(i) is finite with probability one if and only if $I(w,c)<\infty $ with some $c>0$ (cf. Csörgő and Horváth, 1993). The normalization in Theorem 2.1(ii) is proportional to the standard deviation of the coordinates of $\bar {{\mathcal r}}_{k}({\boldsymbol {\theta }})$ , and it is usually referred to as a maximally selected standardized statistic.

Now, we discuss briefly the consistency of tests based on Theorem 2.1. The parameter $\bar {\boldsymbol {\theta }}_0=(\alpha _0,\beta _0)^\top $ changes to $\boldsymbol {\theta }_*=({\alpha }_*,\beta _*)^\top $ at $k^*$ , and we assume the following assumptions.

Assumption 2.5. $k^*=\lfloor \lambda N\rfloor $ with some $0<\lambda <1$ ,

and

Assumption 2.6. $\alpha _*>0, \beta _*>0$ , $(\omega _0, \alpha _*,\beta _*)^\top $ is in the interior of $\boldsymbol {\Theta }$ , and $\bar {\boldsymbol {\theta }}_0\neq \boldsymbol {\theta }_*$ .

The sequence $y_0, y_i=\sigma _i\epsilon _i, \sigma _i^2=\omega _0+\alpha _0y_{i-1}^2+\beta _0\sigma _{i-1}^2, 1\leq i\leq k^*$ changes to the new regime $ y_i=\sigma _i\epsilon _i, \sigma _i^2=\omega _0+\alpha _*y_{i-1}^2+\beta _*\sigma _{i-1}^2, k^*+1\leq i\leq N$ starting from the “initial value” $y_{k^*}$ .

We use $\|\cdot \|$ for the Euclidean norm of vectors and matrices throughout this article.

Theorem 2.2. If Assumptions 2.1–2.3, 2.5, and 2.6 hold, then

$$ \begin{align*}\liminf_{N\to\infty}P\left\{\max\left(\left\|\frac{1}{N} \bar{{\mathcal r}}_{k^*}(\hat{\boldsymbol{\theta}}_N) \right\|, \left\|\frac{1}{N}\left(\bar{{\mathcal r}}_N(\hat{\boldsymbol{\theta}}_N)- \bar{{\mathcal r}}_{k^*}(\hat{\boldsymbol{\theta}}_N)\right) \right\|\right)> \delta \right\}=1 \end{align*} $$

with some $\delta>0$ .

The estimator $\hat {\mathbf {D}}_N$ might not converge in probability, but it converges along subsequences. However, it still holds that

$$ \begin{align*} &\liminf_{N\to\infty}\left\{\max\Bigg(\frac{1}{N} \left(\bar{{\mathcal r}}_{k^*}^\top(\hat{\boldsymbol{\theta}}_N)\hat{\mathbf{D}}_N^{-1}\bar{{\mathcal r}}_{k^*}(\hat{\boldsymbol{\theta}}_N)\right)^{1/2} ,\right.\\ &\quad\left.\hspace{1cm} \frac{1}{N}\left(\left(\bar{{\mathcal r}}_N(\hat{\boldsymbol{\theta}}_N)- \bar{{\mathcal r}}_{k^*}(\hat{\boldsymbol{\theta}}_N)\right)^\top\hat{\mathbf{D}}_N^{-1}\left(\bar{{\mathcal r}}_N(\hat{\boldsymbol{\theta}}_N)- \bar{{\mathcal r}}_{k^*}(\hat{\boldsymbol{\theta}}_N)\right) \right)^{1/2}\Bigg)> \delta \right\}=1 \end{align*} $$

with some $\delta>0$ . Hence, Theorem 2.2 implies that under the alternative $\sup _{0<t<1}Z_N(t)/w(t) \to \infty $ in probability, the rate of convergence is $N^{1/2}$ , and the convergence to $\infty $ is not affected by the “inconsistency” of $\hat {\mathbf {D}}_N$ under the alternative.

Throughout the article, we assume that $\omega $ remains constant and focus on parameter changes in $(\alpha , \beta )$ . Here, we elaborate on the possible effect of a change in $\omega $ . Under the “boundary” and explosive GARCH cases, our change-point test would not be affected by a change in $\omega $ . Such a change in $\omega $ could only cause complications if a stationary GARCH process ( $E\log (\alpha _0\epsilon _0^2+\beta _0)<0$ ) changes into a different stationary process ( $E\log (\alpha _*\epsilon _0^2+\beta _*)<0$ ). In this scenario, it is possible to reject $H_0^*$ even if a change in $\omega $ causes $\sup _{0<t<1}Z_N(t)/w(t)$ to exceed the critical value. If this possibility is suspected, we recommend dividing the data into two subsamples at $\hat {k}_N$ , as given by (4.1), and then testing the stationarity on both subsamples. If the two subsamples are indeed stationary, then the parameters can be estimated from each subsample. By comparing the estimates for $\omega $ , one can decide whether the change is due to $\omega $ or to $(\alpha ,\beta )$ .

We also note that Assumption 2.3 excludes parameter values on the boundary, such as $\alpha _0= 0$ or $\beta _0= 0$ , as is standard in the literature on GARCH models. This is because the QMLE $\hat {\boldsymbol {\theta }}_N$ is not asymptotically normal when $\alpha _0=0$ or $\beta _0=0$ . For these special cases, the asymptotic results are provided by Francq and Zakoian (Reference Francq and Zakoian2007) and Iglesias and Linton (Reference Iglesias and Linton2007). Therefore, one typically needs to have prior knowledge of whether a parameter of the GARCH(1,1) model is zero.

In this article, we only consider GARCH(1,1) processes. Our results can be easily extended to more general GARCH $(p,q)$ processes in the stationary case, where the condition for stationarity is that the Lyapunov exponent is negative. Under this condition, (6.5) holds (cf. Berkes et al., Reference Berkes, Horváth and Kokoszka2003; Francq and Zakoian, Reference Francq and Zakoian2004). When the Lyapunov exponent is strictly positive, Chan and Ng (Reference Chan and Ng2009) show that $\sigma _i^2\to \infty $ a.s., which allows them to establish the asymptotic normality of the QMLE. However, to obtain the analogues of Theorems 2.1 and 2.2, one needs the rate of convergence of $\sigma _i^2$ to $\infty $ . It is necessary for the weighted approximation in (5.16) to hold for GARCH $(p,q)$ processes. Our method can be extended to the more general case if one can establish even a polynomial rate for $\sigma ^2\to \infty $ , that is, a weaker result than (6.7) where the Lyapunov exponent is 0 and (6.10) where the Laypunov exponent is positive.

3 MONTE CARLO SIMULATIONS

In this section, we perform a simulation study to evaluate the finite sample performance of the proposed tests under the null and alternative hypotheses. The theory developed in Section 2 leads to two classes of tests, one based on Theorem 2.1(i) and the other based on Theorem 2.1(ii). Our simulation shows that the former has better size control and higher power. For the sake of brevity, we present simulation results for the test based on Theorem 2.1(i) and showcase its performance across all three cases of GARCH(1,1): the stationary case ( $E\log (\alpha _0\epsilon _0^2+\beta _0)<0$ ), the “boundary” case ( $E\log (\alpha _0\epsilon _0^2+\beta _0)=0$ ), and the explosive case ( $E\log (\alpha _0\epsilon _0^2+\beta _0)>0$ ).

3.1 Empirical Size under $H_0$

Under the null hypothesis, our data generating process (DGP) is a GARCH(1,1) model:

$$ \begin{align*} y_i=\sigma_i\epsilon_i\quad\;\; \text{and}\quad \;\;\sigma_i^2=\omega_0+\alpha_0 y_{i-1}^2+\beta_0 \sigma_{i-1}^2,\;\;\;1\leq i<N. \end{align*} $$

For the distribution of the innovation term, we have two settings: (i) $\epsilon _i$ follows a standard normal distribution $\mathcal {N} (0,1)$ , and (ii) $\epsilon _i$ follows the skewed t distribution of Hansen (Reference Hansen1994) with degrees of freedom $\nu $ and skewness $\lambda $ . We use $\nu =10$ and $\lambda =-0.15$ , which are taken from Patton (Reference Patton2013) and are close to the typical values implied by empirical financial returns.

We consider all three cases of GARCH(1,1). In the stationary case, the parameter values are set to $(\omega _0, \alpha _0, \beta _0)^\top = (0.014, 0.084, 0.905)^\top $ . These values are taken from Francq and Zakoian (Reference Francq and Zakoian2019) (Table 7.4, S&P 500) and reflect real data of stationary financial returns. In the explosive case, the parameter values are set to $(\omega _0, \alpha _0, \beta _0)^\top = (0.014, 0.084, 1.000)^\top $ , which violates the strict stationarity condition of GARCH(1,1). In the “boundary” case, the parameter values depend on the distribution of the innovation term: they are set to $(\omega _0, \alpha _0, \beta _0)^\top = (0.014, 0.084, 0.9219)^\top $ for the standard normal distribution, and $(\omega _0, \alpha _0, \beta _0)^\top = (0.014, 0.084, 0.9238)^\top $ for the skewed t distribution.

To implement the test based on Theorem 2.1(ii), we require a weight function that satisfies Assumption 2.4. We investigate the impact of different shapes of the weight function by specifying it as $w(t)=(t(t-1))^\kappa $ , where $\kappa = 0, 0.15, 0.25$ , and $0.35$ . From Theorem 2.1(i), we calculate the test statistic

(3.1)

$$ \begin{align} T_N = \sup_{0 < t < 1} \dfrac{Z_N (t)}{w(t)}, \end{align} $$

using the steps detailed in the Appendix.

To obtain the critical values, we simulate two independent Brownian bridges $B_1(t)$ and $B_2(t)$ on a grid of 100,000 equally spaced points in $ \left [ 0, 1\right ] $ and calculate the quantity $ \sup _{0 < t < 1} \dfrac {1}{w(t)} \left ( B_1^2(t) + B_2^2(t)\right ) ^{1/2} $ . This is repeated 100,000 times, and we obtain the empirical percentile values of this quantity at the 90%, 95%, and 99%, which are corresponding to the critical values at the 10%, 5%, and 1% levels. The resulting critical values are shown in Table 1.

Table 1 Critical values.

We now perform the Monte Carlo simulation 10,000 times and record the computed test statistics. By comparing these simulated statistics with the critical values in Table 1, we obtain the empirical size under the null hypothesis. Table 2 reports the empirical sizes at the 5% significance level with a range of values of $\kappa $ for the three GARCH(1,1) cases. The differences between the three cases are marginal, with slightly higher sizes observed in the explosive case. Additionally, the empirical sizes under the normal and skewed t innovations are similar. Notably, the empirical size depends on the choice of $\kappa $ . Recall that $\kappa $ controls the shape of the weight function, which can be chosen arbitrarily between 0 and 0.5. Our simulation reveals that a smaller $\kappa $ is more conservative in rejection, while a larger $\kappa $ leads to a higher rejection percentage. Generally, our proposed test has reasonably good size control with $\kappa =0.15$ or $0.25$ , whereas it is under-sized for $\kappa =0$ and over-sized for $\kappa =0.35$ . Overall, we recommend $\kappa =0.15$ because its empirical size is typically close to or slightly below the theoretical levels, and there is no compromise in the empirical power with this choice of $\kappa $ , as shown in Section 3.2.

Table 2 Empirical size.

3.2 Empirical Power under $H_A$

We now turn to the analysis of the empirical power. The DGP under $H_A$ is similar to that under $H_0$ but with a change in one model parameter at time $k^*$ . We set the time of the change at $k^* = 0.5N$ , corresponding to the middle of the sample period.Footnote ¹

There are various possible scenarios under the alternative hypothesis; we consider the following representative scenarios. The parameter values of GARCH(1,1) before $k^*$ are set to be the same as those in the DGP of the three GARCH cases under the null. Since $\omega $ cannot be consistently estimated in the explosive regime of GARCH(1,1), we analyze a change in either $\alpha $ or $\beta $ in the simulation under $H_A$ . We consider both positive and negative changes in $\alpha $ or $\beta $ . Specifically, after the change at $k^*$ , the parameter values are as follows:

• $H_{A,1}$ : a negative change in $\beta $ , $\beta_{\ast} = \beta _0 - 0.05$ ,
• $H_{A,2}$ : a positive change in $\beta $ , $\beta_{\ast} = \beta _0 + 0.05$ ,
• $H_{A,3}$ : a negative change in $\alpha $ , $\alpha _{\ast} = \alpha _0 - 0.05$ ,
• $H_{A,4}$ : a positive change in $\alpha $ , $\alpha _{\ast} = \alpha _0 + 0.05$ .

Table 3 reports the empirical power for changes in the stationary, “boundary”, and explosive GARCH(1,1) cases. Several noticeable observations can be made. First, the empirical power increases with the sample size N under all scenarios, indicating the consistency of the proposed test. Second, our test has higher power when detecting changes in the explosive case than in the stationary and “boundary” cases. This can be attributed to the fact by that $\sigma _i^2$ and $y_i^2$ converge to infinity at a much faster rate (see Theorem 6.1), and the terms that are not in the limit are exponentially small. Thus, changes in the explosive case can be detected more easily. Third, the power is slightly lower under the skewed t innovation than under the normal innovation. This could be because the QMLE becomes the MLE for the normal innovation, and the derivatives are calculated exactly. Lastly, the choice of $\kappa $ has a nontrivial but not substantial impact on the power. Choosing $\kappa =0.15$ can provide satisfactory and balanced performance across a variety of scenarios.

Table 3 Empirical power.

4 EMPIRICAL APPLICATIONS

In this section, we first present a prototype demonstration by applying our testing procedure to the same five individual stocks analyzed by Francq and Zakoïan (Reference Francq and Zakoïan2012). After that, we employ our test on a much larger scale by applying it to 10,620 stocks in the CRSP database. Specifically, we use all stocks with at least two years of price records between 2011 and 2022 in the NYSE, AMEX, and NASDAQ stock markets. We use the test statistic $T_N$ with $\kappa = 0.15$ because it has the best finite sample performance, reflected by its better size control and higher power in our simulation study.

4.1 Prototype Application

In the prototype application, we focus on the daily returns of the five individual stocks analyzed by Francq and Zakoïan (Reference Francq and Zakoïan2012), with the same sample period they selected. The five stocks are Icagen (Nasdaq:ICGN), Monarch Community Bancorp (Nasdaq:MCBF), KV Pharmaceutical (NYSE:KV-A), Community Bankers Trust (AMEX:BTC), and China MediaExpress (Nasdaq:CCME). The sample period is generally between 2007 and 2011, with KV-A from 2006 and CCME from 2009. These five stocks provide a useful demonstration because the nonstationary (explosive) assumption cannot be rejected for four of them at any reasonable significance level; see Table 8 in Francq and Zakoïan (Reference Francq and Zakoïan2012). Our change-point test can provide further insights into the stability of the two parameters $(\alpha , \beta )$ of GARCH(1,1) models for those five stocks.

Table 4 presents the results of the parameter estimation, the nonstationarity test of Francq and Zakoïan (Reference Francq and Zakoïan2012), and our change-point test for the five individual stock returns. We reproduce the parameter estimation and the nonstationary test results reported in Francq and Zakoïan (Reference Francq and Zakoïan2012). Regarding our change-point test, we reveal a change in the two parameters of the GARCH(1,1) model for the stock MCBF, which indicates the rejection of $H_0^*$ at the 5% significance level.

Table 4 Test results of change point.

We further use a change-point estimator

(4.1)

$$ \begin{align} \hat{k}_N= \lfloor N \hat{t}^* \rfloor, \quad \text{ where } \; \hat{t}^* = \underset{0 < t < 1}{\arg\max}\dfrac{Z_N (t)}{w(t)}, \end{align} $$

to find the time of the change. It shows that the change date is on 9 February 2009 for MCBF. By splitting the samples, we reestimate the two parameters before and after the change. The estimates are $\hat {\alpha } = 0.118, \hat {\beta }=0.886$ before the change and $\hat {\alpha } = 0.013, \hat {\beta }=0.981$ after the change. This result suggests that, in this particular case, the ARCH effect is weaker after the change, while the GARCH effect becomes stronger.

4.2 Extensive Application

We repeat our prototype analysis on the universe of U.S. stocks in the CRSP database. We select all stocks with at least 2 years of records between 4 January 2011 and 30 December 2022 in the NYSE, AMEX, and NASDAQ stock markets, resulting in a total of 10,620 stocks. We choose this period to focus on the most recent decade. We note that not all stocks have observations covering the full period, because some stocks had an initial public offering (IPO) after 4 January 2011, or were delisted before 30 December 2022. The nonstationarity test of Francq and Zakoïan (Reference Francq and Zakoïan2012) suggests that there are 631 stocks for which the null hypothesis of nonstationarity (explosivity) cannot be rejected at the 1% significance level.

We apply our change-point test to the daily returnsFootnote ² of 10,620 stocks, regardless of whether they are in a stationary or explosive regime. We find evidence of change points in $(\alpha , \beta )$ for 3,416 stocks at the 5% level. Using the change-point estimator defined in (4.1), we further find the time of the change for stocks that are rejected by our test. Figure 1 shows a histogram of the times of detected changes. As can be seen, the largest number of changes occurred in first half of 2020 when the COVID-19 pandemic begun to affect the United States economy. Thus, our test is able to detect change points that align with a well-known market event, providing validation using a large dataset.

Figure 1 Histogram of times of changes in $(\alpha , \beta )$ for 3,416 U.S. stocks.

5 Conclusions

This article provides a novel test procedure to detect changes in GARCH(1,1) processes without imposing a stationary assumption. Our methodology complements the existing literature by allowing for change points to be detected in GARCH(1,1) models not only in the stationary case but also in the explosive and “boundary” cases. A key advantage of the proposed test is that it does not require practitioners to have prior knowledge on whether their empirical observations are stationary or explosive. The test procedure is applicable in all cases, offering a useful addition to the toolbox of financial econometricians. With other relevant theories developed in the future, it would be interesting to extend the test procedure to more general GARCH $(p,q)$ processes.

6 PROOFS OF TECHNICAL RESULTS

6.1 The Properties of $\sigma _i^2$ as $i\to \infty $

The properties of $\sigma _i^2$ depend on the sign of $E\log (\alpha _i\epsilon _0^2+\beta _0)$ . We consider the following cases:

(6.1)

$$ \begin{align} E\log(\alpha_i\epsilon_0^2+\beta_0)<0, \end{align} $$

(6.2)

$$ \begin{align} E\log(\alpha_i\epsilon_0^2+\beta_0)=0, \end{align} $$

and

(6.3)

$$ \begin{align} E\log(\alpha_i\epsilon_0^2+\beta_0)>0. \end{align} $$

The case (6.1) is known as the stationary case, (6.2) as the “boundary” case, and (6.3) as the explosive case.

In the case (6.1), there is a unique stationary non anticipative sequence satisfying

(6.4)

$$ \begin{align} x_i=h_i\epsilon_i,\quad\quad h_i^2=\omega_0 +\alpha_0x_{i-1}^2+\beta_0h_{i-1}^2,\quad\;\;-\infty<i<\infty. \end{align} $$

The next theorem shows that $\sigma _i^2$ is close to $h_i^2$ , as $i\to \infty $ , if (6.1) holds. We also show that in the “boundary” and explosive cases $\sigma _i^2\to \infty $ in probability but with different rates. Thus, $\sigma _i^2$ is explosive under (6.2) as well as (6.3).

Theorem 6.1. We assume that Assumptions 2.1 and 2.2 are satisfied. (i) If (6.1) holds, then there exist $\bar {\kappa }>0$ and $0<\bar {\rho }<1$ such that

(6.5)

$$ \begin{align} E|\sigma_i^2-h_i^2|^{\bar{\kappa}}=O\left(\bar{\rho}^i\right). \end{align} $$

(ii) If (6.2) holds and

(6.6)

$$ \begin{align} E|\log(\alpha_0\epsilon_0^2+\beta_0)|^{\bar{\nu}}<\infty\;\;\;\;\;\text{with some}\;\;\;\;\bar{\nu}>2, \end{align} $$

then for all $1/\bar {\nu }<\xi <1/2$

(6.7)

$$ \begin{align} P\left\{\sigma_i^2<\exp \left(c_1i^{\xi}\right)\right\}\leq c_2\left(i^{\xi-1/2}+i^{1-\xi\bar{\nu}} \right) \end{align} $$

with some $c_1$ and $c_2$ . (iii) If (6.3) holds, then

$$ \begin{align*}\exp(-S(i-1))\sigma_i^2\to R\;\;\;\text{a.s.}, \end{align*} $$

where

$$ \begin{align*}R=\omega_0\sum_{k=0}^{\infty}\exp(-S(k))+\sigma_0^2(\alpha_0\epsilon_{0}^2+\beta_0)\geq \omega_0, \quad\quad P\{R=0\}=0, \end{align*} $$

and

(6.8)

$$ \begin{align} S(j)=\sum_{k=1}^j\log (\alpha_0\epsilon_k^2+\beta_0),\quad\quad S(0)=0. \end{align} $$

For any ${\mathcal m}<\bar {{\mathcal m}}=E(\log (\alpha _0\epsilon _0^2+\beta _0),$ we have

(6.9)

$$ \begin{align} \exp(-{\mathcal m} i)\sigma_i^2\stackrel{P}\to\infty. \end{align} $$

If (6.6) also holds, then for all $1/\bar {\nu }<\xi <1/2$

(6.10)

$$ \begin{align} P\left\{ \sigma_i^2<\exp(\bar{ {\mathcal m}} i-c_1 i^\xi)\right\}\leq c_2\left( i^{1-\xi\bar{\nu}}+\exp(-c_3i \right)) \end{align} $$

with some $c_1, c_2$ , and $c_3$ .

Proof. We begin by proving Theorem 6.1(i). The recursion in (2.1) yields

$$ \begin{align*}\sigma_i^2=\omega_0+(\alpha_0\epsilon_i^2+\beta_0)\sigma_{i-1}^2, \end{align*} $$

which can be solved explicitly

(6.11)

$$ \begin{align} \sigma_i^2&=\omega_0\sum_{\ell=1}^i\prod_{j=1}^{\ell-1}(\alpha_0\epsilon_{i-j}^2+\beta_0) +\sigma_0^2\prod_{\ell=0}^{i-1}(\alpha_0\epsilon_{\ell}^2+\beta_0). \end{align} $$

The explicit solution of the equation (6.4) is given by

$$ \begin{align*}h_i^2=\omega_0\sum_{\ell=1}^\infty\prod_{k=1}^{\ell-1}(\alpha_0\epsilon_{i-k}^2+\beta_0) \end{align*} $$

$(\prod _\emptyset =1$ ). If (6.1) holds, then there exists a $\bar {\kappa }>0$ such that

(6.12)

$$ \begin{align} \bar{\rho}=E (\alpha_0\epsilon_0^2+\beta_0)^{\bar{\kappa}}<1 \end{align} $$

(cf. Berkes et al., Reference Berkes, Horváth and Kokoszka2003; Francq and Zakoian, Reference Francq and Zakoian2004). We can assume that $0<\bar {\kappa }<1$ , so by Jensen’s inequality

$$ \begin{align*} E\left(\sum_{\ell=i+1}^\infty\prod_{j=1}^{\ell-1}(\alpha_0\epsilon_{i-j}^2+\beta_0)\right)^{\bar{\kappa}} \leq \sum_{\ell=i+1}^\infty E\left(\prod_{j=1}^{\ell-1}(\alpha_0\epsilon_{i-j}^2+\beta_0)\right)^{\bar{\kappa}}\leq \sum_{\ell=i+1}^\infty \bar{\rho}^{\ell-1}{,} \end{align*} $$

and

$$ \begin{align*} E\left(\prod_{\ell=0}^{i-1}(\alpha_0\epsilon_{\ell}^2+\beta_0)\right)^{\bar{\kappa}}\leq \frac{\bar{\rho}^{i-1}}{1-\bar{\rho}}. \end{align*} $$

Hence, (6.5) in Theorem 6.1(i) is proven.

We now proceed with the proof of Theorem 6.1(ii). It follows from (6.11) that

(6.13)

$$ \begin{align} \sigma_i^2&\geq \omega_0\sum_{\ell=1}^i\prod_{j=1}^{\ell-1}(\alpha_0\epsilon_{i-j}^2+\beta_0)\\ &\geq \omega_0\max_{1\leq \ell\leq i}\prod_{j=1}^{\ell-1}(\alpha_0\epsilon_{i-j}^2+\beta_0) \notag\\ &=\omega_0\exp\left(\max_{1\leq \ell\leq i}\sum_{j=1}^{\ell-1}\log(\alpha_0\epsilon_{i-j}^2+\beta_0)\right).\notag \end{align} $$

Using Assumption 2.1, we have

$$ \begin{align*} P\left\{\max_{1\leq \ell\leq i}\sum_{j=1}^{\ell-1}\log(\alpha_0\epsilon_{i-j}^2+\beta_0)<c_1i^{\xi} \right\} =P\left\{\max_{1\leq \ell\leq i-1}S(\ell)< c_1i^{\xi}\right\}, \end{align*} $$

where $S(\ell )$ is defined in (6.8). Let ${\mathcal m}_1^2=E(\log (\alpha _0\epsilon _0^2+\beta _0)^2{)}$ . Using the Komlós–Major–Tusnády approximation (cf. Theorem 2.6.7 in Csörgo and Révész, Reference Csörgo and Révész1981), there is a Wiener process $\{W(x), x\geq 0\}$ such that

$$ \begin{align*} P\left\{ \max_{0\leq x\leq i}|S(\lfloor x\rfloor)-{\mathcal m}_1W(x)|>x^\xi\right\}\leq c_3i^{1-\xi\bar{\nu}}. \end{align*} $$

By the scale transformation of $W(x)$ and Theorem 1.5.1 in Csörgő and Révész (1981), we have

$$ \begin{align*} P\left\{ \sup_{0\leq x \leq i}W(x)\leq c_4i^\xi \right\}=P\left\{|{\mathcal N}|\leq c_4 i^{\xi-1/2}\right\}\leq c_5 i^{\xi-1/2}, \end{align*} $$

where ${\mathcal N}$ is a standard normal random variable, completing the proof of (6.6) and (6.7) in Theorem 6.1(ii).

Lastly, we turn to the proof of Theorem 6.1(iii). The formula in (6.11) can be written as

(6.14)

$$ \begin{align} \sigma_i^2 =\omega_0\sum_{\ell=1}^i\exp(S(i-1)-S(i-\ell))+\sigma_0^2(\alpha_0\epsilon_{0}^2+\beta_0)\exp(S(i-1)), \end{align} $$

where $S(j)$ is defined in (6.8). Thus, we obtain

$$ \begin{align*}\sigma_i^2=\exp(S(i-1))\left(\omega_0\bar{R}_i+\sigma_0^2(\alpha_0\epsilon_0^2+\beta_0)\right), \end{align*} $$

where

$$ \begin{align*}\bar{R}_i=\sum_{k=0}^{i-1}\exp(-S(k)). \end{align*} $$

According to the strong law of large numbers

$$ \begin{align*}\lim_{k\to\infty}\frac{1}{k}S(k)= E\log(\alpha_0\epsilon_0^2+\beta_0)>0\quad\quad \text{a.s.} \end{align*} $$

and therefore

$$ \begin{align*}\lim_{k\to\infty}\bar{R}_k=\bar{R}= \sum_{\ell=0}^{\infty}\exp(-S(\ell)), \quad\;\;\;\text{a.s.}\quad \end{align*} $$

and $P\{\bar {R}<\infty \}=1$ . We again use (6.13). By the Komlós–Major–Tusnády approximation (cf. Theorem 2.6.7 in Csörgő and Révész, 1981), we can define a Wiener process $\{W(x), x\geq 0\}$ such that

$$ \begin{align*} P\left\{ \max_{0\leq x \leq i} |S(\lfloor x\rfloor)-({\mathcal m}_1W(x)+\bar{{\mathcal m}}x) |>i^\xi \right\}\leq c_4 i^{1-\xi\bar{\nu}}. \end{align*} $$

By the scale transformation of the Wiener process, we have

$$ \begin{align*} \sup_{0\leq x\leq i}({\mathcal m}_1W(x)+\bar{{\mathcal m}}x)\stackrel{{\mathcal D}}{=}\frac{{\mathcal m}_1^2}{\bar{{\mathcal m}}}\sup_{0\leq u\leq i\bar{{\mathcal m}}^2/{\mathcal m}_1^2}(W(u)+u). \end{align*} $$

Applying again the Komlós–Major–Tusnády approximation, we can define a Poisson process $\{{\mathcal P}(x), x\geq 0\}$ such that

(6.15)

$$ \begin{align} P\left\{ \sup_{1\leq x\leq i}|(W(x)+x)-{\mathcal P}(x)|>i^\xi \right\}\leq c_5 i^{1-\xi\bar{\nu}}. \end{align} $$

It can be seen that

$$ \begin{align*}\sup_{0\leq x \leq i}{\mathcal P}(x)={\mathcal P}(i). \end{align*} $$

Thus, we can use (6.15) to show that

$$ \begin{align*} P\left\{\max_{1\leq \ell\leq i}S(\ell)<c_6 x^\xi\right\}&\leq P\left\{W(i)+i\leq 2c_6x^\xi\right\}+c_7 i^{1-\xi\bar{\nu}}\\ &\leq P\left\{ {\mathcal N}<-i +2c_6x^\xi\right\}\\ &\leq 2 c_7 i^{1-\xi\bar{\nu}}+c_8\exp(-i^2/3), \end{align*} $$

where ${\mathcal N}$ is a standard normal random variable, completing the proof of (6.8) and (6.9) in Theorem 6.1(iii).

Remark 6.1. It follows from Assumption 2.1 and (6.5) that

$$ \begin{align*}E|y_i^2-x_i^2|^{\bar{\kappa}}=O\left(\bar{\rho}^i\right). \end{align*} $$

If $F(u)$ denotes the distribution function of $\epsilon _0^2$ and

$$ \begin{align*}|F(u)-F(v)|\leq c_1|u-v|^{c_2}\;\;\;\;\;\text{in a {neighborhood} of 0,} \end{align*} $$

then

$$ \begin{align*}P\left\{\epsilon_0^2\leq i^{-c_3/c_2}\right\}\leq c_1i^{-c_3}, \end{align*} $$

and therefore $y_i^2$ increases to $\infty $ at an exponential rate.

6.2 Proofs of Theorems 2.1 and 2.1

Proof of Theorem 2.1.

We show in Sections 6.3 and 6.4 that we can define Gaussian processes $\{\mathbf {W}_N(x), x\geq 0\}$ such that

(6.16)

$$ \begin{align} \max_{x\in [1, N-1]}\left(\frac{N}{x(N-x)}\right)^\nu\left\| \bar{{\mathcal r}}_{ x }(\hat{\boldsymbol{\theta}}_N)-\boldsymbol \Gamma_N(x) \right\|=O_P(1) \end{align} $$

with some $\nu <1/2$ ,

$$ \begin{align*}\boldsymbol \Gamma_N(x)=\mathbf{W}_N(x)-\frac{x}{N}\mathbf{W}_N(N),\;\;\;0\leq x \leq N, \end{align*} $$

$E\mathbf {W}_N(x)=\mathbf {0}$ , and $E\mathbf {W}_N(x)\mathbf {W}^\top _N(y)=\min (x,y)\mathbf {D}$ . Checking the covariance functions, one can verify that

(6.17)

$$ \begin{align} \left\{N^{-1/2}\mathbf{D}^{-1/2}\boldsymbol \Gamma_N(Nt), 0\leq t \leq 1\right\}\stackrel{{\mathcal D}}{=}\left\{\mathbf{B}(t), 0\leq t \leq 1 \right\}, \end{align} $$

(6.18)

$$ \begin{align} \mathbf{B}(t)=\left(B_1(t), B_2(t)\right)^\top, \end{align} $$

where $\{B_1(t), 0\leq t \leq 1\}$ and $\{B_2(t), 0\leq t \leq 1\}$ are independent Brownian bridges. The approximation in (6.16) can be rewritten as

(6.19)

$$ \begin{align} N^{1/2-\nu}\sup_{1/(N+1)\leq t \leq 1-1/(N+1)}&\frac{1}{(t(1-t))^{\nu}}\left\|N^{-1/2}\bar{{\mathcal r}}_{\lfloor (N+1)t\rfloor}(\hat{\boldsymbol{\theta}}_N)-N^{-1/2}\boldsymbol \Gamma_N(Nt) \right\|\\ &=O_P(1){,} \notag \end{align} $$

which implies that

(6.20)

$$ \begin{align} \sup_{1/(N+1)\leq t \leq 1-1/(N+1)}&\frac{1}{(t(1-t))^{1/2}}\left\|N^{-1/2}\bar{{\mathcal r}}_{\lfloor (N+1)t\rfloor}(\hat{\boldsymbol{\theta}}_N)-N^{-1/2}\boldsymbol \Gamma_N(Nt) \right\| =O_P(1). \end{align} $$

We can show that (6.19) and (6.20) yield

(6.21)

$$ \begin{align} \sup_{0<t<1}\frac{1}{w(t)}\frac{1}{N^{1/2}}\left\{\bar{{\mathcal r}}_{\lfloor (N+1)t\rfloor}(\hat{\boldsymbol{\theta}}_N){\mathbf{D}}^{-1}\left(\bar{{\mathcal r}}_{\lfloor (N+1)t\rfloor}(\hat{\boldsymbol{\theta}}_N)\right)^\top\right\}^{1/2}\stackrel{{\mathcal D}}{\to}\sup_{0<t<1}\frac{1}{w(t)}\left\|\mathbf{B}(t)\right\|. \end{align} $$

Using Assumption 2.4(i) and the approximation in (6.19), we get for all $0<\delta <1/2$ that

(6.22)

$$ \begin{align} \sup_{\delta\leq t\leq 1-\delta}\frac{1}{w(t)}\frac{1}{N^{1/2}}\left\{\bar{{\mathcal r}}_{\lfloor (N+1)t\rfloor}(\hat{\boldsymbol{\theta}}_N){\mathbf{D}}^{-1}\left(\bar{{\mathcal r}}_{\lfloor (N+1)t\rfloor}(\hat{\boldsymbol{\theta}}_N)\right)^\top\right\}^{1/2}\stackrel{{\mathcal D}}{\to}\sup_{\delta\leq t\leq 1-\delta}\frac{1}{w(t)}\|\mathbf{B}(t)\|. \end{align} $$

If $I(w,c)<\infty $ with some $c>0$ , then

(6.23)

$$ \begin{align} \lim_{t\to 0}\frac{t^{1/2}}{w(t)}=0 \end{align} $$

(cf. Section 4.1 in Csörgő and Horváth, 1993). We have, by (6.20), that

$$ \begin{align*} \sup_{1/(N+1)<t\leq \delta}&\frac{1}{w(t)}\left\| N^{-1/2}\bar{{\mathcal r}}_{\lfloor (N+1)t\rfloor}(\hat{\boldsymbol{\theta}}_N)-N^{-1/2}\boldsymbol \Gamma_N(Nt)\right\|\\ &=\sup_{1/(N+1)<t\leq \delta}\frac{t^{1/2}}{w(t)}\frac{1}{t^{1/2}}\left\| N^{-1/2}\bar{{\mathcal r}}_{\lfloor (N+1)t\rfloor}(\hat{\boldsymbol{\theta}}_N)-N^{-1/2}\boldsymbol \Gamma_N(Nt)\right\|\\ &=\sup_{1/(N+1)<t\leq \delta}\frac{t^{1/2}}{w(t)}O_P(1). \end{align*} $$

Hence, we conclude that

$$ \begin{align*} \lim_{\delta\to 0}\limsup_{N\to \infty}P\left\{\sup_{1/(N+1)\leq t\leq \delta}\frac{1}{w(t)}\left\| N^{-1/2}\bar{{\mathcal r}}_{\lfloor (N+1)t\rfloor}(\hat{\boldsymbol{\theta}}_N)-N^{-1/2}\boldsymbol \Gamma_N(Nt) \right\|>u \right\}=0 \end{align*} $$

for all $u>0$ and by symmetry

$$ \begin{align*} \lim_{\delta\to 0}\limsup_{N\to \infty}P\!\left\{\!\sup_{1-\delta\leq t\leq 1-1/(N+1)}\!\frac{1}{w(t)}\!\left\| N^{-1/2}\bar{{\mathcal r}}_{\lfloor (N+1)t\rfloor}(\hat{\boldsymbol{\theta}}_N)-N^{-1/2}\boldsymbol \Gamma_N(Nt) \right\|\!>\!u\! \right\}\!=\!0. \end{align*} $$

Thus, we get

$$ \begin{align*} \sup_{1/(N+1)\leq t\leq 1-1/(N+1)}\frac{1}{w(t)}\left\| N^{-1/2}\bar{{\mathcal r}}_{\lfloor (N+1)t\rfloor}(\hat{\boldsymbol{\theta}}_N)-N^{-1/2}\boldsymbol \Gamma_N(Nt) \right\|=o_P(1). \end{align*} $$

Using (6.23) and the definition of $\bar {{\mathcal r}}_{\lfloor (N+1)t\rfloor }$ , we obtain

$$ \begin{align*}\sup_{0<t\leq 1/(N+1)}\frac{1}{w(t)}\left\| N^{-1/2}\bar{{\mathcal r}}_{\lfloor (N+1)t\rfloor} (\hat{\boldsymbol{\theta}}_N) \right\|=o_P(1) \end{align*} $$

and

$$ \begin{align*}\sup_{N/(N+1)\leq t<1}\frac{1}{w(t)}\left\| N^{-1/2}\bar{{\mathcal r}}_{\lfloor (N+1)t\rfloor} (\hat{\boldsymbol{\theta}}_N) \right\|=o_P(1). \end{align*} $$

Because of (6.17) and (6.18)

$$ \begin{align*} \sup_{0<t\leq \delta}\frac{1}{w(t)}\left\|N^{-1/2}\mathbf{D}^{-1/2}\boldsymbol \Gamma_N(Nt) \right\|\;\stackrel{{\mathcal D}}{=}\; \sup_{0<t\leq \delta}\frac{1}{w(t)}\left\|\mathbf{B}(t)\right\| \end{align*} $$

and

$$ \begin{align*}\lim_{\delta\to 0}\sup_{0<t\leq \delta}\frac{1}{w(t)}\left\|\mathbf{B}(t)\right\|=c_1\quad\quad a.s. \end{align*} $$

(cf. Section 4.1 in Csörgő and Horváth, 1993). By symmetry,

$$ \begin{align*} \sup_{1-\delta \leq t<1}\frac{1}{w(t)}\left\|N^{-1/2}\mathbf{D}^{-1/2}\boldsymbol \Gamma_N(Nt) \right\|\;\stackrel{{\mathcal D}}{=}\; \sup_{1-\delta \leq t<1}\frac{1}{w(t)}\left\|\mathbf{B}(t)\right\|, \end{align*} $$

and

$$ \begin{align*}\lim_{\delta\to 0}\sup_{1-\delta\leq t<1}\frac{1}{w(t)}\left\|\mathbf{B}(t)\right\|=c_2\quad\quad a.s. \end{align*} $$

completing the proof of the first part of Theorem 2.1.

According to the law of the iterated logarithm,

(6.24)

$$ \begin{align} \frac{1}{(2\log \log N)^{1/2}}\sup_{1/(N+1)\leq t \leq 1-1/(N+1)}\frac{\|\mathbf{B}(t)\|}{(t(1-t))^{1/2}}\stackrel{P}{\to}1, \end{align} $$

Csörgő and Horváth (1993) proved that

(6.25)

$$ \begin{align} A_{N,1}=\sup_{1/(N+1)\leq t \leq (\log N)^4/N}\frac{\|\mathbf{B}(t)\|}{(t(1-t))^{1/2}}=O_P((\log \log \log N)^{1/2}), \end{align} $$

(6.26)

$$ \begin{align} A_{N,2}=\sup_{ 1-(\log N)^4/N\leq t \leq 1-1/(N+1)}\frac{\|\mathbf{B}(t)\|}{(t(1-t))^{1/2}}=O_P((\log \log \log N)^{1/2}). \end{align} $$

Now putting together (6.20) and (6.24)–(6.26), we obtain

(6.27)

$$ \begin{align} \frac{1}{(2\log \log N)^{1/2}}\sup_{1/(N+1)\leq t \leq 1-1/(N+1)}\frac{1}{(t(1-t))^{1/2}} \left\|N^{-1/2}\mathbf{D}^{-1/2}\bar{{\mathcal r}}_{\lfloor (N+1)t\rfloor}(\hat{\boldsymbol{\theta}}_N)\right\|\stackrel{P}{\to} 1, \end{align} $$

(6.28)

$$ \begin{align} A_{N,3}=\sup_{1/(N+1)\leq t \leq (\log N)^4/N}&\frac{1}{(t(1-t))^{1/2}}\left\|N^{-1/2}\mathbf{D}^{-1/2}\bar{{\mathcal r}}_{\lfloor (N+1)t\rfloor}(\hat{\boldsymbol{\theta}}_N)\right\|\\ &=O_P((\log \log \log N)^{1/2}) \notag{,} \end{align} $$

and

(6.29)

$$ \begin{align} A_{N,4}=\sup_{ 1-(\log N)^4/N\leq t \leq 1-1/(N+1)}&\frac{1}{(t(1-t))^{1/2}}\left\|N^{-1/2}\mathbf{D}^{-1/2}\bar{{\mathcal r}}_{\lfloor (N+1)t\rfloor}(\hat{\boldsymbol{\theta}}_N)\right\|\\ &=O_P((\log \log \log N)^{1/2}).\notag \end{align} $$

It follows from (6.24) to (6.29) that

$$ \begin{align*} a(\log N)\max\left\{A_{N,1}, A_{N,2} \right\}-b(\log N)\;\stackrel{P}{\to}-\infty \end{align*} $$

and

$$ \begin{align*} a(\log N)\max\left\{A_{N,3}, A_{N,4} \right\}-b(\log N)\;\stackrel{P}{\to}-\infty \end{align*} $$

resulting in

$$ \begin{align*} \lim_{N\to \infty}&P\Bigg\{\sup_{1/(N+1)\leq t \leq 1-1/(N+1)}\frac{1}{(t(1-t))^{1/2}} \left\|N^{-1/2}\mathbf{D}^{-1/2}\bar{{\mathcal r}}_{\lfloor (N+1)t\rfloor}\right\|\\ &\hspace{2cm}=\sup_{(\log N)^4/(N+1)\leq t \leq 1-(\log N)^4/(N+1)}\frac{1}{(t(1-t))^{1/2}}\\ &\hspace{3cm}\left\|N^{-1/2}\mathbf{D}^{-1/2}\bar{{\mathcal r}}_{\lfloor (N+1)t\rfloor}\right\| \Bigg\} =1. \end{align*} $$

Using (6.19), we conclude that

$$ \begin{align*} \sup_{(\log N)^4/(N+1)\leq t \leq 1-(\log N)^4/(N+1)}&\frac{1}{(t(1-t))^{1/2}} \left\|N^{-1/2}\mathbf{D}^{-1/2}\left(\bar{{\mathcal r}}_{\lfloor (N+1)t\rfloor}-\boldsymbol \Gamma_N(Nt)\right)\right\|\\ &=o_P((\log \log N)^{1/2}). \end{align*} $$

Csörgő and Horváth (1993) proved that

$$ \begin{align*} \lim_{N\to \infty}P\Bigg\{a(\log N)\sup_{(\log N)^4/(N+1)\leq t \leq 1-(\log N)^4/(N+1)}&\frac{\|\mathbf{B}(t)\|}{(t(1-t))^{1/2}}\leq x +b(\log N) \Bigg\}\\ &=\exp\left(-2e^{-x}\right), \end{align*} $$

and therefore the limit result in Theorem 2.1 follows directly from (6.17).

Proof of Theorem 2.2.

Since $\boldsymbol {\Theta }$ is compact and $\bar {\ell }_i(\boldsymbol {\theta })$ is continuous, the set $\tilde {\boldsymbol {\Theta }}$ of the possible limit points of $\hat {\boldsymbol {\theta }}_N$ is countable. It follows from the proofs in Sections 6.3 and 6.4 that

$$ \begin{align*}\frac{1}{N}\bar{{\mathcal r}}_{k^*}(\boldsymbol{\theta})\;\stackrel{P}{\to}\;{\mathcal a}_1(\boldsymbol{\theta}) \end{align*} $$

and

$$ \begin{align*}\frac{1}{N}\left(\bar{{\mathcal r}}_N(\boldsymbol{\theta})-\bar{{\mathcal r}}_{k^*}(\boldsymbol{\theta})\right)\;\stackrel{P}{\to}\;{\mathcal a}_2(\boldsymbol{\theta}), \end{align*} $$

uniformly in $\boldsymbol {\theta }\in \boldsymbol {\Theta }$ . We note that ${\mathcal a}_1(\boldsymbol {\theta })=\mathbf {0}$ if and only if the first two coordinates of $\boldsymbol {\theta }$ are $(\alpha _0,\beta _0)$ and similarly ${\mathcal a}_2(\boldsymbol {\theta })=\mathbf {0}$ if and only if the first two coordinates of $\boldsymbol {\theta }$ are $(\alpha _*,\beta _*)$ . Hence, for any $\tilde {\boldsymbol {\theta }}\in \tilde {\boldsymbol {\Theta }}$ ,

$$ \begin{align*}\max\left(\left\|{\mathcal a}_1(\tilde{\boldsymbol{\theta}})\right\|,\left\|{\mathcal a}_2(\tilde{\boldsymbol{\theta}})\right\|\right)>0. \end{align*} $$

We only need to show that

(6.30)

$$ \begin{align} \inf\left\{\max\left(\left\|{\mathcal a}_1(\tilde{\boldsymbol{\theta}})\right\|,\left\|{\mathcal a}_2(\tilde{\boldsymbol{\theta}})\right\|\right):\;\tilde{\boldsymbol{\theta}}\in\tilde{\boldsymbol{\Theta}}\right\}>0. \end{align} $$

If the $\inf $ in (6.30) were 0, then we could construct a subsequence $\hat {\boldsymbol {\theta }}_{N_k}$ that would converge in probability to a point where the first two sets of coordinates are ${(}\alpha _0,\beta _0{)}$ and ${(}\alpha _*,\beta _*{)}$ , which contradicts Assumption 2.6.

6.3 Proof of (6.16) When $E\kern1pt\mathbf{log}\kern1pt (\alpha _0\epsilon _0^2+\beta _0)<0$

We use several results obtained by Berkes et al. (Reference Berkes, Horváth and Kokoszka2003) and Francq and Zakoian (Reference Francq and Zakoian2004). A survey on volatility processes with detailed proofs is given in Francq and Zakoian (Reference Francq and Zakoian2010). First, we define

$$ \begin{align*}\bar{h}^2_i(\boldsymbol{\theta})=\omega+\alpha x_{i-1}^2+\beta\bar{h}_{i-1}^2(\boldsymbol{\theta}), \end{align*} $$

the stationary version of $\bar {\sigma }^2_i(\boldsymbol {\theta })$ . Berkes et al. (Reference Berkes, Horváth and Kokoszka2003) proved that

$$ \begin{align*}\bar{h}^2_i(\boldsymbol{\theta})={\frak c}_0(\boldsymbol{\theta})+\sum_{\ell=1}^\infty{\frak c}_\ell(\boldsymbol{\theta}) x_{i-\ell}^2, \end{align*} $$

where the functions $\{{\frak c}_\ell (\boldsymbol {\theta }), 0\leq \ell <\infty \}$ are analytical, and

$$ \begin{align*}\sup\left\{|{\frak c}_\ell(\boldsymbol{\theta})|:\;\boldsymbol{\theta}\in\boldsymbol{\Theta}\right\}=O(\rho_1^\ell)\quad\;\;\text{with some}\quad\;\;0<\rho_1<1. \end{align*} $$

Hence, similarly to (6.5)

(6.31)

$$ \begin{align} \sup\left\{|\bar{h}_i^2(\boldsymbol{\theta})-\bar{\sigma}_i^2(\boldsymbol{\theta})|:\;\boldsymbol{\theta}\in\boldsymbol{\Theta}\right\}=O(\rho_2^i)\quad\text{a.s.}\;\;\;\;\text{with some}\quad\;\;0<\rho_2<1. \end{align} $$

It is also known that

(6.32)

$$ \begin{align} \|\hat{\boldsymbol{\theta}}_N-\boldsymbol{\theta}_0\|=O_P(N^{-1/2}) \end{align} $$

(cf. Berkes et al., Reference Berkes, Horváth and Kokoszka2003; Francq and Zakoian, Reference Francq and Zakoian2004). Furthermore, there exits a $\boldsymbol {\Theta }_0$ , a neighborhood of $\boldsymbol {\theta }_0$ , such that

(6.33)

$$ \begin{align} \sup\left\{\|\nabla\bar{\sigma}_i^2(\boldsymbol{\theta})-\nabla\bar{h}_i^2(\boldsymbol{\theta})\|:\;\boldsymbol{\theta}\in\boldsymbol{\Theta}_0\right\}=O(\rho_3^i)\quad\text{a.s.}\;\;\;\;\text{with some}\quad\;\;0<\rho_3<1. \end{align} $$

It follows from the representation that $h_i^2$ is a Bernoulli shift; that is, there is a function $z: R^\infty \to R$ such that $h_i^2=z(\epsilon _{i-1}^2, \epsilon _{i-2}^2, \dots )$ . We show in the following lemma that $h^2_i$ is decomposable at an exponential rate. Let

$$ \begin{align*}h_{i,j}^2=\omega_0\sum_{\ell=1}^j\prod_{k=1}^{\ell-1}(\alpha_0\epsilon_{i-k}^2+\beta_0) +\omega_0\sum_{\ell=j+1}^\infty\prod_{k=1}^{\ell-1}(\alpha_0(\epsilon_{i,i-k,j}^*)^2+\beta_0), \quad\;\;j\geq 1, \end{align*} $$

and

$$ \begin{align*}h_{i,j}^2=\omega_0\sum_{\ell=1}^\infty\prod_{k=1}^{\ell-1}(\alpha_0(\epsilon_{i,i-k,j}^*)^2+\beta_0), \quad\;\;j\leq 0, \end{align*} $$

where $\{\epsilon _{i,k,j}^*, -\infty <i,k,j<\infty \}$ are independent copies of $\epsilon _0$ . We also define

$$ \begin{align*} x_{i,j}=\left\{ \begin{array}{@{}ll} \epsilon_ih_{i-1,j},\quad j\geq 1, \\ \epsilon_{i,i,j}^*h_{i-1,j},\quad j\leq 0. \end{array} \right. \end{align*} $$

In the next lemma, we establish the decomposable Bernoulli property of

$$ \begin{align*}\mathbf{g}_i=\left(\frac{1}{h_i^2}\frac{\partial \bar{h}_i(\boldsymbol{\theta}_0)}{\partial\alpha}, \frac{1}{h_i^2}\frac{\partial \bar{h}_i(\boldsymbol{\theta}_0)}{\partial\beta}, \frac{1}{h_i^2}\frac{\partial \bar{h}_i(\boldsymbol{\theta}_0)}{\partial\omega} \right)^\top, \end{align*} $$

and we approximate $\mathbf {g}_i$ by

$$ \begin{align*}\mathbf{g}_{i,j}=\left(\frac{1}{h_{i,j}^2}\frac{\partial \bar{h}_{i,j}(\boldsymbol{\theta}_0)}{\partial\alpha}, \frac{1}{h_{i,j}^2}\frac{\partial \bar{h}_{i,j}(\boldsymbol{\theta}_0)}{\partial\beta}, \frac{1}{h_{i,j}^2}\frac{\partial \bar{h}_{i,j}(\boldsymbol{\theta}_0)}{\partial\omega} \right)^\top. \end{align*} $$

Lemma 6.1. If $H_0$ , Assumptions 2.1–2.3, and (6.1) are satisfied, then

(6.34)

$$ \begin{align} \left(E|h^2_i-h_{i,j}^2|^{\bar{\kappa}}\right)^{1/\bar{\kappa}}\leq c_1\bar{\rho}^j,\quad\;\;1\leq j<\infty \end{align} $$

and

(6.35)

$$ \begin{align} \left(E|x^2_i-x_{i,j}^2|^{\bar{\kappa}}\right)^{1/\bar{\kappa}}\leq c_2\bar{\rho}^j,\quad\;\;1\leq j<\infty, \end{align} $$

where $c_1, c_2>0$ , and $0<\bar {\rho }<1$ , $\bar {\kappa }>0$ are defined in (6.12). For all $\kappa _1>1$

(6.36)

$$ \begin{align} \left(E\|\mathbf{g}_i-\mathbf{g}_{i,j}\|^{\kappa_1}\right)^{1/\kappa_1}\leq c_3\rho_1^j,\quad\;\;1\leq j<\infty, \end{align} $$

with some $c_3=c_3(\kappa _1)$ , and $0<\rho _1=\rho _1(\kappa _1)<1$ .

Proof. We can assume without loss of generality that $0<\bar {\kappa }<1$ , and therefore by Minkowski’s inequality,

(6.37)

$$ \begin{align} E\left(\sum_{\ell=j+1}^\infty\prod_{m=1}^{\ell-1}(\alpha_0\epsilon_{i-m}^2+\beta_0)\right)^{\bar{\kappa}}\leq c_4\bar{\rho}^j. \end{align} $$

Using the representation of $h_i^2$ and the definition of $h_{i,j}^2$ , we obtain

$$ \begin{align*} |h_i^2-h^2_{i,j}|\leq \sum_{\ell=j+1}^\infty\prod_{m=1}^{\ell-1}(\alpha_0\epsilon_{i-m}^2+\beta_0) +\omega_0\sum_{\ell=j+1}^\infty\prod_{k=1}^{\ell-1}(\alpha_0(\epsilon_{i,i-k,j}^*)^2+\beta_0), \quad\;\;j\geq 1, \end{align*} $$

and thus (6.34) follows from (6.37). Observing that

$$ \begin{align*}E|x^2_i-x^2_{i,j}|^{\bar{\kappa}}= E|\epsilon_0|^{\bar{\kappa}}E|h_{i-1}^2-h^2_{i-1,j}|^{\bar{\kappa}}, \end{align*} $$

(6.35) is also proven.

It is also shown in Berkes et al. (Reference Berkes, Horváth and Kokoszka2003) (cf. also Francq and Zakoian, Reference Francq and Zakoian2010) that for all $\kappa _2>0$ ,

(6.38)

$$ \begin{align} E\left(\frac{1}{h_i^2}\frac{\partial \bar{h}^2_i(\boldsymbol{\theta}_0)}{\partial\alpha}\right)^{\kappa_2}\leq c_5(\kappa_2). \end{align} $$

Along the lines of the proof of (6.34), one can verify that there exists a $\kappa _3>0$ such that

(6.39)

$$ \begin{align} E\left|\frac{\partial \bar{h}^2_i(\boldsymbol{\theta}_0)}{\partial\alpha}-\frac{\partial \bar{h}^2_{i,j}(\boldsymbol{\theta}_0)}{\partial\alpha}\right|^{\kappa_3}\leq c_5\rho_3^j,\quad\text{with some}\;\;0<\rho_3<1. \end{align} $$

Let

$$ \begin{align*} A_{i,j}=\Bigg\{|h^2_i-h^2_{i,j}|>\bar{\rho}^{j/(2{\bar{\kappa}})}\;\;\;\text{or}\;\;\; \left|\frac{\partial \bar{h}^2_i(\boldsymbol{\theta}_0)}{\partial\alpha}-\frac{\partial \bar{h}^2_{i,j}(\boldsymbol{\theta}_0)}{\partial\alpha}\right|>\rho_3^{j/(2{\kappa_3})} \Bigg\}. \end{align*} $$

Then, by Markov’s inequality,

(6.40)

$$ \begin{align} P\{A_{i,j}\}\leq c_6\max(\bar{\rho}^{j/2}, \rho_3^{j/2}). \end{align} $$

Now

$$ \begin{align*} &\left|\frac{1}{h_i^2}\frac{\partial \bar{h}^2_i(\boldsymbol{\theta}_0)}{\partial\alpha}-\frac{1}{h_{i,j}^2}\frac{\partial \bar{h}^2_{i,j}(\boldsymbol{\theta}_0)}{\partial \alpha} \right|\\ &\hspace{.5cm}\leq \left( \left|\frac{1}{h_i^2}\frac{\partial \bar{h}^2_i(\boldsymbol{\theta}_0)}{\partial\alpha}\right| +\left|\frac{1}{h_{i,j}^2}\frac{\partial \bar{h}^2_{i,j}(\boldsymbol{\theta}_0)}{\partial \alpha} \right| \right)I\{ A_{i,j} \}+\frac{1}{\omega_0}\left|\frac{\partial \bar{h}^2_i(\boldsymbol{\theta}_0)}{\partial\alpha}-\frac{\partial \bar{h}^2_{i,j}(\boldsymbol{\theta}_0)}{\partial\alpha}\right|I\{\bar{A}_{i,j}\}\\ &\hspace{2cm}+\frac{1}{\omega_0}\left|\frac{1}{h_{i,j}^2}\frac{\partial \bar{h}^2_{i,j}(\boldsymbol{\theta}_0)}{\partial \alpha} \right| \left|\bar{h}_i^2(\boldsymbol{\theta}_0)-\bar{h}_{i,j}^2(\boldsymbol{\theta}_0)\right|I\{\bar{A}_{i,j}\}\\ &\hspace{.5cm}\leq \left( \left|\frac{1}{h_i^2}\frac{\partial \bar{h}^2_i(\boldsymbol{\theta}_0)}{\partial\alpha}\right| +\left|\frac{1}{h_{i,j}^2}\frac{\partial \bar{h}^2_{i,j}(\boldsymbol{\theta}_0)}{\partial \alpha} \right| \right)I\{ A_{i,j} \}+\frac{1}{\omega_0}\rho_3^{j/(2/{\kappa_3})}\\&\hspace{2cm} +\frac{1}{\omega_0}\left|\frac{1}{h_{i,j}^2}\frac{\partial \bar{h}^2_{i,j}(\boldsymbol{\theta}_0)}{\partial \alpha} \right|\bar{\rho}^{j/(2/{\bar{\kappa}})}{,} \end{align*} $$

and therefore by Minkowski’s inequality, for any $\kappa _4\geq 2 $ , we have

$$ \begin{align*} &\left(\left|\frac{1}{h_i^2}\frac{\partial \bar{h}^2_i(\boldsymbol{\theta}_0)}{\partial\alpha}-\frac{1}{h_{i,j}^2}\frac{\partial \bar{h}^2_{i,j}(\boldsymbol{\theta}_0)}{\partial \alpha} \right|^{\kappa_4}\right)^{1/{\kappa_4}}\leq 2\left( E\left|\frac{1}{h_i^2}\frac{\partial \bar{h}^2_i(\boldsymbol{\theta}_0)}{\partial\alpha}\right|^{\kappa_4}I\{ A_{i,j}\}\right)^{1/{\kappa_4}}\\ &\hspace{1cm}+ \frac{1}{\omega_0}\rho_3^{j/(2/{\kappa_3})}+\frac{1}{\omega_0}\bar{\rho}^{j/(2/{\bar{\kappa}})} \left(E\left|\frac{1}{h_{i,j}^2}\frac{\partial \bar{h}^2_{i,j}(\boldsymbol{\theta}_0)}{\partial \alpha} \right|^{\kappa_4}\right)^{\kappa_4}. \end{align*} $$

Since by the Cauchy–Schwartz inequality

$$ \begin{align*}E\left\{\left|\frac{1}{h_i^2}\frac{\partial \bar{h}^2_i(\boldsymbol{\theta}_0)}{\partial\alpha}\right|^{\kappa_4}I\{ A_{i,j}\}\right\} \leq \left\{E\left(\frac{1}{h_i^2}\frac{\partial \bar{h}^2_i(\boldsymbol{\theta}_0)}{\partial\alpha}\right)^{2{\kappa_4}}\right\}^{1/2}\left\{ P\{ A_{i,j}\}\right\}^{1/2}, \end{align*} $$

we conclude that for all $\kappa _5\geq 2$ there is $0<\rho _5<1$ such that

(6.41)

$$ \begin{align} \left(E\left|\frac{1}{h_i^2}\frac{\partial \bar{h}^2_i(\boldsymbol{\theta}_0)}{\partial\alpha}-\frac{1}{h_{i,j}^2}\frac{\partial \bar{h}^2_{i,j}(\boldsymbol{\theta}_0)}{\partial \alpha} \right|^{\kappa_5}\right)^{1/{\kappa_5}}\leq c_7 \rho^j_5. \end{align} $$

Hence, the decomposability of the first coordinate of $\mathbf {g}_i$ with an exponential rate is proven. The same argument can be applied to the other two coordinates.

Next, we introduce the stationary version of the likelihood function:

$$ \begin{align*}\bar{f}_i(\boldsymbol{\theta})=\log \bar{h}_{i}^2(\boldsymbol{\theta})+\frac{x_i^2}{\bar{h}_{i}^2(\boldsymbol{\theta})},\quad\quad-\infty<i<\infty, \end{align*} $$

$$ \begin{align*}\bar{{\mathcal f}}_k(\boldsymbol{\theta})=\sum_{i=1}^k\bar{f}_i(\boldsymbol{\theta}), \end{align*} $$

and

$$ \begin{align*}\bar{f}_{i,j}(\boldsymbol{\theta})=\log \bar{h}_{i,j}^2(\boldsymbol{\theta})+\frac{x_i^2}{\bar{h}_{i,j}^2(\boldsymbol{\theta})}. \end{align*} $$

Lemma 6.2. If $H_0$ , Assumptions 2.1–2.3, and (6.1) are satisfied, then

(6.42)

$$ \begin{align} \left(\left\|\nabla(\bar{f}_i(\boldsymbol{\theta}_0) - \bar{f}_{i,j}(\boldsymbol{\theta}_0)\right\|^{\kappa/2}\right)^{2/\kappa}\leq c\rho^j \end{align} $$

and

(6.43)

$$ \begin{align} \left(\left\|\nabla^2(\bar{f}_i(\boldsymbol{\theta}_0) - \bar{f}_{i,j}(\boldsymbol{\theta}_0) \right\|^{\kappa/2}\right)^{2/\kappa}\leq c\rho^j \end{align} $$

with some $0<\rho <1$ , and $\kappa>4$ is defined in Assumption 2.1.

Proof. The result in (6.42) is an immediate consequence of Lemma 6.1 since

$$ \begin{align*}\nabla\bar{f}_i(\boldsymbol{\theta}_0)=(1-\epsilon_i^2)\mathbf{g}_{i}. \end{align*} $$

Similar arguments give (6.43).

Let

$$ \begin{align*}\mathbf{H}=E\nabla^2\bar{f}_0(\boldsymbol{\theta}_0). \end{align*} $$

Berkes et al. (Reference Berkes, Horváth and Kokoszka2003) and Francq and Zakoian (Reference Francq and Zakoian2004) prove that $\mathbf {H}$ is not singular.

Lemma 6.3. If $H_0$ , Assumptions 2.1–2.3, and (6.1) are satisfied, then

$$ \begin{align*}N(\hat{\boldsymbol{\theta}}_N-\boldsymbol{\theta}_0)=-\mathbf{H}^{-1}\nabla\bar{{\mathcal f}}_N(\boldsymbol{\theta}_0)+O_P(1). \end{align*} $$

Proof. Berkes et al. (Reference Berkes, Horváth and Kokoszka2003) and Francq and Zakoian (Reference Francq and Zakoian2004) show that there exists a $\boldsymbol {\Theta }_0$ , a neighborhood of $\boldsymbol {\theta }_0$ , such that

(6.44)

$$ \begin{align} \sup\left\{ \frac{1}{N}\left\| \nabla^3 \bar{{\mathcal f}}_N(\boldsymbol{\theta}) \right\|:\;\boldsymbol{\theta}\in\boldsymbol{\Theta}_0 \right\}=O_P(1) \quad\text{a.s.} \end{align} $$

By (6.31), we have

$$ \begin{align*}\|\nabla{{\mathcal L}}_N(\hat{\boldsymbol{\theta}}_N)-\nabla\bar{{\mathcal f}}_N(\hat{\boldsymbol{\theta}}_N)\|=O_P(1) \end{align*} $$

and since

$$ \begin{align*}\nabla\bar{{\mathcal L}}_N(\hat{\boldsymbol{\theta}}_N)=\mathbf{0}, \end{align*} $$

by a Taylor expansion, we get from (6.32) and (6.44) that

$$ \begin{align*} -\nabla\bar{{\mathcal L}}_N(\boldsymbol{\theta}_0)=\nabla\bar{{\mathcal f}}_N(\hat{\boldsymbol{\theta}}_N)-\nabla\bar{{\mathcal f}}_N(\boldsymbol{\theta}_0)+O_P(1) =\nabla^2\bar{{\mathcal f}}_N(\boldsymbol{\theta}_0)(\hat{\boldsymbol{\theta}}_N-\boldsymbol{\theta}_0)+O_P(1). \end{align*} $$

Because of (6.42) and (6.43), we can use Lemma S2.1 of Aue et al. (Reference Aue, Hörmann, Horváth and Hušková2014) to conclude that

$$ \begin{align*}\|\nabla\bar{{\mathcal f}}_N(\boldsymbol{\theta}_0)\|=O_P(N^{1/2}) \end{align*} $$

and

$$ \begin{align*}\|\nabla^2\bar{{\mathcal f}}_N(\boldsymbol{\theta}_0)-N\mathbf{H}\|=O_P(N^{1/2}). \end{align*} $$

Now the proof of Lemma 6.3 is complete.

Lemma 6.4. If $H_0$ , Assumptions 2.1–2.3, and (6.1) are satisfied, then

$$ \begin{align*}\max_{1\leq k \leq N-1}\left(\frac{N}{k(N-k)}\right)^\zeta\left\| \nabla\bar{{\mathcal L}}_k(\hat{\boldsymbol{\theta}}_N)-\left(\nabla\bar{{\mathcal f}}_k(\boldsymbol{\theta}_0) -\frac{k}{N} \nabla\bar{{\mathcal f}}_N(\boldsymbol{\theta}_0) \right) \right\|=O_P(1) \end{align*} $$

for any $\zeta>0$ .

Proof. It is enough to prove that

$$ \begin{align*}\max_{1\leq k \leq N-1}\left(\frac{N}{k(N-k)}\right)^\zeta \left\|\nabla\bar{{\mathcal f}}_k(\hat{\boldsymbol{\theta}}_N)-\left(\nabla\bar{{\mathcal f}}_k(\boldsymbol{\theta}_0) -\frac{k}{N} \nabla\bar{{\mathcal f}}_N(\boldsymbol{\theta}_0) \right) \right\|=O_P(1). \end{align*} $$

We follow the proof of Lemma 6.3. For any $\zeta $ , we choose $1/2<\zeta _1< 1/2+\zeta $ . Using again (6.43) and Lemma S2.1 of Aue et al. (Reference Aue, Hörmann, Horváth and Hušková2014), we have

(6.45)

$$ \begin{align} \max_{1\leq k \leq N}\frac{1}{k^{\zeta_1}}\left\|\nabla^2\bar{{\mathcal f}}_k(\boldsymbol{\theta}_0)-k\mathbf{H} \right\|=O_P(1){.} \end{align} $$

Thus, we obtain

$$ \begin{align*} \max_{1\leq k\leq N}\frac{1}{k^\zeta}\left\|(\nabla^2\bar{{\mathcal f}}_k(\boldsymbol{\theta}_0)-k\mathbf{H})(\hat{\boldsymbol{\theta}}_N-\boldsymbol{\theta}_0)\right\|= O_P(1)\max_{1\leq k\leq N}k^{\zeta_1-\zeta}N^{-1/2}=O_P(1), \end{align*} $$

and similarly by Lemma 6.3

$$ \begin{align*} \max_{1\leq k\leq N}\frac{1}{k^\zeta}\left\|k\mathbf{H}\left[ (\hat{\boldsymbol{\theta}}_N-\boldsymbol{\theta}_0)-\left(-\mathbf{H} \frac{1}{N}\nabla\bar{{\mathcal f}}_N(\boldsymbol{\theta}_0)\right) \right]\right\|= O_P(1). \end{align*} $$

Our estimates imply that

$$ \begin{align*}\max_{1\leq k \leq N}\frac{1}{k^\zeta} \left\|\nabla\bar{{\mathcal f}}_k(\hat{\boldsymbol{\theta}}_N)-\left(\nabla\bar{{\mathcal f}}_k(\boldsymbol{\theta}_0) -\frac{k}{N} \nabla\bar{{\mathcal f}}_N(\boldsymbol{\theta}_0) \right) \right\|=O_P(1). \end{align*} $$

To prove the same estimate when we use $N-k$ instead of k in the weight function, we observe, as in (6.45), that

(6.46)

$$ \begin{align} \max_{1\leq k \leq N}\frac{1}{(N-k)^{\zeta_1}}\left\|\nabla^2\bar{{\mathcal f}}_N(\boldsymbol{\theta}_0)-\nabla^2\bar{{\mathcal f}}_k(\boldsymbol{\theta}_0)-(N-k)\mathbf{H} \right\|=OP(1) \end{align} $$

for all $\zeta _1>1/2$ . Using

$$ \begin{align*} &\max_{1\leq k \leq N_1} \left\|\nabla(\bar{{\mathcal f}}_N(\hat{\boldsymbol{\theta}}_N)-\bar{{\mathcal f}}_k(\hat{\boldsymbol{\theta}}_N))\right.\\&\qquad-\left.\left[\nabla(\bar{{\mathcal f}}_N({\boldsymbol{\theta}}_0)-\bar{{\mathcal f}}_k({\boldsymbol{\theta}}_0)) +\nabla(\bar{{\mathcal f}}_N({\boldsymbol{\theta}}_0)-\bar{{\mathcal f}}_k({\boldsymbol{\theta}}_0))(\hat{\boldsymbol{\theta}}_N-\boldsymbol{\theta}_0) \right] \right\|\\ &=O_P(1) \end{align*} $$

and (6.46), one can verify that

$$ \begin{align*}\max_{1\leq k \leq N-1}\frac{1}{(N-k)^\zeta} \left\|\nabla\bar{{\mathcal f}}_k(\hat{\boldsymbol{\theta}}_N)-\left(\nabla\bar{{\mathcal f}}_k(\boldsymbol{\theta}_0) -\frac{k}{N} \nabla\bar{{\mathcal f}}_N(\boldsymbol{\theta}_0) \right) \right\|=O_P(1). \end{align*} $$

The last lemma of this section is an immediate consequence of Lemma 6.2 and the approximation in Aue et al. (Reference Aue, Hörmann, Horváth and Hušková2014).

Lemma 6.5. If $H_0$ , Assumptions 2.1–2.3, and (6.1) are satisfied, then we can define independent Gaussian processes $\{\mathbf {W}_{N,1}(x), 0\leq x\leq N/2\}$ and $\{\mathbf {W}_{N,2}(x), 0\leq x\leq N/2\}$ such that

$$ \begin{align*}\max_{1\leq x\leq N/2}\frac{1}{x^\nu}\left\|\sum_{i=1}^{ x }\left(\frac{\partial \bar{f}_i(\boldsymbol{\theta}_0)}{\partial \alpha},\frac{\partial \bar{f}_i(\boldsymbol{\theta}_0)}{\partial \beta}\right)^\top-\mathbf{W}_{N,1}(x) \right\|=O_P(1){,} \end{align*} $$

and

$$ \begin{align*}\max_{N/2\leq x\leq N-1}\frac{1}{(N-x)^\nu}\left\|\sum^{N}_{i= x+1}\left(\frac{\partial \bar{f}_i(\boldsymbol{\theta}_0)}{\partial \alpha},\frac{\partial \bar{f}_i(\boldsymbol{\theta}_0)}{\partial \beta}\right)^\top-\mathbf{W}_{N,2}(N-x) \right\|=O_P(1){,} \end{align*} $$

with some $\nu <1/2$ , $E\mathbf {W}_{N,1}(x)=E\mathbf {W}_{N,2}(x)=\mathbf {0}$ , and $E\mathbf {W}_{N,1}(x)\mathbf {W}_{N,1}^\top (y)=E\mathbf {W}_{N,2}(x)\mathbf {W}_{N,2}^\top (y)=\min (x,y)\mathbf {D}$ .

6.4 Proof of (6.16) When $E\kern1pt\mathbf{log}\kern1pt (\alpha _0\epsilon _0^2+\beta _0)\geq 0$

We note that Jensen and Rahbek (Reference Jensen and Rahbek2004) prove their results under assumption (6.3). However, Theorem 6.1 implies that their arguments are also valid when (6.2) holds. Hence, in the section, we assume that

(6.47)

$$ \begin{align} E\log(\alpha_0\epsilon_0^2+\beta_0)\geq 0. \end{align} $$

In Section 6.3, we showed that it is enough to consider the processes based on the stationary variables $x_i$ and $h_i^2$ . There is no stationary solution in the present case, so we need to work directly with $\bar {\sigma }_i^2(\boldsymbol {\theta })$ and $\bar {\ell }_i(\boldsymbol {\theta })$ . However, following the method of Jensen and Rahbek (Reference Jensen and Rahbek2004), we approximate $\nabla \bar {\ell }_i$ with a stationary sequence. Elementary arguments lead to

$$ \begin{align*} &\frac{\partial\bar{\ell}_i(\boldsymbol{\theta})}{\partial\alpha}=\left[ 1-\frac{y_i^2}{\bar{\sigma}^2_i(\boldsymbol{\theta})} \right]\bar{p}_{i,1}(\boldsymbol{\theta}), \quad \;\;\frac{\partial\bar{\ell}_i(\boldsymbol{\theta})}{\partial\beta}=\left[ 1-\frac{y_i^2}{\bar{\sigma}^2_i(\boldsymbol{\theta})} \right]\bar{p}_{i,2}(\boldsymbol{\theta}){,} \end{align*} $$

where

$$ \begin{align*}\bar{p}_{i,1}(\boldsymbol{\theta})=\frac{1}{\bar{\sigma}_i^2(\boldsymbol{\theta})} \frac{\partial \bar{\sigma}^2_i(\boldsymbol{\theta})}{\partial\alpha}, \quad \bar{p}_{i,2}(\boldsymbol{\theta})=\frac{1}{\bar{\sigma}_i^2(\boldsymbol{\theta})} \frac{\partial \bar{\sigma}^2_i(\boldsymbol{\theta})}{\partial\beta}. \end{align*} $$

Since we can estimate only $\alpha _0$ and $\beta _0$ , we use a special notation for the estimable parameters, $\bar {\boldsymbol {\theta }}_0=(\alpha _0, \beta _0)^\top $ . Let

$$ \begin{align*}z_{i,1}=\sum_{j=1}^\infty\epsilon_{i-j}^2\frac{1}{\beta_0}\prod_{k=1}^j\frac{\beta_0}{\alpha_0\epsilon_{i-k}^2+\beta_0}\;\;\;\;\;\text{and}\;\;\;\;\; \end{align*} $$

Lemma 6.6. If $H_0$ , Assumptions 2.1–2.3, and (6.47) are satisfied, then

(6.48)

$$ \begin{align} \sup_{1/\bar{\omega}\leq \omega \leq \bar{\omega}}\left| \bar{p}_{i,1}(\bar{\boldsymbol{\theta}}_0,\omega)-z_{i,1} \right|=O(\rho^i)\quad\;\;\text{a.s.} \end{align} $$

and

(6.49)

$$ \begin{align} \sup_{1/\bar{\omega}\leq \omega \leq \bar{\omega}}\left|\bar{p}_{i,2}(\bar{\boldsymbol{\theta}}_0,\omega)- z_{i,2} \right|=O(\rho^i)\quad\;\;\text{a.s.} \end{align} $$

with some $0<\rho <1$ .

Proof. First, we note that

$$ \begin{align*} \bar{\sigma}_i^2(\boldsymbol{\theta})&=\omega\sum_{k=0}^{i-1}\beta^k+\alpha\sum_{\ell=1}^i\beta^{\ell-1}y^2_{i-\ell}+\beta^i\sigma_0^2, \end{align*} $$

which gives the representations

$$ \begin{align*} \bar{\sigma}_i^2(\boldsymbol{\theta}_0)&=\omega_0\sum_{k=0}^{i-1}\beta_0^k+\alpha_0\sum_{\ell=1}^i\beta_0^{\ell-1}y^2_{i-\ell}+\beta_0^i\sigma_0^2 \end{align*} $$

and

$$ \begin{align*} \bar{\sigma}_i^2(\bar{\boldsymbol{\theta}}_0, \omega)&=\omega\sum_{k=0}^{i-1}\beta_0^k+\alpha_0\sum_{\ell=1}^i\beta_0^{\ell-1}y^2_{i-\ell}+\beta_0^i\sigma_0^2. \end{align*} $$

We get, by Theorem 6.1(ii) and (iii), that there exist $c_1>0, c_2>0$ such that

(6.50)

$$ \begin{align} \exp(-c_1i^{c_2})\max_{1/\bar{\omega}\leq \omega\leq \bar{\omega}}|\bar{\sigma}_i^2(\bar{\boldsymbol{\theta}}_0, \omega)-\bar{\sigma}_i^2(\boldsymbol{\theta}_0)|=o(1)\;\;\text{a.s.} \end{align} $$

and

(6.51)

$$ \begin{align} \max_{1/\bar{\omega}\leq \omega\leq \bar{\omega}}\left|\frac{\bar{\sigma}_i^2(\bar{\boldsymbol{\theta}}_0, \omega)}{\bar{\sigma}_i^2(\boldsymbol{\theta}_0)}-1\right| =O(\rho_1^i) \end{align} $$

with some $0<\rho _1<1$ . Using the recursion, we have

$$ \begin{align*} \bar{p}_{i,1}(\boldsymbol{\theta})=\frac{1}{\bar{\sigma}_i^2(\boldsymbol{\theta})}\sum_{j=1}^{i}\beta^{j-1} y^2_{i-j}=\sum_{j=1}^i\beta^{j-1}\frac{y_{i-j}^2}{\bar{\sigma}_{i-j}^2(\boldsymbol{\theta})}\prod_{k=1}^j\frac{\bar{\sigma}_{i-k}^2(\boldsymbol{\theta})}{\bar{\sigma}_{i-k+1}^2(\boldsymbol{\theta})} \end{align*} $$

and therefore,

$$ \begin{align*}\bar{p}_{i,1}(\bar{\boldsymbol{\theta}}_0,\omega)=\sum_{j=1}^i\frac{1}{\beta_0}\frac{y_{i-j}^2}{\bar{\sigma}_{i-j}^2(\bar{\boldsymbol{\theta}}_0,\omega)} \prod_{k=1}^j\frac{\beta_0\bar{\sigma}_{i-k}^2(\bar{\boldsymbol{\theta}}_0,\omega)}{\bar{\sigma}_{i-k+1}^2(\bar{\boldsymbol{\theta}}_0,\omega)}. \end{align*} $$

Since

$$ \begin{align*} \frac{\beta_0\bar{\sigma}_{i-k}^2(\bar{\boldsymbol{\theta}}_0,\omega)}{\bar{\sigma}_{i-k+1}^2(\bar{\boldsymbol{\theta}}_0,\omega)} =\frac{\beta_0\bar{\sigma}_{i-k}^2(\bar{\boldsymbol{\theta}}_0,\omega)}{\omega+\alpha_0y_{i-k}^2 +\beta_0\bar{\sigma}_{i-k}^2(\bar{\boldsymbol{\theta}}_0,\omega)}< \frac{\beta_0}{\alpha_0\epsilon_{i-k}^2+\beta_0}< 1, \end{align*} $$

we obtain

$$ \begin{align*} &\sum_{j=1}^{i/2}\beta^{j-1}_0\left|\frac{y_{i-j}^2}{\bar{\sigma}_{i-j}^2(\bar{\boldsymbol{\theta}}_0,\omega)}-\epsilon_{i-j}^2\right| \prod_{k=1}^j\frac{\bar{\sigma}_{i-k}^2(\bar{\boldsymbol{\theta}}_0,\omega)}{\bar{\sigma}_{i-k+1}^2(\bar{\boldsymbol{\theta}}_0,\omega)}\\&\quad \leq \frac{1}{\beta_0}\sum_{j=1}^{i/2}\epsilon_{i-j}^2\left|\frac{\sigma_{i-j}^2}{\bar{\sigma}_{i-j}^2(\bar{\boldsymbol{\theta}}_0,\omega)}-1\right| =O(\rho_2^i)\;\;\;\text{a.s.} \end{align*} $$

on account of (6.51) and

(6.52)

$$ \begin{align} \max_{1\leq i <\infty}i^{-1/\kappa}\epsilon_i^2=O_P(1)\;\;\;\text{a.s.} \end{align} $$

By applying Minkowski’s inequality, for any $\kappa _1>2$ , we have

(6.53)

$$ \begin{align} \left(E\left(\sum_{j=i/2}^i \prod_{k=1}^j\frac{\beta_0\bar{\sigma}_{i-k}^2(\bar{\boldsymbol{\theta}}_0,\omega)}{\bar{\sigma}_{i-k+1}^2(\bar{\boldsymbol{\theta}}_0,\omega)}\right)^{\kappa_1} \right)^{1/\kappa_{1}} &\leq \left(E\sum_{j=i/2}^i \left(\prod_{k=1}^j\frac{\beta_0}{\alpha_0\epsilon_{i-k}^2+\beta_0}\right)^{\kappa_1} \right)^{1/\kappa_{1}}\\ &\leq \sum_{j=i/2}^i\left(E\left(\prod_{k=1}^j\frac{\beta_0}{\alpha_0\epsilon_{i-k}^2+\beta_0}\right)^{\kappa_1}\right)^{1/\kappa_{1}}\notag\\ &=\sum_{j=i/2}^i\rho_3^j.\notag \end{align} $$

Hence, the Borel–Cantelli lemma implies that

$$ \begin{align*}\sum_{j=i/2}^i \prod_{k=1}^j\frac{\beta_0\bar{\sigma}_{i-k}^2(\bar{\boldsymbol{\theta}}_0,\omega)}{\bar{\sigma}_{i-k+1}^2(\bar{\boldsymbol{\theta}}_0,\omega)}=O(\rho_4^i)\;\;\;\text{a.s.} \end{align*} $$

It follows from the recursion and (6.51) that

$$ \begin{align*}\frac{y_{i-j}^2}{\bar{\sigma}_{i-j}^2(\bar{\boldsymbol{\theta}}_0,\omega)}=\epsilon_{i-j}^2\frac{\sigma_{i-j}^2}{\bar{\sigma}_{i-j}^2(\bar{\boldsymbol{\theta}}_0,\omega)} \leq \epsilon_{i-j}\sup_{1\leq \ell<\infty}\sup_{1/\bar{\omega}\leq \omega\leq \bar{\omega}}\frac{\sigma_{\ell}^2}{\bar{\sigma}_{\ell}^2(\bar{\boldsymbol{\theta}}_0,\omega)} \end{align*} $$

and

$$ \begin{align*}\sup_{1\leq \ell<\infty}\sup_{1/\bar{\omega}\leq \omega\leq \bar{\omega}}\frac{\sigma_{\ell}^2}{\bar{\sigma}_{\ell}^2(\bar{\boldsymbol{\theta}}_0,\omega)} =O(1)\;\;\;\text{a.s.} \end{align*} $$

Thus, we conclude that

(6.54)

$$ \begin{align} \sup_{1/\bar{\omega}\leq \omega\leq \bar{\omega}}\left|p_{i,1}(\bar{\theta}_0,\omega)-\sum_{j=1}^i\frac{1}{\beta_0}\epsilon_{i-j}^2 \prod_{k=1}^j\frac{\beta_0\bar{\sigma}_{i-k}^2(\bar{\boldsymbol{\theta}}_0,\omega)}{\bar{\sigma}_{i-k+1}^2(\bar{\boldsymbol{\theta}}_0,\omega)}\right| =O(\rho_5^i)\;\;\;\text{a.s.} \end{align} $$

Elementary argument leads to

(6.55)

$$ \begin{align} &\hspace{0cm}\left|\prod_{k=1}^{j}\frac{\beta_0\bar{\sigma}_{i-k}^2(\bar{\boldsymbol{\theta}}_0,\omega)}{\omega+(\alpha_0\epsilon_{i-k}^2+\beta_0)\bar{\sigma}^2_{i-k}(\bar{\boldsymbol{\theta}}_0,\omega)} - \prod_{k=1}^{j}\frac{\beta_0}{\alpha_0\epsilon_{i-k}^2+\beta_0} \right|\\ &\quad\leq \sum_{k=1}^{j}\left|\frac{\beta_0\bar{\sigma}_{i-k}^2(\bar{\boldsymbol{\theta}}_0,\omega)}{\omega+(\alpha_0\epsilon_{i-k}^2+\beta_0)\bar{\sigma}^2_{i-k}(\bar{\boldsymbol{\theta}}_0,\omega)} -\frac{\beta_0}{\alpha_0\epsilon_{i-k}^2+\beta_0} \right|\notag\\ &\quad\leq \beta_0\omega\sum_{k=1}^j\frac{1}{\bar{\sigma}_{i-k}^2(\bar{\boldsymbol{\theta}}_0,\omega)}. \notag \end{align} $$

Using (6.50) and (6.52), we get that

(6.56)

$$ \begin{align} &\hspace{0cm}\sup_{1/\bar{\omega}\leq \omega\leq \bar{\omega}}\sum_{j=1}^{i/2}\epsilon_{i-j}^2\left|\prod_{k=1}^{j} \frac{\beta_0\bar{\sigma}_{i-k}^2(\bar{\boldsymbol{\theta}}_0,\omega)}{\omega+(\alpha_0\epsilon_{i-k}^2+\beta_0)\bar{\sigma}^2_{i-k}(\bar{\boldsymbol{\theta}}_0,\omega)} - \prod_{k=1}^{j}\frac{\beta_0}{\alpha_0\epsilon_{i-k}^2+\beta_0} \right|\\ &\quad\stackrel{\text{ a.s.}}{=}O(1) \sum_{j=1}^{i/2}\epsilon_{i-j}^2\sum_{k=1}^{j}\exp(-S(i-k))\notag\\ &\quad\stackrel{\text{ a.s.}}{=}O(\rho_6^i){,}\notag \end{align} $$

with some $0<\rho _6<1$ . Similarly to (6.53)

$$ \begin{align*} \sum_{j=i/2}^{i}\epsilon_{i-j}^2\prod_{k=1}^{j}\frac{\beta_0}{\alpha_0\epsilon_{i-k}^2+\beta_0}\stackrel{\text{ a.s.}}{=}O(\rho_7^i){,}\notag \end{align*} $$

and

$$ \begin{align*} \sup_{1/\bar{\omega}\leq \omega\leq \bar{\omega}}\sum_{j=i/2}^{i}\epsilon_{i-j}^2\prod_{k=1}^{j} \frac{\beta_0\bar{\sigma}_{i-k}^2(\bar{\boldsymbol{\theta}}_0,\omega)}{\omega+(\alpha_0\epsilon_{i-k}^2+\beta_0)\bar{\sigma}^2_{i-k}(\bar{\boldsymbol{\theta}}_0,\omega)} \stackrel{\text{ a.s.}}{=}O(\rho_7^i),\notag \end{align*} $$

with $0<\rho _7<1$ , completing the proof of (6.48). Since

$$ \begin{align*} \bar{p}_{i,2}(\bar{\boldsymbol{\theta}}_0,\omega)=\sum_{j=1}^i\frac{1}{\beta_0}\prod_{k=1}^j\frac{\beta_0\bar{\sigma}_{i-k}^2(\bar{\boldsymbol{\theta}}_0,\omega)} {\omega_0+(\alpha_0\epsilon_{i-k}^2+\beta_0)\bar{\sigma}^2_{i-k}(\bar{\boldsymbol{\theta}}_0,\omega)}, \end{align*} $$

the proof of (6.49) is nearly the same but simpler than that of (6.49).

Let

$$ \begin{align*} z_{i,j,1}&=\sum_{\ell=1}^j\epsilon_{i-\ell}^2\frac{1}{\beta_0}\prod_{k=1}^\ell\frac{\beta_0}{\alpha_0\epsilon_{i-k}^2+\beta_0} +\sum_{\ell=j+1}^\infty\epsilon_{i,i-\ell,j}^2\frac{1}{\beta_0}\\ &\quad\times \left(\prod_{k=1}^j\frac{\beta_0}{\alpha_0\epsilon_{i-k}^2+\beta_0}\right)\left(\prod_{r=j+1}^\ell\frac{\beta_0}{\alpha_0\epsilon_{i,i-r,j}^2+\beta_0}\right){,} \end{align*} $$

and

$$ \begin{align*}z_{i,j,2}=\sum_{\ell=1}^j\frac{1}{\beta_0}\prod_{k=1}^\ell\frac{\beta_0}{\alpha_0\epsilon_{i-k}^2+\beta_0} \!+\!\sum_{\ell=j+1}^\infty\frac{1}{\beta_0} \left(\prod_{k=1}^j\frac{\beta_0}{\alpha_0\epsilon_{i-k}^2+\beta_0}\!\right)\!\left(\prod_{r=j+1}^\ell\frac{\beta_0}{\alpha_0\epsilon_{i,i-r,j}^2+\beta_0}\right)\!{,} \end{align*} $$

where $\{\epsilon _{i,j,k}, -\infty <i,j,k<\infty \}$ are independent copies of $\epsilon _0$ .

Lemma 6.7. If $H_0$ , Assumptions 2.1–2.3, and (6.47) are satisfied, then

(6.57)

$$ \begin{align} \left(E\left[(1-\epsilon_i^2)(z_{i,1}-z_{i,j,1})\right]^\kappa\right)^{1/\kappa}\leq c\rho^j \end{align} $$

and

(6.58)

$$ \begin{align} \left(E\left[(1-\epsilon_i^2)(z_{i,2}-z_{i,j,2})\right]^\kappa\right)^{1/\kappa}\leq c\rho^j, \end{align} $$

where $0<\rho <1$ , and $\kappa $ is defined in Assumption 2.1.

Proof. By Assumption 2.1 and Minkowski’s inequality, we have

$$ \begin{align*} &\left( E\left[(1-\epsilon_i^2)\sum_{\ell=j+1}^\infty\epsilon_{i,i-\ell,j}^2\frac{1}{\beta_0} \left(\prod_{k=1}^j\frac{\beta_0}{\alpha_0\epsilon_{i-k}^2+\beta_0}\right)\left(\prod_{r=j+1}^\ell\frac{\beta_0}{\alpha_0\epsilon_{i,i-r,j}^2+\beta_0}\right) \right]^\kappa\right)^{1/\kappa}\\ &\hspace{1cm}\leq \left(E(1-\epsilon_0^2)^\kappa\right)\frac{1}{\beta_0}\sum_{\ell=j+1}^\infty\left(E\left[\frac{\beta_0}{\alpha_0\epsilon_{}^2+\beta_0}\right]^\kappa \right)^{1/\kappa}\leq c_1\rho^j_1 \end{align*} $$

with some $0<\rho _1<1$ , proving (6.57). The same argument gives (6.58).

Let

$$ \begin{align*} \bar{\mathbf{J}}_i(\boldsymbol{\theta})= \begin{pmatrix} \frac{\partial^2 \bar{\ell}_i(\boldsymbol{\theta})}{\partial \alpha^2}, & \frac{\partial^2 \bar{\ell}_i(\boldsymbol{\theta})}{\partial \alpha\partial\beta} \\ \frac{\partial^2 \bar{\ell}_i(\boldsymbol{\theta})}{\partial \alpha\partial\beta}, & \frac{\partial^2 \bar{\ell}_i(\boldsymbol{\theta})}{\partial \beta^2} \end{pmatrix} \end{align*} $$

be the matrix of the second-order partial derivatives of $\bar {\ell }_i(\boldsymbol {\theta })$ . Similarly to Lemma 6.6, it can be approximated with a matrix $\mathbf {Q}_i$ of the form $\mathbf {Q}_i=\mathbf {q}(\epsilon _i,\epsilon _{i-1}, \dots )$ . The sequence $\mathbf {Q}_i$ is a decomposable Bernoulli shift with $\mathbf {Q}_{i,j}=\mathbf {q}(\epsilon _i,\epsilon _{i-1},\dots , \epsilon _{i-1}, \epsilon _{i, j-i-1, j}, \epsilon _{i, j-i-2, j},\dots )$ .

Lemma 6.8. If $H_0$ , Assumptions 2.1–2.3, and (6.47) are satisfied, then

(6.59)

$$ \begin{align} \sup_{1/\bar{\omega}\leq \omega\leq \bar{\omega}}\left\| \bar{\mathbf{J}}_i(\bar{\boldsymbol{\theta}}_0,\omega) -\mathbf{Q}_i \right\|=O(\rho^i)\;\;\;\text{a.s.} \end{align} $$

and

(6.60)

$$ \begin{align} \left(E\left\|\mathbf{Q}_i-\mathbf{Q}_{i,j}\right\|^\kappa\right)^{1/\kappa}\leq c \rho^j, \end{align} $$

where $0<\rho <1$ , and $\kappa $ is defined in Assumption 2.1.

Proof. The matrix $\bar {\mathbf {J}}_i$ can be computed explicitly, so following the proofs of Lemmas 6.6 and 6.7, we obtain (6.59) and (6.60), respectively.

The rest of the proof is a repetition of the calculations in Section 6.3, since we have showed again that the partial derivatives can be approximated with decomposable Bernoulli shifts with exponential rates. Thus, we get the following analogue of Lemma 6.4.

Lemma 6.9. If $H_0$ , Assumptions 2.1–2.3, and (6.47) are satisfied, then

$$ \begin{align*} \max_{1\leq k \leq N-1}&\left( \frac{N}{k(N-k)} \right)^\zeta\left\|\vphantom{\sum_{i=1}^N} \bar{{\mathcal r}}_k(\hat{\boldsymbol{\theta}}_N)\right.\\&\quad\left.-\left(\sum_{i=1}^k(1-\epsilon_i^2)(z_{i,1},z_{i,2})^\top -\frac{k}{N}\sum_{i=1}^N(1-\epsilon_i^2)(z_{i,1},z_{i,2})^\top \right) \right\|\\ &=O_P(1) \end{align*} $$

for any $\zeta>0$ .

Lemma 6.10. If $H_0$ , Assumptions 2.1–2.3, and (6.47) are satisfied, then we can define independent Gaussian processes $\{\mathbf {W}_{N,1}(x), 0\leq x\leq N/2\}$ and $\{\mathbf { W}_{N,2}(x), 0\leq x\leq N/2\}$ such that

$$ \begin{align*}\max_{1\leq x\leq N/2}\frac{1}{x^\nu}\left\|\sum_{i=1}^{\lfloor x\rfloor}(1-\epsilon_i^2)(z_{i,1},z_{i,2})^\top-\mathbf{W}_{N,1}(x) \right\|=O_P(1){,} \end{align*} $$

and

$$ \begin{align*}\max_{N/2\leq x\leq N-1}\frac{1}{(N-x)^\nu}\left\|\sum^{N}_{i=\lfloor x\rfloor+1}(1-\epsilon_i^2)(z_{i,1},z_{i,2})^\top-\mathbf{W}_{N,2}(N-x) \right\|=O_P(1){} \end{align*} $$

Proof. The approximations are immediate consequences of Lemma 6.7 and Theorem S2.1 of Aue et al. (Reference Aue, Hörmann, Horváth and Hušková2014).

Appendix

To facilitate the application of the test developed in this work, we provide the following step-by-step procedure based on Theorem 2.1(i) to calculate the test statistic $T_N$ .

1. Use the QMLE to estimate the parameters $\boldsymbol {\theta }$ and the corresponding estimator $\hat {\boldsymbol {\theta }}_N = (\hat {\omega }, \hat {\alpha }, \hat {\beta })^\top $ that satisfies
$$ \begin{align*}\inf \left\{\bar{{\mathcal L}}_N(\boldsymbol{\theta}):\;\boldsymbol{\theta}\in\boldsymbol{\Theta}\right\}=\bar{{\mathcal L}}_N(\hat{\boldsymbol{\theta}}_N), \end{align*} $$

where
$$ \begin{align*}\bar{{\mathcal L}}_k(\boldsymbol{\theta})=\sum_{i=1}^k\bar{\ell}_i(\boldsymbol{\theta}),\quad \;\; \text{and}\quad\;\;\bar{\ell}_i(\boldsymbol{\theta})=\log \bar{\sigma}_i^2(\boldsymbol{\theta})+\frac{y_i^2}{\bar{\sigma}_i^2(\boldsymbol{\theta})}. \end{align*} $$
2. Calculate the derivatives of the log likelihood function
$$ \begin{align*}\frac{\partial \bar{\ell}_i(\hat{\boldsymbol{\theta}}_N)}{\partial \alpha} = \frac{\partial \bar{\ell}_i(\hat{\boldsymbol{\theta}}_N)}{\partial \bar{\sigma}_{i}^2(\hat{\boldsymbol{\theta}}_N)} \frac{\partial \bar{\sigma}_{i}^2(\hat{\boldsymbol{\theta}}_N)}{\partial \alpha} = \frac{1}{\bar{\sigma}_i^2(\hat{\boldsymbol{\theta}}_N)}\left( 1 - \frac{y_i^2}{\bar{\sigma}_i^2(\hat{\boldsymbol{\theta}}_N)}\right) \frac{\partial \bar{\sigma}_{i}^2(\hat{\boldsymbol{\theta}}_N)}{\partial \alpha}, \end{align*} $$

$$ \begin{align*}\frac{\partial \bar{\ell}_i(\hat{\boldsymbol{\theta}}_N)}{\partial \beta} = \frac{\partial \bar{\ell}_i(\hat{\boldsymbol{\theta}}_N)}{\partial \bar{\sigma}_{i}^2(\hat{\boldsymbol{\theta}}_N)} \frac{\partial \bar{\sigma}_{i}^2(\hat{\boldsymbol{\theta}}_N)}{\partial \beta} = \frac{1}{\bar{\sigma}_i^2(\hat{\boldsymbol{\theta}}_N)}\left( 1 - \frac{y_i^2}{\bar{\sigma}_i^2(\hat{\boldsymbol{\theta}}_N)}\right) \frac{\partial \bar{\sigma}_{i}^2(\hat{\boldsymbol{\theta}}_N)}{\partial \beta}. \end{align*} $$

To implement the above calculation, the following recursive equations are required:
$$ \begin{align*}\frac{\partial \bar{\sigma}_{i}^2(\hat{\boldsymbol{\theta}}_N)}{\partial \alpha} = y_{i-1}^2 + \beta \frac{\partial \bar{\sigma}_{i-1}^2(\hat{\boldsymbol{\theta}}_N)}{\partial \alpha}, \end{align*} $$

$$ \begin{align*}\frac{\partial \bar{\sigma}_{i}^2(\hat{\boldsymbol{\theta}}_N)}{\partial \beta} = \bar{\sigma}_{i-1}^2(\hat{\boldsymbol{\theta}}_N) + \beta \frac{\partial \bar{\sigma}_{i-1}^2(\hat{\boldsymbol{\theta}}_N)}{\partial \beta}, \end{align*} $$

and their initial values are set to be $\frac {\partial \bar {\sigma }_{0}^2(\hat {\boldsymbol {\theta }}_N)}{\partial \alpha } = \frac {\partial \bar {\sigma }_{0}^2(\hat {\boldsymbol {\theta }}_N)}{\partial \beta } = 0$ , following Fiorentini, Calzolari, and Panattoni (Reference Fiorentini, Calzolari and Panattoni1996).
3. Calculate the following two quantities:
$$ \begin{align*}\bar{\mathcal{r}}_k (\hat{\boldsymbol{\theta}}_N) = \sum_{i=1}^{k} \left( \frac{\partial \bar{\ell}_i(\hat{\boldsymbol{\theta}}_N)}{\partial \alpha}, \frac{\partial \bar{\ell}_i(\hat{\boldsymbol{\theta}}_N)}{\partial \beta} \right), \end{align*} $$

$$ \begin{align*}\hat{\boldsymbol{D}}_N = \dfrac{1}{N} \sum_{i=i}^{N} \left( \frac{\partial \bar{\ell}_i(\hat{\boldsymbol{\theta}}_N)}{\partial \alpha}, \frac{\partial \bar{\ell}_i(\hat{\boldsymbol{\theta}}_N)}{\partial \beta} \right)^\top \left( \frac{\partial \bar{\ell}_i(\hat{\boldsymbol{\theta}}_N)}{\partial \alpha}, \frac{\partial \bar{\ell}_i(\hat{\boldsymbol{\theta}}_N)}{\partial \beta} \right). \end{align*} $$
4. Then, compute the stochastic process
$$ \begin{align*}Z_N(t) = \frac{1}{N^{1/2}} \left\lbrace \bar{\mathcal{r}}_{\lfloor (N+1)t\rfloor} (\hat{\boldsymbol{\theta}}_N) \hat{\boldsymbol{D}}_N^{-1} \left( \bar{\mathcal{r}}_{\lfloor (N+1)t\rfloor} (\hat{\boldsymbol{\theta}}_N)\right)^\top \right\rbrace, \qquad 0 < t < 1. \end{align*} $$
5. Obtain the test statistic
$$ \begin{align*}T_N = \sup_{0 < t < 1} \dfrac{Z_N (t)}{w(t)}, \end{align*} $$

where $w(t)$ is a weight function.

Footnotes

We would like to thank the editor Peter C. B. Phillips, the co-editor Robert Taylor, and three anonymous referees for their constructive suggestions, which have lead to a much improved paper. We also thank the helpful comments received from the conference participants of the 1st conference on “Econometrics with Data Science” at Reading 2024. All remaining errors are ours. This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors. We report that there are no competing interests to declare.

1 We have also performed the simulation with the time of the change at $k^* = 0.25N$ and $0.75N$ , corresponding to early and late changes in the sample period. In both cases, we can observe a modest reduction in the empirical power. Thus, the test is generally more sensitive to a change in the middle of the sample than an early or late change, which is a typical finding in the change point literature.

2 We use daily returns without dividend from the CRSP database to avoid the complication caused by stock splits when calculating log returns from raw prices. We also performed the analysis using daily returns with dividend, and the results showed only slight differences.

References

REFERENCES

Andreou, E., & Ghysels, E. (2002). Detecting multiple breaks in financial market volatility dynamics. Journal of Applied Econometrics, 17(5), 579–600.10.1002/jae.684CrossRef Google Scholar

Andreou, E., & Ghysels, E. (2006). Monitoring disruptions in financial markets. Journal of Econometrics, 135(1–2), 77–124.10.1016/j.jeconom.2005.07.023CrossRef Google Scholar

Aue, A., Hörmann, S., Horváth, L., & Hušková, M. (2014). Dependent functional linear models with applications to monitoring structural change. Statistica Sinica, 24, 1043–1073.Google Scholar

Berkes, I., Gombay, E., Horváth, L., & Kokoszka, P. (2004). Sequential change-point detection in GARCH(p, q) models. Econometric Theory, 20(6), 1140–1167.10.1017/S0266466604206041CrossRef Google Scholar

Berkes, I., & Horváth, L. (2004). The efficiency of the estimators of the parameters in GARCH processes. Annals of Statistics, 32(2), 633–655.10.1214/009053604000000120CrossRef Google Scholar

Berkes, I., Horváth, L., & Kokoszka, P. (2003). GARCH processes: Structure and estimation. Bernoulli, 9(2), 201–227.10.3150/bj/1068128975CrossRef Google Scholar

Berkes, I., Horváth, L., & Kokoszka, P. (2004). Testing for parameter constancy in GARCH(p, q) models. Statistics & Probability Letters, 70(4), 263–273.10.1016/j.spl.2004.10.010CrossRef Google Scholar

Bollerslev, T. (1986). Generalized autoregressive conditional heteroskedasticity. Journal of Econometrics, 31(3), 307–327.10.1016/0304-4076(86)90063-1CrossRef Google Scholar

Chan, N. H., & Ng, C. T. (2009). Statistical inference for non-stationary GARCH (p, q) models. Electronic Journal of Statistics, 3, 956–992.10.1214/09-EJS452CrossRef Google Scholar

Chen, B., & Hong, Y. (2016). Detecting for smooth structural changes in GARCH models. Econometric Theory, 32(3), 740–791.10.1017/S0266466614000942CrossRef Google Scholar

Chibisov, D. M. (1964). Some theorems on the limiting behaviour of an empirical distribution function. Selected Translations in Mathematical Statistics and Probability, 9, 147–156.Google Scholar

Chu, C.-S. J. (1995). Detecting parameter shift in GARCH models. Econometric Reviews, 14(2), 241–266.Google Scholar

Csörgő, M., & Horváth, L. (1993). Weighted approximations in probability and statistics. Wiley.Google Scholar

Csörgo, M., & Révész, P. (1981). Strong approximations in probability and statistics. Academic Press and Akadémiai Kiadó.Google Scholar

Engle, R. F. (1982). Autoregressive conditional heteroscedasticity with estimates of the variance of United Kingdom inflation. Econometrica, 50, 987–1007.10.2307/1912773CrossRef Google Scholar

Fiorentini, G., Calzolari, G., & Panattoni, L. (1996). Analytic derivatives and the computation of GARCH estimates. Journal of Applied Econometrics, 11(4), 399–417.10.1002/(SICI)1099-1255(199607)11:4<399::AID-JAE401>3.0.CO;2-R3.0.CO;2-R>CrossRef Google Scholar

Francq, C., & Zakoian, J.-M. (2004). Maximum likelihood estimation of pure GARCH and ARMA-GARCH processes. Bernoulli, 10(4), 605–637.10.3150/bj/1093265632CrossRef Google Scholar

Francq, C., & Zakoian, J.-M. (2007). Quasi-maximum likelihood estimation in GARCH processes when some coefficients are equal to zero. Stochastic Processes and their Applications, 117(9), 1265–1284.10.1016/j.spa.2007.01.001CrossRef Google Scholar

Francq, C., & Zakoian, J.-M. (2010). GARCH models: Structure statistical inference and financial applications. Wiley.10.1002/9780470670057CrossRef Google Scholar

Francq, C., & Zakoïan, J.-M. (2012). Strict stationarity testing and estimation of explosive and stationary generalized autoregressive conditional heteroscedasticity models. Econometrica, 80(2), 821–861.Google Scholar

Francq, C., & Zakoïan, J.-M. (2013). Inference in nonstationary asymmetric GARCH models. Annals of Statistics, 41(4), 1970–1998.10.1214/13-AOS1132CrossRef Google Scholar

Francq, C., & Zakoian, J.-M. (2019). GARCH models: Structure, statistical inference and financial applications. John Wiley & Sons.10.1002/9781119313472CrossRef Google Scholar

Galeano, P., & Tsay, R. S. (2010). Shifts in individual parameters of a GARCH model. Journal of Financial Econometrics, 8(1), 122–153.10.1093/jjfinec/nbp007CrossRef Google Scholar

Hansen, B. E. (1994). Autoregressive conditional density estimation. International Economic Review, 35, 705–730.10.2307/2527081CrossRef Google Scholar

Hoga, Y., & Demetrescu, M. (2023). Monitoring value-at-risk and expected shortfall forecasts. Management Science, 69(5), 2954–2971.10.1287/mnsc.2022.4460CrossRef Google Scholar

Huang, D., Wang, H., & Yao, Q. (2008). Estimating GARCH models: When to use what? The Econometrics Journal, 11(1), 27–38.10.1111/j.1368-423X.2008.00229.xCrossRef Google Scholar

Iglesias, E. M., & Linton, O. B. (2007). Higher order asymptotic theory when a parameter is on a boundary with an application to GARCH models. Econometric Theory, 23(6), 1136–1161.10.1017/S0266466607070454CrossRef Google Scholar

Jensen, S. T., & Rahbek, A. (2004). Asymptotic inference for nonstationary GARCH. Econometric Theory, 20(6), 1203–1226.10.1017/S0266466604206065CrossRef Google Scholar

Kokoszka, P., & Leipus, R. (1999). Testing for parameter changes in ARCH models. Lithuanian Mathematical Journal, 39, 182–195.10.1007/BF02469283CrossRef Google Scholar

Kokoszka, P., & Leipus, R. (2000). Change-point estimation in ARCH models. Bernoulli, 6, 513–539.10.2307/3318673CrossRef Google Scholar

Lazar, E., Wang, S., & Xue, X. (2023). Loss function-based change point detection in risk measures. European Journal of Operational Research, 310(1), 415–431.10.1016/j.ejor.2023.03.033CrossRef Google Scholar

Lee, S.-W., & Hansen, B. E. (1994). Asymptotic theory for the GARCH (1, 1) quasi-maximum likelihood estimator. Econometric Theory, 10(1), 29–52.10.1017/S0266466600008215CrossRef Google Scholar

Lumsdaine, R. L. (1996). Consistency and asymptotic normality of the quasi-maximum likelihood estimator in IGARCH (1, 1) and covariance stationary GARCH (1, 1) models. Econometrica, 64, 575–596.10.2307/2171862CrossRef Google Scholar

Nelson, D. B. (1990). Stationarity and persistence in the GARCH (1, 1) model. Econometric Theory, 6(3), 318–334.10.1017/S0266466600005296CrossRef Google Scholar

O’Reilly, N. E. (1974). On the weak convergence of empirical processes in sup-norm metrics. Annals of Probability, 2, 642–651.Google Scholar

Pape, K., Wied, D., & Galeano, P. (2016). Monitoring multivariate variance changes. Journal of Empirical Finance, 39, 54–68.10.1016/j.jempfin.2016.08.007CrossRef Google Scholar

Patton, A. (2013). Copula methods for forecasting multivariate time series. Handbook of Economic Forecasting, 2, 899–960.10.1016/B978-0-444-62731-5.00016-6CrossRef Google Scholar

Richter, S., Wang, W., & Wu, W. B. (2023). Testing for parameter change epochs in GARCH time series. The Econometrics Journal, 26, 467–491.10.1093/ectj/utad006CrossRef Google Scholar

Straumann, D., & Mikosch, T. (2006). Quasi-maximum-likelihood estimation in conditionally heteroscedastic time series: A stochastic recurrence equations approach. Annals of Statistics, 34, 2449–2495.10.1214/009053606000000803CrossRef Google Scholar

Xu, K.-L. (2013). Powerful tests for structural changes in volatility. Journal of Econometrics, 173(1), 126–142.10.1016/j.jeconom.2012.11.001CrossRef Google Scholar

Table 1 Critical values.

Table 2 Empirical size.

Table 3 Empirical power.

Table 4 Test results of change point.

Figure 1 Histogram of times of changes in $(\alpha , \beta )$ for 3,416 U.S. stocks.

Article contents

DETECTING CHANGES IN GARCH(1,1) PROCESSES WITHOUT ASSUMING STATIONARITY

Abstract

Information

1 Introduction

2 ASSUMPTIONS AND MAIN RESULTS

3 MONTE CARLO SIMULATIONS

3.1 Empirical Size under $H_0$

3.2 Empirical Power under $H_A$

4 EMPIRICAL APPLICATIONS

4.1 Prototype Application

4.2 Extensive Application

5 Conclusions

6 PROOFS OF TECHNICAL RESULTS

6.1 The Properties of $\sigma _i^2$ as $i\to \infty $

6.2 Proofs of Theorems 2.1 and 2.1

Proof of Theorem 2.1.

Proof of Theorem 2.2.

6.3 Proof of (6.16) When $E\kern1pt\mathbf{log}\kern1pt (\alpha _0\epsilon _0^2+\beta _0)<0$

6.4 Proof of (6.16) When $E\kern1pt\mathbf{log}\kern1pt (\alpha _0\epsilon _0^2+\beta _0)\geq 0$

Appendix

Footnotes

References

REFERENCES

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests