Large deviations, moment estimates and almost sure invariance principles for skew products with mixing base maps and expanding-on-average fibers

Abstract In this paper we show how to apply classical probabilistic tools for partial sums 
$\sum _{j=0}^{n-1}\varphi \circ \tau ^j$
 generated by a skew product 
$\tau $
 , built over a sufficiently well-mixing base map and a random expanding dynamical system. Under certain regularity assumptions on the observable 
$\varphi $
 , we obtain a central limit theorem (CLT) with rates, a functional CLT, an almost sure invariance principle (ASIP), a moderate-deviations principle, several exponential concentration inequalities and Rosenthal-type moment estimates for skew products with 
$\alpha $
 -, 
$\phi $
 - or 
$\psi $
 -mixing base maps and expanding-on-average random fiber maps. All of the results are new even in the uniformly expanding case. The main novelty here (in contrast to [2]) is that the random maps are not independent, they do not preserve the same measure and the observable 
$\varphi $
 depends also on the base space. For stretched exponentially 
${\alpha }$
 -mixing base maps our proofs are based on multiple correlation estimates, which make the classical method of cumulants applicable. For 
$\phi $
 - or 
$\psi $
 -mixing base maps, we obtain an ASIP and maximal and concentration inequalities by establishing an 
$L^\infty $
 convergence of the iterates 
${\mathcal K}^{\,n}$
 of a certain transfer operator 
${\mathcal K}$
 with respect to a certain sub- 
${\sigma }$
 -algebra, which yields an appropriate (reverse) martingale-coboundary decomposition.

1. Introduction and a preview of the main results 1.1.Quenched limit theorems for random dynamical systems.Let (X, B, m) be a probability space and let ( , F, P, σ ) be an invertible ergodic probability-preserving system.Let T ω : X → X, ω ∈ , be a family of non-singular maps (that is, m • T −1 ω m) so that the corresponding skew product τ given by τ (ω, x) = (σ ω, T ω x) is measurable.A random dynamical system is formed by the sequence of compositions taken along the orbit of a 'random' point ω.The system ( , F, P, σ ) is often referred to as the driving system, and the map σ is often referred to as the base map.
Let ϕ : × X → R be a measurable function (an 'observable') and let μ be a τ -invariant probability measure on × X.Then μ can be decomposed as μ = μ ω dP(ω), where μ ω is a family of probability measures on X so that (T ω ) * μ ω = μ σ ω for P-almost every (a.e.) ω.Set S n ϕ = n−1 j =0 ϕ • τ j .Then where ϕ ω (•) = ϕ(ω, •).For P-a.e. ω we can consider the sequence of functions S ω n ϕ(•) on the probability space (X, B, μ ω ) as random variables.Limit theorems for such sequences are called quenched limit theorems.Among the first papers dealing with quenched limit theorems for random dynamical systems are [36,37], where in [36] a quenched large-deviations principle was obtained, and in [37] a central limit theorem (CLT) and a law of the iterated logarithm were established.Since then quenched limit theorems for random dynamical systems have been extensively studied.For instance, in [16,[20][21][22] almost sure invariance principle (ASIP, an almost sure approximation by a sum of independent Gaussians) was established for random expanding or hyperbolic maps T ω , in [19,31] Berry-Esseen theorems (optimal rates in the CLT) were obtained for similar classes of maps and in [17,18,23,31] local CLTs were achieved.In addition, in [27] several limit theorems were extended to random non-uniformly hyperbolic or expanding maps.We would also like to refer to [3] for related results concerning mixing rates for random non-uniformly hyperbolic maps and to [32] for related results concerning sequential dynamical systems, where an ASIP was obtained.We note that in many of the examples these results are obtained for the unique measure μ such that μ ω is absolutely continuous with respect to m.However, some results hold true even for maps T ω : E ω → E σ ω ⊂ X which are defined on random subsets of X (see [40]), where in this case the most notable choices of μ ω are the so-called random Gibbs measures (see [31,44]).
1.2.Limit theorem skew products.Let us consider the sums S n ϕ = n−1 j =0 ϕ • τ j as random variables on the probability space ( × X, F × B, μ).In this paper will focus on limit theorems for such sequences of random variables.In order to demonstrate the difference between such limit theorems and the quenched ones, let us focus of the CLT.The quenched CLT means that for P-a.e. ω, for all real t, we have lim n→∞ μ ω ({x : S ω n ϕ(x) − μ ω (S ω n ϕ) ≤ t where σ ≥ 0 is the number that satisfies σ 2 = lim n→∞ (1/n)Var μ ω (S ω n ϕ) for P-a.e. ω (assuming that this limit exists and does not depend on ω, refer to [37,Theorem 2.3] for sufficient conditions).On the other hand, the CLT for the skew product means that for all real t we have 120 where 2 = lim n→∞ (1/n)Var μ (S n ϕ).Note that, in contrast to the quenched case, the summands X j = ϕ • τ j form a stationary sequence and, in applications, the existence of the limit 2 follows from a sufficiently fast decay of Cov(X 0 , X n ) as n → ∞.We also remark that both CLT's above are formulated when σ and are positive, and when one of them vanishes the convergence is towards the constant function 0. When μ ω (ϕ ω ) does not depend on ω, we have that μ ω (ϕ ω ) = μ(ϕ) and σ 2 = 2 .In this case the quenched CLT implies the CLT for S n ϕ by integrating μ ω ({x : S ω n ϕ(x) − μ ω (S ω n ϕ) ≤ t √ n}) with respect to P (and similarly other distributive limit theorems for the skew product follow from the quenched ones).However, it is less likely to be true when μ ω (ϕ ω ) depends on ω.Remark that even when μ ω (ϕ ω ) does not depend on ω other finer results like the ASIP do not follow by integration.Indeed the ASIP concerns an almost sure approximation of the partial sums in question by a sum of independent Gaussian random variables, but the quenched ASIP provides a construction of such a Gaussian process which depends on the fiber ω.
1.2.1.Annealed limit theorems: i.i.d.maps.A particular well-studied case is when the maps T σ j ω are independent.That is, = Y Z is a product space, the coordinates ω j of ω = (ω j ) are independent (with σ being the left shift) and T ω = T ω 0 depends only on the zeroth coordinate.In this case the statistical behavior of the skew product τ can be investigated using the so-called annealed transfer operator, given by (see [8,9,35]) where L ω is the transfer operator corresponding to T ω and the underlying reference measure m.In [2] it was shown that for several classes of random expanding maps, the operator A is quasicompact.Using that, a variety of limit theorems were obtained (such as a CLT, a Berry-Esseen theorem, a local CLT, a local large-deviations principle and an ASIP) for random variables of the form where (ω, x) are distributed according to a τ -invariant measure μ of the form P × (h dm) for some continuous function h, which satisfies Ah = h.The latter assumption means that the maps T ω preserve the same measure ν = h dm.The point is that once quasicompactness is achieved the classical Nagaev-Guivarch method (see [33]) can be applied.This method was applied successfully to obtain limit theorems for deterministic dynamical systems (that is, when T ω = T does not depend on ω), and in [2] (see also [7]) this method was applied to obtain annealed limit theorems.We note that since both the function ϕ and the measure h dm do not depend on ω, and all the maps T ω preserve the measure h dm, the fiberwise centering constant μ ω (S ω n ϕ) and the usual centering constant μ(S n ϕ) are both Spectral method 121 equal to n ϕ(x)h(x) dm(x).Hence, as discussed in the previous section, in this setup some annealed results such as the CLT already follow from the quenched ones.Independence here is crucial, since it yields that the iterates on the annealed transfer operator can be written as where which is the transfer operator of T n ω .Hence, the statistical behavior of the iterates τ n of the skew product can be described by the iterates of A. Note that in this independent and identically distributed (i.i.d.) setup this approach works only when ϕ(ω, x) = ϕ(x) does not depend on ω since it requires substituting ϕ (and appropriate functions of ϕ) into the annealed operator.

The motivation behind the present paper: non-i.i.d. maps and random functions.
The starting point of this paper is the observation that when the coordinates (ω j ) are not independent (that is, that maps T σ j ω are not i.i.d.) there is no apparent relation between the iterates τ n of τ and the iterates of the annealed operator A defined above.Thus, a natural question arising from [2,7] is which limit theorems hold true for mixing base maps with non-independent coordinates, and functions ϕ which depend on ω.Moreover, the assumptions in [2] require all the maps T ω to preserve the same absolutely continuous measure ν = h dm, and it is also desirable to prove limit theorems without such assumptions.(We refer to [46] for a CLT and large deviations for random i.i.d.intermittent maps in the case where the T ω do not preserve the same measure.)We note that without the above assumptions even the CLT was not obtained before for the skew products considered in this paper, which will be our first result.
The question described above was also one of the main motivations in [26], where a CLT, a local CLT and a renewal theorem were obtained for several classes of skew products with mixing base maps such as Markov shifts and non-uniform Young towers, together with uniformly expanding random maps.These results were obtained by a certain type of integration argument; however, the method of [26] does not involve the iterates of an annealed transfer operator, and instead we studied directly integrals of the form L n ω g ω dP(ω), and their complex perturbations (relying on the fiberwise 'spectral' properties and a certain type of periodic point approach which was introduced in [31]).While [26] was the first paper to discuss limit theorem for skew products with non-independent fiber maps and random observables, all the results there were obtained for fiberwise centered observables ϕ (that is, μ ω (ϕ ω ) = 0).Moreover, the maps T ω in [26] were uniformly expanding, the base map had a periodic point and the random transfer operator satisfied certain regularity assumptions as functions of ω around the periodic orbit.From this point of view, a second motivation for the present paper is to prove limit theorems for skew products with non-independent fiber maps T σ j ω without the fiberwise centralization assumption and without additional topological assumptions such as the behavior around a periodic orbit.We note that, apart from the CLT, we did not consider in [26] any of the limit theorems obtained in the present paper, and so almost all the results in the present paper are new even under the fiberwise centering assumption.

122
Y. Hafouta 1.3.Our new results and the method of the proofs.As explained in the previous section, the goal of this paper is to obtain limit theorems with deterministic centering conditions for skew products τ built over mixing base maps and non-uniformly expanding maps T ω .More precisely, we still consider a product space = Y Z , but with 'weakly dependent' coordinates ω j instead of independent ones.We consider a family of non-uniformly expanding maps T ω = T ω 0 and observables of the form ϕ(ω, x) = ϕ ω 0 (x) and prove limit theorems for sequences of the form Z n = S n ϕ − n ϕ dμ, where considered as a random variables on the probability space ( × X, F × B, μ), where μ = μ ω dP is the unique τ -invariant measure with μ ω being absolutely continuous with respect to m (or when μ ω is a random Gibbs measure).In this setup we have (T ω ) * μ ω = μ σ ω , and in general the maps T ω do not preserve the same measure.These results are obtained for a certain type of observables ϕ so that ϕ ω (•) has bounded variation, uniformly in ω.When the maps T ω are expanding on average we will also have a certain scaling assumption (that is, esssup ω∈ (K(ω) ϕ ω BV ) < ∞ for some tempered random variable K), which was shown in [22] to be necessary for quenched limit theorems, and which is similarly necessary for obtaining limit theorems for the skew product.In what follows we will always assume that ϕ dμ = 0, which is not really a restriction since we can always replace ϕ with ϕ − ϕ dμ.We obtain our results using two different methods, as described below.
1.3.1.Limit theorems for skew products: (functional) CLT, moment estimates, moderate-deviations and exponential concentration inequalities for α-mixing driving systems via the method of cumulants.Recall that the α-mixing (dependence) coefficient between two sub-σ -algebras G, H of F is given by Let F −∞,k be the σ -algebra generated by the coordinates ω j at places j ≤ k and F m,∞ be the σ -algebra generated by the coordinates ω j at places j ≥ m.Then the α-dependence coefficients of the sequence of coordinates (ω n ) are defined by where the last equality is due to stationarity of the process (ω n ).
We assume first that α n = O(e −cn η ) for some c, η > 0 (that is, it is stretched exponential).The first step towards limit theorems is standard for stationary processes: we show that under the weaker condition n nα n < ∞, the limit

Spectral method
123 exists and that it vanishes if and only if ϕ admits an appropriate coboundary representation.When s 2 > 0 we show that n −1/2 S n converges in distribution towards a centered normal random variable with variance s 2 .More precisely, we obtain the convergence rate An annealed CLT (that is, for independent maps) was obtained in [7] for random toral automorphisms and in [2] for more general maps.When the base map is only mixing (and ϕ depends on ω) it was obtained in [26] for fiberwise centered potentials (that is, μ ω (ϕ ω ) = 0).One of the results in this paper is the CLT for stretched exponentially α-mixing base maps but without the fiberwise centering assumption (in fact, we will obtain a functional CLT; see Theorem 2.19 and the last paragraph of §1.3.1).
We also obtain a certain type of large-deviations results, often referred to as a moderate-deviations principle (see [14]).These results yield, for instance, that for every closed interval [a, b] we have where a n is a sequence such that a n → ∞ and a n = o(n 1/(2+4γ ) ).We also obtain several types of 'stretched' exponential concentration inequalities ((2.20), (2.21)) and Gaussian moment estimates of Rosenthal type (2.22).These result are obtained using the method of cumulants.More precisely, we first obtain a certain type of multiple correlation estimates (see Proposition 3.4), and then by applying a general theorem we conclude that the kth cumulant of the sum S n is at most of order n(k!) 1+γ (c 0 ) k−2 for k ≥ 3, where c 0 is some constant (see Theorem 3.1).Then we can apply the method of cumulants [15,49].
In the annealed setup, using the quasicompactness of the annealed transfer operator, large-deviations principles and exponential concentration inequalities were obtained in [2], and the above results show that there is a similar behavior when the maps are not independent and the function ϕ depends on ω (see also the results in the next section where better exponential concentration inequalities are described).
The above multiple correlation estimates together with the method of cumulants and the Rosenthal-type moment estimates also yield a functional CLT.Let us consider the random function S n (t) = n −1/2 S [nt] on [0, 1].Then we show that it converges in distribution in the Skorokhod space D[0, 1] to sW , where W is a standard Brownian motion and 1.3.2.Limit theorems for skew products with φ-or ψ-mixing driving systems via martingale methods: almost sure invariance principle, concentration inequalities and maximal moment estimates.One of the strongest methods to prove CLTs and related results in probability theory and dynamical systems is the so-called martingale-coboundary representation (Gordin's method).For a sufficiently chaotic dynamical system (Y , G, μ, T ) and an observable ϕ : Y → R it means that ϕ can be represented as ϕ = u + χ − χ • T for some sufficiently regular function χ, and (u • T n ) forms a reverse martingale difference.Such results are well known for deterministic expanding (or hyperbolic) dynamical Y. Hafouta systems, and we refer to [16,22,42] for quenched and sequential versions of such martingale methods.
Recall that the φ-mixing and ψ (dependence) coefficient between two sub-σ -algebras G, H of F is given by The reverse φ-mixing coefficients of the sequence of coordinates (ω n ) are defined by while the ψ-mixing coefficients of (ξ n ) are defined by where F n,m is as defined before (1.2).It is clear from the definitions of the mixing coefficients that When the sequence (ω n ) is (sufficiently fast) φ-or ψ-mixing we obtain a certain type of L ∞ martingale-coboundary representation (that is, χ ∈ L ∞ ) for the underlying class of observables ϕ with respect to the skew product τ .This was already established in [2] in the annealed setup (that is, when (ω n ) is an i.i.d.sequence), and here, using different arguments, we obtain such a representation for skew products with mixing base maps.
Once an L ∞ martingale-coboundary decomposition is achieved, as usual, we can apply the Azuma-Hoeffding inequality together with Chernoff's bounding method and obtain exponential concentration inequalities of the form where c 1 , c 2 , c 3 are positive constants.These concentration inequities are better than the ones we obtain using the method of cumulants, although they involve the stronger notions of φ-or ψ-mixing instead of α-mixing.(However, they only require summable φ-or ψ-mixing coefficients and not stretched exponential ones.)Another immediate consequence is moment estimates of the form which hold for every 1 ≤ p < ∞.Such results are known in the annealed case [2], and we extend them to the skew products considered in this paper.
The idea behind the martingale-coboundary representation is as follows.Consider the sub-σ -algebra F 0 of × X generated by the projection π 0 (ω, x) = ((ω j ) j ≥0 , x), where ω = (ω j ) j ∈Z .Then τ preserves F 0 since T ω = T ω 0 depends only on ω 0 , and F 0 can be viewed as a subsystem (or a factor) given by ( × X, F 0 , μ, τ ).Our main argument is as follows.Let K be the transfer operator corresponding to the invariant σ -algebra F 0 , namely the one defined by the duality relation Then we show that, under quite mild φ-or ψ-mixing rates for the sequence of coordinates (ω n ), the iterates K n ϕ of the transfer operator K corresponding to this system converge fast enough in L ∞ (μ) towards μ(ϕ)1, where 1 is the function taking the constant value 1, and ϕ is our given observable.This convergence can be established for every function ϕ so that ϕ K,2 = esssup ω∈ (K(ω) 2 ϕ(ω, •) BV ) < ∞ for an appropriate tempered random variable K(ω), or for any observable with esssup ω∈ ϕ(ω, •) BV < ∞ when the maps T ω are uniformly expanding.We stress that in any case this is not a spectral result (even under exponential mixing), since the convergence of K n is not in an operator norm, and, in general, it does not have an exponential rate.Indeed, we only prove that where ), and δ ∈ (0, 1) and φ R (•) and ψ(•) are the reverse φ-mixing coefficients and ψ-mixing coefficients defined in (1.3) and (1.4), respectively.
Another consequence of the martingale-coboundary representation is the ASIP, which in our context concerns almost sure approximation of the Birkhoff sum by Gaussians.The ASIP for random (and sequential) dynamical systems has been studied by several authors in recent years (see, for instance, [16, 20-22, 32, 50, 51]), and in this paper we will focus on the ASIP for Birkhoff sums generated by the skew product.
In [13] the authors proved that, under certain assumptions, a reverse martingale M n can be approximated almost surely by a sum of independent Gaussians.One consequence of the methods in [13] is for sums of the form W n = n−1 j =0 ϕ • τ j .For such sums, the conditions of [13,Theorem 3.2] show that there is a coupling with a sequence of i.i.d.centered normal random variables Z j with variance In our notation, the first and second conditions of [13, Theorem 3.2] about K can be verified using (1.5).In order to show that the third (and last condition) about K in [13, Theorem 3.2] is in force we will also need to provide more general estimates on expression of the form We note that in [2] the annealed ASIP was obtained using Gouëzel's approach [24] and not the martingale-coboundary approach.Gouëzel's approach was also used in [5] to obtain an ASIP for non-independent maps with mixing base maps, but as indicated in [5] the results are mostly applicable for Gordin-Denker maps.

Y. Hafouta
Finally, we also prove a vector-valued ASIP for skew products with uniformly expanding random maps and exponentially fast α-mixing base maps via the method of Gouëzel [24].As we have mentioned above, this method was applied in [2] in the annealed setting, while in [5] it was applied for Gordin-Denker systems.In a final section we also discuss a few extensions such as different types of mixing base maps such as Young towers or Gibbs-Markov maps, application of the method of cumulants for non-conventional sums of the form S n = n m=1 j =1 ϕ j • τ q j (m) , for polynomial q j (m), as well as extensions of the results for different classes of random expanding maps (the ones in [44]).

Preliminaries and main results
2.1.The random maps.We begin by recalling the setup from [12].Let (X, G) be a measurable space endowed with a probability measure m and a notion of a variation v: Then BV is a Banach space with respect to the norm (2.1) Remark 2.2.We observe that in [12], assumption (V5) is replaced by the weaker v(1) < +∞.However, for the examples we have in mind, our stronger version is satisfied.
The rest of our setup is almost identical to [22], with a single additional requirement that will be indicated in what follows.Let ( , F, P, σ ), be a probability space and σ : → an invertible ergodic measure-preserving transformation.Let T ω : X → X, ω ∈ be a collection of non-singular transformations (that is, m • T −1 ω m for each ω) acting on Spectral method 127 X.Each transformation T ω induces the corresponding transfer operator L ω acting on L 1 (X, m) and defined by the duality relation (2.2) Thus, we obtain a cocycle of transfer operators ( , F, P, σ , L 1 (X, m), L) that we denote by L = (L ω ) ω∈ .For ω ∈ and n ∈ N, set We recall the notion of a tempered random variable.
Definition 2.3.We say that a measurable map K : In this paper we will consider the following assumptions on the random transfer operators.
Definition 2.4.A cocycle L = (L ω ) ω∈ of transfer operators is said to be good if the following conditions hold.
• is a Borel subset of a separable, complete metric space and σ is a homeomorphism.Moreover, L is P-continuous, that is, can be written as a countable union of measurable sets such that ω → L ω is continuous on each of those sets.
• There exist N ∈ N and random variables α, K : → (0, +∞) such that log α dP < 0, log K ∈ L 1 ( , P) and, for P-a.e. ω ∈ and h ∈ BV , • For each a > 0 and P-a.e. ω ∈ , there exist random numbers n c (ω) < +∞ and where Finally, we say that the cocycle L is uniformly random if the random variables C, α N , K N and n c are constants and α n (ω) does not depend on n and ω.

Y. Hafouta Remark 2.5
• Definition 2.4 almost coincides with [22, Definition 3], the only difference being the addition of (2.3) (which was considered in [22, §3].) • The log-integrability assumption specified at the end of Definition 2.4 may easily be checked on explicit examples (see, for example, the discussion in [6, Remark 2.12]).• Furthermore, this assumption implies a certain version of the 'random covering' similar to (2.4); see [22,Remark 4].
Let us now give examples of systems satisfying our requirements.Our first example is essentially taken from [12].
Example 2.6.(Lasota-Yorke cocycles) Consider X = [0, 1], endowed with Lebesgue measure m and the classical notion of variation v.We say that T : X → X is a piecewise monotonic non-singular (p.m.n.s.) map if the following conditions hold.
• T is piecewise monotonic, that is, there exists a subdivision 0 We consider a family (T ω ) ω∈ of random p.m.n.s. as above, and such that T : × we make the following assumptions.
The following example can be fruitfully compared to a similar one by Kifer [38].
Example 2.7.We consider X = S 1 , endowed with the Lebesgue measure m and the notion of variation given by v(φ) := X |φ | dm = φ L 1 .We consider a measurable map In addition, we make the following assumptions.
• There exists a tempered random variable N(ω) so that (2.3) holds true.

2.2.
The one-dimensionality of the top Oseledets space: a summary of known results.In this section we recall two results from [22] that will be in constant use in the course of the proofs of all of our results.
Spectral method 131 2.3.Main results: limit theorems for mixing base maps 2.4.The observable.Let us take a measurable ϕ : × X → R so that ϕ dμ = 0. Let K(ω) be the tempered random variable defined by where D(ω), D(ω) and N(ω) are specified in the definition of a good cocycles and in Theorem 2.8 and Corollary 2.9.In order to describe our assumptions on the observable ϕ, we will need the following classical result (see [4,Proposition 4.3.3.]).PROPOSITION 2.10.Let K : → (0, +∞) be a tempered random variable.For each > 0, there exists a tempered random variable for P-a.e. ω ∈ and n ∈ Z.
Remark 2.11.From now on we will replace both λ and λ by their minimum, which for notational convenience will be denoted by λ.
In what follows we will consider an observable ϕ : × X → R satisfying the scaling condition which was first introduced in [23].In the uniformly random case K(ω) (and hence K(ω)) can be replace by a positive constant, and so the scaling condition reads The main goal in this paper is to obtain limit theorems for the sequence of functions under certain mixing assumptions on the driving system ( , F, P, σ ) and the above assumptions on the observable ϕ.
Remark 2.12.For expanding-on-average maps the scaling condition (2.16) is necessary for limit theorems (see [22,Appendix]).In any case, our results are also new in the uniformly random case, and readers who would prefer can just consider this case together with the assumption that esssup ω∈ ϕ ω BV < ∞.
Let us also note that, in general, the random variable K(ω) comes from Oseledets theorem and it is not computable.In order to provide explicit conditions for quenched limit 132 Y. Hafouta theorems, in [28] several examples of non-uniformly expanding maps (which are stronger than expansion on average) were given with the property that Here the BV norm is with respect to the choice of variation v(g) = v α (g), where v α is the Hölder constant corresponding to some exponent α and B(ω) and ρ(ω) ∈ (0, 1) are random variables with explicit formulas, and they depend only on the zeroth coordinate ω 0 .Moreover, for several of these examples we already have B(ω) ≤ B for some constant B. In this case (similarly to [22, §5.2]) we have the following assertions.
Let ε be smaller than 1 Then, for P-a.e. ω ∈ and n ∈ N, where N(ω) .Observe that for k ≥ 1, Thus, if the stationary process (I A • σ n ) satisfies an appropriate concentration inequality (for example, under appropriate mixing assumptions on (ξ n )), we can conclude that N(ω) is integrable.Hence, log K is integrable and consequently also tempered.
The above means that in this situation we can express the condition on ϕ by means of the more explicit random variable K(ω) defined above.Still, in the setup of [28], under appropriate integrability conditions on B(ω) the main results in this paper can be obtained under conditions such as ϕ ∈ L p (μ) for p large enough (depending on the desired result).Since this approach requires several non-trivial modifications to the arguments in this paper such results will be considered elsewhere.

Limit theorems.
Let us first introduce our assumptions on the base map.Let (ξ n ) be a two-sided stationary sequence taking values on some measurable space Y.We assume here that ( , F, P, σ ) is the corresponding shift system.Namely, = Y Z , (σ ω) j = (ω j +1 ) j is the left shift and if π 0 : → Y denotes the zeroth coordinate projection, then (ξ n ) has the same distribution as (π 0 • σ n ).We also assume that T ω = T ω 0 and ϕ(ω, •) = ϕ(ω 0 , •) depend only on zeroth coordinate ω 0 of ω.

2.5.1.
Limit theorems for stretched exponentially fast α-mixing driving processes.Let ( 0 , F , P) be the probability space on which (ξ n ) is defined.We recall that the α-mixing (dependence) coefficient between two sub-σ -algebras G, H of F is given by

Spectral method 133
The α-dependence coefficients of (ξ n ) are defined by where F −∞,k is the σ -algebra generated by ξ j , j ≤ k, and F k+n,∞ is generated by ξ j , j ≥ k + n.The last equality holds true due to stationarity.Let us consider the following class of mixing assumptions on the base map.
Assumption 2.13.(Stretched exponential α mixing rates) There exist positive constants c 1 , c 2 and η such that α n ≤ c 1 e −c 2 n η for every n.
Our first result concerns the variance of S n and the CLT (with rates).
THEOREM 2.14.Suppose that the cocycle L is good.Let ϕ be an observable such that ϕ K := esssup ω∈ (K(ω) ϕ ω BV ) < ∞, where ϕ ω = ϕ(ω, •).Suppose that n nα n < ∞.Then the limit exists and vanishes if and only if ϕ = r • τ − r for some r ∈ L 2 (μ).If in addition Assumption 2.13 is satisfied then n −1/2 S n converges in distribution to sZ, where Z is a standard normal random variable.Moreover, there is a constant C > 0 such that.for all n ∈ N, where γ = 1/η and is the standard normal distribution function.The constant C depends only on c 1 , c 2 , η, ϕ K and the constant C v (from the definition of the variation v(•)), and an explicit formula for C can be recovered from the proof.
The proof of Theorem 2.14 appears in §3.2.1.As discussed in § §1.2 and 1.3, when the quenched CLT holds true with a deterministic centering, then the CLT for the skew product follows by integration.This was the approach for the CLT in [2], but in the setup of this paper the function ϕ and the measure μ ω depend on ω, and so the quenched CLT only holds with fiberwise centering.Thus, the novelty of Theorem 2.14 is that the CLT is obtained for the skew product beyond the annealed case considered in [2].Moreover, Theorem 2.19 also strengthens the CLT in [26], since our maps T ω are not uniformly expanding, and the observable ϕ is not fiberwise centered.

Y. Hafouta
All the constants depend only on c 1 , c 2 , η, ϕ K and C v from the definition of the variation v(•), and an explicit formula for them can be recovered from the proof.
We will also prove the following theorem.
(i) Set v n = √ Var(S n ), and when v n > 0 also set Z n = (S n − E[S n ])/v n .Let be the standard normal distribution function.Then there exist constants s 3 , s 4 , s 5 > 0 such that, that for every n ≥ a 3 we have v n > 0, and for every 0 (2.21) The constants a 4 , a 5 depend only on c 1 , c 2 , η, ϕ K and C v , and an explicit formula for them can be recovered from the proof.
(ii) Let a n , n ≥ 1, be a sequence of real numbers so that Then the sequence W n = (sn 1/2 a n ) −1 S n , n ≥ 1, satisfies the moderate-deviations principle with speed s n = a 2 n and the rate function I (x) = x 2 /2.Namely, for every Borel measurable set ⊂ R, where o is the interior of and is its closure.
We also obtain the following Rosenthal-type moment estimates.
THEOREM 2.17.Suppose that L is a good cocycle.If ϕ K = esssup ω∈ (K(ω) ϕ ω BV )< ∞, then under Assumption 2.13 there exists a constant c 0 such that, with γ = 1/η for every integer p ≥ 1, we have where Z is a standard normal random variable.In particular, S n − E μ [S n ] L p = O( √ n) for every p.As in the previous theorems, the constant c 0 depends (explicitly) only on c 1 , c 2 , η, ϕ K and C v .
We remark that Theorem 2.17 provides another proof of the CLT by the method of moments.Indeed, if s 2 > 0 then it follows that, for every integer p ≥ 1, the pth moment of , where s 2 is the asymptotic variance.In fact, for even p we get the convergence rate O(n −1/2 ), while for odd p we get the rate O(n −1 ).
Theorems 2.15-2.17are well established for sufficiently fast mixing (in the probabilistic sense) sequences of random variables, where one of the most notable methods of proof is the so-called method of cumulants (see [49]).For random dynamical systems, a moderate-deviations principle was obtained in [19], using a random complex Perron-Frobenius theorem.In the setup of [2], annealed (local) large-deviations principles and exponential concentration inequalities were obtained for i.i.d.maps, and we expect that for independent maps the methods in [2] will yield results like Theorems 2.15-2.17as well.The novelty in Theorems 2.15-2.17 is that we show how to apply the method of cumulants in the context of skew products with non-independent fiber maps, which results in concentration inequalities, moderate-deviations principles and Gaussian moment estimates beyond the annealed setup [2].

Finally, let us consider the random function
We also obtain a functional CLT.THEOREM 2.19.Let L be a good cocycle.Suppose that esssup ω∈ (K(ω) ϕ ω BV ) < ∞ and that Assumption 2.13 holds true.Then the random function S n converges in distribution towards the distribution of {sW t }, where W is a standard Brownian motion (restricted to [0, 1]) and s 2 is the asymptotic variance.
Remark 2.20.The proof of Theorem 2.19 appears in §3.3.In [2] an ASIP was obtained, which yields the functional CLT.In §2.5.2 below, using different mixing coefficients for the base map, we will obtain an ASIP for the more general skew products considered in this paper.However, Theorem 2.19 shows that the functional CLT already holds true for stretched exponential α-mixing base maps.

An almost sure invariance principle and exponential concentration inequalities for φ-and ψ-mixing driving processes (via martingale methods).
Let ( 0 , F , P) be the probability space on which (ξ n ) is defined.We recall that the φ-mixing and ψ (dependence) coefficient between two sub-σ -algebras G, H of F is given by The reverse φ-mixing coefficients of (ξ n ) are defined by while the ψ-mixing coefficients of (ξ n ) are defined by Let F 0 be the σ algebra generated by the map π(ω, x) = ((ω j ) j ≥0 , x), namely the one generated by B and the coordinates with non-negative indexes in the ω direction.If either essinf inf x h ω (x) > 0 and n φ n,R < ∞ or n ψ n < ∞ then there is an ) is a reverse martingale difference with respect to the reverse filtration {τ −n F 0 }.As a consequence, we have the following assertions.(i) There are constants a 1 , a 2 , a 3 > 0 such that the following exponential concentration inequality holds true: for every t > 0, we have

.25)
The constants a 1 , a 2 , a 3 depend only on ˜ = n φ n,R < ∞ and c (or ˜ = n ψ n < ∞), the constant C v and ϕ K,2 = esssup ω∈ (K(ω) 2 ϕ ω BV ), and an explicit formula for them can be recovered from the proof.(ii) For every p ≥ 2, we have where C p > 0 is a constant (which can be recovered from the proof and depends only on p and the above constants).
We refer readers to [43] for some related moment bounds for random intermittent maps.The proof of Theorem 2.21 appears in §4.Let us note that once the martingalecoboundary representation ϕ = u + χ − χ • τ is established, Theorem 2.21(i) follows from the Azuma-Hoeffding inequality together with Chernoff's bounding method, and Theorem 2.21(ii) follows from the so-called Rio inequality [48] (see [45,Proposition 7]).
To obtain the martingale-coboundary representation we show that if K is the transfer operator (namely, the one satisfying the duality relation corresponding to the system ( × X, F 0 , μ, τ ) then there is a constant C > 0 such that where γ n is either ψ n or φ n,R , depending on the case, and δ ∈ (0, 1).Once this is established we can take The proof of (2.27) is given in Proposition 4.3 (i).
Our next result is an ASIP.
When essinf inf x h ω (x) > 0 we set γ n = φ R,n , while otherwise we set γ n = ψ n .In both cases, assume that Then the limit exists and the following version of the ASIP holds true: there is a coupling of (ϕ • τ n ) with a sequence of i.i.d.Gaussian random variables Z j with zero mean and variance s 2 such that sup 1≤k≤n Remark 2.23.The ASIP implies the functional CLT, see [47].Thus, Theorem 2.22 yields better results than Theorem 2.19 for φ R -or ψ-mixing driving sequences (which are not necessarily stretched exponentially mixing).
The proof of Theorem 2.22 appears in §4 and relies on an application of [13, Theorem 3.2].In addition to (2.27), in order to apply [13, Theorem 3.2] we will show that for all 1 ≤ i, j ≤ n we have where φ = ϕ − μ(ϕ), C is a constant and δ and γ n are as in (2.27).The proof of (2.28) is given in Proposition 4.3 (ii).
Remark 2.24.As discussed in §1.3.2, the martingale-coboundary decomposition in Theorem 2.21 (and its consequences) is comparable with the annealed case [2], and the main novelty is that we obtain it for more general skew products and functions ϕ which depend on ω.Moreover, we do not assume that all T ω preserve the same absolutely continuous probability measure.The ASIPs we obtain are comparable to ASIPs in [2] (see the discussion in §1.3.2).

A vector-valued almost sure invariance principle in the uniformly random case for exponentially fast α-mixing base maps. Let us take a vector-valued measurable function
•) depend on ω only through ω 0 and esssup ω∈ (K(ω) ϕ ω,i BV ) < ∞ for all 1 ≤ i ≤ d.Let us also assume that μ(ϕ i ) = 0 for every i.Set S n = n−1 j =0 ϕ • τ j .

138
Y. Hafouta THEOREM 2.25.Suppose that α n = O(α n ) for some α ∈ (0, 1).Then there is a positive semidefinite matrix 2 such that Moreover, 2 is positive definite if and only if ϕ • v = r − r • τ for all unit vectors v and all r ∈ L 2 .Assume now that there are constants C > 0 and δ ∈ (0, 1) so that namely, that K(ω) is a bounded random variable.Then there is a coupling of (ϕ • τ n ) with a sequence of independent Gaussian centered random vectors (Z n ) such that Cov(Z n ) = 2 and for every ε > 0, ) almost surely.

Limit theorems via the method of cumulants for α-mixing driving processes
We recall next that the kth cumulant of a random variable W with finite moments of all orders is given by From now on we will assume that E[S n ] = 0 for all n, that is, we will replace ϕ by ϕ − μ(ϕ).The main result in this section is the following theorem.THEOREM 3.1.Let L be a good cocycle, and suppose that Assumption 2.13 holds true and that ϕ K = esssup ω∈ (K(ω) ϕ ω BV ) < ∞.Then, with γ = 1/η, there exists a constant c 0 which depends only on ϕ K and the constants from Assumption 2.13 such that, for any k ≥ 3, We will prove Theorem 3.1 by applying the following Proposition 3.3, which appears in [25] as Corollary 3.2.
Let us start with a few preparations.Let V be a finite set and ρ : We assume here that there exist c 0 ≥ 1 and u 0 ≥ 0 such that for all v ∈ V and s ≥ 1.

Spectral method 139
Next, let X v , v ∈ V be a collection of centered random variables with finite moments of all orders, and for each v ∈ V and t ∈ (0, ∞] let v,t ∈ (0, ∞] be such that X v t ≤ v,t .Assumption 3.2.For some 0 < δ ≤ ∞ and all k ≥ 1, b > 0 and a finite collection A j , j ∈ J , of (non-empty) subsets of V such that min i =j ρ(A i , A j ) ≥ b and r := j ∈J |A j | ≤ k, we have where γ δ (b, r) is some non-negative number which depends only on δ, b and r, and | | stands for the cardinality of a finite set .
Set W = v∈V X v .In the course of the proof of Theorems 2.14-2.16and 2.19 we will need the following general result.for some a, η > 0, d ≥ 1 and all k, m ≥ 1.Then there exists a constant c which depends only on c 0 , a, u 0 and η such that, for every k ≥ 2, where for all q > 0, M q = max{ v,q : v ∈ V } and M k q = (M q ) k .When the X v are bounded and (3.2) holds true with δ = ∞ we can always take v,t = v,∞ , t > 0, and then, for any k ≥ 2, When δ < ∞ and there exist θ ≥ 0 and M > 0 such that for any v ∈ V and k ≥ 1, we have that, for any k ≥ 2, where C is some absolute constant.
Theorem 3.1 will follow from the following result, which is proved in §3.1.
PROPOSITION 3.4.For a good cocycle L and an observable ϕ satisfying (2.16) we have the following assertion.Fix some n and set V = {0, 1, . . ., n − 1} and Then condition (3.2) holds true with the above choices and with where A 0 is a constant which depends only on λ − 3ε and on the constant C so that sup |g| ≤ C g BV for every function g : X → C (and the dependence can be easily recovered from the proof).If, in addition, Assumption 2.13 holds then the conditions of Proposition 3.3 hold true with u 0 = 1, c 0 = 2 and γ = 1/η.

3.1.
Multiple correlation estimates: proof of Proposition 3.4.Our goal is to show that (3.2) holds true with the desired upper bounds.We first need the following result.LEMMA 3.5.For every pair of measurable functions g, h on Y N with g, h ∈ L ∞ (with respect to the law of (ξ n )) and all k ∈ Z and n ∈ N, we have Proof.By [11,Ch. 4], we have Next, is it clearly enough to prove Proposition 3.4 when ϕ L ∞ and esssup ω∈ (K(ω) ϕ ω BV ) do not exceed 1, for otherwise we can just divide ϕ by the maximum between the two.Recall also our assumption that K(ω)e −ε|m| ≤ K(σ m ω) ≤ K(ω)e ε|m| for some ε < λ/3 (recall Remark 2.11).
The first step in the proof of Proposition 3.4 is the following result.
LEMMA 3.6.(Fiberwise multiple correlation estimates) Let B 1 , B 2 , . . ., B m be non-empty intervals in the non-negative integers so that B i is to the left of B i+1 and B 1 contains 0. Let us denote by d i the gap between B i and B i+1 (namely, the distance).Let us fix some ω and let f i be a family of functions such that K(σ i ω) f i BV ≤ 1 and where A = C 2 sup d∈N 2de −(λ−ε)d and λ comes from (2.10) and (2.14) (recall Remark 2.11).
Proof.The proof will proceed by induction on m.Let us first prove the lemma in the case m = 2.We first note that for all functions g 0 , g 1 , . . ., g q , we have where g i ∞ = sup g i L ∞ , and hence where we have used (2.3), that N(ω) ≤ K(ω) and that Let us write B 1 = {0, 1, . . ., d}.Taking g k = f k for 0 ≤ k ≤ d = q and noting that K(σ s ω) g s ∞ ≤ C for some constant C which depends (C is a constant which satisfies g ∞ = sup |g| ≤ C g BV for every complex function on X) only the space X, we conclude that , where d+n) .Therefore, using also that μ ω is an equivariant family and that (since n This proves the lemma for m = 2.
where G m is some function.Now we observe that which is proved exactly as in the previous case (even though there are gaps between the blocks B j , we can set g i = 1 when i does not belong to one of the B j , and then v(g i ) = 0).Thus, as in the case m = 2, we have The induction is completed by the above inequality, taking into account that Integrating over ω yields the following corollary of Lemma 3.6.
COROLLARY 3.7.Let τ be the skew product.Let B j , 1 ≤ j ≤ m, be blocks as in Lemma 3.6.Set G j = i∈B j ϕ • τ i .Let us denote by b j the left end point of B j .Then e −λd j . (3.9) The next step of the proof is to estimate the second term inside the absolute value on the left-hand side of (3.9).To obtain appropriate estimates, we first need the following lemma.LEMMA 3.8.Let us fix some k ∈ N and set Then, for every n ∈ N and for P-a.e. ω, we have where C is such that g L ∞ ≤ C g BV for every function g on X with bounded variation (recall that such a constant C exists by our assumption on the variation v(•)).
Proof.Using (2.10), that K(σ −n ω) ≤ e εn K(ω) and that we get the following result directly from Corollary 3.7 and Lemma 3.8.COROLLARY 3.9.Let b j be the left end point of the block B j .Let us also set r j = d j /3 and r 0 = r 1 .Then there exists a constant A 1 > 0 which does not depend on ω or on the blocks so that in the notation of Corollary 3.7 and Lemma 3.8 we have where Namely, in distribution it can be written as for some measurable function f j .Since m(ϕ ω,j L ) and |ϕ ω,j | ≤ 1, we can ensure that |f j | ≤ 1.Using [25, (2.20)] and Corollary 3.9 we conclude that the following result holds.COROLLARY 3.10.Let G j , 1 ≤ j ≤ m, be as in Corollary 3.7 (defined by some blocks B j with gaps d j ).There are constants A > 1 and δ 0 ∈ (0, 1) which do not depend on the blocks so that All that is left is to notice that Corollary 3.10 is a reformulation of Proposition 3.4, using the notation of this section.

Limit theorems via the method of cumulants
and the results concerning the asymptotic variance s 2 follow from the general theory of (weakly) stationary processes (see [34] Let us give a reminder of the short proof.We have Using this lemma together with [29,Lemma 3.3] with a = 2 and that  , Y d,n ), by the multidimensional version of Levi's theorem, in order to show that Y n converges in distribution as n → ∞ towards a given random variable Z, it is enough to show that for every a ∈ R d we have Therefore, it is enough to show that any linear combination of Y j ,n , j = 1, 2, . . ., d, converges in distribution towards the corresponding linear combination of the coordinates of Z. Returning to our problem, to obtain the appropriate convergence of the distribution of (S n (it j )) d j =1 it is enough to show that any linear combination of S n (t j ) converges towards a centered normal random variable with an appropriate variance.More precisely, let a 1 , . . ., a d ∈ R. Then we need to show that d j =1 a j S n (t j ) converges in distribution towards a centered normal random variable with variance where t 0 = 0 and s 2 = lim n→∞ where we set t 0 = 0 and S 0 = 0. Thus, using stationarity, we have

Y. Hafouta
Now the first summand on the right-hand side above converges to On the other hand, arguing as in the proof of Theorem 3.1 (replacing each appearance of ϕ • τ k by Y k ), we get the same kind of estimates on the cumulants of that is, there exists a constant c 0 which might depend on t j and a j such that for every k we have Thus, by applying [49, Corollary 2.1] we get that s /w n converges towards the standard normal distribution, where w n is the standard deviation of the numerator.Note that, as we have shown, , which is positive unless either s = 0 or a 1 = • • • = a d = 0, which are both trivial cases.Thus, in any case we obtain the desired convergence of the linear combination d j =1 a j S n (t j ) and the proof of Theorem 2.19 is complete.4. Limit theorems via martingale approximation for φ-and ψ-mixing driving processes 4.1.Some expectation estimates using mixing coefficients.In the course of the proof of Theorem 2.22 we will need the following two relatively simple lemmas.LEMMA 4.1.Let G, H be two sub-σ -algebras of a given σ -algebra on some space measure space.Let g be a real-valued bounded G-measurable function and h be an H-measurable real-valued integrable function.Then Proof.By [11,Ch. 4] we have H), which clearly implies the lemma.LEMMA 4.2.Let G, H be two sub-σ -algebras of a given σ -algebra on some measure space.Let g a real-valued bounded G-measurable function and h be an H-measurable real-valued integrable function.Suppose also that ψ = ψ(G, H) < 1.Then Proof.By [11,Ch. 4] we have Taking h, g ≥ 0, we get that Thus, Therefore, for non-negative functions we have Now the general result follows by writing h = h + − h − and g = g + − g − , where h ± and g ± are non-negative functions such that h + + h − = |h| and g + + g − = |g|, and using that both (g, h) → E[g]E[h] and (g, h) → E[hg] are bilinear in (g, h).

Convergence of the iterates of the transfer operator with respect to a sub-σ -algebra.
Let F 0 be the σ -algebra generated by the map π(ω, x) = ((ω j ) j ≥0 , x), namely, the one generated by B and the coordinates with non-negative indexes in the ω direction.Then (τ −k F 0 ) k≥0 is a decreasing sequence of σ -algebras and τ −k F 0 is generated by τ k and the coordinates ω j for j ≥ k.In particular, τ preserves F 0 .
Next, let us define a transfer operator with respect to F 0 .For each function g ∈ L 1 (μ) there is a unique F 0 -measurable function G such that

Y. Hafouta
Let us define Kg = G, where we formally set G to be 0 outside the image of τ (if τ is not onto).Then and therefore K can also be defined using the usual duality relation with respect to the above σ -algebra.That is, it is the transfer operator of τ with respect to ( × X, F 0 , μ). ) where A is an absolute constant and C 0 is any constant satisfying g L ∞ ≤ C 0 g BV and fg BV ≤ C 0 g BV f BV for all functions g, f : X → C. (ii) We have Proof of Theorems 2. 21 ) is a reverse martingale difference with respect to the reverse filtration {τ −n F 0 }.Moreover, the differences u • τ n are uniformly bounded (as χ and ϕ are in L ∞ ).Thus, by the Azuma-Hoeffding inequality, for every β > 0, we have Now the proof proceeds by using the Chernoff bounding method.By the Markov inequality for all t > 0 we have Taking β = β t = t/2 u L ∞ and replacing u with −u, we get that The proof of Theorem 2.21(i) is completed now by noticing that Next, the proof of Theorem 2.21(ii) is completed by applying [45, Proposition 7] with the reverse martingale (u • τ n ) and using (4.3).
In order to prove Theorem 2.22, we apply [13, Theorem 3.2] with the bounded function ϕ and the probability-preserving system ( × X, F 0 , μ, τ ), whose transfer operator is K. Now, since we have assumed that μ(ϕ) = 0, in order for the conditions of [13,Theorem 3.2] to be in force we need the estimates to hold.These three conditions are verified by Proposition 4.3 and the mixing rates specified in the formulation of Theorem 2.22, and the proof of Theorem 2.22 is complete.
Proof of Proposition 4.3.(i) Since L ∞ (μ) is the dual of L 1 (μ), and ϕ and K n ϕ are F 0 -measurable, it is enough to show that, for every g ∈ where γ n is one of the desired upper bounds.To achieve that let us first note that K n is the dual of the restriction of the Koopman operator f → f • τ n acting on F 0 -measurable functions.Thus, Now, using (2.14) and that ϕ K = esssup ω∈ (K(ω) ϕ ω BV ) < ∞, we get that Hence, using also the σ -invariance of P, where |I | ≤ Ce −λn g L 1 (μ) .Next, let us write μ σ n ω (g σ n ω ) = m(g σ n ω h σ n ω ).
By (2.10) we have for every function g, and recall that K(σ n ω) ≤ K(ω)e εn .Combining this with the previous estimates, we get that where |I | ≤ Ce −λn g L 1 (μ) and |J | ≤ C e −(λ−3ε)n/2 g L 1 (μ) and we have used that K(ω) 2 ϕ ω BV is bounded.Next, using (2.10) and that K(ω) is tempered, we have h ω = lim n→∞ L n σ −n ω 1, and therefore h ω depends only on the coordinates ω j for j ≤ 0. Thus, μ ω (ϕ ω ) = F (ω j ; j ≤ 0) for some measurable function F so that |F | ≤ ϕ L 1 (μ) .Observe also that the random variable depends only on ω j , j ≥ [n/2], since g ω (x) is a function of x and ω j , j ≥ 0 (that is, it factors through π 0 ).In the case where h ω ≥ c −1 > 0 for some constant c > 0 we have Thus, using also Lemma 4.1, we see that there is a constant C > 0 so that where we have taken into account that μ ω (ϕ ω ) dP(ω) = μ(ϕ) = 0. This, together with (4.6) and the previous estimates on I and J, proves (4.2).
To prove (4.1), we first use (4.5) in order to obtain that Taking into account that we conclude that G n (ω)μ ω (ϕ ω ) is integrable.We would now like to apply Lemma 4.2, but the problem is that G n is not bounded.To overcome that, for each M > 0 set Then, since G n (ω)μ ω (ϕ ω ) is integrable, by the dominated convergence theorem we have Now, taking n so that ψ [n/2] ≤ 1/2 and using that μ(ϕ) = 0, we get from Lemma 4.2 that Using also (4.7) and that esssup ω∈ ( ϕ ω BV K(ω) 2 ) < ∞, we conclude that and (4.1) follows (using also (4.6)).

A vector-valued almost sure invariance principle for skew products with uniformly expanding fiber maps and exponentially fast α-mixing base maps
Let us first explain why the matrix 2 exists.For a fixed vector v the limit exists, by considering the real-valued observable ϕ • v.
Then the matrix 2 from Theorem 2.25 is given by ( 2 ) i,j = 1 2 (s 2 e i +e j − s 2 e i − s 2 e j ).This Spectral method 153 matrix satisfies 2 v • v = s 2 v and so it is not positive definite if and only if ϕ • v is a coboundary for some unit vector v.Note that this part does not require T ω to be uniformly expanding.

Extensions, generalizations, additional results and a short discussion
In this section we will describe a few additional results which can also be obtained using the methods of the current paper.In order not to overload the paper the section is presented in a form of a discussion rather than explicit formulations of theorems.
6.1.More general mixing base maps for continuous in ω transfer operators.Let (ξ n ) n∈Z be a stationary process taking values on a metric space (Y, d) satisfying the following approximation and mixing conditions.There are sub-σ -algebras G n,m on the underlying probability space such that G n,m ⊂ G n 1 ,m 1 if [n, m] ⊂ [n 1 , m 1 ] and for each r and n there is an G n−r,n+r measurable random variable ξ n,r so that the following assertions hold.
W ) = Var(W ), and k (aW ) = a k k (W ) for any a ∈ R and k ≥ 1.
PROPOSITION 3.3.[25, Corollary 3.2] Suppose that inequality (3.1) and Assumption 3.2 are in force.Assume also that γδ (m, k) := max{γ δ (m, r)/r : 1 ≤ r ≤ k} ≤ de −am η us complete the induction step.Let d be the right end point of B m−1 .Then d + d m is the left end point of B m and we can write
−∞,k is the σ -algebra generated by ξ j , j ≤ k, and F k+n,∞ is generated by ξ j , j ≥ k + n.It is clear from the definitions of the mixing coefficients thatα n ≤ φ n,R ≤ ψ n .THEOREM 2.21.(Exponential concentration and maximal inequalities) Let L be a good cocycle.Suppose the observable satisfies esssup ω∈ and Lemma 3.11 below).Now suppose that s 2 = lim n→∞ (1/n)Var μ (S n ) > 0, where S n = S n ϕ.To prove the CLT and the convergence rate (2.19), by applying [49, Corollary 2.1], taking into account Theorem 3.1, we get the CLT and the rate (2.19) for S n / √ Var(S n ).To get the same rate for S n / √ n we need the following general fact from the theory of stationary real-valued sequences, which for the sake of convenience is stated as a lemma.LEMMA 3.11.Let Y n be a centered weakly stationary sequence of square integrable random variables.Set b n [15,rem 1.1].We note that the conditions of[49, Lemma 2.3],[15, Lemma 2.3] and[15, Theorem 1.1]are certain estimates on the growth rates (in k) of the cumulants k (S n ), and the role of Theorem 3.1 is that it shows that the conditions of all of these results are in force in the setup of this paper.A functional central limit theorem via the method of cumulants: proof of Theorem 2.19.Let us first show that the sequence S n is tight.By Theorem 2.17 we have that = • L 4 , and therefore, using also stationarity and the Hölder inequality, we get that for allt 1 < t 2 ≤ r 1 < r 2 , E[(S n (r 2 ) − S n (r 1 )) 2 (S n (t 2 ) − S n (t 1 )) 2 ] ≤ S n (r 2 ) − S n (r 1 ) 2 4 S n (t 2 ) − S n (t 1 ) 2 Ch.15], S n (•) is a tight sequence in the Skorokhod space D[0, 1].Now let us show that the finite-dimensional distributions converge.Let us fix somet 1 < t 2 < • • • < t d .Set X k = ϕ • τ k .Next, let us recall the following general fact.Given a vector-valued sequence of random variables Y n = (Y 1,n , . . . 2(t j − t j −1 ), The proof of Theorems 2.21 and 2.22 is based on the following result.
PROPOSITION 4.3.Under the assumptions of Theorems 2.21 and 2.22, and when μ(ϕ) = 0, we have the following assertions.(i) We have and 2.22 based on Proposition 4.3.First, Theorem 2.21(i) follows since if we set χ = ∞ n=1 K n ϕ and u