## 1. Introduction

Turbulence interacts with, and leads to the generation of, mean flows in a wide variety of natural systems. Understanding the nature of these interactions, particularly for systems far from equilibrium, remains a key problem for the description of large-scale flows on planets and stars. The fundamental problem in studying such systems via direct numerical simulation (DNS) is the expense of the calculations, owing to the vast range of spatial and temporal scales that need to be described. One way to make progress is to study the statistics of the flow rather than detailed flow variables themselves (Marston, Qi & Tobias Reference Marston, Qi and Tobias2019). However, one should bear in mind that these systems are often inhomogeneous and strongly anisotropic, and such a statistical description should respect this, despite the simplifications that are afforded with assumptions of homogeneity and isotropy.

Rotation is a hallmark of geophysical and astrophysical fluid dynamics. Through the breaking of reflection symmetry, rotating systems generate mean flows such as zonal jets; examples of which include the bands on Jupiter, the jet stream on Earth and the differential rotation in stars (Galperin & Read Reference Galperin and Read2019). In these systems, turbulence that is driven at moderate scales interacts with rotation to produce non-trivial correlations that lead to Reynolds stresses that drive differential rotation and jets. In $\beta$-plane turbulence without convection, where the turbulence is constrained by two-dimensionality arising from strong stratification as opposed to rotation, the interaction of the jets and the turbulence serves to reinforce the jet structure even in the absence of rotation (see Farrell & Ioannou (Reference Farrell and Ioannou2019), figure 25.1). Here, the turbulence is often described as having an anti-frictional character as it moves the system away from solid body rotation; this provides a stern test for theories that seek to describe the statistical properties of the turbulence. We also note that it is common astrophysically for the source of the turbulence to be thermal convection, and that the convection itself is sensitive to the presence or absence of large-scale flows; this leads to non-trivial feedbacks between the mean flows and the turbulence.

Here, we explore a simple system that involves the interaction of convection with mean flows – the Busse annulus. The Busse annulus models rotating convection in an annulus with slanted ends leading to a topographic $\beta$ effect; this effect leads to vortex stretching and the generation of systematic large-scale flows (see e.g. Busse Reference Busse1976; Brummell & Hart Reference Brummell and Hart1993; Rotvig & Jones Reference Rotvig and Jones2006). Whilst it is known that Rayleigh–Bénard convection can produce large-scale flows in itself (e.g. Thompson Reference Thompson1970; Goluskin *et al.* Reference Goluskin, Johnston, Flierl and Spiegel2014), here the $\beta$ effect arising from the slanting endwalls provides an easily tunable parameter. We use this system to test the effectiveness of the statistical description termed CE2. The CE2 is a quasilinear approximation of direct statistical simulation (DSS) and is a cumulant expansion truncated at second order. It is the lowest-order method in a hierarchy that can be extended to reach more accurate but more complex methods. Whilst higher-order DSS models do succeed in reproducing results that CE2 does not (see Marston & Tobias Reference Marston and Tobias2022, for a recent review), it is not clear *a priori* which model is required for a given system (even for simple systems, e.g. Li *et al.* Reference Li, Marston, Saxena and Tobias2022) and therefore it is prudent to begin with CE2. The CE2 has been shown to be effective for simple systems where tightly coupled correlations control the dynamics and driving is either stochastic (e.g. Farrell & Ioannou Reference Farrell and Ioannou2007; Tobias & Marston Reference Tobias and Marston2013) or arises through the instability of a shear flow (e.g. Marston, Conover & Schneider Reference Marston, Conover and Schneider2008). However, the case of thermal convection, where buoyancy provides the driving in the vorticity equation and there is nonlinearity in both the vorticity and temperature equations, has been much less studied (though we note Fitzgerald & Farrell (Reference Fitzgerald and Farrell2018*a*,Reference Fitzgerald and Farrell*b*), studied stably stratified Boussinesq dynamics with S3T).

Thus, the Busse annulus system presents an important challenge for CE2, not least because it is known to host multiple solutions at modest Rayleigh numbers (Brummell & Hart Reference Brummell and Hart1993). Moreover, it has been shown (Tobias, Oishi & Marston Reference Tobias, Oishi and Marston2018) that the simplest quasilinear dynamical theory can perform poorly in describing this system and only works well when the definition of mean fields is generalised to large-scale modes via the so-called generalised quasilinear (GQL) approximation. How does a quasilinear statistical theory fare when faced with multiple potential solution basins? This question has been investigated in the context of zonal jet formation (Farrell & Ioannou Reference Farrell and Ioannou2003; Constantinou, Farrell & Ioannou Reference Constantinou, Farrell and Ioannou2014) and shear flows in stability stratified turbulence (Fitzgerald & Farrell Reference Fitzgerald and Farrell2018*a*). Here, we investigate the sensitivity of CE2 to initial conditions for a thermally driven case with multiple solution branches.

## 2. Model equations, formulation, parity and numerical methods

The Busse annulus is a two-dimensional, locally Cartesian model of rotating, incompressible Boussinesq fluid with viscosity $\nu$ and thermal diffusivity $\kappa$. The $x$-direction corresponds to the zonal direction, the $y$-direction corresponds to the radial direction with uniform gravitational acceleration pointing inwards, and the rotation is perpendicular to both with angular velocity ${\boldsymbol \varOmega = \varOmega {\boldsymbol e}_z}$. For additional information on the geometry, see Tobias *et al.* (Reference Tobias, Oishi and Marston2018, particularly figure 1). The system is non-dimensionalised with the width of the annulus in the $y$-direction ($d$) such that $0 \le y \le 1$, the viscous time scale $d^2/\nu$ and the temperature difference between the inner and outer walls $\Delta T$. In these units, the length of the box in the *x*-direction, $L_x = 2{\rm \pi}$.

Following Brummell & Hart (Reference Brummell and Hart1993), Rotvig & Jones (Reference Rotvig and Jones2006), the temperature $T = T_{BS} +{\hat \theta (x,y)}$ is decomposed into a basic state profile $T_{BS}$ satisfying $\nabla ^2 T_{BS} = 0$ and a perturbation ${\hat \theta }$. After some manipulation, the equation for the vertical ($z$) component of the vorticity ($\zeta$) is given by

where $\psi$ is the streamfunction, $(u,v) = (-(\partial /\partial y){\psi }, (\partial /\partial x){\psi })$ and $\zeta$ is related to $\psi$ by

Here $J(A, B) = (\partial /\partial x){A}(\partial /\partial Y){B} - (\partial /\partial y){A}(\partial /\partial x){B}$ is the Jacobian. The equation for the temperature perturbation $\theta$ is given by

The system is governed by four dimensionless parameters, the degree of vortex stretching caused by the sloping endwalls $\beta$, the friction at the endwalls $C$, the ratio of viscosity to thermal diffusivity $Pr$, and $Ra$, the ratio of thermal driving to dissipation. Here $\beta$ measures the degree of vortex stretching engendered by the sloping endwalls and $C$ measures the degree of friction, whilst the more familiar Rayleigh number and thermal Prandtl number have their usual physical interpretations. The limit $C = 0$ formally corresponds to vanishing Ekman number.

A zonal CE2 formulation (sometimes termed zCE2) is implemented using a zonal average,

to split all dynamical variables into mean and fluctuating components,

We then derive evolution equations for the first cumulants $c_{\zeta }(y) = \langle \zeta \rangle$ and $c_{\theta }(y) = \langle \theta \rangle$ and the second cumulants

for $c_{\theta \theta }$, $c_{\zeta \zeta }$, $c_{\theta \zeta }$ and $c_{\zeta \theta }$ where $\xi = x_1 - x_2$ . We solve for $c_{\psi }$, $c_{\theta }$, $c_{\psi \psi }$, $c_{\theta \psi }$, $c_{\psi \theta }$ and $c_{\theta \theta }$, though we write equations in terms of $\zeta$ cumulants. The required $\zeta$ cumulants, for example $c_{\zeta \psi } = \nabla ^2_1 c_{\psi \psi }$, are computed via differential operators $\nabla ^2_1 = \partial _\xi ^2 + \partial _{y_1}^2$ and $\nabla ^2_2 = \partial _\xi ^2 + \partial _{y_2}^2$ (as in Tobias & Marston Reference Tobias and Marston2013). The dynamical equations for the first cumulants are given by

and

whilst those for the second cumulants are given by

and

The other second cumulant terms (e.g. $c_{\zeta \theta }$) do not need their own evolution equations, as they can be calculated from symmetry considerations, $c_{\zeta \theta }(\xi, y_1, y_2) = c_{\theta \zeta }(-\xi, y_2, y_1)$.

We solve both the DNS and the cumulant system subject to impenetrable, stress-free, zero temperature perturbation boundary conditions in the $y$-dimension; both systems are periodic in $x$. This implies that $\theta$, $\zeta$ and $\psi$ all have odd parity with respect to the boundary conditions in $y$ (and are periodic in $x$). As the action of the zonal average preserves the parity, we discretise the first cumulants using a sine series in $y$. Likewise, the second cumulants may be discretised using sine series in $y_1$ and $y_2$ and a Fourier basis in $x$.

We use Dedalus (Burns *et al.* Reference Burns, Vasil, Oishi, Lecoanet and Brown2020) to solve both the direct equations (2.1)–(2.3) and the CE2 model equations, (2.7)–(2.11). The DNS use the SBDF2 time-stepper with linear terms implicit, whilst the CE2 equations use RK222 scheme with all terms evaluated explicitly. We have found fully explicit time stepping to be more stable for CE2. The second cumulants must be positive definite in order to ensure the corresponding probability density function is realisable. At higher $Ra$, we have found it necessary to remove negative eigenvalues from the second cumulant every 10 time steps. These negative eigenvalues are caused by the rank instability that we describe in § 3.4.

## 3. Results

Given that the Busse annulus system is known to exhibit strong hysteresis, it is key to examine the role of initial conditions when solving the cumulant system. Minimally, we begin each CE2 simulation using one of two extreme initial conditions: maximal ignorance, in which we draw a sample of a Gaussian random process for $\theta$ and then initialise $c_{\theta \theta }$ by computing its second cumulant. All other cumulants (both first and second) are zero. The other extreme is maximal knowledge, in which we initialise all fields using cumulants calculated from the statistics of a DNS solution. This allows the evaluation of CE2's ability to both find solutions and continue solutions already known. In addition, we begin some solutions with biased initial conditions that are designed to ensure solutions are not trapped in symmetry subspaces. In order to facilitate comparison with previous work, we adapt the same run naming scheme as Tobias *et al.* (Reference Tobias, Oishi and Marston2018). Table 1 contains a list of the parameters for each run we consider here.

### 3.1. Run A: two-jet/three-jet solutions

The first case we consider (case A) exhibits relatively simple dynamics in DNS. Started from random initial conditions, the driven convection interacts with the $\beta$ effect to drive large-scale zonal flows (jets). In this particular realisation, these take the form of one prograde and one retrograde jet as shown in the solutions for the mean zonal velocity $\langle u \rangle (y, t)$ and mean temperature $\langle \theta \rangle (y,t)$ in the Hovmöller plots of figure 1. Owing to the shift-reflect symmetry of the Busse annulus model (e.g. Brummell & Hart Reference Brummell and Hart1993), the system will produce a jet in the opposite sense ($c_u$ positive for $y > 0.5$) with equal probability. Started from maximal ignorance initial conditions, however, CE2 evolves to a three-jet solution (not shown), which is symmetric about the midplane ($y=0.5$). These results are similar to the strictly quasilinear run in Tobias *et al.* (Reference Tobias, Oishi and Marston2018), which also produces a three-jet solution, an unsurprising result given that both results are from fundamentally quasilinear methods. Interestingly, GQL solutions do, however, yield the correct number of jets (Tobias *et al.* Reference Tobias, Oishi and Marston2018). At these parameters it has been demonstrated in DNS that there is hysteresis between two-jet and three-jet solutions, but that three-jet solutions are unstable to perturbations that break shift-reflect symmetry (Brummell & Hart Reference Brummell and Hart1993). To test this, we ran a DNS with initial conditions that respect this symmetry and, owing to our highly accurate spectral methods, the solution remained in the symmetry class and the three-jet solution remained steady for the entirety of our DNS integration.

Better comparison with DNS for these parameter values is achieved when the CE2 solution is biased by initialising the first cumulant of the $x$-velocity, $c_u$ to

The first term is odd about the centre of the domain, whilst the second is even. We find that CE2 is capable of finding and sustaining a two-jet solution, with amplitude comparable with the DNS results, if the symmetry of the initial condition is such that $B \gg A$ (here $B = 10^{-3}$ and $A = 0$). This strongly suggests that the transition from three to two jets in DNS is the product of a subcritical, nonlinear transition mediated by eddy-eddy $\to$ eddy interactions excluded both from CE2 and our previous quasilinear results. Regardless, the two-jet solution remains a fixed point of the CE2 system, suggesting that the selection of a given multi-jet solution is distinct from its maintenance. This hypothesis is supported by our maximal knowledge calculation where the CE2 solution is initialised with the cumulants calculated from the saturated state of the DNS run (see figure 1). The CE2 is certainly capable of maintaining the form of these fully nonlinear solutions (even maintaining the slight asymmetry in the temperature distribution) even though the eddy-eddy nonlinearity (EENL) term is missing from the system. Interestingly, we note that the kinetic energy decreases when the solution is continued by CE2. Because the missing EENL terms pass energy in a cascade to small scales, one might expect that CE2 should have less diffusion and thus higher kinetic energy than DNS. However, we note that the fully nonlinear solutions also have slightly higher Nusselt numbers than CE2, $\mathrm {Nu}_{DNS} = 1.95$ vs. $\mathrm {Nu}_{CE2} = 1.83$. This shows that DNS is more efficient at extracting energy from the walls and thus has higher kinetic energy despite also having a cascade that leads to increased dissipation.

The degree of success of the quasilinear CE2 models can be investigated further by comparing the second cumulants; it is possible for first cumulants to agree well whilst second cumulants diverge. Figure 2 shows the evolution of the second cumulant $c_{\theta \theta }(\xi, y_1, y_2 = 0.5)$ under CE2 for the solution started from the statistics of the saturated state of a DNS calculation. The figure shows that this covariance of the DNS is fairly localised in space; we hypothesise that EENL interactions are important in maintaining the locality of these correlations. This hypothesis is validated by the fact that the correlations delocalise in space as the quasilinear CE2 calculation progresses. Eventually the second cumulant is very delocalised and has a wave-like form. The long-range teleconnections are typical of the quasilinear interaction between waves (in this case thermally driven Rossby waves) and a mean flow.

### 3.2. Multiple-jet solutions

Parameters may also be selected that yield multiple-jet solutions; case R, for which the DNS is shown in figure 3(*a*), is such a case. Here, after an initial transient that takes the form of a six-jet solution, a jet merging leads to the solution settling down into a five-jet solution. The DNS exhibits a solution for the second cumulant that is very localised in space, which is suggested by its broad power spectrum in figure 4. For this case, the maximally ignorant quasilinear CE2 model yields an incorrect seven-jet solution. Remarkably, if the CE2 calculation is biased initially to yield a three-jet solution, the solution evolves via a seven-jet solution to form a five-jet solution of the correct amplitude, obviously driven by the corresponding evolution of the second cumulants; the basin of attraction for CE2 is clearly very sensitive to the initial condition here. The power spectra of the CE2 calculations shown in figure 4 demonstrate that power is concentrated in a finite number of $k_x$ values indicating a delocalisation of the second cumulant; the second cumulant is again a superposition of wave-like solutions. The precise wavenumbers that receive energy for CE2 are sensitive to the structure of the first cumulant and therefore the initial conditions. Moreover, the ‘zigzag’ structure of the power spectra in CE2 shows that most of the power is trapped within a symmetry subspace and is not efficiently scattered into all modes. Nonetheless, we stress that the interactions are sufficient to drive the correct mean flows. As for case A, the CE2 solution also tracks the DNS solution well if a maximal knowledge initial condition is used (not shown).

### 3.3. Bursting dynamics

The Busse annulus model is capable of producing extremely complicated nonlinear spatio-temporal dynamics. Perhaps the most nonlinear behaviour exhibited by the DNS model is shown in figure 5(*a*), which show time series of the total, zonal and non-zonal kinetic energies of the flow and Hovmöller plots of the zonally averaged zonal flows for DNS. Here the solution takes the form of a relaxation oscillation; a cycle proceeds as follows. Convection is driven by strong temperature gradients and interacts with the rotation to generate zonal flows (in this case a two-jet solution). As more energy is pumped into the zonal flow, the shear acts so as to switch off the convection (acting as a barrier to transport). This in turn removes the energy source for the zonal flow, which decays on a longer time scale. Once the zonal flow is weak enough, convection sets in again and the process is repeated. This type of behaviour has been described in terms of predator–prey dynamics with the convective turbulence taking the role of the prey and the zonal flows acting as a predator (e.g. Rotvig & Jones Reference Rotvig and Jones2006; Tobias *et al.* Reference Tobias, Oishi and Marston2018).

Figure 5(*b*) shows one of the maximal knowledge solutions for CE2. Here the solution is started from the DNS solution when it has reached a peak in the zonal flow energy. The CE2 is able to continue relaxation oscillations from this state. If CE2 is started from the trough in the zonal flow energy, similar results are obtained (not shown). The sensitivity to initial conditions for CE2 is, however, demonstrated in figure 5(*c*), which shows the evolution from small amplitude maximal ignorance initial conditions. This solution clearly does not replicate the correct number of jets – though it does show some indication of bursting behaviour, as the strength of each zonal jet waxes and wanes in response to the driving.

### 3.4. Rank of second cumulants

One important difference between quasilinear and CE2 simulations is that the former explicitly provides only a single realisation of the dynamics, whilst the latter does not; it is a description of the low-order statistics of the system and hence can be thought of as an average over an ensemble of simulations of the system. Recently, Nivarti *et al.* (Reference Nivarti, Kerswell, Marston and Tobias2022) have shown that an important difference between the quasilinear and CE2 can be quantified via the ranks of the second cumulants – that is, the number of non-zero eigenvalues they possess. They have demonstrated that the statistical description CE2 is open to instabilities that can lead to higher-rank solutions for the second cumulant; the quasilinear by contrast always produces second cumulants at each zonal wavenumber of rank one for the barotropic models considered by Nivarti *et al.* (Reference Nivarti, Kerswell, Marston and Tobias2022). Moreover the different ranks can lead to different behaviour for CE2 compared with that of a single run of a quasilinear system. Finally, rank instability of CE2 can also lead to non-realisable negative eigenvalues. As discussed above, these are projected out to maintain stability of the CE2 simulations.

Although we do not investigate these instabilities in detail here, it is interesting to note the rank of the modes of the CE2 solutions. Figure 6 shows the rank of the second cumulant at different zonal wavenumbers as a function of time, for run A and various initial conditions. For a quasilinear system the rank of the solution should be one. For a maximal knowledge initial condition for run A, when CE2 gives the correct answer, the solution accesses a higher-rank solution than the quasilinear, and the quasilinear simulation finds the incorrect answer (Tobias *et al.* Reference Tobias, Oishi and Marston2018). The CE2 solutions clearly undergo a rank instability to access the correct answer. However, for a maximal ignorance initial condition both quasilinear and CE2 approximations give the incorrect answer. The CE2 solution eventually returns to a low-rank solution and reproduces the quasilinear (rather than the correct) dynamics. Finally, for the biased initial condition case (with the first cumulant correct but the second cumulant incorrect) the second cumulant remains high rank and one gets the correct behaviour for the first cumulant. We stress again here that in all cases the second cumulant is necessarily delocalised for these quasilinear theories and hence incorrect.

## 4. Discussion

In this paper we have examined the effectiveness of the statistical quasilinear theory zCE2 in reproducing the statistics of turbulent solutions of a model of the interaction of rotating convection and zonal flows. This Busse annulus model exhibits complicated spatio-temporal dynamics including the formation of large-scale zonal jets, multiple zonal jets and even predator–prey relaxation oscillations between states with strong zonal flows and strong convection.

We show that zCE2 is capable of reproducing even very complicated mean dynamics (such as the driving and decay of the mean flows and modification of the mean temperature profile) though, for this system, it may have multiple stable attractors for any given parameter set. If the system is initiated with maximal ignorance then it is possible to fall into the basin of attraction of an attractor that is not preferred by DNS. It is possible to bias the symmetry of the initial condition in order to achieve solutions that mirror the symmetry of the DNS solutions. Furthermore, one may use a maximal knowledge initial condition where the zCE2 is initiated using the statistics from the saturated state of a DNS run. In this case, zCE2 appears to be extremely successful – even reproducing the extremely nonlinear relaxation oscillations of the highly driven system. It is noteworthy that zCE2 seems to perform better than a simple quasilinear dynamical theory that considers only one realisation from an ensemble. Because zCE2 is a statistical theory, it can be thought of as describing an ensemble over initial conditions, whilst one quasilinear realisation is sensitive to the precise initial conditions. The zCE2 relies on the solution of equations for quantities averaged in the zonal direction, i.e. the mean flows and temperature and the two-point correlation functions. It is a quasilinear theory in that the evolution equation for the two-point correlation function is linear in the two-point correlation function, though there are terms that involve the product of the means and two-point correlation functions. The results presented here indicate that even the predator–prey dynamics is controlled by the interaction of the mean flows and temperature with the fluctuations, with the EENL being of secondary importance. Hence, we have shown that even for complicated dynamics a quasilinear theory may suffice for a description of the statistics – though this is certainly not always the case.

It is interesting to speculate further on the role of the missing physics encoded in the EENL. This term in the equations can be modelled by deriving evolution equations for the higher-order cumulants, and truncating the hierarchy at third order. This extension to DSS is known as CE3 – and sometimes CE2.5 when an approximate form of the CE3 equations are utilised (Marston *et al.* Reference Marston, Qi and Tobias2019). However, this is a computationally expensive procedure and solution strategies suffer severely from the ‘curse of dimensionality’. An interesting question then is what is the best strategy for modelling these higher-order interactions. Of course, if the linear operator provided by the mean flows and other fields is sufficiently non-normal (if, for example, the shear is sufficiently strong), then the precise form of this modelling term is not critical (since its role will be to act as a driving term for the non-normal modes of the linear operator). In this case, one may as well model these terms via delta correlated white noise. However, it is possible to optimise the form of the noise provided to the linear operator so as to best match the true dynamics. Similar optimisation procedures have been utilised for the form of the (coloured) noise in transition problems (Zare, Jovanović & Georgiou Reference Zare, Jovanović and Georgiou2017). One strategy that appeals is to use data-driven methods to ‘learn’ the optimal form of the term that should be added to CE2 to best reproduce the dynamics. It is to be hoped (though by no means certain) that this optimal form should be relatively insensitive to the form of the large scales since this dependence should be encoded in the interaction between the first and second cumulants of CE2.

## Acknowledgements

Computations were performed on the Bates High Performance Computing Center's Leavitt system and the NASA High-End Computing Program through the NASA Advanced Supercomputing Division at Ames Research Centre under allocation GID s1647.

## Funding

This work was supported by funding from the European Research Council (ERC) under the EU's Horizon 2020 research and innovation programme (grant agreement D5S-DLV-786780). J.S.O. was supported by NASA LWS grant number NNX16AC92G and Research Corporation Scialog Collaborative Award (TDA) ID#24231. J.B.M. and S.M.T. are supported in part by a grant from the Simons Foundation (grant no. 662962, GF).

## Declaration of interests

The authors report no conflict of interest.