Turbulence in forced stratified shear flows

Abstract Continuously forced, stratified shear flows occur in many geophysical systems, including flow over sills, through fjords and at the mouths of rivers and estuaries. These continuously forced shear flows can be unstable and drive turbulence, which can enhance the rate of mixing. In this study, we analyse three-dimensional direct numerical simulations of an idealized stratified shear flow that is continuously forced by weakly relaxing both the buoyancy and streamwise velocity towards prescribed mean profiles. We explore a range of large and small Richardson numbers, for constant Reynolds and Prandtl numbers (${Re}=4000$ and ${Pr}=1$). After a turbulent steady state develops, three regimes are observed: (i) a weakly stratified, overturning regime, (ii) a strongly stratified, scouring regime and (iii) an intermediately stratified, intermittent regime. The overturning regime exhibits partially formed overturning billows that break down into turbulence and broaden the velocity and buoyancy interfaces. Conversely, the scouring regime exhibits internal gravity waves propagating along the strongly stratified buoyancy interface, while turbulence on either side of the buoyancy interface reinforces the stratification. The intermediate regime quasi-periodically alternates between behaviours associated with the overturning and scouring regimes. For each case, we quantify an appropriate measure of the efficiency of mixing and examine its dependence on relevant parameters including appropriate definitions of the buoyancy Reynolds number, gradient Richardson number and horizontal Froude numbers. Using a framework involving sorted buoyancy coordinates as introduced by Nakamura (J. Atmos. Sci., vol. 53, 1996, pp. 1524–1537) and Winters & D'Asaro (J. Fluid Mech., vol. 317, 1996, pp. 179–193), we examine the underlying physical mechanisms leading to broadening and thinning of the buoyancy interface.


Introduction
Forced stratified shear flows are stratified shear flows that are continuously forced for some period of time by the exchange between two reservoirs (or sources). These reservoirs supply a replenishing source of momentum and buoyancy and enable a constant production of turbulence. Forced shear flows occur at the mouths of rivers and estuaries, in cross-shelf exchange flows and in channels between basins. They play a role in many important processes and systems including outflow from the Mediterranean Sea (Armi & Farmer 1988), setting properties of bottom water and underflows (Yoshida et al. 1998;Dallimore, Imberger & Ishikawa 2001;van Haren et al. 2014), the persistence or destruction of hypoxic layers (Cui et al. 2019) and the vertical and horizontal distribution of chemicals, biota and sediments in coastal and riverine regions (Wolanski & Pickard 1983;Pineda 1994;Boehm, Sanders & Winant 2002). However, these flows and the turbulence associated with them are generally unresolved in Earth system models and thus a thorough understanding of them and their effects is needed to model and parameterize them accurately.
Turbulence in stratified shear flows can exhibit a wide range of characteristics. When stratification is relatively weak, shear-driven overturns can develop at a relatively 'sharp' density interface embedded in a broader region of velocity variation. Such vortical overturns can break up or broaden interfaces and mix the two differing fluids through penetrative entrainment (Barenblatt et al. 1993;Balmforth, Llewellyn-Smith & Young 1998;Woods et al. 2010). An example of this is the commonly studied stratified shear flow mixing event of a large overturning billow that develops from a Kelvin-Helmholtz instability (KHI) (Thorpe 1973;Koop & Browand 1979;Klaassen & Peltier 1985). These events occur when the kinetic energy of the flow is able to overcome the potential energy of the (essentially two layer) stratification, thereby allowing eddies to overturn the interface and mix the two fluids.
At higher values of stratification, the flow does not have enough kinetic energy and large overturns are suppressed. Instead, Holmboe wave instabilities (HWI) and turbulent scouring are observed (Holmboe 1962;Smyth & Winters 2002;Salehipour, Caulfield & Peltier 2016a;Salehipour, Peltier & Caulfield 2018) that act to sharpen interfaces further (Fernando & Long 1988;Woods et al. 2010;Zhou et al. 2017b). This produces an anti-diffusion-like behaviour at the interface that preserves the distinct density layers over relatively long times. However, it is unclear as to whether this process leads to more or less irreversible mixing of buoyancy in comparison to the above turbulent diffusive-like overturning events (Koop & Browand 1979;Smyth & Winters 2002;Carpenter, Smyth & Lawrence 2006;Salehipour et al. 2016a).
There is a large body of literature on stratified shear flows (e.g. Peltier & Caulfield 2003;Mashayek & Peltier 2012a,b;Smyth & Moum 2012;Salehipour & Peltier 2015;Salehipour et al. 2016a). Many of these studies have focused on the development and breakdown of unforced linear instabilities, including KHI and HWI. A typical initial-value problem consists of a primary linear instability growing to a saturated finite amplitude followed by a (relatively rapid) break down into turbulence and then a (typically slower) decay back to a laminar state. However, the ocean and atmosphere can be turbulent and events like these can exist within a larger-scale forcing flow or within a flow that has retained memory of previous mixing events (Hogg & Ivey 2003). It is not clear whether linear stability or initial-value problems are relevant when considering persistent shear flows or useful in predicting shear-driven mixing between two exchanging bodies of fluid. Thus, questions still remain about what happens in a continuously forced shear flow, in particular whether these two behaviours of overturning and scouring are generic, robust and present, and what determines the appearance of either class of dynamics.
Several experiments have tried to address these (and related) questions. For example, the stratified inclined duct experiments of Meyer & Linden (2014), Lefauve et al. (2018) and Lefauve, Partridge & Linden (2019) are designed to maintain over relatively long periods a shearing counterflow of dense fluid moving below light fluid within an inclined duct connecting two reservoirs of fluid with differing densities. Depending on the tilt of the duct and the Reynolds number, they found four distinct flow states: (i) a laminar state, (ii) a state primarily susceptible to HWI, (iii) a spatio-temporally intermittent state and (iv) a broadening turbulent state. The transition between the flow states appears to be governed by switching from hydraulically controlled, low-dissipation states to higher-dissipation states. The constricted duct experiments of Hogg & Ivey (2003) saw a billowing KHI steady state and a HWI steady state with a clear transition between, predicted by an appropriately defined bulk Richardson number. Additionally, the circular lock-exchange experiments of Tanino, Moisy & Hulin (2012) observed pulsing between turbulent and laminar states that was better predicted by a Reynolds number-based criterion than a Richardson number-based (i.e. shear compared to stratification) criterion.
Here we perform a series of direct numerical simulations (DNS) of a continuously forced stratified shear flow. Each simulation is initialized with a uni-directional stratified shear flow that is unstable to Kelvin-Helmholtz or Holmboe instabilities and random perturbations are added. The flow is then forced by relaxing the buoyancy and streamwise velocity towards a background state that is set to the horizontal mean of the initial conditions. Given the chosen relaxation time scale (discussed in § 2), the flow then reaches a new quasi-equilibrium background state. Our principal aims are twofold. First, we wish to investigate whether this flow (for appropriate choices of parameters) can exhibit 'overturning' Kelvin-Helmholtz-like mixing and 'scouring' Holmboe-like mixing. Second, we wish to characterize the ensuing mixing, in particular whether it is ever possible for a relatively sharp density interface to survive while the flow is turbulent. To address these two key aims, the rest of the paper is organized as follows. We describe the set-up for the simulations performed in § 2, and we discuss qualitatively the phenomenology of the simulations in § 3. We then discuss the simulations in the context of a linear stability analysis framework in § 4 and present quantitative analysis of the simulations in § 5. Lastly, we provide our conclusions in § 6.

Equations
We perform three-dimensional DNS of a box of fluid centred at the density interface of a forced shear flow. We force the flow, the details of which are discussed below, to mimic the effects of the larger-scale shear flow outside of the box. This is intended to resemble what happens at the interface of an actual geophysical exchange flow. A schematic of the flow geometry is shown in figure 1. Such a flow is commonly referred to as a stratified shear 'layer', as there is a finite depth layer in which the shear is significantly different from zero. Since we are particularly interested in the fate of the relatively thin region in which the density varies significantly from the two far field values, we will refer to this region as a density 'interface' and the region where velocity varies significantly as a velocity 'interface', and reserve the use of the word 'layer' for the two deeper regions with approximately constant (initial) properties above and below these 'interfaces', which in general will have different and time-dependent depths.  We solve the non-dimensional incompressible Boussinesq Navier-Stokes equations, given as where u is the Eulerian velocity, p is the pressure, b is the buoyancy, Re is the Reynolds number, Ri 0 is the (initial) bulk Richardson number, Pr is the Prandtl number, F u and F b are the streamwise velocity and buoyancy forcing terms andx andẑ are the unit vectors in the streamwise and vertical directions respectively. The forcing terms are defined as where τ is the relaxing time scale, u is the streamwise velocity component and u * 0 (z) and b * 0 (z) are the z-dependent initial conditions to which the flow relaxes back, which have the (dimensional) form where U * 0 and B * 0 are the initial (dimensional) magnitudes of the streamwise velocity and buoyancy, and d * 0 and δ * 0 are the initial (half) depths of the velocity and buoyancy 910 A42-4 interfaces, respectively. This is a forced-dissipative system where forcing in the system is achieved entirely by the relaxation terms in (2.4) and (2.5).
In this context, τ can be thought of as the non-dimensional (scaled with the advection time scale d * 0 /U * 0 ) flushing or refreshing time scale associated with the larger-scale shear flow outside our computational domain. The time scale τ = 100 has been chosen such that, at steady state, the forcing is strong enough to maintain shear unstable background profiles of the streamwise velocity and buoyancy (determined so by performing stability analysis on the steady-state horizontally averaged streamwise velocity and buoyancy profiles), but weak enough that it is less than half the turbulence production term in the turbulent energy equation. Figure 7 in § 3.3 shows the relative magnitude of these terms and additional, under-resolved simulations at τ values of 50 and 200 are shown and discussed in the appendix. The Reynolds number, initial bulk Richardson number, and Prandtl numbers, as well as the (initial) interface length scale ratio R 0 are defined as where ν * is the kinematic viscosity and κ * is the molecular diffusivity of the buoyancy. We are also interested in the properties of a particular gradient Richardson number Ri g (z, t), defined in terms of the horizontally averaged velocity and buoyancy profiles and so in general a function of both z and t where · xy denotes horizontal averaging, S(z, t) is the vertical shear of the horizontally averaged streamwise velocity and N 2 (z, t) is the buoyancy frequency associated with the horizontally averaged buoyancy, i.e. (2.10) Initially, the gradient Richardson number at the midpoint of the density interface is The numerical code is the pseudo-spectral code DIABLO (Taylor 2008), used previously in similar simulations of stratified shear flow (Deusebio, Caulfield & Taylor 2015;Taylor & Zhou 2017;Zhou et al. 2017b). Horizontal derivatives are calculated pseudo-spectrally, while vertical derivatives use second-order finite differences. Time stepping is done with a mixed implicit/explicit scheme of third-order Runge-Kutta and Crank-Nicolson. The velocity and buoyancy are periodic in both the horizontal directions. Vertical velocity is zero at the top and bottom boundaries while all other components of the velocity and the buoyancy have zero gradients at the top and bottom boundaries. The domain size is L X = 30, L Z = 30 and L Y = 15 relative to the initial velocity interface half-depth, d * 0 , and 768 × 768 × 384 grid points are used for all simulations. In all cases the grid spacing is no larger than twice the Kolmogorov length scale (L κ = (ν * 3 /ε) 1/4 , where ε is the kinetic energy dissipation rate), a typical criterion for DNS (Yeung & Pope 1989;Pope 2000). The initial flow field is seeded with random noise with a k −2 spectra (although the steady-state results are not sensitive to this specific form) and amplitude of 0.001U * 0 in order to aid the transition to turbulence.
Through a sequence of exploratory simulations we can identify three distinct regimes which arise in this system: (i) an overturning and interface broadening regime 'B'; (ii) a scouring and interface thinning regime 'T'; and (iii) an intermediate, spatio-temporally intermittent regime 'I', and we thus consider in detail three simulations, representative of each of these regimes. All three simulations have Re = 4000, Pr = 1 and R 0 = 7, but with different initial and background forced bulk Richardson numbers. For simulation 'B' in the interface broadening regime, Ri 0 = 0.0125 and hence Ri g,0 = 0.0875, for simulation 'T' in the interface thinning regime Ri 0 = 0.35 and hence Ri g,0 = 2.45, while for simulation 'I' in the intermediate, spatio-temporally intermittent regime Ri 0 = 0.1 and hence Ri g,0 = 0.7. The Re value is chosen so it is sufficiently high for the full 'zoo' of secondary instabilities and subsequent turbulent break down to arise, at least for flows susceptible to KHI Salehipour et al. 2016a).
We have conducted a linear stability analysis on the initial profiles of each simulation, details of which will be discussed in § 4. This analysis reveals that the most unstable mode in simulation B is KHI, identified by the phase speed of the most unstable mode being zero, while both simulations T and I are initially most unstable to HWI, with the most unstable modes being a complex conjugate pair with non-zero phase speeds. The specific value of R 0 is chosen so that all three of these regimes can be accessed with the same R 0 value. Linear stability analysis and several test simulations reveals that at lower values of R 0 the flow is no longer unstable to HWI (or only weakly so) at the chosen Re and τ values. This will be discussed further below and is illustrated in figure 8, where the darkness of the red shading represents the growth rate associated with HWI at different R 0 and Ri b values. Simulations B and T are run until an approximate turbulent steady state is achieved, while the simulation I is run until several pulsation cycles are achieved, as a steady state does not develop. All results shown are from times after these steady states are achieved unless otherwise stated. In general, we will not be discussing the transient spin-up phase of each simulation in too much detail, as our primary focus is on the statistically steady state. The novel aspect of this study is the addition of the forcing term, which allows a statistically steady state to develop. Additionally, the forcing term is relatively unimportant during the transient phase, and a large body of literature has already explored the evolution of stratified shear layers from a prescribed initial condition (Caulfield & Peltier 2000;Smyth & Winters 2002;Peltier & Caulfield 2003;Carpenter et al. 2006;Brucker & Sarkar 2007;Mashayek & Peltier 2012a,b;Smyth & Moum 2012;Salehipour & Peltier 2015;Salehipour et al. 2016a;Kaminski & Smyth 2019).
It should be noted that, although the exact form of the forcing and the magnitude of τ do change the quantitative results of this study, the qualitative results within each regime appear to be robust for a large range of τ values. Changing the magnitude of τ generically leads to the occurrence of three distinct regimes, an overturning and interface broadening regime, a scouring and interface thinning regime and an intermediate spatio-temporally intermittent regime. However, the parameter values at which each regime occurs, the transitions between the regimes, and the magnitudes of the analysed quantities presented later shift with changes of τ , primarily due to an increase or decrease in the kinetic and potential energy provided by the forcing. Thus, our focus in this study is in comparing the characteristics of the turbulence seen in each regime, with all parameters except the initial bulk Richardson numbers held the same. We first consider qualitatively the flows observed in each of the three simulations, and then present a quantitative analysis and interpretation of the simulation data.  Figure 2 shows the horizontal and time-averages of the (a) streamwise velocity, (b) buoyancy and (c) gradient Richardson number Ri g as defined in (2.9) (but constructed using the time-averaged profiles). Dotted lines show simulations B and T initially and solid lines at their final turbulent steady state. Time averages are performed over the last 100 (non-dimensional) time units of each respective simulation. It is immediately clear from panels (a) and (b) that the initial sharp interfaces of the velocity and buoyancy in simulation B (compare grey dotted lines to red solid lines) are not maintained once a turbulent steady state is achieved and the buoyancy and velocity interfaces are much broader at the end of the simulation.We define time-dependent (and non-dimensional) velocity and density interface half-depths as

Simulations B and T
where, by construction, d(0) = 1 and δ(0) = 1/R 0 . Defining the time-dependent interface half-depth ratio as R(t) = d/δ, we plot this ratio versus time in figure 3. Here, we see that the initial transient broadening period for all three cases lasts approximately 100-200 (non-dimensional) time units. In simulation B, during this transient period the flow develops turbulent billows, similar to those seen in the intermediate or strongly turbulent initial condition simulations of Kaminski & Smyth (2019). The resulting growth of the buoyancy interface causes R(t) to decrease from its initial value R(0) = R 0 = 7 to its steady-state value of R 1. Additionally, the gradient Richardson number (figure 2c), which initially had a maximum at the midplane of the computational domain, is relatively uniform across the centre of the domain and remains well below the Miles-Howard criterion of 1/4 (vertical dashed line in figure 2c) (Howard 1961;Miles 1961). In contrast, a relatively sharp interface for both the buoyancy and velocity profiles is still maintained for simulation T during steady state. While both d and δ increase from their initial values quite rapidly as turbulent and wispy interfacial waves develop in the transient period, even in steady state the density interface remains thinner than the velocity interface, and so R remains significantly greater than one (as seen in figure 3 and discussed further below). Additionally, although slightly decreased from its initial value, a maximum in the gradient Richardson number in figure 2(c) is still maintained at the midplane of the computational domain, with minima (less than 1/4) on either side of the midplane, while the gradient Richardson number then approaches large values in the far field, as the shear is very small away from the midplane. Note, the high frequency oscillations seen in figure 3 for simulation T are interfacial waves within a continuously stratified system. To visualize the flow dynamics, we show in figure 4 vertical slices of various flow quantities at the end of simulations B and T. In figure 4(a,d) we show buoyancy, in figure 4(b,e) we show the log of kinetic energy dissipation rate ε and in figure 4(c, f ) we show the log of buoyancy variance dissipation rate χ , defined as where s ij is the rate of strain tensor associated with the full velocity field u and the buoyancy frequency is as defined in (2.9). Considering the three panels for simulation B (i.e. figure 4a-c), it is apparent that regions of high buoyancy variance dissipation generally coincide with regions of high turbulence dissipation. This co-location leads to a significant amount of irreversible mixing and broadens the density interface. Since the initial gradient Richardson number at the density interface is not particularly high, turbulent eddies in the flow overcome the effects of stratification. Additionally, throughout the steady-state portion of this simulation we do not see the classical coherent billow of KHI roll-up, but rather a complex turbulent flow that is reminiscent of the simulations in Brucker & Sarkar (2007) and Kaminski  & Smyth (2019), which are seeded with pre-existing turbulence. In the initial transient period, there is roll-up like behaviour, but is again significantly altered by the presence of turbulence.
In contrast, in the equivalent panels for simulation T (i.e. figure 4d-f ), the kinetic energy and buoyancy variance dissipation are overall less than those in simulation B, indicating that overall mixing in simulation T is much smaller than that in simulation B. The high initial gradient Richardson number at the density interface prevents turbulence from overturning the interface, instead relegating overturns to either side of the interface where stratification is relatively low and they can scour the interface. So while mixing is overall all much smaller in simulation T, the important feature here is the difference in mixing going from the midplane to the outer flanks of the interface. This leads to a sustained sharpening of the interface and a maintenance of a higher gradient Richardson number at z = 0, which in turn further inhibits the breaking down of the interface by turbulence.

Simulation I
In contrast to both simulations B and T, a statistically steady turbulent flow is not achieved in simulation I (see figure 3). Instead, spatio-temporal intermittency develops that has aspects that resemble each of the other simulations. Specifically, this simulation exhibits overturning and scouring behaviour at different stages in the flow evolution.
In figure 5(a,b), it is apparent that the horizontally averaged streamwise velocity and buoyancy cycle between phases where the interfaces sharpen and broaden. It should be noted that while the cyclic behaviour is generically present for all values of τ tested (see appendix A for more detail), the value of τ does influence the period of the cycling between the two states. Specifically, as τ is increased, the period linearly increases as well. In panel (c), 4N 2 − S 2 is shown, where S and N are the mean shear and buoyancy frequency as defined in (2.9). Therefore, positive and negative values of this quantity correspond to Ri g > 1/4 and Ri g < 1/4 respectively. Significantly, after t 100 at the midplane of the computational domain, Ri g > 1/4 is maintained (prior to t 100 an initial larger roll-up occurs that reduces Ri g to less than 1/4 briefly, before developing the spatio-temporal intermittency seen in the rest of the simulation). However, depending on whether the system is in the observed overturning-or scouring-like state, the width of this strong buoyancy interface and the values of Ri g either side of this interface vary. Specifically, considering the time period around the first thick dashed line, there is a relatively thin high Ri g region flanked by very low values of Ri g . In contrast, looking at the time period around the second dotted line, we see that 4N 2 − S 2 becomes small shortly before this time, followed by an increase in Ri g over a much broader vertical extent.   Figure 6(d-f ) shows slices taken at the time marked with a grey dotted line in figure 5, at (non-dimensional) t ≈ 1000 when the density interface is broadening. Coherent overturns of the density interface are visible and strong momentum dissipation is co-located with strong buoyancy gradients. This is qualitatively similar to figure 4(a-c) for simulation B. However, here the buoyancy interface, while broader than in the scouring-like state in figure 6(d-f ), is still noticeably thinner than in simulation B. Figure 6(a-c) shows slices taken at the time marked with the grey dashed line in figure 5, at t ≈ 500 when the density interface is thinning. Here, the buoyancy interface is thinner than in figure 6(d-f ), and significant kinetic energy dissipation occurs on either side of the buoyancy interface, similar to figure 4(d-f ) for simulation T (although the interface is not quite as thin as in simulation T).
Although this is an idealized system with Pr = 1 and a relatively modest Reynolds number, it is interesting to note that the intense braid-like structures in the dissipation field of panel (e) strongly resemble the features in acoustic backscatter images of a salt-stratified estuarian outflow in figures 2 and 3 in Geyer et al. (2010). Although they do not explicitly measure dissipation, they estimate kinetic energy dissipation from the vertical velocity variance measurements they make. In both Geyer et al. (2010) and the overturning phase simulation I here, the most intense dissipation values occur in regions of large buoyancy gradients. A similar co-location of intense kinetic energy and buoyancy variance dissipation can also be seen in the estuarian observations of Holleman, Geyer & Ralston (2016), where again, they have not directly measured either dissipation, but rather estimated it from variance measurements.

Turbulent kinetic energy
Modifying the Osborn (1980) assumption of stationarity in time and homogeneity in space to include forcing, the turbulent kinetic energy (TKE) equation reduces to a balance between four terms: shear production P, turbulent buoyancy flux B, viscous dissipation D and forcing F. These are defined as where u , w and b are the fluctuations about u xy , w xy and b xy , respectively. Figure 7 shows the horizontal and time averages of these four terms for the B and T cases. Case I is not shown as stationarity is not achieved at any point in the simulation. Time averages are performed over the last 600 (non-dimensional) time units of the respective simulations. The average TKEs over this period for the B and T cases are 0.64 ± 0.1 and 0.015 ± 0.0007, respectively, and the average changes in TKE in time over this period are 0.0 ± 0.004 and 0.0 ± 0.0003, respectively, showing that a statistical steady state is maintained. Additionally, the magnitude of the forcing terms are less than half of the respective TKE production terms in each case.

Linear stability analysis
In order to examine the initial and temporal evolution of the stability of each simulation, we have numerically calculated the linear stability of a viscous, diffusive, stratified shear flow system. We substitute the perturbation solutions 3) and linearize around the base states U and B. Considering normal modes of the form where φ is the perturbation of any flow property,φ(z) is the z-dependent eigenfunction, σ is the growth rate and k the streamwise wavenumber, we get the following system of forced, viscous Taylor-Goldstein equations where and D 2 = d 2 /dz 2 . The notable addition to this system of equations is the τ forcing terms. Boundary conditions at the top and bottom forŵ andb are free-slip and insulating, respectively. The base states U and B take the same form as the initial velocity and buoyancy profiles given in (2.6) and (2.7) covering a Ri 0 − R 0 phase space through variation in the strength and depth of the buoyancy interface. We solve the system of equations using the procedure outlined in the appendix of Smyth, Moum & Nash (2011). The most unstable mode is extracted for a range of R 0 and Ri 0 values. Figure 8 shows in colour the magnitude of the real part of the growth rate of the most unstable mode according to the linear stability analysis as a function of R 0 and log 10 (Ri 0 ). Blue shading is used when the phase speed of the most unstable mode is zero (interpreted as being of KHI type) and the red shading is used when there is a complex conjugate pair with non-zero phase speeds of most unstable modes (interpreted as being of HWI type). Stable or neutral modes are coloured white. The initial condition (Ri 0 , R 0 ) for simulation B is marked with a triangle, for simulation I is marked with a star, and simulation T is marked with a square. The attached lines show the temporal evolution of each simulation in Ri b −R phase space. At each instant the updated values of d and δ, determined using (3.1) and (3.2), are used to determine the value of R = d/δ for the simulation, used as the y-coordinate on the figure.
Analogously, we can also define a time-dependent value of the bulk Richardson number, taking into account the fact that the depth d of the velocity interface in general increases (and so the intensity of the shear drops). We generalize the definition of the initial Ri 0 in (2.8a-d) as it has no time-dependent terms (d * 0 is defined as the initial velocity interface 910 A42-13 and so we distinguish it from the d(t) used in (3.1)), so that (4.5) which we use to determine the x-coordinate on the figure. Increases in d > 1 lead inevitably to increases in Ri b from its initial value Ri 0 . The grey lines denote the initial transitory, non-steady evolution of each simulation and the black lines show the steady or fully evolved state of each simulation. All three simulations exhibit an initial transient period that involves the broadening of the velocity interface. In simulation B, this broadening affects the velocity and buoyancy interfaces, so Ri b increases significantly (due to the increase in d) and the velocity and buoyancy interfaces becoming approximately equal in depth, and so R 1. In simulation T, while there is an initial broadening of both interfaces with δ increasing more than d, R still remains substantially larger than in simulation B (R 3.5) once steady-state is achieved. Simulation I resembles simulation B in its low steady-state average R value, however, unlike simulation B, simulation I oscillates in phase space between two different states (examples of which can be seen in figure 6). Performing the same stability analysis as before, but using the instantaneous horizontally averaged velocity and buoyancy profiles at each time step output as the base state reveals that it is oscillating between a completely stable state and a state that is most unstable to a mode 2 KHI. However, caution should be taken in inferring the stability of the flow from these averaged profiles as the background state is continuously altered by the growing perturbations (Hogg & Ivey 2003). Although Pr = 1 in these simulations, to leading order the forcing counteracts any broadening effects of the much slower molecular diffusion. Thus, turbulence is the primary mechanism for interface broadening and setting of the steady-state R value here.

Quantitative analysis
5.1. Mixing efficiency: physical coordinate space Figure 9 shows as a function of time the horizontal averages of kinetic energy dissipation rate ε xy , buoyancy variance dissipation rate χ xy , and the associated mixing efficiency η(z, t), defined as for all three simulations. One advantage of this definition is that the mixing efficiency is a function of depth, unlike the mixing efficiency defined with the irreversible buoyancy flux as calculated from the available potential energy (APE) framework from Winters & D'Asaro (1996) which yields a single volumetric mixing efficiency. For ease of comparison between different simulations, z has been normalized for each simulation by a time-averaged buoyancy interface half-depth δ xyt given by (3.2), where the time-averaged value is shown in each figure. For simulation B, we see that η(z, t) is quite variable in space and time. Close to the midplane of the computational domain, η(z, t) is relatively low, where overturning and turbulence is relatively active and thus ε xy is large, but buoyancy is, relatively, more homogenized, so χ xy is somewhat suppressed. The mixing efficiency then increases to higher values toward the outer flanks of the turbulent region, where both ε xy and χ xy become quite small. In contrast, simulation T has a relatively constant mixing efficiency concentrated around the midplane of the computational domain, with peak values of the mixing efficiency and overall values of the kinetic energy dissipation and buoyancy variance dissipation much less than those in simulation B. However, a reduction or enhancement in mixing or mixing efficiency when comparing the two regimes, B and T, is not the only important point to be made here. As will be discussed in subsequent sections, how the mixing and mixing efficiencies vary with respect to the buoyancy interface in each regime is a key feature of that respective regime. Again, the high frequency oscillations seen in panels (c), ( f ) and (i) for simulation T are a result of high frequency internal waves propagating along the density interface. These are effectively interfacial waves in the continuously stratified system and they can be seen more clearly in figure 4(d-f ). Similarly to before, simulation I appears to exhibit behaviour similar to aspects of both simulation B and T, cycling between a state of high peak values towards the flanks and a state of lower, yet more uniform values that are localized around the midplane. An important hypothesis of self-organization and the tendency for a system to be attracted towards critical Ri g and η values of 1/4 and 1/6 respectively at steady state is raised in Salehipour et al. (2018) using data from unforced, initial-value simulations. To examine whether a forced system, such as the ones examined here, exhibits evidence of this behaviour we plot in figure 10 the probability density function (PDF) of the horizontally averaged gradient Richardson number and mixing efficiency as defined in (2.9) and (5.1), respectively. Data points included in the binning are from the interface region − δ t ≤ z ≤ δ t and the last 600 (non-dimensional) time units of each simulation. As to be expected, simulation B has relatively low gradient Richardson numbers and mixing efficiencies, simulation T has relatively high gradient Richardson numbers and mixing efficiencies, and simulation I is situated between the two. Although there is a peak at Ri g = 1/4 in the steady state of simulation T, the mixing efficiencies are well above  1/6, and again, although there is a peak at η = 1/6 in the spatio-temporally intermittent simulation I, the gradient Richardson's number values are well below 1/4. Additionally, the majority of the steady-state values in simulation B for both Ri g and η are well below 1/4 and 1/6, respectively. We might not expect to find evidence of a self-organized basin of attraction here, as the forcing appears to act somewhat against these effects, not least due to being somewhat too strong for the underlying assumptions of the self-organized criticality paradigm.

5.2.
Mixing efficiency: buoyancy coordinate space The above view of the mixing efficiency, however, does not reveal the full picture. Horizontally averaging over the domain and thus the interfacial internal waves arising in both simulation T and I produces a broad density interface when calculated in this fashion. In order to eliminate this misleading property, we now average the dissipation rate, buoyancy variance destruction rate, and mixing efficiency data based on the reference z * buoyancy coordinate space of Winters & D'Asaro (1996) and Nakamura (1996). Specifically, the sorted buoyancy profile b * (z * , t) is first calculated using the PDF method in Tseng & Ferziger (2001). Then * and χ * are calculated by averaging the values of and χ that fall within a given z * bin. For all three simulations, figure 11 plots (with thick lines) time and buoyancy coordinate-averaged kinetic energy dissipation rate ε * t , buoyancy variance dissipation rate χ * ,t and the associated time-averaged mixing efficiency η * ,t , where η * is defined as Time averages are performed using snapshots spaced 100 (non-dimensional) time units apart across the respective time windows shown in figure 9. Analogously to figure 9, for each simulation, the z * coordinate is scaled with an appropriate measure of the depth of the buoyancy interface. Specifically, the time-dependent half-depth δ * (t) of the buoyancy interface in buoyancy coordinate space is defined as We then scale the vertical coordinate in the plots using the time average δ * t , averaged over the same period as ε * , χ * and η * , and the specific values of this quantity for each simulation are given on the figure. Furthermore, the data from simulation I have been split to construct a time average, referred to as I B , over all relatively thick interfaces, identified as interfaces for which δ * > 2.4 (the time-averaged values of δ * for all of simulation I), and a time average, referred to as I T , over all relatively thin interfaces, identified as interfaces for which δ * < 2.4. Finally, the shaded regions indicate ±1 standard deviation of the instantaneous data about the time average. Averaged in this way, it is now apparent that the dissipation in simulation T is localized on the edges of the buoyancy interface, exterior to the location of the strongest buoyancy gradient, the time average of which is maximum at the midpoint in z * coordinates. The kinetic energy dissipation rate is low in simulation T in the region −0.5 δ * t ≤ z * ≤ 0.5 δ * t compared to the other simulations. The buoyancy variance destruction rate, χ * , is also smaller near z * = 0 in simulation T, but the reduction compared to the other simulations is smaller and hence the mixing efficiency is relatively large in this region in simulation T. So while the midplane mixing efficiency is relatively large for simulation T, the flow at the midplane is quasi-laminar and mixing and χ * are at least an order of magnitude smaller than the other simulations at the midplane. However, when moving away from z * = 0 towards the flanks of the buoyancy interface, the flow becomes more turbulent. Both ε * and χ * begin to increase for larger values of z * in simulation T, indicating more mixing is occurring, but do so in such a way that the mixing efficiency decreases. This difference between mixing at the midplane and flanks is precisely what contributes to the thinning of the interface in simulation T, as will be discussed further in § 5.3. Conversely, in simulation B, the maximum dissipation rate is co-located with the maximum buoyancy variance destruction. Additionally, both are spread across a much larger portion of the domain (of width δ * t = 10.3), leading a more broad and relatively moderate value of mixing efficiency. Subdividing simulation I into I T and I B phases, it is apparent the associated mixing efficiencies are quite similar, exhibiting a combination of behaviours characteristic of both simulation T and B. Specifically, dissipation is not peaked on the flanks of the buoyancy interface as in simulation T, but rather both dissipation and buoyancy variance destruction are peaked at the midpoint as in simulation B. On the other hand, the widths of these regions of enhanced dissipation and buoyancy variance dissipation (of width δ * t = 2.0 and 2.8 respectively), are quite narrow, similarly to the observed behaviour in simulation T.
A further depth-averaged mixing efficiency is calculated using an average across the buoyancy interface (i.e. for − δ * t < z * < δ * t ), which we denote as η * . For simulations B and T, the associated averaged values are η * = 0.11 and η * = 0.43 respectively, suggesting that, although it spans a much smaller vertical extent, simulation T achieves a much higher interfacial-averaged mixing efficiency. However, a key point to keep in mind when considering the destruction or maintenance of sharp buoyancy interfaces, is not so much whether mixing and mixing efficiency are high or low overall when comparing simulation B and T, but how mixing and mixing efficiency vary along the depth of the interface within each simulation (a point that is illustrated further in § 5.3). Evaluating mixing or mixing efficiency as a single depth-averaged value in simulations such as these can mask this important information. For simulation I, the value associated with the I T phase is η * = 0.23 and the value associated with the I B phase is η * = 0.28, falling between the values associated with the other two simulations.

Effective diffusivity
In examining the mechanisms behind the destruction or maintenance of buoyancy interfaces in layered stratified plane Couette flow simulations, Zhou et al. (2017b) derived an evolution equation for the buoyancy gradient in the same sorted buoyancy coordinate as discussed above. They found that the curvature of the effective diffusivity κ e , defined as where A s is the area of the isopycnal surface and A is the area of the isopycnal surface projected onto a flat plane, served as a simple quantity to diagnose whether 'scouring' or 'overturning' processes took place at the interface. In this context, 'scouring' is characterized by positive curvature, while 'overturning' is characterized by negative curvature. The distinction can perhaps be best understood through consideration of isopycnal surfaces. Large overturning processes distort isopycnal surfaces near the midpoint, leading to a relatively large A s /A ratio and hence relatively large κ e . This distortion then decreases away from the midpoint, creating negative curvature in κ e . Conversely, during scouring processes, relatively large isopycnal distortion due to turbulence, and associated relatively large A s /A, is displaced to either side of the interface, while at the midpoint (i.e. where z * = 0) the isopycnals are almost flat. This leads to a positive curvature in κ e and a mechanism by which diffusive spreading of the interface is counteracted. In addition to the curvature of κ e , Prandtl number effects were also found to be important, as κ e is bounded below by the molecular value of κ * , which thus limits the development of positive curvature at finite Péclet numbers. While the physical set-up of the Zhou et al. (2017b) simulations is quite different from that of this study, we can still employ this simple metric describing the curvature of κ e to understand the mechanisms behind the interface broadening and thinning behaviour observed in simulations B, T and I. In figure 12(c) we have plotted the time-averaged vertical profiles of κ e /κ * as a function of z * / δ * t for all three simulations. Time-averaging is the same as in figure 11. In simulation B, κ e is enhanced well above molecular diffusion across the entire depth of the domain with κ e exhibiting a negative curvature. Isopycnal distortion is greatest at the midpoint (i.e. at z * = 0) due to overturns, which naturally leads to interface broadening.
In simulation T, κ e κ * for |z * | < δ * t . Since κ e includes molecular diffusion, this indicates that there is very little enhancement in mixing by turbulence and the flow is quasi-laminar. For |z * | = δ * t , κ e > 2κ * . Although this is smaller than in the other simulations, the enhancement of mixing by turbulence outside of the density interface is non-trivial. As a result of the enhanced mixing on the flanks of the density interface, the κ e (z * ) profile has positive curvature, and based on the analysis in Zhou et al. (2017b), this acts to thin the interface.
In Zhou et al. (2017b), similar thinning and broadening interfaces were seen. However, the interface thinning observed in that study required a relatively large Prandtl number (Pr = 70) to limit the diffusion of the interface. In the study described herein, the forcing plays a somewhat analogous role in limiting secular spreading of the interface. In all cases the velocity and buoyancy-interface depths are greater than their initial depths. The forcing term always works to counteract the effects of interface diffusion. In simulation B, this effect is swamped by the broadening effects of the overturns. On the other hand, in simulation T, the forcing helps the scouring eddies to maintain a thinner interface.
In order to understand what sets the magnitude of κ e , it is helpful to consider the following relation from Salehipour & Peltier (2015) and Taylor & Zhou (2017) for κ e , quantifying the effective diffusivity enhancement above the molecular κ * κ e κ * = PrΓ * Re b, * + 1; where Re b, * , N 2 * and Γ * are a buoyancy Reynolds number, buoyancy frequency and turbulent flux coefficient constructed from the appropriately sorted buoyancy field b * (z * , t), and generically functions of both z * and time. Figure 12(a,b) shows time-averaged profiles of the buoyancy Reynolds number ( Re b, * t ) and flux coefficient ( Γ * t ) as a function of z * / δ * t for the three simulations. Again, time averaging is the same as in figure 11 and for κ e in panel (c). Comparing these profiles with their corresponding κ e profiles in panel (c), it is apparent that both Re b, * t and Γ * t contribute to the magnitude of κ e . For simulation B, Re b, * t is relatively large over the entire depth of the domain, but Γ * t conversely is relatively small. The overturns are not particularly efficient at buoyancy variance destruction, at least partly because there is less buoyancy variance to destroy, but are nevertheless quite vigorous and thus lead to a relatively large value of κ e . Conversely, in simulation T, Re b, * t is relatively small at the interface and increases away from it, while Γ * t is quite large at the interface and decreases away from it. Therefore, the scouring interface can be interpreted as being quite efficient at the destruction of buoyancy variance, but since overturns of the interface are suppressed, κ e falls to molecular κ * values, and mixing at the interface is weak. Away from the interface, there is less buoyancy variance to destroy, but turbulence conversely become more vigorous and κ e is larger than at the interface. Again, the I T and I B phases of simulation I lie somewhere in between simulations B and T at intermediate values and exhibit a mix of characteristics of the dynamics of the other two simulations.
Crucially, as is immediately apparent from panel (c), forced flows exhibiting overturning, broadening behaviour lead to much more diapycnal transport than flows exhibiting scouring, thinning behaviour. Although the mixing in simulation B may be thought of as being less efficient than in simulation T, since typical values of the flux coefficient Γ * are significantly smaller for simulation B than for simulation T, this effect is completely swamped by the substantially higher average value of the buoyancy Reynolds number, as shown in panel (a) (and thus perhaps warrants a comparison to be made between scouring and overturning mechanisms at the same buoyancy Reynolds number). Therefore, from (5.5a-d), the overturning-dominated mixing in simulation B has a much larger typical value of effective diffusivity than the scouring-dominated mixing in simulation T, remembering that the horizontal axes in the figure are logarithmic. It is also apparent that the intermediate simulation I has properties which lie between the two other simulations, with significantly larger effective diffusivity when the flow is in the I B phase.

Scaling of mixing efficiency
There is mounting evidence that mixing associated with stratified turbulence is a function of one or more non-dimensional numbers (see e.g. Linden 1979;Shih et al. 2005;Brucker & Sarkar 2007;Karimpour & Venayagamoorthy 2014;Holleman et al. 2016;Maffioli, Brethouwer & Lindborg 2016;Salehipour et al. 2016b;Venayagamoorthy & Koseff 2016;Zhou et al. 2017b;Garanaik & Venayagamoorthy 2019). Here, we diagnose the dependence of the time-dependent mixing efficiency η * (z * , t) in reference buoyancy coordinates (as defined in (5.2)) on several (also time-dependent) non-dimensional numbers to understand these relationships in a continuously forced system. As noted above, the actual (effective) diffusivity is proportional to the product of the turbulent flux coefficient Γ * = η * /(1 + η * ) and the buoyancy Reynolds number Re b, * , and so any dependence of η * on Re b, * is naturally of interest to explain (and parameterize) the eventual effective diffusivity.
We use the results of our simulations to determine the values of various key quantities as functions of time and the reference buoyancy coordinate z * , and then construct various non-dimensional parameters on which (an appropriately defined) mixing efficiency has been hypothesized to depend. In particular, figures 13 and 14 show time-dependent mixing efficiency and flux coefficient with respect to several commonly considered non-dimensional numbers. Specifically, we are interested in the dependence on the buoyancy Reynolds number Re b, * as defined in (5.5a-d), gradient Richardson number and horizontal Froude number in the reference z * buoyancy space coordinate, defined as where S 2 * = ( ∂u/∂z * ) 2 and U h, * = u 2 + v 2 * . Here, the gradient for the shear is with respect to the physical space vertical coordinate and u and v are the fluctuating horizontal velocities calculated as departures from the horizontal means u xy and v xy in physical space.
In each case, all values of η * over the range −δ * ≤ z * ≤ δ * are plotted against the relevant parameter and no time averaging is performed. Points outside of this z * range are not considered in this analysis because stratification in this region is very weak and our focus is on the properties of turbulence and mixing within a stratified shear layer. By restricting our analysis to the stratified interface, we are able to compare the three simulations more directly since the stratified interface occupies a much smaller fraction of the computational domain in simulations T and I than in simulation B. Data points are plotted from successive snapshots spaced 100 (non-dimensional) time units apart within the respective time window shown in figure 9. The colouring of circles indicates distance from z * = 0, where light is close to z * = 0 and dark is close to z * = ±δ * . A series of scaling lines are provided for reference on each of the panels. In figure 13(a), the dashed lines show the scaling η * ∝ Re −1/2 b, * identified in the DNS simulations of Shih et al. (2005) and field measurements of Davis &Monismith (2011) andWalter et al. (2014). The data appear to be consistent with a η * ∝ Re −1/2 b, * scaling for simulation B, which it must be remembered is relatively weakly stratified. We do not see evidence for an asymptote to a constant mixing efficiency for small Re b, * as suggested in Shih et al. (2005). However, the Reynolds number and resolution of the simulations in this study are not similar to those in Shih et al. (2005). Shih et al. (2005) proposed that the mixing efficiency transitions from a constant value to η ∝ Re −1/2 b at Re b 100. However, several recent studies have found that the value of Re b that marks the start of the η ∝ Re −1/2 b scaling is Reynolds number dependent (Lozovatsky & Fernando 2013;Maffioli et al. 2016;Taylor et al. 2019). Extrapolating the Re −1/2 b, * scaling line in figure 13(a) suggests that the start of the Re −1/2 b, * scaling occurs for Re b, * > 100 in our simulations, although the gap between cases B and I B make it difficult to be precise about this transition point. Additionally, the mixing efficiency does not appear to decay to zero at small values of Re b, * , as was seen in the analysis of Salehipour & Peltier (2015) which only included the post roll-up mixing events, but rather varies non-monotonically as Re b, * decreases. Specifically, η * decreases in the I T phase after an initial peak during the I B phase for simulation I, but it is observed to increase again in simulation T. Figure 2 in Mashayek, Caulfield & Peltier (2017) shows that with the addition of the roll-up, denoted as 'DNS: young', a similar second peak in mixing efficiency at low Re b, * occurs.
The dependence of the mixing efficiency on the gradient Richardson number, Ri g, * is shown in figure 13(b). For comparison, the dashed line shows the scaling η * ∝ Ri g, * .
As proposed by Salehipour & Peltier (2015), this scaling arises from the assumption that an appropriate (irreversible) definition of the turbulent Prandtl number is unity, implying that the turbulent diffusivities of momentum and (irreversible) buoyancy variations are equal. As discussed in Salehipour & Peltier (2015), this scaling follows by noting that the turbulent Prandtl number Pr T, * can be written as where R f , * = R f , * (z * , t) is a 'flux Richardson number', and ν T and κ T are the turbulent viscosity and diffusivity, respectively. They are defined as the ratio of the turbulent momentum (buoyancy) flux to the momentum (buoyancy) gradient. κ T here differs from the κ e used above as κ T vanishes in the limit of a laminar flow, while κ e remains non-zero due to molecular diffusion. For a quasi-steady state with negligible turbulent transport (see for example Mashayek, Caulfield & Peltier 2013;Salehipour & Peltier 2015 for further discussion) it is commonplace to assume that, particularly when only irreversible processes are considered, R f , * η * or equivalently R f , * Γ * /(1 + Γ * ). Therefore the assumption that both momentum and buoyancy are 'mixed' largely equivalently by turbulent motions and Pr T, * 1, implies that η * ∝ Ri g, * . There is increasing evidence for this scaling, particularly when a flow may be characterized as being relatively weakly stratified (see for example Zhou, Taylor & Caulfield 2017a;Portwood, de Bruyn Kops & Caulfield 2019).
On the other hand, the dotted line in figure 13 shows the essentially empirical scaling  (5.2) and shown as symbol shading), as a function of buoyancy Reynolds number, log 10 (Re b, * ) and gradient Richardson number, Ri g, * within the region −δ * ≤ z * ≤ δ * . As in figure 13, all data points are plotted from successive snapshots spaced 100 (non-dimensional) time units apart within the respective time window shown in figure 9 with no time averaging performed. Data from simulation B with Ri 0 = 0.0125 are plotted with triangles; simulation T with Ri 0 = 0.35 are plotted with squares; and from simulation I with Ri 0 = 0.1 are plotted with diamonds and circles for the I T phases and I B phases respectively. Note that the colour bar has been saturated at a value of 0.4 and for some value less than log 10 (Re b, * ) = 1 the mixing efficiency η * is always greater than 0.4. The dashed line plots a Ri g, * ∝ 1/Re b, * scaling.
Simulations B and I appear to exhibit at least approximate agreement with the unity turbulent Prandtl number scaling. Around Ri g, * ∼ 0.2-0.25 the data (largely the interface preserving data from simulation T) begins to deviate from this scaling. From here it is unclear as to whether η * asymptotes to a constant value, similar to the empirical scaling relation in (5.8), or if it decreases again in some fashion, as is predicted by the right flank of the flux-gradient curve within the paradigm presented by (Phillips 1972) for the development (and maintenance) of interfaces, and is observed in the experimental data presented by Linden (1979).
Although the flows considered here are forced specifically by a vertical shear, it is also of interest to compare the mixing efficiency with scalings proposed in terms of the horizontal Froude number, Fr h, * , which naturally is expressed in terms of properties of the turbulence and background stratification alone. Maffioli et al. (2016) proposed that for sufficiently high Reynolds number, Re, the flux coefficient, Γ , is a function of the Froude number alone. Specifically, Maffioli et al. (2016) Figure 13(c) shows the flux coefficient, Γ * , as a function of the horizontal Froude number, Fr h, * , along with lines indicating Γ * ∝ Fr −1 h, * (dotted) and Γ * ∝ Fr −2 h, * (dashed). The solid grey line shows the regime transition from the scaling given in (5.9) for Fr h, * > 0.3 to an η ∼ 0.23 scaling for Fr h, * < 0.3 seen by Maffioli et al. (2016) in their steady-state forced shear-free DNS simulations. Simulation B is consistent with a Fr −1 h, * scaling, while the scatter in the data points makes it difficult to determine whether the results follow a Fr −2 h, * scaling, although the data is not inconsistent with this scaling for Fr h, * > 0.3 as proposed by Maffioli et al. (2016).
There is no significant evidence of an asymptote to a constant as proposed by Maffioli et al. (2016) and Garanaik & Venayagamoorthy (2019) for small Froude numbers. However, our simulations differ from those reported in Maffioli et al. (2016) in several important ways and are perhaps not expected to be directly comparable. First, the flows here are never 'strongly' stratified at all depths. Second, they are specifically designed to inject energy through shear instabilities, while those in Maffioli et al. (2016) are isotropically forced. Third, the vertical Froude number here is somewhat constrained by the initial and forcing value of Ri 0 and may not reach unity as it has in Maffioli et al. (2016), where it no longer influences the dynamics. However, we argue that, in the flows considered here, the vertical momentum length scale (used in the calculation of the vertical Froude number) is not the only important length scale. Here, the width of the momentum and buoyancy interfaces, and in particular their ratio, R, is an important parameter, as demonstrated in figure 8.
In figure 13 the spatial variability within the buoyancy interface in the B and I cases are quite small, meaning they are more robust to the different definitions of vertical averaging one might choose in order to arrive at an average measure of the mixing efficiency. In contrast, the T case has a very strong dependence on depth within the buoyancy interface and thus would be quite sensitive to the specific choice in definition of vertical averaging. It should be noted for all of these scalings that the method used for averaging and the definition of each non-dimensional number could be different between the current study and those that are cited and thus could explain some of the differences. Figure 14 shows the specific value of the mixing efficiency (shown as symbol shading), as defined in (5.2), as a function of Re b, * and Ri g, * . Data from simulation B are plotted as triangles, from simulation T are plotted as squares, while phase I B and phase I T from simulation I are plotted with circles and diamonds respectively. Here again, no time or vertical averaging is performed. All data points are plotted from within the region −δ * ≤ z * ≤ δ * and from successive snapshots spaced 100 (non-dimensional) time units apart within the respective time window shown in figure 9. From this plot, it is clear that for all the simulations Re b, * and Ri g, * are correlated, and overall, a decrease in gradient Richardson number corresponds to an increase in the buoyancy Reynolds number, although there is some suggestion of an increase in Ri g, * with intermediate values of Re b, * . An additional dashed line plots a Ri g, * ∝ 1/Re b, * scaling, which naturally emerges from their respective dependence on N 2 * . Such a scaling implies that turbulence is shear generated. Its properties are determined by the large scale shear and not strongly affected by the stratification. Thus in turn, the stratification is also not strongly affected by the turbulence. Both the B and T cases fall on this line because in their steady states, where turbulence is strong (everywhere for the B case and on either side of the interface for the T case), it is shear generated and stratification is relatively weak. In contrast, the I case does not fall on the line because it is never in a steady state, instead the shear and stratification compete with each other. Thus turbulence is not solely shear generated and the stratification is not unaffected.
Non-monotonic variation of η * with Re b, * is once again apparent, with peak values of η * shown for both simulation T and the I B phase of simulation I. Also, the relatively high values of η * at very low Re b, * and high Ri g, * suggest that, even though entrainment of fluid due to large overturning processes may strongly decrease in a weakly turbulent flow with a robust interface, the flux of buoyancy, and associated irreversible mixing across the interface, is not fully suppressed. However, it is important to appreciate that such high efficiency does not imply large (total) transport, due to the associated small value of Re b, * , thus implying from (5.5a-d) relatively little enhancement in (irreversible) effective diffusivity.

Conclusions
We have demonstrated the existence of three distinct types of behaviour in an idealized continuously forced, stratified shear flow: (i) a weakly stratified, broad density interface in simulation B; (ii) a thin, strongly stratified density interface in simulation T; and (iii) spatio-temporal intermittency with certain characteristics of each of the other two behaviours, shown in simulation I. Each simulation had the same Re = 4000, Pr = 1 and initial R 0 = 7, the ratio between the initial velocity and buoyancy-interface half-depths, but different initial bulk Richardson numbers Ri 0 .
Simulation B is characterized by turbulent eddies and overturns being largely co-located with strong buoyancy gradients. This flow structure leads to a relatively rapid break up and broadening of the buoyancy interface so that the ratio of the velocity and buoyancy-interface half-depths R ≈ 1 throughout the quasi-steady-state flow evolution. In contrast, simulation T is characterized by turbulent eddies being shifted slightly above and below the (robust) buoyancy interface. This produces a scouring effect, ensuring that the buoyancy interface remains relatively thin compared to the velocity interface, and so R > 1 throughout the quasi-steady-state evolution of this simulation.
We have found that there is a useful classification in terms of a parameter space based on the specific properties of the initial and forced background velocity and buoyancy profiles, in particular the values of Ri b and R = d/δ, as defined in (4.5), (3.1) and (3.2). Certain regions of this parameter space can be associated with these two dynamically different behaviours, which may be characterized as 'overturning' and 'scouring' dynamics. Loosely, overturning dynamical behaviour is associated with relatively small values of Ri b and R, while scouring dynamical behaviour is associated with larger values of Ri b and R.
However, we have also found that there is not a sharp transition between these two behaviours. For certain choices of initial parameters, as demonstrated by simulation I, intermediate dynamical behaviour occurs, characterized by spatio-temporal intermittency and alternating overturning and scouring behaviour with points of at least qualitative similarity to simulations B and T. As shown in figure 8, simulations B and T evolved to quasi-steady states, with characteristic values of associated bulk Richardson number Ri b and interfacial depth ratio R. Interestingly, linear stability analysis of the notional horizontally averaged background profiles of velocity and buoyancy reveals that the quasi-steady state of simulation B is most unstable to Kelvin-Helmholtz-type instabilities, while the quasi-steady state of simulation T is most unstable to Holmboe wave instabilities.
These observations suggest that linear stability could be useful to determine if a quasi-steady forced turbulent system is in either an overturning or scouring state.
Comparison of the three forced simulations considered here to previously reported unforced (and hence inherently transient) simulations is instructive. For example, in the unforced KHI and HWI simulations of Salehipour et al. (2016a) and Salehipour et al. (2018), qualitatively similar overturning behaviour at low R values and scouring behaviour at higher R values is observed, consistent with previous simulations at substantially smaller Reynolds numbers considered by Carpenter et al. (2006). Indeed, all such simulations are consistent with the theoretical predictions of Smyth & Peltier (1991) and Hogg & Ivey (2003).
Furthermore, the HWI simulations of Salehipour et al. (2016a) clearly exhibited both a scouring-like behaviour and a 'long-lived twin-lobed' structure in the diapycnal diffusivity. Recalling the simple metric of Zhou et al. (2017b) utilized here, the positive curvature of the diffusivity profile reported in Salehipour et al. (2016a) suggests strong points of commonality between these (transient) simulations and the continually forced simulation T described here. This is perhaps unsurprising, since as discussed in Salehipour et al. (2018), transient flows prone to HWI apparently approach for an extended period a quasi-equilibrium state. Perhaps more interestingly, in the inherently transient simulations prone to primary KHI, Salehipour et al. (2016a) observed an analogously negative curvature in diapycnal diffusivity to simulation B reported here. This point of similarity suggests that such a property may well be generic for flows prone to relatively large-scale overturning mixing.
As discussed in Salehipour et al. (2016a), the efficiency of mixing by a KHI-dominated flow can reach substantially larger values compared to the mixing associated with a HWI-dominated flow with the same (initial) bulk Richardson number Ri 0 = 0.16, and yet different initial interface ratio R 0 , (R 0 = 1 for the KHI simulation and R 0 = √ 8 for the HWI simulation) principally due to the 'flare' associated with the large primary KHI billow overturning. They also found that Re b, * reached significantly larger values in the KHI-dominated flow, thus leading transiently to larger effective diffusivity κ e (see their figure 11). This behaviour is superficially not consistent with our forced results here, where the time-averaged mixing efficiency was much lower in the overturning simulation B than in the scouring simulation T. However, it is important to remember that the initial bulk Richardson number is much smaller for simulation B than for simulation T (0.0125 as opposed to 0.35), and also that their simulations were at higher Pr = 8. Furthermore, in our analysis here, we only consider the steady-state behaviour of the system, and in particular we ignore the collapse of the first overturning event in simulation B. Also, as is shown in figure 12, the effective diffusivity of simulation B is still markedly larger than for simulation T, principally due to Re b, * being substantially larger in the broadening simulation, which does have points of consistency with the KHI-dominated simulation reported in Salehipour et al. (2016a).
Additionally, we see higher mixing efficiency values at very low buoyancy Reynolds number here in comparison to the strongly stratified HW simulations of Salehipour et al. (2016a), suggesting that the re-supply of stratification through the forcing helps to sustain a larger mixing efficiency. While this may be less relevant in the open ocean, it could be an important factor in natural exchange flows where there is an external source of buoyancy, although it is worth noting that if Re b, * is small, the enhancement in diffusivity might not be significant.
Our demonstration of the scouring dynamics at relatively high Richardson numbers suggests that this behaviour could be a mechanism for interface formation and preservation in places where there are large buoyancy contrasts, such as the sharp thermoclines of estuaries. Additionally, the maintenance of a sharp interface at a relatively low Prandtl number in this study suggests that in regions of high levels of turbulence, such as those associated with many exchange flows, a high Prandtl number may not be necessary for either interface formation or maintenance if there is some form of external forcing and a re-supply of buoyancy. Indeed, a fluid with a larger Prandtl number should result in more of the Ri b − R phase space being favourable for interface formation and maintenance.

Appendix. Dependence on τ
In order to show the dependence of the results shown here on the chosen value of τ , additional simulations were run for τ = 50, 100 and 200 for each of the three Ri 0 values. These simulations used a lower resolution than the simulations in the text and should be classified as under-resolved DNS. However, comparing the fully resolved and under-resolved simulations with τ = 100 shows that the qualitative behaviour is similar. Note that all results shown here for τ = 100 are from the under-resolved simulations in order to make a fair comparison between the different τ value results. Specifically, the same physical dimensions, boundary conditions, and form of forcing were used, but the resolution was reduced to 256 × 256 × 128 grid points and the value of τ was set to 50, 100 or 200. Figure 15 shows the trajectories of the simulations in R, Ri b phase space as in figure 8 in the main text, but now for τ = 50, 100 and 200, differentiated by dashed, solid or dotted lines, respectively. Trajectories are calculated in the same way using (3.1) and (3.2) to construct a time-dependent R and bulk Richardson number and the initial conditions are marked with a triangle for Ri 0 = 0.0125, a star for Ri 0 = 0.1 and a square for Ri 0 = 0.35. Linear stability analysis was performed for the three different values of τ . While the magnitude of the growth rates of the fastest growing modes do vary with τ , we found that the boundaries between the KHI, HWI and stable regions to be only very weakly dependent on τ within this range. Thus for simplicity and clarity, we choose to plot only the boundaries between the primary instability regimes for τ = 100 here (which correspond closely to the boundaries for τ = 50 and 200). The blue contour line denotes the boundary of the region where the most unstable mode is a KHI, the red contour line denotes the boundary of the region where the most unstable mode is a HWI, and all regions outside of both the red and blue contour lines are stable.
In general, we found that the simulations do not strongly depend on the value of τ unless the flow changes regimes (e.g. from broadening to thinning). For Ri 0 = 0.0125, all of the results are largely insensitive to the range of τ values tested and behave largely the same. Similarly, for Ri 0 = 0.35, reducing the forcing time scale to 50 results in very similar behaviour to the τ = 100 simulations. However, an increase in the forcing time scale to 200 causes a regime change, from one of interface thinning to one of interface broadening. For the Ri 0 = 0.1, a regime change does not occur, but the period and amplitude of the spatio-temporally intermittent pulsations was affected by changes in τ . There appears to be a roughly linear relationship between the τ and the period of the pulsations, though this conclusion should be taken with caution as it is made with only three data points. Concerning differences in the transient behaviour dependence on τ , we found that for all three values of Ri 0 the transient period was roughly the same for the different values of τ so long as a regime change was not triggered (as it was with Ri 0 = 0.35, τ = 200). In order to examine the dependence of mixing efficiency on the choice of τ in figure 16 we plot the horizontal and time-averaged mixing efficiency η xyt for τ = 50, 100 or 200 for each of the three Ri 0 values as a function of time-averaged buoyancy-interface normalized depth z/ δ xyt (the values of which appear in corresponding colours in each panel). Here the time average has been performed over the last 100 (non-dimensional) time units of each simulation.Similar to what was seen in figure 15 for the trajectories, the general behaviour and shape of the mixing efficiency across the interface for the Ri 0 = 0.0125 simulations (figure 16a) is largely insensitive to the τ values tested here. However, the width of the interface does increase slightly with τ . The same can be said for the behaviour and shape of the mixing efficiency across the interface for the Ri 0 = 0.35 (figure 16c), τ = 50 and 100 simulations, but when a regime change is triggered in the τ = 200 simulation the interface begins to broaden and behave more like the broadening 910 A42-29 regime seen in the Ri 0 = 0.0125 simulations. For the Ri 0 = 0.1 simulations (figure 16b), again, the τ = 50 and 100 simulations are roughly identical in their mixing efficiencies, but as τ is relaxed to 200, the amplitude of the spatio-temporally intermittent pulsations increases and the broadening phase of those pulsations increases in vertical extent. We speculate that further relaxation of τ in this case would cause a regime change, either to one of interface broadening or to a laminar state.