Shear-induced breaking of internal gravity waves

Motivated by observations of turbulence in the strongly stratified ocean thermocline, we use direct numerical simulations to investigate the interaction of a sinusoidal shear flow and a large-amplitude internal gravity wave. Despite strong nonlinearities in the flow and a lack of scale separation, we find that linear ray tracing theory is qualitatively useful in describing the early development of the flow as the wave is refracted by the shear. Consistent with the linear theory, the energy of the wave accumulates in regions of negative mean shear where we observe evidence of convective and shear instabilities. Streamwise-aligned convective rolls emerge the fastest, but their contribution to irreversible mixing is dwarfed by shear-driven billow structures that develop later. Although the wave strongly distorts the buoyancy field on which these billows develop, the mixing efficiency of the subsequent turbulence is similar to that arising from Kelvin-Helmholtz instability in a stratified shear layer. We run simulations at Reynolds numbers of 5000 and 8000, and vary the initial amplitude of the internal gravity wave. For high values of initial wave amplitude, the results are qualitatively independent of $Re$. Smaller initial wave amplitudes delay the onset of the instabilities, and allow for significant laminar diffusion of the internal wave, leading to reduced turbulent activity. We discuss the complex interaction between the mean flow, internal gravity wave and turbulence, and its implications for internal wave-driven mixing in the ocean.

becomes highly nonlinear and the form of the energy spectrum changes to the power law scaling ( ) ∼ 2 −3 , where 2 ≔ (− / 0 ) / is the squared buoyancy frequency and is the vertical wavenumber (as observed e.g. by Gargett et al. 1981). Although the energy spectra are consistent across measurements (see also Gregg et al. 1993), they sample flow fields that are highly intermittent, as highlighted for example by Baker & Gibson (1987). Away from boundaries, such intermittency suggests that the turbulence may be sustained by a collection of localised, transient 'wave breaking' events that transfer energy downscale from the internal wave field.
Further evidence of turbulence arising from wave breaking processes can be found in the thermocline observations of Alford & Pinkel (2000), henceforth denoted AP. Intermittent metre-scale overturns, where the vertical profile of density becomes statically unstable, are used to indicate the presence of turbulence. In the observations, these overturns favourably sample regions with high 'vertical strain'. Strain in this context refers to local changes in ( 2 ( )) −1 due to vertical convergence or divergence of the flow, and regions with low local stratification relative to the mean are associated with high strain. Significant fluctuations in local stratification (and therefore strain) are suggestive of large amplitude internal waves. There are however a range of possible mechanisms by which the waves can overturn and break, and it is unclear how different types of wave breaking may affect the mean rates of diapycnal mixing. Larger scale vertical shear in the observations of AP is often colocated with the internal wave field, and this shear is likely to play an important role in the breaking process.
For example, figure 11 of AP highlights three 'overturning events' with seemingly different characteristics in terms of the roles of internal waves and shear. One of the overturns is associated with persistently low values of the gradient Richardson number = 2 /| / | 2 (measured at 6.4 m resolution), suggesting that shear instabilities are primarily triggering the turbulence. Overturning events are also highlighted where large amplitude internal waves strongly distort the density field. These 'high strain' overturns are observed where is reduced, but still large enough for instability of the large scale shear to be unlikely. The vertical extent of the overturns in AP is typically comparable to the scale of the strain features associated with internal gravity waves. This suggests that the overturns may be attributed to the breakdown of large amplitude internal waves.
The stability of finite amplitude internal gravity waves was first studied by Mied (1976) and Drazin (1977) using linear stability analysis in a 2-D plane. Klostermeyer (1991) later extended this work to consider three-dimensional perturbations. Finite amplitude internal gravity waves were found to be generally unstable to linear perturbations, although the nature of the instability depended on the wave amplitude and propagation angle. Lombard & Riley (1996) and Sonmor & Klaassen (1997) expanded upon this work with more comprehensive linear stability studies. They found that as the propagation angle of the wave increases, the fastest growing perturbations become three-dimensional and resonant processes become less significant. This is important in the context of the above thermocline observations, where AP estimated a propagation angle of ≈ 85°for the waves associated with high strain. Although the condition of wave steepness > 1 is commonly used to determine whether a wave breaks through convective instability (e.g. Thorpe 2018), the linear stability analysis suggested that there is no qualitative change in the breakdown of an internal wave across this threshold.
To our knowledge, relatively few studies have investigated the fully nonlinear breakdown of internal gravity waves through direct numerical simulation (DNS). Bouruet-Aubertot et al. (2001) performed two-and three-dimensional DNS (with a grid size of 256 3 ) of a plane wave propagating at = 45°. Consistent with the earlier linear stability analysis, the primary instability of the wave occurred due to resonance. Fritts et al. (2009b,a) later used high resolution DNS (with a grid size of 2400 × 1600 × 800) to consider the breakdown of a large amplitude internal wave at = 72°. They found that the breakdown was inherently three-dimensional, and that = 1 did not act as a significant threshold for the nature of the breakdown, consistent with the linear analysis discussed above.
Wave breaking processes can however be significantly impacted by the presence of a background shear flow. This was first highlighted by Bretherton (1966) and Booker & Bretherton (1967), who revealed the possible emergence of critical levels, where the horizontal phase speed of the waves matches the velocity of the shear flow. Vertical propagation of the waves is halted at these critical levels, causing the waves to break as their local amplitudes increase. This phenomenon was subsequently confirmed by the experiments of Koop & McGee (1986). Winters & D'Asaro (1994) performed three-dimensional hyper-diffusive simulations of internal wave packets approaching a critical level in a shear flow. These simulations were run on a very small grid of size 32 2 × 200. As waves approached the critical level, convective rolls formed in the spanwise plane, and these rolls were in turn strongly affected by the enhanced shear of the refracted wave. These results were consistent with the linear stability analysis of Winters & Riley (1992), who modelled a critically-refracted wave as a statically unstable parallel shear flow. Higher resolution studies (with grids up to 3456 × 864 × 1728) of sheared internal waves were performed by Fritts et al. (2013) andFritts &Wang (2013), although their approach was rather different. They considered the effect of 'finescale' shear on a single, large-scale internal gravity wave of steepness = 0.5. The superposition of small-scale shear and the internal wave produced an initial condition locally susceptible to shear instabilities. Fritts et al. (2013) also considered the case where the shear is not aligned with the internal wave, but found that wave-shear interactions in such cases are weak and do not lead to a breakdown of the wave.
We shall consider a similar problem to that of Fritts et al. (2013) in this study, using DNS to investigate the flow arising from a superposition of a plane internal gravity wave and a sinusoidal shear flow in a triply periodic domain. Motivated by the observations of AP, we prescribe the shear flow to vary on a larger vertical scale than the wavelength of the internal wave. We are primarily interested in understanding the key mechanisms involved in the interaction of the wave and the shear, as well as the properties of the turbulence generated from the breakdown of the wave, in particular the associated irreversible mixing and wave-mean flow interaction. In this idealised study, we do not specify the source of the internal gravity wave, but simply choose appropriate parameters to remain consistent with the observations. We acknowledge that for many oceanographic applications, it is useful to quantify mixing associated with specific generation mechanisms, such as oceanic lee waves (Legg 2021).
The remainder of the manuscript is organised as follows. §2 describes the setup of the numerical simulations, and also presents the results of some elementary linear ray tracing calculations to provide a link between our nonlinear flow and linear predictions of critical levels from wave-mean flow analysis. §3 presents the results of our DNS, focusing on the nature of the wave breaking, the mixing achieved by turbulence, and the effect of the breaking wave on the mean flow. Our findings are summarised in §4, and their implications are then discussed in the context of internal wave driven mixing in the ocean.

Numerical simulations
2.1. Nonlinear 3D simulations: domain and initial conditions We use D (Taylor 2008) to perform direct numerical simulations (DNS) of the Navier-Stokes equations subject to the Boussinesq approximation and an imposed, constant mean stratification. The numerical solver implements parallelised pseudospectral methods for spatial derivatives, and time evolution is achieved using a third-order Runge-Kutta scheme. Dealiasing by a 2/3 rule is applied to the calculation of the nonlinear terms, and periodic boundary conditions are used in all directions. The governing equations read ∇ · = 0, (2.1) Here is the dimensionless buoyancy perturbation to a uniform background stratification. The total dimensionless buoyancy is therefore given by which is related to the full, dimensional density profile by where 0 is a typical scale for the mean density and ∆ is a typical scale for the density fluctuations. The dimensionless parameters in (2.1)-(2.3) are where 0 is the buoyancy frequency of the uniform background stratification. 0 and 0 are typical velocity and length scales associated with a background shear flow. In all of our simulations, the bulk Richardson number 0 is set equal to one so that the inertial time scale 0 / 0 is equal to the buoyancy time scale 0 −1 . The Prandtl number is also set to one in every simulation to enable, subject to the constraint of the computational resources available to us, adequate resolution of small-scale dynamics at (what we believe to be) sufficiently high Reynolds number .
We note that the Prandtl number appropriate for seawater at 20°C is = 7, and for flows stratified by salinity, (or more precisely the Schmidt number) takes values of (1000). Previous studies have highlighted significant -dependence of the mixing properties (Smyth et al. 2001), interface evolution (Xu et al. 2019, and secondary instabilities (Salehipour et al. 2015) in simulations of stratified flows. Although we cannot capture these effects at = 1, the flow we consider requires high values of , making DNS at high currently infeasible.
As discussed in the introduction, we are inspired and motivated by the observations of Alford & Pinkel (2000, AP) of wave breaking in the thermocline, and consider the flow developing from the superposition of a plane internal gravity wave and a sinusoidal shear flow. AP estimated the vertical wavenumber of large amplitude internal waves associated with overturning events to be approximately ≈ 2 /(12 m). By inspecting vertical profiles of the effective strain rate / and accounting for Doppler shifts by horizontal currents, they also estimated a typical horizontal wavenumber of the waves as ≈ 2 /(180 m). These estimates coincide with measurements of vertical shear that vary on a length scale of (30 m). It is not possible to resolve centimetre-scale dissipation adequately using DNS while also resolving the dynamics associated with lengths (100 m). We therefore perform a 'miniaturised' simulation of the shear and internal wave interaction by reducing the Reynolds number to a computationally tractable value.
In a periodic domain of dimensionless height 2 , we set ( ) = sin as the base shear flow. The minimum gradient Richardson number of this flow = min( ) is equal to the bulk Richardson number 0 = 1, with taking this value at the edge of the domain ( = 0, 2 ) as well as at the mid-height = . This ensures that the background shear profile is linearly stable, as shown by Balmforth & Young (2002). We superimpose this shear flow and a plane internal gravity wave with dimensionless wave vector = ( , , ) = (1/4, 0, 3). Compared to the observational estimates of AP the wave has a similar propagation angle, and the ratio between the vertical wavenumber of the shear ( = 1) and the vertical wavenumber of the wave ( = 3) also provides a good match to the observations. Preliminary simulations showed that waves oriented perpendicular to the shear flow (with = 0, ≠ 0) produce insignificant interactions even at large amplitude, consistent with the findings of Fritts et al. (2013). We therefore focus only on the case where the planes of the wave and shear are aligned.
We perform simulations at Reynolds numbers of 5000 and 8000. The dimensionless domain size is chosen to fit one horizontal wavelength of the internal wave and one wavelength of the shear. Preliminary runs showed that the scale of spanwise motion that develops is small, so we choose a narrow domain of size 8 × /2 × 2 . Setting the kinematic viscosity to = 1 × 10 −6 m 2 s −1 , typical of water, and choosing a typical buoyancy frequency of 0 = 5 × 10 −3 s −1 , we can deduce typical velocity and length scales from our choices of and 0 . For the highest value of = 8000 this gives 0 = 1.26 m and 0 = 6.3 mm s −1 , and hence an effective domain size of approximately 32 m × 2 m × 8 m.
In the dimensionless Boussinesq system (2.1)-(2.3), internal gravity waves in the -plane are given by the real parts of the polarisation relations where is an arbitrary constant phase and > 0 is the wave steepness, representing a dimensionless amplitude that satisfies = 1 when buoyancy contours first become vertical somewhere in the domain. For 0 = 1, the dimensionless wave frequency satisfies the dispersion relation To construct the initial condition for our simulations, we take the positive root of (2.8), set = 1/4 and = 3, and (without loss of generality) choose = 0. Superposed with the shear flow, this gives the initial condition The values of wave steepness used in the simulations are outlined with all other relevant parameters in table 1. The initial conditions for the buoyancy and velocity fields are displayed in figure 1 for two values of used in the simulations. Small-amplitude, three-dimensional noise is added to the velocity field to allow the development of spanwise motion from the two-dimensional initial condition of (2.9) and (2.10). All simulations begin on a uniformly-spaced grid at the 'initial resolution' specified in table 1. This resolution corresponds to a grid spacing of ∆ = /256 ≈ 1.2 × 10 −2 . As the simulations develop, this spacing is compared to the minimum Kolmogorov length scale calculated from the horizontally averaged turbulent dissipation rate (2.12) Here an overbar denotes a horizontal average, and a prime denotes the deviation from that horizontal average. Once becomes smaller than the initial ∆ , the flow is upscaled to a higher resolution grid with a grid spacing of ∆ = /512 ≈ 6.1 × 10 −3 . The upscaling is achieved through performing an inverse fast Fourier transform onto the higher resolution grid to preserve the exact spectral form of the flow fields. At late times in the simulations, once again rises above the initial grid resolution as the turbulence decays. Once this happens, the extra Fourier modes associated with the higher resolution are truncated and we return to simulating the flow on the initial grid. After both upscaling or downscaling, time series of the turbulence statistics remain consistent and exhibit no sudden jumps or deviations.

Qualitative insight from linear ray theory: critical levels
In the absence of any mean flow, the internal gravity wave (2.7) propagates (in terms of its energy) at the group velocity where we have taken the positive root of the dispersion relation (2.8) to match the initial condition (2.9)-(2.11). For , > 0 the wave therefore propagates down and to the right. In a constant mean flow , the frequency of the internal gravity wave appears to change as the wave is Doppler shifted. The frequency seen by a stationary observer, which we shall refer to as the extrinsic frequency, is given by where is the frequency arising from the dispersion relation (2.8), which we shall refer to as the intrinsic frequency. This intrinsic frequency may equivalently be defined as the frequency observed when travelling with the mean flow. The terminology regarding Doppler shifts can often be unclear from the literature, with the monographs of Sutherland (2010) and Bühler (2014) disagreeing on the extrinsic/intrinsic distinction. In defining the extrinsic frequency as that seen by a stationary observer, we follow the notation and terminology of Bühler (2014).
We consider the propagation of an internal gravity wave through the one-dimensional mean shear flow ( ) as originally considered by Booker & Bretherton (1967). Assuming that this shear flow varies 'slowly' in , the extrinsic frequency defined in (2.14) becomes (2.15) The wave will then propagate along a 'ray' in the direction of the extrinsic group velocity where is the intrinsic group velocity detailed in (2.13). Since the mean flow is independent of time, the extrinsic frequency will be conserved along the ray, that is / = 0. The wave vector = ( , ) must therefore vary along the ray such that The horizontal wavenumber = 0 is conserved along the ray, whereas the vertical wavenumber will change according to the mean shear.
In the simple case of a constant mean shear / = 0 , the vertical wavenumber satisfies / = − 0 . For positive , the vertical wavenumber therefore decreases in the presence of positive shear, and increases in the presence of negative shear. As increases, with kept constant, the intrinsic frequency decreases and the group velocity vector becomes closer to horizontal (as can be inferred from (2.13) for large ). Conservation of ≡ 0 combined with the form of (2.15) can predict the existence of a critical level where the intrinsic frequency drops to zero and becomes infinite. Setting = 0 in (2.15) implicitly defines the height of a critical level as As waves propagate towards a critical level, they typically grow in amplitude until they 'break' through instabilities. These ray tracing calculations are commonly used to investigate the propagation of a localised wavepacket through a large-scale (i.e. slowly varying compared to characteristic length scales of the wavepacket) mean flow, where they can be formally derived using a classical WKBJ asymptotic approximation argument. Our setup of a relatively large amplitude plane wave superposed on a shear flow throughout the entirety of our computational domain is quite different, and in particular, the required scale separation underlying the validity of the derivation of the ray-tracing equations does not occur. Nevertheless, as we demonstrate below, solutions to these equations still provide valuable qualitative insight into the behaviour of the full nonlinear (and relatively rapidly spatially varying) flow. We attempt to model this system by considering the paths of wavepackets (traced using these linear ray equations) with the same properties as the plane wave, from different initial positions. All wavepackets have the initial wave vector (1/4, 3), and hence also have the same initial intrinsic frequency. However the extrinsic frequencies, that are conserved along each ray, depend on the initial height 0 . Figure 2 displays the results of numerically solving (2.16) for the mean flow = sin . The vertical propagation of 17 wavepackets, equally spaced out at time 0, is shown in figure  2a. The majority of the rays end up in the centre of the domain where the background shear is negative, and their vertical propagation decreases. This is consistent with our earlier discussion of wave propagation through a uniform shear. Since each initial wavepacket height has a different extrinsic frequency 0 , (2.18) can predict critical levels at multiple heights. For the flow considered, (2.18) gives the predicted critical levels through .
(2.19) Figure 2b plots the critical levels (if they exist) associated with each of the rays in figure 2a.
The above equation predicts critical levels for approximately 75% of possible initial heights 0 . The upper and lower bounds on critical levels are also shown in figure 2b. Before moving on to analyse the results of the direct numerical simulations, we must emphasise that we do not expect the above linear analysis to describe the development of the flow quantitatively. We instead believe that the analysis illustrates qualitatively some key phenomena that occur in the flow and provides some physical insight into its behaviour. In particular, we expect energy to build up in the region of negative shear due to wave refraction and the appearance of critical levels. A subsequent breakdown to turbulence is then likely through small-scale instabilities and nonlinearities, although this may be affected by diffusion if the instabilities develop on a sufficiently slow time scale.

Results
3.1. Flow phenomenology and wave breakdown We now describe the results of the (inherently nonlinear) 3-D direct numerical simulations outlined in §2.1. We begin by outlining key features of the flow arising from the initial condition with wave steepness = 1, and later compare these results to those with less energetic initial conditions. Figure 3 presents vertical plane snapshots of the total buoyancy field = + at various times of simulation R8s1 up to = 32. Figure 4 shows the vorticity field associated with the same vertical planes, with the streamwise vorticity = − plotted in the -planes and the spanwise vorticity = − in the -planes. By time = 8, shown in panel ( ), the tilted structure of the internal gravity wave has been distorted by the shear flow. As predicted by the ray tracing calculations in §2.2, vertical length scales associated with the wave decrease where the mean shear is negative, at midheights in the domain. The effect of this wave refraction on the buoyancy field can be seen in figure 3d. In the centre of the domain, regions with statically unstable buoyancy profiles emerge, flanked by 'sheets' of strong stratification where buoyancy contours are pushed close together. This is consistent with the predictions of figure 12 for the local wave steepness to increase near = , and points to a local buildup of available potential energy. In contrast, the buoyancy contours closer to the top and bottom of the domain flatten and relax towards the mean uniform stratification.
Panel ( ) is the first to highlight three-dimensional motion in the flow at time = 16. Coherent normal mode-like disturbances emerge in the streamwise vorticity of figure 4e with a spanwise wavenumber of ≈ 20. These vorticity structures are generated in the regions where the buoyancy field is statically unstable, which suggests that they are generated through a convective instability. Indeed, the mushroom-like plumes in figure 3e further suggest that the structures can be classified as convective rolls aligned on the streamwise axis. Preliminary simulations at lower resolution showed that the wavenumber associated with the rolls is independent of the width of the domain in the direction. We are therefore confident that the narrow domain still captures sufficient three-dimensionality in the flow, particularly since the rolls subsequently break down into smaller scale turbulence as they are advected by the flow.
At the same time as the appearance of the convective rolls, spanwise vorticity intensifies locally in the -plane. The dark green regions in figure 4f highlight strong negative vertical shears that emerge in the centre of the domain. In a canonical stratified shear layer, the stability of such a region would be determined by the gradient Richardson number, but in this case such a number is difficult to quantify. Firstly the shear layer depth , for which an estimate is shown on figure 4f, varies in both space and time. Secondly, the maximum shear is offset compared to the peak in stratification. In fact the shear layer spans regions where the buoyancy field transitions between static instability and strong stratification. The strong local shears nevertheless present a potential route for further instabilities to develop. By time = 24, shown in panels ( ) and (ℎ), the small-scale convective disturbances have interacted with the strong shears in the centre of the domain, generating a turbulent flow characterised by relatively intense small-scale vortices. Comparing the vorticity field in figure 4h with the buoyancy field in figure 3h, we find that the turbulence emerges in a region of highly variable local stratification. This can have a significant impact on local irreversible mixing of the buoyancy field, as we investigate further in Howland et al. (2021). The final snapshots presented in figures 3 and 4 highlight a striking organisation of the turbulence into large structures. Undulations in the isopycnals in figure 3j are closely reflected by intense patches in the vorticity field of figure 4j. These patches are somewhat reminiscent of the 'billows' that arise from the development of Kelvin-Helmholtz instability (KHI). The emergence and evolution of these flow structures can be seen in supplementary movies 1 and 2. Although it is plausible that these billows are essentially finite-amplitude manifestations of a linear shear instability, we must add a number of caveats to this interpretation. As mentioned above, the shear layer that develops is not steady and its depth and velocity jump both vary in space and time. Further work is needed to understand better the nature of instabilities in temporally-varying stratified flows. Kaminski et al. (2017) and Kaminski & Smyth (2019) have also shown that finite amplitude perturbations and pre-existing turbulence can significantly impact the development of shear-driven billows in a stratified shear layer. The disturbances introduced by the convective rolls therefore make it difficult to estimate the size of the billows from the initial wave setup. An alternative hypothesis is that small-scale vortices, formed through shearing of the convective disturbances, undergo a form of inverse cascade in the presence of the mean shear. By the time = 32 of the final snapshots, the turbulent dissipation rate ′ has already peaked and the subsequent flow is that of a turbulent decay. Figure 5 highlights the change in various horizontally-averaged quantities between the initial condition and the flow state at the late time = 80. As the turbulence decays, the buoyancy contours flatten in the middle of the domain and leave alternating regions of relatively weak and strong stratification. This variation is clear in the mean vertical profiles of 2 shown in figure 5c, with three strong peaks in the middle of the domain associated with a 30-50% increase in the local buoyancy gradient. The mean shear shows similar vertical variation in figure 5. At mid-heights in the domain, local extrema in 2 appear offset from local extrema in 2 , akin to the form of an internal wave. Despite this offset, regions with significant shear exhibit a gradient Richardson number of ≈ 1, similar to the initial profile. The most intense mean shears lead to a minimum Richardson number of ≈ 1/2, significantly above the value of 1/4 that ensures linear stability. The simulations are continued up until ≈ 150, although the remaining dynamics after = 80 in case R8s1 could primarily be characterized as relaminarization, with the smaller-scale variations seen in figure 5 being smeared out by diffusion.

Energetics
With a basic understanding of how the flow develops in simulation R8s1, we now investigate how the Reynolds number and initial wave steepness modify the dynamics. We begin by further investigating the emergence of three-dimensional motion associated with the convective rolls in figures 3e and 4e. Time series for each component of the kinetic energy and the potential energy P = 0 2 /2 are plotted in figure 6, where · denotes a volume average. The time series are plotted on a logarithmic scale, and in every simulation we see a period where the energy of the spanwise velocity K increases with an approximately linear slope, indicating exponential growth. This growth in K is less steep for the two cases with initial wave steepness, and occurs significantly later for simulation R8s0, where = 0.5.
To demonstrate further evidence that the mechanism driving this growth is convective, we calculate a Rayleigh number, defined in our dimensionless framework as (3.2) where ∆ and ∆ are calculated as follows. For every horizontal position ( , ), we consider the vertical profile of buoyancy ( ). In this profile we identify the largest continuous region with / < 0 and denote its size by ∆ . We then take ∆ as the buoyancy difference across this region to compute the Rayleigh number through (3.2). Taking the maximum Rayleigh number across all horizontal positions then provides us with some information on whether convection is likely to be occurring somewhere in the domain. Classical linear stability results predict the onset of convection above a Rayleigh number of (1000), with the critical value varying depending on the boundary conditions considered (see, for example, Drazin & Reid 2004). In figure 6 we additionally plot the time at which the maximum value of in the domain first exceeds 2000 for each simulation. Every case shows that the growth in K only occurs after statically unstable regions form and the Rayleigh number gets sufficiently large. This, together with the quasi-exponential energy growth, provides strong evidence that threedimensional motion is brought about through a convective linear instability. To be clear, this result only informs us of the first source of small-scale disturbances in the flow, and it cannot be used to determine how energy is supplied to turbulence for mixing at later times.
For simulation R8s0, with the smallest initial wave steepness = 0.5, the peak in the energy of the spanwise velocity K is significantly lower than in any of the other cases. The fact that the energy growth occurs later and more slowly than in other cases may allow diffusive effects to impact the saturation of the convective instability. To investigate this, we present a simple extension to add diffusion to the linear ray tracing theory in appendix A. From this analysis, it is plausible that diffusive effects are impacting the development of the wave for the cases with < 1, but quantitative predictions cannot be drawn from the linear theory.
The ray theory analysis introduced in §2.2 of course relies on a number of bold assumptions that are not even well satisfied by the initial conditions. It is therefore remarkable how well ray theory can provide useful intuition for certain aspects of the flow, such as in figure 7, where we plot the horizontally-averaged buoyancy flux J = 0 for each simulation. Recall that positive values of J describe a transfer of potential energy to kinetic energy. The net buoyancy flux associated with the plane wave initial condition is zero, but as the wave is distorted by the mean flow, large and reversible exchanges between the kinetic and potential energies occur. Figure 7 highlights that these exchanges are qualitatively similar at early times for all of the simulations. In the top half of the domain, alternating patches of high and low buoyancy flux appear to propagate downwards over time. Particularly for the cases with lower initial wave steepness, shown in panels ( ) and ( ), this propagation is qualitatively reminiscent of the wave refraction seen in the ray tracing results. At late times in panel ( ), significant wave activity, inferred from the buoyancy flux, is only present at heights similar to the critical level locations specified in figure 2b. In some cases turbulence is intensified when the internal wave rays converge at the critical level. This is reflected in the horizontally-averaged TKE dissipation rate ′ , shown in figure 8, where As before, an overbar denotes a horizontal average and a prime denotes the perturbation from the horizontal average. In the simulations with = 0.75 and = 1, a patch of large ′ emerges in the middle of the domain between = 15 − 30. This is consistent with internal wave energy converging near the middle of the domain, transitioning to turbulence through smallscale shear and/or convective instabilities and generating localized turbulence and energy dissipation. Despite the chaotic, small-scale turbulence present in the = 1 simulations, panels ( ) and ( ) are remarkably similar. Raising from 5000 to 8000 results in minimal changes to the flow structure, which reassures us that the simulations are at sufficiently high to resolve oceanographically relevant turbulent mixing. By contrast ′ does not exhibit a strong burst when = 0.5 in panel ( ). Here downward propagating structures, most likely associated with the refracted internal wave, are most prominent. The relationship between TKE production and dissipation will be explored in more detail using the perturbation potential and kinetic energy budgets in the next section.

Turbulence and mixing
For the simulations with higher initial wave steepness, the turbulent wave breaking event, identified by heightened dissipation of kinetic energy in figure 8, leads to high-frequency, small-scale features in the buoyancy flux of figure 7. However the large-scale pattern in the buoyancy flux J remains present during the burst of turbulent activity for times 20 < < 40, with patches of alternating sign overlaying the small-scale details associated with turbulence. This is significant in the context of irreversible mixing, where J is often used to infer a diapycnal mixing rate when appropriately averaged.
Consider (as in e.g. Howland et al. 2020) decomposing the kinetic and potential energies into contributions from the horizontally-averaged fields , and their perturbations ′ , ′ . For example the volume-averaged potential energy can be decomposed as P = P + P ′ , where (3.4) and · denotes the volume average. Performing a similar decomposition for the kinetic energy leads to the following evolution equations for the energy components: The mean-perturbation exchange terms are defined as and the dissipation rates of the perturbation energies are given by (3.8) Dissipation rates associated with the mean quantities are defined as = | / | 2 / and = 0 ( / ) 2 / . Note that the above decomposition cannot distinguish between energy in internal waves and energy in turbulence since both contribute to the perturbation energy quantities. Here we also assume that the available potential energy of the system can be approximated by P = 0 2 /2, and therefore that is an appropriate measure of irreversible diapycnal mixing. The validity of this approximation is revisited in Howland et al. (2021), where we find only small discrepancies between and the 'true' rate of diapycnal mixing M for the flows considered here.
In a statistically steady state where energy is supplied from the mean flow through the shear production , we expect the buoyancy variance destruction rate ′ to balance −J . This in turn implies that J < 0 and the buoyancy flux represents a mean transfer of kinetic energy to potential energy. In our simulations however, turbulence is most intense in regions where the larger scale wave-mean flow interaction leads to a positive buoyancy flux (for example, see ≈ in panel ( ) for 20 < < 40). In fact the total mean buoyancy flux (integrated over the domain and in time) is positive in all of the simulations, indicating a net transfer of potential energy to kinetic energy. The magnitude of this transfer varies significantly between the simulations, taking values between 24% and 40% of the initial perturbation potential energy. The classic shear-driven steady state assumption, as used by Osborn (1980), clearly does not apply in this case. Indeed this assumption does not even apply to the canonical evolution of a stratified shear layer (Mashayek & Peltier 2013). Despite the emergence of flow structures related to vertical shear (as seen in figure 4), the turbulence in the flow primarily draws energy from the wave rather than the mean shear flow. We therefore use the volume-averaged dissipation rates defined in (3.8) to investigate mixing properties and the evolution of turbulence in the simulations. Time series of the decomposed dissipation rates are plotted for each simulation in figure 9. Comparing the time series with the vorticity snapshots in figure 4, we unsurprisingly see that the dissipation rates peak when intense small-scale turbulence spans the domain at mid-heights. In figures 9a and 9b, the fact that ′ peaks at the same time as ′ suggests that although the convective rolls seen in figures 3e and 4e are the first small-scale structures to emerge, their contribution to mixing is small. Indeed the overall shape of the time series curves for = 1 in figure 9 are reminiscent of those for the development of Kelvin-Helmholtz instability (KHI) in a stratified shear layer (see e.g. Salehipour et al. 2015). Particular features that stand out include a sharp, early rise in ′ , a short-lived 'fully turbulent' stage where the dissipation rates are approximately constant, and a fast decay from this regime. This is in contrast to some other canonical flows, such as the development of Holmboe instability, which lead to more long-lived turbulent activity (Salehipour et al. 2016).
In the simulations with = 1, the dominant contribution to mixing (quantified by the time integral of ) comes from the 'fully turbulent' period 25 35. The instantaneous mixing efficiency during this period is = /( + ) ≈ 0.24, which matches the KHI simulations of Salehipour et al. (2015) for = 1. Together with the temporal evolution of ′ and the development of 'billow' structures in figure 4j, this makes a strong argument that mixing in these flows is primarily the result of turbulence driven by local shear instabilities, despite the presence of localised convection. Indeed the similarity in mixing efficiency is remarkable given the highly irregular buoyancy field on which the billows develop in our simulations. In both of the simulations with lower initial wave steepness, the peaks in dissipation rates are far smaller than for the more energetic initial conditions. For simulation R8s0, where = 0.5, the maximum value of ′ never even exceeds the dissipation rate associated with laminar diffusion of the mean flow . To visualise how these flows differ from the higher dissipation cases, vorticity snapshots are plotted in figure 10 for the times at which ′ is at its maximum. Panels ( ) and ( ) show that no large billow structures develop in case R8s0, and the maximum TKE dissipation rate is instead achieved when the convective rolls saturate in the spanwise plane. Although the buoyancy field is sufficiently distorted by the shear to drive local convection, the local amplification of shear in the -plane is reduced compared to figure 4. Treating the dynamics as that of a refracted wave, we can think of the wave only achieving high values of steepness once its vertical wavenumber has also increased significantly. The smaller scales associated with high values of are more susceptible to viscous effects, and it is possible that locally intense shear is smeared out by diffusion before instabilities can grow significantly. As seen in figures 10c and 10d, turbulent structures emerge from regions of high shear at slightly higher initial wave steepness ( = 0.75). The local shear layers are not as thin as for = 0.5, consistent with the idea that instabilities are more likely to develop when viscous effects are reduced. At larger values of , it may be possible for turbulent billows to grow from = 0.5 and lead to significant local dissipation and mixing. The turbulence would remain far more localised due to the thinner shear layers, but it is possible that the combination of convective and shear mechanisms seen in the cases where = 1 would remain relevant.

Mean flow interactions
For the simulations with the largest wave steepness, we have deduced that the majority of turbulent dissipation and mixing can be associated with turbulence arising from shear instabilties. As mentioned above, turbulent shear flows are often associated with a transfer From another perspective, internal waves breaking at a critical level typically provide momentum to the mean flow as shown in the classical experiments of Koop & McGee (1986). This momentum transfer is a vital part of the mechanism discussed by Plumb(1977) to describe the atmospheric Quasi-Biennial Oscillation. In our simulations, we appear to observe shear instabilities developing near critical levels, and therefore expect the development of the mean flow to rely on a combination of these effects.
To investigate how the wave breaking affects the mean flow, we plot the shear production from simulation R8s1 as a function of and in figure 11. The time series of volumeaveraged shear production, shown in figure 11a, is dominated by large, reversible changes at early stages of the simulation. Indeed the mean value of , averaged over both space and time for 80 time units, is only (10 −5 ), indicating a small net transfer of energy between the mean flow and its perturbation (relative to the energy changes due to turbulent dissipation). This contrasts with the evolution of Kelvin-Helmholtz instability in a stratified shear layer (Salehipour & Peltier 2015). Although large, reversible changes are also seen at early times in that setup, the lack of initial perturbation energy requires a significant net transfer of energy from the mean flow over the course of a turbulent event.
The small net energy transfer does not however mean that the mean flow is unaffected by its interaction with the breaking wave. Figure 11c plots the time-averaged shear production as a function of height, showing that < 0 in the centre of the domain, whereas > 0 near the edges. This suggests that although the turbulence produced at mid-heights in the domain is reminiscent of that triggered by shear instabilities, any local extraction of energy from the mean flow in this region is dominated by the earlier wave-mean flow interaction. This is emphasised in the space-time plot of figure 11b, where a strong patch of negative shear production persists at mid-heights even as the turbulence develops at ≈ 25. As hinted at earlier in figure 4, we can therefore interpret the billows as arising from instabilities of the wave's shear rather than the mean flow. The evolution of the mean flow appears primarily governed by its interaction with the coherent internal wave, and is only slightly modified by the subsequent turbulence. This interpretation of a wave-mean flow interaction is also consistent with the shift in mean streamwise velocity shown earlier in figure 5a. Since the wavenumbers of the internal wave and are both positive, we expect the wave to propagate to the right and downwards (in the positive and negative directions) even as it is refracted by the shear flow. If the wave then deposits its momentum as it approaches the predicted critical levels, we would expect a positive shift in the streamwise velocity in that region, since the wave is propagating to the right. This is precisely what we see in figure 5a, where increases over the region 3 /4 3 /2.

Discussion and conclusions
We have investigated the flow arising from the superposition of a large amplitude plane internal gravity wave and a mean shear flow. This initial condition is inspired and motivated by observations of high internal wave strain in the presence of variable shear in regions of the thermocline by Alford & Pinkel (2000, AP). In our simulations, some aspects of the dynamics at early times can be reasonably described by ray tracing analysis, despite a lack of the necessary, assumed scale separation between the base flow and the wave field. The propagation of wave energy quantities towards the centre of the domain shows qualitative agreement between the simulations and the linear theory, as seen in figure 7. This analysis suggests that critical levels, whose locations are highlighted in figure 2, exist in this region where the mean shear is negative. Ray tracing predicts an increase in the vertical wavenumber as waves approach the critical levels.
The DNS is consistent with this picture, (even though the underlying assumptions of the ray theory are clearly not satisfied) as seen in the snapshots of figures 3 and 4. Vertical length scales are reduced in the centre of the domain, and regions of statically unstable buoyancy emerge as the wave field is distorted by the shear. Streamwise-aligned convective rolls, best highlighted by figures 3e and 4e, emerge from the regions of static instability in all of the simulations, regardless of their initial wave steepness. Quasi-exponential growth in the energy of the spanwise velocity is observed in figure 6 once the maximum local Rayleigh number in the domain becomes large. We deduce that the roll structures in the spanwise plane are simply driven by a linear convective instability.
The accumulation of wave energy in the centre of the domain also leads to an intensification of local shear in the -plane. Flows arising from the more energetic initial condition (where = 1) subsequently become turbulent and exhibit large-scale organisation in the form of elliptical billow structures. These billows, visualized in figure 4j, are reminiscent of those arising due to Kelvin-Helmholtz instability (KHI) in a stratified shear layer. Furthermore the time series of dissipation rates in figure 9 show that wave breaking is characterised by a 'burst' or 'flare' of turbulence, rather than a sustained event. This bursting nature is again reminiscent of turbulence initiated through KHI.
When turbulence persists throughout the domain at mid-heights, the mixing efficiency is also largely similar to that found in previous studies of KHI at = 1. The buoyancy field surrounding the local shear layers in our simulations is complex, with regions of strong, stable stratification, and static instability present either side of the shear layer. It is therefore somewhat surprising that the mixing results are consistent with a typical stratified shear layer, particularly given the results of Mashayek et al. (2013) highlighting strong Richardson number dependence within that simple setup. A future study of shear-induced mixing for a wider range of background buoyancy profiles would be useful in pinpointing the key parameters governing variations in mixing efficiency. Nevertheless in the simulations with larger initial wave steepness, mixing appears predominantly shear-driven despite the prior emergence of convective rolls in the breakdown of the wave. To be clear, by 'shear-driven' we mean that energy is supplied to turbulence primarily through shear instabilities, and in this case the unstable shear is that in the velocity field of the refracted internal wave. As seen in figure 10, the less energetic initial conditions do not lead to as much turbulent activity. The waves are still refracted towards the centre of the domain and reach sufficient steepness to drive local convection, but we do not observe as intense shear amplification in the -plane in these cases. We suggest that viscous effects are damping the wave before strong shears can be generated. Although high wave steepness values occur at later times, high local wavenumbers are still produced earlier by the wave refraction, and as time progresses these gradients will be smeared out by diffusion. A simple model for this wave damping is provided in appendix A, although its inherently linear formulation prevents us from drawing quantitative comparisons with the simulations presented here.
Even at the high resolution of our simulations, we cannot consider Reynolds numbers that match our motivating oceanographic observations, suggesting that viscous effects are overemphasised in our flows. It is therefore possible that the mechanisms driving turbulence and mixing in our more energetic simulations may be relevant for flows arising from smaller initial wave steepness. In these cases unstable shear layers would be produced at higher wavenumbers, potentially limiting the size of the billows and the extent of the turbulence. Nevertheless this wave breaking may be representative of a process leading to intense mixing from internal waves in the ocean.
Shear-driven turbulence is commonly associated with an extraction of energy from the mean shear flow, characterised by positive values of shear production > 0. However even in our most energetic simulations, we find on average that < 0 in the region of most intense turbulence, as shown in figure 11. This instead suggests that the primary effect on the mean flow comes from a wave-mean flow interaction, where the wave transfers its momentum into the mean flow as it breaks. The change in the mean streamwise velocity shown in figure 5a supports this interpretation. Indeed, since the strong local shears are associated with the wave rather than the mean flow, it may be expected that simple energetics arguments regarding the interaction of turbulence and a mean flow do not apply here.
Although this counterexample to the traditional picture of shear-driven turbulence is specific to our setup, it highlights a generic difficulty in analysing turbulent stratified flows. The effects of internal gravity waves and turbulence are often considered in isolation, although their interplay is vital at the scales associated with wave breaking that are of interest to us. Waves break to produce turbulence, turbulence itself can emit internal waves, and the evolution of a turbulent patch in a stratified fluid is affected at leading order by the presence of internal waves (as reviewed e.g. by Davidson 2013). Continuous energy transfer between waves and turbulence can lead to great difficulties in interpreting their respective roles in the dynamics.
In our simulations, the internal wave appears to drive both the generation of turbulence and the modification of the mean flow. However our setup of an initial value problem superimposing a wave and shear is not typical of how such an interaction would arise in the ocean. Internal waves in the ocean continuously propagate away from generation sites such as topographic features where waves are generated through tidal flows (Sarkar & Scotti 2017). Future studies could extend the relevant setup of Lamb & Dunphy (2018), who consider the interaction of a tidal flow over a ridge with a mean shear, but only in 2-D. It is unclear what behaviour could be expected over a longer time scale as more waves propagate towards the breaking event through the shear. If a critical level were responsible for the breakdown, one might expect a continuous supply of energy to maintain the turbulence as waves propagate towards it.
Our simulation of an isolated 'burst' of turbulence arising from a large amplitude internal wave is however more consistent with the time scales of overturning events observed by AP. Taking the dimensionless duration of the wave breaking event in simulation R8s1 as 0 = 50 and the background buoyancy frequency as 0 = 5 × 10 −3 s −1 , we deduce an event duration of = 1 × 10 4 s ≈ 0.116 d, consistent with the time scales shown in figure 11 of AP. Of course those observations rely on individual vertical profiles, and it is possible for longer lasting turbulent patches to simply be advected away from the profiler.
Although not present in the observations of AP, the existence of coherent 'staircases' in density is common in many regions of the ocean. The propagation and instability of internal waves in such regions, where the background stratification varies strongly, is far different to the case of uniform stratification (Sutherland 2016). Nevertheless, the fundamental mechanism of shear refracting small-scale internal waves seems relevant at sharp density interfaces, at least in situations with large internal solitary waves as considered by Xu & Stastna (2018). Understanding how generic the mixing properties of this shear-wave interaction are for arbitrary 2 ( ) is vital for the general application of our results.
In the context of the ocean thermocline, we have also neglected the effect of the Earth's rotation in our simulations. For the field site of AP, buoyancy effects are important on much faster time scales than rotation, as evidenced by the typical ratio / = 1.6 × 10 −3 . The slowly varying shear may however be intrinsically modified by rotation, and it is most likely associated with a slowly-propagating near-inertial wave. Although the observations of AP tell us the strength of the vertical shear, they do not report on the orientation of the mean flow or how it changes. This orientation may have significant consequences on the nature of the wave breakdown. For example Fritts et al. (2013) find that a spiralling finescale shear flow weakens the spanwise convective instability relative to the case of a shear flow aligned with the internal wave. Broutman et al. (1997) also add the time-dependent nature of propagating near-inertial shear to their ray tracing analysis and find that this can reduce the proportion of short internal waves that end up dissipated in critical layers. Determining whether these types of interaction could impact our results on mixing and mean flow acceleration would be useful in understanding how specific the results are to our setup.
In regions away from the thermocline, / typically takes larger values and rotation can be expected to play more of an important role, although similar wave breaking mechanisms may still be relevant. For example the deep ocean measurements of Waterman et al. (2012) highlight a local peak in turbulent dissipation and internal wave energy approximately 1 km above the ocean floor, where stratification remains relatively weak. From corresponding measurements of the mean shear flow, they attribute this peak to waves breaking at critical levels. Waterman et al. (2012) also find a mismatch in this region between dissipation rates measured from microstructure and those inferred from the internal wave energy. One explanation for this is that, like in our simulations, wave energy is split between the mean flow and turbulence as the waves break. Investigating how incoming wave energy is distributed between mean flow acceleration, turbulent dissipation, and mixing in a fully turbulent critical layer would be useful for improving parameterizations for such scenarios. Such parameterizations could depend strongly on the properties of the incoming waves, and therefore require a fundamental understanding of the various sources of internal waves in the We now have a closed system to solve numerically for initial values of , , , , and .
To investigate how the waves behave as they are refracted towards the middle of the domain, we now also consider the evolution of the wave action along the rays. As described previously, the vertical wavenumber increases as a wave approaches a critical level. This means that molecular diffusion, thus far neglected in the analysis, may become important, particularly for the Reynolds numbers of our direct numerical simulations. We therefore propose a simple modification to the ray tracing equations that incorporates diffusive effects below.
Consistent with the assumption that is larger than the vertical wavenumber of the shear, we only consider diffusion associated with the internal wave, and assume that the mean shear flow ( ) is constant in time. Defining the wave energy density as in (A 1), diffusive effects will appear in the energy equation as a dissipation rate : For = 1, if we substitute the internal gravity form of (2.7) (where in the velocity prefactors should be replaced with the intrinsic frequency ) then the dissipation term simplifies to = 2( 2 + 2 ) / . By doing this, we assume that the the polarisation of the velocity and buoyancy field in (2.7) is maintained even as the vertical wavenumber varies due to refraction. Dividing D by then gives the corresponding dissipation rate to add to the wave action equation, which becomes This equation can be solved in conjunction with the ray tracing equations of (2.16) to provide an estimate of the energy buildup in the centre of the domain. Although now straightforward to calculate, wave action can be difficult to interpret intuitively. In particular, it is not clear what a specific value of can tell us about how susceptible a wave is to different instabilities. Stability analyses of finite amplitude internal waves have shown that the local wave steepness is a key parameter in determining the nature of wave breakdown (e.g. Lombard & Riley 1996). We therefore convert wave action to wave steepness by assuming the wave locally maintains the polarisation given in (2.7), even as the local wave vector is modified by the Doppler shifting. In this form, the energy density of the wave is simply given by = 2 /2 . Wave steepness and wave action can then be exchanged through the equations and then inverted to give wave steepness by (A 8). Figure 12 presents the results of solving (A 7) in terms of the wave steepness obtained through (A 8) for a range of initial wavepacket heights 0 . The initial wave steepness is set at = 0.5, and we compare the results for the inviscid limit in figure 12a with the results for = 5000 in figure 12b. In the inviscid case, increases consistently over time for those rays that approach a critical level. The high values of seen in figure 12a predict the development of highly unstable convective regions in the centre of the domain. However once diffusion is taken into account, wave steepness is shown to peak on a timescale of (50) and then decrease as the critical levels are approached. This timescale is comparable with the time at which spanwise perturbations peak in the simulations with < 1, shown in figure 6. It is plausible that the wave breakdown in these cases may be affected by diffusive effects. This diffusion may also lead to the lower growth rates seen in figure 6 for < 1, since reduced values of local steepness produce smaller negative buoyancy gradients to drive convective instabilities.