Non-local dispersion and the reassessment of Richardson's t3-scaling law

The Richardson scaling law states that the mean square separation of a fluid particle pair grows according to t3 within the inertial range and at intermediate times. The theories predicting this scaling regime assume that the pair separation is within the inertial range and that the dispersion is local, meaning that only eddies at the scale of the separation contribute. These assumptions ignore the structural organization of the turbulent flow into large-scale shear layers, where the intense small-scale motions are bounded by the large-scale energetic motions. Therefore, the large scales contribute to the velocity difference across the small-scale structures. It is shown that, indeed, the pair dispersion inside these layers is highly non-local and approaches Taylor dispersion in a way that is fundamentally different from the Richardson scaling law. Also, the layer's contribution to the overall mean square separation remains significant as the Reynolds number increases. This calls into question the validity of the theoretical assumptions. Moreover, a literature survey reveals that, so far, t3 scaling is not observed for initial separations within the inertial range. We propose that the intermediate pair dispersion regime is a transition region that connects the initial Batchelor- with the final Taylor-dispersion regime. Such a simple interpretation is shown to be consistent with observations, and is able to explain why t3 scaling is found only for one specific initial separation outside the inertial range. Moreover, the model incorporates the observed non-local contribution to the dispersion, because it requires only small-time-scale properties and large-scale properties.


Introduction
The relative dispersion of two tracer particles in a turbulent flow has received considerable interest, because it is a tractable problem closely connected to turbulent mixing. More specifically, the spatial correlations, including the variance, of a passive scalar are related to the statistics of the relative pair separation (Batchelor 1952, Monin & Yaglom 1975. If x 1 and x 2 are the Lagragian trajectories of the two tracer particles, then their relative dispersion is described by the relative separation vector, r(t) = x 1 (t) -x 2 (t). The separation distance, given by = ( ) , is widely believed to scale as '~) in the inertial range and for intermediate times, which is referred to as Richardson scaling. Here, . . indicates averaging over a large ensemble of pairs. Initially, Richardson (1926) obtained the t 3 scaling by proposing a diffusion equation for relative dispersion in isotropic turbulence, where, based on his experimental observations, the diffusion coefficient, K, was scale dependent according to K~r 4/3 .
As Batchelor (1952) noted, the diffusion equation is not exact, but is based on some intuitive or empirical evidence. Moreover, this equation conveniently reduces the full complexity of the turbulent scalar transport to a diffusivity parameter. In Richardson (1926) the diffusivity was a function of the pair separation distance, r, while Batchelor (1952) argued it should be a function of time, t. Finally, in the Kraichnan (1966) model, the diffusivity depended on both r and t. However, all three approaches resulted in a t 3 scaling of the mean square pair separation, ' , in the inertial range. Nowadays, the diffusion equation is considered unsuitable for modelling the dispersion by a turbulent flow, since turbulence is time correlated and contains non-local effects (e.g. Salazar & Collins 2009, Falkovich, Gawedzki & Vergassola 2001, Thalabard, Krstulovic & Bec 2014. But the t 3 scaling law, originally obtained from the diffusion equation, has since been derived by other means as discussed in §2, and is still considered valid. In their review of pair dispersion, Salazar & Collins (2009) concluded that the Richardson '~) scaling law remains unchallenged despite that "...there has not been an experiment that has unequivocally confirmed Richardson scaling over a broad-enough range of time and with sufficient accuracy." More than ten years later, this is still the case, as shown in §3. The lack of a clearly observable t 3 scaling is commonly attributed to too low Reynolds numbers, too short observation times, mean shear or experimental error, but hardly ever to fundamental limitations in the theory. On the contrary, the ability to predict t 3 scaling is often a measure by which a new theory or dispersion model is judged (some examples are given in Sawford (2001)).
The present level of confidence in Richardson scaling is thus largely based on a firm belief in the classical turbulence theory rather than being based on strong empirical evidence. Notwithstanding shortcomings in the simulations and experiments, the lack of convincing evidence calls for a critical reassessment of the underlying assumptions and hypotheses, especially in light of recent advances in the understanding of the structure of the turbulent velocity field.
(a) (b) Figure 1. (a) Tracer particle pair dispersion near a significant shear layer, which is characteristic of intermittency at high Reynolds number. Pair A is initially located inside the layer and transported by the small-scale structures within this large-scale layer structure. The small scales are indicated by red and green blobs marking intense dissipation and swirl respectively (see also Elsinga et al. 2017). As soon as the pair leaves the layer, it disperses quickly due to the large velocity difference across the layer. Pair B is located within a large energetic flow region bounding the layer, which is associated with low level velocity fluctuations and slow relative dispersion. (b) Example of a tangential velocity, w, profile across a significant shear layer, where x denotes the normal to the layer (data from Ishihara, Kaneda & Hunt (2013)). Of particular relevance is that high Reynolds number turbulence is highly intermittent, meaning that the most intense small-scale motions are clustered in confined regions of space. Intermittency has been related, at least in part, to the development of shear layer structures, which combine small-scale and large-scale motions ( figure 1(a)). Intense small-scale flow structures, i.e. vortices and dissipation sheets, are found within these shear layers, and they scale with the Kolmogorov length scale, h. The shear layer itself is approximately 4l T thick on average, where l T is the Taylor length scale, and is bounded on either side by large nearly uniform flow regions that scale with the integral length scale, L. The velocity magnitude in the uniform flow regions is of the order of the root-mean-square of a velocity component, U. These shear layers are important as they contain significant dissipation (Ishihara, Kaneda & Hunt 2013, Elsinga, Ishihara & Hunt 2020 and affect the average velocity distribution associated with strain (Elsinga et al. 2017). It is this strain that is responsible for the relative dispersion/separation of particle pairs (see Goto & Vassilicos (2004) for an illustration in 2D). Because of their importance and large size, Ishihara, Kaneda & Hunt (2013) have referred to these structures as significant shear layers. Moreover, these layers are persistent and can affect the dispersion over a long time. The lifetime of the significant shear layers is of the same order as that of the large scales, i.e. L/U, since the layers are created by large scale regions rubbing against each other. However, the internal structures, such as the small-scale vortices, may have shorter lifetimes.
The observation of significant shear layers challenges the classical understanding of turbulence, which assumes that the small-scales are largely independent of the largest scales. However, these thin shear layers are bounded by large-scale energetic motions, which determine the velocity difference across the layer and the intense small-scale motions within the layer ( figure 1(b)). Indeed, Ishihara, Kaneda & Hunt (2013) find that the velocity difference associated with the intense small-scale structures is of the order of U. Furthermore, significant energy transfer was found between the largest and smallest scales, and vice versa, near the significant shear layer (Aoyama et al. 2005), which suggests that the scales are strongly mutually dependent. Since these shear layers contribute significantly to the turbulent strain and dissipation, as mentioned above, they are likely to affect pair dispersion in non-classical ways. This is confirmed by new results for pair dispersion inside a significant shear layer, which are presented in §3.3.
The aim of the present paper is to open a discussion on the validity of the '~) scaling law and its underlying assumptions in particular ( §2). Furthermore, numerical and experimental results reported in the literature since the 2009 review by Salazar & Collins are examined for possible evidence of t 3 scaling in the ranges predicted by the theory ( §3). An alternative model for pair dispersion at intermediate times is introduced in §4. This model captures features of the mean square separation currently unexplained by theory. The main findings and observations are summarized in §5.
As a final introductory remark, we briefly comment on the other scaling regimes in pair dispersion. The initial ballistic (or Batchelor) regime is kinematic in nature and assumes that the particles maintain their initial velocity over short time-scales. This approximation is valid for any continuous velocity field in the limit of small t, and results in a '~' scaling law (Batchelor 1950). For long times, the pair separation distance increases beyond the integral length scale, where the particles move independently leading to a diffusive dispersion regime, which is known as the Taylor regime (Taylor 1922). Both the initial short-time Batchelor and the long-time diffusive regime are well established and are independent of the detailed structure of turbulence, which is not the case for the intermediate-time Richardson scaling regime considered here. We return to these other regimes in §4.

Derivations of t 3 scaling
This section reviews a number of different theories, which predict a '~) scaling regime ( § §2.1-2.3). Particular emphasis will be on the ranges for which t 3 scaling is predicted. It is intended as an overview of the broad variety of approaches without claiming to be complete. However, in the end, all these theoretical studies use quite similar assumptions. Their validity is discussed in § §2.4 and 2.5. Batchelor (1950) obtained t 3 scaling for initial separations, + = ( = 0), in the inertial range from dimensional arguments, which are similar to those originally presented by Obukhov (1941). Within the inertial range the relative motion (i.e. velocity) is said not to be affected by viscosity. Furthermore, the dependence of the dispersion on the initial separation, r 0 , and the energy containing motions is assumed negligible at intermediate times (defined in §3.1) when

Dimensional analysis
In the spirit of K41 (Kolmogorov 1941), the relative motion then depends only on the mean dissipation rate, e, and time, t. From dimensional consistency, it follows that (Batchelor 1950): where C is a dimensionless constant. Integration subsequently yields: Here g (= C/3) is the Richardson constant. In a later paper, Batchelor (1952) introduced a time shift t 1 allowing for some influence of the initial pair separation r 0 on the effective origin (t 1 , r 1 2 ), which resulted in: Other than that, the set of assumptions resulting in (2.3) remained the same as for (2.2). Further justification for an effective origin is given by Ishihara & Kaneda (2002), which is based on the observation of a linear region in the Lagrangian correlation of the velocity difference.
2.2 Mechanistic/stochastic approach A physical mechanism leading to t 3 scaling was proposed by Bourgoin (2015). In this model, the mean square particle separation, ( ) ' , is described by a stepwise ballistic growth. During each step k, the pair's relative velocity squared is constant and given by the second-order Eulerian structure function, S 2 , evaluated at the scale of the particle separation distance, r k . This leads to ballistic dispersion, which is maintained for a time period t k . Accordingly, the pair separation will have grown to r k+1 at the start of the next step k+1. The ballistic process repeats at the new scale, r k+1 , with the associated S 2 (r k+1 ) and time scale t k+1 . Both the Eulerian structure function and the time scale are related to the particle separation distance by imposing K41 scaling. Specifically, S 2 ~ (er) 2/3 , which applies in the inertial range. This implies that the theory is valid for r k (including r 0 ) within the inertial range. Furthermore, t k = aS 2 (r k )/2e, where a is referred to as a persistence parameter. The condition a = 1 corresponds to the eddy turnover time at scale r k . This stepwise process leads to: where the virtual time origin t 0 depends on the initial separation. The model parameter a can be tuned to yield a Richardson constant g = 0.55 consistent with reported values in the literature.
It is important to note that the ballistic dispersion model is local in its original implementation, that is, only the eddies of scale r k contribute to the relative velocity and these eddies have inertial range scaling properties. Furthermore, the variations in the square separation at t > 0 are not considered, i.e. the model only uses the mean.
The ballistic approach is general and can be adapted to turbulent shear flow (Polanco et al. 2018). Moreover, the actual S 2 (as opposed the inertial range model) may be inserted, which yields deviations from equation (2.4) (Liot et al. 2019, see also §4). However, we focus on the case of homogeneous isotropic turbulence with inertial range and local dispersion assumptions, because it yields the classical Richardson scaling, which is of interest here.
Another ballistic model was proposed by Thalabard, Krstulovic & Bec (2014), who considered the pair dispersion as a continuous-time random walk process. The inertial range and local dispersion assumptions are the same as in Bourgoin (2015), however, the stepwise ballistic scenario is formulated in terms of increments in the pair separation distance, r, (as opposed to its square) and a different choice is made for the relative velocity between the particles at a given scale. These choices also lead to a t 3 scaling for ( ) ' in the inertial range, but the separation growth is governed at leading order by third-order velocity increments as compared to the second-order increments in the Bourgoin (2015) model. Thus, the resulting physics is different due to different choices for the relative velocity between the particles. Related continuous-time random walk descriptions of pair dispersion include the so-called Lévy walks (e.g. Shlesinger, West & Klafter 1987), in which a probability distribution is assumed for the steps in the pair separation distance and the associated time steps.
In Lagrangian stochastic models (Thomson 1987, Wilson & Sawford 1996, Sawford 2001, the dispersion of particles is treated as a Markov process, where the changes in a pair's relative separation and relative velocity depend only the present separation and relative velocity. Therefore, the process is memoryless and ignores any history effects. Then, the relative velocity increment at each time step is described by a stochastic differential equation containing a drift term and a diffusion term. The latter adds a Gaussian white noise, which amplitude scales with (e dt) 1/2 in the inertial range according to K41. However, a dissipation range correction has been considered by Borgas & Yeung (2004). The drift term can be related to the Eulerian relative velocity probability-density-function, p E (Thomson 1987). Assuming K41 inertial range properties for p E , that is, the shape of p E is self-similar and its variance scales according to ~(er) 2/3 , results in Richardson scaling, see Sawford (2001) for a review and Devenish & Thomson (2019) for a recent example. Note, however, the large scatter in the Richardson constants predicted by the different Lagrangian stochastic models (Sawford 2001). Clearly other modelling aspects also play a critical role. The above assumes that r (while distributed) remains within the inertial range, which is a local dispersion assumption in the sense that the small dissipative scales and the large scales do not influence the dispersion. In an actual turbulent flow, p E is not self-similar (Sawford & Yeung 2010), which leads to deviations from the classical Richardson scaling as discussed further in §4. Malik (2018) related the slope of the kinetic energy spectrum, E(k), to the pair dispersion power law. Note that there was no dissipative range in the assumed energy spectrum. Consequently, the initial separation was always in the inertial range. For local dispersion, i.e. dispersion by the turbulent scales equal to the pair separation, and a k -5/3 energy spectrum, the Richardson t 3 scaling law was obtained. However, including effects of non-local scales, that is, inertial range scales larger than the separation distance, was shown to enhance the dispersion and increase the power scaling (t g with g > 3). So, the assumption of local dispersion seems important in obtaining t 3 scaling.

Why Kolmogorov theory works for average energy related statistics only
The existing approaches predicting t 3 scaling ( § §2.1-2.3) rely on the existence of an inertial range, where all flow properties depend only on e and the local scale (length scale r or time scale t). This is referred to as Kolmogorov theory, or K41, and represents the classic inertial range assumption. K41 predicts, among other things, that, in the inertial range, '~' /) and E ~ k -5/3 . These results were used in §2.2 and §2.3 respectively. While '~' /) and E ~ k -5/3 appear to be accurate subject to minor corrections (e.g. Donzis & Sreenivasan 2010), this does not imply that K41 can be extended to other turbulence properties, including dispersion, without question. We explain our reservations below.
In order to better understand the successes and limitations of K41, it is useful to consider the intermediate range as a matching or overlap region, which connects the small r regime to the large r regime. The present matching argument developed for structure functions is similar to that of Tennekes & Lumley (1972, p. 264) and George (2013) for the inertial range in the energy spectrum. Consider the n th order velocity structure function, 7 ( ) = ( , 0) 7 , where is the relative velocity between the particles at t = 0. Assume that for small separation r, these structure functions scale with the Kolmogorov length and velocity scales, h and u h respectively, which is written as: where = = / . At large r, integral scaling applies, which is written as: 7 ( ) = 7 7 ( ) (2.6) where = / . Furthermore, assume that there exists an overlap or matching region where both scalings are valid. In that case, the derivatives 7 / from equations (2.5) and (2.6) can be equated, which results in: = 4@7/) 7 = = = @7/) 4@7/) 7 (2.7) In the above equation, both sides have been multiplied by 4@7/) and the relation < ) / ≡ = ) / has been used, where the normalized dissipation rate, D, is a constant in (near) equilibrium turbulence at sufficiently large Reynolds number (Re l > 100, Sreenivasan 1998, Kaneda et al. 2003. The latter follows from the energy balance, where the average dissipationrate must be equal to the average turbulent kinetic energy production-rate when turbulence is at equilibrium. Note that the energy balance = ) / does not require an inertial range (see also Pope 2000). It only assumes that production occurs at large scales, and hence the production-rate scales with ) / . In the limit of large Reynolds number, and hence = → ∞ and → 0, the left and right-hand-side of equation (2.7) must be equal to the same constant, which we define as E F 7 . Subsequent integration yields:  For the special case of the second-order structure function (n = 2), Kolmogorov scaling applies to a very good approximation at small r (figure 2(a)). Consequently, e and r appear as the only relevant parameters in the overlap region of ' consistent with Kolmogorov theory (equation (2.8)). However, the collapse of ' with u h and h is to be expected owing to the ebased definitions of the Kolmogorov scales. Since ~/ ' = lim K↓+ ' / ' and = < / ' , it follows that '~< / ' at small r, which validates equation (2.5) for n = 2. Generally, Kolmogorov scaling does not collapse velocity structure functions at small r. Reynolds number dependencies are clearly visible for n ≥ 6 (figures 2(c-d)), and may already be noticed at n = 4 ( figure 2(b)). This means that another, independent scale needs to be combined with the Kolmogorov scales in order to collapse the data. The only remaining independent turbulent scale (in the current understanding of turbulence) is the integral scale. Therefore, the lack of a collapse in figure 2 suggests that large scale influences are present at small r. Indeed, the large scales influence the magnitude and the size of the most intense dissipation and velocity gradients (Elsinga, Ishihara & Hunt 2020). Also, Yeung, Brasseur, & Wang (1995) found evidence of a strong and direct coupling between the large and the small scales of motion in both physical and Fourier space, which resulted in a quick response of the small-scales to changes at the largest scale. These couplings may be associated with large energy-containing motions bounding the small-scale motions within the significant shear layer structures (Ishihara, Kaneda & Hunt 2013, Elsinga et al. 2017. Moreover, a dependence on the larger scales may be inferred from the anisotropy of the small scales (Shen & Warhaft 2000, La Porta et al. 2001, Fiscaletti et al. 2016. The above observations are clearly inconsistent with the classical assumption that the small scales are independent of the large scales apart from the mean energy transferred (Kolmogorov 1941). We should mention that these largescale influences at small r do not appear in ' (figure 2(a)), because of the way u h and h have been defined (see above). The lack of collapse at small r in figures 2(c-d) causes equation (2.8) to fail in predicting the power law exponent at intermediate r when n > 5 (Benzi et al. 1993, their figure 2). Large-scale influences, i.e. a contribution from U at small length scales, can explain the observed discrepancy in these power law exponents (She & Leveque 1994).
In summary, Kolmogorov theory appears to be suitable for statistical averages closely associated with energy (e.g. energy spectrum, ' , D and its Lagrangian equivalent D L ), because K41 is based on the energy balance (and equilibrium). However, Kolmogorov theory is less suitable for flow properties other than energy, such as the higher order moments of velocity difference (n > 5 in particular), because, in those cases, the large-scale influences at small r are not accurately accounted for. Note that the intermediate region may still be considered as an overlap region, but velocity and length scales different from u h and h are required to collapse 7 at small r and n ≳ 5. Different scales may also improve the collapse for lower order moments, n ~ 4, but the gain will be less obvious, since the observed deviations from K41 are quite small in those cases.
What are the implications for pair dispersion? For the initial Batchelor regime at small t, the mean square separation depends on ' + (Batchelor 1950, §4), and hence collapses in Kolmogorov scaling for a given r 0 (e.g. , Salazar & Collins 2009). So, the small t regime depends on u h , h and r 0 . The additional r 0 dependence may carry over into the overlap region and affect the scaling exponent at intermediate time (and indeed it does, § §3 and 4). This is similar to what has been observed for the velocity structure functions, where the Kolmogorov scales fail to collapse the higher order structure functions at small r causing the breakdown of K41 in the intermediate range for n ≳ 5. The additional r 0 dependence may also introduce a Reynolds number dependence in the scaling exponent. Such an r 0 dependence at intermediate times can be viewed as a non-local effect or a history effect, which is not included in the K41-based theories predicting Richardson scaling ( § §2.1-2.3). Moreover, r(t) is distributed at the end of the Batchelor regime, and hence at the start of the intermediate regime, which further complicates the analysis as discussed in §2.5. It is therefore not obvious that K41 should apply to pair dispersion at intermediate times.

Non-local dispersion and turbulent structure
The approaches in § §2.2-2.3 include the idea that, at any given time, only the turbulent eddies at the scale of the mean pair separation distance, ( ) ' 4/' , contribute to the rate of change of the separation distance. This is known as local dispersion, as opposed to non-local dispersion, where the rate of change of the pair separation distance is affected by turbulent eddies at scales other than ( ) ' 4/' . Dispersion, however, contains non-local contributions. For example, some particles, initially at r 0 within the inertial range, are brought closer together as time progresses with r eventually entering into the small-scale range, while other pairs move far apart causing r to be in the large-scale range. Consequently, the pair separation probability-density-function (PDF) is very broad (e.g. Scatamacchia, Biferale & Toschi 2012, which means that the pair dispersion for r 0 at later time is affected by energetic scales and viscous scales simultaneously. Furthermore, the PDFs may depend on r 0 , which could help to explain a r 0 dependence at intermediate time. These issues are minor when the Reynolds number is extremely large and it would take any pair very long to disperse to scales well outside the inertial range. However, as argued in §2.4, notable large-scale influences persist down to small r. This makes predicting the tails of the PDF, and hence the mean square separation, highly complex. The complex relation between dispersion and turbulent scales is illustrated in figure 1. A particle pair leaving a small-scale eddy inside the significant shear layer almost immediately enters the neighboring energetic scales (pair A in figure 1(a)). In that case, the dispersion is always affected by the viscosity or the energy-containing eddies, and there is no important contribution from inertial range eddies, even when r is in the inertial range. Furthermore, the energetic large-scale motions bounding the significant shear layer determine the velocity difference across the layer and thereby they control the magnitude of the small-scale eddies and the small-scale dispersion inside the layer. These are significant non-local effects, which are assessed in §3.3.
Turbulent structure also contributes to the selective sampling of the flow. Pairs that are randomly distributed initially (isotropy) align and cluster due to flow structure. For example, pairs cluster onto sheets around a shear layer flow structure or a node-saddle topology (e.g. Goudar & Elsinga 2018). All pairs then disperse along the directions of the sheet, which roughly correspond to the directions of extensive strain. The approximate alignment between the most extensive strain and the direction of large elongation between particle pairs has also been noted by Devenish & Thomson (2013). Consequently, the particles probe the turbulent flow along those specific directions, which may have a different velocity distribution across the scales as compared to the unconditional/isotropic energy distribution. For instance, the kinetic energy spectrum along the most extensive straining direction of an average shear layer structure is different (k -1 ) from the usual k -5/3 , which is the average over all directions (Elsinga & Marusic 2016). These effects are not well understood at present, but may have important implications. A related discussion for single particle statistics can be found in Lalescu & Wilczek (2018).

Observations
Now that the theory has been reviewed and the issues concerning the assumptions have been discussed, we turn our attention to the evidence for Richardson scaling. Results from DNS ( §3.2), kinematic simulations ( §3.4) and experiments ( §3.5) are considered. Furthermore, the contribution from the significant shear layers to the overall pair dispersion statistics is examined in §3.3. However, first the requirements are given for positively identifying a Richardson scaling regime ( §3.1).

Requirements for testing Richardson scaling
When testing for Richardson scaling, the initial separation r 0 is required to be in the inertial range consistent with the assumptions made when deriving the scaling law ( §2). Typically, this requirement is translated into h << r 0 << L (e.g. Sawford (2001) and Salazar & Collins (2009)), which is correct but imprecise. Often it is misinterpreted as r 0 ~ 10h being sufficient, simply because it is an order of magnitude larger than the lower bound, h (see § §3.2 and 3.5). Note that r 0 ~ 10h is still within the linear core of the vortices and the dissipation sheets (Jiménez et al. 1993, Elsinga et al. 2017, which are small-scale structures. The actual lower bound, however, is provided by the assumption that the relative velocity at scale r 0 is not affected by viscosity (see §2), i.e. not affected by the small dissipative scales. Though it is difficult to define the lower bound for the inertial range exactly, we may specify that r 0 should be larger than the characteristic size of the dissipation structures, which has been determined at ~60h (Elsinga et al. 2017). Moreover, the dissipation spectrum drops quickly for dimensionless wavenumbers kh < 10 -1 (e.g. Pope 2000) corresponding to a physical length scales larger than approximately 60h. A more conservative criterion would be r 0 ≳ 120h, which is based on the coherence length of small-scale vorticity (Elsinga et al. 2017). This stricter criterion is consistent with the second-order Eulerian structure function revealing a r 2/3 inertial range scaling for separation distances larger than 100h (e.g. Donzis & Sreenivasan 2010). At present, there is limited data available for r 0 > 60h, let alone r 0 > 120h, making it difficult to effectively assess the scaling law for the inertial range. The exception may be the experiments discussed in §3.5. Furthermore, t 3 scaling should appear for a wide range of initial separations, r 0 , within the inertial range.
Concerning the temporal range, the t 3 scaling is predicted for intermediate times, after which the initial condition is forgotten ( §2.1) and + ' ≪ ' ≪ ' . This intermediate range is expected to lie somewhere between the Batchelor time scale, N = + '/) @4/) , and the Lagrangian integral time scale T L (Batchelor 1950). Furthermore, the t 3 scaling should appear for a decade of t in order to be convincing, as explained below. Alternatively, it has been suggested that Richardson scaling can also appear for very small initial separations, + ≲ , and intermediate times, t >> t h , when h << r << L (Batchelor 1952, Monin & Yaglom 1975. In that case, the effect of viscosity is lost and it is hypothesized that the particle pairs ultimately forget their r 0 before entering the Taylor regime (Batchelor 1952). In this scenario, the bulk of the pairs need to reach the inertial range (r ≳ 120h) first. During this time, the initial condition will not be forgotten due to the spatial coherence that exists up to 120h (Elsinga et al. 2017). Because the velocity of the tracers and the fluid is the same, it is reasonable to assume that the pair travel time and the turn-over time of the fluid structure (and hence its decorrelation time) are similar for a given length. In that case, spatial coherence of a flow structure implies sufficient temporal coherence. Once the pairs enter the inertial range, the r 0 -dependence may be gradually lost in a similar process as for the pairs with r 0 inside the inertial range. Because of the additional time needed to reach the inertial range when + ≲ , we expect that Richardson scaling appears first for h << r 0 << L, which, moreover, is the common condition appearing in reviews on the subject (Sawford 2001, Salazar & Collins 2009). In any case, if a true Richardson scaling regime exists for + ≲ , then there is no reason why it should not appear for h << r 0 << L from the theoretical point of view (both conditions neglect the effects of viscosity and r 0 after some time when r is within the inertial range). So far, ballistic and Lagrangian stochastic modelling frameworks ( §2.2) have not been able to confirm exact t 3 scaling when r 0 is within the dissipative range. This is discussed in more detail in the last paragraph of §4. Also, the spectral approach ( §2.3) has not yet included a dissipative range. Therefore, we focus on the condition for h << r 0 << L, which has received wider theoretical support until now ( §2) and may reveal Richardson scaling first. However, results for + ≲ are commented on when relevant.
Finally, there is the issue of the functional form of the Richardson scaling regime (equations (2.1)-(2.4) or the one that is often used in practice: − (0) ' = ) ). For large times, i.e. t >> t 0 and ( ) ' >> r 0 2 , these relations are equivalent. However, the observation time in the presently available simulations and experiments is limited due to the limited Reynolds numbers achieved. Therefore, we briefly comment on the issue. The most general form is given by equation (2.3) and uses an effective, or virtual, origin (t 1 ,r 1 2 ). The other forms can be obtained by introducing specific choices for t 1 and r 1 2 . The use of a virtual origin has received important criticism, because it allows fitting any power law to the data (e.g. Ouellette 2006, Salazar & Collins 2009). This is illustrated in figure 3. Here, the actual mean square separation is taken to evolve according to ( ) ' / ' = / < ' (blue line), where t h is the Kolmogorov time scale. However, by introducing a virtual origin when plotting the exact same data, we can observe approximate t 3 scaling over a short temporal range, even if the actual dependence is t 2 . The plot also shows that for sufficiently long times, the true t 2 scaling is recovered for all cases ((t-t 1 )>100t h in the example of figure 3). In this way, a spurious t 3 scaling range can be obtained for at most half a decade (approximately ¼ decade due to a time shift and another ¼ decade due to an r 1 2 shift). Hence, the issue of the virtual origin appears irrelevant if a t 3 scaling range can be observed for at least a full decade of time. In that case, we may admit the use of the most general form with the virtual origin as a fit parameter (equation (2.3)). The requirements for identifying Richardson scaling in a linear plot of ( ) ' 4/) versus time are similar, as discussed in Appendix A. (blue line, t 1 =0, r 1 =0). The other lines show the exact same data, but plotted applying different virtual origins (t 1 ,r 1 2 ). This leads to a spurious t 3 scaling range over half a decade of time (marked by the thick lines). Dashed lines indicate t 3 power laws for reference. Note that for all cases the correct t 2 scaling is recovered at large times.

Results from numerical simulations
At the time of the review of Salazar & Collins (2009), the DNS of particle pair dispersion in homogenous isotropic turbulence (HIT) was available for Re l up to 650 ). These simulations have not confirmed Richardson scaling beyond any doubt, as also mentioned in their review. Either approximate t 3 scaling was obtained for a single initial separation outside the inertial range (r 0 < 60h), or observed only for very short times, much less than a decade. However, the Reynolds number in numerical simulations has increased thereby expanding the inertial range that can be probed. (<|r(t)| 2 > -r  considered HIT at Re l = 460 and 730, and presented mean square separation, ( ) ' , data for r 0 £ 24h (their figure 2). For r 0 ≈ 4h and t > 10t h , a t 3 regime is observed for almost a decade before the separation reaches the integral scale, while the exponent is larger and smaller for r 0 < 4h and r 0 > 4h respectively. Similar results were obtained by Bragg, Ireland & Collins (2016) at Re l = 582. Assuming these results can be extrapolated to the inertial range (i.e. r 0 > 60h), the exponent is expected to further decrease, away from the predicted value of 3.
In a related paper ), a short time t 2 correction term to the Richardson regime was proposed, which depended on r 0 . However, the data (especially for large r 0 ) did not extend up to times where the correction is negligible in order to have a true t 3 scaling. Furthermore, the correction term was found to be zero only for r 0 ≈ 4h consistent with their 2013 results discussed above. Hence, they refer to r 0 ≈ 4h as the 'optimal choice' for observing t 3 behavior. As remarked before ( §3.1), at this small initial separation, the pair is released within the same small-scale flow structure and not within the inertial range. Moreover, true Richardson dispersion applies to a range of initial separations, as opposed to one specific value of r 0 .
Additionally, , 2013) define a new time scale, Q = ' ( + )/ (2 ), for the end of the initial ballistic t 2 regime, after which a Richardson regime could appear. Here, ' = ' is the second-order Eulerian structure function, and is the longitudinal velocity difference over a distance r. The time scale t t was shown to collapse the end of the Batchelor regime for their data at Re l = 460 and 730 ).
An even higher Reynolds number, Re l = 1000, was achieved by . Moreover, they presented mean square separation data for the inertial range, that is r 0 = 64h and 256h, which are reproduced in figure 4 (blue solid lines) along with the results for the other initial separations (red dashed lines). Note that the t 2 compensated mean square separation is shown here, such that horizontal lines indicate Batchelor t 2 scaling. A Richardson t 3 power law is indicated by a dash-dotted line for reference. Again, a t 3 regime is observed only for r 0 ≈ 4h, while the larger (up to 256h) and the smaller initial separation results approach this line from above and below respectively. However, the results for these other r 0 do not reveal an exact t 3 scaling. Focussing on the inertial range (blue curves), it is seen that after the initial Batchelor t 2 scaling regime, the compensated mean square separation first deceases (implying t b scaling with b < 2) and reaches a minimum. The time scale associated with this decrease appears to be the Batchelor time scale (marked + in the plot). Moreover, the mean square separation at this point is of the order of 4l T (marked by dotted line with circles in the plot), meaning that a significant number of pairs will have separated by more than the thickness of the significant shear layer. These pairs may already feel the effect of the energetic largescale motions (figure 1). Following the minimum, at around t = t t (* in figure 4), the compensated mean square separation increases, but does not reach a t 3 regime. Furthermore, the mean square separation quickly reaches integral length scales. For r 0 within the inertial range, there is half a decade, or less, between t t and the time corresponding to the condition − (0) ' = ' (indicated by dotted line with squares). When the mean square separation increases beyond L 2 , the curves for all initial separations seem to converge to a common point (figure 4). From that common point the onset of a Taylor regime is anticipated, where the influence of the initial separation is lost. The data show that the mean square separation remains dependent on r 0 up to the simulated time, while the extrapolated lines suggest that this dependence persists until the Taylor regime is approached. The r 0 -dependence is a clear violation of the assumptions leading to a Richardson scaling regime for r 0 within the inertial range, as well as for + ≲ ( §3.1). In conclusion, at Re l = 1000 there is still no sign of a true Richardson scaling regime for the inertial range, and large-scale influences appear already at the end of the Batchelor regime when the pair separation is of the order of the significant shear layer thickness. At Re l = 1000, inertial range scaling is observed in the energy spectrum and the secondorder velocity structure function to a good approximation over 1-1.5 decades (e.g. Donzis & Sreenivasan 2010, Ishihara et al. 2016. This is consistent with defining an approximate inertial range as 60h < r < L (≈ 2100h at Re l = 1000), see ( §3.1). However, inertial range scaling in these statistics does not imply inertial range scaling in pair dispersion, i.e. Richardson scaling, as argued in §2.4.
As mentioned, the DNS data reveal a Reynolds number dependence of the mean square separation at intermediate times, which is most notable for r 0 ≳ 16h. Figure 5 presents results from  for r 0 = 16h and 64h as an example. Furthermore, the result for r 0 = 4h is included as a reference. The initial deviation from the Batchelor regime seems to collapse in Kolmogorov scaling. However, in the intermediate range beyond t/t h ≈ 60, the slope, and hence the scaling exponent, is found to change with the Reynolds number for r 0 = 16h and 64h ( figure 5(b) and (c)). Note that the intermediate range is relatively short at Clear deviations from the inertial range dispersion model behavior (non-Richardson dispersion) have been reported for initial separations in the dissipative range, r 0 < h (Scatamacchia, Biferale & Toschi 2012). However, some observations may still be of interest for the inertial range, because they illustrate how large-scales affect dispersion at smaller scales, including the smallest scale. The reported deviations were due to extreme events of rapidly separating pairs and pairs remaining close together for long time (order integral time scale). The separating velocity associated with the former is of the order of the rms velocity, U, which indicates a contribution from the large-scale energetic motions. So, both extremes are (partially) linked to the large scales, either through the time scale or the separation velocity. These large-scale influences at small r 0 can be understood from the shear layer structures that characterize the turbulent strain (Elsinga et al. 2017) and hence the turbulent dispersion. The velocity difference across the most intense small-scale straining structures within the shear layer is of order U (Ishihara, Kaneda & Hunt 2013, Elsinga et al. 2017 meaning that a pair with r 0 < h centered on such a structure separates with U. The pair then maintains that velocity difference for a significant amount of time (order integral time scale) as each particle enters into a different integral-scale flow region on either side of the shear layer ( figure 1(a) pair A). This scenario is consistent with the extreme pair separation observed by Scatamacchia, Biferale & Toschi (2012). Extremely slow dispersion on the other hand can be understood as pairs initially located within the same large-scale and nearly-uniform flow region bounding the shear layers ( figure 1(a) pair B). Within these flow regions the velocity differences are relatively small, and because these regions are large, the particle pairs remain within them for long times (order integral time) without separating much. In these cases of extremely fast or extremely slow separation, the particles remember their initial condition (velocity difference) for very long times violating t 3 scaling law assumptions ( § §2 and 3.1). Long term memory effects have also been reported by  for r 0 £ 24h. In these cases, the fact that the (initial) pair separation is small does not imply that there must be a small-scale eddy structure between these particles. In fact, if their relative velocity is small, it is more likely that they are in the same large-scale uniform flow region. Note that in the kinetic energy spectrum, low velocity magnitude is associated with small scales, but for relative dispersion the velocity difference, i.e. the velocity gradient, matters, whose magnitude is low at large scales. The above structural explanation for extreme dispersion highlights the important contributions of the large scales to the dispersion at small r, which is inconsistent with assumptions underlying the t 3 scaling law ( §2).

Non-classical pair dispersion in a significant shear layer
Here, we present new results showing extreme pair separation. They are obtained for particles released inside the significant shear layer detected by Ishihara, Kaneda & Hunt (2013). The present results are based on the DNS of homogenous isotropic turbulence at Re l = 1131, where the full fluid velocity field has been advanced in time using the DNS code described by Ishihara et al. (2007Ishihara et al. ( , 2016. At the initial time, which corresponds to the time instant discussed in Ishihara, Kaneda & Hunt (2013), 719 fluid tracer particles are distributed randomly within 11 connected subdomains coinciding with the core of the detected significant shear layer. The subdomains are 96.4h ´96.4h ´96.4h in size and they contain the highest boxaverage enstrophy levels within a slice through the significant shear layer. The particles are tracked using the method described by Ishihara et al. (2018), who traced fluid particles and inertial particles in a series of DNS of turbulent flow using cubic spline interpolation for the fluid velocity at the particle position and using a fourth-order Runge-Kutta method for time integration. Here, we are concerned only with the fluid particle traces.
The significant shear layer was observed to survive in visualizations of intense vorticity until at least t/t h = 60, where t = 0 corresponds to the time when the particles were released. During this time, the average strength of the vortices inside the layer decreased slightly. After t/t h » 60, layer deformations were observed. However, a layer-type structure appeared to be maintained between the same large-scale motions for up to at least t/t h » 100. At this point, it is hard to provide an exact number for the layer's lifetime. For the present purpose, it suffices that the layer survives until at least t/t h = 60, because the particles leave the layer well before. This is confirmed by the results presented below. As pointed out before, the lifetime of the individual small-scale vortices inside the layer may be shorter.
From the particles released inside the significant shear layer, 1909, 2696 and 1920 particle pairs are obtained at r 0 = 64h, 128h and 256h respectively. These initial separations are in the inertial range and smaller than the significant shear layer thickness, i.e. 4l T = 264h at the present Reynolds number. While the pairs at r 0 = 64h and 128h are approximately randomly oriented, the pairs at r 0 = 256h are predominantly aligned with the significant shear layer because of confinement effects, which are caused by the separation being very close to the layer thickness. Therefore, the results at r 0 = 256h are affected by orientation.
The mean square separation for the particle pairs released inside the significant shear layer is shown in figure 6. As expected, the mean square separation initially follows Batchelor scaling, which corresponds to a horizontal line in this t 2 compensated plot. The associated separation velocity is around − (0) ' / ' = 32 < , which corresponds to 1.8U. The large, order U, separation velocity is consistent with the velocity difference across the significant shear layer (see Ishihara, Kaneda & Hunt (2013) and §1). Moreover, the separation velocity in the Batchelor regime is relatively insensitive to r 0 within the considered range, which is clearly different from the unconditional result for pairs released anywhere in the flow at the same r 0 and similar Re l (figure 4, blue lines). This also suggests that U is the only relevant velocity scale for the dispersion inside the significant shear layer. In the range 4 < t/t h < 20, the mean square separation follows a ~t 1.2 scaling for r 0 = 64h and 128h (figure 6), which is reminiscent of a t 1 scaling associated with Taylor dispersion. However, it concerns a Taylor-like dispersion at relatively small scales, i.e. − (0) ' ≲ 4 U ' , in contrast to the genuine Taylor regime, which appears at scales much larger than L. In Taylor dispersion the particles move independently, which is also the case inside the significant shear layer. This can be seen from the Lagrangian correlation of the velocity difference X ( + , , ∆ ) ≡ ( + , ) • ( + , + ∆ ) , where is the relative velocity between the particles with the initial separation r 0 . For r 0 = 64h and t/t h < 20, the normalized R L drops to small values (<0.4) within a time delay Dt of 5t h (figure 7(b)), which means that the relative velocity decorrelates quickly and that the particle motions are nearly independent after 5t h . This time scale is small, but not negligible, on the temporal range 4 < t/t h < 20. Moreover, the particle separation is still close to the small-scale coherence length and depends on r 0 . Additionally, Taylor dispersion requires R L (r 0 ,t,0) to be constant, which is approximately the case ( figure 7). Therefore, the dispersion is Taylor-like, but not exactly Taylor. The largely independent motion of the particles is consistent with the structural description of the significant shear layer, where intense small-scale structures are clustered inside a thin layer bounded by large-scale motions ( §1). In the layer, there are no intermediate scales contributing importantly to the dispersion. The small-scale coherence length is between 60h and 120h depending on the definition (Elsinga et al. 2017, see also §3.1). Therefore, the Taylor-like dispersion by uncorrelated small-scale structures appears for separations larger than 60h and smaller than the layer thickness, 4l T . Beyond r ~ 4l T , particles leave the layer and the bounding large-scale motions also contribute to their dispersion, which is therefore no longer Taylorlike. Consequently, the Taylor-like regime is present for r 0 = 64h and 128h, but is not as clearly seen for r 0 = 256h = 3.9l T .
Furthermore, the present observation of a Taylor-like regime appears consistent with the model analysis of Devenish & Thomson (2019). They showed that diffusion dominates pair dispersion after the initial Batchelor regime in case that (i) the constant of proportionality in the second-order Lagrangian velocity structure function (and hence the relative velocity) is very large and (ii) the relative velocity is short correlated in time as assumed in a Markov process description. As shown here, these conditions are satisfied locally within the significant shear layer.
The Taylor X ( + , , ∆ ) (Ishihara & Kaneda 2002). In the Taylor-like regime R L drops to zero quickly, which justifies introducing the approximation ∆ + @] X ( + , , ∆ ) ≈ X ( + , , 0) X , where t L is a Lagrangian correlation time scale for the small-scale motions inside the layer. Furthermore, X ( + , , 0) is approximately constant in the Taylor-like regime. For r 0 = 64h and 4 < t/t h < 20, our results suggest that X ≈ 4 < and X ≈ 1.8 ' , which yields: for the Taylor-like dispersion regime inside the significant shear layer. When compared to the data, relation (3.1) is seen to capture the order of magnitude in the Taylor-like dispersion range 4 < t/t h < 20 (figure 6).
For r 0 = 64h and 128h, other regime changes are observed at t/t h » 30 and 100 when − (0) ' is larger than 4 U ' and is approaching ' respectively (figure 6). The former is associated with pairs leaving the significant shear layer and entering the large-scale motions bounding the layer. The latter is close to the large-scale turnover time, ~130t h at the present Reynolds number, which is the order of magnitude for the lifetime of the significant shear layers ( §1). Therefore, the particles have left the shear layer (t/t h » 30) well before the expected lifetime of the shear layer (>60t h ). Just before the regime changes in figure 6, the peak of the Lagrangian correlation R L is observed to broaden in the direction of Dt, which is evident from the thick contour line breaking away from the vertical axis at t/t h » 20 and 75 in figure 7(a). These rather sudden changes in the peak width signify that distinctly different scales affect the dispersion, which is very different from a gradual local dispersion process where the particles continuously probe the scale given by their instantaneous separation r. Furthermore, a longer-time velocity correlation appears for t/t h > 30, when R L increases with Dt. Though eventually R L is expected to go to zero at very large Dt. This larger-scale influence in the correlation is consistent with particles entering into the large scale, quasi uniform flow regions bounding the significant shear layer at t/t h » 30. Beyond t/t h » 100, the mean square separation increases rapidly, but this result cannot be evidence for a Richardson scaling regime, because the temporal range is too short and the average separation is partially outside the inertial range (see criteria in §3.1).
Based on these new results, it is concluded that pair dispersion inside the significant shear layer is highly non-classical, because it is dominated by non-local dispersion. This non-locality is evident from (i) the velocity scale governing the dispersion, which is U even at relatively small spatial scales, and (ii) the Taylor-like dispersion driven by small-scale motions, which appears for intermediate r 0 . The latter is very different from the Batchelor and Richardson regimes commonly associated with small and intermediate time dispersion. Furthermore, we speculate that the Taylor-like dispersion inside the significant shear layers contributes to (or possibly causes) the dip in the unconditional t 2 compensated mean square separation ( §3.2, figure 4), because they occur over the same temporal range.
The question remains, how important is the significant shear layer dispersion to the overall (unconditional) mean square separation? Its relative contribution is given by the product of the relative magnitude of ( ) − (0) ' and the volume fraction occupied by significant shear layers, assuming that the particle pairs are randomly distributed over space. The volume fraction occupied by significant shear layers is estimated to be ~10% at Re l ≈ 1000 (Elsinga, Ishihara & Hunt 2020). Initially, the mean square separation inside the layer is almost an order of magnitude larger than the unconditional result for r 0 = 64h and Re l ≈ 1000 (see figure 6). At later times, the difference is still at least a factor 2-3. Multiplying these factors by the 10% volume fraction reveals that the significant shear layers' relative contribution to the overall mean square separation is significant and order one. Moreover, the significant shear layers will also contribute to the dispersion of the particles initially outside the layer and later entering into the layer, which is not accounted for here. Therefore, the present estimate represents a lower bound for their relative contribution.
The significant shear layer contribution remains significant as the Reynolds number increases. The volume fraction of these layers decreases according to ( U L 2 /L 3 ) ~ Re l -1 (Elsinga, Ishihara & Hunt 2020), while the layer's relative magnitude of ( ) − (0) ' scales according to (U/u h ) 2 ~ Re l 1 , since the dispersion inside the layer and the overall dispersion are governed by U and u h respectively. Therefore, the product of these two is independent of Re l meaning that the layer's relative contribution to the unconditional mean square separation is order one at all Reynolds numbers.

Results from kinematic simulations
Richardson scaling has also been examined using kinematic simulations, where the particles are tracked through a random velocity fields with the spectral properties of a turbulent flow. However, as argued by Thomson & Devenish (2005), any appearance of a t 3 scaling range is accidental and the Richardson constants thus obtained are generally much smaller than those from experiments and numerical simulations of actual turbulence (e.g. Ott & Mann 2000. Their main explanation for the disparity is that, in kinematic simulations, the small-scales are not advected by the large scales as in actual turbulence, resulting in the particle motion being affected by the small-scales for too short times. However, Fung et al. (1992) in their kinematic simulations compensate for large-scale advection effects. Such compensation is non-trivial and introduces assumptions. These assumptions are essentially the same as those used in the theories discussed in §2, that is, the large-scales (larger than r) do not contribute and hence the dispersion is governed by eddies at scales close to the particle separation, r. Moreover, the phases of the different wavelengths are random in kinematic simulations, which is similar to the theories in the sense that the scales are considered to be independent. Therefore, it is not surprising that the kinematic simulations with large-scale advection compensation yield the same t 3 scaling law as the theories (e.g. Fung et al. 1992). It proves that the theory is internally consistent, but it cannot validate the theory, because similar simplifying assumptions are used. It is, in fact, these assumptions that are being challenged by the lack of an observable Richardson scaling regime in DNS data ( §3.2) and by the results in §3.3 showing that the large scales are significantly contributing to the dispersion at r << L.
As mentioned, the large-and small-scale modes are independent in kinematic simulations by construction. This leads to a profoundly different turbulent flow structure, which affects the particle dispersion. It was already observed by Fung et al. (1992) that the length of the intense vorticity tubes was underestimated by their artificial velocity fields. The same effect is visible in the visualizations of the intense vortices and the large-scale uniform flow regions in Elsinga & Marusic (2010), their figures 14 and 15. Not only are the vortices much shorter, also the shape of the large-scale structure and the spatial distribution of the small scales are different. In actual turbulence, the large-scales appear elongated and the intense small-scale vortices form clusters along their edges. However, in the artificial field, the large scales are blob-like and the small-scales are distributed uniformly over the entire domain. The dispersion of particle pairs in these different velocity fields is illustrated in figure 8. For the purpose of particle tracking both fields were treated as stationary. After the initial Batchelor regime (t/t h > 2), the scaling exponents (i.e. the slopes in figure 8) are observed to differ, especially for the larger initial separations r 0 /h = 65 and 130. The coherence of the velocity field in the actual DNS is found to slow down the pair dispersion as compared to the random field. Note that this illustration concerns a low Reynolds number. When significant shear layers appear (Re l > 250), the spatial clustering of intense small-scale structures, i.e. intermittency, is much more pronounced, increasing the differences with respect to a random velocity field.
To summarize, kinematic simulations with large-scale advection compensation cannot be used to validate the Richardson scaling law. Moreover, the artificial velocity fields do not accurately reflect the turbulent flow structure and the pair dispersion. Because of these issues, we will not consider the kinematic simulations in further detail here. Figure 8. Kinematic simulations of pair dispersion in a random and DNS velocity field with the same turbulent kinetic energy spectrum (Re l = 170). The evolution of the t 2 -compensated mean square separation is shown for three initial separations, r 0 /h = 5, 65 and 130 (increasing upwards). The DNS field was provided by Dr. A.A. Wray (CTR 2002, private communication). The random field was generated using the method of Rogallo (1981).

Results from experiments
Unfortunately, laboratory experiments have advanced very little over the last decade as far as pair dispersion is concerned. The measurements of Ouellette et al. (2006) and  still represent the highest Reynolds numbers achieved in an experiment, i.e. Re l = 815. These flow conditions have since been surpassed by DNS reaching Re l = 1000 ( §3.2). However, the mentioned experimental studies presented data for a wide range of initial separations within the inertial range up to approximately t =50t h , after which the scatter in the data increases. Only a Batchelor t 2 scaling regime was observed, which has later been attributed to limited observation time (<t t , Bourgoin 2015). Also, it has been suggested that the finite measurement volume may play a role in these experiments (Salazar & Collins 2009). However, the experimental results appear generally consistent with DNS, which also showed that for t < t t the pair separation follows t b scaling with b £ 2 ( §3.2).
For low Reynolds number, Re l ≈ 100, Ott & Mann (2000) showed a short t 3 scaling range when plotting their data using a virtual origin. This range extended over less than half a decade in time, which is not yet convincing for reasons explained in §3.1. Moreover, the t 3 scaling is observed over a temporal range that partially overlaps with the Batchelor scaling range, i.e. it extends below t = t B , suggesting that t 2 and t 3 scaling are valid simultaneously. Furthermore, their results are for an initial separation outside inertial range, i.e. r 0 ≈ 10h. The short t 3 scaling range was also observed in the results from a large-scale DNS at Re l = 283 covering a wider range of initial separations outside the inertial range, r 0 £ 32h, but a virtual origin had to be used (Ishihara & Kaneda 2002).
Based on their experiments at Re l ≈ 180, Shnapp & Liberzon (2018) found that pairs with low initial separation velocity magnitude, i.e. small dr/dt, separated according to a t 3 law, whereas pairs with high initial separation velocity separated following a t 2 scaling. However, these observations were made outside the inertial range, that is, r 0 £ 15h and for time scales smaller than the eddy turnover time at r 0 . Nevertheless, it reinforces the notion that the mean square separation represents an average over widely different separation behaviors. The observation that a subset of pair dispersions follows a t 3 law is interesting, and remains to be explained. We suggest that it might be related to the orientation of the separation velocity vector relative to the initial separation vector r 0 . Hereto, consider two tracer particles in a frame of reference attached to the bottom particle (figure 9(a)). In this frame, the velocity of the other particle, V, corresponds to the separation velocity vector. When V is assumed constant and normal to r 0 (V 1 in figure 9(a)), then dr/dt is initially zero, or small when allowing for a slight deviation from the normal condition. The resulting evolution of the mean square separation − (0) ' is given in figure 9(b), where = ( ) . The plot shows that after the initial Batchelor t 2 regime, the separation is accelerated as r tilts in the direction of V 1 . During this stage, the dispersion closely follows a t 3 scaling law. At later times (beyond the scale of the plot) t 2 scaling is recovered when the tilting is complete and r approximately aligns with V 1 . On the other hand, if V and r 0 align (V 2 in figure 9(a)), then the initial dr/dt is significant and a t 2 scaling is found throughout ( figure 9(b)). The present results are qualitatively consistent with those of Shnapp & Liberzon (2018), which suggests that the initial orientation of the relative velocity is important. We further comment that identical results are obtained for both orientations of V when the vector form − (0) ' is used to quantify the mean square separation, instead of the scalar form − (0) ' (figure 9(b) and Shnapp & Liberzon 2018). Therefore, also the metric used to quantify the mean square separation is important.
Many of the more recent experiments have considered inertial particles, and the effects of walls or thermal convection, which is different from the present problem of fluid tracers in homogeneous isotropic turbulence. But also in these cases, Re l is typically in the range of 250-500 (e.g. Dou et al. (2018) for a Richardson scaling regime. However, there is some evidence to suggest that also in Rayleigh-Bénard convection a t 3 scaling range appears for r 0 ≈ 4h with other r 0 approaching from above or below (Liot et al. 2019). Pair dispersion experiments at much higher Reynolds numbers would be of considerable interest. New facilities can generate grid turbulence at Re l ~5000 (Küchler, Bewley & Bodenschatz 2019), albeit grid turbulence is decaying, but they have not yet been used to measure pair dispersion.
High Reynolds numbers are also achieved in environmental flows, but the experimental conditions are typically poorly controlled and the flow is often inhomogeneous and affected by buoyancy (Salazar & Collins 2009), which introduces considerable uncertainty. Therefore, these experiments are not considered here.

Summary
At present, there is no clear and indisputable evidence for a Richardson t 3 scaling regime in the inertial range. Only for the case of r 0 ≈ 4h did a t 3 scaling range appear, which could be coincidental. A genuine Richardson regime would have resulted in t 3 scaling for a range of initial separations within the inertial range (r 0 > 60h), not just for one specific case of r 0 ≈ 4h outside this range. Other initial separations were found to approach the r 0 ≈ 4h result from above or below, but they never reached a true t 3 scaling regime.
Despite a considerable increase in the Reynolds numbers achieved by DNS, the separation of scales remains limited. At Re l = 1000, the mean separation is of the order of the significant shear layer thickness, i.e. − (0) ' ≈ 4 U ' , at the end of the initial Batchelor regime. It means that some pairs are dispersed by large-scale motions well before a Richardson regime might develop (e.g. figure 1(a)

pair A).
New results for pairs released inside a significant shear layer confirmed this conjecture. They revealed that the separation velocity is of the order of U, and that the dispersion is Taylorlike for r 0 > 60h and 4 < t/t h < 20. This suggests that the large scales contribute to the dispersion through the velocity scale, while the independent small-scale motions inside the layer govern the particle motions. There does not seem to be an important role for intermediate scales, which strongly questions the validity of the assumptions leading to the prediction of a Richardson scaling regime. Furthermore, scaling analysis showed that the non-local contribution from the significant shear layers to the overall mean square separation is significant at all Reynolds numbers.
Finally, the metric used to quantify the mean square separation can have important consequences for the observed dispersion behavior. For example, the introduction of a virtual origin may introduce a spurious t 3 scaling range over approximately half a decade of time. Also, there can be profound differences between the vector and the scalar definitions of the mean square separation. These issues sometimes complicate a straightforward comparison of results. However, they do not explain the fact that a Richardson scaling regime has not been observed in the studies discussed.

The intermediate range as a transition region
Here, we propose an alternative view of pair dispersion in the intermediate range, which is consistent with the available data. Starting from the well-established initial Batchelor regime and the ultimate Taylor regime, the intermediate range is regarded simply as a transition between these two regimes. This is similar to considering the intermediate range in the velocity structure functions as a matching region ( §2.4). Because we do not use inertial range assumptions, the results of our model apply to wide range of r 0 , not limited to the inertial range. Also, the present model is consistent with non-local contributions to the dispersion.
The initial pair separation is ballistic meaning that the particles maintain their initial velocity for sufficiently short time-scales. This may be relaxed by assuming that the mean square separation velocity, ( , ) ' , is constant, which is valid for longer times (e.g. Goudar & Elsinga 2018) and leads to the so-called Batchelor scaling regime (Batchelor 1950): where is the relative velocity between the particles. To compute the three-dimensional second-order structure function, ' , we use the functional form given in Donzis & Sreenivasan (2010), which is given by: where c B = 0.076. The Batchelor regime extends up to t = 3t B in our model. Note that the Batchelor regime is Reynolds number independent in Kolmogorov scaling. Therefore, it seems reasonable to assume that the end of this regime scales accordingly, and hence is proportional to t B independent of the Reynolds number. The pre-factor is empirical and estimated from figure 4, where we have included the part revealing a slight dip into the Batchelor regime. Moreover, 3t B is a close approximation of the time scale t t proposed by  for the end of the Batchelor regime, when r 0 > 60h and using (4.2). At large time, the pair separation is of order L or larger, which implies that the motions of the particles are uncorrelated and independent of the initial separation r 0 . This is known as the Taylor regime, in which the mean square separation grows as (Taylor 1922, Pope 2000: Define the onset of the Taylor regime at t = cT L , where c ≈ 8 is a constant independent of r 0 and the Reynolds number. Here, the simplifying assumption is that the Taylor regime starts at the same value of the mean square separation, and hence time, for all r 0 . This seems reasonable given the convergence observed in figure 4 (and also in figure 11(b)). Furthermore, the onset of the Taylor regime is assumed to collapse when using integral scales, which is consistent with the general scaling in this regime. The value of c was estimated from the intersection point when extrapolating the data for different r 0 (see figure 4). Furthermore, using X = 2 ' /( X ) (Tennekes 1979, Pope 2000, we obtain for the onset of the Taylor regime: The expression used for T L can be understood from the energy balance, where the average production-rate scales with large-scale quantities (i.e. ' / X ) apart from a constant and is equal to the average dissipation-rate e (see also §2.4). Lacking a better estimate, the value of D L has been determined from the inertial range of the second-order Lagrangian structure function as D L » C 0 , where C 0 = 6.0 is the Lagrangian Kolmogorov constant (Lien & D'Asaro 2002, Sawford & Yeung 2011. However, this inertial range approach is considered reasonable for second-order moments of velocity as explained in §2.4. Rewriting equation (   The present model (figure 10) illustrates how the mean square separation slowly approaches t 3 scaling in the limit of n → ∞, but never reaches it in any real flow. Note that t 3 scaling corresponds to a horizontal line in this t 3 compensated plot. Furthermore, it is worth pointing out that the high Reynolds number case shown in figure 10 is representative of atmospheric conditions with Re l ~ 10 4 , where the onset of the Taylor regime is anticipated at cT L /t h ~ 10 4 . So, even in these highly turbulent flows, Richardson scaling is not returned by the model. There is, however, one exception. For r 0 /h ≈ 4, the end of the Batchelor regime and the start of the Taylor regime are on the same vertical coordinate, which results in t 3 scaling. This is consistent with DNS revealing t 3 scaling only for r 0 /h ≈ 4 , Bragg, Ireland & Collins 2016, see also §3.2). Moreover, as noted by , the compensated relative dispersion will approach the t 3 scaling from below for r 0 /h ≲ 4 and from above for r 0 /h ≳ 4, which is reproduced by the model. The rate at which t 3 scaling is approached is presented in figure 11(a) for different r 0 . The plot shows the power law exponent b versus the time corresponding to the onset of the Taylor regime, cT L /t h , which varies linearly with Re l . The results in figure 11(a) confirm our discussion of figure 10 by showing that b = 3 for r 0 /h ≈ 4, while, at finite Re l , larger and smaller r 0 yield b < 3 and b > 3 respectively. Only in the limit of n → ∞ is b = 3 independent of r 0 . Furthermore, figure 11(a) compares the model predictions (lines) with power law fits to actual DNS data (circles). These fits were obtained in the middle of the intermediate regime (3t B < t < cT L ) where the slope of the mean square separation was approximately constant in the logarithmic plot. These fits are illustrated in figure 11(b) for the case of Re l = 650. Apart from the lowest Reynolds number, the model agrees with the actual DNS data to within the accuracy of the fit. The deviations at Re l = 140 (cT L /t h = 110) may be understood from the fact that the straining motions are underdeveloped for Re l < 250 (Elsinga et al. 2017), which affects the dispersion. At Re l = 240 (cT L /t h = 180), r 0 /h = 256 corresponds to r 0 » L, which means that r 0 is not in the intermediate-r range. In that case, it is possible that there is no intermediate regime and the Taylor regime directly follows the Batchelor regime. This explains the deviation of this data point from the model prediction. All remaining data points, and especially those within the intermediate-r range (r 0 /h = 64), seem consistent with the model. While the model results slowly approach t 3 scaling with increasing Reynolds number ( figure  11(a)), there is an important fundamental difference between the present model and the theories leading to the Richardson scaling law ( §2). The theories assume local dispersion and invoke inertial range assumptions, where the dispersion is independent of the large (energetic) scales as well as the dissipative scales. By contrast, the present model uses the properties of the large scales, i.e. U and L, (equation 4.3) and the small time-scales, i.e. u h , h and r 0 , (equation 4.1) and assumes a transition region. This introduces a r 0 and Reynolds number dependence in the scaling exponent for the intermediate range. These dependencies are observed in the DNS data ( §3.2), but are not retained in K41-based theory as discussed in §2.4. Basically, the present model yields an overlap region for each r 0 separately. Further note that the classical results for ' and C 0 used here, can also be obtained without inertial range assumptions (see §2.4). The present model explains why t 3 scaling is found for just one specific initial condition outside the inertial range (i.e. r 0 /h ≈ 4 independent of Re l ), and also captures the departures for r 0 /h ≲ 4 and r 0 /h ≳ 4, which the inertial theories ( §2) do not. Furthermore, the 'Richardson constant' is g' ≈ 0.56 for r 0 /h ≈ 4 (equation (4.5)), which is consistent with the reported values in the literature (g = 0.55 -0.57, . We should point out that the value of r 0 for exact t 3 scaling and the value of g' are not predicted from first principles. They partially depend on observations related to the starting point and the end point of the intermediate range. Therefore, consistency between the resulting g' and the reported g is expected, even if it is not enforced. Furthermore, g', and to a lesser extent r 0 for which t 3 scaling is obtained, is sensitive to the uncertainties in C 0 (Lien & D'Asaro 2002) and c, see equation (4.5). Nevertheless, the present estimate for the onset of the Taylor regime, i.e. c ≈ 8, seems to return a reasonable value for g'.
The model introduces two empirical coefficients, i.e. c and the pre-factor associated with t B , which means that at most two of the model outcomes can be fixed by these coefficients. For sake of argument, let's say the fixed outcomes are the value of g' and the value of r 0 for exact t 3 scaling. The former depends on the coefficient c and the latter depends on both coefficients. Once the coefficients are set, the exponents for the other r 0 and their Reynolds number dependencies (figure 11) follow directly from the model and cannot be adjusted. Therefore, the most significant results from the model are (i) the Reynolds number dependence of the intermediate range for different r 0 and (ii) the finding that there is only a single r 0 for which exact t 3 scaling is obtained.
Only in the limit of / + → ∞ and / → ∞ (at n → ∞) can a r 0 dependence be neglected in the present model, because r 0 and h become indistinct. In that case, a Kolmogorov scaling regime at small t, independent of r 0 , is matched to an integral scaling regime at large t, which yields the Richardson scaling consistent with K41, but without having to assume local dispersion within an inertial range where all flow properties depend only on e and r.
Presently, a power law has been assumed for the intermediate range, and details, such as the dip in the t 2 compensated mean square separation (figure 4) and the smooth connections to the Batchelor and Taylor regimes, are ignored. A possible alternative to the power law assumption could be a (slightly) curved line connecting the Batchelor and Taylor regimes in figure 10.
Further work characterizing the shape of the connection is clearly warranted. However, including this kind of detail does not fundamentally alter the physical picture presented here, where the intermediate range is a transition region. It will yield a similar slow r 0 -dependent approach of t 3 scaling when it is being stretched due to an increasing Reynolds number, i.e. increasing separation between Batchelor and Taylor regimes.
As commented before, dispersion at intermediate times is highly complex, because r is widely distributed. Therefore, the pair dispersion is affected by many scales, or perhaps even the full range of scales, at any one time. The present model for the mean square separation incorporates some of these effects by considering the intermediate range as a transition region, where the balance gradually shifts from a state governed by r 0 and Kolmogorov scaling (i.e. Batchelor regime) to a state fully described by the large, energetic scales (i.e. Taylor regime). This transition scenario is consistent with the existence of significant shear layer structures, where only small and large scales contribute to the dispersion (figure 1(a) and §3.3). At intermediate times, we expect that a blend of these two contributions will determine the statistics.
Recently, Liot et al. (2019) have shown that the ballistic approach of Bourgoin (2015) ( §2.2) can yield a r 0 -dependent intermediate range consistent with observations and the present model. A t 3 scaling was obtained for r 0 /h ≈ 4, while other initial separations approached t 3 scaling from above or below. This was achieved by replacing the inertial range model S 2 ~ (er) 2/3 with an actual S 2 , which included large-and small-scale ranges. It seems to confirm that all scales play a role in the intermediate range. Their results are for a single Reynolds number, Re l ≈ 75, and r 0 /h £ 10. It would be of interest to extend them to larger r 0 and to higher Reynolds numbers. The Lagrangian Stochastic Models of Borgas & Yeung (2004) and Devenish & Thomson (2019) also yields a t 3 scaling for r 0 /h ≈ 4, with larger r 0 approaching from above, when using the actual probability distribution of the separation velocity instead of an inertial range model. This distribution was obtained directly from DNS at Re l = 390 (Devenish & Thomson 2019) or modelled after DNS results at Re l = 90 -230 (Borgas & Yeung 2004) and included dissipation range effects and longer tails. The latter are associated with extreme separation velocities (intermittency) and non-local effects. However, model parameters had to be adjusted for each Reynolds number in order to obtain a good match between the model and the DNS in terms of the mean square separation and the value of g at r 0 /h ≈ 4 (Borgas & Yeung 2004). This appears consistent with Sawford & Yeung (2010) showing that g strongly increases with Re l when using constant model parameters. The Reynolds number dependence of the model parameters remains to be explained. Therefore, the ballistic and Lagrangian stochastic modelling frameworks can give a realistic r 0 dependence in the intermediate range if non-inertial range behavior is included. It supports our assertion that the dispersion in the intermediate range should not be understood as inertial and local. The intermediate range is perhaps best seen as a transition region where non-local dispersion is important. A significant example of such non-local dispersion is given by the large-scale shear layers ( §3.3).

Conclusions
The Richardson scaling law for the mean square separation of a tracer pair has been critically assessed. The common theories predicting this t 3 scaling regime assume that only the eddies at the scale of the (mean) pair separation contribute to the dispersion. This is referred to as local dispersion. Moreover, it is assumed that the pair separation is within an inertial range, where the pair separation is not affected by the small dissipative scales and the large energetic scales. It is believed that the increasing scale separation with increasing Reynolds number allows for such an inertial range to eventually develop. However, the assumption ignores any phase dependence between the scales, that is, any structural organization of the turbulent flow. Particularly relevant in this context are the significant shear layers, which develop at high Reynolds numbers. As shown in §3.3, particle pairs within these layers are initially transported by the small scales, but when their separation increases over time, they leave the small-scale structures and almost immediately enter into the large energetic flow regions bounding the layer (as illustrated for example by pair A in figure 1(a)). An inertial range contribution is, therefore, absent around the significant shear layer. This results in a Taylor-like dispersion inside these layers, which is very different from the classical dispersion regimes. Importantly, these layers provide a significant, and non-local, contribution to the overall mean square separation at all Reynolds numbers, because the particle separation velocity is of the order of U inside the layer. Additionally, results reported in the literature show that the separation, r, is broadly distributed at intermediate times when the Richardson scaling law is expected to apply. It implies that some pairs have entered the small or the large scales (outside the inertial range). Consequently, all scales contribute to the dispersion at intermediate times, not just the intermediate / inertial scales. Furthermore, there is increasing evidence to suggest that even the dissipative scales contain important large-scale influences ( §2.4), which would mean that the fluid motions, and hence tracer motions, are affected by the large energetic scales at all separations. These observations raise doubt on whether the assumption of local dispersion within an inertial range is justified.
Cases where a t 3 scaling has been observed do not support the theory for two main reasons. First, the t 3 scaling was observed over too short temporal ranges (less than a decade), which leaves open the possibility that the scaling is artificially introduced by the use of a virtual origin. Secondly, t 3 scaling was found exclusively for an initial separation r 0 /h ≈ 4, which is inconsistent with the theory. Specifically, the theory applies to r (including r 0 ) in the inertial range and it predicts t 3 scaling for a range of r 0 , as opposed to a single ('lucky') value for r 0 outside the inertial range. Extensions of the theory to + ≲ have been suggested in the literature, but this does not explain the available data either. Therefore, it is concluded that the classical theory relies on debatable assumptions and, so far, lacks evidence.
As an alternative to the current theories, we proposed in §4 that the intermediate regime is simply the transition region from the initial Batchelor regime to the ultimate Taylor regime. This is consistent with the observed non-local contribution to the dispersion, because the inputs contain only small-time-scale and large-scale turbulence properties. The simple transition model explains why there is only a single r 0 ≈ 4h for which there is a true t 3 scaling and why smaller/larger r 0 approaches t 3 scaling from below/above, as observed in actual data. The inertial range and local dispersion theories discussed in §2 do not explain this behavior. The 'Richardson constant' returned by the transition model is 0.56 for this t 3 regime at r 0 ≈ 4h, which is consistent with the observed value in DNS. Furthermore, the model prediction of the power law exponent for the intermediate regime is consistent with the presently available dispersion data from DNS at finite Reynolds number. Only in the limit of infinite Reynolds number does the model yield a t 3 scaling regime for r 0 << L. In that case, the t 3 regime is an asymptotic state, which does allow for significant non-local contributions to the dispersion.
The intermediate range is relatively short within the presently available data (Re l ≲ 1000), which introduces some uncertainty when extrapolating the results to higher Reynolds numbers. Accurate data at Re l ~ 10 5 seems desirable to probe the intermediate range in Lagrangian statistics (e.g. Lien &D'Asaro 2002 andSawford &Yeung 2011). However, applications of turbulent dispersion are often found at lower Re l , which makes finite Reynolds number theories of important interest, in addition to the infinite Reynolds number limit.

Acknowledgement
We thank the referees for their comments, which helped to improve the manuscript. This work used computational resources of the K computer / the supercomputer Fugaku provided by RIKEN, and the Oakforest-PACS in the Information Technology Center, the university of Tokyo, through the HPCI System Research Project (Projects ID: hp200124 and hp210164). This work also used computational resources provided by the Information Technology Center, Nagoya University, and Research Institute for Information Technology, Kyushu University, through the HPCI Research Project (Projects ID: hp190084 and hp200042) and the JHPCN Joint Research Project. TI was supported in part by JSPS KAKENHI Grant Number 20H01948 and MEXT as "Program for Promoting Researches on the Supercomputer Fugaku" (Toward a unified view of the universe: from large scale structures to planets). We thank Dhawal Buaria and PK Yeung for providing their data at Re l = 390, 650 and 1000, which was generated under NSF Grant No. CBET-1235906.
Declaration of Interests. The authors report no conflict of interest.

Appendix A. Richardson scaling in linear plots
Here, we briefly discuss the linear plot of ' 4/) versus time, where a Richardson scaling regime appears as a straight line. Such plots have been used in the past, for example by Ott & Mann (2000) and Ishihara & Kaneda (2002). It is shown that the condition for positively identifying a Richardson scaling regime is similar to that for the logarithmic plot.  provide ' at Re l = 730. Figure 12 shows their result for r 0 = 24h, which is the largest initial separation available, and hence closest to the inertial range. In the conventional logarithmic plot (figure 12(a)), the data approaches t 3 scaling from above (see also §3.2). In the range t/t h > 30, a fit yields '~'.o , which, furthermore, is consistent with our model at the same Reynolds number ( figure 11(a)). This implies ' 4/)~+.pq .
In figure 12(b) the same data is shown in a linear plot of ' 4/) versus time (black line). Furthermore, the blue and red curves show power laws with exponents 1 and 0.9 respectively. The latter is a better fit to the data in the range 30 < t/t h < 130. This temporal range covers almost an order of magnitude, as is suggested for a reliable power law estimate in §3.1. Note that t/t h < 20 = t B /t h is associated with the Batchelor scaling range for r 0 = 24h, so we do not expect a good fit there. The range 70 < t/t h < 130 is too short to make a distinction between the two power laws. Any smooth curve is piecewise linear over a short range, but that does not mean that the true exponent is equal to 1. Importantly, the exponent 0.9 is consistent with the result from the logarithmic plot, i.e. ' 4/)~+.pq . Therefore, power laws should be fitted over sufficiently range ranges in a logarithmic plot (as discussed in §3.1) or a linear plot. When this condition is satisfied, the results from the different plots are consistent.  (<r 2 >/ 2 ) 1/3 data obtained from smaller bins. The consistency in the results shows that meaningful results can be obtained when using binning. However, one should be aware of some minor quantitative differences between datasets, which results in some uncertainty. For example, at Re l ≈ 650, the intermediate-range exponent for a nominal r 0 /h = 4 was estimated to be 2.9 for the data presented by  using binning, while it was 3.1 for the data presented by  without binning (estimated from their figure 4). The lower exponent in the former may be explained by considering that the effective r 0 /h was approximately 6 (see above) and that the exponent decreases with increasing r 0 at a given Reynolds number. Therefore, approximate signs were used to indicate the condition for t 3 scaling, i.e. r 0 ≈ 4h. The uncertainty is of the order of the Kolmogorov length scale, h, which is inferred from the finest available resolution in r 0 and the mentioned differences between the nominal and the effective r 0 .