On non-uniqueness of the mesoscale eddy diﬀusivity

Oceanic mesoscale currents (‘eddies’) can have signiﬁcant effects on the distributions of passive tracers. The associated inhomogeneous and anisotropic eddy ﬂuxes are traditionally parametrised using a transport tensor ( K -tensor), which contains both diffusive and advective components. In this study, we analyse the eddy transport tensor in a quasigeostrophic double-gyre ﬂow. First, the ﬂow and passive tracer ﬁelds are decomposed into large- and small-scale (eddy) components by spatial ﬁltering, and the resulting eddy forcing includes an eddy tracer ﬂux representing advection by eddies and non-advective terms. Second, we use the ﬂux-gradient relation between the eddy ﬂuxes and the large-scale tracer gradient to estimate the associated K -tensors in their entire structural, spatial and temporal complexity, without making any additional assumptions or simpliﬁcations. The divergent components of the eddy tracer ﬂuxes are extracted via the Helmholtz decomposition, which yields a divergent tensor. The remaining rotational ﬂux does not affect the tracer evolution, but dominates the total tracer ﬂux, affecting both its magnitude and spatial structure. However, in terms of estimating the eddy forcing, the transport tensor prevails over its divergent counterpart because of the signiﬁcant numerical errors induced by the Helmholtz decomposition. Our analyses demonstrate that, in general, the K -tensor for the eddy forcing is not unique, that is, it is tracer-dependent. Our study raises serious questions on how to interpret and use various estimates of K -tensors obtained from either observations or eddy-resolving model solutions.


Introduction
The focus of this study is on material transport by mesoscale eddies, which are defined here as geostrophic currents with length scales shorter than approximately 200-500 km. These currents have a significant influence on the distributions of various oceanic tracers. Eddy tracer transport has been conventionally quantified by turbulent eddy diffusionthis study describes the lateral eddy-induced tracer fluxes using an inhomogeneous and time-dependent, two-dimensional K-tensor. We call it the 'transport tensor' because it parametrises both diffusive and advective effects. From a practical point of view, the eddy transport tensor can be used to represent eddy-induced fluxes in ocean models that either completely miss the mesoscales or only partially resolve them. The approach is based on the assumption of the flux-gradient relation: the eddy tracer flux is proportional to (minus) the large-scale tracer gradient. It is implicitly assumed that the transport tensor is uniquely determined by the turbulent flow and stratification, and is independent of the tracer field itself. Relying on this assumption, one can then obtain meaningful transport tensor estimates using a variety of Eulerian and Lagrangian methods. If, however, the transport tensor is not unique for each physical flow field and is instead a function of the tracer field, the utility of the model becomes questionable. This non-uniqueness is the focus of the present study.
Previous studies have accumulated significant evidence of the inherent complexity of the eddy transport tensor. Observation-based estimates exhibit strong dependence on depth and geographical location (Lumpkin, Treguier & Speer 2002; HSSB20 used a quasigeostrophic (QG) double-gyre model with a spatial filter -as opposed to a Reynolds average -to separate the flow and tracer field scales. The study considered the eigenvalues of the symmetric (i.e. diffusive) part of the transport tensor and found polar, i.e. opposite-signed, eigenvalues to be a dominant feature at all depths. The amplitudes of the eigenvalues were typically two orders of magnitude smaller after removal of the rotational part of the eddy tracer flux, owing to the dominance of such fluxes (Marshall & Shutts 1981). In general, the transport tensor considered by HSSB20 was tracer-dependent. The aim of the present study is to discuss this dependence in depth.
In addition to the eddy tracer flux, non-advective eddy terms, which are filtered out in a Reynolds decomposition, remain in the large-scale tracer equation owing to our spatial filtering, and thus need to be parametrised. One way of doing this is to absorb the terms into the flux-gradient relation so that the resulting K-tensor corresponds to the total eddy term. We diagnose this new tensor and complete the discussion of the tracer dependence initiated in HSSB20.
The paper is organised as follows. In § 2, we describe the tracer experiments, which are set up in a double-gyre QG model. Section 3 discusses the numerical accuracy of the methodology, the effects of the rotational flux and the tracer-dependence of K-tensors under multiple circumstances. Finally, we conclude and discuss the direction of future studies in § 4.

Problem formulation
In this section, we outline the ocean model and the K-tensors on which this study focuses. We describe the three-layer QG model in which passive tracers are advected in § 2.1. In § 2.2, we introduce the scale-aware flow decomposition, the K-tensor inversion method and K-tensor (K f ) for the eddy tracer flux. In § 2.3, we describe the Helmholtz decomposition used to remove the rotational flux and introduce the K-tensor (K div ) for the divergent part of the eddy tracer flux. In § 2.4, we apply the flux-gradient relation to non-advective eddy terms and introduce the K-tensor (K g ) for these terms. Finally, we present the details of the experiments in § 2.5.
2.1. Ocean model A three-layer QG model for double-gyre, mid-latitude circulation, driven by wind forcing, is configured in a square basin with rigid lateral boundaries and flat bottom topography. The governing equations for the potential vorticity (PV) anomalies q l are  The wind forcing is chosen to be asymmetric with respect to the middle latitude of the basin to avoid artificial symmetry of the gyres: 3) The notation and the values of other parameters are listed in table 1, and readers can refer to HSSB20 for the model details.
The velocity u l = (u l , v l ) T is given by On the lateral boundaries, we use a partial-slip boundary condition given by where α = 120 km. Equations (2.1)-(2.4a,b) are solved using the CABARET scheme (Karabasov, Berloff & Goloviznin 2009) on a uniform Cartesian grid of size G = 1025 × 1025 with grid spacing Δ x = Δ y = 3.75 km. The potential vorticities are saved at each grid point every day for 183 days.

Tracer inversion and the K f tensor
In this section, we introduce the scale-aware spatial filtering to decompose the tracer equation and describe the inversion method by which we can obtain a K-tensor for quantifying the eddy tracer flux. Consider the advection-diffusion equation on a fine grid: where C is the tracer advected by the non-divergent flow u, F represents tracer sources or sinks and ν is the small-scale diffusivity. The tracer evolution equation has the same 920 A32-5 form in all three layers, so we have omitted the layer subscript for brevity. A common way to obtain a governing equation for the large-scale tracer is to decompose the velocity and tracer fields into large-scale/small-scale components by a localised filtering: Substituting the decomposed quantities into (2.6) and filtering the equation on the large scales in space yields: where the eddy terms ∂C /∂t, ν∇ 2 C and F are filtered out. Here the filtering is assumed to commute with the divergence operator. To model (2.9) on a coarse grid, the small-scale variations under the second divergence operator need to be parametrised by some large-scale quantities. We denote the divergent term as the eddy forcing of C, and label (2.10) as the first expression of the eddy forcing. Here, E (1) can not be directly parametrised by the flux-gradient relation because it is not in the form of the divergence of an eddy flux. Additionally, the physical meaning of this expression is unclear. The issue can be solved by choosing a suitable filtering such as the Reynolds averaging. However, the Reynolds averaging is not suitable for our study because we want to consider the full spatio-temporal evolution of the eddy tracer flux, and the interactions between the large-scale and eddy flows. Therefore, instead of the mean-fluctuation decomposition in time, we determine the large/small scales of a variable by applying a spatial decomposition. This method of eddy filtering is also motivated by the practicality of eddy parametrisations in ocean models. Consider a variable A(t, x) on a Cartesian grid G. A running-average spatial filter G s is applied to its instantaneous fields. The large-scale componentĀ ij at a grid point G ij is defined as the mean of A evaluated over the region covered by the filter G s centred at this point. That is,Ā where d is the discrete side length of the filter, which is set to be odd to avoid any interpolation. Dropping the grid point indices, the small-scale part of . For grid points near the boundaries, where the distance between the grid point and the nearest boundary is d b < d, we set d = d b . In this case, the square shape of the filter is maintained, and A is zero on the boundaries. Note, in general for a spatial filter,Ā / =Ā and A / = 0, which is not the case with Reynolds averaging. Using the spatial filter, decomposing the terms in the tracer equation (2.6) yields where E (2) is the equivalent eddy forcing Unlike in the Reynolds decomposition approach, we do not filter/average the tracer equation. The physical meaning of each eddy term is clear in E (2) . The advection operators forū and u are preserved in the divergence term, so we define the corresponding eddy tracer flux as f :=ūC + u C + u C . (2.14) The eddy tendency ∂C /∂t, the explicit diffusion ν∇ 2 C and the forcing F are each present in E (2) , but are filtered out when deriving E (1) . These non-advective terms contribute to the eddy forcing, and thus need to be parametrised. We will discuss this in more detail in § 2.4. We now present the inversion method for obtaining a transport tensor for f . Because u and C are zero on the boundaries, there is no flux through and along the boundaries, (2.15) The transport tensor K f can then be locally estimated from the flux-gradient relation (2.17) Because the system (2.16), with four unknowns and two equations, is under-determined for a single tracer, we use two tracers C p and C q . This gives T is the eddy flux for tracer C p,q . Then, the tensor is obtained by inverting (2.18): (2.19) To summarise, here we have derived the eddy forcing E (2) for a spatial filtering decomposition, which we show is distinct from the eddy forcing obtained with the Reynolds averaging. The eddy forcing E (2) is contributed to by an eddy tracer flux divergence and non-advective terms. We have presented a method for computing the transport tensor for eddy tracer fluxes for a pair of tracers.

Helmholtz flux decomposition and the K div tensor
In the previous section, we presented the method for computing the K-tensor (K f ) for a pair of eddy tracer fluxes. In this section, we describe the Helmholtz decomposition which is necessary to compute the K-tensor (K div ) for the divergent part of two eddy tracer fluxes. We are only interested in these purely divergent fluxes as the rotational part does not contribute to the tracer dynamics, as shown in HSSB20.
For an eddy tracer flux f , its Helmholtz decomposition is (2.20) Here, ∇Φ f is the divergent flux, ∇ × Ψ is the rotational flux and H is the non-divergent and irrotational gauge term (Maddison, Marshall & Shipton 2015). The decomposition can be achieved by solving the Poisson equations of Φ and Ψ (Lau & Wallace 1979;Roberts & Marshall 2000;Maddison et al. 2015;Mak et al. 2016), i.e.
The decomposition is not, however, unique in a bounded or singly periodic domain owing to a dependence on boundary conditions (Fox-Kemper, Ferrari & Pedlosky 2003;Maddison et al. 2015). For example, although the total flux through all solid boundaries must be zero, the rotational and divergent components separately need not be zero on the boundaries. Maddison et al. (2015) provided a decomposition by using the example of a three-dimensional eddy PV flux: the horizontal divergent component was minimised by introducing an eddy force function Φ e with a zero tangential boundary condition. We adopt this boundary condition so that the rotational component can be removed as much as possible. We let the boundary condition for Φ f be because there is no flux on the boundaries, as in (2.15). Then, the Poisson (2.21) can be solved uniquely, and ∇Φ is minimised (Maddison et al. 2015). We define the divergent component of the eddy flux as Then, the total eddy flux is conveniently split into the divergent f div and the non-divergent f rot parts: The non-divergent flux is obtained by subtracting f div from f , and it contains the rotational flux ∇ × Ψ and the gauge term H. We denote it as f rot and do not extract the gauge term because the gauge term does not affect the interpretation of the rotational component. We will call it the rotational flux for simplicity. The divergent tensor K div is associated with the divergent flux via the flux-gradient relation, i.e. f div = −K div · ∇C. (2.26) We consider it as part of the transport tensor K f , denoting the part corresponding to f rot as K rot . The divergent tensor, and other K-tensors, can be separated into a symmetric component, and an antisymmetric component, We refer to S as the diffusion tensor because its eigenvalues λ 1 , λ 2 are the diffusivities in the directions of their eigenvectors (Plumb & Mahlman 1987). Here, λ 1 is assigned to the eigenvalue with the most positive value such that λ 1 ≥ λ 2 . The method of computing the eigenvalues is specifically described in HSSB20. We refer to A as the advection tensor, where the component A 12 quantifies the advection of the large-scale tracer field in the direction perpendicular to the tracer gradient. Our study only focuses on the tracer-dependence of the eigenvalues and does not discuss the interpretation of S and A.

Non-advective eddy terms and the K g tensor
In the Reynolds decomposition approach, non-advective eddy terms do not contribute to the mean eddy transport. On the non-advective terms in the spatial filter approach (2.13), the explicit eddy diffusion is negligible, but the eddy tendency and the external forcing terms make leading-order contributions to the eddy forcing. This motivates incorporating their effects into a K-tensor (K ) that parametrises all eddy effects. To do this, we use the fact that the non-advective eddy terms can be quantified using a purely divergent eddy flux g, where As with the divergent flux f div , we obtain g by inverting its corresponding Poisson equation with zero normal and tangent flux boundary conditions. (This requires choosing a suitable tracer forcing with F ∂Ω = 0.) We apply the flux-gradient relation to g and obtain a K-tensor K g for the non-advective eddy terms. Then the eddy forcing can be expressed in terms of the full tensor K , The tensors K f and K g both vary on the small scales, because they are determined by eddy terms. Their sum K , however, will not contain such small-scale variability because in (2.30), we relate it to strictly large-scale terms. The E (2) only varies on the large scale as it is equal to the left-hand side of (2.12), which includes only large-scale quantities. As a result, any tracer dependence in the full tensor K cannot be caused by small-scale variability.

Numerical experiment set-up
In the previous subsections, we outlined the method for computing four K-tensors, K f , K div , K g and K = K f + K g . In this section, we outline the experiments in which we test the non-uniqueness of these tensors and their ability to reproduce their corresponding fluxes.
We prepare two sets of passive tracers with different forms of initialisation for the experiments. For each set, six tracers are chosen to provide a sufficient variety to the initial profiles, and are simulated for half a year. The first set is initialised with linear profiles: where the gradient (a, b) ∈ (Z, Z) and the amplitude A c ∈ Z + . The constant γ is set to be positive, so that C p B ≥ 0. The second set is initialised with sinusoidal wave profiles: A c cos(k n x + l n y + ϕ n ), p = 7, . . . , 12, (2.33) where the number of waves n p ∈ Z + and the phase angle ϕ n ∈ [0, 2π]. The wavenumber for each sinusoidal wave is set to be k = |(k n , l n )| = 6, where k n , l n are randomly determined. Then the wavelength of the nonlinear C p B is approximately the basin size. For every pair of tracers in a set (15 pairs per set of 6), we carry out a test that computes the transport tensor K f , its divergent counterpart K div , the corresponding K g and consequently the full tensor K . To avoid the singularity in (2.19) when diagnosing the transport tensors, it is required that the large-scale gradients of each tracer pair remain misaligned. We maintain this misalignment by setting F to be a restoring force, with a relaxation time scale of T = 10 days. The velocity and the tracer fields are decomposed into large scales and small scales by a spatial filter with d = 31, the distance given by which is approximately 112.5 km.
For the set of tracers initialised linearly, we refer to this experiment as RL, and for the set of those initialised with sinusoidal waves, we refer to it as RN. The two experiments RL and RN, as described above, are repeated but with the tracer relaxation switched off, and we denote these experiments as UL and UN. For reference, the letters R, U, L, N stand for relaxed, unrelaxed, linear and nonlinear, respectively. In summary, we have presented the methods to obtain four K-tensors in four different experiments.

Results
In this section, we present the results from the four experiments described in § 2.5 and focus on the conditions for obtaining unique tensors. We begin by diagnosing the accuracy of K-tensors in § 3.1. Then, in § 3.2, we present the K-tensors for linearly initialised tracers and discuss the effects of the rotational flux. Finally, in § 3.3, we examine the tracer dependence of K-tensors for different tracer sets.

Numerical experiment errors
In this section, we analyse the accuracy of K-tensors in terms of reproducing the eddy fluxes and eddy forcing. First, we measure the accuracy of the transport tensor K f by reconstructing the tracer flux f . The relative error Err( f ) of the inversion is given by The ensemble-averaged Err( f ) for the experiments RL, RN, UL and UN are presented in figure 1. Bachman et al. (2015) suggested to over-determine the tracer flux-gradient relation to balance the removal of a large number of degrees of freedom owing to Reynolds decomposition. However, our linear system (2.18) is successfully solved with two tracers, with relative errors for all experiments almost always less than 10 −4 . Therefore, we deduce that using more tracers is not necessary, because the scale-aware decomposition can preserve sufficient spatial and temporal information of the flow. Panel (a) compares the spatial averages of Err( f ). Overall, the errors do not increase in time after 25 days when the total amount of fluxes have reached equilibrium, and the error amplitudes are likely determined by the tracer initial conditions. Comparing the results from experiments RL and RN, we find that K f for linearly initialised tracers gives an order of magnitude better reconstruction, compared with the case of nonlinear tracers. The effect of relaxation on the accuracy of the method is investigated using the nonlinear tracers set -the error for the unrelaxed tracers is larger than that for the relaxed tracers. The large peak in panel (a) arises from an instantaneous near-alignment of the large-scale tracer gradients. Our results confirm that such alignment can be restrained if tracers are weakly relaxed to their initial profiles.
The spatial distributions of Err( f ) are shown in panels (b), (c), (d) and (e) of figure 1. When the tracers are relaxed in experiments RL and RN, the relative errors are more pronounced in the jet region because the energetic flow accelerates alignment of the tracer gradients. Some additional curves of large errors are found for tracers with nonlinear initial profiles. As their locations correlate with the distribution of the gradient fields ∇C, we argue a steep change in the tracer gradient arising from the random combination of cosine waves can cause a large numerical error. As shown in panels (d) and (e), for the case of unrelaxed tracer profiles, the error increases globally (except in the south-east corner), thus, in agreement with results shown in panel (a).
We now measure the accuracy of the divergent tensor K div , noting that the Helmholtz decomposition can introduce large numerical errors (Bachman et al. 2015). To estimate these errors, we reconstruct the flux divergence by using K f and K div , and then compare the relative errors: Figure 2 shows the fields of ∇ · f reconstructed by K f and K div for a single test in experiment RN. It also shows histograms of the range of the relative errors. From the panels in the first two columns, we see that both tensors are able to reproduce the flux divergence in all layers. The error distributions are roughly Gaussian, as illustrated in panels (c), ( f ) and (i). Even though the divergence fields obtained via the two tensors are visually indistinguishable, the errors associated with K div are, on average, three orders of magnitude larger than those for K f (10 −1 versus 10 2 ). These increased numerical errors are injected by the divergence operators during the calculation of ∇ · f in (2.21) and ∇Φ in (2.25), similar to what was observed by Bachman et al. (2015). Specifically, we hypothesise that the second-order finite-difference method used in the Helmholtz decomposition can shift 'correct' values to neighbouring grid points, which leads to large point-wise errors. This can explain why the two reconstructed flux divergences are visually indistinguishable, but have average errors three orders of magnitude apart. Because any tensor needs to be coarse-grained before being applied to a coarse-resolution model, it is unclear whether or not use of K f over K div would lead to a more accurate parametrisation, despite the large errors associated with K div . Finally, we analyse the accuracy of K and K g by measuring the reconstruction error of the eddy forcing E. (Hereafter, we drop the superscript of E (2) and denote the eddy forcing as just E.) figure 3 shows the temporally averaged E and the relative errors Err(E) for K f and K . Even though ∇ · f dominates E in the top layer by accounting for 73 % of the total, in the intermediate and bottom layers, the non-advective eddy terms become comparably large by accounting for 45 % of the total. This implies that the non-advective eddy terms are not negligible and the associated K g should be included in the reconstruction of the eddy forcing. We illustrate this by comparing Err(E) given by K f and K . The latter, being of order 10 1 , lies between Err(∇ · f ) for K f and K div , so it is purely numerical. This is because the divergence operator is used once during the computation of g, unlike zero times or twice for the other two cases. However, the former is on average one order of magnitude larger owing to the absence of the non-advective eddy terms. Hence, we deduce that it is not sufficient to reproduce the eddy forcing by K f but is sufficient by the full tensor K . Overall, the results show that K-tensors computed by the inversion method can successfully reproduce the eddy tracer flux, its divergence and then the eddy forcing.
Top layer

Intermediate layer
Bottom layer Figure 2. Temporal averages of the flux divergence associated with K f (a,d,g) and K div (b,e,h) in the three layers. (c, f ,i) The histograms of the relative errors Err(∇ · f ) together with the spatial averages marked by the vertical lines. The solid vertical lines represent the domain-mean errors, while the dashed lines only include the 95 % of grid points with the smallest errors. Transparent red and blue bars represent K f and K div , respectively. These results are from experiment RN. It is shown that both tensors reconstruct the flux divergence well, but the latter introduces much larger numerical errors. The same observations are made for linear cases.
Although the transport and the divergent tensors lead to visually indistinguishable flux convergences, the Helmholtz decomposition introduces non-negligible errors in the latter that need to be addressed. As the non-advective eddy terms make considerable contributions to the eddy forcing, the corresponding K-tensor needs to be included in the reconstruction. -5.0 -2.5 2.5 5.0 10 -1 10 0 10 1 10 2 10 3 0 Figure 3. (a,d,g) Time-mean fields of the eddy forcing E in the three layers. The relative errors of their reconstructions by K f (b,e,h) and K (c, f ,i). They are in the range of 10-300 % and 0.1-10 % at 90 % grid points, respectively.

K-tensors
In this section, we diagnose the representative properties of K-tensors and discuss the influence of the rotational eddy flux on tensor components. Because K g is given by a purely divergent flux g, it is sufficient to discuss the influence of the rotational flux by using K instead of K f . Figure 4 presents the time-mean entries of K , K div and K g in the top layer for a tracer pair in experiment RL. The magnitudes of their components for the three layers are listed in tables 2-4. The inclusion/exclusion of the rotational flux leads to notable distinctions between K and K div . A comparison of them between tables 2 and 3 shows that the former is approximately two orders of magnitude larger than the latter in all layers, which means that the eddy transport given by the dynamically active flux f div is only 1 % of the total.
(l) Figure 4. Temporally averaged K (a-d), K div (e-h) and K g ( j-l), all evaluated over half a year in the top layer for a test in experiment RL. The values of K (as well as the indistinguishable K f ) are found to be generally two orders of magnitudes larger than those for K div and K g . The magnitudes of the tensor components in the three layers are listed in tables 2-4.
Additionally, from K div , we see that diffusion outside the energetic jet region is relatively strong, that is, in comparison to the disparity inside and outside the jet region with K f . It is worth comparing K div and K g , as both are computed for divergent fluxes. They have the same orders of magnitudes, but K div is approximately three times larger than K g . Additionally, the spatial patterns of their components are very different. We further illustrate the differences by presenting the velocity potentials of f and g in the three layers in figure 5. As expected, φ g shows more pronounced small-scale patterns in the subtropical regions. However, it also generates mesoscale components along the jet, similarly to Φ f , thus, contradicting the assumption that the non-advective terms do not contribute as much to the mean flow and thus can be filtered out. The sign of Φ g is distributed differently from that of Φ f . Their distributions are roughly perpendicular in the top layer and are the opposite in the lower two layers. Overall, the results presented highlight the necessity of interpreting the velocity potential by depth, but this is beyond the scope of this study. Despite the distinctions in the magnitudes and spatial patterns, negative values appear in the fields of K 11,12,21,22 for all tensors. They are more pronounced in K div , but also exist in K g , which naturally has no relation with the rotational flux. Therefore, the removal of the rotational flux does not eliminate the negative values in the tensor components. Furthermore, the second eigenvalues of the symmetric parts of K f and K div are most often negative (HSSB20).
To summarise, we have presented K , K div and K g , and the velocity potentials Φ f and Φ g . Owing to the dominance of the rotational flux f rot , there are major differences in the magnitudes and spatial structures of K when compared with the other two tensors. The inclusion of f rot is not responsible for the negative values or negative diffusion eigenvalues (HSSB20), as these persist for all K-tensors.

Non-uniqueness of diffusivity
In this section, we investigate the tracer-dependence of K-tensors. We start by obtaining tracer-independent tensors for the linearly initialised tracers by exploiting the linearity of the flux-gradient relation. We then measure their relative distances to use as a reference for purely numerical discretisation errors. Next, we analyse the tracer-dependence by comparing this reference case with the relative distances of the K-tensors corresponding to the nonlinearly initialised tracers.
The linearity of the tracer equation and the flux-gradient relation have important consequences for the non-uniqueness of the K-tensor. That is, a tensor obtained for any two tracers, C p and C q , is the same as would be obtained for any other tracer that is a linear combination of C p and C q . Consider two solutions C p , C q of the tracer (2.6) with initial conditions C p (t 0 ), C q (t 0 ). Then, any tracer C initialised as C(t 0 ) = mC p (t 0 ) + nC q (t 0 ) is a solution in the form where m and n are arbitrary constants. Because the spatial filter is a linear operator, the large-and small-scale components can also be written as the above linear combination.
L. Sun, M. Haigh, I. Shevchenko, P. Berloff and I. Kamenkovich Then, the eddy tracer flux and the large-scale tracer gradient are given by Suppose K f is the transport tensor obtained with C p and C q , then the linearity of the flux-gradient relation gives and implies that K f is also a transport tensor for C. Because the same linear combination as in (3.4) can be applied to the non-advective eddy terms, the associated flux divergence can be written as ∇ · g = m∇ · g p + n∇ · g q . (3.7) Therefore, the same conclusion can be inferred for K g . Overall, any tracer pairs that are linear combinations of C p and C q share the same K f and K g , and, thus, the same full tensor K . It is not necessary to linearly initialise tracers to obtain the same tensor. In fact, the tracers with linear initial profiles do not always have the same tensor. Recall the initialisation of the linear profiles C B in (2.32). Even though every linear profile has a constant gradient (a, b), it can only be written as a linear combination of other pairs of linear profiles with an additional constant owing to the scalar term γ . This γ is included in the large-scale tracer component and causes the tensors for linearly initialised tracers to be distinct. Even though it does not affect the large-scale gradient in (3.5), the eddy fluxes cannot, in general, be written as the linear combinations of those given by the linearly initialised tracers, as in (3.4), because of an additional flux component u γ . Thus, K f can not be unique in our linear tracer experiments, as part of the tensor component K γ depends on the scalar constant, u γ = −K γ · ∇C. (3.8) Note that K γ is not, in general, equal to K rot , even though u γ only contributes to f rot . Nonetheless, the transport tensor for initially linear tracers cannot be tracer-independent or unique, unless the rotational flux is removed. However, K g is always unique, because it only depends on C .
To verify this, we measure the relative distance between respective tensor components. That is, for a pair of components (denoted with superscripts p or q), the relative distance between them is where K p,q ∈ {K 11 , K 12 , K 21 , K 22 }, and the ensemble-average K is the global reference.
For each of K f , K div and K g , we have 15 ensembles from RL, so in total we can measure the relative distances for 105 pairs. The standard deviation is not used as a measure owing to the large differences in magnitudes among K-tensors. Figure 6 presents the relative distances for K f and K div . We see that d r for K f typically varies between 0.7 and 3, while for K div it is mainly between 0.8 and 1.25. Furthermore, more than 70 % of tensor pairs have larger distances for the former, despite the additional numerical errors induced in the latter. According to the linearity of the flux-gradient relation, K div for the linearly initialised tracers should be unique, because the tracer-dependent components K γ are excluded by removing the rotational fluxes. (d ) Figure 6. The relative distance d r of components K 11 , K 12 , K 21 , K 22 (a-d) in the top layer in experiment RL. One dot represents one comparison between K f and K div given by the same tracer pairs. A total of 105 comparisons are carried out amongst the 15 tracer pairs. The x-values are the temporal averages of d r for K f , while the y-values are for K div .
The non-zero d r for K div is attributed to numerical errors in the flux decomposition, as well as to some unavoidable alignments of gradients during the simulation. Moreover, d r can be uncommonly large for some pairs, because the ensemble-average K is used as a global reference in (3.9), instead of considering only K p and K q . Overall, it is evident that the greater relative distances in K f arise from the tracer-dependent K rot .  Figure 7. The relative distance d r of components K 11 , K 12 , K 21 , K 22 (a-d) in the top layer in experiment RL. One dot represents one comparison between K div and K g given by the same tracer pairs. A total of 105 comparisons are carried out amongst the 15 tracer pairs. The x-values are the temporal averages of d r for K div , while the y-values are for K g . The red solid lines are the linear regression lines, and the corresponding coefficients are 1.01, 0.9, 0.79, 0.99.
Because K g is not associated with the tracer-dependent γ , it is unique for linearly initialised tracers. This is shown in figure 7, where the relative distances for K g are in the range similar to those for K div . As both tensors are unique, we determine the relation between them by fitting linear regression models to the comparisons for each component (x for K div and y for K g ). We find that the regression lines are closely aligned with x = y, especially for the diagonal elements K 11 and K 22 . Therefore, d r for the tensors K g and K div are very similar, with slight differences caused by numerical errors. This rough similarity shows that the non-uniqueness of the tensors is caused by the linearity of the flux-gradient relation and linear tracers, and may not be a property inherent to all transport tensors.
To investigate if the uniqueness holds for all the passive tracers, we repeat the analyses for K div and K g using the ensembles from RN. In figure 8, we scatter plot d r for the RN ensemble against d r for the RL ensemble. Most d r values for K div from RN are greater than 1.5, and all of them are larger than those from RL. The same is observed for the sum of K div and K g . Because we have shown (analytically) that all linearly initialised tracers generate the same K div and K g , we deduce that they are different and, thus, tracer-dependent, when the tracers are initialised with sinusoidal -or more generally, nonlinear -profiles.
Until now, our conclusions are based on tracers that are relaxed towards their initial profiles. To test these conclusions for freely evolving tracers, in figure 9, we present time  Figure 8. The relative distance d r of components K 11 , K 12 , K 21 , K 22 (a-d) in the top layer for comparisons between experiments RL and RN. The tensors are for the divergent fluxes: blue dots represent K div and orange dots represent K div + K g . The x-and y-values are the relative distances for the nonlinear and linear tracer profiles, respectively. The vertical dashed lines are located at the maximum d r for RL on x-axes. If a dot is to the right of its corresponding line, then d r of this tensor in RN is larger than all ensembles of d r in RL. The clouds of dots clearly show the non-uniqueness of both K div and K div + K g for nonlinearly initialised tracers.
series of the ensemble-averaged d r from all four experiments. For initially linear tracers, the relative differences d r increase, when the tracers are not relaxed, and this agrees with our findings regarding the numerical errors, as shown in figure 1. For nonlinear tracers, though, d r does not notably depend on the use of relaxation. Therefore, even though the relaxation effect can affect the numerical accuracy, it is irrelevant for the non-uniqueness of K div . Similar observations are made for the other K-tensors, so we opt to not show them for brevity. Because the diffusion and advection tensors have different physical interpretations, they are often separately parametrised. Therefore, it is necessary to investigate whether they are also individually tracer-dependent. To examine this, we compare the eigenvalues λ 1 , λ 2 of the diffusion tensor S and the off-diagonal component A 12 of the advection tensor A between the experiments RL and RN. We measure the differences by the relative ensemble standard deviation δ r ( · ), which divides the ensemble standard deviation by the ensemble mean · . For example, δ r ( λ 1 ) is given by where λ p 1 , p = 1, . . . , 15, are the eigenvalues for 15 cases in one experiment. Figure 10 shows the instantaneous fields of the relative ensemble standard deviations of the eigenvalues λ 1 , λ 2 (λ 1 ≥ λ 2 ) and of the off-diagonal element A 12 . The first and second columns of panels are for RL and RN, respectively. Because the denominator of δ r is the -0.2 0 0.2 -4 0 4 Figure 10. Snapshots of the relative ensemble standard deviation δ r of the eigenvalues λ 1 (a,b), λ 2 (c,d) of S and of the off-diagonal element A 12 (e, f ) of A, all at day 183. Panels (a,c,e) are for RL, where tracers are linearly initialised, while (b,d, f ) are for RN (nonlinearly initialised tracers). The non-uniqueness of λ 1 , λ 2 and A 12 is observed, as the standard deviation for the latter is much larger than for the former. For both experiments, δ r ( λ 1 ) is mainly positive, as shown in (a) and (b), but δ r ( λ 2 ) is mainly negative, as shown in (c) and (d). This indicates that the eigenvalues tend to have opposite signs.
that the ensembles of three variables for initially linear tracers do not differ much from the ensemble-means, as δ r is predominantly less than 0.2 in the domain. This observation shows that for both S and A, the ensembles are close to each other in distances, and agrees with our argument that K div is unique, up to numerical errors, if tracers are initially linear. For the case when tracers are initialised with nonlinear profiles, the relative ensemble standard deviations are much larger, by at least an order of magnitude. Therefore, the diffusion and the advection tensors are also tracer-dependent, that is, they are non-unique for nonlinearly initialised tracers. Overall, K div and K g are unique for linearly initialised tracers, and any tracer dependence can serve as a measure of the numerical errors in our approach. The comparison with more general, nonlinearly initialised tracers demonstrates the inherent dependence of the K-tensor on the tracer pairs, and, thus, its non-uniqueness. This conclusion is not affected by a relaxation forcing. The same argument also applies to the eigenvalues of the diffusion tensor and to the off-diagonal element of the advection tensor. It is worth noting that the non-uniqueness of K-tensors is not limited to the idealised QG model but is also found in a more comprehensive general circulation model (GCM) of the entire Atlantic Ocean (Kamenkovich et al. 2021).

Conclusions and discussion
This work aimed to investigate the uniqueness of the transport K-tensor in an eddy-resolving model, with the long-term motivation of parametrising eddy tracer fluxes in coarse-resolution ocean models. Our objective here was to calculate K-tensors from high-resolution flow and tracer distributions, to investigate the resulting errors in the eddy flux reconstructions and to examine the K-tensors' dependencies on the passive tracers. The main conclusion is that the K-tensors depend on the tracer and are, therefore, non-unique. This non-uniqueness is strongly influenced by the rotational (non-divergent) component of the eddy tracer flux.
A double-gyre, mid-latitude QG model was used in the study. Tracers were initialised with either linear or nonlinear (sinusoidal wave) profiles, and were advected by the flow for half a year. The flow and tracer fields were decomposed into large-and small-scale components using a spatial filtering method. The resulting eddy forcing E consisted of the divergence of the eddy tracer flux f and non-advective eddy terms. We assumed a flux-gradient relation to link f to the large-scale tracer gradient ∇C via a transport tensor K f . By quantifying the non-advective eddy terms using a divergent flux g, we obtained the corresponding K-tensor K g . Then, the full tensor K corresponding to the eddy forcing was given by the sum of K f and K g . The divergent tensor K div -that is, the portion of K f corresponding to the divergent eddy flux f div -was obtained by applying the Helmholtz decomposition to f . The non-advective eddy forcing terms were comparable in size to ∇ · f = ∇ · f div , and similarly the K-tensor K g was as large as K div . Therefore, we deduce that the non-advective terms should be parametrised along with the advective eddy terms.
The accuracy of the inversion method for computing the K-tensors was measured by reconstructing f , ∇ · f and E. Generally, K is two orders of magnitude larger than K div , owing to the dominant rotational fluxes (Marshall & Shutts 1981). Despite this difference, both tensors were shown to be capable of reconstructing the eddy tracer flux convergence, with their parametrised flux convergences being visually indistinguishable from each other. The largest errors were typically found in the eastward jet region where the eddies were most energetic, but for initially nonlinear tracers, they also occurred in regions where the tracer profiles exhibited sharp spatial gradients (a consequence of the randomly selected nonlinear profiles).
Large errors in the reconstructed flux divergence are introduced by the Helmholtz decomposition, more precisely by the divergence operator, and hence errors associated with K div are much larger than those for K . This is despite the fact that the corresponding flux convergences are visually indistinguishable from each other. As the transport tensor component associated with the rotational flux is eliminated automatically under the divergence operator in the tracer evolution equation, it seems more suitable to use K f rather than K div for a parametrisation, given the errors associated with the latter. However, the additional model errors induced by simplifications in parametrisations may be as large as the errors induced by the flux decomposition. Therefore, the question of whether the difference in errors remains significant after smoothing the fields for coarse-grid models needs to be answered.
For each K-tensor, the tracer-dependence was quantified by measuring the relative distances among its ensembles given by different tracer pairs. We analysed how the tracer-dependence of the K-tensors was affected by various factors such as the initial tracer profiles, the rotational flux and the relaxation forcing. We found that K div and K g were unique, up to numerical errors, if the tracers were initialised with linear profiles, owing to the linearity of the flux-gradient relation. However, K f and, thus, the full tensor K were found to be tracer-dependent because the dominant rotational component (i.e. that corresponding to the rotational flux) was tracer-dependent. For nonlinearly initialised tracer profiles, the uniqueness of K div and K g was lost, and the non-uniqueness of K was further amplified by the presence of the tracer-dependent rotational flux. Therefore, in general, K-tensors are non-unique for passive tracers. The conclusion can be extended to diffusion tensor S and advection tensor A, separately. Finally, switching off the relaxation forcing did not affect the non-uniqueness for nonlinear tracers.
The effect of the non-uniqueness of the eddy transport tensor on parametrisations needs to be investigated. Generally, the estimation of the eddy flux for a tracer is different when using different tensors. A question then arises: does it make a fundamental difference whether one uses the tensor given by a tracer pair including this tracer or any tensor from any tracer pair? If not, can these tensors be implemented for the stochastic estimations of the eddy flux patterns? Stochastic closures have been employed in the past (Berloff & McWilliams 2003;Grooms 2016), and our study suggests that treating the eddy transport tensor as a random process may be consistent with its inherent non-uniqueness. In situations when the large-scale flow varies much slower than the eddies, a deterministic solution can be obtained as the tensor given by the flux-gradient relation between u C and the steady large-scale gradientC, where· denotes the Reynolds averaging. Whether the difference between such tensors and our non-unique transport tensor (which is supposed to be given by the variations of the large-scale terms in the eddy flux) can be treated as stochastic variations still needs to be investigated.