Gauge-invariant variational formulations of electromagnetic gyrokinetic theory

Ronald Remmerswaal; Roman Hatzky; Eric Sonnendrücker

doi:10.1017/S0022377825100688

Gauge-invariant variational formulations of electromagnetic gyrokinetic theory

Published online by Cambridge University Press: 03 September 2025

and

Ronald Remmerswaal*: Affiliation:
Max Planck Institute for Plasma Physics, D-85748 Garching, Germany
Roman Hatzky: Affiliation:
Max Planck Institute for Plasma Physics, D-85748 Garching, Germany
Eric Sonnendrücker: Affiliation:
Max Planck Institute for Plasma Physics, D-85748 Garching, Germany Department of Mathematics, Technical University of Munich, D-85748 Garching, Germany
*: Corresponding author: Ronald Remmerswaal, ronald.remmerswaal@ipp.mpg.de

Article contents

Abstract
Introduction
A brief overview of the main results
Preliminary transformations
Gyrocentre single-particle phase-space Lagrangian
Gyrokinetic Maxwell model
Quasi-neutral gyrokinetic Darwin model
Comparison with some models from literature
Conclusions
Funding
Declaration of interests
References

Rights & Permissions

Abstract

The use of gyrokinetics, wherein phase-space coordinate transformations result in a phase-space dimensionality reduction as well as the removal of fast time scales, has enabled the simulation of microturbulence in fusion devices. The state-of-the-art gyrokinetic models used in practice are parallel-only models wherein the perpendicular part of the vector potential is neglected. Such models are inherently not gauge-invariant. We generalise the work of Burby & Brizard (2019 Phys. Lett. A vol. 383, no. 18, pp. 2172–2175) by deriving a sufficient condition on the gyrocentre coordinate transformation that ensures gauge invariance. This leads to a parametrised family of gyrokinetic models for which we motivate a specific choice of parameters that results in the smallest gyrocentre coordinate transformation for which the resulting gyrokinetic model is consistent, gyro-phase independent, gauge-invariant and has an invariant magnetic moment. Due to gauge invariance, this model can be expressed directly in terms of the electromagnetic fields rather than the potentials, and the gyrokinetic model thereby results in the macroscopic Maxwell’s equations. For the linearised model, it is demonstrated that the shear and compressional Alfvén waves are present with the correct frequencies. The fast compressional Alfvén wave can be removed by making use of a Darwin approximation. This approximation retains the gauge invariance of the proposed model.

Keywords

plasma dynamics plasma nonlinear phenomena fusion plasma

Information

Type: Research Article
Information: Journal of Plasma Physics , Volume 91 , Issue 4 , August 2025 , E127

DOI: https://doi.org/10.1017/S0022377825100688 [Opens in a new window]

NASA ADS Abstract Service [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright: © The Author(s), 2025. Published by Cambridge University Press

1. Introduction

The role of numerical modelling is prevalent not only in understanding the physics of fusion plasmas, but also in the design and optimisation of magnetic fusion devices. The collisionless Vlasov–Maxwell model is found to be appropriate when the collision frequency of the charged particles is much lower than the frequencies that are of interest, e.g. when studying microturbulence (Garbet et al. Reference Garbet, Idomura, Villard and Watanabe2010). Nonetheless, such a model is still very challenging to use, not only because of its six-dimensional phase-space, but also because of the large range of length (four orders of magnitude between the plasma size and the Debye length) and time (seven orders of magnitude between the ion–ion collision frequency and the electron plasma frequency) scales.

Two of the aforementioned challenges are, at least partially, addressed by gyrokinetic theory (Frieman & Chen Reference Frieman and Chen1982; Littlejohn Reference Littlejohn1983; Sugama Reference Sugama2000; Brizard & Hahm Reference Brizard and Hahm2007; Burby & Brizard Reference Burby and Brizard2019) wherein a sequence of phase-space coordinate transformations is used to decouple the fast gyration of a charged particle from its otherwise slower motion along the magnetic field. This thereby results in the removal of the (high) cyclotron frequency, while at the same time reducing the phase-space dimensionality by one. When moreover considering the quasi-neutral limit, it is found that also the light wave and the Langmuir wave (or plasma oscillation) are removed from the model. The use of gyrokinetic theory thereby permits the numerical modelling of turbulent transport in tokamaks, as discussed in the well-written review paper (Garbet et al. Reference Garbet, Idomura, Villard and Watanabe2010) and was more recently applied to study electromagnetic turbulence in stellarator plasmas (Mishchenko et al. Reference Mishchenko, Borchardt, Hatzky, Kleiber, Könies, Nührenberg, Xanthopoulos, Roberg-Clark and Plunk2023).

The gyrokinetic model results from a sequence of – mostly near-identity – phase-space coordinate transformations which are applied to the collisionless Vlasov–Maxwell model and therefore, in theory, there is no approximation error. In practice, however, one must always truncate the near-identity phase-space coordinate transformation to some order of the small expansion parameter, resulting in an unavoidable modelling error. The more recently developed gyrokinetic models are based on a variational principle which thereby, despite this modelling error, still preserve essential structures of the original Vlasov–Maxwell model. For instance, the total (free) charge, momentum and energy should be conserved (Sugama et al. Reference Sugama, Nunami, Satake and Watanabe2018; Hirvijoki et al. Reference Hirvijoki, Burby, Pfefferlé and Brizard2020; Brizard Reference Brizard2021a ; Peifeng, Hong & Jianyuan Reference Peifeng, Hong and Jianyuan2021), while the choice of the gauge condition on the vector potential should leave the model invariant (Burby & Brizard Reference Burby and Brizard2019), resulting in so-called gauge invariance.

However, to our knowledge, all global gyrokinetic simulations either neglect the part of the vector potential that is perpendicular to the background magnetic field and thereby result in a ‘parallel-only’ gyrokinetic model (Kleiber et al. Reference Kleiber, Hatzky, Könies, Mishchenko and Sonnendrücker2016), or use a simplified model for the parallel component of the perturbed magnetic field (Chen & Zonca Reference Chen and Zonca2016). Both approximations are irreconcilable with gauge invariance and lead to intermediate wavelength models wherein perpendicular system-scale effects are incorrectly modelled. There are, however, numerous gyrokinetic theories and models that include the perpendicular part of the vector potential (Qin et al. Reference Qin, Tang, Lee and Rewoldt1999; Qin, Tang & Lee Reference Qin, Tang and Lee2000; Qin Reference Qin, Passot, Sulem and Sulem2005; Brizard & Hahm Reference Brizard and Hahm2007). Furthermore, Burby & Brizard (Reference Burby and Brizard2019) introduced a novel gauge-invariant gyrokinetic model for which exact conservation laws are derived by Brizard (Reference Brizard2021a , Reference Brizardb ).

We extend the approach followed by Burby & Brizard (Reference Burby and Brizard2019) by proposing a parametrised family of gyrocentre coordinate transformations, resulting in a family of gauge-invariant gyrokinetic models where a specific choice of parameters yields the model of Burby & Brizard (Reference Burby and Brizard2019). A different choice of parameters is motivated in detail in this paper, resulting in the smallest gyrocentre coordinate transformation for which the resulting gyrokinetic model is consistent, gyro-phase independent, gauge-invariant and has an invariant magnetic moment.

The proposed gyrokinetic model is derived in detail using the language of vector calculus in favour of the customarily used language of differential geometry and Lie transform methods. Such a derivation is equivalent to the more traditionally used techniques, as found for example in Hahm (Reference Hahm1988), Brizard (Reference Brizard1990), Qin et al. (Reference Qin, Tang, Lee and Rewoldt1999), but we have opted for the vector calculus approach as it reduces the required prerequisite knowledge of our readers.

Our paper starts with a brief overview of its main results in § 2. In § 3, we discuss the preliminary phase-space coordinate transformation leading to the guiding-centre single-particle phase-space Lagrangian, wherein only the stationary background magnetic field is considered. The perturbed time-dependent electromagnetic fields are included as a perturbation to the guiding-centre Lagrangian in § 4, followed by a detailed description of the proposed gyrocentre coordinate transformation, resulting eventually in the gyrocentre single-particle phase-space Lagrangian. We then combine the gyrocentre single-particle Lagrangian for each species with Maxwell’s equations in § 5, eventually resulting in the gyrocentre equations of motion, Gauss’s law as well as the Ampère–Maxwell law. A low-frequency quasi-neutral Darwin approximation of the proposed model is considered in § 6. A comparison with reduced models is made in § 7, where we compare the two proposed models to several models from the literature. We conclude with a discussion in § 8.

2. A brief overview of the main results

To compensate for the length and complexity of this paper, we provide a brief overview of the main results in this section. This is of particular use to those readers who are not necessarily interested in a detailed derivation of the model, but who are instead interested in (the implementation of) the resulting models and their key properties.

In essence, two gauge-invariant models are derived, analysed and proposed in this paper: the first is referred to as the ‘gyrokinetic Maxwell model’ (see § 5), and the second is referred to as the ‘quasi-neutral gyrokinetic Darwin model’ (see § 6). The former yields a model in which fast waves (such as the light wave, Langmuir wave and compressional Alfvén wave) are retained, whereas such fast waves are eliminated in the latter model. Each model is derived from an action principle which can be found in (5.25) and (6.7), respectively, where the action is based on the gyrocentre single-particle Lagrangian (which is derived in detail in § 4) and is a function of the gyrocentre coordinate ${\boldsymbol{Z}}(t) = ({\boldsymbol{R}}(t), U_\shortparallel (t), {M}, \varTheta (t))$ (where $\boldsymbol{R}$ denotes the gyrocentre position, $U_\shortparallel$ denotes the velocity component parallel to the background magnetic field ${\boldsymbol{B}}_0$ , $M$ denotes the invariant magnetic moment and $\varTheta$ denotes the gyro-phase), the perturbed scalar potential $\phi _1$ and the perturbed vector potential ${\boldsymbol{A}}_1$ .

2.1. Gyrocentre equations of motion

Imposing the principle of least action (see § 5.3) with respect to the gyrocentre coordinate yields the gyrocentre equations of motion, which are discussed in detail in § 5.3.1 and are presented here for convenience:

(2.1a)

\begin{align} \,\dot { { {\boldsymbol{R}}}} &= {U}_\shortparallel {\boldsymbol{b}}_s^{\star } - \frac {1}{q_s B_{s,\shortparallel }^{\star }} {{\boldsymbol{\hat {b}}}_0} \times \big[ q_s {\boldsymbol{E}}^{\star }_1 - {M} \boldsymbol{\nabla }\big(B_0 + \big\langle \kern-0.7pt\big\langle \mathring {B}^\varsigma _{1,\shortparallel } \big\rangle \kern-0.8pt\big\rangle \big) \big], \end{align}

(2.1b)

\begin{align} \dot {U}_\shortparallel &= \frac {1}{m_s} {\boldsymbol{b}}_s^{\star } \boldsymbol{\cdot } \big[ q_s {\boldsymbol{E}}^{\star }_1 - {M} \boldsymbol{\nabla }\big(B_0 + \big\langle \kern-0.7pt\big\langle \mathring {B}^\varsigma _{1,\shortparallel } \big\rangle \kern-0.8pt\big\rangle \big) \big] , \end{align}

where we note that the magnetic moment is invariant $\dot {{M}} = 0$ and the gyro-phase $\varTheta (t)$ is an ignorable coordinate as none of the other equations depend on it. Here, ${{\boldsymbol{\hat {b}}}_0} = {\boldsymbol{B}}_0 / B_0$ is the normalised background magnetic field, $B_0 = \lvert {\boldsymbol{B}}_0 \rvert$ is the Euclidean norm of the background magnetic field, ${\boldsymbol{b}}_s^{\star }$ is the effective magnetic field direction defined in (4.76), $B_{s,\shortparallel }^{\star }$ is the effective parallel magnetic field defined in (4.77), $q_s$ and $m_s$ denote the charge and mass of the species $s$ , ${\boldsymbol{E}}^{\star }_1$ is the effective electric field defined in (4.71a ) (and is approximately equal to the gyro-averaged electric field ${\boldsymbol{E}}^{\star }_1 \approx \langle {\mathring {\boldsymbol{E}}}_1 \rangle$ ), and $\big\langle \kern-0.7pt\big\langle \mathring {B}^\varsigma _{1,\shortparallel } \big\rangle \kern-0.8pt\big\rangle$ denotes the disc average of the parallel component of the perturbed magnetic field as defined in (4.49). We note that the equations of motion are identical for the two models proposed in this paper.

2.2. Gyrokinetic Maxwell model

Imposing the principle of least action with respect to the perturbed vector potential ${\boldsymbol{A}}_1$ , where the action is given by (5.25), yields the following Ampère–Maxwell law together with Faraday’s law (we refer to § 5.3.2 for corresponding weak formulation and to §§ 5.4 and 5.5 for a discussion on the strong formulation):

(2.2a)

\begin{align} \biggl (\epsilon _0 \unicode{x1D644}_3 + \underbrace {\frac {\sum _s m_s {n}_{0,s}}{B_0^2} {\varPi }_\perp \biggr ) \frac {\partial {\boldsymbol{E}}}{\partial t}}_{\text{polarisation current}} = {} & \boldsymbol{\nabla }\!\times\! {\biggl ( \frac {1}{\mu _0} {\boldsymbol{B}} - \underbrace {\frac {p_{0,\shortparallel } - p_{0,\perp }}{B_0^2} {\boldsymbol{B}}_{\perp } + \frac {\sum _s m_s {n}_{0,s} {u}_{0,\shortparallel ,s}}{B_0^2} {{\boldsymbol{\hat {b}}}_0} \times {\boldsymbol{E}}}_{\text{magnetisation}} \biggr )} \nonumber \\ & + \underbrace {\frac {\sum _s m_s {n}_{0,s} {u}_{0,\shortparallel ,s}}{B_0^2} {{\boldsymbol{\hat {b}}}_0} \times (\boldsymbol{\nabla }\times {\boldsymbol{E}})}_{\text{polarisation current}} - \overline {{\boldsymbol{\mathcal{J}}}}{}^{\mathrm{f}} , \end{align}

(2.2b)

\begin{align} \frac {\partial {\boldsymbol{B}}}{\partial t} = {} & {-} \boldsymbol{\nabla }\times {\boldsymbol{E}}, \end{align}

where we note that both Gauss’s law (cf. (5.44a )) and the magnetic Gauss’s law (cf. (5.44b )) are satisfied provided that they are satisfied initially. Here, $\epsilon _0$ denotes the vacuum permittivity, $\mu _0$ denotes the magnetic vacuum permeability, $n_{0,s}$ and ${u}_{0,\shortparallel ,s}$ denote the background particle density and parallel velocity of species $s$ (as defined in (5.52)), ${\varPi }_\perp = \unicode{x1D644}_3 - {{\boldsymbol{\hat {b}}}_0} \otimes {{\boldsymbol{\hat {b}}}_0}$ is the perpendicular projection matrix and $ \unicode{x1D644}_3$ denotes the $3 \times 3$ identity matrix, ${\boldsymbol{E}} = {\boldsymbol{E}}_1$ and ${\boldsymbol{B}} = {\boldsymbol{B}}_0 + {\boldsymbol{B}}_1$ denote the electric and magnetic field, ${\boldsymbol{B}}_\perp = {\varPi }_\perp {\boldsymbol{B}}$ denotes the perpendicular part of the magnetic field, $p_{0,\perp }, p_{0,\shortparallel }$ denote the perpendicular and parallel background pressure as defined in (5.56), and $\overline {{\boldsymbol{\mathcal{J}}}}{}^{\mathrm{f}}$ denotes the gyrocentre free-current density defined weakly in (5.39b ).

The evolution equation for the electric field $\boldsymbol{E}$ can be obtained from (2.2a ) by multiplying by the inverse of the $3 \times 3$ matrix shown on the left-hand side. In this formulation, it is therefore advantageous to let the vacuum permittivity $\epsilon _0$ be finite, such that this matrix is invertible, resulting in field equations which are entirely local and thereby result in a local gyrokinetic model which can be integrated explicitly in time. The fast compressional Alfvén wave is present in this model as we demonstrate in § 7.3.

Conditions on the background magnetic field ${\boldsymbol{B}}_0$ and the initial distribution function ${f}^0_{s}$ are derived in § 5.6, ultimately leading to the MHD equilibrium condition (5.61) for the correct balance in the perpendicular part of the Ampère–Maxwell law and (5.62) for the remaining parallel component. Moreover, a local energy conservation law for this model in terms of the kinetic and potential energy densities is derived in § 5.8.

2.3. Quasi-neutral gyrokinetic Darwin approximation

The gyrokinetic Maxwell model enjoys a favourable local structure of the equations, but it contains the fast compressional Alfvén wave, the fast light wave as well as the Langmuir wave, which are often undesirable. To this end, we propose a quasi-neutral gyrokinetic Darwin model in § 6, wherein these fast waves are removed, ultimately resulting in the following field equations in the perpendicular Coulomb gauge (2.3c ):

(2.3a)

\begin{align} -\boldsymbol{\nabla }\boldsymbol{\cdot } \bigg( \sum _s \frac { m_s {n}_{0,s}}{B_0^2} \big[ \boldsymbol{\nabla} _\perp \phi _1 - {u}_{0,\shortparallel ,s} {{\boldsymbol{\hat {b}}}_0} \times (\boldsymbol{\nabla }\times {\boldsymbol{A}}_1) \big] \bigg) = \overline {\mathcal{R}}{}^{\mathrm{f}}, \end{align}

(2.3b)

\begin{align} \boldsymbol{\nabla }& \times \biggl [ \frac {1}{\mu _0} \boldsymbol{\nabla }\times {\boldsymbol{A}}_1 - \frac {p_{0,\shortparallel } - p_{0,\perp }}{B_0^2} (\boldsymbol{\nabla }\times {\boldsymbol{A}}_1)_\perp - \frac {\sum _s m_s {n}_{0,s} {u}_{0,\shortparallel ,s}}{B_0^2} {{\boldsymbol{\hat {b}}}_0} \times \boldsymbol{\nabla} _\perp \phi _1 \biggr ] \nonumber \\[4pt]& \quad + \, \frac {\sum _s m_s {n}_{0,s}}{B_0^2} \boldsymbol{\nabla} _\perp \lambda = \overline {{\boldsymbol{\mathcal{J}}}}{}^{\mathrm{f}} - \frac {1}{\mu _0} \boldsymbol{\nabla }\times {\boldsymbol{B}}_0, \end{align}

(2.3c)

\begin{align} \boldsymbol{\nabla }\boldsymbol{\cdot } \left ( \frac {\sum _s m_s {n}_{0,s}}{B_0^2} {\boldsymbol{A}}_{1,\perp } \right ) = 0, \end{align}

where $\boldsymbol{\nabla} _\perp = {\varPi }_\perp \boldsymbol{\nabla}$ denotes the perpendicular part of the gradient operator, the perpendicular part of the perturbed magnetic field is denoted by $(\boldsymbol{\nabla }\times {\boldsymbol{A}}_1)_\perp = {\varPi }_\perp \boldsymbol{\nabla }\times {\boldsymbol{A}}_1$ , $\overline {\mathcal{R}}{}^{\mathrm{f}}$ denotes the gyrocentre free-charge density defined weakly in (5.39a ) and $\lambda$ is the Lagrange multiplier associated with the constraint (2.3c ). Due to the quasi-neutral Darwin approximation, the field equations are no longer local, but we note that the quasi-neutrality equation (2.3a ) is entirely decoupled from the Ampère–Maxwell law (2.3b ) (together with its constraint (2.3c )) if the background distribution function is symmetric ( ${u}_{0,\shortparallel ,s} = 0$ ) and can therefore be solved for independently.

In § 7.3, we demonstrate that in this model, the fast compressional Alfvén wave is removed. Moreover, the local energy conservation law from § 5.8 also holds for this model, except that the energy flux vector is altered slightly as discussed in § 6.5.

3. Preliminary transformations

In this section, we establish most of our notation and apply preliminary coordinate transformations which result in the guiding-centre single-particle phase-space Lagrangian wherein only a background magnetic field is considered.

3.1. Motivation

We start by considering the model for the motion of a charged particle, of charge $q$ and mass $m$ , in the presence of a stationary background magnetic field ${\boldsymbol{B}}_0 = \boldsymbol{\nabla }\times {\boldsymbol{A}}_0$ , with magnitude $B_0 = \lvert {\boldsymbol{B}}_0 \rvert$ , where ${\boldsymbol{A}}_0$ denotes the background vector potential and $\lvert \cdot \rvert$ denotes the Euclidean norm. The background magnetic field yields a coordinate system whose orthogonal basis vectors are denoted by $({{\boldsymbol{\hat {b}}}_0}, {\boldsymbol{\hat {e}}}_1, {\boldsymbol{\hat {e}}}_2)$ , where ${{\boldsymbol{\hat {b}}}_0} = {\boldsymbol{B}}_0 / B_0$ and ${\boldsymbol{\hat {e}}}_1, {\boldsymbol{\hat {e}}}_2$ are unit vectors orthogonal to ${\boldsymbol{\hat {b}}}_0$ for which ${{\boldsymbol{\hat {b}}}_0} = {\boldsymbol{\hat {e}}}_1 \times {\boldsymbol{\hat {e}}}_2$ . For any vector $\boldsymbol{S}$ (such as the velocity $\boldsymbol{U}$ ), we denote the component parallel to the background magnetic field as

(3.1)

\begin{equation} S_\shortparallel \mathrel {\mathop :}= {\boldsymbol{S}} \boldsymbol{\cdot } {{\boldsymbol{\hat {b}}}_0}. \end{equation}

The resulting parallel and perpendicular parts of a vector are denoted by

(3.2)

\begin{equation} {\boldsymbol{S}}_\shortparallel \mathrel {\mathop :}= S_\shortparallel {{\boldsymbol{\hat {b}}}_0}, \quad {\boldsymbol{S}}_\perp \mathrel {\mathop :}= {\boldsymbol{S}} - {\boldsymbol{S}}_\shortparallel . \end{equation}

Equivalent notation is used for the perpendicular gradient operator

(3.3)

\begin{equation} \boldsymbol{\nabla} _\perp Q \mathrel {\mathop :}= \boldsymbol{\nabla }Q - (\boldsymbol{\nabla }Q \boldsymbol{\cdot } {{\boldsymbol{\hat {b}}}_0}) {{\boldsymbol{\hat {b}}}_0}. \end{equation}

It is well known that the following single-particle phase-space Lagrangian $L_0$ is a model for the motion of a charged particle in physical coordinates

(3.4)

\begin{equation} L_0 \mathrel {\mathop :}= \varGamma _0 - H_0, \quad \varGamma _0 \mathrel {\mathop :}= \left ( q {\boldsymbol{A}}_0 + m {\boldsymbol{U}} \right ) \boldsymbol{\cdot } \,\dot {{ {\!\boldsymbol{R}}}}, \quad H_0 \mathrel {\mathop :}= K_0, \end{equation}

where $\varGamma _0$ and $H_0$ denote the symplectic and Hamiltonian part of the Lagrangian $L_0$ , respectively. The kinetic energy per particle is given by

(3.5)

\begin{equation} K_0 = \frac {m}{2} \lvert {\boldsymbol{U}} \rvert ^2. \end{equation}

The model is expressed in terms of the phase-space coordinates $\tilde {{\boldsymbol{Z}}} = ({\boldsymbol{R}}, {\boldsymbol{U}}) \in \mathbb{R}^6$ , where $\boldsymbol{R}$ and $\boldsymbol{U}$ denote the particle position and velocity, respectively.

Imposing the principle of least action (this is explained in more detail in § 3.5) on the single-particle phase-space Lagrangian (3.4) results in the well-known Euler–Lagrange equations, which in turn yield the equations of motion (EOMs) for a charged particle in the presence of a stationary background magnetic field

(3.6)

\begin{equation} \,\dot {{{\!\boldsymbol{R}}}} = {\boldsymbol{U}} , \quad \dot {{\boldsymbol{U}}} = \frac {q}{m} {\boldsymbol{U}} \times {\boldsymbol{B}}_0 . \end{equation}

If ${\boldsymbol{B}}_0$ is constant, then the solution to the EOMs is

(3.7)

\begin{equation} {\boldsymbol{U}}(t) = U_\shortparallel (0) {{\boldsymbol{\hat {b}}}_0} + \big[ \cos (\omega _{\mathrm{c}} t) \big({\boldsymbol{\hat {e}}}_1 {\boldsymbol{\hat {e}}}_1^\intercal + {\boldsymbol{\hat {e}}}_2 {\boldsymbol{\hat {e}}}_2^\intercal \big) + \sin (\omega _{\mathrm{c}} t) \big({\boldsymbol{\hat {e}}}_1 {\boldsymbol{\hat {e}}}_2^\intercal - {\boldsymbol{\hat {e}}}_2 {\boldsymbol{\hat {e}}}_1^\intercal \big) \big] {\boldsymbol{U}}_\perp (0), \end{equation}

where we have defined the cyclotron frequency as

(3.8)

\begin{equation} \omega _{\mathrm{c}} \mathrel {\mathop :}= \frac {q B_0}{m}. \end{equation}

In many applications, the frequency of interest is much smaller than the cyclotron frequency and, therefore, the aim is to decouple this fast gyrating motion by applying coordinate transformations to the single-particle phase-space Lagrangian.

3.2. Field aligned velocity coordinates

The first coordinate transformation that we consider results in coordinates which are field aligned in velocity space ${\boldsymbol{Z}} = ({\boldsymbol{R}}, U_\shortparallel , {M}, \varTheta )$ , where the parallel velocity component $U_\shortparallel$ , magnetic moment $M$ and gyro-phase $\varTheta$ are defined as

(3.9)

\begin{equation} U_\shortparallel \mathrel {\mathop :}= {\boldsymbol{U}} \boldsymbol{\cdot } {{\boldsymbol{\hat {b}}}_0}, \quad {M} \mathrel {\mathop :}= \frac {m U_\tau ^2}{2 B_0}, \quad \varTheta \mathrel {\mathop :}= \arctan \left ( \frac {{\boldsymbol{U}} \boldsymbol{\cdot } {\boldsymbol{\hat {e}}}_1}{{\boldsymbol{U}} \boldsymbol{\cdot } {\boldsymbol{\hat {e}}}_2} \right )\!, \end{equation}

where $U_\tau$ is defined later. Using the gyro-phase, we define the new coordinate system $({{\boldsymbol{\hat {b}}}_0}, {\boldsymbol{\hat {\tau }}}, {\boldsymbol{\hat {\rho }}})$ , where ${\boldsymbol{\hat {\tau }}}, {\boldsymbol{\hat {\rho }}}$ are given by

(3.10)

\begin{equation} {\boldsymbol{\hat {\tau }}} \mathrel {\mathop :}= - {\boldsymbol{\hat {e}}}_1 \sin \varTheta - {\boldsymbol{\hat {e}}}_2 \cos \varTheta , \quad {\boldsymbol{\hat {\rho }}} \mathrel {\mathop :}= {\boldsymbol{\hat {e}}}_1 \cos \varTheta - {\boldsymbol{\hat {e}}}_2 \sin \varTheta \end{equation}

and are such that ${{\boldsymbol{\hat {b}}}_0} = {\boldsymbol{\hat {\tau }}} \times {\boldsymbol{\hat {\rho }}}$ . See also figure 1. We denote the tangential and radial components of a vector field $\boldsymbol{S}$ by

(3.11)

\begin{equation} S_\tau \mathrel {\mathop :}= {\boldsymbol{S}} \boldsymbol{\cdot } {\boldsymbol{\hat {\tau }}}, \quad S_\rho \mathrel {\mathop :}= {\boldsymbol{S}} \boldsymbol{\cdot } {\boldsymbol{\hat {\rho }}}. \end{equation}

Figure 1. Illustration of the guiding-centre coordinate system. We denote the physical particle position in black and the guiding-centre position in green. The particle moves along the background magnetic field in the (blue) ${\boldsymbol{\hat {b}}}_0$ direction, while gyrating in the (red) plane perpendicular to the background magnetic field, in the direction of the (red) arrow $\boldsymbol{\hat {\tau }}$ . The extremal values of the $\varsigma$ parameter (introduced in § 4.4) are indicated in grey.

Note that

(3.12)

\begin{equation} \tan \varTheta = \frac {{\boldsymbol{U}} \boldsymbol{\cdot } {\boldsymbol{\hat {e}}}_1}{{\boldsymbol{U}} \boldsymbol{\cdot } {\boldsymbol{\hat {e}}}_2} = \frac {U_\tau \tan \varTheta - U_\rho }{U_\tau + U_\rho \tan \varTheta } \quad \implies \quad U_\rho = 0, \end{equation}

and therefore, the velocity can be expressed in terms of the field aligned velocity coordinates as

(3.13)

\begin{equation} {\boldsymbol{U}} = U_\shortparallel {{\boldsymbol{\hat {b}}}_0} + U_\tau {\boldsymbol{\hat {\tau }}}, \end{equation}

where the signed tangential velocity can be obtained from the magnetic moment as follows:

(3.14)

\begin{equation} U_\tau = \mathrm{sgn}(q) \sqrt {\frac {2 {M} B_0}{m}}. \end{equation}

Thus, the kinetic energy can be written as $K_0 = m (U_\shortparallel ^2 + U_\tau ^2)/2 = m U_\shortparallel ^2/2 + {M} B_0$ .

The single-particle phase-space Lagrangian in field aligned velocity coordinates is expressed as

(3.15)

\begin{equation} L_0 = {\boldsymbol{\gamma }}_0 \boldsymbol{\cdot } \dot {{\boldsymbol{Z}}} - H_0, \quad {\boldsymbol{\gamma }}_{0,{\boldsymbol{R}}} = q {\boldsymbol{A}}_0 + m U_\shortparallel {{\boldsymbol{\hat {b}}}_0} + m U_\tau {\boldsymbol{\hat {\tau }}}, \quad H_0 = \frac {m U_\shortparallel ^2}{2} + {M} B_0, \end{equation}

where the remaining components of ${\boldsymbol{\gamma }}_{0}$ are zero. Note that the symplectic part of the Lagrangian is now written as ${\boldsymbol{\gamma }}_0 \boldsymbol{\cdot } \dot {{\boldsymbol{Z}}}$ , where we interchangeably refer to ${\boldsymbol{\gamma }}_0$ as the symplectic part of the Lagrangian.

3.3. Small parameters

Before we discuss near-identity phase-space coordinate transformations, we discuss the corresponding small parameters in which the coordinate transformation is expanded.

We let $L_B$ be the length scale on which the background magnetic field varies and let $\varrho$ denote the Larmor radius

(3.16)

\begin{equation} L_B \mathrel {\mathop :}= \frac {[B_0]}{[\boldsymbol{\nabla }B_0]} , \quad \varrho \mathrel {\mathop :}= \frac {m u_{\mathrm{th}}}{q [B_0]} , \quad u_{\mathrm{th}} \mathrel {\mathop :}= \sqrt {\frac {2 k_{\mathrm{B}} T}{m}} , \end{equation}

where $[Q]$ is the constant dimensional part of $Q$ and $u_{\mathrm{th}}, k_{\mathrm{B}}, T$ denote the thermal velocity, the Boltzmann constant and the temperature, respectively. The ratio of the two length scales is denoted by $\varepsilon _B$ :

(3.17)

\begin{equation} \varepsilon _B \mathrel {\mathop :}= \frac {\varrho }{L_B}, \end{equation}

which is much smaller than one when the background magnetic field has a weak inhomogeneity (Brizard & Hahm Reference Brizard and Hahm2007). This is the parameter that is used in the guiding-centre coordinate transformation.

We let $\varepsilon _\delta$ denote the size of the perturbed magnetic field, which is introduced in § 4, relative to the background magnetic field, and we use the first subscript of any function or vector field to indicate the magnitude in terms of $\varepsilon _\delta$ :

(3.18)

\begin{equation} \varepsilon _\delta \mathrel {\mathop :}= \frac {[{\boldsymbol{B}}_1]}{[{\boldsymbol{B}}_0]}, \quad Q_l = O\big(\varepsilon _\delta ^l\big). \end{equation}

It is assumed that the perturbed electric field ${\boldsymbol{E}}_1$ scales identically such that any function linear in ${\boldsymbol{E}}_1$ is $O(\varepsilon _\delta )$ . This is the parameter that is used in the gyrocentre coordinate transformation, which is discussed in § 4. The smallness of this parameter is motivated in § 5.6.

Frequencies are non-dimensionalised using the cyclotron frequency resulting in the non-dimensional frequency $\varepsilon _\omega$ :

(3.19)

\begin{equation} \varepsilon _\omega \mathrel {\mathop :}= \frac {\omega }{[\omega _{\mathrm{c}}]}, \end{equation}

which is a small parameter in the magnetic fusion devices that we consider (Zoni & Possanner Reference Zoni, Possanner and Salvarani2021). The assumed smallness of this parameter plays a crucial role in the approximation of the perturbed gyrocentre Lagrangian, which is discussed in § 4.5.2.

Finally, we non-dimensionalise the perpendicular length scale $2\pi / k_{\perp }$ (that is, the typical length scale in the plane perpendicular to ${\boldsymbol{\hat {b}}}_0$ ) by the Larmor radius which results in the non-dimensional wavenumber $\varepsilon _\perp$ :

(3.20)

\begin{equation} \varepsilon _\perp \mathrel {\mathop :}= k_{\perp } \varrho . \end{equation}

We emphasise that this last parameter is not necessarily small; in particular, when turbulence is considered, we find that $\varepsilon _\perp \sim 1$ . This parameter is used to approximate the second-order (in $\varepsilon _\delta$ ) gyrocentre Hamiltonian in § 4.5.3.

3.4. Guiding-centre coordinates

The second coordinate transformation that we consider results in the guiding-centre coordinates $\bar {{\boldsymbol{Z}}} = (\bar {{\boldsymbol{R}}}, \bar {U}_\shortparallel , \bar {{M}}, \bar {\varTheta })$ . This transformation is aimed specifically at removing the gyro-phase dependence of ${\boldsymbol{\gamma }}_0$ (which depends on the gyro-phase via the coordinate vector ${\boldsymbol{\hat {\tau }}}({\boldsymbol{R}}, \varTheta )$ ). It results in the desired decoupling of the EOM for $\varTheta$ from the other EOMs and thereby also decouples the fast gyrating motion of the particle.

The leading-order (in $\varepsilon _B$ ) contribution to the near-identity coordinate transformation of the particle position is given by (see figure 1)

(3.21)

\begin{align} \bar {{\boldsymbol{R}}} = {\boldsymbol{R}} - {\boldsymbol{\rho }}, \end{align}

where the particle radial vector and gyroradius are defined as

(3.22)

\begin{equation} {\boldsymbol{\rho }} \mathrel {\mathop :}= \rho {\boldsymbol{\hat {\rho }}}, \quad \rho \mathrel {\mathop :}= \frac {m \bar {U}_\tau }{q B_0}, \end{equation}

and we note that a derivation of this well-known result can be found from Brizard (Reference Brizard1990, (2.58)). The transformation of the remaining phase-space coordinates is not of interest to us here and, therefore, we do not list them. The resulting guiding-centre single-particle phase-space Lagrangian is given by Brizard (Reference Brizard1990, (2.57)),

(3.23)

\begin{equation} \bar {L}_0 \mathrel {\mathop :}= \bar {{\boldsymbol{\gamma }}}_0 \boldsymbol{\cdot } \dot {\bar {{\boldsymbol{Z}}}} - \bar {H}_0, \quad \bar {{\boldsymbol{\gamma }}}_{0,{\boldsymbol{R}}} \mathrel {\mathop :}= q {\boldsymbol{A}}_0^{\star }, \quad \bar {\gamma }_{0,\varTheta } \mathrel {\mathop :}= \frac {m \bar {{M}}}{q}, \quad \bar {H}_0 \mathrel {\mathop :}= \bar {K}_0, \end{equation}

where the effective guiding-centre vector potential is defined as

(3.24)

\begin{equation} {\boldsymbol{A}}_0^{\star } \mathrel {\mathop :}= {\boldsymbol{A}}_0 + \frac {m \bar {U}_\shortparallel }{q} {{\boldsymbol{\hat {b}}}_0} - \frac {m \bar {{M}}}{q^2} {\boldsymbol{w}}_0 , \quad {\boldsymbol{w}}_0 \mathrel {\mathop :}= (\boldsymbol{\nabla }{\boldsymbol{\hat {\tau }}}) {\boldsymbol{\hat {\rho }}} + \frac {1}{2} (\boldsymbol{\nabla }\times {{\boldsymbol{\hat {b}}}_0})_\shortparallel {{\boldsymbol{\hat {b}}}_0}, \end{equation}

and the guiding-centre kinetic energy per particle is given by

(3.25)

\begin{equation} \bar {K}_0 = \frac {m \bar {U}_\shortparallel ^2}{2} + \bar {{M}} B_0. \end{equation}

Here, the gradient of a vector field $\boldsymbol{S}$ is defined component wise as

(3.26)

\begin{equation} (\boldsymbol{\nabla }{\boldsymbol{S}})_{ij} \mathrel {\mathop :}= \frac {\partial S_j}{\partial R_i} \end{equation}

such that, for example, the components of ${\boldsymbol{w}}_0$ are given by the matrix-vector product

(3.27)

\begin{equation} ({\boldsymbol{w}}_0)_i = \sum _{j=1}^3 \frac {\partial \hat \tau _j}{\partial R_i} \hat {\rho }_j + \frac {1}{2} (\boldsymbol{\nabla }\times {{\boldsymbol{\hat {b}}}_0})_\shortparallel ({{\boldsymbol{\hat {b}}}_0})_i. \end{equation}

The relevance of the first contribution to the ${\boldsymbol{w}}_0$ term becomes apparent when considering the transformation

(3.28)

\begin{equation} \varTheta \mapsto \varTheta + \varPsi (\bar {{\boldsymbol{R}}}), \quad {\boldsymbol{\hat {e}}}_i \mapsto \unicode{x1D64F}(-\varPsi ) {\boldsymbol{\hat {e}}}_i, \end{equation}

where the rotation matrix $ \unicode{x1D64F}$ is such that

(3.29)

\begin{equation} \unicode{x1D64F}(\psi ) {\boldsymbol{\hat {\tau }}}({\boldsymbol{R}}, \varTheta ) = {\boldsymbol{\hat {\tau }}}({\boldsymbol{R}}, \varTheta + \psi ), \quad \unicode{x1D64F}(\psi ) {{\boldsymbol{\hat {b}}}_0} = {{\boldsymbol{\hat {b}}}_0}. \end{equation}

Invariance of the Lagrangian under this transformation reflects that we should be free to choose the coordinate vectors ${\boldsymbol{\hat {e}}}_i$ . This is referred to as gyro-gauge invariance. Indeed, we find that two terms of the symplectic part of the Lagrangian now depend on the gyro-phase $\bar {\varTheta }$ or the coordinate vectors ${\boldsymbol{\hat {e}}}_i$ , and their sum is given by

(3.30)

\begin{equation} - [(\boldsymbol{\nabla }{\boldsymbol{\hat {\tau }}}) {\boldsymbol{\hat {\rho }}}] \boldsymbol{\cdot } \dot {\bar {{\boldsymbol{R}}}} + \dot {\bar {\varTheta }} = \big[{-} (\boldsymbol{\nabla }{\boldsymbol{\hat {\tau }}})^\intercal \dot {\bar {{\boldsymbol{R}}}} + \dot {\bar {\varTheta }} {\boldsymbol{\hat {\rho }}} \big] \boldsymbol{\cdot } {\boldsymbol{\hat {\rho }}} = -\frac {{\mathrm{d}} {\boldsymbol{\hat {\tau }}}}{{\mathrm{d}} t} \boldsymbol{\cdot } {\boldsymbol{\hat {\rho }}}, \end{equation}

which is invariant under the transformation given by (3.28).

Note that we can furthermore show that

(3.31)

\begin{equation} \frac {\partial }{\partial \varTheta } [ (\boldsymbol{\nabla }{\boldsymbol{\hat {\tau }}}) {\boldsymbol{\hat {\rho }}} ] = (\boldsymbol{\nabla }{\boldsymbol{\hat {\rho }}}) {\boldsymbol{\hat {\rho }}} - (\boldsymbol{\nabla }{\boldsymbol{\hat {\tau }}}) {\boldsymbol{\hat {\tau }}} = \frac {1}{2} \boldsymbol{\nabla }({\boldsymbol{\hat {\rho }}} \boldsymbol{\cdot } {\boldsymbol{\hat {\rho }}}) - \frac {1}{2} \boldsymbol{\nabla }({\boldsymbol{\hat {\tau }}} \boldsymbol{\cdot } {\boldsymbol{\hat {\tau }}}) = {\boldsymbol{0}}_3, \end{equation}

from which it follows that ${\boldsymbol{w}}_0$ and, therefore, also $\bar {{\boldsymbol{\gamma }}}_0$ are gyro-phase independent. This implies that we can select the value $\varTheta = \pi /2$ resulting in

(3.32)

\begin{equation} {\boldsymbol{w}}_0 = [ (\boldsymbol{\nabla }{\boldsymbol{\hat {\tau }}}) {\boldsymbol{\hat {\rho }}} ] |_{\varTheta = \pi /2} + \frac {1}{2} (\boldsymbol{\nabla }\times {{\boldsymbol{\hat {b}}}_0})_\shortparallel {{\boldsymbol{\hat {b}}}_0} = (\boldsymbol{\nabla }{\boldsymbol{\hat {e}}}_1) {\boldsymbol{\hat {e}}}_2 + \frac {1}{2} (\boldsymbol{\nabla }\times {{\boldsymbol{\hat {b}}}_0})_\shortparallel {{\boldsymbol{\hat {b}}}_0}. \end{equation}

3.5. Principle of least action

Provided with the guiding-centre single-particle phase-space Lagrangian, we impose the principle of least action to obtain the corresponding EOMs. That is, we impose

(3.33)

\begin{equation} \left . \frac {{\mathrm{d}} }{{\mathrm{d}} \varepsilon } \right |_{\varepsilon = 0} \int _{t^0}^{t^1} \bar {L}_0(\bar {{\boldsymbol{Z}}} + \varepsilon {\boldsymbol{\delta }}, \dot {\bar {{\boldsymbol{Z}}}} + \varepsilon \dot {{\boldsymbol{\delta }}})\, \mathrm{d} t = 0, \end{equation}

where $\boldsymbol{\delta }$ is arbitrary with ${\boldsymbol{\delta }}(t^0) = {\boldsymbol{\delta }}(t^1) = {\boldsymbol{0}}_6$ . This results in the well-known Euler–Lagrange equations that are given by

(3.34)

\begin{equation} \frac {{\mathrm{d}} }{{\mathrm{d}} t} \frac {\partial \bar {L}_0}{\partial \dot {\bar {{\boldsymbol{Z}}}}} = \frac {\partial \bar {L}_0}{\partial \bar {{\boldsymbol{Z}}}} \quad \iff \quad \dot {\bar {{\boldsymbol{Z}}}} = \,\,\bar {\!\! \unicode{x1D645}}_0 \left ( \frac {\partial \bar {{\boldsymbol{\gamma }}}_0}{\partial t} + \frac {\partial \bar {H}_0}{\partial \bar {{\boldsymbol{Z}}}} \right )\!, \end{equation}

where we have defined the Lagrange and Poisson matrices as

(3.35)

\begin{equation} \bar { \unicode{x1D652}}_0 \mathrel {\mathop :}= \left ( \frac {\partial \bar {{\boldsymbol{\gamma }}}_0}{\partial \bar {{\boldsymbol{Z}}}} \right )^\intercal - \frac {\partial \bar {{\boldsymbol{\gamma }}}_0}{\partial \bar {{\boldsymbol{Z}}}}, \quad \bar{\!\!\unicode{x1D645}}_0 \mathrel {\mathop :}= (\bar { \unicode{x1D652}}_0)^{-1}, \end{equation}

respectively. Here, the components of the Jacobian matrix are given by (cf. (3.26))

(3.36)

\begin{equation} \left ( \frac {\partial \bar {{\boldsymbol{\gamma }}}_0}{\partial \bar {{\boldsymbol{Z}}}} \right )_{ij} = \frac {\partial \bar {{\boldsymbol{\gamma }}}_{0,i}}{\partial \bar {Z}_j} \end{equation}

and $^\intercal$ denotes the transpose of a matrix.

Provided with the Poisson matrix $\,\,\bar {\!\! \unicode{x1D645}}_0$ , we can define the guiding-centre Poisson bracket as

(3.37)

\begin{equation} \{\mathcal{F}, \mathcal{G}\}_{0} \mathrel {\mathop :}= \frac {\partial \mathcal{F}}{\partial \bar {{\boldsymbol{Z}}}} \boldsymbol{\cdot } \left ( \,\,\bar {\!\! \unicode{x1D645}}_{0} \frac {\partial \mathcal{G}}{\partial \bar {{\boldsymbol{Z}}}} \right )\! ,\end{equation}

which allows the EOMs, as given by (3.34), to be expressed as

(3.38)

\begin{equation} \dot {\bar {{\boldsymbol{Z}}}} = \{\bar {{\boldsymbol{Z}}}, \bar {H}_0\}_{0}, \end{equation}

where we have made use of the time-independence of $\bar {{\boldsymbol{\gamma }}}_0$ and we evaluate the bracket component-wise: $(\{\bar {{\boldsymbol{Z}}}, \bar {H}_0\}_{0})_i = \{\bar {Z}_i, \bar {H}_0\}_{0}$ .

When using our expression for the symplectic part of the Lagrangian $\bar {{\boldsymbol{\gamma }}}_0$ , as given by (3.23), we find that the Lagrange matrix is given by

(3.39)

\begin{equation} \bar { \unicode{x1D652}}_0 = \begin{pmatrix} q \unicode{x1D63D}_0^{\star } &\quad -m {{\boldsymbol{\hat {b}}}_0} &\quad \dfrac {m}{q} {\boldsymbol{w}}_0 &\quad {\boldsymbol{0}}_3\\[12pt] m {{\boldsymbol{\hat {b}}}_0}^\intercal &\quad 0 &\quad 0 &\quad 0\\[5pt] - \dfrac {m}{q} {\boldsymbol{w}}_0^\intercal &\quad 0 &\quad 0 &\quad \dfrac {m}{q} \\[12pt] {\boldsymbol{0}}_3^\intercal &\quad 0 &\quad - \dfrac {m}{q}&\quad 0\\ \end{pmatrix}\!, \end{equation}

where we have defined the matrix

(3.40)

\begin{equation} \unicode{x1D63D}_0^{\star } \mathrel {\mathop :}= \boldsymbol{\nabla }{\boldsymbol{A}}_0^{\star } - (\boldsymbol{\nabla }{\boldsymbol{A}}_0^{\star })^\intercal \end{equation}

for which

(3.41)

\begin{equation} \unicode{x1D63D}_0^{\star } {\boldsymbol{S}} = {\boldsymbol{S}} \times {\boldsymbol{B}}_0^{\star }, \quad {\boldsymbol{B}}_0^{\star } \mathrel {\mathop :}= \boldsymbol{\nabla }\times {\boldsymbol{A}}_0^{\star }. \end{equation}

Inversion of the Lagrange matrix, resulting in the Poisson matrix, is somewhat tedious and is, therefore, described in detail in Appendix A (this coincides with the result of Parra & Calvo (Reference Parra and Calvo2011, Appendix E), except that therein, the derivation is absent). The result is given by

(3.42)

\begin{equation} \,\,\bar {\!\! \unicode{x1D645}}_0 = \begin{pmatrix} -\dfrac { \unicode{x1D63D}_0}{q B_0 B_{0,\shortparallel }^{\star }} & \quad \dfrac {{\boldsymbol{b}}_{0}^{\star }}{m} & \quad {\boldsymbol{0}}_3 & \quad - \dfrac {{\boldsymbol{w}}_0 \times {{\boldsymbol{\hat {b}}}_0}}{q B_{0,\shortparallel }^{\star }}\\[14pt] -\dfrac {({\boldsymbol{b}}_{0}^{\star })^\intercal }{m} & \quad 0 & \quad 0 & \quad -\dfrac {{\boldsymbol{b}}_{0}^{\star } \boldsymbol{\cdot } {\boldsymbol{w}}_0}{m}\\[14pt] {\boldsymbol{0}}_3^\intercal & \quad 0 & \quad 0 & \quad -\dfrac {q}{m}\\[5pt] \dfrac {({\boldsymbol{w}}_0 \times {{\boldsymbol{\hat {b}}}_0})^\intercal }{q B_{0,\shortparallel }^{\star }} & \quad \dfrac {{\boldsymbol{b}}_{0}^{\star } \boldsymbol{\cdot } {\boldsymbol{w}}_0}{m} & \quad \dfrac {q}{m}& \quad 0\\ \end{pmatrix}\!, \end{equation}

where we have defined

(3.43)

\begin{equation} {\boldsymbol{b}}_0^{\star } \mathrel {\mathop :}= \frac {{\boldsymbol{B}}_{0}^{\star }}{B_{0,\shortparallel }^{\star }} = {{\boldsymbol{\hat {b}}}_0} + \frac {1}{q B_{0,\shortparallel }^{\star }} \left [ m \bar {U}_\shortparallel {{\boldsymbol{\hat {b}}}_0} \times {\boldsymbol{\kappa }} - \frac {m \bar {{M}}}{q} (\boldsymbol{\nabla }\times {\boldsymbol{w}}_0)_\perp \right ]\!, \end{equation}

the parallel component of ${\boldsymbol{B}}_{0}^{\star }$ is given by

(3.44)

\begin{equation} B_{0,\shortparallel }^{\star } = B_0 + \frac {m \bar {U}_\shortparallel }{q} (\boldsymbol{\nabla }\times {{\boldsymbol{\hat {b}}}_0})_\shortparallel - \frac {m \bar {{M}}}{q^2} (\boldsymbol{\nabla }\times {\boldsymbol{w}}_0)_\shortparallel \end{equation}

and the curvature vector $\boldsymbol{\kappa }$ is defined as

(3.45)

\begin{equation} {\boldsymbol{\kappa }} \mathrel {\mathop :}= (\boldsymbol{\nabla }\times {{\boldsymbol{\hat {b}}}_0})\times {{\boldsymbol{\hat {b}}}_0}. \end{equation}

The matrix $ \unicode{x1D63D}_0$ is defined analogously to (3.41) and, therefore, is given by

(3.46)

\begin{equation} \unicode{x1D63D}_0 \mathrel {\mathop :}= \boldsymbol{\nabla }{\boldsymbol{A}}_0 - (\boldsymbol{\nabla }{\boldsymbol{A}}_0)^\intercal \quad \implies \quad \unicode{x1D63D}_0 {\boldsymbol{S}} = {\boldsymbol{S}} \times {\boldsymbol{B}}_0. \end{equation}

This results in the following guiding-centre Poisson bracket:

(3.47)

\begin{align} \{\mathcal{F}, \mathcal{G}\}_{0} = {} & - \frac {{{\boldsymbol{\hat {b}}}_0} }{q B_{0,\shortparallel }^{\star }} \boldsymbol{\cdot } (\boldsymbol{\nabla }\mathcal{F} \times \boldsymbol{\nabla }\mathcal{G}) + \frac {{\boldsymbol{b}}_{0}^{\star }}{m} \boldsymbol{\cdot } \left ( \boldsymbol{\nabla }\mathcal{F} \frac {\partial \mathcal{G}}{\partial \bar {U}_\shortparallel } - \frac {\partial \mathcal{F}}{\partial \bar {U}_\shortparallel } \boldsymbol{\nabla }\mathcal{G} \right )\nonumber \\[4pt] & + \frac {q}{m} \left ( \frac {\partial \mathcal{F}}{\partial \bar {\varTheta }} \frac {\partial \mathcal{G}}{\partial \bar {{M}}} - \frac {\partial \mathcal{F}}{\partial \bar {{M}}} \frac {\partial \mathcal{G}}{\partial \bar {\varTheta }} \right ) + \frac {{\boldsymbol{w}}_0 \times {{\boldsymbol{\hat {b}}}_0}}{q B_{0,\shortparallel }^{\star }} \boldsymbol{\cdot } \left ( \frac {\partial \mathcal{F}}{\partial \bar {\varTheta }} \boldsymbol{\nabla }\mathcal{G} - \boldsymbol{\nabla }\mathcal{F} \frac {\partial \mathcal{G}}{\partial \bar {\varTheta }} \right )\nonumber \\[4pt] & + \frac {{\boldsymbol{b}}_{0}^{\star } \boldsymbol{\cdot } {\boldsymbol{w}}_0}{m} \left ( \frac {\partial \mathcal{F}}{\partial \bar {\varTheta }} \frac {\partial \mathcal{G}}{\partial \bar {U}_\shortparallel } - \frac {\partial \mathcal{F}}{\partial \bar {U}_\shortparallel } \frac {\partial \mathcal{G}}{\partial \bar {\varTheta }} \right ) \end{align}

by substituting (3.42) into (3.37). Substitution of (3.23) and (3.47) in (3.37) yields the following guiding-centre EOMs:

(3.48a)

\begin{align} \dot {\bar {{\boldsymbol{R}}}} &= \bar {U}_\shortparallel {\boldsymbol{b}}_{0}^{\star } + \frac {\bar {{M}}}{q B_{0,\shortparallel }^{\star }} {{\boldsymbol{\hat {b}}}_0} \times \boldsymbol{\nabla }B_0 , \end{align}

(3.48b)

\begin{align} \dot {\bar {U}}_\shortparallel &= -\frac {\bar {{M}}}{m} {\boldsymbol{b}}_{0}^{\star } \boldsymbol{\cdot } \boldsymbol{\nabla }B_0, \end{align}

(3.48c)

\begin{align} \dot {\bar {{M}}} &= 0, \end{align}

(3.48d)

\begin{align} \dot {\bar {\varTheta }} &= \omega _{\mathrm{c}} + {\boldsymbol{w}}_0 \boldsymbol{\cdot } \dot {\bar {{\boldsymbol{R}}}}. \end{align}

Note that whereas the guiding-centre EOMs still contain the fast gyrating motion for which the frequency is given by the cyclotron frequency $\omega _{\mathrm{c}}$ , this motion has been decoupled from the EOMs for the guiding-centre position and parallel velocity. This means that if one is not interested in the gyro-phase $\bar {\varTheta }$ , then the corresponding EOM can be omitted entirely, thereby resulting in a phase-space dimensionality reduction.

3.6. Discussion on guiding-centre coordinates

We compare the guiding-centre EOMs given by (3.48) to the EOMs in physical coordinates, as given by (3.6). When integrating (3.7) in time, we find that the physical particle position is given by

(3.49)

\begin{align} {\boldsymbol{R}}(t) = {} & {\boldsymbol{R}}(0) + U_\shortparallel (0) {{\boldsymbol{\hat {b}}}_0} t \nonumber \\ & + \frac {1}{\omega _{\mathrm{c}}} \big[ \sin (\omega _{\mathrm{c}} t) \big({\boldsymbol{\hat {e}}}_1 {\boldsymbol{\hat {e}}}_1^\intercal + {\boldsymbol{\hat {e}}}_2 {\boldsymbol{\hat {e}}}_2^\intercal \big) - (\cos (\omega _{\mathrm{c}} t) - 1) \big({\boldsymbol{\hat {e}}}_1 {\boldsymbol{\hat {e}}}_2^\intercal - {\boldsymbol{\hat {e}}}_2 {\boldsymbol{\hat {e}}}_1^\intercal \big) \big] {\boldsymbol{U}}_\perp (0), \end{align}

where we recall that this result holds only if ${\boldsymbol{B}}_0$ is constant. Under the same assumption, we find that the guiding-centre EOMs result in

(3.50)

\begin{equation} \dot {\bar {{\boldsymbol{R}}}} = \bar {U}_\shortparallel {{\boldsymbol{\hat {b}}}_0} , \quad \dot {\bar {U}}_\shortparallel = 0 , \quad \dot {\bar {{M}}} = 0 , \quad \dot {\bar {\varTheta }} = \omega _{\mathrm{c}} \end{equation}

which, upon integration in time, yields

(3.51)

\begin{equation} \bar {{\boldsymbol{R}}}(t) = \bar {{\boldsymbol{R}}}(0) + \bar {U}_\shortparallel (0) {{\boldsymbol{\hat {b}}}_0} t , \quad \bar {{M}}(t) = \bar {{M}}(0) , \quad \bar {\varTheta }(t) = \bar {\varTheta }(0) + \omega _{\mathrm{c}} t. \end{equation}

According to Brizard (Reference Brizard1990, (2.58)), the velocity coordinates $(U_\shortparallel , {M}, \varTheta )$ transform trivially under the guiding-centre coordinate transformation whenever ${\boldsymbol{B}}_0$ is constant and, therefore, (3.51) can be written in physical coordinates as

(3.52)

\begin{equation} {\boldsymbol{R}}(t) = {\boldsymbol{R}}(0) + U_\shortparallel (0) {{\boldsymbol{\hat {b}}}_0} t + \frac {1}{\omega _{\mathrm{c}}}[ {\boldsymbol{\hat {\rho }}}(\varTheta ^0 + \omega _{\mathrm{c}} t)- {\boldsymbol{\hat {\rho }}}(\varTheta ^0) ] \bar {U}_\tau (0) \end{equation}

upon substitution of (3.8), (3.21) and (3.22) and letting $\varTheta (0) = \varTheta ^0$ . Here, we use the notational convention, as we do throughout this paper, that a superscripted ‘ $0$ ’ indicates the initial value. By making use of ${\boldsymbol{U}}_\perp (0) = U_\tau (0) {\boldsymbol{\hat {\tau }}}(\varTheta ^0)$ , which follows from (3.13), it can be shown that the solutions given by (3.49) and (3.52) are identical, thereby confirming that we have consistently decoupled the fast gyrating motion using the guiding-centre coordinate transformation.

4. Gyrocentre single-particle phase-space Lagrangian

Thus far, we have discussed a model for the motion of a charged particle in the presence of a stationary background magnetic field ${\boldsymbol{B}}_0$ , where the introduction of the guiding-centre coordinates has resulted in decoupling the fast gyration and has furthermore resulted in a phase-space dimensionality reduction. However, the moving charged particle itself deposits a charge and current, and thereby generates an electromagnetic field, which in turn affects the motion of the particle. In this section, we introduce a ‘perturbation’ to the guiding-centre single-particle phase-space Lagrangian in the form of time-varying electromagnetic potentials, which in § 5, allows us to derive a self-consistent formulation of the proposed gyrokinetic model.

4.1. Perturbed guiding-centre Lagrangian

In physical coordinates, the perturbed guiding-centre Lagrangian is given by

(4.1)

\begin{equation} \bar {L}_1^\dagger \mathrel {\mathop :}= q {\boldsymbol{A}}_1({\boldsymbol{R}}, t) \boldsymbol{\cdot } \,\dot {{ {\!\boldsymbol{R}}}} - q \phi _1({\boldsymbol{R}}, t), \end{equation}

where $ {\boldsymbol{A}}_1$ and $\phi _1$ are the perturbed vector and scalar potentials resulting in the perturbed electric and magnetic field, which are assumed to be small compared with ${\boldsymbol{B}}_0$ , i.e. $\varepsilon _\delta \ll 1$ . Note that we have added a superscripted ${}^\dagger$ which we have introduced to distinguish this Lagrangian from the final perturbed guiding-centre Lagrangian in which we have subtracted the total derivative of some function.

Using (3.21), we find that

(4.2)

\begin{equation} {\boldsymbol{R}} = \bar {{\boldsymbol{R}}} + {\boldsymbol{\rho }}, \end{equation}

which expresses the particle position $\boldsymbol{R}$ in terms of the guiding-centre position $\bar {{\boldsymbol{R}}}$ and the radial vector $\boldsymbol{\rho }$ . We introduce the following compact notation to indicate the evaluation of a scalar function, the gradient of a scalar function or a vector field at the particle position:

(4.3)

\begin{equation} \mathring {Q} \mathrel {\mathop :}= Q(\bar {{\boldsymbol{R}}} + {\boldsymbol{\rho }}), \quad \mathring {\boldsymbol{\nabla }} Q \mathrel {\mathop :}= (\boldsymbol{\nabla }Q)(\bar {{\boldsymbol{R}}} + {\boldsymbol{\rho }}), \quad \mathring {S}_\tau \mathrel {\mathop :}= {\boldsymbol{S}}(\bar {{\boldsymbol{R}}} + {\boldsymbol{\rho }}) \boldsymbol{\cdot } {\boldsymbol{\hat {\tau }}}(\bar {{\boldsymbol{R}}}). \end{equation}

When considering figure 1, one might expect that the coordinate vector $\boldsymbol{\hat {\tau }}$ should be evaluated at the particle position $\bar {{\boldsymbol{R}}} + {\boldsymbol{\rho }}$ . However, from the derivation of the model, it turns out that the evaluation is always done at the guiding-centre position $\bar {{\boldsymbol{R}}}$ which is equivalent to the evaluation at the particle position up to an $O(\varepsilon _B)$ contribution.

When making use of (4.3), it follows that (4.1) can be written in guiding-centre coordinates as

(4.4)

\begin{equation} \bar {L}_1^\dagger = q\, \mathring {\!{\boldsymbol{A}}}_1 \boldsymbol{\cdot } (\dot {\bar {{\boldsymbol{R}}}} + \dot {{\boldsymbol{\rho }}}) - q \mathring {\phi }_1. \end{equation}

Note that both $\boldsymbol{\rho }$ and $\mathring {\!{\boldsymbol{A}}}_1$ (via $\boldsymbol{\rho }$ ) depend on the gyro-phase $\bar {\varTheta }$ , and, therefore, we are in need of a third coordinate transformation that is aimed at removing the gyro-phase dependence of the perturbed guiding-centre Lagrangian and results in the gyrocentre phase-space coordinates $\bar {\bar {{\boldsymbol{Z}}}} = (\bar {\bar {\boldsymbol{R}}}, \bar {\bar {U}}_\shortparallel , \bar {\bar {{M}}}, \bar {\bar {\varTheta }})$ .

4.2. Gyrocentre coordinate transformation

Before we can perform the gyrocentre coordinate transformation, however, we must briefly discuss Lie transformations (Dragt & Finn Reference Dragt and Finn1976; Littlejohn Reference Littlejohn1982; Cary & Littlejohn Reference Cary and Littlejohn1983), which are used for this purpose. We consider second-order Lie transformations, which are phase-space coordinate transformations of the form

(4.5)

\begin{equation} \bar {{\boldsymbol{Z}}} = \bar {\bar {{\boldsymbol{Z}}}} - \bar {\bar {{\boldsymbol{G}}}}_1 + \frac {1}{2} \frac {\partial \bar {\bar {{\boldsymbol{G}}}}_1}{\partial \bar {\bar {{\boldsymbol{Z}}}}} \bar {\bar {{\boldsymbol{G}}}}_1 - \bar {\bar {{\boldsymbol{G}}}}_2, \end{equation}

where $\bar {\bar {{\boldsymbol{G}}}}_1$ and $\bar {\bar {{\boldsymbol{G}}}}_2$ are the first- and second-order generating vectors.

The resulting gyrocentre Lagrangian is defined such that

(4.6)

\begin{equation} \bar {\bar {L}}(\bar {\bar {{\boldsymbol{Z}}}}, \dot {\bar {\bar {{\boldsymbol{Z}}}}}) = \bar {L}(\bar {{\boldsymbol{Z}}}, \dot {\bar {{\boldsymbol{Z}}}}) + \frac {{\mathrm{d}} \bar {\bar {S}}}{{\mathrm{d}} t} + O\big(\varepsilon _\delta ^3\big), \end{equation}

where we have added the total derivative of a generating function $\bar {\bar {S}} = \bar {\bar {S}}_1 + \bar {\bar {S}}_2$ to the gyrocentre Lagrangian

(4.7)

\begin{equation} \bar {\bar {L}} \mapsto \bar {\bar {L}} + \frac {{\mathrm{d}} \bar {\bar {S}}}{{\mathrm{d}} t}, \end{equation}

resulting in the following additions to the Hamiltonian and symplectic part:

(4.8)

\begin{equation} \bar {\bar {H}} \mapsto \bar {\bar {H}} - \frac {\partial \bar {\bar {S}}}{\partial t} , \quad \bar {\bar {{\boldsymbol{\gamma }}}} \mapsto \bar {\bar {{\boldsymbol{\gamma }}}} + \frac {\partial \bar {\bar {S}}}{\partial \bar {\bar {{\boldsymbol{Z}}}}}. \end{equation}

This results in the following gyrocentre Hamiltonian $\bar {\bar {H}} = \bar {\bar {H}}_0 + \bar {\bar {H}}_1 + \bar {\bar {H}}_2$ :

(4.9a)

\begin{align} \bar {\bar {H}}_0 &= \bar {H}_0,\\[-8pt]\nonumber \end{align}

(4.9b)

\begin{align} \bar {\bar {H}}_1 &= \bar {H}_1 - \frac {\partial \bar {H}_0}{\partial \bar {\bar {{\boldsymbol{Z}}}}} \boldsymbol{\cdot } \bar {\bar {{\boldsymbol{G}}}}_1 - \frac {\partial \bar {\bar {S}}_1}{\partial t},\\[-8pt]\nonumber \end{align}

(4.9c)

\begin{align} \bar {\bar {H}}_2 &= \bar {H}_2 - \frac {\partial \bar {H}_0}{\partial \bar {\bar {{\boldsymbol{Z}}}}} \boldsymbol{\cdot } \bar {\bar {{\boldsymbol{G}}}}_2 - \left [ \frac {\partial }{\partial t} \left ( \bar {{\boldsymbol{\gamma }}}_1 - \frac {1}{2} \frac {\partial \bar {{\boldsymbol{\gamma }}}_0}{\partial \bar {\bar {{\boldsymbol{Z}}}}} \bar {\bar {{\boldsymbol{G}}}}_1 \right ) + \frac {\partial }{\partial \bar {\bar {{\boldsymbol{Z}}}}}\left ( \bar {H}_1 - \frac {1}{2} \frac {\partial \bar {H}_0}{\partial \bar {\bar {{\boldsymbol{Z}}}}}\boldsymbol{\cdot } \bar {\bar {{\boldsymbol{G}}}}_1 \right ) \right ]\nonumber\\[4pt]&\quad \boldsymbol{\cdot } \bar {\bar {{\boldsymbol{G}}}}_1 - \frac {\partial \bar {\bar {S}}_2}{\partial t}, \end{align}

whereas the symplectic part $\bar {\bar {{\boldsymbol{\gamma }}}} = \bar {\bar {{\boldsymbol{\gamma }}}}_0 + \bar {\bar {{\boldsymbol{\gamma }}}}_1 + \bar {\bar {{\boldsymbol{\gamma }}}}_2$ is given by

(4.10a)

\begin{align} \bar {\bar {{\boldsymbol{\gamma }}}}_0 &= \bar {{\boldsymbol{\gamma }}}_0 ,\\[-10pt]\nonumber \end{align}

(4.10b)

\begin{align} \bar {\bar {{\boldsymbol{\gamma }}}}_1 &= \bar {{\boldsymbol{\gamma }}}_1 + \bar { \unicode{x1D652}}_0 \bar {\bar {{\boldsymbol{G}}}}_1 + \frac {\partial \bar {\bar {S}}_1}{\partial \bar {\bar {{\boldsymbol{Z}}}}},\\[-10pt]\nonumber \end{align}

(4.10c)

\begin{align} \bar {\bar {{\boldsymbol{\gamma }}}}_2 &= \bar {{\boldsymbol{\gamma }}}_2 + \bar { \unicode{x1D652}}_0 \bar {\bar {{\boldsymbol{G}}}}_2 + \frac {1}{2}\big(\bar { \unicode{x1D652}}_1 + \bar {\bar { \unicode{x1D652}}}_1\big) \bar {\bar {{\boldsymbol{G}}}}_1 + \frac {\partial \bar {\bar {S}}_2}{\partial \bar {\bar {{\boldsymbol{Z}}}}}. \end{align}

We use the Lagrange matrix $\bar { \unicode{x1D652}}_0$ , as given by (3.35) and (3.39), and have equivalently defined the perturbed Lagrange matrices as

(4.11a)

\begin{align} \bar { \unicode{x1D652}}_1 &\mathrel {\mathop :}= \left ( \frac {\partial \bar {{\boldsymbol{\gamma }}}_1}{\partial \bar {\bar {{\boldsymbol{Z}}}}} \right )^\intercal - \frac {\partial \bar {{\boldsymbol{\gamma }}}_1}{\partial \bar {\bar {{\boldsymbol{Z}}}}},\\[-5pt]\nonumber \end{align}

(4.11b)

\begin{align} \bar {\bar { \unicode{x1D652}}}_1 &\mathrel {\mathop :}= \left ( \frac {\partial \bar {\bar {{\boldsymbol{\gamma }}}}_1}{\partial \bar {\bar {{\boldsymbol{Z}}}}} \right )^\intercal - \frac {\partial \bar {\bar {{\boldsymbol{\gamma }}}}_1}{\partial \bar {\bar {{\boldsymbol{Z}}}}} = \bar { \unicode{x1D652}}_1 + \left [ \frac {\partial }{\partial \bar {\bar {{\boldsymbol{Z}}}}} \big( \bar { \unicode{x1D652}}_0 \bar {\bar {{\boldsymbol{G}}}}_1 \big) \right ]^\intercal - \frac {\partial }{\partial \bar {\bar {{\boldsymbol{Z}}}}} \big( \bar { \unicode{x1D652}}_0 \bar {\bar {{\boldsymbol{G}}}}_1 \big) . \end{align}

These transformation rules are classical results that can be obtained using Lie transform methods (Cary & Littlejohn Reference Cary and Littlejohn1983), but can also be derived using Taylor series expansions, as shown in Appendix B. Note that the contribution due to the first-order generating function vanishes in (4.11b ) because the skew-symmetric part of the Hessian matrix of $\bar {\bar {S}}_1$ vanishes.

4.3. General form of the gyrocentre coordinate transformation

The generating vectors $\bar {\bar {{\boldsymbol{G}}}}_1$ and $\bar {\bar {{\boldsymbol{G}}}}_2$ , which are used to define the gyrocentre coordinate transformation, are chosen to satisfy some desired form of the symplectic part $\bar {\bar {{\boldsymbol{\gamma }}}}_1, \bar {\bar {{\boldsymbol{\gamma }}}}_2$ of the Lagrangian by inverting (4.10b ) and (4.10c ), respectively. This yields a transformation of the Hamiltonian part of the Lagrangian, as given by (4.9), where the generating vectors re-introduce gyro-phase dependence in the gyrocentre Hamiltonian. The role of the generating functions $\bar {\bar {S}}_1$ and $\bar {\bar {S}}_2$ is to absorb the gyro-phase-dependent part of the resulting gyrocentre Hamiltonian.

4.3.1. First-order transformation

Without specifying the desired form of $\bar {\bar {{\boldsymbol{\gamma }}}}_1$ , we find that this approach results in the following first-order generating vector field:

(4.12)

\begin{equation} \bar {\bar {{\boldsymbol{G}}}}_1 = \,\,\bar {\!\! \unicode{x1D645}}_0 \left ( \bar {\bar {{\boldsymbol{\gamma }}}}_1 - \bar {{\boldsymbol{\gamma }}}_1 - \frac {\partial \bar {\bar {S}}_1}{\partial \bar {\bar {{\boldsymbol{Z}}}}} \right )\!, \end{equation}

which, upon substitution in (4.9b ), results in the following first-order Hamiltonian:

(4.13)

\begin{equation} \bar {\bar {H}}_1 = \bar {H}_1 + \left ( \bar {\bar {{\boldsymbol{\gamma }}}}_1 - \bar {{\boldsymbol{\gamma }}}_1 \right ) \boldsymbol{\cdot } \dot {\bar {{\boldsymbol{Z}}}} - \frac {\partial \bar {\bar {S}}_1}{\partial t} - \big\{\bar {\bar {S}}_1, \bar {\bar {H}}_0\big\}_{0} = q \psi _1 - \frac {\partial \bar {\bar {S}}_1}{\partial t} - \big\{\bar {\bar {S}}_1, \bar {\bar {H}}_0\big\}_{0}, \end{equation}

where we have used (3.34) and (3.37), denote by $\dot {\bar {{\boldsymbol{Z}}}}$ the unperturbed guiding-centre EOMs (3.48) evaluated at the gyrocentre coordinate $\bar {\bar {{\boldsymbol{Z}}}}$ and we have defined the effective potential as

(4.14)

\begin{equation} q \psi _1 \mathrel {\mathop :}= \bar {H}_1 + \left ( \bar {\bar {{\boldsymbol{\gamma }}}}_1 - \bar {{\boldsymbol{\gamma }}}_1 \right ) \boldsymbol{\cdot } \dot {\bar {{\boldsymbol{Z}}}}. \end{equation}

We let the generating function $\bar {\bar {S}}_1$ absorb the gyro-phase dependent part of $\psi _1$ such that (4.13) results in

(4.15)

\begin{equation} \frac {\partial \bar {\bar {S}}_1}{\partial t} + \big\{\bar {\bar {S}}_1, \bar {\bar {H}}_0\big\}_{0} = q \widetilde {\psi }_1 \quad \implies \quad \bar {\bar {H}}_1 = q \langle \psi _1 \rangle , \end{equation}

where we define the gyro-average and the resulting gyro-phase-dependent part of some function $Q$ as

(4.16)

\begin{equation} \langle Q \rangle \mathrel {\mathop :}= \frac {1}{2\pi } \int _0^{2\pi } Q\, \mathrm{d} \bar {\bar {\varTheta }}, \quad \widetilde {Q} \mathrel {\mathop :}= Q - \langle Q \rangle , \end{equation}

which is defined component-wise for vector fields. It follows that the first-order part of the gyrocentre single-particle phase-space Lagrangian is given by

(4.17)

\begin{equation} \bar {\bar {L}}_1 = \bar {\bar {{\boldsymbol{\gamma }}}}_1 \boldsymbol{\cdot } \dot {\bar {\bar {{\boldsymbol{Z}}}}} - \bar {\bar {H}}_1 = \bar {\bar {{\boldsymbol{\gamma }}}}_1 \boldsymbol{\cdot } \dot {\bar {\bar {{\boldsymbol{Z}}}}} + \langle \bar {{\boldsymbol{\gamma }}}_1 - \bar {\bar {{\boldsymbol{\gamma }}}}_1 \rangle \boldsymbol{\cdot } \dot {\bar {{\boldsymbol{Z}}}} - \langle \bar {H}_1 \rangle . \end{equation}

It is insightful to consider the two limiting cases of (4.17): if $\bar {\bar {{\boldsymbol{\gamma }}}}_1 = {\boldsymbol{0}}_6$ , then the gyrocentre coordinate transformation transforms the entire symplectic part of the first-order guiding-centre Lagrangian to the Hamiltonian part of the Lagrangian (this is referred to as the Hamiltonian formulation)

(4.18a)

\begin{equation} \bar {\bar {{\boldsymbol{\gamma }}}}_1 = {\boldsymbol{0}}_6 \quad \implies \quad \bar {\bar {H}}_1 = \langle \bar {H}_1 \rangle - \langle \bar {{\boldsymbol{\gamma }}}_1 \rangle \boldsymbol{\cdot } \dot {\bar {{\boldsymbol{Z}}}} \end{equation}

and, conversely, if $\bar {\bar {{\boldsymbol{\gamma }}}}_1 = \langle \bar {{\boldsymbol{\gamma }}}_1 \rangle$ , then the symplectic and Hamiltonian parts of the first-order guiding-centre Lagrangian simply end up being gyro-averaged

(4.18b)

\begin{equation} \bar {\bar {{\boldsymbol{\gamma }}}}_1 = \langle \bar {{\boldsymbol{\gamma }}}_1 \rangle \quad \implies \quad \bar {\bar {H}}_1 = \langle \bar {H}_1 \rangle . \end{equation}

4.3.2. Second-order transformation

We follow the same approach for deriving the second-order Hamiltonian. That is, we solve (4.10c ) for $\bar {\bar {{\boldsymbol{G}}}}_2$ resulting in

(4.19)

\begin{equation} \bar {\bar {{\boldsymbol{G}}}}_2 = \,\,\bar {\!\! \unicode{x1D645}}_0 \left [ \bar {\bar {{\boldsymbol{\gamma }}}}_2 - \frac {1}{2}\big(\bar { \unicode{x1D652}}_1 + \bar {\bar { \unicode{x1D652}}}_1\big) \bar {\bar {{\boldsymbol{G}}}}_1 - \frac {\partial \bar {\bar {S}}_2}{\partial \bar {\bar {{\boldsymbol{Z}}}}} \right ]\!, \end{equation}

without specifying $\bar {\bar {{\boldsymbol{\gamma }}}}_2$ and by making use of $\bar {{\boldsymbol{\gamma }}}_2 = {\boldsymbol{0}}_6$ . This allows us to express the second-order Hamiltonian (4.9c ) in the following way:

(4.20)

\begin{equation} \bar {\bar {H}}_2 = \bar {\bar {{\boldsymbol{\gamma }}}}_2 \boldsymbol{\cdot } \dot {\bar {{\boldsymbol{Z}}}} + {\boldsymbol{T}}_1 \boldsymbol{\cdot } \bar {\bar {{\boldsymbol{G}}}}_1 - \frac {\partial \bar {\bar {S}}_2}{\partial t} - \big\{\bar {\bar {S}}_2, \bar {H}_0\big\}_{0}, \end{equation}

where we have made use of $\bar {H}_2 = 0$ , and we have defined

(4.21)

\begin{equation} {\boldsymbol{T}}_1 \mathrel {\mathop :}= \frac {1}{2} \big(\bar { \unicode{x1D652}}_1 + \bar {\bar { \unicode{x1D652}}}_1\big) \dot {\bar {{\boldsymbol{Z}}}} - \frac {\partial }{\partial t} \left ( \bar {{\boldsymbol{\gamma }}}_1 - \frac {1}{2} \frac {\partial \bar {{\boldsymbol{\gamma }}}_0}{\partial \bar {\bar {{\boldsymbol{Z}}}}} \bar {\bar {{\boldsymbol{G}}}}_1 \right ) - \frac {\partial }{\partial \bar {\bar {{\boldsymbol{Z}}}}}\left ( \bar {H}_1 - \frac {1}{2} \frac {\partial \bar {H}_0}{\partial \bar {\bar {{\boldsymbol{Z}}}}} \boldsymbol{\cdot } \bar {\bar {{\boldsymbol{G}}}}_1 \right ) \end{equation}

by making use of (3.34).

As with the first-order generating function, the second-order generating function $\bar {\bar {S}}_2$ is defined such that it absorbs the gyro-phase-dependent part of $\bar {\bar {H}}_2$ resulting in

(4.22)

\begin{equation} \frac {\partial \bar {\bar {S}}_2}{\partial t} + \big\{\bar {\bar {S}}_2, \bar {H}_0\big\}_{0} = \widetilde {\bar {\bar {{\boldsymbol{\gamma }}}}_2} \boldsymbol{\cdot } \dot {\bar {{\boldsymbol{Z}}}} + \widetilde {{\boldsymbol{T}}_1 \boldsymbol{\cdot } \bar {\bar {{\boldsymbol{G}}}}_1} \end{equation}

and, therefore,

(4.23)

\begin{equation} \bar {\bar {H}}_2 = \langle \bar {\bar {{\boldsymbol{\gamma }}}}_2 \rangle \boldsymbol{\cdot } \dot {\bar {{\boldsymbol{Z}}}} + \langle {\boldsymbol{T}}_1 \boldsymbol{\cdot } \bar {\bar {{\boldsymbol{G}}}}_1 \rangle . \end{equation}

To summarise, we have thus far considered a general gyrocentre coordinate transformation, where we are still free to choose the symplectic parts $\bar {\bar {{\boldsymbol{\gamma }}}}_1$ and $\bar {\bar {{\boldsymbol{\gamma }}}}_2$ . The resulting first- and second-order Hamiltonians are given by

(4.24)

\begin{equation} \bar {\bar {H}}_1 = \langle \bar {H}_1 \rangle + \langle \bar {\bar {{\boldsymbol{\gamma }}}}_1 - \bar {{\boldsymbol{\gamma }}}_1 \rangle \boldsymbol{\cdot } \dot {\bar {{\boldsymbol{Z}}}} \end{equation}

as well as (4.23), respectively as follows from (4.14) and (4.15). For consistency, we require that $\bar {\bar {{\boldsymbol{\gamma }}}}_1$ and $\bar {\bar {{\boldsymbol{\gamma }}}}_2$ are $O(\varepsilon _\delta )$ and $O\big(\varepsilon _\delta ^2\big)$ , respectively. As the purpose of the gyrocentre coordinate transformation is to decouple the gyro-phase from the perturbed Lagrangian, we must also require $\widetilde {\bar {\bar {{\boldsymbol{\gamma }}}}_1} = \widetilde {\bar {\bar {{\boldsymbol{\gamma }}}}_2} = {\boldsymbol{0}}_6$ . Moreover, we require the magnetic moment to remain an invariant in gyrocentre coordinates. The requirement on the coordinate transformation to obtain invariance of the magnetic moment can be found by considering the Euler–Lagrange equation for $\bar {\bar {\varTheta }}$

(4.25)

\begin{equation} \frac {{\mathrm{d}} }{{\mathrm{d}} t} \frac {\partial \bar {\bar {L}}}{\partial \dot {\bar {\bar {\varTheta }}}} = \frac {\partial \bar {\bar {L}}}{\partial \bar {\bar {\varTheta }}} \quad \implies \quad \frac {{\mathrm{d}} \bar {\bar {\gamma }}_{\varTheta }}{{\mathrm{d}} t} = 0 \quad \implies \quad \dot {\bar {\bar {{M}}}} = - \frac {q}{m} \frac {{\mathrm{d}} }{{\mathrm{d}} t} \big(\bar {\bar {\gamma }}_{1,\varTheta } + \bar {\bar {\gamma }}_{2,\varTheta }\big) , \end{equation}

which shows that $\bar {\bar {\gamma }}_{1,\varTheta } + \bar {\bar {\gamma }}_{2,\varTheta } = 0$ is sufficient for obtaining invariance of $\bar {\bar {{M}}}$ . In what follows, we discuss a fourth requirement on $\bar {\bar {{\boldsymbol{\gamma }}}}_1$ and $\bar {\bar {{\boldsymbol{\gamma }}}}_2$ , which ensures that the resulting gyrocentre single-particle phase-space Lagrangian is gauge-invariant.

4.4. Gauge invariance

Gauge invariance refers to invariance under the gauge transformation

(4.26)

\begin{equation} \phi _1 \mapsto \phi _1 - \frac {\partial \eta }{\partial t}, \quad {\boldsymbol{A}}_1 \mapsto {\boldsymbol{A}}_1 + \boldsymbol{\nabla }\eta \end{equation}

for some scalar function $\eta$ . The electromagnetic fields, as given by

(4.27a)

\begin{align} {\boldsymbol{E}}_1 &\mathrel {\mathop :}= - \boldsymbol{\nabla }\phi _1 -\frac {\partial {\boldsymbol{A}}_1}{\partial t}, \end{align}

(4.27b)

\begin{align} {\boldsymbol{B}}_1 &\mathrel {\mathop :}= \boldsymbol{\nabla }\times {\boldsymbol{A}}_1 \end{align}

are invariant under the gauge transformation (4.26), from which it follows that any (part of a) model which is expressed in terms of the electromagnetic fields is automatically gauge-invariant. If a model is gauge-invariant, it means that it does not matter which gauge condition is used to fix the function $\eta$ , which is what we would expect from a physical point of view.

Following the discussion by Burby & Brizard (Reference Burby and Brizard2019), we introduce the following parametrised perturbed Lagrangian:

(4.28)

\begin{equation} {\bar {L}}^{\dagger ,\varsigma }_1 \mathrel {\mathop :}= q \mathring {{\boldsymbol{A}}}^\varsigma _1 \boldsymbol{\cdot } ( \dot {\bar {{\boldsymbol{R}}}} + \varsigma \dot {{\boldsymbol{\rho }}} ) - q \mathring {\phi }^\varsigma _1, \end{equation}

where we have defined

(4.29)

\begin{equation} \mathring {Q}^\varsigma \mathrel {\mathop :}= Q(\bar {{\boldsymbol{R}}} + \varsigma {\boldsymbol{\rho }}). \end{equation}

The $\varsigma$ parameter, therefore, interpolates from the guiding-centre position ( $\varsigma = 0$ ) to the particle position ( $\varsigma = 1$ ), see also figure 1. It follows that the perturbed guiding-centre Lagrangian, as given by (4.1), coincides with $\varsigma = 1$ and can therefore be written as

(4.30)

\begin{equation} \bar {L}_1^\dagger = \bar {L}_1^{\dagger ,\varsigma = 1} = \bar {L}_1^{\dagger ,\varsigma = 0} + \big(\bar {L}_1^{\dagger ,\varsigma = 1} - \bar {L}_1^{\dagger ,\varsigma = 0}\big) = \underbrace {\bar {L}_1^{\dagger ,\varsigma = 0}}_{\bar {L}_1^{\dagger ,\mathrm{ZLR}}} + \underbrace {\int _0^1 \frac {{\mathrm{d}} {\bar {L}}^{\dagger ,\varsigma }_1}{{\mathrm{d}} \varsigma }\, \mathrm{d} \varsigma }_{\bar {L}_1^{\dagger ,\mathrm{FLR}}}, \end{equation}

which can therefore be written as the sum of a zero Larmor radius (ZLR) contribution and a finite Larmor radius (FLR) contribution.

Computation of the $\varsigma$ derivative of the parametrised Lagrangian yields

(4.31)

\begin{equation} \frac {{\mathrm{d}} {\bar {L}}^{\dagger ,\varsigma }_1}{{\mathrm{d}} \varsigma } = q [ (\mathring {\boldsymbol{\nabla} }^\varsigma {\boldsymbol{A}}_1)^\intercal {\boldsymbol{\rho }} ] \boldsymbol{\cdot } (\dot {\bar {{\boldsymbol{R}}}} + \varsigma \dot {{\boldsymbol{\rho }}}) + q \mathring {{\boldsymbol{A}}}^\varsigma _1 \boldsymbol{\cdot } \dot {{\boldsymbol{\rho }}} - q \mathring {\boldsymbol{\nabla} }^\varsigma \phi _1 \boldsymbol{\cdot } {\boldsymbol{\rho }}. \end{equation}

Furthermore, we note that

(4.32)

\begin{align} \frac {{\mathrm{d}} }{{\mathrm{d}} t} \big({\boldsymbol{\rho }} \boldsymbol{\cdot } \mathring {{\boldsymbol{A}}}^\varsigma _1\big) &= \dot {{\boldsymbol{\rho }}} \boldsymbol{\cdot } \mathring {{\boldsymbol{A}}}^\varsigma _1 + {\boldsymbol{\rho }} \boldsymbol{\cdot } \frac {{\mathrm{d}} }{{\mathrm{d}} t} {\boldsymbol{A}}(\bar {{\boldsymbol{R}}} + \varsigma {\boldsymbol{\rho }}, t)\nonumber\\&= \dot {{\boldsymbol{\rho }}} \boldsymbol{\cdot } \mathring {{\boldsymbol{A}}}^\varsigma _1 + {\boldsymbol{\rho }} \boldsymbol{\cdot } \bigg[ \frac {\partial \mathring {{\boldsymbol{A}}}^\varsigma _1}{\partial t} + (\mathring {\boldsymbol{\nabla} }^\varsigma {\boldsymbol{A}}_1)^\intercal (\dot {\bar {{\boldsymbol{R}}}} + \varsigma \dot {{\boldsymbol{\rho }}}) \bigg], \end{align}

from which it follows that

(4.33)

\begin{equation} \frac {{\mathrm{d}} {\bar {L}}^{\dagger ,\varsigma }_1}{{\mathrm{d}} \varsigma } - q\frac {{\mathrm{d}} }{{\mathrm{d}} t} \big({\boldsymbol{\rho }} \boldsymbol{\cdot } \mathring {{\boldsymbol{A}}}^\varsigma _1\big) = q {\boldsymbol{\rho }} \boldsymbol{\cdot } \big[ (\dot {\bar {{\boldsymbol{R}}}} + \varsigma \dot {{\boldsymbol{\rho }}}) \times \,\mathring {\!{\boldsymbol{B}}}^\varsigma _1 + \,\mathring {\!{\boldsymbol{E}}}^\varsigma _1 \big], \end{equation}

and therefore the FLR part of the perturbed guiding-centre Lagrangian can, up to a total derivative, be expressed in terms of the gauge-invariant electromagnetic fields.

In what follows, we omit the contribution by the total derivative, as this does not alter the resulting EOMs after imposing the principle of least action. We denote the resulting perturbed guiding-centre Lagrangian by $\bar {L}_1$ for which

(4.34)

\begin{equation} \bar {L}_1 \mathrel {\mathop :}= \bar {L}_1^\dagger - q\, \frac {{\mathrm{d}} }{{\mathrm{d}} t} \int _0^1 {\boldsymbol{\rho }} \boldsymbol{\cdot } \mathring {{\boldsymbol{A}}}^\varsigma _1 \,\mathrm{d}\varsigma . \end{equation}

Therefore,

(4.35a)

\begin{equation} \bar {L}_1 = \big(\bar {{\boldsymbol{\gamma }}}_{1}^{\mathrm{ZLR}} + \bar {{\boldsymbol{\gamma }}}_{1}^{\mathrm{FLR}}\big) \boldsymbol{\cdot } \dot {\bar {{\boldsymbol{Z}}}} - \big(\bar {H}_1^{\mathrm{ZLR}} + \bar {H}_1^{\mathrm{FLR}}\big), \end{equation}

where the symplectic part $\bar {{\boldsymbol{\gamma }}}_{1} \mathrel {\mathop :}= \bar {{\boldsymbol{\gamma }}}_{1}^{\mathrm{ZLR}} + \bar {{\boldsymbol{\gamma }}}_{1}^{\mathrm{FLR}}$ is given by

(4.35b)

\begin{equation} \bar {{\boldsymbol{\gamma }}}_{1,{\boldsymbol{R}}}^{\mathrm{ZLR}} \mathrel {\mathop :}= q{\boldsymbol{A}}_1 , \quad \bar {{\boldsymbol{\gamma }}}_{1,{\boldsymbol{R}}}^{\mathrm{FLR}} \mathrel {\mathop :}= q \int _0^1 \,\mathring {\!{\boldsymbol{B}}}^\varsigma _1 \mathrm{d} \varsigma \times {\boldsymbol{\rho }} , \quad \bar {\gamma }_{1,\varTheta }^{\mathrm{FLR}} \mathrel {\mathop :}= - q \rho ^2 \int _0^1 \varsigma \,\mathring {\!{\boldsymbol{B}}}^\varsigma _1 \mathrm{d} \varsigma \boldsymbol{\cdot } {{\boldsymbol{\hat {b}}}_0}, \end{equation}

and the Hamiltonian part $\bar {H}_1 \mathrel {\mathop :}= \bar {H}_1^{\mathrm{ZLR}} + \bar {H}_1^{\mathrm{FLR}}$ is given by

(4.35c)

\begin{equation} \bar {H}_1^{\mathrm{ZLR}} \mathrel {\mathop :}= q \phi _1 , \quad \bar {H}_1^{\mathrm{FLR}} \mathrel {\mathop :}= - q \int _0^1 \,\mathring {\!{\boldsymbol{E}}}^\varsigma _1 \mathrm{d} \varsigma \boldsymbol{\cdot } {\boldsymbol{\rho }}. \end{equation}

We distinguish the ZLR contributions from the FLR contributions. Note that each of the FLR contributions is gauge-invariant, as they are expressed in terms of the electromagnetic fields.

When considering the Lie coordinate transformation given by (4.9) and (4.10), we find that the following yields a sufficient condition for gauge invariance of the resulting gyrokinetic model. This is a new result which provides a general approach for the development of gauge-invariant gyrokinetic models. A proof can be found in Appendix C.

Theorem 1 (Sufficient condition for gauge invariance). The gyrocentre single-particle phase-space Lagrangian (to second-order) is gauge-invariant up to a total derivative

(4.36)

\begin{equation} \bar {\bar {L}} \overset {\text{(4.26)}}{\mapsto } \bar {\bar {L}} + q \left ( \boldsymbol{\nabla }\eta \boldsymbol{\cdot } \,\dot {\bar {\bar {\!\boldsymbol{R}}}} + \frac {\partial \eta }{\partial t} \right ) = \bar {\bar {L}} + q \,\frac {{\mathrm{d}} \eta }{{\mathrm{d}} t} \end{equation}

provided that $\bar {\bar {{\boldsymbol{\gamma }}}}_1 - \bar {{\boldsymbol{\gamma }}}_1$ and $\bar {\bar {{\boldsymbol{\gamma }}}}_2$ are gauge-invariant.

Remark 1 (Cross-terms of $O(\varepsilon _\delta \varepsilon _B)$ ). In the expression for $\bar {{\boldsymbol{\gamma }}}_{1,{\boldsymbol{R}}}^{\mathrm{FLR}}$ in ( 4.35b ), we have neglected the $O(\varepsilon _\delta \varepsilon _B)$ contribution. Neglecting this term is consistent with the leading-order (in $O(\varepsilon _B)$ ) approximation of the particle position in terms of the guiding-centre coordinates as given in ( 3.21 ).

When a conventional gyrokinetic ordering is used (Parra & Calvo Reference Parra and Calvo2011 ), where, in particular, it is assumed that $\varepsilon _\delta = \varepsilon _B$ , we find that the neglected cross-terms are of the same order as terms that eventually end up in the second-order gyrocentre Hamiltonian $\bar {\bar {H}}_2 = O\big(\varepsilon _\delta ^2\big)$ (see also § 4.5.3 ). Hence, when a conventional gyrokinetic ordering is used, one should retain these terms, as is done by Parra & Calvo ( Reference Parra and Calvo2011 ). In the present work, such terms are not retained in favour of clarity and simplicity of the resulting model, and we note that such cross-terms are also not included in the state-of-the-art parallel-only gyrokinetic models used in practice (Brizard & Hahm Reference Brizard and Hahm2007; Kleiber et al. Reference Kleiber, Hatzky, Könies, Mishchenko and Sonnendrücker 2016 ). However, the cross-terms can easily be included without breaking the structure of the proposed model, as we now demonstrate.

A more accurate approximation of the guiding-centre coordinate transformation is considered, which is given by

(4.37)

\begin{equation} {\boldsymbol{R}} = \bar {{\boldsymbol{R}}} + {\boldsymbol{\rho }} + {\boldsymbol{\mathfrak{r}}}, \end{equation}

where $\boldsymbol{\mathfrak{r}}$ is an $O(\varepsilon _B)$ correction (i.e. ${\boldsymbol{\mathfrak{r}}} = -{\boldsymbol{\rho }}_1$ in Brizard (Reference Brizard 1990 , (2.58))). When taking this additional correction into account, the FLR part of the perturbed guiding-centre Lagrangian becomes (the ZLR part is unchanged)

(4.38)

\begin{equation} q \int _0^1 ({\boldsymbol{\rho }} + {\boldsymbol{\mathfrak{r}}}) \boldsymbol{\cdot } \big[ (\dot {\bar {{\boldsymbol{R}}}} + \varsigma \dot {{\boldsymbol{\rho }}} + \varsigma \dot {{\boldsymbol{\mathfrak{r}}}}) \times \,\mathring {\!{\boldsymbol{B}}}^\varsigma _1 + \,\mathring {\!{\boldsymbol{E}}}^\varsigma _1 \big]\, \mathrm{d}\varsigma , \end{equation}

which, when neglecting $O(\varepsilon _\delta \varepsilon _B^2)$ terms, results in (cf. ( 4.33 ))

(4.39)

\begin{align} \bar {L}_1^{\mathrm{FLR}} &= {} q\int _0^1 \big( \,\mathring {\!{\boldsymbol{B}}}^\varsigma _1 \boldsymbol{\cdot } \big[ {\boldsymbol{\rho }} \times \big( \dot {\bar {{\boldsymbol{R}}}} + \varsigma \rho {\boldsymbol{\hat {\tau }}} \dot {\bar {\varTheta }} \big) \big] + {\boldsymbol{\rho }} \boldsymbol{\cdot } \,\mathring {\!{\boldsymbol{E}}}^\varsigma _1 \big)\, \mathrm{d}\varsigma \nonumber \\&\quad + q\int _0^1 \Big[ \,\mathring {\!{\boldsymbol{B}}}^\varsigma _1 \boldsymbol{\cdot } \Big( {\boldsymbol{\mathfrak{r}}} \times \big[ \dot {\bar {{\boldsymbol{R}}}} + \varsigma \rho {\boldsymbol{\hat {\tau }}} \dot {\bar {\varTheta }} \big] + {\boldsymbol{\rho }} \times \Big[ \varsigma (\boldsymbol{\nabla }{\boldsymbol{\rho }})^\intercal \dot {\bar {{\boldsymbol{R}}}} + \varsigma \frac {\partial {\boldsymbol{\mathfrak{r}}}}{\partial \varTheta } \dot {\bar {\varTheta }} \Big] \Big) + {\boldsymbol{\mathfrak{r}}} \boldsymbol{\cdot } \,\mathring {\!{\boldsymbol{E}}}^\varsigma _1 \Big]\, \mathrm{d}\varsigma , \end{align}

where the second row contains all the cross-terms which we do not include in the present work. We note that the definition of $\mathring {Q}^\varsigma$ (cf. ( 4.29 )) is altered according to ( 4.37 ) when this more accurate approximation to the perturbed guiding-centre Lagrangian is considered. Moreover, care should be taken that $O(\varepsilon _B^2)$ terms are also included in the guiding-centre Lagrangian $\bar {L}_0$ when such an approach is followed.

4.5. A family of gauge-invariant gyrocentre coordinate transformations

Thus far, we have considered a general gyrocentre coordinate transformation, which is defined by the symplectic part of the gyrocentre Lagrangian: $\bar {\bar {{\boldsymbol{\gamma }}}}_1$ and $\bar {\bar {{\boldsymbol{\gamma }}}}_2$ . In what follows, we let $\bar {\bar {{\boldsymbol{\gamma }}}}_2 = {\boldsymbol{0}}_6$ . Therefore, we find that consistency (with respect to the Lie transformation), gyro-phase independence, invariance of the gyrocentre magnetic moment (cf. (4.25)) and gauge invariance of the resulting gyrocentre Lagrangian requires the following four conditions to be satisfied:

(4.40)

\begin{equation} \bar {\bar {{\boldsymbol{\gamma }}}}_1 = O(\varepsilon _\delta ) , \quad \widetilde {\bar {\bar {{\boldsymbol{\gamma }}}}_1} = {\boldsymbol{0}}_6 , \quad \bar {\bar {\gamma }}_{1,\varTheta } = 0 , \quad \bar {\bar {{\boldsymbol{\gamma }}}}_1 - \bar {{\boldsymbol{\gamma }}}_1 \overset {\text{(4.26)}}{\mapsto } \bar {\bar {{\boldsymbol{\gamma }}}}_1 - \bar {{\boldsymbol{\gamma }}}_1 , \end{equation}

respectively.

4.5.1. Overview

Traditionally (Brizard Reference Brizard1990; Brizard & Hahm Reference Brizard and Hahm2007), the following choice was made:

(4.41)

\begin{equation} \bar {\bar {{\boldsymbol{\gamma }}}}_{1,{\boldsymbol{R}}} = q \langle \,\mathring {\!{\boldsymbol{A}}}_1 \rangle \end{equation}

with all other components equal to zero. Note that this corresponds to $\bar {\bar {{\boldsymbol{\gamma }}}}_{1,{\boldsymbol{R}}} = \langle \bar {{\boldsymbol{\gamma }}}_{1,{\boldsymbol{R}}}^\dagger \rangle$ , where $\bar {{\boldsymbol{\gamma }}}_{1}^\dagger$ denotes the symplectic part of the perturbed guiding-centre Lagrangian before the total derivative has been omitted, see also (4.34). This choice satisfies the first, second and third of our requirements, but it does not lead to a gauge-invariant model:

(4.42)

\begin{equation} \bar {\bar {{\boldsymbol{\gamma }}}}_{1} - \bar {{\boldsymbol{\gamma }}}_{1}^\dagger \overset {\text{ (4.26)}}{\mapsto } \bar {\bar {{\boldsymbol{\gamma }}}}_{1} - \bar {{\boldsymbol{\gamma }}}_{1}^\dagger - \begin{pmatrix} q \widetilde {\mathring {\boldsymbol{\nabla }} \eta }\\[6pt] 0\\[6pt] \dfrac {q \rho }{2\bar {{M}}} {\boldsymbol{\hat {\rho }}} \boldsymbol{\cdot } \mathring {\boldsymbol{\nabla }} \eta \\[12pt] q \rho {\boldsymbol{\hat {\tau }}} \boldsymbol{\cdot } \mathring {\boldsymbol{\nabla }} \eta \end{pmatrix}. \end{equation}

This approach leads to a gyrokinetic model in which the compressional Alfvén wave can be included by considering a high-frequency approximation for the first-order generating function $\bar {\bar {S}}_1$ , as proposed by Qin et al. (Reference Qin, Tang, Lee and Rewoldt1999).

More recently, the following gyrocentre coordinate transformation was proposed by Burby & Brizard (Reference Burby and Brizard2019):

(4.43)

\begin{equation} \bar {\bar {{\boldsymbol{\gamma }}}}_1 = \bar {{\boldsymbol{\gamma }}}_1^{\mathrm{ZLR}} \quad \implies \quad \bar {\bar {{\boldsymbol{\gamma }}}}_{1,{\boldsymbol{R}}} = q {\boldsymbol{A}}_1, \end{equation}

which satisfies all of our requirements since

(4.44)

\begin{equation} \bar {\bar {{\boldsymbol{\gamma }}}}_1 - \bar {{\boldsymbol{\gamma }}}_1 = - \bar {{\boldsymbol{\gamma }}}_1^{\mathrm{FLR}}, \end{equation}

which is gauge-invariant as the FLR parts of the perturbed guiding-centre Lagrangian are gauge-invariant. Rather than keeping only the ZLR part of the symplectic part of the perturbed guiding-centre Lagrangian, we can also include the FLR effects resulting in

(4.45)

\begin{equation} \bar {\bar {{\boldsymbol{\gamma }}}}_1 = \bar {{\boldsymbol{\gamma }}}_1^{\mathrm{ZLR}} + \langle \bar {{\boldsymbol{\gamma }}}_1^{\mathrm{FLR}} \rangle = \langle \bar {{\boldsymbol{\gamma }}}_1 \rangle . \end{equation}

Gauge invariance follows from the gauge invariance of $\bar {{\boldsymbol{\gamma }}}_1^{\mathrm{FLR}}$ , and we have gyro-averaged the FLR contribution to ensure that our second requirement is satisfied. We note that (4.45) results in a transformation for which the first-order generating vector is, in some sense, smallest. This is of interest because the gyrokinetic model results from a truncated coordinate transformation, where the truncation error is smaller if the coordinate transformation is smaller (see also the discussion in Appendix E). In particular, using (4.12), we find that

(4.46)

\begin{equation} \bar {\bar {{\boldsymbol{\gamma }}}}_1 = \langle \bar {{\boldsymbol{\gamma }}}_1 \rangle \quad \implies \quad \langle \bar {\bar {{\boldsymbol{G}}}}_1 \rangle = \,\,\bar {\!\! \unicode{x1D645}}_0 \left \langle \bar {\bar {{\boldsymbol{\gamma }}}}_1 - \bar {{\boldsymbol{\gamma }}}_1 - \frac {\partial \bar {\bar {S}}_1}{\partial \bar {\bar {{\boldsymbol{Z}}}}} \right \rangle = {\boldsymbol{0}}_6, \end{equation}

and, therefore, the coordinate transformation given by (4.45) contains, to first-order, only a fluctuating gyro-phase-dependent part.

Henceforth, we consider the following parametrised form of the symplectic part of the gyrocentre Lagrangian:

(4.47)

\begin{equation} \bar {\bar {{\boldsymbol{\gamma }}}}_1 \mathrel {\mathop :}= \begin{pmatrix} q {\boldsymbol{A}}_1 + \xi _R q \langle\kern0.3pt \!| \,\mathring {\!{\boldsymbol{B}}}^\varsigma _1 \times {\boldsymbol{\rho }} |\!\kern0.3pt\rangle \\[8pt] 0\\[8pt] 0\\[8pt] - \xi _\varTheta \dfrac {q \rho ^2}{2} \big\langle \kern-0.7pt\big\langle \mathring {B}^\varsigma _{1,\shortparallel } \big\rangle \kern-0.8pt\big\rangle \end{pmatrix}\!, \end{equation}

where $\xi _R, \xi _\varTheta$ are real-valued parameters that define the coordinate transformation. The choice $(\xi _R, \xi _\varTheta ) = (0, 0)$ yields the model proposed by Burby & Brizard (Reference Burby and Brizard2019), $(\xi _R, \xi _\varTheta ) = (1, 1)$ results in (4.45) and we note that this general form is gauge-invariant regardless of the value of $\xi _R, \xi _\varTheta$ . Here, we have defined the radially averaged gyro-average as

(4.48)

\begin{equation} \langle\kern0.3pt \!| \mathring {Q}^\varsigma |\!\kern0.3pt\rangle \mathrel {\mathop :}= \int _0^1 \langle \mathring {Q}^\varsigma \rangle \,\mathrm{d} \varsigma = \frac {1}{2\pi } \int _0^1 \int _0^{2\pi } Q(\bar{\bar{\boldsymbol{r}}} + \varsigma {\boldsymbol{\rho }})\, \mathrm{d} \bar {\bar {\varTheta }}\, \mathrm{d} \varsigma \end{equation}

and the disc average as (cf. (4.29))

(4.49)

\begin{equation} \langle \!\langle \mathring {Q}^\varsigma \rangle \!\rangle \mathrel {\mathop :}= 2 \langle\kern0.3pt \!| \varsigma \mathring {Q}^\varsigma |\!\kern0.3pt\rangle = \frac {1}{\pi } \int _0^1 \int _0^{2\pi } \varsigma Q(\bar{\bar{\boldsymbol{r}}} + \varsigma {\boldsymbol{\rho }})\, \mathrm{d} \bar {\bar {\varTheta }}\, \mathrm{d} \varsigma , \end{equation}

which is defined component-wise for vector fields. The latter operator is referred to as the disc average (Porazik & Lin Reference Porazik and Lin2011) because it exactly yields the average value of the ‘gyro-disc’ shown in figure 1.

The parametrised coordinate transformation results in the following first-order gyrocentre Lagrangian:

(4.50)

\begin{align} \bar {\bar {L}}_1 = {} & \bar {{\boldsymbol{\gamma }}}_1^{\mathrm{ZLR}} \boldsymbol{\cdot } \dot {\bar {\bar {{\boldsymbol{Z}}}}} + \big\langle \bar {{\boldsymbol{\gamma }}}_{1,{\boldsymbol{R}}}^{\mathrm{FLR}} \big\rangle \boldsymbol{\cdot } \big[ \xi _R \,\dot {\bar {\bar {\!\boldsymbol{R}}}} + (1 - \xi _R) \dot {\bar {{\boldsymbol{R}}}} \big] + \big\langle \bar {\gamma }_{1,\varTheta }^{\mathrm{FLR}} \big\rangle \big[ \xi _\varTheta \dot {\bar {\bar {\varTheta }}} + (1 - \xi _\varTheta ) \dot {\bar {\varTheta }} \big] \nonumber \\ & - q \phi _1 + q \rho \int _0^1 \big\langle \mathring {E}^\varsigma _{1,\rho } \big\rangle\, \mathrm{d}\varsigma , \end{align}

where we have substituted (4.35c ) and (4.47) into (4.24). This shows that the parameters $\xi _R, \xi _\varTheta$ put the symplectic FLR part of the perturbed guiding-centre Lagrangian either in the symplectic ( $(\xi _R, \xi _\varTheta ) = (1, 1)$ ) or in the Hamiltonian ( $(\xi _R, \xi _\varTheta ) = (0, 0)$ ) part of the first-order gyrocentre Lagrangian.

We can already ensure that the gyrocentre magnetic moment is an invariant by imposing our third condition, where we note that

(4.51)

\begin{equation} \dot {\bar {\bar {{M}}}} = - \frac {q}{m}\, \frac {{\mathrm{d}} }{{\mathrm{d}} t} (\bar {\bar {\gamma }}_{1,\varTheta } + \bar {\bar {\gamma }}_{2,\varTheta }) = \xi _\varTheta\, \frac {{\mathrm{d}} }{{\mathrm{d}} t} \left ( \bar {\bar {{M}}} \frac {\big\langle \kern-0.7pt\big\langle \mathring {B}^\varsigma _{1,\shortparallel } \big\rangle \kern-0.8pt\big\rangle }{B_0} \right ) \end{equation}

by substituting (4.47) into (4.25). Hence, requiring $\bar {\bar {{M}}}$ to remain invariant in gyrocentre coordinates implies that $\xi _\varTheta = 0$ , which we use from here on out.

4.5.2. First-order transformation

The first-order gyrocentre Hamiltonian is found by substituting (3.48) and (4.35b ) into (4.50)

(4.52)

\begin{equation} \bar {\bar {H}}_1 \mathrel {\mathop :}= q \phi _1 - q \rho \big\langle\kern-1.7pt \big| \mathring {E}^\varsigma _{1,\rho } \big|\kern-1.7pt\big\rangle - (1 - \xi _R) q \rho \bar {\bar {U}}_\shortparallel \big\langle\kern-1.7pt \big| \mathring {B}^\varsigma _{1,\tau } \big|\kern-1.7pt\big\rangle + \bar {\bar {{M}}} \big\langle \kern-0.5pt\big\langle \mathring {B}^\varsigma _{1,\shortparallel } \big\rangle \kern-0.5pt\big\rangle , \end{equation}

where we have neglected the $O(\varepsilon _B)$ contributions due to $\dot {\bar {{\boldsymbol{R}}}}$ if $\xi _R \neq 1$ .

We need an explicit expression for the first-order generating vector $\bar {\bar {{\boldsymbol{G}}}}_1$ for the computation of the second-order Hamiltonian as follows from (4.23). Recall that the first-order generating vector $\bar {\bar {{\boldsymbol{G}}}}_1$ is given by (4.12), which itself requires an expression for the first-order generating function $\bar {\bar {S}}_1$ . From (4.15), it follows that

(4.53)

\begin{equation} \underbrace {\frac {1}{\omega _{\mathrm{c}}}\frac {\partial \bar {\bar {S}}_{1}}{\partial t}}_{O(\varepsilon _\omega )} + \underbrace {\frac {\bar {\bar {U}}_\shortparallel }{\omega _{\mathrm{c}}} {{\boldsymbol{\hat {b}}}_0} \boldsymbol{\cdot } \boldsymbol{\nabla }\bar {\bar {S}}_{1}}_{O(\varepsilon _\omega )} + \underbrace {\frac {\partial \bar {\bar {S}}_{1}}{\partial \bar {\bar {\varTheta }}}}_{O(1)} = \frac {q}{\omega _{\mathrm{c}}} \widetilde {\psi }_{1}, \end{equation}

where we have substituted the zeroth-order Hamiltonian as given by (3.23) and have neglected the $O(\varepsilon _B)$ contributions from the guiding-centre Poisson bracket (3.47). Furthermore, we have indicated the magnitude of each of the terms, which is a result from non-dimensionalisation using

(4.54)

\begin{equation} [t] = \frac {1}{\omega }, \quad [U_\shortparallel ] = \frac {\omega }{k_{\shortparallel }}, \end{equation}

where we recall that the non-dimensional frequency $\varepsilon _\omega$ is defined in (3.19). Using (4.14) and (4.35), we find that

(4.55)

\begin{equation} q \widetilde {\psi }_{1} = \widetilde {\bar {H}_{1}^{\mathrm{FLR}}} - \dot {\bar {{\boldsymbol{Z}}}} \boldsymbol{\cdot } \widetilde {\bar {{\boldsymbol{\gamma }}}_{1}^{\mathrm{FLR}}} = - \rho \int _0^1 \widetilde {\mathring {F}^\varsigma _{1,\rho }}\, \mathrm{d} \varsigma + \omega _{\mathrm{c}} q \rho ^2 \int _0^1 \varsigma \widetilde {\mathring {B}^\varsigma _{1,\shortparallel }}\, \mathrm{d} \varsigma , \end{equation}

where we have introduced the Lorentz force

(4.56)

\begin{equation} {\boldsymbol{F}}_1 \mathrel {\mathop :}= q \big( {\boldsymbol{E}}_1 + \bar {\bar {U}}_\shortparallel {{\boldsymbol{\hat {b}}}_0} \times {\boldsymbol{B}}_1 \big) , \quad \mathring {{\boldsymbol{F}}}^\varsigma _1 \mathrel {\mathop :}= q \big( \,\mathring {\!{\boldsymbol{E}}}^\varsigma _1 + \bar {\bar {U}}_\shortparallel {{\boldsymbol{\hat {b}}}_0} \times \,\mathring {\!{\boldsymbol{B}}}^\varsigma _1 \big). \end{equation}

When considering approximations of (4.53), it is important to keep gauge invariance of the resulting model in mind. In particular, when considering the proof of Theorem1 as given in Appendix C, we find that gauge invariance of the first-order generating function $\bar {\bar {S}}_1$ is needed, and this is proven by observing that $\bar {\bar {S}}_1$ is the solution of a linear PDE (4.15) with a gauge-invariant right-hand side given by (4.55). When obtaining an approximation to $\bar {\bar {S}}_1$ , it is therefore essential that we preserve its gauge invariance, which can rather easily be achieved by simply keeping the gauge-invariant parts of the right-hand side $\widetilde {\psi }_1$ together. The consequence of preserving gauge invariance is that the high-frequency contribution from $\partial {\boldsymbol{A}}_1 / \partial t$ , which itself comes from the ${\boldsymbol{E}}_1$ term in the Lorentz force (4.56), is kept on the right-hand side of (4.53). Keeping this term results in a high-frequency compressional Alfén wave, as discussed in § 6.

We make several long wavelength approximations to (4.55), starting with

(4.57)

\begin{equation} \mathring {F}^\varsigma _{1,\rho } = F_{1,\rho } + O(\varepsilon _\perp ) \quad \implies \quad \widetilde {\mathring {F}^\varsigma _{1,\rho }} = \mathring {F}^\varsigma _{1,\rho } - \big\langle \mathring {F}^\varsigma _{1,\rho } \big\rangle = F_{1,\rho } + O(\varepsilon _\perp ) \end{equation}

as follows from a Taylor series expansion of ${\boldsymbol{F}}_1$ centred around the gyrocentre position $\bar{\bar{\boldsymbol{R}}}$ (see the discussion in Appendix D), where we recall that the non-dimensional perpendicular wave number $\varepsilon _\perp$ is as defined in (3.20). Similarly, we find that

(4.58)

\begin{equation} \widetilde {\mathring {B}^\varsigma _{1,\shortparallel }} = \mathring {B}^\varsigma _{1,\shortparallel } - \langle \mathring {B}^\varsigma _{1,\shortparallel } \rangle = O(\varepsilon _\perp ) \end{equation}

and, therefore, neglecting $O(\varepsilon _\perp )$ contributions to the right-hand side of (4.55) and neglecting the $O(\varepsilon _\omega )$ part of the left-hand side of (4.53), we find that the first-order generating function can be approximated by

(4.59)

\begin{equation} \frac {\partial \bar {\bar {S}}_{1}}{\partial \bar {\bar {\varTheta }}} = - \frac {\rho }{\omega _{\mathrm{c}}} F_{1,\rho } \quad \iff \quad \bar {\bar {S}}_{1} = \frac {\rho }{\omega _{\mathrm{c}}} F_{1,\tau }. \end{equation}

This allows us to approximate the first-order generating vector explicitly as

(4.60)

\begin{equation} \bar {\bar {{\boldsymbol{G}}}}_{1} = \,\,\bar {\!\! \unicode{x1D645}}_0 \left ( \bar {\bar {{\boldsymbol{\gamma }}}}_{1} - \bar {{\boldsymbol{\gamma }}}_{1} - \dfrac {\partial \bar {\bar {S}}_{1}}{\partial \bar {\bar {{\boldsymbol{Z}}}}} \right ) = \begin{pmatrix} \dfrac {1}{B_{0,\shortparallel }^{\star }} {{\boldsymbol{\hat {b}}}_0} \times \int _0^1 \big( \xi _R\big\langle \,\mathring {\!{\boldsymbol{B}}}^\varsigma _1 \times {\boldsymbol{\rho }} \big\rangle - \,\mathring {\!{\boldsymbol{B}}}^\varsigma _1 \times {\boldsymbol{\rho }} \big) \,\mathrm{d}\varsigma + \dfrac {\rho B_{1,\rho }}{B_0} {{\boldsymbol{\hat {b}}}_0}\\[10pt] - \dfrac {q}{m} {{\boldsymbol{\hat {b}}}_0} \boldsymbol{\cdot } \int _0^1 \big( \xi _R\big\langle \,\mathring {\!{\boldsymbol{B}}}^\varsigma _1 \times {\boldsymbol{\rho }} \big\rangle - \,\mathring {\!{\boldsymbol{B}}}^\varsigma _1 \times {\boldsymbol{\rho }} \big)\, \mathrm{d}\varsigma \\[10pt] - \dfrac {q^2 \rho ^2}{m} \int _0^1 \varsigma \mathring {B}^\varsigma _{1,\shortparallel } \mathrm{d}\varsigma - \dfrac {\rho }{B_0} F_{1,\rho }\\[10pt] -\dfrac {\rho }{2 \bar {\bar {{M}}} B_0} F_{1,\tau } \end{pmatrix}, \end{equation}

where we have substituted (4.35b ), (4.47) and (4.59) into (4.12), and have furthermore neglected all spatial derivatives of the perturbed electromagnetic fields for the evaluation of the Poisson bracket $\{\bar {\bar {{\boldsymbol{Z}}}}, \bar {\bar {S}}_{1}\}_{0}$ .

From (4.60), it follows that the gyro-average of all except one of the components of the first-order generating vector vanishes if we choose $\xi _R = 1$ ,

(4.61)

\begin{equation} \langle \bar {\bar {{\boldsymbol{G}}}}_{1} \rangle ^{\xi _R = 1} = \begin{pmatrix} {\boldsymbol{0}}_3\\[3pt] 0\\[3pt] - \dfrac {\bar {\bar {{M}}}}{B_0} \big\langle \kern-0.7pt\big\langle \mathring {B}^\varsigma _{1,\shortparallel } \big\rangle \kern-0.8pt\big\rangle \\[8pt] 0 \end{pmatrix}\!. \end{equation}

This choice, therefore, in some sense, yields the smallest transformation to first-order in $\varepsilon _\delta$ , which is relevant as the magnitude of the coordinate transformation determines the magnitude of the truncation error of the resulting gyrokinetic model. More specifically, in Appendix E, we show that this choice minimises the Euclidean norm of the gyro-average of the first-order generating vector, resulting in the minimisation of the truncation error of the gyrocentre coordinate transformation.

When considering (4.61), we find that only the magnetic moment is transformed non-trivially, which is a consequence of choosing $\xi _\varTheta = 0$ , which, in turn, was required to ensure that the magnetic moment remains invariant in gyrocentre coordinates, as is shown in (4.51). It follows that the gyrocentre coordinate transformation is smallest for $\xi _R = 1$ , which, as discussed previously, is of interest as it affects the accuracy of the resulting model. For this reason, we choose the parameter value $\xi _R = 1$ resulting in the following symplectic part of the gyrocentre Lagrangian:

(4.62)

\begin{equation} (\xi _R, \xi _\varTheta ) = (1, 0) \quad \implies \quad \bar {\bar {{\boldsymbol{\gamma }}}}_{1,{\boldsymbol{R}}} = \langle \bar {{\boldsymbol{\gamma }}}_{1,{\boldsymbol{R}}} \rangle = q {\boldsymbol{A}}_1 + q \big\langle\kern-1.7pt \big| \,\mathring {\!{\boldsymbol{B}}}^\varsigma _1\times {\boldsymbol{\rho }} \big|\kern-1.5pt\big\rangle , \end{equation}

with all other components equal to zero.

Remark 2 (Interpreting the gyrocentre magnetic moment). From ( 4.61 ), it follows that the gyrocentre magnetic moment can be interpreted as follows:

(4.63)

\begin{align}\langle \bar {{M}}(\bar {\bar {{\boldsymbol{Z}}}}) \rangle := \bar {\bar {{M}}} - \langle \bar {\bar {G}}_{1,{M}} \rangle + O\big(\varepsilon _\delta ^2\big) \approx \bar {\bar {{M}}} \left ( 1 + \frac {\big\langle \kern-0.7pt\big\langle \mathring {B}^\varsigma _{1,\shortparallel } \big\rangle \kern-0.8pt\big\rangle }{B_0} \right ) \implies \bar {\bar {{M}}} \approx \frac {m_s \bar {U}_\tau ^2}{2 \big(B_0 + \big\langle \kern-0.7pt\big\langle \mathring {B}^\varsigma _{1,\shortparallel } \big\rangle \kern-0.8pt\big\rangle \big)}, \end{align}

where we find that the parallel component of the full magnetic field appears in the denominator rather than just $B_0$ (cf. ( 3.9 )):

(4.64)

\begin{equation} B_0 + \big\langle \kern-0.7pt\big\langle \mathring {B}^\varsigma _{1,\shortparallel } \big\rangle \kern-0.8pt\big\rangle = {{\boldsymbol{\hat {b}}}_0} \boldsymbol{\cdot } \big({\boldsymbol{B}}_0 + \big\langle \kern-0.7pt\big\langle \,\mathring {\!{\boldsymbol{B}}}^\varsigma _1 \big\rangle \kern-0.7pt\big\rangle \big). \end{equation}

Thus, it is crucial to include the contribution of the perturbed magnetic field to make $\bar {\bar {{M}}}$ an invariant of motion.

4.5.3. Second-order transformation

When assuming $\bar {\bar {{\boldsymbol{\gamma }}}}_2 = {\boldsymbol{0}}_6$ , as we do throughout this discussion, we find that (4.23) results in

(4.65)

\begin{equation} \bar {\bar {H}}_{2} = \langle {\boldsymbol{T}}_{1} \boldsymbol{\cdot } \bar {\bar {{\boldsymbol{G}}}}_{1} \rangle , \end{equation}

where ${\boldsymbol{T}}_{1}$ is as defined in (4.21). The approximation of ${\boldsymbol{T}}_{1}$ , followed by substitution of the approximated first-order generating vector $\bar {\bar {{\boldsymbol{G}}}}_{1}$ and subsequent gyro-averaging results in the following second-order gyrocentre Hamiltonian:

(4.66)

\begin{equation} \bar {\bar {H}}_{2} \mathrel {\mathop :}= \frac {\bar {\bar {{M}}}}{2 B_0} \lvert {\boldsymbol{B}}_{1,\perp } \rvert ^2 - \frac {m}{2 q^2 B_0^2} \lvert {\boldsymbol{F}}_{1,\perp } \rvert ^2 , \end{equation}

which agrees with the result from Burby & Brizard (Reference Burby and Brizard2019) (hence, with $(\xi _R, \xi _\varTheta ) = (0, 0)$ ) upon substitution of the expression for the Lorentz force (4.56). The derivation of (4.66) is rather tedious and can be found in Appendix F.

It should be noted that many terms have been neglected in the derivation of the second-order Hamiltonian $\bar {\bar {H}}_2$ . In particular, only the terms of leading order in $\varepsilon _B$ and $\varepsilon _\perp$ have been kept, resulting in a ZLR approximation of $\bar {\bar {H}}_2$ wherein $O(\varepsilon _B)$ terms have been neglected. Even though there is no fundamental limitation that keeps us from including such higher-order terms, we have thus far opted not to do so, thereby keeping the resulting equations somewhat tractable, more easily interpretable as well as more suitable for discretisation. We view the proposed model as a pragmatic first step towards a ‘fully gyrokinetic’ gauge-invariant model wherein such terms are kept also at second-order in $\varepsilon _\delta$ , i.e. in the second-order gyrocentre Hamiltonian.

4.5.4. Gyrocentre single-particle phase-space Lagrangian

When combining the symplectic and Hamiltonian parts of the zeroth-order Lagrangian defined by (3.23), the first-order gyrocentre Lagrangian defined by (4.47) and (4.52) (with $(\xi _R, \xi _\varTheta ) = (1, 0)$ ), as well as the second-order gyrocentre Lagrangian defined by $\bar {\bar {{\boldsymbol{\gamma }}}}_2 = {\boldsymbol{0}}_6$ and (4.66), we find that the total gyrocentre single-particle phase-space Lagrangian is given by

(4.67)

\begin{align} \bar {\bar {L}} = {} & q ({\boldsymbol{A}}^{\star }_0 + {\boldsymbol{A}}^{\star }_1) \boldsymbol{\cdot } \,\dot {\bar {\bar {\!\boldsymbol{R}}}} + \frac {m \bar {\bar {{M}}}}{q} \dot {\bar {\bar {\varTheta }}} - \frac {m \bar {\bar {U}}_\shortparallel ^2}{2} - q \phi _1 + q \rho \big\langle\kern-1.7pt \big| \mathring {E}^\varsigma _{1,\rho } \big|\kern-1.7pt\big\rangle - \bar {\bar {{M}}} \big(B_0 + \big\langle \kern-0.7pt\big\langle \mathring {B}^\varsigma _{1,\shortparallel } \big\rangle \kern-0.8pt\big\rangle \big)\nonumber \\ & - \frac {\bar {\bar {{M}}}}{2 B_0} \lvert {\boldsymbol{B}}_{1,\perp } \rvert ^2 + \frac {m}{2 q^2 B_0^2} \lvert {\boldsymbol{F}}_{1,\perp } \rvert ^2 , \end{align}

where we have defined the FLR corrected vector potential as

(4.68)

\begin{equation} {\boldsymbol{A}}^{\star }_1 \mathrel {\mathop :}= {\boldsymbol{A}}_1 + \langle\kern0.3pt \!| (\mathring {\boldsymbol{\nabla} }^\varsigma \times {\boldsymbol{A}}_1) \times {\boldsymbol{\rho }} |\!\kern0.3pt\rangle . \end{equation}

4.6. Principle of least action

For the derivation of the EOMs, we follow the same approach as followed in § 3.5 and, therefore, we must compute the perturbed gyrocentre Lagrange matrix $\bar {\bar { \unicode{x1D652}}}_1$ , which follows from the skew-symmetric part of the Jacobian matrix of $\bar {\bar {{\boldsymbol{\gamma }}}}_{1}$ , as defined in (4.47). We note that the unperturbed gyrocentre Lagrange matrix coincides with the unperturbed guiding-centre Lagrange matrix $\bar { \unicode{x1D652}}_0$ , given by (3.39), except that it is evaluated at gyrocentre coordinates. From (4.47), it follows that

(4.69)

\begin{equation} \frac {\partial \bar {\bar {{\boldsymbol{\gamma }}}}_{1}}{\partial \bar {\bar {{\boldsymbol{Z}}}}} = \begin{pmatrix} q (\boldsymbol{\nabla }{\boldsymbol{A}}^{\star }_1)^\intercal & \quad {\boldsymbol{0}}_3 & \quad -{\boldsymbol{w}}_1 & \quad {\boldsymbol{0}}_3\\[8pt] {\boldsymbol{0}}_3^\intercal & \quad 0 & \quad 0 & \quad 0\\[8pt] {\boldsymbol{0}}_3^\intercal & \quad 0 & \quad 0 & \quad 0\\[8pt] {\boldsymbol{0}}_3^\intercal & \quad 0 & \quad 0 & \quad 0\\ \end{pmatrix}, \end{equation}

where we let ${\boldsymbol{w}}_1$ be defined as

(4.70)

\begin{equation} {\boldsymbol{w}}_1 \mathrel {\mathop :}= - \frac {q}{2 \bar {\bar {{M}}}} \left ( \big\langle\kern-1.7pt \big| \,\mathring {\!{\boldsymbol{B}}}^\varsigma _1\times {\boldsymbol{\rho }} \big|\kern-1.5pt\big\rangle + \frac {1}{2} \langle \!\langle [ (\mathring {\boldsymbol{\nabla} }^\varsigma {\boldsymbol{B}}_1)^\intercal {\boldsymbol{\rho }} ] \times {\boldsymbol{\rho }} \rangle \!\rangle \right ). \end{equation}

In addition, we define the FLR corrected electromagnetic fields by

(4.71a)

\begin{align} {\boldsymbol{E}}^{\star }_1 &\mathrel {\mathop :}= {\boldsymbol{E}}_1 + \boldsymbol{\nabla }\langle\kern0.3pt \!| \rho \mathring {E}^\varsigma _{1,\rho } |\!\kern0.3pt\rangle + \langle\kern0.3pt \!| (\mathring {\boldsymbol{\nabla} }^\varsigma \times {\boldsymbol{E}}_1) \times {\boldsymbol{\rho }} |\!\kern0.3pt\rangle ,\\[-10pt]\nonumber\end{align}

(4.71b)

\begin{align} {\boldsymbol{B}}^{\star }_1 &\mathrel {\mathop :}= \boldsymbol{\nabla }\times {\boldsymbol{A}}^{\star }_1 = {\boldsymbol{B}}_1 + \boldsymbol{\nabla }\times \big\langle\kern-1.7pt \big| \,\mathring {\!{\boldsymbol{B}}}^\varsigma _1\times {\boldsymbol{\rho }} \big|\kern-1.5pt\big\rangle , \end{align}

where we recall that the radially averaged gyro-average $\langle\kern0.3pt \!| \cdot |\!\kern0.3pt\rangle$ is defined in (4.48). These fields, which end up being used in the EOMs in (5.23), are referred to as ‘FLR corrected’ electromagnetic fields because they can be approximated as (cf. (4.27))

(4.72a)

\begin{align} {\boldsymbol{E}}^{\star }_1 &= \langle \,\mathring {\!{\boldsymbol{E}}}_1 \rangle + O(\varepsilon _B),\\[-5pt]\nonumber \end{align}

(4.72b)

\begin{align} {\boldsymbol{B}}^{\star }_1 &= \langle \,\mathring {\!{\boldsymbol{B}}}_1 \rangle + O(\varepsilon _B) \end{align}

by making use of (D.10) and (D.12). We note that (4.72) only holds due to our choice of the parameter value $\xi _R = 1$ .

The resulting Lagrange matrix is given by (cf. (3.39))

(4.73)

\begin{equation} \bar {\bar { \unicode{x1D652}}} = \begin{pmatrix} q \unicode{x1D63D}^{\star } & \quad -m {{\boldsymbol{\hat {b}}}_0} & \quad \dfrac {m}{q} {\boldsymbol{w}} & \quad {\boldsymbol{0}}_3\\[8pt] m {{\boldsymbol{\hat {b}}}_0}^\intercal & \quad 0 & \quad 0 & \quad 0\\[8pt] - \dfrac {m}{q} {\boldsymbol{w}}^\intercal & \quad 0 & \quad 0 & \quad \dfrac {m}{q} \\[8pt] {\boldsymbol{0}}_3^\intercal & \quad 0 & \quad - \dfrac {m}{q}& \quad 0\\ \end{pmatrix}, \end{equation}

where we have defined the effective gyrocentre vector potential as (see also (3.24))

(4.74)

\begin{equation} {\boldsymbol{A}}^{\star } \mathrel {\mathop :}= {\boldsymbol{A}}_0^{\star } + {\boldsymbol{A}}^{\star }_1 = {\boldsymbol{A}}_0 + \frac {m \bar {\bar {U}}_\shortparallel }{q} {{\boldsymbol{\hat {b}}}_0} - \frac {m \bar {\bar {{M}}}}{q^2} {\boldsymbol{w}}_0 + {\boldsymbol{A}}_1 + \big\langle\kern-1.7pt \big| \,\mathring {\!{\boldsymbol{B}}}^\varsigma _1\times {\boldsymbol{\rho }} \big|\kern-1.5pt\big\rangle , \quad {\boldsymbol{B}}^{\star } \mathrel {\mathop :}= \boldsymbol{\nabla }\times {\boldsymbol{A}}^{\star } \end{equation}

as well as ${\boldsymbol{w}} = {\boldsymbol{w}}_0 + {\boldsymbol{w}}_1$ . The matrix $ \unicode{x1D63D}^{\star }$ is defined analogously to (3.41).

For the computation of the gyrocentre Poisson bracket, we must invert the gyrocentre Lagrange matrix. Using the result of Appendix A, we find that the gyrocentre Poisson matrix is given by

(4.75)

\begin{equation} \bar{\bar{\!\!\! \unicode{x1D645}}}\, = \begin{pmatrix} -\dfrac { \unicode{x1D63D}_0}{q B_0 B_\shortparallel ^{\star }} & \quad \dfrac {{\boldsymbol{b}}^{\star }}{m} & \quad {\boldsymbol{0}}_3 & \quad - \dfrac {{\boldsymbol{w}} \times {{\boldsymbol{\hat {b}}}_0}}{q B_\shortparallel ^{\star }}\\[14pt] -\dfrac {({\boldsymbol{b}}^{\star })^\intercal }{m} & \quad 0 & \quad 0 & \quad -\dfrac {{\boldsymbol{b}}^{\star } \boldsymbol{\cdot } {\boldsymbol{w}}}{m}\\[10pt] {\boldsymbol{0}}_3^\intercal & \quad 0 & \quad 0 & \quad -\dfrac {q}{m}\\[10pt] \dfrac {({\boldsymbol{w}} \times {{\boldsymbol{\hat {b}}}_0})^\intercal }{q B_\shortparallel ^{\star }} & \quad \dfrac {{\boldsymbol{b}}^{\star } \boldsymbol{\cdot } {\boldsymbol{w}}}{m} & \quad \dfrac {q}{m}& \quad 0\\ \end{pmatrix}, \end{equation}

where we have defined (cf. (3.43))

(4.76)

\begin{equation} {\boldsymbol{b}}^{\star } \mathrel {\mathop :}= \frac {{\boldsymbol{B}}^{\star }}{B_\shortparallel ^{\star }} = {{\boldsymbol{\hat {b}}}_0} + \frac {1}{B_{\shortparallel }^{\star }} \left [ \frac {m \bar {\bar {U}}_\shortparallel }{q} {{\boldsymbol{\hat {b}}}_0} \times {\boldsymbol{\kappa }} - \frac {m \bar {\bar {{M}}}}{q^2} (\boldsymbol{\nabla }\times {\boldsymbol{w}}_0)_\perp + {\boldsymbol{B}}^{\star }_{1,\perp } \right ], \end{equation}

and note that $B_\shortparallel ^{\star }$ can be written explicitly as (cf. (3.44))

(4.77)

\begin{equation} B_\shortparallel ^{\star } = B^{\star }_{0,\shortparallel } + B^{\star }_{1,\shortparallel } = B_0 + \frac {m \bar {\bar {U}}_\shortparallel }{q} (\boldsymbol{\nabla }\times {{\boldsymbol{\hat {b}}}_0})_\shortparallel - \frac {m \bar {\bar {{M}}}}{q^2} (\boldsymbol{\nabla }\times {\boldsymbol{w}}_0)_\shortparallel + B^{\star }_{1,\shortparallel } . \end{equation}

Analogous to (3.37), we define the gyrocentre Poisson bracket as

(4.78)

\begin{equation} \{\mathcal{F}, \mathcal{G}\} \mathrel {\mathop :}= \frac {\partial \mathcal{F}}{\partial \bar {\bar {{\boldsymbol{Z}}}}} \boldsymbol{\cdot } \left ( \,\,\bar{\bar{\!\!\! \unicode{x1D645}}}\, \frac {\partial \mathcal{G}}{\partial \bar {\bar {{\boldsymbol{Z}}}}} \right ) \end{equation}

such that the EOMs, similar to (3.38), are given by

(4.79)

\begin{equation} \dot {\bar {\bar {{\boldsymbol{Z}}}}} = \,\,\,\bar{\bar{\!\!\! \unicode{x1D645}}}\, \frac {\partial \bar {\bar {{\boldsymbol{\gamma }}}}}{\partial t} + \{\bar {\bar {{\boldsymbol{Z}}}}, \bar {\bar {H}}\}, \end{equation}

where the zeroth-order term $\bar {\bar {H}}_0$ of the Hamiltonian is defined in (3.23), the first-order term $\bar {\bar {H}}_1$ is defined in (4.52) and the second-order term is given by (4.66). Substitution of (4.75) in (4.79) results in

(4.80a)

\begin{align} \,\dot {\bar {\bar {\!\boldsymbol{R}}}} &= \frac {1}{m} \frac {\partial \bar {\bar {H}}}{\partial \bar {\bar {U}}_\shortparallel } {\boldsymbol{b}}^{\star } + \frac {1}{q B_\shortparallel ^{\star }} {{\boldsymbol{\hat {b}}}_0} \times \left ( \boldsymbol{\nabla }\bar {\bar {H}} + q \frac {\partial {\boldsymbol{A}}^{\star }_1}{\partial t} \right ), \end{align}

(4.80b)

\begin{align} \dot {\bar {\bar {U}}}_\shortparallel &= -\frac {1}{m} {\boldsymbol{b}}^{\star } \boldsymbol{\cdot } \left ( \boldsymbol{\nabla }\bar {\bar {H}} + q\frac {\partial {\boldsymbol{A}}^{\star }_1}{\partial t} \right ), \end{align}

(4.80c)

\begin{align} \dot {\bar {\bar {{M}}}} &= 0, \end{align}

(4.80d)

\begin{align} \dot {\bar {\bar {\varTheta }}} &= \frac {q}{m} \frac {\partial \bar {\bar {H}}}{\partial \bar {\bar {{M}}}} + {\boldsymbol{w}} \boldsymbol{\cdot }\, \dot {\bar {\bar {\!\boldsymbol{R}}}} \, . \end{align}

Here, we note that, even though the EOM for the gyro-phase $\bar {\bar {\varTheta }}$ is non-trivial, it does not have to be solved as none of the other terms on the right-hand side of the EOMs depend on the gyro-phase.

5. Gyrokinetic Maxwell model

Thus far, we have derived a gyrokinetic model for single-particle motion for a given electromagnetic field, which includes a time-dependent perturbation. This forms the basis for a gyrokinetic approximation of the coupled and self-consistent Vlasov–Maxwell system of equations, wherein the time-dependent perturbation of the electromagnetic field results from the motion of the charged particles themselves.

In this section, we first introduce the particle distribution function for each of the species, which we then use to formulate the self-consistent action principle following the work of e.g. Sugama (Reference Sugama2000). Provided with this action principle, we then derive the resulting EOMs for the particles as well as the corresponding field equations for the electromagnetic field. The field equations are considered in a strong formulation wherein we recognise the macroscopic Maxwell equations. We discuss equilibrium solutions as well as the well-posedness of the field equations. The section is concluded with a discussion on energy conservation.

As we exclusively discuss the gyrokinetic model, which is expressed in gyrocentre coordinates, we drop the $\bar {\bar {\cdot }}$ notation and simply write $\boldsymbol{Z}$ rather than $\bar {\bar {{\boldsymbol{Z}}}}$ .

5.1. Particle distribution function

Several particle species are considered, which we denote by the subscript ‘ $s$ ’, where usually $s \in \{\mathrm{i}, \mathrm{e}\}$ for the ion species ‘ $\mathrm{i}$ ’ and the electron species ‘ $\mathrm{e}$ ’. Each species has its own particle mass $m_s$ and charge $q_s$ . A particle distribution function is considered, for each particle species $s$ , which is denoted by ${f}_s({\boldsymbol{r}}, {u}_\shortparallel , {\mu }, {t})$ and coincides with the number of particles per unit phase-space volume. The particle distribution function is split into its initial background part and time-dependent part,

(5.1)

\begin{equation} {f}_s({\boldsymbol{r}}, {u}_\shortparallel , {\mu }, {t}) = {f}^0_{s}({\boldsymbol{r}}, {u}_\shortparallel , {\mu }) + \delta {f}_{s}({\boldsymbol{r}}, {u}_\shortparallel , {\mu }, {t}) \end{equation}

with $\delta {f}_{s}({\boldsymbol{r}}, {u}_\shortparallel , {\mu }, t^0) = 0$ . Note that the particle distribution function is gyrotropic, i.e. it does not depend on the gyro-phase, which is a consequence of assuming that the initial particle distribution function ${f}^0_{s}$ is gyrotropic. This, in turn, can be justified by noting that the non-dimensionalisation of the Vlasov equation for ${f}^0_{s}$ implies that ${\partial {f}^0_{s}}/{\partial {\theta }} = O(\varepsilon _\omega )$ by making use of (4.54).

We use the lowercase letter ${\boldsymbol{z}} = ({\boldsymbol{r}}, {u}_\shortparallel , {\mu }, {\theta })$ to refer to the Eulerian equivalent of the Lagrangian phase-space coordinate ${\boldsymbol{Z}}$ . The dependence of a particle’s Lagrangian characteristic on the initial phase-space coordinate ${\boldsymbol{z}}^0 = ({\boldsymbol{R}}(t^0), {U}_\shortparallel (t^0), {M}(t^0), {\varTheta }(t^0))$ is denoted in the following way (in the absence of collisions):

(5.2)

\begin{equation} {\boldsymbol{Z}}(t; {\boldsymbol{z}}^0, t^0) = {\boldsymbol{Z}}(t) \quad \text{with} \quad {\boldsymbol{Z}}(t^0) = {\boldsymbol{z}}^0. \end{equation}

The particle distribution function then satisfies (by definition)

(5.3)

\begin{equation} {f}_s({\boldsymbol{Z}}(t; {\boldsymbol{z}}^0, t^0), t) = {f}^0_{s}({\boldsymbol{z}}^0), \end{equation}

where we denote by ${f}^0_{s}$ the particle distribution function at $t^0$ . Hence, the following Vlasov equation is satisfied:

(5.4)

\begin{equation} \frac {{\mathrm{d}} {f}_s}{{\mathrm{d}} t} = 0 \quad \iff \quad \frac {\partial {f}_s}{\partial t} + \,\dot { { {\!\boldsymbol{R}}}} \boldsymbol{\cdot } \boldsymbol{\nabla }{f}_s + \dot {{U}}_\shortparallel \frac {\partial {f}_s}{\partial {u}_\shortparallel } = 0, \end{equation}

where we consider the EOMs $\dot {\boldsymbol{Z}}$ given by (4.80) to be evaluated at the Eulerian phase-space coordinate $\boldsymbol{z}$ .

Recall that the physical coordinates, as defined in § 3.1, were denoted by $\tilde {{\boldsymbol{Z}}}$ . The field-theoretic Lagrangian, which is discussed in § 5.2, is formulated using integrals over physical space, which has to be transformed to integrals over the gyrocentre coordinates. For instance, we consider the integral of a function $\tilde {\mathcal{F}}({\boldsymbol{x}}, {\boldsymbol{v}}, t) = \mathcal{F}({\boldsymbol{r}}, {u}_\shortparallel , {\mu }, {\theta }, {t})$ (note that we now write ${\boldsymbol{x}}, {\boldsymbol{v}}$ for the physical position and velocity, to distinguish them from the gyrocentre position and velocity, which are now denoted by ${\boldsymbol{r}}, {\boldsymbol{u}}$ since we have omitted the $ {\bar{\bar{\cdot}}}$ notation),

(5.5)

\begin{equation} \int _{\mathbb{R}^3} \int _\varOmega \tilde {\mathcal{F}} \,\mathrm{d}^3 x \,\mathrm{d}^3 v = \int \mathcal{F} \mathfrak{J}_s \,\mathrm{d}^6 {z}, \end{equation}

where the integration limits and differentials are defined as

(5.6)

\begin{equation} \int \mathrm{d}^6 {z} \mathrel {\mathop :}= \int_{\mathbb{R}^3} \int _\varOmega \mathrm{d}^3 {r}\, \mathrm{d}^3 {u}, \quad \int \mathrm{d}^3 {u} \mathrel {\mathop :}= \int _0^{2 \pi } \int _0^\infty \int _{-\infty }^\infty \mathrm{d} {u}_\shortparallel\, \mathrm{d} {\mu } \,\mathrm{d} {\theta }, \end{equation}

and $\mathfrak{J}_s$ denotes the Jacobian of the coordinate transformation from physical to gyrocentre coordinates,

(5.7)

\begin{equation} \mathfrak{J}_s \mathrel {\mathop :}= \det \frac {\partial \tilde {{\boldsymbol{z}}}}{\partial {\boldsymbol{z}}} = \frac {B_{s,\shortparallel }^{\star }}{m_s}, \end{equation}

which is derived in Appendix G (a proof can also be found from Parra & Calvo (Reference Parra and Calvo2011, Appendix F)) and can be written explicitly by making use of (4.77).

We find that the gyrocentre EOMs, as given by (4.80), imply that the phase-space volume is conserved. A proof is given in Appendix H and can also be found from Parra & Calvo (Reference Parra and Calvo2011, Appendices G and H).

Theorem 2 (Gyrocentre Liouville theorem). The phase-space volume is conserved:

(5.8)

\begin{equation} \frac {\partial \mathfrak{J}_s}{\partial t} + \boldsymbol{\nabla }\boldsymbol{\cdot } (\mathfrak{J}_s \,\dot { { {\!\boldsymbol{R}}}}) + \frac {\partial }{\partial {u}_\shortparallel } (\mathfrak{J}_s \dot {{U}}_\shortparallel ) = 0. \end{equation}

Furthermore, integrals of the form ( 5.5 ) can be expressed in terms of the initial phase-space coordinates in the following way:

(5.9)

\begin{equation} \int {f}_s \mathcal{F} \mathfrak{J}_s \,\mathrm{d}^6 {z} = \int {f}^0_{s}({\boldsymbol{z}}^0) \mathcal{F}({\boldsymbol{Z}}(t; {\boldsymbol{z}}^0, t^0)) \mathfrak{J}_s({\boldsymbol{z}}^0, t^0)\, \mathrm{d}^6 {z}^0 , \end{equation}

where an absence of arguments implies evaluation at ( ${\boldsymbol{z}}, t$ ).

By combining (5.4) and (5.8), we find the conservative form of the Vlasov equation

(5.10)

\begin{equation} \frac {\partial }{\partial t} ({f}_s \mathfrak{J}_s) + \boldsymbol{\nabla }\boldsymbol{\cdot } ({f}_s \mathfrak{J}_s \,\dot { { {\!\boldsymbol{R}}}}) + \frac {\partial }{\partial {u}_\shortparallel } ({f}_s \mathfrak{J}_s \dot {{U}}_\shortparallel ) = 0. \end{equation}

Note that integration of the conservative form of the Vlasov equation over velocity space, multiplication by $q_s$ and subsequent summation over the species $s$ results in the free-charge continuity equation

(5.11)

\begin{equation} \frac {\partial \mathcal{R}^{\mathrm{f}}}{\partial t} + \boldsymbol{\nabla }\boldsymbol{\cdot } {\boldsymbol{\mathcal{J}}}^{\mathrm{f}} = 0, \end{equation}

where the gyrocentre free charge and current density are defined as

(5.12a)

\begin{align} \mathcal{R}^{\mathrm{f}} &\mathrel {\mathop :}= \sum _s q_s \int {f}_s \mathfrak{J}_s \,\mathrm{d}^3 {u}, \end{align}

(5.12b)

\begin{align} {\boldsymbol{\mathcal{J}}}^{\mathrm{f}} &\mathrel {\mathop :}= \sum _s q_s \int {f}_s \,\dot { { {\!\boldsymbol{R}}}} \, \mathfrak{J}_s\, \mathrm{d}^3 {u}, \end{align}

respectively. It implies local conservation of the free charge.

5.2. Low’s action

We use a variational formulation to obtain a structure-preserving self-consistent Vlasov–Maxwell system of equations. Such a variational formulation is in particular suitable for our foreseen structure-preserving discretisation using the finite element exterior calculus (FEEC) (Kraus et al. Reference Kraus, Kormann, Morrison and Sonnendrücker2017).

The starting point is Low’s action (Low Reference Low1958) in gyrocentre coordinates

(5.13a)

\begin{equation} \mathfrak{A}({\boldsymbol{Z}}, \phi _1, {\boldsymbol{A}}_1) \mathrel {\mathop :}= \int \mathfrak{L}({\boldsymbol{Z}}, \phi _1, {\boldsymbol{A}}_1) \,\mathrm{d} t, \end{equation}

where the field-theoretic Lagrangian is given by

(5.13b)

\begin{align} \mathfrak{L}({\boldsymbol{Z}}, \phi _1, {\boldsymbol{A}}_1) = {} & \sum _s \int {f}^0_{s}({\boldsymbol{z}}^0) { {L}}_s( { {{\boldsymbol{Z}}}}(t; {\boldsymbol{z}}^0, t^0), \dot {{\boldsymbol{Z}}}(t; {\boldsymbol{z}}^0, t^0)) \mathfrak{J}_s({\boldsymbol{z}}^0, t^0) \,\mathrm{d}^6 { {z}}^0\nonumber \\ & + \frac {\epsilon _0}{2} \int \lvert {\boldsymbol{E}}_1 \rvert ^2\, \mathrm{d}^3 x - \frac {1}{2 \mu _0} \int \lvert {\boldsymbol{B}}_0 + {\boldsymbol{B}}_1 \rvert ^2 \,\mathrm{d}^3 x . \end{align}

Here, integration over the time coordinate $t$ is done over the interval $[t^0, t^1]$ , where $t^1$ denotes the final time, ${L}_s$ denotes the gyrocentre Lagrangian corresponding to the species $s$ and $\mu _0$ denotes the magnetic permeability in vacuum. Note that we keep a finite value of the vacuum permittivity $\epsilon _0$ as this favourably yields field equations which can be integrated explicitly in time. This is discussed in more detail in § 5.5. In § 6, a low-frequency approximation of this model is proposed, wherein the limit of quasi-neutrality $\epsilon _0 \rightarrow 0$ is considered, thereby eliminating fast waves which would otherwise be present.

We may transform the first integral in (5.13b ) by making use of Theorem2. Rather than transforming the integral resulting from each of the contributions of the gyrocentre Lagrangian, we split the gyrocentre Lagrangian in two parts, ${L}_s = {L}_s^{\mathrm{part}} + {L}_s^{\mathrm{field}}$ , referred to as the particle and field part, respectively. Subsequently, we transform and linearise the contribution from the field part

(5.14)

\begin{align} & \int {f}^0_{s}({\boldsymbol{z}}^0) {L}_s({\boldsymbol{Z}}(t; {\boldsymbol{z}}^0, t^0), \dot {\boldsymbol{Z}}(t; {\boldsymbol{z}}^0, t^0)) \mathfrak{J}_s({\boldsymbol{z}}^0, t^0)\, \mathrm{d}^6 {z}^0 \nonumber \\& \approx \int {f}^0_{s}({\boldsymbol{z}}^0) {L}_s^{\mathrm{part}}({\boldsymbol{Z}}(t; {\boldsymbol{z}}^0, t^0), \dot {\boldsymbol{Z}}(t; {\boldsymbol{z}}^0, t^0)) \mathfrak{J}_s({\boldsymbol{z}}^0, t^0)\, \mathrm{d}^6 { {z}}^0 + \int {f}^0_{s} {L}_s^{\mathrm{field}} \mathfrak{J}_{0,s}\, \mathrm{d}^6 {z}. \end{align}

Furthermore, we have defined the (unperturbed) guiding-centre Jacobian as (cf. (5.7))

(5.15)

\begin{equation} \mathfrak{J}_{0,s} \mathrel {\mathop :}= \frac {B_0}{m_s}, \end{equation}

where we have neglected the $O(\varepsilon _\delta )$ and $O(\varepsilon _B)$ terms from the Jacobian. Note that this is a modelling choice which does not break the structure of the resulting equations. The field-theoretic Lagrangian, as given by (5.13b ), is now approximated by

(5.16)

\begin{align} \mathfrak{L}({\boldsymbol{Z}}, \phi _1, {\boldsymbol{A}}_1) \mathrel {\mathop :}= {} & \sum _s \int {f}^0_{s}({\boldsymbol{z}}^0) {L}_s^{\mathrm{part}}({\boldsymbol{Z}}(t; {\boldsymbol{z}}^0, t^0), \dot {\boldsymbol{Z}}(t; {\boldsymbol{z}}^0, t^0)) \mathfrak{J}_s({\boldsymbol{z}}^0, t^0)\, \mathrm{d}^6 {z}^0\nonumber\\ & + \sum _s \int {f}^0_{s} {L}_s^{\mathrm{field}} \mathfrak{J}_{0,s} \,\mathrm{d}^6 {z} + \frac {\epsilon _0}{2} \int \lvert {\boldsymbol{E}}_1 \rvert ^2 \,\mathrm{d}^3 x\nonumber\\ & - \frac {1}{2 \mu _0} \int \lvert {\boldsymbol{B}}_0 + {\boldsymbol{B}}_1 \rvert ^2\, \mathrm{d}^3 x . \end{align}

The field part of the Lagrangian does not affect the EOMs of the particles directly, but only affects the potentials via the field equations, which are derived in § 5.3.2. The reason for splitting the field-theoretic Lagrangian in this way is to simplify the resulting discretised model. For instance, we want to obtain linear field equations and, thus, have linearised the corresponding field part of the Lagrangian. Note that neglecting the time-dependent part $\delta {f}_{s}$ of the particle distribution function is justified only if $\delta {f}_{s}$ is small compared with ${f}^0_{s}$ , which is the case e.g. when studying microturbulence in the core of fusion devices (Garbet et al. Reference Garbet, Idomura, Villard and Watanabe2010).

We recall that the gyrocentre single-particle phase-space Lagrangian is given by (4.67); the following splitting is considered:

(5.17a)

\begin{equation} { {L}}_s^{\mathrm{part}}({\boldsymbol{Z}}, \dot {\boldsymbol{Z}}) \mathrel {\mathop :}= q_s {\boldsymbol{A}}_s^{\star } \boldsymbol{\cdot } \,\dot { { {\!\boldsymbol{R}}}} + \frac {m_s {M}}{q_s} \dot {\varTheta } - {H}_s^{\mathrm{part}}({\boldsymbol{Z}}), \quad {L}_s^{\mathrm{field}}({\boldsymbol{z}}) \mathrel {\mathop :}= - {H}_s^{\mathrm{field}}({\boldsymbol{z}}) \end{equation}

for

(5.17b)

\begin{equation} { {H}}_s^{\mathrm{part}}({\boldsymbol{Z}}) \mathrel {\mathop :}= \frac {m_s}{2} U_\shortparallel ^2 + {M} \big(B_0 + \big\langle \kern-0.7pt\big\langle \mathring {B}^\varsigma _{1,\shortparallel } \big\rangle \kern-0.8pt\big\rangle \big) + q_s \phi _1 - q_s \rho \big\langle\kern-1.7pt \big| \mathring {E}^\varsigma _{1,\rho } \big|\kern-1.7pt\big\rangle \end{equation}

and

(5.17c)

\begin{equation} {H}_s^{\mathrm{field}}({\boldsymbol{z}}) \mathrel {\mathop :}= \frac {{\mu }}{2 B_0} \lvert {\boldsymbol{B}}_{1,\perp } \rvert ^2 - \frac {m_s}{2 q_s^2 B_0^2} \lvert {\boldsymbol{F}}_{1,\perp } \rvert ^2 . \end{equation}

5.3. Principle of least action

In what follows, we compute the EOMs for the gyrocentre characteristic $\boldsymbol{Z}$ as well as the field equations for the potentials $\phi _1, {\boldsymbol{A}}_1$ . We follow the principle of least action, which states that the EOMs and field equations are stationary points of the action. The following notation is introduced for computing the variation of the action with respect to the gyrocentre coordinate

(5.18)

\begin{equation} \frac {\delta \mathfrak{A}}{\delta {\boldsymbol{Z}}} [{{\boldsymbol{\delta }}}] \mathrel {\mathop :}= \left . \frac {{\mathrm{d}} }{{\mathrm{d}} \varepsilon } \right |_{\varepsilon = 0} \mathfrak{A}({\boldsymbol{Z}} + \varepsilon {\boldsymbol{\delta }}, \phi _1, {\boldsymbol{A}}_1), \end{equation}

which we define analogously for the other arguments of the action.

5.3.1. Equations of motion

The EOMs are defined by setting the variation of the action with respect to the gyrocentre coordinate to zero for all suitable trajectories $\boldsymbol{\delta }$ with ${\boldsymbol{\delta }}(t^0) = {\boldsymbol{\delta }}(t^1) = {\boldsymbol{0}}_6$ . It follows that the trajectories satisfy

(5.19)

\begin{equation} \sum _s \int {f}^0_{s}({\boldsymbol{z}}^0) \Biggl [\left ( \frac {\partial {L}_s^{\mathrm{part}}}{\partial {\boldsymbol{Z}}} - \frac {{\mathrm{d}} }{{\mathrm{d}} t} \frac {\partial {L}_s^{\mathrm{part}}}{\partial \dot {\boldsymbol{Z}}} \right ) \boldsymbol{\cdot } {\boldsymbol{\delta }} + \underbrace {\frac {{\mathrm{d}} }{{\mathrm{d}} t} \left ( \frac {\partial {L}_s^{\mathrm{part}}}{\partial \dot {\boldsymbol{Z}}} \boldsymbol{\cdot } {\boldsymbol{\delta }} \right )}_{ = 0}\Biggr ] \mathfrak{J}_s({\boldsymbol{z}}^0, t^0)\, \mathrm{d}^6 {z}^0 \,\mathrm{d} t = 0 \end{equation}

by making use of partial integration in time. As this must hold for all trajectories $\boldsymbol{\delta }$ , it follows that the EOMs satisfy the Euler–Lagrange equations (see e.g. (3.34)) and, therefore, the EOMs are of the form given by (4.80), except that only the particle part of the Hamiltonian, as defined in (5.17b ), is used on the right-hand side.

We compute the required partial derivatives of ${H}_s^{\mathrm{part}}$ . The term appearing in the EOMs given by (4.80) can be written as

(5.20)

\begin{equation} \boldsymbol{\nabla }{H}_s^{\mathrm{part}} + q_s \frac {\partial {\boldsymbol{A}}^{\star }_1}{\partial t} = {M} \boldsymbol{\nabla }\big(B_0 + \big\langle \kern-0.7pt\big\langle \mathring {B}^\varsigma _{1,\shortparallel } \big\rangle \kern-0.8pt\big\rangle \big) - q_s {\boldsymbol{E}}^{\star }_1 \end{equation}

by substituting (4.71a ), (4.27), (4.68) and (5.17b ). Furthermore, we made use of Faraday’s law,

(5.21)

\begin{equation} \frac {\partial {\boldsymbol{B}}_1}{\partial t} = -\boldsymbol{\nabla }\times {\boldsymbol{E}}_1, \end{equation}

which follows from the definition of the electromagnetic fields (4.27).

The second partial derivative that is required for the EOMs is the one with respect to the parallel velocity and is given by

(5.22)

\begin{equation} \frac {\partial {H}_s^{\mathrm{part}}}{\partial {U}_\shortparallel } = m_s {U}_\shortparallel . \end{equation}

Substitution of these results in (4.80) yields (we only show the relevant and non-trivial EOMs)

(5.23a)

\begin{align} \,\dot { { {\!\boldsymbol{R}}}} &= {U}_\shortparallel {\boldsymbol{b}}_s^{\star } - \frac {1}{q_s B_{s,\shortparallel }^{\star }} {{\boldsymbol{\hat {b}}}_0} \times \big[ q_s {\boldsymbol{E}}^{\star }_1 - {M} \boldsymbol{\nabla }\big(B_0 + \big\langle \kern-0.7pt\big\langle \mathring {B}^\varsigma _{1,\shortparallel } \big\rangle \kern-0.8pt\big\rangle \big) \big], \end{align}

(5.23b)

where we recall that ${\boldsymbol{b}}_s^{\star }$ is defined in (4.76) and $B_{s,\shortparallel }^{\star }$ is given by (4.77). When substituting (4.76), we find that the EOM for the gyrocentre position $\boldsymbol{R}$ can be written as

(5.24)

\begin{equation} \begin{aligned} \,\dot { { {\!\boldsymbol{R}}}} = {} & {U}_\shortparallel \biggl ({{\boldsymbol{\hat {b}}}_0} + \overbrace {\frac {{\boldsymbol{B}}^{\star }_{1,\perp }}{B_{s,\shortparallel }^{\star }}}^{\text{magnetic flutter}}\biggr ) + \frac {1}{q_s B_{s,\shortparallel }^{\star }} \biggl [ \overbrace {q_s {\boldsymbol{E}}^{\star }_1}^{\text{ExB drift}} - \overbrace {{M} \boldsymbol{\nabla }\big(B_0 + \big\langle \kern-0.7pt\big\langle \mathring {B}^\varsigma _{1,\shortparallel } \big\rangle \kern-0.8pt\big\rangle \big)}^{\text{grad-B drift}} + \overbrace {m_s {U}_\shortparallel ^2 {\boldsymbol{\kappa }}}^{\text{curvature drift}}\\[6pt] & - \underbrace {\frac {m_s {M} {U}_\shortparallel }{q_s} (\boldsymbol{\nabla }\times {\boldsymbol{w}}_0) \times {{\boldsymbol{\hat {b}}}_0}}_{\text{gyro-gauge invariance}} \biggr ] \times {{\boldsymbol{\hat {b}}}_0}, \end{aligned} \end{equation}

where we have indicated the physical meaning of each of the terms. The EOM for the gyrocentre parallel velocity ${U}_\shortparallel$ , as given by (5.23b ), contains two contributions: an acceleration due to the perturbed parallel component of the FLR corrected electric field ${\boldsymbol{E}}^{\star }_1$ as well as the contribution due to the magnetic mirror force.

We recall that the FLR corrected electromagnetic fields ${\boldsymbol{E}}^{\star }_1$ and ${\boldsymbol{B}}^{\star }_1$ are approximations of their respective gyro-averaged counterparts $\langle \,\mathring {\!{\boldsymbol{E}}}_1 \rangle$ and $\langle \,\mathring {\!{\boldsymbol{B}}}_1 \rangle$ according to (4.72). It should be noted that letting ${\boldsymbol{B}}^{\star }_1 = \langle \,\mathring {\!{\boldsymbol{B}}}_1 \rangle$ and/or ${\boldsymbol{E}}^{\star }_1 = \langle \,\mathring {\!{\boldsymbol{E}}}_1 \rangle$ implies that the model no longer results from an action principle, thereby resulting in a loss of energy conservation. Even if $\varepsilon _B = 0$ , one must be aware that the identities given by (4.72) result from application of the gradient Theorem (D.10), which is not likely to hold numerically.

5.3.2. Field equations

We give Low’s action explicitly for our parameter choice $(\xi _R, \xi _\varTheta ) = (1, 0)$ such that we can find the field equations by computing the appropriate variations

(5.25)

\begin{align} &\mathfrak{A}({\boldsymbol{Z}}, \phi _1, {\boldsymbol{A}}_1) = \sum _s \int {f}^0_{s}({\boldsymbol{z}}^0) \biggl [ q_s \big( {\boldsymbol{A}}_{0,s}^{\star } + {\boldsymbol{A}}_1 + \big\langle\kern-1.7pt \big| \,\mathring {\!{\boldsymbol{B}}}^\varsigma _1\times {\boldsymbol{\rho }} \big|\kern-1.5pt\big\rangle \big) \boldsymbol{\cdot } \,\dot { { {\!\boldsymbol{R}}}} + \frac {m_s {M}}{q_s} \dot {\varTheta } \nonumber \\ &- \frac {m_s}{2} {U}_\shortparallel ^2 - {M} \big(B_0 + \big\langle \kern-0.7pt\big\langle \mathring {B}^\varsigma _{1,\shortparallel } \big\rangle \kern-0.8pt\big\rangle \big) - q_s \phi _1 + q_s \rho \big\langle\kern-1.7pt \big| \mathring {E}^\varsigma _{1,\rho } \big|\kern-1.7pt\big\rangle \biggr ] \mathfrak{J}_s({\boldsymbol{z}}^0, t^0)\, \mathrm{d}^6 {z}^0 \mathrm{d} t\nonumber \\ &+ \sum _s \int {f}^0_{s} \biggl [ \frac {m_s}{2 B_0^2} \lvert {\boldsymbol{E}}_{1,\perp } \rvert ^2 - \frac {m_s {u}_\shortparallel }{B_0^2} {{\boldsymbol{\hat {b}}}_0} \boldsymbol{\cdot } ({\boldsymbol{E}}_{1} \times {\boldsymbol{B}}_{1}) - \left ( {\mu } B_0 - m_s {u}_\shortparallel ^2 \right ) \frac {\lvert {\boldsymbol{B}}_{1,\perp } \rvert ^2}{2 B_0^2} \biggr ] \mathfrak{J}_{0,s}\, \mathrm{d}^6 {z} \,\mathrm{d} t\nonumber \\ &+ \frac {\epsilon _0}{2} \int \lvert {\boldsymbol{E}}_1 \rvert ^2 \,\mathrm{d}^3 x \,\mathrm{d} t - \frac {1}{2 \mu _0} \int \lvert {\boldsymbol{B}}_0 + {\boldsymbol{B}}_1 \rvert ^2 \,\mathrm{d}^3 x \,\mathrm{d} t , \end{align}

where we have substituted (4.68), (5.17) and (5.16) into (5.13a ). Recall that the electromagnetic fields are defined in (4.27).

Each of the field equations can be derived by setting the corresponding variation with respect to the function to zero. We start by computing Gauss’s law, which results from setting the variation of Low’s action (5.25) with respect to the scalar potential $\phi _1$ to zero. That is, Gauss’s law is derived from

(5.26)

\begin{equation} \frac {\delta \mathfrak{A}}{\delta \phi _1} [{\varLambda }] = 0, \end{equation}

where $\varLambda$ is a scalar test function. We note that the substitution $\phi _1 \mapsto \phi _1 + \varepsilon \varLambda$ results in

(5.27)

\begin{equation} {\boldsymbol{E}}_1 \mapsto {\boldsymbol{E}}_1 - \varepsilon \boldsymbol{\nabla }\varLambda . \end{equation}

This results in the following Gauss law:

(5.28)

\begin{equation} - \int \left ( \epsilon _0 {\boldsymbol{E}}_1 + {\boldsymbol{\mathcal{P}}}_1 \right ) \boldsymbol{\cdot } \boldsymbol{\nabla }\varLambda \,\mathrm{d}^3 {r} = \sum _s q_s \int {f}_s \langle \mathring {\varLambda } \rangle \mathfrak{J}_s\, \mathrm{d}^6 {z} , \end{equation}

where we have used Liouville’s theorem to transform the integral over ${f}_s$ , assumed the test function to be independent of time and have defined the electric polarisation as

(5.29)

\begin{equation} {\boldsymbol{\mathcal{P}}}_1 \mathrel {\mathop :}= \sum _s \int {f}^0_{s} {\boldsymbol{P}}_{1,s} \mathfrak{J}_{0,s}\, \mathrm{d}^3 {u} , \quad {\boldsymbol{P}}_{1,s} \mathrel {\mathop :}= \frac {m_s}{q_s B_0^2} {\boldsymbol{F}}_{1,\perp }, \end{equation}

where we recall that ${\boldsymbol{F}}_{1}$ denotes the Lorentz force as defined in (4.56). In simplifying the right-hand side of (5.28), we have made use of the gradient Theorem (D.8),

(5.30)

\begin{equation} \varLambda + \rho \langle\kern0.3pt \!| {\boldsymbol{\hat {\rho }}} \boldsymbol{\cdot } \mathring {\boldsymbol{\nabla} }^\varsigma \varLambda |\!\kern0.3pt\rangle = \langle \mathring {\varLambda } \rangle . \end{equation}

We note that, because the gradient theorem in this form does not hold numerically, it is important to discretise the gyro-average using the left-hand side of (5.30), whenever gauge invariance is to be preserved numerically.

The Ampère–Maxwell law is derived by imposing

(5.31)

\begin{equation} \frac {\delta \mathfrak{A}}{\delta {\boldsymbol{A}}_1} [{{{\boldsymbol{\varLambda }}}}] = 0, \end{equation}

where ${\boldsymbol{\varLambda }}$ is a vector-valued test function and we note that the substitution ${\boldsymbol{A}}_1 \mapsto {\boldsymbol{A}}_1 + \varepsilon {{\boldsymbol{\varLambda }}}$ results in

(5.32)

\begin{equation} {\boldsymbol{E}}_1 \mapsto {\boldsymbol{E}}_1 - \varepsilon \frac {\partial {{\boldsymbol{\varLambda }}}}{\partial t} , \quad {\boldsymbol{B}}_1 \mapsto {\boldsymbol{B}}_1 + \varepsilon \boldsymbol{\nabla }\times {{\boldsymbol{\varLambda }}}. \end{equation}

This results in

(5.33)

\begin{align} & \frac {1}{\mu _0} \int ({\boldsymbol{B}}_0 + {\boldsymbol{B}}_1) \boldsymbol{\cdot } (\boldsymbol{\nabla }\times {{\boldsymbol{\varLambda }}}) \,\mathrm{d}^3 x = \int \left [ \frac {\partial }{\partial t} \left ( \epsilon _0 {\boldsymbol{E}}_1 + {\boldsymbol{\mathcal{P}}}_1 \right ) + \boldsymbol{\nabla }\times {\boldsymbol{\mathcal{M}}}_1 \right ] \boldsymbol{\cdot } {{\boldsymbol{\varLambda }}} \,\mathrm{d}^3 {r} \nonumber \\& \quad + \sum _s \int {f}_s \big[ q_s \,\dot { { {\!\boldsymbol{R}}}} \boldsymbol{\cdot } \big( {{\boldsymbol{\varLambda }}} + \boldsymbol{\nabla }\langle\kern0.3pt \!| \rho \mathring {\varLambda }^\varsigma _\rho |\!\kern0.3pt\rangle + \langle\kern0.3pt \!| (\mathring {\boldsymbol{\nabla} }^\varsigma \times {{\boldsymbol{\varLambda }}}) \times {\boldsymbol{\rho }} |\!\kern0.3pt\rangle \big) - {\mu } \langle \!\langle (\mathring {\boldsymbol{\nabla} }^\varsigma \times {{\boldsymbol{\varLambda }}})_\shortparallel \rangle \!\rangle \big] \mathfrak{J}_s \,\mathrm{d}^6 {z}, \end{align}

where we made use of partial integration in time, substituted (5.10) and (5.15), and have defined the magnetisation as

(5.34)

\begin{equation} {\boldsymbol{\mathcal{M}}}_1 \mathrel {\mathop :}= \sum _s \int {f}^0_{s} {\boldsymbol{M}}_{1,s} \mathfrak{J}_{0,s}\, \mathrm{d}^3 {u} , \quad {\boldsymbol{M}}_{1,s} \mathrel {\mathop :}= - {u}_\shortparallel {{\boldsymbol{\hat {b}}}_0} \times {\boldsymbol{P}}_{1,s} - \frac {{\mu }}{B_0} {\boldsymbol{B}}_{1,\perp } . \end{equation}

We define the rest-frame magnetic and electric dipole moments per particle as

(5.35a)

\begin{align} {\boldsymbol{\mathfrak{m}}}_s &\mathrel {\mathop :}= - {\mu } \left( {{\boldsymbol{\hat {b}}}_0} + \frac {{\boldsymbol{B}}_{1,\perp }}{B_0} \right), \\[-5pt]\nonumber\end{align}

(5.35b)

\begin{align} {\boldsymbol{\mathfrak{p}}}_{1,s} &\mathrel {\mathop :}= \frac {m_s}{q_s B_0^2} {\boldsymbol{F}}_{1,\perp } = \frac {m_s}{B_0^2} \big( {\boldsymbol{E}}_{1,\perp } + {u}_\shortparallel {{\boldsymbol{\hat {b}}}_0} \times {\boldsymbol{B}}_1 \big), \end{align}

where we have included the intrinsic guiding-centre magnetic moment $-{\mu } {{\boldsymbol{\hat {b}}}_0}$ (Bittencourt Reference Bittencourt2004, (4.35)) coming from the ZLR part of $- {\mu } \langle \!\langle (\mathring {\boldsymbol{\nabla} }^\varsigma \times {{\boldsymbol{\varLambda }}})_\shortparallel \rangle \!\rangle$ (the last term on the right-hand side of (5.33)). The minus sign in (5.35a ) reflects the fact that the plasma is diamagnetic as the magnetic dipole moment points in the opposite direction to the magnetic field. It follows that the magnetic and electric dipole moments per particle are given by

(5.36a)

\begin{align} {\boldsymbol{\mathfrak{M}}}_s &= {\boldsymbol{\mathfrak{m}}}_s - {\boldsymbol{v}}_0 \times {\boldsymbol{\mathfrak{p}}}_{1,s}, \end{align}

(5.36b)

\begin{align} {\boldsymbol{P}}_{1,s} &= {\boldsymbol{\mathfrak{p}}}_{1,s} + \frac {1}{c^2} \underbrace {{\boldsymbol{v}} \times {\boldsymbol{\mathfrak{m}}}_s}_{= 0} \end{align}

by making use of (5.29) and (5.34). Here, $c = 1 / \sqrt {\epsilon _0 \mu _0}$ denotes the speed of light, and we have defined the velocity $\boldsymbol{v}$ as

(5.37)

\begin{equation} {\boldsymbol{v}} \mathrel {\mathop :}= \left ( {{\boldsymbol{\hat {b}}}_0} + \frac {{\boldsymbol{B}}_{1,\perp }}{B_0} \right ) {u}_\shortparallel , \end{equation}

which includes the contribution from the magnetic flutter as found in (5.24). The expressions for the magnetic and electric dipole moments per particle can now directly be compared with those found from Brizard & Hahm (Reference Brizard and Hahm2007, (34) and (35)) as well as with the expressions of a moving electric and magnetic dipole described by Fisher (Reference Fisher1971) and Hnizdo (Reference Hnizdo2012).

Remark 3. In obtaining ( 5.33 ), we have made use of partial integration in time, thereby omitting the following term from the variation of the action:

(5.38)

\begin{align} \frac {\delta \mathfrak{A}}{\delta {\boldsymbol{A}}_1} [{{{\boldsymbol{\varLambda }}}}] = {} & \int (\text{the Amp}\unicode{x00E8}\text{re-Maxwell law})\, \mathrm{d} t -\sum _s q_s \int \frac {\partial }{\partial t} \big( {f}_s \mathfrak{J}_s \rho \langle\kern0.3pt \!| \mathring {\varLambda }^\varsigma _\rho |\!\kern0.3pt\rangle \big)\, \mathrm{d}^6 {z} \,\mathrm{d} t \nonumber \\ & - \sum _s \int {f}^0_{s} \frac {m_s}{q_s B_0^2} \frac {\partial }{\partial t} ({\boldsymbol{F}}_{1,\perp } \boldsymbol{\cdot } {{\boldsymbol{\varLambda }}}) \mathfrak{J}_{0,s}\, \mathrm{d}^6 {z}\, \mathrm{d} t . \end{align}

We explicitly state this term as it plays a crucial role in the derivation of the conserved energy in § 5.8 .

5.4. Strong formulation of the field equations

The previously discussed field equations were given in a weak formulation, which is how they naturally arise from the variational formulation. The weak formulation is exactly what we need for a future FEEC (Kraus et al. Reference Kraus, Kormann, Morrison and Sonnendrücker2017) discretisation of the field equations; however, when it comes to physical interpretation, it is not the most convenient way to present the equations. To this end, we consider the strong formulation of the field equations, where we moreover highlight the macroscopic Maxwell structure of the equations.

In essence, the strong formulation of Gauss’s law is the equation which, once multiplied by the scalar test function $\varLambda$ and integrated over the spatial domain, results in Gauss’s law (5.28) after partial integration. Here, we note that the right-hand side of Gauss’s law (5.28) contains the gyro-average of the test function. Hence, to find the strong formulation, we must define the gyro-average adjoint of the free charge density $\mathcal{R}^{\mathrm{f}}$ , which is defined such that

(5.39a)

\begin{equation} \int \overline {\mathcal{R}}{}^{\mathrm{f}} \varLambda\, \mathrm{d}^3 r \mathrel {\mathop :}= \sum _s q_s \int {f}_s \langle \mathring {\varLambda } \rangle \, \mathfrak{J}_s \,\mathrm{d}^6 {z} \end{equation}

for all suitable test functions $\varLambda$ . A similar definition for the gyro-average adjoint of the free current density holds

(5.39b)

\begin{align} \int \overline {{\boldsymbol{\mathcal{J}}}}{}^{\mathrm{f}} \boldsymbol{\cdot } {{\boldsymbol{\varLambda }}} \,\mathrm{d}^3 { {r}} \mathrel {\mathop :}= {} & \sum _s \int {f}_s \big[q_s \,\dot { { {\!\boldsymbol{R}}}} \boldsymbol{\cdot } \big( {{\boldsymbol{\varLambda }}} + \boldsymbol{\nabla }\langle\kern0.3pt \!| \rho \mathring {\varLambda }^\varsigma _\rho |\!\kern0.3pt\rangle + \langle\kern0.3pt \!| (\mathring {\boldsymbol{\nabla} }^\varsigma \times {{\boldsymbol{\varLambda }}}) \times {\boldsymbol{\rho }} |\!\kern0.3pt\rangle \big)\nonumber \\ & - {\mu } \langle \!\langle (\mathring {\boldsymbol{\nabla} }^\varsigma \times {{\boldsymbol{\varLambda }}})_\shortparallel \rangle \!\rangle \big ] \mathfrak{J}_s\, \mathrm{d}^6 {z} \end{align}

for which

(5.40)

\begin{equation} \int \overline {{\boldsymbol{\mathcal{J}}}}{}^{\mathrm{f}} \boldsymbol{\cdot } {{\boldsymbol{\varLambda }}}\, \mathrm{d}^3 {r} = \sum _s q_s \int {f}_s \langle {\boldsymbol{U}}^{\star } \boldsymbol{\cdot } \mathring {{{\boldsymbol{\varLambda }}}} \rangle \, \mathfrak{J}_s \,\mathrm{d}^6 {z} + O(\varepsilon _B) \end{equation}

by making use of (D.10) and (D.7). Here, we have defined the effective gyrocentre velocity as (cf. (3.13))

(5.41)

\begin{equation} {\boldsymbol{U}}^{\star } \mathrel {\mathop :}= \,\dot { { {\!\boldsymbol{R}}}} + {u}_\tau {\boldsymbol{\hat {\tau }}} \end{equation}

for which $\langle {\boldsymbol{U}}^{\star } \rangle = \,\dot { { {\!\boldsymbol{R}}}}$ . The tangential velocity component ${u}_\tau$ used in (5.41) is in gyrocentre coordinates, i.e. it is defined according to (3.14) evaluated at gyrocentre coordinates. Similarly, the gyroradius $\rho$ is defined according to (3.22) evaluated at gyrocentre coordinates.

We recall that the contribution to the free current density given by $-{\mu } \langle \!\langle (\mathring {\boldsymbol{\nabla} }^\varsigma \times {{\boldsymbol{\varLambda }}})_\shortparallel \rangle \!\rangle$ on the right-hand side of (5.39b ) results in the intrinsic guiding-centre magnetic moment and was included in (5.35a ) to define the rest-frame magnetic moment. The complicated term on the right-hand side of (5.39b ) that multiplies $\,\dot { { {\!\boldsymbol{R}}}}$ is essential in § 5.7 for showing that the field equations are compatible. The key property of this term is found by letting ${{\boldsymbol{\varLambda }}} = \boldsymbol{\nabla }\varLambda$ , as one does when computing the divergence of the adjoint of the free current density. This results in

(5.42)

\begin{equation} \int \overline {{\boldsymbol{\mathcal{J}}}}{}^{\mathrm{f}} \boldsymbol{\cdot } \boldsymbol{\nabla }\varLambda\, \mathrm{d}^3 {r} = \sum _s q_s \int {f}_s \,\dot { { {\!\boldsymbol{R}}}} \boldsymbol{\cdot } \boldsymbol{\nabla }\langle \mathring {\varLambda } \rangle \mathfrak{J}_s \,\mathrm{d}^6 {z}, \end{equation}

where we have made use of the gradient Theorem (D.8) and shows that the gradient of the gyro-average of the test function is found rather than the gyro-average of the gradient. This equality is essential in showing that the gyro-average adjoint of the free-charge continuity equation also holds,

(5.43)

\begin{equation} \int \left ( \frac {\partial \overline {\mathcal{R}}{}^{\mathrm{f}}}{\partial t} + \boldsymbol{\nabla }\boldsymbol{\cdot } \overline {{\boldsymbol{\mathcal{J}}}}{}^{\mathrm{f}} \right ) \varLambda \,\mathrm{d}^3 {r} = 0 \qquad \implies \qquad \frac {\partial \overline {\mathcal{R}}{}^{\mathrm{f}}}{\partial t} + \boldsymbol{\nabla }\boldsymbol{\cdot } \overline {{\boldsymbol{\mathcal{J}}}}{}^{\mathrm{f}} = 0 \end{equation}

as follows from multiplying the conservative form of the Vlasov equation (5.10) by $q_s$ and the gyro-averaged scalar test function $\langle \mathring {\varLambda } \rangle$ , integrating over phase-space, by making use of partial integration, and by substituting the gyro-average adjoints defined in (5.39).

We can write the strong formulation of the field equations as

(5.44a)

\begin{align} \boldsymbol{\nabla }\boldsymbol{\cdot } {\boldsymbol{\mathcal{D}}} &= \overline {\mathcal{R}}{}^{\mathrm{f}}, \end{align}

(5.44b)

\begin{align} \boldsymbol{\nabla }\boldsymbol{\cdot } {\boldsymbol{B}} &= 0, \end{align}

(5.44c)

\begin{align} \boldsymbol{\nabla }\times {\boldsymbol{E}} &= - \frac {\partial {\boldsymbol{B}}}{\partial t}, \end{align}

(5.44d)

\begin{align} \boldsymbol{\nabla }\times {\boldsymbol{\mathcal{H}}} &= \frac {\partial {\boldsymbol{\mathcal{D}}}}{\partial t} + \overline {{\boldsymbol{\mathcal{J}}}}{}^{\mathrm{f}}, \end{align}

where the constitutive relations defining the displacement and magnetising field are given by

(5.44e)

\begin{align} {\boldsymbol{\mathcal{D}}} &\mathrel {\mathop :}= \epsilon _0 {\boldsymbol{E}}_1 + {\boldsymbol{\mathcal{P}}}_1, \end{align}

(5.44f)

\begin{align} {\boldsymbol{\mathcal{H}}} &\mathrel {\mathop :}= \frac {1}{\mu _0} {\boldsymbol{B}} - {\boldsymbol{\mathcal{M}}}_1. \end{align}

The displacement current density is given by $\partial {{\boldsymbol{\mathcal{D}}}} / \partial t$ . We recall that the polarisation ${\boldsymbol{\mathcal{P}}}_1$ and magnetisation ${\boldsymbol{\mathcal{M}}}_1$ are defined in (5.29) and (5.34), respectively, and we note that ${\boldsymbol{B}} = {\boldsymbol{B}}_0 + {\boldsymbol{B}}_1$ as well as ${\boldsymbol{E}} = {\boldsymbol{E}}_1$ . In addition to Gauss’s law (5.44a ) and the Ampère–Maxwell law (5.44d ), we have included Faraday’s law (5.44c ) as well as the magnetic Gauss law (5.44b ). The latter two equations are satisfied automatically when a potential formulation is used, but due to the gauge invariance of the proposed model, we are able to express the proposed model entirely in terms of the electromagnetic fields, which thereby requires (5.44b ) and (5.44c ).

Writing the field equations in this way shows that the proposed gauge-invariant gyrokinetic model can in fact be interpreted as a material property in the macroscopic Maxwell equations. As with the vacuum Maxwell equations, we find that substituting the partial time derivative of Gauss’s law (5.44a ) in the divergence of the Ampère–Maxwell law (5.44d ) yields the free-charge continuity equation (5.43). Hence, the field equations possess a constraint which is automatically satisfied as a consequence of the particle EOMs. The fact that precisely this constraint arises in the field equations is a consequence of gauge invariance of the gyrocentre single-particle phase-space Lagrangian (cf. (4.36)) as discussed in § 5.7 and, in particular, Remark4.

5.5. Structure of the initial value problem

The Ampère–Maxwell law (5.44d ) can be written as (upon substitution of Faraday’s law (5.44c))

(5.45)

\begin{equation} \left ( \epsilon _0 \unicode{x1D644}_3 + \mathcal{C}(1) {\varPi }_\perp \right ) \frac {\partial {\boldsymbol{E}}}{\partial t} = \boldsymbol{\nabla }\times {\boldsymbol{\mathcal{H}}} - \overline {{\boldsymbol{\mathcal{J}}}}{}^{\mathrm{f}} + \mathcal{C}({u}_\shortparallel ) {{\boldsymbol{\hat {b}}}_0} \times (\boldsymbol{\nabla }\times {\boldsymbol{E}}), \end{equation}

where the perpendicular projection matrix is defined as

(5.46)

\begin{equation} {\varPi }_\perp \mathrel {\mathop :}= \unicode{x1D644}_3 - {{\boldsymbol{\hat {b}}}_0} \otimes {{\boldsymbol{\hat {b}}}_0}, \end{equation}

and we have defined the spatially varying functions $\mathcal{C}(\zeta )$ as

(5.47)

\begin{equation} \mathcal{C}(\zeta ) \mathrel {\mathop :}= \frac {1}{B_0^2} \sum _s m_s \int {f}^0_{s} \zeta \mathfrak{J}_{0,s} \,\mathrm{d}^3 {u} . \end{equation}

We note that the positivity of $\mathcal{C}(1)$ implies that the matrix on the left-hand side can be trivially inverted, provided that the vacuum permittivity $\epsilon _0$ is positive. This results in an evolution equation for the electric field $\boldsymbol{E}$ , which, combined with the evolution equation for the magnetic field $\boldsymbol{B}$ (i.e. Faraday’s law (5.44c )) as well as the particle EOMs (5.23), yields an initial value problem (IVP) for the unknowns $({\boldsymbol{E}}, {\boldsymbol{B}}, {f}_s)$ (where the solution of the characteristics ${\boldsymbol{Z}}(t; {\boldsymbol{z}}^0, t^0)$ define the distribution function ${f}_s$ ). We note that solving this IVP requires an initial particle distribution function ${f}^0_{s}$ that has to be compatible with the background magnetic field ${\boldsymbol{B}}_0$ (as discussed in § 5.6), an initial electric field ${\boldsymbol{E}}^0$ that satisfies Gauss’s law (5.44a ) and an initial magnetic field ${\boldsymbol{B}}^0$ that satisfies the magnetic Gauss law (5.44b ).

Moreover, it is worth noting that having a positive vacuum permittivity introduces the light wave as well as the Langmuir wave into the proposed model, which sounds problematic due to their high velocity. However, the light wave does not travel at the vacuum speed of light, but rather at the speed of light in the gyrokinetic plasma, which is much lower than the vacuum speed of light (Burby et al. Reference Burby, Brizard, Morrison and Qin2015). The presence of such fast waves (including the compressional Alfvén wave, as we demonstrate in § 7.3), however, implies that explicit time integration yields a stringent time step constraint and, to this end, implicit time-integration methods might be of interest. A quasi-neutral Darwin approximation to the gyrokinetic model can be considered when such fast waves are not of interest, as discussed in § 6.

In the limit of quasi-neutrality (i.e. $\epsilon _0 = 0$ ), the light wave as well as the Langmuir wave are removed from the model, while the compressional Alfvén wave remains. In this limit, we find that the displacement field is perpendicular to the background magnetic field ${\boldsymbol{\mathcal{D}}} \perp {{\boldsymbol{\hat {b}}}_0}$ (by substituting (4.56) and (5.29))

(5.48)

\begin{equation} \epsilon _0 = 0 \quad \implies \quad {\boldsymbol{\mathcal{D}}} = \mathcal{C}(1) {\boldsymbol{E}}_{\perp } + \mathcal{C}({u}_\shortparallel ) {{\boldsymbol{\hat {b}}}_0} \times {\boldsymbol{B}}. \end{equation}

It follows that (5.45) yields an evolution equation for ${\boldsymbol{E}}_{\perp }$ only and not for $E_{\shortparallel }$ . This means that upon discretising (5.44) in space, we find a differential algebraic system of equations (DAEs) rather than a system of ordinary differential equations (ODEs).

Here, we follow the works of Chen et al. (Reference Chen, Chen, Zonca and Lin2021) and McMillan (Reference McMillan2023), and compute the time derivative of the parallel component of the Ampère–Maxwell law (5.44d ), followed by substituting Faraday’s law (5.44c). This results in the following constraint equation for $E_{1,\shortparallel }$ (i.e. not an evolution equation):

(5.49)

\begin{align} \frac {1}{\mu _0} {{\boldsymbol{\hat {b}}}_0} \boldsymbol{\cdot } \big( \boldsymbol{\nabla }\times \big[ \boldsymbol{\nabla }\times \big( E_{1,\shortparallel } {{\boldsymbol{\hat {b}}}_0} \big) \big] \big) &= -\frac {\partial \overline {\mathcal{J}_\shortparallel }{}^{\mathrm{f}}}{\partial t}\nonumber\\&\quad - {{\boldsymbol{\hat {b}}}_0} \boldsymbol{\cdot } \left [ \boldsymbol{\nabla }\times \left ( \frac {1}{\mu _0}\boldsymbol{\nabla }\times {\boldsymbol{E}}_{1,\perp } + \frac {\partial {\boldsymbol{\mathcal{M}}}_1}{\partial t} \right ) \right ]. \end{align}

In general, the magnetisation ${\boldsymbol{\mathcal{M}}}_1$ also depends on ${\boldsymbol{B}}_{1,\perp }$ , and therefore its time derivative depends on ${\boldsymbol{E}}_1$ . However, this dependency vanishes when an isotropic background pressure is considered (see also (7.12)), as is often the case. It is worth noting that, for a constant background magnetic field, the operator on the left-hand side of (5.49) reduces to the perpendicular Laplacian $\boldsymbol{\nabla }\boldsymbol{\cdot } \boldsymbol{\nabla} _\perp E_{1,\shortparallel }$ . Hence, if so desired, an equation for $E_{1,\shortparallel }$ can be obtained, but we leave the details of a corresponding numerical solution strategy for a future paper.

5.6. Equilibrium solutions of the field equations

The gyrocentre coordinate transformation discussed in § 4 is based on the assumption that $\varepsilon _\delta \ll 1$ , which we have not yet justified. For this assumption to hold, we require that the initial particle distribution function ${f}^0_{s}$ and background magnetic field ${\boldsymbol{B}}_0$ are close to equilibrium. That is, at $t = t^0$ , we require that the field equations approximately hold true to leading order in $\varepsilon _\delta$ , i.e. when setting the perturbed fields to zero. This results in equilibrium solutions of the field equations.

We assume that the background distribution function is nearly symmetric in ${u}_\shortparallel$ , that is, it is of the form

(5.50)

\begin{equation} {f}^0_{s}({\boldsymbol{r}}, {u}_\shortparallel , {\mu }) = {f}^0_{s}{}^{,\mathrm{S}}({\boldsymbol{r}}, {u}_\shortparallel - \delta {u}_{s}, {\mu }) \quad \text{with} \quad {f}^0_{s}{}^{,\mathrm{S}}({\boldsymbol{r}}, {u}_\shortparallel , {\mu }) = {f}^0_{s}{}^{,\mathrm{S}}({\boldsymbol{r}}, -{u}_\shortparallel , {\mu }), \end{equation}

where $\varepsilon _{U,s} \mathrel {\mathop :}= \delta {u}_{s} / u_{\mathrm{th},s}$ is assumed to be small. This results in

(5.51)

\begin{equation} {n}_{0,s} {u}_{0,\shortparallel ,s} = \int {f}^0_{s}{}^{,\mathrm{S}} \delta {u}_{s} \mathfrak{J}_{0,s}\, \mathrm{d}^3 {u} = {n}_{0,s} \delta {u}_{s}, \end{equation}

where we have defined the background particle density and parallel velocity as

(5.52a)

\begin{align} {n}_{0,s} &\mathrel {\mathop :}= \int {f}^0_{s} \mathfrak{J}_{0,s} \,\mathrm{d}^3 {u}, \end{align}

(5.52b)

\begin{align} {u}_{0,\shortparallel ,s} &\mathrel {\mathop :}= \frac {1}{{n}_{0,s}} \int {f}^0_{s} {u}_\shortparallel \mathfrak{J}_{0,s}\, \mathrm{d}^3 {u}. \end{align}

Throughout the discussion on the equilibrium solutions, we neglect $O(\varepsilon _{B,s}^2)$ and $O(\varepsilon _{B,s} \varepsilon _{U,s})$ terms, and we consider the ZLR limit $\varepsilon _\perp \rightarrow 0$ .

For Gauss’s law (5.44a ), we find that the leading order part is given by

(5.53)

\begin{equation} 0 = \sum _s q_s {n}_{0,s} , \end{equation}

where we have made use of (5.12a ). The background distributions must result in an (approximately) quasi-neutral plasma to justify $\varepsilon _\delta \ll 1$ .

The leading order part of the ZLR limit of the Ampère–Maxwell law (5.44d ) is given by

(5.54)

\begin{equation} \frac {1}{\mu _0} \boldsymbol{\nabla }\times {\boldsymbol{B}}_0 = \overline {{\boldsymbol{\mathcal{J}}}}{}^{\mathrm{f},\mathrm{ZLR}}_0, \end{equation}

where the ZLR limit of the background gyrocentre free-current density results in

(5.55)

\begin{equation} \overline {{\boldsymbol{\mathcal{J}}}}{}^{\mathrm{f},\mathrm{ZLR}}_0 = \sum _s q_s {n}_{0,s} \delta {u}_{s} {{\boldsymbol{\hat {b}}}_0} + \frac {p_{0,\shortparallel }^{\mathrm{S}}}{B_0} \boldsymbol{\nabla }\times {{\boldsymbol{\hat {b}}}_0} + \frac {p_{0,\perp }^{\mathrm{S}}}{B_0^2} {{\boldsymbol{\hat {b}}}_0} \times \boldsymbol{\nabla }B_0 - \boldsymbol{\nabla }\times \left ( \frac {p_{0,\perp }^{\mathrm{S}}}{B_0} {{\boldsymbol{\hat {b}}}_0} \right ) \end{equation}

by making use of (3.48a ), (5.39b ) and (5.51). The background pressures are defined as

(5.56a)

\begin{align} p_{0,\shortparallel } &\mathrel {\mathop :}= \sum _s m_s \int {f}^0_{s} {u}_\shortparallel ^2 \mathfrak{J}_{0,s} \,\mathrm{d}^3 {u}, \end{align}

(5.56b)

\begin{align} p_{0,\perp } &\mathrel {\mathop :}= \sum _s \frac {m_s}{2} \int {f}^0_{s} {u}_\tau ^2 \mathfrak{J}_{0,s} \,\mathrm{d}^3 {u} = B_0 \sum _s \int {f}^0_{s} {\mu } \mathfrak{J}_{0,s} \,\mathrm{d}^3 {u} \end{align}

with equivalent definitions for the pressures resulting from the symmetric distribution function: $p_{0,\shortparallel }^{\mathrm{S}}, p_{0,\perp }^{\mathrm{S}}$ . Moreover, we have made use of (5.51) as well as

(5.57a)

\begin{align} \int {f}^0_{s} {u}_\shortparallel ^2 \mathfrak{J}_{0,s}\, \mathrm{d}^3 {u} &= \int {f}^0_{s}{}^{,\mathrm{S}} {u}_\shortparallel ^2 \mathfrak{J}_{0,s} \,\mathrm{d}^3 {u} + O(\varepsilon _{U,s}^2), \end{align}

(5.57b)

\begin{align} \int {f}^0_{s} {\mu } \mathfrak{J}_{0,s}\, \mathrm{d}^3 {u} &= \int {f}^0_{s}{}^{,\mathrm{S}} {\mu } \mathfrak{J}_{0,s}\, \mathrm{d}^3 {u} + O(\varepsilon _{B,s} \varepsilon _{U,s}) , \end{align}

to conclude that $p_{0,\shortparallel } = p_{0,\shortparallel }^{\mathrm{S}}$ and $p_{0,\perp } = p_{0,\perp }^{\mathrm{S}}$ up to $O(\varepsilon ^2)$ .

We find that the perpendicular part of the ZLR limit of the background gyrocentre free-current density can alternatively be written as

(5.58)

\begin{equation} \overline {{\boldsymbol{\mathcal{J}}}}{}^{\mathrm{f},\mathrm{ZLR}}_{0,\perp } = \frac {1}{B_0}{{\boldsymbol{\hat {b}}}_0} \times (\boldsymbol{\nabla }\boldsymbol{\cdot } \unicode{x1D64B}_{0}) , \quad \unicode{x1D64B}_{0} \mathrel {\mathop :}= p_{0,\shortparallel }^{\mathrm{S}} {{\boldsymbol{\hat {b}}}_0} \otimes {{\boldsymbol{\hat {b}}}_0} + p_{0,\perp }^{\mathrm{S}} ( \unicode{x1D644}_3 - {{\boldsymbol{\hat {b}}}_0} \otimes {{\boldsymbol{\hat {b}}}_0}), \end{equation}

where we have made use of

(5.59)

\begin{equation} Q (\boldsymbol{\nabla }\times {{\boldsymbol{\hat {b}}}_0})_\perp = {{\boldsymbol{\hat {b}}}_0} \times \boldsymbol{\nabla }\boldsymbol{\cdot } (Q {{\boldsymbol{\hat {b}}}_0} \otimes {{\boldsymbol{\hat {b}}}_0}). \end{equation}

When combined with the Ampère–Maxwell law, this results in the equilibrium condition (as can also be found in e.g. Grad (Reference Grad1966))

(5.60)

\begin{equation} (\boldsymbol{\nabla }\times {\boldsymbol{B}}_0) \times {\boldsymbol{B}}_0 = \mu _0 (\boldsymbol{\nabla }\boldsymbol{\cdot } \unicode{x1D64B}_{0})_\perp , \end{equation}

which, for an isotropic background distribution with $p_0 = p_{0,\shortparallel }^{\mathrm{S}} = p_{0,\perp }^{\mathrm{S}}$ , results in the MHD equilibrium condition (Grad Reference Grad1966, Reference Grad1967)

(5.61)

\begin{equation} (\boldsymbol{\nabla }\times {\boldsymbol{B}}_0) \times {\boldsymbol{B}}_0 = \mu _0 \boldsymbol{\nabla }p_0, \end{equation}

wherein we have imposed ${{\boldsymbol{\hat {b}}}_0} \boldsymbol{\cdot } \boldsymbol{\nabla }p_0 = 0$ , as usually required by the tools for computing MHD equilibria. The condition on the parallel derivative of the background pressure implies that the background particle density and temperature must be a function of the flux surface label.

MHD equilibria can be computed using software tools such as VMEC (Hirshman & Whitson Reference Hirshman and Whitson1983) and GVEC (Hindenlang et al. Reference Hindenlang, Maj, Strumberger, Rampp and Sonnendrücker2019), which for a given geometry and pressure find a magnetic field ${\boldsymbol{B}}_0$ such that (5.61) holds. However, we note that (5.61) and (5.60) only ensure that the perpendicular part of the Ampère–Maxwell law is satisfied and, to this end, we consider the parallel component of the Ampère–Maxwell law

(5.62)

\begin{equation} \frac {1}{\mu _0} {{\boldsymbol{\hat {b}}}_0} \boldsymbol{\cdot } (\boldsymbol{\nabla }\times {\boldsymbol{B}}_0) = \sum _s q_s {n}_{0,s} \delta {u}_{s} + \frac {p_{0,\shortparallel }^{\mathrm{S}} - p_{0,\perp }^{\mathrm{S}}}{B_0^2} {{\boldsymbol{\hat {b}}}_0} \boldsymbol{\cdot } (\boldsymbol{\nabla }\times {\boldsymbol{B}}_0) \end{equation}

as follows substitution of the parallel component of (5.55). Correctly satisfying the parallel component of the Ampère–Maxwell law is crucial for the modelling of, for example, kink modes (Dudkovskaia et al. Reference Dudkovskaia, Wilson, Connor, Dickinson and Parra2023). To this end, we note that the shift $\delta {u}_{s}({\boldsymbol{r}})$ (which may be a function of the gyrocentre position) is the only unknown in (5.62) and, therefore, this equation can be used to impose a constraint on the shift to satisfy the parallel component of the Ampère–Maxwell law. Moreover, it shows that the background distribution function must be strongly anisotropic if it is unshifted, unless ${{\boldsymbol{\hat {b}}}_0} \boldsymbol{\cdot } (\boldsymbol{\nabla }\times {{\boldsymbol{\hat {b}}}_0}) = 0$ . The smallness of the non-dimensional shift $\varepsilon _{U,s}$ per species can be deduced from the non-dimensionalisation of (5.62)

(5.63)

\begin{equation} O(1) = \sum _s \frac {\beta _{0,s}}{1 - {\mu _0 \big(p_{0,\shortparallel }^{\mathrm{S}} - p_{0,\perp }^{\mathrm{S}}\big)}/{B_0^2}} \frac {\varepsilon _{U,s}}{\varepsilon _{B,s}} , \end{equation}

where the plasma- $\beta$ is defined per species as

(5.64)

\begin{equation} \beta _{0,s} \mathrel {\mathop :}= \frac {2 \mu _0 {n}_{0,s} k_{\mathrm{B}} T_s}{B_0^2}. \end{equation}

This approach is comparable to the work presented by McMillan (Reference McMillan2023). Therein, a global Maxwellian particle distribution function is considered, for which it is shown that only part of the parallel component of the Ampère–Maxwell law is correctly satisfied for a (toroidally symmetric) Grad–Shafranov equilibrium (Grad & Rubin Reference Grad and Rubin1958). This issue is then resolved by introducing a slight modification (in particular, (McMillan Reference McMillan2023, (3.5) and (3.6))) to the global Maxwellian particle distribution function. We similarly modify the originally symmetric particle distribution function by a shift $\delta {u}_s$ to correctly satisfy the parallel component of the Ampère–Maxwell law. Our approach, however, does not require toroidal symmetry of the background magnetic field and can therefore also be applied to three-dimensional (3-D) MHD equilibria in stellarator devices.

To summarise, we can choose any symmetric background distribution function ${f}^0_{s}{}^{,\mathrm{S}}$ from which we compute the parallel and perpendicular pressure according to (5.56). We then solve the equilibrium equation (5.60) (or (5.61) if the background distribution function is isotropic), resulting in the background magnetic field ${\boldsymbol{B}}_0$ . Finally, we use (5.62) to find the shifts $\delta {u}_{s}({\boldsymbol{r}})$ , thereby adjusting the background distribution function ${f}^0_{s}$ according to (5.50). Constructing the background distribution function and magnetic field in this way ensures that we are near equilibrium, which thereby ensures that the underlying assumption of the gyrocentre coordinate transformation $\varepsilon _\delta \ll 1$ is justified.

5.7. Well-posedness of the field equations

Well-posedness of the field equations is non-trivial. With well-posedness, we refer to the existence and uniqueness of solutions to the field equations. This is a rather mathematical topic and therefore falls outside the scope of this paper when it is considered fully rigorously. However, as we can see, a necessary condition for well-posedness is related to the bound-charge continuity equation and thereby allows for a physical interpretation.

To illustrate that well-posedness is a non-trivial property, we consider the following general form of Gauss’s law and the Ampère–Maxwell law:

(5.65a)

\begin{align} \epsilon _0 \boldsymbol{\nabla }\boldsymbol{\cdot } {\boldsymbol{E}}_1 &= \overline {\mathcal{R}}{}^{\mathrm{f}} + \mathcal{R}^{\mathrm{b}}, \end{align}

(5.65b)

\begin{align} \frac {1}{\mu _0}\boldsymbol{\nabla }\times {\boldsymbol{B}} &= \epsilon _0 \frac {\partial {\boldsymbol{E}}_1}{\partial t} + \overline {{\boldsymbol{\mathcal{J}}}}{}^{\mathrm{f}} + {\boldsymbol{\mathcal{J}}}^{\mathrm{b}}, \end{align}

where we have defined the bound charge density as

(5.66)

\begin{equation} \mathcal{R}^{\mathrm{b}} \mathrel {\mathop :}= - \boldsymbol{\nabla }\boldsymbol{\cdot } {\boldsymbol{\mathcal{P}}}_1 \end{equation}

and consider some unspecified bound current density denoted by ${\boldsymbol{\mathcal{J}}}^{\mathrm{b}}$ . Computing the divergence of the Ampère–Maxwell law (5.65b ) results in

(5.67)

\begin{equation} 0 = \epsilon _0 \boldsymbol{\nabla }\boldsymbol{\cdot } \frac {\partial {\boldsymbol{E}}_1}{\partial t} + \boldsymbol{\nabla }\boldsymbol{\cdot } \overline {{\boldsymbol{\mathcal{J}}}}{}^{\mathrm{f}} + \boldsymbol{\nabla }\boldsymbol{\cdot } {\boldsymbol{\mathcal{J}}}^{\mathrm{b}} \quad \iff \quad \underbrace {\frac {\partial \mathcal{R}^{\mathrm{b}}}{\partial t} + \boldsymbol{\nabla }\boldsymbol{\cdot } {\boldsymbol{\mathcal{J}}}^{\mathrm{b}} = 0}_{\substack {\text{bound-charge} \\ \text{continuity equation}}}, \end{equation}

where we have substituted the free-charge continuity equation (5.43) as well as Gauss’s law (5.65a ). That is, we find that the bound-charge continuity equation (5.67) must hold for the two field equations to be compatible. Note that compatibility here means that the divergence of the Ampère–Maxwell law should coincide with the time derivative of Gauss’s law. For the strong formulation, we find

(5.68)

\begin{equation} {\boldsymbol{\mathcal{J}}}^{\mathrm{b}} = \frac {\partial {\boldsymbol{\mathcal{P}}}_1}{\partial t} + \boldsymbol{\nabla }\times {\boldsymbol{\mathcal{M}}}_1 \end{equation}

for which it can easily be shown that (5.67) is satisfied.

Remark 4. Compatibility of the field equations can also directly be deduced from the action principle by computing the following gauge-invariant variation:

(5.69)

\begin{equation} \left . \frac {{\mathrm{d}} }{{\mathrm{d}} \varepsilon } \right |_{\varepsilon = 0} \mathfrak{A}( { {{\boldsymbol{Z}}}}, \phi _1 - \varepsilon \partial \eta / \partial t, {\boldsymbol{A}}_1 + \varepsilon \boldsymbol{\nabla }\eta ) = 0 \end{equation}

for some scalar test function $\eta$ . When substituting ( 5.25 ), we find that this results in

(5.70)

\begin{equation} 0 = \sum _s q_s \int {f}^0_{s}({\boldsymbol{z}}^0) \left ( \boldsymbol{\nabla }\eta \boldsymbol{\cdot } \,\dot { { {\!\boldsymbol{R}}}} + \frac {\partial \eta }{\partial t} \right ) \mathfrak{J}_s({\boldsymbol{z}}^0, t^0) \,\mathrm{d}^6 { {z}}^0\, \mathrm{d} t , \end{equation}

where we note that this is a direct consequence of the gauge invariance of the gyrocentre single-particle phase-space Lagrangian, see (4.36). By making use of Theorem2 as well as partial integration, this results in

(5.71)

\begin{equation} 0 = - \int \underbrace {\left ( \frac {\partial \mathcal{R}^{\mathrm{f}}}{\partial t} + \boldsymbol{\nabla }\boldsymbol{\cdot } {\boldsymbol{\mathcal{J}}}^{\mathrm{f}} \right )}_{\overset { (5.11)}{=} 0} \eta\, \mathrm{d}^3 { {r}} \,\mathrm{d} t , \end{equation}

where we have moreover substituted the definitions of free charge and current density as given by ( 5.12 ), and thereby results in a constraint which is automatically satisfied as a consequence of the free-charge continuity equation ( 5.11 ). It follows that this gauge-invariant variation does not give an additional constraint as it is consistent with the free-charge continuity equation ( 5.11 ) derived from the conservative form of the Vlasov equation ( 5.10 ). We note that this specific variation coincides with computing the difference between the divergence of the Ampère–Maxwell law (i.e. letting ${{\boldsymbol{\varLambda }}} = \boldsymbol{\nabla }\eta$ in ( 5.33 )) and the time derivative of Gauss’s law (i.e. letting $\varLambda = \partial \eta / \partial t$ in (5.28)).

5.8. Energy conservation

Typically, the derivation of conservation laws for quantities such as energy and momentum is achieved by making use of Noether’s method (Noether Reference Noether1918), wherein symmetries of the Lagrangian result in conserved quantities. The derivation of exact local conservation laws for electromagnetic gyrokinetic systems has been elusive (Peifeng et al. Reference Peifeng, Hong and Jianyuan2021), however, due to the presence of the integrals over time only in the Lagrangian owing to the particles. Peifeng et al. (Reference Peifeng, Hong and Jianyuan2021) proposed a procedure to overcome this difficulty, resulting in exact local conservation laws for energy and momentum of arbitrary-order gyrokinetic models. Alternatively, one can switch to Eulerian variables (i.e. transforming from the Lagrangian EOMs $ { {{\boldsymbol{Z}}}}$ to the particle distribution function ${f}_s$ ) as it is done by Hirvijoki et al. (Reference Hirvijoki, Burby, Pfefferlé and Brizard2020) for models based on an Euler–Poincaré variational formulation. The conservation laws derived by Hirvijoki et al. (Reference Hirvijoki, Burby, Pfefferlé and Brizard2020) are rederived by Brizard (Reference Brizard2021a ) for models that are based on an Eulerian variational principle.

5.8.1. Derivation

In our work, we follow a more direct and simple approach, as we do not aim to derive conservation laws for arbitrary-order gyrokinetic models. We first derive the evolution of the kinetic energy per particle $ { {K}}_s$ , which then leads to the evolution equation of the kinetic energy density $\mathcal{K}$ upon integration over the particles. Next, we derive the evolution equation of the potential energy density $\mathcal{U}$ . When combined, the two evolution equations result in the local energy conservation law, which, upon integration over the spatial coordinate, leads to the global energy conservation law.

The gyrocentre kinetic energy per particle is defined as

(5.72)

\begin{equation} { {K}}_s \mathrel {\mathop :}= \frac {m_s { {U}}_\shortparallel ^2 }{2} + { {{M}}} \big(B_0 + \big\langle \kern-0.7pt\big\langle \mathring {B}^\varsigma _{1,\shortparallel } \big\rangle \kern-0.8pt\big\rangle \big), \end{equation}

which is derived from applying the gyrocentre coordinate transformation (4.61) to the guiding-centre kinetic energy $ \bar{K}_0$ (as defined in (3.25)). The kinetic energy of a particle evolves as

(5.73)

\begin{equation} \frac {{\mathrm{d}} { {K}}_s}{{\mathrm{d}} t} = q_s \,\dot { { {\!\boldsymbol{R}}}} \boldsymbol{\cdot } {\boldsymbol{E}}^{\star }_1 - { {{M}}} \langle \!\langle (\mathring {\boldsymbol{\nabla} }^\varsigma \times {\boldsymbol{E}}_{1})_\shortparallel \rangle \!\rangle \end{equation}

by making use of (5.23) as well as Faraday’s law (5.21). Hence, if the perturbed electric field vanishes, we find that the kinetic energy per particle is conserved, as can be expected because static magnetic fields do no work.

The kinetic energy per particle can be integrated over all particles resulting in the kinetic energy density

(5.74a)

\begin{equation} { {\mathcal{K}}} \mathrel {\mathop :}= \sum _s \int {f}_s { {K}}_s \mathfrak{J}_s \,\mathrm{d}^3 {u}, \end{equation}

whereas the displacement and magnetising field result in the potential energy density

(5.74b)

\begin{equation} { {\mathcal{U}}} \mathrel {\mathop :}= \frac {1}{2} ( {\boldsymbol{\mathcal{D}}} \boldsymbol{\cdot } {\boldsymbol{E}} + {\boldsymbol{\mathcal{H}}} \boldsymbol{\cdot } {\boldsymbol{B}} ) . \end{equation}

This results in the following local energy conservation law for which a proof can be found in Appendix I.

Theorem 3 (Local energy conservation). The kinetic energy density ( 5.74a ) satisfies

(5.75)

\begin{equation} \frac {\partial { {\mathcal{K}}}}{\partial t} + \boldsymbol{\nabla }\boldsymbol{\cdot } \left ( \sum _s \int {f}_s \,\dot { { {\!\boldsymbol{R}}}} { {K}}_s \mathfrak{J}_s \,\mathrm{d}^3 {u} \right ) = \overline {{\boldsymbol{\mathcal{J}}}}{}^{\mathrm{f}} \boldsymbol{\cdot } {\boldsymbol{E}}, \end{equation}

whereas the potential energy density ( 5.74b ) satisfies Poynting’s theorem

(5.76)

\begin{equation} \frac {\partial { {\mathcal{U}}}}{\partial t} + \boldsymbol{\nabla }\boldsymbol{\cdot } ({\boldsymbol{E}} \times {\boldsymbol{\mathcal{H}}}) = - \overline {{\boldsymbol{\mathcal{J}}}}{}^{\mathrm{f}} \boldsymbol{\cdot } {\boldsymbol{E}}. \end{equation}

The magnetising field $\boldsymbol{\mathcal{H}}$ and free current density $\overline {{\boldsymbol{\mathcal{J}}}}{}^{\mathrm{f}}$ are defined in (5.44f) and (5.39b), respectively. It follows that the following local energy conservation law holds:

(5.77)

\begin{equation} \frac {\partial }{\partial t} ( { {\mathcal{K}}} + { {\mathcal{U}}}) + \boldsymbol{\nabla }\boldsymbol{\cdot } \bigg( {\boldsymbol{E}} \times {\boldsymbol{\mathcal{H}}} + \sum _s \int {f}_s \,\dot { { {\!\boldsymbol{R}}}} { {K}}_s \mathfrak{J}_s \,\mathrm{d}^3 {u} \bigg ) = 0. \end{equation}

On the right-hand side of the evolution equation for the energy densities, i.e. (5.75) and (5.76), we recognise the $\overline {{\boldsymbol{\mathcal{J}}}}{}^{\mathrm{f}} \boldsymbol{\cdot } {\boldsymbol{E}}$ source term, which is often used for diagnostic purposes (Bottino & Sonnendrücker Reference Bottino and Sonnendrücker2015; Novikau et al. Reference Novikau, Biancalani, Bottino, Di Siena, Lauber, Poli, Lanti, Villard, Ohana and Briguglio2021; Kleiber et al. Reference Kleiber2024).

When integrating the sum of the kinetic and potential energy densities over the spatial domain, we find the total energy

(5.78)

\begin{equation} { {\mathfrak{E}}} \mathrel {\mathop :}= \int ( { {\mathcal{K}}} + { {\mathcal{U}}})\, \mathrm{d}^3 { {r}}, \end{equation}

which is conserved as a consequence of (5.77).

Remark 5 (Total energy conservation in weak form). The derivation of the total energy presented previously is based on the strong formulation of the equations, but we note that the conserved energy can also be derived directly from the field-theoretic Lagrangian. This is of interest as this implies that a numerical model based on this weak formulation can also be exactly energy conserving, provided that it is properly discretised.

We start by considering the total time derivative of the field-theoretic Lagrangian ( 5.16 )

(5.79)

\begin{equation} \frac {{\mathrm{d}} \mathfrak{L}}{{\mathrm{d}} t} = \frac {\delta \mathfrak{L}}{\delta { {{\boldsymbol{Z}}}}} [{\dot{\boldsymbol{Z}}}] + \frac {\delta \mathfrak{L}}{\delta \phi _1} \left[{\frac {\partial \phi _1}{\partial t}}\right] + \frac {\delta \mathfrak{L}}{\delta {\boldsymbol{A}}_1} \left[{\frac {\partial {\boldsymbol{A}}_1}{\partial t}}\right] . \end{equation}

We then assume that the characteristics and fields satisfy the EOMs and field equations, respectively, and make use of the fact that these equations are found by setting the respective variation of the action to zero, up to partial integration in time. For instance, for the gyrocentre characteristics, we find that

(5.80)

\begin{equation} \frac {\delta \mathfrak{L}}{\delta{\boldsymbol{Z}}} [{\dot{\boldsymbol{Z}}}] = \frac {{\mathrm{d}} }{{\mathrm{d}} t} \sum _s \int {f}^0_{s}({\boldsymbol{z}}^0) {\frac {\partial { {L}}_s^{\mathrm{part}}}{\partial \dot { { {{\boldsymbol{Z}}}}}} \boldsymbol{\cdot } \dot { { {{\boldsymbol{Z}}}}}} \mathfrak{J}_s({\boldsymbol{z}}^0, t^0)\, \mathrm{d}^6 { {z}}^0 \end{equation}

by making use of ( 5.19 ). Similarly, we find that ( 5.38 ) becomes

(5.81)

\begin{equation} \frac {\delta \mathfrak{L}}{\delta {\boldsymbol{A}}_1} \left[{\frac {\partial {\boldsymbol{A}}_1}{\partial t}}\right] = - \frac {{\mathrm{d}} }{{\mathrm{d}} t} \sum _s \int \left ( q_s {f}_s \mathfrak{J}_s \rho \frac {\partial }{\partial t}\langle\kern0.3pt \!| \mathring {A}^\varsigma _{1,\rho } |\!\kern0.3pt\rangle + {f}^0_{s} {\boldsymbol{P}}_{1,s} \boldsymbol{\cdot } \frac {\partial {\boldsymbol{A}}_1}{\partial t} \mathfrak{J}_{0,s} \right )\, \mathrm{d}^6 { {z}} \end{equation}

upon substitution of the Ampère–Maxwell law. The variation for $\phi _1$ did not require any partial integration in time and, therefore, we find that ( 5.79 ) results in

(5.82)

\begin{equation} \frac {{\mathrm{d}} { {\mathfrak{E}}}}{{\mathrm{d}} t} = 0, \end{equation}

where the conserved energy is given by

(5.83)

\begin{equation} { {\mathfrak{E}}} = \sum _s \int \left ( {f}_s {\frac {\partial { {L}}_s^{\mathrm{part}}}{\partial \dot { { {{\boldsymbol{Z}}}}}} \boldsymbol{\cdot } \dot { { {{\boldsymbol{Z}}}}}} \mathfrak{J}_s - q_s {f}_s \rho \frac {\partial }{\partial t}\langle\kern0.3pt \!| \mathring {A}^\varsigma _{1,\rho } |\!\kern0.3pt\rangle \mathfrak{J}_s - {f}^0_{s} \frac {m_s}{q_s B_0^2} {\boldsymbol{F}}_{1,\perp } \boldsymbol{\cdot } \frac {\partial {\boldsymbol{A}}_1}{\partial t} \mathfrak{J}_{0,s} \right ) \mathrm{d}^6 { {z}} - \mathfrak{L}. \end{equation}

It can be shown that ( 5.83 ) reduces to ( 5.78 ) by substituting the definitions of the particle ( 5.17b ) and field ( 5.17c ) Hamiltonian in Low’s action ( 5.16 ) and by making use of Gauss’s law as well as the gradient theorem ( D.8 ).

5.8.2. Comparison with results from literature

Despite using a more direct approach in deriving the local energy conservation law, the resulting conserved energy density should agree with other results from the literature. To facilitate this comparison, we consider the work of Brizard (Reference Brizard2021a ), wherein the ZLR limit of the model proposed by Burby & Brizard (Reference Burby and Brizard2019) is considered, which conveniently coincides with the ZLR limit of the proposed model.

The conserved energy density of the proposed model in the ZLR limit is given by

(5.84)

\begin{equation} \mathcal{K}^{\mathrm{ZLR}} + \mathcal{U}^{\mathrm{ZLR}} = \sum _s \int {f}_s { {K}}_s^{\mathrm{ZLR}} \mathfrak{J}_s \,\mathrm{d}^3 {u} + \frac {1}{2} ( {\boldsymbol{\mathcal{D}}} \boldsymbol{\cdot } {\boldsymbol{E}} + {\boldsymbol{\mathcal{H}}} \boldsymbol{\cdot } {\boldsymbol{B}} ), \end{equation}

where the ZLR limit of the kinetic energy per particle is given by

(5.85)

\begin{equation} { {K}}_s^{\mathrm{ZLR}} = \frac {m_s { {U}}_\shortparallel ^2 }{2} + { {{M}}} (B_0 + {B}_{1,\shortparallel }) \end{equation}

by making use of (5.74a ) as well as the observation that the displacement and magnetising field do not contain any FLR contributions.

The kinetic energy per particle of Brizard (Reference Brizard2021a , (3.4)) is defined differently from ours (5.72) and is given by

(5.86)

\begin{equation} K_s^{\mathrm{B}} \mathrel {\mathop :}= { {K}}_s^{\mathrm{ZLR}} + { {H}}_{2,s}, \end{equation}

where we have neglected the contribution from the guiding-centre electric dipole moment as a result of neglecting the mixed $O(\varepsilon _\delta \varepsilon _B)$ terms in the proposed model. Note that the second-order gyrocentre Hamiltonian can be expressed in terms of the magnetisation and polarisation per particle as (cf. (5.34), (5.29) and (4.66))

(5.87)

\begin{equation} { {H}}_{2,s} = -\frac {1}{2} ({\boldsymbol{P}}_{1,s} \boldsymbol{\cdot } {\boldsymbol{E}} + {\boldsymbol{M}}_{1,s} \boldsymbol{\cdot } {\boldsymbol{B}}) \end{equation}

such that the kinetic energy density can be written as

(5.88)

\begin{align} \mathcal{K}^{\mathrm{B}} &= \mathcal{K}^{\mathrm{ZLR}} + \sum _s \int {f}_s { {H}}_{2,s} \mathfrak{J}_s\, \mathrm{d}^3 {u} = \mathcal{K}^{\mathrm{ZLR}}\nonumber\\&\quad + \frac {1}{2} \left [{-} \left ( {\boldsymbol{\mathcal{D}}} - \epsilon _0 {\boldsymbol{E}} \right ) \boldsymbol{\cdot } {\boldsymbol{E}}+ \left ( {\boldsymbol{\mathcal{H}}} - \frac {1}{\mu _0} {\boldsymbol{B}} \right ) \boldsymbol{\cdot } {\boldsymbol{B}} \right ] \end{align}

by substituting (5.44e ) and (5.44f ). Furthermore, we have ignored the linearisation of the particle part of the Hamiltonian introduced in (5.16) as no such linearisation is applied by Brizard (Reference Brizard2021a ). Finally, we note that the potential energy density of Brizard (Reference Brizard2021a , (5.11)) is given by

(5.89)

\begin{equation} \mathcal{U}^{\mathrm{B}} = {\boldsymbol{\mathcal{D}}} \boldsymbol{\cdot } {\boldsymbol{E}} - \frac {\epsilon _0}{2} \lvert {\boldsymbol{E}} \rvert ^2 + \frac {1}{2\mu _0} \lvert {\boldsymbol{B}} \rvert ^2 \end{equation}

such that the conserved energy densities are indeed found to be the same: $\mathcal{K}^{\mathrm{ZLR}} + \mathcal{U}^{\mathrm{ZLR}} = \mathcal{K}^{\mathrm{B}} + \mathcal{U}^{\mathrm{B}}$ . We note that the conserved energy density also agrees with that found by Hirvijoki et al. (Reference Hirvijoki, Burby, Pfefferlé and Brizard2020, (82)).

6. Quasi-neutral gyrokinetic Darwin model

The proposed gyrokinetic Maxwell model keeps more physics than the popular reduced parallel-only model (discussed in § 7.1) and has more structure than the symplectic Brizard–Hahm model (Brizard & Hahm Reference Brizard and Hahm2007) (discussed in § 7.2). In particular, we find that the Ampère–Maxwell law (5.33) contains a displacement current density, which is not present in either of the two other models. Part of this displacement current density, however, gives rise to fast waves such as the light wave, the Langmuir wave and the compressional Alfvén wave (as demonstrated in § 7.3), and in most situations, such waves are undesired due to their high frequency.

In this section, a quasi-neutral Darwin approximation is proposed, which removes the fast waves from the model. This approximation consists of two steps: first, the limit of quasi-neutrality is considered thereby removing the light wave as well as the Langmuir wave from the model and, second, the compressional Alfvén wave is removed by considering a Darwin approximation wherein the transversal part of the displacement current density is removed from the Ampère–Maxwell law. The resulting model is gauge-invariant, is obtained via an action principle, has compatible field equations and still possesses a local energy conservation law.

6.1. Darwin approximation to Maxwell’s equations

One way to deal with high-frequency components in the solution is to damp them numerically using implicit time integration methods. Another option is to remove such waves from the underlying model, i.e. from the Lagrangian. For instance, to remove the light wave from Maxwell’s equations, rather than considering the limit of quasi-neutrality, one can consider the Darwin approximation, wherein only the longitudinal (i.e. irrotational or curl free) part is kept in the $\epsilon _0 \partial {\boldsymbol{E}}_1 / \partial t$ term in the Ampère–Maxwell law. Thus, the transversal (i.e. solenoidal or divergence free) part of the displacement current density is neglected

(6.1)

\begin{equation} \epsilon _0 \frac {\partial {\boldsymbol{E}}_1}{\partial t} \approx \epsilon _0 \frac {\partial (\varPi _{\mathrm{L}} {\boldsymbol{E}}_1)}{\partial t}, \end{equation}

where $\varPi _{\mathrm{L}}$ is the longitudinal projection operator, which is defined as (with appropriate boundary conditions on the inverse Laplace operator)

(6.2)

\begin{equation} \varPi _{\mathrm{L}} {\boldsymbol{S}} \mathrel {\mathop :}= \boldsymbol{\nabla }[\boldsymbol{\nabla} ^{-2} (\boldsymbol{\nabla }\boldsymbol{\cdot } {\boldsymbol{S}})]. \end{equation}

This results in a model which yields a second- and third-order accurate electric and magnetic field, respectively, in the small parameter $\varepsilon _c = v / c$ (Degond & Raviart Reference Degond and Raviart1992) and restricts the dynamics of the Vlasov–Maxwell system to an invariant slow manifold of the Vlasov–Maxwell phase space (Miloshevich & Burby Reference Miloshevich and Burby2021).

Note that the Darwin approximation is a gauge-invariant approximation, as the projection operator acts on the electric field directly. If the Coulomb gauge is used, then the vector potential is transversal and we find that the Darwin approximation simply neglects the vector potential contribution to the electric field,

(6.3)

\begin{equation} \boldsymbol{\nabla }\boldsymbol{\cdot } {\boldsymbol{A}}_{1} = 0 \quad \implies \quad \varPi _{\mathrm{L}} {\boldsymbol{E}}_1 = -\boldsymbol{\nabla }\phi _1. \end{equation}

6.2. Darwin approximation of the gyrocentre Hamiltonian

We follow an approach similar to the Darwin approximation to remove the fast compressional Alfvén wave. As the proposed model is defined in terms of an action principle, we propose a modification to the action (5.25), which corresponds to the removal of the compressional Alfvén wave. Recall that the second-order gyrocentre Hamiltonian is given by (cf. (4.66))

(6.4)

\begin{equation} { {H}}_{2,s} = - \frac {m_s}{2 B_0^2} \lvert {\boldsymbol{E}}_{1,\perp } \rvert ^2 + \frac {m_s { {U}}_\shortparallel }{B_0^2} {{\boldsymbol{\hat {b}}}_0} \boldsymbol{\cdot } ({\boldsymbol{E}}_{1,\perp } \times {\boldsymbol{B}}_1) + \frac { { {{M}}} B_0 - m_s { {U}}_\shortparallel ^2 }{2 B_0^2} {\lvert {\boldsymbol{B}}_{1,\perp } \rvert ^2}, \end{equation}

where we have substituted the definition of the Lorentz force (4.56), and thus agrees exactly with the result found by Burby & Brizard (Reference Burby and Brizard2019, (14)). We note that the compressional Alfvén wave comes from the transversal contribution to the $\lvert \partial {\boldsymbol{A}}_{1,\perp } / \partial t \rvert ^2$ term (which itself comes from the $\lvert {\boldsymbol{E}}_{1,\perp } \rvert ^2$ term), and it should therefore be sufficient to remove exactly this term from $ { {H}}_{2,s}$ .

When keeping only the contribution from the longitudinal part of the electric field in the second-order Hamiltonian, we find the following contribution to the action from the second-order Hamiltonian:

(6.5)

\begin{align} \sum _s \int {f}^0_{s} { {H}}_{2,s}^{\mathrm{Dar}} \mathfrak{J}_{0,s} \mathrm{d}^3 {u} \mathrel {\mathop :} &= {} - \frac {1}{2 \mathcal{C}(1)} \left \lvert \varPi _{\mathrm{L},\perp } \left ( \mathcal{C}(1) {\boldsymbol{E}}_{1} \right ) \right \rvert ^2\nonumber \\&\quad + \frac {\mathcal{C}({u}_\shortparallel )}{\mathcal{C}(1)} {{\boldsymbol{\hat {b}}}_0} \boldsymbol{\cdot } \left [ \varPi _{\mathrm{L},\perp }\left ( \mathcal{C}(1){\boldsymbol{E}}_{1} \right ) \times {\boldsymbol{B}}_1 \right ] + \frac { p_{0,\perp } - p_{0,\shortparallel } }{2 B_0^2} {\lvert {\boldsymbol{B}}_{1,\perp } \rvert ^2}, \end{align}

where the gyrokinetic longitudinal projection operator is defined as (cf. (6.2))

(6.6)

\begin{equation} \varPi _{\mathrm{L},\perp } {\boldsymbol{S}} \mathrel {\mathop :}= \mathcal{C}(1) \boldsymbol{\nabla} _\perp \left ( [\boldsymbol{\nabla }\boldsymbol{\cdot } (\mathcal{C}(1) \boldsymbol{\nabla} _\perp )]^{-1} \left [ \boldsymbol{\nabla }\boldsymbol{\cdot } {\boldsymbol{S}}_\perp \right ] \right ), \end{equation}

and we recall that $\mathcal{C}(\zeta )$ is defined in (5.47).

6.3. Principle of least action

As the particle part of the Hamiltonian is still given by (5.17b ) and the symplectic part of the Lagrangian is unchanged when compared with the proposed gyrokinetic Maxwell model, we find that the EOMs are still given by (5.23).

The explicit expression of Low’s action, while using (6.5), is given by (cf. (5.25))

(6.7)

\begin{align} \mathfrak{A}^{\mathrm{Dar}}( { {{\boldsymbol{Z}}}}, \phi _1, {\boldsymbol{A}}_1) & = \sum _s \int {f}^0_{s}({\boldsymbol{z}}^0) \biggl [ q_s \big( {\boldsymbol{A}}_{0,s}^{\star } + {\boldsymbol{A}}_1 + \big\langle\kern-1.7pt \big| \,\mathring {\!{\boldsymbol{B}}}^\varsigma _1\times {\boldsymbol{\rho }} \big|\kern-1.5pt\big\rangle \big) \boldsymbol{\cdot } \,\dot { { {\!\boldsymbol{R}}}}+ \frac {m_s { {{M}}}}{q_s} \dot { { {\varTheta }}}\nonumber \\ &\quad - { {{M}}} \big(B_0 + \big\langle \kern-0.7pt\big\langle \mathring {B}^\varsigma _{1,\shortparallel } \big\rangle \kern-0.8pt\big\rangle \big) - \frac {m_s}{2} { {U}}_\shortparallel ^2 - q_s \phi _1 + q_s \rho \big\langle\kern-1.7pt \big| \mathring {E}^\varsigma _{1,\rho } \big|\kern-1.7pt\big\rangle \biggr ] \mathfrak{J}_s({\boldsymbol{z}}^0, t^0) \mathrm{d}^6 { {z}}^0 \,\mathrm{d} t\nonumber \\ &\quad + \int \Biggl ( \frac {1}{2\mathcal{C}(1)} \left \lvert \varPi _{\mathrm{L},\perp } \left ( \mathcal{C}(1) {\boldsymbol{E}}_{1} \right ) \right \rvert ^2 - \frac {\mathcal{C}({u}_\shortparallel )}{{\mathcal{C}(1)}} {{\boldsymbol{\hat {b}}}_0} \boldsymbol{\cdot } \left [ \varPi _{\mathrm{L},\perp }\left ( {\mathcal{C}(1)}{\boldsymbol{E}}_{1} \right ) \times {\boldsymbol{B}}_1 \right ]\nonumber \\ &\quad - \frac {p_{0,\perp } - p_{0,\shortparallel }}{2B_0^2} {\lvert {\boldsymbol{B}}_{1,\perp } \rvert ^2} \Biggr )\mathrm{d}^3 { {r}} \mathrm{d} t - \frac {1}{2 \mu _0} \int \lvert {\boldsymbol{B}}_0 + {\boldsymbol{B}}_1 \rvert ^2\, \mathrm{d}^3 x \,\mathrm{d} t. \end{align}

Compared with the action of the proposed gyrokinetic Maxwell model, as given in (5.25), we additionally consider the limit of quasi-neutrality (i.e. $\epsilon _0 = 0$ ) to eliminate the fast light wave as well as the Langmuir wave from the proposed model.

Setting the variations of the action (6.7) with respect to the scalar and vector potential to zero results in the following quasi-neutrality equation (cf. (5.28)) and the Ampère–Maxwell law (cf. (5.33)):

(6.8a)

\begin{align} - \int {\boldsymbol{\mathcal{P}}}_{1}^{\mathrm{Dar}} \boldsymbol{\cdot } \boldsymbol{\nabla} _\perp \varLambda \,\mathrm{d}^3 { {r}} = {} & \sum _s q_s \int {f}_s \langle \mathring {\varLambda } \rangle \mathfrak{J}_s \,\mathrm{d}^6 { {z}} ,\\[-10pt]\nonumber \end{align}

(6.8b)

\begin{align} \frac {1}{\mu _0} \int ({\boldsymbol{B}}_0 + {\boldsymbol{B}}_1) \boldsymbol{\cdot } (\boldsymbol{\nabla }\times {{\boldsymbol{\varLambda }}})\, \mathrm{d}^3 x = {} & \int \left ( \frac {\partial {\boldsymbol{\mathcal{P}}}_{1}^{\mathrm{Dar}}}{\partial t} + \boldsymbol{\nabla }\times {\boldsymbol{\mathcal{M}}}_1^{\mathrm{Dar}} + \overline {{\boldsymbol{\mathcal{J}}}}{}^{\mathrm{f}} \right ) \boldsymbol{\cdot } {{\boldsymbol{\varLambda }}} \,\mathrm{d}^3 { {r}} , \end{align}

respectively, by making use of the self-adjointness of the projection operator $\varPi _{\mathrm{L},\perp }$ . The Darwin polarisation (cf. (5.29)) and magnetisation (cf. (5.34)) are given by

(6.9a)

\begin{align} {\boldsymbol{\mathcal{P}}}_{1}^{\mathrm{Dar}} &\mathrel {\mathop :}= \varPi _{\mathrm{L},\perp } {\boldsymbol{\mathcal{P}}}_{1} = \varPi _{\mathrm{L},\perp } \big( \mathcal{C}(1) {\boldsymbol{E}}_{1} + \mathcal{C}({u}_\shortparallel ) {{\boldsymbol{\hat {b}}}_0} \times {\boldsymbol{B}}_1 \big) ,\\[-6pt]\nonumber \end{align}

(6.9b)

\begin{align} {\boldsymbol{\mathcal{M}}}_{1}^{\mathrm{Dar}} &\mathrel {\mathop :}= \frac {p_{0,\shortparallel } - p_{0,\perp }}{B_0^2} {\boldsymbol{B}}_{1,\perp } - \frac {\mathcal{C}({u}_\shortparallel )}{{\mathcal{C}(1)}} {{\boldsymbol{\hat {b}}}_0} \times \varPi _{\mathrm{L},\perp } \big( {\mathcal{C}(1)} {\boldsymbol{E}}_{1} \big) , \end{align}

where we note that the Darwin polarisation is entirely longitudinal, as desired.

6.4. Strong formulation of the field equations

6.4.1. Gauge-invariant formulation

We find that the resulting model can be written in strong formulation as

(6.10a)

\begin{align} \boldsymbol{\nabla }\boldsymbol{\cdot } {\boldsymbol{\mathcal{D}}}^{\mathrm{Dar}} &= \overline {\mathcal{R}}{}^{\mathrm{f}}, \end{align}

(6.10b)

\begin{align} \boldsymbol{\nabla }\times {\boldsymbol{\mathcal{H}}}^{\mathrm{Dar}} &= \frac {\partial {\boldsymbol{\mathcal{D}}}^{\mathrm{Dar}}}{\partial t} + \overline {{\boldsymbol{\mathcal{J}}}}{}^{\mathrm{f}} , \end{align}

where the Darwin displacement and magnetising field are defined as

(6.11a)

\begin{align} {\boldsymbol{\mathcal{D}}}^{\mathrm{Dar}} &\mathrel {\mathop :}= {\boldsymbol{\mathcal{P}}}_{1}^{\mathrm{Dar}} , \end{align}

(6.11b)

\begin{align} {\boldsymbol{\mathcal{H}}}^{\mathrm{Dar}} &\mathrel {\mathop :}= \frac {1}{\mu _0} {\boldsymbol{B}} - {\boldsymbol{\mathcal{M}}}_{1}^{\mathrm{Dar}}. \end{align}

It follows that the field equations are compatible in the sense that the divergence of the Ampère–Maxwell law yields the time derivative of the quasi-neutrality equation. That is, the bound-charge continuity equation is satisfied for the gyrokinetic Darwin model analogous to the discussion from § 5.7.

6.4.2. Perpendicular Coulomb gauge

Despite the favourable structure of the resulting field equations, we note that the explicit presence of the gyrokinetic longitudinal projection operator is not desirable because it is not a local operator and, in particular, involves the inversion of a perpendicular Laplace operator. Specific choices of the gauge condition can be made to avoid this complication and, in particular, we consider the perpendicular Coulomb gauge given by

(6.12)

\begin{equation} \int \mathcal{C}(1) \boldsymbol{\nabla} _\perp \varLambda \boldsymbol{\cdot } {\boldsymbol{A}}_1 \,\mathrm{d}^3 x = 0. \end{equation}

This gauge condition implies that the longitudinal part of the scaled vector potential vanishes

(6.13)

\begin{equation} 0 = \int \boldsymbol{\nabla} _\perp \varLambda \boldsymbol{\cdot } \varPi _{\mathrm{L},\perp } \left ( {\mathcal{C}(1)} {\boldsymbol{A}}_1 \right ) \,\mathrm{d}^3 x \quad \implies \quad \varPi _{\mathrm{L},\perp } \left ( {\mathcal{C}(1)}{\boldsymbol{A}}_1 \right ) = {\boldsymbol{0}}_3 \end{equation}

as follows from the adjoint of the projection operator as well as

(6.14)

\begin{equation} \varPi _{\mathrm{L},\perp } \left ( {\mathcal{C}(1)} \boldsymbol{\nabla} _\perp \varLambda \right ) = {\mathcal{C}(1)} \boldsymbol{\nabla} _\perp \varLambda . \end{equation}

Within the perpendicular Coulomb gauge (6.12), we find that the quasi-neutrality equation (6.10a ) reduces to

(6.15)

\begin{equation} -\boldsymbol{\nabla }\boldsymbol{\cdot } \bigg( \sum _s \frac { m_s {n}_{0,s}}{B_0^2} \big[ \boldsymbol{\nabla} _\perp \phi _1 - {u}_{0,\shortparallel ,s} {{\boldsymbol{\hat {b}}}_0} \times (\boldsymbol{\nabla }\times {\boldsymbol{A}}_1) \big] \bigg) = \overline {\mathcal{R}}{}^{\mathrm{f}}, \end{equation}

where we have substituted the value of $\mathcal{C}(1)$ and $\mathcal{C}({u}_\shortparallel )$ from (5.47), and have made use of the adjoint of the longitudinal projection operator and (6.14). Note that this equation is decoupled from the Ampère–Maxwell law if the background distribution function is symmetric (i.e. ${u}_{0,\shortparallel ,s} = 0$ ) and thereby reduces to the well-known (and well-posed) perpendicular Laplace equation for $\phi _1$ . We note that using a perpendicular Lorenz-type gauge results in a similar simplification, but we do not consider this here.

For the vector potential ${\boldsymbol{A}}_1$ , we have to solve the following system of equations (by substituting the definition of the magnetising field (6.11b ) and magnetisation (6.9b ) into (6.10b )):

(6.16a)

\begin{align} \boldsymbol{\nabla } & \times \biggl [ \frac {1}{\mu _0} \boldsymbol{\nabla }\times {\boldsymbol{A}}_1 - \frac {p_{0,\shortparallel } - p_{0,\perp }}{B_0^2} (\boldsymbol{\nabla }\times {\boldsymbol{A}}_1)_\perp - \frac {\sum _s m_s {n}_{0,s} {u}_{0,\shortparallel ,s}}{B_0^2} {{\boldsymbol{\hat {b}}}_0} \times \boldsymbol{\nabla} _\perp \phi _1 \biggr ] \nonumber \\ &\quad + \frac {\sum _s m_s {n}_{0,s}}{B_0^2} \boldsymbol{\nabla} _\perp \lambda = \overline {{\boldsymbol{\mathcal{J}}}}{}^{\mathrm{f}} - \frac {1}{\mu _0} \boldsymbol{\nabla }\times {\boldsymbol{B}}_0, \end{align}

(6.16b)

\begin{align} &\boldsymbol{\nabla }\boldsymbol{\cdot } \left ( \frac {\sum _s m_s {n}_{0,s}}{B_0^2} {\boldsymbol{A}}_{1,\perp } \right ) = 0, \end{align}

where we have intentionally dropped the contribution from the longitudinal displacement current density and have instead introduced a Lagrange multiplier $\lambda$ for which

(6.17)

\begin{align} \frac {\partial {\boldsymbol{\mathcal{D}}}^{\mathrm{Dar}}}{\partial t} = -\mathcal{C}(1) \boldsymbol{\nabla} _\perp \lambda \!\!\quad \implies\!\!\quad \lambda \mathrel {\mathop :}= \frac {\partial \phi _1}{\partial t} - [\boldsymbol{\nabla }\boldsymbol{\cdot } (\mathcal{C}(1) \boldsymbol{\nabla} _\perp )]^{-1} \left [ \boldsymbol{\nabla }\boldsymbol{\cdot } \left ( \mathcal{C}({u}_\shortparallel ) {{\boldsymbol{\hat {b}}}_0} \times \frac {\partial {\boldsymbol{B}}_1}{\partial t} \right ) \right ],\end{align}

which thereby replaces the displacement current density and simultaneously enforces the gauge condition (6.16b ). We note that keeping the contribution from the displacement current density would yield $\lambda = 0$ as a consequence of the compatibility of the field equations, which in turn is a consequence of the gauge invariance of the proposed model. Hence, the Lagrange multiplier $\lambda$ is non-zero only because we have chosen to drop the contribution from the displacement current density for numerical reasons.

If the background particle distribution functions are symmetric, i.e. ${u}_{0,\shortparallel ,s} = 0$ , it results in a full decoupling of the field equations for the potentials: (6.15) yields the scalar potential and (6.16) yields the vector potential. The key property of (6.16a ) is that it is invariant under ${\boldsymbol{A}}_1 \mapsto {\boldsymbol{A}}_1 + \boldsymbol{\nabla }\eta$ , and a gauge condition is therefore needed to fix this freedom. Such a gauge condition is exactly provided by the constraint (6.16b ). The well-posedness of (6.16) for an isotropic pressure and a symmetric background distribution is discussed in Appendix K.

6.5. Energy conservation

For the gyrokinetic Darwin model, we find that the evolution equation for the kinetic energy density (5.75) is unchanged. The equivalent to Poynting’s theorem can also be shown, except that the potential energy density

(6.18)

\begin{equation} \mathcal{U}^{\mathrm{Dar}} \mathrel {\mathop :}= \frac {1}{2} ( {\boldsymbol{\mathcal{D}}}^{\mathrm{Dar}} \boldsymbol{\cdot } {\boldsymbol{E}} + {\boldsymbol{\mathcal{H}}}^{\mathrm{Dar}} \boldsymbol{\cdot } {\boldsymbol{B}} ) \end{equation}

evolves according to (cf. (5.76))

(6.19)

\begin{equation} \frac {\partial \mathcal{U}^{\mathrm{Dar}}}{\partial t} + \boldsymbol{\nabla }\boldsymbol{\cdot } \left ( {\boldsymbol{E}} \times {\boldsymbol{\mathcal{H}}}^{\mathrm{Dar}} + {\boldsymbol{\mathcal{S}}} \right ) = - \overline {{\boldsymbol{\mathcal{J}}}}{}^{\mathrm{f}} \boldsymbol{\cdot } {\boldsymbol{E}}, \end{equation}

where the additional potential energy flux is given by

(6.20)

\begin{align} {\boldsymbol{\mathcal{S}}} &= \frac {1}{2} \Biggl [ \phi ^E \varPi _{\mathrm{T},\perp } \frac {\partial {\boldsymbol{\mathcal{P}}}_1}{\partial t} - \frac {\partial \phi ^E}{\partial t} \varPi _{\mathrm{T},\perp } {\boldsymbol{\mathcal{P}}}_1 + \phi ^B \varPi _{\mathrm{T},\perp } \left ( {\mathcal{C}(1)} \frac {\partial {\boldsymbol{E}}_\perp }{\partial t} \right ) -\frac {\partial \phi ^B}{\partial t} \varPi _{\mathrm{T},\perp } \left ( {\mathcal{C}(1)} {\boldsymbol{E}}_\perp \right ) \Biggr ] . \end{align}

The gyrokinetic transversal projection operator $\varPi _{\mathrm{T},\perp }$ is defined as

(6.21)

\begin{equation} \varPi _{\mathrm{T},\perp } {\boldsymbol{S}} \mathrel {\mathop :}= {\boldsymbol{S}} - \varPi _{\mathrm{L},\perp } {\boldsymbol{S}} , \end{equation}

and $\phi ^E, \phi ^B$ are the scalar potential parts of the (longitudinal) displacement,

(6.22)

\begin{align} \phi ^E &\mathrel {\mathop :}= -[\boldsymbol{\nabla }\boldsymbol{\cdot } (\mathcal{C}(1) \boldsymbol{\nabla} _\perp )]^{-1} \left [ \boldsymbol{\nabla }\boldsymbol{\cdot } \left ( \mathcal{C}(1) {\boldsymbol{E}}_\perp \right ) \right ],\\[-4pt]\nonumber \end{align}

(6.23)

\begin{align} \phi ^B &\mathrel {\mathop :}= -[\boldsymbol{\nabla }\boldsymbol{\cdot } (\mathcal{C}(1) \boldsymbol{\nabla} _\perp )]^{-1} \big[ \boldsymbol{\nabla }\boldsymbol{\cdot } \big( \mathcal{C}({u}_\shortparallel ) {{\boldsymbol{\hat {b}}}_0} \times {\boldsymbol{B}} \big) \big], \end{align}

which, in the perpendicular Coulomb gauge, yields $\phi ^E = \phi _1$ and the potentials relate to the Lagrange multiplier $\lambda$ from (6.16) via $\lambda = \partial (\phi ^E + \phi ^B)/\partial t$ .

This results in the following conserved energy:

(6.24)

\begin{equation} \mathfrak{E}^{\mathrm{Dar}} \mathrel {\mathop :}= \int (\mathcal{K} + \mathcal{U}^{\mathrm{Dar}})\, \mathrm{d}^3 { {r}}, \end{equation}

where the kinetic energy density is still defined by (5.74a ).

7. Comparison with some models from literature

The two proposed gyrokinetic models which have been derived thus far are more comprehensive and possess more structure than the models usually found in the literature (Qin et al. Reference Qin, Tang, Lee and Rewoldt1999; Brizard & Hahm Reference Brizard and Hahm2007; Kleiber et al. Reference Kleiber, Hatzky, Könies, Mishchenko and Sonnendrücker2016). In this section, we compare the proposed gyrokinetic models to several models from the literature. This comparison is not intended to be exhaustive: we compare the two proposed models to the parallel-only model (Kleiber et al. Reference Kleiber, Hatzky, Könies, Mishchenko and Sonnendrücker2016) as it is the ‘working horse’ of gyrokinetic simulations, the symplectic gyrokinetic model from Brizard & Hahm (Reference Brizard and Hahm2007) as this is a frequently cited paper which presents a novel gyrokinetic model which includes ${\boldsymbol{A}}_{1,\perp }$ but is not gauge-invariant, and finally we compare the two proposed models to the gauge-invariant gyrokinetic model from Burby & Brizard (Reference Burby and Brizard2019) as the two proposed models are a generalisation thereof and are largely inspired by it.

For each model under consideration, we discuss the gyrocentre single-particle Lagrangian, the resulting EOMs and field equations as well as their corresponding strong form. Furthermore, the well-posedness of the models is discussed, and dispersion relations are derived and used to compare the models in terms of the presence of shear and/or compressional Alfvén waves.

7.1. Parallel-only model

The parallel-only model, as is discussed for example by Kleiber et al. (Reference Kleiber, Hatzky, Könies, Mishchenko and Sonnendrücker2016), is based on the assumption that the perpendicular part of the vector potential can be neglected: ${\boldsymbol{A}}_{1,\perp } = {\boldsymbol{0}}_3$ . This assumption is combined with the following approximation in the derivation of the Hamiltonian

(7.1)

\begin{equation} {\boldsymbol{B}}_{1} = \boldsymbol{\nabla }\times (A_{1,\shortparallel } {{\boldsymbol{\hat {b}}}_0}) = \underbrace {\boldsymbol{\nabla} _\perp A_{1,\shortparallel } \times {{\boldsymbol{\hat {b}}}_0}}_{O(\varepsilon _\perp )} + \underbrace {A_{1,\shortparallel } \boldsymbol{\nabla }\times {{\boldsymbol{\hat {b}}}_0}}_{O(\varepsilon _B)} \approx \boldsymbol{\nabla} _\perp A_{1,\shortparallel } \times {{\boldsymbol{\hat {b}}}_0}, \end{equation}

where it is assumed that

(7.2)

\begin{equation} \varepsilon _B \ll \varepsilon _\perp \quad \iff \quad \frac {1}{k_\perp } \ll L_B. \end{equation}

That is, it is assumed that no system-scale effects are present in the perpendicular direction. When considering that FLR effects are already neglected in the second-order gyrocentre Hamiltonian $ { {H}}_2$ , it follows that the reduced parallel-only model is valid for intermediate wavelengths only: $\varrho \ll 1 / k_\perp \ll L_B$ .

7.1.1. Gyrocentre single-particle Lagrangian

When neglecting the perpendicular part of the vector potential and by making use of the approximation given by (7.1), we find that the symplectic part is given by

(7.3)

\begin{equation} { {{\boldsymbol{\gamma }}}}_{1,{\boldsymbol{R}},s}^{\shortparallel } \mathrel {\mathop :}= \langle ({{\boldsymbol{\hat {b}}}_0} \boldsymbol{\cdot } {\bar{\boldsymbol{\gamma }}}_{1,s,{\boldsymbol{R}}}^\dagger ) {{\boldsymbol{\hat {b}}}_0} \rangle = q_s \langle \mathring {A}_{1,\shortparallel } \rangle {{\boldsymbol{\hat {b}}}_0}, \end{equation}

whereas the first- and second-order Hamiltonian are reduced to (cf. (4.52))

(7.4a)

\begin{equation} { {H}}_{1,s}^\shortparallel \mathrel {\mathop :}= q_s \langle \mathring {\phi }_1 \rangle \end{equation}

and (cf. (4.66))

(7.4b)

\begin{equation} { {H}}_{2,s}^\shortparallel \mathrel {\mathop :}= - \frac {m_s}{2 B_0^2} \lvert \boldsymbol{\nabla} _\perp (\phi _1 - { {U}}_\shortparallel A_{1,\shortparallel }) \rvert ^2 + \frac { { {{M}}}}{2 B_0} \lvert \boldsymbol{\nabla} _\perp A_{1,\shortparallel } \rvert ^2, \end{equation}

respectively, by making use of the gradient Theorem (D.8).

7.1.2. Principle of least action

We find that the reduced EOMs are given by (cf. (5.23))

(7.5a)

\begin{align} \,\dot { { {\!\boldsymbol{R}}}} &= { {U}}_\shortparallel {\boldsymbol{b}}_s^{{\star },\shortparallel } - \frac {1}{q_s B_{s,\shortparallel }^{{\star },\shortparallel }} {{\boldsymbol{\hat {b}}}_0} \times \left ( q_s {\boldsymbol{E}}_1^{{\star },\shortparallel } - {M} \boldsymbol{\nabla }B_0 \right ), \end{align}

(7.5b)

\begin{align} \dot { { {U}}}_\shortparallel &= \frac {1}{m_s} {\boldsymbol{b}}_s^{{\star },\shortparallel } \boldsymbol{\cdot } \left ( q_s {\boldsymbol{E}}_1^{{\star },\shortparallel } - {M} \boldsymbol{\nabla }B_0 \right ) , \end{align}

where the electromagnetic fields are defined as (cf. (4.71))

(7.6a)

\begin{align} {\boldsymbol{E}}_1^{{\star },\shortparallel } &\mathrel {\mathop :}= -\boldsymbol{\nabla }\langle \mathring {\phi }_1 \rangle - \frac {\partial \langle \mathring {A}_{1,\shortparallel } \rangle }{\partial t} {{\boldsymbol{\hat {b}}}_0}, \end{align}

(7.6b)

\begin{align} {\boldsymbol{B}}_1^{{\star },\shortparallel } &\mathrel {\mathop :}= \boldsymbol{\nabla }\times (\langle \mathring {A}_{1,\shortparallel } \rangle {{\boldsymbol{\hat {b}}}_0}) , \end{align}

and we have defined ${\boldsymbol{b}}_s^{{\star },\shortparallel }$ and $B_{s,\shortparallel }^{{\star },\shortparallel }$ analogously to (4.76) and (4.77), i.e. by replacing ${\boldsymbol{B}}_{1}^{\star }$ by ${\boldsymbol{B}}_{1}^{{\star },\shortparallel }$ .

For the derivation of the field equations, we again give Low’s action explicitly resulting in (cf. (5.25) and (6.7))

(7.7)

\begin{align} \mathfrak{A}^\shortparallel ( { {{\boldsymbol{Z}}}}, \phi _1, A_{1,\shortparallel }) &= \sum _s \int {f}^0_{s}({\boldsymbol{z}}^0) \biggl [ q_s \left ( {\boldsymbol{A}}_{0,s}^{\star } + \langle \mathring {A}_{1,\shortparallel } \rangle {{\boldsymbol{\hat {b}}}_0} \right ) \boldsymbol{\cdot } \,\dot { { {\!\boldsymbol{R}}}} + \frac {m_s { {{M}}}}{q_s} \dot { { {\varTheta }}} - \frac {m_s}{2} { {U}}_\shortparallel ^2 - { {{M}}} B_0\nonumber \\[5pt] &\quad - q_s \langle \mathring {\phi }_1 \rangle \biggr ] \mathfrak{J}_s^\shortparallel ({\boldsymbol{z}}^0, t^0) \,\mathrm{d}^6 { {z}}^0 \,\mathrm{d} t\nonumber \\[5pt]&\quad + \sum _s \int {f}^0_{s} \biggl [ \frac {m_s}{2 B_0^2} \lvert \boldsymbol{\nabla} _\perp (\phi _1 - {u}_\shortparallel A_{1,\shortparallel }) \rvert ^2 - \frac {{\mu }}{2 B_0} \lvert \boldsymbol{\nabla} _\perp A_{1,\shortparallel } \rvert ^2 \biggr ] \mathfrak{J}_{0,s} \,\mathrm{d}^6 { {z}}\, \mathrm{d} t\nonumber \\[5pt]&\quad - \frac {1}{2 \mu _0} \int \lvert {\boldsymbol{B}}_0 + \boldsymbol{\nabla} _\perp A_{1,\shortparallel } \times {{\boldsymbol{\hat {b}}}_0} \rvert ^2 \,\mathrm{d}^3 x \,\mathrm{d} t, \end{align}

where the Jacobian is now given by $\mathfrak{J}_s^\shortparallel \mathrel {\mathop :}= {B_{s,\shortparallel }^{{\star },\shortparallel }} / {m_s}$ (cf. (5.7)). Compared with the action of the proposed gyrokinetic Maxwell model, as given in (5.25), we consider the limit of quasi-neutrality (i.e. $\epsilon _0 = 0$ ).

The quasi-neutrality equation (cf. (5.28) and (6.8a )) and Ampère’s law (cf. (5.33) and (6.8b )) are given by

(7.8a)

\begin{align} -\int {\boldsymbol{\mathcal{P}}}_1^\shortparallel \boldsymbol{\cdot } \boldsymbol{\nabla} _\perp \varLambda\, \mathrm{d}^3 { {r}} &= \sum _s q_s \int {f}_s \langle \mathring {\varLambda } \rangle \mathfrak{J}_s^\shortparallel\, \mathrm{d}^6 { {z}} ,\\[-4pt]\nonumber \end{align}

(7.8b)

\begin{align} \frac {1}{\mu _0} \int \boldsymbol{\nabla} _\perp A_{1,\shortparallel } \boldsymbol{\cdot } \boldsymbol{\nabla} _\perp \varLambda \,\mathrm{d}^3 x &= \int {\boldsymbol{\mathcal{M}}}_1^\shortparallel \boldsymbol{\cdot } ({{\boldsymbol{\hat {b}}}_0} \times \boldsymbol{\nabla} _\perp \varLambda ) \,\mathrm{d}^3 { {r}} + \sum _s q_s \int {f}_s \dot { { {R}}}_\shortparallel \langle \mathring {\varLambda } \rangle \mathfrak{J}_s^\shortparallel \,\mathrm{d}^6 { {z}}, \end{align}

where the reduced polarisation (cf. (5.29) and (6.9a )) and magnetisation (cf. (5.34) and (6.9b )) are given by

(7.9a)

\begin{align} {\boldsymbol{\mathcal{P}}}_1^\shortparallel &\mathrel {\mathop :}= \sum _s \int {f}^0_{s} {\boldsymbol{P}}_{1,s}^\shortparallel \mathfrak{J}_{0,s} \,\mathrm{d}^3 {u} , \quad {\boldsymbol{P}}_{1,s}^\shortparallel \mathrel {\mathop :}= \frac {m_s}{q_s B_0^2} {\boldsymbol{F}}_{1,\perp }^\shortparallel , {\boldsymbol{F}}_{1,\perp }^\shortparallel \mathrel {\mathop :}= -q_s \boldsymbol{\nabla} _\perp (\phi _1 - {u}_\shortparallel A_{1,\shortparallel }),\\[-5pt]\nonumber \end{align}

(7.9b)

\begin{align} {\boldsymbol{\mathcal{M}}}_1^\shortparallel &\mathrel {\mathop :}= \sum _s \int {f}^0_{s} {\boldsymbol{M}}_{1,s}^\shortparallel \mathfrak{J}_{0,s} \,\mathrm{d}^3 {u} , {\boldsymbol{M}}_{1,s}^\shortparallel \mathrel {\mathop :}= - {{\boldsymbol{\hat {b}}}_0} \times \left ( {u}_\shortparallel {\boldsymbol{P}}_{1,s}^\shortparallel - \frac {{\mu }}{B_0} \boldsymbol{\nabla} _\perp A_{1,\shortparallel } \right ) . \end{align}

When considering the EOMs (7.5) as well as field equations (7.8) of the reduced model, we find that this model coincides with reduced parallel models from the literature. We note that this is only due to the choice $(\xi _R, \xi _\varTheta ) = (1, 0)$ , which yields the appropriate gyro-averages on both the scalar and vector potential.

7.1.3. Strong formulation

To be able to interpret the equations more easily, we present the strong formulation of the field equations as follows (cf. (6.16)):

(7.10a)

\begin{align} - \boldsymbol{\nabla }\boldsymbol{\cdot } \left [ \sum _s\frac { m_s {n}_{0,s}}{B_0^2} \left ( \boldsymbol{\nabla} _\perp \phi _1 - {u}_{0,\shortparallel ,s} \boldsymbol{\nabla} _\perp A_{1,\shortparallel } \right ) \right ] &= \overline {\mathcal{R}}{}^{\mathrm{f}}, \end{align}

(7.10b)

\begin{align} -\frac {1}{\mu _0} \boldsymbol{\nabla }\boldsymbol{\cdot } \boldsymbol{\nabla} _\perp A_{1,\shortparallel } - \boldsymbol{\nabla }\boldsymbol{\cdot } \left ( \frac {\sum _s m_s {n}_{0,s} {u}_{0,\shortparallel ,s}}{B_0^2} \boldsymbol{\nabla} _\perp \phi _1 - \frac {p_{0,\shortparallel } - p_{0,\perp }}{B_0^2} \boldsymbol{\nabla} _\perp A_{1,\shortparallel } \right ) &= \overline {\mathcal{J}}{}^{\mathrm{f},\shortparallel }_\shortparallel , \end{align}

where we recall that ${n}_{0,s}$ and ${u}_{0,\shortparallel ,s}$ denote the background particle density and parallel velocity, as defined in (5.52). The gyro-average adjoint of the parallel component of the free current density $\overline {{\mathcal{J}}}{}^{\mathrm{f},\shortparallel }_\shortparallel$ is different from the one discussed in § 5.4, and it is defined in a weak sense as (cf. (5.39b ))

(7.11)

\begin{equation} \int \overline {{\mathcal{J}}}{}^{\mathrm{f},\shortparallel }_\shortparallel \varLambda \,\mathrm{d}^3 { {r}} \mathrel {\mathop :}= \sum _s q_s \int {f}_s \dot { { {{R}}}}_\shortparallel \langle \mathring {\varLambda } \rangle \mathfrak{J}_s^\shortparallel\, \mathrm{d}^6 { {z}}. \end{equation}

The following identities:

(7.12)

\begin{equation} \int {f}^0_{s} {u}_\shortparallel \mathfrak{J}_{0,s}\, \mathrm{d}^3 {u} = 0 , \quad \int {f}^0_{s} \left ( m_s {u}_\shortparallel ^2 - {\mu } B_0 \right ) \mathfrak{J}_{0,s}\, \mathrm{d}^3 {u} = 0, \end{equation}

which hold for a centred Maxwellian background distribution

(7.13)

\begin{equation} {f}^{0,\mathrm{CM}}_{s}({\boldsymbol{r}}, {u}_\shortparallel , {\mu }) \mathrel {\mathop :}= \frac {{n}_{0,s}({\boldsymbol{r}})}{\sqrt {\pi }^{3} u_{\mathrm{th},s}({\boldsymbol{r}})^3} \exp {\left (-\frac {m_s {u}_\shortparallel ^2 + 2 {\mu } B_0({\boldsymbol{r}})}{m_s u_{\mathrm{th},s}({\boldsymbol{r}})^2} \right )}, \end{equation}

result in decoupling the field equations (7.10). Note that (7.12), in physical terms, coincides with the absence of a parallel background current density ( ${u}_{0,\shortparallel ,s} = 0$ ) as well as the isotropy of the pressure/temperature ( $p_{0,\perp } = p_{0,\shortparallel }$ ).

7.2. Symplectic gyrokinetic model from Brizard–Hahm

In addition to the gauge-invariant model described by Burby & Brizard (Reference Burby and Brizard2019), there are also gauge-variant gyrokinetic models which include ${\boldsymbol{A}}_{1,\perp }$ . In particular, we consider the symplectic model from Brizard & Hahm (Reference Brizard and Hahm2007, (171) and (173) with $(\alpha , \beta ) = (1, 1)$ ), which we hereafter refer to as the Brizard–Hahm (BH) model.

7.2.1. Gyrocentre single-particle Lagrangian

The symplectic part of the Lagrangian is as given in (4.41), whereas the first- and second-order Hamiltonians are derived from Brizard & Hahm (Reference Brizard and Hahm2007, (171) and (173) with $(\alpha , \beta ) = (1, 1)$ ) resulting in (cf. (4.52) and (7.4a ))

(7.14a)

\begin{equation} { {H}}_{1,s}^{\mathrm{BH}} \mathrel {\mathop :}= q_s \langle \mathring {\phi }_1 \rangle + {M} \big\langle \kern-0.7pt\big\langle \mathring {B}^\varsigma _{1,\shortparallel } \big\rangle \kern-0.8pt\big\rangle \end{equation}

and (cf. (4.66) and (7.4b ))

(7.14b)

\begin{equation} { {H}}_{2,s}^{\mathrm{BH}} \mathrel {\mathop :}= - \frac {m_s}{2 B_0^2} \lvert \boldsymbol{\nabla} _\perp (\phi _1 - { {U}}_\shortparallel A_{1,\shortparallel }) \rvert ^2 + \frac {{M}}{2 B_0} \lvert \boldsymbol{\nabla} _\perp A_{1,\shortparallel } \rvert ^2, \end{equation}

respectively. The derivation, which neglects terms of $O(\varepsilon _\perp ^3)$ in $ { {H}}_{2,s}^{\mathrm{BH}}$ , can be found in Appendix J. We note that this result can alternatively be derived from (4.66), when making use of the approximation ${\boldsymbol{B}}_1 \approx \boldsymbol{\nabla} _\perp A_{1,\shortparallel } \times {{\boldsymbol{\hat {b}}}_0} + (\boldsymbol{\nabla }\times {\boldsymbol{A}}_{1,\perp })_\shortparallel {{\boldsymbol{\hat {b}}}_0}$ and neglecting ${\partial {\boldsymbol{A}}_{1,\perp }} / {\partial t}$ .

Note that the second-order Hamiltonian coincides exactly with that of the parallel-only model: $ { {H}}_{2,s}^{\mathrm{BH}} = { {H}}_{2,s}^\shortparallel$ . This is remarkable, as it implies an absence of the polarisation current density as well as an absence of a contribution to the magnetisation from the perpendicular part of the vector potential.

Both the proposed gyrokinetic Maxwell model and the Brizard–Hahm model (Brizard & Hahm Reference Brizard and Hahm2007) reduce to the parallel-only model when neglecting the perpendicular part of the vector potential. Note that the same cannot be said about the gauge-invariant model from Burby & Brizard (Reference Burby and Brizard2019), which coincides with $(\xi _R, \xi _\varTheta ) = (0, 0)$ . Therein, we find that, e.g. ${\boldsymbol{A}}^{\star }_1 = {\boldsymbol{A}}_1$ such that the vector potential without gyro-averaging appears in the EOMs.

7.2.2. Principle of least action

The EOMs are found by substituting the expression for the particle Hamiltonian $ { {H}}^{\mathrm{part},\mathrm{BH}} = { {H}}_0 + { {H}}_1^{\mathrm{BH}}$ into the general form of the EOMs given by (4.80), where we now have ${\boldsymbol{A}}^{{\star },\mathrm{BH}}_1 \mathrel {\mathop :}= \langle \,\mathring {\!{\boldsymbol{A}}}_1 \rangle$ . This results in EOMs which have an identical structure as the EOMs of the quasi-neutral gyrokinetic Maxwell model (5.23), except that the electromagnetic fields are defined as (cf. (4.71) and (7.6))

(7.15a)

\begin{align} {\boldsymbol{E}}_1^{{\star },\mathrm{BH}} &\mathrel {\mathop :}= -\boldsymbol{\nabla }\langle \mathring {\phi }_1 \rangle - \frac {\partial \langle \mathring {{\boldsymbol{A}}}_{1} \rangle }{\partial t},\\[-10pt]\nonumber \end{align}

(7.15b)

\begin{align} {\boldsymbol{B}}_1^{{\star },\mathrm{BH}} &\mathrel {\mathop :}= \boldsymbol{\nabla }\times \langle \,\mathring {\!{\boldsymbol{A}}}_1 \rangle , \end{align}

and we have defined ${\boldsymbol{b}}_s^{{\star },\mathrm{BH}}$ and $B_{s,\shortparallel }^{{\star },\mathrm{BH}}$ analogously to (4.76) and (4.77). The explicit expression of Low’s action is given by (cf. (5.25), (6.7) and (7.7))

(7.16)

\begin{align} &\mathfrak{A}^{\mathrm{BH}}( { {{\boldsymbol{Z}}}}, \phi _1, {\boldsymbol{A}}_1, \lambda ) = \sum _s \int {f}^0_{s}({\boldsymbol{z}}^0) \biggl [ q_s \big( {\boldsymbol{A}}_{0,s}^{\star } + \langle \mathring {{\boldsymbol{A}}}_{1} \rangle \big) \boldsymbol{\cdot } \,\dot { { {\!\boldsymbol{R}}}} + \frac {m_s { {{M}}}}{q_s} \dot { { {\varTheta }}} - \frac {m_s}{2} { {U}}_\shortparallel ^2 - { {{M}}} B_0 \nonumber \\[5pt] &\quad - q_s \langle \mathring {\phi }_1 \rangle - {M} \big\langle \kern-0.7pt\big\langle \mathring {B}^\varsigma _{1,\shortparallel } \big\rangle \kern-0.8pt\big\rangle \biggr ] \mathfrak{J}_s^{\mathrm{BH}}({\boldsymbol{z}}^0, t^0)\, \mathrm{d}^6 { {z}}^0 \,\mathrm{d} t + \sum _s \int {f}^0_{s} \biggl [ \frac {m_s}{2 B_0^2} \lvert \boldsymbol{\nabla} _\perp (\phi _1 - {u}_\shortparallel A_{1,\shortparallel }) \rvert ^2 \nonumber \\[5pt] &\quad - \frac {{\mu }}{2 B_0} \lvert \boldsymbol{\nabla} _\perp A_{1,\shortparallel } \rvert ^2 \biggr ] \mathfrak{J}_{0,s} \,\mathrm{d}^6 { {z}}\, \mathrm{d} t - \frac {1}{2 \mu _0} \int \lvert {\boldsymbol{B}}_0 + {\boldsymbol{B}}_1 \rvert ^2 \,\mathrm{d}^3 x \,\mathrm{d} t \nonumber\\[5pt]&\quad + \int C^{\mathrm{BH}}( { {{\boldsymbol{Z}}}}, \phi _1, {\boldsymbol{A}}_1, \lambda )\, \mathrm{d}^3 x\, \mathrm{d} t, \end{align}

where a Lagrange multiplier $\lambda$ is introduced with an associated constraint $C^{\mathrm{BH}}$ to ensure that a well-posed system of equations is found, as it is discussed in more detail in § 7.2.4. Therein, we show that the constraint necessarily depends on all unknowns of our problem, including the characteristics $ { {{\boldsymbol{Z}}}}$ . Hence, when using a variational formulation of the Brizard–Hahm model (Brizard & Hahm Reference Brizard and Hahm2007), we find that well-posedness of the model implies that the Lagrange multiplier $\lambda$ affects the EOMs as well as each of the field equations. We do not show this dependence here explicitly and, for now, we ignore the contribution due to the constraint. Moreover, the Jacobian is given by $\mathfrak{J}_s^{\mathrm{BH}} \mathrel {\mathop :}= {B_{s,\shortparallel }^{{\star },\mathrm{BH}}} / {m_s}$ (cf. (5.7)).

The action (7.16) results in the same quasi-neutrality equation as found for the reduced parallel-only model, as given by (7.8a ). We find that Ampère’s law is given by

(7.17)

\begin{align} \frac {1}{\mu _0} \int ({\boldsymbol{B}}_0 + {\boldsymbol{B}}_1) \boldsymbol{\cdot } (\boldsymbol{\nabla }\times {{\boldsymbol{\varLambda }}}) \,\mathrm{d}^3 x & = \int \boldsymbol{\nabla }\boldsymbol{\cdot } ({\boldsymbol{\mathcal{M}}}_1^\shortparallel \times {{\boldsymbol{\hat {b}}}_0}) \varLambda _\shortparallel \,\mathrm{d}^3 { {r}} \nonumber\\ &\quad + \sum _s \int {f}_s \big[ q_s \,\dot { { {\!\boldsymbol{R}}}} \boldsymbol{\cdot } \langle \mathring {{{\boldsymbol{\varLambda }}}} \rangle - {\mu } \langle \!\langle (\mathring {\boldsymbol{\nabla} }^\varsigma \times {{\boldsymbol{\varLambda }}})_\shortparallel \rangle \!\rangle \big] \mathfrak{J}_s^{\mathrm{BH}} \,\mathrm{d}^6 { {z}} . \end{align}

7.2.3. Strong formulation

As usual, we consider the strong formulation of the field equations, which results in the following quasi-neutrality equation and Ampère’s law (cf. (5.44), (6.16) and (7.10))

(7.18a)

\begin{align} - \boldsymbol{\nabla }\boldsymbol{\cdot } \left [ \sum _s\frac { m_s {n}_{0,s}}{B_0^2} \left ( \boldsymbol{\nabla} _\perp \phi _1 - {u}_{0,\shortparallel ,s} \boldsymbol{\nabla} _\perp A_{1,\shortparallel } \right ) \right ] = \overline {\mathcal{R}}{}^{\mathrm{f}}, \end{align}

(7.18b)

\begin{align} \frac {1}{\mu _0} \boldsymbol{\nabla } & \times (\boldsymbol{\nabla }\times {\boldsymbol{A}}_1) - {{\boldsymbol{\hat {b}}}_0} \boldsymbol{\nabla }\boldsymbol{\cdot } \left ( \sum _s\frac { m_s {n}_{0,s} {u}_{0,\shortparallel ,s}}{B_0^2} \boldsymbol{\nabla} _\perp \phi _1 - \frac {p_{0,\shortparallel } - p_{0,\perp }}{B_0^2} \boldsymbol{\nabla} _\perp A_{1,\shortparallel } \right ) = \overline {{\boldsymbol{\mathcal{J}}}}{}^{\mathrm{f},\mathrm{BH}} \nonumber \\ & - \frac {1}{\mu _0} \boldsymbol{\nabla }\times {\boldsymbol{B}}_0 . \end{align}

The meaning of the gyro-average adjoint of the free current density $\overline {{\boldsymbol{\mathcal{J}}}}{}^{\mathrm{f},\mathrm{BH}}$ is different from the one discussed in § 5.4, and it is defined in a weak sense as (cf. (5.39b ))

(7.19)

\begin{equation} \int \overline {{\boldsymbol{\mathcal{J}}}}{}^{\mathrm{f},\mathrm{BH}} \boldsymbol{\cdot } {{\boldsymbol{\varLambda }}} \,\mathrm{d}^3 { {r}} \mathrel {\mathop :}= \sum _s \int {f}_s \big[ q_s \,\dot { { {\!\boldsymbol{R}}}} \boldsymbol{\cdot } \langle \mathring {{{\boldsymbol{\varLambda }}}} \rangle - {\mu } \langle \!\langle (\mathring {\boldsymbol{\nabla} }^\varsigma \times {{\boldsymbol{\varLambda }}})_\shortparallel \rangle \!\rangle \big] \mathfrak{J}_s^{\mathrm{BH}} \,\mathrm{d}^6 { {z}}. \end{equation}

This adjoint of the free current density coincides with the one from the proposed gyrokinetic model, as given by (5.39b ), whenever $\varepsilon _B = 0$ thanks to (D.10).

In contrast to the proposed gyrokinetic Maxwell model discussed in strong formulation in § 5.4, we now fail to recognise the structure of the macroscopic Maxwell’s equations in (7.18). This is primarily due to the absence of the polarisation current density, but also due to the fact that the magnetisation current density ${\boldsymbol{\mathcal{J}}}^{\mathrm{m},\mathrm{BH}}$ cannot be written as the curl of a magnetisation. Even when neglecting $O(\varepsilon _B)$ contributions, this is not possible, in which case we find

(7.20)

\begin{equation} {\boldsymbol{\mathcal{J}}}{^{\mathrm{m,\mathrm{BH}}}} = (\boldsymbol{\nabla }\times {\boldsymbol{\mathcal{M}}}_1^\shortparallel )_\shortparallel + O(\varepsilon _B). \end{equation}

7.2.4. Well-posedness of the field equations

For the symplectic Brizard–Hahm model (Brizard & Hahm Reference Brizard and Hahm2007), we note that the free current density is defined differently from the free current density of the proposed model (5.39b ). This implies in particular that the gyro-average of the free-charge continuity equation is no longer satisfied,

(7.21)

\begin{equation} \int \left ( \frac {\partial \overline {\mathcal{R}}{}^{\mathrm{f}}}{\partial t} + \boldsymbol{\nabla }\boldsymbol{\cdot } \overline {{\boldsymbol{\mathcal{J}}}}{}^{\mathrm{f},\mathrm{BH}} \right ) \varLambda \,\mathrm{d}^3 { {r}} = \int {\boldsymbol{\mathcal{J}}}^{\mathrm{f}} \boldsymbol{\cdot } ( \boldsymbol{\nabla }\langle \mathring {\varLambda } \rangle - \langle \mathring {\boldsymbol{\nabla }} \varLambda \rangle )\, \mathrm{d}^3 { {r}} . \end{equation}

As the right-hand side does not vanish in general, we find that a result analogous to (5.43) is not obtained.

Upon inspecting (7.17), we find that the bound current density is given by

(7.22)

\begin{equation} {\boldsymbol{\mathcal{J}}}^{\mathrm{b},\mathrm{BH}} = {\boldsymbol{\mathcal{J}}}^{\mathrm{m},\mathrm{BH}} = {{\boldsymbol{\hat {b}}}_0} \boldsymbol{\nabla }\boldsymbol{\cdot } ({\boldsymbol{\mathcal{M}}}_1^\shortparallel \times {{\boldsymbol{\hat {b}}}_0}), \end{equation}

which is not divergence free. Computation of the divergence of Ampère’s law (7.18b ) results in the unsatisfied constraint

(7.23)

\begin{equation} 0 = \frac {\partial }{\partial t} (\overline {\mathcal{R}}{}^{\mathrm{f}} + \mathcal{R}^{\mathrm{b},\mathrm{BH}}) + \boldsymbol{\nabla }\boldsymbol{\cdot } (\overline {{\boldsymbol{\mathcal{J}}}}{}^{\mathrm{f},\mathrm{BH}} + {\boldsymbol{\mathcal{J}}}^{\mathrm{b},\mathrm{BH}} ) \end{equation}

upon addition of the quasi-neutrality equation (7.18a ), where we let $\mathcal{R}^{\mathrm{b},\mathrm{BH}} \mathrel {\mathop :}= - \boldsymbol{\nabla }\boldsymbol{\cdot } {\boldsymbol{\mathcal{P}}}_1^\shortparallel$ . In this case, the gyro-average adjoint of the free-charge continuity equation does not vanish and we are, therefore, left with a total continuity equation which must be enforced by means of a Lagrange multiplier. Due to the non-zero right-hand side of (7.21), we find that the constraint (7.23) also depends on the characteristics $ { {{\boldsymbol{Z}}}}$ , which is very undesirable as it implies that the Lagrange multiplier $\lambda$ also affects the EOMs.

Hence, for the symplectic Brizard–Hahm model (Brizard & Hahm Reference Brizard and Hahm2007), we find that $\lambda \neq 0$ for three independent reasons: the polarisation current density is missing, the magnetisation current density is not the curl of a magnetisation and the gyro-average adjoint of the free-charge continuity equation does not hold. Note that a polarisation current density can be included by considering higher-order approximations of the first-order generating function, see e.g. Qin et al. (Reference Qin, Tang, Lee and Rewoldt1999).

7.3. Linearised models in a slab

To study the dispersive properties of the models under consideration, we consider a slab geometry, wherein the background magnetic field ${\boldsymbol{B}}_0$ is assumed to be constant. In this case, the FLR corrected electromagnetic fields exactly coincide with what one would expect (cf. (4.72))

(7.24)

\begin{equation} {\boldsymbol{B}}_1^{\star } = \langle \,\mathring {\!{\boldsymbol{B}}}_1 \rangle , \quad {\boldsymbol{E}}_1^{\star } = \langle \,\mathring {\!{\boldsymbol{E}}}_1 \rangle , \end{equation}

as $\varepsilon _B = 0$ . Furthermore, the term that multiplies $\,\dot { { {\!\boldsymbol{R}}}}$ in the Ampère–Maxwell law exactly coincides with $\langle \mathring {{{\boldsymbol{\varLambda }}}} \rangle$ (cf. (D.10)). We reiterate that each of these three identities holds only because of the specific choice of our gyrocentre coordinate transformation $(\xi _R, \xi _\varTheta ) = (1, 0)$ .

7.3.1. Proposed gyrokinetic Maxwell model

We make use of the so-called susceptibility tensor to study the dispersive properties of the proposed model. The susceptibility tensor represents the linearised model with a Fourier ansatz. More precisely, we find that computing the time derivative of the Ampère–Maxwell law (5.44d ) results in

(7.25)

\begin{equation} -\frac {1}{\mu _0} \boldsymbol{\nabla }\times (\boldsymbol{\nabla }\times {\boldsymbol{E}}_1) = \frac {\partial ^2 }{\partial t^2} \left ( \epsilon _0 {\boldsymbol{E}}_1 + {\boldsymbol{\mathcal{P}}}_1 \right ) + \boldsymbol{\nabla }\times \frac {\partial {\boldsymbol{\mathcal{M}}}_1}{\partial t} + \frac {\partial \overline {{\boldsymbol{\mathcal{J}}}}{}^{\mathrm{f}}}{\partial t} , \end{equation}

where we have substituted (5.44e ), (5.44f ) and (5.44c ). By substituting the expressions for the polarisation (5.29), magnetisation (5.34) and the gyrocentre current density, and by repeatedly using Faraday’s law (5.21), we can obtain an equation which is expressed entirely in terms of the perturbed electric field ${\boldsymbol{E}}_1$ . We then linearise this equation and substitute the Fourier ansatz

(7.26)

\begin{equation} {\boldsymbol{E}}_1 = \widehat {{\boldsymbol{E}}}_1 \mathrm{e}^{\mathrm{i} ({\boldsymbol{k}} \boldsymbol{\cdot } {\boldsymbol{r}} - \omega t)}, \end{equation}

which results in an equation of the form (Hasegawa Reference Hasegawa, Roederer and Wasson1975, (2.37))

(7.27)

\begin{equation} \omega ^2 ( \unicode{x1D644}_3 + \bar{\bar{ \unicode{x1D653}}}) \widehat {{\boldsymbol{E}}}_1 + c^2 {\boldsymbol{k}} \times ({\boldsymbol{k}} \times \widehat {{\boldsymbol{E}}}_1) = {\boldsymbol{0}}_3, \end{equation}

where $ \bar{\bar { \unicode{x1D653}}}$ is referred to as the gyrokinetic susceptibility tensor, $ \unicode{x1D644}_3$ denotes the $3 \times 3$ identity matrix and $c = \sqrt {1 / (\mu _0 \epsilon _0)}$ denotes the speed of light in vacuum. Note that $ \unicode{x1D644}_3 + \bar{\bar { \unicode{x1D653}}}$ is often referred to as the dielectric tensor. The susceptibility tensor contains the (linearised) contributions to the Ampère–Maxwell law from the polarisation, magnetisation as well as the gyrocentre current density, and reduces the linearised gyrokinetic model to a material property: the permittivity of the ‘material’ is given by $\epsilon _0 ( \unicode{x1D644}_3 + \bar{ \bar{ \unicode{x1D653}}})$ .

We follow the discussion from Zonta et al. (Reference Zonta, Iorio, Burby, Liu and Hirvijoki2021), wherein the gyrokinetic susceptibility tensor $ { { \unicode{x1D653}}}^{\mathrm{ZLR}}$ is derived for the drift kinetic model from Burby & Brizard (Reference Burby and Brizard2019) (i.e. the proposed model with $(\xi _R, \xi _\varTheta ) = (0, 0)$ in the ZLR limit $\varepsilon _\perp \rightarrow 0$ ) and subsequently compared with the ZLR limit (Hasegawa Reference Hasegawa, Roederer and Wasson1975, (2.159)) of the Vlasov–Maxwell susceptibility tensor (Hasegawa Reference Hasegawa, Roederer and Wasson1975, (2.42) and (2.43)).

The derivation of the susceptibility tensor can be found in Appendix L, and the expression for the susceptibility tensor in its most general form can be found in (L.31). We make two simplifications: first, we consider the ZLR limit of the susceptibility tensor, resulting in the drift kinetic susceptibility tensor given by (L.34), which coincides exactly with the low-frequency and ZLR limit of the Vlasov–Maxwell susceptibility tensor found from Hasegawa (Reference Hasegawa, Roederer and Wasson1975, (2.159)). Second, we consider the use of a centred Maxwellian background particle distribution as defined in (7.13). Note that a constant background magnetic field ${\boldsymbol{B}}_0$ combined with a centred Maxwellian background distribution with constant density trivially satisfies the equilibrium conditions discussed in § 5.6. When using the identities given by (7.12), we find that the resulting susceptibility tensor is given by

(7.28)

\begin{align} \frac {\omega ^2}{c^2} \bar{\bar { \unicode{x1D653}}}= {} & \sum _s \frac {\beta _{0,s}}{u_{\mathrm{th},s}^2} \begin{pmatrix} \omega ^2 & \quad \mathrm{i} \omega \omega _{\mathrm{c},s} & \quad 0\\[5pt] -\mathrm{i} \omega \omega _{\mathrm{c},s} & \quad \omega ^2 - k_{\perp }^2 u_{\mathrm{th},s}^2 & \quad 0\\[5pt] 0 & \quad 0 & \quad 0\\ \end{pmatrix}\nonumber\\ & - \sum _s \frac {\beta _{0,s}}{{n}_{0,s} u_{\mathrm{th},s}^2} \int \frac { k_{\shortparallel }^2 {f}^{0,\mathrm{CM}}_{s} } {(k_{\shortparallel } {u}_\shortparallel - \omega )^2} \begin{pmatrix} 0 & \quad 0 & \quad 0\\[5pt] 0 & \quad {u}_\tau ^4 k_{\perp }^2 \dfrac {1}{4} & \quad \mathrm{i} \dfrac {k_{\perp }}{k_{\shortparallel }} {u}_\tau ^2 \dfrac {\omega \omega _{\mathrm{c},s}}{2}\\[10pt] 0 & \quad -\mathrm{i} {u}_\tau ^2 \dfrac {k_{\perp }}{k_{\shortparallel }} \dfrac {\omega \omega _{\mathrm{c},s}}{2} & \quad \dfrac {\omega ^2 \omega _{\mathrm{c},s}^2}{k_{\shortparallel }^2} \end{pmatrix} \mathfrak{J}_{0,s} \mathrm{d}^3 {u}, \end{align}

where we recall that the plasma- $\beta$ is defined in (5.64).

The remaining integrals can be expressed in terms of the plasma dispersion function $\mathcal{Z}$ ,

(7.29)

\begin{equation} \mathcal{Z}(\zeta ) \mathrel {\mathop :}= \frac {1}{\sqrt {\pi }} \int _{-\infty }^{\infty } \frac {\mathrm{e}^{-u^2}}{u - \zeta }\, \mathrm{d} u. \end{equation}

Substitution of (7.28) in (7.27) results in

(7.30)

\begin{equation} \bar{\bar { \unicode{x1D63F}}}\widehat {{\boldsymbol{E}}}_1 = {\boldsymbol{0}}_3 , \quad \bar{\bar { \unicode{x1D63F}}} = \frac {\omega ^2}{c^2} \unicode{x1D644}_3 + \frac {\omega ^2}{c^2} \bar{\bar { \unicode{x1D653}}} + \unicode{x1D646}^2, \end{equation}

where the matrix $ \unicode{x1D646}$ is such that

(7.31)

\begin{equation} \unicode{x1D646} {\boldsymbol{S}} = {\boldsymbol{S}} \times {\boldsymbol{k}}, \end{equation}

and the matrix $ \bar{\bar { \unicode{x1D63F}}}$ can explicitly be written as

(7.32)

\begin{align} \bar{\bar { \unicode{x1D63F}}} = {} & \begin{pmatrix} \dfrac {\omega ^2}{c^2} - k_{\shortparallel }^2 & \quad 0 & \quad k_{\shortparallel } k_{\perp }\\[4pt] 0 & \quad \dfrac {\omega ^2}{c^2} - k^2 & \quad 0\\[4pt] k_{\shortparallel } k_{\perp } & \quad 0 & \quad \dfrac {\omega ^2}{c^2} - k_{\perp }^2 \end{pmatrix} + \sum _s \dfrac {\beta _{0,s}}{u_{\mathrm{th},s}^2} \begin{pmatrix} \omega ^2 & \quad \mathrm{i} \omega \omega _{\mathrm{c},s} & \quad 0\\[4pt] -\mathrm{i} \omega \omega _{\mathrm{c},s} & \quad \fbox{$\omega ^2$} - k_{\perp }^2 u_{\mathrm{th},s}^2 & \quad 0\\[4pt] 0 & \quad 0 & \quad 0\\[4pt] \end{pmatrix}\nonumber\\ & - \sum _s \frac {\beta _{0,s}}{u_{\mathrm{th},s}^2} \mathcal{Z}'({\omega } / (k_{\shortparallel } u_{\mathrm{th},s})) \begin{pmatrix} 0 & \quad 0 & \quad 0\\[7pt] 0 & \quad \dfrac {k_{\perp }^2 u_{\mathrm{th},s}^2}{2} & \quad \mathrm{i} \dfrac {k_{\perp } \omega \omega _{\mathrm{c},s}}{2k_{\shortparallel }}\\[12pt] 0 & \quad -\mathrm{i} \dfrac {k_{\perp } \omega \omega _{\mathrm{c},s}}{2k_{\shortparallel }} & \quad \dfrac {\omega ^2 \omega _{\mathrm{c},s}^2}{k_{\shortparallel }^2 u_{\mathrm{th},s}^2} \end{pmatrix}. \end{align}

We highlight the boxed $\omega ^2$ term for the discussion that follows in § 7.3.2.

Non-trivial solutions to (7.30) exist if and only if the dispersion matrix given by (7.32) is singular, and to this end, we implicitly define the dispersion relation $\omega ({\boldsymbol{k}})$ via

(7.33)

\begin{equation} \det{\bar{\bar{\unicode{x1D63F}}}}(\omega ({\boldsymbol{k}}), {\boldsymbol{k}}) = 0. \end{equation}

7.3.2. Other models

It is a priori not clear what effect the Darwin approximation has on the dispersive properties of the proposed model and, to this end, we also consider the gyrokinetic Darwin susceptibility tensor for which a derivation can be found in Appendix L.6. When using a centred Maxwellian, as we did in § 7.3, we find that the susceptibility tensor is given by (7.32), where the boxed $\omega ^2$ term is removed while the limit of quasi-neutrality yields $c \rightarrow \infty$ .

As discussed in § 7.2, the EOMs of the symplectic Brizard–Hahm (BH) model (Brizard & Hahm Reference Brizard and Hahm2007) coincide with those from the proposed gyrokinetic Maxwell model if $\varepsilon _B = 0$ , which is what we assume here. Similarly, we find that the magnetisation term vanishes, as it does in the proposed gyrokinetic model, for the centred Maxwellian background particle distribution function that we consider here. Hence, the only difference between the proposed model and the Brizard–Hahm model, under the current simplifying assumptions, is the missing polarisation current density.

However, when reconsidering (7.23) under the current simplifying assumptions and while including the same gauge condition used in the quasi-neutral gyrokinetic Darwin model, we find that the compatibility constraint reduces to

(7.34)

\begin{equation} \boldsymbol{\nabla }\boldsymbol{\cdot } \left ( \frac {\sum _s m_s {n}_{0,s}}{B_0^2} \boldsymbol{\nabla} _\perp \lambda \right ) = \frac {\partial }{\partial t} \boldsymbol{\nabla }\boldsymbol{\cdot } \left ( \frac {\sum _s m_s {n}_{0,s}}{B_0^2} \boldsymbol{\nabla} _\perp \phi _1 \right ) \end{equation}

upon substitution of (7.18a ), and it therefore follows that in this case, $\lambda = \partial \phi _1 / \partial t$ . This implies that the Lagrange multiplier for the Brizard–Hahm model restores the polarisation current density and thereby coincides with the quasi-neutral gyrokinetic Darwin model. Note that this only holds under the simplifying assumptions that we use here: the model is linearised, we assume $\varepsilon _B = 0$ , use a centred Maxwellian background distribution, and we consider the ZLR limit.

For the parallel-only model, we make use of the dispersion relation provided by Kleiber et al. (Reference Kleiber, Hatzky, Könies, Mishchenko and Sonnendrücker2016, (40)).

7.3.3. Comparison of the dispersion relations

The models are compared by numerically evaluating the dispersion relations (the code can be found from Remmerswaal (Reference Remmerswaal2023, examples/gauge_invariant.ipynb)). We consider electron and ion (deuteron) species, with $T_{\mathrm{i}} = T_{\mathrm{e}}$ , $q_{\mathrm{i}} = - q_{\mathrm{e}}$ and an electron to ion mass ratio given by $m_{\mathrm{e}} / m_{\mathrm{i}} = 1 / 3671$ . Furthermore, we non-dimensionalise the frequency and wave vector as

(7.35)

\begin{equation} \check {\omega } \mathrel {\mathop :}= \frac {\omega }{u_{\mathrm{th},\mathrm{i}} k_{\shortparallel }} , \quad \check {{\boldsymbol{k}}} \mathrel {\mathop :}= \varrho _{\mathrm{i}} {\boldsymbol{k}} . \end{equation}

In this case, we expect to find a shear and compressional Alfvén wave, which have the following frequencies (in the quasi-neutral limit $c \rightarrow \infty$ ):

(7.36)

\begin{equation} \check {\omega }_{\mathrm{As}} \mathrel {\mathop :}= \frac {1}{\sqrt {{\beta _{0}}}} , \quad \check {\omega }_{\mathrm{Ac}} \mathrel {\mathop :}= \frac {k}{k_{\shortparallel }} \check {\omega }_{\mathrm{As}} , \end{equation}

respectively. We note that the shear Alfvén frequency is constant with respect to $\check {k}_\perp$ , whereas the compressional Alfvén frequency increases linearly with increasing wavenumber. Hence, the compressional Alfvén wave is a fast wave. This is especially true when small perpendicular length scales are considered (turbulence), as in such a case, the compressional Alfvén frequency becomes comparable to the cyclotron frequency

(7.37)

\begin{equation} \varepsilon _\perp \sim 1, \quad \omega = \omega _{\mathrm{Ac}} \quad \implies \quad \varepsilon _\omega = \frac {\omega _{\mathrm{Ac}}}{\omega _{\mathrm{c},\mathrm{i}}} = \frac {\check {k}}{\sqrt {\beta _{0}}} \sim 1. \end{equation}

The presence of the compressional Alfvén wave in the proposed gyrokinetic model is therefore incompatible with the low-frequency approximation of the first-order generating function (4.59), which relies on $\varepsilon _\omega \ll 1$ . This incompatibility seems to suggest that we have made a mistake in the derivation of the proposed gyrokinetic model. The origin of this issue lies in the low-frequency approximation of the first-order generating function as discussed in § 4.5.2, where we have intentionally kept the gauge-invariant parts of the right-hand side of (4.53). While this ensures gauge invariance, it neglects the fact that the part of the electric field that comes from the vector potential has a time derivative, and this term would therefore be neglected if we had neglected all $O(\varepsilon _\omega )$ terms.

In figure 2, we show the dispersion relations for a fixed value of $\check {k}_\shortparallel = 2 \times 10^{-3}$ and $\beta _{0} = 10 \,\%$ . The gyrokinetic Maxwell model with $(\xi _R, \xi _\varTheta ) = (1, 0)$ results in two waves, which have the correct real part of the frequency, whereas the imaginary part has the correct sign which results in damping. The dispersion relation of the other models result only in the shear Alfvén wave, as expected.

To further explore the parameter space, we use the geometrical parameters of the tokamak fusion devices ASDEX Upgrade (AUG) and ITER as well as the stellarator Wendelstein 7-X (W7-X) to determine the value of the wave vector $\check {{\boldsymbol{k}}}$ . For both the parallel and perpendicular direction, we consider the lowest non-trivial wavenumber

(7.38)

\begin{equation} \check {k}_\shortparallel = \frac {\varrho _{\mathrm{i}}}{R_0}, \quad \check {k}_\perp = \frac {2\pi \varrho _{\mathrm{i}}}{a\sqrt {4 + \pi ^2}} , \end{equation}

where $R_0$ and $a$ denote the major and minor radii of the tokamak, respectively. The values of $R_0, a, \varrho _{\mathrm{i}}$ are shown in table 1 and are taken from Zoni & Possanner (Reference Zoni, Possanner and Salvarani2021) for the tokamak fusion devices and from Grieger et al. (Reference Grieger, Beidler, Harmeyer, Lotz, Kisslinger, Merkel, Nührenberg, Rau, Strumberger and Wobig1992) and Klinger et al. (Reference Klinger2019) for the stellarator W7-X. We moreover show the resulting wavenumbers according to (7.38).

Table 1. The length scales used for determining the wave vector $\check {{\boldsymbol{k}}}$ , as obtained from Zoni & Possanner (Reference Zoni, Possanner and Salvarani2021). The non-dimensional wavenumbers $\check {k}_\shortparallel$ and $\check {k}_\perp$ are computed according to (7.38).

Figure 2. Dispersion relations for a fixed value of $\check {k}_\shortparallel = 2 \times 10^{-3}$ and $\beta _{0} = 10 \,\%$ . The black dotted line corresponds to $\check {\omega } = \check {\omega }_{\mathrm{As}}$ , whereas the black dashed line corresponds to $\check {\omega } = \check {\omega }_{\mathrm{Ac}}$ .

Figure 3. Dispersion relations for fixed values of $\check {k}_\perp$ and $\check {k}_\shortparallel$ , as determined from table 1. Only the shear Alfvén wave is shown.

For each of the three models and each of the three machines, we compute the dispersion curve where we vary $\beta _{0}$ and keep the wave vectors fixed. The results are shown in figure 3. We find that the real part of the dispersion curves all overlap and agree with (7.36), which suggests that the shear Alfvén frequency depends only on $\beta _{0}$ . When considering the imaginary part of the dispersion curve, we find differences not only between the machines, but also between the models, in particular, for larger values of $\beta _{0}$ . The gyrokinetic Maxwell and quasi-neutral gyrokinetic Darwin model agree well, which shows that the removal of the compressional Alfvén wave has not altered the shear Alfvén wave. Moreover, we find that the parallel-only model yields a different imaginary part of the shear Alfvén frequency.

7.4. Summary of comparison

We have compared five different gyrokinetic models, all of which are derived from an action principle. An overview of the most essential properties is given in table 2, wherein we have also included the symplectic part of the gyrocentre single-particle phase-space Lagrangian $ { {{\boldsymbol{\gamma }}}}_{1,{\boldsymbol{R}}}$ – with all other components equal to zero – which defines the gyrocentre coordinate transformation. We summarise the comparison of the models.

Table 2. Properties of the different gyrokinetic models under consideration. The two models proposed in this paper are in boldface: the gyrokinetic Maxwell model with $(\xi _R, \xi _\varTheta ) = (1, 0)$ and its corresponding quasi-neutral gyrokinetic Darwin approximation. $^\dagger$ This can be a ‘yes (Y)’ if the approach from Qin et al. (Reference Qin, Tang, Lee and Rewoldt1999) is followed. $^\ast$ If the polarisation current density is kept, then the compressional Alfvén wave is present and the Lagrange multiplier vanishes, but if the polarisation current density is neglected, then the compressional Alfvén wave is absent and the Lagrange multiplier is needed to restore the bound-charge continuity equation.

The parallel-only model (Kleiber et al. Reference Kleiber, Hatzky, Könies, Mishchenko and Sonnendrücker2016), as described in § 7.1, neglects the perpendicular part of the vector potential and thereby results in two scalar, well-posed, perpendicular Poisson-type field equations which decouple for a centred Maxwellian background particle distribution function. The model is not gauge-invariant. Due to the absence of the perpendicular part of the vector potential, we find that the EOMs of the parallel model do not contain a grad- $B_{1,\shortparallel }$ drift. This model is still the ‘working horse’, and it is widely used for the global simulation of turbulence in fusion devices (Garbet et al. Reference Garbet, Idomura, Villard and Watanabe2010; Mishchenko et al. Reference Mishchenko, Borchardt, Hatzky, Kleiber, Könies, Nührenberg, Xanthopoulos, Roberg-Clark and Plunk2023). It is favourable if more complex models reduce to this well-known model in the limiting case ${\boldsymbol{A}}_{1,\perp } = {\boldsymbol{0}}_3$ , not only because it is reassuring that the traditional parallel-only model is recovered in this limit, but also because it implies that any differences in simulation results are only due to the presence of the perpendicular part of the vector potential when results are compared between the parallel-only model and a more complex gyrokinetic model. In particular, differences in simulation results cannot be due to a different choice of coordinates.

The symplectic Brizard–Hahm model (Brizard & Hahm Reference Brizard and Hahm2007) (see § 7.2) contains the full vector potential and results in a curl–curl-type Ampère law which is coupled to the same quasi-neutrality equation as found in the parallel-only model. This model is also not gauge-invariant and reduces to the parallel-only model when neglecting the perpendicular part of the vector potential. The grad- $B_{1,\shortparallel }$ drift is present in the EOMs; this is the case for each of the gyrokinetic models which have the full vector potential. Interestingly, the second-order Hamiltonian of the Brizard–Hahm model coincides with that of the parallel-only model. This implies that this model has no polarisation current density, and the magnetisation effects do not depend on the perpendicular part of the vector potential. Moreover, the magnetisation current density is not the curl of a magnetisation, and we find that the continuity equation is not satisfied for the gyro-average adjoint of the free charge and current density. For those reasons, the total continuity equation, consisting of the free and bound charge density, is not satisfied, which is a necessary condition for well-posedness of the field equations. It follows that well-posedness of the Brizard–Hahm model (Brizard & Hahm Reference Brizard and Hahm2007) requires the introduction of a Lagrange multiplier, which affects both the EOMs and the field equations in an unphysical way.

The gyrokinetic model from Burby & Brizard (Reference Burby and Brizard2019) has played an essential role in this paper, as it has guided us to the development of a family of gauge-invariant gyrokinetic models. Their model coincides with the parameter choice $(\xi _R, \xi _\varTheta ) = (0, 0)$ . To suppress the compressional Alfvén wave, they propose to neglect the polarisation current density altogether, as it is a higher-order contribution in $\varepsilon _\omega$ . This choice, however, breaks the bound-charge continuity equation and therefore requires the introduction of a Lagrange multiplier to restore it. Moreover, letting $\xi _R = 0$ results in a model where the symplectic part is essentially drift kinetic and does not reduce to the parallel-only model if the perpendicular part of the vector potential is neglected.

Instead, we propose to use $(\xi _R, \xi _\varTheta ) = (1, 0)$ , which results in a gauge-invariant gyrokinetic model for which all terms in both the EOMs and the field equations can be interpreted from a physical point of view. Moreover, it yields the smallest coordinate transformation (see the discussion at the end of § 4.5.2) while resulting in a gauge-invariant model wherein the gyrocentre magnetic moment is an invariant. This model does reduce to the parallel-only model if the perpendicular part of the vector potential is neglected. The gyrocentre single-particle Lagrangian and resulting Vlasov–Maxwell action principle of this model are derived and described in detail in §§ 4 and 5, respectively. We find that the continuity equation is satisfied for the gyro-average adjoint of the free charge and current density, and, since the model is gauge-invariant and derived from an action principle, it follows that the bound-charge continuity equation is satisfied as well. These are necessary conditions for well-posedness of the field equations, which are satisfied without the need of a Lagrange multiplier.

The presence of the polarisation current density in the proposed gyrokinetic Maxwell model results in a compressional Alfvén wave, which is often undesired due to its relatively high frequency. To this end, the quasi-neutral Darwin approximation can be applied to the proposed gyrokinetic Maxwell model (see § 6). Therein, we consider the limit of quasi-neutrality and remove the part of the polarisation current density that is responsible for the compressional Alfvén wave (as we have demonstrated by making use of the dispersion relation), while retaining the compatibility of the field equations as well as gauge invariance of the model.

Finally, we have derived a dispersion relation for each of the models, which has confirmed the expected properties of the models: each model contains the shear Alfvén wave and only the proposed gyrokinetic Maxwell model includes the compressional Alfvén wave. In both cases, the real part of the frequency agrees well with the theory, whereas the imaginary part is negative and thereby results in damping of the wave.

8. Conclusions

Motivated by the need for a more complete gyrokinetic model, wherein the perpendicular part of the vector potential is kept, we have discussed the gyrocentre coordinate transformation in detail. The purpose of the gyrocentre coordinate transformation is to transform the perturbed guiding-centre single-particle phase-space Lagrangian in such a way that it becomes independent of the gyro-phase. This results in a reduction of the phase-space dimension from six to five (the gyrocentre position, the parallel velocity and the invariant magnetic moment) and when moreover considering the limit of quasi-neutrality, removes the fastest time scales from the large range of length and time scales present in the Vlasov–Maxwell model.

When gyrokinetic modelling is considered at the level of the Lagrangian, which thereby utilises a variational principle, it is ensured that the gyrokinetic model is structure preserving. In particular, we find that energy is conserved regardless of the modelling error introduced by the truncated gyrocentre coordinate transformation. However, an aspect that is often overlooked in gyrokinetics is the property of gauge invariance (Brizard & Hahm Reference Brizard and Hahm2007), which ensures that the model is invariant under the gauge transformation, as it is the case for the Vlasov–Maxwell model. In particular, the traditionally used parallel-only models are not gauge-invariant. To this end, we have generalised the approach proposed of Burby & Brizard (Reference Burby and Brizard2019), wherein a gauge-invariant gyrokinetic model is introduced. We have derived sufficient conditions on the gyrocentre coordinate transformation which ensure that the resulting gyrocentre single-particle phase-space Lagrangian is gauge-invariant. Despite this additional restriction, the gyrocentre coordinate transformation is by no means uniquely defined, and this approach therefore results in a family of gauge-invariant gyrokinetic models. The family of models is parametrised by two parameters $\xi _R, \xi _\varTheta$ , where the model from Burby & Brizard (Reference Burby and Brizard2019) coincides with $(\xi _R, \xi _\varTheta ) = (0, 0)$ . Our derivation is presented by making use of vector calculus, as opposed to the more customarily used formalism of differential geometry.

In an effort to obtain a gyrokinetic model, for which each of the equations of motion as well as the field equations can be interpreted from a physical point of view, we have chosen $(\xi _R, \xi _\varTheta ) = (1, 0)$ . This choice leads to the smallest coordinate transformation which results in a gauge-invariant model wherein the gyrocentre magnetic moment is an invariant. We find that the proposed model reduces to the parallel-only model when the perpendicular component of the vector potential is neglected, which does not hold for the gauge-invariant model from Burby & Brizard (Reference Burby and Brizard2019). The resulting model has been derived to second-order in the perturbation parameter $\varepsilon _\delta$ , and the second-order part of the Lagrangian contains polarisation and magnetisation effects which have a clear physical meaning. Due to gauge invariance, the model can be expressed directly in terms of the electromagnetic fields rather than the potentials. The gyrokinetic model thereby results in the macroscopic Maxwell’s equations. Moreover, we have derived equilibrium conditions on the background distribution function and magnetic field which justify the smallness of the perturbation parameter $\varepsilon _\delta$ .

We find that the proposed gyrokinetic Maxwell model possesses a magnetisation current density which is the curl of a magnetisation. In addition, it has a polarisation current density, and we find that the free-charge continuity equation holds for the gyro-average adjoints of the free charge and current density. Each of those three properties is essential for showing that the bound-charge continuity equation is naturally satisfied, which is necessary for the well-posedness of the field equations. Hence, unlike the Brizard–Hahm model (Brizard & Hahm Reference Brizard and Hahm2007), we find that the field equations of the proposed gyrokinetic Maxwell model are compatible without the need of a Lagrange multiplier. A brief summary of the comparison between each of the models under consideration is found in table 2.

In addition, we have derived the gyrokinetic susceptibility tensor, which covers the material properties of the linearised gyrokinetic model in the macroscopic Maxwell’s equations for each of the models under consideration. In the zero Larmor radius limit, we find that the gyrokinetic susceptibility tensor of the proposed gyrokinetic Maxwell model agrees with that of the Vlasov–Maxwell system. Moreover, the resulting dispersion relation shows that, in addition to the usual shear Alfvén wave, a fast compressional Alfvén wave is present in the proposed gyrokinetic Maxwell model. Due to the potentially high frequency of this wave, we have proposed a quasi-neutral Darwin approximation to the proposed gyrokinetic Maxwell model which successfully removes the fast compressional Alfvén wave. We find that the quasi-neutral gyrokinetic Darwin model is still well-posed and gauge-invariant.

In future work, we plan to implement the two proposed models in the gyrokinetic particle-in-cell code EUTERPE (Jost et al. Reference Jost, Tran, Cooper, Villard and Appert2001; Kleiber et al. Reference Kleiber2024) and compare them with the well-established and traditionally used parallel-only model.

Acknowledgements

We thank A. Bottino, M. Campos Pinto, R. Kleiber, A. Könies, P. Lauber, O. Maj, A. Mishchenko, S. Possanner and B. Scott for helpful discussions. We also thank our anonymous referees for their valuable comments which have led to significant improvement of the manuscript and, in particular, has led to the insight that the quasi-neutral gyrokinetic Darwin model is gauge-invariant.

Editor Paolo Ricci thanks the referees for their advice in evaluating this article.

Funding

This work has been carried out within the framework of the EUROfusion Consortium, funded by the European Union via the Euratom Research and Training Programme (Grant Agreement No 101052200 – EUROfusion). Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union or the European Commission. Neither the European Union nor the European Commission can be held responsible for them.

Declaration of interests

The authors report no conflict of interest.

Appendix A. Inversion of the Lagrange matrix

We consider the inversion of the Lagrange matrix

(A.1)

\begin{equation} \unicode{x1D652} = \begin{pmatrix} \unicode{x1D652}_{11} & \quad \unicode{x1D652}_{12}\\[5pt] - \unicode{x1D652}_{12}^{\,\intercal} & \quad \unicode{x1D652}_{22} \end{pmatrix}, \end{equation}

where

(A.2)

\begin{equation} \unicode{x1D652}_{11} = \begin{pmatrix} q \unicode{x1D63D} & \quad -m {{\boldsymbol{\hat {b}}}_0}\\[5pt] m {{\boldsymbol{\hat {b}}}_0}^\intercal & \quad 0 \end{pmatrix},\quad \unicode{x1D652}_{12} = \begin{pmatrix} \dfrac {m}{q} {\boldsymbol{w}} & \quad {\boldsymbol{0}}_3\\[9pt] 0 & \quad 0\\[10pt] \end{pmatrix}, \quad \unicode{x1D652}_{22} = \begin{pmatrix} 0 & \quad \dfrac {m}{q} \\[14pt] -\dfrac {m}{q} & \quad 0\\ \end{pmatrix}. \end{equation}

Here, $ \unicode{x1D63D}$ is such that

(A.3)

\begin{equation} \unicode{x1D63D} {\boldsymbol{S}} = {\boldsymbol{S}} \times {\boldsymbol{B}} \end{equation}

for some magnetic field $\boldsymbol{B}$ .

The inverse of $ \unicode{x1D652}_{22}$ is readily given by

(A.4)

\begin{equation} \unicode{x1D652}_{22}^{\,-1} = \begin{pmatrix} 0 & \quad- \dfrac {q}{m}\\[9pt] \dfrac {q}{m} & \quad 0\\ \end{pmatrix}. \end{equation}

Using the Schur complement

(A.5)

\begin{equation} \unicode{x1D64E} = \unicode{x1D652}_{11} + \unicode{x1D652}_{12} \unicode{x1D652}_{22}^{\,-1} \unicode{x1D652}_{12}^{\,\intercal} , \end{equation}

we find that the inverse of the $2\times 2$ block matrix $ \unicode{x1D652}$ is given by

(A.6)

\begin{equation} \unicode{x1D652}^{\,-1} = \begin{pmatrix} \unicode{x1D64E}^{-1} & \quad - \unicode{x1D64E}^{-1} \unicode{x1D652}_{12} \unicode{x1D652}_{22}^{\,-1}\\[6pt] \unicode{x1D652}_{22}^{\,-1} \unicode{x1D652}_{12}^{\,\intercal} \unicode{x1D64E}^{-1} & \quad \unicode{x1D652}_{22}^{\,-1} - \unicode{x1D652}_{22}^{\,-1} \unicode{x1D652}_{12}^{\,\intercal} \unicode{x1D64E}^{-1} \unicode{x1D652}_{12} \unicode{x1D652}_{22}^{\,-1} \end{pmatrix}. \end{equation}

The expressions for the block matrices result in

(A.7)

\begin{equation} \unicode{x1D64E} = \unicode{x1D652}_{11} \end{equation}

for which the inverse is given by

(A.8)

\begin{equation} \unicode{x1D64E}^{-1} = \begin{pmatrix} -\dfrac { \unicode{x1D63D}_0}{q B_0 B_{\shortparallel }} & \quad \dfrac {{\boldsymbol{B}}}{m B_{\shortparallel }}\\[15pt] -\dfrac {{\boldsymbol{B}}^\intercal }{m B_{\shortparallel }} & \quad 0 \end{pmatrix} \end{equation}

as can be verified by computing the product

(A.9)

\begin{equation} \unicode{x1D64E} \unicode{x1D64E}^{-1} = \begin{pmatrix} q \unicode{x1D63D} & \quad -m {{\boldsymbol{\hat {b}}}_0}\\[5pt] m {{\boldsymbol{\hat {b}}}_0}^\intercal & \quad 0 \end{pmatrix} \begin{pmatrix} -\dfrac { \unicode{x1D63D}_0}{q B_0 B_{\shortparallel }} & \quad \dfrac {{\boldsymbol{B}}}{m B_{\shortparallel }}\\[15pt] -\dfrac {{\boldsymbol{B}}^\intercal }{m B_{\shortparallel }} & \quad 0 \end{pmatrix} = \begin{pmatrix} \dfrac {- \unicode{x1D63D} \unicode{x1D63D}_0 + {\boldsymbol{B}}_0 {\boldsymbol{B}}^\intercal }{B_0 B_{\shortparallel }} & \quad \dfrac {q \unicode{x1D63D}{\boldsymbol{B}}}{m B_{\shortparallel }} \\[15pt] -\dfrac {m {{\boldsymbol{\hat {b}}}_0}^\intercal \unicode{x1D63D}_0}{q B_0 B_{\shortparallel }} & \quad 1 \end{pmatrix} \end{equation}

and observing that using (3.41),

(A.10)

\begin{equation} \unicode{x1D63D} {\boldsymbol{B}} = {\boldsymbol{B}} \times {\boldsymbol{B}} = {\boldsymbol{0}}, \quad \unicode{x1D63D}_0^\intercal {{\boldsymbol{\hat {b}}}_0} = {\boldsymbol{B}}_{0} \times {{\boldsymbol{\hat {b}}}_0} = {\boldsymbol{0}} \end{equation}

as well as

(A.11)

\begin{align} \left ( - \unicode{x1D63D} \unicode{x1D63D}_0 + {\boldsymbol{B}}_0 {\boldsymbol{B}}^\intercal \right ) {\boldsymbol{S}} &= - \unicode{x1D63D} ({\boldsymbol{S}} \times {\boldsymbol{B}}_0) + {\boldsymbol{B}}_0 ({\boldsymbol{B}} \boldsymbol{\cdot } {\boldsymbol{S}})\nonumber \\ &= -({\boldsymbol{S}} \times {\boldsymbol{B}}_0) \times {\boldsymbol{B}} + {\boldsymbol{B}}_0 ({\boldsymbol{B}} \boldsymbol{\cdot } {\boldsymbol{S}})\nonumber \\ &= B_{\shortparallel } B_0 {\boldsymbol{S}} \end{align}

for any vector $\boldsymbol{S}$ , which implies that $- \unicode{x1D63D} \unicode{x1D63D}_0 + {\boldsymbol{B}}_0 {\boldsymbol{B}}^\intercal = B_{\shortparallel } B_0 \unicode{x1D644}_3$ .

Using these intermediate results, we are ready to evaluate the blocks of the inverse of $ \unicode{x1D652}$

(A.12)

\begin{equation} ( \unicode{x1D652}^{\,-1})_{12} = - \unicode{x1D64E}^{-1} \unicode{x1D652}_{12} \unicode{x1D652}_{22}^{\,-1} = \begin{pmatrix} {\boldsymbol{0}}_3 & \quad - \dfrac {{\boldsymbol{w}} \times {{\boldsymbol{\hat {b}}}_0}}{q B_{\shortparallel }} \\[15pt] 0 & \quad -\dfrac {{\boldsymbol{B}} \boldsymbol{\cdot } {\boldsymbol{w}}}{m B_{\shortparallel }}\\ \end{pmatrix} \end{equation}

and

(A.13)

\begin{equation} ( \unicode{x1D652}^{\,-1})_{22} = \unicode{x1D652}_{22}^{\,-1} - \unicode{x1D652}_{22}^{\,-1} \unicode{x1D652}_{12}^{\,\intercal} \unicode{x1D64E}^{-1} \unicode{x1D652}_{12} \unicode{x1D652}_{22}^{\,-1} = \begin{pmatrix} 0 & \quad - \dfrac {q}{m} \\[10pt] \dfrac {q}{m} & \quad 0 \\ \end{pmatrix} \end{equation}

such that the inverse of the Lagrange matrix is given by

(A.14)

\begin{equation} \unicode{x1D652}^{\,-1} = \begin{pmatrix} -\dfrac { \unicode{x1D63D}_0}{q B_0 B_{\shortparallel }} & \quad \dfrac {{\boldsymbol{B}}}{m B_{\shortparallel }} & \quad {\boldsymbol{0}}_3 & \quad - \dfrac {{\boldsymbol{w}} \times {{\boldsymbol{\hat {b}}}_0}}{q B_{\shortparallel }}\\[15pt] -\dfrac {{\boldsymbol{B}}^\intercal }{m B_{\shortparallel }} & \quad 0 & \quad 0 & \quad -\dfrac {{\boldsymbol{B}} \boldsymbol{\cdot } {\boldsymbol{w}}}{m B_{\shortparallel }}\\[15pt] {\boldsymbol{0}}_3^\intercal & \quad 0 & \quad 0 & \quad -\dfrac {q}{m}\\[10pt] \dfrac {({\boldsymbol{w}} \times {{\boldsymbol{\hat {b}}}_0})^\intercal }{q B_{\shortparallel }} & \quad \dfrac {{\boldsymbol{B}} \boldsymbol{\cdot } {\boldsymbol{w}}}{m B_{\shortparallel }} & \quad \dfrac {q}{m}& \quad 0\\ \end{pmatrix}. \end{equation}

Appendix B. Gyrocentre coordinate transformation

The gyrocentre Lagrangian $\bar {\bar {L}}$ is defined according to (4.6). The transformation rules for the Hamiltonian and symplectic part follow directly by substituting (4.5) into (4.6) and by subsequently making use of a Taylor series expansion centred around the gyrocentre coordinate $\bar {\bar {{\boldsymbol{Z}}}}$ . The contribution due to the generating function $\bar {\bar {S}}$ is omitted here.

For the guiding-centre Hamiltonian, this results in

(B.1)

\begin{align} \bar {H}(\bar {{\boldsymbol{Z}}}) = \bar {H} + \frac {\partial \bar {H}}{\partial \bar {\bar {{\boldsymbol{Z}}}}} \boldsymbol{\cdot } \left ( - \bar {\bar {{\boldsymbol{G}}}}_1 + \frac {1}{2} \frac {\partial \bar {\bar {{\boldsymbol{G}}}}_1}{\partial \bar {\bar {{\boldsymbol{Z}}}}} \bar {\bar {{\boldsymbol{G}}}}_1 - \bar {\bar {{\boldsymbol{G}}}}_2 \right ) + \frac {1}{2} \frac {\partial ^2 \bar {H}}{\partial \bar {\bar {{\boldsymbol{Z}}}}^2} \boldsymbol{:} \big( \bar {\bar {{\boldsymbol{G}}}}_1 \otimes \bar {\bar {{\boldsymbol{G}}}}_1 \big) + O(\varepsilon _\delta ^3) \end{align}

upon substitution of (4.5) and using a Taylor-series expansion of $\bar {H}$ centred around the gyrocentre coordinate $\bar {\bar {{\boldsymbol{Z}}}}$ . We use the notational convention that an absence of arguments implies evaluation at the gyrocentre coordinate, and we let $\otimes$ denote the outer product of two vectors

(B.2a)

\begin{equation} ({\boldsymbol{a}} \otimes {\boldsymbol{b}})_{ij} = a_i b_j \end{equation}

such that

(B.2b)

\begin{align} \unicode{x1D648} \boldsymbol{:} ({\boldsymbol{a}} \otimes {\boldsymbol{b}}) = \sum _{i,j = 1}^{3} \unicode{x1D648}_{ij} a_i b_j. \end{align}

This expression can be simplified to

(B.3)

\begin{equation} \bar {H}(\bar {{\boldsymbol{Z}}}) = \bar {H} - \frac {\partial \bar {H}}{\partial \bar {\bar {{\boldsymbol{Z}}}}} \boldsymbol{\cdot } \left ( \bar {\bar {{\boldsymbol{G}}}}_1 + \bar {\bar {{\boldsymbol{G}}}}_2 \right ) + \frac {1}{2} \left [ \frac {\partial }{\partial \bar {\bar {{\boldsymbol{Z}}}}}\left ( \frac {\partial \bar {H}}{\partial \bar {\bar {{\boldsymbol{Z}}}}} \boldsymbol{\cdot } \bar {\bar {{\boldsymbol{G}}}}_1 \right ) \right ] \boldsymbol{\cdot } \bar {\bar {{\boldsymbol{G}}}}_1 + O(\varepsilon _\delta ^3) . \end{equation}

The same approach is followed for the symplectic part of the guiding-centre Lagrangian. Let $\bar {\varGamma }(\bar {{\boldsymbol{Z}}}) = \bar {{\boldsymbol{\gamma }}}(\bar {{\boldsymbol{Z}}}) \boldsymbol{\cdot } \dot {\bar {{\boldsymbol{Z}}}}$ . Substitution of (4.5) results in

(B.4)

\begin{align} \bar {\varGamma }(\bar {{\boldsymbol{Z}}}) = {} & \left [ \bar {{\boldsymbol{\gamma }}} - \frac {\partial \bar {{\boldsymbol{\gamma }}}}{\partial \bar {\bar {{\boldsymbol{Z}}}}} \left ( \bar {\bar {{\boldsymbol{G}}}} - \frac {1}{2} \frac {\partial \bar {\bar {{\boldsymbol{G}}}}_1}{\partial \bar {\bar {{\boldsymbol{Z}}}}} \bar {\bar {{\boldsymbol{G}}}}_1 \right ) + \frac {1}{2} \frac {\partial ^2 \bar {{\boldsymbol{\gamma }}}}{\partial \bar {\bar {{\boldsymbol{Z}}}}^2} \boldsymbol{:} \left ( \bar {\bar {{\boldsymbol{G}}}}_1 \otimes \bar {\bar {{\boldsymbol{G}}}}_1 \right ) \right ] \nonumber \\[5pt] & \boldsymbol{\cdot } \left [ \dot {\bar {\bar {{\boldsymbol{Z}}}}} - \frac {{\mathrm{d}} }{{\mathrm{d}} t} \left ( \bar {\bar {{\boldsymbol{G}}}} - \frac {1}{2} \frac {\partial \bar {\bar {{\boldsymbol{G}}}}_1}{\partial \bar {\bar {{\boldsymbol{Z}}}}} \bar {\bar {{\boldsymbol{G}}}}_1 \right ) \right ] + O(\varepsilon _\delta ^3) , \end{align}

where we write $\bar {\bar {{\boldsymbol{G}}}} \mathrel {\mathop :}= \bar {\bar {{\boldsymbol{G}}}}_1 + \bar {\bar {{\boldsymbol{G}}}}_2$ for brevity, and we use the notational convention that an absence of arguments implies evaluation at the gyrocentre coordinate $\bar {\bar {{\boldsymbol{Z}}}}$ . We are permitted to add a total derivative by virtue of considering an action principle. We add the total derivative of

(B.5)

\begin{equation} \left ( \bar {{\boldsymbol{\gamma }}} - \frac {1}{2} \frac {\partial \bar {{\boldsymbol{\gamma }}}}{\partial \bar {\bar {{\boldsymbol{Z}}}}} \bar {\bar {{\boldsymbol{G}}}} \right ) \boldsymbol{\cdot } \left ( \bar {\bar {{\boldsymbol{G}}}} - \frac {1}{2} \frac {\partial \bar {\bar {{\boldsymbol{G}}}}_1}{\partial \bar {\bar {{\boldsymbol{Z}}}}} \bar {\bar {{\boldsymbol{G}}}}_1 \right ) \end{equation}

resulting in the following symplectic part of the gyrocentre Lagrangian:

(B.6)

\begin{align} \bar {\bar {{\boldsymbol{\gamma }}}} = {} & \bar {{\boldsymbol{\gamma }}} + \bar { \unicode{x1D652}} \left ( \bar {\bar {{\boldsymbol{G}}}} - \frac {1}{2} \frac {\partial \bar {\bar {{\boldsymbol{G}}}}_1}{\partial \bar {\bar {{\boldsymbol{Z}}}}} \bar {\bar {{\boldsymbol{G}}}}_1 \right ) + \frac {1}{2} \frac {\partial ^2 \bar {{\boldsymbol{\gamma }}}}{\partial \bar {\bar {{\boldsymbol{Z}}}}^2} \boldsymbol{:} \big( \bar {\bar {{\boldsymbol{G}}}}_1 \otimes \bar {\bar {{\boldsymbol{G}}}}_1 \big) \nonumber \\ & + \frac {1}{2} \left ( \left [ \frac {\partial \bar {\bar {{\boldsymbol{G}}}}_1}{\partial \bar {\bar {{\boldsymbol{Z}}}}} \right ]^\intercal \frac {\partial \bar {{\boldsymbol{\gamma }}}}{\partial \bar {\bar {{\boldsymbol{Z}}}}} - \left [ \frac {\partial }{\partial \bar {\bar {{\boldsymbol{Z}}}}} \left ( \frac {\partial \bar {{\boldsymbol{\gamma }}}_0}{\partial \bar {\bar {{\boldsymbol{Z}}}}} \bar {\bar {{\boldsymbol{G}}}}_1 \right ) \right ]^\intercal \right ) \bar {\bar {{\boldsymbol{G}}}}_1 , \end{align}

where the following piece of the symplectic part of the transformed guiding-centre Lagrangian moves to the Hamiltonian part of the gyrocentre Lagrangian (while switching sign):

(B.7)

\begin{equation} \frac {\partial }{\partial t} \left ( \bar {{\boldsymbol{\gamma }}}_1 - \frac {1}{2} \frac {\partial \bar {{\boldsymbol{\gamma }}}_0}{\partial \bar {\bar {{\boldsymbol{Z}}}}} \bar {\bar {{\boldsymbol{G}}}}_1 \right ) \boldsymbol{\cdot } \bar {\bar {{\boldsymbol{G}}}}_1, \end{equation}

where we have made use of $\partial \bar {{\boldsymbol{\gamma }}}_0 / \partial t = {\boldsymbol{0}}_6$ .

To further simplify (B.6), we write it component wise as

(B.8)

\begin{align} \bar {\bar {{\gamma }}}_i = {} & \bar {{\gamma }}_i + \bar {\unicode{x1D652}_{ij}} \left ( \bar {\bar {{G}}}_j - \frac {1}{2} \frac {\partial \bar {\bar {{G}}}_{1,j}}{\partial \bar {\bar {{Z}}}_k} \bar {\bar {{G}}}_{1,k} \right ) + \frac {1}{2} \left ( \frac {\partial ^2 \bar {{\gamma }}_i}{\partial \bar {\bar {{Z}}}_j \partial \bar {\bar {{Z}}}_k} - \frac {\partial ^2\bar {{\gamma }}_{0,j}}{\partial \bar {\bar {{Z}}}_i \partial \bar {\bar {{Z}}}_k} \right ) \bar {\bar {{G}}}_{1,j} \bar {\bar {{G}}}_{1,k} \nonumber \\ & + \frac {1}{2} \bar {\unicode{x1D652}_{kj}} \frac {\partial \bar {\bar {{G}}}_{1,j}}{\partial \bar {\bar {{Z}}}_i} \bar {\bar {{G}}}_{1,k} , \end{align}

where we have simplified the expression, and we note that a repeated index implies summation. Note that

(B.9)

\begin{align} \frac {\partial }{\partial \bar {\bar {Z}}_i} \left ( \bar {\unicode{x1D652}_{kj}} \bar {\bar {{G}}}_{1,j} \right ) - \frac {\partial }{\partial \bar {\bar {Z}}_k} \left ( \bar {\unicode{x1D652}_{ij}} \bar {\bar {{G}}}_{1,j} \right ) = {} & \left ( \frac {\partial ^2 \bar {\gamma }_{i}}{\partial \bar {\bar {Z}}_j \partial \bar {\bar {Z}}_k} - \frac {\partial ^2 \bar {\gamma }_{k}}{\partial \bar {\bar {Z}}_j \partial \bar {\bar {Z}}_i} \right ) \bar {\bar {{G}}}_{1,j} \nonumber \\ & + \bar {\unicode{x1D652}_{kj}} \frac {\partial \bar {\bar {{G}}}_{1,j}}{\partial \bar {\bar {Z}}_i} - \bar {\unicode{x1D652}_{ij}} \frac {\partial \bar {\bar {{G}}}_{1,j}}{\partial \bar {\bar {Z}}_k}, \end{align}

which allows (B.8) to be written as

(B.10)

\begin{align} \bar {\bar {{\gamma }}}_i = {} & \bar {{\gamma }}_i + \bar {\unicode{x1D652}_{ij}}\bar {\bar {{G}}}_j + \left [ \frac {\partial }{\partial \bar {\bar {Z}}_i} \left ( \bar {\unicode{x1D652}_{kj}} \bar {\bar {{G}}}_{1,j} \right ) - \frac {\partial }{\partial \bar {\bar {Z}}_k} \left ( \bar {\unicode{x1D652}_{ij}} \bar {\bar {{G}}}_{1,j} \right ) \right ] \bar {\bar {{G}}}_{1,k} . \end{align}

Finally, the gyrocentre Hamiltonian follows from subtracting (B.7) from (B.3) resulting in

(B.11)

\begin{equation} \bar {\bar {H}} = \bar {H} - \frac {\partial \bar {H}}{\partial \bar {\bar {{\boldsymbol{Z}}}}} \boldsymbol{\cdot } \left ( \bar {\bar {{\boldsymbol{G}}}}_1 + \bar {\bar {{\boldsymbol{G}}}}_2 \right ) + \left [{-} \frac {\partial }{\partial t} \left ( \bar {{\boldsymbol{\gamma }}}_1 - \frac {1}{2} \frac {\partial \bar {{\boldsymbol{\gamma }}}_0}{\partial \bar {\bar {{\boldsymbol{Z}}}}} \bar {\bar {{\boldsymbol{G}}}}_1 \right ) + \frac {1}{2} \frac {\partial }{\partial \bar {\bar {{\boldsymbol{Z}}}}}\left ( \frac {\partial \bar {H}}{\partial \bar {\bar {{\boldsymbol{Z}}}}} \boldsymbol{\cdot } \bar {\bar {{\boldsymbol{G}}}}_1 \right ) \right ] \boldsymbol{\cdot } \bar {\bar {{\boldsymbol{G}}}}_1 . \end{equation}

Appendix C. Proof of sufficient condition for gauge invariance

Theorem 4 (Sufficient condition for gauge invariance). The gyrocentre single-particle phase-space Lagrangian (to second-order) is gauge-invariant up to a total derivative

(C.1)

provided that $\bar {\bar {{\boldsymbol{\gamma }}}}_1 - \bar {{\boldsymbol{\gamma }}}_1$ and $\bar {\bar {{\boldsymbol{\gamma }}}}_2$ are gauge-invariant.

Proof. We assume that $\bar {\bar {{\boldsymbol{\gamma }}}}_1 - \bar {{\boldsymbol{\gamma }}}_1$ is gauge-invariant. Note that the perturbed guiding-centre Lagrangian (4.35) is gauge-invariant (up to a total derivative),

(C.2)

\begin{equation} \bar {L}_1 \overset {\text{ (4.26)}}{\mapsto } \bar {L}_1 + q \left ( \boldsymbol{\nabla }\eta \boldsymbol{\cdot } \dot {\bar {{\boldsymbol{R}}}} + \frac {\partial \eta }{\partial t} \right ) = \bar {L}_1 + q \,\frac {{\mathrm{d}} \eta }{{\mathrm{d}} t}. \end{equation}

From (4.10b ) and (4.9b ) it follows that the first-order gyrocentre Lagrangian can be written as

(C.3)

\begin{equation} \bar {\bar {L}}_1 = \bar {L}_1 + \left ( \bar { \unicode{x1D652}}_0 \bar {\bar {{\boldsymbol{G}}}}_1 + \frac {\partial \bar {\bar {S}}_1}{\partial \bar {\bar {{\boldsymbol{Z}}}}} \right ) \boldsymbol{\cdot } \dot {\bar {\bar {{\boldsymbol{Z}}}}} + \frac {\partial \bar {H}_0}{\partial \bar {\bar {{\boldsymbol{Z}}}}} \boldsymbol{\cdot } \bar {\bar {{\boldsymbol{G}}}}_1 + \frac {\partial \bar {\bar {S}}_1}{\partial t}, \end{equation}

from which it follows that $\bar {\bar {L}}_1$ is gauge-invariant up to a total derivative if the first-order generating vector $\bar {\bar {{\boldsymbol{G}}}}_1$ and first-order generating function $\bar {\bar {S}}_1$ are gauge-invariant.

Using (4.14), (4.15) and (4.35c ), we find that

(C.4)

\begin{equation} \frac {\partial \bar {\bar {S}}_1}{\partial t} + \big\{\bar {\bar {S}}_1, \bar {\bar {H}}_0\big\}_{0} = q \widetilde {\psi }_1 = \widetilde {\bar {H}_1} + \dot {\bar {{\boldsymbol{Z}}}} \boldsymbol{\cdot } \widetilde {\left ( \bar {\bar {{\boldsymbol{\gamma }}}}_1 - \bar {{\boldsymbol{\gamma }}}_1 \right )} = \widetilde {\bar {H}_1^{\mathrm{FLR}}} + \dot {\bar {{\boldsymbol{Z}}}} \boldsymbol{\cdot } \widetilde {\left ( \bar {\bar {{\boldsymbol{\gamma }}}}_1 - \bar {{\boldsymbol{\gamma }}}_1 \right )}, \end{equation}

where we note that $\bar {H}_1^{\mathrm{FLR}}$ is gauge-invariant as it is expressed in terms of the electric field and, therefore, $\bar {\bar {S}}_1$ is gauge-invariant as $\bar {\bar {{\boldsymbol{\gamma }}}}_1 - \bar {{\boldsymbol{\gamma }}}_1$ is assumed to be gauge-invariant. It follows that the generating vector $\bar {\bar {{\boldsymbol{G}}}}_1$ by (4.12) and thereby also the first-order gyrocentre Lagrangian are gauge-invariant.

We additionally assume that $\bar {\bar {{\boldsymbol{\gamma }}}}_2$ is gauge-invariant. The second-order gyrocentre Lagrangian is given by

(C.5)

\begin{equation} \bar {\bar {L}}_2 = \bar {\bar {{\boldsymbol{\gamma }}}}_2 \boldsymbol{\cdot } \dot {\bar {\bar {{\boldsymbol{Z}}}}} - \bar {\bar {H}}_2, \end{equation}

where the second-order gyrocentre Hamiltonian is given by (4.23). For the second-order gyrocentre Lagrangian to be gauge-invariant, we therefore need to show that the corresponding Hamiltonian is gauge-invariant, which upon inspection of (4.23) requires the vector ${\boldsymbol{T}}_1$ to be gauge-invariant.

The vector ${\boldsymbol{T}}_1$ , as given by (4.21), can be written as

(C.6)

\begin{equation} {\boldsymbol{T}}_1 = \underbrace {\left [ \bar { \unicode{x1D652}}_1 \dot {\bar {{\boldsymbol{Z}}}} - \frac {\partial \bar {H}_1}{\partial \bar {\bar {{\boldsymbol{Z}}}}} - \frac {\partial \bar {{\boldsymbol{\gamma }}}_1}{\partial t} \right ]}_{{\boldsymbol{T}}_1^\dagger } + \frac {1}{2}\left [ \frac {\partial \bar {{\boldsymbol{\gamma }}}_0}{\partial \bar {\bar {{\boldsymbol{Z}}}}} \frac {\partial \bar {\bar {{\boldsymbol{G}}}}_1}{\partial t} + (\bar {\bar { \unicode{x1D652}}}_1 - \bar { \unicode{x1D652}}_1) \dot {\bar {{\boldsymbol{Z}}}} + \frac {\partial }{\partial \bar {\bar {{\boldsymbol{Z}}}}}\left ( \frac {\partial \bar {H}_0}{\partial \bar {\bar {{\boldsymbol{Z}}}}} \boldsymbol{\cdot } \bar {\bar {{\boldsymbol{G}}}}_1 \right ) \right ], \end{equation}

where we note that $\bar {\bar { \unicode{x1D652}}}_1 - \bar { \unicode{x1D652}}_1$ is gauge-invariant because $\bar {\bar {{\boldsymbol{G}}}}_1$ is gauge-invariant (cf. (4.11)). It therefore remains to be shown that ${\boldsymbol{T}}_1^\dagger$ is gauge-invariant, which is expressed in terms of the first-order symplectic and Hamiltonian part of the guiding-centre Lagrangian. We note that the FLR part of the first-order guiding-centre Lagrangian is expressed in terms of the electromagnetic fields (cf. (4.35)) and is therefore gauge-invariant. Hence, we need only to show that the ZLR part ${\boldsymbol{T}}_1^{\dagger ,\mathrm{ZLR}}$ is gauge-invariant, which is given by

(C.7)

\begin{align} {\boldsymbol{T}}_1^{\dagger ,\mathrm{ZLR}} = \bar { \unicode{x1D652}}_1^{\,\mathrm{ZLR}} \dot {\bar {{\boldsymbol{Z}}}} - \frac {\partial \bar {H}_1^{\mathrm{ZLR}}}{\partial \bar {\bar {{\boldsymbol{Z}}}}} - \frac {\partial \bar {{\boldsymbol{\gamma }}}_1^{\mathrm{ZLR}}}{\partial t} = \begin{pmatrix} q ({\boldsymbol{E}}_1 + \dot {\bar {{\boldsymbol{R}}}} \times {\boldsymbol{B}}_1)\\[3pt] 0\\[3pt] 0\\[3pt] 0\\ \end{pmatrix} \end{align}

and is therefore gauge-invariant.

Appendix D. Gyro-averaging identities

D.1. Taylor-series expansions

Let $Q = Q({\boldsymbol{r}}, t)$ be a smooth function. We denote by $ \unicode{x1D643} Q$ the Hessian matrix of second-order partial derivatives of $Q$ ,

(D.1)

\begin{equation} ( \unicode{x1D643} Q)_{ij} = \frac {\partial ^2 Q}{\partial r_i \partial r_j}, \end{equation}

which allows us to write the Taylor series expansion of $Q$ centred around $\boldsymbol{r}$ as

(D.2)

\begin{align} \mathring {Q} = Q({\boldsymbol{r}} + \rho {\boldsymbol{\hat {\rho }}}) = Q + \rho {\boldsymbol{\hat {\rho }}} \boldsymbol{\cdot }\boldsymbol{\nabla }Q + \frac {\rho ^2}{2} ( \unicode{x1D643} Q) \boldsymbol{:} ({\boldsymbol{\hat {\rho }}} \otimes {\boldsymbol{\hat {\rho }}}) + O(\rho ^3), \end{align}

where we recall (B.2) and note that an absence of arguments implies evaluation at $({\boldsymbol{r}}, t)$ .

Non-dimensionalisation of (D.2) results in

(D.3)

\begin{align} \check {Q}({\boldsymbol{r}} + \rho {\boldsymbol{\hat {\rho }}}) = \check {Q} + \varepsilon _\perp {\boldsymbol{\hat {\rho }}} \boldsymbol{\cdot }\boldsymbol{\nabla }\check {Q} + \frac {\varepsilon _\perp ^2}{2} (\check { \unicode{x1D643}} \check {Q}) \boldsymbol{:} ({\boldsymbol{\hat {\rho }}} \otimes {\boldsymbol{\hat {\rho }}}) + O(\varepsilon _\perp ^3), \end{align}

where $Q = [Q] \check {Q}$ and we recall that the non-dimensional perpendicular wavenumber is defined in (3.20).

We note that the $O(\varepsilon _\perp ^3)$ term in (D.3) contains a ${\boldsymbol{\hat {\rho }}} \otimes {\boldsymbol{\hat {\rho }}} \otimes {\boldsymbol{\hat {\rho }}}$ term, which yields zero upon computing the gyro-average. It follows that the gyro-average of $\mathring {Q}$ has the following Taylor series expansion:

(D.4)

\begin{align} \langle \mathring {Q} \rangle = Q + \frac {\rho ^2}{4} ( \unicode{x1D643} Q) \boldsymbol{:} ({\boldsymbol{\hat {e}}}_1 \otimes {\boldsymbol{\hat {e}}}_1 + {\boldsymbol{\hat {e}}}_2 \otimes {\boldsymbol{\hat {e}}}_2) + O(\varepsilon _\perp ^4), \end{align}

where we have multiplied (D.3) by $[Q]$ and have made use of

(D.5a)

\begin{align} \langle {\boldsymbol{\hat {\rho }}} \rangle &= {\boldsymbol{0}}_3, \end{align}

(D.5b)

\begin{align} \langle {\boldsymbol{\hat {\rho }}} \otimes {\boldsymbol{\hat {\rho }}} \rangle &= \frac {1}{2}\left ( {\boldsymbol{\hat {e}}}_1 \otimes {\boldsymbol{\hat {e}}}_1 + {\boldsymbol{\hat {e}}}_2 \otimes {\boldsymbol{\hat {e}}}_2 \right ). \end{align}

D.2 Stokes’s theorem

Using the definition of the gyro-average, as found in (4.16), we find that

(D.6)

\begin{equation} \langle \mathring {S}_\tau \rangle = \frac {1}{2\pi } \int _0^{2\pi } {\boldsymbol{S}}(\bar {\bar {\boldsymbol{R}}} + {\boldsymbol{\rho }}) \boldsymbol{\cdot } {\boldsymbol{\hat {\tau }}}(\bar {\bar {\boldsymbol{R}}}, \bar {\bar {{\theta }}}) \,\mathrm{d} \bar {\bar {{\theta }}} = -\frac {1}{2\pi \rho } \int _{\partial D_\rho } {\boldsymbol{S}} \boldsymbol{\cdot } {\boldsymbol{\hat {t}}} \,\mathrm{d}{l} , \end{equation}

where ${\boldsymbol{\hat {t}}} \perp {{\boldsymbol{\hat {b}}}_0}$ is the counter-clockwise tangent to the boundary of the disk $D_\rho$ centred at $\bar{\bar{\boldsymbol{R}}}$ (i.e. the shaded disk shown in figure 1), which results in the minus sign. Using Stokes’s theorem and (4.49), we find that (see also (Porazik & Lin Reference Porazik and Lin2011))

(D.7)

\begin{equation} \langle \mathring {S}_\tau \rangle = -\frac {1}{2\pi \rho } \int _{D_\rho } (\boldsymbol{\nabla }\times {\boldsymbol{S}})_\shortparallel \mathrm{d}^2 x = - \frac {\rho }{2} \frac {1}{\pi \rho ^2} \int _{D_\rho } (\boldsymbol{\nabla }\times {\boldsymbol{S}})_\shortparallel \mathrm{d}^2 x = -\frac {\rho }{2} \langle \!\langle (\mathring {\boldsymbol{\nabla} }^\varsigma \times {\boldsymbol{S}})_\shortparallel \rangle \!\rangle , \end{equation}

where $\mathrm{d}^2 x$ is an infinitesimal area element on the disk.

D.3. Gradient theorem

Terms containing a radial average (i.e. an integral over $\varsigma$ ) can often be interpreted in some alternative way. For instance, the gradient theorem (or fundamental theorem of calculus for line integrals) results in

(D.8)

\begin{equation} \rho \langle\kern0.3pt \!| {\boldsymbol{\hat {\rho }}} \boldsymbol{\cdot } \mathring {\boldsymbol{\nabla} }^\varsigma Q |\!\kern0.3pt\rangle = \frac {1}{2\pi } \int _0^{2\pi }\int _0^1 {\boldsymbol{\rho }} \boldsymbol{\cdot } (\boldsymbol{\nabla }Q)(\bar {\bar {\boldsymbol{R}}} + \varsigma \rho)\, \mathrm{d} \varsigma\, \mathrm{d}{\bar{\bar{\theta}}} = \langle \mathring {Q} \rangle - Q. \end{equation}

The same identity can be derived for vector fields

(D.9)

\begin{equation} \rho \langle\kern0.3pt \!| (\mathring {\boldsymbol{\nabla} }^\varsigma {\boldsymbol{S}})^\intercal {\boldsymbol{\hat {\rho }}} |\!\kern0.3pt\rangle = \langle \mathring {{\boldsymbol{S}}} \rangle - {\boldsymbol{S}}. \end{equation}

By making use of $\boldsymbol{\nabla }({\boldsymbol{a}} \boldsymbol{\cdot } {\boldsymbol{b}}) = {\boldsymbol{a}} \times (\boldsymbol{\nabla }\times {\boldsymbol{b}}) + {\boldsymbol{b}} \times (\boldsymbol{\nabla }\times {\boldsymbol{a}}) + (\boldsymbol{\nabla }{\boldsymbol{b}})^\intercal {\boldsymbol{a}} + (\boldsymbol{\nabla }{\boldsymbol{a}})^\intercal {\boldsymbol{b}}$ , while neglecting $O(\varepsilon _B)$ contributions, we find that

(D.10)

\begin{equation} \langle\kern0.3pt \!| \mathring {\boldsymbol{\nabla} }^\varsigma (\rho S_\rho ) |\!\kern0.3pt\rangle + \rho \langle\kern0.3pt \!| (\mathring {\boldsymbol{\nabla} }^\varsigma \times {\boldsymbol{S}}) \times {\boldsymbol{\hat {\rho }}} |\!\kern0.3pt\rangle + O(\varepsilon _B) = \langle \mathring {{\boldsymbol{S}}} \rangle - {\boldsymbol{S}}. \end{equation}

Similarly, we find that application of (D.9) to the curl of a vector field results in

(D.11)

\begin{equation} \rho \langle\kern0.3pt \!| [\mathring {\boldsymbol{\nabla} }^\varsigma (\boldsymbol{\nabla }\times {\boldsymbol{S}})]^\intercal {\boldsymbol{\hat {\rho }}} |\!\kern0.3pt\rangle = \langle \mathring {\boldsymbol{\nabla }} \times {\boldsymbol{S}} \rangle - \boldsymbol{\nabla }\times {\boldsymbol{S}}. \end{equation}

By making use of $\boldsymbol{\nabla }\times ({\boldsymbol{a}} \times {\boldsymbol{b}}) = {\boldsymbol{a}} \boldsymbol{\nabla }\boldsymbol{\cdot } {\boldsymbol{b}} - {\boldsymbol{b}} \boldsymbol{\nabla }\boldsymbol{\cdot } {\boldsymbol{a}} + (\boldsymbol{\nabla }{\boldsymbol{a}})^\intercal {\boldsymbol{b}} - (\boldsymbol{\nabla }{\boldsymbol{b}})^\intercal {\boldsymbol{a}}$ , while neglecting $O(\varepsilon _B)$ contributions, we then find that

(D.12)

\begin{equation} \rho \langle\kern0.3pt \!| \mathring {\boldsymbol{\nabla} }^\varsigma \times [(\boldsymbol{\nabla }\times {\boldsymbol{S}}) \times {\boldsymbol{\hat {\rho }}}] |\!\kern0.3pt\rangle + O(\varepsilon _B) = \langle \mathring {\boldsymbol{\nabla }} \times {\boldsymbol{S}} \rangle - \boldsymbol{\nabla }\times {\boldsymbol{S}}. \end{equation}

Appendix E. Smallness of the coordinate transformation

The choice of the parameter $\xi _R$ is based on the smallness of the gyrocentre coordinate transformation, which is relevant because we truncate the expansion (in the small parameter $\varepsilon _\delta$ ) at second-order in (4.9) and (4.10). To make this more explicit, we consider a specific contribution of $\bar {H}_1$ to $\bar {\bar {H}}_3$ , which is therefore neglected in the proposed second-order accurate model. We consider the following gyro-averaged contribution from (B.1)

(E.1)

\begin{align} \bar {\bar {H}}_3^{\star } = \frac {1}{2} \left \langle \frac {\partial ^2 \bar {H}_1}{\partial \bar {\bar {{\boldsymbol{Z}}}}^2} \boldsymbol{:} \big( \bar {\bar {{\boldsymbol{G}}}}_1 \otimes \bar {\bar {{\boldsymbol{G}}}}_1 \big) \right \rangle = O(\varepsilon _\delta ^3), \end{align}

where we note that the corresponding fluctuating part would be absorbed by the third-order generating function $\bar {\bar {S}}_3$ if this term was to be included in the proposed model. By decomposing each term in terms of its mean and fluctuating part, we find that

(E.2)

\begin{align} \nonumber \bar {\bar {H}}_3^{\star } = {} & \frac {1}{2} \left \langle \frac {\partial ^2 \bar {H}_1}{\partial \bar {\bar {{\boldsymbol{Z}}}}^2} \boldsymbol{:} \big( \widetilde {\bar {\bar {{\boldsymbol{G}}}}}_1 \otimes \widetilde {\bar {\bar {{\boldsymbol{G}}}}}_1 \big) \right \rangle + \frac {1}{2} \frac {\partial ^2 \langle \bar {H}_1 \rangle }{\partial \bar {\bar {{\boldsymbol{Z}}}}^2} \boldsymbol{:} \big( \langle \bar {\bar {{\boldsymbol{G}}}}_1 \rangle \otimes \langle \bar {\bar {{\boldsymbol{G}}}}_1 \rangle \big) \\[5pt] &+ \left \langle \frac {\partial ^2 \widetilde {\bar {H}}_1}{\partial \bar {\bar {{\boldsymbol{Z}}}}^2} \boldsymbol{:} \big( \widetilde {\bar {\bar {{\boldsymbol{G}}}}}_1 \otimes \langle \bar {\bar {{\boldsymbol{G}}}}_1 \rangle \big) \right \rangle , \end{align}

where we note that the first contribution does not depend on $\xi _R$ . The magnitude of $\bar {\bar {H}}_3^{\star }$ can be bounded from above as follows:

(E.3)

\begin{align} \lvert \bar {\bar {H}}_3^{\star } \rvert \leqslant \frac {1}{2} \left \lvert \frac {\partial ^2 \langle \bar {H}_1 \rangle }{\partial \bar {\bar {{\boldsymbol{Z}}}}^2} \right \rvert _{\mathrm{F}} \lvert \langle \bar {\bar {{\boldsymbol{G}}}}_1 \rangle \rvert ^2 + \left \lvert \left \langle \frac {\partial ^2 \widetilde {\bar {H}}_1}{\partial \bar {\bar {{\boldsymbol{Z}}}}^2} \widetilde {\bar {\bar {{\boldsymbol{G}}}}}_1 \right \rangle \right \rvert \lvert \langle \bar {\bar {{\boldsymbol{G}}}}_1 \rangle \rvert + \ldots , \end{align}

where we have omitted the contribution that does not depend on $\xi _R$ (now denoted by $\ldots$ ), and by making use of the triangle and Cauchy–Schwarz inequalities, substituting (4.35c), and by letting $\lvert \cdot \rvert _{\mathrm{F}}$ denote the Frobenius norm for which $\lvert {\boldsymbol{S}} \otimes {\boldsymbol{S}} \rvert _{\mathrm{F}} = \lvert {\boldsymbol{S}} \rvert ^2$ .

As the fluctuating part of the first-order generating vector $\widetilde {\bar {\bar {{\boldsymbol{G}}}}}_1$ is independent of the choice of $\xi _R$ , we can minimise the upper bound of the magnitude of the neglected term $\bar {\bar {H}}_3^{\star }$ by minimising the Euclidean norm (squared) of the gyro-average of the first-order generating vector

(E.4)

\begin{equation} \lvert \langle \bar {\bar {{\boldsymbol{G}}}}_1 \rangle \rvert ^2 = \frac {\bar {\bar {{M}}}^2}{B_0^2} \big\langle \kern-0.7pt\big\langle \mathring {B}^\varsigma _{1,\shortparallel } \big\rangle \kern-0.8pt\big\rangle ^2 + (1 - \xi _R)^2 \left ( \frac {\langle\kern0.3pt \!| \mathring {B}^\varsigma _{1,\shortparallel } |\!\kern0.3pt\rangle ^2}{(B_{0,\shortparallel }^{\star })^2} + \frac {q^2}{m^2} \big\langle\kern-1.7pt \big| \mathring {B}^\varsigma _{1,\tau } \big|\kern-1.7pt\big\rangle ^2 \right ), \end{equation}

which is achieved by letting $\xi _R = 1$ . This choice yields a second-order accurate (in terms of $\varepsilon _\delta$ ) gyrocentre Lagrangian for which the upper bound of the truncation error is minimised with respect to the contributions from the first-order generating vector, under the constraint that the resulting gyrocentre magnetic moment is an invariant (i.e. letting $\xi _\varTheta = 0$ ).

Appendix F. Approximation of the second-order Hamiltonian

We consider (4.65), with the first-order generating vector approximated by (4.60) and ${\boldsymbol{T}}_{1}$ is as defined by (4.21), which we repeat here for convenience,

(F.1)

\begin{equation} {\boldsymbol{T}}_1 = \underbrace {\frac {1}{2} \big(\bar { \unicode{x1D652}}_1 + \bar {\bar { \unicode{x1D652}}}_1\big) \dot {\bar {{\boldsymbol{Z}}}}}_{\unicode{x24B6}} - \underbrace {\frac {\partial }{\partial t} \left ( \bar {{\boldsymbol{\gamma }}}_1 - \frac {1}{2} \frac {\partial \bar {{\boldsymbol{\gamma }}}_0}{\partial \bar {\bar {{\boldsymbol{Z}}}}} \bar {\bar {{\boldsymbol{G}}}}_1 \right )}_{\unicode{x24B7}} - \underbrace {\frac {\partial }{\partial \bar {\bar {{\boldsymbol{Z}}}}}\left ( \bar {H}_1 - \frac {1}{2} \frac {\partial \bar {H}_0}{\partial \bar {\bar {{\boldsymbol{Z}}}}} \boldsymbol{\cdot } \bar {\bar {{\boldsymbol{G}}}}_1 \right )}_{\unicode{x24B8}}. \end{equation}

In this section of the appendix, we use the following shorthand notation:

(F.2)

\begin{equation} Q^{\dagger } \mathrel {\mathop :}= \frac {1}{2} (Q + \langle Q \rangle ). \end{equation}

For the approximation of $\bar {\bar {H}}_2$ , we omit all derivatives of the background magnetic field as well as of the perturbed electromagnetic fields.

Using (4.35b ) and (4.47), we find that

(F.3)

\begin{equation} \frac {1}{2} (\bar {{\boldsymbol{\gamma }}}_{1} + \bar {\bar {{\boldsymbol{\gamma }}}}_{1}) = \begin{pmatrix} q{\boldsymbol{A}}_1 + q \rho \int _0^1 (\,\mathring {\!{\boldsymbol{B}}}^\varsigma _1 \times {\boldsymbol{\hat {\rho }}})^{\dagger } \mathrm{d} \varsigma \\[5pt] 0\\[5pt] 0\\[5pt] - \dfrac {q \rho ^2}{2} \int _0^1 \varsigma \big(\mathring {B}^\varsigma _{1,\shortparallel }\big) \mathrm{d} \varsigma \\ \end{pmatrix}, \end{equation}

from which it follows that

(F.4)

\begin{equation} \dfrac {1}{2} (\bar { \unicode{x1D652}}_{1} + \bar {\bar { \unicode{x1D652}}}_{1}) = \int _0^1 \!\begin{pmatrix} q \unicode{x1D63D}_1 & \quad\! {\boldsymbol{0}}_3 & \quad\! -\dfrac {q \rho }{2 \bar {\bar {{M}}}} (\,\mathring {\!{\boldsymbol{B}}}^\varsigma _1 \times {\boldsymbol{\hat {\rho }}})^{\dagger } & \quad\! -\dfrac {q \rho }{2} \,\mathring {\!{\boldsymbol{B}}}^\varsigma _1 \times {\boldsymbol{\hat {\tau }}}\\[10pt] {\boldsymbol{0}}_3^\intercal & \quad\! 0 & \quad\! 0 & \quad\! 0\\[10pt] \dfrac {q \rho }{2 \bar {\bar {{M}}}} \big[ \big(\,\mathring {\!{\boldsymbol{B}}}^\varsigma _1 \times {\boldsymbol{\hat {\rho }}}\big)^{\dagger } \big]^\intercal & \quad\! 0 & \quad\! 0 & \quad\! -\dfrac {q \rho ^2}{2\bar {\bar {{M}}}} \varsigma \big(\mathring {B}^\varsigma _{1,\shortparallel }\big)\\[10pt] \dfrac {q \rho }{2} \big( \,\mathring {\!{\boldsymbol{B}}}^\varsigma _1 \times {\boldsymbol{\hat {\tau }}} \big)^\intercal & \quad\! 0 & \quad\! \dfrac {q \rho ^2}{2\bar {\bar {{M}}}} \varsigma \big(\mathring {B}^\varsigma _{1,\shortparallel }\big) & \quad\! 0\\ \end{pmatrix} \mathrm{d}\varsigma , \end{equation}

where the matrix $ \unicode{x1D63D}_1$ is defined analogously to (3.41). We have omitted all derivatives of the background magnetic field as well as of the perturbed electromagnetic fields. Subsequent evaluation of the matrix-vector product with $\dot {\bar {{\boldsymbol{Z}}}}$ (while neglecting $O(\varepsilon _B)$ contributions), where the guiding-centre EOMs are given by (3.48), results in

(F.5)

\begin{equation} \unicode{x24B6} = \frac {1}{2} (\bar { \unicode{x1D652}}_{1} + \bar {\bar { \unicode{x1D652}}}_{1}) \dot {\bar {{\boldsymbol{Z}}}} = \int _0^1 \begin{pmatrix} q \bar {U}_\shortparallel {{\boldsymbol{\hat {b}}}_0} \times {\boldsymbol{B}}_1 - \dfrac {q \rho \omega _{\mathrm{c}}}{2} \,\mathring {\!{\boldsymbol{B}}}^\varsigma _1 \times {\boldsymbol{\hat {\tau }}}\\[10pt] 0\\[10pt] \dfrac {q \rho \bar {U}_\shortparallel }{2 \bar {\bar {{M}}}} (\,\mathring {\!{\boldsymbol{B}}}^\varsigma _1 \times {\boldsymbol{\hat {\rho }}})^{\dagger } \boldsymbol{\cdot } {{\boldsymbol{\hat {b}}}_0} -\dfrac {\omega _{\mathrm{c}} q \rho ^2}{2\bar {\bar {{M}}}} \varsigma \big(\mathring {B}^\varsigma _{1,\shortparallel }\big)\\[10pt] \dfrac {q \rho \bar {U}_\shortparallel }{2} (\,\mathring {\!{\boldsymbol{B}}}^\varsigma _1 \times {\boldsymbol{\hat {\tau }}}) \boldsymbol{\cdot } {{\boldsymbol{\hat {b}}}_0}\\ \end{pmatrix}\, \mathrm{d} \varsigma . \end{equation}

The second term in ${\boldsymbol{T}}_1$ is given by

(F.6)

\begin{equation} \unicode{x24B7} = \begin{pmatrix} q\dfrac {\partial {\boldsymbol{A}}_1}{\partial t}\\[9pt] 0\\[5pt] 0\\[5pt] 0\\[5pt] \end{pmatrix} \end{equation}

by making use of (4.35b ), (3.23) and (4.60) as well as Faraday’s law (5.21). The third term in ${\boldsymbol{T}}_1$ involves the average of the first-order guiding-centre and gyrocentre Hamiltonians

(F.7)

\begin{equation} \frac {1}{2} (\bar {H}_1 + \bar {\bar {H}}_1) = q \phi _1 - \int _0^1 \left [ \frac {q \rho }{2} \big(\mathring {E}^\varsigma _{1,\rho } + \big\langle \mathring {E}^\varsigma _{1,\rho } \big\rangle \big) - \bar {\bar {{M}}} \varsigma \langle \mathring {B}^\varsigma _{1,\shortparallel } \rangle \right ]\, \mathrm{d}\varsigma , \end{equation}

which follows from substituting (4.35c ) and (4.52). This results in

(F.8)

\begin{equation} \unicode{x24B8} = \int _0^1 \begin{pmatrix} q \boldsymbol{\nabla }\phi _1\\[5pt] 0\\[5pt] - \dfrac {q \rho }{4 \bar {\bar {{M}}}} \big(\mathring {E}^\varsigma _{1,\rho } + \big\langle \mathring {E}^\varsigma _{1,\rho } \big\rangle \big) + \varsigma \langle \mathring {B}^\varsigma _{1,\shortparallel } \rangle \\[5pt] - \dfrac {q \rho }{2} \mathring {E}^\varsigma _{1,\tau } \\ \end{pmatrix} \,\mathrm{d} \varsigma , \end{equation}

where we have again neglected all derivatives of the background and perturbed electromagnetic fields.

When combining (F.5), (F.6) and (F.8), we find that ${\boldsymbol{T}}_1$ is given by

(F.9)

\begin{equation} {\boldsymbol{T}}_1 = \int _0^1 \begin{pmatrix} {\boldsymbol{F}}_1 - \dfrac {q \rho \omega _{\mathrm{c}}}{2} \,\mathring {\!{\boldsymbol{B}}}^\varsigma _1 \times {\boldsymbol{\hat {\tau }}}\\[10pt] 0\\[10pt]\dfrac {\rho }{4 \bar {\bar {{M}}}} \big(\mathring {F}^\varsigma _{1,\rho } + \big\langle \mathring {F}^\varsigma _{1,\rho } \big\rangle \big) - \varsigma \big(\mathring {B}^\varsigma _{1,\shortparallel } + \big\langle \mathring {B}^\varsigma _{1,\shortparallel } \big\rangle \big)\\[10pt] \dfrac {\rho }{2} \mathring {F}^\varsigma _{1,\tau } \\ \end{pmatrix}\, \mathrm{d} \varsigma , \end{equation}

where we have made use of the definition of the Lorentz force ${\boldsymbol{F}}_1$ as given by (4.56). We are ready to evaluate the second-order gyrocentre Hamiltonian, as expressed in (4.65), by substituting (4.60) and (F.9)

(F.10)

\begin{align} &\bar {\bar {H}}_{2} = {} \left \langle \left ( {\boldsymbol{F}}_1 - \frac {q \rho \omega _{\mathrm{c}}}{2} \int _0^1 \,\mathring {\!{\boldsymbol{B}}}^\varsigma _1 \times {\boldsymbol{\hat {\tau }}}\, \mathrm{d} \varsigma \right ) \boldsymbol{\cdot } \left ( -\frac {{{\boldsymbol{\hat {b}}}_0}}{B_{0,\shortparallel }^{\star }} \times \int _0^1 \widetilde {\,\mathring {\!{\boldsymbol{B}}}^\varsigma _1 \times {\boldsymbol{\rho }}}\, \mathrm{d}\varsigma + \frac {\rho B_{1,\rho }}{B_0} {{\boldsymbol{\hat {b}}}_0} \right ) \right \rangle \nonumber \\ & - \left \langle \int _0^1 \left [ \frac {\rho }{4 \bar {\bar {{M}}}} \big(\mathring {F}^\varsigma _{1,\rho } + \big\langle \mathring {F}^\varsigma _{1,\rho } \big\rangle \big) - \varsigma \big(\mathring {B}^\varsigma _{1,\shortparallel } + \big\langle \mathring {B}^\varsigma _{1,\shortparallel } \big\rangle \big) \right ]\,\mathrm{d} \varsigma \left [ \frac {q^2 \rho ^2}{m} \int _0^1 \varsigma \mathring {B}^\varsigma _{1,\shortparallel }\, \mathrm{d}\varsigma - \frac {\rho }{B_0} F_{1,\rho } \right ] \right \rangle \nonumber \\ & + \left \langle \left [ \frac {\rho }{2} \int _0^1 \mathring {F}^\varsigma _{1,\tau }\, \mathrm{d} \varsigma \right ] \left [{-}\frac {\rho }{2 B_0 \bar {\bar {{M}}}} F_{1,\tau } \right ] \right \rangle . \end{align}

We consider the ZLR limit of all the terms. That is, we make use of the approximation $\mathring {Q}^\varsigma \approx Q$ resulting in

(F.11)

\begin{equation} \begin{aligned} \bar {\bar {H}}_{2} = {} & \frac {\rho }{B_0}\left \langle \left [ {\boldsymbol{F}}_1 - \frac {q \rho \omega _{\mathrm{c}}}{2} {{\boldsymbol{B}}}_1 \times {\boldsymbol{\hat {\tau }}} \right ] \boldsymbol{\cdot } \left [ {{\boldsymbol{\hat {b}}}_0} \times \left ( \langle {{\boldsymbol{B}}}_1 \times {\boldsymbol{\hat {\rho }}} \rangle - {{\boldsymbol{B}}}_1 \times {\boldsymbol{\hat {\rho }}} \right ) + B_{1,\rho } {{\boldsymbol{\hat {b}}}_0} \right ] \right \rangle \\ & + \frac {1}{2} \left \langle \left [ \frac {\rho }{2 \bar {\bar {{M}}}} ({F}_{1,\rho } + \langle {F}_{1,\rho } \rangle ) - {B}_{1,\shortparallel } \right ] \left [{-}\frac {q^2 \rho ^2}{2 m} {B}_{1,\shortparallel } - \frac {\rho }{B_0} F_{1,\rho } \right ] \right \rangle - \frac {\rho ^2}{4 B_0 \bar {\bar {{M}}}} \left \langle F_{1,\tau }^2 \right \rangle , \end{aligned} \end{equation}

where we have moreover approximated $B_{0,\shortparallel }^{\star } \approx B_0$ . When subsequently making use of $\langle {\boldsymbol{\hat {\tau }}} \rangle = \langle {\boldsymbol{\hat {\rho }}} \rangle = {\boldsymbol{0}}_3$ as well as

(F.12)

\begin{align} \langle S_\rho ^2 \rangle = \langle S_\tau ^2 \rangle = ({\boldsymbol{S}} \otimes {\boldsymbol{S}}) \boldsymbol{:} \langle {\boldsymbol{\hat {\rho }}} \otimes {\boldsymbol{\hat {\rho }}} \rangle = \frac {1}{2} ({\boldsymbol{S}} \otimes {\boldsymbol{S}}) \boldsymbol{:} ({\boldsymbol{\hat {e}}}_1 \otimes {\boldsymbol{\hat {e}}}_1 + {\boldsymbol{\hat {e}}}_2 \otimes {\boldsymbol{\hat {e}}}_2) = \frac {1}{2} \lvert {\boldsymbol{S}}_\perp \rvert ^2, \end{align}

we find that

(F.13)

\begin{equation} \bar {\bar {H}}_{2} = \frac {\bar {\bar {{M}}}}{2 B_0} \lvert {\boldsymbol{B}}_{1,\perp } \rvert ^2 - \frac {m}{2 q^2 B_0^2} \lvert {\boldsymbol{F}}_{1,\perp } \rvert ^2 . \end{equation}

Appendix G. Computation of the Jacobian

The aim is to compute the Jacobian of the transformation from the physical coordinates $\tilde {{\boldsymbol{Z}}} = ({\boldsymbol{R}}, {\boldsymbol{U}})$ to the gyrocentre coordinates $\bar {\bar {{\boldsymbol{Z}}}}$ . To this end, we write the Lagrangian in physical coordinates $\tilde {{\boldsymbol{Z}}}$ as

(G.1)

\begin{equation} \tilde {L} = \tilde {{\boldsymbol{\gamma }}} \boldsymbol{\cdot } \dot {\tilde {{\boldsymbol{Z}}}} - \tilde {H} = \left [ q ({\boldsymbol{A}}_0 + {\boldsymbol{A}}_1) + m {\boldsymbol{U}} \right ] \boldsymbol{\cdot } \,\dot {\bar {\bar {\!\boldsymbol{R}}}} - \left ( q \phi _1 + \frac {m}{2} \lvert {\boldsymbol{U}} \rvert ^2 \right ). \end{equation}

Here, the coordinate transformation $\tilde {{\boldsymbol{Z}}} = \tilde {{\boldsymbol{Z}}}(t, \bar {\bar {{\boldsymbol{Z}}}})$ is defined implicitly by imposing that the physical Lagrangian is transformed to the gyrocentre Lagrangian given by (4.67)

(G.2)

\begin{equation} \tilde {L}(\tilde {{\boldsymbol{Z}}}) = \bar {\bar {L}}(\bar {\bar {{\boldsymbol{Z}}}}), \end{equation}

where we note that $\tilde {{\boldsymbol{Z}}}$ as a dependent coordinate also depends on time $t$ via the perturbed potentials. Note that

(G.3)

\begin{equation} \dot {\tilde {{\boldsymbol{Z}}}} = \frac {\partial \tilde {{\boldsymbol{Z}}}}{\partial t} + \frac {\partial \tilde {{\boldsymbol{Z}}}}{\partial \bar {\bar {{\boldsymbol{Z}}}}} \dot {\bar {\bar {{\boldsymbol{Z}}}}}, \end{equation}

from which it follows that (G.1) and (G.2) imply

(G.4)

\begin{equation} \tilde {{\boldsymbol{\gamma }}} \boldsymbol{\cdot } \left ( \frac {\partial \tilde {{\boldsymbol{Z}}}}{\partial \bar {\bar {{\boldsymbol{Z}}}}} \dot {\bar {\bar {{\boldsymbol{Z}}}}} \right ) = \bar {\bar {{\boldsymbol{\gamma }}}} \boldsymbol{\cdot } \dot {\bar {\bar {{\boldsymbol{Z}}}}} \quad \implies \quad \bar {\bar {{\boldsymbol{\gamma }}}} = \left ( \frac {\partial \tilde {{\boldsymbol{Z}}}}{\partial \bar {\bar {{\boldsymbol{Z}}}}} \right )^\intercal \tilde {{\boldsymbol{\gamma }}}. \end{equation}

This results in the following expression for the gyrocentre Lagrange matrix:

(G.5)

\begin{equation} \bar {\bar { \unicode{x1D652}}} = \left ( \frac {\partial \bar {\bar {{\boldsymbol{\gamma }}}}}{\partial \bar {\bar {{\boldsymbol{Z}}}}} \right )^\intercal - \frac {\partial \bar {\bar {{\boldsymbol{\gamma }}}}}{\partial \bar {\bar {{\boldsymbol{Z}}}}} = \left ( \frac {\partial \tilde {{\boldsymbol{\gamma }}}}{\partial \bar {\bar {{\boldsymbol{Z}}}}} \right )^\intercal \frac {\partial \tilde {{\boldsymbol{Z}}}}{\partial \bar {\bar {{\boldsymbol{Z}}}}} - \left ( \frac {\partial \tilde {{\boldsymbol{Z}}}}{\partial \bar {\bar {{\boldsymbol{Z}}}}} \right )^\intercal \frac {\partial \tilde {{\boldsymbol{\gamma }}}}{\partial \bar {\bar {{\boldsymbol{Z}}}}} = \left ( \frac {\partial \tilde {{\boldsymbol{Z}}}}{\partial \bar {\bar {{\boldsymbol{Z}}}}} \right )^\intercal \tilde { \unicode{x1D652}} \frac {\partial \tilde {{\boldsymbol{Z}}}}{\partial \bar {\bar {{\boldsymbol{Z}}}}}, \end{equation}

from which it follows that

(G.6)

\begin{equation} \det \frac {\partial \tilde {{\boldsymbol{Z}}}}{\partial \bar {\bar {{\boldsymbol{Z}}}}} = \sqrt {\frac {\det \bar {\bar { \unicode{x1D652}}}}{\det \tilde { \unicode{x1D652}}}}. \end{equation}

From the definition of $\tilde {{\boldsymbol{\gamma }}}$ in (G.1), it follows that

(G.7)

\begin{equation} \det \tilde { \unicode{x1D652}} = \det \begin{pmatrix} q \unicode{x1D63D} & \quad -m \unicode{x1D644}_3\\[5pt] m \unicode{x1D644}_3 & \quad \boldsymbol{\mathsf{0}}_3 \\ \end{pmatrix} = m^6 \end{equation}

and

(G.8)

\begin{equation} \det \bar {\bar { \unicode{x1D652}}} = \det \bar {\bar { \unicode{x1D64E}}} \det \bar {\bar { \unicode{x1D652}}}_{22} = \det \bar {\bar { \unicode{x1D652}}}_{11} \frac {m^2}{q^2} \end{equation}

by the Schur complement determinant formula, where we use the block structure of the Lagrange matrix as discussed in Appendix A. Finally, we note that direct computation shows that

(G.9)

\begin{equation} \det \bar {\bar { \unicode{x1D652}}}_{11} = \det \begin{pmatrix} q \unicode{x1D63D}^{\star } & \quad -m {{\boldsymbol{\hat {b}}}_0}\\[5pt] m {{\boldsymbol{\hat {b}}}_0}^\intercal & \quad 0 \end{pmatrix} = (q m B_\shortparallel ^{\star })^2 \end{equation}

such that

(G.10)

\begin{equation} \det \frac {\partial \tilde {{\boldsymbol{Z}}}}{\partial \bar {\bar {{\boldsymbol{Z}}}}} = \frac {B_\shortparallel ^{\star }}{m}. \end{equation}

Appendix H. Proof of Liouville’s theorem

Theorem 5 (Gyrocentre Liouville theorem). The phase-space volume is conserved:

(H.1)

\begin{equation} \frac {\partial \mathfrak{J}_s}{\partial t} + \boldsymbol{\nabla }\boldsymbol{\cdot } (\mathfrak{J}_s \,\dot {\bar {\bar {\!\boldsymbol{R}}}}) + \frac {\partial }{\partial \bar{\bar{u}}_\shortparallel } (\mathfrak{J}_s \dot {\bar {\bar {U}}}_\shortparallel ) = 0. \end{equation}

Furthermore, integrals of the form ( 5.5 ) can be expressed in terms of the initial phase-space coordinates in the following way:

(H.2)

\begin{equation} \int \bar {\bar {{{f}}}}_s \mathcal{F} \mathfrak{J}_s \,\mathrm{d}^6 \bar {\bar {z}} = \int \bar {\bar {{{f}}}}^0_{s}(\bar {\bar {{\boldsymbol{z}}}}^0) \mathcal{F}(\bar {\bar {{\boldsymbol{Z}}}}(t; \bar {\bar {\boldsymbol{z}}}^0, t^0)) \mathfrak{J}_s(\bar {\bar {{\boldsymbol{z}}}}^0, t^0) \,\mathrm{d}^6 \bar {\bar {z}}^0 , \end{equation}

where an absence of arguments implies evaluation at ( ${\boldsymbol{z}}, t$ ).

Proof. The gyrocentre EOMs (4.80) imply that

(H.3a)

\begin{align} \frac {\partial B_{s,\shortparallel }^{\star }}{\partial t} &= \frac {\partial B^{\star }_{1,\shortparallel }}{\partial t}, \end{align}

(H.3b)

\begin{align} \boldsymbol{\nabla }\boldsymbol{\cdot } \big(B_{s,\shortparallel }^{\star } \,\dot {\bar {\bar {\!\boldsymbol{R}}}}\big) &= \frac {{\boldsymbol{B}}_s^{\star }}{m_s} \boldsymbol{\cdot } \boldsymbol{\nabla }\frac {\partial \bar {\bar {H}}_s}{\partial \bar {\bar {U}}_\shortparallel } + \frac {1}{q_s}\left ( \boldsymbol{\nabla }\bar {\bar {H}}_s + q_s \frac {\partial {\boldsymbol{A}}^{\star }_1}{\partial t} \right ) \boldsymbol{\cdot } (\boldsymbol{\nabla }\times {{\boldsymbol{\hat {b}}}_0}) - \frac {\partial B^{\star }_{1,\shortparallel }}{\partial t}, \end{align}

(H.3c)

\begin{align} \frac {\partial }{\partial \bar {\bar {U}}_\shortparallel } \big(B_{s,\shortparallel }^{\star } \dot {\bar {\bar {U}}}_\shortparallel \big) &= -\frac {1}{q_s} (\boldsymbol{\nabla }\times {{\boldsymbol{\hat {b}}}_0}) \boldsymbol{\cdot } \left ( \boldsymbol{\nabla }\bar {\bar {H}}_s + q_s \frac {\partial {\boldsymbol{A}}^{\star }_1}{\partial t} \right ) - \frac {{\boldsymbol{B}}_s^{\star }}{m_s} \boldsymbol{\cdot } \frac {\partial }{\partial \bar {\bar {U}}_\shortparallel } \boldsymbol{\nabla }\bar {\bar {H}}_s, \end{align}

from which (H.1) follows by simply adding the contributions.

We consider the coordinate transformation $\bar{\bar{\boldsymbol{z}}} = \bar {\bar {{\boldsymbol{Z}}}}(t; \bar{\bar{\boldsymbol{z}}}^0, t^0) \mapsto \bar{\bar{\boldsymbol{z}}}^0$ applied to the left-hand side of (H.2) resulting in

(H.4)

\begin{align} &\int \bar {\bar {f}}_s(\bar {\bar {{z}}}, t) \mathcal{F}(\bar {\bar {{z}}}, t) \mathfrak{J}_s(\bar {\bar {{z}}}, t) \,\mathrm{d}^6 \bar {\bar {z}} \nonumber \\ &\quad = \int \bar {\bar {f}}^0_{s}(\bar {\bar {{z}}}^0) \mathcal{F}(\bar {\bar {{\boldsymbol{Z}}}}(t; \bar {\bar {{z}}}^0, t^0)) \mathfrak{J}_s(\bar {\bar {{\boldsymbol{Z}}}}(t; \bar {\bar {{z}}}^0, t^0), t) \mathfrak{H}(\bar {\bar {{\boldsymbol{Z}}}}(t; \bar {\bar {{z}}}^0, t^0), t) \,\mathrm{d}^6 \bar {\bar {z}}^0 \end{align}

by defining the Jacobian

(H.5)

\begin{equation} \mathfrak{H} = \det \frac {\partial \bar {\bar {{\boldsymbol{Z}}}}}{\partial \bar {\bar {{\boldsymbol{z}}}}^0} \end{equation}

and by making use of (5.3). Therefore, we must show that

(H.6)

\begin{equation} \frac {{\mathrm{d}} }{{\mathrm{d}} t} (\mathfrak{J}_s \mathfrak{H}) = 0, \end{equation}

where we note that the product $\mathfrak{J}_s \mathfrak{H}$ is the Jacobian of the coordinate transformation from initial gyrocentre coordinates to physical coordinates.

We note that (H.1) implies

(H.7)

\begin{equation} \dot {\mathfrak{J}}_s = - \mathfrak{J}_s \frac {\partial }{\partial \bar {\bar {{\boldsymbol{Z}}}}} \boldsymbol{\cdot } \dot {\bar {\bar {{\boldsymbol{Z}}}}}. \end{equation}

To proceed, we make use of (for a proof, we refer to Bouchut et al. (Reference Bouchut, Golse, Pulvirenti, Desvillettes and Perthame2000, Proposition 1.1))

(H.8)

\begin{equation} \dot {\mathfrak{H}} = \mathfrak{H} \frac {\partial }{\partial \bar {\bar {{\boldsymbol{Z}}}}} \boldsymbol{\cdot } \dot {\bar {\bar {{\boldsymbol{Z}}}}}. \end{equation}

When combining this result with (H.7), we find

(H.9)

\begin{equation} \frac {{\mathrm{d}} }{{\mathrm{d}} t} (\mathfrak{J}_s \mathfrak{H}) = \dot {\mathfrak{J}}_s \mathfrak{H} + \mathfrak{J}_s \dot {\mathfrak{H}} = \dot {\mathfrak{J}}_s \mathfrak{H} + \mathfrak{J}_s \mathfrak{H} \frac {\partial }{\partial \bar {\bar {{\boldsymbol{Z}}}}} \boldsymbol{\cdot } \dot {\bar {\bar {{\boldsymbol{Z}}}}} = \dot {\mathfrak{J}}_s \mathfrak{H} - \mathfrak{H} \dot {\mathfrak{J}}_s = 0 . \end{equation}

It follows that

(H.10)

\begin{equation} \mathfrak{J}_s \mathfrak{H} = \mathfrak{J}_s(\bar {\bar {\boldsymbol{z}}}^0, t^0) \mathfrak{H}(\bar {\bar {\boldsymbol{z}}}^0, t^0) = \mathfrak{J}_s(\bar {\bar {\boldsymbol{z}}}^0, t^0), \end{equation}

which upon substitution in (H.4), results in (H.2).

Appendix I. Proof of local energy conservation

Theorem 6 (Local energy conservation). The kinetic energy density ( 5.74a ) satisfies

(I.1)

\begin{equation} \frac {\partial \bar {\bar {\mathcal{K}}}}{\partial t} + \boldsymbol{\nabla }\boldsymbol{\cdot } \bigg( \sum _s \int \bar {\bar {f}}_s \,\dot {\bar {\bar {\!\boldsymbol{R}}}} \bar {\bar {K}}_s \mathfrak{J}_s\, \mathrm{d}^3 {u} \bigg) = \overline {{\boldsymbol{\mathcal{J}}}}{}^{\mathrm{f}} \boldsymbol{\cdot } {\boldsymbol{E}}, \end{equation}

whereas the potential energy density ( 5.74b ) satisfies Poynting’s theorem,

(I.2)

\begin{equation} \frac {\partial \bar {\bar {\mathcal{U}}}}{\partial t} + \boldsymbol{\nabla }\boldsymbol{\cdot } ({\boldsymbol{E}} \times {\boldsymbol{\mathcal{H}}}) = - \overline {{\boldsymbol{\mathcal{J}}}}{}^{\mathrm{f}} \boldsymbol{\cdot } {\boldsymbol{E}}. \end{equation}

(I.3)

\begin{equation} \frac {\partial }{\partial t} (\bar {\bar {\mathcal{K}}} + \bar {\bar {\mathcal{U}}}) + \boldsymbol{\nabla }\boldsymbol{\cdot } \bigg( {\boldsymbol{E}} \times {\boldsymbol{\mathcal{H}}} + \sum _s \int {\bar {\bar {f}}}_s \,\dot {\bar {\bar {\!\boldsymbol{R}}}} \bar {\bar {K}}_s \mathfrak{J}_s \,\mathrm{d}^3 \bar{\bar{u}} \bigg) = 0. \end{equation}

Proof. A direct computation of the partial derivative of the kinetic energy density (5.74a ) per species results in

(I.4)

\begin{align} \frac {\partial \bar {\bar {\mathcal{K}}}_s}{\partial t} &= \int \frac {\partial }{\partial t} ({\bar {\bar {f}}}_s \mathfrak{J}_s) \bar {\bar {K}}_s \,\mathrm{d}^3 {\bar {\bar {u}}} - \int {\bar {\bar {f}}}_s {\bar {\bar {\mu }}} \langle \!\langle (\mathring {\boldsymbol{\nabla} }^\varsigma \times {\boldsymbol{E}}_{1})_\shortparallel \rangle \!\rangle \mathfrak{J}_s\, \mathrm{d}^3 {\bar {\bar {u}}} \nonumber \\& = - \boldsymbol{\nabla }\boldsymbol{\cdot } \int {\bar {\bar {f}}}_s \,\dot {\bar {\bar {\!\boldsymbol{R}}}} \bar {\bar {K}}_s \mathfrak{J}_s\, \mathrm{d}^3 {\bar {\bar {u}}} + \int {\bar {\bar {f}}}_s {\bar {\bar {\mu }}} \,\dot {\bar {\bar {\!\boldsymbol{R}}}} \boldsymbol{\cdot } \boldsymbol{\nabla }\big(B_0 + \big\langle \kern-0.7pt\big\langle \mathring {B}^\varsigma _{1,\shortparallel } \big\rangle \kern-0.8pt\big\rangle \big) \mathfrak{J}_s \,\mathrm{d}^3 {\bar {\bar {u}}} \nonumber \\ &\quad + m_s \int {\bar {\bar {f}}}_s \dot {\bar {\bar {U}}}_\shortparallel {\bar {\bar {u}}}_\shortparallel \mathfrak{J}_s\, \mathrm{d}^3 {\bar {\bar {u}}} - \int {\bar {\bar {f}}}_s {\bar {\bar {\mu }}} \langle \!\langle (\mathring {\boldsymbol{\nabla} }^\varsigma \times {\boldsymbol{E}}_{1})_\shortparallel \rangle \!\rangle \mathfrak{J}_s \,\mathrm{d}^3 {\bar {\bar {u}}} \nonumber \\ & = - \boldsymbol{\nabla }\boldsymbol{\cdot } \int {\bar {\bar {f}}}_s \,\dot {\bar {\bar {\!\boldsymbol{R}}}} \bar {\bar {K}}_s \mathfrak{J}_s \,\mathrm{d}^3 {\bar {\bar {u}}} + \frac {1}{m_s} \int {\bar {\bar {f}}}_s {\bar {\bar {\mu }}} \big( {\bar {\bar {u}}}_\shortparallel {\boldsymbol{B}}_s^{\star } - {{\boldsymbol{\hat {b}}}_0} \times {\boldsymbol{E}}^{\star }_1 \big) \boldsymbol{\cdot } \boldsymbol{\nabla }\big(B_0 + \big\langle \kern-0.7pt\big\langle \mathring {B}^\varsigma _{1,\shortparallel } \big\rangle \kern-0.8pt\big\rangle \big) \,\mathrm{d}^3 {\bar {\bar {u}}} \nonumber \\ &\quad + \frac {1}{m_s} \int {\bar {\bar {f}}}_s {\boldsymbol{B}}_s^{\star } \boldsymbol{\cdot } \big[ q_s {\boldsymbol{E}}^{\star }_1 - {\bar {\bar {\mu }}} \boldsymbol{\nabla }\big(B_0 + \big\langle \kern-0.7pt\big\langle \mathring {B}^\varsigma _{1,\shortparallel } \big\rangle \kern-0.8pt\big\rangle \big) \big] {\bar {\bar {u}}}_\shortparallel \,\mathrm{d}^3 {\bar {\bar {u}}}\nonumber\\ & \quad - \int {\bar {\bar {f}}}_s {\bar {\bar {\mu }}} \langle \!\langle (\mathring {\boldsymbol{\nabla} }^\varsigma \times {\boldsymbol{E}}_{1})_\shortparallel \rangle \!\rangle \mathfrak{J}_s \,\mathrm{d}^3 {\bar {\bar {u}}} \nonumber \\ & = - \boldsymbol{\nabla }\boldsymbol{\cdot } \int {\bar {\bar {f}}}_s \,\dot {\bar {\bar {\!\boldsymbol{R}}}} \bar {\bar {K}}_s \mathfrak{J}_s \mathrm{d}^3 {\bar {\bar {u}}} + q_s \int {\bar {\bar {f}}}_s \,\dot {\bar {\bar {\!\boldsymbol{R}}}} \boldsymbol{\cdot } {\boldsymbol{E}}^{\star }_1 \mathfrak{J}_s \mathrm{d}^3 {\bar {\bar {u}}} - \int {\bar {\bar {f}}}_s {\bar {\bar {\mu }}} \langle \!\langle (\mathring {\boldsymbol{\nabla} }^\varsigma \times {\boldsymbol{E}}_{1})_\shortparallel \rangle \!\rangle \mathfrak{J}_s \,\mathrm{d}^3 {\bar {\bar {u}}} \end{align}

upon substitution of Faraday’s law (5.21), the conservative form of the Vlasov equation (5.10) and the gyrocentre EOMs (5.23). It follows that the total kinetic energy evolves as (I.1) upon substitution of the definition of the free current density (5.39b ) and summation over the species $s$ .

For computing the partial time derivative of the potential energy density (5.74b ), we note that

(I.5)

\begin{align} \frac {\partial }{\partial t} ({\boldsymbol{\mathcal{D}}} \boldsymbol{\cdot } {\boldsymbol{E}}) &= 2\frac {\partial {\boldsymbol{\mathcal{D}}}}{\partial t} \boldsymbol{\cdot } {\boldsymbol{E}}\nonumber\\&\quad + \sum _s \frac {m_s}{B_0^2} \int {\bar {\bar {f}}}{}^0_{\!s} {\bar {\bar {u}}}_\shortparallel \left [ ({{\boldsymbol{\hat {b}}}_0} \times {\boldsymbol{B}}_1) \boldsymbol{\cdot } \frac {\partial {\boldsymbol{E}}}{\partial t} - \left ( {{\boldsymbol{\hat {b}}}_0} \times \frac {\partial {\boldsymbol{B}}_1}{\partial t} \right ) \boldsymbol{\cdot } {\boldsymbol{E}} \right ] \mathfrak{J}_{0,s} \mathrm{d}^3 {\bar {\bar {u}}}\end{align}

and similarly

(I.6)

\begin{equation} \frac {\partial }{\partial t} ({\boldsymbol{\mathcal{H}}} \boldsymbol{\cdot } {\boldsymbol{B}}) = 2{\boldsymbol{\mathcal{H}}} \boldsymbol{\cdot } \frac {\partial {\boldsymbol{B}}}{\partial t} + \frac {\partial {\boldsymbol{B}}}{\partial t} \boldsymbol{\cdot } {\boldsymbol{\mathcal{M}}}_1 - \frac {\partial {\boldsymbol{\mathcal{M}}}_1}{\partial t} \boldsymbol{\cdot } {\boldsymbol{B}}. \end{equation}

Upon substitution of the strong form of the Ampère–Maxwell law (5.44d ) as well as Faraday’s law (5.44c ), we find that

(I.7)

\begin{equation} \frac {\partial {\boldsymbol{\mathcal{D}}}}{\partial t} \boldsymbol{\cdot } {\boldsymbol{E}} + {\boldsymbol{\mathcal{H}}} \boldsymbol{\cdot } \frac {\partial {\boldsymbol{B}}}{\partial t} = (\boldsymbol{\nabla }\times {\boldsymbol{\mathcal{H}}} - \overline {{\boldsymbol{\mathcal{J}}}}{}^{\mathrm{f}}) \boldsymbol{\cdot } {\boldsymbol{E}} - {\boldsymbol{\mathcal{H}}} \boldsymbol{\cdot } (\boldsymbol{\nabla }\times {\boldsymbol{E}}) = - \boldsymbol{\nabla }\boldsymbol{\cdot } ({\boldsymbol{E}} \times {\boldsymbol{\mathcal{H}}}) - \overline {{\boldsymbol{\mathcal{J}}}}{}^{\mathrm{f}} \boldsymbol{\cdot } {\boldsymbol{E}}. \end{equation}

It remains to be shown that the two remaining terms from (I.5) and (I.6) vanish. The sum of the two terms is written as

(I.8)

\begin{align} \sum _s \frac {m_s}{B_0^2} \int {\bar {\bar {f}}}{}^0_{\!s} \Biggl ( & {\bar {\bar {u}}}_\shortparallel \left [ ({{\boldsymbol{\hat {b}}}_0} \times {\boldsymbol{B}}_1) \boldsymbol{\cdot } \frac {\partial {\boldsymbol{E}}_1}{\partial t} - \left ( {{\boldsymbol{\hat {b}}}_0} \times \frac {\partial {\boldsymbol{B}}_1}{\partial t} \right ) \boldsymbol{\cdot } {\boldsymbol{E}}_1 \right ] \nonumber \\ & - \frac {\partial {\boldsymbol{B}}_1}{\partial t} \boldsymbol{\cdot } \left [ {\bar {\bar {u}}}_\shortparallel {{\boldsymbol{\hat {b}}}_0} \times {\boldsymbol{E}}_{1,\perp } + \left ( \frac {{\bar {\bar {\mu }}} B_0}{m_s} - {\bar {\bar {u}}}_\shortparallel ^2 \right ) {\boldsymbol{B}}_{1,\perp } \right ] \nonumber \\ & + \frac {\partial }{\partial t} \left [ {\bar {\bar {u}}}_\shortparallel {{\boldsymbol{\hat {b}}}_0} \times {\boldsymbol{E}}_{1,\perp } + \left ( \frac {{\bar {\bar {\mu }}} B_0}{m_s} - {\bar {\bar {u}}}_\shortparallel ^2 \right ) {\boldsymbol{B}}_{1,\perp } \right ] \boldsymbol{\cdot } {\boldsymbol{B}}_1 \biggr ) \mathfrak{J}_{0,s} \,\mathrm{d}^3 {\bar {\bar {u}}} \end{align}

upon substitution of the definition of the magnetisation and polarisation as defined in (5.34) and (5.29), respectively. Indeed, the sum vanishes and, therefore, (I.2) also holds.

Appendix J. Derivation of the Brizard–Hahm Hamiltonian

The aim is to compute the ZLR approximation of the second-order Hamiltonian as found by Brizard & Hahm (Reference Brizard and Hahm2007, (173)). Their first- and second-order Hamiltonian are given by

(J.1a)

\begin{equation} \bar {\bar {H}}_1^{\mathrm{BH}} = q \langle {\psi }_1^{\mathrm{BH}} \rangle , \quad {\psi }_1^{\mathrm{BH}} \mathrel {\mathop :}= \mathring {\phi }_1 - \bar {\bar {U}}_\shortparallel \widetilde {A}_{1,\shortparallel } - \bar {\bar {U}}_\tau \mathring {A}_{1,\tau } \end{equation}

and

(J.1b)

\begin{equation} \bar {\bar {H}}_2^{\mathrm{BH}} = - \frac {q}{2} \Big\langle \Big\{{\bar {\bar {S}}}_1^{\mathrm{BH}}, \widetilde {\psi }_1^{\mathrm{BH}}\Big\}_{0} \Big\rangle + \frac {q^2}{2m} [ \langle \lvert \mathring {{\boldsymbol{A}}}_{1,\perp } \rvert ^2 \rangle + \langle (\widetilde {A}_{1,\shortparallel })^2 \rangle ] + \frac {q}{B_0} \langle \mathring {{\boldsymbol{A}}}_{1,\perp } \rangle \boldsymbol{\cdot } \big({{\boldsymbol{\hat {b}}}_0} \times \boldsymbol{\nabla }\big\langle {\psi }_1^{\mathrm{BH}} \big\rangle \big), \end{equation}

respectively. The first-order generating function $\bar {\bar {S}}_1^{\mathrm{BH}}$ again satisfies (4.53) to zeroth-order in $\varepsilon _\omega$ .

J.1. ZLR approximation of the Poisson bracket

We use the following shorthand notation for the components of a vector field $\boldsymbol{S}$ in terms of the local coordinates ${\boldsymbol{\hat {e}}}_1, {\boldsymbol{\hat {e}}}_2$ :

(J.2)

\begin{equation} S_{\hat {i}} \mathrel {\mathop :}= {\boldsymbol{S}} \boldsymbol{\cdot } {\boldsymbol{\hat {e}}}_i \end{equation}

as well as the corresponding directional derivative of a scalar function

(J.3)

\begin{equation} \partial _{\hat {i}} Q \mathrel {\mathop :}= {\boldsymbol{\hat {e}}}_i \boldsymbol{\cdot } \boldsymbol{\nabla }Q. \end{equation}

Provided with the first-order generating function $\bar {\bar {S}}_1^{\mathrm{BH}}$ , we want to approximate the leading order term in $\varepsilon _\perp$ of the gyro-averaged guiding-centre Poisson bracket (3.47) while neglecting $O(\varepsilon _\omega )$ and $O(\varepsilon _B)$ terms. This results in

(J.4)

\begin{equation} \Big\langle \Big\{{\bar {\bar {S}}}_1^{\mathrm{BH}}, \widetilde {\psi }_1^{\mathrm{BH}}\Big\}_{0} \Big\rangle = \underbrace {\frac {q}{B_0} \left \langle \frac {\partial }{\partial \bar {\bar {{M}}}} \big(\widetilde {\psi }_1^{\mathrm{BH}}\big)^2 \right \rangle }_{\unicode{x24B6}} - \underbrace {\frac {{{\boldsymbol{\hat {b}}}_0}}{q B_0}\boldsymbol{\cdot } \Big\langle \boldsymbol{\nabla} _\perp {\bar {\bar {S}}}_1^{\mathrm{BH}} \times \boldsymbol{\nabla} _\perp \widetilde {\psi }_1^{\mathrm{BH}} \Big\rangle }_{\unicode{x24B7}} + O(\varepsilon _\perp ^3). \end{equation}

Application of (D.4) to $\widetilde {\psi }_1^{\mathrm{BH}}$ results in

(J.5)

\begin{align} \widetilde {\psi }_1^{\mathrm{BH}} = {} & - \frac {B_0^2}{m} {\boldsymbol{\rho }} \boldsymbol{\cdot } {\boldsymbol{Q}}_{1} - \rho \bar {\bar {U}}_\tau {\boldsymbol{\hat {\rho }}} \boldsymbol{\cdot } \boldsymbol{\nabla} _\perp {A_{1,\tau }} - \frac {\rho \bar {\bar {U}}_\tau }{2} (\boldsymbol{\nabla }\times {\boldsymbol{A}}_{1,\perp })_\shortparallel - \frac {\rho ^2 \bar {\bar {U}}_\tau }{2} ({\boldsymbol{\hat {\rho }}} \otimes {\boldsymbol{\hat {\rho }}}) \boldsymbol{:} ( \unicode{x1D643} {A_{1,\tau }}) \nonumber \\ & + \frac {\rho ^2}{2} ({\boldsymbol{\hat {\rho }}} \otimes {\boldsymbol{\hat {\rho }}}) \boldsymbol{:} \big[ \unicode{x1D643} \big(\phi _1 - \bar {\bar {U}}_\shortparallel A_{1,\shortparallel }\big)\big] - \frac {\rho ^2}{2} \boldsymbol{\nabla} _\perp ^2 \big(\phi _1 - \bar {\bar {U}}_\shortparallel A_{1,\shortparallel }\big) + O(\varepsilon _\perp ^3), \end{align}

which shows that the first term can be written as

(J.6)

where we made use of the fact that integration over the interval $[0, 2\pi ]$ of odd powers of cosine and/or sine functions yields zero, and we have defined the vector ${\boldsymbol{Q}}_{1}$ as

(J.7)

\begin{equation} {\boldsymbol{Q}}_{1} \mathrel {\mathop :}= -\frac {m}{B_0^2} \big[ \boldsymbol{\nabla} _\perp \big(\phi _1 - \bar {\bar {U}}_\shortparallel A_{1,\shortparallel }\big) + \omega _{\mathrm{c}} {\boldsymbol{A}}_{1,\perp } \times {{\boldsymbol{\hat {b}}}_0} \big]. \end{equation}

Moreover, we used the fact that the ${\boldsymbol{A}}_{1,\perp }$ term in ${\boldsymbol{Q}}_{1}$ is the only $O(\varepsilon _\perp ^0)$ term of $\widetilde {\psi }_1^{\mathrm{BH}}$ and, therefore, the only term that remains upon multiplication by the $\rho ^2 ( \unicode{x1D643} A_{1,\tau })$ term, when neglecting $O(\varepsilon _\perp ^3)$ terms and after integration over $\theta$ . Recall that the outer product is defined in (B.2).

The first contribution can be evaluated as follows:

(J.8)

where we made use of (D.5b ). The second contribution can trivially be evaluated as

(J.9)

whereas the third contribution yields

(J.10)

by making use of

(J.11)

\begin{equation} \langle {\boldsymbol{\hat {\rho }}} \otimes {\boldsymbol{\hat {\tau }}} \rangle = \frac {1}{2} ({\boldsymbol{\hat {e}}}_2 \otimes {\boldsymbol{\hat {e}}}_1 - {\boldsymbol{\hat {e}}}_1 \otimes {\boldsymbol{\hat {e}}}_2). \end{equation}

The fourth contribution can be written as

(J.12)

where we make use of

(J.13)

\begin{equation} \langle \sin ^4\theta \rangle = \langle \cos ^4\theta \rangle = \frac {3}{8} , \quad \langle \sin ^2\theta \cos ^2\theta \rangle = \frac {1}{8} \end{equation}

and define

(J.14)

\begin{equation} \unicode{x1D642}_\perp \mathrel {\mathop :}= \begin{pmatrix} \partial _{\hat {1}} A_{1,\hat {1}} & \quad \partial _{\hat {2}} A_{1,\hat {1}}\\[4pt] \partial _{\hat {1}} A_{1,\hat {2}} & \quad \partial _{\hat {2}} A_{1,\hat {2}} \end{pmatrix}. \end{equation}

Finally, the fifth contribution can be written as

(J.15)

Since the $\unicode{x24B7}$ term in (J.4) already has a factor $\varepsilon _\perp ^2$ , we need to only keep the zeroth-order term of $\widetilde {\psi }_1^{\mathrm{BH}}$ (cf. (J.5)) for obtaining an $O(\varepsilon _\perp ^3)$ approximation

(J.16)

\begin{equation} \widetilde {\psi }_1^{\mathrm{BH}} = \bar {\bar {U}}_\tau A_{1,\tau } + O(\varepsilon _\perp ) \end{equation}

and similarly for $\bar {\bar {S}}_1^{\mathrm{BH}}$ for the computation of the second term in (J.4) resulting in

(J.17)

\begin{equation} \unicode{x24B7} = -\frac {\bar {\bar {U}}_\tau ^2}{B_0 \omega _{\mathrm{c}}} {{\boldsymbol{\hat {b}}}_0} \boldsymbol{\cdot } \left ( \boldsymbol{\nabla} _\perp A_{1,\hat {1}} \times \boldsymbol{\nabla} _\perp A_{1,\hat {2}} \right ) + O(\varepsilon _\perp ^3) = -\frac {2 \bar {\bar {{M}}}}{q B_0} \det \unicode{x1D642}_\perp + O(\varepsilon _\perp ^3) . \end{equation}

By making use of

(J.18)

\begin{equation} \frac {1}{2} \boldsymbol{\nabla} _\perp ^2 \lvert {\boldsymbol{A}}_{1,\perp } \rvert ^2 = \lvert \boldsymbol{\nabla} _\perp {\boldsymbol{A}}_{1,\perp } \rvert ^2 + {\boldsymbol{A}}_{1,\perp } \boldsymbol{\cdot } \boldsymbol{\nabla} _\perp ^2 {\boldsymbol{A}}_{1,\perp }, \end{equation}

we then find that the contribution to $\bar {\bar {H}}_2^{\mathrm{BH}}$ due to the first-order generating function can be approximated by (neglecting $O(\varepsilon _\perp ^3)$ terms)

(J.19)

\begin{equation} \Big\langle \Big\{{\bar {\bar {S}}}_1^{\mathrm{BH}}, \widetilde {\psi }_1^{\mathrm{BH}}\Big\}_{0} \Big\rangle = \frac {B_0^2}{q m} \lvert {\boldsymbol{Q}}_{1} \rvert ^2 + \frac {\bar {\bar {{M}}}}{q B_0} \left [ \frac {1}{2} \boldsymbol{\nabla} _\perp ^2 \lvert {\boldsymbol{A}}_{1,\perp } \rvert ^2 - 2 {{\boldsymbol{\hat {b}}}_0} \boldsymbol{\cdot } ({\boldsymbol{A}}_{1,\perp } \times \boldsymbol{\nabla }[ (\boldsymbol{\nabla }\times {\boldsymbol{A}}_{1,\perp })_\shortparallel ]) \right ]. \end{equation}

Note that, in principle, a high-frequency approximation of the first-order generating function $\bar {\bar {S}}_1^{\mathrm{BH}}$ can also be considered, such as in the work of Qin et al. (Reference Qin, Tang, Lee and Rewoldt1999).

J.2. ZLR approximation of the Hamiltonian

The remaining gyro-averaged terms in $\bar {\bar {H}}_2^{\mathrm{BH}}$ can be approximated in the ZLR limit as follows:

(J.20)

\begin{align} \frac {q^2}{2m} \langle (\widetilde {A}_{1,\shortparallel })^2 \rangle &= \frac {\bar {\bar {{M}}}}{2 B_0} \lvert \boldsymbol{\nabla} _\perp A_{1,\shortparallel } \rvert ^2 + O(\varepsilon _\perp ^4), \end{align}

(J.21)

\begin{align} \frac {q^2}{2m} \langle \lvert \mathring {{\boldsymbol{A}}}_{1,\perp } \rvert ^2 \rangle &= \frac {q^2}{2m} \lvert {\boldsymbol{A}}_{1,\perp } \rvert ^2 + \frac {\bar {\bar {{M}}}}{4 B_0} \boldsymbol{\nabla} _\perp ^2 \lvert {\boldsymbol{A}}_{1,\perp } \rvert ^2 + O(\varepsilon _\perp ^4), \end{align}

(J.22)

\begin{align} \langle \mathring {{\boldsymbol{A}}}_{1,\perp } \rangle &= {\boldsymbol{A}}_{1,\perp } + \frac {\rho ^2}{4} \boldsymbol{\nabla} _\perp ^2 {\boldsymbol{A}}_{1,\perp } + O(\varepsilon _\perp ^4) \end{align}

by making use of (D.4). Moreover, we note that

(J.23)

\begin{equation} \langle \mathring {\psi }_1^{\mathrm{BH}} \rangle = \phi _1 - \bar {\bar {U}}_\shortparallel A_{1,\shortparallel } + \frac {\rho ^2}{4} \boldsymbol{\nabla} _\perp ^2 (\phi _1 - \bar {\bar {U}}_\shortparallel A_{1,\shortparallel }) + \frac {\bar {\bar {{M}}}}{q} (\boldsymbol{\nabla }\times {\boldsymbol{A}}_1)_\shortparallel + O(\varepsilon _\perp ^3) \end{equation}

by making use of (D.7) and (D.4). Combining the ZLR approximations results in the following ZLR approximation of the second-order Hamiltonian:

(J.24)

\begin{equation} \bar {\bar {H}}_2^{\mathrm{BH}} = - \frac {B_0^2}{2 m} \lvert {\boldsymbol{Q}}_{1} \rvert ^2 + \frac {\bar {\bar {{M}}}}{2 B_0} \lvert \boldsymbol{\nabla} _\perp A_{1,\shortparallel } \rvert ^2 + \frac {q^2}{2m} \lvert {\boldsymbol{A}}_{1,\perp } \rvert ^2 + \frac {q}{B_0} {\boldsymbol{A}}_{1,\perp } \boldsymbol{\cdot } [{{\boldsymbol{\hat {b}}}_0} \times \boldsymbol{\nabla} _\perp (\phi _1 - \bar {\bar {U}}_\shortparallel A_{1,\shortparallel })] , \end{equation}

which can be simplified to (7.14b ).

Appendix K. Well-posedness of the saddle-point problem

Here, we briefly, and rather informally, discuss the well-posedness of the saddle-point problem given by (6.16) under the simplifying assumptions of a constant background magnetic field, an isotropic pressure $p_{0,\perp } = p_{0,\shortparallel }$ as well as a spatially constant coefficient $\mathcal{C}(1)$ . Therefore, the following system of equations is considered:

(K.1a)

\begin{align} \boldsymbol{\nabla }\times (\boldsymbol{\nabla }\times {\boldsymbol{A}}) - \boldsymbol{\nabla} _\perp \lambda &= {\boldsymbol{\mathcal{J}}}, \end{align}

(K.1b)

\begin{align} \boldsymbol{\nabla} _\perp \boldsymbol{\cdot } {\boldsymbol{A}} &= 0, \end{align}

where $\boldsymbol{\nabla }\boldsymbol{\cdot } {\boldsymbol{\mathcal{J}}} = 0$ and with appropriate boundary conditions. Assume that $\lambda = 0$ and that $\boldsymbol{A}$ satisfies

(K.2a)

\begin{align} -\boldsymbol{\nabla} _\perp ^2 A_\shortparallel &= \mathcal{J}_\shortparallel , \end{align}

(K.2b)

\begin{align} -\boldsymbol{\nabla} ^2 {\boldsymbol{A}}_\perp &= {\boldsymbol{\mathcal{J}}}_\perp - \boldsymbol{\nabla} _\perp ({{\boldsymbol{\hat {b}}}_0} \boldsymbol{\cdot } \boldsymbol{\nabla }A_\shortparallel ), \end{align}

then we can show that $(\lambda , {\boldsymbol{A}})$ also satisfy (K.1) provided that $\varepsilon _B = 0$ . As each of the PDEs in (K.2) is well-posed provided with appropriate boundary conditions, we find that (K.1) is well-posed as well.

The equivalence can be shown as follows. Assume that $\boldsymbol{A}$ solves (K.2). Computing the divergence of (K.2b ) results in

(K.3)

\begin{equation} \boldsymbol{\nabla} ^2 (\boldsymbol{\nabla} _\perp \boldsymbol{\cdot } {\boldsymbol{A}}) = \boldsymbol{\nabla} _\perp \boldsymbol{\cdot } {\boldsymbol{\mathcal{J}}} - {{\boldsymbol{\hat {b}}}_0} \boldsymbol{\cdot } \boldsymbol{\nabla }(\boldsymbol{\nabla} _\perp ^2 A_\shortparallel ) = \boldsymbol{\nabla} _\perp \boldsymbol{\cdot } {\boldsymbol{\mathcal{J}}} + {{\boldsymbol{\hat {b}}}_0} \boldsymbol{\cdot } \boldsymbol{\nabla }(\mathcal{J}_\shortparallel ) = \boldsymbol{\nabla }\boldsymbol{\cdot } {\boldsymbol{\mathcal{J}}} = 0 \end{equation}

upon substitution of (K.2a ). If homogenous Dirichlet boundary conditions are ‘included’ in the Laplace operator, then this implies that (K.1b ) holds. Finally, we add ${\boldsymbol{\hat {b}}}_0$ times (K.2a ) to (K.2b ) resulting in

(K.4)

\begin{equation} {{\boldsymbol{\hat {b}}}_0} ({{\boldsymbol{\hat {b}}}_0} \boldsymbol{\cdot } \boldsymbol{\nabla })^2 A_\shortparallel - \boldsymbol{\nabla} ^2 {\boldsymbol{A}} = {\boldsymbol{\mathcal{J}}} - \boldsymbol{\nabla} _\perp ({{\boldsymbol{\hat {b}}}_0} \boldsymbol{\cdot } \boldsymbol{\nabla }A_\shortparallel ). \end{equation}

When rearranging the terms and by making use of (K.1b ), we find that

(K.5)

\begin{equation} \boldsymbol{\nabla }(\boldsymbol{\nabla }\boldsymbol{\cdot } {\boldsymbol{A}}) - \boldsymbol{\nabla} ^2 {\boldsymbol{A}} = {\boldsymbol{\mathcal{J}}}, \end{equation}

which shows that $\boldsymbol{A}$ satisfies (K.1a ).

Appendix L. Susceptibility tensor in a slab

For the computation of the susceptibility tensor, we require a linearisation of the proposed model under a Fourier ansatz. In particular, we require the Fourier component of the linearised distribution function, the free current density, the polarisation current density and finally the magnetisation current density. Once each of these Fourier components is known, we combine the results into the gyrokinetic susceptibility tensor, and we compute the drift kinetic limit.

L.1. Fourier component of the linearised distribution function

We need a closed-form expression for $\delta \bar {\bar {f}}_{s}$ , which can be obtained upon substitution of the Fourier ansatz

(L.1)

\begin{equation} \delta {\bar {\bar {f}}}_{s} = \widehat {\delta {\bar {\bar {f}}}_{s}}({\bar {\bar {u}}}_\shortparallel , \bar {\bar {\mu }}) \mathrm{e}^{\mathrm{i} ({\boldsymbol{k}} \boldsymbol{\cdot } \bar {\bar {\boldsymbol{r}}} - \omega t)} \end{equation}

into the linearised Vlasov equation (cf. (5.4))

(L.2)

\begin{equation} \frac {\partial \delta {\bar {\bar {f}}}_{s}}{\partial t} + {\bar {\bar {u}}}_\shortparallel {{\boldsymbol{\hat {b}}}_0} \boldsymbol{\cdot } \boldsymbol{\nabla }\delta {\bar {\bar {f}}}_{s} - \frac {1}{m_s} {{\boldsymbol{\hat {b}}}_0} \boldsymbol{\cdot } \big( \mu \boldsymbol{\nabla }\big\langle \kern-0.7pt\big\langle \mathring {B}^\varsigma _{1,\shortparallel } \big\rangle \kern-0.8pt\big\rangle - q_s \langle \,\mathring {\!{\boldsymbol{E}}}_1 \rangle \big) \frac {\partial {\bar {\bar {f}}}{}^0_{\!s}}{\partial {\bar {\bar {u}}}_\shortparallel } = 0, \end{equation}

where we have made use of

(L.3)

\begin{equation} \big(B_{s,\shortparallel }^{\star } \,\dot {\bar {\bar {\!\boldsymbol{R}}}}\big)_0 = {\boldsymbol{B}}_0 {\bar {\bar {u}}}_\shortparallel , \quad \big(B_{s,\shortparallel }^{\star } \,\dot {\bar {\bar {\!\boldsymbol{R}}}}\big)_1 = {\bar {\bar {u}}}_\shortparallel \langle \,\mathring {\!{\boldsymbol{B}}}_{1} \rangle + \left ( \langle \,\mathring {\!{\boldsymbol{E}}}_1 \rangle - \frac {\bar {\bar {\mu }}}{q_s} \boldsymbol{\nabla }\big\langle \kern-0.7pt\big\langle \mathring {B}^\varsigma _{1,\shortparallel } \big\rangle \kern-0.8pt\big\rangle \right ) \times {{\boldsymbol{\hat {b}}}_0} \end{equation}

using the gyrocentre EOM (5.24), (4.74) and (4.72b ) as well as

(L.4)

\begin{equation} \big(B_{s,\shortparallel }^{\star } \dot {\bar {\bar {U}}}_\shortparallel \big)_0 = 0 , \quad \big(B_{s,\shortparallel }^{\star } \dot {\bar {\bar {U}}}_\shortparallel \big)_1 = -\frac {1}{m} {\boldsymbol{B}}_0 \boldsymbol{\cdot } \big( \mu \boldsymbol{\nabla }\big\langle \kern-0.7pt\big\langle \mathring {B}^\varsigma _{1,\shortparallel } \big\rangle \kern-0.8pt\big\rangle - q_s \langle \,\mathring {\!{\boldsymbol{E}}}_1 \rangle \big), \end{equation}

which follows from substituting (4.76) and (5.20) into the gyrocentre EOM (4.80b ) and having assumed $\boldsymbol{\nabla }{f}^0_{s} = {\boldsymbol{0}}_3$ . Subsequent substitution of the Fourier ansatzes (L.1) and (7.26) (and similarly for the magnetic field) results in

(L.5)

\begin{equation} - \mathrm{i} \omega \widehat {\delta {\bar {\bar {f}}}_{s}} + \mathrm{i} {\bar {\bar {u}}}_\shortparallel {{\boldsymbol{\hat {b}}}_0} \boldsymbol{\cdot } {\boldsymbol{k}} \widehat {\delta {\bar {\bar {f}}}_{s}} - \frac {1}{m_s} {{\boldsymbol{\hat {b}}}_0} \boldsymbol{\cdot } \Big( \mathrm{i} \mu {\boldsymbol{k}} \widehat {\big\langle \kern-0.7pt\big\langle \mathring {B}^\varsigma _{1,\shortparallel } \big\rangle \kern-0.8pt\big\rangle } - q_s \widehat {\langle \,\mathring {\!{\boldsymbol{E}}}_1 \rangle } \Big) \frac {\partial {\bar {\bar {f}}}{}^0_{\!s}}{\partial {\bar {\bar {u}}}_\shortparallel } = 0. \end{equation}

To proceed, we must express the Fourier component of the gyro-average of a function in terms of the Fourier component of that function itself. We write the wave vector $\boldsymbol{k}$ in terms of its parallel and perpendicular components as

(L.6)

\begin{equation} {\boldsymbol{k}} = k_{\shortparallel } {{\boldsymbol{\hat {b}}}_0} + k_{\perp }({\boldsymbol{\hat {e}}}_1 \cos \alpha + {\boldsymbol{\hat {e}}}_2 \sin \alpha ) \end{equation}

for some angle $\alpha$ . It follows that the gyro-average of a function $Q$

(L.7)

\begin{equation} Q({\boldsymbol{r}}) = \int \widehat {Q}({\boldsymbol{k}}) \mathrm{e}^{\mathrm{i}{\boldsymbol{k}} \boldsymbol{\cdot } {\boldsymbol{r}}} \,\mathrm{d}^3 k \end{equation}

can be expressed in terms of the Fourier component in the following way:

(L.8)

\begin{equation} \langle \mathring {Q} \rangle ({\boldsymbol{r}}) = \frac {1}{2\pi } \int \widehat {Q} \mathrm{e}^{\mathrm{i}{\boldsymbol{k}} \boldsymbol{\cdot } {\boldsymbol{r}}} \int _0^{2\pi } \mathrm{e}^{\mathrm{i} k_{\perp } \rho \cos \theta }\, \mathrm{d} \theta\, \mathrm{d}^3 k = \int \widehat {Q} \mathrm{e}^{\mathrm{i}{\boldsymbol{k}} \boldsymbol{\cdot } {\boldsymbol{r}}} J_0\, \mathrm{d}^3 k \end{equation}

and, therefore,

(L.9)

\begin{equation} \widehat {\langle \mathring {Q} \rangle } = J_0 \widehat {Q}, \end{equation}

where $J_n$ denotes the $n$ th-order Bessel function of the first kind evaluated at $k_{\perp } \rho$

(L.10)

\begin{equation} J_n = \frac {1}{2\pi \mathrm{i}^n} \int _0^{2\pi } \mathrm{e}^{\mathrm{i} k_{\perp } \rho \cos \theta } \mathrm{e}^{\mathrm{i} \theta n} \,\mathrm{d} \theta . \end{equation}

A similar computation for the disc average, (4.49), yields

(L.11)

\begin{equation} \widehat {\langle \!\langle \mathring {Q}^\varsigma \rangle \!\rangle } = 2 \int _0^1 \varsigma J_0(\varsigma k_{\perp } \rho ) \,\mathrm{d} \varsigma \, \widehat {Q} = \frac {2 J_1}{k_{\perp } \rho } \widehat {Q}, \end{equation}

(L.12)

\begin{equation} \widehat {\delta {\bar {\bar {f}}}_{s}} = {\boldsymbol{{{f}}}}_{s} \boldsymbol{\cdot } \widehat {{\boldsymbol{E}}}_1 , \quad {\boldsymbol{{{f}}}}_{s} \mathrel {\mathop :}= \left ( J_0 {{\boldsymbol{\hat {b}}}_0} - \mathrm{i} \frac {2 {\bar {\bar {\mu }}} k_{\shortparallel } J_1}{\omega q_s k_{\perp } \rho } {{\boldsymbol{\hat {b}}}_0} \times {\boldsymbol{k}} \right ) \frac { \omega _{\mathrm{c},s} ({(\partial {\bar {\bar {f}}}{}^0_{\!s})}/{(\partial {\bar {\bar {u}}}_\shortparallel )})} {\mathrm{i} B_0 (\omega - k_{\shortparallel } {\bar {\bar {u}}}_\shortparallel )} , \end{equation}

where we have made use of Faraday’s law (5.21) in Fourier space

(L.13)

\begin{equation} \widehat {{\boldsymbol{B}}}_1 = \frac {1}{\omega } {\boldsymbol{k}} \times \widehat {{\boldsymbol{E}}}_1. \end{equation}

L.2. Fourier component of the linearised gyrocentre free-current density

We write the gyrocentre free-current density (5.39b ) as

(L.14)

\begin{equation} \overline {{\boldsymbol{\mathcal{J}}}}{}^{\mathrm{f}} = \sum _s \overline {{\boldsymbol{\mathcal{J}}}}{}^{\mathrm{f}}_{s} , \quad \overline {{\boldsymbol{\mathcal{J}}}}{}^{\mathrm{f}}_s \mathrel {\mathop :}= \int \left [ q_s \langle \overline {{\bar {\bar {f}}}_s \mathfrak{J}_s \,\dot {\bar {\bar {\!\boldsymbol{R}}}}} \rangle - {\bar {\bar {\mu }}} \langle \!\langle \overline {\boldsymbol{\nabla }\times ({\bar {\bar {f}}}_s \mathfrak{J}_s {{\boldsymbol{\hat {b}}}_0})} \rangle \!\rangle \right ] \,\mathrm{d}^3 \bar {\bar {u}} , \end{equation}

where we have made use of (D.10) with $\varepsilon _B = 0$ and define the gyro-average adjoint $\langle \overline {Q} \rangle$ as (and similarly for the disc average)

(L.15)

\begin{equation} \int \langle \overline {Q} \rangle \varLambda\, \mathrm{d}^3 \bar {\bar {r}} \mathrel {\mathop :}= \int Q \langle \mathring {\varLambda } \rangle \,\mathrm{d}^6 \bar {\bar {z}} \end{equation}

for all suitable test functions $\varLambda$ .

Linearisation of the gyrocentre free-current density (5.39b ), as is required for computing the susceptibility tensor, results in

(L.16)

\begin{align} \overline {{\boldsymbol{\mathcal{J}}}}{}^{\mathrm{f}}_{1,s} = {} & \frac {q_s}{m_s} \int \biggl [ {\bar {\bar {f}}}{}^0_{\!s} \left ( \big\langle \overline {\big(B_{s,\shortparallel }^{\star } \,\dot {\bar {\bar {\!\boldsymbol{R}}}}\big)_1} \big\rangle - \frac {{\bar {\bar {\mu }}}}{q_s} \langle \!\langle \overline {\boldsymbol{\nabla }\times {\boldsymbol{B}}_{1,s,\shortparallel }^{\star }} \rangle \!\rangle \right ) \nonumber \\ & + \langle \overline {\delta {\bar {\bar {f}}}_{s}} \rangle {\boldsymbol{B}}_0 {\bar {\bar {u}}}_\shortparallel - \frac {{\bar {\bar {\mu }}}}{q_s} \langle \!\langle \overline {\boldsymbol{\nabla }\times (\delta {\bar {\bar {f}}}_{s} {\boldsymbol{B}}_0)} \rangle \!\rangle \biggr ] \,\mathrm{d}^3 {\bar {\bar {u}}} , \end{align}

where we have substituted $\big(B_{s,\shortparallel }^{\star } \,\dot {\bar {\bar {\!\boldsymbol{R}}}}\big)_0 = B_0 {\bar {\bar {u}}}_\shortparallel {{\boldsymbol{\hat {b}}}_0}$ as follows from (5.23a ).

We need the Fourier component of the adjoint of the gyro-average for evaluating (L.16). The gyro-average adjoint on $\varOmega = \mathbb{R}^3$ with constant ${\boldsymbol{B}}_0$ is given by

(L.17)

\begin{equation} \int Q \varLambda ({{\bar {\bar {\boldsymbol{r}}}}} + \rho ({\bar {\bar {\mu }}}) {\boldsymbol{\hat {\rho }}}({\bar {\bar {\theta }}})) \,\mathrm{d}^6 \bar {\bar {z}} = \int Q({{\bar {\bar {\boldsymbol{r}}}}} - \rho ({\bar {\bar {\mu }}}) {\boldsymbol{\hat {\rho }}}({\bar {\bar {\theta }}})) \varLambda\, \mathrm{d}^6 \bar {\bar {z}} \quad \implies \quad \langle \overline {Q} \rangle = \langle Q({{\bar {\bar {\boldsymbol{r}}}}} - {\boldsymbol{\rho }}) \rangle , \end{equation}

where we have made use of the coordinate transformation

(L.18)

\begin{equation} \bar {\bar {\boldsymbol{r}}} \mapsto \bar {\bar {\boldsymbol{r}}} - \rho (\bar {\bar {\mu }}) {\boldsymbol{\hat {\rho }}}(\bar {\bar {\theta }}), \end{equation}

which has unit Jacobian. It follows that the corresponding Fourier component is given by

(L.19)

\begin{equation} \widehat {\langle \overline {Q} \rangle } = J_0(-k_{\perp } \rho ) \widehat {Q} = J_0 \widehat {Q} \end{equation}

by the symmetry of the zeroth-order Bessel function of the first kind. A similar result holds for the adjoint of the disc average

(L.20)

\begin{equation} \langle \!\langle \overline {Q} \rangle \!\rangle = \langle \!\langle Q(\bar {\bar {\boldsymbol{r}}} - \varsigma {\boldsymbol{\rho }}) \rangle \!\rangle \quad \implies \quad \widehat {\langle \!\langle \overline {Q} \rangle \!\rangle } = \frac {2 J_1}{k_{\perp } \rho } \widehat {Q} . \end{equation}

It follows that (L.16), in terms of Fourier components, can be expressed as

(L.21)

\begin{align} \widehat {\overline {{\boldsymbol{\mathcal{J}}}}}{}^{\mathrm{f}}_{1,s} = {} & \frac {q_s}{m_s} \int \biggl [ {\bar {\bar {f}}}{}^0_{\!s} \left ( J_0 \widehat {\big(B_{s,\shortparallel }^{\star } \,\dot {\bar {\bar {\!\boldsymbol{R}}}}\big)_1} + \mathrm{i} \frac {{\bar {\bar {\mu }}}}{q_s} \frac {2 J_1}{k_{\perp } \rho } \widehat {B_{1,s,\shortparallel }^{\star }} {{\boldsymbol{\hat {b}}}_0} \times {\boldsymbol{k}} \right ) \nonumber \\ & + \widehat {\delta {\bar {\bar {f}}}_{s}} \left ( J_0 {\boldsymbol{B}}_0 {\bar {\bar {u}}}_\shortparallel + \mathrm{i} \frac {{\bar {\bar {\mu }}}}{q_s} \frac {2 J_1}{k_{\perp } \rho } {\boldsymbol{B}}_0 \times {\boldsymbol{k}} \right ) \biggr ]\, \mathrm{d}^3 {\bar {\bar {u}}} \end{align}

for which we find that (L.3) can be written in terms of Fourier components as follows:

(L.22)

\begin{equation} \widehat {\big(B_{s,\shortparallel }^{\star } \,\dot {\bar {\bar {\!\boldsymbol{R}}}}\big)_1} = \unicode{x1D64D}_s \widehat {{\boldsymbol{E}}}_1 , \quad \unicode{x1D64D}_s \mathrel {\mathop :}= J_0 \left ( - \frac {{\bar {\bar {u}}}_\shortparallel }{\omega } \unicode{x1D646} + \frac {1}{B_0} \unicode{x1D63D}_0 \right ) + \mathrm{i} \frac {2 {\bar {\bar {\mu }}} J_1}{\omega q_s k_{\perp } \rho } ({\boldsymbol{k}} \times {{\boldsymbol{\hat {b}}}_0}) ({\boldsymbol{k}} \times {{\boldsymbol{\hat {b}}}_0})^\intercal , \end{equation}

where the matrix $ \unicode{x1D646}$ is such that (7.31) holds. Moreover, the Fourier transform of $B_{1,s,\shortparallel }^{\star }$ is given by

(L.23)

\begin{equation} \widehat {B_{1,s,\shortparallel }^{\star }} = J_0 \widehat {B}_{1,\shortparallel } = \frac {J_0}{\omega } ({{\boldsymbol{\hat {b}}}_0} \times {\boldsymbol{k}}) \boldsymbol{\cdot } \widehat {{\boldsymbol{E}}}_1 \end{equation}

by making use of (L.13).

When substituting (L.19), (L.23) and (L.22) into (L.21), we find

(L.24)

\begin{equation} \widehat {\overline {{\boldsymbol{\mathcal{J}}}}}{}^{\mathrm{f}}_1 = \unicode{x1D645} \widehat {{\boldsymbol{E}}}_1, \end{equation}

where

(L.25)

\begin{align} \unicode{x1D645} \mathrel {\mathop :}= {} & \sum _s \frac {q_s}{m_s} \int \biggl [ {\bar {\bar {f}}}{}^0_{\!s} \left ( J_0 \unicode{x1D64D}_s + \mathrm{i} \frac {{\bar {\bar {\mu }}}}{q_s} \frac {2 J_1}{k_{\perp } \rho } \frac {J_0}{\omega } ({{\boldsymbol{\hat {b}}}_0} \times {\boldsymbol{k}}) ({{\boldsymbol{\hat {b}}}_0} \times {\boldsymbol{k}})^\intercal \right ) \nonumber \\ & + \left ( J_0 {\boldsymbol{B}}_0 {\bar {\bar {u}}}_\shortparallel + \mathrm{i} \frac {{\bar {\bar {\mu }}}}{q_s} \frac {2 J_1}{k_{\perp } \rho } {\boldsymbol{B}}_0 \times {\boldsymbol{k}} \right ) {\boldsymbol{{f}}}_{s}^\intercal \biggr ]\, \mathrm{d}^3 {\bar {\bar {u}}} . \end{align}

L.3. Fourier component of the polarisation and magnetisation

Computing the Fourier component of the polarisation (5.29), while substituting Faraday’s law (L.13), results in

(L.26)

\begin{equation} \widehat {{\boldsymbol{\mathcal{P}}}}_1 = \unicode{x1D64B} \widehat {{\boldsymbol{E}}}_{1} , \quad \unicode{x1D64B} \mathrel {\mathop :}= \sum _s \int \frac {{\bar {\bar {f}}}{}^0_{\!s}}{B_0} \left [ {\varPi }_\perp + \frac {{\bar {\bar {u}}}_\shortparallel }{\omega } ({\boldsymbol{k}} {{\boldsymbol{\hat {b}}}_0}^\intercal - \unicode{x1D644}_3 k_{\shortparallel }) \right ]\, \mathrm{d}^3 {\bar {\bar {u}}}, \end{equation}

where we have made use of the perpendicular projection as defined in (5.46). For the magnetisation (5.34), we find

(L.27)

\begin{equation} \widehat {{\boldsymbol{\mathcal{M}}}}_1 = \unicode{x1D648} \widehat {{\boldsymbol{E}}}_1 , \quad \unicode{x1D648} \mathrel {\mathop :}= - \sum _s \int \frac {{\bar {\bar {f}}}{}^0_{\!s}}{m_s} \left [ \frac {m_s {\bar {\bar {u}}}_\shortparallel }{B_0} \left ( -\frac {1}{B_0} \unicode{x1D63D}_0 {\varPi }_\perp + \frac {{\bar {\bar {u}}}_\shortparallel }{\omega } {\varPi }_\perp \unicode{x1D646} \right ) - \frac {{\bar {\bar {\mu }}}}{\omega } {\varPi }_\perp \unicode{x1D646} \right ] \mathrm{d}^3 {\bar {\bar {u}}} . \end{equation}

L.4. Gyrokinetic susceptibility tensor

We consider the Fourier component of the linearisation of (7.25)

(L.28)

\begin{equation} -\omega ^2 \left ( \epsilon _0 \widehat {{\boldsymbol{E}}}_1 + \widehat {{\boldsymbol{\mathcal{P}}}}_1 \right ) = \frac {1}{\mu _0} {\boldsymbol{k}} \times ({\boldsymbol{k}} \times \widehat {{\boldsymbol{E}}}_1) - \omega {\boldsymbol{k}} \times \widehat {{\boldsymbol{\mathcal{M}}}}_1 + \mathrm{i}\omega \widehat {\overline {{\boldsymbol{\mathcal{J}}}}}{}^{\mathrm{f}}_1 \end{equation}

and therefore, when comparing with (7.27), we find that the gyrokinetic susceptibility tensor $\bar {\bar { \unicode{x1D653}}}$ is given by

(L.29)

\begin{equation} \epsilon _0 \bar {\bar { \unicode{x1D653}}} = \unicode{x1D64B} + \frac {1}{ \omega } \unicode{x1D646} \unicode{x1D648} + \mathrm{i} \frac {1}{\omega } \unicode{x1D645}, \end{equation}

where the polarisation, magnetisation and current matrices are given by (L.26), (L.27) and (L.25), respectively.

To facilitate the comparison with the results from Hasegawa (Reference Hasegawa, Roederer and Wasson1975) and Zonta et al. (Reference Zonta, Iorio, Burby, Liu and Hirvijoki2021), we choose the following parallel and perpendicular directions:

(L.30)

\begin{equation} {{\boldsymbol{\hat {b}}}_0} = {\boldsymbol{\hat {e}}}_z , \quad {\boldsymbol{k}} = k_{\perp } {\boldsymbol{\hat {e}}}_x + k_{\shortparallel } {\boldsymbol{\hat {e}}}_z. \end{equation}

When substituting this into (L.25), (L.27) and (L.29), we find

\begin{align*} \bar {\bar { \unicode{x1D653}}} = {} & \sum _s \frac {\omega _{\mathrm{p},s}^2}{\omega \omega _{\mathrm{c},s}^2} \int \frac {{\bar {\bar {f}}}{}^0_{\!s}}{{\bar {\bar {n}}}_{0,s}} \begin{pmatrix} \omega - 2k_{\shortparallel } {\bar {\bar {u}}}_\shortparallel & \quad 0 & \quad k_{\perp } {\bar {\bar {u}}}_\shortparallel \\[4pt] 0 & \quad \omega - 2k_{\shortparallel } {\bar {\bar {u}}}_\shortparallel & \quad 0\\[4pt] k_{\perp } {\bar {\bar {u}}}_\shortparallel & \quad 0 & \quad 0\\[4pt] \end{pmatrix} \mathfrak{J}_{0,s}\, \mathrm{d}^3 {\bar {\bar {u}}}\nonumber\\[8pt]& - \sum _s \frac {\omega _{\mathrm{p},s}^2}{\omega ^2 \omega _{\mathrm{c},s}^2} \int \frac {{\bar {\bar {f}}}{}^0_{\!s}}{{\bar {\bar {n}}}_{0,s}} \left ( {\bar {\bar {u}}}_\shortparallel ^2 - \frac {{\bar {\bar {u}}}_\tau ^2}{2} \right ) \mathfrak{J}_{0,s} \,\mathrm{d}^3 {\bar {\bar {u}}}\begin{pmatrix} - k_{\shortparallel }^2 & \quad 0 & \quad k_{\perp } k_{\shortparallel }\\[4pt] 0 & \quad - k_{\shortparallel }^2 & \quad 0\\[4pt] k_{\shortparallel } k_{\perp } & \quad 0 & \quad -k_{\perp }^2 \end{pmatrix} \end{align*}

(L.31)

\begin{align} & + \mathrm{i} \sum _s \frac {\omega _{\mathrm{p},s}^2}{\omega ^2\omega _{\mathrm{c},s}} \int \frac {{\bar {\bar {f}}}{}^0_{\!s}}{{\bar {\bar {n}}}_{0,s}} J_0^2 \begin{pmatrix} 0 & \quad \omega - k_{\shortparallel } {\bar {\bar {u}}}_\shortparallel & \quad 0\\[8pt] k_{\shortparallel } {\bar {\bar {u}}}_\shortparallel - \omega & \quad \mathrm{i} \dfrac {2k_{\perp } {\bar {\bar {u}}}_\tau J_1}{J_0} & \quad -k_{\perp } {\bar {\bar {u}}}_\shortparallel \\[12pt] 0 & \quad k_{\perp } {\bar {\bar {u}}}_\shortparallel & \quad 0 \end{pmatrix} \mathfrak{J}_{0,s} \,\mathrm{d}^3 {\bar {\bar {u}}} \nonumber\\[10pt] & - \sum _s \frac {\omega _{\mathrm{p},s}^2}{\omega ^2} \int \frac { k_{\shortparallel }^2 {\bar {\bar {f}}}{}^0_{\!s} } {{\bar {\bar {n}}}_{0,s} (k_{\shortparallel } {\bar {\bar {u}}}_\shortparallel - \omega )^2} \begin{pmatrix} 0 & \quad 0 & \quad 0\\[4pt] 0 & \quad {\bar {\bar {u}}}_\tau ^2 J_1^2 & \quad \mathrm{i} \dfrac {\omega }{k_{\shortparallel }} {\bar {\bar {u}}}_\tau J_0 J_1\\[12pt] 0 & \quad -\mathrm{i} \dfrac {\omega }{k_{\shortparallel }} {\bar {\bar {u}}}_\tau J_0 J_1 & \quad \dfrac {\omega ^2}{k_{\shortparallel }^2} J_0^2 \end{pmatrix} \mathfrak{J}_{0,s} \,\mathrm{d}^3 {\bar {\bar {u}}}, \nonumber\\ \end{align}

where $\mathfrak{J}_{0,s} = B_0 / m_s$ and the plasma frequency is given by

(L.32)

\begin{equation} \omega _{\mathrm{p},s} = q_s \sqrt {\frac {{\bar {\bar {u}}}_{0,s}}{\epsilon _0 m_s}}. \end{equation}

In deriving (L.31), we have made use of partial integration with respect to ${\bar {\bar {u}}}_\shortparallel$ and assume that ${\bar{\bar{f}}}^0_{s}$ vanishes at ${\bar {\bar {u}}}_\shortparallel = \pm \infty$ .

L.5. Drift kinetic susceptibility tensor

To compare with the results from Zonta et al. (Reference Zonta, Iorio, Burby, Liu and Hirvijoki2021), we ignore FLR effects by assuming $k_{\perp } \rho \ll 1$ such that we may approximate the Bessel functions as follows:

(L.33)

\begin{equation} J_0 \approx 1 , \quad J_1 \approx \frac {k_{\perp } \rho }{2}. \end{equation}

This results in the following ZLR limit of (L.31):

\begin{align*} \bar {\bar { \unicode{x1D653}}}^{\mathrm{ZLR}} = {} & \sum _s \frac {\omega _{\mathrm{p},s}^2}{\omega \omega _{\mathrm{c},s}^2} \int \frac {{\bar {\bar {f}}}{}^0_{\!s}}{{\bar {\bar {n}}}_{0,s}} \begin{pmatrix} \omega - 2k_{\shortparallel } {\bar {\bar {u}}}_\shortparallel & 0 & k_{\perp } {\bar {\bar {u}}}_\shortparallel \\[5pt] 0 & \omega - 2k_{\shortparallel } {\bar {\bar {u}}}_\shortparallel & 0\\[5pt]k_{\perp } {\bar {\bar {u}}}_\shortparallel & 0 & 0\\\end{pmatrix} \mathfrak{J}_{0,s} \,\mathrm{d}^3 {\bar {\bar {u}}}\\[10pt] & - \sum _s \frac {\omega _{\mathrm{p},s}^2}{\omega ^2 \omega _{\mathrm{c},s}^2} \int \frac {{\bar {\bar {f}}}{}^0_{\!s}}{{\bar {\bar {n}}}_{0,s}} \left ( {\bar {\bar {u}}}_\shortparallel ^2 - \frac {{\bar {\bar {u}}}_\tau ^2}{2} \right ) \mathfrak{J}_{0,s}\, \mathrm{d}^3 {\bar {\bar {u}}}\begin{pmatrix} - k_{\shortparallel }^2 & \quad 0 & \quad k_{\shortparallel } k_{\perp }\\[5pt] 0 & \quad - k_{\shortparallel }^2 & \quad 0\\[5pt] k_{\shortparallel } k_{\perp } & \quad 0 & \quad -k_{\perp }^2 \end{pmatrix}\\[10pt] & + \mathrm{i} \sum _s \frac {\omega _{\mathrm{p},s}^2}{\omega ^2 \omega _{\mathrm{c},s}^2} \int \frac {{\bar {\bar {f}}}{}^0_{\!s}}{{\bar {\bar {n}}}_{0,s}} \begin{pmatrix} 0 & \quad \omega _{\mathrm{c},s} (\omega - k_{\shortparallel } {\bar {\bar {u}}}_\shortparallel ) & \quad 0\\[10pt] -\omega _{\mathrm{c},s} (\omega - k_{\shortparallel } {\bar {\bar {u}}}_\shortparallel ) & \quad \mathrm{i} k_{\perp }^2 {\bar {\bar {u}}}_\tau ^2 & \quad - \omega _{\mathrm{c},s} k_{\perp } {\bar {\bar {u}}}_\shortparallel \\[10pt] 0 & \quad \omega _{\mathrm{c},s} k_{\perp } {\bar {\bar {u}}}_\shortparallel & \quad 0 \end{pmatrix} \mathfrak{J}_{0,s}\mathrm{d}^3 {\bar {\bar {u}}} \end{align*}

(L.34)

\begin{align}& - \sum _s \frac {\omega _{\mathrm{p},s}^2}{\omega ^2} \int \frac { k_{\shortparallel }^2 {\bar {\bar {f}}}{}^0_{\!s} } {{\bar {\bar {n}}}_{0,s} (k_{\shortparallel } {\bar {\bar {u}}}_\shortparallel - \omega )^2} \begin{pmatrix} 0 & \quad 0 & \quad 0\\[7pt] 0 & \quad {\bar {\bar {u}}}_\tau ^4 k_{\perp }^2 \dfrac {1}{4 \omega _{\mathrm{c},s}^2} & \quad \mathrm{i} \dfrac {k_{\perp }}{k_{\shortparallel }} {\bar {\bar {u}}}_\tau ^2 \dfrac {\omega }{2 \omega _{\mathrm{c},s}}\\[16pt] 0 & \quad -\mathrm{i} {\bar {\bar {u}}}_\tau ^2 \dfrac {k_{\perp }}{k_{\shortparallel }} \dfrac {\omega }{2 \omega _{\mathrm{c},s}} & \quad \dfrac {\omega ^2}{k_{\shortparallel }^2} \end{pmatrix} \mathfrak{J}_{0,s} \,\mathrm{d}^3 {\bar {\bar {u}}}. \end{align}

It can be verified that this agrees with Hasegawa (Reference Hasegawa, Roederer and Wasson1975, (2.159)).

L.6. Gyrokinetic Darwin susceptibility tensor

This is very similar to the gauge-invariant gyrokinetic model, except that we must express the additional term in the Darwin polarisation (6.9a ) in terms of the electric field $\widehat {{\boldsymbol{E}}}_1$ . That is, we must express the perpendicular part of the vector potential in terms of the electromagnetic fields.

From the definition of the electric field (4.27), we find that

(L.35)

\begin{equation} \widehat {{\boldsymbol{E}}}_1 = -\mathrm{i} {\boldsymbol{k}} \widehat {\phi }_1 + \mathrm{i} \omega \widehat {{\boldsymbol{A}}}_1 \quad \implies \quad {\boldsymbol{k}}_\perp \boldsymbol{\cdot } \widehat {{\boldsymbol{E}}}_1 = -\mathrm{i} k_{\perp }^2 \widehat {\phi }_1, \end{equation}

after having substituted the Fourier ansatz as well as the constraint (6.16b ), ${\boldsymbol{k}}_\perp \boldsymbol{\cdot } \widehat {{\boldsymbol{A}}}_1 = 0$ . It follows that the perpendicular part of the vector potential is given by

(L.36)

\begin{equation} \widehat {{\boldsymbol{A}}}_{1,\perp } = \frac {\widehat {{\boldsymbol{E}}}_{1,\perp } k_{\perp }^2 - {\boldsymbol{k}}_\perp ({\boldsymbol{k}}_\perp \boldsymbol{\cdot } \widehat {{\boldsymbol{E}}}_1)}{\mathrm{i} \omega k_{\perp }^2}. \end{equation}

Using the specific parallel and perpendicular directions given by (L.30), we find

(L.37)

\begin{equation} \widehat {{\boldsymbol{A}}}_{1,\perp } = \unicode{x1D63C}_\perp \widehat {{\boldsymbol{E}}}_1 , \quad \unicode{x1D63C}_\perp = \begin{pmatrix} 0 & \quad 0 & \quad 0\\[5pt] 0 & \quad -\dfrac {\mathrm{i}}{ \omega } & \quad 0\\[8pt] 0 & \quad 0 & \quad 0 \end{pmatrix} \end{equation}

such that when considering (L.29) and (6.9a ), we find that the gyrokinetic Darwin susceptibility tensor is given by

(L.38)

\begin{equation} \bar {\bar { \unicode{x1D653}}}^{\mathrm{Dar}} = \bar {\bar { \unicode{x1D653}}} - \mathrm{i} \omega \sum _s \frac {\omega _{\mathrm{p},s}^2}{\omega _{\mathrm{c},s}^2} \unicode{x1D63C}_\perp , \end{equation}

where we have moreover substituted the definition of the plasma frequency (L.32).

References

Bittencourt, J.A. 2004 Fundamentals of Plasma Physics. 3rd edn. Springer.10.1007/978-1-4757-4030-1CrossRef Google Scholar

Bottino, A. & Sonnendrücker, E. 2015 Monte Carlo particle-in-cell methods for the simulation of the Vlasov–Maxwell gyrokinetic equations. J. Plasma Phys. 81, 435810501–435810539.10.1017/S0022377815000574CrossRef Google Scholar

Bouchut, F., Golse, F. & Pulvirenti, M. 2000 Kinetic equations and asymptotic theory. In Series in Applied Mathematics (ed. Desvillettes, L. & Perthame, B.). Gauthiers-Villars.Google Scholar

Brizard, A.J. 1990 Nonlinear gyrokinetic tokamak physics PhD thesis, Princeton University.Google Scholar

Brizard, A.J. 2021 A Exact conservation laws for gauge-free electromagnetic gyrokinetic equations. J. Plasma Phys. 87, 905870307.Google Scholar

Brizard, A.J. 2021 B Hamiltonian structure of a gauge-free gyrokinetic Vlasov–Maxwell model. Phys. Plasmas 28, 122107.10.1063/5.0068519CrossRef Google Scholar

Brizard, A.J. & Hahm, T.S. 2007 Foundations of nonlinear gyrokinetic theory. Rev. Mod. Phys. 79, 421–468.10.1103/RevModPhys.79.421CrossRef Google Scholar

Burby, J.W. & Brizard, A.J. 2019 Gauge-free electromagnetic gyrokinetic theory. Phys. Lett. A 383, 2172–2175.Google Scholar

Burby, J.W., Brizard, A.J., Morrison, P.J. & Qin, H. 2015 Hamiltonian gyrokinetic Vlasov–Maxwell system. Phys. Lett. A 379, 2073–2077.Google Scholar

Cary, J.R. & Littlejohn, R.G. 1983 Noncanonical Hamiltonian mechanics and its application to magnetic field line flow. Ann. Phys.-New York 151, 1–34.Google Scholar

Chen, L., Chen, H., Zonca, F. & Lin, Y. 2021 A gyrokinetic simulation model for low frequency electromagnetic fluctuations in magnetized plasmas. Sci. China Phys., Mech. Astron. 64, 245211.10.1007/s11433-020-1640-9CrossRef Google Scholar

Chen, L. & Zonca, F. 2016 Physics of Alfvén waves and energetic particles in burning plasmas. Rev. Mod. Phys. 88, 015008.10.1103/RevModPhys.88.015008CrossRef Google Scholar

Degond, P. & Raviart, P.A. 1992 An analysis of the Darwin model of approximation to Maxwell’s equations. Forum Math. 4, 13–44.Google Scholar

Dragt, A.J. & Finn, J.M. 1976 Lie series and invariant functions for analytic symplectic maps. J. Math. Phys. 17, 2215–2227.Google Scholar

Dudkovskaia, A.V., Wilson, H.R., Connor, J.W., Dickinson, D. & Parra, F.I. 2023 Nonlinear second order electromagnetic gyrokinetic theory for a tokamak plasma. Plasma Phys. Control Fusion 65, 045010.Google Scholar

Fisher, G.P. 1971 The electric dipole moment of a moving magnetic dipole. Am. J. Phys. 39, 1528–1533.10.1119/1.1976708CrossRef Google Scholar

Frieman, E.A. & Chen, L. 1982 Nonlinear gyrokinetic equations for low-frequency electromagnetic waves in general plasma equilibria. Phys. Fluids 25, 502–508.CrossRef Google Scholar

Garbet, X., Idomura, Y., Villard, L. & Watanabe, T.H. 2010 Gyrokinetic simulations of turbulent transport. Nucl. Fusion 50, 043002.10.1088/0029-5515/50/4/043002CrossRef Google Scholar

Grad, H. 1966 Variational principle for a guiding-center plasma. Phys. Fluids 9, 225–251.10.1063/1.1761665CrossRef Google Scholar

Grad, H. 1967 Toroidal containment of a plasma. Phys. Fluids 10, 137–154.Google Scholar

Grad, H. & Rubin, H. 1958 Hydromagnetic equilibria and force-free fields. J. Nucl. Energy 7, 284–285.Google Scholar

Grieger, G., Beidler, C., Harmeyer, E., Lotz, W., Kisslinger, J., Merkel, P., Nührenberg, J., Rau, F., Strumberger, E. & Wobig, H. 1992 Modular stellarator reactors and plans for Wendelstein 7-X. Fusion Technol. 21, 1767–1778.Google Scholar

Hahm, T.S. 1988 Nonlinear gyrokinetic equations for tokamak microturbulence. Phys. Fluids 31, 2670–2673.10.1063/1.866544CrossRef Google Scholar

Hasegawa, A. 1975 Plasma instabilities and non-linear effects. In Physics and Chemistry in Space (ed. Roederer, J.G. & Wasson, J.T.). Springer-Verlag.Google Scholar

Hindenlang, F., Maj, O., Strumberger, E., Rampp, M. & Sonnendrücker, E. 2019 GVEC: A newly developed 3D ideal MHD Galerkin Variational Equilibrium Code. In Annual Meeting of the Simons Collaboration on Hidden Symmetries and Fusion Energy.Google Scholar

Hirshman, S.P. & Whitson, J.C. 1983 Steepest-descent moment method for three-dimensional magnetohydrodynamic equilibria. Phys. Fluids 26, 3553–3568.10.1063/1.864116CrossRef Google Scholar

Hirvijoki, E., Burby, J.W., Pfefferlé, D. & Brizard, A.J. 2020 Energy and momentum conservation in the Euler–Poincaré formulation of local Vlasov–Maxwell-type systems. J. Phys. A: Math. Theor. 53, 235204.10.1088/1751-8121/ab8b38CrossRef Google Scholar

Hnizdo, V. 2012 Magnetic dipole moment of a moving electric dipole. Am. J. Phys. 80, 645–647.Google Scholar

Jost, G., Tran, T.M., Cooper, W.A., Villard, L. & Appert, K. 2001 Global linear gyrokinetic simulations in quasi-symmetric configurations. Phys. Plasmas 8, 3321–3333.Google Scholar

Kleiber, R., et al. 2024 EUTERPE: A global gyrokinetic code for stellarator geometry. Comput. Phys. Commun. 295, 109013.Google Scholar

Kleiber, R., Hatzky, R., Könies, A., Mishchenko, A. & Sonnendrücker, E. 2016 An explicit large time step particle-in-cell scheme for nonlinear gyrokinetic simulations in the electromagnetic regime. Phys. Plasmas 23, 032501–032512.10.1063/1.4942788CrossRef Google Scholar

Klinger, T., et al. 2019 Overview of first Wendelstein 7-X high-performance operation. Nucl. Fusion 59, 112004.Google Scholar

Kraus, M., Kormann, K., Morrison, P.J. & Sonnendrücker, E. 2017 GEMPIC: geometric electromagnetic particle-in-cell methods. J. Plasma Phys. 83, 905830401–905830451.Google Scholar

Littlejohn, R.G. 1982 Hamiltonian perturbation theory in noncanonical coordinates. J. Math. Phys. 23, 742–747.10.1063/1.525429CrossRef Google Scholar

Littlejohn, R.G. 1983 Variational principles of guiding centre motion. J. Plasma Phys. 21, 111–125.Google Scholar

Low, F.E. 1958 A Lagrangian formulation of the Boltzmann–Vlasov equation for plasmas. Proc. Royal Soc. Lond. A 248, 282–287.Google Scholar

McMillan, B.F. 2023 Relationship between drift kinetics, gyrokinetics and magnetohydrodynamics in the long-wavelength limit. J. Plasma Phys. 89, 905890115.10.1017/S0022377823000089CrossRef Google Scholar

Miloshevich, G. & Burby, J.W. 2021 Hamiltonian reduction of Vlasov–Maxwell to a dark slow manifold. J. Plasma Phys. 87, 835870301.CrossRef Google Scholar

Mishchenko, A., Borchardt, M., Hatzky, R., Kleiber, R., Könies, A., Nührenberg, C., Xanthopoulos, P., Roberg-Clark, G. & Plunk, G.G. 2023 Global gyrokinetic simulations of electromagnetic turbulence in stellarator plasmas. J. Plasma Phys. 89, 955890304.Google Scholar

Noether, E. 1918 Invariante Variationsprobleme. Nachrichten von der Gesellschaft der Wissenschaften zu Göttingen. Math.-Phys. Klasse 1918, 235–257.Google Scholar

Novikau, I., Biancalani, A., Bottino, A., Di Siena, A., Lauber, P., Poli, E., Lanti, E., Villard, L., Ohana, N. & Briguglio, S. 2021 Implementation of energy transfer technique in ORB5 to study collisionless wave-particle interactions in phase-space. Comput. Phys. Commun. 262, 107032.10.1016/j.cpc.2019.107032CrossRef Google Scholar

Parra, F.I. & Calvo, I. 2011 Phase-space Lagrangian derivation of electrostatic gyrokinetics in general geometry. Plasma Phys. Control Fusion 53, 045001.10.1088/0741-3335/53/4/045001CrossRef Google Scholar

Peifeng, F., Hong, Q. & Jianyuan, X. 2021 Discovering exact, gauge-invariant, local energy–momentum conservation laws for the electromagnetic gyrokinetic system by high-order field theory on heterogeneous manifolds. Plasma Sci. Technol. 23, 105103.Google Scholar

Porazik, P. & Lin, Z. 2011 Gyrokinetic simulation of magnetic compressional modes in general geometry. Commun. Comput. Phys. 10, 899–911.CrossRef Google Scholar

Qin, H. 2005 A short introduction to general gyrokinetic theory. In Topics in Kinetic Theory (ed. Passot, T., Sulem, C. & Sulem, P.L.). American Mathematical Society.Google Scholar

Qin, H., Tang, W.M. & Lee, W.W. 2000 Gyrocenter-gauge kinetic theory. Phys. Plasmas 7, 4433–4445.10.1063/1.1309031CrossRef Google Scholar

Qin, H., Tang, W.M., Lee, W.W. & Rewoldt, G. 1999 Gyrokinetic perpendicular dynamics. Phys. Plasmas 6, 1575–1588.Google Scholar

Remmerswaal, R. 2023 DispersionCurves.jl: A small package for plotting dispersion curves. Available at https://gitlab.mpcdf.mpg.de/rwr/dispersioncurves.jl.Google Scholar

Sugama, H. 2000 Gyrokinetic field theory. Phys. Plasmas 7, 466–480.Google Scholar

Sugama, H., Nunami, M., Satake, S. & Watanabe, T.H. 2018 Eulerian variational formulations and momentum conservation laws for kinetic plasma systems. Phys. Plasmas 25, 102506.10.1063/1.5031155CrossRef Google Scholar

Zoni, E. & Possanner, S. 2021 On the accuracy of gyrokinetic equations in fusion applications. In Recent Advances in Kinetic Equations and Applications (ed. Salvarani, F.), pp. 367–393. Springer International Publishing.Google Scholar

Zonta, F., Iorio, R., Burby, J.W., Liu, C. & Hirvijoki, E. 2021 Dispersion relation for gauge-free electromagnetic drift kinetics. Phys. Plasmas 28, 092504.Google Scholar

Figure 1. Illustration of the guiding-centre coordinate system. We denote the physical particle position in black and the guiding-centre position in green. The particle moves along the background magnetic field in the (blue) ${\boldsymbol{\hat {b}}}_0$ direction, while gyrating in the (red) plane perpendicular to the background magnetic field, in the direction of the (red) arrow $\boldsymbol{\hat {\tau }}$. The extremal values of the $\varsigma$ parameter (introduced in § 4.4) are indicated in grey.

Table 1. The length scales used for determining the wave vector $\check {{\boldsymbol{k}}}$, as obtained from Zoni & Possanner (2021). The non-dimensional wavenumbers $\check {k}_\shortparallel$ and $\check {k}_\perp$ are computed according to (7.38).

Figure 2. Dispersion relations for a fixed value of $\check {k}_\shortparallel = 2 \times 10^{-3}$ and $\beta _{0} = 10 \,\%$. The black dotted line corresponds to $\check {\omega } = \check {\omega }_{\mathrm{As}}$, whereas the black dashed line corresponds to $\check {\omega } = \check {\omega }_{\mathrm{Ac}}$.

Figure 3. Dispersion relations for fixed values of $\check {k}_\perp$ and $\check {k}_\shortparallel$, as determined from table 1. Only the shear Alfvén wave is shown.

Table 2. Properties of the different gyrokinetic models under consideration. The two models proposed in this paper are in boldface: the gyrokinetic Maxwell model with $(\xi _R, \xi _\varTheta ) = (1, 0)$ and its corresponding quasi-neutral gyrokinetic Darwin approximation. $^\dagger$ This can be a ‘yes (Y)’ if the approach from Qin et al. (1999) is followed. $^\ast$ If the polarisation current density is kept, then the compressional Alfvén wave is present and the Lagrange multiplier vanishes, but if the polarisation current density is neglected, then the compressional Alfvén wave is absent and the Lagrange multiplier is needed to restore the bound-charge continuity equation.