Coupling multi-fluid dynamics equipped with Landau closures to the particle-in-cell method

The particle-in-cell (PIC) method is successfully used to study magnetized plasmas. However, this requires large computational costs and limits simulations to short physical run-times and often to setups in less than three spatial dimensions. Traditionally, this is circumvented either via hybrid-PIC methods (adopting massless electrons) or via magneto-hydrodynamic-PIC methods (modelling the background plasma as a single charge-neutral magneto-hydrodynamical fluid). Because both methods preclude modelling important plasma-kinetic effects, we introduce a new fluid-PIC code that couples a fully explicit and charge-conservative multi-fluid solver to the PIC code SHARP through a current-coupling scheme and solve the full set of Maxwell's equations. This avoids simplifications typically adopted for Ohm's Law and enables us to fully resolve the electron temporal and spatial scales while retaining the versatility of initializing any number of ion, electron, or neutral species with arbitrary velocity distributions. The fluid solver includes closures emulating Landau damping so that we can account for this important kinetic process in our fluid species. Our fluid-PIC code is second-order accurate in space and time. The code is successfully validated against several test problems, including the stability and accuracy of shocks and the dispersion relation and damping rates of waves in unmagnetized and magnetized plasmas. It also matches growth rates and saturation levels of the gyro-scale and intermediate-scale instabilities driven by drifting charged particles in magnetized thermal background plasmas in comparison to linear theory and PIC simulations. This new fluid-SHARP code is specially designed for studying high-energy cosmic rays interacting with thermal plasmas over macroscopic timescales.


Introduction
Astrophysical plasmas naturally partition into thermal and non-thermal particle populations. Provided particles collide frequently via (Coulomb) collisions, this eventually leads to a characteristic thermal Maxwellian phase-space distribution. This population can be reliably described with the fluid approximation, which characterizes a vast amount of particles by a few macroscopic fields in space (e.g., number density, mean velocity and temperature). By contrast, the non-thermal cosmic ray (CR) ion population at energies exceeding GeV is mostly collisionless and interacts with the background plasma † Email address for correspondence: rlemmerz@aip.de ‡ Email address for correspondence: mshalaby@live.ca arXiv:2301.04679v2 [astro-ph.HE] 23 Jan 2023 via wave-particle interactions, thus retaining its initial power-law distribution for much longer times (Blandford & Eichler 1987;Draine 2011;Zweibel 2017). Low-energy CRs ( GeV) more frequently experience Coulomb/ionisation collisions and as such have a direct influence on gas dynamics and molecular chemistry (Dalgarno 2006;Padovani et al. 2020). CRs can excite and grow plasma waves via instabilities at which they scatter in pitch angle (i.e., the angle between momentum and magnetic field vector), thereby regulating their macroscopic transport speed and exchanging energy and momentum with the thermal population. Modelling these plasma processes requires to move beyond the classical fluid approximation. During the process of diffusive shock acceleration, CRs stream ahead of the shock into the precursor region and drive non-resonant Alfvén waves unstable by means of their powerful current (Bell 2004;Riquelme & Spitkovsky 2009;Caprioli & Spitkovsky 2014a), which provides efficient means of increasing their wave-particle scattering and reducing the CR diffusion coefficient (Caprioli & Spitkovsky 2014b). Upon escaping from the acceleration site into the ambient medium, CRs continue to drive Alfvén-waves through resonant instabilities. Scattering off of these self-induced waves regulates their transport speed (Kulsrud & Pearce 1969;Marcowith et al. 2021;Shalaby et al. 2021), which is determined by the balancing instability growth and wave damping (Thomas & Pfrommer 2019;Thomas et al. 2020). In the interstellar medium, CRs provide a comparable if not dominant pressure, despite their negligible number densities in comparison to the thermal population, which makes them dynamically important (Boulares & Cox 1990;Draine 2011). Their pressure gradient can drive outflows from the interstellar medium Girichidis et al. 2018;Farber et al. 2018) so that powerful global winds emerge from galaxies (Uhlig et al. 2012;Hanasz et al. 2013;Pakmor et al. 2016;) that enrich the circumgalactic medium in galaxy haloes with CRs that can also dominate the pressure support and modify the cosmic accretion of gas onto galaxies (Buck et al. 2020;Ji et al. 2020). The degree to which CRs regulate galaxy formation critically depends on the efficiency of wave-particle interactions, which in turn depends on the amplitude of self-excited plasma waves (Thomas et al. 2022). On even larger scales, CRs energised in jets of active galactic nuclei stream into the surrounding intracluster medium of cool core clusters and heat it via the excitation of Alfvén waves and the successive damping (Guo & Oh 2008;Pfrommer 2013;Jacob & Pfrommer 2017). Because the plasma physics underlying these processes is highly non-linear, numerical calculations are needed to study these effects.
Due to its ability to resolve kinetic processes, the PIC method (Dawson 1962;Langdon & Birdsall 1970;Hockney 1988;Birdsall & Langdon 1991) has become one of the most used methods for studying plasmas from laboratory to astrophysical scales. Examples of that include revolutionizing our understanding of the rich physics found in collisionless shocks (Spitkovsky 2008;Marcowith et al. 2016), magnetic reconnection (Daughton et al. 2006(Daughton et al. , 2011Sironi & Spitkovsky 2014), instabilities driven by highly relativistic electronpositron beams Shalaby et al. 2018Shalaby et al. , 2020, as well as the transport of non-thermal particle populations like CRs (Holcomb & Spitkovsky 2019;Shalaby et al. 2021). However, the PIC method needs to advance numerous particles per cell each time step, and thus it is quick to reach its computational limit. Even one-dimensional simulations usually only capture dynamics on very short physical times and the extent to which two or three-dimensional simulations can be performed is very limited.
The time interval between the inverse of the electron plasma frequency, −1 e , (which is necessary to ensure the stability of the PIC algorithm) and that of the ion plasma frequency, −1 i , depends on the ion-to-electron mass ratio, since −1 i / −1 e = ( i / e ) 1/2 , assuming charge neutrality, i.e. that the electron and ion densities are equal. Therefore, one frequently used trick to increase the computational efficiency in PIC simulations is to adopt a reduced ion-to-electron mass ratio to bridge the gap between the smallest timescale in the simulation and the larger timescale on which interesting physical processes occur. However, this might lead to artificial suppression of physical effects Hong et al. 2012;Moreno et al. 2018), including instabilities with excitation conditions that depend on the mass ratio (Shalaby et al. 2021;Shalaby et al. 2022). This shows the need for a more efficient numerical method to complement the accurate results achieved by PIC simulations in order to enable simulations of realistic physics occurring on longer timescales. One possible method consists in using the less expensive fluid approximation, which works particularly well for collisional systems where frequent particle collisions maintain a thermodynamic temperature but is less well motivated in weakly collisional or even collisionless astrophysical plasmas where it cannot accurately capture some important microphysical plasma processes. Multiple methods have been devised that combine the computational advantages of a fluid code, while trying to maintain some of the physics accuracy provided by the PIC method. Hybrid-PIC codes (Lipatov 2002;Gargaté et al. 2007) treat electrons as a massless fluid and ions as particles. With the assumption of charge neutrality and the Darwin approximation (i.e., neglecting the transverse displacement current), these codes are able to overcome some computational barriers while omitting effects on the electron time and length scale. Since this eliminates the need to resolve electron scales, the increase in computational efficiency from pure-PIC to hybrid-PIC methods is roughly a factor of ( i / e ) 1/2 in timescale and about the same factor in spatial scales. On the other hand, an even more efficient method exists, that combines the magneto-hydrodynamic (MHD) description of the thermal background plasma with PIC methods to model the evolution of energetic particles such as CRs (Bai et al. 2015;van Marle et al. 2018), called MHD-PIC. However, this method inherits the assumptions of MHD, in particular, the use of (simplified) Ohm's law by fully neglecting the displacement current, which precludes physics associated with higher-order terms of Ohm's law as well as the electron dynamics.
In this paper we present a self-consistent algorithm that is suitable for simulating microphysical effects of CR physics by only applying the fluid approximation to thermal particles and solving the full set of Maxwell's equations. Our goal of this novel fluid-PIC method is to sacrifice as little physics accuracy as possible, while at the same time alleviating computational restraints by orders of magnitude for setups involving CRs (or similar, low density non-thermal particle populations interacting with a thermal plasma). The fluid-PIC method, in essence, couples a multi-fluid solver to the PIC method by summing their contributions to the charge and current densities used to solve Maxwell's equations, and the resulting electromagnetic fields. Thus, the subsequent dynamics is dictated by fluid and PIC species. This enables treating any arbitrary number of species in thermal equilibrium by modelling them as separate fluids that interact electromagnetically with each other and with particles of arbitrary momentum distribution (modelled using the PIC method). In contrast to MHD-PIC and hybrid-PIC methods, we do not explicitly assume Ohm's law, and instead, solve Maxwell's equations in a fully self-consistent manner in our fluid-PIC code. Therefore, displacement currents are included in our model and fast changes in the electric field and electron dynamics are captured. This, in turn, allows studying the interaction of high energy particles with the background plasma, e.g. to investigate CR streaming. Another hybrid approach resolving electron timescales fully, but using pressure coupling, has been used for simulation of pick-up ions in the heliosphere by Burrows et al. (2014).
Often implicit and semi-implicit methods are utilized for stability and resolution reasons to couple the multi-fluid equations to Maxwell's equations (Hakim et al. 2006;Shumlak et al. 2011;Wang et al. 2020). However, this creates an interdependency between all fluids and has limited utility when coupled to explicit particles. We have developed an explicit multi-fluid solver in which each fluid and particle species is agnostic about each other and the coupling is achieved via an indirect current-coupling scheme. Because the PIC part of the code is the most computationally expensive part of the fluid-PIC, hybrid-PIC, and MHD-PIC methods, the computational efficiency is mostly determined by the number of particles required as well as the smallest time and length scales that need to be resolved. Hence, this fluid-PIC approach results in large speed-ups for CR propagation simulations in comparison to traditional hybrid-PIC codes, which treat every ion as a particle and need to initialise a large number of particles according to the density ratio, as well as in comparison to PIC-only simulations. Especially studying comic ray propagation in the interstellar medium, where the typical CR density is of the order 10 −9 times the interstellar medium number density, is challenging. Since the fluid-PIC algorithm is faster by orders of magnitude in comparison to PIC in such a case, we can reach further into the realistic parameter regime without sacrificing some essential microphysics. One of the most important kinetic effects is arguably Landau damping. The fluid description can emulate this effect using Landau closures (Hammett & Perkins 1990;Umansky et al. 2015;Hunana et al. 2019), which necessitates the computation of the heat flux in Fourier space. While Fourier transforms in 1D are not easily parallelizable, this bottleneck can partially be mitigated by performing global communications of the message-passing interface (MPI) in the background while processing the high computational load (e.g. resulting from evolving orbits of PIC particles) in the foreground. Simulations with periodic boundary conditions are currently handled by convolution with a finite-impulse-response (FIR) filter in our code, but other options are available in the literature (Dimits et al. 2014;Wang et al. 2019). A number of simplifying local approximations exist as well (Wang et al. 2015;Allmann-Rahn et al. 2018;Ng et al. 2020), which scale computationally well but become inaccurate for studying some multiscale plasma physics problems. Our code implements these different approaches so that an appropriate one can be chosen, dependent on the requirements of a simulation. Our implementation is massively parallelized and can be efficiently run on thousands of cores. Furthermore, the fluid-PIC method allows for any multi-fluid setup. As such, this framework allows for some straightforward extensions. Potentially, this involves a setup with actively participating neutrals to incorporate ion-neutral damping into this method. To this end, the coupling between different fluids needs to be extended by a collision term, which is left as a future extension to the code.
The outline of this paper is as follows. In section 2, we introduce the pillars of this method and describe the PIC method, the fluid solver, how we couple both methods by means of electromagnetic fields, and describe various implementations of the Landau closure. In section 3, we show validation tests of the fluid solver (shock tube tests), linear waves in an ion-electron plasma, and the damping rate of Langmuir waves in a singleelectron fluid with Landau closures. We then investigate the non-linear effects of two interacting Alfvén waves as well as cosmic-ray-driven instabilities, where fluid-PIC and PIC results are compared. We conclude in section 4. Throughout this work, we use the SI system of units.

Numerical Method
After a review of the kinetic description of a plasma in section 2.1, we briefly introduce our PIC method in section 2.2. The fluid description for plasmas and its assumptions are given in section 2.3. The finite volume scheme we use to numerically solve the compressible Euler equations is described in section 2.4, while the electromagnetic interactions of the fluid are described in section 2.5. In section 2.6, we describe the Landau closure we adopt in order to mimic the Landau damping in kinetic thermal plasmas within the fluid description, and detail its implementation in our code. We close this section by describing the overall code structure of the fluid-PIC algorithm and finally discuss the interaction between the modules via the current-coupling scheme (Section 2.7).

Kinetic description of a plasma
The kinetic description of a collisionless relativistic plasma with particles of species s with elementary mass, s , and elementary charge, s , is given by the Vlasov equation, is the distribution function, = is the spatial component of the four-velocity with the Lorentz factor = [1 + ( / ) 2 ] −1/2 , and is the light speed. The acceleration due to the Lorentz force is given by where ( , ) and ( , ) are the electric and magnetic fields, respectively. The evolution of electric and magnetic fields is governed by Maxwell's equations: where = 1/ √ 0 0 is the vacuum speed of light, and 0 and 0 are the permittivity and the permeability of free space, respectively. The evolution of the electro-magnetic fields is influenced by the charge density, , and current density, . They are given by the charge-weighted sum over all species of the number densities s and bulk velocities s respectively, (2.6) 2.2. The particle-in-cell method We use the PIC method to solve for the evolution of plasma species that are modelled with the kinetic description. The PIC method initializes a number of computational macroparticles to approximate the distribution function in a Lagrangian fashion. Each macroparticle represents multiple physical particles and, as such, each macroparticle has a shape in position space which can be represented by a spline function. By depositing the particle motions and positions to the numerical grid (or computational cells), the electromagnetic fields can be computed. This step is followed by a back-interpolation of these fields to the particle positions so that the Lorentz forces on the particles can be computed. In our implementation, these equations are solved using one spatial dimension and three velocity dimensions (1D3V), i.e. ∇ = ( / , 0, 0) T .
The code quantities are defined as multiples of the fiducial units given for time, fields (electric and magnetic), charge, current density and length (2.7) This enables us to select a fixed time step of where cfl < 0.5 to satisfy the Courant-Friedrichs-Lewy (CFL) condition. The value of the reference density 0 is chosen such that the code timescale, 0 , obeys −2 p = 2 0 . The total plasma frequency is p = ( 2 ) 1/2 , and related to the plasma frequencies of the individual species, 2 s = 2 s s /( 0 ). We define the discretized time = Δ , position = Δ and quantities at discrete position and times as = ( , ). For details on the PIC code SHARP, the reader is referred to Shalaby et al. ( , 2021. Here, we focus on describing how SHARP is extended to include fluid treatment of some plasma species.

Fluid description of plasma
A straightforward way of coarse graining the Vlasov equation (2.1) is to reduce its dimensionality. By taking the -th moment over velocity space, i.e. ∫ d 3 , we retrieve the fluid quantities and reduce the dimensionality of the 1D3V kinetic description to 1D. The number density s and the bulk velocity s are defined through the zeroth and first moment of the distribution function, respectively, while the total energy density per unit mass and the scalar pressure per unit mass are related to the second moment (Wang et al. 2015): (2.9) (2.12) Here, the pressure tensor is under the adiabatic assumption and the degrees of freedom are encoded in the adiabatic index Γ. The following relation is found from the definitions (2.13) The first three moments of the of the Vlasov equation are called the continuity, momentum, and energy conservation equations. A set of these equations is found for each fluid species, but the subscript s is neglected here for simplicity: (2.14) (2.16) We assumed the non-relativistic limit and an isotropic pressure tensor with vanishing non-diagonal components, i.e. the inviscid limit. The notation indicates the dyadic product of the two vectors and 1 is the unit matrix. Similar to the definition of the scalar pressure in equation (2.12) we use a definition of the heat flux vector, which is normalized to the degrees of freedom as well (2.17) The electromagnetic source term is given by The general form of the fluid equations can be written as where˜=˜( , ) = ( , , ) T is the fluid state vector at position ( , ), F is the flux matrix, and is the source vector. Numerically, the complexity of solving equation (2.19) can be reduced by splitting the operator into less complex sub-operators using Strang operator splitting (Strang 1968;Hakim et al. 2006). This enables us to use the most appropriate solver for each subsystem sequentially. We split the fluid update into three parts; the flux F excluding the heat flux (see section 2.4), the electromagnetic source em = / (see section 2.5.1), and the heat flux (see section 2.6). For commuting operators exp(Δ ) and exp(Δ em ) a second order accurate Strang splitting is obtained as (2.20) If and act independently on the entries and respectively, then the order of applying them can be varied and they need to be evaluated only once. In practice the formulation of might partially depend on . In this case, Strang splitting is performed on this part of the operator as well, see equation (2.43).

Finite volume scheme
The 1D3V fluid equations are solved using a finite volume method, where the fluid equations are averaged over the cell volume, which is an interval of length Δ in 1D, This enables us to correctly conserve the overall fluid mass, fluid momentum and fluid energy, even in the presence of large gradients, by utilizing Gauss' theorem: where the flux through an interface at is ( ) = [˜( , )], leading to the update equation (2.23) Integrating equation (2.23) in time is achieved by using second, third, or fourth-order Runge-Kutte methods (Butcher 2016). In contrast to the finite difference scheme used for electromagnetic fields and particles, where electromagnetic quantities are point values, fluid quantities discretized with the finite volume method are cell averages. This is useful, because the finite difference method does not guarantee the conservation of the conservation equations (2.14) through (2.16), which are governing the fluid; while on the other hand using the finite volume method for the electromagnetic fields needs additional steps to satisfy the constraint ∇ · = 0. Hybridization of both schemes to combine the advantages of each has been used before in other contexts, i.e. Soares Frazao & Zech (2002). The maximum time step in the 1D3V Euler equations, which allows for stable simulations, is Δ < cfl Δ × (| | + s ), with the speed of sound s = (Γ / ) 1/2 . For all realistic setups these velocities are limited naturally by the speed of light, | | < and s < , and this condition is automatically fulfilled by the time step criterion in equation (2.8).
In practice, only equation (2.8) together with a suitable Courant number of cfl 0.5 is used to determine the time step of the simulation.

Reconstruction
To approximate the flux at interfaces, we need to reconstruct the fluid state at cell interfaces. The accuracy of the reconstruction has a crucial influence on the diffusivity. A lower-order reconstruction can lead to excessive damping of waves, which might suppress relevant physical effects on longer timescales.
For reconstructing the point value˜( +1/2 , ), which is needed to compute +1/2 , we employ a central weighted essentially non-oscillatory reconstruction (C-WENO) scheme of spatial order five. The reconstruction computes two point values at each interface +1/2 , an interpolation from the left-and right-hand side. We reconstruct the primitive variables , , and individually.
Our implementation of the C-WENO method is based on the 5th order scheme presented in Capdeville (2008). An introduction to the topic can be found in Cravero et al. (2018a). The C-WENO reconstruction uses a convex combination of multiple low-order reconstruction polynomials to achieve high-order interpolations of the interface values while it employs a non-linear limiter to degrade this high-order interpolation to a lower order if the reconstructed quantity contains discontinuities. The fifth-order C-WENO uses three third-order polynomials L ( ), C ( ), R ( ) for each cell to interpolate the four adjacent cells in the following way: ) interpolates values at + 1 + 2 while the optimal fifth-order polynomial interpolates all of them: where 0 + L + C + R = 1. The polynomials 0 , L , C , and R are a convex representation of the opt polynomial. We use 0 = 3/4, C = 2/16, and L = R = 1/16. In general, we would like to use the reconstruction provided by the opt polynomial as frequently as possible because of its high-order nature. But this high-order reconstruction can cause oscillations similar to the Gibbs phenomenon at discontinuities. Therefore, we need to employ a limiting strategy to avoid such behaviour. In order to accomplish this, we re-weight all of our -coefficients by taking the smoothness of the associated polynomial into account (Jiang & Shu 1996). We define where is a measure for the overall smoothness of the reconstructed variables, and IS[ ] defines a smoothness indicator of the low-order polynomials. Because the formulae for these smoothness indicators are quite cumbersome, we list them in appendix A. These coefficients define a new set of normalized weights given by (2.26) The final reconstructed polynomial is then given by the convex combination of the loworder polynomials using this set of normalized weights: which we evaluate at the cell interfaces to calculate the required left-and right-handed interface values for the Riemann solver. We detail how these polynomials are evaluated in appendix A. The smoothness indicators IS[ ] vanish if the underlying polynomials are smooth. In this case, the re-weighted coefficients reduce to their original value → and the reconstructed polynomial reduces to the optimal polynomial rec ( ) → opt ( ).

Riemann solver
The previous reconstruction step determines two, potentially different, values˜L and R for each quantity to the left and right of every interface, thereby providing the initial conditions for the Riemann problem: An (approximate) Riemann solver is employed to compute the numerical flux F (˜). While a number of different families of Riemann solvers have been developed with individual strengths and weaknesses, we have decided to implement multiple solvers which can be changed on demand. Implemented solvers in fluid-SHARP include a Roe solver with entropy fix (Roe 1981;Harten & Hyman 1983) and an HLLC solver (Toro et al. 1994). While the Roe solver yields more accurate solutions and fewer overshoots in our tests in comparison to the HLLC solver, it becomes unstable in near vacuum flows and strong expansion shock waves. Even though differences between the solvers are easily visible in some shock setups and artificially extreme conditions, they are typically negligible in most applications common for thermal plasmas. We opt to employ the HLLC solver as our standard for stability purposes and use the Roe solver in cases where stronger shocks with overshoots are expected.

Electromagnetic interaction with charged fluids
In this section, we first introduce the Lorentz force as a source term in equation (2.15). Furthermore, we describe how the fluid influences the electromagnetic fields. With these two additional parts, the description from an uncharged gas in section 2.4 is expanded here to include plasmas.

Treatment of electromagnetic source term
Instead of integrating the energy equation (2.16), which would require evaluating the source term on the right-hand side, we compute the time evolution of the primitive pressure variable, for which the electromagnetic source term conveniently vanishes: (2.30) Then only the computation of the source term for the momentum equation (2.15) is left, which uses the Boris integrator (Boris et al. 1970) to account for the Lorentz force on the fluid momentum vectors. Up until now we have only applied the C-WENO method for conservation laws, however, by adding the source term, we are left with a balance law. In C-WENO formulations for balance laws it is customary to approximate the integral of the source term (equation 2.23) numerically to higher orders as well (Cravero et al. 2018b). We use Simpson's Formula for approximating equation (2.23) where the intra-cell values˜± 1/2 are interpolated by the same C-WENO scheme as used for solving the hydrodynamical equations, and the centre-value is computed selfconsistently with the numerical integration formula, i.e.˜= (6 −˜+ 1/2 −˜− 1/2 )/4. We also need to interpolate the electromagnetic field values to a comparable spatial order. This is achieved by performing finite-difference interpolations for each component from the Yee mesh discretized fields, that is and temporal order, = ( +1/2 + −1/2 )/2, again, for each component necessary. Lower order approximations produce, in our tests, similar results, but converge to slightly lower wave frequencies when compared with the analytical solution of the dispersion relation.

Deposition of charges
Equations (2.4) govern the electric field evolution, where Faraday's or Gauss' law might be used to compute . In this section we focus on the one-dimensional setup without particle contributions, which are explained in section 2.7. The perpendicular components' update, and , is received straightforwardly by discretizing Faraday's Law where the sum is taken over all fluid species s and are components of the fluid vector . For the component in spatial direction however, in order to enforce chargeconservation, Gauss' law in discretized form needs to be enforced for all 1 as well where the second equality uses the definition of cell averages in the finite volume scheme (see equation 2.21) and shows, that this numerical formula is exact. Another formula for updating ( ) 0 to the time step is still needed. In the analytical case Gauss' law in combination with the density conservation equation (2.14) for the analytical flux (or cell values) ∝ can be shown to be equivalent to Faraday's law; in the numerical case this equivalency is shown using the discretized conservation equation and corresponding numerical flux ∝ (˜) for the current density . Taking the time derivative of equation (2.35) in conjunction with the discretized density update equation (2.23) leads to the expression (2.36) The integration in time using Runge-Kutta methods is the same as used to solve equation (2.23). Faraday's law using fluxes in one spatial dimension is then given by and enables us to identify by comparison to the charge conservation equation (equation 2.14 multiplied by s ) Note, that the numerical flux also includes numerical diffusion and is directly related to changes in . Due to this, other formulations for violate the charge conservation equation and can lead to numerical instabilities.

Magnetic field evolution
Because the fluid evolution influences the magnetic field only indirectly, the finitedifference time-domain (FDTD) update for the magnetic field is unchanged from the previous SHARP code. For completeness we reproduce the formulae here (Shalaby et al. 2021) ( ) is constant in the 1D3V model because of the requirement ∇ · = 0.

Landau closure for fluid species
The highest retained fluid moment, which is in our case the specific heat flux , is not evolved in our set of equations. Instead, we need to estimate its value dynamically using an appropriate closure. The simple ideal gas closure sets = 0, which, however, prevents the energy dissipation of plasma waves. One important mechanism of such a dissipation is the collisionless damping of electrostatic waves achieved through Landau damping. Landau damping is a microphysical kinetic wave-particle interaction, where particles resonate with the wave exchange energy as a function of time. In essence, the resonant particles accelerate or decelerate to approach the wave's phase velocity, thereby picking up energy or releasing it, respectively. For Maxwellian phase space distributions, there are more particles at velocities smaller than the phase velocity, which yields a net damping, i.e., energy loss of the wave (Boyd & Sanderson 2003).
Various attempts, e.g. by Hammett & Perkins (1990), were carried out to approximate the heat flux of an almost Maxwellian distributed plasma, such that the kinetic phenomenon of Landau damping is mimicked in the linearized fluid equations. Landau damping is a non-isotropic effect, which can be reflected in the fluid descriptions. Accounting for the gyrotropy of the system around the magnetic field, often the doubleadiabatic law with two adiabatic coefficients parallel and perpendicular to the magnetic field is presupposed (Hunana et al. 2019). For now, we restrict our algorithm to isotropic pressures with only one common adiabatic coefficient for parallel and perpendicular pressure and leave this possibility of modelling anisotropic double-adiabatic systems open for future extensions of our algorithm. In our simplified model, we denote an isotropized pressure tensor with the adiabatic coefficient Γ = 5/3, instantly isotropizing all heating occurring due to the heat flux closure, while Γ = 3 denotes a negligible pressure in the and -direction. Hence, we define only the perturbed scalar heat flux parallel to the magnetic field line = and no perpendicular heat flux. Here, we will introduce two different formulae for heat flux closures. The first and most popular collisionless electrostatic closure was proposed by Hammett & Perkins (1990). We refer to it as the 32 closure throughout this paper, and it approximates the heat flux at a fixed Γ = 3, in Fourier space, bŷ Here, hats are used to denote quantities in Fourier space along the magnetic field line, i.e.ˆ= F ( ), and the subscript 0 refers to simulation box averages, that is 0 = c =0 / c is an average over all c cells. Furthermore B is the Boltzmann-constant, and Bˆ= (ˆ− B 0ˆ) / 0 . Since the plasma average or equilibrium temperature evolves slowly as a function of time, we adjust the background temperature 0 after every time step to synchronize it with the mean pressure, B 0 ( )/ = 0 ( )/ 0 , while the density conservation ensures that 0 stays constant. Note also, that 0 = 0. The dimensionless mass-normalized temperature is 0 = B 0 /( 2 ).
A more recent approximation was proposed by Hunana et al. (2018), who restricts this closure to Γ = 3 only, for reasons mentioned already. We use an ad hoc formulation of their closure with a variable Γ, thereby allowing our simplified model to be used. They also introduce the nomenclature adopted here, which is used to denote that the kinetic plasma response function is mimicked for this closure by a Padé approximant with polynomials / = of order and . We refer to their closure as 31 and it approximates the heat flux, in Fourier space, bŷ (2.42) In comparison to the 32 closure, this closure has an additional dependence on the perturbed bulk velocityˆ. This effectively increases the speed of sound obtained from the non-electromagnetic fluid equations and allows retrieving the correct damping rate with our ad-hoc assumption of variable Γ, see appendix C. For Γ = 3, we retrieve the coefficient forˆfrom the aforementioned literature 4 4− − 3 = 3 −8 4− . In only one spatial dimension, as assumed in our code, the global integration along a magnetic field line is approximated to be along the spatial direction, i.e. F = F . An extension to multiple spatial dimensions with an anisotropic pressure tensor is not straightforward because in this case, this approach can lead to spurious instabilities (Passot et al. 2014) and the integration would need to be carried out along magnetic field lines.
A kinetic code does not need global communication to accurately reproduce Landau damping, since each particle (or particle bin) tracks its own interaction with each wave mode as a function of time and accumulates this information in the particle velocity. However, after integrating out the individual particle velocities when building the evolution equations for the phase-space distribution function, i.e. equations (2.14)-(2.16), information about the individual particle-wave interaction is no longer collected. Because some information about this interaction is also contained in the wave, such nonlocal information can be used to approximate the gradient of the physical heat flux, i.e., a closure of the fluid moments that incorporates such missing information. This non-local information is approximated in equations (2.41) and (2.42), and is manifested by the term i sign ( ) in Fourier space, which is also referred to as the Hilbert transform.
Numerically, we do not include the heat flux in the Riemann solver used to compute the fluid fluxes. Instead, we compute the spatial derivative of the heat flux ∇ · separately. We use Strang splitting for the dependent part and the temperature dependent part to expand equation (2.20) into where the derivative in Fourier space was obtained by multiplying with i and the inverse FFT is denoted by F −1 . For the 31 closure the coefficients are given by = 4/(4 − ) and = (4 − ) −1 (2 0 ) 1/2 0 B / , while for the 32 closure these are given by = 0 and = 2(2 0 / ) 1/2 0 B / . Both closures compute a term proportional toˆ(cf. Computing this term naively using the FFT is expensive. This is why, in the following, we present local, semi-local, and efficient global (Fourier transform-based) numerical approximations of the Landau closures, which we have implemented in the fluid-SHARP code.

Local approximations of the Hilbert transform
The phase shift between the wanted derivative iˆand the input ofˆin equation (2.46) is exactly 0, while the amplitude is proportional to | |. This is therefore a special case ( = 1) of the fractional Riesz derivative / | | with Fourier representation where ∈ R. Note, that all approximations mentioned here only introduce errors in the amplitude of | |, but not in its phase. This makes them easier to integrate into simulations in comparison to approximations which are not designed to prevent phase errors, because large phase errors (between π/2 and 3π/2) in any wave mode transform the damping term into an exponentially growing numerical instability. The local approximations make use of the fact, that the fractional Riesz derivative is local and cheap to evaluate for the special case = 2 with ∈ N 0 , where it reproduces the usual derivative 2 / | | 2 = (−1) +1 2 / 2 . Wang et al. While the less accurate local approximations use an arbitrary value of 0 , the FFT is expensive and depends on periodic boundary conditions. Here, we aim to have a fallback algorithm as a compromise between both approaches.
A digital finite impulse response (FIR) filter can be designed to approximate the nonlocal effects by convolving the simulation data with adjacent auxiliary data points, where the filter length determines the maximum distance. For example, an asymmetric filter with an even number of entries is applied on an input using filter coefficients , producing the output : A numerical derivative is then an asymmetrical filter with = 2 and coefficients ±0.5 = ±/Δ , such that +0.5 = ( +1 − )/Δ . Figure 1 shows the magnitude of the frequency response. The gradient driven case shows a quadratic 2 dependence, which is suppressed for larger . This is due to the relatively small uneven filter length of 7 used here; the filter length is an important parameter, since it influences the accuracy of the approximation. With a filter length corresponding to the simulation box size the results can converge to the FFT-based algorithm (i.e. the 2 dependence is not suppressed at higher ), if the filter is designed appropriately. As noted previously, the local closures do not converge to / | |. A correct convergence for approximating / | | is obtained through the high order formulation by Ding et al. (2015). However, this filter violates energy conservation for smaller filter length and is thus, not suitable for our case. Instead, we construct the filter by adopting a convolution of two sub-filters, each of which has an odd amount of asymmetric entries (termed a Type IV filter) similar to the numerical derivative mentioned already. By design, their output has a vanishing mean, thereby guaranteeing energy conservation. A symmetric splitting into the sub-filters / | | = ( 1/2 / | | 1/2 ) 2 is possible, however its frequency response is not monotonic (and has visible ripples) for small filter lengths. This leads to the nonphysical case that some waves at a particular wavenumber are damped less than their slightly larger scale waves at − . Instead, we opt to use the intuitive splitting of / | | = / H where the Hilberttransform filter H is equivalent to −i sign ( ) in Fourier space. The filter H has coefficients = 1/( ). We derive an equivalent formulation to equation (2.45), which is first order in time, by applying the derivative and Hilbert-transform filters successively, i.e. Note, that the derivative is also computed by convolution and has a separate filter length corresponding to its spatial order. We opt to use the same spatial order as in the C-WENO reconstruction for the finite volume scheme. Even for small Hilbert-transform filter lengths in comparison to the number of cells, e.g. / c = 0.04 as shown in figure 1, this formulation dramatically improves the accuracy of multiscale problems in comparison to local approximations. Here, is critical for the accuracy at small wavenumbers , while the spatial order of the derivative is critical for the accuracy at large . Most importantly, this semi-local approach does not require setting an arbitrary damping scale 0 such as the local approximations mentioned before. The only parameter of this approach is the filter length, which should be chosen to be sufficiently large.

Efficient FFT-based computation of the Hilbert transform
Provided the plasma background is uniform and periodic, the most accurate while computationally most expensive results are achieved by computing the heat flux of the fluid in Fourier space. While the FFT is easy to compute on a single computer using standard numerical libraries, our code is parallelized using MPI and an efficient onedimensional FFT is needed. The computation of the Fourier transform is expensive for two reasons: (i) globally, each Fourier component needs to be informed about data from every other computational cell (which may be stored on a different processor), and (ii) the Fourier transform is not easily parallelizable in one dimension, which precludes an efficient scalable Fourier algorithm.
This naturally limits the overall computational scalability of the fluid part of the code. Communication over multiple MPI processes is time consuming because of latency and finite bandwidth. For this reason, parallel FFT algorithms are prone to become a computational bottleneck. However, using non-blocking MPI routines to perform communication in the background can be used while the high computational load of the particles is carried out. Thus, in our case of a combined fluid and PIC algorithm, the communication required for an accurate FFT-based heat flux computation is comparatively computationally cheaper, even with relatively small numbers of PIC particles. Hence, in our case the FFT algorithm does not necessarily become a bottleneck for larger problems. In order to distribute the computational load of the FFT, we employ a four-step algorithm in the first step of the computation (Bailey 1990;Takahashi & Kanada 2000), which extends the Cooley-Tukey algorithm (Cooley & Tukey 1965) for multiple processors. We shortly describe the algorithm for complex input data as found in the literature and afterwards adapt the parallel FFT for real input data in our implementation. The fourstep algorithm interprets the complex data vector of length as a two-dimensional vector = 1 , 2 with lengths 1 and 2 respectively, and volume 1 2 = . The mapping = 1 + 2 1 and = 2 + 1 2 is inserted into the definition of the discrete Fourier transform, where Ψ = exp{−2 i} This way, a complex-to-complex parallel FFT of length is distributed to 1 local FFTs of length 2 , a multiplication by the twiddle factors Ψ 1 2 / and finally 2 FFTs of length simultaneously, they can be computed on different nodes. This has the advantage, that the second all-to-all communication for the transpose is not completely global resulting in reduced communication times. Adapting this algorithm to a real-to-complex FFT, where due to Hermitian symmetry only values of /2 need to be computed, a large amount of computational and communicational savings can be realized. A real-to-complex parallel FFT of length is distributed to 1 local real-to-complex FFTs of length 2 , a multiplication by the twiddle factors Ψ 1 2 / and, now only, 2 /2 + 1 complex-to-complex FFTs of length 1 . Up to two of the latter FFTs can be replaced by real-to-complex FFTs, along the axes 2 = 0 and, if 2 is even, 2 = 2 /2. A scrambled output is received, which, due to Hermitian symmetry, needs to be partially complex conjugated.
A key point in ensuring the efficiency of the parallel four-step algorithm consists in choosing large 1 and 2 . 1 2 √ is the optimal choice for the distributed complexto-complex FFT, the real-to-complex FFT should prefer 1 2 /2 +1 ( √ 2 + 1+1)/2. The computational scaling with processors and roughly optimally distributed 1 and 2 is akin to O ( / log ), but degrades if is a prime number, or, more generally, if 1 or 2 /2 is smaller than the number of processors. This easily avoidable because is a free parameter, and so are 1 and 2 . While this does not scale favourably in comparison to the O ( / ) scaling that dominates the rest of the fluid code, still, the FFT is trivially independent of the numbers of particles per cell pc . The PIC-module on the other hand scales as O pc / and typical applications have pc 100. In many applications the cost of the Fourier transform is, even with worse scaling, subdominant in comparison to the cost of the PIC part. In the remaining cases, local approximations, discussed above, are favourable.

Current-coupled fluid-PIC algorithm
The coupling in our code between various fluid and kinetic (PIC) species is achieved through a current-coupling scheme. Namely, both fluid and kinetic species contribute to the charge and current densities. The electromagnetic fields then evolve in response to the total contributions. The fields are staggered on a Yee-mesh and are updated with the FDTD scheme. Subsequently, both fluid and kinetic species evolve in repose to the new electromagnetic fields. That is our current-coupling scheme does not make any assumption on the velocity distribution of the species modelled using the kinetic description (Park et al. 1992).
The PIC species, using fifth-order spline interpolation, are deposited to specific points on the Yee-grid for which the charge density is defined at full-time steps while the current density is defined at half-time steps as discussed by Shalaby et al. ( , 2021. For fluid species, the fluid density and velocity are defined at the same time step. Therefore, during the evolution of the fluid, we deposit the fluid contribution to the charge and current densities, and , respectively, at the cell centres. The deposition for , is trivial at half-time steps, where the fluid vector is defined, while the contribution to is computed at full-time steps, i.e. before the electromagnetic source update according to equation (2.20). Note, that stays constant when computing the Lorentz force and heat flux updates.
Our algorithm does not apply any approximations to the electrical field components or to Ohm's law, requiring electron timescales and motions to be fully resolved. Consequently, we apply the same algorithm to fluid electrons and protons. This is accomplished using the modular design of the fluid SHARP code where each fluid species is represented by initialising a fluid code class. Each instance of this code class is initialized using the values of the mass and the charges of their respective particle species. The algorithms which define the evolution of each particle species are implemented as functions of the fluid class. This allows us to setup simulations with multiple species, all of which are evolved with the same numerical algorithms, with little effort.
In figure 2 the main loop of the fluid-PIC algorithm is presented. It can be seen that the usual PIC-algorithm loop of electromagnetic update, interpolation to particle position, particle push, and field deposition is retrieved when no fluid species is initialised. On the other hand, without PIC particles, we retrieve a multispecies fluid plasma code. While our fluid-PIC algorithm can simulate an arbitrary mixture of species, it is most efficient if fluids are used for background species and particles for non-thermal particle distributions. Possibilities for task parallelization are shown in figure 2 by dashed lines, which allows maximizing computation-communication overlap.
Our fluid implementation is included within the SHARP code, which uses a fifth-order spline function for deposition and back-interpolation for PIC species (Shalaby et al. , 2021. The PIC part of the code does not make use of filtering grid quantities and results in comparatively small numerical heating per time step, which (if present) would affect the reliability of the simulation results on long timescales (see section 5 in ). This property is important because we are specifically interested in studying microphysical effects on long timescales with our fluid-PIC code. Due to the modularity of our code, each part can be tested individually. These tests, ranging from the uncharged fluid solver to full fluid-PIC simulations, are shown in the next section.

Code validation tests
In this section, we present the results of various code tests. We start with two shocktube tests in section 3.1 before we show that our code is able to accurately capture all six branches of the two-fluid dispersion relation (Section 3.2). We describe code tests of Langmuir wave damping (Section 3.3) and of two interacting Alfvén waves generating   Table 1: Parameters adopted for the shock tube tests described in section 3.1. 0 divides the domain into two halves, where values to the left of 0 ( < 0 ) are initialized by the parameters with subscript l. Similarly, subscript r indicates parameters to the right of 0 .
a new, longitudinal wave along the magnetic field (Section 3.4). In section 3.5, we test the entire fluid-PIC code with a simulation of the gyrotropic CR streaming instability, where PIC CRs are streaming in a stationary electron-proton fluid background. Finally, we demonstrate the successful parallelization strategy of our code by performing scaling tests in Section 3.6.

Shock tube
As the fluid approximation will be primarily used for background plasmas without excessive gradients, the accuracy of resolving sharp discontinuities is of secondary importance in practical applications. Still, we stress test our implementation of the fluid equations to ensure its numerical robustness and to compare the numerical dispersion for Test 1, as shown in figure 3a, is a modified Sod shock tube test. The sonic rarefaction wave on the left-hand side as well as the shock front on the right are well resolved without noticeable oscillations. The contact discontinuity in the middle introduces small oscillations in the density and is smeared out more than the shock front. While the Roe and HLLC solvers yield almost the same results, the HLLC solver is slightly better at resolving the sonic point at the head (to the left) of the sonic rarefaction wave, which the Roe solver can only resolve because an entropy fix is applied. Figure 3b shows a test of a stationary contact discontinuity with a shock front of a high Mach number travelling to the right and a rarefaction wave to the left. It can be seen, that while the HLLC method introduces more oscillations, it is also better at resolving the contact discontinuity.
In low-density flows the Roe solver is not suitable because it is not robust without further modifications (Einfeldt et al. 1991), making the HLLC method slightly more robust while the Roe method is slightly less dispersive. However, for most practical applications studied here, both methods produce similar results.

Two-fluid dispersion relation
For an ideal two-fluid plasma the dispersion relation can be solved for six different wave branches (Stix 1992). We show the solutions to the dispersion relation of a two-fluid  Figure 5: The linear dispersion relations of a Langmuir wave with immobile ions. Shown are, on the left-hand side, the real frequency components and, on the right-hand side, the negative imaginary frequency components (which are responsible for damping). The crosses present data points obtained from simulations with the respective closure while the theoretical result is shown with a solid line. The relative error between simulation and theoretical results ( sim − theor )/ theor is shown in the lower panels. For reference, the red crosses display the data points as given in table 1 of . plasma in figure 4 for a realistic mass ratio of i = 1836 e and i = B i /[ 2 0 /(2 0 )] = 0.2 in an isothermal plasma. 0 is oriented along the -axis and the Alfvén velocity is A = 0 /( 0 i i ) 1/2 = 5.83 × 10 −3 . Multiple simulations at different wavenumbers have been initialized that have all six wave modes simultaneously present and were run for a total time of 14/min ( ), where denotes the wave frequencies, which are always completely real for an ideal fluid. Consequently, the waves should be undamped and any possible damping introduced is because of numerical dissipation. Initial conditions for all of our fluid simulations as well as theoretical predictions are computed using an extended algorithm based on the dispersion solver by Xie (2014), which can take into account the effects of both heat flux closures. A Fourier analysis in time has been performed and the six largest local extrema are shown as crosses in Fig. 4. It can be seen, that the simulation results are in good agreement with the analytical results. In the Fourieranalysis the largest relative errors of at most 7 per cent in occur in the large-scale part of the ion-acoustic branch as well as close to the cut-off frequency of the lower LCP branch. In comparison to this, the largest relative errors in the upper three branches are more than one magnitude less.

Langmuir wave damping
The electrostatic wave modes are directly subject to linear Landau damping, and thus present a good test for the heat flux closures. To test this, we initialize standing Langmuir waves in an electron plasma with immobile ions. We use the same grid layouts as in table 1 of , supplemented with fluid simulations run at / D ∈ {0.1, 0.2, 0.3} with a resolution of λ/Δ = 68 cells per wavelength and a domain size of length = 10λ wavelengths. The wavenumber associated with the Debye length is the ratio of plasma frequency to thermal velocity, i.e. D = p / 1/2 . The amplitude of the wave is chosen, such that the density fluctuation to background ratio is fixed to / 0 = 10 −3 .
In order to find the numerical dispersion relation we perform curve fitting with the Powell algorithm on the time series for times up to 80 −1 p , while the simulations at / D = 0.01 and 0.05 with small damping are analysed up to 240 −1 p . The computation Figure 6: Two different Alfvén waves, with magnetic and velocity vectors 1 , 2 and 1 , 2 , propagate transversally along the -axis, where the electromagnetic vectors rotate (counter-)clockwise around it. Because of their phase difference Δ the overall Lorentz force ( 1 + 2 ) × ( 1 + 2 ) in -direction is non-zero, thereby generating the longitudinal wave shown in dark yellow.
of the heat fluxes for the 31 and 32 closures is performed using the FFT-based method. The results are shown in figure 5, where the ideal gas closure and the kinetic results are also depicted for reference.
Generally, it can be seen, that at small scales the closures show larger deviations from each other, which is also where the fluid description starts breaking down naturally as the particle distribution is not in equilibrium. At larger scales, the various descriptions of Landau damping converge and approach zero. The numerical relative error of the fluid code is small and stays below 0.003 per cent for real frequencies and below 0.02 per cent for decay rates in this setup. The simulation at / D = 0.05 performs worse than the one at / D = 0.1 due to the significantly lower resolution. The error in decreases at second-order with increasing spatial resolution, as shown in appendix B.

Interacting Alfvén waves
A single Alfvén wave is purely transversal and not directly affected by Landau damping. However, two or more Alfvén waves drive a longitudinal electrostatic wave, which is susceptible to Landau damping, see figure 6. This leads to particle heating as a result of the collisionless damping of the Alfvén wave, also known as non-linear Landau damping.
Restricting ourselves to a setup of pairwise interacting waves, we can identify two distinct cases. In the first case counter-propagating waves are interacting. In consequence, both waves damp, lose energy to the longitudinal wave and subsequently heat the particles. In the second case the waves are co-propagating. Here the wave with the smaller wavelength will not only transfer energy to the particles, but also to the other Alfvén wave. Lee & Völk (1973) describe this mechanism in detail and formulate the following coupled set of differential equations while adopting a measure for the magnetic energy  The coupling between the differential equations is implicit because the damping coefficient has the dependency Γ 1 ∝ 2 . For the counter-propagating case with an isothermal ion-electron-plasma in the high beta limit i = 2 0 i B i / 2 0 = 2 1, where 0 is the background magnetic field strength, the damping rate Γ is approximately equal for both wave polarizations with similar frequencies and may be approximated by (Holcomb 2019) Note that Γ 2 is found by substituting the subscripts 1 → 2 and 2 → 1. In figure 7 we show simulations of a linearly polarized Alfvén wave, which consists of two counter-propagating waves of equal amplitude. The pure fluid simulations are shown with a box size of = 252 / i and wavelengths λ = /3. Right and left polarized waves are initialized with phase velocities RCP / = 0.0342 and LCP / = 0.0318 with a perpendicular magnetic field amplitude of = 0.1 0 . A reduced mass-ratio of i / e = 100 is adapted here.
Our simulations are carried out with the different heat flux closures 32 and 31 , as shown in figure 7. Both closures reproduce the theoretical predictions quite well. A PIC simulation with similar parameters has been shown in figure 6.4 by , which reproduces half of the predicted damping rate until ∼ 2 and shows a quenching of the damping rate afterwards. In comparison to kinetic simulations, there is no saturation of the Landau-damping effect in fluids. This is because the distribution of the fluid particles is always assumed to be roughly Maxwellian and resonant particles are not depleted as a function of time. Hence, Landau fluid is implicitly assumed to have small thermalization timescale in comparison to the damping timescale. On the other hand, PIC simulations are plagued by Poisson noise and an insufficient resolution of velocity space might lead to a reduced Landau damping rate.

Gyrotropic CR streaming instability
To test the entire code, we run CR streaming instability simulations, where electron and ion CRs are modelled with the PIC method and the background electron and ion plasmas are modelled as fluids. The initial CR momentum distribution for ions (electrons) is assumed to be a gyrotropic distribution with a non-vanishing (zero) pitch angle, while both CR electrons and ions are assumed to drift at the same velocity dr . Namely, the phase space distributions for the electron and ion CR species ∈ {e, i} are given by (Shalaby et al. 2021) cr, ( , ) = cr, where = (1 − 2 dr / 2 − 2 ⊥, / 2 ) −1/2 is the Lorentz factor and ⊥, is the perpendicular component of the CR velocity. We choose ⊥,e = 0 and ⊥,i = 13.1 A , where the ion Alfvén velocity is given by A = 0 /( 0 i i ) 1/2 = 0.01 with the background magnetic field pointing along the spatial direction, and dr of 5 A resulting in a pitch angle for the ions of tan −1 ( ⊥,i / dr ) = 69.1 • . The thermal background species are isothermal with the temperatures B /( 2 ) = 10 −4 and a mass ratio i / e = 1836. We use a periodic box of length = 10 971.5 / p and resolution Δ = 0.1 / p . The CR to background ratio number density ratio = cr,i / i = 0.01.
We run two simulations where the background plasmas are modelled as fluids. The first one uses an ideal gas closure without accounting for Landau damping (FPIC ideal gas) while we include the heat flux source term in the second simulation to mimic the impact of linear Landau damping using the 31 closure of equation (2.42) (FPIC Landau 31 ). We compare these two fluid-PIC simulations against PIC simulations where both CRs and background plasmas are modelled as PIC species. The number of CR ions per cell is pc = 25 (75) and we call this simulation "PIC normal (high) pc " (Shalaby et al. 2021). Like the "PIC normal pc " simulation, the fluid-PIC simulations also use 25 particles per cell for modelling CRs.
Growth rates of the instability in the linear regime can be computed from the linear cold background plasma dispersion relation (Holcomb & Spitkovsky 2019;Shalaby et al. 2022): (3.4) The non-relativistic and relativistic cyclotron frequencies of each species are given by Ω s,0 = 0 / and Ω s = Ω s,0 / s respectively. The wavelength of the most unstable wave mode at the gyroscale is λ g = 2 ( dr − A )/Ω i , which is properly captured in our setup using a box size of ∼ 10.15λ g . We show the amplification of the perpendicular magnetic field components as a function of time for this unstable setup in figure 8 for various simulations. It shows that the noise level of the fluid-PIC simulations is orders of magnitude lower in comparison to the "PIC normal pc " resolution, even though the number of CR particles per cell is the same. Especially up to the saturation point ( Ω i ∼ 10) the fluid-PIC simulation compares more favourably to the PIC results with lower noise than to the PIC simulation with fewer pc . After saturation, i.e. when Alfvén waves at many scales have built up and their interaction has created an electrostatic field, these waves start to lose some energy to Landau damping of the electrostatic waves (see section 3.4). At that point, the Landau closure becomes relevant. Qualitatively the ideal gas closure has no efficient mechanism for dissipating such electrostatic waves, resulting in a prolonged growth period leading to saturation at higher values at the cascading and intermediate scales. Utilization of a Landau closure leads to some damping, albeit it is quantitatively smaller than in the PIC simulations. While figure 5 indicates faster damping for the Landau closures in comparison to the kinetic results in the electron electrostatic branches, damping in the ion-acoustic branch might be underestimated in the Landau closures. We have compared the expected damping between kinetic and Landau fluid in the ion-acoustic branch for multiple wavenumbers, which confirmed that this is a likely scenario. The accuracy of this approximation is not the same at all scales, which can be seen in figure 9, where the magnetic field amplifications at various ranges of scales are compared. Especially in the highly Landau-damped scales, differences between fluid-PIC and PIC emerge. At ion gyro scales, where most of the magnetic energy is stored at saturation, there is a good agreement over the entire time period. Exponential growth at every scale is also in good agreement between PIC and fluid-PIC simulations at all scales. The initial exponential growth can also be compared to the expected growth rates from the linear dispersion relation. The growth rates of the two local maxima are plotted alongside the simulated data, one at the intermediate scales around = 4.91 i and one at the gyro scale at = 0.38 i . The intermediate scale starts an inverse cascade to larger scales almost immediately, which causes a reduced growth rate in comparison to the expectation from linear theory. By contrast, the gyro scale instability follows linear expectations to very good approximation.
While our fluid-PIC and PIC results are promisingly similar, differences after the saturation level might be attributed to multiple reasons. First, the Landau closures do not exactly reproduce the correct damping, and therefore will deviate quantitatively. Second, due to the high electron temperature chosen, relativistic effects might occur in PIC, but not in the non-relativistic fluid that we assumed for the background plasma. Third, the PIC method might exhibit more numerical dissipation at the given pc in comparison to the fluid method. However, figure 9 seems to indicate numerical convergence at the intermediate and gyro scale.
Even though our simulations were run at unrealistically high , the background particles did not deviate significantly from the Maxwellian distribution at the end of the simulation time. This indicates, that a fluid description for background species is indeed a valid approach for this setup, especially for smaller, more realistic values of .

Computational scaling
We show the strong scaling properties of our fluid-PIC code in Fig. 10. The tests were run on Intel Cascade 9242 processors with 96 processors per node at the HLRN Emmy cluster. Simulations with 3000 processors or more typically cause severe bottlenecks due to the latency and/or the finite bandwidth of input/ouput operations. For this number of processors the Fourier-based closures are roughly 20 per cent more costly in comparison to the ideal gas closures. This is in stark contrast to pure PIC simulations, which scale with the inverse ratio of CR-to-background density −1 , consequently the fluid-PIC algorithm leads to a speed-up of a factor of 100 for the simulation performed in section 3.5, which adopted unrealistically large . The bottleneck in the communication procedure of our implementation is currently the "Ialltoallv" MPI routine, which is not optimized for hierarchical architecture networks as of now. Further optimizations to this might provide fruitful in increasing the code's scalability further if necessary.
The fluid-PIC simulations in section 3.5 used only pc = 25 and seem to be sufficiently resolved. For such a low particle number, the FFT is the bottleneck for scalability because the overlap of communication and computation is small, i.e. we measure a 260 per cent increase in time with 2880 processors, while at 192 processors the increase is below 20 per cent. This indicates that scalability of fluid-only simulations is dominated quickly by the FFT, while the cost is almost negligible for fluid-PIC simulations. Still, simulations with only a few particles per cell are computationally inexpensive so that there is no reason for performing such a simulation on thousands of processors. Furthermore, the example of a mono-energetic cold CR beam is not very demanding regarding the phasespace resolution. More realistic scenarios include power law distributions for the CR population as well as larger spatial density inhomogeneities, both resulting in an increased requirement for the number of particles per cell in order to accurately resolve the velocity phase-space distribution along the entire spatial domain.

Conclusion
In this paper, we introduce a new technique termed fluid-PIC, which uses Maxwell's equations to self-consistently couple the PIC method to the fluid equations. This tech-nique is particularly aimed at simulating energetic particles like CRs interacting with a thermal plasma. This enables us to resolve effects on electron time and length scales and to emulate Landau damping in the fluid by incorporating appropriate closures for the divergence of the heat flux. The underlying building blocks of our implementation are the SHARP 1D3V PIC-code extended by a newly developed fluid module and the overall algorithm is second-order accurate in space and time. While an ideal fluid does not exhibit Landau damping, we have implemented two different Landau fluid closures and studied their performance. Here we summarize our main findings: • We developed a stable multi-species fluid code that is coupled to explicit PIC algorithm.
In order to couple multi-fluid equations to Maxwell's equations, very often implicit and semi-implicit methods have been used for stability reasons. However, the resulting interdependency between all fluids complicates their coupling to explicit PIC methods.
To ensure numerical stability, Riemann solvers that provide some numerical diffusion are used. However, we demonstrate that the level of numerical diffusivity needs to be carefully controlled so that it does not numerically damp small-amplitude plasma waves or quench plasma instabilities. Most importantly, our new fluid-PIC code fully resolves the electron timescales, precluding the need to adopt any simplifying assumptions to the electrical field components or to Ohm's law. • We compare various Landau fluid closures and demonstrate that local closures only produce reliable results close to a characteristic scale while they are prone to fail in multi-scale problems. By contrast, semi-local spatial filters or global (Fourier-based) methods to estimate Landau fluid closures produce reliable results for a large range of scales. Most importantly, we demonstrate that the inclusion of communication intensive (Fourier-based) fluid closures only have a minimal impact on our code performance (through the usage of non-blocking background communication) because the majority of the computational workload is taken up by the much more cost-intensive PIC module. This enables us to make use of the more accurate Fourier-based Landau closure for the fluid instead of relying on local approximations only. • In numerical tests, our implementation of the multi-species fluid module showed excellent agreement with theoretical frequencies and damping rates of Langmuir waves, oscillation frequencies of various two fluid wave modes, as well as the non-linear Landau damping of Alfvén waves. • First simulations of the CR streaming instability with our combined fluid-PIC code provide very good agreement with the results of pure PIC simulations, especially for the growth rates and saturation levels of the gyro-scale and intermediate-scale instabilities. This success is achieved at a substantially lower Poisson noise of the background plasma at the same number of computational CR particles per cell. Most importantly, the numerical cost of the fluid-PIC simulation is reduced by the CRto-background number density ratio. However, we find that the late-time behaviour of the CR streaming instability differs for our fluid-PIC and PIC simulations. More work is needed to understand the reason for this, which could be either resulting from (i) numerical damping due to Poisson noise resulting from the finite number of PIC particles, (ii) missing relativistic (electron) effects in our non-relativistic fluid dynamics, or (iii) missing physics in our fluid closures that may be underestimating other relevant collisionless wave damping processes.
Three possible future extensions of the algorithm are left open here. (i) Extending the fluid formulation with a full pressure tensor, (ii) extending the code to two or three spatial dimensions, and (iii) the inclusion of direct interaction terms between the various fluids to explicitly incorporate scattering processes such as ion-neutral damping. The novel fluid-PIC framework greatly extends the computationally limited parameter space accessible to pure PIC methods whilst not compromising on some of the most important microphysical plasma effects. This opens up many possibilities for studying CR physics in physically relevant parameter regimes, such as the growth and saturation of the CR streaming instability in different environments, and including the effect of partial ionization, ionneutral damping and inhomogeneities of the background plasma.
The low-order polynomials are evaluated at the left-hand interface of a given cell via:  where Γ eff = +Γ = 4/(4− ) 4.66 can be interpreted as the effective adiabatic index of the fluid. The evolution of sound waves of a non-electromagnetic fluid in the linear regime is governed by the linear term Γ eff 0 ∇· while the term Γ 1 ∇· adds non-linearity to this equation. In the linear approximation, the speed of sound becomes s = (Γ eff 0 / 0 ) 1/2 which coincides with the typical expression for the sound speed s = (Γ 0 / 0 ) 1/2 in the limit of = 0. This implies that the speed of sound is increased for the 31 closure even if direct Landau damping is not present ( = 0). Interestingly, the effective adiabatic index and the speed of sound are independent of the choice of Γ. If direct Landau damping, as described by the 31 closure, is affecting the fluid (i.e., ≠ 0), then the effective adiabatic index attains somewhat smaller values in comparison to + Γ while the wave frequency becomes complex because of the associated damping. Both are still independent of the choice of Γ.
This has consequences for simulations that model mildly relativistic fluids. If a simulation setup includes a fluid with an associated speed of sound near the speed of light s , then a simulation that uses this setup with the 31 closure can become unstable because s can now exceed the speed of light because of the aforementioned reason.