An introductory guide to fluid models with anisotropic temperatures. Part 2. Kinetic theory, Padé approximants and Landau fluid closures

In Part 2 of our guide to collisionless fluid models, we concentrate on Landau fluid closures. These closures were pioneered by Hammett and Perkins and allow for the rigorous incorporation of collisionless Landau damping into a fluid framework. It is Landau damping that sharply separates traditional fluid models and collisionless kinetic theory, and is the main reason why the usual fluid models do not converge to the kinetic description, even in the long-wavelength low-frequency limit. We start with a brief introduction to kinetic theory, where we discuss in detail the plasma dispersion function $Z(\unicode[STIX]{x1D701})$ , and the associated plasma response function $R(\unicode[STIX]{x1D701})=1+\unicode[STIX]{x1D701}Z(\unicode[STIX]{x1D701})=-Z^{\prime }(\unicode[STIX]{x1D701})/2$ . We then consider a one-dimensional (1-D) (electrostatic) geometry and make a significant effort to map all possible Landau fluid closures that can be constructed at the fourth-order moment level. These closures for parallel moments have general validity from the largest astrophysical scales down to the Debye length, and we verify their validity by considering examples of the (proton and electron) Landau damping of the ion-acoustic mode, and the electron Landau damping of the Langmuir mode. We proceed by considering 1-D closures at higher-order moments than the fourth order, and as was concluded in Part 1, this is not possible without Landau fluid closures. We show that it is possible to reproduce linear Landau damping in the fluid framework to any desired precision, thus showing the convergence of the fluid and collisionless kinetic descriptions. We then consider a 3-D (electromagnetic) geometry in the gyrotropic (long-wavelength low-frequency) limit and map all closures that are available at the fourth-order moment level. In appendix A, we provide comprehensive tables with Padé approximants of $R(\unicode[STIX]{x1D701})$ up to the eighth-pole order, with many given in an analytic form.

6. APPENDIX 86 6.1. Higher order Padé approximants of R(ζ) 86 6.1.1. 5-pole approximants of R(ζ) 86 6.1.2. 6-pole approximants of R(ζ) 87 6.1.3. 7-pole approximants of R(ζ) 89 6.1.4. 8-pole approximants of R(ζ) 92 6.2. Precision of R(ζ) approximants 95 6.3. 5th-order moment 98 6.4. 6th-order moment 99 6.5. Operator (E + 1 c v × B) · ∇ v f 0 for gyrotropic f 0 102 6.6. The general kinetic f (1) distribution (effects of non-gyrotropy) 104 6.6.1. Case ψ = 0, propagation in the x-z plane 109 The incorporation of kinetic effects, such as Landau damping, into a fluid description naturally requires some knowledge of kinetic theory. There are many excellent plasma physics books available, for example Akhiezer et al. (1975), Stix (1992), Swanson (1989), Gary (1993), Gurnett & Bhattacharjee (2005), Fitzpatrick (2015) and many others. These books cover numerous topics in kinetic theory that need to be addressed, if a plasma physics book wants to be considered complete. However, the topics that are required for the construction of advanced fluid models are often covered only briefly, or not covered at all. For example, the Padé approximation of the Maxwellian plasma dispersion function Z(ζ) or the plasma response function R(ζ) = 1 + ζZ(ζ), which is a crucial technique for the construction of collisionless fluid closures valid for all ζ, is not addressed by any of the cited plasma books.
A researcher interested in collisionless fluid models that incorporate kinetic effects has to follow for example Hammett & Perkins (1990); Hammett et al. (1992); Snyder et al. (1997); Passot & Sulem (2003); Goswami et al. (2005); Passot & Sulem (2006, 2007; Passot et al. (2012); Sulem & Passot (2015) and references therein. The first three cited references are written in the guiding-center reference frame (gyrofluid), which is a very powerful approach that enables the derivation of many results in an elegant way. However, the calculations in guiding-center coordinates can be very difficult to follow. The other cited references are written in the usual laboratory reference frame (Landau fluid), but, the kinetic effects considered are of an even higher-degree of complexity and the papers can be very difficult to follow as well. There are other subtle differences between gyrofluids and Landau fluids and the vocabulary is not strictly enforced.
Additionally, the cited papers assume that the reader is already fully familiar with the nuances of the kinetic description, such as the definition of the plasma dispersion function Z(ζ) and the very confusing sign of the parallel wavenumber sign(k ), that almost every plasma book appears to treat slightly differently. This brief guide, which is a companion paper to "A Brief Guide to Fluid Models with Anisotropic Temperatures. Part 1: CGL Description and Collisionless Fluid Hierarchy", attempts to be a simple introductory paper to the collisionless fluid models, and we focus on the Landau fluid approach. The text is designed to be read as "lecture notes", and may be regarded as a detailed exposition of Hunana et al. (2018).
In Section 2, we introduce kinetic theory briefly, and we consider only aspects that are necessary for the construction of advanced fluid models that contain Landau damping. We focus on the integral e −x 2 x−x0 dx that we call the Landau integral. We discuss how this integral is expressed through the plasma dispersion function Z(ζ) and we discuss in detail the perhaps only technical (but very important) difference between defining ζ = ω/(k v th ) and ζ = ω/(|k |v th ). Only the latter choice allows one to use the original plasma dispersion function of Fried & Conte (1961), and the former choice requires that the Z(ζ) is redefined.
In Section 3, we consider a 1D electrostatic geometry. We discuss the concept of the Padé approximation to the plasma dispersion function Z(ζ) and the plasma response function R(ζ). We introduce a new classification scheme for approximants R n,n ′ (ζ) that we believe is slightly more natural than the classification scheme introduced by Martín et al. (1980) or the scheme of Hedrick & Leboeuf (1992). Nevertheless, we provide conversion relations that allow to convert one notation into the other. We verify the numerical values in Table 1 of Hedrick & Leboeuf (1992) analytically and find a typo in one coefficient of the quite important Z 3,1 (ζ) approximant previously used to construct closures. We proceed by mapping all plausible Landau fluid closures that can be constructed at the level of 4th-order moment. For a brief summary of possible closures, see (316)-(317). For the sake of clarity, all closures are provided in Fourier space as well as in real space. Writing the closures in real space emphasizes the non-locality of collisionless closures, since all closures contain the Hilbert transform, which in real space should be calculated correctly by integration along the magnetic field lines. We compare the precision of the obtained closures by calculating the dispersion relation of the ion-acoustic mode at wavelengths that are much longer than the Debye length. For some closures, an interesting property is observed in that the resulting fluid dispersion relation is analytically equivalent to the kinetic dispersion relation, once R(ζ) is replaced by the R n,n ′ (ζ) approximant, and such closures are viewed as "reliable", or physicallymeaningful. Subsequently, all unreliable closures were eliminated; see the discussion below (317). The closure with the highest power series precision is the R 5,3 (ζ) closure.
We note that electron Landau damping of the ion-acoustic mode can be correctly captured, even if the electron inertia in the electron momentum equation is neglected (the ratio m e /m p still enters the electron heat flux and the 4th-order moment r). The dispersion relation of such a fluid model is of course not analytically equivalent to the kinetic dispersion relation (after R(ζ) is replaced by the R n,n ′ (ζ)), however, such a fluid model provides great benefit for direct numerical simulations, since the electron motion does not have to be resolved. In Figure 2 we plot solutions for selected fluid models without the electron inertia. In Figure 3, the electron inertia is retained, and we replot the fluid model with the R 5,3 (ζ) closure to show that the differences are negligible. We also plot additional closures and discuss a regime where the electron temperature is much larger than the proton temperature, and where closures with higher asymptotic precision yield better accuracy. We then investigate the precision of the obtained closures by using the example of the Langmuir mode, see Figures 4 and 5. These calculations were only noted but not presented in Hunana et al. (2018).
In Section 4, we consider a 3D electromagnetic geometry in the gyrotropic limit, and map all plausible Landau fluid closures at the 4th-order moment level. In a 3D electromagnetic geometry, the most difficult part of the calculations actually consists in obtaining the perturbed distribution function f (1) , since in the laboratory reference frame that we use here, one needs to first calculate the fully kinetic integration around the unperturbed orbit. Only then, the correct gyrotropic limit (where the gyroradius and the frequency ω are small) can be obtained. The integration around the unperturbed orbit can be found in many plasma books, and can be found in the Appendix. An alternative and very illuminating derivation of f (1) is by using the guiding-center reference frame. By writing the collisionless Vlasov equation in the guiding-center limit and by prescribing from the beginning that the magnetic moment has to be conserved at the leading order, the same f (1) is obtained in a perhaps more intuitive way. The various terms in f (1) can be identified with the conservation of the magnetic moment, the electrostatic Coulomb force (which yields Landau damping) and the magnetic mirror force (which yields transit-time damping). Usually Landau damping and its magnetic analogue, transit time damping, are summarily described as Landau damping, and we note that 3D Landau fluid models contain both of these collisionless damping mechanisms.
We show that the closures for the q and r moments are the same as for the q and r moments in 1D geometry. The closure for r ⊥⊥ in the gyrotropic limit is simply r ⊥⊥ = 0. One therefore needs to consider only closures for the q ⊥ and r ⊥ moments. For a summary of the q ⊥ and r ⊥ closures, see (658)-(659). We did not compare the dispersion relation of the resulting fluid models with the fully kinetic dispersion relation in the gyrotropic limit and therefore we can not conclude which closures are "reliable". Nevertheless, by briefly considering parallel propagation along B 0 , one closure was eliminated since it produced a growing higher-order mode. There is only one static closure available for the perpendicular heat flux q ⊥ , which is constructed with the R 1 (ζ) approximant. As discussed later in the Appendix, the simple R 1 (ζ) = 1/(1 − i √ πζ) is a quite imprecise approximant of R(ζ). This has the important implication that 3D Landau fluid simulations should not be performed with static heat fluxes, and time-dependent heat flux equations have to be considered. The closure with the highest power-series precision for r ⊥ in the gyrotropic limit is constructed with R 3,0 (ζ).
In the Appendix, we provide tables of Padé approximants of R(ζ) up to the 8-pole approximation, and many solutions are provided in an analytic form. The case of 1D geometry is then pursued further, and selected closures with 5thorder and 6th-order moments are constructed. For an impatient reader, the entire text can be perhaps summarized with Figure 9, where the Landau damping of the ion-acoustic mode is plotted for dynamic closures with the highest power-series precision that can be constructed at a given fluid moment level. For the 3rd-order moment (the heat flux) it is R 4,2 (ζ), for the 4th-order moment it is R 5,3 (ζ), for the 5th-order moment it is R 6,4 (ζ), and for the 6th-order moment it is R 7,5 (ζ) (we also briefly checked that for the 7th-order moment it will be R 8,6 (ζ)). In Figure 10, we also plot solutions for the Langmuir mode with the R 7,5 (ζ) closure. Additionally, it was verified that all these closures are "reliable".
The remarkable result that the reliable 1D closures reproduce the exact kinetic dispersion relation once R(ζ) is replaced by R n,n ′ (ζ) leads us to the conjecture that there exist reliable fluid closures that can be constructed for even higher-order moments, i.e. satisfying the kinetic dispersion relation exactly, once R(ζ) is replaced by the R n,n ′ (ζ) approximant. Furthermore, for a given n-th order fluid moment, the reliable closure with the highest power-series precision is the dynamic closure constructed with R n+1,n−1 (ζ). Indeed, for higher order fluid moments one should be able to construct closures with higher order R n+1,n−1 (ζ) approximants that will converge to R(ζ) with increasing precision. Thus, one can reproduce the linear Landau damping in the fluid framework to any desired precision, which establishes the convergence of fluid and kinetic descriptions.

A BRIEF INTRODUCTION TO KINETIC THEORY
In this section we introduce some building blocks of kinetic theory starting from the simple case of wave propagation along a mean magnetic field B 0 in a homogeneous plasma. Such an approach allows us to introduce the plasma dispersion function and the hierarchy of linearized kinetic moments, preparing the ground for the next section where various hierarchy closures will be described in detail. The collisionless Vlasov equation in CGS units reads It is often illuminating to work in the cylindrical coordinate system, where the particle velocity and the gyrating (azimuthal) angle φ = arctan(v y /v x ). The reason is, that it very nicely clarifies the meaning of gyrotropy, where the distribution function and the expressions that follow, are independent of the angle φ. The velocity gradient in the cylindrical coordinate system reads where the unit vectorsv so the velocity gradient is A straightforward calculation with B 0 = (0, 0, which further implies Now we need to expand the Vlasov equation (1) around some equilibrium distribution function f 0 , i.e., the entire distribution function is separated to two parts as f = f 0 + f (1) . For the distribution function, we drop the species index r. The magnetic field is separated as B = B 0 + B (1) , where B 0 = B 0ẑ , and the electric field as E = E 0 + E (1) , but since there is no large-scale electric field in your system, the E 0 = 0. The most important principle that is usually not emphasized enough, is that the kinetic velocity v is an independent quantity, and is not linearized. The entire Vlasov equation reads or equivalently by using the r-species cyclotron frequency Ω r = q r B 0 /(m r c) The Vlasov equation is now expanded (i.e. linearized) by assuming that the "(1)" components are small, and that terms containing 2-small "(1)" quantities can be neglected. At the leading order, the situation is similar as many times before, i.e., at very low frequencies (ω ≪ Ω r ) and very long spatial scales, the term proportional to Ω r dominates and must be by itself equal to zero where in the last step we used already calculated identity (7). The obtained result implies that at the longest spatial scales, the distribution function cannot depend on the azimuthal angle φ, or in another words, the distribution function must be isotropic in the perpendicular velocity components and can depend only on v 2 x + v 2 y = v 2 ⊥ , i.e., the distribution function must be gyrotropic. The second most important principle for doing the linear kinetic hierarchy is to realize, that the hierarchy is linear, and all the quantities will have to be linearized. Additionally, we are interested only in a simplified case where the plasma is perturbed around a homogeneous equilibrium state, and we can assume that the equilibrium f 0 does not depend on time and position, so that ∂f 0 /∂t = 0 and ∇f 0 = 0. Therefore, the distribution f 0 contains only density n 0 that is not (t, x) dependent, or in another words f (x, v, t) = f 0 (v) + f (1) (x, v, t). Perhaps a different way of looking at it is that the f 0 must satisfy the leading-order Vlasov equation which at long spatial scales and low frequencies further implies gyrotropy (10) and E 0 = 0, together with ∂f 0 /∂t + v · ∇f 0 = 0. Terms that contain 2-small (1) quantities in (8) can be neglected, and by putting the f (1) contributions to the left hand side and the f 0 contributions to the right hand side yields This is the starting equation that expresses f (1) with respect to f 0 and that is used in plasma physics books to derive the kinetic dispersion relation for waves in hot magnetized plasmas. The second term on the left hand side v · ∇f (1) , introduces the simplest forms of Landau damping. The most complicated term, by-far, is the 3-rd term Ω r (v×B 0 /B 0 )·∇ v f (1) , since it introduces non-gyrotropic f (1) effects. This term introduces the complicated integration around the unperturbed orbit with associated sums over expressions containing Bessel functions, that are found in the full kinetic dispersion relations. It is this 3-rd term that makes the collisionless damping (and the kinetic theory) a very complicated process, even at the linear level. Without this 3rd term, life would much easier, and Landau fluid models would be an excellent match for a full kinetic description, at least at the linear level. The 3-rd term is obviously equal to zero if the f (1) distribution function is assumed to be strictly gyrotropic (see (7)). Or, we can just neglect the term by hand, assuming that we are at low-frequencies and that ω ≪ Ω, meaning, if we perform an "overly-strict", and a bit ad-hoc-done low-frequency limit. However, as we will see later in the 3D geometry section, it turns out that even if a strictly gyrotropic f (1) is assumed, the 3-rd term can not be just eliminated from the onset. To obtain the correct f (1) in the gyrotropic limit, the 3-rd term has to be retained, the integration around the unperturbed orbit performed, and only then the term can be eliminated in a limit. It is emphasized that sophisticated Landau fluid models of (Passot & Sulem (2007)), that we do not address here, do not neglect this 3rd term and these models do not assume the f (1) to be gyrotropic. It is exactly the deviations from gyrotropy that introduces the Bessel functions found in kinetic theory and sophisticated Landau fluid models. Now, for a moment we do not perform any calculations, and just reformulate the important equation (12). The (1)-st order fields are typically transformed to Fourier space (∼ e ik·x−iωt ), but we will postpone that for now. By defining operator that represents a rate of change along an unperturbed orbit (zero-order trajectory), the equation is rewritten as To obtain the f (1) , one therefore has to calculate the integral of the above equation, where also the integration of the r.h.s. must be naturally done along the zero-order trajectory (along the unperturbed orbit) in order to cancel the d/dt on the l.h.s. The integration is denoted with prime quantities, and the integral is performed along dt ′ . If the integral is performed from time t ′ = t 0 to t ′ = t, the integration of the l.h.s. yields f (1) (x, v, t) − f (1) (x, v, t 0 ), i.e. the result depends on the initial condition at time t 0 . To remove this dependence, the integral is performed from t 0 = −∞ and it is typically stated that in this case the initial condition f (1) (x, v, −∞) can be neglected. (This is however not that obvious and for example Stix have a rather long discussion in this regard on page 249). The distribution function f (1) (x, v, t) is therefore obtained by performing integral The calculation of this integral is cumbersome because of the required change of coordinates. We want to get the final f (1) expression and we will repeat the algebra how to obtain it, but before doing that, let's consider the simplest possible case.
2.1. The simplest case: 1D geometry, Maxwellian f 0 Let's consider a particular situation, when (for whatever reason) the 3rd term on the l.h.s of equation (12) disappears, i.e. let's briefly consider which according to (10) implies that f (1) is gyrotropic (it does not depend on the angle φ). Let's also consider the even more special case in which f 0 is isotropic. In such a case, that is a specific case of (16), the direction of B (1) does not matter at all for f 0 and naturally To quickly double-check the correctness of the above expression, for isotropic f 0 (v) the velocity gradient is given by Fourier transforming the first-order quantities and ∂ ∂t → −iω, ∇ → ik yields which allows us to obtain expression for f (1) in the form Even though not necessary, it is useful to express the (electrostatic) electric field through the scalar potential E (1) = −∇Φ, which in Fourier space reads E (1) = −ikΦ, yielding Now we want to integrate the f (1) , and obtain the linear "kinetic" moments for density, velocity (current), pressure (temperature), heat flux, and the 4-th order moment r (or the correctionr). To continue, we have to prescribe some distribution function f 0 . The 3D (isotropic) Maxwellian distribution is where the isotropic v 2 = v 2 x + v 2 y + v 2 z and α r = m r /(2T r ) = 1/v 2 thr . For simplicity, let's drop the species index r, except for the charge q r . The velocity gradient Therefore, for a Maxwellian Before continuing, let's slightly re-arrange the above expression for f (1) and add 0 = ω − ω to the numerator, otherwise we will have to do this each time, when calculating the higher order moments. The rearrangement yields For clarity, let's simplify even further and discuss the simplest possible 1D case, for a 1D Maxwellian distribution Here we consider fluctuations along the magnetic field B 0 and the wavenumber is therefore denoted as k . Note that the case is strictly 1D, and the velocity fluctuations are along the B 0 as well. For example from MHD perspective, we are therefore considering the parallel propagating ion-acoustic mode. The f (1) for a Maxwellian f 0 is expressed as (dropping all the species indices 'r' except for the charge q r ) Now we are ready to calculate the velocity integrals. Let's start with the density n (1) , by integrating By using the prescribed Maxwellian f 0 , the second integral is rewritten as We purposely wrote the integral with instead of the usual ζ, since we want to define ζ slightly differently. The integral is related to the famous plasma dispersion function Z(ζ), that is responsible for the famous Landau damping. Each plasma physics book devotes many pages to the discussion of Landau damping, that was first correctly described by Landau, by considering an initial value problem and using Laplace transforms. It was later shown by Von Kampen, that the Landau damping can be indeed obtained by using the Fourier analysis. We refer the reader for example to books by Swanson, Stix, Akhiezer, Gary, Gurnett and Bhattacharjee, Nicholson, Fitzpatrick, etc. Let's call the integral (30) the "Landau integral". Nevertheless, the very-well-known secret is, that even if one is armed with all these excellent books, the Landau damping effect can still be very confusing (even at the linear level). We did not find any secret recipe that explains the Landau damping in a simplified and different way, and the reader is referred to the thick plasma physics books. Here we want to concentrate only how to express the integral (30) through the plasma dispersion function.
Since the Landau integral can be very confusing and boring to explain, to increase the "pedagogical" value of this text, let us talk a bit more freely on the next few pages. The plasma dispersion function can be defined with a short definition In the definition of x 0 , the thermal speed v th is always a positive real number, and we do not have to worry about it. Now, considering the specific case k > 0 and Im(ω) > 0, where we indeed have Im(x 0 ) > 0, we can directly use the plasma dispersion function and the result of the Landau integral (30) is n 0 x 0 Z(x 0 ). For this case, we are done. Really ? Yes, there is nothing else we can do for this case, we calculated the Landau integral. Reeeaallyy ?? Yes, because the Landau integral can not be analytically "calculated", the integral can not be expressed through elementary functions, unless the Z(ζ) function is somehow simplified, for example by expansion for cases |ζ| ≪ 1 or |ζ| ≫ 1, or by considering the weak damping limit when Im(x 0 ) is small (see plasma physics books). We are not interested in these limits and the Z(ζ) function has to be calculated numerically or looked up in the table. We are really done here ! 1 So why is the Landau integral so confusing for the other cases ? It is exactly because of that -that basically nothing gets "really calculated".
2.2. The dreadful Landau integral e −x 2 x−x0 dx There are many reasons why the "Landau integral" (30) can be so confusing. The first reason is, 1) that the integral (30) can not be expressed by using only elementary functions. If we did not arrive at this integral in the middle of a thick plasma physics book, but instead, encounter it during our undergraduate studies of complex analysis, we would perhaps not have such a respect to this integral, and immediately attempted to calculate it, by using the residue theorem. The integral appears to be so simple. Instead of calculating ∞ −∞ , we would calculate a different integral over a closed contour in complex plane C . That integral can be calculated by using the residue theorem, that states that C = 2πi Res, if the big path that encircles all the poles is counter-clockwise. 2 An equivalent statement is that the integral is equal to C = −2πi Res, if the big path that encircles all the poles is clockwise. 3 In our case, there is always just one pole, at x = x 0 , and the residue of e −x 2 x−x0 evaluated at x = x 0 is actually very simple, it is always regardless of the value of x 0 , since for a general function f (x), the residue Res x=x0 f (x) x−x0 = f (x 0 ). However, to make the result C useful for the calculation of our integral on the real axis ∞ −∞ , we need to separate the closed contour integral to C = ∞ −∞ + arc , where the arc represents the big half-circle at infinitely large radius. To preserve the direction of integration along the real axis ∞ −∞ , if the pole is in the upper complex plane, i.e. if Im(x 0 ) > 0, we need to close the big arc contour in the upper half of complex plane counter-clockwise. Similarly, if the pole is in the lower complex plane, i.e. if Im(x 0 ) < 0, we need to close the big arc contour clockwise. These are just very basic results taught in basic complex analysis classes, that yields 2 useful versions of the residue theorem that people remember: one for the residues in the upper half of complex plane C = 2πi Res, and one for the residues in the lower half of the complex plane C = −2πi Res. These 2 versions are just for mnemotechnical help that preserves the direction of integration on the real axis, and any direction of integration can of course be chosen in either plane, if the correct signs are observed. For completeness, we note that there is also another version of the residue theorem, that can be sometimes very useful in complex analysis if a large number of poles is encountered, and that states "The sum of all residues, including the one at infinity, is zero". This residue theorem is not addressed here.
We therefore easily managed to do the closed integral C , however, in contrast to typical examples presented in basic complex analysis classes, the arc integral arc does not disappear. The problem is, that the function f (x) = e −z 2 is a very strongly decaying function on the Real axis (for z = x → ±∞), however, this is not true at all in the complex plane. Considering the purely Imaginary axis z = ±iy, the function e −z 2 = e +y 2 is a very strongly diverging function as y increases, and the arc integral arc cannot be neglected ! This is a very sad news, since now we clearly see, that with arc = 0, we will not be able to use the complex analysis to actually "calculate" the Landau integral (30).
We note that the well-known Gaussian integral I = ∞ −∞ e −x 2 dx = √ π, is typically calculated in the Real plane by means of a trick which consists in evaluating I 2 in polar coordinates, ∞ 0 e −r 2 rdr. The Gaussian integral can still be calculated in Complex plane by using the residue theorem, even though quite sophisticated tricks are required. 4 1 In old times, a good barber would loudly shout: The next in line for shaving! 2 As noted in the footnote of Appendix A of the book by Swanson, page 363, the rumor has it that the famous Cauchy's residue theorem, is actually due to Cauchy's dog, that usually went around leaving residues at every existing pole.
3 The direction of integration really does matter and produces a minus sign, similarly as the direction of integration of real functions matter, and a −a = − −a a . This fact can be further clarified by C = ∞ −∞ + arc and considering many functions for which the arc-integral vanishes, so C = ∞ −∞ , and the direction of integration does matter, at least when the pole is "in sight". 4 For example, by considering e iπz 2 / sin(πz)dz, calculated along lines with 45 • angle with the real axis, that encircle the pole at z = 0. and where the residue Res z=0 = 1/π. The second reason why Landau damping is confusing is, 2) the necessity of analytic continuation. The third reason is very closely related to the second and it is 3) The analytic continuation has to be done differently for k > 0 and for k < 0. The big result of Landau (1946) can be summarized as follows: if k > 0, the integration has to be done that the path always passes below the pole x = x 0 . Therefore, starting with the basic case in the upper complex plane Im(x 0 ) > 0, nothing has to be done and the integration is just along the real axis. Now, if the pole is moved to the real axis, so Im(x 0 ) = 0, one needs to go around that pole with a tiny half-circle from below. This creates a contribution of 1/2 times 2πi times the residue at that pole, so the contribution is πie −x 2 0 . If the pole x 0 is moved further down to the lower complex plane, a full circle around the pole is required to enclose it from below, which yields a contribution of 2πie −x 2 0 . The situation is demonstrated in the left panel of Figure 1. The integral (30) for k > 0 is therefore "calculated" as For the Cauchy principal value, we prefer the original French pronunciation "Valuer Principal", abbreviated as V.P. The above result is completely consistent with the definition of the plasma dispersion function, since the plasma dispersion function was developed exactly to describe this integral. One starts with the definition in the upper complex plane (32), and analytically continues this function to a lower complex plane, according to To save space in scientific papers and plasma physics books, the definition of Z(ζ) is often abbreviated as (32), i.e. only as a first line of (35), with a powerful statement that for Im(ζ) < 0 the function is analytically continued. That statement indeed completely defines Z(ζ), since the powerful complex analysis tells us that an analytic continuations of a function, if it exists, is unique. Another abbreviated definition is by essentially writing down only the second (middle) line of (35). This is the most useful 1-line abbreviation because one can immediately recognize, how the sign(k ) was treated (as we will see soon). However, such a definition of Z(ζ), with only specifying it for Im(ζ) = 0, would not be a complete definition of that function, and no powerful statement how the function is extended above & below from the x-axis is available. So plasma physicists found a very smart workaround, how not to write the Im(ζ) = 0 restriction in the second line of (35) and how to completely define the Z(ζ) with this 1-line statement. Let's still consider the case k > 0, where our x 0 and ζ are equivalent. It is often stated (e.g. Stix,bottom of page 190), that "the principal value of an integral through an isolated singular point may be considered the average of the two integrals that pass just above and just below the point". For example, for a specific situation when x 0 lies on the real x-axis, integrating along horizontal line below the x-axis yields the first line of (35), and integration along horizontal line above the x-axis yields the third line of (35) since when the pole is encountered we have to pass it from below. An average of the first line and third line of (35) yields the second line. The idea can now be generalized to an entire complex plane, for all values of Im(x 0 ), where two integrals are done. One integral along horizontal line that passes below the x 0 point (where nothing has to be done) and one integral along horizontal line that passes above the x 0 point (and where a deformation that passes below the point has to be performed, accounting for the full residuum). Average of these two integrals yields an abbreviated Z(ζ) definition for all values of ζ in the form where the integration is said to go through the pole. Of course, no integration can be really done "through" a singular point, and what the wording means is that the integration is done along the horizontal axis that goes through x 0 , i.e. the integration is along horizontal axis Im(x 0 ). It is possible to look at it from another (perhaps more illuminating) perspective. Consider the situation in which x 0 is somewhere in the upper half of the complex plane. One can perform the integral along the real axis, so that the first line of (35) applies. Let's call this result c 1 . Alternatively, one can perform the integral along the horizontal line that passes through Im(x 0 ) (with the required tiny half-circle passing below x 0 ), and (36) applies. Let's call this result c 2 . This two different integrals must be equal. Why? Because one can plot two vertical lines (passing through Re(x 0 ) = ±∞) that together with the two horizontal integration lines, enclose an area that does not contain any pole, and integration around all four lines (in a circular direction, let's say counter-clockwise) must yield zero. The two integrals along the vertical lines cancel each other, yielding that c 1 − c 2 = 0, the minus sign in front of c 2 appears since the integration along c 2 was now done in the opposite direction. Even though perhaps a bit confusing when seen at first, the definition (36) is very useful, and when encountered, it should be just interpreted as an abbreviated definition of (35).
Unfortunately, the plasma dispersion function was obviously developed only with the case k > 0 in mind. The Landau result requires that for k < 0, the path of integration always encircles the pole from above, see the right panel of Figure 1. For k < 0, the Landau integral is defined as The two different cases for k > 0 and k < 0 can be easily combined together by using the sign of the wavenumber k function, that is equal to +1 for k > 0, and equal to −1 for k < 0. However, one needs to forget the sign of Im(x 0 ), and arrange the results only with respect to the sign of Im(ω). The Landau integral with Obviously, it is the sign of Im(ω), and not the sign of Im(x 0 ), that is the "natural language" of the Landau integral. However, the connection to the plasma dispersion function Z(ζ) is unnecessarily difficult. Sometimes, the definition of the plasma dispersion function is then altered so that the above expression is satisfied. Stix for example uses in addition to the usual Z(ζ), also a different function Z 0 (ζ) that can be defined with respect to the sign of Im(ω) instead of the sign of Im(ζ), where as noted on page 202, Z 0 (ζ) = Z(ζ) for k > 0, and, Again, the function Z 0 (ζ) can be defined in an abbreviated form as the first line of (39), with analytic continuation for Im(ω) ≤ 0 (Stix,page 206,eq. 91). The second possible abbreviated definition of Z 0 (ζ), valid for all values of Im(ω), is the trick with the principal value (Stix,page 206,eq. 92) where the integration is done along the horizontal line Im(ω), i.e. the integration path goes "through" the pole.
With the use of this new function Z 0 (ζ) of Stix, we can therefore express the dreadful Landau integral for all values of k as However, we do not like this formulation with Z 0 . Here we insist on using the original plasma dispersion function Z. In our opinion, the most elegant solution, is the one that is used for example in the book by Peter Gary and in some Landau fluid papers, and that is to use |k | in the definition of ζ, by defining This amazingly convenient definition simplifies the expressions and represents "natural language" of the plasma dispersion function. We note that |k | = sign(k )k , and also k = sign(k )|k |. With the new definition of ζ, for k > 0 obviously nothing is changed since |k | = k . However, for k < 0, where in the last step we used −1 = sign(k ). By examining the first and the last expression, the result is also obviously valid for k > 0, and therefore for all k . Or alternatively, (perhaps more confusingly, but keeping an exact track of the sign(k )), for all values of k which is the same result as the one obtained above. The definition of ζ (42) therefore yields This result allows us to use the original plasma dispersion function definition (35), and express the dreadful Landau integral for all k simply as Now we calculated the Landau integral to our satisfaction, and we can continue with the calculation of the linear kinetic hierarchy. Wait. We had basically the same result several pages back ! For the case k > 0 and Im(ω) > 0. The Landau integral was just expressed through the plasma dispersion function, basically the same result as is done now, there is just one sign(k ) in front of the integral and one in the definition of ζ. Are you suggesting, that all these calculations, contour drawings and discussions, we did all of these things just to get a sign right ? Affirmative. The Landau integral is all about chasing minus signs, but to get the correct signs is very important. This is exactly the reason why the Landau damping is so confusing, and why it needed the genius of Landau to correctly figure it out. Nevertheless, that the Landau damping (Landau 1946) is indeed very confusing can be understood from the fact, that the effect was questioned for almost 20 years before it was experimentally verified by Malmberg & Wharton (1966). To conclude, and to summarize the differences between plasma physics books of Stix and Peter Gary, we have two equivalent recipes to "calculate" the Landau integral, that can be written as where the A.C. stands for analytic continuation. It is important to emphasize that some plasma books, as for example by Gurnett and Bhattacharjee, take a different approach and call the function Z 0 (ζ) simply as Z(ζ), as is obvious from their expressions for Z(ζ) (pages 347-348) that contain the sign(k ). Of course, this approach is fully kosher, however, one needs to be extra careful when adopting a numerical routine for the plasma dispersion function. The second choice in (47) appears inconvenient, however, it is not, since the expression (30) contains The book by Peter Gary, and many Landau fluid papers prefer the second choice, since this small trick with redefining ζ allows the use of the original plasma dispersion function Z(ζ), that was tabulated by Fried & Conte (1961). 5 We prefer it too, and therefore, the integral that we will use frequently in the kinetic hierarchy is and obviously x 0 = sign(k )ζ. Now we are able to finish the calculation of the density n (1) , eq. (29), that yields The result can also be expressed by using the derivative Z ′ (ζ) = −2(1 + ζZ(ζ)). The quantity 1 + ζZ(ζ) appears very frequently in kinetic calculations with Maxwellian distribution and it is called the plasma response function For a different (general) distribution function f 0 , the plasma response function R(ζ) can be defined according to what is obtained after calculating the density n (1) = − qrn0 T (0) ΦR(ζ). The name is very appropriate, since the R(ζ) describes, how plasma with some distribution function "responds" to an applied electric field (or a scalar potential).

Short afterthoughts, after the Landau integral
Why some "analytic continuation" has to be done ? Even though we did not manage to express the Landau integral (30) through elementary functions, the integral appears to be well-defined in both upper and lower halves of the complex planes, regardless where the x 0 is. And it indeed is. So why the analytic continuation ? The very-deep reason why the analytic continuation is necessary, is that the integral is not continuous when crossing the real axis Im(x 0 ) = 0 in the complex plane. 6 If a function is not continuous, it is not analytic (a fancy well-defined language that says that derivatives to all orders are not defined, basically meaning that it matters from what direction that point is approached in the complex plane, very similarly to a derivative of function |x| on real axis). And if a function is analytic in some area, and not analytic outside of that area, we can sometimes push/extend the area of where the function is analytic, to/through the area where the function is not analytic, therefore the term "analytic continuation".
Why is the analytic continuation so important, why is it a big problem that the integral is not continuous when crossing the real axis ? Because it directly relates to the causality principle, that is, if something happens, then the response to this incident must come after, and not before, the time in which that incident happened. This can be perhaps more intuitively addressed by performing the Laplace transforms in time (instead of the Fourier transforms), and considering an initial value problem, as was done by Landau. For more information, see plasma physics books, for example Stix, Chapter 3 on causality etc.

Easy Landau integrals
x−x0 dx We want to calculate moments in velocity space all the way up to the 4th-order moment r, and (including the 3D geometry) we will need integrals only up to n = 5. To this aim, we will use frequently eq. (49), where we find convenient to use x 0 and ζ instead of chasing the sign(k ), and in the end we will just use the definition x 0 = sign(k )ζ. We already saw that the 0-th order moment was Let us now calculate the higher order moments. Since we talked so much on the last pages, we will remain silent for a moment and we will just enjoy the calculation: the book, which could be confusing, is the exclusion of the √ 2 in the definition of the thermal speed v th . However, some Landau fluid papers (Hammett & Perkins (1990); Snyder et al. (1997)) use the same definition without the √ 2.
That was easy ! If we ever need a higher order, we will just blindly calculate and we do not worry right now if this general case can be expressed in some smarter way. Now we know how to calculate the kinetic Landau integrals, so let's use this knowledge, to calculate the first few integrals of the linear "kinetic hierarchy". With the previous integrals already calculated, the calculation of the linear kinetic hierarchy is an easy process. However, it is important to emphasize, that the hierarchy is linear, and must be calculated as such. Again, as emphasized before, the kinetic velocity v is an independent quantity, and is not linearized. The total density n = f dv, and at the first order of course n 0 = f 0 dv. The expansion n 0 + n (1) = (f 0 + f (1) )dv implies n (1) = f (1) dv. The density n (1) , already calculated in (50), was therefore calculated correctly, and using the plasma response function The velocity moment is nu = vf dv and at the first order n 0 u 0 = vf 0 dv. In our specific case, because we do not consider any drifts in the distribution function, u 0 = 0. Expanding (n 0 + n (1) )(u 0 + u (1) ) = v(f 0 + f (1) )dv and neglecting the nonlinear quantity n (1) u (1) , yields n 0 u (1) = vf (1) dv. The velocity moment calculates and canceling n 0 and using 1/ The definition of the scalar pressure is p = m (v − u) 2 f dv and at the first order p 0 = m v 2 f 0 dv, because again u 0 = 0. The quantity (v − u) 2 = v 2 − 2vu + u 2 is linearized as v 2 − 2vu (1) , and expanding p 0 + p (1) = m (v 2 − 2vu (1) )(f 0 + f (1) )dv, further linearizing by neglecting u (1) f (1) , and using u 0 = 0 yields p (1) = m v 2 f (1) dv. The pressure calculates and dividing by p 0 and using p 0 = n 0 T (0) to calculate mn 0 /(p 0 α) = 2, the pressure moment reads We will also need the temperature T (1) . The general temperature is defined T = p/n, i.e. the definition is nonlinear. The process of linearization is essentially like doing a derivative and dividing by T (0) = p 0 /n 0 yields T (1) If one does not like the "derivative", the same result is obtained by writing p = T n instead, and linearizing (p 0 +p (1) ) = (T (0) + T (1) )(n 0 + n (1) ). Which after subtracting p 0 = T (0) n 0 , neglecting T (1) n (1) , yields p (1) = T (1) n 0 + T (0) n (1) , which after dividing by p 0 yields (64). The temperature is therefore easily calculated as The scalar heat flux is defined as q = m (v − u) 3 f dv and at the first order q 0 = m v 3 f 0 dv, and for our case , yields one contribution that is very easy to overlook, and that is of the same order as the expected m v 3 f (1) dv, and that is proportional to m v 2 f 0 dv = p 0 . Therefore, the linearized heat flux q (1) must be correctly calculated according to The first term calculates where we used α −3/2 = (2T (0) /m) 2T (0) /m. And the entire heat flux (66) then reads The scalar 4th order moment is defined as r = m (v − u) 4 f dv and at the first order of course r 0 = m v 4 f 0 dv, since again u 0 = 0. Also, r 0 = 3p 2 0 /ρ 0 , where ρ 0 = mn 0 . The quantity (v − u) 4 is linearized as v 4 − 4v 3 u (1) . Expanding The 4th order moment calculates where we have used 1/α 2 = 4T (0)2 /m 2 , and mn 0 /α 2 = 4p 2 0 /ρ 0 .
The entire nonlinear r is decomposed as r = 3p 2 /ρ + r. The first term can be linearized in a number of ways, and of course, all techniques must yield the same result, since linearization must be unique. For example by using the derivative the term is easily linearized as p 2 ρ lin.

Exploring possibilities of a closure
Let's summarize the obtained linear hierarchy so that we can directly see the similarities. Let's also for a moment introduce back the species index r, so that we are completely clear with an emphasis that the charge q r should not be confused with the heat flux q (1) r . The ζ r = ω/(|k |v thr ) and the thermal speed v thr = 2T To better understand what is meant by "a closure", let's first examine what is not a closure. Let's examine the density n (1) equation. Since in this specific example we used the electrostatic electric field E (1) = −∇φ, the only Maxwell equation left is the ∇ · E (1) = 4π r q r n r , where q r is the charge and n r is the total density. Linearization of this equation, and using the natural charge neutrality that must be satisfied at the 0-th order r q r n 0r = 0, yields ∇ · E (1) = 4π r q r n (1) r , or written with the scalar potential −∇ 2 Φ = 4π r q r n (1) r , and transformed to Fourier space k 2 Φ = 4π r q r n (1) r . We consider 1D propagation parallel to B 0 with wavenumber k , and to be consistent, we therefore continue with k and which can be rewritten as Even though the system is now "closed", the eq. (85) does not represent a fluid closure, and should be viewed only as a kinetic "dispersion relation". To have a non-trivial solution for the potential Φ, the expression inside of the bracket must be equal to zero. By declaring that k = 0 (the case k = 0 is trivial since we need some wavenumber), we can divide by k 2 . By using the definition of the Debye length of r-species λ Dr = 1/k Dr , where k 2 Dr = 4πn 0r q 2 r /T r , one obtains a dispersion relation 7 If one replaces here k → k, the expression is actually equivalent to a multi-species dispersion relation, usually found in plasma physics books under the electrostatic waves in hot unmagnetized 8 plasmas, with Maxwellian f 0r . See for example Gurnett & Bhattachrjee,page 353,eq. (9.4.18). We are not interested here in studying unmagnetized plasmas, and instead, we will just remember (86) as the dispersion relation of the parallel propagating (to B 0 ) electrostatic mode in magnetized plasma, since this mode indeed does not contain any magnetic field fluctuations.
Let's consider only the proton and electron species, r = p, e, so that For a general case, the dispersion relation has to be solved numerically, and again, can not be much simplified, unless one wants to consider long wavelength limit k λ De ≪ 1, where only the expression inside of the big brackets can be used. The solution contains the usual Langmuir waves, that are obtained by neglecting the ion term (by making the ions immobile) and by expanding the R(ζ e ) in the limit |ζ e | ≫ 1, i.e. in the limit when the wave phase speed ω/k is much larger than the electron thermal speed v the . Langmuir waves propagate with speeds that are higher than the electron plasma frequency ω pe = 4πn e0 e 2 /m e , which for us are extremely high frequencies. The solution also contains the "ion-acoustic mode", which in plasma books is obtained in the limit |ζ p | ≫ 1 and |ζ e | ≪ 1, i.e. in the limit where the wave phase speed is much larger than the proton thermal speed, ω/k ≫ v thp , but also where the phase speed is much smaller than the electron thermal speed, ω/k ≪ v the , for the result see for example Gurnett and Bhattacharjee,page 356,.
So what about the limit |ζ p | ≪ 1, when the phase speed is much smaller than the proton thermal speed ω/k ≪ v thp ? The ion-acoustic mode does not exist in this limit ? Well, that is not a very well formulated question, because one has to consider under what circumstances such a limit would be appropriate. We have seen that even in the classical long wavelength fluid limit, in the CGL description or in the MHD description, the phase speeds do not become smaller and smaller, the phase speeds ω/k are just non-dispersive and constant. In the CGL description (with cold electrons), the parallel propagating ion-acoustic mode has a phase speed ω/k = ±C , where the parallel sound speed C 2 = 3p p /m p , which is of course never true. By considering the CGL parallel sound speed (which does not include the Landau damping), we can estimate the lowest possible value of |ζ p | to be roughly in the neighborhood of |ζ p | min ≈ C CGL /v th p = 3/2, or in another words |ζ p | min ≈ 1. Unfortunately, there is no expansion of the Z(ζ) for |ζ| ≈ 1 and the result has to be found only numerically.
So what constitutes a closure ? We will use the following definition: Express the last retained moment through lower-order moments in such a way, that the kinetic R(ζ) function is eliminated, and that the closure is valid for all ζ values.
3.2.1. Preliminary closures for |ζ| ≪ 1 As explained above, the limit |ζ| ≪ 1 is actually a bit unphysical for the proton species, and is physically plausible only for the electron species. Nevertheless, briefly exploring the linear kinetic hierarchy in this limit allows us to explore what kind of closures might be possible. In this limit, the plasma dispersion function can be expanded as and the plasma response function as and where for small ζ, the e −ζ 2 is naturally expanded as yielding |ζ| ≪ 1 : For our purposes it is sufficient to keep the series only up to ζ 3 , i.e. to work with the precision o(ζ 3 ). The expressions entering the kinetic hierarchy in equations (77)- (83) are An interesting observation is that for small ζ, moments n (1) , p (1) and r (1) are finite, and moments u (1) , T (1) , q (1) and r (1) are proportional to ζ and therefore small. We want to make a simple closure for the heat flux q (1) or the 4th order correction r (1) , and thus, let's concentrate on the moments that are small. To clarify how the closure is performed, let's write them down only up to the precision o(ζ 2 r ), so If we further restrict ourselves to only precision o(ζ r ) and neglect the ζ 2 r terms, we can find an amazing result that we can express the heat flux q The above result is of upmost importance, because it emphasizes the major difference between collisionless and collisional systems. At this point, the result is derived only with the assumption |ζ| ≪ 1, even though we will see later that the result is not restricted to this limit, and the result has a much wider applicability. The result is the famous expression for collisionless heat flux, that here reads q ∼ −isign(k )T , which is in strong contrast to the usual collisional heat flux q ∼ −∇ T that in Fourier space reads q ∼ −ik T . We will come to this expression later.
With the precision o(ζ r ), other obvious possibilities are to express q (1) r with respect to velocity u (1) r , or to express However, if we did so much work that we consider the 4th order moment, it would be a shame not to increase the precision to o(ζ 2 r ). Obviously, we need to use a combination of at least 2 different lower order moments. For example, by trying The proportionality constants α q , α T are easily obtained by separation to two equations for ζ r and ζ 2 r that must be Playing with the algebra little bit (for example p 0r /m r = T (0) r n 0r /m r = v 2 thr n 0r /2), the two equations can be solved easily for the unknown quantities α q , α T , and the final result is There are naturally other possibilities and with the precision o(ζ 2 r ), one can search for closures o(ζ 2 r ) : where the first choice yields a closure and the other two choices yield For completeness, one can easily find a closure for r (1) r with precision o(ζ 3 r ) (after updating (101)-(104) to precision o(ζ 3 r )) by searching for a solution o(ζ 3 r ) : and the solution reads We purposely kept the species index r in the calculations, to clearly show that the closures are performed for each species separately, and no Maxwell equations or other physical principles are used. The equations would be perhaps easier to read without the index r.

Exploring the case |ζ| ≫ 1
For large value of |ζ|, we need to use an asymptotic expansion of the plasma dispersion function that reads The term with σ comes directly from the definition of Z(ζ) and there is not much one can further do about it, since there is no further asymptotic expansion for exp(−ζ 2 ) when ζ is large. The term is zero in the upper half of complex plane (σ = 0). When very close to the real axis, i.e. when σ = 1, the term mainly contributes to the imaginary part of Z(ζ) (even though only very weakly) and for the real part of Z(ζ), it's contribution can be neglected. However, when deeply down in the lower half of complex plane, the term can become very large (for example if ζ = −iy, exp(−ζ 2 ) = exp(y 2 ) and if y is large the term obviously explodes). Deeply down in the lower complex plane the term is a real trouble, and even some kinetic solvers such as WHAMP (Rönnmark 1982) have trouble with calculations when the damping is too large.
We will see shortly, that for our purposes the term can be completely neglected, but let's keep it for a moment. The expansion of the Maxwellian plasma response function therefore reads Let's calculate the kinetic hierarchy, at least up to 1/ζ 4 . After a short inspection, one immediately sees that the hierarchy calculates a bit differently than in the previous case, and to get the 4th order moments with the precision o(1/ζ 4 ), it is important to keep all the terms up to ∼ 1/ζ 8 in the R(ζ) expression, since the 4th order moments contain ζ 4 R(ζ) terms. The expressions entering the kinetic hierarchy dully calculate where for brevity we suppressed the proportionality constants, including the sign(k ). Interestingly, the velocity u (1) decreases the slowest, only as 1/ζ. The n (1) , p (1) , r (1) and also the temperature T (1) , decrease as 1/ζ 2 . The heat flux q (1) decreases as 1/ζ 3 and the cumulant r (1) decreases the fastest, as 1/ζ 4 . This is not good news, since it is obvious that the direct closures that were easily obtained for the small ζ case, can not be easily done here.
To understand how the terms contribute to the real frequency and damping, it is useful to separate ζ = x + iy and calculate expressions with y being small, i.e. the weak growth rate (actually weak damping) approximation. The exponential term entering (124) can be approximated as and the fractions of ζ are approximately etc. For large x, the exponential term (132) is strongly suppressed as e −x 2 (with oscillations e i2xy ). Additionally, the real part of (132) is proportional to y, which is also small, and its contribution to the real part of R(ζ) can be therefore completely neglected. The imaginary part of the exponential term (132) has to be kept, if one wants to match the approximate kinetic dispersion relations from plasma books (usually calculated in the weak growth rate/damping approximation), for example for the damping of the Langmuir mode or the ion-acoustic mode. However, even smart plasma physics books have trouble to analytically reproduce the full kinetic dispersion relations that have to be solved numerically, see for example figures in Gurnett & Bhattacharjee on pages 341 & 355, that compare the analytic and full solutions for the Langmuir mode and the ion-acoustic mode. The trouble is that the damping can become large, and the entire approach with the weak damping invalid. If kinetic plasma books have trouble to analytically reproduce the damping with full accuracy under these conditions, we would be naive to think that we can do better with a fluid model and we know we cannot be analytically exact for |ζ| ≫ 1 if the damping is too large. If the damping is way-too large, and the imaginary frequency starts to be comparable to real frequency, the mode will be damped away very quickly.
In fact, even the well known kinetic solver WHAMP, neglects this term in calculation of Z(ζ) for large ζ values, as can be verified in the WHAMP full manual (Rönnmark 1982) from the asymptotic expansion of Z(ζ), eq. III-6 on page 10, and the discussion of numerical errors on page 13. The WHAMP solver uses an 8-pole Padé approximant of Z(ζ), which is a very precise approximant, and imprecision starts to show up only if the damping become too large. For example in the very damped regime when the Im(ζ) = −Re(ζ)/2, the error in real and imaginary values of Z(ζ) is still less than 2-3 %, where the calculation should be stopped (in less damped regime, the precision is much higher).
If a full kinetic solver can neglect the exponential term for large ζ values, we can surely neglect it as well. It should be emphasized that the term is neglected only for large ζ values (i.e. in the asymptotic expansion), the exponential term is otherwise fully retained and enters the Padé approximation through the power series expansion for small ζ. To summarize, the "ideal" large ζ asymptotic behavior that we would like to obtain reads (141)

A brief introduction to Padé approximants
Padé approximants, i.e. Padé series approximation/expansion, is a very powerful mathematical technique, comparable to the usual Taylor series and the Laurent series. Nevertheless, for some unknown reason, Padé series seems to somehow disappear from the modern educational system that a typical physicist encounter. The lack of Padé series in classes is even more surprising, if one realizes that the technique is in fact very simple, and anybody can fully grasp it in very short time. We therefore make a quick introduction to the technique here.
Padé series consist of approximating a function as a ratio of two polynomials. If a power series (e.g. Taylor series) of a function f(x) is known around some point with coefficients c n , the goal is to express it as a ratio of two polynomials The choice of b 0 = 1 is an ad-hoc choice and the entire decomposition can be done without it, leading to the same results at the end. Multiplying the left hand side by the denominator 1 + b 1 x + b 2 x 2 + · · · , and grouping the x n contributions together, that must be satisfied independently, leads to the system of equations etc. The necessary condition for the system being solvable, is that the number of variables is equivalent to the number of equations. Therefore, if we want to approximate function f(x) with a ratio of two polynomials P m /Q n , of degrees m and n, we will need the Taylor series on the left hand side of (144) up to the order m + n. The Padé approximation is sometimes denoted as R m,n or using a function f (x) that is being approximated as f (x) m,n or [f (x)] m,n . If the Padé approximation exists, it is unique. For example, the function e x has a Taylor series around the point x = 0 Let's say we want to approximate e x as a ratio of two polynomials of 0-th and 1-st order e x ≈ a 0 /(1 + b 1 x), i.e. we want to find the Padé approximant [e x ] 0,1 . Respecting the n + m rule, the approximation therefore consist of equating that leads to the system of equations a 0 = c 0 = 1; yielding the Padé approximation To feel confident with the Padé approximations, let's find another approximant of e x , for example [e x ] 1,2 . The system is written as and yields a system of equations a 0 = 1; which have a solution b 1 = −2/3, b 2 = 1/6, a 1 = 1/3, and the Padé approximant It is just a straightforward algebraic exercise to find other Padé approximations, for example etc. Similarly, it is easy to find Padé approximations to a function e −x , and for example (obviously) The approximations were derived from Taylor expansion of e −x around x = 0, and all 3 choices naturally have the correct limit lim x→0 e −x = 1. However, we can see that by choosing the degree of the Padé approximation, we can also control what the function is doing for large values of x. For example, for large values of x the Padé approximations (154) go to 0, −1 and +∞. Obviously, the smart choice is [e −x ] 1,2 which approximately reproduces the behavior of e −x also for large x. The usefulness of Padé approximation becomes especially apparent when considering analytically difficult functions, for example the e −x 2 , where the "smart" lowest Padé approximants are Therefore, depending on the required precision of a physical problem, instead of working with e −x 2 (which for example does not have an indefinite integral that can be expressed in elementary functions), one can approximate the function e −x 2 for all x, as 1/(1 + x 2 ), that is much easier to work with. Curiously, the reader might recognize that the 1/(1 + x 2 ) is the Cauchy distribution function, often used in plasma physics books to get better understanding of the complicated Landau damping. The Cauchy distribution therefore can be thought of as the simplest Padé approximation of the Maxwellian distribution. Now we are ready to use the Padé approximation for the plasma dispersion function Z(ζ) or the plasma response function R(ζ). We do not have to explore all the possibilities, and we can immediately pick up only the smart choices. For large ζ (by neglecting the exponential term as discussed in the previous section), at the first order Z(ζ) ∼ 1/ζ and R(ζ) ∼ 1/ζ 2 , and both functions approach zero as ζ increases. Obviously, a smart choice worth exploring will always be a Padé approximant [ ] m,n where n > m. In fact, we can be even more specific. We know the asymptotic behavior for large ζ, and obviously, even smarter choice is to concentrate only on approximants [Z(ζ)] n−1,n and [R(ζ)] n−2,n , since such a choice will naturally lead to the correct asymptotic behavior ζ ≫ 1 : Any other choice is not really interesting and therefore, the usual 2-digit notation of the Padé approximation becomes redundant. We can just use 1-digit notation with "n", that represents the degree of a chosen polynomial in the denominator, and we can omit writing the (n − 1) and (n − 2), since this will always be the case (except for the R 1 (ζ)).
Note that one can directly work with Padé approximants for both Z n (ζ) and R n (ζ), and that in general according to definitions (157), the approximants are not automatically equivalent. The difference is as if one does approximations to a function f (x) or its derivative f ′ (x). Usually in papers, the approximant Z n (ζ) is calculated, and R n (ζ) is just defined according to R n (ζ) = 1 + ζZ n (ζ). One can choose another (and better approach in our opinion) and to calculate directly approximants R n (ζ), and if really required (which should not be the case), obtain Z n (ζ) approximants as Z n (ζ) = (R n (ζ) − 1)/ζ. Moreover, we can do even better than (157). We shall not be satisfied just by approximating the asymptotic trend ∼ 1/ζ for ζ ≫ 1, and hope for the best. For large ζ, the correct asymptotic expansions are Z(ζ) → −1/ζ and R(ζ) → −1/(2ζ 2 ). By prescribing a n−1 /b n = −1 for Z(ζ), and a n−2 /b n = −1/2 for R(ζ), we will obtain correct asymptotic behavior of these functions, at least at the first order. By doing this, we are not "destroying" the Padé approximation, since it is easy to argue that if an n-pole approximation is determined to be sufficient for small ζ values, we can just add one more pole and use that one to control the asymptotic behavior for large ζ values. Of course, we will always use at least the first term in the expansion for ζ ≪ 1, that yields a 0 = i √ π for Z n (ζ) and a 0 = 1 for R n (ζ), otherwise the functions will have incorrect values at ζ = 0. The "smart" choices worth considering therefore can be summarized as and have a property to correctly match the Z and R functions at ζ = 0 and, have the correct first order asymptotic expansion at ζ ≫ 1. The 1-pole approximant R 1 (ζ) is an exception, and can be defined only as R 1 (ζ) = 1/(1 + b 1 ζ). This function obviously cannot have correct asymptotic expansion ∼ 1/ζ 2 and the only possibility is to use ζ The 1-pole approximant Z 1 (ζ) can be obtained directly from the definition (158), that yields and that has correct asymptotic behavior Z 1 (ζ) → −1/ζ for large ζ values, even though it has only precision o(ζ 0 ) for small ζ values. Perhaps curiously, in this case R 1 (ζ) = 1 + ζZ 1 (ζ) exactly. Alternatively, if the precision for small ζ is more important than the exact asymptotic expansion for large ζ, it is possible to increase the Z 1 precision to o(ζ 1 ) and write Z 1 (ζ) = i √ π/(1 − 2iζ/ √ π). In this case R 1 (ζ) = 1 + ζZ 1 (ζ) exactly, and the functions are equal only for small ζ and only with precision o(ζ 1 ).
Right now, in the definition (158), we just used 1 pole for the asymptotic series of Z(ζ) and R(ζ), but one can naturally use more poles. By opening the possibility to increase the number of matching asymptotic points in the n-pole Padé approximation (158), the number of possible approximants for a given n naturally increases. To keep track of all the possibilities, we obviously need some kind of classification scheme. It is useful to modify the usual 1-index Padé series notation for Z n (ζ) and R n (ζ) functions (that only specify the number of poles), to a two index notation Z n,n ′ (ζ), R n,n ′ (ζ). Now we have a wide range of possibilities how to define n, n ′ and there is no clear "natural" winner.
There are two different existing notations (likely more), introduced by Martín et al. (1980) and by Hedrick & Leboeuf (1992), that consider Z n,n ′ (ζ) Padé approximants. The first reference defines n = number of points (equations) used in the power series expansion, and n ′ = number of points (equations) used in the asymptotic series expansion. Even though perhaps clear, for example 3-pole approximants in this notation are expressed as Z 5,1 , Z 4,2 , Z 3,3 etc, so to get the number of poles (which is the most important information), one has to calculate (n + n ′ )/2. When using a lot of different approximants, this notation is a bit confusing and is rarely used.
The notation of Hedrick & Leboeuf (1992) can be interpreted as defining Z n,n ′ with n = number of poles (which we like), and n ′ = number of additional poles in the asymptotic expansion that is used, compared to some "minimally interesting" or "basic" definition Z n , that can be denoted as Z n,0 (which we like too). The problem with the notation of Hedrick & Leboeuf (1992) is with the definition of the "basic" Z n,0 , since the number of asymptotic points used in the definition of Z n,0 keeps changing with n (and is actually equal to n). The notation is physically motivated, but the motivation is difficult to follow. The Z 2,0 is defined with 2 asymptotic points, Z 3,0 with 3 asymptotic points and so on. This can be easily deduced from their definitions of Z 2 − Z 5 , as we will discuss later. We find this notation confusing.
Importantly, both mentioned notations consider the Padé approximants to Z(ζ). We do not really care about Z(ζ), since all the kinetic moments are formulated with R(ζ) at this stage. We want to calculate direct Padé approximants to R(ζ), which is actually slightly less analytically complicated for a given n. Here we define the 2-index Padé approximation to the plasma response function R(ζ) simply as R n,0 (ζ) = 1 + a 1 ζ + a 2 ζ 2 + · · · + a n−2 ζ n−2 i.e. as having asymptote −1/(2ζ 2 ) for large ζ, and notation R n,n ′ (ζ) means that n ′ additional asymptotic points are used compared to the basic definition R n,0 (ζ). The notation feels natural, and the n ′ = 0 index helps us to orient in the hierarchy of many possible R(ζ) approximants. It is easy to remember that this asymptotic profile is the minimum "desired" profile that correctly captures the 0-th order (density) moment, and any profile with less asymptotic points should be avoided if possible. The R n,0 (ζ) has power series precision o(ζ 2n−3 ) and asymptotic series precision Of course, we want to make the R n,n ′ (ζ) and Z n,n ′ (ζ) definitions fully consistent, and Z n,n ′ (ζ) is defined so that is satisfied. This dictates that in comparison to Z n (ζ) definition (158), two additional asymptotic points must be used to define the Z n,0 (ζ). We have no other choice and when calculating the Z n,n ′ (ζ), we have to start counting from n ′ = −2, and we define When calculating the hierarchy of plasma dispersion functions Z(ζ), the −2 index is actually a nice reminder that we are two asymptotic points short of the "desired" profile (161) for the plasma response function R(ζ). We want to feel fully confident that we understand both Padé approximants R(ζ) and Z(ζ), and we will calculate 2-pole and 3-pole approximants for both functions. For 4-pole approximants and above, we will only work with R(ζ). Padé approximants were also used for other interesting physical problems, such as developing analytic models for the Rayleigh-Taylor and Richtmyer-Meshkov instability Zhou (2017a,b).

2-pole approximants of R(ζ) and Z(ζ)
Let's be patient and go slowly. A general 2-pole Padé approximant to R(ζ) is where a 0 = 1. The asymptotic expansion for large ζ values calculates and must be matched with the asymptotic expansion (124) Matching the first point implies b 2 = −2a 0 , and this is how R 2,0 (ζ) is defined. Then matching with 2 equation for the small ζ expansion, eq. (93), the classical Padé approach yields To match additional asymptotic point (and to potentially find R 2,1 (ζ)), dictates that b 1 = 0. However, the resulting function R 2,1 (ζ) = 1/(1 − 2ζ 2 ) does not have any imaginary part for real valued ζ, and does not represent valuable approximation to R(ζ) and this approximant is eliminated. Let's now explore possible 2-pole approximations of Z(ζ). A general 2-pole approximant is defined as and has the following asymptotic expansion for large ζ values The Z(ζ) has asymptotic expansion so by matching with 1/ζ implies b 2 = −a 1 (as already used previously) that defines Z 2,−2 (remember, we are starting to count with n ′ = −2). By further matching with 1/ζ 2 implies b 1 = −a 0 , that defines Z 2,−1 , and by further matching with 1/ζ 3 implies a 1 = 2, that defines Z 2,0 . The calculation is continued by matching with the power series for small ζ values, i.e. by using the classical Padé approach, that is described as and the solution is Continuing with Z 2,−1 (ζ), i.e. by using one more additional asymptotic term that dictates b 1 = −a 0 , the matching with the power series yields Similarly, considering Z 2,0 (ζ) yields Obviously, R 2,0 (ζ) = 1 + ζZ 2,0 (ζ) exactly.

3-pole approximants of R(ζ) and Z(ζ)
A general 3-pole approximant of R(ζ) is The asymptotic expansion calculates so that For R 3,0 (ζ) this implies b 3 = −2a 1 , for R 3,1 (ζ) additionally b 2 = −2a 0 , and for R 3,2 (ζ) also b 1 = 3a 1 . The asymptotic expansions (178) can become very long for higher orders of ζ, especially when more poles are considered. It is beneficial to write down the following scheme, where in each line, we advance the matching with one more asymptotic point: In the last expression the a 1 → ∞ since a 0 = 1, implying the R 3,3 (ζ) does not make sense and it is not defined. The scheme can be very quickly verified by using Maple (or Mathematica) software, by using command asympt(expression(ζ), ζ, n), where ζ is the variable, and n prescribes the precision of the expansion that is calculated up to the o(ζ −n ) order. Now by matching with the power series for small ζ values and the solutions are and has the following asymptotic expansion By matching the first asymptotic term implies b 3 = −a 2 , which defines Z 3,−2 (ζ). For Z 3,−1 (ζ) the second term is matched as well and b 2 = −a 1 . For Z 3,0 (ζ) the third term is also matched and b 1 = −a 0 + a 2 /2. To go higher requires higher order expansion (190). It is again easier to write down the asymptotic expansion scheme step by step Matching these results with an expansion for small ζ values is done according to and the solutions are Of course, the following relations now hold exactly R 3,2 (ζ) = 1 + ζZ 3,2 (ζ). (209) 3.3.3. 4-pole approximants of R(ζ) and Z(ζ) As before, the procedure of matching with asymptotic expansion yields (for simplicity already assuming a 0 = 1) (212) where the last relation imply a possible approximant R 4,5 (ζ) = (1− 2 3 ζ 2 )/(1−4ζ 2 + 4 3 ζ 4 ). However, such an approximant is not well behaved (it has zero imaginary part for real valued ζ) and the R 4,5 (ζ) is eliminated. Matching with the power series is performed according to and the results are From the 4-pole approximants, perhaps the most known one is R 4,3 (ζ) used for example by Hammett & Perkins (1990), Passot & Sulem (2007) etc., and which can be written in a convenient form Here we do not double check the derivation of the Z 4 (ζ) approximants "from scratch", and for a given R 4 coefficients, the Z 4 coefficients are of course easily obtained by For completeness, the corresponding results are 3.4. Conversion of our 2-index R n,n ′ (ζ) notation to other notations For clarity, we provide conversion tables of Padé approximants in the notation of Martín et al. (1980) and Hedrick & Leboeuf (1992) to our notation. Comparing our analytic results to those of Martín et al. (1980) (introducing superscript M), can be done easily according to and the general conversion can be written as The Table 1 of Martín et al. (1980) can be now easily verified, which reveals a small obvious typo in their Z M 4,2 , where the coefficient p 2 is missing the imaginary i number.
To compare our results to those of Hedrick & Leboeuf (1992), it is useful to calculate asymptotic expansions of their Z n definitions (that is defined as Z n,0 ), that calculate where "HL" stands for Hedrick & Leboeuf (1992). As one can see, the number of asymptotic points used in their basic definition of Z n , increases with the number of poles n. Compared to our definition, their Z 2,0 is defined as having another asymptotic point (for a total of 2), Z 3,0 has another asymptotic point (for a total of 3), Z 4,0 another one (for a total of 4), and so on. Essentially, in their notation the basic Z n,0 is defined as having "n" asymptotic points, and asymptotic precision o(1/ζ n ). The conversion between their and our notation is easy, and or the general conversion can be written as Z HL n,n ′ = Z n,n ′ +n−3 .
We checked the Table 1 of Hedrick & Leboeuf (1992) that provides coefficients for the Padé approximants (241)- (244) and we can confirm that the table is essentially correct, except for one coefficient. 9 The coefficient where a simple typo is suspected, is the coefficient a 1 in Z 3,1 . Rewriting our 3-pole approximant R 3 (ζ) to the form used by Passot & Sulem (2007) and Hedrick & Leboeuf (1992) (that corresponds to the Z HL 3 as written in (238) ) yields which further yields and our approximants are For the a 1 coefficient in R 3,1 , both Hedrick & Leboeuf (1992) and Passot & Sulem (2007) use a 1 = 2.23990 instead of the correct a 1 = 2.32990. The differences are of course small. Nevertheless, the new correct value explains the observation made by Passot & Sulem (2007), in the paragraph below their Figure 1, where they write: "It is conspicuous that R 3,2 provides a fit that is slightly better for small ζ, but turns out to be globally less accurate than R 3,1 ." Authors obviously noticed that something is not right, since for small ζ, the R 3,1 has precision o(ζ 2 ) and R 3,2 only o(ζ), so the R 3,1 should be more precise. And it indeed is, authors were just misguided by the wrong value of a 1 introduced by Hedrick & Leboeuf (1992).

Landau fluid closures -fascinating closures for all ζ
Now, let's use various Padé approximations of the plasma response function R(ζ), and calculate the kinetic moments. Let's start with the simplest choice of replacing the exact R(ζ) with approximant R 1 (ζ) = 1/(1 − i √ πζ). Let's drop the index r. The linear kinetic moments calculate We are looking for a closure, and we want to express either q (1) or r (1) , as a linear combination of lower order moments.
To immediately see possible closures, it is always useful to pull out the denominator of the Padé approximant out (as done above), and concentrate only at the expressions inside the big brackets. Also, similarly to the closures explored for small ζ, it is useful to forget the n (1) , p (1) and r (1) moments, and just concentrate at the u (1) , T (1) , q (1) and r (1) moments. Nevertheless, we will keep the n (1) moment, since it helps us to understand the expressions and to somehow "maintain the touch with reality". It is therefore sufficient to write where we have suppressed writing everything in front of the 1/(1 − i √ πζ), including the minus signs (it does not mean that these were neglected, full expressions are considered, we are just not writing down the full expressions, which helps in spotting the possible closures). By exploring the expressions inside of the brackets, it is obvious that it is impossible to express q (1) or r (1) as a linear combination of lower order moments that eliminate ζ dependence. Moreover, for large ζ, the moments q (1) ∼ ζ 2 and r (1) ∼ ζ 3 , which does not make physical sense, since these quantities should converge to zero with increasing ζ, as explored in the ζ ≫ 1 limit, see eqs. (137)-(143). The R 1 (ζ) approximant therefore does not yield a closure. Let's try 2-pole approximants. The R 2,0 (ζ) approximant yields linear moments R 2,0 (ζ) : and again, no closure for q (1) or r (1) is possible. We note that the approximant R 2,1 (ζ), that was eliminated because it is not a good approximant of R(ζ) yields and the possible closure is to express q (1) through u (1) . The full expressions for these moments are that yields a closure R 2,1 (ζ) : which is equivalent to the closure (106), that was obtained for small ζ with the precision o(ζ). This closure is therefore disregarded. Let's try the 3-pole Padé approximants. The moments with R 3,1 (ζ) approximant are proportional to and there is a possibility to express q (1) through the combination of the lower moments T (1) and u (1) . The full expressions of these moments are where we have used a convenient notation D for the denominator of the plasma response function, and the closure is which is equivalent to the (117) closure (which was obtained for small ζ with the precision o(ζ 2 )). Continuing with the next approximant R 3,2 (ζ), the moments calculate It is possible to express 1) q (1) through T (1) ; 2) r (1) through the combination of u (1) and q (1) ; 3) r (1) through the combination of u (1) and T (1) . The first choice yields a closure R 3,2 (ζ) : that is equivalent to the (105) closure obtained for small ζ with the precision o(ζ). This is indeed the famous simplest possible Landau fluid closure that expresses the collisionless heat flux with respect to temperature, and it equivalent to eq. (7) of Hammett & Perkins (1990). 10 The closure is written here in Fourier space. The important part is the isign(k ) that typically written as ik /|k |, and that in Real space rewrites as a Hilbert transform, which we will address later. The R 3,2 (ζ) was obtained with o(ζ) power series expansion, and o(1/ζ 4 ) asymptotic series expansion. How good is this closure ? By exploring expressions (287)-(291), the quantities n (1) , u (1) , T (1) have all correct asymptotic expansion for large ζ (including the proportionality constants), however, the heat flux decreases only as q (1) ∼ 1/ζ 2 instead of the correct ∼ 1/ζ 3 , see eq. (141). For large ζ, the heat flux is therefore overestimated by this simple closure, which typically leads to an overestimation of the Landau damping in fluid models that use this simplest closure. Nevertheless, the closure is very beneficial because it clarifies the distinction between the collisional and collisionless heat flux. The other two possible closures with R 3,2 (ζ) are R 3,2 (ζ) : 10 With their later found constant χ 1 = 2/ √ π, and remembering that their thermal speeds are defined as v th = T (0) /m, whereas ours are v th = 2T (0) /m. and one can go from (294) to (295) by using (293). Obviously, it would be also possible to construct a closure π v th n 0 T (0) sign(k ), and where α q and α T are related by satisfying 2n 0 v th sign(k )α q + i √ πα T = −iv 2 th n 0 3π+8 2 √ π , i.e. one could consider a closure with a free parameter, which we will not consider. Additionally, all constructed closures should be checked with respect to obtained dispersion relations, and closures (294), (295) will be later disregarded as not well behaved.
For R 4,2 (ζ), the kinetic moments calculate The only possibility is to express r (1) through a combination of u (1) , T (1) and q (1) , and the solution is which is equivalent to the closure (121), that was obtained for small ζ with precision o(ζ 3 ). Obviously such a closure is precise for small values of ζ, however for large values of ζ, the asymptotic behavior of q (1) ∼ ζ −2 and r (1) ∼ ζ −1 , instead of the correct ζ −3 , ζ −4 profiles (see eqs. (141), (143)), and these quantities will be therefore overestimated. Nevertheless, the solution is interesting and we are not aware of it being reporting in any literature.
Continuing with R 4,3 (ζ), the kinetic moments calculate It is possible to express r (1) through the combination of q (1) and T (1) and the result is which is equivalent to the closure (113), that was obtained for small ζ with the precision o(ζ 2 ). The heat flux has a correct asymptotic behavior q (1) ∼ ζ −3 (even though with incorrect proportionality constant), and the quantity r (1) ∼ ζ −2 instead of the correct ∼ ζ −4 . The closure was first reported by Hammett & Perkins (1990), and is equivalent to the (non-numbered) expression between their eq. (10) and (11). Continuing with R 4,4 (ζ) approximant, the kinetic moments are (let's stop writing down n (1) from now on since we know we can get it from u (1) ) R 4,4 (ζ) : It is possible to express r (1) through q (1) and the closure is The result is equivalent to the (109) closure, that obtained for small ζ with precision o(ζ). This very simple closure has only precision o(ζ), however, it does have the correct asymptotic behavior of the heat flux q (1) ∼ −3/(2ζ 3 ) (including the proportionality constant), and r (1) ∼ ζ −3 that is closer to the correct ζ −4 than the previous closure.
3.6. Table of moments (u , T , q , r ) for various Padé approximants To clearly see possibilities of a closure, it is useful to create the following summarizing table, that is self-explanatory after reading the previous section, i.e. all the proportionality constants (including the minus signs) are suppressed. Even though the table here is created for 1D geometry, we will see that exactly the same table is constructed for 3D geometry, where all the quantities are given a "parallel" sub-index, i.e. u (1) → u (1) , T (1) → T (1) , q (1) → q (1) and r (1) → r (1) . The table is therefore useful to spot all the possible closures that can be constructed in 1D geometry for quantities q (1) , r (1) , as well as in 3D geometry for quantities q (1) and r (1) . The approximants R 2,1 , R 4,5 , R 6,9 and R 8,13 are marked with an asterisk "*", since these are not well-behaved and are provided only for completeness, these approximants should be disregarded.
New closures should be always checked. Later on, we will consider propagation of the ion-acoustic mode, satisfying kinetic dispersion relation (452). We believe that a good "reliable" closure of a fluid model obtained with R n,n ′ (ζ) approximant, should yield a fluid dispersion relation that is equivalent to (452), after R(ζ) is replaced with R n,n ′ (ζ) (equivalent to the numerator of (452) once both terms are written with the common denominator). Closures that satisfy this requirement are marked with " " in the above table. Closures that do not satisfy this requirement were eliminated, and can be further split to two categories. Both eliminated categories actually appear to describe the ionacoustic mode with the same accuracy as a corresponding "reliable" closure satisfying (452), however, the difference is in the higher-order modes. The first category of eliminated closures, marked with "x", produces higher-order modes with positive growth rate, and these closures can not be used for numerical simulations. The second category, marked with "!", produces higher-order modes that are damped, and these closures can still be useful. However, there is no guarantee that these closures will behave well in different circumstances (for example when used in 3D geometry) and these closures were therefore eliminated.

Going back from Fourier space to Real space -the Hilbert transform
The quasi-static Landau fluid closures explored in the previous section, were constructed in Fourier space. For direct numerical simulations that can use Fourier transforms (that are usually restricted to periodic boundaries), or for solving dispersion relations ω(k), this is the easiest and natural way how to implement these closures. Nevertheless, it is very beneficial to see how these advanced fluid closures translate to Real space.
Provided all equations are linear (and homogeneous), transformation between Real and Fourier space is usually very easy and so far we just needed where we did not even bother to write the hat symbol on the quantities in Fourier space, since it was obvious and not necessary.
With equations encountered in simple fluid models, transformation back to Real space is easy and one can usually just flip the direction of the arrow in relations (318). However, the constructed Landau fluid closures contain an unusual operator isign(k ) = ik /|k |. How does this operator transforms to Real space ? Considering just spatial 1D transformation between coordinates z ↔ k , a general function Fourier transforms according to where the first equation is the inverse/backward Fourier transform and the second equation is the forward Fourier transform. As usual, we often do not bother to write the hat symbols on quantities in Fourier space. The location of the normalization factor 1/(2π) is an ad-hoc choice, but one has to be consistent, especially when calculating convolutions. As a first step, we need to calculate F −1 of a function sign(k ). However, if such an integral is calculated directly, one will find out that the result is not clearly defined. It is beneficial to use a small trick, where instead of a function sign(k ), one considers function sign(k )e −α|k | , where α is some small positive constant α > 0. And after the calculation, one performs the limit α → 0. The considered function is sign(k )e −α|k | = −e +αk ; k < 0; and the integral calculates further yielding By taking the limit α → 0, Such a limit is not well defined, and at least to our knowledge, it does not exist. One can try to use the well-known representation for the Dirac delta function lim a→∞ = 1 πz sin(az) = δ(z), but we did not manage to obtain anything useful. Obviously, some "damping" of the considered function is required that technically suppressed the limit to zero, and this was achieved by the e −α|k | function that was removed later. Nevertheless, the transformation (324) is considered a solid mathematical result, and the reader is referred to more advanced mathematical texts than our simplistic guide.
To look at the result (324) from another perspective, it is noteworthy that In basic mathematics textbooks considering the Fourier transform of sign(k ), the reader is sometimes cheated, and the calculation is done by using a theorem for Fourier transforming a derivative of a general function f (k ), that in our notation here calculates (using integration by parts) The usual theorem found in basic textbooks neglects the first term, and is written as since it is implicitly assumed that the functionf (k ) decreases to zero for high k , i.e. that lim k →±∞f (k ) = 0. This assumption is obviously not true for the functionf (k ) = sign(k ), which stays ±1 with increasing k . Nevertheless, by using the simplified rule (328) and the identity (326), it is easy to show that that indeed verifies (324). However, the reader is implicitly tricked in those basic textbooks, to make the F −1 isign(k ) look easy. If the non-simplified rule (327) is used, the undefined limit (325) is obtained, and to suppress the limit, some damping function similar to the e −α|k | has to be assumed. Again, the result (324) is a solid mathematical result and let's move on. In Landau fluid closures, the operator isign(k ) acts on a variablef (k ), and to transform this to Real space, we need to use a convolution theorem for Fourier transforms. To make sure that we get the normalization factors right, let's calculate it in detail. The convolution between two real functions is defined as For brevity, let's temporarily suppress the parallel subscript on k and use only k. By decomposing the function f (z−z ′ ) to waves (using the inverse Fourier transform), and changing the order of integrals For normalizations (319), (320), the required convolution theorem therefore reads and of course, f (z) * g(z) = g(z) * f (z). Now it is straightforward to calculate how the isign(k )f (k ) transforms to Real space The convolution of 1 πz with a function f (z) is a famous transformation, called the Hilbert transform. According to the definition (330), the convolution 1 z * f (z) should be defined as However, because of the singularity 1 z−z ′ , such an integral will likely not exist, and the convolution integral is defined with a principal value. The definition of the Hilbert transform "H" that is acting on a function f (z) reads The use of the Hilbert transform allows a very elegant notation, how the isign(k )f (k ) transforms to Real space, it is according to Performing a lot of calculations, we like shortcuts, and the quantity isign(k ) can be viewed as an operator, that is acting on many possiblef (k ) variables, such as the velocity u (1) , the heat flux q (1) , etc. (see the Landau fluid closures). Therefore, in addition to the usual shortcuts (318), we can write an elegant shortcut for the operator isign(k ), that is very useful for advanced fluid models when transforming from Fourier to Real space, and that reads I.e., the operator isign(k ) in Fourier space, is the negative Hilbert transform operator in Real space. Curiously, doesn't the Hilbert transform integral z−z ′ dz ′ reminds us something ? What about, if we prescribe the quantity f (z ′ ) to be a Maxwellian f (z ′ ) = e −z ′2 ? Oh yes, this is the dreadful Landau integral ! This is how the plasma dispersion function was essentially defined. This is indeed the reason, why the paper by Fried and Conte 1961, that is well-known for tabulating the properties of the plasma dispersion function, has a full title: "The Plasma Dispersion Function. The Hilbert Transform of the Gaussian." Now we are ready to reformulate the Landau fluid closures in Real space. Purely for convenience, often in modern Landau fluid papers another operator H is defined that is equivalent to the negative Hilbert transform, and that absorbs the minus sign, i.e.
This "H" operator is therefore defined as and allows us to write or in the operator shortcut isign(k ) → H.
This new definition is of course not necessary. However, it is often used in Landau fluid papers, and there is indeed some logic behind it. First of all, we do not have to remember another minus sign, and we will make less typos, perhaps. Second, the definition is consistent with another "spatial" operator Fourier shortcut ik → ∂ z . Third, the Landau integral and the plasma dispersion function were defined with integrals f (x) x−x0 dx, and not as f (x) x0−x dx, and the H operator therefore can feel more natural than H. Whatever the choice, we now talked about it detail, and all possible confusion between H and H should be clarified. We will use the H operator henceforth.

Quasi-static closures in Real space
With our new shortcut (340) as discussed above, the transformation of closures from Fourier space to Real space is very easy. For example, the heat flux closure obtained for R 3,2 (ζ) that in Fourier space reads q (1) = −i 2 √ π n 0 v th sign(k )T (1) , is transferred to Real space as Again, the H operator shows its slight advantage over H operator, because it is easy to remember that for the usual collisional heat flux q ∼ −∂ z T , whereas for the collisionless heat flux q ∼ −HT . Let's rewrite the Hilbert transform a bit further, so that we can clearly see what this distinction means physically. Rewriting the principal value using the substitution z ′ = −y in the first integral (so that dz ′ = −dy and −∞ → ∞, −ǫ → ǫ), and renaming back y → z ′ , the H operator reads Instead of remembering the limit, it is more elegant to write the final result as V.P. ∞ 0 . In the simplest closure (341), the collisionless heat flux is therefore expressed with respect to the temperature as R 3,2 (ζ) : which is equivalent to the eq. (8) of Hammett & Perkins (1990). Writing the Hilbert transform and the collisionless heat flux in this form is very useful, because it reveals what the Hilbert transform of the temperature means physically. The equation says that to obtain the heat flux in Real space, one has to calculate integrals -and sum the differences between temperatures according to (345) -along the entire considered coordinate z. Here we calculated the expressions in the linear setting/approximation, and in reality, the integrals (345) should be performed along the magnetic field lines. What is perhaps the most non-intuitive and most surprising about the expression (345), that the expression is telling us that the entire temperature profile along a magnetic field line is important, since it will be encountered in the integral (345). Therefore, the collisionless heat flux q(z) at some spatial point z, depends on the temperature difference between that point, and the temperature along the entire magnetic field line. This effect is summarized with an appropriate word of non-locality of the collisionless heat flux, since it is in strong contrast with the usual collisional heat flux, that depends only at the local gradient of the temperature at that point. For time-evolving systems, this effect is also directly associated with the "isotropization" of temperature along the magnetic field lines.
To rewrite the other quasi-static closures that were explored in the previous section to the Real space is trivial, and for example the quasi-static closure (309) of Hammett and Perkins 1990 obtained with R 4,3 (ζ) reads The closure (302) obtained with 4-pole approximant R 4,2 (ζ) is rewritten to Real space as the closure (315) obtained with R 4,4 (ζ) is rewritten as and the closure (286) obtained with R 3,1 (ζ) reads The closures (294), (295) obtained with R 3,2 (ζ) read R 3,2 (ζ) : R 3,2 (ζ) : however (as mentioned before), these closures will be eliminated. To summarize, we obtained altogether 7 quasi-static closures. Additionally, there is 1 closure (276) that was disregarded since it was obtained with approximant R 2,1 (ζ) that is not a well-behaved approximant.

Time-dependent (dynamic) closures
In addition to the "quasi-static" closures explored above (sometimes called simply "static"), it is possible to construct a different class of closures that we can call "time-dependent" closures, or "dynamic" closures. For example, for the approximant R 4,3 (ζ), the temperature T (1) and the heat flux q (1) read where D is the denominator of R 4,3 (ζ) defined in (303). Calculating the ratio using the definition ζ = ω |k |v th and multiplying by |k |v th and n 0 v th sign(k ), allows us to formulate a closure that is further rewritten as To go back to Real space, we need a recipe for the inverse Fourier transform of operator |k |, that acts on a general quantityf (k ). The transform calculates easily by using |k | = −(ik )isign(k ) and writing that allows us to write a useful shortcut The closure (356) therefore transforms to Real space as and represents the time-dependent evolution equation for the heat flux. The last step in these type of Landau fluid closures is to recover Galilean invariance, that is achieved by substituting ∂/∂t with the convective derivative d/dt, and the final closure reads To easily compare this expression with the existing literature, a small rearrangement yields The expression is equal for example to equation (57) in Passot & Sulem (2003) for the parallel heat flux q (where in that paper v th = T (0) /m is used, whereas ours here is v th = 2T (0) /m). The time-dependent closure (360) was obtained with the approximant R 4,3 (ζ). Interestingly, if the derivative d/dt is neglected, the closure is equivalent to the quasi-static closure (341) obtained with R 3,2 (ζ) (which can be easily seen in Fourier space, or by using HH = −1). Also, it is useful to compare the time-dependent (360) with the quasi-static closure (346), that was obtained for the same approximant R 4,3 (ζ). To compare these closures, we need to use a time-dependent heat flux equation where the closure for r will be applied. In Part 1 of this guide, we derived nonlinear "fluid" equation for the parallel heat flux q (see Part 1, Section "Collisionless damping in fluid models -Landau fluid models"). Quickly rewriting it in the 1D parallel geometry that we use here yields (dropping the parallel subscript) where for brevity ∂/∂z = ∂ z . The equation can be of course obtained by direct integration of the 1D Vlasov equation ∂f /∂t + v∂ z f + (q r /m r )E∂f /∂v = 0, as done by Hammett & Perkins (1990), and prescribing Maxwellian by r = 3p 2 /ρ + r. The equation is nonlinear and to compare closures that were done at the linear level, we need to linearize the heat flux equation. This eliminates the 2nd and the last term, the 4rd term is linearized as 3(p 0 /m)∂ z T (1) , and since p 0 /m = n 0 v 2 th /2, the linearized equation reads This is just a 1D linear heat flux equation, where no closure was imposed yet. The quantities q (1) , r (1) , T (1) were not calculated from kinetic theory by using approximants to R(ζ), etc. The equation was obtained by a general "fluid approach", that we heavily used before we started to consider kinetic calculations (perturbations around Maxwellian are assumed here because of the prescribed r). The equation (363) greatly clarifies relations between the quasi-static and time-dependent Landau fluid closures. For example, by using the quasi-static closure (346) in the heat flux equation (363), the time-dependent closure (359) is immediately recovered. Often, time-dependent closures can not be straightforwardly constructed by a simple division of two moments as done above. It is useful to learn a new technique that will allow us to see and construct possible closures in a quicker way. Let's explore the closure (356). It is apparent that whenever we attempt to use ∂/∂t of some moment (in this case ∂q (1) /∂t), it is logical to also use the same moment without the time derivative (q (1) ) in the construction of the considered closure, i.e., in this case we search for a closure where α q , α T need to be determined. By using expressions for q (1) , T (1) , the above closure is separated to 2 equations for ζ and ζ 2 that must be satisfied independently if the closure is valid for all ζ, and solving these 2 equations yields The closure therefore reads and is of course equivalent to (356).
We are now ready to construct all other possible time-dependent closures. Still considering R 4,3 (ζ) approximant, another possible closure is R 4,3 (ζ) : which when separated into 3 equations for ζ, ζ 2 and ζ 3 that must be each satisfied yields the closure reads and this closure will be eliminated. Another closure with R 4,3 (ζ) can be constructed as so the closure reads and this closure will be eliminated as well. The time-dependent closures (370) and (367) are of course closely related, and one can go from one to another by using the quasi-static closure (346) that expresses r (1) as a combination of T (1) and q (1) . Continuing with R 4,2 (ζ) approximant, it is possible to construct the following closure R 4,2 (ζ) : and the result is consistent with using the quasi-static closure (347) in the linearized heat flux equation (363). Continuing with R 4,4 (ζ), it is possible to construct R 4,4 (ζ) : ; α T = 3 2 n 0 v th sign(k ), and the closure reads The obtained closure is related to the quasi-static closure (348), since by using the quasi-static closure (348) in the linear heat flux equation (363), the time-dependent closure (375) is recovered. Another closure with R 4,4 (ζ) is And yet another closure with R 4,4 (ζ) The closure (379) is related to the closure (377), because one can use the quasi-static closure (348) to express r (1) through q (1) , however, the closure (379) will be eliminated. The R 4,5 (ζ) was eliminated because it is not a well-behaved approximant (see discussion above), nevertheless, for completeness the following closure can be constructed R 4,5 (ζ) : With R 3,2 (ζ), the following time-dependent closure can be constructed and similarly, yet another one R 3,2 (ζ) : however, the last closure will be eliminated.
Continuing with the approximant R 5,6 (ζ), the kinetic moments calculate R 5,6 (ζ) : which yields a closure The R 5,6 (ζ) approximant has precision o(ζ), o(1/ζ 8 ). Even though the precision for small ζ is relatively low, the closure correctly reproduces the asymptotic behavior r (1) ∼ −6/ζ 4 (including the proportionality constant). Finally, it is indeed possible to construct a closure with precision o(ζ 4 ), by using R 5,3 (ζ). The approximant R 5,3 (ζ) is defined as R 5,3 (ζ) = 1 + a 1 ζ + a 2 ζ 2 + a 3 ζ 3 where the constants a 1 , a 2 , a 3 , b 1 are given in the Appendix. Using this approximant, the kinetic moments calculate R 5,3 (ζ) : It is possible to search for a closure and the solution is The correctness of the algebra can be quickly checked by prescribing b 1 = 3a 1 + 3a 3 , which immediately recovers the closure (391)-(392) that was obtained for R 5,4 (ζ) with only asymptotic expansion coefficients (and the power series coefficients unspecified), which yields α u = 0. The R 5,3 (ζ) closure reads By using the calculated coefficients from the Appendix, the closure in Fourier and Real space then reads where the perhaps complicated appearing proportionality constants (that come from the Padé approximation), are just constants, that are numerically evaluated as For numerical simulations, we of course recommend to re-calculate these constants from the above analytic expressions, to fully match the numerical precision of the considered simulation. For complete clarity, the fully expressed closure in Real space reads This is the only closure with precision o(ζ 4 ), and the asymptotic precision is o(1/ζ 5 ). To conclude, we altogether obtained 13 time-dependent closures. Additionally, we also obtained 1 time-dependent closure for R 4,5 (ζ) that was disregarded since the R 4,5 (ζ) is not a well-behaved approximant.

Parallel ion-acoustic (sound) mode, cold electrons
After all the calculations, it is advisable to verify if we obtained anything useful. Let's consider only the proton species, make the electrons cold and neglect electron inertia, so we have only 1-fluid model. Let's continue to work in physical units and later we will switch to normalized units. From Part 1 of this guide, the linearized fluid equations (obtained by direct integration of the Vlasov equation for a general distribution function f ) can be written in physical units as where the fluctuating parallel temperature T (1) is linearized as The superscript (1) on quantities n (1) , u (1) z , p (1) , q (1) (and T (1) ) signifies that these are just fluctuating quantities, the superscript does not mean here, that these quantities are obtained by integration over the kinetic f (1) . This fluid model is accompanied by a closure for r (1) , and that one was obtained from linear kinetic theory by integrating over the kinetic f (1) . Let's choose the R 4,3 (ζ) closure, eq. (309) Now the model is closed, and calculating the determinant yields the following dispersion relation By examining the expression, an obvious substitution offers itself that transforms the polynomial to a completely dimensionless form The ζ is obviously a very useful quantity, and one could rewrite the fluid equations (420)-(423) directly with this quantity. The polynomial (427) can be solved numerically, and the approximate solutions are (writing only 3 decimal digits) The first solution is the ion-acoustic (sound) mode and the second solution is "a higher-order mode". Both solutions are highly damped, and the higher-order mode has actually higher damping rate than its real frequency. We can now also see how important was to keep track of the sign(k ), since the modes are damped for both k > 0 and k < 0. If we have ignored the sign(k ), we would obtain that for k < 0 the sound mode has a positive growth rate and is unstable, which would be unphysical. Of course, each closure will yield a different dispersion relation. Exploring the simplest closure with quasi-static heat flux q (1) obtained with R 3,2 (ζ), the equations (420)-(422) are closed by which yields a polynomial Numerical solutions are showing that in this case the higher-order mode does not propagate and is purely damped. The ion-acoustic mode has a dispersion relation and is also very damped. We examine two more closures. The most precise quasi-static closure (302) obtained with R 4,2 (ζ) yields the analytic dispersion relation and the solutions are the first one being the ion-acoustic mode. Finally, the only available o(ζ 4 ) closure (419) obtained with R 5,3 (ζ) yields analytic dispersion relation where the coefficients are specified in (415) the first being the ion-acoustic mode. Let's compare the obtained results. Perhaps curiously, it appears that as the precision of closures increases, so does increases the real frequency and the damping rate of the ion-acoustic mode, and the differences are quite significant. So what is the correct kinetic result, i.e., how close did we get to the kinetic theory ? That is not as easy question as it appears to be. By opening kinetic books, there is no such a discussion as long-wavelength limit of the ion-acoustic mode, when the electrons are cold. Even the exact numerical solutions are usually considered only for T e /T p ≥ 1, see for example Figure 9.18 on page 355 in Gurnett and Bhattacharjee.
Let's examine the analytic dispersion relations (427), (432), (435) and (438), that were obtained with approximants R 4,3 (ζ), R 3,2 (ζ), R 4,2 (ζ) and R 5,3 (ζ). After some time, one notices that the dispersion relations exactly match the denominators of the associated Padé approximants ! Or in another words, without doing any calculations whatsoever, it appears that if a closure of a 1D fluid model is available for a R n,n ′ (ζ) approximant, the dispersion relation is equivalent to the denominator of that R n,n ′ (ζ). How is this possible ? The explanation is simple, if one considers the electrostatic kinetic dispersion relation for the proton and electron species where the electron Debye length was rewritten with the proton Debye length λ De = λ Dp T (0) e /T (0) p . By completely neglecting the electron term, we essentially prescribed that the proton term is very large (for both real and imaginary parts), so that 1 R(ζ p ) = 0.
The above expression can be considered electrostatic dispersion relation of proton-electron plasma, where the electrons are completely cold. The reason why such an expression cannot be found in any plasma book is, that from the kinetic perspective, such an expression cannot be solved and is ill-defined. The function R(ζ) is directly related to the derivative of Z(ζ) according to Z ′ (ζ) = −2R(ζ). Infinitely large R(ζ) means that Z(ζ) has infinitely large derivatives, i.e. that Z(ζ) is not continuous and, not analytic, which contradicts the entire definition of Z(ζ) and how the function was constructed. However, when Padé approximants of these functions are considered, and when R(ζ) is replaced by R n,n ′ (ζ), so that 1 R n,n ′ (ζ p ) = 0, such an expression does make sense, and is equivalent to the denominator of R n,n ′ (ζ) being zero, i.e., it directly yields the dispersion relations of the considered fluid models.

The proton Landau damping does not disappear at long-wavelengths
There are several extremely interesting phenomena worth discussing. 1) In the dispersion relation for the ionacoustic (sound) mode (429), the usual phase speed ω/k is constant, implying that the Landau damping (of the parallel propagating sound mode) does not disappear, however long-wavelengths are considered. With cold electrons as considered here, the parallel sound mode is always heavily damped, and disappears in a few wavelengths, even on large astrophysical scales. 2) The equations (420)-(423) do not even contain the parallel electric field E . This might sound surprising, but the parallel electric field completely disappears at long wavelengths, even though the Landau damping (as expressed through the constant phase speed), does not disappear. The parallel electric field does not disappear, if electrons have finite temperature, it also enters (very weakly), if the electron inertia is included. In the 1D linearized geometry considered here, the contributions will be 3) The presence of Landau damping in the long-wavelength limit is exactly the reason why usual fluid models such as MHD or even much more sophisticated CGL description, do not converge to the collisionless kinetic theory, whatever long-wavelengths and low-frequencies are considered. There is always a mismatch in dispersion relations when the phase speed is plotted, that depending on plasma parameters, can be quite large. This does not concern only the damping rate (which in MHD and CGL is of course zero), the differences in the real frequency, which is always coupled to the imaginary frequency (for example through the polynomial (427) for that specific closure), can be large too. 4) If the heat flux q (1) is prescribed to be zero, i.e. if a CGL model is prescribed, the dispersion relation of the parallel propagating sound mode is determined only by the parallel velocity eq. (421) and parallel pressure eq. (422), yielding the CGL result ω 2 = 3 2 v 2 th k 2 , so that ω CGL = ±|k |v th 3 2 = ±|k |v th 1.225.
As can be seen, the CGL is much closer to the kinetic calculations, which can be interpreted as a consequence of having the "correct" γ = 3 and not the MHD value of 5/3.

Proton Landau damping, influence of isothermal electrons
Let's prescribe electrons to be isothermal, with some finite electron temperature, but let's neglect the electron inertia, since its effect is insignificant at long wavelengths. The reader should be familiar with isothermal electrons from Part I of this text, the proton momentum equation is changed to (458), the electron pressure equation reads where for brevity, we define the ratio of electron and proton temperature as τ . Using the R 4,3 (ζ) closure as before, the coupled dispersion relation reads The above expression is equivalent to eq. (A6) in Hunana et al. (2011). 11 The R 4,2 (ζ) closure yields dispersion relation and the R 5,3 (ζ) closure yields dispersion relation Let's concentrate on the ion-acoustic mode and forget the higher-order mode (which is always highly damped), and obtain solutions for few different τ values This is excellent, as in kinetic books, with increasing electron temperature, the Landau damping of the ion-acoustic mode disappears. Compared to kinetic calculations, the total Landau damping is here of course underestimated, especially for high electron temperatures, since here in the fluid model, only the proton Landau damping is contributing, and the electron Landau damping is turned off. Let's turn it on.

Proton and Electron Landau damping
Considering wavelengths much longer than the Debye length, the exact kinetic dispersion relation reads where the electron thermal velocity is of course much higher than the proton thermal velocity (unless the electrons are cold), and by using the abbreviated so that and the exact kinetic dispersion relation reads τ R(ζ p ) + R ζ p µ/τ = 0.
Let's see how close did we get. One of the greatest advantages of Landau fluid models is that we do not have to resolve electron motion to obtain the correct form of electron Landau damping at long wavelengths, and the electron inertia in the electron momentum equation can be neglected. The correct electron-proton mass ratio can enters equations for the electron heat flux q e and the 4th-order moment r e , and the electron inertia influence the solutions only insignificantly. However, let's keep the electron inertia for a moment. The equations for the proton species read and the electron inertia represents the last term in (458). The electron equations are written in a form so that they are normalized with respect to the proton pressure Note that the electron fluid speed u (1) zp (so we omitted the index p). The fluid equations are accompanied by a closure from kinetic theory, for example the R 4,3 (ζ) closure The equations (457)-(464) now represents a fluid description of the ion-acoustic mode, and contain both proton and electron Landau damping. It is rather mesmerizing, that the relatively complicated dispersion relation of this fluid model, can be shown to be equivalent to the "simple looking" kinetic dispersion relation i.e. equivalent to the full kinetic dispersion relation (456), where the exact R(ζ) is replaced with the R 4,3 (ζ) approximant (by transferring the proton and electron terms of R 4,3 (ζ) in the expression (465) to the common denominator and making the resulting nominator of that expression equal to zero, Maple is great in this regard). Nevertheless, here we want clearly demonstrate that the electron inertia can be neglected and the electron Landau damping still nicely captured, and we use fluid dispersion relations obtained from the system (457)-(464), where the last term in (458) is neglected. It is important to normalize properly and for example the R 4,2 (ζ) closure for electrons In the table below, we compare these fluid solutions of the quasi-static R 4,3 (ζ) and the R 4,2 (ζ) closures, and the time-dependent R 5,3 (ζ) closure to the exact kinetic solutions, calculated from (452), for various electron temperatures.  are very precise in the very important regime, where the electron temperature τ ranges between τ = 1 and τ = 5. The R 5,3 (ζ) closure is the most globally precise closure. If static closures are preferred, the comparison between R 4,2 (ζ) and R 4,3 (ζ) is more difficult to summarize, the R 4,2 (ζ) is definitely preferred in the regime τ ∈ [1, 5] and perhaps also for τ ∈ [25, 60], however, the Hammett and Perkins closure R 4,3 (ζ) is the better fit in the regime τ ∈ [5, 25] and also for τ ∈ [60, 100]. We checked that the inclusion of electron inertia is insignificant for all 3 fluid closures, and by eye inspection, it appears that the largest global difference is seen for the R 5,3 (ζ) closure, roughly for τ ∈ [30, 60], making the closure (very slightly) more precise. In Figure (3), we calculate the other selected obtained closures. We use the full dispersion relations with electron inertia included. The figure shows, that if static closures are preferred, for value of roughly τ > 15, the best closure is actually the static closure R 4,4 (ζ). The most precise closure for τ > 15 is by far the time-dependent R 5,6 (ζ) closure, which achieves an excellent accuracy for high values of tau. If a global accuracy for all values of τ is required, our favorite closures are R 5,3 (ζ) and R 5,4 (ζ). With the help of Maple software, we analytically investigated dispersion relations of all the obtained fluid closures, and we investigated if the resulting dispersion relation (including the electron inertia) is equivalent to the kinetic result (452), after replacing the R(ζ) with R n,n ′ (ζ), i.e. if the fluid dispersion relation is equivalent to the numerator of All the closures considered in this subsection satisfied this requirement, however, some other previously obtained closures did not. We concluded, that satisfying (467) should be indeed considered as strong requirement for a physically meaningful closure, and closures that did not satisfy this requirement were therefore eliminated. The results are summarized in the subsection 3.6 "Table of moments (u , T , q , r ) for various Padé approximants", eqs. (316), .

Electron Landau damping of the Langmuir mode
In addition to the ion-acoustic mode, let's calculate the Landau damping for the second (perhaps first) typical example, how Landau damping is addressed in plasma physics book, the Langmuir mode. Focusing on the electron species and making the proton species cold and "very heavy" with m p → ∞, i.e. immobile with u (1) zp = 0, the proton species completely decouple from the system, and their role is just to conserve the leading-order charge neutrality n 0p = n 0e = n 0 . Since we haven't dealt with such a system so far (not even in Part 1 of the text), let's write down the basic equation nicely in Real space in physical units. Neglecting the electron heat flux, the basic system of linearized equations reads By using the general electrostatic Maxwell's equation for the current (including the displacement current) that in our specific 1D linear case considered here reads prescribes the electric field time evolution, and the system of equations is closed. By applying ∂/∂t to the momentum equation (469), the equations can be combined, yielding a wave equation where the electron plasma frequency ω 2 pe = 4πe 2 n 0 /m e . This wave equation describes the basic plasma physics mode, known as the Langmuir mode, and the dispersion relation is If we ignored the displacement current ∂E /∂t, the ω 2 pe term would be absent. By dividing with ω 2 pe and by using the Debye length λ De = 1/k De where k 2 De = 4πe 2 n 0e /T (0) e , so that the Debye length λ 2 De = T (0) e /(m e ω 2 pe ), the dispersion relation (474) reads Obviously, the electron plasma frequency and the electron Debye length are the natural normalizing units of this system, and one should use normalized quantities ω/ω pe and k λ De . A useful relation also is λ 2 De = v 2 th e /(2ω 2 pe ). Often, in plasma physics books, the CGL adiabatic index γ = 3 in the above two equations, is substituted with a general adiabatic index γ e , so that a more "general" case can be considered. This is especially useful if the Langmuir waves, which are the basic waves of plasma physics, are introduced early on (in an early chapter of a book), where the correct CGL value of γ = 3 is difficult to introduce. Again, we have an advantage of not being a plasma book, and we are not describing the general electrostatic case, we are describing the fully electromagnetic case, but we are focusing only on one mode -the electrostatic mode that propagates parallel to B 0 . In the view presented here, and as elaborated in Part I of the text, playing with adiabatic indices, does not make much sense. No adiabatic index can match the CGL and the MHD, the CGL is always different from MHD, even for isotropic distribution function with ⊥ . Therefore, we are not introducing any adiabatic index, and the correct CGL value is used, and fixed to 3. Instead, we introduce the electron heat flux and get closer to the kinetic theory in a much more sophisticated way.

The basic linearized fluid equations in Fourier space read
and are accompanied for example by the R 4,3 (ζ) closure The dispersion relation of this fluid model reads (suppressing e in the electron Debye length λ De ) where The exact kinetic dispersion relation reads 1 + 1 k 2 λ 2 D R(ζ e ) = 0.
To clearly understand the obtained solutions, let's solve the simple R 3,2 (ζ e ) dispersion relation (484)  The first mode is the Langmuir mode, and the second mode is a purely damped higher-order mode. In the complete limit k λ D → 0, the Langmuir mode becomes undamped with a solution ζ e = 1/( √ 2|k |λ D ), which corresponds to oscillations with electron plasma frequency ω = ω pe ; and the higher-order mode has a solution ζ e = −2i/ √ π. Considering the weak damping limit ζ e = x + iy, where x ≫ y, at the leading order ζ 2 e = x 2 + i2xy and ζ 3 e = x 3 + i3x 2 y, which when used in the dispersion relation (484) that is separated to real and imaginary parts yields and for x ≫ y at the leading order which approximates the above numerical solutions reasonably well up to let's say k λ D = 0.3, and from (480) the x, y expressions are equivalent to For k λ D ≪ 1, the Landau damping of the Langmuir mode goes to zero, however, the damping rate is very overestimated. The approximate kinetic result found in plasma physics books, see for example Gurnett and Bhattacharjee page 349, has of course the same real frequency, however the damping rate reads For k λ D ≪ 1, i.e. approaching long wavelengths, the exponential term suppresses the Landau damping much quicker than our result (486). To understand the discrepancy, let's quickly consider how the kinetic result (487) was obtained. The result is obtained by considering asymptotic expansion |ζ e | ≫ 1 of the exact kinetic dispersion relation (481), which in the weak growth rate approximation (see eq. (124) with σ = 1) reads By using ζ e = x + iy with x ≫ y and k λ D ≪ 1 yields at the leading order x 2 = (1 + 3k 2 λ 2 D )/2k 2 λ 2 D which agrees with (485) and the damping rate is y = − √ πx 4 e −x 2 , which recovers (487). The e −x 2 term in the damping rate comes from the last term in (488), and as discussed previously, this term is neglected in the asymptotic expansion when constructing the Padé approximants of R(ζ) (it is however included in the power series expansion), explaining the discrepancy.
The damping rate of the Langmuir mode is plotted in Figure 4 and the real frequency in Figure 5, where solutions of various fluid models are compared with exact kinetic dispersion relation (481), depicted as the black solid line. Additionally, the asymptotic kinetic solution (487) from plasma physics books is plotted as the black dotted line. Figure 4 is plotted in log-log scale and Figure 5 uses linear scales. It is shown that for k λ D > 0.2, fluid models can reproduce the damping of the Langmuir mode quite accurately, and the most accurate closure is R 5,3 (ζ). This closure also reproduces the real frequency of the Langmuir mode very accurately and actually better than the asymptotic kinetic solution (487).  Nevertheless, as discussed above, because of the missing exponential factor in fluid models, the Landau damping becomes very overestimated at scales k λ D < 0.2, i.e. it is the long-wavelength limit (and not the short-wavelength limit) that represents trouble. This is because in the long-wavelength limit, the frequency of Langmuir mode does not go to zero but approaches electron plasma frequency ω pe , and so the phase speed ω r /k (and the variable ζ e ) becomes large and for k λ D → 0 goes to infinity, where the fluid closures become imprecise. Landau fluid simulations of the Langmuir mode should be therefore restricted to scales k λ D > 0.2. At longer wavelengths, some closures can actually become ill-posed and instead of Landau damping, can produce a small positive growth rate. For example, if one insists on numerical simulations in the domain below k λ D < 0.2, the closures that have to be eliminated are closures R 4,2 (ζ), R 5,3 (ζ), R 5,4 (ζ), since they produce a small positive growth rate. We briefly checked, and all other closures seems to be well-behaved all the way up to k λ D = 10 −4 . At even longer scales, such as k λ D = 10 −5 , two other closures become ill-posed, the R 5,5 (ζ) and R 5,6 (ζ), and the remaining closures R 3,1 (ζ), R 3,2 (ζ), R 4,3 (ζ), R 4,4 (ζ) do not appear to have a length-scale restriction. It is useful to note that this is not only a problem of Landau fluid closures, but at long-wavelengths, it is actually the kinetic theory itself that becomes very difficult to solve, and in the region k λ D < 0.1, we were often not able to obtain correct numerical solution when solving the exact dispersion relation (481).

3D GEOMETRY (ELECTROMAGNETIC)
Considering gyrotropic f 0 , let's remind ourselves the linearized Vlasov equation (12), that reads We want to describe the simplest kinetic effects and we demand that f (1) must be gyrotropic as well, so ∂f (1) /∂φ = 0. This eliminates the third term on the left hand side of (489) that is responsible for complicated non-gyrotropic effects with associated Bessel functions. However, even without this term the equation still appears to be complicated. For gyrotropic f 0 , the operator on the right hand side can be shown to be (see later in the text, written in Fourier space) Written in the cylindrical co-ordinate system which yields The Vlasov equation in Fourier space now reads This equation is not very useful. If the equation is divided by the (ω − v · k) to obtain f (1) , and integration over 2π 0 dφ is attempted, leads to integrals that are not well defined. On the other hand, if (495) is directly integrated over dφ (each side separately), almost all the terms disappear since 2π 0 cos(φ − ψ)dφ = 0 etc., except and the system reduces to the simplest case of Landau damping that we have already described in detail (even though only in 1D geometry). We could divide (496) by (ω − v k ), integrate the system in 3D geometry and consider Landau fluid closures, but this would be a bit boring right now. We want to get a bit more kinetic effects out of the system. We need a different approach and we need to obtain a better gyrotropic limit for f (1) . It turns out that to obtain the correct gyrotropic limit for f (1) , the 3rd term in the Vlasov equation (489) cannot be just straightforwardly neglected. The term has to be kept there, the relatively complicated integration around the unperturbed orbit has to be performed (see Appendix), and only then the term can be removed in a limit. This is very similar to other mathematical techniques that were encountered earlier, for example when calculating the Fourier transform of sign(k ), where instead of that function, one needs to consider sign(k )e −α|k | , and only after the calculation the term is removed with the limit α → 0. Without the additional term e −α|k | that was removed later, the calculations were not clearly defined, and a very similar situation is encountered now. Nevertheless, it is indeed mind boggling that the complicated integration around the unperturbed orbit has to be performed to recover the gyrotropic limit. This is exactly why the 3D case is so much more complicated than the previously studied 1D case, even though the Landau fluid closures will not be more complicated at all, as we will see later. An alternative approach that we will discuss only very briefly, is to use the guiding center variables where the gyrotropic limit is recovered perhaps more naturally. However, we will skip a huge amount of calculations that lead to do the guiding center approach, so the amount of complexity is probably similar at the end.

Gyrotropic limit for f (1)
We need to consider the full kinetic f (1) with all non-gyrotropic effects, that is obtained in the Appendix, the eq. (800). This f (1) contains all the information of linear kinetic theory, with associated Bessel functions J n (λ r ), where λ r = k ⊥ v ⊥ /Ω r and Ω r = q r B 0 /(m r c). To get rid of the Bessel functions (and to get closer to the gyrotropic state), one needs to consider the limit that the gyroradius is small, which corresponds to limit λ r ≪ 1. Additionally, we will need to consider ω/Ω r ≪ 1. Before we proceed further, we find it illuminating to separate the n = 0 resonance from all the other expressions, without performing any approximations, i.e we want to separate By using the z-component of the induction equation ∂B/∂t = −c∇ × E written in Fourier space (824) (that is an equation of general validity not introducing any simplifications), the general f (1) eq. (800) is slightly rewritten as Separating the n = 0 case directly yields Note that nJ n (x)/x = (J n−1 (x) + J n+1 (x))/2, which when evaluated for n = 0 is zero exactly, since J −1 (x) + J 1 (x) = 0 exactly. Since there is no dependence on angles φ, ψ inside of the big brackets, the sum can be summed (or put to its original form where it came from) Very interestingly, for one B z term, the complicated denominator ω − k v cancels out, yielding This is an exact kinetic expression for f (1) corresponding to n = 0 resonances, that is accompanied by an expression for all the other resonances f (1) | n =0 (that is equivalent to (498) where n = 0 is added below the sum with n). Now considering the limit λ r ≪ 1, the Bessel functions J 0 (λ r ) = 1, J ′ 0 (λ r ) = −λ r /2, the exponential term disappears, which yields the final f (1) in the gyrotropic limit that reads or alternatively The same expression is obtained by directly picking up the m = 0, n = 0 contributions from the general (498). Up to replacing B z with |B|, the expression agrees for example with eq. (19) of Ferrière & André (2002), and is of course equivalent to expressions of Snyder et al. (1997) (formulated in the gyrofluid formalism). In those works, the expression is derived perhaps more elegantly, in the so-called guiding-center limit of the Vlasov equation (see Kulsrud (1983)). The difference between B z and |B| arises, because the fully kinetic f (1) in (498) is linearized completely. As we will see shortly, the expression has a very nice physical interpretation, where the first term comes from the conservation of the magnetic moment µ, the second term comes from the magnetic mirror force and the third term comes from the Coulomb force. Note that to obtain the gyrotropic limit (502), we did not have to perform the low frequency limit ω/Ω r ≪ 1, however, by only picking up the n = 0 resonances, we basically cheated. The power series expansion of the Bessel functions for n ≥ 0 reads (with integer n) where the second expression can be easily replaced by J −n (x) = (−1) n J n (x). The first few terms are and the derivatives of these functions read and the derivatives can be also calculated by using identity J ′ n (x) = (J n−1 (x) − J n+1 (x))/2. In the full equation (498), the term with E x , E y components contains J m (x)(J n−1 (x) + J n+1 (x)), so for m = 0 terms with resonances n = ±1 do not disappear in the limit λ r ≪ 1. Similar situation is for the B z components (which for n = 0 is actually easier to reformulate to the original formulation without the B z induction equation to recover the correct limit). If we like it or not, to get rid of these terms and to obtain the gyrotropic limit (502), one has to do the low frequency limit ω ≪ Ω r as well.
A few notes are in order. 1) If we now calculate the kinetic moments with f (1) described by (502), which was obviously obtained in the low-frequency limit, and find possible fluid closures for the heat fluxes q , q ⊥ or the 4th-order moments r , r ⊥ , r ⊥⊥ , such a fluid model will not become necessarily restricted only to a low frequency regime ω ≪ Ω. At the linear level, the parallel propagating ion-cyclotron and whistler modes are completely independent from the Landau fluid closures, and these modes remain undamped. 12 For example Figure 7 in Part 1 remains unchanged, and the simplest ion-cyclotron resonance where ω → Ω for high wavenumbers (neglecting FLRs), will not be suddenly "removed" by using a low-frequency Landau fluid closure. All figures for the (strictly) parallel firehose instability remain unchanged, and the same applies to the perpendicular fast mode.
2) There is nothing "esoteric" about ion-cyclotron resonances. Similarly to the kinetic effect of Landau damping, the ion-cyclotron resonances just represent some integral, which indeed has some wave-particle "resonance", i.e. the integral has some singularity in the denominator. Similarly to Landau damping, in the case of bi-Maxwellian f 0 this singularity can be expressed through the plasma dispersion function Z(ζ) (similar generalizations exist for a bi-Kappa distribution etc.). The variable ζ is only modified to include the resonances, and for n = ±1 one can work with or for general n with ζ n = (ω + nΩ)/(|k |v th ). No new discussion how to treat this singularity is required. The singular point x 0 in the complex plane is only moved to some other location, and all the previous discussion about the Landau integral fully applies. We could potentially integrate over all the ion-cyclotron resonances and obtain expressions for the heat flux or the 4th-order moments (with the same techniques as plasma physics books do, even though they usually stop at the 1st-order velocity moment, since it is enough to obtain the kinetic dispersion relation). Even though complicated in detail, these would be just standard kinetic calculations. The difference between advanced fluid and kinetic description is, that we need to find a closure after all of these kinetic calculations. I.e., we need to find a way to express the last considered moment through lower order moments, that the closure is valid for all the ζ values, for example, by using the Padé approximation for R(ζ). Such a closure remains elusive for the ion-cyclotron resonances.
3) Advanced fluid models are not restricted to work with f (1) in the gyrotropic limit (502). In Landau fluid models of Passot & Sulem (2007), no assumption about the size of the gyroradius is made, and only the low-frequency condition is used and therefore, the f (1) of these fluid models contain Bessel functions J n (λ r ). The integrals over dv ⊥ are slightly more difficult, and for example if a term proportional to J 0 (λ)J 0 (λ)f 0 is encountered, the integration over dv ⊥ where the new parameter b (which should not be confused with the magnetic field unit vectorb) is Calculations like this lead to the functions Γ 0 (b) = e −b I 0 (b) and Γ 1 (b) = e −b I 1 (b). We note that the limit b → 0 yields Γ 0 (b) → 1 and Γ 1 (b) → 0.

Coulomb force & mirror force (Landau damping & transit-time damping)
The gyrotropic limit (502) has a very meaningful physical interpretation. To clearly understand what kind of forces are present in such a system, one needs to consider that a particle quickly gyrates around its slower moving center, called the "guiding center", and express the full velocity of a particle v as being composed of the quick gyration v gyro , and a motion of the guiding center, that is further decomposed to its free motion parallel to the magnetic field line v b , and all the possible drifts of the guiding center : the ExB drift u E , the grad-B drift, the curvature drift, the polarization drift etc. The plasma physics books by Fitzpatrick and Gurnett & Bhattacharjee (2005) have detailed introductions about single-particle motions in the presence of Lorentz force, where the drifts of the guiding center are calculated. Then one should follow the gyrofluid approach, and by performing integrals over dφ (gyro-averaging) and by expanding for example with respect to Larmor radius, one should get the "guiding center limit" of the Vlasov equation and the expression for f (1) . One should follow Kulsrud (1983);Snyder et al. (1997) etc. Very useful paper is also by Ferrière & André (2002), that explores the discrepancy between the usual CGL and the long-wavelength low-frequency kinetic theory in great detail and that we follow here.
Without going through the lengthy derivation, it can be shown that at the leading order it is sufficient to consider the motion of the guiding center with velocity where the perpendicular equation of motion satisfies the conservation of the magnetic moment and the parallel equation of motion satisfies where E =b · E and d/dt = ∂/∂t + v · ∇ = ∂/∂t + (v b + u E ) · ∇. The first term on the right hand side of the above equation is the Coulomb force, responsible for acceleration of particles along the magnetic field lines. The second term is the magnetic mirror force, responsible for trapping of particles in the magnetic bottle. The third term is just a force associated with the ExB drift of the gyrocenter. The similarity of the Coulomb force and the magnetic mirror force can be emphasized by using the scalar potential Φ and rewriting E = −∇Φ, which yields The similarity is immediately apparent, one just needs to replace the charge of the particle with its magnetic moment q → µ and replace Φ → |B|. Therefore, in a similar way as a charged particle reacts to electric field, a gyrating particle has a magnetic moment that reacts with the gradient of the strength (absolute value) of the magnetic field. The damping effects associated with the Coulomb force are called Landau damping. The damping effects associated with the mirror force are called transit-time damping. Therefore, it is often stated that the transit-time damping is a "magnetic analogue" of Landau damping. Often, the two effects are not separated since both represent the n = 0 particle resonance and one talks only about Landau damping. Nevertheless, it is emphasized that Landau fluid models in 3D geometry contain both damping mechanisms, and these models contain both the Coulomb force and the mirror force. 13 The equations of motion (512), (513) should be used in gyro-averaged Vlasov equation and the equation should be expanded f = f 0 + f (1) . We are interested only in linear solutions, and we can simplify. To avoid discussing compatibility conditions for f 0 (see Kulsrud (1983)), we can just simply claim, that f 0 does not have any time or spatial dependence. By further noticing that linearization of u Noticing that the ExB drift u E is always perpendicular to the direction of B (and also E) impliesb · u E = 0, and the last term in the dv /dt equation (513) rewrites which at the linear level disappears, since The magnetic mirror force contains Similarly, the dv ⊥ /dt equation (512) contains d|B|/dt =b · dB/dt that linearizes as d|B|/dt yielding the final expression for f (1) in real space which when Fourier transformed recovers the f (1) in the gyrotropic limit (502). Instead of fully linearized equations with B z , one can also work with |B|, i.e. one can write the leading order equations of motion as which yields analogous equations (521), (502) where B z is just replaced by |B|.

Kinetic moments for Bi-Maxwellian f 0
Since in the Vlasov expansion the gyrotropic f 0 was assumed to dependent only on v, i.e. f 0 (v 2 ⊥ , v 2 ) and be x, t independent, the fluid velocity u is removed from the distribution function and the "pure" bi-Maxwellian is where α = m or in the language of thermal speeds, We prefer the α notation instead of the thermal speed v th , since in long analytic calculations, there is a less chance of an error. We work without the species index r except for charge q r and mass m r . It is straightforward to calculate that Instead of E z , we will work with the scalar potential Φ as before The f (1) that we want to integrate reads or alternatively expressed with temperatures Now we want to calculate the linear "kinetic" moments over this distribution function. The kinetic moments are We have so far avoided integration in the cylindrical co-ordinate system, and all the previous integral were done in Cartesian co-coordinate system. In the cylindrical system, d 3 v = v ⊥ dv ⊥ dv dφ and the integral with respect to v ⊥ is from 0 to ∞. The Gaussian integrals are x 4 e −ax 2 dx = 3 8a 2 π a ; ∞ 0 x 5 e −ax 2 dx = 1 a 3 ; ∞ 0 x 6 e −ax 2 dx = 15 16a 3 π a ; ∞ 0 x 7 e −ax 2 dx = 3 a 4 .
Therefore, integrating over dv ⊥ dφ is straightforward and and similarly and these are all of integrals over dv ⊥ that are needed right now. The basic integrals (without singularity) calculate and each integral yields further 3 cases from (537) just by multiplying, so Let's remind ourselves how we did calculate integrals over dv Now it is easy to calculate the kinetic moments.

DENSITY
The density calculates so that the ratio and the final result reads

PARALLEL VELOCITY
The parallel velocity calculates so that

PARALLEL PRESSURE
The parallel pressure calculates The parallel heat flux calculates or alternatively

PERPENDICULAR HEAT FLUX
The perpendicular heat flux calculates or alternatively q (1) The perpendicular heat flux q ⊥ (similarly to the perpendicular temperature T (1) ⊥ ), is also not directly influenced by the Landau damping (∼ E z ), even though it is influenced by the transit-time damping (∼ B z ).

4TH-ORDER MOMENT R ⊥⊥
The 4th-order moment r ⊥⊥ calculates and the result is

4TH-ORDER MOMENT DEVIATION R ⊥⊥
The 4th-order moment deviation r (1) ⊥⊥ calculates (for example linearizing r ⊥⊥ = 2 mr p ⊥ T ⊥ + r ⊥⊥ with definitions r (0) or equivalently r (1) which yields r (1) This is an excellent news, since we will not have to consider closures for r ⊥⊥ . All the obtained kinetic moments are consistent for example with the much more general results of Passot & Sulem (2007), and can be verified by performing limit b ≡ k 2 ⊥ r 2 L → 0 in those expressions (note that for the potential Φ used here, a variable Ψ is used instead, that we also discuss in the last section of our Appendix).

Landau fluid closures in 3D
Let's separate the kinetic moments to two groups. The first group: And the second group: One immediately notices that the moments in the first group, are extremely similar to the moments we obtained in the simplified case of 1D geometry, where we neglected the transit-time damping, i.e. in the system (77)-(83). In fact, the system is completely the same, if the variable Bz Therefore there is nothing more we can do here, and all the discussion and closures from 1D geometry, applies here in 3D geometry to closures for q (1) and r (1) without any changes. So for example, and similarly for all the other closures that we considered in the 1D geometry. ⊥ contain only powers ζ and ζ 2 . On one hand, this is good news since the analytic calculations are simpler and we will explore all possible cases of closure very quickly. On the other hand, this means that we will be able to use only relatively low-order Padé approximants to R(ζ), implying that the closures will be less accurate.
To easily spot closures, it is perhaps beneficial to use and the moments read Before proceeding with Padé approximants, it is very beneficial to briefly consider the limit ζ ≪ 1, where the R(ζ) → 1. And a problem is immediately apparent. The quantities q (1) ⊥ and r (1) ⊥ are small and converge to zero, however, this is in general not true for the perpendicular temperature T (1) ⊥ , where the result depends on the temperature anisotropy ratio a p . With anisotropic mean temperatures (a p = 1), the quantity T (1) ⊥ will remain finite and will not converge to zero. The quantity T (1) ⊥ , at least as is written now, is therefore not suitable for construction of closures. Or in another words, the technique with Padé approximants of R(ζ) will not work, since the technique is based on matching the expressions for all ζ values. To consider closures, we have to separate this finite contribution, so that the Padé technique can be used, i.e. by writing and by moving the finite contribution to the left hand side Therefore, instead of looking for closures with T (1) ⊥ , we have to look for closures with a quantity that is proportional to the left hand side of this equation, that we call T ⊥ (T written with "mathcal" command in latex), and for clarity written with the full notation where on the left is the definition of the new quantity, and on the right is the kinetic moment that this new quantity satisfies. Only now we are ready to use the Padé approximants of R(ζ) and construct closures.

1-POLE CLOSURE
By using approximant R 1 (ζ), the moments calculate and the heat flux q (1) ⊥ can be directly expressed through T ⊥ according to and using full notation and transforming to real space Up to the replacement of B z with |B|, the closure is equivalent for example to eq. (40) of Snyder et al. (1997) (their thermal speeds do not contain the factors of 2). The closure is similar to the corresponding closure for the parallel heat flux (608), and for isotropic temperatures the term ∼ B z disappears. The closure is therefore very useful for understanding of the collisionless heat flux, however, the closure is not very accurate and for ζ ≫ 1, the heat flux (618) does not disappear and instead, converges to an asymptotic value. Alternatively, since later on, the normalization is always done with respect to parallel quantities and when the temperature T (1) ⊥ is expressed through the pressure and density, it is useful to note the difference between

2-POLE CLOSURES
Continuing with the R 2,0 (ζ) approximant, the moments calculate The r (1) ⊥ can be expressed through q (1) ⊥ and the closure reads or in real space The closure with R 2,0 (ζ) is naturally more precise that the closure with R 1 (ζ), and both q (1) ⊥ and r (1) ⊥ at least converge to zero for ζ ≫ 1. The closure is equivalent to eq. (35) of Snyder et al. (1997).
There are 3 another closures that can be constructed with R 2,0 (ζ), all of them time-dependent. The first one is obtained by searching for (ζ + α q )q 4.5.
⊥ is created to clearly see the possibilities of a closure. All the proportionality constants (including the minus signs) and including the common denominator of R(ζ), are suppressed here. The approximants R 2,1 , R 4,5 , R 6,9 , R 8,13 are marked with an asterisk "*", since these are not well-behaved approximants and are provided only for completeness, these approximants should be disregarded.

ACKNOWLEDGMENTS
We acknowledge support of the NSF EPSCoR RII-Track-1 Cooperative Agreement No. OIA-1655280 "Connecting the Plasma Universe to Plasma Technology in Alabama", led by Gary P. Zank. Anna Tenerani acknowledges support of the NASA Heliophysics Supporting Research Grant #80NSSC18K1211. PH thanks Thierry Passot, Monica Laurenza, Elena Khomenko, Manuel Collados Vera and Nikola Vitas for many useful discussions. Significant effort has been made to eliminate all the misprints from the equations. However, we will amend possible misprints, if found, in a corrigendum.
where as a minimum choice, we match the first asymptotic term by b 6 = −2a 4 , which defines R 6,0 (ζ). The procedure of matching with the asymptotic expansion yields step by step where the approximant R 6,9 (ζ) is not a good approximant (no imaginary part for real ζ), and is eliminated. Matching with the power series is performed according to . . .
As expected, the very simple approximant R 1 (ζ) is unprecise for larger values of ζ, and above ζ > 1, the ReR 1 (ζ) even has a wrong sign. Nevertheless, the approximant is still a good approximant for small ζ ≪ 1 values, and it is also very valuable from a theoretical perspective, since it is the only approximant that provides a quasi-static closure for the perpendicular heat flux q ⊥ . This has one important implication : If one renders the approximant R 1 as not satisfactory (which is true unless ζ ≪ 1 or at least ζ < 1), 3D simulations with fluid models that contain Landau damping can be only performed with time-dependent heat flux equations. All other approximants in Figure 1 perform reasonably well, and the most precise is R 3,0 (ζ), followed by R 3,1 (ζ). Figure 7 shows selected 4-pole and 5-pole approximants for which we were able to obtain closures. Unfortunately, approximants R 4,4 (ζ) and R 5,6 (ζ) show a bit unpleasant behavior, and the associated closures obtained with these approximants are therefore difficult to recommend, unless the considered domain is ζ ≪ 1 or ζ ≫ 1, or more specifically, at least ζ < 0.5 or ζ > 2. The behavior is not surprising, since approximants R 4,4 (ζ) and R 5,6 (ζ) have the maximum available number of poles devoted to the asymptotic expansion ζ ≫ 1, without being ill-posed. The closures are therefore specifically suitable for ζ ≫ 1 regime, for example in the low-temperature limit, or, in the high-frequency (actually high phase speed) limit (since ζ = ω |k |v th ). For R 4,4 (ζ), the corresponding closures are the quasi-static closure (348) and time-dependent closures (375), (377), (379). For R 5,6 (ζ), the corresponding closure is time-dependent (405) and naturally, this is the most precise closure in the ζ ≫ 1 regime, with precision o(ζ −8 ). Noticeably, the asymptotic precision is even better than the R 8,3 (ζ) approximant used in the WHAMP code, which has a precision o(ζ −5 ). 14 All other approximants in Figure 7 are very precise in the entire considered range of ζ. To clearly see the precision, it is useful to calculate relative errors Im R n,n ′ (ζ) − R(ζ) R(ζ) 100%; Re R n,n ′ (ζ) − R(ζ) R(ζ) 100%, which we define this way instead of for example Re(R n,n ′ (ζ) − R(ζ))/ReR(ζ), since the real part of R(ζ) is going through zero. The R 1 (ζ) approximant is excluded from the table since its relative error of the imaginary part increases with ζ. We omit if errors are positive or negative and the results are: error Im % 0.06 0.03 0.021 0.018 0.022 0.028 0.04 0.08 0.17 0.43 error Re % 0.06 0.03 0.022 0.020 0.020 0.027 0.04 0.07 0.17 0.46 The numbers of course do not reveal the entire story, since the maximum error can occur for different ζ values. For example, from the plots of ImR(ζ) in Figure 6, the approximant R 3,0 (ζ) captures the maximum (the peak around ζ ∼ 1) with much better accuracy than the approximant R 3,1 (ζ). However, according to the above table, the R 3,1 (ζ) appears to be more precise globally. The discrepancy is easily understood from Figure 8 , where % errors of both approximants are plotted with respect to ζ.

5th-order moment
Let's work in the 1D geometry and continue with the hierarchy. In Part 1 of this text, we called the n-th order moment X (n) . However, when linearizing, we want to use our (1) superscript as before. Therefore, here we move the (n) index of the n-th order moment down, and refer to the n-th moment simply as X n . The fifth-order moment X 5 = m (v − u) 5 f dv is linearized according to and direct calculation yields (dropping species index r everywhere except on charge q r )
Now, importantly, by using this equation, it is directly shown that the above static closures with X (1) 5 , are equivalent to time-dependent (dynamic) closures with r (1) obtained for the same R(ζ) approximants, closures (419), (395), (400), (405). The process can be viewed as a verification procedure. Indeed, it should be always possible to double check a dynamic closure, by calculating a static closure at the next moment with the same Padé approximant.
The closure has precision o(ζ 5 ), o(ζ −6 ). It was verified that the closure is reliable, i.e. it satisfies (452) once R(ζ) is replaced by R 6,4 (ζ). The closure is plotted in Figure 9 with orange line. Figure 9. Landau damping of the ion-acoustic mode, calculated with exact R(ζ) -black line; R4,2(ζ) -green line; R5,3(ζ)blue line; R6,4(ζ) -orange line; and R7,5(ζ) -red line. The solutions represent the most precise dynamic closures that can be constructed for the 3rd-order moment (heat flux), 4th-order moment, 5th-order moment, and 6th-order moment. It was analytically verified that all closures are "reliable", i.e. equivalent to the kinetic dispersion relation once R(ζ) is replaced by the associated R n,n ′ (ζ) approximant. The next most precise closure constructed for the 7th-order moment is R8,6(ζ), which is not plotted, but we checked that the solution is basically not distinguishable (by eye) from the exact R(ζ) solution. Figure shows that it is possible to reproduce Landau damping in the fluid framework to any desired precision.
The closure has precision o(ζ 6 ), o(ζ −7 ), and it was verified that the closure is reliable. The closure is plotted in Figure  9 with red line.

CONCLUSION
In general, for a given X n , the most precise (power series) closures are of course dynamic closures, and we have seen that for the 3rd-order moment it is R 4,2 (ζ), for the 4th-order moment it is R 5,3 (ζ), for the 5th-order moment it is R 6,4 (ζ), and for the 6th-order moment it is R 7,5 (ζ). Therefore, it is reasonable to make a conjecture that for an nth-order moment X n , the most precise closure will be constructed with approximant R n+1,n−1 (ζ).
The dynamic closures above are directly related to the most precise (power series) static closures that can be constructed, and we have seen that for the 3rd-order moment it is with approximant R 3,1 (ζ), for the 4th-order moment it is R 4,2 (ζ), for the 5th-order moment it is R 5,3 (ζ), and for the 6th-order moment it is R 6,4 (ζ), and therefore for an nth-order moment, it will be with approximant R n,n−2 (ζ). Regardless if dynamic or static closures are used, this implies that one can reproduce the (linear) Landau damping phenomenon in the fluid framework, to any desired precision, which establishes convergence of fluid and kinetic descriptions.
The convergence was shown here in 1D (electrostatic) geometry, by considering the long-wavelength low-frequency ion-acoustic mode. Nevertheless, the 1D closures have general validity, and are of course valid also for the Langmuir mode, that we considered in section 3.14. However, see the discussion about limitations of the Langmuir mode modeling at the end of that section, since the closures can become unstable for k λ D < 0.2, i.e. in the long-wavelength limit.
For a curious reader, the damping and real frequency of the Langmuir mode obtained with R 7,5 (ζ), are plotted in Figure 10. If one wants to pursue a proof of our conjecture, the general Landau integral with x n can be calculated, for example by considering separate cases for "n" being odd and even. The result can be expressed as x − x 0 dx = ζ n−1 R(ζ) + and it is valid for n ≥ 3. Alternatively, one could say that the result is valid for n ≥ 1 and that the sums are zero when the upper index is negative. One can write expressions for the general n-th moment X (1) n , and the moment is proportional to ζ n R(ζ). Therefore, considering static closures where the X (1) n is expressed through all the lower-order moments X (1) m ; m = 1 . . . n − 1 (for even moments the deviations X (1) have to be considered), it is obvious that the closure has to be achieved with n-th order Padé approximant of R(ζ). Similarly, considering dynamic closures where the ζX (1) n ∼ ζ n+1 R(ζ) is expressed through all the lower-order moments, the closure has to be achieved with (n+1)-th order Padé approximant of R(ζ). To finish the proof, one needs to show that the number of required asymptotic points corresponds to R n,n−2 (ζ) and R n+1,n−1 (ζ), and that such a closure is "reliable".
The next logical step would be to establish such analytic convergence of fluid and kinetic descriptions in 3D electromagnetic geometry in the gyrotropic limit. However, in 3D, for a given n-th order tensor X n , the number of its gyrotropic moments is equal to 1 + int[n/2] and increases with n. Therefore, it might be much more difficult to show the convergence in 3D, even though the convergence should still exist. and which allows us to calculate Or in another words, in cylindrical co-ordinate system the f 0 is φ independent and ∂f 0 /∂φ = 0, so that the velocity gradient 6.6. The general kinetic f (1) distribution (effects of non-gyrotropy) The calculation is actually not that difficult once the coordinate change is figured out, as elaborated in the plasma physics books by Stix, Swanson, Akheizer etc. In the general equation (15) the (1) quantities must be Fourier transformed according to and the equation (15) rewrites In the cylindrical coordinate system with velocity (2) and the wave-vector The integration is changed to be done with respect to variable The time t is a constant here and since dτ = −dt ′ , the integration reads which at time τ = 0 satisfies the initial condition. Now by straightforward calculation (and by using sin(a) cos(b) − cos(a) sin(b) = sin(a − b)), the exponential factor is transformed as so that The complicated expressions encountered in the kinetic dispersion relations originate in using identity There are two such exponents, and therefore the linear kinetic theory contains two independent summations, usually one through "n" and one through "m" (which should not be confused with mass), i.e.
where the Im(ω) > 0 requirement is obtained, because k , v , Ω are real numbers, n is an integer, and none of these can have an imaginary part. For the other 3 integrals we need if Im(a) > 0, and similarly if Im(a) > 0.
The 3 required integrals therefore calculate and all 3 results require Im(ω) > 0. This strictly appearing restriction is removed later by the analytic continuation, once the fluid integrals over the f (1) are calculated. We therefore managed to finish the integration of (777) along the unperturbed orbit and our latest full result for f (1) reads Obviously, the result is not very pretty, and we would like to pull somehow out the denominator ω − k v − nΩ from all the expressions, so that the "resonances" are grouped together. The trouble is the shifted (n − 1)Ω and (n + 1)Ω. However, all expressions are preceded by ∞ n=−∞ . It is therefore easy to shift the summation by one index, where terms that contain (n − 1)Ω require shift n = l + 1, and terms that contain (n + 1)Ω require shift n = l − 1. The transformation is easy to calculate, for example the terms proportional to E x transform as