Hostname: page-component-77f85d65b8-g4pgd Total loading time: 0 Render date: 2026-03-26T23:47:24.460Z Has data issue: false hasContentIssue false

GX: a GPU-native gyrokinetic turbulence code for tokamak and stellarator design

Published online by Cambridge University Press:  23 September 2024

N.R. Mandell*
Affiliation:
Princeton Plasma Physics Laboratory, Princeton, NJ 08543, USA Plasma Science and Fusion Center, Massachusetts Institute of Technology, Cambridge, MA, USA
W. Dorland
Affiliation:
University of Maryland, College Park, MD 20742, USA Institute for Research in Electronics and Applied Physics, University of Maryland, College Park, MD 20742, USA
I. Abel
Affiliation:
Institute for Research in Electronics and Applied Physics, University of Maryland, College Park, MD 20742, USA
R. Gaur
Affiliation:
Institute for Research in Electronics and Applied Physics, University of Maryland, College Park, MD 20742, USA Princeton University, Princeton, NJ 08543, USA
P. Kim
Affiliation:
University of Maryland, College Park, MD 20742, USA Princeton University, Princeton, NJ 08543, USA
M. Martin
Affiliation:
Thea Energy, Princeton, NJ 08542, USA
T. Qian
Affiliation:
Princeton University, Princeton, NJ 08543, USA
*
Email address for correspondence: noah.mandell@typeoneenergy.com

Abstract

GX is a code designed to solve the nonlinear gyrokinetic system for low-frequency turbulence in magnetized plasmas, particularly tokamaks and stellarators. In GX, our primary motivation and target is a fast gyrokinetic solver that can be used for fusion reactor design and optimization along with wide-ranging physics exploration. This has led to several code and algorithm design decisions, specifically chosen to prioritize time to solution. First, we have used a discretization algorithm that is pseudospectral in the entire phase space, including a Laguerre–Hermite pseudospectral formulation of velocity space, which allows for smooth interpolation between coarse gyrofluid-like resolutions and finer conventional gyrokinetic resolutions and efficient evaluation of a model collision operator. Additionally, we have built GX to natively target graphics processors (GPUs), which are among the fastest computational platforms available today. Finally, we have taken advantage of the reactor-relevant limit of small $\rho _*$ by using the radially local flux-tube approach. In this paper we present details about the gyrokinetic system and the numerical algorithms used in GX to solve the system. We then present several numerical benchmarks against established gyrokinetic codes in both tokamak and stellarator magnetic geometries to verify that GX correctly simulates gyrokinetic turbulence in the small $\rho _*$ limit. Moreover, we show that the convergence properties of the Laguerre–Hermite spectral velocity formulation are quite favourable for nonlinear problems of interest. Coupled with GPU acceleration, which we also investigate with scaling studies, this enables GX to be able to produce useful turbulence simulations in minutes on one (or a few) GPUs and higher fidelity results in a few hours using several GPUs. GX is open-source software that is ready for fusion reactor design studies.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
Copyright © The Author(s), 2024. Published by Cambridge University Press
Figure 0

Figure 1. Normalized growth rates (a) and real frequencies (b) as a function of the normalized binormal wavenumber $k_y{\rho _i}$ from GX (blue circles and dotted lines) and GS2 (yellow squares) for ‘Cyclone base case’ (CBC) parameters using a Boltzmann electron response.

Figure 1

Figure 2. Normalized growth rates (i) and real frequencies (ii) as a function of the normalized binormal wavenumber $k_y{\rho _i}$ from GX (blue circles and dotted lines), GS2 (yellow squares) and stella (green inverted triangles) for CBC parameters using a kinetic electron response in the electrostatic limit ($\beta _\mathrm {ref} = 10^{-5}$). Ion scales (ITG and TEM) are shown in (a) and electron scales (ETG) are shown in (b).

Figure 2

Figure 3. Normalized growth rates (a) and real frequencies (b) as a function of the reference beta $\beta _\mathrm {ref}$ from GX (blue circles and dotted lines) and GS2 (yellow squares) for CBC parameters using a kinetic electron response, taking $k_y \rho _i = 0.3$. Both codes show the transition from ITG instability at low $\beta _\mathrm {ref}$ to kinetic ballooning mode (KBM) instability at high $\beta _\mathrm {ref}$.

Figure 3

Figure 4. (a) Time traces of ion heat flux $Q_i$, normalized to gyro-Bohm units, $Q_{\rm GB}= n_i T_i v_{\rm ti}\rho _i^2/a^2$, for the CBC with a Boltzmann electron response from GX with moderate (blue) and coarse (green) velocity resolution, and GS2 (yellow). (b) Time-averaged gyro-Bohm-normalized ion heat flux, $Q_i/Q_{\rm GB}$, as a function of the normalized inverse ITG scale length, $a/L_{Ti}$.

Figure 4

Figure 5. Time traces of ion (solid) and electron (dashed) heat flux, normalized to ion gyro-Bohm units, for the CBC with kinetic electrons from GX with high velocity resolution (blue) and moderate velocity resolution (green) and GS2 (yellow).

Figure 5

Figure 6. Normalized growth rates (a) and real frequencies (b) as a function of the normalized binormal wavenumber $k_y$ from GX (blue circles and dotted lines) and stella (yellow squares) for W7-X bean flux-tube geometry and ITG parameters using a Boltzmann electron response.

Figure 6

Figure 7. Time traces of ion heat flux $Q_i$, normalized to gyro-Bohm units, $Q_{\rm GB}= n_i T_i v_{\rm ti}\rho _i^2/a^2$, for the W7-X bean flux-tube geometry with a Boltzmann electron response. Results are shown from GX with moderate (blue) and coarse (red) velocity resolution, stella (yellow) and GENE (green).

Figure 7

Table 1. Velocity-space convergence and performance for the nonlinear CBC with Boltzmann electrons. Here $N_m$ is the number of Hermite modes and $N_\ell$ is the number of Laguerre modes. Other resolution parameters were $N_x=192,\ N_y=64,\ N_z=24$. Accurate ion heat flux calculations can be obtained with resolution as coarse as $( N_\ell, N_m)\geq (4,6)$ (above the double bar). Each simulation was run to $t = 1000 a/v_{\rm ti}$, and we report the total wallclock time in minutes, the time per timestep in seconds, and the average timestep size $\langle \Delta t\rangle$ (normalized to $a/v_{\rm ti}$), which changes with $N_\ell$ and $N_m$ due to linear stability constraints. The number of NVIDIA A100 GPUs used for each calculation is listed in the final column. The resolutions shown in figure 4 are marked with a $\star$.

Figure 8

Figure 8. Laguerre (a) and Hermite (b) free energy spectra, $W_g$, for the nonlinear CBC with a Boltzmann electron response with varying velocity-space resolution. All the spectra are nearly identical at the scales that contribute dominantly to the heat flux (small $\ell$ and $m$), which is consistent with the excellent convergence observed in table 1.

Figure 9

Table 2. Velocity-space convergence and performance for the nonlinear CBC with kinetic electrons. Here $N_m$ is the number of Hermite modes and $N_\ell$ is the number of Laguerre modes. Other resolution parameters were $N_x=192,\ N_y=64,\ N_z=24$. Reasonably accurate heat flux calculations (within 15 % of the highest resolution case) can be obtained with resolution as coarse as$( N_\ell, N_m)\geq (4,16)$ (above the double bar). Each simulation was run to $t = 600 a/v_{\rm ti}$, and we report the total wallclock time in minutes, the time per timestep in seconds, and the average timestep size $\langle \Delta t\rangle$ (normalized to $a/v_{\rm ti}$). The number of NVIDIA A100 GPUs used for each calculation is also shown. The resolutions used for the GX calculations shown in figure 5 are marked with a $\star$.

Figure 10

Figure 9. Laguerre (i) and Hermite (ii) spectra for ions (a) and electrons (b) for the nonlinear CBC with kinetic electrons with varying velocity-space resolution.

Figure 11

Figure 10. Scaling of time per timestep (in seconds) versus the number of radial grid points, $N_x$, for nonlinear CBC with Boltzmann electrons on a single GPU. Ideal linear scaling is shown with the dashed line, while the dot–dashed line shows $N_x \log N_x$ scaling (the expected scaling when FFTs dominate the algorithm).

Figure 12

Figure 11. Strong scaling study for the nonlinear CBC with kinetic electrons showing the time per timestep (a) and the scaling efficiency (b) as a function of number of GPUs used, with fixed resolution for each curve. In each case the ideal scaling is shown with a dashed black line. The Laguerre–Hermite resolution parameters are listed in the legend as $(N_\ell, N_m)$. The other resolution parameters are $N_x=192,\ N_y=64, N_z=24,\ N_\mathrm {species}=2$.

Figure 13

Figure 12. Weak scaling study for the nonlinear CBC with kinetic electrons showing the time per timestep (a) and the scaling efficiency (b) as a function of number of GPUs used, with the number of Hermite modes $N_m$ scaling with the number of GPUs as $N_m=8N_\mathrm {gpu}$. We show results for both $N_\ell =4$ (blue) and $N_\ell =8$ (yellow). The ideal scaling is shown with dashed lines. The other resolution parameters are $N_x=192,\ N_y=64,\ N_z={24}, N_\mathrm {species}=2$.

Figure 14

Table 3. Fundamental normalizing quantities.

Figure 15

Table 4. Normalizing the quantities for the independent variables.

Figure 16

Table 5. Normalizing quantities for dependent variables.

Figure 17

Figure 13. Time-averaged gyro-Bohm-normalized ion (solid blue circles) and electron (dashed yellow squares) heat fluxes as a function of $A\Delta t$, with $A$ the scaling factor for the damping filter and $\Delta t$ the timestep size.

Figure 18

Figure 14. Time-averaged gyro-Bohm-normalized ion (solid) and electron (dashed) heat fluxes as a function of $z_\mathrm {width}/L_z$, with $z_\mathrm {width}$ the width of the damping region and $L_z$ the parallel length of the flux tube. Colours indicate different values of parallel resolution $N_z$.