Optimization of nonlinear turbulence in stellarators

We present new stellarator equilibria that have been optimized for reduced turbulent transport using nonlinear gyrokinetic simulations within the optimization loop. The optimization routine involves coupling the pseudo-spectral GPU-native gyrokinetic code GX with the stellarator equilibrium and optimization code DESC. Since using GX allows for fast nonlinear simulations, we directly optimize for reduced nonlinear heat fluxes. To handle the noisy heat flux traces returned by these simulations, we employ the simultaneous perturbation stochastic approximation (SPSA) method that only uses two objective function evaluations for a simple estimate of the gradient. We show several examples that optimize for both reduced heat fluxes and good quasi-symmetry as a proxy for low neoclassical transport. Finally, we run full transport simulations using the T3D stellarator transport code to evaluate the changes in the macroscopic profiles.


Introduction
Stellarators are one of the most promising designs for magnetic confinement fusion as they are inherently steady-state and are less susceptible to current-driven instabilities (Helander 2014).However, stellarators have historically been plagued by large collisionally-induced losses of particles and energy, associated with cross-field drifts of particles caused by the inhomogeneity and curvature of the magnetic field.In principle, stellarators can be optimized to reduce these collisional losses.In practice, impressive improvements in plasma confinement have been obtained.For example, experiments at the W7-X stellarator have shown greatly reduced neoclassical transport (Beidler et al. 2021).Furthermore, advances in stellarator optimization techniques have led to designs with precise quasisymmetry (QS) (Landreman & Paul 2022;Landreman et al. 2022) and quasi-isodynamicity (QI) (Goodman et al. 2022).
After neoclassical losses are minimized, confinement is limited by the anomalous transport of heat and particles by turbulence.This turbulence is driven by plasma microinstabilities on length scales of the gyroradius.For example, the ion-temperature P. Kim et al. gradient (ITG) instability is believed to be a major cause for ion-temperature clamping in W7-X, preventing heating of ions above 2 keV (Beurskens et al. 2021) (along with weak energy exchange between electrons and ions).While there have been recent attempts to optimize stellarators to reduce turbulence-induced transport, they have mainly relied on proxies based solely on the magnetic geometry (Mynick et al. 2010;Proll et al. 2016;Roberg-Clark et al. 2022) or linear simulations (Jorge et al. 2023).However, linear physics may not accurately predict nonlinear saturation mechanisms that ultimately determine the rate of heat and particle loss (McKinney et al. 2019).Unfortunately, using nonlinear analysis for optimization is usually very challenging.Gyrokinetics (Antonsen Jr. & Lane 1980;Catto 1978;Frieman & Chen 1982) is one of the most commonly-used models to study turbulence in magnetic confinement fusion devices, and is also the model used for this paper.Typical nonlinear gyrokinetics simulations usually require hundreds to thousands of CPU hours, making them infeasible to use within an optimization loop.
In this work, we demonstrate the ability to reduce turbulent losses by optimizing stellarator configurations using nonlinear turbulence simulations directly rather than relying on proxies.In order to run nonlinear simulations inside the optimization loop, we use the new GPU-native gyrokinetic code GX (Mandell et al. 2018(Mandell et al. , 2022)).GX utilizes pseudo-spectral methods in velocity space.GPU acceleration combined with flexible velocity resolution allows for nonlinear GX simulations that only take minutes to run.For this work, we focus on ITG turbulence, which contributes to major energy losses that limit plasma confinement, and so is one of the most important microinstabilities to consider for reactor design (Kotschenreuther et al. 1995b;Horton 1999;Helander et al. 2013).To that end, we will be running electrostatic simulations with a Boltzmann (adiabatic) response assumed for the electrons.
The optimization is performed using the stellarator equilibrium and optimization code DESC (Dudt & Kolemen 2020;Panici et al. 2023;Conlin et al. 2023;Dudt et al. 2023).DESC also uses pseudo-spectral methods and directly solves the ideal MHD forcebalance equation to compute the magnetic equilibrium.The quantities computed from the resulting magnetic fields are used as input for GX.Stochastic optimization methods are used to robustly handle the noisy landscape of turbulent heat fluxes.
The paper is organized as follows.The stochastic optimization method used in this work is described in Section 2. Results are shown in Section 3, with analysis on the effects of global magnetic shear for reduced turbulence explored in Section 4. Transport simulations using the T3D stellarator transport code (Qian et al. 2022) are shown in Section 5. Finally, the conclusions follow in Section 6.

Optimization Methods
In this optimization routine, we seek to minimize the nonlinear heat flux returned by GX.Specifically, GX returns as output the time-trace of the heat flux (normalized to gyro-Bohm units).An example of some heat traces is shown in Figure 5a.At the beginning of the simulation, linear growth of the fastest-growing instability dominates.However, eventually nonlinear effects cause the heat flux to decrease and saturate to a statistical steady-state.We use the time-average of this steady-state flux as our heat flux objective f Q .To take the time-average, we take the second half of the time-trace and compute the weighted Birkoff average To run these simulations, GX requires a set of geometric quantities that can be computed from numerical equilibria.Given that the magnetic field is written in Clebsch form B = ∇ψ × ∇α, the set of quantities needed are where b = B/B, α is the straight field line label, κ = b • ∇b is the curvature, and z is some coordinate representing the distance along the field line.For these simulations, the geometric toroidal angle ϕ is used for z.All of these quantities are easily computed using utility functions in DESC.
One issue with directly minimizing nonlinear heat fluxes is that their time-traces are often very noisy.Even the resulting time averages are usually very noisy in parameter space.This can be seen in Fig. 1 showing the time-averaged nonlinear heat flux when scanning over the Z m,n = Z 0,−1 boundary mode of the initial equilibrium used in this study.Optimizers designed for smooth objectives may easily get stuck in local minima and make very little progress.To help with this issue, we use the simultaneous perturbation stochastic approximation (SPSA) method (Spall 1987) to minimize the heat fluxes.In this algorithm, the i th component of the gradient at point x n is approximated as where c n is a random perturbation vector whose components are sampled from a Rademacher distribution, and k is a finite-difference step-size.Therefore, we are effectively using the finite difference method in all directions at once.Finally, for this method, rather than specifying stopping tolerances, we instead specify a maximum number of iterations.
The main feature of the SPSA method is that it only requires two objective function measurements per iteration.This makes it suitable for high-dimensional optimization problems whose objective functions are expensive to compute.SPSA can also robustly handle noisy objective functions, and so has been used for simulation optimization, including Monte-Carlo simulations (Chan et al. 2003).
In order to still use automatic differentiation and smooth optimization methods for other objectives like quasisymmetry residuals, we split the optimization routine into two parts.We first minimize the turbulent heat flux using stochastic gradient descent.Next, we minimize the quasisymmetry residuals using a least-squares method that utilizes automatic differentiation.This order is chosen arbitrarily, but the reverse ordering can also work.The two objectives are thus (2.4) where f Q is the heat flux from GX, and A is the aspect ratio.For these optimizations we target an aspect ratio of A target = 8.We use the two-term quasisymmetry objective, where a magnetic field is quasisymmetric if the quantity is a flux function.Computationally, we evaluate the equivalent form at several different flux surfaces and try to minimize the resulting values.For this study, we always choose flux surfaces at ρ = 0.6, 0.8, and 1.We target quasi-helical (QH) symmetry, so that the helicity is (M, N ) = (1, 4).
The GX simulation parameters used in the optimization loop and for post-processing, along with their justification, are in Appendix B.
Finally, the optimization routine is performed in stages.Each stage increments the maximum boundary Fourier mode being optimized over.For example in the first stage only boundary modes with modes m, n satisfying |m| ⩽ 1 and |n| ⩽ 1 are used as optimization variables.In the next stage boundary modes with |m| ⩽ 2 and |n| ⩽ 2 are used.The m = 0, n = 0 mode is excluded to prevent the major radius from changing.This reduces the number of optimization variables at the beginning of the optimization and "warm-starts" each successive stage.This type of method has been used with great success for previous stellarator optimization results, including for optimizing quasisymmetry (Landreman & Paul 2022).After each stage, we start the next stage with an equilibrium chosen from the previous stage that had achieved both low heat flux and low quasisymmetry error.No strict rule is used, and the choice depended on several factors like the value of the gradient estimate, and the ι profile.

Turbulence Optimization
As an initial test of our stochastic optimization method, we begin by solely optimizing for turbulence.The optimization routine is the same as described in the previous section, except instead of optimizing for both quasisymmetry and aspect ratio in the second half of each iteration, we only optimize for the target aspect ratio.Usually, combining the turbulence and aspect ratio objectives as is done in the first part is insufficient to reach our target aspect ratio.We have to force the optimizer to take more aggressive steps due to the simulation noise, which makes it more difficult to maintain the desired aspect ratio.For this optimization (and all of the examples in this work), the initial equilibrium is an approximately quasi-helically symmetric equilibrium with an aspect ratio of 8 and 4 field-periods.
Figure 2 shows the normalized nonlinear heat flux across each iteration of the optimization process.Due to the stochastic nature of optimization routine, the initial gradient estimates are very poor, leading to an increase in the heat flux in the first several iterations.However, as the optimization continued, there is eventually a rapid decrease in the heat flux as the gradient approximation begins to track the true gradient.The optimizer continues to steadily decrease the heat flux for most of the remaining iterations.Note that since we specify a maximum number of iterations rather than stopping tolerances, the optimization ends even when the heat flux had increased slightly at the end.
A scan of the nonlinear heat fluxes across ρ and optimized cross-sections of the flux surfaces are shown in Fig. 3.For this simulation (and all of the simulations in this section), the resolution is increased to the values in Table 3.As seen in those plots, some surfaces see moderate to drastic improvements to the nonlinear heat flux.In particular, the ρ = 0.4 surfaces has a reduction of about an order of magnitude.However, other surfaces see much less improvement, such as at ρ = 0.2 and ρ = 0.8.It is not completely well-understood why that is, but it could be related to the fact that only the ρ = √ 0.5 surface is being optimized.This does, reveal a potential weakness in the optimization strategy that will be addressed in future work.Nevertheless, the following sections show P.Kim et al. much more uniform improvement.Interestingly, based off of the surface boundary plots, the cross-sections of the optimized equilibria seem much less strongly-shaped than in the initial equilibrium.Instead, the magnetic axis has significantly more torsion.
The contours of |B| in Boozer coordinates are plotted in Fig. 4. Surprisingly, the contours seem to resemble those from quasi-isodynamic equilibria despite not including a QI term in the objective functions.This may be related to the fact that it seems like the optimizer favored a high mirror ratio.Additionally, it is well-known that QI stellarators can be optimized to have the maximum-J property, which has numerous benefits including improved confinement of fast ions, neoclassical confinement of thermal ions (Sánchez et al. 2023;Velasco et al. 2023) and enhanced stabilization against trapped-electron modes (TEM) (Proll et al. 2012;Helander et al. 2013;Helander 2014).It's been further theorized and shown in gyrokinetic simulations of W7-X that possessing the maximum-J property can also be beneficial for ITG turbulence as well (Proll et al. 2022).However, the TEM studies required modeling the full kinetic electron dynamics.In this study, we only use a Boltzmann response for the electrons.

Combined Turbulence-Quasisymmetry Optimization
Next, we include the two-term quasisymmetry objective in the second part of the optimization loop.The final heat flux traces and optimized cross-sections of the flux surfaces are shown in Fig. 5.The cross sections in Fig. 5b show relatively modest changes in the shape of optimized stellarator.However, the time-trace (simulated at the ψ/ψ b = 0.5 surface) in Fig. 5a shows about a factor of 3 decrease in the nonlinear heat flux.
To ensure that the nonlinear heat flux was reduced throughout the plasma volume, we again ran simulations at radial locations of ρ = 0.2, 0.4, 0.6, and 0.8.The resulting heat fluxes as well as the maximum symmetry breaking residuals across ρ are plotted in Fig. 6.As seen in the plots, despite optimizing only for the ρ = √ 0.5 surface, the heat flux is reduced throughout the plasma volume.With regard to quasisymmetry, for ρ < 0.5, the optimized equilibrium has smaller maximum symmetry-breaking modes than the initial equilibrium.However, this reverses in the outer surfaces.This is unexpected considering that the quasisymmetry residuals were computed at ρ = 0.6, 0.8 and 1.Nevertheless, the degree of symmetry-breaking is still comparable to WISTELL-A (Bader et al. 2020), another optimized QH stellarator.Indeed, the |B| plots in Fig. 7 show contours characteristic of a quasi-helically symmetric stellarator (plotted at the ψ = 0.5 surface).
Unlike in tokamaks, different field lines in stellarators experience different curvatures, magnetic fields, etc. and so may have very different fluxes (Dewar & Glasser 1983;Faber et al. 2015).Therefore, we also run simulations across α at the radial location ρ = √ 0.5.The resulting scan is shown in Fig. 8a.At each α simulated, the optimized stellarator P.Kim et al.It is interesting to see that, for the initial geometry, there are large variations in the heat flux.This might indicate that some flux tubes are too short to sample enough area on the flux surface or that there is some coupling between the flux tubes.However, it should be noted that the ι of this equilibrium (plotted in Fig. 9) is close to 5/4.Recent work has shown a very large variation across α on low-order rational surfaces (Buller et al. 2023).In comparison, the optimized equilibrium has an ι that is slightly above 0.9.This could have negative impacts on the optimization routine as the optimizer may choose to approach equilibria with ι near low-order rationals.This could then lead to not only misleading heat fluxes, but also poor MHD stability.More work is needed to investigate the limitations of flux-tube codes, the effects of rational surfaces, and the resulting consequences for optimization.In a fusion reactor, since the transport is very stiff, the temperature profiles may evolve to approach the critical temperature gradient (the temperature gradient at which the heat flux is zero).Therefore, a decrease in the heat flux is less useful if the critical temperature gradient also decreases significantly.To verify that the critical temperature gradient did not decrease, we also ran several nonlinear simulations with different temperature gradients at ρ = √ 0.5.The results are shown in Fig. 8b which shows that the critical temperature gradient does not seem to change significantly.It would be more beneficial if the critical temperature gradient had also increased.Specifically targeting the critical temperature gradient will be the focus of future work.
We also run linear simulations for the initial and optimized equilibria, with the growth rates shown in Fig. 11a   k x = 0. On the other hand, it has lower growth rates at lower k y , and one would expect that the nonlinear heat flux scales like γ/⟨k 2 ⊥ ⟩ (where the angle brackets indicate a flux-surface average) (Mariani et al. 2018).Therefore, this might result in the observed lower nonlinear heat fluxes.However, this trend reverses for the simulations at k x = 0. Furthermore, recent work has shown that both peak growth rates and the quasilinear γ/⟨k 2 ⊥ ⟩ estimate are both poor proxies for the nonlinear heat flux (Buller et al. 2023).As a preliminary example, we replicate the plots from Fig. 10 but with a quasi-linear estimate of the form where ⟨k 2 ⊥ ⟩ is Here, ϕ is the electrostatic potential, √ g is the Jacobian, and the integration is along the simulated field line.This expression is similar to the one used in Mariani et al. (2018).
The resulting plots are shown in Fig. 11.We have enforced that min(f QL ) = 0, which assumes that only growing modes contribute to the flux.As seen in the plots, while the optimized configuration has a smaller f QL at smaller k y at k x = 0, it has a much larger f QL at k x = 0.4.This makes it challenging to use such a quasilinear estimate of the heat flux as a proxy, necessitating our approach of directly computing the nonlinear heat flux.Finally, we compare the optimized stellarator to another configuration solely optimized for quasisymmetry.The plot of the heat fluxes across different surfaces is shown in Fig 12 .The heat fluxes for the precise QH equilibrium are higher than those from the approximately QH equilibrium used as the initial point.This shows that just optimizing for quasisymmetry can be detrimental for turbulence, and stresses the need to optimize for both.

Optimization with Multiple field lines
As described in the previous section, several physical and computational factors can lead to large variations in the heat fluxes at different α.To avoid this issue, we rerun the optimization loop but simulate on field lines of both α = 0 and α = πι/4.This makes the new objective function Figure 12: The heat fluxes across ρ for a precise QH equilibrium (red) and the turbulence optimized equilibrium (blue).
This new objective function serves both as a way of reducing the heat flux across multiple field lines while also testing our stochastic optimizer against additional objectives.Similar situations include trying to optimize across different flux surfaces, different temperature/density gradients, etc.
The plots in Fig. 13 show the cross-sections, maximum symmetry breaking modes, and heat fluxes across ρ and α for the initial equilibrium and the final optimized equilibria after optimizing at one and two field lines.While this new equilibrium has higher heat fluxes than when optimizing for just a single field line, it instead has a lower maximum symmetry breaking modes and better quasisymmetry.
The scans across multiple α in Fig. 13d show about a 50% variation in the heat flux compared to the α = 0 point.Unfortunately, this is larger than the approximately 30% variation in the single-field line case, and comparable to the relative variation in the initial equilibrium.Nevertheless, the heat fluxes across each α are still 2-3x smaller than P.Kim et al. in the initial equilibrium and this shows that the stochastic optimizer is still effective with additional terms in the objective function.More investigation is needed to more effectively and precisely optimize with multiple field lines.

Fluid Approximation
In this final case, we change the simulation parameters within the optimization loop to approach the fluid limit in GX.This is based off of previous work that showed a fluid approximation with only a few velocity moments can accurately match the true heat fluxes at higher velocity resolution at moderate collision frequencies (Buck et al. 2022).That study was motivated by a recently developed three-field model for the density, temperature, and momentum to approximate growth rates and nonlinear heat fluxes (Hegna et al. 2018).
To approximate this model, we reduce the velocity resolution to 4 Hermite moments and 2 Laguerre moments, as in the gyrofluid model.While normally this requires closure relations like in (Beer 1995;Mandell et al. 2018), we instead increase the collisionality to damp the high wavenumber modes and increase the temperature gradient to a/L T = 5.We employ a Dougherty collision operator (Dougherty 1964).
The plots of the maximum-symmetry breaking modes and the heat flux across ρ for the kinetic and fluid cases are shown in Fig. 14.While the new equilibrium does not achieve lower heat fluxes than in the kinetic case, it still achieves about a factor of 2 reduction compared to the initial equilibrium except at ρ = 0.8 despite using very different physical parameters.The fluid case retains comparable levels of quasisymmetry as well.These results indicate that the optimization routine is robust against varying levels of fidelity.This opens new possibilities of potentially varying the fidelity across iterations.This can either decrease the computational cost further or allow us to increase other resolution parameters.

Effects of Magnetic Shear on Reduced Turbulence
It is well-known since Greene & Chance (1981) that sufficient positive global magnetic shear can be destabilizing for intermediate to high wavenumber instabilities like the ballooning mode.An analogous relationship for transport was investigated in Kotschenreuther et al. (1995a), where it was found that while increased shear can reduce thermal transport, it can also negatively impact the critical gradient.
To further investigate these effects, we optimized for several more precisely quasisymmetric equilibria, but with different target shears.The nonlinear heat flux traces are shown in Fig. 15.The shear = 0.01 case is when there is no shear target.While increasing the shear slightly does reduce the heat flux, continuing to increase it eventually increases the heat flux again, consistent with the findings in Kotschenreuther et al. (1995a).It is somewhat surprising that the heat flux is this sensitive to the shear since the shear is very small.For comparison, the global magnetic shear for W7-X in these units is approximately 0.16.Future work will better address the relationship between shear and turbulence in stellarators.This relationship can be very important as very low global magnetic shear is characteristic of vacuum quasisymmetric stellarators (Landreman & Paul 2022).While the shear increases for finite β QS stellarators (Landreman et al. 2022), it is still small compared to shear in tokamaks.

Transport Simulations
To study how the changes in nonlinear heat fluxes affected the macroscopic profiles, we use the T3D transport code (Qian et al. 2022).T3D uses the same algorithm as Trinity (Barnes et al. 2010), but is written in Python and adapted to work with stellarator geometries.T3D takes advantage of the separation of scales in the local δf gyrokinetic model where the distribution function is written as F = F 0s + δf s .It can be shown that F 0s is a Maxwellian and it is assumed that the perturbation δf scales like δf ∼ ϵF 0s where ϵ = ρ/a.The Maxwellian F 0s evolves slower than the perturbation ∂F 0s /∂t ∼ ϵ 2 ∂δf s /∂t.This allows for T3D to evolve the macroscopic density, pressure, and temperature profiles using heat and particle fluxes computed by gyrokinetic codes like GX.
We run T3D simulations for the initial approximately QH equilibrium and optimized equilibrium from Section 3.2.Both equilibria were scaled to the same major radius and on-axis magnetic field as W7-X.An adiabatic response is still assumed for the electrons, and neoclassical contributions are ignored to isolate the effects from turbulence.The plots of the final steady-state temperature profiles for both equilibria are shown in Figure 16.
From the plot of the temperature profiles, there is a clear increase in the temperature for the optimized equilibrium.The temperature at the innermost simulated point increased by approximately 10%.This seems lower than expected considering that the optimized equilibrium had lower heat fluxes by about a factor of 3. The true heat flux scales like , where Q sim is the heat flux from the simulation.Since v th ∼ T 1/2 and ρ ∼ v th , the heat flux scales like Q ∼ Q sim T 5/2 .For equivalent sources, T ∼ Q 2/5 sim .Therefore, we would expect the temperature to increase by a factor of Q 0.4 ∼ 1.55.However, the simulations from Section 3.2 assumed a temperature gradient of a/L T i = 3.From the plot of the temperature gradients, most points are simulated with lower gradients, and the equilibria have the same critical temperature gradient (as seen in Figure 8b).Nevertheless, the improvement is still very encouraging.Future work will investigate using simulations to optimize for a higher critical gradient or for specific target profiles.

Conclusions
In this work, we directly optimized stellarators for reduced nonlinear heat fluxes and good quasisymmetry by coupling GX and DESC.By directly running nonlinear simulations we include the nonlinear saturation mechanisms that determine the steady-state heat flux, and so avoid potential limitations of linear or quasilinear models.The SPSA method is used to handle the noisy heat flux objective and to cheaply estimate the gradient.The newly optimized equilibria show factors of 2-4 improvement in the nonlinear heat flux across several flux surfaces and multiple field lines while also having comparable or improved quasisymmetry.We also investigated how global magnetic shear affects the nonlinear heat flux and found that slightly increasing it can significantly reduce the flux.However, increasing the shear too much led to increases in the heat flux.Our T3D simulations showed that by reducing the heat fluxes, the steady-state temperature profiles did increase slightly.Unfortunately, the transport is sufficiently stiff so that macroscopic profiles will approach the critical gradient.Since the initial and optimized equilibria have similar critical gradients, this limits the improvements in the temperature profile.Nevertheless, these results demonstrate that we can efficiently include nonlinear gyrokinetic simulations within the optimization loop.Future work will include adding an objective for MHD stability, optimizing for the nonlinear critical gradient and also directly optimize for desired macroscopic profiles.More improvements will also be made to the SPSA optimizer to make it more effective near minima.Currently, near the minima the true gradient is small enough to be comparable to the simulation noise.This leads to poorer convergence near the minima, and so requires more sophisticated optimization methods.Consequently, we will implement other optimization methods that have been effective for stochastic optimization, such as Bayesian optimization.DESC then solves for the spectral coefficients R lmn , Z lmn , and λ lmn that minimize this force error.DESC also utilizes automatic differentiation through JAX (Bradbury et al. 2018) in its optimization routines.Briefly, in automatic differentiation, the chain rule is applied through the code, allowing DESC to compute exact derivatives faster and more accurately than derivatives from finite differencing routines.However, at the time of this writing this requires writing the objective function in native Python, and GX is written in C++/CUDA.Furthermore, it is unclear how effective it would be to differentiate through the timestepping of such a noisy system.Therefore, this would require extensive code developments for both DESC and GX, and so AD will not be used when optimizing for reduced turbulence.For the GX simulations within the optimization loop, we use the simulation parameters listed in Table 1.

Appendix B. Simulation Parameters used in Optimization
The number of poloidal turns, parallel resolution, and number of simulated modes (proportional to n x and n y ) are chosen to enable cheaper simulations.However, one poloidal turn has been used in previous nonlinear W7-X gyrokinetic benchmark studies (González-Jerez et al. 2022a).Furthermore, from a quasilinear estimate the heat flux   et al. 2018).Consequently, higher wave number modes contribute less to the heat flux.The gradients are typical for experimental profiles for W7-X (Beurskens et al. 2021) and have been previously used for benchmark, transport, and optimization studies (González-Jerez et al. 2022b;Bañón Navarro et al. 2023;Roberg-Clark et al. 2022).Finally, it's been demonstrated that GX simulations can yield accurate heat flux traces using the specified number of Hermite and Laguerre polynomials (Mandell et al. 2022).To check our results, the post-processing simulations are run at higher resolution.

B.2. Fluid Approximation Optimization
When running the optimization at fluid resolution, the simulation parameters are identical except those listed in Table 2.The velocity resolution is decreased to that of the gyrofluid model, and the collision frequency is increased.Since there is far greater dissipation in the model, we also increase the temperature gradient.

B.3. Post-processing Simulations
To ensure our final results are well-converged, we increase the resolution of the postprocessing simulations to those listed in Table 3.The scans over ρ, α, and a/L T of course include additional values.

B.4. DESC Equilibria Parameters
The parameters used for the equilibria are shown in this table.These are considered "moderate" resolution, and are typical values used for optimization.This is a vacuum case, so both the pressure and current are zero.When optimized for precise QH symmetry, the resulting equilibrium is similar to the QH equilibirum in (Landreman & Paul 2022).

B.5. Optimization Timings
The table below shows the approximate timings of different parts of the optimization loop using a single Tesla V100 GPU on the Princeton Traverse cluster.All values are in minutes.In total, each optimization completed in about 20 hours.It takes slightly more  than 12 hours on a single NVIDIA A100 GPU on the NERSC Perlmutter cluster.The "optimization" timings are for a single iteration.
P.Kim et al.

Figure 1 :
Figure 1: The time-averaged nonlinear heat flux computed by GX when scanning across the Z 0,−1 boundary mode.

Figure 2 :
Figure 2: The normalized heat flux across each iteration.The dashed lines represent an increase in the maximum boundary mode number being optimized.

Figure 3 :
Figure 3: (a) A scan of the nonlinear heat flux for the initial (red) and optimized (blue) equilibria across different flux surfaces.(b) The cross-sections of the boundary magnetic flux surface of the initial (red) and optimized (blue) configurations.The star in (a) indicates the ρ = √ 0.5 surface that was chosen for the optimization loop.

Figure 4 :Figure 5 :
Figure 4: The |B| contours in Boozer coordinates, which resemble contours of a QI equilibrium.

Figure 6 :
Figure 6: Scans of the nonlinear heat flux (a) and maximum symmetry breaking modes (b) across different radial locations.The star in (a) indicates the ρ = √ 0.5 surface that was chosen for the optimization loop.

Figure 7 :
Figure 7: The |B| contours in Boozer coordinates for the initial (a) and final (b) equilibria.

Figure 8 :
Figure 8: The heat fluxes across different field lines α (a) and a/L T (b) for the initial (red) and optimized (blue) configurations.The star indicates the α = 0 field line and the a/L T = 3 temperature gradient that were chosen for the optimization loop.

Figure 9 :
Figure 9: The ι profiles for the initial (red) and optimized (blue) equilibria.
at k x = 0 and Fig. 11b at k x = 0.4.Although the optimized equilibria have lower heat fluxes, they have a significantly higher peak growth rate at P.Kim et al.

Figure 10 :
Figure 10: The linear growth rates across different k y at k x = 0 (a) and k x = 0.4 (b)

Figure 11 :
Figure 11: The quasilinear estimate for the initial (red) and optimized (blue) configurations at k x = 0 (left) and k x = 0.4 (right).

Figure 13 :
Figure 13: (a) The cross sections for the initial equilibrium (red), the single field line optimized equilibrium (blue)from Section 3.2, and the new two field line optimized equilibrium.(b) The maximum symmetry-breaking modes for all three equilibria and WISTELL-A.(c) The heat flux scans across ρ.(d) The heat flux scans across α.The stars in (c) and (d) indicate parameters that were chosen for optimization.

Figure 14 :
Figure 14: The maximum symmetry-breaking modes (a) and the heat flux scan across ρ (b) for the initial equilibrium, the kinetic case from Section 3.2, and the fluid case (as well as WISTELL-A for the left plot).The star in (b) indicates the ρ = √ 0.5 surface that was chosen for the optimization loop.
P.Kim et al.

Figure 15 :
Figure 15: Nonlinear heat flux traces for quasisymmetric equilibria with different shear targets.

Figure 16 :
Figure 16: The final temperature (left) and temperature gradient (right) profiles for the initial and optimized equilibria.

Table 1 :
P.Kim et al. as part of P.K.'s Summer 2022 DOE SULI internship under award number DE-AC02-09CH11466.N.R.M was supported by the DOE Fusion Energy Sciences Postdoctoral Research Program administered by the Oak Ridge Institute for Science and Education (ORISE) for the DOE via Oak Ridge Associated Universities (ORAU) under DOE contract number DE-SC0014664, and by the Laboratory Directed Research and Development Program of the Princeton Plasma Physics Laboratory under U.S. Department of Energy contract number DE-AC02-09CH11466.R J is supported by the Portuguese FCT-Fundação para a Ciência e Tecnologia, under Grant 2021.02213.CEECIND.This work has been carried out within the framework of the EUROfusion Consortium, funded by the European Union via the Euratom Research and Training Programme (Grant Agreement No 101052200 -EUROfusion).Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union or the European Commission.Neither the European Union nor the European Commission can be held responsible for them.Instituto Superior Técnico activities also received financial support from FCT through Projects UIDB/50010/2020 and UIDP/50010/2020.Computations were performed on the Traverse and Stellar clusters at Princeton/PPPL as well as the Perlmutter cluster at NERSC.P.Kim et al.The simulation parameters used for the GX simulations within the optimization loop for Sections 3.1-3.3.

Table 2 :
The simulation parameters used for the GX simulations within the optimization loop for the fluid case in Section 3.4 scales like 1/⟨k 2 ⊥ ⟩ (where the angle brackets indicate a flux-surface average) (Mariani

Table 3 :
The simulation parameters used for the GX simulations for post-processing.

Table 4 :
The resolution parameters for the DESC equilibria in this study.