Hostname: page-component-7dd5485656-zlgnt Total loading time: 0 Render date: 2025-10-23T04:13:27.423Z Has data issue: false hasContentIssue false

An adjoint-based method for optimising MHD equilibria against the infinite-n, ideal ballooning mode

Published online by Cambridge University Press:  31 October 2023

Rahul Gaur*
Affiliation:
Institute for Research in Electronics and Applied Physics, University of Maryland, College Park 20740, MD, USA
Stefan Buller
Affiliation:
Institute for Research in Electronics and Applied Physics, University of Maryland, College Park 20740, MD, USA
Maximilian E. Ruth
Affiliation:
Center for Applied Mathematics, Cornell University, Ithaca 14850, NY, USA
Matt Landreman
Affiliation:
Institute for Research in Electronics and Applied Physics, University of Maryland, College Park 20740, MD, USA
Ian G. Abel
Affiliation:
Institute for Research in Electronics and Applied Physics, University of Maryland, College Park 20740, MD, USA
William D. Dorland
Affiliation:
Institute for Research in Electronics and Applied Physics, University of Maryland, College Park 20740, MD, USA Department of Physics, University of Maryland, College Park 20740, MD, USA
*
Email address for correspondence: rgaur@umd.edu
Rights & Permissions [Opens in a new window]

Abstract

We demonstrate a fast adjoint-based method to optimise tokamak and stellarator equilibria against a pressure-driven instability known as the infinite-$n$ ideal ballooning mode. We present three finite-$\beta$ (the ratio of thermal to magnetic pressure) equilibria: one tokamak equilibrium and two stellarator equilibria that are unstable against the ballooning mode. Using the self-adjoint property of ideal magnetohydrodynamics, we construct a technique to rapidly calculate the change in the eigenvalue, a measure of ideal ballooning instability. Using the SIMSOPT optimisation framework, we then implement our fast adjoint gradient-based optimiser to minimise the eigenvalue and find stable equilibria for each of the three originally unstable equilibria.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
Copyright © The Author(s), 2023. Published by Cambridge University Press

1. Introduction

Magnetic confinement is currently considered the most promising way (Baalrud et al. Reference Baalrud, Ferraro, Garrison, Howard, Kuranz, Sarff, Scime and Solomon2020) to achieve the United States’ goal of building a pilot fusion power plant that generates net electricity before 2040 (National Academies of Sciences et al. 2021). Most advanced fusion reactor designs today are based on two main designs that use magnetic confinement: tokamaks and stellarators. These devices work by using strong magnetic fields to keep a hot, dense plasma at their centre. The main difference between tokamaks and stellarators lies in their geometric shape. Tokamaks are symmetric about a fixed axis, whereas stellarators are not. Due to the difference in axisymmetry, the tokamak equilibria are two-dimensional (2D) axisymmetric and the stellarator equilibria are three-dimensional (3D).

For a fixed magnetic field strength, the power density $P$ of a fusion device scales as $\beta ^2$, where $\beta$ is the ratio of the plasma pressure to the magnetic pressure. Since current tokamaks and stellarators are low-$\beta$ devices, one way to improve the efficiency of these devices is to increase the operating $\beta$. However, doing so creates a large pressure gradient from the centre to the edge of the device, which is a source of a variety of pressure-driven, magnetohydrodynamic (MHD) and kinetic instabilities. An important category of pressure-driven instabilities are the high-$n$ ballooning modes.

In tokamaks, there have been numerous studies that have used the ideal ballooning mode to determine the plasma beta limit. This led to the development of codes such as EPED (Snyder et al. Reference Snyder, Burrell, Wilson, Chu, Fenstermacher, Leonard, Moyer, Osborne, Umansky and West2007) that can determine edge pressure profile with reasonable accuracy. In stellarators, the ideal ballooning mode might not cause a significant loss of energy, but Tang, Connor & Hastie (Reference Tang, Connor and Hastie1980) and Aleynikova et al. (Reference Aleynikova, Zocco, Xanthopoulos, Helander and Nührenberg2018) have shown that it is related to a kinetic instability known as the kinetic ballooning mode (KBM). Recent articles have shown the detrimental effects of KBM turbulence on stellarators for finite beta values (Aleynikova et al. Reference Aleynikova, Zocco, Xanthopoulos, Helander and Nührenberg2018; McKinney et al. Reference McKinney, Pueschel, Faber, Hegna, Ishizawa and Terry2021). The KBM is similar to the ideal ballooning mode, albeit with additional kinetic effects. Therefore, the ideal ballooning mode could be used as a proxy for KBM stability.

Numerous studies have been conducted to optimise tokamaks (Miller & Moore Reference Miller and Moore1979; Bernard & Moore Reference Bernard and Moore1981) and stellarators (Sanchez et al. Reference Sanchez, Hirshman, Ware, Berry and Spong2000a; Gates et al. Reference Gates, Boozer, Brown, Breslau, Curreli, Landreman, Lazerson, Lore, Mynick and Neilson2017) against ideal MHD instabilities. However, such calculations can be computationally costly and time consuming. In this paper, we use the self-adjoint property of ideal MHD to devise an adjoint-based method that speeds up the optimisation of 2D and 3D equilibria against the infinite-$n$, ideal ballooning mode. Adjoint-based methods have been extensively used for aeronautical design (Giles & Pierce Reference Giles and Pierce2000) and recently in the context of stellarator optimisation (see Paul, Landreman & Antonsen (Reference Paul, Landreman and Antonsen2021) and references therein). Using this technique, we can speed up the process of finding equilibria that are stable against the ballooning mode.

The remainder of this paper is divided as follows: in § 2, we briefly describe the fundamentals of a general 3D MHD equilibrium and follow it with the details of the VMEC equilibrium solver (Hirshman & Whitson Reference Hirshman and Whitson1983) in § 2.1. Using VMEC, we obtain and present the details of one 2D axisymmetric equilibrium in § 2.2 and two 3D equilibria in§§ 2.3 and 2.4. In § 3, we present the physical, mathematical and numerical details used to solve the infinite-$n$, ideal ballooning equation. We then analyse the susceptibility of the chosen local equilibria to the ideal ballooning instability. In § 3.3, we explain the self-adjoint property of the ideal ballooning equation. We briefly explain how the ballooning eigenvalue can be used as a proxy to stabilise the equilibria against the KBM. Using the self-adjoint technique, we formulate an adjoint method which we elucidate and test in § 4. In § 5, we present the details of the overall optimisation process and present an adjoint-based optimiser using the SIMSOPT (Landreman et al. Reference Landreman, Medasani, Wechsung, Giuliani, Jorge and Zhu2021) framework. In the penultimate section, we present our results, comparing the optimised stable equilibria with their initial, unstable counterparts. Finally, in § 7 we summarise our work and discuss possible ways in which it can be extended.

2. Ideal MHD equilibrium

In this section, we start with the general form of a 3D, divergence-free magnetic field. We use this form to represent the magnetic field in tokamaks and stellarators. After that, we briefly describe the steady-state, ideal MHD, force-balance equation. In § 2.1, we explain the details of solving the ideal MHD force-balance equation using the VMEC code. Finally, we present the details of three MHD equilibria in §§ 2.22.4 that we will use throughout this study.

A divergence-free magnetic field $\boldsymbol {B}$ can be written in the Clebsch form (D'haeseleer et al. Reference D'haeseleer, Hitchon, Callen and Shohet2012)

(2.1)\begin{equation} \boldsymbol{B} = \boldsymbol{\nabla}\alpha_{t} \times \boldsymbol{\nabla} \psi_{p}. \end{equation}

The form (2.1) is generally used for tokamak equilibria. For stellarators, we use the following equivalent representation

(2.2)\begin{equation} \boldsymbol{B} = \boldsymbol{\nabla} \psi \times \boldsymbol{\nabla}\alpha_{s}. \end{equation}

We focus on solutions whose magnetic field lines lie on closed nested toroidal surfaces, known as flux surfaces. For tokamaks, we label the flux surfaces with their enclosed poloidal flux $\psi _{p}$ whereas for stellarators, we use the enclosed toroidal flux $\psi$. On each flux surface, lines of constant $\alpha _{t}$ and $\alpha _{s}$ coincide with the magnetic field lines in tokamaks and stellarators, respectively. Thus, the variables $\alpha _{t}$ and $\alpha _{s}$ are known as field line labels.

To facilitate the calculation of various physical quantities from a general equilibrium solver, we use multiple coordinate systems. We use the right-handed cylindrical coordinate system $(R, \zeta, Z)$ where $R$ and $Z$ are the radial and vertical distances from the origin and $\zeta$ is the azimuthal angle around the symmetry axis. We also define a curvilinear coordinate system comprising the $\mathrm {PEST}$ coordinates $(\psi _{p}, \zeta, \theta )$ where $\psi _{p}$ is the flux surface label, $\zeta$ is the cylindrical azimuthal angle and $\theta$ is the ‘straight-field-line’ poloidal angle (D'haeseleer et al. Reference D'haeseleer, Hitchon, Callen and Shohet2012) such that $\alpha _{t} = \zeta - q(\psi _{p})( \theta - \theta _0)$. Similarly, for 3D equilibria, we use the coordinate system $(\psi, \zeta, \theta )$ and $\alpha _{s} = \theta - \iota (\psi )(\zeta - \zeta _0)$. The pitch of the magnetic field line on a flux surface is described by the safety factor

(2.3)\begin{equation} q(\psi) = \frac{1}{\iota(\psi)} \equiv \frac{{\rm d} \psi}{{\rm d} \psi_{p}} = \frac{1}{(2{\rm \pi})^2} \oint \,{\rm d}\zeta \oint \,{\rm d}\theta \frac{\boldsymbol{B}\boldsymbol{\cdot} \boldsymbol{\nabla}\zeta}{\boldsymbol{B}\boldsymbol{\cdot} \boldsymbol{\nabla} \theta}, \end{equation}

where $\iota$, the rotational transform, is the inverse of the safety factor. Using (2.1) or (2.2) for the magnetic field, one has to solve the steady-state ideal MHD force balance equation in the absence of a background flow

(2.4)\begin{equation} \boldsymbol{j} \times \boldsymbol{B} = \boldsymbol{\nabla} p, \end{equation}

where $p$ is the plasma pressure, and $\boldsymbol {j}$ is the plasma current given by the Ampere's law

(2.5)\begin{equation} \mu_0 \boldsymbol{j} = \boldsymbol{\nabla} \times \boldsymbol{B}, \end{equation}

where $\mu _0$ is the coefficient of permeability in vacuum. For axisymmetric equilibria, simplifying (2.4) yields the Grad–Shafranov equation (Shafranov Reference Shafranov1957; Grad & Rubin Reference Grad and Rubin1958). For a general 3D equilibrium, such an equation does not exist. However, we can solve for the axisymmetric and 3D cases using a general numerical equilibrium solver. In the following section, we explain how we use the numerical solver VMECFootnote 1 to obtain both 2D axisymmetric and 3D equilibria.

2.1. Numerical equilibrium solver

We generate numerical equilibria using the 3D equilibrium code VMEC (Hirshman & Whitson Reference Hirshman and Whitson1983). The code works by minimising the integral

(2.6)\begin{equation} W = \int\left(\frac{p}{\gamma -1} + \frac{B^2}{2\mu_0}\right) \,{\rm d}V, \end{equation}

subject to multiple geometric constraints (Kruskal & Kulsrud Reference Kruskal and Kulsrud1958). For our study, we used the fixed-boundary mode of VMEC. The fixed-boundary mode takes the shape of the boundary surface denoted by the cylindrical coordinates $R_{b}$ and $Z_{b}$ in terms of the Fourier-decomposed poloidal ($\varTheta$) and toroidal ($\zeta$) modes

(2.7)\begin{equation} \left. \begin{gathered} R_{b} = \sum_{n}\sum_{m} \hat{R}_{b}(m, n) \exp({\rm i}(m \varTheta - n \zeta)), \\ Z_{b} = \sum_{n}\sum_{m} \hat{Z}_{b}(m, n) \exp({\rm i}(m \varTheta - n \zeta)), \end{gathered} \right\} \end{equation}

where $m$ and $n$ are integers. We also provide VMEC with the coefficients of the polynomials representing the global radial pressure $p(s)$ and the rotational transform $\iota (s)$ as a function of the normalised toroidal flux $s$, and the total toroidal or poloidal flux enclosed by the boundary. The poloidal angle $\varTheta$ used by VMEC is related to the straight field line $\theta$ by the following equation

(2.8)\begin{equation} \varTheta = \theta + \varLambda, \end{equation}

where

(2.9)\begin{equation} \varLambda = \sum_{n}\sum_{m} \hat{\varLambda}(m, n) \exp({\rm i}(m \varTheta - n \zeta)). \end{equation}

For a boundary shape, pressure, rotational transform and enclosed toroidal flux, it solves for the flux surfaces. To achieve this, it uses a steepest descent method to minimise the energy integral in (2.6) on each surface for fixed $p$, $\iota$, and enclosed flux subject to various topological constraints imposed by the ideal MHD. In a more compact form, VMEC solves

(2.10)\begin{equation} \min_{R, Z, \varLambda} W[R, Z, \varLambda; p, \iota, \psi(s=1)], \quad \text{s.t. } R(s=1) = R_{b}, Z(s=1) = Z_{b}. \end{equation}

After running the code, we obtain the shape of the flux surfaces, the magnetic field and a set of important physical quantities. The characteristic physical quantities that we use in this work are defined as follows.

  1. (i) The total enclosed toroidal flux by the boundary $\psi _{b} = \int \,\textrm {d}V \boldsymbol {B}\boldsymbol {\cdot }\boldsymbol {\nabla } \zeta$.

  2. (ii) The normalising magnetic field $B_{N} = \psi _{b}/({\rm \pi} a_{N}^2)$ where $a_{N} = \sqrt {\mathcal {A}_{b}/{\rm \pi} }$ is the effective minor radius and $\mathcal {A}_{b}$ is the average area enclosed by the boundary (averaged over $\zeta$).

  3. (iii) The ratio of the total plasma pressure to the magnetic pressure on the magnetic axis $\beta _{\mathrm {ax}} = 2 \mu _0 p(s=0)/B_{N}^2$.

  4. (iv) The aspect ratio $A$ and the normalised minor radius $a_{N}$ of the device.

  5. (v) The radius of curvature of the boundary $R_c(\theta ) = ({\textrm {d}^2\!R}/{\textrm {d}Z^2})/(1 + ({\textrm {d}R}/{\textrm {d}Z})^2)^{3/2}$ where $R$ and $Z$ are the cylindrical coordinates used to parametrise the boundary.

  6. (vi) The volume-averaged, normalised plasma pressure $\langle \beta \rangle = \int \,\textrm {d}V p/\int \,\textrm {d}V B^2$ where $\textrm {d}V$ is the differential volume element.

  7. (vii) The total enclosed toroidal current $j_{\zeta }$ = $\lvert \int \,\textrm {d}V\, (\boldsymbol {j}\boldsymbol {\cdot } \boldsymbol {\nabla } \zeta )\rvert$.

  8. (viii) The volume-averaged magnetic field $\langle B \rangle = \int \,\textrm {d}V B/\sqrt {V}$ where $\textrm {d}V$ is the differential volume element.

  9. (ix) The mean rotational transform $\bar {\iota } = \int \,\textrm {d}s \, \iota /\int \,\textrm {d}s$.

Using VMEC, we generate three equilibria: an axisymmetric equilibrium with a DIII-D-like boundary shape and two 3D equilibria, modified NCSX- and modified Henneberg-QA. In the following sections, we provide important details for each of these equilibria.

2.2. 2D axisymmetric equilibrium

In this study, the first equilibrium that we choose is a high-$\beta$, axisymmetric, DIII-D-like equilibrium with a negative triangularity boundary: a boundary that looks like an inverted-D. Negative triangularity equilibria have generally been found to have enhanced confinement Marinoni et al. (Reference Marinoni, Austin, Hyatt, Walker, Candy, Chrystal, Lasnier, McKee, Odstrčil and Petty2019) and stability against edge localised modes. We then choose a negative triangularity equilibrium from Gaur et al. (Reference Gaur, Abel, Dickinson and Dorland2023) where it is shown to be unstable against the ideal ballooning mode. With this equilibrium as an initial state, we run our ideal ballooning stability optimisation to find a stable equilibrium while maintaining a boundary shape with negative triangularity. The input pressure, the rotational transform and the boundary shape profile for this equilibrium are shown in figure 1.

Figure 1. Plots of the inputs to the VMEC code for the DIII-D-like design: (a) the pressure, (b) the rotational transform as a function of the normalised toroidal flux $s$ and (c) the cross section of the boundary.

Using these inputs, we run VMEC to obtain the global MHD equilibrium. Our optimiser sometimes finds solutions that meet the ideal ballooning stability constraints in a trivial fashion. For example, the optimiser may give us a large aspect ratio or decrease the minor radius causing the volume averaged $B$ to increase, causing the $\beta$ to decrease. To avoid these trivial solutions, we have to impose additional constraints on important characteristic physical quantities to prevent them from changing significantly. The values of the relevant physical quantities for this equilibrium are provided in table 1.

Table 1. Relevant physical quantities for the DIII-D-like equilibrium.

2.3. Modified NCSX equilibrium

The second equilibrium we select is the 3D equilibrium for the NCSX design (Zarnstorff et al. Reference Zarnstorff, Berry, Brooks, Fredrickson, Fu, Hirshman, Hudson, Ku, Lazarus and Mikkelsen2001; Fu et al. Reference Fu, Isaev, Ku, Mikhailov, Redi, Sanchez, Subbotin, Cooper, Hirshman and Monticello2007). This equilibrium is designed to have a hidden symmetry known as quasisymmetry (Boozer Reference Boozer1983; Garren & Boozer Reference Garren and Boozer1991) where the strength of the magnetic field $|\boldsymbol {B}|$ does not change along a special coordinate. QuasisymmetryFootnote 2 is a useful property because it ensures orbit confinement, which helps improve energetic particle confinement, a major issue in stellarators. The pressure, rotational transform and boundary shape profile for this equilibrium are shown in figure 2.

Figure 2. Plots of the inputs to the VMEC code for the modified NCSX design: (a) the pressure, (b) the rotational transform as a function of the normalised toroidal flux $s$ and (c) the cross section of the boundary. Note the large negative shear until $s \approx 0.85$.

The values of equilibrium-dependent quantities output by VMEC are presented in table 2.

Table 2. Important physical quantities for the modified NCSX equilibrium.

2.4. Modified Henneberg-QA

The final equilibrium we study is the modified Henneberg-QA design (Henneberg et al. Reference Henneberg, Drevlak, Nührenberg, Beidler, Turkin, Loizu and Helander2019). This equilibrium is also designed to have quasisymmetry for a wide variety of pressure profiles. The pressure, rotational transform and boundary shape profile for this equilibrium are shown in figure 3.

Figure 3. Plots of the inputs to the VMEC code for the modified Henneberg-QA design: (a) the pressure, (b) the rotational transform as a function of the normalised toroidal flux $s$ and (c) the cross section of the boundary.

For reasons explained in the previous section, we impose additional constraints on some of the physical quantities. The values of these equilibrium-dependent parameters that we use as constraints in § 6 are presented in table 3.

Table 3. Relevant physical quantities for the modified Henneberg-QA design.

In the following section, we describe the ideal ballooning stability and analyse these equilibria by measuring their instability against the ideal ballooning mode.

3. The infinite-$n$ ideal ballooning mode

In this section, we present the details of the infinite-$n$ ideal ballooning mode. In § 3.1, we briefly describe its theoretical foundation and mathematical formulation. In § 3.2, we present a numerical technique used to solve the ideal ballooning equation. In § 3.3, we describe the mathematical properties that we use to formulate an adjoint-based method and accelerate optimisation against the ideal ballooning mode. In the final section, we describe how optimisation against the ideal ballooning mode can speed up optimisation against an electromagnetic mode seen in kinetic plasma turbulence, known as the KBM.

3.1. Physical and mathematical description

This work involves a detailed analysis of three equilibria against an important MHD instability, the infinite-$n$ ideal ballooning instability (Connor, Hastie & Taylor Reference Connor, Hastie and Taylor1979; Dewar & Glasser Reference Dewar and Glasser1983): a field-aligned, pressure-driven Alfvén wave that grows when the destabilising pressure gradient in the region of ‘bad’ curvature exceeds the stabilising effect of field-line bending. Using magnetic field unit vector $\boldsymbol {b} = \boldsymbol {B}/B$, the region of ‘bad’ curvature is defined as a region of a flux surface where $(\boldsymbol {b}\boldsymbol {\cdot }\boldsymbol {\nabla } \boldsymbol {b}) \boldsymbol {\cdot } \boldsymbol {\nabla } p > 0$.

The ideal ballooning equation

(3.1) \begin{align} \frac{1}{\mathcal{J}}\frac{\partial}{\partial \theta}\left( \frac{\lvert \boldsymbol{\nabla}\alpha_{t} \rvert^2 }{\mathcal{J}\, B^2} \frac{\partial \hat{X}}{\partial \theta}\right) + 2 \frac{1}{B^4} \frac{{\rm d}(\mu_0 p)}{{\rm d}\psi}\left[ \boldsymbol{B}\times\boldsymbol{\nabla}\left(\mu_0 p + \frac{B^2}{2}\right)\boldsymbol{\cdot}\boldsymbol{\nabla}{\alpha_{t}}\right] \hat{X} ={-}\rho \omega^2 \frac{\lvert\boldsymbol{\nabla}\alpha_{t}\rvert^2}{B^2} \hat{X}, \end{align}

is a second-order eigenvalue differential equation that calculates the perturbation $\hat {X}(\theta )$ along the ballooning coordinate $\theta$ and its eigenvalue (or the square of the growth rate) $-\omega ^2$. In (3.1), $\rho$ is the plasma mass density, $\mathcal {J} = (\boldsymbol {B}\boldsymbol {\cdot } \boldsymbol {\nabla } \theta )^{-1}$ and the rest of the terms are defined in § 2. This equation is solved subject to the following condition on the eigenfunction

(3.2)\begin{equation} \lim_{\theta \rightarrow \pm \infty} \hat{X}(\theta; \psi, \alpha_{t}, \theta_0) = 0, \end{equation}

where $\theta _0$ is the ballooning parameter.Footnote 3 Details of the derivation of the ideal ballooning equation are given in Connor et al. (Reference Connor, Hastie and Taylor1979) and Dewar & Glasser (Reference Dewar and Glasser1983).

The ballooning equation balances the stabilising field-line bending term and the destabilising pressure gradient with the inertia of the resulting Alfvén wave, oscillating with a frequency $\omega$. Note that (3.1) depends on $\psi$ (or $\psi _{p}$) as a parameter, and we can compute the coefficients from the equilibrium quantities on each surface. Before solving (3.1) numerically, we normalise and write the ballooning equation on a field line (fixed $\alpha _{t}$) as

(3.3)\begin{equation} \frac{{\rm d}}{{\rm d}\theta}{g} \frac{{\rm d} \hat{X}}{{\rm d}\theta} + {c} \hat{X} = \hat{\lambda} {f} \hat{X}, \end{equation}

where

(3.4) \begin{equation} \left. \begin{gathered} {g} = (\boldsymbol{b}\boldsymbol{\cdot}\boldsymbol{\nabla}_{N} \theta) \frac{\lvert \boldsymbol{\nabla}_{N}\alpha_{t} \rvert^2}{B/B_{N}},\\ {c} = \frac{2}{(\boldsymbol{b}\boldsymbol{\cdot}\boldsymbol{\nabla}_{N} \theta)}\, \frac{1}{(B/B_{N})^{4}}\, \frac{{\rm d} (\mu_0 p/B_{N}^2)}{{\rm d}\psi_{N}}\left[\boldsymbol{b} \times \boldsymbol{\nabla}_{N} \left(\frac{2\mu_0 p + B^2}{2 B_{N}^2}\right) \boldsymbol{\cdot} \boldsymbol{\nabla}_{N} \alpha_{t} \right],\\ {f} = \frac{1}{(\boldsymbol{b}\boldsymbol{\cdot}\boldsymbol{\nabla}_{N} \theta)}\, \frac{\lvert \boldsymbol{\nabla}_{N} \alpha_{t} \rvert ^2}{(B/B_{N})^3},\\ \hat{\lambda} ={-}\left(\frac{\omega a_{N}}{v_{A}}\right)^2,\quad v_{A} = \frac{B_{N}}{\sqrt{4{\rm \pi} \rho}}, \end{gathered} \right\} \end{equation}

where $v_{A}$ is the Alfvén speed and the values and definitions of the effective minor radius $a_{N}$ and the normalising magnetic field $B_{N}$ are the normalising length and magnetic field strength, respectively, defined in § 2.1. The normalised gradient $\boldsymbol {\nabla }_{N} = a_{N}\boldsymbol {\nabla }$. In the next section, we present the numerical procedure used to solve the ideal ballooning equation.

3.2. Numerical implementation and eigenvalues of the selected equilibria

In this section, we briefly discuss the numerical technique used to solve the ballooning equation (3.3). Our numerical technique is virtually identical to that used by Sanchez et al. (Reference Sanchez, Hirshman, Whitson and Ware2000b) in their ballooning solver COBRAVMEC. After briefly explaining the details of our solver, we present the maximum eigenvalue as a function of the normalised toroidal flux $s$ for the three equilibria we presented in § 2.

The ideal ballooning equation is a second-order ordinary differential equation with real-valued coefficients. This eigenvalue equation is discretised using a second-order accurate, central-finite-difference scheme

(3.5)\begin{equation} {g}_{j+1/2}\frac{(\hat{X}_{j+1} - \hat{X}_{j})}{ \Delta \theta^2} - {g}_{j-1/2}\frac{(\hat{X}_{j} - \hat{X}_{j-1})}{\Delta \theta^2} + ({c}_j - \hat{\lambda} {f}_j)\hat{X}_j = 0, \quad j = 0\ldots N-1 \end{equation}

subject to the boundary conditions

(3.6)\begin{equation} \hat{X}_0 = 0,\quad \hat{X}_{N} = 0, \end{equation}

where $N$ is an odd number of uniformly spaced points in the ballooning space, $\theta _{b}$ is a finite user-selected value that determines the extent of the eigenfunction such that $\theta _j \in [-\theta _{b}, \theta _{b}]$ and $\Delta \theta = \theta _{j+1} - \theta _{j}$. First-order derivatives are evaluated at half points $j-1/2, j-3/2$ instead of grid points to ensure numerical stability. For a fixed poloidal and toroidal resolution, the time taken by our solver is proportional to $\theta _{b}$. Therefore, it is important to find the right balance between speed and accuracy. Throughout this work, we chose the domain limit $\theta _{b} = 5{\rm \pi}$ for all calculations, as we found it to be a sufficient range to capture the ballooning eigenfunction. We observed that the values $\theta _{b} > 5{\rm \pi}$ made a relatively small difference from the value obtained of $\hat {\lambda }$. The discrete set of (3.5) is written in the form of a matrix equation

(3.7)\begin{equation} \boldsymbol{\mathsf{A}} \hat{X} = \hat{\lambda} \hat{X}, \end{equation}

where the exact matrix $\boldsymbol{\mathsf{A}}$ is provided in Appendix A. We then solve (3.7) to find the largest eigenvalue using an Arnoldi iterative scheme using the scipy.linalg.eigs solver in Python and refine the accuracy of the largest eigenvalue in the grid spacing $\Delta \theta$ using variational refinement

(3.8)\begin{equation} \hat{\lambda} = \frac{\int_{-\theta_{b}}^{\theta_{b}} \,{\rm d}\theta \left({c} \lvert \hat{X}\rvert^2 - {g} \left\lvert \dfrac{{\rm d}\!\hat{X}}{{\rm d}\theta}\right\rvert^2 \right)}{\int_{-\theta_{b}}^{\theta_{b}} \,{\rm d}\theta\, {f}\lvert \hat{X}\rvert^2}, \end{equation}

where the derivative $\textrm {d}\hat {X}/\textrm {d}\theta$ is calculated using a fourth-order accurate finite-difference scheme and the integral is performed using a fourth-order accurate Simpson's rule ($1/3$ rule) with scipy.integrate.simps.Footnote 4 Note that we only solve for and refine the largest eigenvalue of (3.3) and not the entire eigenvalue spectrum.

3.3. Properties of the ideal ballooning equation

The ideal ballooning (3.3) is a linear equation that can be written as

(3.9)\begin{equation} \mathcal{L}\hat{X} = \hat{\lambda} \hat{X}, \end{equation}

where the linear operator

(3.10)\begin{equation} \mathcal{L} \equiv \frac{1}{f}\frac{{\rm d}}{{\rm d}\theta}{g}\frac{{\rm d}}{{\rm d}\theta} + \frac{c}{f}, \end{equation}

and the coefficients ${g, c, f}$ are real-valued functions along a field line. Mathematically, the solutions of (3.9) form the basis for the Hilbert space equipped with the following inner product

(3.11)\begin{equation} \langle \hat{X}_1, \hat{X}_2 \rangle = \int_{-\infty}^{\infty} \,{\rm d}\theta \hat{X}_1^{*} \hat{X}_2 , \end{equation}

and are square integrable, i.e.$\langle \hat {X}, \hat {X} \rangle < \infty$. Owing to the self-adjoint nature of ideal MHD (Freidberg Reference Freidberg2014), for solutions $\hat {X}_1$ and $\hat {X}_2$ of (3.9) the operator $\mathcal {L}$ satisfies the following property:

(3.12)\begin{equation} \langle \mathcal{L} \hat{X}_1, \hat{X}_2 \rangle = \langle \hat{X}_1, \mathcal{L}\hat{X}_2 \rangle, \end{equation}

where we have used the boundary condition $\lim _{\theta \rightarrow \pm \infty } \hat {X}_1 = \lim _{\theta \rightarrow \pm \infty } \hat {X}_2 = 0$. Using (3.12), one can show that all eigenvalues $\hat {\lambda }$ (3.1) will be real numbers. Therefore, $\omega = \pm \textrm {i}\sqrt {\hat {\lambda }}$ will be purely real, an oscillating mode, or purely imaginary, a growing mode. We refer to oscillating modes as stable and to growing modes as unstable. We use this powerful property in § 4 to formulate an adjoint method, a technique that can speed up the calculation of the gradient of $\hat {\lambda }_{\mathrm {max}}$ on each flux surface.

3.4. Relation to the KBM

In this section, we explain how the ideal ballooning equation is directly related to an important mode of gyrokinetic plasma turbulence known as the KBM. Unlike ideal MHD which is a fluid theory, a single fluid with properties that vary in configuration space, a gyrokinetic model takes into account the distribution of different ion and electron species in both configuration and velocity space. Using a gyrokinetic model, Tang et al. (Reference Tang, Connor and Hastie1980) and Aleynikova et al. (Reference Aleynikova, Zocco, Xanthopoulos, Helander and Nührenberg2018) have shown that for devices with a large aspect ratio, modes with wavenumbers $k_y \ll 1/\rho _{i}$, where $\rho _i = e B/m_{i}$ is the ion gyroradius and $m_{i}$ is the ion mass, the gyrokinetic model can be reduced to the ideal ballooning equation with corrections that depend on $k_y\rho _{i}$:

(3.13)\begin{equation} \frac{{\rm d}}{{\rm d}\theta}{g} \frac{{\rm d} \hat{X}}{{\rm d}\theta} + {c} \hat{X} = \omega (\omega_{*,s} - \omega) {f} \hat{X}, \end{equation}

where

(3.14)\begin{equation} \omega_{*,s} = c_{0} \, k_y \rho_{i}, \end{equation}

where $c_0$ is a constant on a flux surface and $k_y \rho _{i}$ is the normalised wavenumber of the mode in the long wavelength limit, i.e.$k_y \rho _i \rightarrow 0$, we recover the ideal ballooning equation exactly. this means that an ideal ballooning unstable mode is also unstable to the KBM. In fact, the KBM is an ideal ballooning mode with kinetic effects.

Using a simple mixing-length argument, one can qualitatively argue that the turbulence heat flux diffusion is

(3.15)\begin{equation} D \sim \frac{\mathrm{Im}(\omega)}{k_y^2}, \end{equation}

where $\mathrm {Im}(\omega )$ is the imaginary part of $\omega$ also called the growth rate. This implies that low-wavenumber turbulence has the highest rate of diffusion and leads to poor plasma confinement. Hence, even if ideal ballooning unstable modes do not lead to disruption in stellarators, they could lead to a large heat flux transport through the KBM channel. Since calculating KBM growth rates using a microstability code is expensive, one can use ideal ballooning stability as a necessary condition for KBM stability to optimise against the KBM.Footnote 5 For tokamaks, this is one of the fundamental ideas currently used in the EPED code (Snyder et al. Reference Snyder, Burrell, Wilson, Chu, Fenstermacher, Leonard, Moyer, Osborne, Umansky and West2007) to predict the plasma pressure profile in the pedestal region.

In summary, in this section, we have explained the mathematical, physical and numerical methods used to solve the ideal ballooning equation. We have also explained the self-adjoint property of the ideal ballooning equation and the crucial link between the ideal ballooning mode and KBM. In the next section, we use the self-adjoint property of the ideal ballooning equation to outline and test the adjoint method.

4. Developing an adjoint method

In this section, we derive and explain the process of calculating the gradients of the ideal ballooning eigenvalue $\hat {\lambda }$ using an adjoint method. Using this method, we find the maximum eigenvalue $\hat {\lambda }_\textrm {max}$ on each flux surface. We then elucidate how this technique is faster than the conventional gradient-based method and illustrate this by plotting gradients from a typical optimisation run and calculating the speed-up.

To find $\hat {\lambda }_\textrm {max}$ on each flux surface, we need the gradient of the eigenvalue of a general function $\mathcal {H}$ such that $\mathcal {H}$ is maximised if and only if $\hat {\lambda } = \hat {\lambda }_\textrm {max}$. Mathematically, this problem can be defined as follows

(4.1)\begin{equation} \max \mathcal{H}(\hat{\lambda}, \tilde{\boldsymbol{p}}, \hat{\boldsymbol{p}}), \quad \text{s.t.} \ \mathcal{G}(\hat{\lambda}, \hat{X}, \tilde{\boldsymbol{p}}, \hat{\boldsymbol{p}}) \equiv \mathcal{L} \hat{X} - \hat{\lambda} \hat{X} = 0, \end{equation}

where $\hat {\lambda }$ is the eigenvalue, $\hat {X}$ is the eigenfunction, $\tilde {\boldsymbol {p}}$ is the state vector that contains all the equilibrium parameters such as the boundary shape and $\hat {\boldsymbol {p}} = (\alpha _{t}, \theta _0)$ is a vector that contains the parameters of the ideal ballooning equation, $\mathcal {H}$ is an objective function and $\mathcal {G}$ is the ideal ballooning operator. To maximise $\mathcal {H}$ on a flux surface for a given equilibrium, i.e.for a fixed $\tilde {\boldsymbol {p}}$, we need the gradient

(4.2)\begin{equation} \frac{{\rm d} \mathcal{H}}{{\rm d} \hat{\boldsymbol{p}}} = \left.\frac{\partial \mathcal{H}}{\partial \hat{\lambda}}\right\lvert_{\hat{\boldsymbol{p}}} \frac{\partial \hat{\lambda}}{\partial \hat{\boldsymbol{p}}} + \left.\frac{\partial \mathcal{H}}{\partial \hat{\boldsymbol{p}}}\right\lvert_{\hat{\lambda}}. \end{equation}

The most expensive term to calculate in (4.2) is the gradient of the eigenvalue $\lambda$.Footnote 6 To obtain that, we expand the equation $\textrm {d} \mathcal {G}/\textrm {d} \hat {\boldsymbol {p}} = 0$ at a constant $\tilde {\boldsymbol {p}}$

(4.3)\begin{equation} -\frac{\partial \mathcal{G}}{\partial \hat{\lambda}}\frac{\partial \hat{\lambda}}{\partial \hat{\boldsymbol{p}}} = \frac{\partial \mathcal{G}}{\partial \hat{X}} \frac{\partial \hat{X}}{\partial \hat{\boldsymbol{p}}} + \left.\frac{\partial \mathcal{G}}{\partial \hat{\boldsymbol{p}}}\right\lvert_{\hat{X}, \hat{\lambda}}. \end{equation}

This equation can be explicitly written with the help of (3.10)

(4.4)\begin{equation} \frac{\partial \hat{\lambda}}{\partial \hat{\boldsymbol{p}}} \hat{X} = (\mathcal{L}-\hat{\lambda}) \frac{\partial \hat{X}}{\partial \hat{\boldsymbol{p}}} + \frac{\partial \mathcal{L}}{\partial \hat{\boldsymbol{p}}}\hat{X}. \end{equation}

To simplify (4.4) further, we multiply it by the eigenfunction $\hat {X}^{*}$ and integrate it throughout the domain $\theta \in [-\theta _{b}, \theta _{b}]$. Upon doing that, we use the self-adjoint property (3.12) and work through the algebra (given in Appendix B) to obtain the adjoint relation

(4.5)\begin{equation} \frac{\partial \hat{\lambda}}{\partial \hat{\boldsymbol{p}}} = \frac{\int_{-\theta_{b}}^{\theta_{b}} \,{\rm d}\theta \left(\dfrac{\partial {c}}{\partial \hat{\boldsymbol{p}}} \lvert \hat{X}\rvert^2 - \dfrac{\partial {g}}{\partial \hat{\boldsymbol{p}}}\left\lvert \dfrac{{\rm d}\hat{X}}{{\rm d}\theta}\right\rvert^2 - \hat{\lambda} \dfrac{\partial {f}}{\partial \hat{\boldsymbol{p}}} \lvert \hat{X} \rvert^2 \right)}{\int_{-\theta_{b}}^{\theta_{b}} \,{\rm d}\theta\, {f}\lvert \hat{X}\rvert^2}. \end{equation}

To obtain $\partial \hat {\lambda }/\partial \hat {\boldsymbol {p}}$ using a central-finite-difference scheme, one has to solve the ideal ballooning equation $2 n_{\hat {\boldsymbol {p}}} = 4$ times at each optimisation step, where $n_{\hat {\boldsymbol {p}}}$ is the length of the vector $\hat {\boldsymbol {p}}$. However, using the adjoint relation (4.5), we only have to solve it once per optimisation step, as long as we can calculate the gradients of geometry-related quantities ${g}$, ${c}$ and ${f}$ four times. Since gradients of ${g}$, ${c}$ and ${f}$ can be calculated roughly two orders of magnitude faster than solving the ideal ballooning equation, we speed up the gradient calculation by approximately a factor of four. Therefore, we use the adjoint relation (4.5) to calculate the gradient of $\hat {\lambda }$. In this study, we choose

(4.6)\begin{equation} \mathcal{H}(\hat{\lambda}, \tilde{\boldsymbol{p}}, \hat{\boldsymbol{p}}) = \hat{\lambda}. \end{equation}

Applying this fact to (4.2) and using (4.5),

(4.7)\begin{equation} \frac{{\rm d} \mathcal{H}}{{\rm d}\hat{\boldsymbol{p}}} = \frac{\partial \hat{\lambda}}{\partial \hat{\boldsymbol{p}}} = \frac{\int_{-\theta_{b}}^{\theta_{b}} \,{\rm d}\theta \left(\dfrac{\partial {c}}{\partial \hat{\boldsymbol{p}}} \lvert \hat{X}\rvert^2 - \dfrac{\partial {g}}{\partial \hat{\boldsymbol{p}}}\left\lvert \dfrac{{\rm d}\hat{X}}{{\rm d}\theta}\right\rvert^2 - \hat{\lambda} \dfrac{\partial {f}}{\partial \hat{\boldsymbol{p}}} \lvert \hat{X} \rvert^2 \right)}{\int_{-\theta_{b}}^{\theta_{b}} \,{\rm d}\theta\, {f}\lvert \hat{X}\rvert^2}. \end{equation}

This relation gives us the derivative of the ballooning objective function at any point $\hat {\boldsymbol {p}} = (\alpha _{t}, \theta _0)$. Note that in this work we use (4.7) combined with a local, gradient-based algorithm to find $\hat {\lambda }_{\mathrm {max}}$ on a flux surface. However, this method is valid and, under appropriate conditions, can be extended to the equilibrium parameters $\tilde {\boldsymbol {p}}$. The details of the extended adjoint method are given in Appendix B.1.

In the next section, we present comparison between adjoint and the regular finite-difference-based methods and present tests comparing the gradients of $\hat {\lambda }$ from these two methods.

4.1. Comparing adjoint gradients with a finite-difference method

In this section, we first compare the values of the gradients of $\hat {\lambda }_{\mathrm {max}}$ from the adjoint method with their values obtained using a central-finite-difference method. We take a typical optimisation loop in the modified NCSX case and show a gradient comparison in figure 4. As you can see, the gradients obtained using an adjoint method match well with the gradients obtained with a finite-difference method.

Figure 4. (a) Comparison between the gradients of eigenvalue $\hat {\lambda }_{\alpha _{t}} = \partial \hat {\lambda }/\partial \alpha _{t}$ and $\hat {\lambda }_{\theta _0} = \partial \hat {\lambda }/\partial \theta _0$ obtained using a finite-difference scheme against those obtained using an adjoint method. The quantity $\mathrm {iter}$ is the number of iterations taken by the local optimiser on a flux surface before finding $\hat {\lambda }_{\mathrm {max}}$. The gradients match well for around four orders of magnitude. The discrepancy between the adjoint and finite difference $\hat {\lambda }_{\alpha _{t}}$ is due to the finite resolution of the $\texttt {VMEC}$ run. (b) Different grids used to calculate the gradient of the eigenvalue $\hat {\lambda }$ on a flux surface. A finite-difference scheme requires four points, whereas an adjoint method only requires one point. This gives us a four times speed-up.

To show the computational speedup, we also compare the time taken by an adjoint method with the regular finite-difference-based method. For the 30 iterations shown in figure 4(a), the adjoint method was about $4$ times faster than the finite-difference method. Indeed, the most expensive part of the gradient calculation is the ballooning solver. As shown in the illustration in figure 4(b), for a second-order accurate central-difference scheme, an adjoint method only needs a single call to the ballooning solver, whereas the finite-difference solver needs four. In principle, a speed-up factor of up to four should be possible.

5. Details of the optimisation process

In this section, we explain the optimisation process to find equilibria that are stable against the ideal ballooning mode. In § 5.1, we describe the process of using an adjoint method to find the maximum eigenvalue $\hat {\lambda }_{\mathrm {max}}$ on each flux surface. In § 5.2, we explain how we use $\hat {\lambda }_{\mathrm {max}}$ and other penalty terms to construct the overall objective function $\mathcal {F}$. Finally, in § 5.3, we explain how we search for ballooning stable equilibria while minimising $\mathcal {F}$ using the SIMSOPT framework.

5.1. Finding $\hat {\lambda }_{\mathrm {max}}$ on each flux surface

To calculate the ballooning objective function we find the maximum $\hat {\lambda }$ on each flux surface. To do that, we solve (3.3) on several flux surfaces, multiple field lines on each surface and numerous values of $\theta _0$ on each field line. We calculate $\hat {\lambda }_{\mathrm {max}}$ on $\mathrm {ns} = 16$ flux surfaces for each equilibrium. For the 3D equilibria, we scan $n_{\alpha _{t}} = 42$ field lines in the range $\alpha _{t} = [-{\rm \pi}, {\rm \pi})$. Since all field lines are identical in a 2D axisymmetric equilibrium, we scan only one field line, i.e.$n_{\alpha } = 1$ for the 2D equilibrium. On each field line, we scan $n_{\theta _0} = 21$ values of $\theta _0$ in the range $\theta _0 = [-{\rm \pi} /2, {\rm \pi}/2)$. The maximum $\hat {\lambda }$ from a coarse grid scan gives us a value close to the global maximum. From the maximum $\hat {\lambda }$ of the coarse grid, we launch a local gradient-based optimiser to find the global maximum eigenvalue. This process is explained using the illustration figure 5. Using this process, we obtain $\hat {\lambda }_{\mathrm {max}}$ as a function of the normalised toroidal flux $s$. Figure 6 shows the plot of $\hat {\lambda }_{\mathrm {max}}$ against $s$ for the three chosen equilibria.

Figure 5. Process of finding the globally maximum eigenvalue $\hat {\lambda }_{\mathrm {max}}$ on the flux surface $s = 0.8$ of the NCSX equilibrium. We start by first finding the maximum $\hat {\lambda }$ on a discrete grid (marked by crosses) of $\alpha _{t}$ and $\theta _0$. From the discrete maximum $\hat {\lambda }$, we search for the global maximum eigenvalue using a local optimiser. In the inset, we show the approximate path taken by the optimiser to reach $\hat {\lambda }_{\mathrm {max}}$. This process is repeated for all the flux surfaces.

Figure 6. Plots of $\hat {\lambda }_{\mathrm {max}}$ against the normalised toroidal flux $s$ for the three chosen equilibria in (ac) and the eigenfunctions $\hat {X}$ at the maximum $\hat {\lambda }_\textrm {max}$ for each equilibrium in the (df): (a) DIII-D $\hat {\lambda }_\textrm {max}$, (b) NCSX $\hat {\lambda }_\textrm {max}$, (c) Henneberg-QA $\hat {\lambda }_\textrm {max}$, (d) DIII-D $\hat {X}(s = 0.95)$, (e) NCSX $\hat {X}(s = 0.8)$ and( f) Henneberg-QA$\hat {X}(s = 0.84)$. The decay of the eigenfunction could be a result of the Anderson localisation of ballooning modes, as discussed by Redi et al. (Reference Redi, Johnson, Klasky, Canik, Dewar and Cooper2002).

For each new equilibrium, on all $\mathrm {ns}=16$ flux surfaces, the local optimiser takes an average of $20$ iterations to find $\hat {\lambda }_{\mathrm {max}}$. Moreover, as described in figure 4(b), at each step, the use of a finite-difference method requires four evaluations of the eigenvalue $\hat {\lambda }$. This means that on average, we have to call the ballooning solver $1280$ ($16 \times 20 \times 4$) times. This is a computationally expensive step that we speed up using our adjoint-based method.

5.2. Finding ballooning-stable equilibria

Once we have found $\hat {\lambda }_{\mathrm {max}}$, we seek an equilibrium stable to the ideal ballooning mode by minimising $\hat {\lambda }_{\mathrm {max}}$ on each flux surface. To do so, we need to define an objective function that depends on $\hat {\lambda }_{\mathrm {max}}$ such that minimising the objective function should allow us to achieve a stable equilibrium. Moreover, during optimisation, once a flux surface is stabilised against the ideal ballooning mode, our objective function should ignore that particular surface. This would be useful as we do not want to penalise a stable equilibrium. To this end, we design the following ideal ballooning objective function

(5.1)\begin{equation} f_{\mathrm{ball}} = \sum_{j=1}^{\mathrm{ns}} \mathrm{ReLU}(\hat{\lambda}_{\mathrm{max},j} - \hat{\lambda}_{\mathrm{th}, j}), \end{equation}

where $\mathrm {ns}$ is the total number of surfaces and

(5.2)\begin{equation} \mathrm{ReLU}(x)= \begin{cases} 0, & \text{if } x\leq 0,\\ x, & x> 0, \end{cases} \end{equation}

is the rectified linear unit operator: an operator that sets all the non-positive values to zero and $\hat {\lambda }_{\mathrm {th},j}$ is the threshold below which we declare a surface ideal ballooning stable. The value $\hat {\lambda }_{j} = 0$ on the $j$th surface implies marginal stability but we choose $\hat {\lambda }_{\mathrm {th},j} = 0.0001$ to ensure that all surfaces are slightly away from marginal ideal ballooning stability. An equilibrium is ideal ballooning stable if $f_{\mathrm {ball}} = 0$.

It is also important to prevent the optimiser from minimising $f_{\mathrm {ball}}$ in a trivial way. For example, for 2D equilibria, going to a larger aspect ratio value stabilises the ideal ballooning mode. For 3D equilibria, the optimiser can sometimes reduce the minor radius, which, for a fixed toroidal flux, causes the magnetic field to increase. This lowers the overall $\beta$ and consequently the unstable curvature drive term. Similarly, if we allow rotational transforms to increase freely, the optimiser can sometimes create large gradients of $\iota$, generating large currents which are suboptimal. To avoid achieving such trivial solutions and uninteresting equilibria, we add a combination of the following penalty terms to the optimiser:

  1. (i) $f_{\mathrm {asp}} = (A - A_0)$ to penalise any deviation from the aspect ratio of the initial equilibrium;

  2. (ii) $f_{\mathrm {minr}} = (a_{N} - a_{{N}0})$ to penalise any deviation from the minor radius of the initial equilibrium;

  3. (iii) $f_{\langle B \rangle } = (\langle B \rangle - \langle B\rangle _0)$ to penalise any deviation of the volume-averaged magnetic field from its value in the initial equilibrium;

  4. (iv) $f_{R_{c}} = \int \,\textrm {d}\theta \, \mathrm {ReLU}({-R_{c}})$, where $R_{c}$ is the radius of curvature of the boundary; this term penalises any boundary shapes that are curved into the plasma

Using the ballooning objective function (5.1) and one or more of the penalty terms described previously, we can get the overall objective function $\mathcal {F}$. Given a vector of input parameters $\boldsymbol {p}$, our goal is to solve

(5.3)\begin{equation} \min_{\boldsymbol{p}} \mathcal{F}(\boldsymbol{p}), \quad \text{s.t.} \ f_{\mathrm{ball}} = 0. \end{equation}

We achieve this with the help of the $\texttt {SIMSOPT}$ (Landreman et al. Reference Landreman, Medasani, Wechsung, Giuliani, Jorge and Zhu2021) package. The implementation details of the optimisation are described in the next subsection.

5.3. Optimisation with the $\mathtt {SIMSOPT}$ package

In this subsection, we discuss the implementation-related details of an adjoint ballooning solver with the SIMSOPT (Landreman et al. Reference Landreman, Medasani, Wechsung, Giuliani, Jorge and Zhu2021) package. First, we briefly explain how an optimisation problem can be solved using SIMSOPT. Next, we go into the details of how we solve the ideal ballooning optimisation and how the use of an adjoint method can speed up this process.

The SIMSOPT package is a optimisation framework containing a suite of codes that, along with the $\texttt {VMEC}$ code, have been used to optimise3D equilibria for various properties such as energetic fast-particle confinement, quasisymmetry, simpler magnetic coil geometry and neoclassical transport. The user specifies the input parameters (also referred to as degrees of freedom (DoFs)) and the objective function $\mathcal {F}$ and $\texttt {SIMSOPT}$ can perform a gradient-based or gradient-free nonlinear least squares optimisation.

As an example, let us construct an optimisation problem to stabilise an equilibrium while penalising the change in the aspect ratio and the minor radius of the boundary

(5.4)\begin{equation} \mathcal{F} = f_{\mathrm{asp}}^2 + f_{\mathrm{minr}}^2 + f_{\mathrm{ball}}^2. \end{equation}

To do so, we use gradient-based optimisation in SIMSOPT, where one calculates $\partial \mathcal {F}/\partial \boldsymbol {p}$ to update the parameter vector at the $i$th iteration, $\boldsymbol {p}_i$ as

(5.5)\begin{equation} \boldsymbol{p}_{i+1} = f\left(\boldsymbol{p}_{i}, \frac{\partial \mathcal{F}}{\partial \boldsymbol{p}_i}\right). \end{equation}

This is done until the optimiser reaches a local minima, i.e.a region in the parameter space where $\partial \mathcal {F}/\partial \boldsymbol {p} = 0$ or the relative change in the gradient is small enough. Typically, one has to evaluate the gradient of $\mathcal {F}$ hundreds of times during an optimisation loop before finding a local minimum. In this study, evaluating $f_{\mathrm {ball}}$ is the most expensive step. Because the speed of the optimisation is limited by the rate at which we can compute $f_{\mathrm {ball}}$, we have used an adjoint method to calculate $\hat {\lambda }_{\mathrm {max}}$ which gives us $f_{\mathrm {ball}}$.

6. Results

In this section, we present the results of our study. In § 6.1, we compare the initial and optimised 2D axisymmetric equilibrium. In §§ 6.2 and 6.3, we do the same for the modified NCSX and modified Henneberg-QA equilibria, respectively. We also compare the values of relevant physical quantities in the initial and optimised equilibria.

6.1. Stabilising the DIII-D-like equilibrium

For the 2D axisymmetric case, we start with a high-$\beta$ equilibrium with a negative triangularity boundary. Due to axisymmetry, the 2D boundary does not depend on the toroidal angle $\zeta$. In other words, only $n = 0$ are needed in (2.7) to represent the boundary of a 2D equilibrium. Therefore, the number of modes needed to specify a 2D boundary is much lower than that for a typical 3D boundary. In this problem, we pick the six largest Fourier modes as our DoFs: $\hat {R}_{b}(0, 1), \hat {R}_{b}(0, 2), \hat {R}_{b}(0, 3), \hat {Z}_{b}(0, 1), \hat {Z}_{b}(0, 3), \hat {Z}_{b}(0, 5)$. The full objective function is

(6.1)\begin{equation} \mathcal{F} = f_{\mathrm{asp}}^2 + f_{R_{c}}^2 + f_{\mathrm{minr}}^2 + f_{\langle B \rangle}^2 + f_{\mathrm{ball}}^2 , \end{equation}

where all terms except $f_{\mathrm {ball}}$ are penalty terms to prevent the optimiser from producing a trivial solution. After this, we start with the negative triangularity equilibrium described in § 2.2 and run SIMSOPT to find multiple equilibria that are stable against the ideal ballooning mode, i.e. equilibria with $f_{\mathrm {ball}} = 0$. We present one of the optimised equilibria in figure 7. We also compare the values of equilibrium-dependent quantities in table 4.

Figure 7. (a) Maximum eigenvalue $\hat {\lambda }_{\mathrm {max}}$ of the initial and optimised DIII-D-like equilibrium. The optimised equilibrium is stable. (b) Boundary shape of the initial and final equilibria. Note the negative triangularity of the initial equilibrium and the positive triangularity of the optimised equilibrium.

Table 4. Comparison between relevant physical quantities of the initial and optimised DIII-D equilibrium.

We observe that the optimiser is moving toward a positive triangularity equilibrium, indicating that, for the similar values of the relevant parameters (given in table 4) positive triangularity high-$\beta$ equilibria are more stable than their negative triangularity counterparts. Our findings are consistent with recent observations by Davies, Dickinson & Wilson (Reference Davies, Dickinson and Wilson2022), Nelson, Paz-Soldan & Saarelma (Reference Nelson, Paz-Soldan and Saarelma2022) and Saarelma et al. (Reference Saarelma, Austin, Knolker, Marinoni, Paz-Soldan, Schmitz and Snyder2021) that negative triangularity equilibria are more unstable against the ideal ballooning mode compared with positive triangularity equilibria. This behaviour prevents the formation of a steep pressure gradient, which limits the operational beta value of negative triangularity equilibria but also prevents the occurrence of edge localisedmodes.

6.2. Stabilising the NCSX equilibrium

The first 3D equilibrium we optimise is an unstable NCSX equilibrium. Since the boundary has a 3D shape, we have to use both toroidal and poloidal modes to change its shape. For this demonstration, we choose $72$ boundary modes listed in table 5 as DoFs. In table 5, $[i, j]$ denotes all integers between $i$ and $j$ (including $i$ and $j$). The total number of DoFs, $72$, is much larger than the axisymmetric case. After choosing these DoFs, we choose the following objective function

(6.2)\begin{equation} \mathcal{F} = 0.5 f_{\mathrm{minr}}^2 + 0.5 f_{\langle B \rangle}^2 + 50 f_{\mathrm{ball}}^2 , \end{equation}

We run SIMSOPT with this configuration to obtain multiple equilibria with $f_{\mathrm {ball}} = 0$. We have plotted a comparison of one of these equilibria with the initial equilibrium in figure 8.

Table 5. Boundary shape DoFs for the NCSX case.

Figure 8. (a) Maximum eigenvalue $\hat {\lambda }_{\mathrm {max}}$ of the initial and optimised modified NCSX equilibrium. (b) Boundary shape of the initial and final equilibria at three different values of the toroidal angle $\zeta$. The dotted curves correspond to the initial cross sections whereas the solid curves are the final cross sections. Note how sensitive the maximum eigenvalue is to the boundary shape.

The optimiser successfully stabilised the equilibrium through boundary shaping. From the change in the boundary shape, it is evident that the ideal ballooning stability is sensitive to the boundary shape for the NCSX equilibrium. We present a comparison of the important equilibrium-dependent quantities for the initial and optimised equilibria in table 6.

Table 6. Comparison between relevant physical quantities of the initial and optimised NCSX equilibrium.

6.3. Stabilising the modified Henneberg-QA equilibrium

As a final example, we present a modified Henneberg-QA equilibrium. In this example, we allow the boundary coefficients given in table 7 to change freely.

Table 7. Boundary shape DoFs for the modified Henneberg-QA case. We have chosen the exact same coefficients as the NCSX case.

For this problem, we choose the following objective function

(6.3)\begin{equation} \mathcal{F} = 0.1 f_{\mathrm{minr}}^2 + 0.2 f_{\langle B \rangle}^2 + 50 f_{\mathrm{ball}}^2 . \end{equation}

After choosing the DoFs and the objective function, we run SIMSOPT and obtain multiple stable equilibria. We compare the stabilisedequilibrium with the initial, unstable equilibrium in figure 9 and the equilibrium-dependent quantities in table 8. We can see in figure 9(a) that the boundary shape deforms in a way to reduce the curvature on the outboard side, stabilising the ideal ballooning mode.

Figure 9. (a) Maximum eigenvalue $\hat {\lambda }_{\mathrm {max}}$ of the initial and optimised modified Henneberg-QA equilibrium. (b) Boundary shape of the initial and final equilibria at three different positions of the toroidal angle $\zeta$. The dotted curves correspond to the initial cross sections whereas the solid curves are the final cross sections.

Table 8. Comparison between relevant physical quantities of the initial and optimised modified Henneberg-QA equilibrium.

7. Summary and conclusions

We began this work by briefly explaining the various curvilinear coordinate systems that we used to fully define a general 3D ideal MHD equilibrium. In § 2, we generated three different equilibria, one axisymmetric 2D and two 3D, and briefly described their properties.

Upon generating the equilibria, in § 3, we have provided a physical description of the infinite-$n$ ideal ballooning mode and explain the numerical methods used to calculate the maximum eigenvalues on a given flux surface. Using these numerical methods, we evaluated the stability of the three equilibria against the ideal ballooning mode. We also described the self-adjoint property of the ideal ballooning mode and its relation to the KBM.

Using the self-adjoint property explained in § 3, we have developed an adjoint method in § 4 and explained how to use it to speed up the calculation of the maximum ballooning eigenvalue $\hat {\lambda }_{\mathrm {max}}$ on each surface. To demonstrate the efficiency and accuracy of the adjoint method, we have also presented a comparison of gradients between an adjoint method and a simple finite-difference scheme. We found that the adjoint method is up to four times faster than the finite-difference scheme.

In § 5, we describe the details of the overall optimisation process and how we accomplish that using the SIMSOPT code. After implementing the optimisation, we presented the results in § 6. We have presented the specific details of the objective function and the DoFs for each equilibrium and stabilised the initial, ideal ballooning unstable equilibria. We have briefly described the physical mechanisms that stabilise the ideal ballooning mode.

This work presents many avenues for future research. A key step forward is to extend our technique to include all equilibrium-dependent parameters $\tilde {\boldsymbol {p}}$ as explained in Appendix B.1. One could also use the exact same method to optimise stellarators and tokamaks against low-$n$, unstable ideal MHD modes, as explained in Appendix B.2. Since solving for low-$n$ ideal MHD codes is much more computationally expensive, the advantage of using an adjoint method would be even greater. Another possible direction would be to use an adjoint method to get derivatives of the ballooning eigenvalue with respect to the plasma shape. Finally, one could use the ideal ballooning optimiser as a tool that could help optimise an equilibrium against KBMs and help us look for equilibrium-dependent proxies for MHD or kinetic instabilities.

Acknowledgements

We thank Professor D. Bindel, Dr A. Bader and Dr B. Faber for helpful discussions and encouragement. R.G. thanks R. Conlin for his valuable suggestions at the APS-DPP 2022 conference.

Editor P. Helander thanks the referees for their advice in evaluating this article.

Funding

This work was supported by the US Department of Energy, Office of Science and Office of Fusion Energy Sciences under award numbers DE-SC0018429 and DE-FG02-93ER54197. This research used resources from the National Energy Research Scientific Computing Center, a DOE Office of Science User Facility and the Stellar cluster at Princeton University. This material is based on work supported by the National Science Foundation Graduate Research Fellowship under grant numberDGE-1650441.

Declaration of interests

The authors report no conflict of interest.

Data availability statement

The Zenodo link containing all the equilibria and optimisation code used in this paper is freely available at https://zenodo.org/record/8092287.

Appendix A. The discretised ideal ballooning equation

After applying the boundary conditions $\hat {X}_0 = \hat {X}_N = 0$ to (3.5), we can rewrite the ballooning equation as

(A1)\begin{equation} (\boldsymbol{\mathsf{A}} - \lambda \mathbb{I}) \hat{X} = 0, \end{equation}

with $\mathbb {I}$ being the identity matrix and the asymmetric tridiagonal matrix $\boldsymbol{\mathsf{A}}$ with the following form:

(A2) \begin{align} \left[\begin{matrix} \dfrac{{g}_{1/2} + {g}_{3/2} - (\Delta \theta)^2 {c}_{1}}{(\Delta \theta)^2 {f}_{1}} & -\dfrac{{g}_{3/2}}{(\Delta \theta)^2 {f}_1} & 0 & 0 & \ldots\\ -\dfrac{{g}_{3/2}}{(\Delta \theta)^2 {f}_2} & \dfrac{{g}_{3/2} + {g}_{5/2} - (\Delta \theta)^2 {c}_{2}}{(\Delta \theta)^2 {f}_{2}} & -\dfrac{{g}_{5/2}}{(\Delta \theta)^2 {f}_2} & 0 & \ldots\\ & & & \ddots\\ & & & & \ddots\\ 0 & 0 & 0 & 0 & \ldots\\ 0 & 0 & 0 & 0 & \ldots\\ \end{matrix}\right.\nonumber\\ \left.\begin{matrix} 0 & 0\\ 0 & 0\\ \vdots&\vdots & \\ \vdots&\vdots & \\ \dfrac{{g}_{N-5/2} + {g}_{N-3/2} - (\Delta \theta)^2 {c}_{N-2}}{h^2 {f}_{N-2}} & -\dfrac{{g}_{N-3/2}}{(\Delta \theta)^2 {f}_{N-2}}\\ -\dfrac{{g}_{N-3/2}}{(\Delta \theta)^2 {f}_{N-1}} & \dfrac{{g}_{N-3/2} + {g}_{N-1/2} - (\Delta \theta)^2 {c}_{N-1}}{(\Delta \theta)^2 {f}_{N-1}}\\ \end{matrix}\right] . \end{align}

Appendix B. Details of the adjoint ideal ballooning calculation

In this appendix, we derive (4.5) starting with (3.3). To do that, we use the self-adjoint property of ideal MHD given in (3.12) as well as the Dirichlet boundary conditions satisfied by the eigenfunction,

(B1)\begin{equation} \hat{X}(\theta ={\pm} \theta_{b}) = \hat{X}^{*}(\theta ={\pm} \theta_{b}) = 0. \end{equation}

Defining $\boldsymbol {p} = (\tilde {\boldsymbol {p}}, \hat {\boldsymbol {p}})$ as the union of all the parameters of the problem, we start by taking the derivative $\partial /\partial \boldsymbol {p}$ of (3.3),

(B2)\begin{equation} \frac{{\rm d}}{{\rm d}\theta} \frac{\partial {g}}{\partial \boldsymbol{p}} \frac{{\rm d} \hat{X}}{{\rm d}\theta} + \frac{{\rm d}}{{\rm d}\theta} {g} \frac{{\rm d} }{{\rm d}\theta}\frac{\partial \hat{X}}{\partial \boldsymbol{p}} + \frac{\partial {c}}{\partial \boldsymbol{p}} \hat{X} + c\frac{\partial \hat{X}}{\partial \boldsymbol{p}} = \frac{\partial \hat{\lambda}}{\partial \boldsymbol{p}} {f} \hat{X} + \hat{\lambda} {f} \frac{\partial \hat{X}}{\partial \boldsymbol{p}} + \hat{\lambda} \hat{X} \frac{\partial {f}}{\partial \boldsymbol{p}}. \end{equation}

Multiplying with $\hat {X}^{*}$ on both sides, integrating throughout the domain,

(B3)$$\begin{gather} \int_{-\theta_{b}}^{\theta_{b}} \,{\rm d}\theta\, \hat{X}^{*}\frac{{\rm d}}{{\rm d}\theta} \frac{\partial {g}}{\partial \boldsymbol{p}} \frac{{\rm d} \hat{X}}{{\rm d}\theta} + \int_{-\theta_{b}}^{\theta_{b}} \hat{X}^{*}\frac{{\rm d}}{{\rm d}\theta} {g} \frac{{\rm d}}{{\rm d}\theta}\frac{\partial \hat{X}}{\partial \boldsymbol{p}} + \int_{-\theta_{b}}^{\theta_{b}} \,{\rm d}\theta\, \frac{\partial {c}}{\partial \boldsymbol{p}} \lvert \hat{X} \rvert^2 + \int_{-\theta_{b}}^{\theta_{b}} \,{\rm d}\theta \, {c} \hat{X}^{*} \frac{\partial \hat{X}}{\partial \boldsymbol{p}}\nonumber\\ = \frac{\partial \hat{\lambda}}{\partial \boldsymbol{p}} \int_{-\theta_{b}}^{\theta_{b}} \,{\rm d}\theta {f} \lvert \hat{X} \rvert^2 + \hat{\lambda} \int_{-\theta_{b}}^{\theta_{b}} \,{\rm d}\theta \frac{\partial {f}}{\partial \boldsymbol{p}} \lvert \hat{X}\rvert^2 + \hat{\lambda} \int_{-\theta_{b}}^{\theta_{b}} \,{\rm d}\theta {f} \hat{X}^{*} \frac{\partial \hat{X}}{\partial \boldsymbol{p}}. \end{gather}$$

Using integration by parts, (B1) and rearranging (B3), we can write

(B4)$$\begin{gather} \int_{-\theta_{b}}^{\theta_{b}} \,{\rm d}\theta \left(\dfrac{\partial {c}}{\partial \boldsymbol{p}}\lvert \hat{X}\rvert^2 - \dfrac{\partial g}{\partial \boldsymbol{p}}\left\lvert \dfrac{{\rm d}\hat{X}}{{\rm d}\theta}\right\rvert^2 - \hat{\lambda} \dfrac{\partial {f}}{\partial \boldsymbol{p}} \lvert \hat{X}\rvert^2 \right)- \frac{\partial \hat{\lambda}}{\partial \boldsymbol{p}} \int_{-\theta_{b}}^{\theta_{b}} \,{\rm d}\theta {f} \lvert \hat{X} \rvert^2\nonumber\\ = \int_{-\theta_{b}}^{\theta_{b}} \,{\rm d}\theta \left( -\frac{{\rm d}}{{\rm d}\theta} {g} \frac{{\rm d}\hat{X}^{*}}{{\rm d}\theta} - {c} \hat{X}^{*} + \hat{\lambda} {f} \hat{X}^{*} \right) \frac{\partial \hat{X}}{\partial \boldsymbol{p}}. \end{gather}$$

Due to the self-adjoint property of ideal MHD, the right-hand side of (B4) is zero. The rest of the equation can be arranged so that

(B5)\begin{equation} \frac{\partial \hat{\lambda}}{\partial \boldsymbol{p}} = \frac{\int_{-\theta_{b}}^{\theta_{b}} \,{\rm d}\theta \left(\dfrac{\partial {c}}{\partial \boldsymbol{p}}\lvert \hat{X}\rvert^2 - \dfrac{\partial {g}}{\partial \boldsymbol{p}}\left\lvert \dfrac{{\rm d}\hat{X}}{{\rm d}\theta}\right\rvert^2 - \hat{\lambda} \dfrac{\partial {f}}{\partial \boldsymbol{p}} \lvert \hat{X}\rvert^2 \right)}{\int_{-\theta_{b}}^{\theta_{b}} \,{\rm d}\theta {f}\lvert \hat{X}\rvert^2}. \end{equation}

Therefore, to calculate $\partial \hat {\lambda }/\partial \boldsymbol {p}$, we only need the gradients of the geometric coefficients ${g}, {c}$, ${f}$, the eigenfunction $\hat {X}$ and the eigenvalue $\hat {\lambda }$ of the ballooning equation; we have to solve the ideal ballooning equation only once. This speeds up the optimisation loop, as it is much faster to obtain the gradient of the geometric coefficients than to solve the ballooning equation multiple times.

B.1. Extending our adjoint-based technique to equilibrium-dependent DoFs

In this paper, we have used an adjoint method to find the maximum eigenvalue $\hat {\lambda }_\textrm {max}$ with respect to the parameters $\hat {\boldsymbol {p}}$ of the ballooning equation. It is possible to extend our method to minimise $f_\textrm {ball}$ to the equilibrium parameters $\tilde {\boldsymbol {p}}$ under the appropriate conditions. We define the problem and find the pertinent conditions in this appendix. We want to find

(B6)\begin{equation} \min f_{\rm{ball}}(\hat{\lambda}_{\mathrm{max}}, \tilde{\boldsymbol{p}}, \hat{\boldsymbol{p}}), \quad \text{s.t.} \ \mathcal{G}(\hat{\lambda}, \hat{X}, \tilde{\boldsymbol{p}}, \hat{\boldsymbol{p}}) \equiv \mathcal{L}\hat{X} - \hat{\lambda} \hat{X} = 0, \end{equation}

where all symbols are defined in § 4 and the ballooning objective function $f_\textrm {ball}$ is defined in (5.1). To minimise $f_\textrm {ball}$ with respect to the equilibrium parameters, we need

(B7) \begin{equation} \frac{{\rm d} f_{\rm{ball}}}{{\rm d} \tilde{\boldsymbol{p}}} =\left. \frac{\partial f_{\rm{ball}}}{\partial \hat{\lambda}_{\rm{max}}}\right\lvert_{\tilde{\boldsymbol{p}}, \hat{\boldsymbol{p}}}\frac{\partial \hat{\lambda}_{\rm{max}}}{\partial \tilde{\boldsymbol{p}}}+ \left.\frac{\partial f_{\rm{ball}}}{\partial \hat{\boldsymbol{p}}}\right\lvert_{\hat{\lambda}_{\rm{max}}, \tilde{\boldsymbol{p}}}\frac{\partial \hat{\boldsymbol{p}}}{\partial \tilde{\boldsymbol{p}}} + \left.\frac{\partial f_{\rm{ball}}}{\partial \tilde{\boldsymbol{p}}}\right\lvert_{\hat{\lambda}_{\rm{max}}, \hat{\boldsymbol{p}}}. \end{equation}

The most expensive term to calculate in (B7) is the gradient of the eigenvalue $\lambda$. To obtain that, we take the derivative of the operator $\mathcal {G}$ with respect to $\lambda$,

(B8)\begin{equation} -\frac{\partial \mathcal{G}}{\partial \hat{\lambda}}\frac{\partial \hat{\lambda}}{\partial \tilde{\boldsymbol{p}}} = \frac{\partial \mathcal{G}}{\partial \hat{\lambda}}\frac{\partial \hat{\lambda}}{\partial \hat{\boldsymbol{p}}} + \frac{\partial \mathcal{G}}{\partial \hat{X}} \frac{\partial \hat{X}}{\partial \hat{\boldsymbol{p}}} + \frac{\partial \mathcal{G}}{\partial \hat{X}} \frac{\partial \hat{X}}{\partial \tilde{\boldsymbol{p}}} + \frac{\partial \mathcal{G}}{\partial \hat{\boldsymbol{p}}} + \frac{\partial \mathcal{G}}{\partial \tilde{\boldsymbol{p}}}. \end{equation}

We also express $\hat {\lambda }$ around a point $\boldsymbol {p}_0 = (\tilde {\boldsymbol {p}}_0, \hat {\boldsymbol {p}}_0)$ in the state space as a Taylor series,

(B9)\begin{equation} \hat{\lambda} = \hat{\lambda}(\boldsymbol{p}_0) + \frac{\partial \hat{\lambda}}{\partial \tilde{\boldsymbol{p}}} \boldsymbol{\cdot} \delta \tilde{\boldsymbol{p}} + \frac{\partial \hat{\lambda}}{\partial \hat{\boldsymbol{p}}} \boldsymbol{\cdot} \delta \hat{\boldsymbol{p}} + \frac{\partial}{{\partial \tilde{\boldsymbol{p}}} }\frac{\partial \hat{\lambda}}{\partial \tilde{\boldsymbol{p}}} \boldsymbol{:} \delta \tilde{\boldsymbol{p}} \delta \tilde{\boldsymbol{p}} + \frac{\partial}{{\partial \hat{\boldsymbol{p}}} }\frac{\partial \hat{\lambda}}{\partial \hat{\boldsymbol{p}}} \boldsymbol{:} \delta \hat{\boldsymbol{p}} \delta \hat{\boldsymbol{p}} + \textit{O}(|\delta\boldsymbol{p}|^3), \end{equation}

and assume that the optimiser takes a step size $|\delta \boldsymbol {p}|$ that is smaller than the radius of convergence of Taylor series (B9),

(B10)\begin{equation} \left\lvert\frac{\partial}{{\partial \tilde{\boldsymbol{p}}} }\frac{\partial \hat{\lambda}}{\partial \tilde{\boldsymbol{p}}} \boldsymbol{:} \delta \tilde{\boldsymbol{p}} \delta \tilde{\boldsymbol{p}} \right\rvert \ll \left\lvert \frac{\partial \hat{\lambda}}{\partial \tilde{\boldsymbol{p}}} \boldsymbol{\cdot} \delta \tilde{\boldsymbol{p}} \right\rvert,\quad \left \lvert\frac{\partial}{{\partial \hat{\boldsymbol{p}}} }\frac{\partial \hat{\lambda}}{\partial \hat{\boldsymbol{p}}} \boldsymbol{:} \delta \hat{\boldsymbol{p}} \delta \hat{\boldsymbol{p}} \right\rvert \ll \left\lvert \frac{\partial \hat{\lambda}}{\partial \hat{\boldsymbol{p}}} \boldsymbol{\cdot} \delta \hat{\boldsymbol{p}} \right\rvert. \end{equation}

Using (B9) and (B10)

(B11)\begin{equation} \frac{\partial \hat{\lambda}}{\partial \tilde{\boldsymbol{p}}} = \frac{\partial \hat{\lambda}}{\partial \hat{\boldsymbol{p}}} + \frac{\partial \hat{\lambda}}{\partial \tilde{\boldsymbol{p}}} \boldsymbol{\cdot} \frac{\partial \tilde{\boldsymbol{p}}}{\partial \hat{\boldsymbol{p}}}. \end{equation}

Next, we use the fact that $\partial \hat {\lambda }/\partial \hat {\boldsymbol {p}} = 0$ at $\hat {\lambda } = \hat {\lambda }_{\mathrm {max}}$, and that our choice of $f_\textrm {ball}$ only explicitly depends on $\hat {\lambda }_\textrm {max}$. Using the explicit form of the linear operator from (3.1), we multiply (B8) by $X^{*}$ and integrate throughout the domain, to rewrite (B7) as

(B12)\begin{equation} \frac{{\rm d} f_{\rm{ball}}}{{\rm d}\boldsymbol{p}} = \sum_{j=1}^{\rm{ns}} \mathrm{ReLU}^{\prime}(\hat{\lambda}_{\mathrm{max},j})\frac{\int_{-\theta_{b}}^{\theta_{b}} \,{\rm d}\theta \left(\dfrac{\partial {c}}{\partial \boldsymbol{p}} \lvert \hat{X}\rvert^2 - \dfrac{\partial g}{\partial \boldsymbol{p}} \left\lvert \dfrac{{\rm d}\hat{X}}{{\rm d}\theta}\right\rvert^2 - \hat{\lambda} \dfrac{\partial {f}}{\partial \boldsymbol{p}} \lvert \hat{X}\rvert^2 \right)}{\int_{-\theta_{b}}^{\theta_{b}} \,{\rm d}\theta\, {f}\lvert \hat{X}\rvert^2}, \end{equation}

where $\boldsymbol {p} = (\tilde {\boldsymbol {p}}, \hat {\boldsymbol {p}})$ is the union of all the parameters of the problem and $\mathrm {ReLU}^{\prime }$Footnote 7 is the derivative of the $\mathrm {ReLU}$ operator such that

(B13)\begin{equation} \mathrm{ReLU}^{\prime}(x)= \begin{cases} 0, & \text{if } x < 0,\\ 1, & x> 0. \end{cases} \end{equation}

Calculating the derivative of the geometric coefficients ${g}$, ${c}$ and ${f}$ with respect to the equilibrium parameter vector $\tilde {\boldsymbol {p}}$ is not straightforward in VMEC and may lack the requisite accuracy for an adjoint method to work. However, an equilibrium solver such as DESC (Dudt & Kolemen Reference Dudt and Kolemen2020; Conlin et al. Reference Conlin, Dudt, Panici and Kolemen2022; Panici et al. Reference Panici, Conlin, Dudt and Kolemen2022) that is designed to calculate these gradients along with the geometric coefficients accurately may enable us to utilise the full potential of this adjoint-based method. Since the speed up obtained with an adjoint method is linearly proportional to the length of the vector $\tilde {\boldsymbol {p}}$, using (B12) we can, in principle, speed up the calculation of $d f_\textrm {ball}/\textrm {d}\tilde {\boldsymbol {p}}$ by an order of magnitude for 2D axisymmetric equilibria and by two orders or magnitude for 3D equilibria.

B.2. Extending our adjoint technique to low-$n$, ideal MHD solvers

Note that this process can be applied to any ideal MHD eigenvalue solver. For fluctuations that are not confined to a flux surface, one can solve for a fluctuation of the form

(B14)\begin{equation} X = \sum_{m,n} \hat{X}_{m, n}(\psi) {\rm e}^{{\rm i}(m\theta-n\zeta)}, \end{equation}

where $X = \{X_{\psi }, X_{\alpha }\}$ are components of fluctuation $X$ perpendicular to the equilibrium magnetic field line, and $m$ and $n$ are the poloidal and toroidal mode numbers, respectively. We solve for $\hat {X}(\psi )$, using codes such as $\texttt {ELITE}$ and $\texttt {GATO}$ (Bernard, Helton & Moore Reference Bernard, Helton and Moore1981) for axisymmetric equilibria or $\texttt {TERPSICHORE}$ (Anderson et al. Reference Anderson, Cooper, Gruber, Merazzi and Schwenn1990), $\texttt {CAS-3D}$ (Schwab Reference Schwab1993) for 3D equilibria. For $\texttt {GATO}$, $\texttt {TERPSICHORE}$, and $\texttt {CAS-3D}$, the ideal MHD energy principle is used to solve the matrix equation

(B15)\begin{equation} \mathcal{A} X = \lambda \mathcal{B} X, \end{equation}

where $\mathcal {A}$ and $\mathcal {B}$ are real symmetric matrices. Currently, solving such an equation using these codes takes at least a few minutes. For such a problem, we can repeat the process explained at the beginning of this appendix to obtain the gradient,

(B16)\begin{equation} \frac{\partial \lambda}{\partial \boldsymbol{p}} = X^{\mathrm{T}}\left.\left( \frac{\partial \mathcal{A}}{\partial \boldsymbol{p}} - \lambda \frac{\partial \mathcal{B}}{\partial \boldsymbol{p}}\right)X \right/ {X^{\mathrm{T}}\mathcal{B} X}, \end{equation}

for all modes. Equation (B16) is similar to the Hellman–Feynman theorem (Hellmann Reference Hellmann1933; Feynman Reference Feynman1939). For axisymmetric equilibria, combining gradient information with fast equilibrium solvers such as $\texttt {EFIT}$ (Lao et al. Reference Lao, John, Stambaugh, Kellman and Pfeiffer1985) can help mitigate real-time disruption. One could also couple this adjoint approach with an optimiser to find low-$n$, ideal MHD stable equilibria.

Footnotes

1 The main idea of this work is independent of the equilibrium solver. Our technique should also work with any other equilibrium solver.

2 We do not optimise equilibria for quasisymmetry in this work.

3 In the context of infinite-$n$ ideal ballooning mode analyses, there is a value of the ballooning parameter $\theta _0$ at which the ballooning mode is the least stable. To find this value, we treat $\theta _0$ as a free parameter and scan its values to find $\theta _0$ for which $\omega ^2$ is the most negative.

4 Our Python code is freely available at github.com/rahulgaur104/ideal-ballooning-optimizer.

5 Note that ideal ballooning stability is a necessary but not sufficient condition for KBM stability. A mode can be stable against the ideal ballooning mode, but unstable against the KBM.

6 Note that the derivative of an eigenvalue is only well-defined when the eigenvalue is isolated. Optimisation problems with stringent penalty terms can lead the optimiser to points with multiplicity (Lewis & Overton Reference Lewis and Overton1996).

7 The derivative of the $\mathrm {ReLU}$ operator is not well-defined at $x=0$. We may have to replace it with an activation function that is continuous with a well-defined derivative. For example, we could use the logistic function $1/(1 + \textrm {e}^{-c x})$ with a large positive real number $c$.

References

Aleynikova, K., Zocco, A., Xanthopoulos, P., Helander, P. & Nührenberg, C. 2018 Kinetic ballooning modes in tokamaks and stellarators. J. Plasma Phys. 84, 745840602.CrossRefGoogle Scholar
Anderson, D.V., Cooper, W.A., Gruber, R., Merazzi, S. & Schwenn, U. 1990 TERPSICHORE: a three-dimensional ideal magnetohydrodynamic stability program. In Scientific Computing on Supercomputers II (ed. J.T. Devreese & E.V. Camp), vol. 159, pp. 159–174. Plenum Press, New York.CrossRefGoogle Scholar
Baalrud, S., Ferraro, N., Garrison, L., Howard, N., Kuranz, C., Sarff, J., Scime, E. & Solomon, W. 2020 A community plan for fusion energy and discovery plasma sciences. Report of the 2019–2020 American Physical Society Division of Plasma Physics Community Planning Process.Google Scholar
Bernard, L.C., Helton, F.J. & Moore, R.W. 1981 GATO: an MHD stability code for axisymmetric plasmas with internal separatrices. Comput. Phys. Commun. 24, 377380.CrossRefGoogle Scholar
Bernard, L.C. & Moore, R.W. 1981 Systematic optimization of tokamaks for ideal magnetohydro- dynamic stability. Phys. Rev. Lett. 46, 1286.CrossRefGoogle Scholar
Boozer, A.H. 1983 Transport and isomorphic equilibria. Phys. Fluids 26, 496499.CrossRefGoogle Scholar
Conlin, R., Dudt, D.W., Panici, D. & Kolemen, E. 2022 The DESC stellarator code suite part II: Perturbation and continuation methods. J. Plasma Phys. 89, 955890305.Google Scholar
Connor, J.W., Hastie, R.J. & Taylor, J.B. 1979 High mode number stability of an axisymmetric toroidal plasma. Proc. R. Soc. Lond. A 365, 1.Google Scholar
Davies, R., Dickinson, D. & Wilson, H.R. 2022 Kinetic ballooning modes as a constraint on plasma triangularity in commercial spherical tokamaks. Plasma Phys. Control. Fusion. 64, 105001.CrossRefGoogle Scholar
Dewar, R.L. & Glasser, A.H. 1983 Ballooning mode spectrum in general toroidal systems. Phys. Fluids 26, 3038.CrossRefGoogle Scholar
D'haeseleer, W.D., Hitchon, W.N.G., Callen, J.D. & Shohet, J.L. 2012 Flux Coordinates and Magnetic Field Structure: A Guide to a Fundamental Tool of Plasma Theory. Springer Science & Business Media.Google Scholar
Dudt, D.W. & Kolemen, E. 2020 DESC: a stellarator equilibrium solver. Phys. Plasmas 27, 102513.CrossRefGoogle Scholar
Feynman, R.P. 1939 Forces in molecules. Phys. Rev. 56, 340.CrossRefGoogle Scholar
Freidberg, J.P. 2014 Ideal MHD. Cambridge University Press.CrossRefGoogle Scholar
Fu, G.Y., Isaev, M., Ku, L., Mikhailov, M., Redi, M.H., Sanchez, R., Subbotin, A., Cooper, W.A., Hirshman, S.P., Monticello, D.A., et al. 2007 Ideal magnetohydrodynamic stability of the NCSX. Fusion Sci. Technol. 51, 218231.CrossRefGoogle Scholar
Garren, D.A. & Boozer, A.H. 1991 Magnetic field strength of toroidal plasma equilibria. Phys. Fluids B 3, 28052821.CrossRefGoogle Scholar
Gates, D.A., Boozer, A.H., Brown, T., Breslau, J., Curreli, D., Landreman, M., Lazerson, S.A., Lore, J., Mynick, H., Neilson, G.H., et al. 2017 Recent advances in stellarator optimization. Nucl. Fusion 57, 126064.CrossRefGoogle Scholar
Gaur, R., Abel, I.G., Dickinson, D. & Dorland, W.D. 2023 Microstability of $\beta\sim 1$ tokamak equilibria. J. Plasma Phys. 89, 905890112.CrossRefGoogle Scholar
Giles, M.B. & Pierce, N.A. 2000 An introduction to the adjoint approach to design. Flow Turbul. Combust. 65, 393415.CrossRefGoogle Scholar
Grad, H. & Rubin, H. 1958Proceedings of the Second United Nations International Conference on the Peaceful uses of Atomic Energy, p. 400. Geneva : 1958: UN.Google Scholar
Hellmann, H. 1933 Zur rolle der kinetischen elektronenenergie für die zwischenatomaren kräfte. Z. Phys. 85 (3).CrossRefGoogle Scholar
Henneberg, S.A., Drevlak, M., Nührenberg, C., Beidler, C.D., Turkin, Y., Loizu, J. & Helander, P. 2019 Properties of a new quasi-axisymmetric configuration. Nucl. Fusion 59, 026014.CrossRefGoogle Scholar
Hirshman, S.P. & Whitson, J.C. 1983 Steepest-descent moment method for three-dimensional magnetohydrodynamic equilibria. Phys. Fluids 26, 3553.CrossRefGoogle Scholar
Kruskal, M.D. & Kulsrud, R.M. 1958 Equilibrium of a magnetically confined plasma in a toroid. Phys. Fluids 1, 265.CrossRefGoogle Scholar
Landreman, M., Medasani, B., Wechsung, F., Giuliani, A., Jorge, R. & Zhu, C. 2021 SIMSOPT: a flexible framework for stellarator optimization. J. Open Source Softw. 6, 3525.CrossRefGoogle Scholar
Lao, L.L., John, H.S., Stambaugh, R.D., Kellman, A.G. & Pfeiffer, W. 1985 Reconstruction of current profile parameters and plasma shapes in tokamaks. Nucl. Fusion 25, 1611.CrossRefGoogle Scholar
Lewis, A.S. & Overton, M.L. 1996 Eigenvalue optimization. Acta Numer. 5, 149–190.CrossRefGoogle Scholar
Marinoni, A., Austin, M.E., Hyatt, A.W., Walker, M.L., Candy, J., Chrystal, C., Lasnier, C.J., McKee, G.R., Odstrčil, T., Petty, C.C., et al. 2019 H-mode grade confinement in L-mode edge plasmas at negative triangularity on DIII-D. Phys. Plasmas 26, 042515.CrossRefGoogle Scholar
McKinney, I.J., Pueschel, M.J., Faber, B.J., Hegna, C.C., Ishizawa, A. & Terry, P.W. 2021 Kinetic-ballooning-mode turbulence in low-average-magnetic-shear equilibria. J. Plasma Phys. 87, 905870311.CrossRefGoogle Scholar
Miller, R.L. & Moore, R.W. 1979 Shape optimization of tokamak plasmas to localized magnetohydrodynamic modes. Phys. Rev. Lett. 43, 765.CrossRefGoogle Scholar
National Academies of Sciences, Engineering, Medicine, et al. 2021 Bringing fusion to the U.S. grid.Google Scholar
Nelson, A.O., Paz-Soldan, C. & Saarelma, S. 2022 Prospects for H-mode inhibition in negative triangularity tokamak reactor plasmas. Nucl. Fusion 62, 096020.CrossRefGoogle Scholar
Panici, D., Conlin, R., Dudt, D.W. & Kolemen, E. 2022 The DESC stellarator code suite part I: Quick and accurate equilibria computations. J. Plasma Phys. 89, 955890303.Google Scholar
Paul, E.J., Landreman, M. & Antonsen, T. 2021 Gradient-based optimization of 3D MHD equilibria. J. Plasma Phys. 87, 905870214.CrossRefGoogle Scholar
Redi, M.H., Johnson, J.L., Klasky, S., Canik, J., Dewar, R.L. & Cooper, W.A. 2002 Anderson localization of ballooning modes, quantum chaos and the stability of compact quasiaxially symmetric stellarators. Phys. Plasmas 9, 19901996.CrossRefGoogle Scholar
Saarelma, S., Austin, M.E., Knolker, M., Marinoni, A., Paz-Soldan, C., Schmitz, L. & Snyder, P.B. 2021 Ballooning instability preventing the H-mode access in plasmas with negative triangularity shape on the DIII–D tokamak. Plasma Phys. Control. Fusion 63, 105006.CrossRefGoogle Scholar
Sanchez, R., Hirshman, S.P., Ware, A.S., Berry, L.A. & Spong, D.A. 2000 a Ballooning stability optimization of low-aspect-ratio stellarators. Plasma Phys. Control. Fusion 42, 641.CrossRefGoogle Scholar
Sanchez, R., Hirshman, S.P., Whitson, J.C. & Ware, A.S. 2000 b COBRA: an optimized code for fast analysis of ideal ballooning stability of three-dimensional magnetic equilibria. J. Comput. Phys. 161, 576.CrossRefGoogle Scholar
Schwab, C. 1993 Ideal magnetohydrodynamics: global mode analysis of three-dimensional plasma configurations. Phys. Fluids B 5, 31953206.CrossRefGoogle Scholar
Shafranov, V.D. 1957 On equilibrium magnetohydrodynamic configurations. Zh. Eksp. Teor. Fiz. 33, 710.Google Scholar
Snyder, P.B., Burrell, K.H., Wilson, H.R., Chu, M.S., Fenstermacher, M.E., Leonard, A.W., Moyer, R.A., Osborne, T.H., Umansky, M., West, W.P., et al. 2007 Stability and dynamics of the edge pedestal in the low collisionality regime: physics mechanisms for steady-state ELM-free operation. Nucl. Fusion 47, 961.CrossRefGoogle Scholar
Tang, W.M., Connor, J.W. & Hastie, R.J. 1980 Kinetic-ballooning-mode theory in general geometry. Nucl. Fusion 20, 1439.CrossRefGoogle Scholar
Zarnstorff, M.C., Berry, L.A., Brooks, A., Fredrickson, E., Fu, G.Y., Hirshman, S., Hudson, S., Ku, L.P., Lazarus, E., Mikkelsen, D., et al. 2001 Physics of the compact advanced stellarator NCSX. Plasma Phys. Control. Fusion 43, A237.CrossRefGoogle Scholar
Figure 0

Figure 1. Plots of the inputs to the VMEC code for the DIII-D-like design: (a) the pressure, (b) the rotational transform as a function of the normalised toroidal flux $s$ and (c) the cross section of the boundary.

Figure 1

Table 1. Relevant physical quantities for the DIII-D-like equilibrium.

Figure 2

Figure 2. Plots of the inputs to the VMEC code for the modified NCSX design: (a) the pressure, (b) the rotational transform as a function of the normalised toroidal flux $s$ and (c) the cross section of the boundary. Note the large negative shear until $s \approx 0.85$.

Figure 3

Table 2. Important physical quantities for the modified NCSX equilibrium.

Figure 4

Figure 3. Plots of the inputs to the VMEC code for the modified Henneberg-QA design: (a) the pressure, (b) the rotational transform as a function of the normalised toroidal flux $s$ and (c) the cross section of the boundary.

Figure 5

Table 3. Relevant physical quantities for the modified Henneberg-QA design.

Figure 6

Figure 4. (a) Comparison between the gradients of eigenvalue $\hat {\lambda }_{\alpha _{t}} = \partial \hat {\lambda }/\partial \alpha _{t}$ and $\hat {\lambda }_{\theta _0} = \partial \hat {\lambda }/\partial \theta _0$ obtained using a finite-difference scheme against those obtained using an adjoint method. The quantity $\mathrm {iter}$ is the number of iterations taken by the local optimiser on a flux surface before finding $\hat {\lambda }_{\mathrm {max}}$. The gradients match well for around four orders of magnitude. The discrepancy between the adjoint and finite difference $\hat {\lambda }_{\alpha _{t}}$ is due to the finite resolution of the $\texttt {VMEC}$ run. (b) Different grids used to calculate the gradient of the eigenvalue $\hat {\lambda }$ on a flux surface. A finite-difference scheme requires four points, whereas an adjoint method only requires one point. This gives us a four times speed-up.

Figure 7

Figure 5. Process of finding the globally maximum eigenvalue $\hat {\lambda }_{\mathrm {max}}$ on the flux surface $s = 0.8$ of the NCSX equilibrium. We start by first finding the maximum $\hat {\lambda }$ on a discrete grid (marked by crosses) of $\alpha _{t}$ and $\theta _0$. From the discrete maximum $\hat {\lambda }$, we search for the global maximum eigenvalue using a local optimiser. In the inset, we show the approximate path taken by the optimiser to reach $\hat {\lambda }_{\mathrm {max}}$. This process is repeated for all the flux surfaces.

Figure 8

Figure 6. Plots of $\hat {\lambda }_{\mathrm {max}}$ against the normalised toroidal flux $s$ for the three chosen equilibria in (ac) and the eigenfunctions $\hat {X}$ at the maximum $\hat {\lambda }_\textrm {max}$ for each equilibrium in the (df): (a) DIII-D $\hat {\lambda }_\textrm {max}$, (b) NCSX $\hat {\lambda }_\textrm {max}$, (c) Henneberg-QA $\hat {\lambda }_\textrm {max}$, (d) DIII-D $\hat {X}(s = 0.95)$, (e) NCSX $\hat {X}(s = 0.8)$ and( f) Henneberg-QA$\hat {X}(s = 0.84)$. The decay of the eigenfunction could be a result of the Anderson localisation of ballooning modes, as discussed by Redi et al. (2002).

Figure 9

Figure 7. (a) Maximum eigenvalue $\hat {\lambda }_{\mathrm {max}}$ of the initial and optimised DIII-D-like equilibrium. The optimised equilibrium is stable. (b) Boundary shape of the initial and final equilibria. Note the negative triangularity of the initial equilibrium and the positive triangularity of the optimised equilibrium.

Figure 10

Table 4. Comparison between relevant physical quantities of the initial and optimised DIII-D equilibrium.

Figure 11

Table 5. Boundary shape DoFs for the NCSX case.

Figure 12

Figure 8. (a) Maximum eigenvalue $\hat {\lambda }_{\mathrm {max}}$ of the initial and optimised modified NCSX equilibrium. (b) Boundary shape of the initial and final equilibria at three different values of the toroidal angle $\zeta$. The dotted curves correspond to the initial cross sections whereas the solid curves are the final cross sections. Note how sensitive the maximum eigenvalue is to the boundary shape.

Figure 13

Table 6. Comparison between relevant physical quantities of the initial and optimised NCSX equilibrium.

Figure 14

Table 7. Boundary shape DoFs for the modified Henneberg-QA case. We have chosen the exact same coefficients as the NCSX case.

Figure 15

Figure 9. (a) Maximum eigenvalue $\hat {\lambda }_{\mathrm {max}}$ of the initial and optimised modified Henneberg-QA equilibrium. (b) Boundary shape of the initial and final equilibria at three different positions of the toroidal angle $\zeta$. The dotted curves correspond to the initial cross sections whereas the solid curves are the final cross sections.

Figure 16

Table 8. Comparison between relevant physical quantities of the initial and optimised modified Henneberg-QA equilibrium.