## 1 Introduction

### 1.1 Stellarators

The stellarator is a class of magnetic-confinement fusion devices that uses a magnetic field to confine plasmas inside a toroidal vessel. In order to avoid prompt losses of fusion-produced alpha particles, and to avoid large neoclassical transport, the magnetic field must be able to confine collisionless orbits on relatively long time scales. This can be achieved in an ‘omnigenous’ field, in which particles drift towards the edge of the plasma in equal measure as they drift inwards (Hall & McNamara Reference Hall and McNamara1975; Cary & Shasharina Reference Cary and Shasharina1997; Helander Reference Helander2014), thus yielding zero time-averaged radial displacement. While this property was once considered practically impossible to achieve in stellarators, recent efforts in stellarator design (Landreman & Paul Reference Landreman and Paul2022) were able to find fields with nearly perfect omnigenity.

Stellarators have several compelling properties as fusion reactor candidates, the most important one being the absence of plasma disruptions. In comparison with tokamaks, the plasmas in stellarators are relatively stable to kink modes and tearing modes thanks to the fact that the net toroidal plasma current is usually very small (Boozer Reference Boozer2021). It does not vanish entirely, even in the absence of active current drive, due to the bootstrap current arising in response to plasma density and temperature gradients. This bootstrap current can be eliminated, however, by making the stellarator ‘quasi-isodynamic’ (QI). A QI field is, by definition, omnigenous and has poloidally closed contours of constant field strength. Trapped particles thus generally precess in the poloidal direction. In a QI field, the bootstrap current vanishes in the limit of low collision frequency (Helander & Nührenberg Reference Helander and Nührenberg2009; Helander, Geiger & Maaßberg Reference Helander, Geiger and Maaßberg2011), which is helpful not only for avoiding current-driven instabilities, but also for facilitating island-divertor operation. A mathematical description of QI fields is given in § 1.5.

For these reasons, QI stellarators are a uniquely attractive option for a fusion reactor design, and are partially why the largest stellarator ever constructed, Wendelstein 7-X in Germany, was designed as an attempt at a QI stellarator.

### 1.2 Omnigenity and the second adiabatic invariant

Mathematically, omnigenity can be understood through the so-called ‘second adiabatic invariant’ ($\mathcal {J}$), which is a constant of motion for magnetically trapped particles. A particle in a toroidal magnetic field is defined as trapped if it is unable to enter regions with magnetic field strength $B\geq B_*$ within that field, and therefore will ‘bounce’ along some field line $\alpha$ between points $l_1$ and $l_2$ at which the magnetic field strength $B=B_*$, and between which $B< B_*$. This particle's $\mathcal {J}$ can be written as

where $m$ is the particle's mass, $v_\parallel$ its velocity parallel to the magnetic field line and $l$ is the geometric length along said field line. A trapped particle will in general be well confined if its second adiabatic invariant $\mathcal {J}$ is conserved when the particle remains on the same flux surface $s=\psi /\psi _\textrm {edge}$, where $\psi _\textrm {edge}$ is the toroidal magnetic flux, $\psi$, at the plasma boundary (Helander Reference Helander2014). In other, words, contours of constant $\mathcal {J}$ should coincide with flux surfaces. In the absence of an electric field within the flux surface, a particle with total energy $\mathcal {H} = m v^2/2$ has $v_\parallel$ given by

where $\mu = mv_\perp ^2/2B$ is another invariant of the particle's motion, and $v_\perp$ is the particle's velocity perpendicular to the magnetic field line. Since $\mathcal {H}$, $\mu$ and $m$ are constants for collisionless particles, and because $v_\parallel =0$ when $B=B_*$, (1.2) can be substituted into (1.1) to yield

where $\lambda \equiv 1/B_*=\mathcal {H}/\mu$. A particle's drift across field lines $\Delta \alpha$, and across flux surfaces $\Delta \psi \propto \Delta s$, are given by $\mathcal {J}$ as follows (Helander Reference Helander2014):

*a,b*)\begin{equation} \Delta\alpha ={-}\frac{1}{Ze} \partial_\psi\mathcal{J}, \quad \Delta\psi = \frac{1}{Ze} \partial_\alpha \mathcal{J}, \end{equation}

where $Ze$ is the charge of the particle in question and $e$ is the elementary charge. We use the notation $\partial _x \equiv \partial /\partial x$ for any variable $x$.

Because the prefactor in (1.3) is a constant of a particle's motion, it is often useful to think in terms of $\tilde {\mathcal {J}}(s,\alpha,\lambda ) \equiv \mathcal {J} / \sqrt {2\mathcal {H}m}$, which can now be understood as a property of the magnetic field, rather than a property of a particle. Note that, while $\tilde {J}$ is still a function of $\lambda$ (a particle property), one can also express $\lambda =1/B_*$ (a magnetic field property). Evaluating $\tilde {\mathcal {J}}$ over various $s$ and $\lambda$ can thus be used to evaluate how well a field confines trapped particles. We can now connect $\mathcal {J}$ to the concept of omnigenity: in a perfectly omnigenous field, $\partial \tilde {\mathcal {J}}/\partial \alpha |_{s,\lambda }=0\ \forall \, s,\lambda,\alpha$.

### 1.3 Maximum-$\mathcal {J}$ configurations

Fields with the so-called ‘maximum-$\mathcal {J}$’ property, where $\partial _\psi \mathcal {J} < 0$ (such that $\mathcal {J}$ decreases as a function of $\psi$) have supressed trapped electron mode (TEM) driven turbulence. This can be understood from the arguments from Helander *et al.* (Reference Helander, Beidler, Bird, Drevlak, Feng, Hatzky, Jenko, Kleiber, Proll and Turkin2012) and Proll *et al.* (Reference Proll, Plunk, Faber, Görler, Helander, McKinney, Pueschel, Smith and Xanthopoulos2022), which consider a plasma instability which moves a particle radially by $\Delta \psi >0$. The energy required to do so, $\Delta E$, can be found through the conservation of both $\mathcal {J}$ and the cumulative energy of the particle and instability ($\Delta \mathcal {H} + \Delta E = 0$)

Because $\partial _\mathcal {H}\mathcal {J} < 0$ (see (1.3)), we see that $\Delta E$ will be negative when $\partial _\psi \mathcal {J} < 0$. A negative $\Delta E$ means that the instability that caused the particle's radial displacement loses energy, and is therefore stabilized. For this reason, maximum-$\mathcal {J}$ is a desirable property in stellarators. Because $\mathcal {J}$ tends to be larger when $B$ is smaller (see (1.3)), maximum-$\mathcal {J}$ is often also called minimum-$B$.

### 1.4 Optimization

Stellarator designs can be extremely complicated, and thus are typically found using numerical optimization algorithms, which require the use of a ‘target function.’ This target function should take some input (in this case, a stellarator design) and output a number that describes how ‘good’ this stellarator design is. A novel choice of target function, which is described in § 2.1, was crucial to achieving the results presented.

There are many possible approaches to stellarator optimization (Henneberg *et al.* Reference Henneberg, Hudson, Pfefferlé and Helander2021). For these optimizations, we employed the ideal magnetohydrodynamics (MHD) code VMEC (Hirshman, van RIJ & Merkel Reference Hirshman, van RIJ and Merkel1986). In MHD, the magnetic field within a plasma is completely described by the shape of the plasma's boundary (as well as its current and pressure profiles, although these quantities are set to zero in this work). VMEC is thus able to take a plasma boundary as an input and output the corresponding magnetic field. This boundary shape is described by Fourier coefficients, which map toroidal and poloidal angles to physical coordinates $R$ and $Z$, taking the ‘centre’ of the torus's ‘doughnut hole’ as the origin.

These Fourier coefficients, $\boldsymbol {x}$, are the inputs to our target function, $f$. We employed a nonlinear optimization algorithm, which uses numerically calculated finite-difference gradients to attempt to find the values of $\boldsymbol {x}$ that result in the smallest possible target function output.

### 1.5 Conditions for QI

In order to design a QI target function, one must understand the properties of a perfectly QI magnetic field. For the purposes of these optimizations, there are three conditions that describe the ‘ideal’ perfectly QI magnetic field, corresponding to a geometry with $n_{fp}$ field periods with $n_{fp}$ a natural number (Cary & Shasharina Reference Cary and Shasharina1997):

(i) There should be only one minimum in magnetic field strength along any field line spanning a single field period. While this is not strictly necessary, it is a useful constraint to impose on a target function.

(ii) All contours of constant $B$ must close poloidally, meaning that

(

*a*) $\mathcal {B}_\textrm {min}(s)$ – the minimum $B$ on each flux surface $s$ – must be present exactly once on each field line along a single toroidal field period; and(

*b*) $\mathcal {B}_\textrm {max}(s)$ – the maximum $B$ on each flux surface $s$ – must on every field line be located at the toroidal angle $\varphi =\{0,2{\rm \pi} /n_{fp}\}$ in Boozer coordinates (Boozer Reference Boozer1981). In other words, contours of $\mathcal {B}_\textrm {max}(s)$ must be straight, vertical lines in the $\theta -\varphi$ plane.

(iii) The ‘bounce distance’, $\delta$ in Boozer coordinates ($\theta,\varphi$) along a field line between consecutive points with equal $B=B_*$ (‘branches’ Helander & Nührenberg Reference Helander and Nührenberg2009) should be independent of the field line label, $\alpha$, for any given flux surface $s$ ($\partial \delta /\partial \alpha |_{B,s}=0 \, \forall \, \alpha,B,s$).

A schematic of a QI field is shown in figure 1. A frequently studied subclass of omnigenity, known as quasisymmetry, requires that these contours be straight lines in Boozer coordinates. If $B$-contours are straight lines, and close poloidally, then the field has ‘quasi-poloidal (QP) symmetry.’ A QP symmetry is impossible close to the magnetic axis (Helander Reference Helander2014), and fields with precise QP symmetry have proven difficult to find through optimization (Landreman & Paul Reference Landreman and Paul2022), but because the conditions for QI are less strict, it offers a promising alternative.

## 2 Target function

In this work, we used the SIMSOPT optimization suite (Landreman *et al.* Reference Landreman, Medasani, Wechsung, Giuliani, Jorge and Zhu2021; SIMSOPT Development Team 2021), to interface between VMEC, our new target function, and the trust-region reflective optimization algorithm in scipy (Virtanen *et al.* Reference Virtanen, Gommers, Oliphant, Haberland, Reddy, Cournapeau, Burovski, Peterson, Weckesser and Bright2020). It is with these tools that we are able to optimize stellarator equilibria.

### 2.1 Targetting QI fields

Using the conditions detailed in § 1.5, this target seeks to construct a closely related, perfectly QI field from an input field, and then penalizes the difference between the two (see (2.10)). At each optimization step, a new perfectly QI field is constructed.

In this work, we sample $n_\alpha$ field lines uniformly between $0$ and $2{\rm \pi}$, and $n_\varphi$ values of $\varphi$ between $0$ and $2{\rm \pi} /n_{fp}$ in Boozer space. We will call each $B(\varphi )|_{\alpha,s}$ a ‘well.’ We then apply a set of three numerical transformations to these wells to create an artificial set of closely related, perfectly QI wells. An example of what it might look like to turn a set of non-QI wells into a set of QI wells is shown in figure 2.

Note that there is no guarantee that the constructed QI wells (figure 2*b*) are physically realizable. In fact, Cary & Shasharina (Reference Cary and Shasharina1997) points out that an analytic QI field is impossible to achieve perfectly (although it may be possible to approach it arbitrarily closely). In this case, it does not matter that these wells are non-physical; what matters is that the difference between the original wells (figure 2*a*) and the constructed wells (figure 2*b*) describes how far the wells deviate from a QI field. Furthermore, as the input field becomes more QI, the target field will become closer to a ‘physical’ field. Below are the steps taken to find a closely related set of QI wells, analogous to those shown in figure 2.

#### 2.1.1 Part 1: the squash

To construct a QI well from an input well, we first find the toroidal angle of its minimum magnetic field strength, $B_\textrm {min}$, and call it $\varphi _\textrm {min}$. Note that, in this input well, $B_\textrm {min}$ is not necessarily the same as $\mathcal {B}_\textrm {min}$ (the global minimum $B$ on the surface). In a perfectly QI field, condition (i) implies that $B_l = B(\varphi <\varphi _\textrm {min})$ and $B_r = B(\varphi >\varphi _\textrm {min})$ should be monotonically decreasing towards $\varphi _\textrm {min}$.

Because each well is defined by an array of points, we can enforce this condition by ‘flattening’ any points that are non-monotonic. Specifically, if B$_\mathtt {l}$[i] is the array of field strengths for $\varphi <\varphi _\textrm {min}$ where $\texttt {i}=0$ corresponds to $\varphi =0$ and $\texttt {i}=n_i$ corresponds to $\varphi =\varphi _\textrm {min}$, we loop through B$_\mathtt {l}$[i] from i=$\{0,1,2,3,\ldots,n_i-1\}$ and, if B$_\mathtt {l}$[i+1] > B$_\mathtt {l}$[i], we set B$_\mathtt {l}$[i+1] = B$_\mathtt {l}$[i]. A similar operation for $B_r$ yields a well that is perfectly monotonic on either side of its minimum. An example of this operation is shown in figure 3(*b*).

It is worth noting that, while this transformation does create monotonic wells, it is a very crude way of doing so. One can think of any number of ways to do this (for instance, one could sort the points in $B_l$ and $B_r$ in descending and ascending order respectively). It is unclear what the ‘best’ way of doing this would be.

#### 2.1.2 Part 2: the stretch

Our new well looks slightly more QI than the original, but further tranformations are still needed. Condition (ii) requires that $B_\textrm {min}=\mathcal {B}_\textrm {min}$, and $B(\varphi =0,2{\rm \pi} /n_{fp})=\mathcal {B}_\textrm {max}$. To enforce this condition in our new well, we ‘stretch’ $B_l$ and $B_r$ as follows:

We here define $\mathcal {B}_\textrm {min}$ and $\mathcal {B}_\textrm {max}$ to be the smallest and largest $B$ that are found on the flux surface.

After applying this transformation, these wells will satisfy condition (ii), as can be seen in figure 3(*c*).

While we apply a ‘linear’ stretch in this work, one could choose a different transformation that yields $\textrm {d}B/\textrm {d}\varphi =0$ at $\varphi =0,2{\rm \pi} /n_{fp}$, possibly resulting in more realistic wells. It is unclear if this approach would work better.

The entirety of the $n_{fp}=1$ configuration's optimization (§ 4.1), and the early stages of the $n_{fp}=2$ configuration's optimization (§ 4.2) used the approximation that Boozer $\varphi =0$ was the same as VMEC $\phi =0$ so as to avoid a full Boozer coordinate transformation, although this approximation failed for the $n_{fp}=3$ optimization (§ 4.3).

#### 2.1.3 Part 3: the shuffle

Finally, we are ready to address condition (iii), which requires that all distances between consecutive branches are equal at constant $B_*$ for all $\alpha$. To do this, we first find the weighted-mean bounce distance for each field line on the surface in our new well, $\langle \delta \rangle _\alpha (B_*)$ for various values of $B_*$, which can be done numerically. The weightings $w$ in this mean, which were used to improve the smoothness of the optimization algorithm, are inversely proportional to the severity of the prior squash and stretch steps

where $B_{(a)}$ and $B_{(c)}$ are the original and stretched wells, respectively. While at first glance these weights may appear to (possibly) diverge to infinity, which would be a problem, this is extremely unlikely, as it would require the maxima and minimum $B$ of the well to exactly equal $\mathcal {B}_\textrm {max}$ and $\mathcal {B}_\textrm {min}$. However, if this is a concern, one could simply modify $w(\alpha )$ to add some small number to the denominator.

With these weights, we can now write

for bounce points $\varphi _1(B_i,\alpha )$ and $\varphi _2(B_i,\alpha )$, where here $\langle \cdot \rangle _\alpha$ is used to denote that the average is taken over $\alpha$. To be QI, our well will satisfy $\delta (B_i,\alpha )=\langle \delta \rangle _\alpha (B_i)$ for all $B_i$ and $\alpha$. The motivation for using a weighted mean is to attempt to choose $\langle \delta \rangle _\alpha$ such that it most closely resembles the bounce distances of the original wells. Wells that have been stretched and squashed significantly may no longer resemble their original forms, and so their new bounce distances should not be as influential than bounce distances from wells that did not require significant stretching or squashing.

To force our new well into being perfectly QI, we define a transformation that shifts (‘shuffles’) the bounce points left and right as follows:

There are many methods by which $\Delta \varphi _1$ and $\Delta \varphi _2$ can be chosen, so long as

but they should ideally be chosen so as to minimize the difference between the shuffled well and the stretched well. One must also take care that stellarator symmetry is preserved (if the optimizations are stellarator symmetric), and also must ensure

for $B_j< B_i$. In this work, we shuffled the points around $B_\textrm {min}$ first, and moved ‘up’ the well until $B_\textrm {max}$ had also been shuffled. For each set of bounce points, we chose $\Delta \varphi _1 = \Delta \varphi _2 = (\langle \delta \rangle _\alpha - \delta )/2$. If this transformation violated (2.7) or (2.8), both points were shifted by $\varphi _1(B_i) - \varphi _1(B_j) + 10^{-5}$ or $\varphi _2(B_j) - \varphi _2(B_i) - 10^{-5}$, respectively, so as to correct this violation.

After this modification, the resulting field, which we call $B_\textrm {QI}(s,\alpha,\varphi )$, should be completely QI. An example of such a well is shown in figure 3(*d*). We have now constructed a ‘target field’ that is close to our input field. To penalize deviations from QI, we finally construct our penalty function, $f$, as the difference between the scaled original field and the scaled constructed, QI field

where the prefactor $1/(\mathcal {B}_\textrm {max} - \mathcal {B}_\textrm {min})^2$ makes the penalty dimensionless.

### 2.2 Elongation target

Recent efforts to find QI fields have resulted in very elongated flux surfaces, such that the semi-minor radius ($a$) of a poloidal cross-section is significantly smaller than its semi-major radius ($b$) (Camacho Mata, Plunk & Jorge Reference Camacho Mata, Plunk and Jorge2022; Jorge *et al.* Reference Jorge, Plunk, Drevlak, Landreman, Lobsien, Camacho and Helander2022). In fact, close to the magnetic axis, QI can be achieved to arbitrary accuracy by making the flux surfaces infinitely elongated in the direction perpendicular to the curvature vector. The magnetic drift then becomes tangential to these surfaces almost everywhere. High elongation is, however, undesirable in stellarators (as we discuss in § 6), and so we included a penalty function to keep elongation under control.

Because stellarator boundaries are three-dimensional, the shape of a poloidal cross-section depends on the toroidal angle $\phi$. Hence, to calculate elongation, we must sample several cylindrical angles, and expect to find a different elongation at each angle.

Crucially, the angle (relative to the magnetic axis) at which the poloidal cross-section is taken is also an important factor when calculating elongation. To understand this, we first define the vector normal to the poloidal cross-section as $\boldsymbol {n_x}$, and the vector tangent to the magnetic axis as $\boldsymbol {t_a}$. If $\boldsymbol {n_x}$ and $\boldsymbol {t_a}$ are not parallel, then the shape of the cross-sections will change. Because VMEC uses the cylindrical angle $\phi$ to define its boundary, $\boldsymbol {n_x}$ and $\boldsymbol {t_a}$ will necessarily not be parallel in configurations with a non-planar magnetic axis. We hence must take care that, for a given cross-section which intersects the magnetic axis at a point in real space $\boldsymbol {p_a}$ at VMEC angle $\phi _a$, each point that defines the edge of said cross-section $\boldsymbol {p_x}(\phi _x)$ satisfies the relation

We define our cross-sections by numerically finding points on the boundary at various values of $\phi _x$ using (2.11) for several VMEC $\theta$ coordinates between $0$ and $2{\rm \pi}$.

Because these poloidal cross-sections are not, in general, ellipses, and can take any number of unusual shapes, the definition of $a$ and $b$ are somewhat ambiguous. We address this problem by defining ‘effective’ semi-major and -minor axes by finding the cross-section's circumference, $C_\sigma$, and its area, $A_\sigma$, and using them find the ‘effective elongation’ $\varepsilon \equiv b/a$ of the ellipse with the same circumference and area. In this work, we penalized only the maximum elongation.

### 2.3 Additional targets

Two other properties that the optimizer could (and did) exploit to make $f_\textrm {QI}\rightarrow 0$ are arbitrarily large aspect ratios $A$, and/or mirror ratios

These are undesirable properties in a stellarator, so we set our penalty function to be

where $\varDelta _*$, $\varepsilon _*$ and $A_*$ are the maximum acceptable threshold values of the mirror ratio, maximum elongation and aspect ratio, respectively, and are chosen somewhat arbitrarily. By setting $w_\textrm {QI} \ll w_\varDelta \sim w_\varepsilon \sim w_A$, we ensure the results of these optimizations do not exceed these thresholds.

The mirror ratio $\varDelta$ is calculated by numerically finding $\mathcal {B}_\textrm {max}$ and $\mathcal {B}_\textrm {min}$ on flux surface $s=1/51$, as this was the innermost flux surface calculated in VMEC in the early stages of these optimizations. The elongation $\varepsilon$ is calculated by finding the approximate boundary cross-sections on planes normal to the magnetic axis at various physical angles $\phi$, and finding the elongation of the ellipse with the same area-to-circumference ratio of each cross-section. VMEC's aspect ratio is taken as $A$.

Note that during the stretch step, (2.1) allows a built-in way to limit the mirror ratio, which must be done when optimizing for QI. On some surface reasonably near the axis, simply replacing $\mathcal {B}_\textrm {min}$ and $\mathcal {B}_\textrm {max}$ with values that result in a desired mirror ratio may suffice, although this was not done in this work.

## 3 Initial conditions

The optimization method presented in this work is sensitive to the choice of initial conditions. Because of this, we chose initial conditions by exploiting the ‘near-axis expansion’ (Landreman, Sengupta & Plunk Reference Landreman, Sengupta and Plunk2019; Jorge, Sengupta & Landreman Reference Jorge, Sengupta and Landreman2020; Plunk, Landreman & Helander Reference Plunk, Landreman and Helander2021), which allows us to construct a first-order plasma boundary given by a rotating ellipse, based around a carefully chosen magnetic axis. According to the near-axis expansion, magnetic axes in QI fields should have zero curvature at points where the field is at its minimum or maximum (Camacho Mata *et al.* Reference Camacho Mata, Plunk and Jorge2022). There is also reason to believe (Camacho Mata *et al.* Reference Camacho Mata, Plunk and Jorge2022) that careful control over the axis’ torsion can dramatically improve these constructions, although it is not accounted for in this approach.

To find the $n_{fp}=2$ configuration presented, we looked to § 6 in Camacho Mata *et al.* (Reference Camacho Mata, Plunk and Jorge2022) as a starting point, which includes a two field-period configuration constructed from the near-axis expansion. Realizing this configuration with high fidelity requires a large number of poloidal (mpol) and toroidal (ntor) modes in VMEC (around 10 and 90, respectively), which is problematic for optimization. Hence, we simply truncated our Fourier spectrum to remove all Fourier coefficients with mode numbers >2. The result was a configuration with very poor QI, but with a similar boundary and axis shape to its high-mode counterpart. The initial condition for the $n_{fp}=3$ configuration was a VMEC input file from the previously completed $n_{fp}=2$ optimization, with the number of field periods increased, truncating the boundary's Fourier spectrum to include only the first mpol$=$ntor$=2$.

In the next section, we describe a novel approach that was used to generate the initial condition for the $n_{fp}=1$ configuration presented. This initial condition was effective for $n_{fp}=1$, but not for $n_{fp}>1$.

### 3.1 Heuristic construction

In this section, we present a heuristic model for the boundary shape of a configuration that is close to QI or QP symmetry. The goal is to find a boundary shape with very small number of Fourier modes in the usual representation in cylindrical coordinates, which could be used to initialize optimization. At the same time, the model may give some insight as to the character of the space of solutions, e.g. how they could be parameterized. It also indicates the minimal set of Fourier modes to use for the first optimization stage. The approach here is not rigorous, as the method in Plunk, Landreman & Helander (Reference Plunk, Landreman and Helander2019) is, but has the advantages of having no differential equations to solve and giving a boundary shape described by very few Fourier modes. The principles of the model are as follows:

(i) The maximum $B$ will occur at $\phi =0$ and the minimum $B$ will occur at $\phi ={\rm \pi} /n_{fp}$.

(ii) The curvature of the magnetic axis will need to vanish at these values of $\phi$, or else $\mathcal {B}_{\max }$ and $\mathcal {B}_{\min }$ on flux surfaces to first order in $\sqrt {\psi }$ would be points instead of poloidally closed curves.

(iii) The flux surface shapes surrounding the axis will be ellipses that rotate as $\phi$ increases. The minor axis of the ellipse will approximately align with the axis curvature vector, to minimize the poloidal variation of $B$. This can be understood from the near-axis relation $B_{1}=\kappa X_{1}B_{0}$, derived in Garren & Boozer (Reference Garren and Boozer1991), where $B = B_0 + r B_1 + O(r^2)$, $B_0(\varphi )$ is the field strength on axis, $r \propto \sqrt {\psi }$ is a surface label, $\kappa (\varphi )$ is the axis curvature and $r X_1(\theta,\varphi )$ is the extent of the surfaces in the direction of the axis normal vector. We want $B_{1}$ to be small so it does not strongly modify the quasi-poloidally symmetric field associated with the mirror in $B_{0}$.

(iv) The magnetic axis will have some torsion, and this torsion and the rotating elongation will contribute constructively to $\iota$.

(v) The cross-sectional area of the flux surfaces will vary toroidally, in such a way as to give the desired mirror ratio.

We proceed by first finding a magnetic axis with the desired points of vanishing curvature, then defining a surface surrounding it with the appropriate rotating elongation, and finally adjusting this surface to provide a mirror term.

First, for the magnetic axis shape, we adopt the minimal model

*a,b*)\begin{equation} R_{ax}\left(\phi\right)=R_{0,0}+R_{0,2}\cos\left(2n_{fp}\phi\right),\quad Z_{ax}\left(\phi\right)=Z_{0,2}\sin\left(2n_{fp}\phi\right). \end{equation}

Note the factors of 2, introduced here because the symmetry of the flux surface shapes will be different from the symmetry of the axis itself. For instance a configuration with $n_{fp}=1$ has a single $\mathcal {B}_{\max }$ and single $\mathcal {B}_{\min }$, meaning there are two points of vanishing curvature on the axis, and so generally the axis can be symmetric under rotation by ${\rm \pi}$. Let $\boldsymbol {r}(\phi )=[R_{ax}\cos \phi,R_{ax}\sin \phi,Z_{ax}]$ denote the position vector along the axis. The axis curvature can be computed from $\kappa =|\boldsymbol {r}'\times \boldsymbol {r}''|/|\boldsymbol {r}'|^{3}$, where primes denote $\textrm {d}/\textrm {d}\phi$. For the curve described by (3.1*a*,*b*), the curvature at $\phi =0$ and $\phi ={\rm \pi} /n_{fp}$ is

Therefore the curvature vanishes at these points when

Note that there is no constraint on the coefficient $Z_{0,2}$, which can be adjusted to vary the axis torsion and hence vary $\iota$. If desired, the number of Fourier modes in the axis shape (3.1*a*,*b*) could be increased, and $\kappa (\phi )$ could be made to vanish to higher order (Camacho Mata *et al.* Reference Camacho Mata, Plunk and Jorge2022; Jorge *et al.* Reference Jorge, Plunk, Drevlak, Landreman, Lobsien, Camacho and Helander2022), but the expressions here are sufficient for the minimal model here.

Next, we establish a rotating ellipse surrounding the axis

where $\hat {R}=a\cos \vartheta$, $\hat {Z}(\vartheta )=b\sin \vartheta$, $a$ is the semi-major radius of the ellipse (half the major diameter) and $b$ is the semi-minor radius of the ellipse. The variable $\vartheta$ used is not a true poloidal angle, and can be understood as the angle about the rotating ellipse, in the ellipse's ‘rest frame’, such that $\vartheta =0$ and $\vartheta ={\rm \pi} /2$ are always along its semi-major and -minor axes, respectively. The constant $k$ describes the number of rotations of the ellipse per field period; we will consider $k=1$ for simplicity, but other values could be considered. Also for simplicity we have assumed in this construction that the ellipse rotates at a uniform speed with respect to $\phi$, and that the elongation is independent of $\phi$. Neither of these choices is likely to be precisely optimal, but they are good enough for this simplistic model.

We now switch to a true poloidal angle $\theta =\vartheta -n_{fp}\phi$, and introduce the elongation $\varepsilon =a/b$. If $\varepsilon >1$, then the ellipses will be oriented so the thin dimension is roughly aligned with the curvature vector, which will minimize $B_{1}$. Applying some trigonometric identities, we arrive at

It is convenient to compute $a$ and $b$ in terms of an effective aspect ratio $A$. The average minor radius conventionally used in the stellarator community, $a_{\textrm {eff}},$ is defined such that ${\rm \pi} a_{\textrm {eff}}^{2}$ equals the toroidal average of the cross-sectional area in the $R$–$Z$ plane. In our case, this area is ${\rm \pi} ab$, so $a_{\textrm {eff}}=\sqrt {ab}$. Then the aspect ratio $A$ is approximately $A=R_{0,0}/a_{\textrm {eff}}=R_{0,0}/(b\sqrt {\epsilon })$, so

*a,b*)\begin{equation} b=\frac{R_{0,0}}{A\sqrt{\epsilon}},\quad a=\frac{R_{0,0}\sqrt{\epsilon}}{A}. \end{equation}

The remaining task is to adjust the surface shape to create a mirror term. This is achieved by adding a toroidal variation to the size of the ellipse; flux conservation then implies that an increase in the area of the ellipse corresponds to a proportional decrease in $B$. Specifically, we add the term $-\xi a\cos \theta \cos (n_{fp}\phi )$ to $R$, and add $-\xi b\sin \theta \cos (n_{fp}\phi )$ to $Z$, for a constant $\xi$. The motivation for these terms is that they result in the $\phi =0$ cross-sectional dimensions being reduced by a factor $(1-\xi )$

*a,b*)\begin{equation} R\left(\theta,0\right)=R_{ax}\left(0\right)+a\left(1-\xi\right)\cos\theta,\quad Z\left(\theta,0\right)=b\left(1-\xi\right)\sin\theta, \end{equation}

while the $\phi ={\rm \pi} /n_{fp}$ cross-sectional dimensions are increased by a factor $(1+\xi )$

*a,b*)\begin{equation} R\left(\theta,{\rm \pi}/n_{fp}\right)=R_{ax}\left({\rm \pi}/n_{fp}\right)+a\left(1+\xi\right)\cos\theta,\quad Z\left(\theta,{\rm \pi}/n_{fp}\right)=b\left(1+\xi\right)\sin\theta. \end{equation}

(The choice of terms to add to $R$ and $Z$ to achieve this kind of effect is not unique, but the choice here is convenient as it introduces no $\theta -3n_{fp}\phi$ Fourier modes.) Flux conservation gives

Rearranging, we can write the relative size of the mirror term (see (2.12)) as

which can be inverted to give

Applying a trigonometric identity for the new terms associated with the mirror field, we finally arrive at

This expression is supplemented with (3.8*a*,*b*) and (3.13) to compute $a$, $b$ and $\xi$ in terms of $A$, $\varepsilon$, $\varDelta$ and $R_{0,0}$.

The basic input parameters to this boundary shape are the average major radius $R_{0,0}$, the aspect ratio $A$, the elongation $\varepsilon$, the mirror ratio $\varDelta$ and the vertical extent of the axis $Z_{0,2}$. Presumably, larger $\varDelta$ helps with obtaining a good quality of QP symmetry, since for a larger mirror ratio, $B_{1}$ will distort the vertical $B$ contours less. But if $\varDelta$ is too large, the reduced $B$ at $\phi ={\rm \pi} /n_{fp}$ becomes low enough that the confinement degrades there. Increasing $\varepsilon$ helps with reducing $B_{1}$, thereby improving the QP symmetry. Also, increasing $\varepsilon$ and $Z_{0,2}$ will increase $\iota$, which is good since it presumably increases the $\beta$ limit, but at the expense of increased shaping which may degrade the QP symmetry and make coils more difficult. Since both $\varepsilon$ and $Z_{0,2}$ should strongly affect $\iota$, it appears that there is significant flexibility in $\iota$.

One property of this construction that is noteworthy is that $n=2 n_{fp}$ modes are required, both for $m=0$ and $m=1$. This means that if one tries to initially optimize in a small parameter space, consisting of very few Fourier modes, at least these modes must be included. In particular one should not truncate the parameter space to only modes with $n\le n_{fp}$.

A natural generalization of this model is the following. The surfaces can be ellipses in the plane perpendicular to the magnetic axis, with the minor axis of the ellipse oriented along the normal vector of the magnetic axis. The boundary surface can then be computed in standard cylindrical coordinates using the method of § 4.2 in Landreman *et al.* (Reference Landreman, Sengupta and Plunk2019). This variant of the procedure has the advantage of being slightly more rigorous in the minimization of $B_{1}$ and in the determination of the ellipse's rotation at each $\phi$, but at the cost of more Fourier modes in the boundary shape.

## 4 Results

To demonstrate the robustness of this approach, we present three different optimized vacuum configurations which show excellent QI properties. The three-dimensional representation of these configurations’ boundaries and the corresponding magnetic field strength can be seen in figure 4. We evaluated all of these fields using the code SPEC (Hudson *et al.* Reference Hudson, Dewar, Dennis, McGann, von Nessi and Lazerson2012; Qu *et al.* Reference Qu, Pfefferlé, Hudson, Baillod, Kumar, Dewar and Hole2012) and found that the VMEC solutions are undisturbed by magnetic islands. In all cases, the rotational transform $\iota$ is larger at the magnetic axis than at the plasma boundary.

When performing these optimizations, we began by optimizing the non-zero modes of the initial condition. We then gradually optimized larger VMEC modes, increasing the toroidal mode number ntor twice as quickly as the poloidal mode number mpol. This is motivated by the fact that QI equilibria generated from near-axis expansion solutions generally require more toroidal modes than poloidal modes.

All configurations presented in this paper, the QI target function presented above, and most of the codes used to create this paper's figures can be found at Goodman (Reference Goodman2022). Some plotting routines included were taken from Landreman (Reference Landreman2021).

### 4.1 One field period

We first present a precisely QI configuration with a single field period, inspired by the success of the configuration published in Jorge *et al.* (Reference Jorge, Plunk, Drevlak, Landreman, Lobsien, Camacho and Helander2022). At first, this configuration was optimized with no rotational transform target, but the resulting field was found to have magnetic islands when run through SPEC, due to its rotational transform passing through the low-order rational 3/5. Further optimizing to avoid this resulted in a nearly flat rotational transform profile just above $\iota =0.601$. To generate a starting point, we used the method detailed in § 3.1.

The aspect ratio of this configuration, calculated in VMEC, is around $A=10$, the mirror ratio $\varDelta$ is just below $0.19$ and the maximum flux surface elongation of this configuration is just above $\varepsilon =5.5$. These values match their respective threshold values $A_*$, $\varDelta _*$ and $\varepsilon _*$ very closely.

### 4.2 Two field periods

This optimized configuration has $\varDelta =0.22$, $A=10$ and maximum $\varepsilon =6$. Optimizing with no target for $\iota$ resulted for $\iota \in [0.45,0.55]$. While SPEC did not detect any islands in this configuration, we nonetheless continued to optimize this configuration with an explicit penalty for low-order rationals in $\iota$ to find $\iota \in [0.61,0.62]$ to avoid rational surfaces. These are both significant departures from the rotational transform of the configuration used as an initial condition, which sat around $\iota =0.1$, although the shape of the axis looks extremely similar. We discuss this change in § 5.5.

### 4.3 Three field periods

This configuration has $\varDelta =0.2$, $A=8$ and maximum $\varepsilon =6.2$. The rotational transform $\iota$ lies between 0.75 and 0.8, chosen to avoid low-order rational surfaces. As an initial condition, we used the fully optimized $n_{fp}=2$ configuration, with an increased $n_{fp}$ and decreased number of VMEC modes (mpol = ntor = 2).

## 5 Analysis

### 5.1 Confinement

Compared with most earlier stellarators, all three configurations presented in this work have excellent confinement properties, both in vacuum and with finite pressure. In figure 5, we show both the neoclassical transport magnitude at low collisionality $\varepsilon _\textrm {eff}^{3/2}$ as described in Beidler *et al.* (Reference Beidler, Allmaier, Isaev, Kasilov, Kernbichler, Leitold, Maaßberg, Mikkelsen, Murakami and Schmidt2011) (calculated using the method from Drevlak *et al.* Reference Drevlak, Heyn, Kalyuzhnyj, Kasilov, Kernbichler, Monticello, Nemov, Nührenberg and Reiman2003), and their fusion-born alpha particle confinement (calculated using the SIMPLE code Albert, Kasilov & Kernbichler Reference Albert, Kasilov and Kernbichler2020), of the optimized vacuum configurations. To be consistent with Landreman & Paul (Reference Landreman and Paul2022), we scaled these configurations to the size of the ARIES-CS device, i.e. to have an effective minor radius of $1.7$ m (as calculated by VMEC) and field strength $B_{0,0}(s=0)=5.7$ T, and simulated 5000 collisionless alpha particles with energy $3.5$ MeV started isotropically on the $s=0.25$ surface using the SIMPLE code (Albert *et al.* Reference Albert, Kasilov and Kernbichler2020).

It is interesting to understand how these configurations, and their confinement properties, change when a pressure profile $p(\psi )$ is added to them. In this work, we use a linear pressure profile $p(s) \propto 1 - s$, and use volume ($V$) averaged $\beta$,

to describe differences in pressure normalization, where $\mu _0$ is the vacuum permeability constant.

Because they were optimized in vacuum, we expect neoclassical transport to worsen when more pressure is added (which matches the observed trends in figure 6). For the same reason, $\partial _\alpha \mathcal {J}$ will increase at higher $\beta$, which would lead to more fast particles being lost, all else being equal. However, other competing factors create a non-trivial correlation between fast-particle losses and $\beta$. This can be understood by first noting that the plasma is, broadly speaking, diamagnetic, meaning that the magnetic field strength roughly decreases with added plasma pressure, and thus also tends to increase with radius (in some average sense) when $\textrm {d}p/\textrm {d}s < 0$. Because $\mathcal {J}$ increases when $B$ decreases, we thus expect $\partial _s\mathcal {J}$ to become more negative as plasma pressure builds up. In other words, increased pressure, in general, makes it easier to find maximum-$\mathcal {J}$ configurations, in which $\partial _s\mathcal {J} < 0$.

Now consider a configuration which is minimum-$\mathcal {J}$ in vacuum (as is typically the case), meaning that $\mathcal {J}(s_1) < \mathcal {J}(s_2)\,\forall \,\alpha,\lambda,s_1< s_2$. If enough pressure is added, we expect the configuration to change such that $\mathcal {J}(s_1) > \mathcal {J}(s_2)$. Due to the intermediate value theorem, there must therefore be some ‘catastrophic’ $\beta$ at which $\mathcal {J}(s_1) = \mathcal {J}(s_2)$. At this $\beta$, we thus expect more particles to be lost, since particles can drift radially whilst conserving $\mathcal {J}$, even if $\partial _\alpha \mathcal {J}$ is small.

Despite increased $\partial _\alpha \mathcal {J}$ (see figure 7), we still see an improvement in fast-particle confinement when $\beta$ is sufficiently large (around 2 % and 3.5 % for the $n_{fp}=2$ and $n_{fp}=3$ configurations respectively), as shown in figure 6. This is again caused by the plasma being essentially diamagnetic, such that an increase in $p(s)$ on some flux surface $s$ generally results in decreased $B(s)$. Thus, increased $\beta$ tends to increase $|\partial _s B|$. This creates a ‘grad-B’ drift (with a non-zero poloidal component) given by

Thus, increased $\beta$ generally causes the particles’ poloidal precession to increase. This means that particles that may otherwise drift outwards radially and be lost, can – at high enough $\beta$ – poloidally precess quickly enough that their inwards drifts and outwards drifts can negate each other, and the particle can remain confined. For this reason, one can generally expect better fast-particle confinement at high $\beta$ (Lotz *et al.* Reference Lotz, Merkel, Nuhrenberg and Strumberger1992; Velasco *et al.* Reference Velasco, Calvo, Mulas, Sanchez, Parra and Cappa2021). Physically, changes in poloidal precession are also the cause of the aforementioned catastrophic $\beta$, but it is instructive to think of these two phenomena separately in this case.

To summarize, there are three competing phenomena at play:

Due to these three factors, the relationship between $\beta$ and fast-particle confinement is therefore non-trivial and worth investigating.

### 5.2 Properties of $\mathcal {J}$

For the above reasons, it is instructive to understand how $\tilde {\mathcal {J}}$ varies with respect to $s$, $\alpha$ and $\lambda$. To extract this information from a VMEC configuration, we begin by finding the VMEC coordinates corresponding to various field lines on a single flux surface using an algorithm used in the stella code Barnes, Parra & Landreman (Reference Barnes, Parra and Landreman2019). With these, we linearly interpolate the magnetic field strength along each field line, and use a numerical root-finding algorithm to find bounce points $\phi _1$ and $\phi _2$ at which $B=B_*$ for some $B_\textrm {min}< B_*< B_\textrm {max}$. For a single pair of bounce points along a single field line, we rewrite (1.3) in terms of parameters available in $\texttt {VMEC}$

Our algorithm to compute $\tilde {\mathcal {J}}$ takes care to consider which wells along neighbouring field lines are ‘connected’ to each other, which allows us to trace field lines over several toroidal turns. It also means that the possibility of transitioning particles is accounted for in our calculation of $\tilde {\mathcal {J}}$ along various field lines.

Although $\partial _\alpha \tilde {\mathcal {J}}$ is small for these configurations, it is not exactly zero, and thus a single $B_*$ on a flux surface will be associated with various values of $\tilde {J}$, each corresponding to a different field line. In figure 7, we plot $\tilde {\mathcal {J}}$ as a function of $s$, for various bounce field strengths $B_*\equiv 1/\lambda$.

From figure 7, one can deduce three important properties:

(i) The variation of $\tilde {\mathcal {J}}$ between various field lines on a single flux surface can be seen by the width of the line, which is bounded between $\max _\alpha (\tilde {\mathcal {J}}(s,B_*))$ and $\min _\alpha (\tilde {\mathcal {J}}(s,B_*))$. In a perfectly omnigenous field, these lines would therefore have zero thickness.

(ii) The degree to which a configuration is maximum-$\mathcal {J}$ can be seen by the slope of the lines. In a completely maximum-$\mathcal {J}$ configuration, all lines on these plots would have negative slopes.

(iii) Particularly large losses occur when $\partial _s\mathcal {J} = 0$. In this plot, the particles that are lost through this mechanism are those whose $B_* \equiv 1/\lambda$ lines have slopes around zero. The shaded grey (often elephant-shaped) regions show which values of $\tilde {\mathcal {J}}$ have particle losses, which, for these plots, coincide with regions where $\partial _s\mathcal {J} = 0$.

It is worth mentioning that the size of the grey region is not necessarily proportional to the number of fast particles lost. For instance, it may be the case that all particles are confined, except for a small number of those with $\tilde {\mathcal {J}}\simeq 0$, and a small number with $\tilde {\mathcal {J}}\simeq 1$. This would result in a grey region that fully covers the plot, despite only a small number of particles being lost.

Note in figure 7 that the $n_{fp}=1$ configuration degrades much more quickly when the pressure is increased to $\beta =1\,\%$ (as shown by the dramatically increased thickness of the lines of constant $B_*$), and it fails converge in VMEC for $\beta >1\,\%$. Its magnetic field is, apparently, sensitive to perturbations arising from the diamagnetic and Pfirsch–Schlüter (PS) currents arising from the pressure gradient. Configurations with a larger number of field periods are less sensitive to the PS current, the reason for which can be qualitatively understood from the fact that this current closes within one field period of any QI device (Helander Reference Helander2014). In a tokamak, the PS current runs in one direction on the outboard side of the torus and in the other direction on the inboard side. In contrast, it ‘turns around’ in each field period of a QI stellarator and never crosses the $B_\textrm {max}$ contour. If the number of periods is large, the current thus has to change direction frequently and therefore does not affect the magnetic geometry very much. In a configuration with only one field period, the PS current traverses one full toroidal turn around the torus before changing directions, and therefore has a larger effect on the magnetic geometry than if $n_{fp} > 1$. The other two configurations, therefore, are much more resilient to changes in pressure.

Another interesting phenomenon that can be seen from figures 6 and 7 is that there can exist in QI configurations a catastrophic value of $\beta$ at which $\partial _s\mathcal {J}(\lambda ) \rightarrow 0$ for a wide range of $\lambda$, resulting in significant fast-particle losses. From figures 6 and 7, we see that this happens at $\beta \simeq 1\,\%$ in the new $n_{fp}=2$ configuration, and at $\beta \simeq 2\,\%$ in the new $n_{fp}=3$ configuration. In a practical stellarator design, one may want to ensure this catastrophic $\beta$ has a small value, so that (i) the field transitions to maximum-$\mathcal {J}$ quickly, and (ii) that the increase in alpha particle losses happens well before thermonuclear ignition.

### 5.3 Neoclassical mono-energetic transport coefficients

In this section, we investigate the neoclassical mono-energetic coefficients for radial transport, $D_{11}^\star$, and bootstrap current, $D_{31}^\star$ (as defined in Beidler *et al.* Reference Beidler, Allmaier, Isaev, Kasilov, Kernbichler, Leitold, Maaßberg, Mikkelsen, Murakami and Schmidt2011). The authors of Helander & Nührenberg (Reference Helander and Nührenberg2009) and Helander *et al.* (Reference Helander, Geiger and Maaßberg2011) show that exactly QI configurations have a very small bootstrap current. The bootstrap coefficient for all three new QI configurations is shown in figure 8 at half-minor radius. As expected, it is very small compared with an equivalent tokamak with the same aspect ratio, elongation and $\iota$ for a wide range of collisionalities $\nu ^\star \equiv R_\textrm {maj}\nu /(\iota v)$. They are also smaller than the non-QI configurations in Beidler *et al.* (Reference Beidler, Allmaier, Isaev, Kasilov, Kernbichler, Leitold, Maaßberg, Mikkelsen, Murakami and Schmidt2011), and are comparable to the QIPC configuration (Mikhailov *et al.* Reference Mikhailov, Shafranov, Subbotin, Isaev, Nührenberg, Zille and Cooper2002). Here, $R_\textrm {maj}$ is the stellarator's effective major radius, and $\nu$ is the collision frequency of a particle with velocity $v$.

The radial transport coefficient $D_{11}^\star$ is shown in figure 9 for the one field-period QI configuration at half-minor radius. The $E=0$ curve shows how far the transport has been reduced in the $D_{11}^\star \propto 1/\nu ^\star$ regime, reflecting the low value of $\epsilon _\mathrm {eff}$. In addition, for relatively low normalized radial electric fields, $E/vB$, in comparison with existing devices, the transport is reduced even further. These regimes where transport is much affected by the radial electric field (approaching $D_{11}^\star \propto \sqrt {\nu }$ or $D_{11}^\star \propto \nu ^\star$ at low $\nu ^\star$) mainly affect the ion transport, because the thermal speed of ions is much lower than that of electrons, so that $E/vB$ is larger for ions. There is also very little sign of a plateau regime. The high-collisionality PS regime ($\nu ^\star \gtrsim 1$) can almost be extrapolated down to the low values of $D_{11}^\star$ appearing at $\nu ^\star \sim 10^{-3}$ where the $D_{11}^\star \propto 1/\nu ^\star$ regime begins. In combination, all these properties of $D_{11}^\star$ mean that one can have a situation for ions where (with the usual signs of the gradients, $\textrm {d}n_i/\textrm {d}r<0$ and $\textrm {d}T_i/\textrm {d}r<0$) thermo-diffusion drives the particles inwards, and the energy flux is more strongly driven by the logarithmic density gradient $\textrm {d}\ln n_i/\textrm {d}r$ than by the logarithmic temperature gradient $\textrm {d}\ln T_i/\textrm {d}r$.

### 5.4 Trapped electron modes and available energy

In configurations that are both perfectly QI and maximum-$\mathcal {J}$ (meaning $\partial _\psi \mathcal {J} < 0, \forall \, \psi$), it has been shown that density-gradient-driven collisionless TEMs are both linearly and nonlinearly stable. The linear stability criterion, derived from the gyrokinetic equations in Proll *et al.* (Reference Proll, Helander, Connor and Plunk2012) and Helander, Proll & Plunk (Reference Helander, Proll and Plunk2013), reduces to the condition that precessing trapped particles should not be in resonance with a drift wave.

Following the method of Proll (Reference Proll2014), this phenomenon can be understood by first using (1.4*a*,*b*), which tells us that the bounce-averaged presessional frequency of trapped particles $\textrm {d}\alpha /\textrm {d}t$ in an omnigenous field is

where $\tau _{b}$ is the particle's bounce time.

Drift waves are driven by density or temperature perturbations in the plasma and propagate with a frequency proportional to

where $n$ denotes the number density and $T$ the plasma temperature. Because plasma density tends to decrease as a function of $\psi$, it is usually the case that $\textrm {d}\ln (n) / \textrm {d}s < 0$. Hence, we find that

TEM-driven turbulence occurs when $\bar {\omega }_\textrm {bnc}$ and $\omega _\textrm {dia}$ are in resonance. This resonance is impossible if (5.8) is negative, which occurs in a maximum-$\mathcal {J}$ ($\partial _\psi \mathcal {J} < 0$) omnigenous field.Footnote ^{1} The configurations presented become completely maximum-$\mathcal {J}$ at high $\beta$, and we may expect trapped electron-driven turbulence to decrease as a function of $\beta$ (Connor, Hastie & Martin Reference Connor, Hastie and Martin1983).

The relationship between TEM turbulence and maximum-$\mathcal {J}$ configurations was expanded upon in Helander (Reference Helander2020), where the so-called ‘available energy’ (*Æ*) of trapped electrons was calculated. *Æ* measures the maximal amount of thermal energy that may be liberated from the plasma distribution function, subject to some constraints.

In order to compare the devices fairly, we opt to calculate the fraction of the total thermal energy of the electrons that is available. The expression for the *Æ* per unit volume in a flux tube may be extracted from Mackenbach, Proll & Helander (Reference Mackenbach, Proll and Helander2022), and we shall denote it as ${\unicode{x00E6} }(s)$. Importantly, this quantity is dependent on the geometry, and the profiles of both temperature and density at each flux surface $s$. If one multiplies $ {\unicode{x00E6} }(s)$ by $\mathrm {d} V / \mathrm {d} s$, we have constructed the *Æ* per $s$, the integral of which assigns a single scalar Æ to each device. Finally, upon normalizing by the total thermal energy of the device, we have constructed a dimensionless scalar which measures the fraction of total thermal energy which is available

Importantly, there is one free parameter in *Æ*, namely the length scale over which energy is available. Following arguments of (Mackenbach *et al.* Reference Mackenbach, Proll and Helander2022), we take this to be the standard gyroradius (although formally it may have additional dependencies), given by

where $B_0$ is a reference magnetic field strength.

To compare devices across various types of TEM turbulence, we choose to analyse three different profiles for the density and temperature. One corresponds to a purely density gradient driven TEM with a flat temperature profile, for which the aforementioned stability criterion holds. We furthermore construct another set of temperature and density profiles, for which we have

It is known that at $\eta >2/3$ the stability criterion for density gradient driven TEMs no longer holds, and maximum-$\mathcal {J}$ devices no longer enjoy (non)linear stability (at $\eta =2/3$ we thus operate at marginality). To go far beyond marginality, we investigate a pure temperature-gradient-driven TEM as well, in which the density profile is flat.

The result may be seen in figure 10, where highly non-trivial behaviour of the various devices may be seen as we increase $\beta$ and change the plasma profiles. All devices experience stabilizing effects (as measured by this quantity) as one increases the plasma $\beta$, which makes the device more maximum-$\mathcal {J}$. Interestingly, this trend downwards is not monotonic, as some devices experience a rise in ${{\unicode{x00C6} }}/E_{th}$ beyond some $\beta$. Upon a closer investigation, this can be attributed to the breaking of omnigeneity, which generally increases Æ. Hence there seems to exist a certain Æ-optimum for some devices, beyond which increasing the plasma-$\beta$ degrades the omnigenous properties to an extent that it outweighs the increase in the maximum-$\mathcal {J}$ property.

It is furthermore of interest to notice that the $n_{fp}=2$ device seems to enjoy the lowest ${{\unicode{x00C6} }}/E_{th}$ across the widest range of profiles and plasma-$\beta$s. This is partly due to the magnitude of the bounce-averaged precessional drift in this device, which we have found to be lower than in a comparable device such a $n_{fp}=3$. *Æ* is weighted by this magnitude, which partly explains this phenomenon. A second device which exhibits favourable behaviour across the chosen profiles and $\beta$ values is the precise-QA (quasi-axisymmetric) device, which has a so-called ‘vacuum magnetic well’, which is beneficial for MHD stability (Greene Reference Greene1998; Mercier Reference Mercier2011), and should not be confused with the wells described in § 2.1. Although this device's low *Æ* may be initially surprising, one can show that, in certain asymptotic regimes, *Æ* decreases with vertical elongation and furthermore decreases with increased magnetic well (Rodriguez & Mackenbach Reference Rodriguez and Mackenbach2023). Since this configuration exhibits large vertical elongation and a magnetic well, it enjoys this reduction in Æ.

We finally note that a closer comparison could take variations of the length scale over which energy is available into account, which may alter some found trends. Seeing the stark difference in performance at $\beta \sim 2\,\%$, one could reasonably expect the trends to be robust, as long as the variation of this length scale is of order unity.

### 5.5 Axis modifications

Because optimizations with $n_{fp}>1$ were extremely sensitive to initial conditions, it is useful to understand how these optimizations shaped the magnetic axis, such that we can learn to construct better QI configurations using the near-axis expansion (NAE). In this section, we showcase comparisons between the axis used for the NAE construction, the axis shapes in the optimized VMEC equilibria, and the axis of the optimization's starting point. For brevity, we compare these properties only for the $n_{fp}=2$ configuration presented, as its optimized properties (namely, its rotational transform) deviated most significantly from its initial condition.

The shape of a magnetic axis can be understood through either (i) the axes’ curvatures $\kappa$ and torsions $\tau$, or (ii) the $R$ and $Z$ coordinates of the axes at various toroidal angles. As in Camacho Mata *et al.* (Reference Camacho Mata, Plunk and Jorge2022), we define curvature and torsion as

where $\boldsymbol {t_a}$, $\boldsymbol {b_a}$, $\boldsymbol {n_a}$ and $l_a$ are the axis’ tangent vector, binormal vector, normal vector, and arclength, respectively. We also define the relative $R$ and $Z$ differences between two magnetic axes as

where (opt) and (init) correspond to the optimized configuration's axis and the initial condition's axis respectively. Plots showcasing these differences are shown in figure 11. The presence of zeros of curvature is an intrinsic requirement for QI configurations in the near-axis formalism (Plunk *et al.* Reference Plunk, Landreman and Helander2019). Although Fourier mode truncation results in this property being lacking in the configuration used as an initial point for the optimization, the resulting optimized configurations have a curvature function that is approximately zero at points of maxima and minima of $B$ ($\phi =0,{\rm \pi} /n_{fp}$). This indicates that the near-axis solutions, which impose this feature analytically, are optimal configurations in this region of the solution space. The curvature is made small over a wider interval, which should help with the reduction of the first-order correction to the magnetic field $B_1$, according to the near-axis theory.

Another striking feature on the axis shapes of the optimized configurations is the presence of regions where torsion increases sharply. This behaviour is expected where curvature approaches zero, but in these cases it is also present in regions of non-zero curvature. We suspect that this is because, at finite aspect ratio, the configurations constructed using the NAE do not exactly represent an MHD equilibrium, and thus are only approximately reconstructed when using an equilibrium solver. As a consequence, the axis shape used for the construction before being run through VMEC (labelled ‘NA’ in figure 11) and the one found by VMEC (labelled ‘full’ in figure 11) are not identical even when a large number of Fourier modes are used in VMEC. This is evident when comparing their curvature and torsion. Peaked torsion profiles, like the ones observed in figure 11, are thus commonly observed when using numerical equilibrium solvers. According to near-axis theory a sign change is expected in the Frenet frame at locations of zero curvature (Camacho Mata *et al.* Reference Camacho Mata, Plunk and Jorge2022). Numerical approximations of such curves achieve this flip by a rapid ‘rotation’ of the axis, explaining the large torsion where the curvature approaches zero. We also observe an increase in the optimized configuration's integrated torsion (it is more negative) compared with the initial condition, even when ignoring the regions of peaked torsion around $\mathcal {B}_{\mathrm {max}}$ and $\mathcal {B}_{\mathrm {min}}$. This explains the large increase in rotational transform seen in the optimized $n_{fp}=2$ configuration ($\iota \sim 0.6$) compared with the rotational transform of the near-axis construction ($\iota \sim 0.1$).

## 6 Discussion

We have demonstrated that, using our novel target function, excellent QI quality is achievable in stellarators with reasonable aspect ratios, mirror ratios and elongations. To our knowledge, the property of quasi-isodynamicity is satisfied to higher accuracy than in any previous publications. As a result, our configurations have extremely low neoclassical transport, fast particles simulated in them are well confined and their bootstrap currents are extremely small. When enough pressure is added, the two and three field-period configurations become maximum-$\mathcal {J}$ while maintaining excellent omnigenity, and should thus benefit from TEM stability and little turbulence associated with such modes. We have also found that this maximum-$\mathcal {J}$ transition can lead to significant fast-particle losses, and so it is beneficial to ensure it happens at a low plasma $\beta$, as was done in Sánchez *et al.* (Reference Sánchez, Velasco, Calvo and Mulas2022).

The three stellarators presented are visually very different from each other, and have different numbers of field periods, $n_{fp}$, indicating the robustness of this target function. When optimizing configurations with $n_{fp}=1$, we had very good results with every initial condition we tried, but the success of optimizations with $n_{fp}>1$ was quite sensitive to the chosen initial conditions. Future investigation into this phenomenon is warranted, since this trend is consistent with near-axis construction results, which seem to degrade with increased field periods (Camacho Mata *et al.* Reference Camacho Mata, Plunk and Jorge2022; Jorge *et al.* Reference Jorge, Plunk, Drevlak, Landreman, Lobsien, Camacho and Helander2022). These near-axis constructions of this nature seem to work as promising initial conditions for optimization for $n_{fp}>1$.

In this work, we directly penalized mirror ratio and elongation, as excessive values of these quantities are undesirable for fusion reactors. However, it is not clear what exactly the highest allowable value of these parameters should be, and if this is the best way to penalize them. High mirror ratios are undesirable because they imply an inefficient use of magnetic energy (much of it concentrated in small regions) and lead to large forces on the coils that generate the fields. A target function that directly accounts for these parameters, or simply penalizing $\varepsilon _\textrm {eff}$, may be preferable alternatives. High flux surface elongations are undesirable because they increase the plasma's area-to-volume ratio. Further, they also tend to increase flux surface compression which increases instability growth rates (Helander & Plunk Reference Helander and Plunk2021; Stroteich *et al.* Reference Stroteich, Xanthopoulos, Plunk and Schneider2022). High elongation also complicates coil optimization. More carefully chosen targets which do not rely on arbitrary threshold values may be better suited to future stellarator designs.

Our $n_{fp}=1$ configuration has almost zero shear. The other two have some positive shear, but still very little, which is detrimental to MHD stability. More troublingly, if left unchecked, all optimizations attempted would converge on rotational transform profiles centred on low-order rationals. Thankfully, these configurations seemed to be fairly robust to changes in $\iota$, and so these low-order rationals were fairly easy to avoid with additional target functions, even after optimizations had already converged reasonably well.

Applications and extensions of this approach to optimization are boundless. Including both MHD and gyrokinetic stability metrics in the target function will be necessary when designing a viable fusion reactor. Finding a set of coils with reasonable properties would also be a valuable project, as was done in Jorge *et al.* (Reference Jorge, Goodman, Landreman, Rodrigues and Wechsung2023). The optimization approach in this paper also seems to be sensitive to initial conditions: we were unable to find any good configurations $n_{fp}>1$, except for those initialized from a solution found using NAE. However, not all near-axis solutions were effective initial conditions. Hence, a more robust and reliable way to generate near-axis solutions as starting points for optimization would be a valuable asset.

## Supplementary material

Supplementary material is available at https://zenodo.org/record/7220257.

## Acknowledgements

We are grateful to C. Albert, P. Costello, M. Drevlak, A. Johansson, G. Pechstein, R. Nies and J. Proll for useful conversations. We additionally thank C. Nührenberg and D. Spong, who provided the other QI configurations shown. Views and opinions expressed are those of the author(s) only and do not necessarily reflect those of the European Union or the European Commission. Neither the European Union nor the European Commission can be held responsible for them.

*Editor Peter Catto thanks the referees for their advice in evaluating this article.*

## Funding

Support from the SIMSOPT team is also gratefully acknowledged. Optimizations and simulations were performed with the Cobra and Raven HPCs (Garching) respectively. A.G., M.L. and K.C.M are supported by a grant from the Simons Foundation (560651). R.J. is supported by the Portuguese FCT – Fundação para a Ciência e Tecnologia, under grant 2021.02213.CEECIND and by a grant by Alexander-von-Humboldt-Stiftung, Bonn, Germany, through a research fellowship. This publication is part of the project ‘Shaping turbulence – building a framework for turbulence optimisation of fusion reactors’, with Project No. OCENW.KLEIN.013 of the research program ‘NWO Open Competition Domain Science’ which is financed by the Dutch Research Council (NWO). This work has been carried out with in the framework of the EUROfusion Consortium, funded by the European Union via the Euratom Research and Training Programme (Grant Agreement No. 101052200 – EUROfusion).

## Declaration of interests

The authors report no conflict of interest.