1. Introduction
In Paper II we convert the analytic formulation of the variational principle derived in Paper I (Freidberg, Cerfon & Lee Reference Freidberg, Cerfon and Lee2015) into a form suitable for numerical analysis. A code has been written based on this analysis that allows us to quickly and accurately calculate the dependence of elongation ${\it\kappa}$ and triangularity ${\it\delta}$ on inverse aspect ratio ${\it\varepsilon}$ for various values of poloidal beta ${\it\beta}_{p}$ , wall radius $b/a$ and feedback parameter ${\it\gamma}{\it\tau}_{w}$ . These scaling dependences provide useful information for the optimization of plasma shape against axisymmetric $n=0$ MHD instabilities, which are the cause of vertical disruptions.
For perspective, it is worth noting that there have been many numerical investigations of $n=0$ MHD stability for a plasma surrounded by a perfectly conducting wall (Laval & Pellat Reference Laval and Pellat1973; Wesson & Sykes Reference Wesson and Sykes1975; Becker & Lackner Reference Becker and Lackner1977) or with a resistive wall (Wesson Reference Wesson1975, Reference Wesson1978; Lazarus et al. Reference Lazarus, Chu, Ferron, Helton, Hogan, Kellman, Lao, Lister, Osborne and Snider1991). In these studies, the growth rate of the mode is obtained either by directly solving the equations of motion or by minimizing ${\it\delta}W$ . These studies have provided valuable insight into the vertical stability of a tokamak, including design guidelines for optimizing performance. However, they have not focused on including the effect of feedback on the scaling of maximum elongation with aspect ratio, which is the main goal of the present paper.
In comparison to previous studies, our results are obtained using a somewhat more realistic model of the wall geometry. On the other hand, our results are somewhat more restrictive in that we use only the wellknown Solov’ev profile for the equilibrium (Solov’ev Reference Solov’ev1968). The Solov’ev profile provides accurate scaling with respect to plasma pressure and shape, but is limited in its ability to take into account the effect of current profile on stability; that is the internal inductance per unit length is always of the order of $l_{i}\sim 0.4$ for all of our results. Still, the general scaling relations are accurate (see for instance Bernard et al. Reference Bernard, Berger, Gruber and Troyon1978) and, importantly, the profile leads to significant savings in computer time. The savings result from the fact that the Green’s theorem for the solution of the vacuum region can also be utilized in the plasma region, thereby reducing the 2D stability problem into a 1D problem.
An outline of the analysis is as follows. The numerical formulation of the variational principle is based on a combination of Fourier analysis and the application of Green’s theorem. The analysis is carried out in terms of the perturbed magnetic flux. A substantial simplification occurs for the Solov’ev profiles because the perturbed poloidal magnetic field in the plasma turns out to be a vacuum field; that is, the perturbed toroidal current is zero. In this case, the standard volume integral for the plasma energy ${\it\delta}W_{F}$ can be converted to a simple surface integral, thus transforming the 2D problem into a 1D problem. This is not true for more general profiles.
The basic strategy is to introduce Fourier expansions for the flux and its normal derivative on two surfaces, the plasma and wall. The corresponding Fourier amplitudes are the unknowns in the problem. Furthermore, the normal derivative amplitudes are related to the flux amplitudes through the solution of the vacuum flux equation, (i.e. ${\rm\Delta}^{\ast }{\it\psi}=0$ ), a step that is conveniently carried out using Green’s theorem.
The end result is a classic minimizing principle that consists of the ratio of quadratic terms in the Fourier amplitudes subject to a series of linear constraints arising from the application of Green’s theorem. Also, the matrix elements contain the resistive wall feedback parameter ${\it\gamma}{\it\tau}_{w}$ , which appears in a simple linear form. The calculation thus reduces to a standard linear algebra problem in which, after some analysis, all the matrices are shown to be real and symmetric.
A summary of our results with respect to the effect of feedback on vertical stability is as follows. For values of ${\it\gamma}{\it\tau}_{w}$ similar to present day high performance tokamaks, we find that the addition of feedback substantially increases the achievable elongation, typically from about 1.17 to 2.06 at ${\it\varepsilon}\approx 0.3$ . Equally important, we show that the achievable value of ${\it\kappa}$ decreases as ${\it\varepsilon}$ gets smaller for any value of ${\it\gamma}{\it\tau}_{w}$ . In addition, we find that at each value of maximum elongation ( ${\it\kappa}_{max}$ ), there is a corresponding value of optimized triangularity ( ${\it\delta}$ ) whose scaling is also determined as a function of ${\it\varepsilon}$ . Theoretical predictions of ${\it\kappa}_{max}$ are slightly higher than experimental observations for high performance discharges, as measured by high average pressure. Theoretical ${\it\delta}$ values are noticeably lower. The explanation is likely associated with the fact that high performance involves not only $n=0$ MHD stability, but also $n\geqslant 1$ MHD modes described by ${\it\beta}_{N}$ in the Troyon limit, and transport as characterized by ${\it\tau}_{E}$ . Operation away from the $n=0$ MHD optimum may still lead to higher performance if there are more than compensatory gains in ${\it\beta}_{N}$ and ${\it\tau}_{E}$ . Unfortunately, while the empirical scaling of ${\it\beta}_{N}$ and ${\it\tau}_{E}$ with the elongation ( ${\it\kappa}$ ) has been determined, the dependence on ${\it\delta}$ has still not been quantified. This information is needed in order to perform more accurate overall optimizations in future experimental designs.
The presentation of the analysis and results begins with § 2, where we convert the Lagrangian integral derived in Paper I into a set of surface integrals by making use of the Solov’ev profile. In § 3, the surface integrals are simplified by expressing them in the form of a symmetric matrix $\unicode[STIX]{x1D652}$ and a vector variable of poloidal Fourier mode amplitudes of the perturbed fluxes and their normal derivatives. In § 4, the constraints between the perturbed fluxes and their normal derivatives are obtained by utilizing the wellknown Green’s function for a vacuum region. In § 5, we describe how numerical solutions are obtained by iterating the plasma parameters in order to make the minimum eigenvalue of $\unicode[STIX]{x1D652}$ in the subspace of the constraints equal to zero. The eigenvalues are efficiently calculated using a QR decomposition. In § 6, the parameter space of the numerical calculations is chosen by introducing (i) a reasonably realistic wall geometry model, and (ii) a reference case of numerical input parameters determined by examining high performance experimental discharges from several tokamaks. Finally, the numerical results and discussion are given in §§ 7 and 8, respectively.
2. The starting point
The starting point for the analysis is the Lagrangian integral for the variational principle repeated here for convenience,
The first goal in our analysis is to convert all volume integrals into surface integrals. This task is accomplished by noting that, for $n=0$ modes, the perturbed poloidal fields can be expressed in terms of the perturbed flux in the standard manner. Thus, for each region of interest (i.e. plasma, inner vacuum and outer vacuum regions) it follows that
with ${\it\psi}$ satisfying
Clearly, $p^{\prime \prime }=F^{2^{\prime \prime }}=0$ for the vacuum regions.
Next use the identity
The divergence theorem now allows us to convert all volume integrals into surface integrals making use of the differential surface element relation
where $\text{d}l$ is the differential poloidal arc length,
Here $\text{d}l$ and $\text{d}\hat{l}$ are the differential arc lengths along the plasma and wall surfaces, respectively. Note that the required continuity of the perturbed fluxes across both the plasma and wall interfaces
has been used to simplify (2.6). We point out that (2.6) is valid for arbitrary profiles. The simplification associated with Solov’ev profiles occurs later in the analysis.
3. Fourier analysis
The task now is to evaluate $L$ by substituting Fourier series with unknown coefficients for each of the dependent variables. Ultimately, the desired relation between elongation and aspect ratio is obtained by standard variational techniques; that is, we set ${\it\delta}L=0$ by varying the Fourier coefficients while simultaneously satisfying the constraint $L=0$ by iterating ${\it\kappa}$ and ${\it\delta}$ . Remember that $L=0$ because the modes of interest are slow enough that we can neglect the inertial effects.
The task of setting ${\it\delta}L=0$ separates into two parts. In the first part, Fourier series are introduced for both the fluxes and their normal derivatives. In the second part, constraint relations between the coefficients in the fluxes and their normal derivatives are obtained by means of Green’s theorem. In this section we focus on the first part of the calculation.
The analysis begins by introducing a simple scaling transformation of actual poloidal arc length into an arc length angle. Specifically we write
Here $L_{P},L_{W}$ are the circumferences of the plasma and wall surfaces, respectively. This transformation is convenient because $0\leqslant {\it\chi}\leqslant 2{\rm\pi}$ and $0\leqslant \hat{{\it\chi}}\leqslant 2{\rm\pi}$ , making it easy to impose poloidal periodicity. The angles ${\it\chi},\hat{{\it\chi}}$ are easily determined numerically once the surface coordinates have been specified.
Next, we introduce Fourier series for each of the basic unknowns. For vertical instabilities where $\boldsymbol{n}\boldsymbol{\cdot }{\bf\xi}$ has even $Z$ symmetry, it follows that the fluxes should be expanded in sine series,
where $R_{0}$ is the geometric centre of the device, as already introduced in Paper I and illustrated in figure 1. As shown shortly, the factors in front of the summations simplify the algebra. Each of the unknown normal derivatives is also expanded in a Fourier sine series,
With the required expansions now in hand, we can combine (2.6), (3.2), and (3.3) to obtain an expression for the normalized Lagrangian integral $\bar{L}=({\it\mu}_{0}R_{0}/{\rm\pi}^{2})L$ in terms of the Fourier amplitudes. A short calculation yields
where ${\bf\psi}$ etc. are the vectors of the Fourier amplitudes and the elements of the matrix $\unicode[STIX]{x1D645}$ can be written as
Note the different fonts used for matrices. Also, the precise definition of the wall diffusion time is given by
Equation (3.4) can be rewritten in the following compact form
Here, $\boldsymbol{x}^{\text{T}}=[{\bf\psi},\hat{{\bf\psi}},\boldsymbol{u},\hat{\boldsymbol{u}},\hat{\boldsymbol{v}},\hat{\hat{\boldsymbol{v}}}]$ and $\unicode[STIX]{x1D652}$ is the symmetric matrix
Each of the elements in $\unicode[STIX]{x1D652}$ is an $M\times M$ matrix with $M$ the number of Fourier amplitudes maintained in the expansions. The total dimensions of $\unicode[STIX]{x1D652}$ are thus $6M\times 6M$ .
4. Application of Green’s theorem
The normal derivatives of the fluxes are related to the fluxes themselves through the solution to the vacuum equation ${\rm\Delta}^{\ast }{\it\psi}=0$ . (The condition ${\rm\Delta}^{\ast }{\it\psi}=0$ is also true in the plasma region for Solov’ev profiles, and this leads to a much simpler numerical formulation plus savings in computer time.) Since the relationships are needed only on the plasma and wall surfaces, a convenient approach is to utilize Green’s theorem with the observation point located on either of the surfaces. The procedure is demonstrated below, starting with the plasma region. The end results are four linear constraint relations between the various Fourier amplitudes.
The plasma region
In the plasma region, the 2D Green’s theorem with the observation point on the plasma surface (i.e. the integration surface) can be obtained from the basic identity
For vacuum fields the flux and 2D Green’s function satisfy
In these expressions, unprimed and primed coordinates refer to the observation and integration points, respectively
The 2D Green’s function is closely related to the flux function for a circular loop of wire. Specifically, the vector potential due to a wire loop, satisfies
The solution is
Here $K(k)=\int _{0}^{{\rm\pi}/2}(\text{d}{\it\theta}/\sqrt{1k^{2}\sin ^{2}{\it\theta}})$ , $E(k)=\int _{0}^{{\rm\pi}/2}\sqrt{1k^{2}\sin ^{2}{\it\theta}}\,\text{d}{\it\theta}$ are complete elliptic integrals. Thus, if we set ${\it\mu}_{0}I=1$ , we see that $RA_{{\it\phi}}=G$ ,
Also needed in the analysis is the normal derivative (in integration coordinates) of the Green’s function evaluated on the plasma surface. A short calculation yields
Note that ${\dot{Z}}^{\prime },{\dot{R}}^{\prime }$ denote $\text{d}Z({\it\chi}^{\prime })/\text{d}{\it\chi}^{\prime }$ and $\text{d}R({\it\chi}^{\prime })/\text{d}{\it\chi}^{\prime }$ indicating that we have switched integration variables from $l^{\prime }$ to ${\it\chi}^{\prime }$ .
The next step is to apply Stokes theorem to (4.1) with the observation point on the plasma surface
In this expression we need to be careful about the signs. The main point is that Stoke’s theorem requires $\text{d}\boldsymbol{l}^{\prime }$ to rotate in a right handed sense. Now, in the usual $R,{\it\phi},Z$ coordinate system as shown in figure 1, this implies that $\text{d}\boldsymbol{l}^{\prime }$ rotates in the clockwise direction. However, it is convenient and familiar to have ${\it\chi}^{\prime }$ rotate in the counterclockwise direction. Thus, if we define a unit tangential vector $\boldsymbol{t}^{\prime }$ pointing in the counterclockwise direction it then follows that
Here, $\boldsymbol{n}^{\prime }$ is the outward pointing unit normal vector. With this sign convention (4.7) reduces to
A similar expression holds for the wall surface.
The calculation continues by substituting the Fourier expansions into (4.9) and then carrying out a Fourier analysis. A straightforward calculation leads to
where the matrix elements $\unicode[STIX]{x1D608}_{mn}^{11}$ and $\unicode[STIX]{x1D609}_{mn}^{11}$ of $\unicode[STIX]{x1D63C}^{11}$ and $\unicode[STIX]{x1D63D}^{11}$ are given by
For the matrix format the first superscript on $\unicode[STIX]{x1D63C}^{11}$ denotes the observation point while the second denotes integration point. The index 1 refers to the plasma surface, and the index 2 refers to the wall. This holds for all matrices that follow.
The outer vacuum region
The analysis of the outer vacuum region is very similar to that of the plasma. One simply has to switch surfaces and take into account the opposite sign of the outward surface normal. The basic equation for the outer vacuum region, assuming regularity at infinity, is given by
On this surface, the Green’s function and its normal derivative are given by
The expressions are the same as for the plasma region except that $L_{P}\rightarrow L_{W}$ in the second equation. Fourier analysis then leads to the following relation between Fourier coefficients
The inner vacuum region
The inner vacuum region is slightly more complicated to analyse because of the coupling of surface vectors between the plasma and wall surfaces. In this region, Green’s theorem must be used twice, once with the observation point on the plasma surface and once on the wall surface. The two basic equations are given by
Observation point on the plasma:
Observation point on the wall:
After carrying out the Fourier analysis, we arrive at two coupled equations for the Fourier amplitudes
or in matrix form
The newly introduced matrix elements are defined by
Note that because of the symmetry $G(R,Z,R^{\prime },Z^{\prime })=G(R^{\prime },Z^{\prime },R,Z)$ it follows that $\unicode[STIX]{x1D609}_{mn}^{12}=\unicode[STIX]{x1D609}_{nm}^{21}$ implying that $\unicode[STIX]{x1D63D}^{21}=(\unicode[STIX]{x1D63D}^{12})^{\text{T}}$ .
The four constraint relations given by (4.10), (4.14), and (4.18) can now be written in a compact form as
where $\unicode[STIX]{x1D63E}^{\text{T}}$ is a $4M\times 6M$ matrix given by
5. The numerical solution
The numerical solution to the problem under consideration requires finding stationary solutions to $\boldsymbol{x}^{\text{T}}\boldsymbol{\cdot }\unicode[STIX]{x1D652}\boldsymbol{\cdot }\boldsymbol{x}=0$ subject to the constraints $\unicode[STIX]{x1D63E}^{\text{T}}\boldsymbol{\cdot }\boldsymbol{x}=\mathbf{0}$ . A convenient way to proceed mathematically is to recast the Lagrangian formulation in terms of a minimizing principle by introducing a normalization constraint $\boldsymbol{x}^{\text{T}}\boldsymbol{\cdot }\boldsymbol{x}=1$ . Standard linear algebra analysis shows that the Lagrangian formulation is equivalent to (see for instance Trefethen & Bau (Reference Trefethen and Bau1997))
The solution procedure requires a determination of the eigenvalues ${\it\lambda}_{j}$ of $\unicode[STIX]{x1D652}$ subject to the constraints $\unicode[STIX]{x1D63E}^{\text{T}}\boldsymbol{\cdot }\boldsymbol{x}=\mathbf{0}$ . The selfconsistency requirement $\boldsymbol{x}^{\text{T}}\boldsymbol{\cdot }\unicode[STIX]{x1D652}\boldsymbol{\cdot }\boldsymbol{x}=0$ corresponds to finding (by iteration) a set of plasma parameters ${\it\varepsilon},{\it\kappa},{\it\delta},{\it\beta}_{p},b/a,{\it\gamma}{\it\tau}_{w}$ such that the minimum (i.e. most negative) eigenvalue just happens to satisfy ${\it\lambda}_{min}=0$ .
Practically speaking, once we are able to solve the eigenvalue problem subject to the constraints, we can then fix ${\it\beta}_{p},b/a,{\it\gamma}{\it\tau}_{w}$ , choose a value for ${\it\varepsilon}$ and then iterate to find the largest value of ${\it\kappa}$ and corresponding ${\it\delta}$ for which ${\it\lambda}_{min}=0$ . In this way the desired curve of ${\it\kappa}={\it\kappa}({\it\varepsilon})$ can be generated.
Finding the eigenvalues of a symmetric matrix $\unicode[STIX]{x1D652}$ subject to a set of linear constraints $\unicode[STIX]{x1D63E}^{\text{T}}\boldsymbol{\cdot }\boldsymbol{x}=\mathbf{0}$ is a wellknown problem in linear algebra. A good way to accomplish this task is by means of a QR orthogonal decomposition (Trefethen & Bau Reference Trefethen and Bau1997) of the constraint matrix $\unicode[STIX]{x1D63E}$ and the introduction of a new set of orthonormal basis vectors $\boldsymbol{z}$ in place of $\boldsymbol{x}$ (Golub & Underwood Reference Golub and Underwood1970). The details of the procedure are given in appendix A. A summary of the required steps, in the proper sequence, is as follows:

(i) Compute (for example using MATLAB (2014)) the QR decomposition of $\unicode[STIX]{x1D63E}$ ,
(5.2) $$\begin{eqnarray}\unicode[STIX]{x1D63E}=\unicode[STIX]{x1D64C}^{\text{T}}\boldsymbol{\cdot }\left\begin{array}{@{}c@{}}\unicode[STIX]{x1D64D}\\ \cdots \\ \mathbf{0}\end{array}\right.\end{eqnarray}$$Here, the properties and dimensions of the matrices are as follows: $\unicode[STIX]{x1D64D}$ is a $4M\times 4M$ invertible upper triangular matrix, $\mathbf{0}$ is a $2M\times 4M$ null space matrix and $\unicode[STIX]{x1D64C}$ is a $6M\times 6M$ orthonormal matrix satisfying $\unicode[STIX]{x1D64C}^{\text{T}}\boldsymbol{\cdot }\unicode[STIX]{x1D64C}=\unicode[STIX]{x1D644}$ . The symbol $\cdots$ appearing here and in appendix A is used to indicate the separation between block matrices. Hereafter, we assume that $\unicode[STIX]{x1D64C}$ and $\unicode[STIX]{x1D64D}$ are known matrices. 
(ii) Introduce a new set of orthonormal basis vectors $\boldsymbol{z}$ in place of $\boldsymbol{x}$ ,
(5.3) $$\begin{eqnarray}\boldsymbol{x}=\unicode[STIX]{x1D64C}^{\text{T}}\boldsymbol{\cdot }\boldsymbol{z}=\unicode[STIX]{x1D64C}^{\text{T}}\boldsymbol{\cdot }\left\begin{array}{@{}c@{}}\boldsymbol{z}_{4}\\ \boldsymbol{ z}_{2}\end{array}\right,\end{eqnarray}$$where $\boldsymbol{z}_{4}$ contains the first $4M$ elements of $\boldsymbol{z}$ while $\boldsymbol{z}_{2}$ contains the remaining $2M$ elements. Both $\boldsymbol{x}$ and $\boldsymbol{z}$ contain a total of $6M$ elements. The analysis in appendix A shows that the constraint relation forces $\boldsymbol{z}_{4}=\mathbf{0}$ . 
(iii) Compute the matrix
(5.4) $$\begin{eqnarray}\unicode[STIX]{x1D64C}\boldsymbol{\cdot }\unicode[STIX]{x1D652}\boldsymbol{\cdot }\unicode[STIX]{x1D64C}^{\text{T}}=\left\begin{array}{@{}cc@{}}\unicode[STIX]{x1D652}_{11} & \unicode[STIX]{x1D652}_{12}\\ \unicode[STIX]{x1D652}_{12}^{\text{T}} & \unicode[STIX]{x1D652}_{22}\end{array}\right.\end{eqnarray}$$Here, $\unicode[STIX]{x1D652}_{11}$ is $4M\times 4M$ , $\unicode[STIX]{x1D652}_{22}$ is $2M\times 2M$ and $\unicode[STIX]{x1D652}_{12}$ is $4M\times 2M$ . Actually only $\unicode[STIX]{x1D652}_{22}$ is needed. 
(iv) The desired eigenvalues are obtained from the simplified matrix problem
(5.5) $$\begin{eqnarray}{\it\lambda}=\frac{\boldsymbol{z}_{2}^{\text{T}}\boldsymbol{\cdot }\unicode[STIX]{x1D652}_{22}\boldsymbol{\cdot }\boldsymbol{z}_{2}}{\boldsymbol{z}_{2}^{\text{T}}\boldsymbol{\cdot }\boldsymbol{z}_{2}}.\end{eqnarray}$$The resulting eigenvalue problem automatically takes into account the constraints. Also $\unicode[STIX]{x1D652}_{22}$ is symmetric. Its dimensions, $2M\times 2M$ , are much smaller than the original $\unicode[STIX]{x1D652}$ whose size is $6M\times 6M$ . Finding the eigenvalues of $\unicode[STIX]{x1D652}_{22}$ is a standard numerical problem. In this work, we simply accomplish this task by calling the function ‘eig’ in MATLAB.
The numerical problem has now been fully formulated.
6. The numerical inputs
The procedure just described has been implemented in a numerical code that is quick, efficient and accurate. The parameter space of interest is large, consisting of six physically relevant dimensionless quantities: ${\it\varepsilon}$ , ${\it\kappa}$ , ${\it\delta}$ , ${\it\beta}_{p}$ , $b/a$ and ${\it\gamma}{\it\tau}_{w}$ . The strategy for presenting the results in a compact and understandable form is as follows. First, as a preparatory step we discuss the precise definition of the normalized wall radius parameter $b/a$ . Our definition is somewhat different from the usual conformal wall parameter $b/a$ . It is more physically realistic in that it holds the normalized gap between the inner midplane wall and the plasma, denoted by ${\it\Delta}_{i}/a$ , fixed as the wall area gets larger. Second, after reviewing some experimental data from different tokamaks we define reference values for ${\it\beta}_{p}$ , $b/a$ and ${\it\gamma}{\it\tau}_{w}$ . Once the reference case is established, we compute curves of maximum ${\it\kappa}$ and corresponding optimized ${\it\delta}$ as a function of ${\it\varepsilon}$ , separately varying ${\it\beta}_{p}$ , $b/a$ , and ${\it\gamma}{\it\tau}_{w}$ .
Definition of the normalized wall radius $b/a$
As explained in Paper I, the plasma surface is determined by inverting the implicit equation ${\it\Psi}=0$ given by the Solov’ev equilibrium. On this surface, the outer equatorial point has coordinates $(R,Z)=(R_{0}+a,0)$ , the top point has coordinates $(R,Z)=(R_{0}{\it\delta}a,{\it\kappa}a)$ and the inner equatorial point has coordinates $(R,Z)=(R_{0}a,0)$ . Now, our wall model has a shape similar to the plasma. It is characterized by three free input parameters: the normalized inner midplane gap ${\it\Delta}_{i}/a$ , the normalized outer midplane gap ${\it\Delta}_{o}/a$ and the normalized vertical gap ${\it\Delta}_{v}/a$ . These in turn are easily related to the more familiar normalized wall radius $b/a$ and wall elongation ${\it\kappa}_{w}$ . The geometry is illustrated in figure 1. In the numerical studies, two of the three gap parameters, ${\it\Delta}_{i}/a$ and ${\it\Delta}_{v}/a$ , are held fixed. Changing the wall radius corresponds to varying only the outer gap; that is, the single parameter ${\it\Delta}_{o}/a$ or equivalently $b/a$ . This choice of variation is motivated by experimental observations (McCracken et al. Reference McCracken, Lipschultz, LaBombard, Goetz, Granetz, Jablonski, Lisgo, Ohkawa, Stangeby and Terry1997), which show that the impurity influx in divertor tokamaks from the outboard midplane area is substantially greater than from the inboard side. Consequently, in order to achieve better impurity isolation in future experiments, it may be necessary to increase the outboard midplane gap ${\it\Delta}_{o}/a$ .
The specific shape of our wall is denoted by the coordinates $\hat{R}$ , $\hat{Z}$ and is given by
Note that the average horizontal wall radius $b/a$ and wall elongation ${\it\kappa}_{w}$ are related to the gap widths and plasma elongation by
The parameters $\hat{R}_{0}$ and $\hat{{\it\delta}}_{0}$ can also be expressed in terms of the gap widths by utilizing the assumption that the maximum heights of both the wall and plasma occur at the same $R$ .
Also, for the numerics, it is convenient to normalize and parameterize the wall coordinates as follows: $\hat{R}=R_{0}\hat{X}$ , $\hat{Z}=R_{0}{\hat{Y}}$ with
Once the gap widths and plasma geometry are specified, the wall coordinates given by (6.4) are completely determined. From this, it is then a straightforward task to calculate the angular arc length coordinate $\hat{{\it\chi}}$ on the wall surface.
The reference case
The next step is to define a reference case. The goal is to determine a typical set of values for the parameters of interest: ${\it\beta}_{p}$ , $b/a$ and ${\it\gamma}{\it\tau}_{w}$ . To accomplish this task, we examine the data for several major large tokamak experiments as shown in table 1: ASDEX Upgrade (AUG) (Ryter et al. Reference Ryter, Alexander, Gruber, Vollmer, Becker, Gehre, Gentle, Lackner, Leuterer and Peeters1996), Alcator CMod (CMod) (Greenwald et al. Reference Greenwald, Boivin, Bombarda, Bonoli, Fiore, Garnier, Goetz, Golovato, Graf and Granetz1997), DIIID (Petty et al. Reference Petty, Luce, Burrell, Chiu, de Grassie, Forest, Gohil, Greenfield, Groebner and Harvey1995), JET (Balet, Campbell & Christiansen Reference Balet, Campbell and Christiansen1995), NSTX (Sabbagh et al. Reference Sabbagh, Kaye, Menard, Paolettia, Bellb, Bellb, Bialeka, Bitterb, Fredricksonb and Gatesb2001), and ITER (Aymar et al. Reference Aymar, Barabaschi and Shimomura2002).
Each set of data corresponds to a high performance (i.e. high pressure) discharge. Observe first that a reasonable value of poloidal beta for the reference case can be chosen as
Next, by combining the experimental data with machine drawings, we conclude that the measured inner and outer gap widths are approximately equal (i.e. ${\it\Delta}_{o}={\it\Delta}_{i}$ ) and are set to the values listed in the tables. Also listed is the corresponding value of $b/a$ as calculated from (6.2). From the table we then assume that the reference values for the gaps and wall radius are given by
The reference wall elongation is an additional, but not independent, geometric parameter which enters the analysis but is more difficult to estimate. The walls have different shapes and the spacing between plasma and wall is different on the top and bottom because of the divertor. Even for a single experiment, it is not clear how to relate the wall shapes in the drawings to the simplified up–down symmetric wall parameter ${\it\kappa}_{w}$ .
To circumvent this difficulty, we assume that for each experiment the vertical gap is three times the measured inner horizontal gap to allow for a larger vertical space to accommodate the divertor: ${\it\Delta}_{v}/a=3{\it\Delta}_{i}/a$ . Thus, for the table, the reference case and all future numerical studies, the wall elongation is given by
Here ${\it\kappa}$ is arbitrary, ${\it\Delta}_{i}/a$ is specified either experimentally or at its reference value and $b/a$ is obtained from (6.2).
Using the data in table 1 we have carried out numerical calculations to determine the value of ${\it\gamma}{\it\tau}_{w}$ that leads to a numerical eigenvalue ${\it\lambda}_{\mathit{min}}=0$ for each experiment. By construction, this defines the elongation at high performance that the feedback system, characterized by ${\it\gamma}{\it\tau}_{w}$ , can safely stabilize. By comparing the ${\it\gamma}{\it\tau}_{w}$ data from the different experiments, but omitting DIIID, we deduce that a typical value of the feedback parameter is
Interestingly, the value of ${\it\gamma}{\it\tau}_{w}$ for DIIID is substantially higher than for the other experiments and the question is ‘Why?’ We suggest that a much stronger feedback system (i.e. a much larger ${\it\gamma}{\it\tau}_{w}$ ) is needed to achieve the high triangularity ${\it\delta}=0.85$ for the listed shot. Furthermore, this stronger feedback is possible in DIIID since the feedback coils are located inside the TF coils, much closer to the plasma. In future fusion grade experiments, this will probably not be possible because of neutron radiation. This is the reason why a low weight is given to DIIID when estimating a ‘reference’ value for ${\it\gamma}{\it\tau}_{w}$ . The DIIID data is discussed in more detail shortly.
Lastly, we note that we could also compute a reference internal inductance $l_{i}$ from the data, with a value of the order of 0.4, as can be seen in table 1. This value is below the values usually obtained in experiments, and cannot be varied much in our model because of the fixed Solov’ev profiles. For the modes under consideration, we do not expect this limitation to have qualitative consequences on the scaling relations we calculate (Bernard et al. Reference Bernard, Berger, Gruber and Troyon1978), but it has quantitative implications, as we discuss shortly.
Having defined the reference case, we now proceed with a series of numerical calculations to shed insight onto the scaling of maximum achievable ${\it\kappa}$ versus ${\it\varepsilon}$ as a function of the experimental parameters, including the feedback system.
7. The numerical results
The reference case
To establish a baseline we calculate curves of maximum ${\it\kappa}={\it\kappa}({\it\varepsilon})$ and the corresponding ${\it\delta}={\it\delta}({\it\varepsilon})$ for the reference case. To do this, we set ${\it\beta}_{p}=1$ , ${\it\Delta}_{i}/a=0.1$ , ${\it\Delta}_{v}/a=0.3$ , ${\it\Delta}_{o}/a=0.1$ (corresponding to $b/a=1.1$ ) and ${\it\gamma}{\it\tau}_{w}=1.5$ . The value of ${\it\kappa}_{w}$ is set in accordance with (6.7). The desired scaling curves are obtained by choosing a value for ${\it\varepsilon}$ and then iterating on ${\it\kappa}$ and ${\it\delta}$ such that the eigenvalue ${\it\lambda}_{\mathit{min}}=0$ for each pair of values. The result can then be plotted as a curve of ${\it\kappa}$ versus ${\it\delta}$ for the given ${\it\varepsilon}$ , as shown in figure 2. We see that there is an optimized value of ${\it\delta}$ for which ${\it\kappa}$ is a maximum. Even so, the expanded scale indicates that the maximum is relatively flat in the vicinity of the optimum.
The procedure is repeated for a range of ${\it\varepsilon}$ , thereby generating a curve of maximum ${\it\kappa}={\it\kappa}({\it\varepsilon})$ and corresponding ${\it\delta}={\it\delta}({\it\varepsilon})$ which is illustrated in figure 3. Observe that, in agreement with previous work (Wesson Reference Wesson1978) and experimental data, the maximum achievable elongation increases as the aspect ratio becomes tighter. As $a/R_{0}$ increases from 0.1 to 0.8, the maximum ${\it\kappa}$ increases from 1.89 to 2.88. The optimum triangularity also increases as the aspect ratio gets tighter, but in a stronger way. Over the same range of $a/R_{0}$ , the triangularity increases from 0.05 to 0.65. At $a/R_{0}=0.3$ , the maximum elongation and optimum triangularity have the values ${\it\kappa}=2.06$ and ${\it\delta}=0.18$ .
The values of ${\it\kappa}$ in figure 3 are in general slightly higher than those listed in table 1. The experimental values of ${\it\delta}$ are substantially higher. This may be explained by the fact that the values in table 1 correspond to high performance as measured by high average pressure. However, high performance is not determined solely by $n=0$ MHD considerations. Kink stability and transport play a comparably important role, and experimental data indicates that the highest experimental pressure may be achieved by operating at a larger value of ${\it\delta}$ than the $n=0$ MHD optimum because of more than compensatory gains in ${\it\beta}_{N}$ and ${\it\tau}_{E}$ (Lomas & JET Team Reference Lomas2000). We will return to this point in § 8 of the article.
A second important effect is associated with the fact that any given experiment has a fixed wall shape. Thus, the typical way to increase elongation is by shrinking the minor radius of the plasma. The effective increase in wall radius leads to a higher resistive wall growth rate requiring more feedback and the smaller plasma volume leads to reduced performance because of smaller ${\it\tau}_{E}$ . Both lead to a reduced ${\it\kappa}$ .
A final contributing factor is associated with the fact that the equilibrium Solov’ev current profile used in our analysis is somewhat broader than typical experimental profiles. Specifically, whereas the Solov’ev internal inductance is always approximately $l_{i}\approx 0.4$ , the more peaked experimental profiles have internal inductances that typically lie in the range $l_{i}\sim 0.5{}1.0$ . This implies that the Solov’ev profile has a higher current density close to the wall than the experimental profiles, and therefore is more strongly affected by wall stabilization. The result is a slightly higher ${\it\kappa}$ for the Solov’ev profile.
Based on this discussion, we see that the numerical results presented here and below should be viewed in the context of future experimental designs where the wall to plasma radius can remain fixed as the plasma geometry is varied. Even so, if the designs are based primarily on empirical ${\it\tau}_{E}$ scaling, the impact of triangularity will not be accurately taken into account.
Having established and discussed the reference case, we now focus on the scaling of maximum elongation with various physical parameters.
Scaling with ${\it\beta}_{p}$
In the first set of studies, as well as all that follow, we fix ${\it\Delta}_{i}/a=0.1$ , ${\it\Delta}_{v}/a=0.3$ . The initial studies focus on scaling with ${\it\beta}_{p}$ . As such we fix ${\it\Delta}_{o}/a=0.1$ (which is equivalent to $b/a=1.1$ ) and ${\it\gamma}{\it\tau}_{w}=1.5$ . The value of ${\it\kappa}_{w}$ is again set in accordance with (6.7).
The desired scaling curves are calculated by repeating the procedure described for the reference case but for various values for ${\it\beta}_{p}$ . In figure 4, a set of curves is shown for four values of ${\it\beta}_{p}=0$ , 0.5, 1, 1.5. An examination of these curves indicates only a weak scaling of ${\it\kappa}$ with ${\it\beta}_{p}$ . Noticeable differences occur only for tight aspect ratios, $a/R_{0}>0.5$ . With regard to triangularity, observe that the optimum ${\it\delta}$ increases with increasing ${\it\beta}_{p}$ although the values, even at ${\it\beta}_{p}=1.5$ , are still below the peak performance values given in table 1, presumably because of the reasons discussed with the reference case.
A possible reason for the larger triangularity as ${\it\beta}_{p}$ increases is as follows. As ${\it\beta}_{p}$ increases, the contribution to the toroidal current density at the outermidplane $R>R_{0}$ becomes larger than the current density at the innermidplane $R<R_{0}$ . Since the outer midplane toroidal curvature is unfavourable, its effect is minimized by reducing the area on the outside of the plasma. This is accomplished by increasing the triangularity. Hence ${\it\delta}$ increases with increasing ${\it\beta}_{p}$ . However, ${\it\delta}$ cannot become too large because of the corresponding increase in unfavourable poloidal curvature at the vertical tips of the plasma.
Scaling with $b/a$
In the second set of studies we fix ${\it\beta}_{p}=1$ , ${\it\gamma}{\it\tau}_{w}=1.5$ and vary the wall radius $b/a$ . As previously stated we do this by setting ${\it\Delta}_{i}/a=0.1$ , ${\it\Delta}_{v}/a=0.3$ and varying the outer gap parameter ${\it\Delta}_{o}/a$ . The values of $b/a$ and ${\it\kappa}_{w}$ are then determined from (6.2).
Following the procedure described above, we compute curves of ${\it\kappa}={\it\kappa}({\it\varepsilon})$ and the corresponding ${\it\delta}={\it\delta}({\it\varepsilon})$ for various ${\it\Delta}_{o}/a$ . These curves are illustrated in figure 5 for the values ${\it\Delta}_{o}/a=0.1$ , 0.3, 0.5 or equivalently $b/a=1.1$ , 1.2, 1.3. A comparative plot of the geometries for each elongation is shown in figure 6.
The numerical results show that, as expected, moving the wall further out leads to a lower maximum elongation. However, the decrease in maximum elongation is smaller than the increase in wall radius. Specifically, for any ${\it\varepsilon}$ , a change in $b/a=0.2$ leads to an approximate change in ${\it\kappa}\approx 0.1$ . Also, the change in triangularity is small, at approximately 0.05 over the whole range of ${\it\varepsilon}$ for the same change in $b/a=0.2$ .
The presumable explanation is that, even though the outer part of the wall is being moved further away from the plasma, the strong resistive wall image currents stay approximately the same on the inner, top and bottom of the first wall since these gaps have been held fixed. In other words, the effectiveness of the feedback system is not primarily driven by the proximity of the outer wall to the plasma. One might wonder whether larger decreases in maximum ${\it\kappa}$ would occur by instead increasing the inner or upper/lower gaps. This turns out to not be the case based on separate numerical studies that we have carried out (but which for brevity are not reported here). The conclusion is that the maximum ${\it\kappa}$ depends significantly on the size of the gap but not its location.
Scaling with ${\it\gamma}{\it\tau}_{w}$
The final set of numerical studies examines the scaling with the feedback parameter ${\it\gamma}{\it\tau}_{w}$ . For these studies we fix the wall gaps to ${\it\Delta}_{i}/a=0.1$ , ${\it\Delta}_{v}/a=0.3$ , ${\it\Delta}_{o}/a=0.1$ and beta poloidal to ${\it\beta}_{p}=1$ . These are the reference values. The values of $b/a$ and ${\it\kappa}_{w}$ are again determined from (6.7).
Curves are generated of ${\it\kappa}={\it\kappa}({\it\varepsilon})$ and the corresponding ${\it\delta}={\it\delta}({\it\varepsilon})$ for ${\it\gamma}{\it\tau}_{w}=0$ , 1, 2, 3 as shown in figure 7. Observe that the curve for ${\it\gamma}{\it\tau}_{w}=0$ represents an experiment without a vertical stability feedback system. It therefore approximates the results for earlier natural elongation studies (see for example Hakkarainen et al. (Reference Hakkarainen, Betti, Freidberg and Gormely1990)). The achievable elongations are indeed quite modest, for example ${\it\kappa}=1.17$ , ${\it\delta}=0.17$ for $a/R_{0}=0.3$ .
For higher values of ${\it\gamma}{\it\tau}_{w}$ , we see that increases in the feedback system capabilities lead to substantial increases in the maximum achievable elongation. Again, for $a/R_{0}=0.3$ , the maximum ${\it\kappa}$ increases from 1.17 to 2.77 as ${\it\gamma}{\it\tau}_{w}$ increases from 0 to 3. The optimized triangularity is insensitive to ${\it\gamma}{\it\tau}_{w}$ for small to moderate ${\it\varepsilon}$ , but decreases appreciably for tight aspect ratios.
A final quite interesting point concerns a different aspect of triangularity as evidenced in the data from DIIID in table 1. To illustrate the point, we have carried out a series of calculations assuming a starting point with values of ${\it\kappa}=2.37$ and ${\it\delta}=0.20$ at ${\it\varepsilon}=0.35$ from the ${\it\gamma}{\it\tau}_{w}=2$ curve. We then vary ${\it\delta}$ holding ${\it\kappa}$ and ${\it\varepsilon}$ fixed. At each new ${\it\delta}$ , we recompute the value of ${\it\gamma}{\it\tau}_{w}$ required to make the eigenvalue ${\it\lambda}_{\mathit{min}}=0$ . This results in a curve of ${\it\gamma}{\it\tau}_{w}$ versus ${\it\delta}$ , as shown in figure 8. In other words, how much must the feedback capability be increased to stabilize a triangularity that is away from its optimum value? We see that the minimum in ${\it\gamma}{\it\tau}_{w}$ is relatively flat in the vicinity of ${\it\gamma}{\it\tau}_{w}=2$ but that a large increase is needed for high triangularities. For example, to achieve a triangularity of 0.71 requires a doubling of the feedback capacity to ${\it\gamma}{\it\tau}_{w}=4$ even though the elongation has remained unchanged. Some insight into this strong behaviour can be obtained by noting that the ratio of the pressure driven term to the line bending term in the ideal MHD ${\it\delta}W_{F}$ scales as
The $1{\it\delta}^{2}$ factor arises from increasing unfavourable poloidal curvature at the top and bottom of the plasma as ${\it\delta}$ becomes larger. This leads to increased instability requiring a larger feedback capability, which is consistent with the DIIID data.
Figure 9 shows the values of elongation and triangularity observed in a large sample of discharges in the DIIID experiment. We can see that the maximum elongation ( ${\it\kappa}=2.3$ ) occurs for moderate triangularity, at approximately ${\it\delta}=0.5$ instead of the highest achievable triangularity, which is around ${\it\delta}=0.9$ . The existence of an optimal triangularity for the maximum elongation obtained in experiments agrees with the theoretical prediction in this paper.
8. Discussion
We have calculated the scaling of maximum elongation and corresponding optimized triangularity as a function of inverse aspect ratio for various plasma parameters. The scaling trends are as one might have expected:

(i) In general, the maximum achievable elongation and optimized triangularity increase as the aspect ratio becomes tighter.

(ii) At fixed aspect ratio, the maximum elongation ${\it\kappa}_{\mathit{max}}$ , is relatively insensitive to ${\it\beta}_{p}$ except for ${\it\varepsilon}\rightarrow 1$ . For tight aspect ratios, ${\it\kappa}_{\mathit{max}}$ decreases. The optimum triangularity monotonically increases with both ${\it\varepsilon}$ and ${\it\beta}_{p}$ .

(iii) When the outer midplane wall is moved further away from the plasma, ${\it\kappa}_{\mathit{max}}$ decreases, although not by that much. There are still strong image currents on the inner, upper and lower walls to keep the stability largely intact. Also, there is a small increase in triangularity.

(iv) There are large gains in ${\it\kappa}_{\mathit{max}}$ as the feedback capability ${\it\gamma}{\it\tau}_{w}$ is increased. This is accompanied by a smalltomodest decrease in triangularity. One interesting feature is that as the triangularity increases away from its optimum value towards ${\it\delta}\rightarrow 1$ , the required ${\it\gamma}{\it\tau}_{w}$ for stability increases rapidly because of the corresponding increase in unfavourable poloidal curvature at the upper and lower tips of the plasma.
Overall, the theoretical predictions of ${\it\kappa}_{\mathit{max}}$ are slightly higher than those observed experimentally for the high performance shots in table 1. The explanation is likely associated with two effects, both of which effectively increase the experimental wall radius, thereby reducing the achievable ${\it\kappa}_{\mathit{max}}$ : (i) shrinking the plasma minor radius to increase plasma elongation and (ii) more peaked current profiles than in the Solov’ev model.
A second important theoretical prediction concerns the optimized values of ${\it\delta}$ , which are noticeably smaller than the observations. The suggestion is that high performance, as measured by high pressure, is not solely dependent on $n=0$ MHD stability. Kink stability and transport play a comparably important role in maximizing performance. As shown in figure 9, the total stored energy is maximized by maximizing the triangularity, and is lower if the plasma is in a highly elongated configuration with a correspondingly lower optimized triangularity. Gains in ${\it\beta}_{N}$ and ${\it\tau}_{E}$ may more than compensate reductions in ${\it\kappa}_{\mathit{max}}$ by operation away from the optimum ${\it\delta}$ . This hypothesis is also supported by experimental results obtained on the JET tokamak (Lomas & JET Team Reference Lomas2000).
The dependence of plasma confinement on triangularity remains poorly understood to this day. Confinement may improve at high triangularity due to reduced MHD turbulence associated with the $n=1$ ballooningkink mode (Eriksson & Wahlberg Reference Eriksson and Wahlberg2001). On the other hand, it was found in experiments on the TCV tokamak that increasing triangularity led to reduced plasma confinement (Weisen et al. Reference Weisen, Moret, Franke, Furno, Martin, Anton, Behn, Dutch, Duval and Hofmann1997), which was explained by the increase of driftwave turbulent transport for high triangularity (Camenen et al. Reference Camenen, Pochelon, Behn, Bottino, Bortolon, Coda, Karpushov, Sauter and Zhuang2007). Unfortunately, the present empirical scaling relations for ${\it\tau}_{E}$ do not explicitly take triangularity into account. Characterizing the complicated effect of triangularity on confinement may therefore be an important challenge for the transport community in the future.
Acknowledgements
The authors would like to thank Drs J. Menard (PPPL) and S. Kaye (PPPL) for their help in understanding the NSTX data as well as Professor D. Whyte (MIT) for providing the motivation for this work and for many insightful conversations. J.P.L. and A.C. were supported by the US Department of Energy, Office of Science, Fusion Energy Sciences under award numbers DEFG0286ER53223 and DESC0012398. J.P.F. was partially supported by the US Department of Energy, Office of Science, Fusion Energy Sciences under award number DEFG0291ER54109. M.G. was supported by US Department of Energy award DEFC0299ER54512.
Appendix A. Linear algebra for $n=0$ stability
The stability problem can be written in a classic eigenvalue form as follows
Here $\unicode[STIX]{x1D652}$ is an $6M\times 6M$ symmetric matrix and $\boldsymbol{x}$ is a vector of length $6M$ . Also included is the ${\it\gamma}{\it\tau}_{w}$ term which enters as an $M\times M$ diagonal matrix contribution to $\unicode[STIX]{x1D652}$ . The mathematical goal is to find the eigenvalues ${\it\lambda}_{j}$ of $\unicode[STIX]{x1D652}$ subject to the Green’s function constraints:
The matrix $\unicode[STIX]{x1D63E}$ has $6M$ rows and $4M$ columns (i.e. $\unicode[STIX]{x1D63E}$ is a $6M\times 4M$ matrix) and has a rank $4M$ . The physical goal requires finding the maximum, ${\it\kappa}={\it\kappa}({\it\varepsilon},{\it\beta}_{p},b/a,{\it\gamma}{\it\tau}_{w})$ and corresponding ${\it\delta}={\it\delta}({\it\varepsilon},{\it\beta}_{p},b/a,{\it\gamma}{\it\tau}_{w})$ , such that the smallest (i.e. most negative) eigenvalue satisfies ${\it\lambda}_{min}=0$ .
Golub and Underwood have proposed an efficient and elegant method to treat this mathematical problem (Golub & Underwood Reference Golub and Underwood1970). The idea is to take into account the constraint relation by carrying out a $QR$ orthogonal decomposition of the constraint matrix $\unicode[STIX]{x1D63E}$ . This allows us to exactly factor out the $4M$ zero eigenvalues arising from the constraint relations, leaving us with a $2M\times 2M$ eigenvalue problem. The $QR$ orthogonal decomposition (called with the function ‘qr’ in MATLAB) of $\unicode[STIX]{x1D63E}$ can be written as
where, as mentioned in the main text, the symbol $\cdots$ is used to represent the separation between matrix blocks. The properties and dimensions of the matrices, using the notation $m=6M,n=4M$ , and $p=mn=2M$ are as follows: $\unicode[STIX]{x1D64D}$ is an $n\times n$ invertible upper triangular matrix, $\mathbf{0}$ is a $p\times n$ null space matrix and $\unicode[STIX]{x1D64C}$ is an $m\times m$ orthonormal matrix satisfying $\unicode[STIX]{x1D64C}^{\text{T}}\boldsymbol{\cdot }\unicode[STIX]{x1D64C}=\unicode[STIX]{x1D644}$ . Note that since $\unicode[STIX]{x1D64C}$ is a square matrix, it follows that $\unicode[STIX]{x1D64C}^{\text{T}}=\unicode[STIX]{x1D64C}^{1}$ .
The next step in the procedure, assuming that $\unicode[STIX]{x1D64C}$ is known, is to introduce a new set of basis vectors $\boldsymbol{z}$ in place of $\boldsymbol{x}$ defined by
Here, $\boldsymbol{z}_{n}$ contains the first $n$ elements of $\boldsymbol{x}$ while $\boldsymbol{z}_{p}$ contains the remaining $p$ elements. Clearly, both $\boldsymbol{x}$ and $\boldsymbol{z}$ each contain $m$ elements. The usefulness of the transformation becomes apparent when rewriting the constraint relation in terms of $\boldsymbol{z}$ ,
Now, using the orthonormal properties of $\unicode[STIX]{x1D64C}$ it follows that
Equation (A 5) thus reduces to
Carrying out the matrix multiplication leads to the simple result
Since $\unicode[STIX]{x1D64D}$ is invertible it has an inverse. Therefore, operating on the left of (A 8) with $(\unicode[STIX]{x1D64D}^{\text{T}})^{1}$ yields
The $QR$ decomposition has led to a set of basis vectors in which the constraint relation is satisfied by the simple step of setting the first $n$ elements of $\boldsymbol{z}$ identically to zero.
We can take this result into account by rewriting the basis vector transformation given by (A 4) as follows
Observe that $\unicode[STIX]{x1D644}_{p}$ is an identity matrix of dimension $p\times p$ which appears only in the lower righthand corner of the total $m\times m$ matrix $\hat{\unicode[STIX]{x1D644}}$ . This is a convenient way to suppress the appearance of $\boldsymbol{z}_{n}$ .
The original eigenvalue problem defined by (A 1) and (A 2) can now be simplified by eliminating $\boldsymbol{x}$ in terms of $\boldsymbol{z}$
The critical point to recognize is that the constraint $\unicode[STIX]{x1D63E}^{\text{T}}\boldsymbol{\cdot }\boldsymbol{x}=0$ is automatically satisfied in this representation. That is, introduction of $\hat{\unicode[STIX]{x1D644}}$ eliminates the contribution of $\boldsymbol{z}_{n}$ and is equivalent to setting $\boldsymbol{z}_{n}=\mathbf{0}$ , which is the constraint condition expressed in terms of $\boldsymbol{z}$ .
The numerator and denominator in (A 11) can be greatly simplified. Using the properties of $\unicode[STIX]{x1D64C}$ and $\hat{\unicode[STIX]{x1D644}}$ we see that the denominator can be written as
Next, in the numerator write
where $\unicode[STIX]{x1D652}_{11}$ is $n\times n$ , $\unicode[STIX]{x1D652}_{22}$ is $p\times p$ and $\unicode[STIX]{x1D652}_{12}$ is $n\times p$ . Since the starting $m\times m$ matrix $\unicode[STIX]{x1D64C}\boldsymbol{\cdot }\unicode[STIX]{x1D652}\boldsymbol{\cdot }\unicode[STIX]{x1D64C}^{\text{T}}$ is symmetric, the matrices $\unicode[STIX]{x1D652}_{11}$ and $\unicode[STIX]{x1D652}_{22}$ must also be symmetric. Using this information we see that the numerator of (A 11) reduces to
Of the total matrix $\unicode[STIX]{x1D64C}\boldsymbol{\cdot }\unicode[STIX]{x1D652}\boldsymbol{\cdot }\unicode[STIX]{x1D64C}^{\text{T}}$ only $\unicode[STIX]{x1D652}_{22}$ need be extracted.
The original eigenvalue problem including constraints has now been reduced to the desired form
It has been reduced from an $m\times m$ to a $p\times p$ eigenvalue problem for the symmetric matrix $\unicode[STIX]{x1D652}_{22}$ . Once the eigenvectors have been determined, the original vector $\boldsymbol{x}$ is determined by substituting into (A 10).