Hostname: page-component-77f85d65b8-pkds5 Total loading time: 0 Render date: 2026-04-12T13:19:34.915Z Has data issue: false hasContentIssue false

An adjoint formulation of energetic particle confinement

Published online by Cambridge University Press:  30 March 2026

Christopher J. McDevitt*
Affiliation:
Nuclear Engineering Program, Department of Materials Science and Engineering, University of Florida, Gainesville, FL 32611, USA
Jonathan S. Arnaud
Affiliation:
Nuclear Engineering Program, Department of Materials Science and Engineering, University of Florida, Gainesville, FL 32611, USA
*
Corresponding author: Christopher J. McDevitt, cmcdevitt@ufl.edu

Abstract

An adjoint formulation of energetic particle confinement in axisymmetric tokamak geometry is derived and evaluated using a physics-informed neural network (PINN). The PINN estimates the mean escape time of energetic ions by solving an inhomogeneous adjoint of the drift kinetic equation with a Lorentz collision operator, yielding predictions of fast ion loss in tokamak geometry due to direct ion orbit loss and collisional transport. To our knowledge, this is the first time a PINN has been used to solve the drift kinetic equation in tokamak geometry, a challenging problem due to the large time scale separation between the rapid transit time of energetic ions and their slow collisional time scale. It is shown that a careful and intentional design of a PINN is able to learn the mean escape time across the majority of the plasma volume, suggesting a path towards constructing a rapid surrogate for use within a broader optimisation framework.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2026. Published by Cambridge University Press

1. Introduction

Energetic particles are a ubiquitous aspect of magnetic fusion plasmas. Such particles emerge as a result of ion heating schemes including neutral beam injection (Hemsworth, Tanga & Antoni Reference Hemsworth, Tanga and Antoni2008) and ion cyclotron resonance heating (ICRH) (Adam Reference Adam1987; Kazakov et al. Reference Kazakov2017). For a burning deuterium–tritium plasma, the confinement of energetic alpha particles is critical for the sustainment of the fusion plasma. In addition, when such particles escape, they can cause substantial damage to plasma-facing components (Fasoli et al. Reference Fasoli2007; Bonofiglo et al. Reference Bonofiglo2024). Obtaining an understanding of the confinement time of such particles is thus vital to the operation of a magnetic fusion device.

In the present work, an initial step towards a framework through which the confinement time of an energetic ion or electron can be efficiently estimated is developed. While evaluating the confinement time of an energetic particle is often of greatest importance for non-axisymmetric magnetic fields arising from error fields (Goldston, White & Boozer Reference Goldston, White and Boozer1981), magnetic islands (Carolipio et al. Reference Carolipio, Heidbrink, Forest and White2002), energetic particle modes (Heidbrink & White Reference Heidbrink and White2020), or for a three-dimensional (3-D) stellarator equilibrium (Galeev et al. Reference Galeev, Sagdeev, Furth and Rosenbluth1969), this work will consider the simpler problem of an axisymmetric tokamak equilibrium. Our motivation for considering this simpler problem is to investigate the potential for evaluating metrics of energetic particle transport via the solution to an inhomogeneous adjoint problem. As described later, the inhomogeneous adjoint of the drift kinetic equation provides a direct means of computing the average confinement time of an ion or electron as a function of the particle’s initial phase space position (Särkimäki Reference Särkimäki2020), a quantity we will refer to as the mean escape time. While this adjoint problem can be solved by a variety of numerical methods, we will consider a physics-informed neural network (PINN) (Raissi, Perdikaris & Karniadakis Reference Raissi, Perdikaris and Karniadakis2019) in the present work. Physics-informed neural networks correspond to a machine learning approach capable of integrating both physics and data into the training of a neural network. A distinguishing feature of a PINN is that it can be extended to learn the parametric dependence of the solution of a partial differential equation (PDE; Sun et al. Reference Sun, Gao, Pan and Wang2020). Noting that optimising the transport properties of energetic particle orbits often requires exploring a broad range of magnetic field configurations, obtaining a rapid surrogate capable of computing quantities of interest such as the mean escape time of energetic particles provides a powerful tool for assessing energetic particle transport.

Our aim in the present paper is not to provide such a parametric solution, but to instead investigate the ability of a PINN to solve the drift kinetic equation for an energetic ion population. The extension to a parametric solution will be treated in a future work. Specifically, while PINNs have previously been used to solve the relativistic (Arnaud, Mark & McDevitt Reference Arnaud and McDevitt2024; McDevitt, Arnaud & Tang Reference McDevitt and Tang2025; Arnaud, Tang & McDevitt Reference Arnaud, Tang and McDevitt2025) and non-relativistic (McDevitt & Tang Reference McDevitt and Tang2024) Fokker–Planck equations, the large time scale separation between the rapid bounce or transit times and slow collision time of energetic particles in tokamak geometry represents a far more challenging problem. Using advanced optimisation algorithms (Kiyani et al. Reference Kiyani, Shukla, Urbán, Darbon and Karniadakis2025; Urbán et al. Reference Urbán, Stefanou and Pons2025; Wang et al. Reference Wang, Bhartari, Li and Perdikaris2025), we aim to show that PINNs are able to robustly resolve several features of this challenging system, even in the absence of data. As described further later, as the energy of the ions is increased, leading to a greater time scale separation between the fast bounce/transit motion and slow collision time of energetic ions, our implementation of a PINN struggles to quantitatively describe the escape time of the best confined ion orbits, although it does resolve the solution in the outer region of the tokamak including phase space regions of prompt ion loss (Hinton & Chu Reference Hinton and Chu1985).

As an alternative means of evaluating the fast ion mean escape time, we introduce a new drift-kinetic solver based on the JAX framework. While in the present work, this solver is only used to verify solutions from the PINN, future work will leverage particle data generated by this drift-kinetic solver to help accurately resolve the mean escape time of well-confined ion orbits, along with facilitating the training of the PINN over a broad range of plasma parameters. This extended PINN framework, trained using both physics and particle data, would thus provide an efficient surrogate for fast ion confinement appropriate for use inside optimisation loops. The remainder of this paper is organised as follows. Section 2 derives an adjoint problem for the mean escape time of an energetic particle. Brief overviews of the physics-informed deep learning framework and JAX particle-based solver are given in § 3. The mean escape time of an energetic ion species is computed in § 4. Conclusions and a discussion of future directions are given in § 5.

2. Adjoint of steady-state drift kinetic equation

In this section, we will define an adjoint problem based on the inhomogeneous steady-state drift kinetic equation. Our starting point will be to define the Green’s function $F_s (\boldsymbol{z}; \boldsymbol{z}_0 )$ for the steady-state drift kinetic (Helander & Sigmar Reference Helander and Sigmar2002):

(2.1a) \begin{equation} \boldsymbol{\nabla }\boldsymbol{\cdot }\big( \boldsymbol{\dot {X}} F_s\big) + \frac {\partial }{\partial v_\Vert } \left ( \dot {V}_\Vert F_s \right ) - C_s \left ( F_s \right ) = \delta \left ( \boldsymbol{z} - \boldsymbol{z}_0 \right )\!, \end{equation}

where $\boldsymbol{z} = ( \boldsymbol{x}, v_\Vert , \mu )$ is the phase space position of the particle and the characteristic equations are given by

(2.1b) \begin{align} \boldsymbol{\dot {X}} &= v_\Vert \boldsymbol{\hat {b}} + \boldsymbol{v}_{\textit{EB}} + \boldsymbol{v}_{\textit{Ds}} , \end{align}
(2.1c) \begin{align} \dot {V}_\Vert &= \frac {q_s}{m_s} \boldsymbol{\hat {b}} \boldsymbol{\cdot }\boldsymbol{E} - \mu \boldsymbol{\hat {b}} \boldsymbol{\cdot }\boldsymbol{\nabla }B . \end{align}

Here, the subscript $s$ is the species index, $\mu \equiv v^2_\perp /( 2B )$ is the magnetic moment, $v_\Vert$ is the velocity parallel to the magnetic field, $C_s ( F_s )$ is a collision operator, taken to be a Lorentz collision operator

(2.2) \begin{equation} C_s \left ( F_s \right ) = \frac {\nu _D}{2} \frac {\partial }{\partial \xi } \left [ \left ( 1-\xi ^2 \right ) \frac {\partial F}{\partial \xi}\right]\!, \end{equation}

and we have defined the $\boldsymbol{E} \times \boldsymbol{B}$ and magnetic drifts as

(2.3a) \begin{align} \boldsymbol{v}_{\textit{EB}} &= \frac {\boldsymbol{E} \times \boldsymbol{B}}{B^2} , \end{align}
(2.3b) \begin{align} \boldsymbol{v}_{\textit{Ds}}&= \frac {\left ( v^2_\Vert + \mu B\right )}{\omega _{\textit{cs}}} \boldsymbol{\hat {b}} \times \boldsymbol{\nabla }\ln B . \end{align}

The magnetic drifts have been simplified by the approximation $\boldsymbol{\nabla }\times \boldsymbol{\hat {b}} \approx \boldsymbol{\hat {b}} \times \boldsymbol{\nabla }\ln B$ , appropriate in the limit of a ‘vacuum plasma’. As boundary conditions, we will enforce $F_s = 0$ at the spatial boundaries whenever $U_r \equiv \boldsymbol{r} \boldsymbol{\cdot }\boldsymbol{\dot {X}} \lt 0$ , where $\boldsymbol{\hat {r}}$ is the radial unit vector and we have assumed a circular spatial boundary, but leave $F_s$ unconstrained for $U_r \equiv \boldsymbol{r} \boldsymbol{\cdot }\boldsymbol{\dot {X}} \geqslant 0$ , i.e.

(2.4) \begin{equation} F_s \left ( r=r_{\textit{wall}}\right ) = \begin{cases} 0, & U_r \lt 0, \\ \text{unconstrained}, & U_r \geqslant 0, \end{cases} \end{equation}

where $r_{\textit{wall}}$ is the location of the surrounding wall, taken as the plasma minor radius $a$ , for simplicity. With these boundary conditions, particles are not allowed to enter the spatial domain, but particles are able to exit. The resulting Green’s function can thus be physically interpreted as the steady-state particle distribution resulting from a unit source located at $\boldsymbol{z} = \boldsymbol{z}_0$ .

To derive an adjoint problem for the mean escape time of an energetic particle, we begin by multiplying the left-hand side of (2.1a ) by the solution to the adjoint equation $T_s$ , which after successive integrations by parts, yields

(2.5) \begin{align} &\int \text{d}^3v \text{d}^3 x T_s \left[\boldsymbol{\nabla }\boldsymbol{\cdot }\big( \boldsymbol{\dot {X}}F_s \big) + \frac {\partial }{\partial v_\Vert } \left ( \dot {V}_\Vert F_s \right ) - C_s \left ( F_s \right ) \right ] \nonumber \\ &\quad = \int \text{d}^3v \,\text{d}^3 x F_s \left [ -\boldsymbol{\dot {X}} \boldsymbol{\cdot }\boldsymbol{\nabla }T_s - \dot {V}_\Vert \frac {\partial T_s}{\partial v_\Vert } - C^*_s \left ( T_s \right ) \right ] \nonumber \\ &\qquad + \int \text{d}^3v \int \text{d} \varGamma \left . \left [ U_r T_s F_s \right ] \right |_{r=r_{\textit{wall}}}. \end{align}

Here, the first term on the right-hand side is the Green’s function times the adjoint of the steady state drift kinetic equation, and the second term on the right-hand side describes fluxes through the spatial boundary $\varGamma$ . The boundary condition on the Green’s function $F_s$ (2.4) implies the spatial boundary term will vanish when $U_r \lt 0$ , where setting the boundary conditions on $T_s$ when $U_r \geqslant 0$ allows for distinct adjoint problems to be defined.

An adjoint problem for the mean escape time (Grasman et al. Reference Grasman2013; Liu et al. Reference Liu, Brennan, Boozer and Bhattacharjee2017) of an energetic particle can be derived by considering the inhomogeneous adjoint equation

(2.6) \begin{equation} \boldsymbol{\dot {X}} \boldsymbol{\cdot }\boldsymbol{\nabla }T_s + \dot {V}_\Vert \frac {\partial T_s}{\partial v_\Vert } + C^*_s \left ( T_s \right ) = -1 . \end{equation}

Substituting (2.6) along with the steady-state Green’s function defined by (2.1a ) into the right- and left-hand sides, respectively, of (2.5), yields

(2.7) \begin{align} T_s \left ( \boldsymbol{x}_0, \boldsymbol{v}_0\right ) &= N_s + \int \text{d}\varGamma \int \text{d}^3v \left . \left [ U_r T_s F_s \right ] \right |_{r_{\textit{wall}}}\!, \end{align}

where $\varGamma$ is the surface of the surrounding wall and $N_s\equiv \int \text{d}^3v \int \text{d}^3x F_s$ is the number of particles remaining in the domain due to the unit source on the right-hand side of (2.6). At the spatial boundary we will take

(2.8) \begin{equation} T_s \left ( r=r_{\textit{wall}}\right ) = \begin{cases} \text{unconstrained}, & U_r \leqslant 0, \\ 0, & U_r \gt 0. \end{cases} \end{equation}

This boundary condition follows since a particle born on the spatial boundary with an outward directed velocity would be lost immediately, such that $T_s=0$ , however, for an inward directed velocity, this particle would not be immediately lost. For the boundary condition defined by (2.8) and noting that $F_s = 0$ for $U_r\lt 0$ , (2.4) yields

(2.9) \begin{align} T \left ( \boldsymbol{x}_0, \boldsymbol{v}_0\right ) &= N_s . \end{align}

Further noting that $N_s$ will scale directly with the confinement time of the particles, $T_s( \boldsymbol{x}_0, \boldsymbol{v}_0)$ will act as a measure of how well particles injected at a given phase space location are confined on average, a quantity often referred to as the mean escape time or expected exit time (Grasman et al. Reference Grasman2013). We note that for the present example, all ions will escape for $t\to \infty$ , such that the mean escape time will be finite throughout the domain.

3. Physics-constrained deep learning framework

The primary focus of the remainder of this work will be the solution of the adjoint problem derived in § 2. A brief overview of our implementation of a PINN is given in § 3.1, where the reader is referred to Cuomo et al. (Reference Cuomo, Di Cola, Giampaolo, Rozza, Raissi and Piccialli2022), Wang et al. (Reference Wang, Sankaran, Wang and Perdikaris2023), Toscano et al. (Reference Toscano, Oommen, Varghese, Zou, Ahmadi Daryakenari, Wu and Karniadakis2025) for additional information on this rapidly expanding area. In addition, a newly developed GPU accelerated particle-based solver for the drift kinetic equation is described in § 3.2. For the present work, this particle-based solver will be used to validate the PINN’s solutions to the drift kinetic equation.

3.1. Physics-informed neural networks

Physics-constrained machine learning methods (Karniadakis et al. Reference Karniadakis, Kevrekidis, Lu, Perdikaris, Wang and Yang2021) seek to incorporate physical constraints into the training of a neural network. Physics-informed neural networks correspond to a prominent example of this area. An attractive feature of PINNs is that they can be used to solve PDEs and ordinary differential equations (ODEs) directly in the absence of data (van Milligen, Tribaldos & Jiménez Reference van Milligen, Tribaldos and Jiménez1995; Raissi et al. Reference Raissi, Perdikaris and Karniadakis2019). They also provide a mechanism through which physical constraints can be used to enrich sparse data sets (Cai et al. Reference Cai, Mao, Wang, Yin and Karniadakis2021; Mathews et al. Reference Mathews, Hughes, Terry and Baek2022; McDevitt, Fowler & Roy Reference McDevitt, Fowler and Roy2024). In its simplest form, the loss function of a PINN can be written as

(3.1) \begin{equation} \text{Loss} = \frac {1}{N_{\textit{data}}} \sum ^{N_{\textit{data}}}_{i=1} \left [ T_i - T \left ( \boldsymbol{z}_i; \boldsymbol{\lambda }_i \right )\right ]^2 + \frac {w_{\textit{PDE}}}{N_{\textit{PDE}}} \sum ^{N_{\textit{PDE}}}_{i=1} \mathcal{R}^2 \left ( \boldsymbol{z}_i; \boldsymbol{\lambda }_i \right)\!, \end{equation}

where $T$ is the quantity we seek to predict, the mean escape time for the current work; $\mathcal{R}$ is the residual of the governing PDE; $w_{\textit{PDE}}$ is a scalar weight applied to the PDE bias of the loss; $\boldsymbol{z} = ( \boldsymbol{x}, \boldsymbol{v} )$ are the independent variables; and $\boldsymbol{\lambda }$ describes physical parameters, such as collisionality or inverse aspect ratio. The first term represents a bias due to data, where the data may be boundary conditions, or synthetic or experimental data. The second term is a bias due to the governing equations, which, for the present work, will be the inhomogeneous adjoint of the steady-state drift kinetic equation. In the absence of synthetic or experimental data, the first term will only contain the boundary conditions. Successfully minimising the loss will thus correspond to identifying a solution to the PDE while satisfying the boundary conditions. Unlike traditional PDE solvers, a parametric solution to a PDE can be straightforwardly obtained by training across a broad range of physics parameters $\boldsymbol{\lambda }$ . Thus, while the training time of such a model may be substantially longer than a traditional PDE solver, once trained, the fast inference time of a PINN allows for fast online prediction times across a broad range of plasma conditions. This property can be used for the development of fast surrogate models, where some recent examples include runaway electron generation (McDevitt Reference McDevitt, Fowler and Roy2023; Arnaud et al. Reference Arnaud, Tang and McDevitt2025), MHD equilibrium (Jang et al. Reference Jang, Kaptanoglu, Gaur, Pan, Landreman and Dorland2024), plasma thrusters (Luo et al. Reference Luo, Ren, Chen, Mao, Zhang, Wang and Tang2025) or non-thermal ion distributions (McDevitt & Tang Reference McDevitt and Tang2024).

Key elements in obtaining accurate results from PINNs correspond to (i) the identification of an appropriate neural network architecture, (ii) the phase space weighting of the residual and (iii) the selection of an efficient optimisation algorithm for minimising the loss. Regarding item (i), we will employ a fully connected feedforward neural network, but with an ‘output layer’ that enforces positivity as a hard constraint. In so doing, this both prevents certain unphysical solutions from being predicted by the PINN and, perhaps more importantly, narrows the space of solutions that the optimiser searches across. Defining $T_{\textit{NN}}$ to be the output of the hidden layers of the neural network, a simple transformation that enforces positivity is given by

(3.2) \begin{equation} T_s = \exp \left(T_{\textit{NN}}\right)\!. \end{equation}

In addition to enforcing positivity, the exponential dependence of $T_s$ on the output of the hidden layers $T_{\textit{NN}}$ provides a natural means of capturing the large disparity in escape times for energetic ions. Specifically, ions born near the plasma core are expected to be well confined, where they will only be lost by the relatively slow process of collisional transport, in contrast to those born near the edge, which may be lost on a very short time scale by direct orbit loss.

An additional challenge that arises is ensuring that the PINN is able to robustly connect the boundary condition (2.8) applied at the edge of the tokamak to the solution throughout the core plasma. Noting that we expect the escape time $T_s$ to be small near the edge, training points in this edge region can be given greater emphasis by weighting the residual of the PDE by

(3.3) \begin{equation} \mathcal{F} \left ( \boldsymbol{z}; \boldsymbol{\lambda } \right ) = \frac {A}{A+T_s\left ( \boldsymbol{z}; \boldsymbol{\lambda } \right ) } , \end{equation}

where $A$ is a constant, such that the loss can be written as

(3.4) \begin{equation} \text{Loss} = \frac {1}{N_{bdy}} \sum ^{N_{bdy}}_{i=1} \left [ T_{s,i} - T_s \left ( \boldsymbol{z}_i; \boldsymbol{\lambda }_i \right ) \right ]^2 + \frac {w_{\textit{PDE}}}{N_{\textit{PDE}}} \sum ^{N_{\textit{PDE}}}_{i=1}\left [ \mathcal{F} \left ( \boldsymbol{z}_i; \boldsymbol{\lambda }_i \right ) \mathcal{R} \left ( \boldsymbol{z}_i; \boldsymbol{\lambda }_i \right ) \right ]^2\!. \end{equation}

Here, the boundary conditions defined by (2.8) are enforced by the first term, $\mathcal{R}$ is the residual of the inhomogeneous adjoint defined by (2.6) and the choice of $A$ sets how much emphasis is given to the edge region. In particular, noting that near the edge, $T_s$ will be of the order of a transit time, since ions on loss orbits will directly escape from the plasma, if $A$ is chosen to satisfy $A \ll 1$ , (3.3) will scale as $\propto 1/T_s$ such that the edge region will be heavily weighted, but the inner region of the plasma with $T_s \gg 1$ will receive minimal weighting. We thus expect this choice to allow for the outer region of the plasma to be accurately described at the expense of poor accuracy in the inner region. In contrast, by choosing a larger value of $A$ , improved accuracy for the inner region can be achieved at the expense of having a looser connection to the edge boundary condition, which is vital for achieving a physically meaningful solution. We have found that values of $A$ between 10 and 100 allow for robust convergence of the PINN, where all results in the present paper are for $A=100$ . In addition, the relative weight of the PDE residual compared with the boundary condition is set by the scalar $w_{\textit{PDE}}$ weighing the first term in (3.4). We have found $w_{\textit{PDE}}=100$ provides relatively rapid convergence of the loss.

The last key component of our PINN implementation will be the choice of an effective optimisation algorithm. While the ADAM optimiser (Kingma & Ba Reference Kingma and Ba2014) is used ubiquitously across a range of machine learning applications, this first-order optimisation algorithm often is not able to achieve low losses for challenging PDEs. A common strategy is thus to use ADAM for the first phase of training, after which limited memory BFGS (L-BFGS) (Liu & Nocedal Reference Liu and Nocedal1989), a second-order optimisation algorithm, is used to more tightly converge the loss. Motivated by Kiyani et al. (Reference Kiyani, Shukla, Urbán, Darbon and Karniadakis2025), Wang et al. (Reference Wang, Bhartari, Li and Perdikaris2025), and Urbán et al. (Reference Urbán, Stefanou and Pons2025) and, we have adopted a different optimisation strategy that we have found to yield substantially improved accuracy. The first optimiser that we apply corresponds to the quasi-second-order optimiser SOAP (Vyas et al. Reference Vyas, Morwani, Zhao, Kwun, Shapira, Brandfonbrener, Janson and Kakade2024). This scalable optimiser exhibits similar computational performance as ADAM, but can yield losses up to an order of magnitude smaller than those achievable by ADAM across a broad range of challenging applications of PINNs (Wang et al. Reference Wang, Bhartari, Li and Perdikaris2025). For the second stage of optimisation, we use the self-scaled Broyden (SSBroyden) method (Al-Baali, Spedicato & Maggioni Reference Al-Baali, Spedicato and Maggioni2014). Like L-BFGS, SSBroyden approximates the Hessian, but introduces additional degrees of freedom into the optimisation algorithm that have yielded substantial improvements in training across a range of complex PDEs (Kiyani et al. Reference Kiyani, Shukla, Urbán, Darbon and Karniadakis2025; Urbán et al. Reference Urbán, Stefanou and Pons2025). While each training epoch is substantially more computationally demanding compared with ADAM or SOAP for the SciPy implementation we are currently employing, we have found it results in substantially faster convergence of the loss for the adjoint problem treated in this work. The primary limitations of SSBroyden as an optimisation routine is that the SciPy implementation used in the present work does not efficiently use GPUs or support mini-batching. These limitations substantially increase the training time of the SSBroyden phase of training and limit the number of training points employed.

3.2. JONTA: a particle-based drift kinetic solver

Particle-based drift kinetic simulations of fast ion orbits are computationally intensive (White & Chance Reference White and Chance1984; Hirvijoki et al. Reference Hirvijoki, Asunta, Koskela, Kurki-Suonio, Miettunen, Sipilä, Snicker and Äkäslompolo2014) requiring substantial computational resources and efficient software. While this has predominately been done with Fortran or C++ libraries, with GPU acceleration facilitated by libraries such as Kokkos Trott et al. (Reference Trott2021), maintaining the support of these solvers can be cumbersome, due to its dependency on external backend libraries to effectively use high-performance computing (HPC) hardware. Perhaps most importantly, given that PINNs are trained on GPUs and libraries with a large user base, this motivates the development of a traditional drift kinetic solver that runs on GPUs and uses the same libraries that are used by PINNs, facilitating the operation of the broader framework that contains an interplay between the PINN and a drift kinetic solver.

With the aforementioned discussion, we present Just anOther fuNcTionAl pusher (JONTA), which is built on the PyTorch (Paszke et al. Reference Paszke2019) and JAX (Bradbury et al. Reference Bradbury2018) libraries that run on GPU accelerated hardware. For the present work, we will use the JAX backend and employ a four stage Runge–Kutta (RK4) integration scheme, which is part of a broader suite containing differentiable integrator methods. Guiding-centre equations are evolved with RK4 and collisions are implemented with a Monte-Carlo operator described later. The version of JONTA used in this manuscript together with the PINN training script are available on github.com/cmcdevitt2/DeepPlasma.

JONTA will be used to solve the guiding centre equations defined by (2.1) together with a Monte Carlo collision operator describing pitch-angle scattering. The guiding centre equations can be written as

(3.5a) \begin{align} \dot {X} &= \hat {\boldsymbol{x}} \boldsymbol{\cdot }\dot {\boldsymbol{X}} , \end{align}
(3.5b) \begin{align} \dot {Z} &= \hat {\boldsymbol{z}} \boldsymbol{\cdot }\dot {\boldsymbol{X}} , \end{align}
(3.5c) \begin{align} \dot {v} = \frac {\partial v}{\partial v_\Vert } \frac {\text{d}v_\Vert }{\text{d}t} +& \frac {\partial v}{\partial \mu } \frac {\text{d}\mu }{\text{d}t} + \frac {\partial v}{\partial \boldsymbol{X}} \boldsymbol{\cdot }\frac {\text{d}\boldsymbol{X}}{\text{d}t} , \end{align}
(3.5d) \begin{align} \dot {\xi } = \frac {\partial \xi }{\partial v_\Vert } \frac {\text{d}v_\Vert }{\text{d}t} &+ \frac {\partial \xi }{\partial \mu } \frac {\text{d}\mu }{\text{d}t} + \frac {\partial \xi }{\partial \boldsymbol{X}} \boldsymbol{\cdot }\frac {\text{d}\boldsymbol{X}}{\text{d}t} , \end{align}

where we have chosen cylindrical coordinates $( R, Z )$ , defined $X \equiv R-R_0$ and made the transformation from $( \boldsymbol{x}, v_\Vert , \mu )$ to $( \boldsymbol{x}, \xi , v )$ . Assuming an axisymmetric tokamak geometry with circular flux surfaces, and noting the relations $\xi ^2 =v^2_\Vert / ( v^2_\Vert + 2\mu B)$ and $v^2=v^2_\Vert + 2\mu B$ , (3.5) can be written as

(3.6a) \begin{align} \dot {X} &= \hat {\boldsymbol{x}} \boldsymbol{\cdot }v_\Vert \boldsymbol{\hat {b}} = - \xi v \frac {Z}{qR_0} , \end{align}
(3.6b) \begin{align} \dot {Z} = \hat {\boldsymbol{z}} \boldsymbol{\cdot }v_\Vert \boldsymbol{\hat {b}} + \hat {\boldsymbol{z}} \boldsymbol{\cdot }\boldsymbol{v}_{\textit{Ds}} & = \xi v \frac {X}{qR_0} - \frac {1}{2} \rho ^*_i \left ( 1 + \xi ^2 \right ) v^2 \frac {a}{R_0} , \end{align}
(3.6c) \begin{align} \dot {v} &= 0 , \end{align}
(3.6d) \begin{align} \dot {\xi } & = \frac {\partial \xi }{\partial v_\Vert } \frac {\text{d}v_\Vert }{\text{d}t} + \frac {\partial \xi }{\partial \boldsymbol{X}} \boldsymbol{\cdot }\frac {\text{d}\boldsymbol{X}}{\text{d}t} = -\frac {1}{2} \left ( 1 - \xi ^2 \right ) v \frac {Z}{qR_0} \frac {a}{R_0} \frac {B}{B_0} . \end{align}

Here, we normalised time to $a/v_{Ti}$ , where $a$ is the plasma minor radius taken to be equal to $r_{\textit{wall}}$ , speed $v$ has been normalised to $v_{Ti}$ , space to $a$ , and we have defined $\rho ^*_i = \rho _i/a$ , $\rho _i = v_{Ti}/\omega ^{(0)}_{c}$ , $\omega ^{(0)}_c = ZeB_0/m_i$ , where $B_0$ is the magnetic field on axis, and $q$ is the safety factor. For convenience, we have neglected the electric field, and assumed the limit of a small inverse aspect ratio and low- $\beta$ plasma. Specifically, the magnetic field was taken to have the form:

(3.7) \begin{align} \boldsymbol{B} &= I \left ( \psi \right ) \boldsymbol{\nabla }\varphi + \boldsymbol{\nabla }\varphi \times \boldsymbol{\nabla }\psi , \end{align}
(3.8) \begin{align} B_\varphi \left ( r, \theta \right ) &= \frac {B_0}{1+ \epsilon \cos \theta } , \end{align}
(3.9) \begin{align} B_\theta \left ( r, \theta \right ) &= \frac {B_\theta \left ( r\right )}{1+ \epsilon \cos \theta } , \end{align}

with

(3.10) \begin{align} R &= R_0 + r \cos \theta , \end{align}
(3.11) \begin{align} q &= \frac {r B_0}{R_0 B_\theta \left ( r \right )} , \end{align}

and the poloidal flux function is related to the poloidal magnetic field by $\text{d}\psi /\text{d} r = R_0 B_\theta ( r )$ . The magnitude of the total magnetic field can be conveniently written as

(3.12) \begin{align} \frac {B \left ( r,\theta \right )}{B_0} = \frac {R_0}{R} \sqrt {1+ \left ( \frac {r}{q R_0} \right )^2} \approx \frac {R_0}{R} , \end{align}

where in the last line, we take $( a/R_0 )^2 \ll 1$ .

Equations (3.6a ) and (3.6b ) describe the spatial evolution of the energetic ions due to parallel streaming and a vertical drift arising from the grad-B and curvature drifts. Our motivation for neglecting the electric field is that in this initial study, we will assume stationary and homogeneous density, temperature and current profiles, such that no electric field is expected to arise. In the absence of the electric field, the collisionless velocity space dynamics are set by the mirror force, which only impacts the pitch of the ions such that the speed $v$ will be constant. While a more complete set of guiding centre equations can be easily adopted, for this initial study, (3.6) will prove sufficient to capture several non-trivial aspects of the collisionless orbit of energetic ions.

Figure 1. Change of toroidal canonical momentum for (a) 20 keV ions and (b) 50 keV ions. Ten million deuterium ions were initialised randomly across the spatial and pitch domains. The time step was varied from $\Delta t = 0.2$ (blue curve) to $\Delta t = 0.1$ (orange curve) and $\Delta t = 0.05$ (green curve). The tokamak was assumed to have a minor radius of $a=0.5\;[\text{m}]$ , an inverse aspect ratio of $a/R_0 = 1/3$ , a magnetic field of $B_0 = 2\;[\text{T}]$ and a constant safety factor $q=2$ .

Aside from the speed $v$ , this reduced set of guiding centre equations conserves toroidal canonical momentum in the absence of collisions, which can be written as

(3.13) \begin{equation} p_\varphi = m_i R \frac {B_\varphi }{B} v_\Vert - Ze \psi . \end{equation}

For circular flux surfaces and making analogous approximations as the guiding centrer equations (3.6) allows the normalised toroidal canonical momentum to be expressed as

(3.14) \begin{equation} \frac {p_\varphi }{R_0m_iv_{Ti}} = \frac {R}{R_0} \xi v - \frac {1}{2}\frac {1}{\rho ^*_i} \frac {a}{R_0} \frac {r^2}{q} , \end{equation}

where we have taken the safety factor to be a constant for simplicity. Figure 1 provides a numerical demonstration of the conservation of toroidal canonical momentum conservation for ions with 20 and 50 keV. Here, while the error in toroidal canonical momentum conservation grows with time, after a time $2\times 10^6 a/v_{Ti}$ , the error remains below $10^{-2}$ for a time step of $\Delta t = 0.1 a/v_{Ti}$ for 20 keV ions and $\Delta t = 0.05 a/v_{Ti}$ for 50 keV ions. An error of less than $1\,\%$ will be sufficiently accurate for the present study. JONTA simulations described in the remainder of this work will use a time step of $\Delta t = 0.1 a/v_{Ti}$ when studying 20 keV ions, and $\Delta t = 0.05 a/v_{Ti}$ for 50 keV ions.

The collision operator is taken to be a Lorentz operator (2.2), where the deflection frequency in the limit of $v \gg v_{Ti}$ can be expressed as

(3.15) \begin{equation} \frac {a\nu _D}{v_{Ti}} = \left ( a\frac {\hat {\nu }_{si}}{v_{Ti}} \right ) \frac {v^3_{Ti}}{v^3} = \pi \left ( n_i r^2_e a\right ) Z^2_s Z^2_i \left ( \frac {m_e c^2}{T_i} \right )^2 \ln \varLambda \frac {v^3_{Ti}}{v^3} , \end{equation}

where $r_e = 2.8179\times 10^{-15}\;[\text{m}]$ is the classical electron radius, $Z_i$ is the charge state of the background ions and $Z_s$ is the charge state of the energetic ion. The Lorentz collision operator is implemented by the Monte Carlo equivalent (Boozer & Kuo-Petravic Reference Boozer and Kuo-Petravic1981)

(3.16) \begin{equation} \xi _{n+1} = \xi _n \left ( 1 - \nu _D \Delta t \right ) \pm \sqrt {\left ( 1 - \xi ^2_n \right ) \nu _D \Delta t} , \end{equation}

where $\Delta t$ is the collisional time step, which is taken to be the same as the collisionless time step. While the Lorentz collision operator does not account for the slowing down of energetic ions, a critical aspect of describing energetic particle evolution, pitch-angle scattering, will result in collisional cross field transport. The addition of a more complete set of physical processes including 3-D magnetic fields, energetic particle modes and collisional slowing down will be pursued in future work.

Figure 2. (a) Comparison of neoclassical diffusivity computed from JONTA (red ‘x’ markers) and the analytic expression given by (3.17) (solid blue curve). (b) Ratio of the analytic expression for the neoclassical diffusivity to JONTA’s prediction. (c) Example fit of squared radial ion displacement versus time (dashed blue curve) with the numerically computed evolution (solid red curve) for particles with an initial radius of ${\sim} 0.51$ . One million total marker particles were used, randomly distributed over pitch and radius. Spatially resolved estimates of transport were made by binning marker particles based on their initial location. The parameters were chosen to be $a=2\;[\text{m}]$ , $a/R_0 = 1/6$ , $B_0 = 10\;[\text{T}]$ , $n_i = 10^{18}\;[\text{m}^{-3}]$ , $q=2$ , $\ln \varLambda = 15$ , and the ions were assumed to have an energy of $20\;[\text{keV}]$ .

To verify our implementation of the Monte Carlo collision operator, we compared JONTA predictions of neoclassical transport with an analytic expression for the neoclassical diffusivity. Using the radial coordinate $r=\sqrt {X^2+Z^2}$ , rather than the toroidal flux $\psi _t=r^2 B_0/2$ , the expression for the neoclassical diffusivity derived by Rosenbluth, Hazeltine & Hinton (Reference Rosenbluth, Hazeltine and Hinton1972), Boozer & Kuo-Petravic (Reference Boozer and Kuo-Petravic1981) can be written as

(3.17) \begin{equation} \frac {D_{\textit{neo}}}{av_{Ti}} = 0.689 \sqrt {2\varepsilon } q^2 \left ( \frac {R_0}{r} \right )^2 \left ( \frac {m_s v^2}{2T_i} \right ) \left ( \frac {a \nu _D}{v_{Ti}} \right ) , \end{equation}

where $\varepsilon = r/R_0$ and $\nu _D$ is the pitch-angle scattering frequency defined in (3.15). To estimate the neoclassical transport predicted by JONTA, we will consider a large tokamak device at low density. Specifically, we will take the density to be $n_i=10^{18}\;[\text{m}^{-3}]$ , the minor radius $a=2\;[\text{m}]$ , a small inverse aspect ratio of $a/R_0=1/6$ and a large toroidal magnetic field of $B_0 = 10\;[\text{T}]$ . By considering a low density plasma, this ensures the system is in the banana transport regime throughout the majority of the device, whereas the large device size and strong magnetic field results in the banana orbits of the fast ions being much smaller than the system size, a necessary condition for estimating an average diffusivity. A comparison of the neoclassical diffusivity estimated by JONTA with (3.17) is shown as a function of minor radius in figure 2. Here, good agreement is evident between the computed neoclassical diffusivity and the approximate analytic expression. In particular, the maximum deviation is roughly 12 %, with a typical deviation less than 10 %. Note that we have removed the inner 20 % of the plasma radius from the comparison, since the de-trapping time scales with $\nu _D/\varepsilon$ and is thus very short near the magnetic origin, such that this region is not in the banana transport regime.

3.3. Example ion orbits and coordinate conventions

Figure 3. Circular flux surface geometry used in this work and example collisionless ion orbits. (a) Co-current ions with a passing ion with initial pitch $\xi =0.8$ (red curve) and a trapped ion with initial pitch $\xi =0.3$ (green curve). (b) Counter-current ions with a passing ion with initial pitch $\xi =-0.8$ (red curve) and a trapped ion with initial pitch $\xi =-0.3$ (green curve). Flux surface contours are shown in blue. The black marker indicates the initial position of the ions and arrows indicate their direction. The grad-B drift points downward in our example geometry. We assumed a deuterium ion with $20\;\text{keV}$ , a tokamak with a minor radius of $0.5\;\text{m}$ , inverse aspect ratio $a/R_0=1/3$ , constant $q$ -profile of $q=2$ and an on-axis magnetic field strength of $B_0=2\;\text{T}$ .

Before discussing solutions to the adjoint drift kinetic equation, it will be useful to describe the magnetic geometry employed. A plot of the circular flux surface geometry assumed, together with four example ion orbits, are shown in figure 3. Here, the blue contours indicate magnetic flux surfaces, and the red and green curves indicate example passing and trapped ion orbits, respectively. The toroidal magnetic field and current are assumed to come out of the page, with the poloidal magnetic field in the counter-clockwise direction. Passing ions with a positive pitch (co-current) will thus rotate in the counter-clockwise direction when their motion is projected onto the poloidal plane as they propagate along a magnetic field line, with a grad-B drift vertically downward. Ions with a negative pitch (counter-current) will move in a clockwise sense when moving along field lines, but will still have a vertically downward grad-B drift. From figure 3, it is apparent that co-current ions (figure 3 a) born on the weak field side will tend to be better confined compared with counter-current ions (figure 3 b) born at the same location, due to the magnetic drifts taking the counter-current ions closer to the spatial boundary. Furthermore, a counter-current passing ion when scattered into the trapped region will move onto an orbit that takes it closer to the plasma edge, as evident from figure 3(b), such that these ions will be more susceptible to being lost.

4. Mean escape time

Figure 4. Loss histories for (a) 20 keV and (b) 50 keV ions. Solid lines indicate the training loss whereas ‘x’ markers indicate the test loss. Six million training points were used and two million test points. The same seed for the pseudorandom number generator was used in both cases.

4.1. Training details of the PINN

This section will describe solutions for the mean escape time of a fast ion population at 20 and 50 keV using the PINN implementation described in § 3.1, along with a comparison to solutions evaluated with JONTA. Loss histories for ions with 20 and 50 keV are shown in figure 4. For these cases, we assumed a midsize tokamak with a minor radius of $a=0.5\;\text{m}$ , aspect ratio $R_0/a = 3$ and deuterium fast ions. The background deuterium plasma is assumed to have a temperature of $1\;\text{keV}$ , a density of $n_i = 10^{20}\;\text{m}^{-3}$ and the safety factor is taken to be $q=2$ . A representative Coulomb logarithm of $\ln \varLambda = 15$ was used in the collision operator. Each of these quantities were taken to be constant across the plasma radius, though non-trivial spatial profiles could easily be introduced. A fully connected feedforward network with seven hidden layers each with a width of fifty neurons was used. A network of this size, together with six million training points, used the majority of the 180 GB of VRAM present on the NVIDIA Blackwell 200 GPU used to train the network. While we have observed improvements in predictions of the PINN by increasing the size of the network, we found the present network size and number of training points offers a good balance in ensuring dense coverage over the ion phase space, while providing a sufficiently expressive network to approximate the solution. For a network of this size, the inference time on an M3 MacBook Pro was measured to be two microseconds per prediction.

The SOAP optimiser was used for the first 50 000 epochs, with SSBroyden used for the remaining 50 000 epochs. Six million training points were used with two million uniformly distributed test points applied. An adaptive residual based sampling method was employed during the SOAP phase of training (Wu et al. Reference Wu, Zhu, Tan, Kartha and Lu2023), such that training points are redistributed near regions with large residuals. From figure 4, it is evident that SOAP drives a slow decay of the loss, with SSBroyden driving a sharper decline. The periodic spikes in the training loss during the SOAP phase are due to the training points being resampled every 8000 epochs. As a result of training points being concentrated in regions with large residuals, the test loss is consistently lower than the training loss, except for the last 10 000 epochs of training for the case of 20 keV ions. This crossing of the test and training loss is due to a large residual forming at some isolated regions immediately adjacent to $r=a$ and will be discussed further later. We note that the SOAP optimiser being employed in this paper is able to efficiently train on a GPU, whereas the SciPy implementation of SSBroyden that we have employed primarily uses a CPU. As a result, the vast majority of the training time is spent during the SSBroyden phase of optimisation.

Figure 5. (a) Mean escape time for an ion initialised in the counter-current direction with a pitch of $\xi = -0.8$ , after different stages of training. The profile of the ion’s escape time is shown in panels (a–c), with the residual indicated in panels (d–f). The first column is after the SOAP phase of training (50 000 epochs), the second column is 10 000 into the SSBroyden phase (60 000 net epochs) and the final column is after 50 000 epochs of SSBroyden (100 000 epochs total). The quantity plotted is $\text{log}_{10}( 1+ T)$ , yielding a $\text{log}_{10}$ scale for large values of $T$ , but vanishing when $T=0$ . Deuterium ions with $20\;\text{keV}$ were assumed.

Considering the evolution of the predicted solution during training, figure 5(a) shows the predicted mean escape time for a 20 keV ion after the SOAP optimisation phase, 10 000 epochs into the SSBroyden phase (figure 5 b) and the final predicted mean escape time (figure 5 c). A reference solution from JONTA is shown in figure 10(d) for comparison. It is evident that SOAP is able to approximately capture the structure of the solution near the tokamak edge, but fails to recover the long mean escape time of ions initially located at small minor radii. This is due to the large discrepancy in time scales between ions that begin on loss orbits, and are thus lost on a transit time scale, and those initially located deep in the plasma interior. These latter ions must collisionally diffuse out of the plasma and are thus lost on a much longer time scale, roughly $10^5$ slower for the present example. After 10 000 epochs with SSBroyden, the training and test loss of the residual and boundary terms drop further, and it is evident from figure 5 that the relatively long confinement time of ions initially located in the interior is better captured, though its value is still substantially under predicted. Finally, after 50 000 epochs of SSBroyden, the predicted escape time of ions initially located in the interior slowly increases, though the final value is still below that predicted by the particle-based code JONTA. It has been verified that additional training epochs with SSBroyden leads to a more accurate recovery of the mean escape time of the best confined ion orbits, though as described later, begins to overfit the solution near the plasma edge at isolated locations.

Figure 6. Test loss distribution for 20 keV ions. A uniform random distribution with one million points was used. The size of the markers is proportional to the magnitude of the residual.

Turning to the distribution of training losses, two-dimensional projections of the resulting test error are shown in figure 6, with the magnitude of the markers set to be proportional to the magnitude of the residual. Hence, darker regions represent regions with the largest residuals. From figure 6, the outer $30\,\%$ of the tokamak contained the largest losses. This follows due to this region containing direct loss orbits, whose detailed form can be difficult to resolve. It is also evident from figure 6(d), where the magnitude of the residuals are plotted as a function of the minor radius, that for $r\approx a$ , a small number of very large residuals are present, with the largest having a magnitude of $0.589$ located at $r/a \approx 0.998$ and $\theta \approx 3.041$ . These large residuals at the very edge of the plasma are responsible for the increase in the test loss during the last 10 000 epochs of training, but as discussed further later, do not substantially impact the accuracy of the solution throughout the majority of the plasma volume. Finally, regions with $r/a \gt 0.99$ and poloidal angles on either the outboard ( $-\Delta \theta \lt \theta \lt \Delta \theta$ ) or inboard ( $\theta \gt -\pi + \Delta \theta$ and $\theta \lt \pi - \Delta \theta$ ) midplanes were set to zero, leading to empty regions in figure 6(a). For this example, we set $\Delta \theta = 0.1$ . As described further later, these regions contain discontinuities in the solution arising from changes in boundary conditions between the lower and upper halves of the tokamak, which cannot be precisely resolved. By removing training points from this region, this allows the PINN to find a solution that smoothly transitions across these narrow regions. We note that the largest residuals in the test loss are located at poloidal angles just outside of these regions, and are thus due to the sharp variation of the solution between the upper and lower regions of the tokamak near the plasma edge (see figure 9 a).

Figure 7. (a) Escape time for an ion initially located at $\theta = 0$ . The quantity plotted is $\text{log}_{10}( 1+ T)$ , yielding a $\text{log}_{10}$ scale for large values of $T$ , but vanishing when $T=0$ . The white curve indicates the trapped-passing boundary. (b) Residual of (2.6). A deuterium ion with $20\;\text{keV}$ was assumed.

Figure 8. Slices of escape time in units of $\log _{10}(1+T)$ for different initial pitch values: (a) for $\xi =0.8$ ; (b) for $\xi =0.3$ ; (c) for $\xi =-0.3$ ; and (d) for $\xi =-0.8$ . The solid white curve indicates the location of the trapped-passing boundary, where ions born with major radii $R$ to the right of the white curve are trapped.

4.2. Phase space structure of the mean escape time

The phase space distribution of the fast ion mean escape time can be conveniently illustrated by taking slices in the three-dimensional $( R, Z, \xi )$ space. The mean escape time of ions born with $\theta = 0$ (weak field side) is shown in figure 7 on a $\log _{10}$ scale. Here, counter-current ions ( $\xi \lt 0$ ) born in the edge of the plasma are lost quickly due to direct orbit loss, with both co- and counter-current ions initially located deeper in the plasma having much longer mean escape times. Further noting that the white contour indicates the trapped-passing boundary, it is evident that co-current passing ions (positive pitches above the white contour) have far better confinement in comparison to counter-current ions. Similarly, in the trapped region, ions that begin with $\xi \gt 0$ are much better confined compared with ions with $\xi \lt 0$ . Comparing the left and right panels of figure 3, this difference in confinement results from trapped ions born with $\xi \lt 0$ at $\theta = 0$ having orbits that extend out to larger radii (figure 3 b), in contrast to trapped ions born with $\xi \gt 0$ (figure 3 a). The ions with the worst confinement are those just inside the trapped-passing boundary and $\xi \lt 0$ , which are expected to have the largest banana width and thus be most susceptible to orbit loss.

Figure 8 shows slices of major radius $R$ and vertical height $Z$ for different pitches. These slices allow for the difference in confinement of co- and counter-current ions to be investigated more clearly. Considering passing co-current ions ( $\xi =0.8$ ), these ions will propagate in the counter-clockwise direction in figure 8(a). Noting that ion orbits drift downward, this will result in ions that are initially on the outboard side not being on direct loss orbits, in contrast to those born on the inboard side. This leads to a notable in–out asymmetry of the fast ion population, as evident in figure 8(a). This trend is reversed for passing ions that begin in the counter-current direction (figure 8 d). A more subtle feature is that counter-current ions that are not on direct loss orbits are more likely to be lost in comparison to co-current ions. This is evident by the orange region in figure 8(d). The relatively poor confinement of these counter-current ions is due to the collisional trapping of these ions resulting in the orbit exploring larger minor radii (compare the red and green curves in figure 3 b). Hence, passing counter-current ions that are within a banana width of the plasma edge are far more susceptible to being lost to the plasma compared with passing co-current ions.

Figure 9. Slices of escape time in units of $\log _{10}(1+T)$ for different initial radii: (a) for $r/a=0.999$ ; (b) for $r/a=0.95$ ; (c) for $r/a=0.9$ ; and (d) for $r/a=0.8$ . The while curves are the trapped-passing boundary.

Cross-cuts of the pitch and poloidal angle distribution of loss times at different radii are shown in figure 9. Here, when approaching $r/a=1$ , a discontinuity appears in the solution. This is evident in figure 9(a), where for $\theta \approx 0$ and $\theta \approx \pm \pi$ , the solution changes rapidly when traversing these discontinuous regions. The origin of this discontinuity is that for ions in the lower half of the tokamak (i.e. $-\pi \lt \theta \lt 0$ ) and $r/a=1$ , they are lost immediately. In contrast, ions located just above $\theta = 0$ or $\theta = \pm \pi$ are often not on direct loss orbits and, thus, can be confined for a long time. For example, co-current ions just above $\theta = 0$ will drift inward and, thus, have a finite escape time, whereas those just below $\theta = 0$ are lost immediately to the boundary. An analogous discontinuity in the solution is present for $\theta = \pm \pi$ for counter-current ions. To relax these discontinuities, we have removed all training and test points from these regions, such that the PINN is able to approximate these discontinuities with sharply varying, but continuous solutions, as evident in figure 9(a). Moving radially away from the $r/a=1$ boundary, figure 9(b–d) shows the solution at radial locations between $r/a = 0.95$ and $r/a=0.8$ , indicating vastly different mean escape times of ions initially located at different pitch and poloidal angles. In particular, the phase space region of prompt ion loss, which initially occupies roughly half of the $( \xi , \theta )$ domain for $r/a\approx 1$ , shrinks to a narrow sliver just inside the trapped-passing boundary for $r/a=0.8$ . This is indicated in figure 9(d), where the solid white curve indicates the trapped-passing region.

Figure 10. Slices of escape time in units of $\log _{10}(1+T)$ for different initial pitch values and an energy of $20\;\text{keV}$ computed from the JONTA code: (a) for $\xi =0.8$ ; (b) for $\xi =0.3$ ; (c) for $\xi =-0.3$ ; and (d) for $\xi =-0.8$ . Ten million markers were used, where the particles were integrated for $t_{final} = 2\times 10^6$ .

4.3. Comparison with Monte Carlo solutions

The mean escape time can also be computed by the particle-based solver JONTA described in § 3.2. Here, marker particles are initialised randomly across the spatial domain at a given pitch. Once a particle’s minor radius exceeds $r_{\textit{wall}}$ , its escape time and location are saved. By pushing a large number of marker particles, the mean escape time can be computed as a function of the ion’s initial $R$ and $Z$ . Noting the conceptual simplicity of this approach, we will use it to validate the predictions of the PINN. Figure 10 shows four slices in the $( R, Z )$ plane at constant pitch computed from JONTA. Here, ten million marker particles are randomly distributed at a given pitch, with their escape times recorded after they escape from the domain. These escape times are then binned, allowing for the mean escape time to be computed over the $( R, Z )$ plane for a given pitch. The particles were evolved until $t_{final} = 2\times 10^6$ , a time at which nearly all of the original particles remained in the domain. Four cross-cuts are shown in figure 10. These four cross-cuts are for the same physics parameters and pitches as figure 8. Good agreement between the mean escape time predicted by JONTA and the PINN are evident. In particular, both approaches are able to recover the rapid escape of ions that are initiated on direct loss orbits (blue regions), ions lost due to collisional trapping/detrapping directly onto a loss orbit (yellow and green regions) and ions initially located near the magnetic origin that must slowly diffuse out (red regions). The JONTA simulations also reveal a limitation of our implementation of a PINN. In particular, while the PINN is able to distinguish the complex edge structure of confined versus unconfined orbits, with excellent agreement with predictions of JONTA, it struggles to accurately capture the very long mean escape time for ions initially located deep inside the plasma. Specifically, the dark red region evident in figure 10 of the JONTA data is not present in the PINN’s solution (see figure 8), indicating roughly a factor of two and a half discrepancy in predicted confinement for these 20 keV well-confined ions. The reason for this quantitative inaccuracy is that the PINN struggles to simultaneously resolve the fast direct orbit dynamics together with the slow collisional transport of the core ions, though as indicated in figure 5, this discrepancy slowly shrinks as the model is trained. The resolution of this disparate physics will require (i) the inclusion of data in training the PINN, e.g. using data taken directly from JONTA for example, (ii) allowing the PINN to train for more epochs which has been verified to slowly reduce the discrepancy, or (iii) modifications to the training of the PINN. These extensions will be discussed further later.

Figure 11. Comparison of escape time in units of $\log _{10}(1+T)$ predictions from the PINN (top row) and JONTA (bottom row). Ions were assumed to have an initial energy of $50\;\text{keV}$ : (a, e) for $\xi =0.8$ ; (b, f) for $\xi =0.3$ ; (c, g) for $\xi =-0.3$ ; and (d, h) for $\xi =-0.8$ . Ten million markers were used in the JONTA simulations, where the particles were integrated for $t_{final} = 10^7$ .

Now, considering an energetic ion population with $50\;\text{keV}$ , similar physical trends are evident, but the larger ion orbits lead to a larger number of ions beginning on direct loss orbits (see figure 11). In addition, the higher energy of the particles leads to a lower collision time and thus energetic ions initially located in the core have a longer escape time. This latter property results in the problem exhibiting greater stiffness, with a faster bounce time but slower collision time, resulting in the PINN struggling to accurately resolve the escape time of the best confined orbits. This is evident in figure 11, where cross-sections in the $( R, Z )$ plane computed both with JONTA and the PINN are compared. While the PINN fails to capture the precise mean escape time of the best confined particles, it does succeed in accurately capturing the phase space structure of the mean escape time across the edge region, where ions have the shortest mean escape times. It thus acts as an effective tool for delineating well-confined regions of phase space from those where ions are directly lost or lost after a small number of collisional scattering events. We note that while for the case of $20\;\text{keV}$ ions a factor of roughly two and a half difference in the mean escape time was present for the best confined ions, for $50\;\text{keV}$ ions, roughly a factor of five difference in the prediction of the best confined orbits is evident. This larger deviation in the prediction of the best confined ions is due to the increased time scale separation between the collisionless ion orbits and the slow collision time. As discussed further in § 5, while further training or hyperparameter tuning can likely reduce these discrepancies, a robust means of improving the predictions of the PINN will be the incorporation of small quantities of data.

5. Conclusions

The inhomogeneous adjoint of the drift kinetic equation was used to evaluate the mean escape time of energetic ion populations. This quantity, computed in the limit of an axisymmetric tokamak plasma, captures fast ion losses due to direct orbit loss and collisional transport, thus providing a metric for energetic particle confinement. A PINN was used to evaluate the phase space dependence of the mean escape time, a challenging task due to the strong time scale separation between the fast bounce time of energetic ions and their slow collision rate. While the offline training time of the PINN is substantial, the online inference time is rapid, typically microseconds per prediction. Although the PINN failed to quantitatively capture the mean escape time of the best confined ion orbits, it is able to robustly recover the phase space structure of the mean escape time, and thus acts as a means of distinguishing well-confined orbits and phase space regions of prompt ion loss. We also note that the present analysis did not include fast ion slowing down, where the slowing down time for the 20 and 50 keV ions considered in this work would be far shorter than the mean escape times computed in this work for the best confined orbits.

For this proof-of-principle demonstration of the proposed framework, we considered several simplifications to the complete system. Perhaps the primary simplifications were the assumptions of (i) an axisymmetric magnetic geometry, (ii) a simplified collision operator, and (iii) idealised plasma profiles and shaping. Regarding item (i), since PINNs to date have not been used to solve the drift kinetic equation, beginning with an axisymmetry geometry was a logical starting point for assessing this approach’s strengths and weaknesses. We anticipate extending the present approach in future work to 3-D magnetic geometry (Law, Cerfon & Peherstorfer Reference Law, Cerfon and Peherstorfer2022), though such an extension will likely require incorporating particle data into the training of the neural network. The particle-based solver JONTA will provide a convenient tool through which such data can be generated, where by providing an accurate calculation of the mean escape time at specific phase space locations, these may be included in the first term of the loss function defined by (3.1). By providing high fidelity data points at internal locations, for example, in regions with large residuals or for the best confined ion orbits where the present implementation of a PINN has struggled, this will likely expedite the training of the PINN while providing a systematic means of improving its accuracy. Our aim will be to leverage data as a means of both treating more challenging scenarios, such as non-axisymmetric geometry, along with facilitating the learning of a parametric solution to the drift kinetic equation, enabling the mean escape time to be evaluated for a broad range of plasma conditions.

An additional simplification was the use of a Lorentz collision operator, which does not account for the slowing down of energetic particles, a quantity of critical interest when assessing the amount of energy deposited by energetic ions into the bulk plasma. Despite this limitation, including Lorentz collisions allows for the collisional transport of fast ions to be assessed. Thus, while not providing an accurate indication of the mean escape time for well-confined energetic particles, since such particles will likely slow down substantially during their long confinement time, it does provide a means of rapidly assessing particles on direct loss orbits, or those that are directly lost after being collisionally scattered into or out of the trapped particle region. The predictions of the present work will thus be most relevant to the outer region of the tokamak, where energetic particles are lost on relatively short time scales. Finally, homogeneous plasma profiles and a circularly shaped plasma were assumed in the present work. This was done for convenience, where we anticipate including more realistic plasma profiles to not pose a substantial challenge. Furthermore, since PINNs are mesh free and are thus well suited to treating shaped magnetic geometry (van Milligen et al. Reference van Milligen, Tribaldos and Jiménez1995; Kaltsas & Throumoulopoulos Reference Kaltsas and Throumoulopoulos2022; Jang et al. Reference Jang, Kaptanoglu, Gaur, Pan, Landreman and Dorland2024), we do not expect the treatment of shaped axisymmetric geometry to pose a major hurdle. Similarly, strong radial electric fields that emerge in the pedestal region (Brzozowski et al. Reference Brzozowski2019) along with realistic equilibrium density and temperature profiles could easily be incorporated.

Due to the strong time scale separation, evaluating fast ion transport is computationally intensive, creating a bottleneck in many optimisation workflows. For the JONTA simulations presented in this paper, the shortest simulation required one and a half hours on a single NVIDIA Blackwell 200 GPU, with the longest requiring slightly over a day. Noting that each JONTA simulation was for a fixed value of pitch, computing four distinct pitches (figures 10 and 11) thus required several hours to days of computing time. A PINN, which encompasses all values of pitch, and thus only a single PINN must be trained, required approximately three days to train, where approximate results obtained after the SOAP phase of training typically take a few hours. Both approaches are thus computationally intensive, where while the PINN is able to recover smooth predictions of the mean escape time, it is unable to accurately compute the best confined ions. JONTA, in contrast, requires roughly ten million markers to obtain a relatively smooth, well-resolved solution. We note that while the PINN also used a single NVIDIA Blackwell 200 GPU, the implementation of the SSBroyden optimiser does not make efficient use of the GPU, and thus was mostly trained using a CPU, which was the primary bottleneck in the training of the PINN.

We also note that while the present paper has focused on the mean escape time of energetic ions, several straightforward extensions are possible. One such extension would be to treat a broader range of fast ion metrics. Metrics of interest include the probability that an ion impacts a specific region of the wall from a given initial phase space location (Salewski et al. Reference Salewski2011; Särkimäki Reference Särkimäki2020) (see also Yang et al. (Reference Yang, Liu, Del-Castillo-Negrete, Cao and Zhang2026) for a runaway electron example), or the slowing down probability, which corresponds to the probability that an ion slows down to the thermal energy. Both of these fast ion metrics can be straightforwardly constructed from the steady-state homogeneous adjoint equation by applying appropriate boundary conditions. The evaluation of this broader class of fast ion metrics will be left for future work. Finally, while we have considered energetic ions in the present formulation, extensions to the relativistic drift kinetic equation describing runaway electron (RE) evolution and transport can be developed following an analogous approach. Such extensions will allow for ‘neoclassical’ corrections to RE generation rates to be inferred (Rosenbluth & Putvinski Reference Rosenbluth and Putvinski1997; McDevitt & Tang Reference McDevitt and Tang2019; Arnaud & McDevitt Reference Arnaud and McDevitt2024), or account for spatial transport due to processes such as the Ware pinch (Nilsson et al. Reference van Milligen, Tribaldos and Jiménez2015; McDevitt, Guo & Tang Reference McDevitt, Guo and Tang2019). While we anticipate that the extremely short bounce time of REs compared with their long collision times will pose a substantial challenge compared with the purely momentum space treatment employed by Arnaud et al. (Reference Arnaud and McDevitt2024), McDevitt et al. (Reference McDevitt and Tang2025), we note that the impact of ‘neoclassical’ effects on RE generation rates are most important near the critical energy for electrons to run away (McDevitt & Tang Reference McDevitt and Tang2019; Arnaud & McDevitt Reference Arnaud and McDevitt2024), an energy that often ranges between several keV to tens of keV during a tokamak disruption. At these modest energies, the time scale separation between the transit or bounce time of an RE and the collision time will be drastically reduced. The extension of the present framework to REs will be pursued in future work.

Acknowledgements

The authors acknowledge the University of Florida Research Computing for providing computational resources that have contributed to the research results reported in this publication.

Editor Tünde Fülöp thanks the referees for their advice in evaluating this article.

Funding

This work was supported by the Department of Energy, Office of Fusion Energy Sciences under awards DE-SC0024634 and DE-SC0024649, and the University of Florida Division of Sponsored Research Projects.

Data availability statement

The version of JONTA used in this manuscript, together with the PINN training script are available on github.com/cmcdevitt2/DeepPlasma.

Declaration of interests

The authors report no conflict of interest.

References

Adam, J. 1987 Review of tokamak plasma heating by wave damping in the ion cyclotron range of frequency. Plasma Phys. Control. Fusion 29, 443.10.1088/0741-3335/29/4/001CrossRefGoogle Scholar
Al-Baali, M., Spedicato, E. & Maggioni, F. 2014 Broyden’s quasi-Newton methods for a nonlinear system of equations and unconstrained optimization: a review and open problems. Optim. Methods Softw. 29, 937954.10.1080/10556788.2013.856909CrossRefGoogle Scholar
Arnaud, J.S., Mark, T. & McDevitt, C.J. 2024 A physics-constrained deep learning surrogate model of the runaway electron avalanche growth rate. J. Plasma Phys. 90, 905900409.10.1017/S0022377824000679CrossRefGoogle Scholar
Arnaud, J.S. & McDevitt, C.J. 2024 The impact of collisionality on the runaway electron avalanche during a tokamak disruption. Phys. Plasmas 31, 062509.10.1063/5.0198338CrossRefGoogle Scholar
Arnaud, J.S., Tang, X.-Z. & McDevitt, C.J. 2025 A runaway electron avalanche surrogate for partially ionized plasmas. Nucl. Fusion 65, 106013.10.1088/1741-4326/ae00dbCrossRefGoogle Scholar
Bonofiglo, P. et al. 2024 Alpha particle loss measurements and analysis in jet dt plasmas. Nucl. Fusion 64, 096038.10.1088/1741-4326/ad69a1CrossRefGoogle Scholar
Boozer, A.H. & Kuo-Petravic, G. 1981 Monte carlo evaluation of transport coefficients. Phys. Fluids 24, 851859.10.1063/1.863445CrossRefGoogle Scholar
Bradbury, J. et al. 2018 Jax: composable transformations of python+ numpy programs.Google Scholar
Brzozowski, R.W. et al. 2019 A geometric model of ion orbit loss under the influence of a radial electric field. Phys. Plasmas 26, 042511.10.1063/1.5075613CrossRefGoogle Scholar
Cai, S., Mao, Z., Wang, Z., Yin, M. & Karniadakis, G.E. 2021 Physics-informed neural networks (pinns) for fluid mechanics: a review. Acta Mechanica Sin. 37, 17271738.10.1007/s10409-021-01148-1CrossRefGoogle Scholar
Carolipio, E., Heidbrink, W., Forest, C. & White, R. 2002 Simulations of beam ion transport during tearing modesin the diii-d tokamak. Nucl. Fusion 42, 853.10.1088/0029-5515/42/7/308CrossRefGoogle Scholar
Cuomo, S., Di Cola, V.S., Giampaolo, F., Rozza, G., Raissi, M. & Piccialli, F. 2022 Scientific machine learning through physics–informed neural networks: Where we are and what’s next. J. Sci. Comput. 92, 88.10.1007/s10915-022-01939-zCrossRefGoogle Scholar
Fasoli, A. et al. 2007 Physics of energetic ions. Nucl. Fusion 47, S264.10.1088/0029-5515/47/6/S05CrossRefGoogle Scholar
Galeev, A.A., Sagdeev, R., Furth, H. & Rosenbluth, M. 1969 Plasma diffusion in a toroidal stellarator. Phys. Rev. Lett. 22, 511.10.1103/PhysRevLett.22.511CrossRefGoogle Scholar
Goldston, R.J., White, R. & Boozer, A.H. 1981 Confinement of high-energy trapped particles in tokamaks. Phys. Rev. Lett. 47, 647.10.1103/PhysRevLett.47.647CrossRefGoogle Scholar
Grasman, J. et al. 2013 Asymptotic Methods for the Fokker–Planck Equation and the Exit Problem in Applications. Springer Science & Business Media.Google Scholar
Heidbrink, W. & White, R. 2020 Mechanisms of energetic-particle transport in magnetically confined plasmas. Phys. Plasmas 27, 030901.CrossRefGoogle Scholar
Helander, P. & Sigmar, D.J. 2002 Collisional Transport in Magnetized Plasmas. Cambridge University Press.Google Scholar
Hemsworth, R., Tanga, A. & Antoni, V. 2008 Status of the iter neutral beam injection system. Rev. Sci. Instrum. 79, 02C109.CrossRefGoogle ScholarPubMed
Hinton, F. & Chu, M. 1985 Neoclassical ion transport through the separatrix in divertor tokamaks. Nucl. Fusion 25, 345.10.1088/0029-5515/25/3/010CrossRefGoogle Scholar
Hirvijoki, E., Asunta, O., Koskela, T., Kurki-Suonio, T., Miettunen, J., Sipilä, S., Snicker, A. & Äkäslompolo, S. 2014 Ascot: solving the kinetic equation of minority particle species in tokamak plasmas. Comput. Phys. Commun. 185, 13101321.CrossRefGoogle Scholar
Jang, B., Kaptanoglu, A.A., Gaur, R., Pan, S., Landreman, M. & Dorland, W. 2024 Grad–shafranov equilibria via data-free physics informed neural networks. Phys. Plasmas 31, 032510.10.1063/5.0188634CrossRefGoogle Scholar
Kaltsas, D. & Throumoulopoulos, G. 2022 Neural network tokamak equilibria with incompressible flows. Phys. Plasmas 29, 022506.10.1063/5.0073033CrossRefGoogle Scholar
Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S. & Yang, L. 2021 Physics-informed machine learning. Nat. Rev. Phys. 3, 422440.10.1038/s42254-021-00314-5CrossRefGoogle Scholar
Kazakov, Y.O. et al. 2017 Efficient generation of energetic ions in multi-ion plasmas by radio-frequency heating. Nat. Phys. 13, 973978.10.1038/nphys4167CrossRefGoogle Scholar
Kingma, D.P. & Ba, J. 2014 Adam: a method for stochastic optimization. arXiv preprint arXiv: 1412.6980.Google Scholar
Kiyani, E., Shukla, K., Urbán, J.F., Darbon, J. & Karniadakis, G.E. 2025 Optimizing the optimizer for physics-informed neural networks and Kolmogorov–Arnold networks. Comput. Meth. Appl. Mech. Engng 446, 118308.10.1016/j.cma.2025.118308CrossRefGoogle Scholar
Law, F., Cerfon, A. & Peherstorfer, B. 2022 Accelerating the estimation of collisionless energetic particle confinement statistics in stellarators using multifidelity monte carlo. Nucl. Fusion 62, 076019.10.1088/1741-4326/ac4777CrossRefGoogle Scholar
Liu, C., Brennan, D.P., Boozer, A.H. & Bhattacharjee, A. 2017 Adjoint method and runaway electron avalanche. Plasma Phys. Control. Fusion 59, 024003.CrossRefGoogle Scholar
Liu, D.C. & Nocedal, J. 1989 On the limited memory bfgs method for large scale optimization. Math. Program 45, 503528.10.1007/BF01589116CrossRefGoogle Scholar
Luo, W., Ren, J.-x., Chen, Z., Mao, R., Zhang, G., Wang, Y. & Tang, H. 2025 Parametric study of plasma swirling flow based on magnetohydrodynamics informed deep learning. Phys. Fluids 37, 077106.10.1063/5.0264343CrossRefGoogle Scholar
Mathews, A., Hughes, J.W., Terry, J.L. & Baek, S.-G. 2022 Deep electric field predictions by drift-reduced braginskii theory with plasma-neutral interactions based on experimental images of boundary turbulence. Phys. Rev. Lett. 129, 235002.10.1103/PhysRevLett.129.235002CrossRefGoogle ScholarPubMed
McDevitt, C.J. 2023 A physics-informed deep learning model of the hot tail runaway electron seed. Phys. Plasmas 30, 092501.10.1063/5.0164712CrossRefGoogle Scholar
McDevitt, C.J., Arnaud, J.S. & Tang, X.-Z. 2025 A physics-constrained deep learning treatment of runaway electron dynamics. Phys. Plasmas 32, 042503.10.1063/5.0253370CrossRefGoogle Scholar
McDevitt, C., Fowler, E. & Roy, S. 2024 Physics-constrained deep learning of incompressible cavity flows. In AIAA SCITECH, Forum, p. 1692.Google Scholar
McDevitt, C.J., Guo, Z. & Tang, X. 2019 Spatial transport of runaway electrons in axisymmetric tokamak plasmas. Plasma Phys. Control. Fusion 61, 024004.10.1088/1361-6587/aaf4d1CrossRefGoogle Scholar
McDevitt, C. & Tang, X.-Z. 2019 Runaway electron generation in axisymmetric tokamak geometry. EPL (Europhys. Lett.) 127, 45001.10.1209/0295-5075/127/45001CrossRefGoogle Scholar
McDevitt, C.J. & Tang, X.-Z. 2024 A physics-informed deep learning description of knudsen layer reactivity reduction. Phys. Plasmas 31, 062701.10.1063/5.0207372CrossRefGoogle Scholar
Nilsson, E., Decker, J., Fisch, N.J. & Peysson, Y. 2015 Trapped-electron runaway effect. J. Plasma Phys. 81, 475810403.10.1017/S0022377815000446CrossRefGoogle Scholar
Paszke, A. et al. 2019 Pytorch: an imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems 32.Google Scholar
Raissi, M., Perdikaris, P. & Karniadakis, G.E. 2019 Physics-informed neural networks: a deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys. 378, 686707.10.1016/j.jcp.2018.10.045CrossRefGoogle Scholar
Rosenbluth, M., Hazeltine, R. & Hinton, F.L. 1972 Plasma transport in toroidal confinement systems. Phys. Fluids 15, 116140.10.1063/1.1693728CrossRefGoogle Scholar
Rosenbluth, M. & Putvinski, S. 1997 Theory for avalanche of runaway electrons in tokamaks. Nucl. Fusion 37, 1355.10.1088/0029-5515/37/10/I03CrossRefGoogle Scholar
Salewski, M. et al. 2011 On velocity space interrogation regions of fast-ion collective thomson scattering at iter. Nucl. Fusion 51, 083014.10.1088/0029-5515/51/8/083014CrossRefGoogle Scholar
Sun, L., Gao, H., Pan, S. & Wang, J.-X. 2020 Surrogate modeling for fluid flows based on physics-constrained deep learning without simulation data. Comput. Method. Appl. Mech. Engng 361, 112732.CrossRefGoogle Scholar
Särkimäki, K. 2020 Efficient and rigorous evaluation of fast particle losses in non-axisymmetric tokamak plasmas. Nucl. Fusion 60, 036002.10.1088/1741-4326/ab60d0CrossRefGoogle Scholar
Toscano, J.D., Oommen, V., Varghese, A.J., Zou, Z., Ahmadi Daryakenari, N., Wu, C. & Karniadakis, G.E. 2025 From pinns to pikans: recent advances in physics-informed machine learning. Mach. Learn. Comput. Sci. Engng 1, 143.Google Scholar
Trott, C.R. et al. 2021 Kokkos 3: Programming model extensions for the exascale era. IEEE Trans. Parallel Distrib. Syst. 33, 805817.10.1109/TPDS.2021.3097283CrossRefGoogle Scholar
Urbán, J.F., Stefanou, P. & Pons, J.A. 2025 Unveiling the optimization process of physics informed neural networks: how accurate and competitive can pinns be? J. Comput. Phys. 523, 113656.10.1016/j.jcp.2024.113656CrossRefGoogle Scholar
van Milligen, B.P., Tribaldos, V. & Jiménez, J. 1995 Neural network differential equation and plasma equilibrium solver. Phys. Rev. Lett. 75, 3594.10.1103/PhysRevLett.75.3594CrossRefGoogle Scholar
Vyas, N., Morwani, D., Zhao, R., Kwun, M., Shapira, I., Brandfonbrener, D., Janson, L. & Kakade, S. 2024 Soap: improving and stabilizing shampoo using adam. arXiv preprint arXiv: 2409.11321.Google Scholar
Wang, S., Bhartari, A.K., Li, B. & Perdikaris, P. 2025 Gradient alignment in physics-informed neural networks: a second-order optimization perspective. arXiv preprint arXiv: 2502.00604.Google Scholar
Wang, S., Sankaran, S., Wang, H. & Perdikaris, P. 2023 An expert’s guide to training physics-informed neural networks. arXiv preprint arXiv: 2308.08468.Google Scholar
White, R.t & Chance, M. 1984 Hamiltonian guiding center drift orbit calculation for toroidal plasmas of arbitrary cross section. Tech. Rep. Princeton Plasma Physics Lab. (PPPL).10.2172/5168290CrossRefGoogle Scholar
Wu, C., Zhu, M., Tan, Q., Kartha, Y. & Lu, L. 2023 A comprehensive study of non-adaptive and residual-based adaptive sampling for physics-informed neural networks. Comput. Meth. Appl. Mech. Engng 403, 115671.10.1016/j.cma.2022.115671CrossRefGoogle Scholar
Yang, M., Liu, Y., Del-Castillo-Negrete, D., Cao, Y. & Zhang, G. 2026 Generative ai models for learning flow maps of stochastic dynamical systems in bounded domains. J. Comput. Phys. 544, 114434.10.1016/j.jcp.2025.114434CrossRefGoogle Scholar
Figure 0

Figure 1. Change of toroidal canonical momentum for (a) 20 keV ions and (b) 50 keV ions. Ten million deuterium ions were initialised randomly across the spatial and pitch domains. The time step was varied from $\Delta t = 0.2$ (blue curve) to $\Delta t = 0.1$ (orange curve) and $\Delta t = 0.05$ (green curve). The tokamak was assumed to have a minor radius of $a=0.5\;[\text{m}]$, an inverse aspect ratio of $a/R_0 = 1/3$, a magnetic field of $B_0 = 2\;[\text{T}]$ and a constant safety factor $q=2$.

Figure 1

Figure 2. (a) Comparison of neoclassical diffusivity computed from JONTA (red ‘x’ markers) and the analytic expression given by (3.17) (solid blue curve). (b) Ratio of the analytic expression for the neoclassical diffusivity to JONTA’s prediction. (c) Example fit of squared radial ion displacement versus time (dashed blue curve) with the numerically computed evolution (solid red curve) for particles with an initial radius of ${\sim} 0.51$. One million total marker particles were used, randomly distributed over pitch and radius. Spatially resolved estimates of transport were made by binning marker particles based on their initial location. The parameters were chosen to be $a=2\;[\text{m}]$, $a/R_0 = 1/6$, $B_0 = 10\;[\text{T}]$, $n_i = 10^{18}\;[\text{m}^{-3}]$, $q=2$, $\ln \varLambda = 15$, and the ions were assumed to have an energy of $20\;[\text{keV}]$.

Figure 2

Figure 3. Circular flux surface geometry used in this work and example collisionless ion orbits. (a) Co-current ions with a passing ion with initial pitch $\xi =0.8$ (red curve) and a trapped ion with initial pitch $\xi =0.3$ (green curve). (b) Counter-current ions with a passing ion with initial pitch $\xi =-0.8$ (red curve) and a trapped ion with initial pitch $\xi =-0.3$ (green curve). Flux surface contours are shown in blue. The black marker indicates the initial position of the ions and arrows indicate their direction. The grad-B drift points downward in our example geometry. We assumed a deuterium ion with $20\;\text{keV}$, a tokamak with a minor radius of $0.5\;\text{m}$, inverse aspect ratio $a/R_0=1/3$, constant $q$-profile of $q=2$ and an on-axis magnetic field strength of $B_0=2\;\text{T}$.

Figure 3

Figure 4. Loss histories for (a) 20 keV and (b) 50 keV ions. Solid lines indicate the training loss whereas ‘x’ markers indicate the test loss. Six million training points were used and two million test points. The same seed for the pseudorandom number generator was used in both cases.

Figure 4

Figure 5. (a) Mean escape time for an ion initialised in the counter-current direction with a pitch of $\xi = -0.8$, after different stages of training. The profile of the ion’s escape time is shown in panels (a–c), with the residual indicated in panels (d–f). The first column is after the SOAP phase of training (50 000 epochs), the second column is 10 000 into the SSBroyden phase (60 000 net epochs) and the final column is after 50 000 epochs of SSBroyden (100 000 epochs total). The quantity plotted is $\text{log}_{10}( 1+ T)$, yielding a $\text{log}_{10}$ scale for large values of $T$, but vanishing when $T=0$. Deuterium ions with $20\;\text{keV}$ were assumed.

Figure 5

Figure 6. Test loss distribution for 20 keV ions. A uniform random distribution with one million points was used. The size of the markers is proportional to the magnitude of the residual.

Figure 6

Figure 7. (a) Escape time for an ion initially located at $\theta = 0$. The quantity plotted is $\text{log}_{10}( 1+ T)$, yielding a $\text{log}_{10}$ scale for large values of $T$, but vanishing when $T=0$. The white curve indicates the trapped-passing boundary. (b) Residual of (2.6). A deuterium ion with $20\;\text{keV}$ was assumed.

Figure 7

Figure 8. Slices of escape time in units of $\log _{10}(1+T)$ for different initial pitch values: (a) for $\xi =0.8$; (b) for $\xi =0.3$; (c) for $\xi =-0.3$; and (d) for $\xi =-0.8$. The solid white curve indicates the location of the trapped-passing boundary, where ions born with major radii $R$ to the right of the white curve are trapped.

Figure 8

Figure 9. Slices of escape time in units of $\log _{10}(1+T)$ for different initial radii: (a) for $r/a=0.999$; (b) for $r/a=0.95$; (c) for $r/a=0.9$; and (d) for $r/a=0.8$. The while curves are the trapped-passing boundary.

Figure 9

Figure 10. Slices of escape time in units of $\log _{10}(1+T)$ for different initial pitch values and an energy of $20\;\text{keV}$ computed from the JONTA code: (a) for $\xi =0.8$; (b) for $\xi =0.3$; (c) for $\xi =-0.3$; and (d) for $\xi =-0.8$. Ten million markers were used, where the particles were integrated for $t_{final} = 2\times 10^6$.

Figure 10

Figure 11. Comparison of escape time in units of $\log _{10}(1+T)$ predictions from the PINN (top row) and JONTA (bottom row). Ions were assumed to have an initial energy of $50\;\text{keV}$: (a, e) for $\xi =0.8$; (b, f) for $\xi =0.3$; (c, g) for $\xi =-0.3$; and (d, h) for $\xi =-0.8$. Ten million markers were used in the JONTA simulations, where the particles were integrated for $t_{final} = 10^7$.