Fluid and kinetic studies of tokamak disruptions using Bayesian optimization

When simulating runaway electron dynamics in tokamak disruptions, fluid models with lower numerical cost are often preferred to more accurate kinetic models. The aim of this work is to compare fluid and kinetic simulations of a large variety of different disruption scenarios in ITER. We consider both non-activated and activated scenarios; for the latter, we derive and implement kinetic sources for the Compton scattering and tritium beta decay runaway electron generation mechanisms in our simulation tool Dream (Hoppe et al., Comput. Phys. Commun., vol. 268, 2021, 108098). To achieve a diverse set of disruption scenarios, Bayesian optimization is used to explore a range of massive material injection densities for deuterium and neon. The cost function is designed to distinguish between successful and unsuccessful disruption mitigation based on the runaway current, current quench time and transported fraction of the heat loss. In the non-activated scenarios, we find that fluid and kinetic disruption simulations can have significantly different runaway electron dynamics, due to an overestimation of the runaway seed by the fluid model. The primary cause of this is that the fluid hot-tail generation model neglects superthermal electron transport losses during the thermal quench. In the activated scenarios, the fluid and kinetic models give similar predictions, which can be explained by the significant influence of the activated sources on the runaway dynamics and the seed.


Introduction
Plasma-terminating disruptions are one of the main challenges facing tokamak fusion reactors.Disruptions are off-normal events, leading to a sudden cooling of the plasmaa thermal quench (TQ) -associated with an increase in the plasma resistivity, causing the plasma current to decay over the longer current quench (CQ).The toroidal current cannot change significantly on the short TQ time scale, and therefore an inductive electric field is produced, which can accelerate electrons to relativistic energies (Helander et al. 2002;Breizman et al. 2019).There are several problems associated with disruptions, including heat loads on the plasma facing components and mechanical stresses due to electromagnetic forces (Hollmann et al. 2015).One of the most significant problems, however, is the generation of highly energetic runaway electron (RE) beams.
Runaway generation is expected to be a major problem in the next generation of machines with high plasma currents, such as ITER (Hender et al. 2007).The reason is that avalanche multiplication of seed REs is exponentially sensitive to the initial plasma current (Rosenbluth & Putvinski 1997).In addition, during deuterium-tritium operation, REs can also be generated by Compton scattering of γ photons and tritium beta decay.These activated generation sources will represent two irreducible sources of energetic electrons and are therefore expected to be the dominant RE seed generation mechanisms in future reactor-scale tokamaks when hot-tail and Dreicer generation have been successfully mitigated.An uncontrolled, localized loss of a relativistic runaway beam may result in damage to the first wall.One of the most prominent mitigation methods is massive material injection (MMI) of a combination of radiating impurities (e.g.neon) and hydrogen isotopes, either in gaseous or solid state.Shattered pellet injection is the reference concept for the disruption mitigation system in ITER.However, as of yet, the problem of designing a successful disruption mitigation strategy is still unsolved.
Several sophisticated simulation codes have been developed in order to study disruptions.One such code is Dream (Hoppe et al. 2021a), that is primarily developed for the study of REs.Dream has several options for simulating electron dynamics -ranging from fluid to fully kinetic models.Dream has already been used to study several aspects of tokamak disruption mitigation -including finding parameter spaces without REs for SPARC (Tinguely et al. 2021;Izzo et al. 2022;Tinguely et al. 2023a), STEP (Berger et al. 2022) and ITER (Pusztai et al. 2023).These studies used RE fluid models, i.e. the momentum-space runaway dynamics is replaced by formulae for the steady-state runaway generation rates as functions of the background parameters.
Kinetic plasma models are physically more rigorous and can, in certain scenarios, yield significantly different results compared with those obtained from fluid models.In kinetic models, the Dreicer and hot-tail RE generation mechanisms are naturally included because the dynamics in the electron velocity space is modelled by the Fokker-Planck collision operator.In certain cases, there may be substantial discrepancies between the kinetic and fluid generation rates.For example, in Dream, the hot-tail generation mechanism is implemented as a generation rate based on a simplified model (Smith & Verwichte 2008), which is derived under the assumption that there are no magnetic perturbations.This model can significantly overestimate the RE generation rate when magnetic perturbations are present.Pusztai et al. (2023) used the fluid RE model in Dream to explore the injected density space of MMI, with Bayesian optimization, to investigate scenarios for disruption mitigation in ITER.However, such a comprehensive exploration of the parameter space has not been performed using kinetic modelling.Even though isolated studies using kinetic models have been done (Vallhagen et al. 2022;Svenningsson et al. 2021), a thorough exploration of the MMI injected density space would not be feasible for a fully kinetic model, due to the many simulations needed combined with the high computational cost of fully kinetic simulations.Using an isotropic kinetic model for such exploration is however feasible, i.e. the model which evolves a distribution function which is analytically averaged over pitch angle.Such a procedure is justified for the mildly superthermal region where the complicated hot-tail dynamics takes place, while the relativistic electrons can again be treated in a simplified manner.The isotropic kinetic model in Dream is several orders of magnitude faster than the corresponding fully kinetic model (Hoppe et al. 2021a).
The aim of this article is to compare fluid and kinetic disruption simulations for an ITER-like disruption scenario.We explore the injected density space of MMI, using Bayesian optimization for sample selection.We start by deriving kinetic expressions for the RE generation sources from tritium beta decay and Compton scattering ( § 2).Then we proceed by constructing an informative cost function ( § 3.2) for efficient sample selection during the Bayesian optimization ( § 3).
Our results show that in the non-activated scenarios, the fluid and kinetic models yield significantly different optima ( § 4).This can be explained mainly by the generation rate of the runaway seed being overestimated when using the fluid model.In the activated scenarios, the fluid and kinetic models yield more similar results.In these cases, the Compton generation mechanism is dominant, which is well represented in the fluid model.

Kinetic runaway electron generation sources
Runaway electron generation is an inherently kinetic process, as it depends on details in the velocity distribution function that are determined by a balance between collisional friction, electric-field acceleration and radiation processes.With the kinetic models in Dream, the generation of REs from the Dreicer and hot-tail mechanisms is included in the dynamics modelled by the Fokker-Planck collision operator, while REs generated via avalanche multiplication are accounted for using a semi-analytical growth rate by Hesslow et al. (2019a) which has been benchmarked against kinetic simulations.
The activated generation processes, Compton scattering and tritium beta decay, provide energetic electrons that are continually generated over a range of electron velocities.Under the assumption that these generation mechanisms are isotropic in velocity space, kinetic sources can be derived using the same reasoning as for their fluid counterparts (Martín-Solís et al. 2017).For tritium beta decay, this assumption is justified since the emission angle of electrons during this process is random and uniformly distributed.For Compton scattering, gamma photons will be emitted from the walls and enter the plasma from all sides in a nearly isotropic manner, even traversing walls, because the plasma is optically thin to the photons.Therefore the resulting electron distribution can be assumed to be approximately uncorrelated with the pitch angle.

Tritium beta decay
The kinetic source rate due to tritium decay rate (df /dt) T can be derived using the expression for the energy spectrum of electrons emitted through beta decay (Krane 1988) Here W = m e c 2 γ is the total energy, W = m e c 2 (γ − 1) is the kinetic energy, γ = p 2 + 1 is the Lorentz factor and p is the momentum normalized to m e c.Since the maximum energy of the emitted electrons is W max = 18.6 keV ≪ m e c 2 , we may take the nonrelativistic limit of the Fermi function (Fermi 1934), with the normalized speed β = p/γ, the charge of the final state nucleus Z f (for tritium decay Z f = 2), and the fine structure constant α ≈ 1/137.Since the source is isotropic, we may write Using dW/dp = m e c 2 β, we find The requirement that the source integrated over the entire momentum space yields the tritium decay rate, (ln 2) n T /τ T -where τ T ≈ 4500 days is the half-life for tritiumdetermines the proportionality factor in (2.4) such that with the constant where p max is the momentum corresponding to W max .

Compton scattering rate
The kinetic Compton scattering rate can be calculated using the Compton cross-section and the gamma energy flux spectrum emitted by a given wall material.When using the kinematic relation between the photon energy W γ , and the energy W and deflection angle θ of the scattered electron, we can write where W ′ γ = W γ − W is the scattered photon energy.The Compton scattering process is described by the Klein-Nishina differential cross-section with r e = e 2 /4πε 0 m e c 2 being the classical electron radius.Thus, in a plasma with the total electron density n e,tot , i.e. the electron population of both free and bound electrons, and isotropic photon distribution, we can write where Γ γ (W γ ) is the gamma energy flux spectrum.The lower integration limit W γ0 can be derived from equation (2.7) as (2.10) with the total flux Γ flux = 10 18 /(m 2 s).This applies for an H-mode discharge at 15 MA and 500 MW fusion power.For an L-mode case, when the plasma energy is lower, the total flux would be approximately 25 % of that of the H-mode (Martín-Solís et al. 2017).
Part of the flux is caused by the time-integrated activation of the walls, and the typical ratio between the prompt gamma flux from fusion neutrons and the gamma flux after the fusion reactions cease is 1000.These calculations were performed for a beryllium first wall.ITER currently plans to have a first wall made of tungsten, which might change the photon flux and spectrum.
The runaway production rate from Compton scattering can be obtained as where we have used Both the kinetic tritium and Compton sources are consistent with the fluid runaway generation rate (Martín-Solís et al. 2017) when the runaway region is set to be above the critical momentum for RE generation p c .

Bayesian optimization of simulated disruptions
In this paper, we will compare disruption simulations using an isotropic reduced kinetic model, described in Appendix B.2 of (Hoppe et al. 2021a), and a computationally less expensive fluid model.The comparison of fluid and kinetic simulations for a wide variety of disruption scenarios was facilitated using Bayesian optimization.The disruption model and simulation details used are described in § 3.1, while the Bayesian optimization is detailed in § 3.3.In § 3.2, the design and motivation of the cost function is presented.

Simulation set-up
The simulations are performed using the numerical tool Dream (Disruption Runaway Electron Analysis Model).Dream simulates the plasma evolution of tokamak disruptions self-consistently, including the evolution of the plasma temperature, current densities, electron and ion densities, ion charge states, and electric field.The simulations are performed with a multi-fluid plasma model, with the possibility of simulating some, or all, subsets of the total electron population kinetically.Dream divides the electrons into up to three populations.
Both in the fluid and the kinetic simulations performed here, the bulk and RE populations are treated as distinct fluids.The bulk electrons are characterized by their density n e , temperature T e and the Ohmic current density j Ω .The REs are described by their density n re , which also yields the runaway current density j re = ecn re , because they are assumed to move with the speed of light parallel to the magnetic field.In reactor-scale tokamak disruptions, this approximation, i.e. v re = c, is typically valid, as supported by (Buchholz 2023).The total current density j Ω + j re is governed by the evolution of the poloidal flux ψ(r) (Hoppe et al. 2021a).For the electrical conductivity σ which enters in this evolution, we use the model described by Redl et al. (2021), that is valid for arbitrary plasma shaping and collisionality, and takes into account the effects of trapping.Neutral collisions -which are only relevant at very low post-TQ temperatures that yield strongly unfavourable outcomes -are not accounted for in the resistivity, as they are not relevant in the parameter ranges of interest (Vallhagen et al. 2020).
In the kinetic simulations, there is an additional population of superthermal electrons that are represented by the distribution function f hot , which is evolved using a pitchangle averaged Fokker-Planck equation (Hoppe et al. 2021a).This division of the hot and cold populations is especially suitable when there are two distinct Maxwellian electron populations present in the plasma, as is the case when cold materials are injected into a hot plasma during massive material injection.
The temperature of the cold population is determined by a time dependent energy balance equation, accounting for Ohmic heating, line and recombination radiation and bremsstrahlung, as well as a radial heat transport, according to equation ( 43) of (Hoppe et al. 2021a).The ionization, recombination and radiation rates are taken from the ADAS database for neon and the AMJUEL database † for deuterium.AMJUEL contains coefficients which account for opacity to Lyman radiation, which is important at high deuterium densities (Vallhagen et al. 2022) ‡.The cold electron density is constrained by the condition of quasi-neutrality.
In the kinetic model, the Dreicer and hot-tail generation mechanisms are naturally present as velocity space particle fluxes, while in the fluid model, they are both implemented as generation rates.The Dreicer generation rate is computed by a neural network (Hesslow et al. 2019b) if the input parameters are within the training interval.However, for low electric fields, outside the training interval (E/E D < 0.01), the values given by the neural network are inaccurate.In such cases, an analytic extrapolation is performed to recover a similar asymptotic behaviour as to the Connor-Hastie formula (Connor & Hastie 1975) Here, A and b are positive parameters fitted to the dependence of the neural network on the electric field for E/E D > 0.01, with the density and temperature fixed at the values of the current time step.They are fitted under the constraint that (dn re /dt) Dreicer (E/E D ) should be once differentiable for all values of E/E D .
The hot-tail generation rate is evaluated from the model distribution function of Smith & Verwichte (2008) and using the formula derived by Svenningsson (2020) -it is presented in Appendix C.4 of (Hoppe et al. 2021a).In both fluid and kinetic models, the avalanche generation is determined by the model by Hesslow et al. (2019a).
In the fluid simulations, the activated generation sources from tritium decay and Compton scattering are evaluated according to Vallhagen et al. (2020); in the kinetic simulations we use the sources derived in § 2. During the TQ, the photon flux is set to Γ flux = 10 18 /(m 2 s) to be consistent with the prediction of the photon flux at ITER for an H-mode discharge at 15 MA and 500 MW fusion power (Martín-Solís et al. 2017).However, with the drop in plasma temperature, the fusion power ceases.For this reason, the photon flux after the TQ is decreased by a factor of 10 3 , as was done by Vallhagen et al. (2024).
The disruption simulations are performed in an ITER-like tokamak scenario with either a deuterium plasma of density 10 20 m −3 for the non-activated scenarios, or a deuteriumtritium plasma with the same total ion density and 50-50 % isotope concentrations.In both scenarios, the fuel density is spatially uniform.Discharges without tritium are simulated for 200 ms, while activated discharges are simulated for 400 ms, in order to capture the interesting runaway current dynamics in both scenarios.† Documentation for the AMJUEL database: www.eirene.de/Documentation/amjuel.pdf.
‡ The effective ionization loss rate, including the line radiation emitted during the bound-bound transitions leading up to the ionization, was adjusted to enforce that it only takes values larger than the ionization potential for each ionization event.This adjustment was only active at electron densities ∼ 10 22 m −3 and temperatures ≲ 1 eV, close to the limits of the validity range for the provided polynomial fit.
The magnetic field B 0 = 5.3 T on axis, the resistive wall time τ w = 0.5 s, the major radius R 0 = 6 m and the plasma minor radius a = 2 m.Pusztai et al. (2023) evaluated the effective radius of the first toroidally closed conducting structure was evaluated by matching the poloidal magnetic energy to that in simulated ITER discharges, yielding b = 2.833 m (Vallhagen et al. 2024).However, during disruptions, the plasma can be vertically displaced, resulting in a continuous contraction of the confined region, and a corresponding reduction of the poloidal magnetic field energy reservoir that contributes to runaway generation.In this case a more tightly fit wall (b = 2.15 m) might be more appropriate.Simulations are therefore performed with both of these values of the conducting wall radius, to bracket a plausible range of magnetic to kinetic energy conversion.Furthermore, the geometry is determined by realistic and radially varying shaping parameters congruent with the Miller model equilibrium (Miller et al. 1998), used also by Pusztai et al. (2023).
At t = 0, the plasma current I p = 15 MA with current density profile ĵ(r) = [1 − (r/a) 2 ] 0.41 , the plasma temperature is parabolic with T (r = 0) = 20 keV, and the plasma is fully ionized.The deuterium and neon of the MMI are assumed to consist of cold atom populations, deposited instantaneously and homogeneously, as was done by Pusztai et al. (2023).
To account for the effect of the stochastic magnetic field during the TQ, we employ spatially and temporally constant magnetic perturbations of δB/B = 0.5 %.Such a value of δB/B is consistent with representative transport levels in MHD simulations of ITER disruptions (Hu et al. 2021).Although we do not consider different values of δB/B here, we note that lower values tend to reduce conducted heat losses as well as the runaway seed, thereby increasing the region in parameter space that yields tolerable disruptions, while the location of the optimum is less affected, consistent with the findings of (Pusztai et al. 2023).Following the Rechester-Rosenbluth model (1978), this magnetic perturbation leads to a transport of hot electrons, REs and electron heat, with a diffusion coefficient D ∝ v ∥ R 0 (δB/B) 2 .Here, v ∥ is replaced by v te = 2T e /m e (local electron thermal speed) for the heat transport and by the speed of light c for the RE transport, where we assume a parallel streaming of the REs along the perturbed field lines.As noted by Pusztai et al. (2023), this approach gives an upper bound on the effect of the RE transport as it does not account for the energy and angular dependence of the RE distribution, as well as finite Larmor radius and orbit width effects.In the kinetic simulations, the parallel velocity dependence for the hot electrons is determined by f hot .
The transport of superthermal and runaway electrons is switched off during the current quench, to signify that the magnetic flux surfaces are reformed.We still use a small but finite heat transport corresponding to δB/B = 0.04 % to avoid the development of nonphysical narrow hot Ohmic channels (Putvinski et al. 1997;Fehér et al. 2011).As long as such channels do not try to form, this low transport level is subdominant to radiative heat losses at post-TQ temperatures.In the absence of a clear, widely accepted definition of the end of the TQ, in the simulations, we define the end of the thermal quench by the mean temperature falling below 20 eV.We discard simulations where the TQ condition has not been fulfilled within 20 ms, which is a factor of ∼ 10 greater than the anticipated TQ time of 1-2 ms according to (Hollmann et al. 2015).In such cases, we can assume that the TQ was not successfully completed and other complications might arise, such as reheating and unsuccessful current quench.
We note, that the criterion used to switch the energetic electron transport off and reduce the heat transport is a source of uncertainty, as the time it takes for flux surfaces to reform might be delayed somewhat after the temperature has reached its post-TQ value.Clearly, longer windows of runaway loss are expected to reduce the final RE current.
However, the presence of a relatively small reformed region may be sufficient to support significant RE currents (Tinguely et al. 2023b).

Cost function
The cost function of an optimization problem is critical for the optimization to yield meaningful results and, here, the cost function should reliably quantify the risks associated with tokamak disruptions, in the form of a single scalar.First, the runaway current should be accounted for in the cost function.We will consider two plausible choices for representing the runaway current contribution to the cost function -the maximum runaway current I max re = max t I re (t) and the runaway current at the time when it makes up 95 % of the plasma current I 95 % re = I re (t Ire=0.95Ip ), which we will call the 95-percent runaway current.To account for the advantages and disadvantages of both options (discussed in Appendix A), we choose to use the first occurrence of I 95 % re unless I max re occurs before I 95 % re or if I re (t) < 0.95I p (t) throughout the simulation.This will be called the representative runaway current I repr re .Additionally, any remaining Ohmic current at the end of the simulation has the potential of being converted into runaway current, as well as indicating if the termination of the discharge was successful.Consequently it should also be included in the cost function.Based on (Lehnen 2021), the upper safe operational limit of the runaway current is 150 kA, which will also be used as the upper safe operational limit for the Ohmic current.
Another indication of the success of the disruption mitigation system is the duration of the CQ (CQ time).In this work, the CQ time is evaluated through extrapolation, as τ CQ = (t I Ω =3 MA − t I Ω =12 MA )/0.6 if the Ohmic current drops below 3 MA during the simulation and, otherwise, τ CQ = (t final − t I Ω =12 MA )/(0.8 − I final Ω /I t=0 p ).Here t I Ω =3 MA (t I Ω =12 MA ) is the time when I Ω is 20 % (80 %) of the initial plasma current.As stated by Hollmann et al. (2015), its safe operational interval is [50,150] ms for 15 MA discharges.
Finally, the last figure of merit considered is the transported heat loss fraction η cond , meaning the fraction of the initial plasma kinetic energy which has been lost from the plasma due to energy transport.It is accounted for using the Rechester-Rosenbluth transport in the collisionless limit, which means that the radial heat diffusion is caused by parallel streaming of electrons along perturbed field lines (Rechester & Rosenbluth 1978), as detailed in § 3.1.For a safe disruption, the transported heat load fraction should be below 10 %, according to Hollmann et al. (2015).
Our cost function L(I repr re , I final Ω , τ CQ , η cond ) is designed to yield a value below one if all components are within their intervals of safe operation.To achieve this, I repr re , I final Ω and η cond are normalized to their respective safe operational limits.The CQ time is shifted and normalized according to |τ CQ − 100 ms|/50 ms, in order to achieve the same behaviour for both the lower and upper limit of safe operation.The four figures of merit can be combined using different orders of p-norms, and here we have chosen to use the Euclidean norm (p = 2).In Appendix A we discuss the impact of different choices for p. = 50 kA and η cond = 3 %), whereas the latter would be preferable in an actual disruption.More specifically, the relevant consideration for the CQ time is that it is between 50 ms and 150 ms (and preferably with some margin for uncertainties), not necessarily that it is as close to the middle of this safe interval (i.e. 100 ms) as possible.For the RE current however, it is preferable that it is as close to zero as possible, even if RE currents up to 150 kA are tolerable.Furthermore, minimizing I repr re was in this work deemed more important than minimizing η cond .To achieve this order of significance when the figures of merit are within their safety intervals, we introduced non-linear (but smooth) rescaling functions f i , which take the form for the currents k I = 1 and x I = I/150 kA, for the conducted heat load k η cond = 3 and x η cond = η cond /10 %, and for the CQ time k τCQ = 6 and x τCQ = |τ CQ − 100 ms|/50 ms.
Note that this implementation of f i is differentiable.
The final expression for the cost function can thus be written as Each of the components are weighted equally, with weight C = 0.5, to ensure values below one for scenarios of safe operation.
For some simulations, the cost function cannot be evaluated -mainly due to lack of convergence or incomplete thermal quench within 20 ms.In these cases, the cost function was set to a high value (L = 75 ≫ 1) in order to decrease the probability of the optimizer exploring the surrounding area further.

Optimization set-up
Bayesian optimization is an optimization strategy which uses Bayesian inference to estimate a probability distribution function for the entire objective (or cost) function based on a prior set of samples (input-output pairs).This is achieved by assuming that all unknowns of the system to be inferred are random variables.
To make viable predictions for the objective function, an assumption of the nature of its probability distribution is needed.For this, Bayesian optimization uses stochastic processes, which are infinite collections of random variables (Garnett 2023).A common stochastic process used for Bayesian optimization is the Gaussian process (GP), in which the random variables are distributed according to multivariate Gaussian distributions.In the context of Bayesian optimization, a GP is specified by a mean function µ(x), which determines the expected objective function value at any point x of the search space, and a covariance function ϕ(x, x ′ ), which measures the correlation between points x and x ′ (Garnett 2023).Consequently, this optimization method not only yields an optimum, but also an estimate of the objective function itself in the form of the mean function µ(x) as well as its uncertainty.
During the optimization, the optimizer advances by taking samples of the objective function and successively updating its sample set and subsequently its GP estimate of the probability distribution function.The strategy for choosing which point to sample next is defined by the acquisition function, which uses the GP estimate to determine which point has the highest potential of improving the current optimum.
In this work, the Bayesian optimization was performed using the Python package by Nogueira (2014) which uses GPs.The Matérn kernel with smoothness parameter ν = 3/2 was used for the GPs and the expected improvement acquisition function was used as the optimization strategy (Garnett 2023).

Disruption mitigation optimization
The goal of this section is to compare fluid and kinetic simulations of MMI scenarios for a range of injected material densities.Since the RE dynamics are different for discharges with and without tritium, we will do this comparison both with and without activated sources, in § 4.2 and § 4.1, respectively.While the main focus will be on the differences between the models, we will also discuss the simulated disruption outcomes, including a comparison of our results with those of Pusztai et al. (2023).We will furthermore comment on the impact of choice of wall radius on the results.The code used to produce the results in this section, as well as the optimization data, can be found at Zenodo †.

Optimization of non-activated discharges
The non-activated scenarios are characterized by only having the Dreicer and hot-tail generation mechanisms as seed generation sources ‡, which tend to be more susceptible to runaway mitigation measures, as we shall see.For the non-activated scenarios, the optimization was performed on the MMI densities within log 15,20] for the fluid model.For the kinetic model, the intervals were set as log 17.5, 20].The total injected number of atoms of ∼ 10 25 corresponding to the upper end the deuterium density ranges, 10 22 m −3 , is an approximate upper bound on material assimilation using shattered pellet injection at ITER (Vallhagen et al. 2024).When using the fluid (kinetic) model, the optimization was performed using 250 (200) disruption simulations.
Figures 1(a) and 1(b) show the results from the optimization of the non-activated scenario for both the fluid and kinetic models, respectively.The most prominent difference between the results of the fluid and kinetic models is that there exists a region of safe operation (as indicated by blue shades in the figure) for the simulations done with the † Simulation data for this article: https://doi.org/10.5281/zenodo.10998363‡ Note, that in the non-activated cases we assume that the gamma flux from the wall -which might be induced by D-D neutron bombardment -is sufficiently small that the contribution of the corresponding Compton seed to the final runaway current is negligible compared with that of the non-activated sources.We find that the main difference between the two models is the runaway current.This is illustrated in figure 2 17.5, 20] square, but for the kinetic simulations, it covers all but the upper right corner.Thus, with the kinetic model, we find that the band of safe operation that can be observed in figure 1.b corresponds to the overlap of I repr re < 150 kA and η cond < 10 %, which are the two competing components.Similarly, the optimum found using the fluid model is located at a point between the region of safety for I repr re and η cond , since here there is no overlap.It is worth noting, however, that such a region of overlap of I repr re < 150 kA and η cond < 10 % might not be possible with a tungsten wall, as the additional high-Z impurity can have a more significant negative impact concerning the runaway current (Walkowiak et al. 2024), than the improvement the extra radiative losses represent for η cond .
The shapes of the region of safety for both I final Ω and τ CQ are also noticeably different, though not to the same degree as for I repr re .This is simply a consequence of the difference in runaway current.In cases when less of the Ohmic current is transformed into runaway current, the Ohmic current is higher at the end of the simulation, and the CQ time is therefore longer.This causes both of the regions of I final Ω < 150 kA and 50 ms < τ CQ < 150 ms to shift upwards when the I repr re < 150 kA region expands.The choice of plasma model does not have a significant impact on the conducted heat load -for both models, the safe region covers the upper right corner of the region explored with the kinetic model.
The results shown in figures 1 and 2 correspond to simulations with a wall radius of 2.833 m, but the results are qualitatively similar for a wall radius of 2.15 m -the corresponding landscapes as presented in figure 2  is that with the smaller wall radius, the currents are in general lower and the CQ times are shorter.In figure 3, the plasma and runaway currents are presented for the optimum found using the kinetic model, as well as fluid simulations for the same parameters using a = 2.833 m and a = 2.15 m.With the larger wall radius, I repr re ≈ 4 MA, while the smaller wall radius (which also uses a resistive wall time of 0.5 s) yields I repr re ≈ 400 kA.This dramatic reduction in the representative runaway current caused by placing the conducting wall near the plasma, thereby reducing the magnetic energy reservoir for runaway acceleration, is consistent with recent findings reported by Pusztai et al. (2023) and Vallhagen et al. (2024).The reason to consider such a reduced wall radius -compared with the nominal one matching the total available magnetic energy content in ITER -is to emulate the reduction of the magnetic energy that is converted to runaway acceleration in a vertical displacement event, as the confined region is gradually scraped off from the plasma channel.This approach can give a lower bound on the representative runaway current.
The difference in runaway current for the two models is dominated by differences in the modelling of the hot-tail RE-generation mechanism, which is highly dependent on the energy distribution of the superthermal electrons.In the kinetic simulations, f hot describes the energy distribution of the superthermal electrons, which enables the hot-tail generation to be included in the velocity flux determined by the Fokker-Planck collision operator.This energy distribution of the superthermal electrons is not evolved in the fluid simulations and, therefore, a model for the energy distribution based on the simulated plasma parameters is needed.Such a model, with the purpose of being used to determine the hot-tail generation rate for fluid simulations, was derived by Smith & Verwichte (2008).They neglect the electric field and diffusion terms in the Fokker-Planck equation, and instead only retain collisional friction, which gives the solution where the critical momentum p fl c is defined as the momentum at which the electric field acceleration and collisional friction balance each other; see equation (C.24) in (Hoppe et al. 2021a).† The optimum found using the kinetic model is a clear example of the fluid and kinetic models yielding significantly different RE-current dynamics, as shown in figure 3, due to the difference in hot-tail generation implementation in the two models.For the same MMI densities in a fluid simulation, we obtain a several MA runaway current, where the seed REs are exclusively generated by the hot-tail mechanism, while the kinetic simulation gives practically zero runaway current.In figure 4, the distribution function derived by Smith & Verwichte (2008) and the distribution function evolved in the kinetic simulation are plotted just before and at end of the TQ, for r = 0.85 m which is the radial position with the highest hot-tail generation rate in the fluid simulation.The two distribution functions f hot and f fl hot are similar initially, as shown in figure 4.a.The discrepancy at p ≳ 1 can be fully explained by f hot being a Maxwell-Jüttner distribution function, while f fl hot is a non-relativistic Maxwell-Boltzmann distribution.For this radius, the hot-tail generation peaks at the end of the TQ in the fluid simulation and then the two distribution functions are significantly different, as shown in figure 4.b.Here f fl hot ≫ f hot around p fl c (by many orders of magnitude; note the logscale on a wide range), which can be explained by the kinetic simulations accounting for a radial diffusive transport of f hot due to magnetic perturbations, while the evaluation of f fl hot does not.A generalization of the fluid hot-tail rate, accounting for transport of the hot electrons, would not be straightforward, due to the velocity dependence of the Rechester-Rosenbluth transport.The neglect of this transport will lead to a significant overestimation of the runaway seed due to the factor of f fl hot (t, p fl c ) in equation (4.2).This was confirmed by comparing the fluid and kinetic simulations with the radial transport removed.As shown in figure 1, the optima found with both the fluid and kinetic models are located at high deuterium densities.As noted in § 3.1, at high deuterium densities, the effects of accounting for opacity to Lyman radiation become important.When not accounting for opacity to Lyman radiation at these optimum, the temperature decays faster, resulting in a significantly larger runaway current, which is in agreement with the conclusions of Vallhagen et al. (2022).
In conclusion, the model distribution function used by the fluid model lacks electric field acceleration and radial transport effects, and it is the transport that dominates the differences observed between the two models.We note, that previous work showed that if the losses due to magnetic perturbations are neglected, i.e. taking into account all the hot-tail electrons, the final runaway current is overestimated by a factor of approximately four in ASDEX Upgrade (Hoppe et al. 2021b), which is in qualitative agreement with our conclusions here.

Optimization of activated discharges
In activated scenarios, the RE seed is expected to be dominated by the generation of REs from tritium beta decay and Compton scattering in cases when the hot-tail and Dreicer generation sources have been successfully mitigated.The same intervals and number of samples were used for the optimization of the activated scenarios as for the non-activated scenarios, except that the neon density interval was changed to log n Ne /[1 m −3 ] ∈ [15, 17.5] for the optimization with the kinetic model, in order to include regions of lower runaway current.
Figure 5 shows the result from the optimizations of the activated scenarios.There are no regions of safe operation, neither when using the fluid nor the kinetic model -in fact, L > 10 (based on the mean prediction µ) everywhere for both models, and both models yield almost identical results.From the optimization using the fluid model, there are two local minima, which both are around L = 15.Since the cost function was designed to distinguish safe from unsafe disruptions, and this distinction happens around L ∼ 1, it is not as reliable at determining which of the two disruptions is better if their cost function Table 1: Disruption figures of merits for fluid and kinetic simulations for parameters indicated in figure 6 with the same markers.The optimization using the kinetic model presented in figure 5.b has not explored the region around the global minimum found with the fluid model, thus its minimum has the same location as the other local minimum found using the fluid model.However, the location of this global optimum did not change when performing a similar optimization using the kinetic model which covered regions surrounding both optima found using the fluid model.
That the two plasma models yield very similar results is also clear from figure 6, where the regions of safe operation for each figure of merit are plotted.The two models yield qualitatively the same results, since the shapes of the safe regions are remarkably alike to the point where it is difficult to distinguish differences due to plasma models from stochastic differences of the sampled locations.
In table 1, fluid and kinetic simulations are compared for different combinations of the deuterium and neon densities, including the two optima (the locations in n D -n Ne space are indicated by symbols in figure 6).The difference between the kinetic and fluid models are in most cases just a few percent, and the relative difference is larger when the figures of merit are small.The largest relative error occurs for the conducted heat load, since it is consistently a few percent larger in the kinetic case; this can be explained by the kinetic simulation capturing the transport of the hot electrons.† In absolute terms, only the runaway current varies significantly between the kinetic and fluid results -for large runaway current cases, the difference can be ∼ 150 kA, while the relative difference is small.As shown in table 1, the kinetic simulations have a higher runaway current in all cases but one.The reason for the kinetic model yielding larger runaway current is indirectly due to the activated sources.The direct generation of electrons above the critical momentum from the activated sources is practically identical for both the fluid and kinetic simulations.In the kinetic simulations however, the hot electrons generated by the activated sources with momenta below the critical momentum will cause an increase in RE generation due to flux in velocity space.
The dynamics of the RE seed generation in the activated scenarios are substantially different from those in the non-activated scenarios.In the non-activated case, the seed generation was dominated by the hot-tail mechanism.For an activated scenario, f hot (which includes not only the hot-tail, but also superthermal electrons generated by the Dreicer, Compton scattering and tritium decay mechanisms) will not be depleted due to magnetic perturbations, because there will be a constant generation of hot electrons due to the activated sources.This means that the runaway seed generation will not necessarily be overestimated with the fluid model, as it was in the non-activated case.
In figure 7, the distribution functions f hot and f fl hot are plotted at the beginning and end of the TQ, at r = 1.05 m, for the optimal activated scenario found using the fluid model.
We see that, in contrast to the non-activated case, f fl hot ≪ f hot around p fl c at the end of the TQ for the activated case (note the scale of the y-axis).The fluid simulation should still overestimate the runaway seed generation rate due to the neglect of superthermal electron transport in the fluid hot-tail model, which should lead to f fl hot ≫ f hot as in the non-activated cases.However, the fluid model does not capture the additional dynamics in the superthermal momentum range caused by the generation of hot electrons from the activated sources, as discussed above, and this explains why f fl hot ≪ f hot in the activated scenarios.Furthermore, for the activated cases, the inaccurate runaway seed generation will have a smaller impact on the runaway current evolution since the activated sources, and particularly the Compton source, will constitute a significant fraction of the seed.For both optima found in the activated scenarios, the Compton scattering generation mechanism dominates the runaway seed generation by several orders of magnitude.
The physical model and methodology of this study are similar to that of Pusztai et al. (2023), but the cost function landscapes and the found optima are significantly different from those presented in that work.The most significant difference compared with the simulations presented by Pusztai et al. (2023)  where Θ(τ ) = 1 2 [1 + tanh(τ /3.3 ms)].In (4.3a),I max re is used instead of I repr re , but this does not have a significant impact on the results.Rather, differences between how τ CQ and η cond are treated are responsible for most of the difference in the results.
First, the step function behaviour of θ(τ CQ ) limits the optimization of the other parameters to only the region of safe operation for τ CQ .All samples with τ CQ / ∈ [50, 150] ms will have L IP > 100, which is an order of magnitude larger than the optimum found by Pusztai et al. (2023), making it difficult to distinguish the impact of the other components on the cost function value.
Second, the transported heat fraction is weighted one order of magnitude higher than the runaway current and final Ohmic current.This means that η cond = 10 % will have the same impact on L IP as I max re = 1.5 MA, but while η cond = 10 % is marginally tolerable, I max re = 1.5 MA is not.The cost function differences explain the difference in landscape of L IP compared with L here.Furthermore, it explains why no case with I max re < 4 MA is found by Pusztai et al. (2023) -the cases in table 1 with I max re < 4 MA have τ CQ > 150 ms and η cond ≈ 80 %.This yields L IP ∼ 200 which is a rather high value of the cost function according to the scale in figure 1 of (Pusztai et al. 2023).
Apart from the difference in cost function landscape, the optimum found using the fluid model is shifted to higher values of n Ne and lower values of n D compared with the optimum found by Pusztai et al. (2023).The main reason for this shift is the decrease of the photon flux from Γ flux = 10 18 /(m 2 s) to Γ flux = 10 15 /(m 2 s) after the TQ, which was not accounted for by Pusztai et al. (2023).This change in the photon flux leads to a delay in the runaway current ramp up, resulting in a longer decay time for the Ohmic current.Therefore, τ CQ is increased from 49 ms to 67 ms, and thus would have a large impact also on L IP .
An important similarity between our result and those of Pusztai et al. (2023) is that we do not find any n D -n Ne combinations yielding tolerable or close to tolerable figures of merit in the activated case.For the full explored region, L ≫ 1, and as we mentioned The optimal samples found during the optimizations using the fluid (kinetic) model is indicated by a black star (diamond), while the other cases in table 1 are indicated as black markers of different shapes.Samples located outside of the plotted domains are placed at the edges.earlier, the cost function was not designed to be particularly informative at these values.
To study correlations and trade-offs between different figures of merit, another approach is needed.
We can use that each simulation which we have performed corresponds to a point in the four dimensional parameter space spanned by our four figures of merit, i.e.I repr re , I final Ω , τ CQ , η cond .In figure 8, these points have been projected on two-dimensional subspaces spanned by pairs of the figures of merit, which illustrates potential tradeoffs between each pair.Figure 8(a-c) clarify that for scenarios with I repr re < 1 MA, we get τ CQ > 400 ms and η cond > 75 %.More specifically, the correlation between I repr re and η cond suggests that it is not possible to achieve moderate values for the runaway current and the transported heat fraction simultaneously -when I repr re < 4 MA, then η cond > 75 %, and vice versa.This is related to the almost rectangular envelope bounding the point cloud from the side of low values in the I repr re -η cond panel.Disregarding the runaway current, there are some points of safety overlap for the other three figures of merit; see the lower panels of figure 8.All simulations with τ CQ ∈ [50, 150] ms have I final Ω < 150 kA, and since there is a slight overlap between τ CQ ∈ [50, 150] ms and η cond < 10 %, here I final Ω < 150 kA as well.This can be related back to figure 6, where we see a small region of overlap for safe values of τ CQ and η cond , as well as the safe region for τ CQ being almost fully covered by the safe region for I final Ω .
Being able to visualize correlations between the cost function components and which regions of the n D -n Ne space correspond to safe regions of the figures of merits is among the benefits of using a stochastic and exploratory optimization algorithm such as Bayesian optimization.This feature has enabled the exploration of how kinetic and fluid simulations compare for a large variety of scenarios, while simultaneously yielding information on the disruption evolution not just for the most optimal case but for the full region of interest in the n D -n Ne space.Such comprehensive comparison or thorough exploration of MMI of deuterium and neon would not have been possible with a more traditional optimization algorithm, such as gradient descent or the comparison of just a few scenarios.

Conclusions
We have compared kinetic and fluid simulations of disruptions with massive material injection for an ITER-like tokamak set-up.With this objective in mind, we derived and implemented kinetic runaway electron generation sources for the Compton scattering and tritium beta decay mechanisms.Furthermore, we investigated how the hot-tail generation compares between the two models, both for activated and non-activated scenarios.Comprehensive fluid-kinetic comparisons were facilitated using Bayesian optimization of the injected densities of deuterium and neon, which enabled exploration of a large variety of different disruption scenarios.
We found that while kinetic and fluid disruption simulations representing activated scenarios generated similar results, there were major differences for the non-activated scenarios, and this was caused by differences in the fluid and kinetic hot-tail generation dynamics.Since the kinetic model evolves the distribution function of the superthermal electrons in the simulations, it takes transport of these particles due to magnetic perturbations into account, while the model distribution used to derive the hot-tail generation rate used in the fluid simulations does not.This has a high impact on the non-activated simulations since this transport leads to a depletion of the distribution function, which limits the runaway seed generation in the kinetic simulations.
In the activated case, however, there is a constant generation of superthermal electrons in the kinetic simulations due to the activated sources, which acts against the depletion due to transport, and thus makes the difference between the fluid and kinetic models less severe.Furthermore, the additional generation of runaway electrons from the activated sources reduces the impact of the hot-tail and Dreicer generation on the magnitude of the runaway current during the CQ.This limits the significance of the overestimation of the runaway seed generation due to the neglect of superthermal electron transport in the fluid hot-tail model.
The cost function used for the Bayesian optimization was carefully designed, with the feature of safe disruption simulations corresponding to cost function values below unity.This had a significant impact on the cost function landscape in the explored MMI density region and the optima found during the optimization, when compared to a similar optimization done by Pusztai et al. (2023).
Outstanding issues that remain to be investigated include the effect of vertical displacements events and other instabilities on the runaway electron dynamics.Additionally, there might be remnant magnetic perturbations even after the thermal quench, which will affect the final runaway current.Finally, the injected material will in reality not be distributed uniformly.The material injection can also be divided into several stages, and recent studies indicate that multi-pellet injections are advantageous for disruption mitigation (Vallhagen et al. 2024).The impact of radial inhomogeneities in the assimi-lated material were, to some extent, explored by Pusztai et al. (2023), finding beneficial effects of outward peaking neon and inward peaking deuterium distributions.However, such inhomogeneities might be less consequential for the runaway current evolution in the presence of an effective particle transport during the TQ (Vallhagen et al. 2024).
The final consideration in the design of our cost function is the relative importance of the components.With what we have discussed so far, a disruption with I repr re = 120 kA and τ CQ = 100 ms would yield the same value of L as I repr re = 0 kA and τ CQ = 60 ms (if I final Ω

Figure 1 :
Figure 1: Logarithmic contour plots of the cost function estimate µ for the non-activated scenario, generated using (a) the fluid, and (b) the kinetic model in Dream.Note that the colour mapping is adapted such that blue shades represent regions of safe operation.The black star (diamond) indicates the optimal samples found using the fluid (kinetic) model, while the black dots indicate all optimization samples.The grey area covers the region of incomplete TQ.

Figure 2 :
Figure 2: Regions of safe operation (shaded) for the non-activated case with regards to I repr re (red), I final Ω (blue), τ CQ (green) and η cond (yellow).Panel (a) uses fluid simulations, and panel (b) uses kinetic simulations.The optimal samples are indicated by a star in panel (a) and a diamond in panel (b).Note that only in the kinetic case, there is a finite intersection of all the regions of safe operation.
, where the safe regions of each of the cost function components are indicated by the coloured areas.For the fluid model, the safe region of I repr re only covers the lower part of the log n D /[1 m −3 ] ∈ [21, 22] and log n Ne /[1 m −3 ] ∈ [

Figure 3 :
Figure 3: Total plasma current (solid) and runaway current (dashed) of the optimum found using the kinetic model, simulated with (a) the kinetic model and (b) the fluid model and a = 2.833 m, as well as (c) the fluid model and a = 2.15 m.

Figure 4 :
Figure 4: The distribution function for the optimal non-activated case, found using the kinetic model (a) just before and (b) at the end of the TQ.The distribution function f hot (dashed) has been evolved in the kinetic simulation and f fl hot (solid) is the model distribution used in the fluid simulation, given by equation (4.1).The latter is based on values of the plasma parameters evolved in the fluid simulation.

Figure 5 :
Figure 5: Logarithmic contour plots of the cost function estimate µ for the activated scenario, generated using (a) the fluid, and (b) the kinetic model in Dream.The black star (diamond) indicates the optimal samples found using the fluid (kinetic) model, while the black dots indicate all optimization samples.

Figure 6 :
Figure 6: Regions of safe operation (shaded) for the activated case with regards to I repr re

Figure 7 :
Figure 7: The distribution function for the optimal activated case found using the fluid model (a) just before and (b) at the end of the TQ.The distribution function f hot (dashed) has been evolved in the kinetic simulation and f fl hot (solid) is the model distribution used in the fluid simulation, given by equation (4.1).The latter is based on values of the plasma parameters evolved in the fluid simulation.

Figure 8 :
Figure 8: Projections of the simulation dataset to all the two-dimensional subspaces of the figure of merit space (I repr re , I final Ω , τ CQ , η cond ), corresponding to the optimizations of the activated scenario.Both kinetic (blue) and fluid (red) simulations are shown.The intervals of safe operation for each cost function component is indicated by the black lines.This figure illustrates the trade-off between the different cost function components.The optimal samples found during the optimizations using the fluid (kinetic) model is indicated by a black star (diamond), while the other cases in table 1 are indicated as black markers of different shapes.Samples located outside of the plotted domains are placed at the edges.

Figure 9 :
Figure 9: (a) A runaway plateau is reached during the disruption, such that the runaway current is slowly increasing and I max re depends on the simulation length.If the simulation is stopped at 200 ms (150 ms), the blue (purple) star marks I max re .(b) Criterion for I 95 % re

Figure 10 :Figure 11 :
Figure 10: Illustration of the relation between p-norm and accuracy of L ⩽ 1 implying safety.For this example, the cost function consists of three components f 1 , f 2 and f 3 .The blue box represents the safe operational region and the red surface implies the L = 1 surface.Note that the red surface intersects the axes at 1/C.