Hostname: page-component-89b8bd64d-9prln Total loading time: 0 Render date: 2026-05-11T08:45:13.391Z Has data issue: false hasContentIssue false

How does ion temperature gradient turbulence depend on magnetic geometry? Insights from data and machine learning

Published online by Cambridge University Press:  07 August 2025

Matt Landreman*
Affiliation:
Institute for Research in Electronics & Applied Physics, University of Maryland, College Park, MD 20742, USA
Jong Youl Choi
Affiliation:
Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
Caio Alves
Affiliation:
Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
Prasanna Balaprakash
Affiliation:
Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
Michael Churchill
Affiliation:
Princeton Plasma Physics Laboratory, Princeton, NJ 08540, USA
Rory Conlin
Affiliation:
Institute for Research in Electronics & Applied Physics, University of Maryland, College Park, MD 20742, USA
Gareth Roberg-Clark
Affiliation:
Max Planck Institute for Plasma Physics, Wendelsteinstraße 1, Greifswald 17491, Germany
*
Corresponding author: Matt Landreman, mattland@umd.edu

Abstract

Magnetic geometry has a significant effect on the level of turbulent transport in fusion plasmas. Here, we model and analyse this dependence using multiple machine learning methods and a dataset of ${\gt}200\,000$ nonlinear gyrokinetic simulations of ion-temperature-gradient turbulence in diverse non-axisymmetric geometries. The dataset is generated using a large collection of both optimised and randomly generated stellarator equilibria. At fixed gradients and other input parameters, the turbulent heat flux varies between geometries by several orders of magnitude. Trends are apparent among the configurations with particularly high or particularly low heat flux. Regression and classification techniques from machine learning are then applied to extract patterns in the dataset. Due to a symmetry of the gyrokinetic equation, the heat flux and regressions thereof should be invariant to translations of the raw features in the parallel coordinate, similar to translation invariance in computer vision applications. Multiple regression models including convolutional neural networks (CNNs) and decision trees can achieve reasonable predictive power for the heat flux in held-out test configurations, with highest accuracy for the CNNs. Using Spearman correlation, sequential feature selection and Shapley values to measure feature importance, it is consistently found that the most important geometric lever on the heat flux is the flux surface compression in regions of bad curvature. The second most important geometric feature relates to the magnitude of geodesic curvature. These two features align remarkably with surrogates that have been proposed based on theory, while the methods here allow a natural extension to more features for increased accuracy. The dataset, released with this publication, may also be used to test other proposed surrogates, and we find that many previously published proxies do correlate well with both the heat flux and stability boundary.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2025. Published by Cambridge University Press
Figure 0

Figure 1. Translation-invariance of the gyrokinetic-quasineutrality system. (a) A periodic translation in $z$ is applied to all of the $z$-dependent inputs to the gyrokinetic system, (2.11). Only four of the seven are shown for simplicity, but all are translated. (b) Average heat flux is unchanged by the translation.

Figure 1

Figure 2. Examples of the rotating-ellipse equilibria included in the dataset. Two-dimensional plots show the cross-sections at which the toroidal angle is 0, $1/4$, $1/2$ and $3/4$ of a field period. Three-dimensional images show each configuration from two angles, with colour indicating $|B|$ (red = high, blue = low) and field lines in black. Left columns, configurations in which the boundaries are centred on a circle. Right columns, configurations in which the boundaries are centred on a curve with torsion.

Figure 2

Figure 3. Examples of the QUASR quasi-axisymmetric and quasi-helically symmetric equilibria included in the dataset. Two-dimensional plots show the cross-sections at which the toroidal angle is 0, $1/4$, $1/2$ and $3/4$ of a field period. Three-dimensional images show each configuration from two angles, with colour indicating $|B|$ (red = high, blue = low) and field lines in black.

Figure 3

Figure 4. Examples of the equilibria generated with random boundary Fourier modes. Two-dimensional plots show the cross-sections at which the toroidal angle is 0, $1/4$, $1/2$ and $3/4$ of a field period. Three-dimensional images show each configuration from two angles, with colour indicating $|B|$ (red = high, blue = low) and field lines in black.

Figure 4

Figure 5. Distribution of heat fluxes for the fixed-gradient and varied-gradient datasets. In the latter, 30 % of the simulations were stable, with $Q \approx 0$. Simulations with $Q\lt 0.1$, considered stable, are included in the leftmost bar.

Figure 5

Figure 6. Some trends are apparent between flux tubes with very low or high heat flux. The columns show six flux tubes from the $n_{fp}=3$ equilibria with random boundary Fourier modes, the first three stable, the last three with very high $Q$ at the same gradients. The top six rows are the inputs to the gyrokinetic-quasineutrality system ($B^{-3}\boldsymbol{B}\times \boldsymbol{\nabla} B\boldsymbol{\cdot} \boldsymbol{\nabla} y$ is omitted for simplicity since it is similar to $B^{-2}\boldsymbol{B}\times \boldsymbol{\kappa }\boldsymbol{\cdot} \boldsymbol{\nabla} y$), while the bottom row shows the contribution to the total heat flux versus $z$.

Figure 6

Table 1. List of hyperparameters and their search ranges explored using DeepHyper.

Figure 7

Figure 7. Surrogate model architecture to learn heat flux averages using a structured neural network. It consists of three main components: (a) feature extraction with Conv1D; (b) global average pooling; and (c) fully connected layers.

Figure 8

Figure 8. Results of the DeepHyper hyperparameter search. Each point represents a model with a unique hyperparameter configuration explored by DeepHyper, plotted against its completion time (x-axis) and performance score (y-axis). The search was conducted using 64 GPUs over approximately nine hours, evaluating a total of 443 models. The top 100 highest-performing models, selected for the final ensemble, are highlighted in red. The bottom plot shows the histogram of model sizes (number of parameters on the x-axis) for all 443 models explored by DeepHyper and the top 100 selected models.

Figure 9

Figure 9. Prediction performance of an ensemble of the top 100 models selected from DeepHyper for the varied-gradient dataset. After exploring 443 models by DeepHyper, the top 100 were chosen based on their performance and evaluated against a test dataset of 9785 samples. Each dot represents the mean prediction of the ensemble, while the vertical bars indicate $\pm$ 1 standard deviation. The ensemble achieved an overall $R^2$ score of 0.989, demonstrating strong predictive accuracy and stability.

Figure 10

Table 2. Geometric features from § 5.1 with highest absolute magnitude of Spearman correlation to the nonlinear heat flux at fixed temperature and density gradient. Here, $\Theta$ denotes the Heaviside function.

Figure 11

Figure 10. Scores for regression (left) and classification (right) in forward sequential feature selection, showing improvement as the first few features are added. Each point shows the mean score on held-out data using five-fold cross-validation.

Figure 12

Figure 11. Accuracy of regression models improves to a point as more features are included from sequential feature selection. Each panel shows the performance of the XGBoost regression on 20 % held-out test data from the varied-gradient dataset.

Figure 13

Table 3. First five features from § 5.1 selected with forward sequential feature selection. Results are shown both for classification of stability vs instability, and for regression on the logarithm of the heat flux $Q$. Results are also shown for both the gradient-boosted decision tree package XGBoost and for 10-nearest-neighbors (10NN). Here, $\Theta$ denotes the Heaviside function.

Figure 14

Table 4. Top-scoring features from steps 3–4 of forward sequential feature selection, for regression on the heat flux using the varied-gradient dataset with XGBoost. At each step, there are many features which are variations on a theme that have nearly identical $R^2$ score. Here, $\Theta$ denotes the Heaviside function.

Figure 15

Figure 12. Comparison of regression methods for the heat flux in the varied-gradient dataset. In all cases, the feature set is $a/L_T$, $a/L_n$ and the top 10 geometric features from FSFS with XGBoost.

Figure 16

Figure 13. Distribution of Shapley values for regression with the varied-gradient dataset, using an XGBoost fit with top 12 features from FSFS. Features are listed in decreasing importance as measured by mean magnitude of the Shapley values.

Figure 17

Figure 14. Comparing the true heat flux and the single geometric feature $f_Q$ in (5.1) for fixed gradients, it is clear that there is significant correlation. No regression model is used here.

Figure 18

Figure 15. Comparing several proposed ITG objectives using three scores. The classification and regression scores are computed using XGBoost with three features: the single geometric feature, $a/L_T$ and $a/L_n$.

Figure 19

Figure 16. Dependence of $Q$ on the resolution parameter nz for a selection of flux tubes with associated gradients from the varied-gradient dataset. The seven tubes, shown by different colours, all have the highest value of $n_{fp}$ in the dataset (8) to give short scales in $z$. The vertical dotted line shows the value of nz used for the main dataset (96). Variation of $Q$ with increasing resolution is small compared with the differences between geometries and gradients, indicating sufficient convergence.

Figure 20

Figure 17. Evidence of sufficient convergence with respect to numerical resolution parameters. For 100 randomly sampled entries in the varied-gradient dataset, every resolution parameter is varied by a factor of 2 or 10. The box sizes in $x$ and $y$ are denoted $x0$ and $y0$ in the legend.

Figure 21

Figure 18. Comparison of boundary conditions in $z$ for 100 randomly sampled flux tubes from the varied-gradient dataset. The heat flux is insensitive to the choice of boundary condition if the number of Fourier modes in $x$ is increased in each twist-and-shift calculation to match the same maximum $k_x$ as the periodic calculations, as is done here.