Hostname: page-component-89b8bd64d-nlwjb Total loading time: 0 Render date: 2026-05-07T19:51:29.589Z Has data issue: false hasContentIssue false

Unsupervised classification of fully kinetic simulations of plasmoid instability using self-organizing maps (SOMs)

Published online by Cambridge University Press:  29 May 2023

Sophia Köhne
Affiliation:
Institut für Theoretische Physik, Ruhr-Universität Bochum, Universitätstraße 150, 44801 Bochum, Germany
Elisabetta Boella
Affiliation:
Physics Department, Lancaster University, Bailrigg, Lancaster LA11NN, UK Cockcroft Institute, Sci-Tech Daresbury, Warrington WA44AD, UK
Maria Elena Innocenti*
Affiliation:
Institut für Theoretische Physik, Ruhr-Universität Bochum, Universitätstraße 150, 44801 Bochum, Germany
*
Email address for correspondence: mariaelena.innocenti@rub.de
Rights & Permissions [Opens in a new window]

Abstract

The growing amount of data produced by simulations and observations of space physics processes encourages the use of methods rooted in machine learning for data analysis and physical discovery. We apply a clustering method based on self-organizing maps to fully kinetic simulations of plasmoid instability, with the aim of assessing their suitability as a reliable analysis tool for both simulated and observed data. We obtain clusters that map well, a posteriori, to our knowledge of the process; the clusters clearly identify the inflow region, the inner plasmoid region, the separatrices and regions associated with plasmoid merging. Self-organizing map-specific analysis tools, such as feature maps and the unified distance matrix, provide us with valuable insights into both the physics at work and specific spatial regions of interest. The method appears as a promising option for the analysis of data, both from simulations and from observations, and could also potentially be used to trigger the switch to different simulation models or resolution in coupled codes for space simulations.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
Copyright © The Author(s), 2023. Published by Cambridge University Press
Figure 0

Table 1. Comparison of the execution times of the parallel CUDA/C++ implementation CUDA-SOM (Mistri 2018) and the serial Python implementation miniSom (Vettigli 2018), both trained for 5 epochs with different sample sizes $m$. The data points are extracted from the upper current sheet of the simulation described in § 3. The SOM parameters used for the training are initial learning rate $\eta _0=0.5$ and initial neighbourhood radius $\sigma _0=0.2 \times \max (x,y)$.

Figure 1

Figure 1. Out-of-plane magnetic field component ($B_z$, a), one diagonal ($p_{xx,e}$, b) and one non-diagonal, ($p_{xz,e}$, c), electron pressure term, and the out-of-plane electron current ($J_{z,e}$, d) at $\varOmega _{ci}t=320$. Magnetic field lines are superimposed in black. Electromagnetic fields, pressure components and currents are in units of $m_i c \omega _{pi} / e$, $n_0 m_i c^2$ and $e n_0 c$, respectively.

Figure 2

Figure 2. Violin plots of the distribution, after scaling, of an outlier-poor ($B_y$) and an outlier-rich ($J_z,e$) feature. The MinMax, Standard and Robust scalers are used.

Figure 3

Figure 3. Trained SOM nodes coloured according to their $k$-means clusters (a,c,e) and UDM maps (b,d,f) for the data scaled with MinMaxScaler (a,b), StandardScaler (c,d), RobustScaler (e,f). Cluster boundaries are drawn in black in (b,d,f).

Figure 4

Table 2. The CUDA-SOM hyperparameters used in SOM training.

Figure 5

Figure 4. Upper current sheet simulated data at $\varOmega _{ci} t=320$, coloured according to the $k$-means cluster the BMU of each point is clustered into. Data scaled with MinMaxScaler, StandardScaler and RobustScaler are depicted in (a), (b) and (c) respectively.

Figure 6

Figure 5. The $B_z$ (a), $p_{xx,e}$ (b), $p_{xz,e}$ (c) and $J_{z,e}$ (d) values associated with the trained SOM nodes, for data points scaled with the RobustScaler. Cluster boundaries are drawn in black.

Figure 7

Figure 6. Identification in the simulated plane of regions of interest in the feature maps; the colour black is used to highlight nodes of interest in (a) and associated points in the simulation in (b).

Figure 8

Figure 7. Identification in the simulated plane of regions of interest in the feature maps: the colour black is used to highlight nodes of interest in (a) and associated points in the simulation in (b).

Figure 9

Figure 8. Identification in the simulated plane of regions of interest in the feature maps; the colour black is used to highlight nodes of interest in (a) and associated points in the simulation in (b).

Figure 10

Figure 9. Data point distribution in the $\beta _{\parallel, e}$ vs $T_{\perp,e}/ T_{\parallel,e}$ plane. (a) All upper current sheet simulated points at initialization. (bf) Points associated with clusters 0, 1, 2, 3 and 4 at $\varOmega _{ci}t= 320$. The colours highlight the number of points per pixel. The isocontours of growth rates $\gamma /\varOmega _{ce} = 0.001, 0.1, 0.2$ for the resonant electron firehose instability and of growth rates $\gamma /\varOmega _{ce} = 0.01, 0.1$ for the whistler temperature anisotropy instability are depicted in the upper and lower semiquadrants.

Figure 11

Table 3. Percentage $R$ of samples classified in the same $k$-means cluster as with a map trained for five epochs, initial learning rate $\eta _0 = 0.5$, initial neighbourhood radius $\sigma _0 = 0.2\times \max (x,y)$, random initialization seed 42 and number of nodes $q= 71\times 93$ when the number of epochs, $\eta _0, \sigma _0$, random initialization seed and $q$ are varied. In all cases, the scaler used is RobustScaler. The inflow cluster, cluster 0, is excluded in the calculation of the matching factor, since we are primarily interested in clusters mapping to the plasmoid region.

Figure 12

Figure 10. Clustering of the upper current sheet simulated data, with the SOM used as a reference in table 3 in (a) and the SOMs trained with the hyper-parameters listed in the lines coloured in blue and red in (b,c).

Figure 13

Figure 11. Out-of-plane electron current, $J_{z,e}$, (a), and clustering of the upper current sheet data points, (b), at $\varOmega _{ci}t=160$. The SOM used for the clustering has been trained at $\varOmega _{ce}t= 320$.