Hostname: page-component-89b8bd64d-4ws75 Total loading time: 0 Render date: 2026-05-09T12:42:43.590Z Has data issue: false hasContentIssue false

Plasma image classification using cosine similarity constrained convolutional neural network

Published online by Cambridge University Press:  16 December 2022

Michael J. Falato
Affiliation:
Los Alamos National Laboratory, Los Alamos, NM 87545, USA
Bradley T. Wolfe
Affiliation:
Los Alamos National Laboratory, Los Alamos, NM 87545, USA
Tali M. Natan
Affiliation:
Los Alamos National Laboratory, Los Alamos, NM 87545, USA
Xinhua Zhang
Affiliation:
Los Alamos National Laboratory, Los Alamos, NM 87545, USA
Ryan S. Marshall
Affiliation:
California Institute of Technology, 1200 E California Blvd, Pasadena, CA 91125, USA
Yi Zhou
Affiliation:
California Institute of Technology, 1200 E California Blvd, Pasadena, CA 91125, USA
Paul M. Bellan
Affiliation:
California Institute of Technology, 1200 E California Blvd, Pasadena, CA 91125, USA
Zhehui Wang*
Affiliation:
Los Alamos National Laboratory, Los Alamos, NM 87545, USA
*
Email address for correspondence: zwang@lanl.gov
Rights & Permissions [Opens in a new window]

Abstract

Plasma jets are widely investigated both in the laboratory and in nature. Astrophysical objects such as black holes, active galactic nuclei and young stellar objects commonly emit plasma jets in various forms. With the availability of data from plasma jet experiments resembling astrophysical plasma jets, classification of such data would potentially aid in not only investigating the underlying physics of the experiments but also the study of astrophysical jets. In this work we use deep learning to process all of the laboratory plasma images from the Caltech Spheromak Experiment spanning two decades. We found that cosine similarity can aid in feature selection, classify images through comparison of feature vector direction and be used as a loss function for the training of AlexNet for plasma image classification. We also develop a simple vector direction comparison algorithm for binary and multi-class classification. Using our algorithm we demonstrate 93 % accurate binary classification to distinguish unstable columns from stable columns and 92 % accurate five-way classification of a small, labelled data set which includes three classes corresponding to varying levels of kink instability.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
Copyright © The Author(s), 2022. Published by Cambridge University Press
Figure 0

Figure 1. We conduct classification using the cosine similarity between feature vectors that are extracted from images. In order for classification to be conducted, the vectors must greatly align for images in the same class compared with images in different classes. Classification reduces to probing if a vector lies within a particular range in the binary classification case (shown above) or choosing the closest target region in the multi-class classification case. In this example, the similar image would be classified in the same class as the target image, while the dissimilar image would be classified out of the class.

Figure 1

Figure 2. An example image from each of the five classes in the small data set. The classes are haze, spider, column, kink and sphere. Each class contains 226, 111, 130, 295 and 390 images, respectively. Note that the images feature a false colour map.

Figure 2

Figure 3. An example of an image from the kink class with the four distributions yielded to extract features. Note that the image features a false colour map. Distributions $P(x,y)$, $P(x)$ and $P(y)$ are distributions of the intensity (a), while $P(v)$ is a histogram of pixels with particular values (b). Features are extracted from statistical and information theoretic quantities of these four distributions.

Figure 3

Figure 4. Example of a cosine similarity matrix indicating a potential feature selection for the task of class one or non-class one (spider versus non-spider) binary classification. The two features used as components for vectors to construct the matrix were the mutual information, $I(x,y)$, and the mean of the $P(y)$ distribution. Classes are labelled as 0 through 4 corresponding to the classes shown in figure 2. Since the matrix is normally symmetric, the bottom left half is set to the minimum value and crossed out.

Figure 4

Algorithm 1 Binary classification with cosine similarityGiven: iterations N, a number γ, a training set U, a test set P and a vector-valued function f(x).

Figure 5

Figure 5. A diagram of the binary classification algorithm based on cosine similarity. We start with a ‘training’ stage, such that enough vectors are sampled from the spider class that an approximate average cosine similarity, $\mu$, of members in the class is retrieved, along with a standard deviation, $\sigma$, of the distribution of cosine similarity values in the class. With this information we introduce a test image to the algorithm and compute its average cosine similarity, $\mu ^{*}$, with the training samples. Then a threshold function, $\tau (\mu, \mu ^{*}, \sigma )$, which considers an image a member in the class if $\mu ^*$ is greater than $\mu - \gamma \sigma$, determines whether the image is in the spider class or not. Note that the plasma images feature a false colour map. This algorithm is used to achieve notable results (shown in figure 6) with the features from figure 4.

Figure 6

Figure 6. Confusion matrix of the results using the binary classification algorithm on the features from figure 4 for spider versus non-spider classification. The matrix displays fractional accuracy for each class in the first row and the fractional inaccuracy for each class presented in the second row. The algorithm achieves an average accuracy of 86 %, and the total number of images classified is 250.

Figure 7

Figure 7. Differences in the activations between the cosine-similarity-trained AlexNet (labelled as CS) versus the cross-entropy-trained AlexNet (CE). Gradient-weighted class activation mapping (Grad-CAM) was performed on the five representative plasma images from figure 2 for both versions of AlexNet. Here GT stands for ‘ground truth’. The second and third rows display the outputs of the Grad-CAM method on the first row for the corresponding models. Note that the ground-truth images are displayed using a false colour map.

Figure 8

Figure 8. A diagram of the cosine embedding training on the final ReLU layer of AlexNet. We use the base AlexNet model from Torchvision and remove the final linear layer, outputting a 4096-dimensional vector for each image. These vectors are used as feature vectors for the loss.

Figure 9

Figure 9. Cosine similarity matrices computed after performing transfer learning on AlexNet using a cosine embedding loss or cross-entropy loss. Note that the vectors used to construct the matrices were from the final ReLU layer of each network and that the matrices were computed using images from the test set only. (a) The cosine similarity matrices of the cross-entropy trained, (b) cosine embedding trained and (c) cosine embedding trained on the final ReLU layer models. Since the matrices are normally symmetric, the bottom left half is set to the minimum value and crossed out.

Figure 10

Figure 10. Confusion matrices of the test results using both the binary classification algorithm (a) and multi-class modification to the binary classification algorithm (b) on feature vectors obtained from AlexNet when trained on vector embeddings instead of the final layer. For the kink versus non-kink binary classification matrix, the fractional accuracy for each class is displayed in the first row and the fractional inaccuracy for each class is presented in the second row. For the five-way classification results, the diagonal values show the fractional accuracy of each class, while the off-diagonal values show the fractional inaccuracies, displaying the fraction of each class in the $x$ axis that was classified incorrectly as a class in the $y$ axis. The total amount of images classified for each case is 285.

Figure 11

Algorithm 2 Multi-class classification with cosine similarityGiven: a number of classes I, a set of classes $C = \{C_i\}_{i \in I}$, a training set for each class Ui, a test set P and a vector-valued function f(x).

Figure 12

Figure 11. (a) The tSNE for 2000 sampled images from each class. The tSNE parameters of perplexity and verbosity are set to 350 and 2, respectively. We note that when the 4096-dimensional vectors are reduced to two dimensions the data appear to be generally separated appropriately. (b) We conduct the same tSNE with normalized feature vectors, where instead of five lines of classes, we see five clusters of classes. The classes are labelled as 0, 1, 2, 3, 4 corresponding to the labelling convention from the previous sections in this work.

Figure 13

Figure 12. (a) Distribution of 2000 randomly sampled vectors in the kink class about the average vector of the samples. (b) The closest five images to the average vector, images randomly sampled from $0^{\circ }$ to $10^{\circ }$ and images randomly sampled from $25^{\circ }$ to $35^{\circ }$ are displayed. Here 75 % of images in the class are within the $0^{\circ }$ to $10^{\circ }$ range, while the remainder of the data lie outside of this range.

Figure 14

Table 1. Equations to obtain features from § 2.2.