Hostname: page-component-77c78cf97d-lphnv Total loading time: 0 Render date: 2026-04-24T17:55:12.802Z Has data issue: false hasContentIssue false

Convolutional kernel-based classification of industrial alarm floods

Published online by Cambridge University Press:  27 November 2024

Gianluca Manca*
Affiliation:
Institute of Automation Technology, Helmut-Schmidt-University Hamburg, Hamburg, Germany Industrial AI, ABB Corporate Research Center, Ladenburg, Germany
Alexander Fay
Affiliation:
Institute of Automation Technology, Helmut-Schmidt-University Hamburg, Hamburg, Germany
*
Corresponding author: Gianluca Manca; Email: gianluca.manca@de.abb.com

Abstract

Alarm flood classification (AFC) methods are crucial in assisting human operators to identify and mitigate the overwhelming occurrences of alarm floods in industrial process plants, a challenge exacerbated by the complexity and data-intensive nature of modern process control systems. These alarm floods can significantly impair situational awareness and hinder decision-making. Existing AFC methods face difficulties in dealing with the inherent ambiguity in alarm sequences and the task of identifying novel, previously unobserved alarm floods. As a result, they often fail to accurately classify alarm floods. Addressing these significant limitations, this paper introduces a novel three-tier AFC method that uses alarm time series as input. In the transformation stage, alarm floods are subjected to an ensemble of convolutional kernel-based transformations (MultiRocket) to extract their characteristic dynamic properties, which are then fed into the classification stage, where a linear ridge regression classifier ensemble is used to identify recurring alarm floods. In the final novelty detection stage, the local outlier probability (LoOP) is used to determine a confidence measure of whether the classified alarm flood truly belongs to a known or previously unobserved class. Our method has been thoroughly validated using a publicly available dataset based on the Tennessee-Eastman process. The results show that our method outperforms two naive baselines and four existing AFC methods from the literature in terms of overall classification performance as well as the ability to optimize the balance between accurately identifying alarm floods from known classes and detecting alarm flood classes that have not been observed before.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2024. Published by Cambridge University Press
Figure 0

Figure 1. Two alarm sequences “A” and “B”. The alarm variable column shows the name of the process variable (XMEAS) or manipulated variable (XMV) and the activated alarm type (L: low, H: high). The black lines between the two sequences connect pairs of identical alarm variables.

Figure 1

Figure 2. Two types of alarm data representations for alarm variables A to D. (a) Alarm series. The solid blue lines represent alarm variable time trends. A higher level represents an active alarm. (b) Alarm sequence. The solid blue lines indicate alarm activations.

Figure 2

Figure 3. Formalized process description of the proposed convolutional kernel-based alarm subsequence identification method (CASIM).

Figure 3

Figure 4. Application of MultiRocket to a binary alarm series and its first-order difference representation, both of which consist of alarm variables A to D and have a sampling rate of 1/s. The convolution operation employs two kernels $ {W}_1=\left[2,2,-1,-1,2,-1,-1,-1,-1\right] $ and $ {W}_2=\left[-1,-1,2,2,-1,-1,-1,-1,2\right] $ in addition to two dilation factors $ {d}_1=\left\lfloor {2}^0\right\rfloor $ and $ {d}_2=\left\lfloor {2}^{\log_2\left(n/8\right)}\right\rfloor $ and a set of selected alarm variables $ K=\left\{A,B,C,D\right\} $. The combination of kernels and dilation factors yields four convolution outputs per alarm series representation. All used kernels utilize zero padding. MultiRocket computes four features per convolution output using the bias $ b=0 $ and four pooling operators: proportion of positive values (PPV), mean of positive values (MPV), mean of indices of positive values (MIPV), and longest stretch of positive values (LSPV).

Figure 4

Figure 5. Convolution operation applied to a univariate series $ X $ of seven integer values. The convolution operation employs two kernels of length three, $ {W}_1 $ and $ {W}_2 $, and three dilation factors $ d $. The combination of kernels and dilation factors yields six convolution outputs $ {z}_0 $, which are calculated by multiplying the respective kernel weights by the aligned series values and then summing the results. (a) Original series without zero padding. (b) Modified series with zero padding. Red zeros represent padded values.

Figure 5

Figure 6. Concept of expanding windows for training the proposed CASIM for online classification of evolving alarm subsequences applied to an exemplary alarm series. With window T, step s, and interval length w. (a) Application of expanding windows to segment a historical alarm subsequence for off-line training. (b) Training a set of stage instances using the window $ \mathit{\mathsf{T}}\left({s}_2\right) $ as input.

Figure 6

Figure 7. Piping and instrumentation diagram (P&ID) of the Tennessee-Eastman process (TEP) (Arroyo, 2017; Bathelt et al., 2015; Downs & Vogel, 1993; Manca & Fay, 2021b).

Figure 7

Figure 8. Three examples of consecutive abnormal situations. The solid blue lines represent the time trends of the alarm variables. The lower level for each alarm variable represents a low alarm, and the higher level represents a high alarm. The red dotted lines represent the initiation of a root cause disturbance. The green dashed-dotted lines represent the return to a normal operation (following Manca et al. (2021) and Manca and Fay (2021b)).

Figure 8

Figure 9. t-distributed stochastic neighbor embedding (t-SNE) representations of the 310 alarm subsequences in the Tennessee–Eastman process alarm management dataset presented in Manca and Fay (2021b). Each symbol represents a unique alarm subsequence, with the color coding and shape indicating the alarm subsequence’s class. The t-SNE representation employs two distinct alarm subsequence distance measures. (a) Alarm set-based Euclidean distances. (b) Alarm activation duration-based Euclidean distances.

Figure 9

Figure 10. Violin plots that depict the true positive rate using the examined alarm flood classification methods across all tests. The median is represented by a red line. The range is depicted by two blue horizontal lines. The probability distribution is shown by the black boundary lines.

Figure 10

Figure 11. Median true positive rate of the examined alarm flood classification methods across all tests. The performance measurements were collected at relative time intervals post initiation of the corresponding alarm subsequences in the test.

Figure 11

Figure 12. Violin plots that depict the nearest-neighbor distances (WDI-1NN and EAC-1NN), the reverse posterior class probabilities (MBW-LR and ACM-SVM), and the outlier probabilities (CASIM) over all tests. The distributions of alarm subsequences belonging to the known and novel classes are represented by blue and red shape fills, respectively. The median is represented by a red line. The range is depicted by two blue horizontal lines.

Figure 12

Figure 13. Average performance using the classification and detection stages of the four existing alarm flood classification methods and the proposed CASIM over all detection threshold parameter settings and tests. (a) True positive rate. (b) True negative rate. (c) Balanced accuracy.

Figure 13

Figure 14. Balanced accuracy over all detection threshold parameter settings and tests using different parameter settings and versions of the proposed CASIM. (a) Ten randomly instantiated CASIM instances. (b) Different number of estimators $ {n}_{\mathrm{clf}} $. (c) Different numbers of features $ {n}_{\mathrm{feat}} $. (d) CASIM, CASIM-V1, and CASIM-V2.

Submit a response

Comments

No Comments have been published for this article.