Hostname: page-component-89b8bd64d-j4x9h Total loading time: 0 Render date: 2026-05-07T23:15:27.299Z Has data issue: false hasContentIssue false

Anomaly detection in a fleet of industrial assets with hierarchical statistical modeling

Published online by Cambridge University Press:  30 December 2020

Maharshi Dhada*
Affiliation:
Department of Engineering, Institute for Manufacturing, University of Cambridge, Cambridge, CB3 0FS, United Kingdom
Mark Girolami
Affiliation:
Department of Engineering, University of Cambridge, Cambridge, CB2 1PZ, United Kingdom The Alan Turing Institute, London, NW1 2DB, United Kingdom
Ajith Kumar Parlikad
Affiliation:
Department of Engineering, Institute for Manufacturing, University of Cambridge, Cambridge, CB3 0FS, United Kingdom
*
*Corresponding author. E-mail: mhd37@cam.ac.uk

Abstract

Anomaly detection in asset condition data is critical for reliable industrial asset operations. But statistical anomaly classifiers require certain amount of normal operations training data before acceptable accuracy can be achieved. The necessary training data are often not available in the early periods of assets operations. This problem is addressed in this paper using a hierarchical model for the asset fleet that systematically identifies similar assets, and enables collaborative learning within the clusters of similar assets. The general behavior of the similar assets are represented using higher level models, from which the parameters are sampled describing the individual asset operations. Hierarchical models enable the individuals from a population, comprising of statistically coherent subpopulations, to collaboratively learn from one another. Results obtained with the hierarchical model show a marked improvement in anomaly detection for assets having low amount of data, compared to independent modeling or having a model common to the entire fleet.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Open Practices
Open data
Copyright
© The Author(s), 2020. Published by Cambridge University Press
Figure 0

Figure 1. Graphical representation of modeling an asset’s data as multivariate Gaussian.

Figure 1

Figure 2. Graphical representation of hierarchically modeled fleet data. Individual asset data are modeled as multivariate Gaussians, whose mean and covariance parameters are sampled from higher level Normal-Inverse Wishart distributions respectively.

Figure 2

Figure 3. A schematic representation describing how the normal and anomalous data were generated for the experiments. The procedure is shown here for a two-dimensional dataset as an example.

Figure 3

Table 1. The values of various parameters introduced in Section 3.

Figure 4

Table 2. An example of condition data for a medium data category asset.

Figure 5

Figure 4. The figures represent the clustering done by the EM algorithm when the assets (low data category assets in c and d) have five and six data points only. The incorrectly clustered assets are marked with dotted red circle.

Figure 6

Table 3. Various $ \alpha $ levels used while plotting the ROCs, and the corresponding $ {D}_{md} $ values for the current experiment.

Figure 7

Figure 5. An example receiver operator characteristic (ROC) for asset id 52 evaluated for testing dataset with $ l $ and $ L $ equal to 0 and 10, respectively.

Figure 8

Figure 6. Shown here are the areas under the receiver operator characteristic curves (AUCs) measured for the experiment cases. The subset of assets across which the AUCs are measured are indicated in the corresponding captions. For all the above four plots, the deviation for anomalous data in the testing dataset was set at $ 1 $ and $ 10 $ for $ l $ and $ L $, respectively.

Figure 9

Figure 7. Box plots presenting the effect of gradually increasing data contained by the low data category assets. The captions denote the corresponding deviations in the testing dataset.

Figure 10

Figure 8. Box plots presenting the effect of gradually increasing the data across all assets, when they all had same amount of data. The corresponding testing dataset deviations are denoted in the captions.

Figure 11

Figure 9. Box plots presenting the $ {D}_B $ recorded across the assets belonging to the low data category. A lower value of $ {D}_B $ signifies that the given Gaussians are more similar.

Figure 12

Figure C1. Box plots presenting the effect of gradually increasing data contained by the low data category assets. The captions denote the corresponding deviations in the testing dataset.

Figure 13

Figure C2. Box plots presenting AUCs recorded across the assets belonging to the low data category. The corresponding testing dataset deviations are denoted in the captions.

Figure 14

Figure C3. Box plots presenting AUCs recorded across the assets belonging to the low data category. The corresponding testing dataset deviations are denoted in the captions.

Figure 15

Figure D1. Box plots presenting area under the receiver operator characteristic curves (AUCs) recorded across the assets belonging to the low data category, but for a narrower range of means.

Submit a response

Comments

No Comments have been published for this article.