Hostname: page-component-77f85d65b8-9nbrm Total loading time: 0 Render date: 2026-04-20T01:03:00.103Z Has data issue: false hasContentIssue false

A machine learning approach using autoencoders to perform quality control on meteorological data

Published online by Cambridge University Press:  26 January 2026

Teresa Kristine Spohn*
Affiliation:
Observations Division, Met Eireann , Ireland
Eoin Walsh
Affiliation:
Research Division, Met Eireann , Ireland
Kevin Horan
Affiliation:
Maynooth University , Ireland
John O’Donoghue
Affiliation:
University of Limerick , Ireland
Tim Charnecki
Affiliation:
Observations Division, Met Eireann , Ireland
Merlin Haslam
Affiliation:
Observations Division, Met Eireann , Ireland
Sarah Gallagher
Affiliation:
Observations Division, Met Eireann , Ireland
*
Corresponding author: Teresa Kristine Spohn; Email: teresa.spohn@met.ie

Abstract

As the volume of meteorological observations continues to grow, automating the quality control (QC) process is essential for timely data delivery. This study evaluates the performance of three machine learning algorithms—autoencoder, variational autoencoder, and long short-term memory (LSTM) autoencoder—for detecting anomalies in air temperature data. Using expert-quality-controlled data as ground truth, all models demonstrated anomaly detection capability, with the LSTM outperforming others due to its ability to capture temporal patterns and minimize false positives. When applied to raw data, the LSTM achieved 99.6% accuracy in identifying valid observations and replicated 79% of manual flags, with only five false negatives and six false positives over a full year. Its sensitivity to subtle meteorological changes, such as those caused by rainfall or cloud cover, highlights its robustness. The LSTM’s performance using a three-day timestep, combined with basic QC checks in SaQC (System for Automated Quality Control), suggests a scalable and effective solution for automated QC at Met Éireann, with potential for expansion to include additional variables and multi-station generalization.

Information

Type
Application Paper
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2026. Published by Cambridge University Press
Figure 0

Figure 1. Workflow diagram for proposed automated QC system.

Figure 1

Figure 2. Architecture of a standard autoencoder.

Figure 2

Figure 3. Air temperature time series of Athenry QC data (dark blue line) with red dots marking points where mean squared error exceeded the 99th percentile threshold for autoencoder (AE), variational autoencoder (VAE), and long-short-term-memory (LSTM) autoencoder on left axis and mean squared error (MSE) (light blue) with 99th percentile threshold (light red) on right axis.

Figure 3

Figure 4. Quality-controlled Athenry air temperature time series for 2014 (blue line) with red dots indicating points where the mean squared error exceeded the 99th percentile between the original and reconstructed data using the LSTM with different timesteps (in number of minutes).

Figure 4

Table 1. Performance metrics for training and testing of the models

Figure 5

Figure 5. Raw one-minute air temperature data for Athenry 2014 with red dots where MSE was above the 99th percentile threshold after fitting the LSTM autoencoder with 4320 timesteps (trained on quality-controlled data).

Figure 6

Table 2. Comparison between manual and automated flagging on raw air temperature data from Athenry in 2014

Figure 7

Figure 6. Examples of the rapid temperature drops (highlighted in grey), which the LSTM flagged as anomalous in the quality-controlled data.

Figure 8

Table 3. LSTM flagged anomalies with contributing factors and potential causes

Figure 9

Table 4. Mean, median, and standard deviation of the differences between air temperature sensor 1 and sensor 2 for each year at Athenry, including the combination of years used for training the LSTM (all years except 2014)

Author comment: A machine learning approach using autoencoders to perform quality control on meteorological data — R0/PR1

Comments

Dear Editor,

Please consider this application paper, “A Machine Learning Approach to Quality Control on Meteorological Data” for publication in your journal, Environmental Data Science. The automation of quality control procedures using machine learning is of great interest to the field of Meteorology and Environmental Science generally, as it not only provides high quality data faster than previously possible, but detects anomalies that human experts are unable to see. My co-authors and I hope that this paper will help further the development of these systems.

Best Regards,

Teresa K. Spohn

Review: A machine learning approach using autoencoders to perform quality control on meteorological data — R0/PR2

Conflict of interest statement

Reviewer declares none.

Comments

The paper explores the use of machine learning for automated quality control (QC) of meteorological observational data, focusing on anomaly detection in air temperature. Three models are evaluated: autoencoder (AE), variational autoencoder (VAE), and LSTM autoencoder (LSTM-AE). Using a dataset previously quality-controlled by experts, the authors report that the LSTM-AE performs best, largely due to its ability to capture temporal patterns and maintain a low false positive rate.

The topic of this paper generally fits the scope of Engineering Data Science as an application paper, as it addresses the use of data-intensive methods for ensuring reliability of meteorological data, which has downstream impacts on climate studies and engineering applications. However, the paper in its current form does not fully meet the standards of novelty and rigor expected. The focus on only three related deep learning methods (all autoencoder-based) limits the methodological contribution, and the writing style often lacks academic precision. Thus, while the paper is within scope, significant revision is needed before it could be considered suitable for publication.

The paper addresses an important problem and applies relevant machine learning approaches to the quality control of meteorological observations. While the overall concept is sound, the manuscript currently lacks sufficient clarity in its methodological description, particularly regarding the anomaly detection workflow and the definition and application of thresholds. This limits the reproducibility and transparency of the study. Moreover, the evaluation is restricted to autoencoder-based methods, without consideration of other anomaly detection approaches, computational efficiency, or additional performance metrics, which constrains the methodological contribution. Overall, the study shows technical promise, but significant clarification and expansion are needed for the work to be considered scientifically sound.

The paper would benefit from a more formal academic writing style. At present, some phrases and expressions are somewhat informal, and several sentences are long, repetitive, or ambiguous, which affects readability. Several citations (e.g., at lines 56, 168, 178, and 183) appear unusual in format or usage. These should be carefully reviewed and revised to ensure consistency with academic standards. Revising these parts in a more concise and academic tone would improve the clarity and overall presentation. The length of the paper appears appropriate.

General Suggestions:

1. Introduction

Section 1.1 contains excessive exposition on the general definition and categories of machine learning, which is not necessary for the target audience of this journal. Readers are already familiar with ML basics; the focus should be on prior QC-related applications.

Section 1.3 is currently titled “QC Using Machine Learning”, but this is misleading because Section 1.2 already discusses applications of ML for QC in meteorological data. The title should be made more specific, for example “QC Using ML in Met Éireann”, to clearly distinguish it from Section 1.2.

2. Methods

The description of methods in Section 2 needs significant clarification. While the text includes details about software packages, it does not clearly explain the overall workflow of applying autoencoders for anomaly detection. A clearer methodological description is necessary for readers to follow the study design and rationale.

3. Threshold Definition

Line 304 mentions that anomalies are defined based on the “99th percentile”, but it is not clear of what distribution. If this refers to the 99th percentile of the MSE reconstruction error, then applying it to test datasets of the same size should yield a consistent number of exceedances. However, the paper reports varying anomaly counts, which is confusing. The same issue arises in Section 3.2. Please clarify the precise procedure and rationale.

Detailed suggestions:

1. The structure of Section 1.3 is confusing. For example, in line 178 the paper cites a 2024 study as “more recently,” whereas in line 194 it cites a 2023 study as “most recently.” Consider revising the chronology and phrasing to make the timeline of prior work clearer.

2. In line 214, the text refers to “all three studies,” but it is unclear which three studies are meant, as Section 1.3 mentions more than three works. Please clarify.

3. The sentence spanning lines 256–261 is difficult to follow and should be rewritten for clarity.

4. In lines 262–263, the sequence “LSTM AE, AE, and VAE” does not match the sequence in the corresponding figure in Section 3. Please align the text with the figure for consistency.

5. Some figure elements, particularly axis labels, are too small. The x-axis appears to have sufficient space, so the text does not need to be tilted. Improving font size and orientation would enhance readability.

6. In line 240, the term “Final phase” is used to refer to the second phase of the study. However, the first phase is referred to as “First phase”, and using “Final phase” for the second phase is somewhat inconsistent and potentially confusing. It is recommended to use a parallel and consistent naming scheme for the phases, such as “Second phase” or “Phase 2”.

7. In line 335, there is a typo: “LSTEM” should be corrected to “LSTM”.

Review: A machine learning approach using autoencoders to perform quality control on meteorological data — R0/PR3

Conflict of interest statement

Reviewer declares none.

Comments

This article presents a study on three autoencoder-based model architectures combined with the SaQC library for automated quality control (QC) of one-minute air temperature data from the TUCSON Athenry station. The research builds upon similar methods previously explored by the authors' institution while expanding data coverage, investigating model parameter tuning, and analyzing model performance characteristics.

While the application to station-specific data appears novel, I have concerns regarding the positioning of this work within the broader context of machine learning applications in QC. The authors reference similar methodologies employed by ECMWF for meteorological surface and satellite data, albeit at different temporal scales. The manuscript would benefit from more explicit differentiation of their work’s unique contributions and use case compared to existing machine learning QC implementations in the meteorological domain.

Detailed Comments:

Title and Abstract

The article title should be more specific by stating “Autoencoders” rather than the generic term “Machine Learning,” which fails to convey the unique methodological approach employed in this study.

The abstract requires substantial revision as it lacks mention of key findings and conclusions from sections 3.2 onwards, resulting in an incomplete representation of the research outcomes. The abstract should incorporate the main takeaways, limitations, and shortcomings discovered throughout the study to provide readers with a comprehensive overview of the work’s contributions and constraints.

Introduction

The introduction to machine learning concepts appears unnecessary given that this manuscript targets a data science-focused journal audience. The paragraph discussing anomaly detection within this general ML section should be relocated to the section specifically addressing ML applications in QC, where it more appropriately contextualizes how the methodology applies to quality control processes.

Sections 1.2 and 1.3 require reorganization to improve logical flow and chronological structure. I suggest the following restructuring:

- Section 1.2: Focus on QC processes, their concepts, and requirements

- Section 1.3: Discuss the evolution of QC techniques and implementations by leading organizations

- Section 1.4: Address machine learning aspects and Met Éireann’s recent work as a transition to the Methods section

TitanLib is mentioned across two paragraphs in section 1.3 without adequate explanation. The manuscript should clarify what this library encompasses and its relationship to machine learning methodologies to avoid reader confusion.

Regarding the cited previous Met Éireann studies: I was unable to locate these as published works through standard academic databases. If these represent unpublished or internal studies, please clarify their publication status and indicate whether they provide baseline results that can be compared with the current study’s outcomes.

Method

The methodology section requires expanded explanation of the chosen machine learning models, including:

- Describe how each model operates. How are your features engineered into them?

- Justification for model selection

- Discussion of respective advantages, disadvantages, and underlying assumptions

- Inclusion of guiding diagrams or flowcharts to facilitate replication by other researchers.

This additional context would significantly enhance the manuscript’s methodological transparency and reproducibility.

Results

The results section would benefit from addressing several analytical gaps:

The substantial performance differences between the autoencoder (AE) and the LSTM/VAE models warrant discussion of potential underlying causes or mechanisms driving these disparities.

I was confused by the relationship between Table 1 and Figure 3, this requires clarification. It is unclear whether the 678 minutes/data points flagged by the LSTM correspond to the data presented in the figure.

Mathematical consistency should be maintained throughout. The statement “Of these 21 estimates, the LSTM did not flag 16 as they were close enough to the predicted value to not exceed the threshold. The remaining 6 minutes...” contains values (16 + 6 = 22) that exceed the stated total of 21. While this may reflect rounding, I suggest adhering to consistency to avoid confusion.

Discussion

The rationale for selecting the LSTM autoencoder as the preferred model from the three architectures tested is not clearly established from the section 3.1 results. The manuscript should explicitly state whether this selection is theoretically driven by LSTM’s sequential data processing capabilities or based on empirical performance metrics.

Additionally, a comparative analysis of the three models' computational efficiency would enhance the discussion. This should include metrics such as training time, inference time, and computational resource requirements, providing practical insights for implementation considerations.

Recommendation: A machine learning approach using autoencoders to perform quality control on meteorological data — R0/PR4

Comments

No accompanying comment.

Decision: A machine learning approach using autoencoders to perform quality control on meteorological data — R0/PR5

Comments

No accompanying comment.

Author comment: A machine learning approach using autoencoders to perform quality control on meteorological data — R1/PR6

Comments

Dear Editors and Reviewers,

My co-authors and I thank you for your feedback and the opportunity to revise our manuscript. We found the comments very helpful and have done our best to address each one. The paper is much better now than before, and we hope it will meet with your approval for publication.

The following changes have been made to the document:

Title and Abstract:

-Title updated to “A Machine Learning Approach Using Autoencoders to Perform Quality Control on Meteorological Data”

-Abstract rewritten to include details of key findings and highlight the novelty of using autoencoders with SaQC

Introduction and Context:

-Removed general machine learning exposition

-Re-organised the remainder of section 1

-Elaborated on Titan

-Clarified publication status of prior Met Éireann studies

Methods Section:

-Expanded model descriptions, to include justification of model selection

-Added guiding diagrams and flowcharts

-Clarified anomaly threshold definition

- Explained why anomaly counts vary across datasets despite consistent thresholds

Results Section:

- Clarified relationship between Table 1 and Figure 3

- Fixed mathematical inconsistency in the 21 vs. 16 + 6 flagged minutes

- Discussed performance differences between AE, VAE, and LSTM-AE

- Aligned model order in text and figures

- Improved figure readability:

-Increased font size of axis labels

- Made x-axis labels horizontal

- Ensured figures are legible and consistent

Discussion Section:

-Justified selection of LSTM-AE using theoretical rationale

-Added table of performance metrics

Writing Style and Formatting:

-Revised text for academic tone and clarity including:

-Shortening long or ambiguous sentences

-Removed informal phrasing

-Fixed citation formatting

-Fixed timeline inconsistencies

Best Regards,

Teresa Spohn

Postdoctoral Researcher

Met Eireann

Review: A machine learning approach using autoencoders to perform quality control on meteorological data — R1/PR7

Conflict of interest statement

Reviewer declares none.

Comments

The authors have provided a thorough and well-considered revision in response to the reviewers’ comments. The revised manuscript shows clear improvement in structure, clarity, and academic tone. The methods section now presents a more coherent workflow and better justification for model choices, with additional diagrams and explanations. The clarification of anomaly threshold definition and the explanation for varying anomaly counts across datasets effectively address previous concerns. The results and discussion are more logically connected, and figure readability has been enhanced.

Overall, the paper now demonstrates solid technical correctness and scientific soundness. The study is relevant, the methodology is clearer, and the revisions have substantially improved the quality of presentation. Some minor issues remain — the overall language could benefit from light polishing for conciseness, and a few figures or tables could still be furthe

Recommendation: A machine learning approach using autoencoders to perform quality control on meteorological data — R1/PR8

Comments

No accompanying comment.

Decision: A machine learning approach using autoencoders to perform quality control on meteorological data — R1/PR9

Comments

No accompanying comment.