Hostname: page-component-89b8bd64d-mmrw7 Total loading time: 0 Render date: 2026-05-08T07:38:18.574Z Has data issue: false hasContentIssue false

Can machine learning models trained using atmospheric simulation data be applied to observation data?

Subject: Earth and Environmental Science

Published online by Cambridge University Press:  24 February 2022

Daisuke Matsuoka*
Affiliation:
Research Institute for Value-Added-Information Generation (VAiG), Japan Agency for Marine-Earth Science and Technology (JAMSTEC), Yokohama, Japan
*
*Corresponding author. Email: daisuke@jamstec.go.jp

Abstract

Atmospheric simulation data present richer information in terms of spatiotemporal resolution, spatial dimension, and the number of physical quantities compared to observational data; however, such simulations do not perfectly correspond to the real atmospheric conditions. Additionally, extensive simulation data aids machine learning-based image classification in atmospheric science. In this study, we applied a machine learning model for tropical cyclone detection, which was trained using both simulation and satellite observation data. Consequently, the classification performance was significantly lower than that obtained with the application of simulation data. Owing to the large gap between the simulation and observation data, the classification model could not be practically trained only on the simulation data. Thus, the representation capability of the simulation data must be analyzed and integrated into the observation data for application in real problems.

Information

Type
Research Article
Information
Result type: Negative result
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2022. Published by Cambridge University Press
Figure 0

Figure 1. Examples of (a) observation and (b) simulation data; left: TCs; right: nonTCs.

Figure 1

Table 1. Numbers of positive and negative examples in training and test data

Figure 2

Figure 2. Classification performance of ObsCNN and SimCNN: (a) precision—recall curve and (b) recall for each category.

Figure 3

Figure 3. Visualization results of vital regions in the CNN trained on (a) observation data (ObsCNN) and (b) simulation data (SimCNN).

Figure 4

Figure 4. Dimension reduction and two-dimensional projection applied to observed and simulated TCs (only TY2 and TY3) using (a) UMAP and (b) t-SNE.

Reviewing editor:  Jacob Carley NOAA Center for Weather and Climate Prediction, NCEP/Environmental Modeling Center, 5830 University Research Cour, College Park, Maryland, United States, 20740
This article has been accepted because it is deemed to be scientifically sound, has the correct controls, has appropriate methodology and is statistically valid, and has been sent for additional statistical evaluation and met required revisions.

Review 1: Can machine learning models trained using atmospheric simulation data be applied to observation data?

Conflict of interest statement

Reviewer declares none

Comments

Comments to the Author: This manuscript investigated and compared the tropical cyclone detection approach with deep learning algorithm by using either satellite observations or model output as training dataset. The authors found out that using model output as training dataset to detect TC is less accurate than using satellite observations. It is generally well written but needs some further clarifications.

Major comments:

1.Are the criteria of detecting/identifying TC the same for satellite observations and NICAM model output? 10-m max. wind can be used in a model to define a TC while the Dvorak technique is commonly used to decide TC categories from cloud patterns of the satellite visible/IR images. If the criteria are different, how will the interpretation of the results be affected?

2.Fig. 2b shows similar recall skill of SimCNN and ObsCNN for strong typhoons (TY3) but Fig. 3b shows a much more off-center pattern of SimCNN for TY3 when compared to ObsCNN in Fig. 3a. How do the authors explain the contradiction?

Minor comments

1.Table1: the positive and negative cases in the training dataset are always the same for both model and obs. Is this a requirement or just an coincidence?

2.Page 3: the denotation of recall/precision. “TN denotes true negative” should be “FN denotes false negative”.

3.Page 4: the third line from bottom “in both the TC categories”. Which two categories do “both” refer to?

Presentation

Overall score 2.7 out of 5
Is the article written in clear and proper English? (30%)
3 out of 5
Is the data presented in the most useful manner? (40%)
3 out of 5
Does the paper cite relevant and related articles appropriately? (30%)
2 out of 5

Context

Overall score 3.5 out of 5
Does the title suitably represent the article? (25%)
4 out of 5
Does the abstract correctly embody the content of the article? (25%)
4 out of 5
Does the introduction give appropriate context? (25%)
3 out of 5
Is the objective of the experiment clearly defined? (25%)
3 out of 5

Analysis

Overall score 3 out of 5
Does the discussion adequately interpret the results presented? (40%)
3 out of 5
Is the conclusion consistent with the results and discussion? (40%)
3 out of 5
Are the limitations of the experiment as well as the contributions of the experiment clearly outlined? (20%)
3 out of 5

Review 2: Can machine learning models trained using atmospheric simulation data be applied to observation data?

Conflict of interest statement

Reviewer declares none.

Comments

Comments to the Author: I think more depth of discussion is needed:

(1) As in Fig. 2(a), what is the classification performance of obsCNN when the test SimData is given?

(2) Comparing TY2 and TY3 in Fig. 3(c) indicates that the typhoon’s eye in SimData seemed not to work as a feature related to the category while that in ObsData did.

In other words, we can infer that the distribution is different between the SimData and ObsData.

To show the above clearly, I suggest using t-SNE or UMAP to visualize each category distribution of the SimData and ObsData, respectively.

Presentation

Overall score 4 out of 5
Is the article written in clear and proper English? (30%)
4 out of 5
Is the data presented in the most useful manner? (40%)
4 out of 5
Does the paper cite relevant and related articles appropriately? (30%)
4 out of 5

Context

Overall score 4 out of 5
Does the title suitably represent the article? (25%)
4 out of 5
Does the abstract correctly embody the content of the article? (25%)
4 out of 5
Does the introduction give appropriate context? (25%)
4 out of 5
Is the objective of the experiment clearly defined? (25%)
4 out of 5

Analysis

Overall score 3 out of 5
Does the discussion adequately interpret the results presented? (40%)
2 out of 5
Is the conclusion consistent with the results and discussion? (40%)
4 out of 5
Are the limitations of the experiment as well as the contributions of the experiment clearly outlined? (20%)
3 out of 5