Hostname: page-component-5db58dd55d-mhzq2 Total loading time: 0 Render date: 2026-06-01T10:24:25.602Z Has data issue: false hasContentIssue false

Neural representation of the stratospheric ozone chemistry

Published online by Cambridge University Press:  04 December 2023

Helge Mohn*
Affiliation:
Climate Sciences | Atmospheric Physics, Alfred Wegener Institute for Polar and Marine Research, Potsdam, Germany Center for Industrial Mathematics, University of Bremen, Bremen, Germany
Daniel Kreyling
Affiliation:
Climate Sciences | Atmospheric Physics, Alfred Wegener Institute for Polar and Marine Research, Potsdam, Germany
Ingo Wohltmann
Affiliation:
Climate Sciences | Atmospheric Physics, Alfred Wegener Institute for Polar and Marine Research, Potsdam, Germany
Ralph Lehmann
Affiliation:
Climate Sciences | Atmospheric Physics, Alfred Wegener Institute for Polar and Marine Research, Potsdam, Germany
Peter Maass
Affiliation:
Center for Industrial Mathematics, University of Bremen, Bremen, Germany
Markus Rex
Affiliation:
Climate Sciences | Atmospheric Physics, Alfred Wegener Institute for Polar and Marine Research, Potsdam, Germany
*
Corresponding author: Helge Mohn; Email: helge.mohn@awi.de

Abstract

In climate modeling, the stratospheric ozone layer is typically only considered in a highly simplified form due to computational constraints. For climate projections, it would be of advantage to include the mutual interactions between stratospheric ozone, temperature, and atmospheric dynamics to accurately represent radiative forcing. The overarching goal of our research is to replace the ozone layer in climate models with a machine-learned neural representation of the stratospheric ozone chemistry that allows for a particularly fast, but accurate and stable simulation. We created a benchmark data set from pairs of input and output variables that we stored from simulations of the ATLAS Chemistry and Transport Model. We analyzed several variants of multilayer perceptrons suitable for physical problems to learn a neural representation of a function that predicts 24-h ozone tendencies based on input variables. We performed a comprehensive hyperparameter optimization of the multilayer perceptron using Bayesian search and Hyperband early stopping. We validated our model by replacing the full chemistry module of ATLAS and comparing computation time, accuracy, and stability. We found that our model had a computation time that was a factor of 700 faster than the full chemistry module. The accuracy of our model compares favorably to the full chemistry module within a 2-year simulation run, also outperforms a previous polynomial approach for fast ozone chemistry, and reproduces seasonality well in both hemispheres. In conclusion, the neural representation of stratospheric ozone chemistry in simulation resulted in an ozone layer that showed a high accuracy, significant speed-up, and stability in a long-term simulation.

Information

Type
Application Paper
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2023. Published by Cambridge University Press
Figure 0

Figure 1. A vision for a potential feedback loop: AI surrogate models could allow decision-makers to base their actions on more reliable forecasts of Earth’s climate.

Figure 1

Figure 2. Prediction step that employs Neural-SWIFT’s MLPs. Where $ t $: a 24 h time-step in model time, $ {X}_t $: parameters describing the state of one air parcel for this time-step, $ {X}_t^{\mathrm{Ozone}} $: ozone volume mixing ratio of this air parcel and time-step, $ \Delta {X}_t^{\mathrm{Ozone}} $: 24-h ozone tendency calculated by the MLP for this air parcel and time-step, $ {X}_{t+24\mathrm{h}}^{\mathrm{Ozone}} $: updated ozone volume mixing ratio of this air parcel for the next time-step $ t+24\mathrm{h} $.

Figure 2

Figure 3. Schematic of Neural-SWIFT’s machine learning pipeline.

Figure 3

Figure 4. Comparison of two architectures of MLPs that employ different input functions in each node of the hidden layers ($ 1..L $): (Architecture 1) linear input function and below (Architecture 2) quadratic residual input function. While the “General” scheme (top) represents a complete MLP, the bottom two show the scheme of a single hidden layer. Where $ {x}_{\mathrm{in}} $: input vector of activations of the previous layer, $ W $: weight matrix, $ b $: bias vector, $ \sigma $: activation function, $ {x}_{\mathrm{out}} $: output vector of activations of this layer, and $ {\mathcal{N}}_{\Theta}(X) $: neural network output.

Figure 4

Table 1. Comparing four different architectures of multilayer perceptrons

Figure 5

Figure 5. Four multilayer perceptron architectures: residuals from the cost function (horizontal) using the full-year testing data (normalized values) are compared. For each variant, nine network sizes were tested (number of layers {3, 5,7}, number of neurons per layer: {256,512,768}) to minimize the effect of network size on each variant.

Figure 6

Figure 6. Results of the sensitivity analysis. Different sets of input variables (left) were used to train each an MLP. The residuals with respect to the normalized testing data of the whole year are shown (see cost function in equation (2)). The architecture and training setup was the same for all models and used training data of all twelve months (number of layers: 6, number of neurons per layer: 733, $ {\omega}_{L1} $: 6, $ \omega $: 4). (orange) set used by Kreyling et al. (2018), (green) Neural-SWIFT’s choice of input variables, and (blue) other sets.

Figure 7

Table 2. Results hyperparameter search

Figure 8

Table 3. Selected input and output variables

Figure 9

Figure 7. Schematic of the implementation of Neural-SWIFT in atlas or climate models.

Figure 10

Table 4. Computation time

Figure 11

Figure 8. Monthly means (April 2000) of the (a) stratospheric ozone column and (b) zonal mean stratospheric ozone volume mixing ratios are shown after 18-month simulation. The binning used 1° latitude-longitude bins for (a) and zonal means in bins of 1000 m pressure altitude and 5° equivalent latitude for (b). Only the bins in which Neural-SWIFT was applied are shown.

Figure 12

Figure 9. The figure depicts the spatial pattern of the standard deviation of the time series covering the 2-year simulation period at various locations, measured in du. The binning used 1° latitude–longitude bins. (Top) Our method Neural-SWIFT, (middle) full chemistry module, and (bottom) $ \left[\mathrm{Neural}\hbox{-} \mathrm{SWIFT}\right]-\left[\mathrm{Full}\ \mathrm{chemistry}\right] $.

Figure 13

Figure 10. Comparison to polynomial SWIFT. The figure depicts the daily evolution of the mean absolute differences between the full chemistry module and two methods, polynomial SWIFT (gray) and Neural-SWIFT (black). Three variables are presented: (a) stratospheric ozone column, (b) ozone volume mixing ratio, and (c) 24-h ozone tendency. The differences were calculated by initially binning the data (compare Figure 8), using 1° latitude–longitude bins for (a) and 1,000 m of pressure altitude and 5° of equivalent latitude bins for (b) and (c). Subsequently, the daily mean of the absolute differences was calculated, incorporating bin weighting based on surface area (see equation (4)). It is important to note that the mean score does not consider bins within the polar vortex (polar SWIFT module).

Figure 14

Table 5. Error metrics of Figure 10

Figure 15

Figure 11. The time evolution employing error-Q (black line) and the std dev (gray) was calculated as defined in equation (5). The binning used 1,000 m pressure altitude and 5° equivalent latitude). A weighting of the bins according to their surface area (equation (4)) was performed. Bins of the polar vortex (polar SWIFT module) are not included in the mean score.

Figure 16

Figure 12. (First row) Shown are zonal mean values (Binning: 3° equivalent latitude) of stratospheric ozone columns in Dobson Units (DU) over time of the results of a 2-year simulation using the novel artificial neural networks of Neural-SWIFT. The areas which also covered the polar vortex were removed and are shown in gray. The results are evaluated by difference plots (comparing to a simulation run that used the full chemistry module of ATLAS): (second row) $ \left[\mathrm{Neural}\hbox{-} \mathrm{SWIFT}\right]-\left[\mathrm{Full}\ \mathrm{chemistry}\right] $ and (third row) $ \frac{\left[\mathrm{Neural}\hbox{-} \mathrm{SWIFT}\right]-\left[\mathrm{Full}\ \mathrm{chemistry}\right]}{\left[\mathrm{Full}\ \mathrm{chemistry}\right]} $.

Figure 17

Figure A1. Monthly zonal mean stratospheric ozone volume mixing ratios (parts per million) from January to June 1999 are shown.

Figure 18

Figure A2. Monthly zonal mean stratospheric ozone volume mixing ratios (parts per million) from July to December 1999 are shown.

Figure 19

Figure A3. Monthly zonal mean stratospheric ozone volume mixing ratios (parts per million) from January to June 2000 are shown.

Figure 20

Figure A4. Monthly zonal mean stratospheric ozone volume mixing ratios (parts per million) from July to December 2000 are shown.

Figure 21

Figure B1. (First row) Shown are zonal mean values (Binning: 3° equivalent latitude) of stratospheric ozone columns in Dobson Units (DU) over time of the results of a 2-year simulation using the previous polynomial approach of SWIFT. The areas which also covered the polar vortex were removed and are shown in gray. The results are evaluated by difference plots (comparing to a simulation run that used the full chemistry module of ATLAS):(second row) $ \left[\mathrm{Polynomial}\ \ \mathrm{SWIFT}\right]-\left[\mathrm{Full}\ \mathrm{chemistry}\right] $ and (third row) $ \frac{\left[\mathrm{Polynomial}\ \ \mathrm{SWIFT}\right]-\left[\mathrm{Full}\ \ \mathrm{chemistry}\right]}{\left[\mathrm{Full}\ \mathrm{chemistry}\right]}. $

Figure 22

Figure C1. Search for the number of layers and number of neurons per layer. The color scale shows the result of the cost function with respect to the normalized testing data.

Figure 23

Figure C2. Learning rate and mini-batch size.

Figure 24

Figure C3. Siren specific (see Table 1): omega first and other layers.