Hostname: page-component-77f85d65b8-pztms Total loading time: 0 Render date: 2026-03-28T08:24:07.772Z Has data issue: false hasContentIssue false

Understanding cirrus clouds using explainable machine learning

Published online by Cambridge University Press:  04 July 2023

Kai Jeggle*
Affiliation:
Institute for Atmospheric and Climate Science, ETH Zurich, Zurich, Switzerland
David Neubauer
Affiliation:
Institute for Atmospheric and Climate Science, ETH Zurich, Zurich, Switzerland
Gustau Camps-Valls
Affiliation:
Image Processing Laboratory, Universitat de València, València, Spain
Ulrike Lohmann
Affiliation:
Institute for Atmospheric and Climate Science, ETH Zurich, Zurich, Switzerland
*
Corresponding author: Kai Jeggle; Email: kai.jeggle@env.ethz.ch

Abstract

Cirrus clouds are key modulators of Earth’s climate. Their dependencies on meteorological and aerosol conditions are among the largest uncertainties in global climate models. This work uses 3 years of satellite and reanalysis data to study the link between cirrus drivers and cloud properties. We use a gradient-boosted machine learning model and a long short-term memory network with an attention layer to predict the ice water content and ice crystal number concentration. The models show that meteorological and aerosol conditions can predict cirrus properties with R2 = 0.49. Feature attributions are calculated with SHapley Additive exPlanations to quantify the link between meteorological and aerosol conditions and cirrus properties. For instance, the minimum concentration of supermicron-sized dust particles required to cause a decrease in ice crystal number concentration predictions is 2 × 10−4 mg/m3. The last 15 hr before the observation predict all cirrus properties.

Information

Type
Application Paper
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2023. Published by Cambridge University Press
Figure 0

Figure 1. Illustrative example of data for 1 hr. DARDAR-Nice satellite observations along the satellite overpass are displayed as gray dots. A vertical profile of 56 300 m thick vertical layers is available for each satellite observation. To keep the diagram uncluttered, only one vertical level is shown. Red crosses mark observations containing cirrus clouds. Blue dashes represent the corresponding 48 hr of backtrajectories calculated for each cirrus observation. Along each trajectory, meteorological and aerosol variables are traced.

Figure 1

Table 1. Summary of data used in this study.

Figure 2

Figure 2. Architecture of the LSTM-based model with integrated attention method to predict $ \mathrm{IWC} $ and $ {N}_i $ of cirrus clouds with the temporal data set. ReLU, rectified linear unit.

Figure 3

Table 2. Performance metrics evaluated on independent test data for the regression task of predicting cirrus CMP using meteorological and aerosol variables.

Figure 4

Figure 3. XAI evaluation metrics for SHAP, LIME, and a random baseline feature attribution of the XGBoost model. For the stability metrics RIS (a) and ROS (b), lower values indicate more stable explanations, that is, more robust toward small changes of the feature values. Estimated faithfulness (c) indicates whether features with high importance attributed by the feature attribution method are important for the prediction performance, where 1 denotes perfect estimated faithfulness.

Figure 5

Figure 4. Mean absolute SHAP values of each feature for $ \mathrm{IWC} $ (a) and $ {N}_i $ (b). The higher the value, the higher the contribution to the prediction, that is, the more important the feature.

Figure 6

Figure 5. (a–f) Partial SHAP dependence plots for representative features showing the mean SHAP value per feature value for $ \mathrm{IWC} $ in blue color and $ {N}_i $ in red color as solid lines. The shaded area represents the standard deviation. Additionally, the marginal distribution of the feature is displayed in gray above each plot, and the marginal SHAP value distributions are displayed on the right of each plot. Note that the absolute SHAP values of $ \mathrm{IWC} $ and $ {N}_i $ are not directly comparable as the SHAP value indicates the contribution of a feature to the prediction value, and the two variables have different yet similar distributions. The plots show only feature values with at least 5,000 occurrences in the test data set. (g) Mean of attention weight per timestep. Attention weights are learned during the training process of the LSTM model and represent the relative importance of the input data per timestep.

Supplementary material: PDF

Jeggle et al. supplementary material

Jeggle et al. supplementary material

Download Jeggle et al. supplementary material(PDF)
PDF 158.1 KB