Hostname: page-component-77f85d65b8-grvzd Total loading time: 0 Render date: 2026-04-19T07:27:38.244Z Has data issue: false hasContentIssue false

Advancing interpretability of machine-learning prediction models

Published online by Cambridge University Press:  25 October 2022

Laurie Trenary*
Affiliation:
Department of Atmospheric, Oceanic, and Earth Science and Center for Ocean-Land-Atmosphere Studies, George Mason University, Fairfax, Virginia 22030, USA
Timothy DelSole
Affiliation:
Department of Atmospheric, Oceanic, and Earth Science and Center for Ocean-Land-Atmosphere Studies, George Mason University, Fairfax, Virginia 22030, USA
*
*Corresponding author. E-mail: ltrenary@gmu.edu

Abstract

This paper proposes an approach to diagnosing the skill of a machine-learning prediction model based on finding combinations of variables that minimize the normalized mean square error of the predictions. This technique is attractive because it compresses the positive skill of a forecast model into the smallest number of components. The resulting components can then be analyzed much like principal components, including the construction of regression maps for investigating sources of skill. The technique is illustrated with a machine-learning model of week 3–4 predictions of western US wintertime surface temperatures. The technique reveals at least two patterns of large-scale temperature variations that are skillfully predicted. The predictability of these patterns is generally consistent between climate model simulations and observations. The predictability is determined largely by sea surface temperature variations in the Pacific, particularly the region associated with the El Nino-Southern Oscillation. This result is not surprising, but the fact that it emerges naturally from the technique demonstrates that the technique can be helpful in “explaining” the source of predictability in machine-learning models.

Information

Type
Methods Paper
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Open Practices
Open materials
Copyright
© The Author(s), 2022. Published by Cambridge University Press
Figure 0

Figure 1. Multi-model NMSE (black) recovered from skill component analysis and multi-model 5% significance level (red). Significance is estimated by the Monte Carlo method using 5,000 iterations. Analysis is performed using independently sampled data (once per winter) over the entire multi-model record. A mode is considered significant if it is less than one.

Figure 1

Figure 2. Patterns for the (a) 1st (a) and(b) 2nd leading skill components recovered from multi-model CMIP6 data. Predictions are made by the same CMIP6-single-task model.

Figure 2

Figure 3. Patterns for the 1st skill component for (a) observations and two different randomly selected 19 year segments of data from the GFDL-ESM4 model (b) and (c).

Figure 3

Figure 4. Correlation between the prediction and verification data for the (a) 1st and (b) 2nd leading skill components. These correlations are found by projecting both prediction and verification data onto the leading eigenvectors recovered from SCA and correlating the resulting time series. Predictions for CMIP6 and observations are made by the same CMIP6-single-task forecast model. The black vertical bars show the 5th–95th percentile range of correlations for predictions within the specified CMIP6 model. The individual CMIP6 models are sampled to have the same number of years as observation (19 years). The black asterisk denotes the mean correlation. For observational based estimates, the time series of the SCA components are found by projecting observational forecast and verification onto the CMIP6 derived eigenvectors. The correlation for predictions using observational data for the 2000–2018 are shown as the dashed line. The skill component analysis was performed using 50 Laplacian time series.

Figure 4

Figure 5. Regression of the leading skill component derived from the CMIP6 single-task models onto sea surface temperature from (a) multi-model CMIP6 data and (b) observations.