Hostname: page-component-77c78cf97d-4gwwn Total loading time: 0 Render date: 2026-04-23T22:57:29.886Z Has data issue: false hasContentIssue false

Earth System Data Cubes: Avenues for advancing Earth system research

Published online by Cambridge University Press:  02 January 2025

David Montero*
Affiliation:
Remote Sensing Centre for Earth System Research (RSC4Earth), Leipzig University, 04103, Leipzig, Germany Institute for Earth System Science & Remote Sensing, Leipzig University, 04103, Leipzig, Germany German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, 04103, Leipzig, Germany
Guido Kraemer
Affiliation:
Remote Sensing Centre for Earth System Research (RSC4Earth), Leipzig University, 04103, Leipzig, Germany Institute for Earth System Science & Remote Sensing, Leipzig University, 04103, Leipzig, Germany
Anca Anghelea
Affiliation:
European Space Research Institute (ESRIN), European Space Agency (ESA), 00044, Frascati, Italy
César Aybar
Affiliation:
Image Processing Laboratory, Universitat de València, 46980, València, Spain Water Competence Center (CCA), 15086, Lima, Perú
Gunnar Brandt
Affiliation:
Brockmann Consult GmbH, 21029, Hamburg, Germany
Gustau Camps-Valls
Affiliation:
Image Processing Laboratory, Universitat de València, 46980, València, Spain
Felix Cremer
Affiliation:
Max Planck Institute for Biogeochemistry, 07745, Jena, Germany
Ida Flik
Affiliation:
Remote Sensing Centre for Earth System Research (RSC4Earth), Leipzig University, 04103, Leipzig, Germany Institute for Earth System Science & Remote Sensing, Leipzig University, 04103, Leipzig, Germany
Fabian Gans
Affiliation:
Max Planck Institute for Biogeochemistry, 07745, Jena, Germany
Sarah Habershon
Affiliation:
Remote Sensing Centre for Earth System Research (RSC4Earth), Leipzig University, 04103, Leipzig, Germany Institute for Earth System Science & Remote Sensing, Leipzig University, 04103, Leipzig, Germany
Chaonan Ji
Affiliation:
Remote Sensing Centre for Earth System Research (RSC4Earth), Leipzig University, 04103, Leipzig, Germany Institute for Earth System Science & Remote Sensing, Leipzig University, 04103, Leipzig, Germany
Teja Kattenborn
Affiliation:
Sensor-based Geoinformatics, University of Freiburg, 79106, Freiburg, Germany
Laura Martínez-Ferrer
Affiliation:
Image Processing Laboratory, Universitat de València, 46980, València, Spain
Francesco Martinuzzi
Affiliation:
Remote Sensing Centre for Earth System Research (RSC4Earth), Leipzig University, 04103, Leipzig, Germany Institute for Earth System Science & Remote Sensing, Leipzig University, 04103, Leipzig, Germany Center for Scalable Data Analytics and Artificial Intelligence (ScaDS.AI), Leipzig University, 04105, Leipzig, Germany
Martin Reinhardt
Affiliation:
Remote Sensing Centre for Earth System Research (RSC4Earth), Leipzig University, 04103, Leipzig, Germany Institute for Earth System Science & Remote Sensing, Leipzig University, 04103, Leipzig, Germany
Maximilian Söchting
Affiliation:
Remote Sensing Centre for Earth System Research (RSC4Earth), Leipzig University, 04103, Leipzig, Germany Institute for Earth System Science & Remote Sensing, Leipzig University, 04103, Leipzig, Germany Image and Signal Processing Group, Leipzig University, 04109, Leipzig, Germany
Khalil Teber
Affiliation:
Remote Sensing Centre for Earth System Research (RSC4Earth), Leipzig University, 04103, Leipzig, Germany Institute for Earth System Science & Remote Sensing, Leipzig University, 04103, Leipzig, Germany
Miguel D. Mahecha
Affiliation:
Remote Sensing Centre for Earth System Research (RSC4Earth), Leipzig University, 04103, Leipzig, Germany Institute for Earth System Science & Remote Sensing, Leipzig University, 04103, Leipzig, Germany German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, 04103, Leipzig, Germany Center for Scalable Data Analytics and Artificial Intelligence (ScaDS.AI), Leipzig University, 04105, Leipzig, Germany Department of Remote Sensing, Helmholtz Centre for Environmental Research (UFZ), 04318, Leipzig, Germany
*
Corresponding author: David Montero; Email: david.montero@uni-leipzig.de

Abstract

Recent advancements in Earth system science have been marked by the exponential increase in the availability of diverse, multivariate datasets characterised by moderate to high spatio-temporal resolutions. Earth System Data Cubes (ESDCs) have emerged as one suitable solution for transforming this flood of data into a simple yet robust data structure. ESDCs achieve this by organising data into an analysis-ready format aligned with a spatio-temporal grid, facilitating user-friendly analysis and diminishing the need for extensive technical data processing knowledge. Despite these significant benefits, the completion of the entire ESDC life cycle remains a challenging task. Obstacles are not only of a technical nature but also relate to domain-specific problems in Earth system research. There exist barriers to realising the full potential of data collections in light of novel cloud-based technologies, particularly in curating data tailored for specific application domains. These include transforming data to conform to a spatio-temporal grid with minimum distortions and managing complexities such as spatio-temporal autocorrelation issues. Addressing these challenges is pivotal for the effective application of Artificial Intelligence (AI) approaches. Furthermore, adhering to open science principles for data dissemination, reproducibility, visualisation, and reuse is crucial for fostering sustainable research. Overcoming these challenges offers a substantial opportunity to advance data-driven Earth system research, unlocking the full potential of an integrated, multidimensional view of Earth system processes. This is particularly true when such research is coupled with innovative research paradigms and technological progress.

Information

Type
Survey Paper
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2024. Published by Cambridge University Press
Figure 0

Figure 1. Representations of different storage systems for gridded data in Earth system research: Image collections (left), information-preserving data cubes (centre), and Earth system data cubes (ESDCs, right). Differences in these abstract representations have deep implications for data storage systems, accessibility, interoperability and metadata definitions.

Figure 1

Figure 2. ESDC life cycle. The inner circle represents data processing tasks, and the outer circles represent ancillary tasks that run parallel to the processing steps, involving activities such as data exploration, visualisation, dissemination, and metadata generation. The outermost circle of the diagram illustrates the readiness level of the processed ESDCs at specific points within the cycle.

Figure 2

Figure 3. Abstract representation illustrating the connection between three Earth system variables in a hARDC+ (from top to bottom: anomalies in air temperature, soil moisture, and gross primary production). The arrows illustrate the interactions that can be modelled, e.g., predictive modelling (top to bottom) or interpretation (bottom to top), depending on the use case of interest.

Figure 3

Figure 4. Abstract representation illustrating the process of sampling high-resolution mini cubes for further analysis by considering vegetation land covers and extreme events detected via a global ESDC. Note that sample mini cubes are specified in the spatial and temporal ranges of the detected extreme events (also considering their occurrence).

Figure 4

Figure 5. Comparison of air temperature at 2 m from ERA5 with and without weighting on the global mean time series computation. This rather trivial example shows how radically wrong any computation can be if the spherical nature of planet Earth is ignored.

Figure 5

Figure 6. Split-apply-combine: split an ESDC along arbitrary axes, apply a function $ f $ to each sub-cube, and then combine the results along the same axes that have been used to split the original ESDC.

Figure 6

Figure 7. Interactions within an example ESDC in Lexcube, showcasing a geographical map on the front side and Hovmöller diagrams depicting temporal changes on the lateral sides. The ESDC allows for interactive subset operations on any side.