Environmental Data Science publishes its first papers

Today marks the release of the first batch of articles in Environmental Data Science (EDS). We are thrilled to celebrate Earth Month with this first release of articles!

Congratulations are in order for all the authors! Many thanks are due to the amazing Editorial Board, and Advisory Board members of EDS, along with the fantastic staff at Cambridge University Press.

We launched EDS to provide a much-needed outlet to publish open-access research at the interface of data science and the environment. This line of work is extremely urgent, considering climate change and the toll it is already taking on our communities. My hometown of Boulder is reeling from its second wildland fire in under three months, one of which destroyed 1000 homes, and both of which occurred in the “off-season” for such events. It is well-past time to harness the collective progress and enthusiasm from advances in data science (especially in AI and machine learning), into the service of addressing climate change, and preserving our environment for years to come. Kudos to the authors published today for doing so!

This first batch contains multiple EDS article types: data papers, application papers, and perspectives (for a full list of possible submission types, please see the EDS Instructions for Authors). We list them here by article type, along the impact statements that the authors have provided to describe the significance.

Application Papers


Modeling and simulating spatial extremes by combining extreme value theory with generative adversarial networks

Younes Boulaguiem, Jakob Zscheischler, Edoardo Vignott, Karin van der Wiel, and Sebastian Engelke

Spatially co-occurring climate extremes such as heavy precipitation events or temperature extremes can have devastating impacts on human and natural systems. Modeling complex spatial dependencies between climate extremes in different locations are notoriously difficult and traditional approaches from the field of extreme value theory are relatively inflexible. We show that combining extreme value theory with a deep learning model (generative adversarial networks) can well represent complex spatial dependencies between extremes. Hence, instead of running expensive climate models, the approach can be used to sample many instances of spatially cooccurring extremes with realistic dependence structure, which may be used for climate risk modeling and stress testing of climate-sensitive systems.

Predicting years with extremely low gross primary production from daily weather data using Convolutional Neural Networks

Aris Marcolongo, Mykhailo Vladymyrov, Sebastian Lienert, Nadav Peleg, Sigve Haug, and Jakob Zscheischler

Understanding and predicting extreme climate-related impacts is crucial to constrain climate risk. This is a difficult task because of the typically multiple involved time scales and interactions between impact drivers. Here, we employ Convolutional Neural Networks (CNNs) and test their ability to predict years with extremely low carbon uptake (a proxy for vegetation mortality) from daily weather data. The employed CNNs can distinguish well between normal years and years with extremely low carbon uptake, with prediction power increasing from high to low latitudes. This highlights that deep learning can be used to learn very complex relationships between daily weather data and extreme impacts.

Forecasting commodity returns by exploiting climate model forecasts of the El Niño Southern Oscillation

Vassili Kitsios, Lurion De Mello, and Richard Matear

Combining skillful climate model forecasts of the El Niño/La Niña cycle, with econometric time series analysis methods, one can improve the ability to predict price changes in impacted commodities. This in turn assists producers, policymakers, and regulators to better anticipate and manage their future climate variability induced risk.

Limitations of econometric evaluation of nonrandomized residential energy efficiency programs: A case study of Northern California rebate programs

Evan D. Sherwin, Russell M. Meyer, and Inês M.L. Azevedo

Many energy utilities offer residential energy efficiency rebate programs to reduce energy consumption and resulting environmental impacts. We find that for a particular set of rebate programs for energy efficient household appliances and services, common econometric methods find that participating households tend to increase electricity consumption after applying for rebates. Thus, it might appear that these efficiency programs did not actually save energy. However, additional utility data and a household survey suggest that the observed increase was likely measuring the “effect” of buying a new appliance. In such circumstances, energy savings estimates based on engineering models may be more appropriate than econometric methods. This illustrates the importance in policy evaluation of picking the right quantitative tool for the job.

Data Paper

PaleoRec: A sequential recommender system for the annotation of paleoclimate datasets

Shravya Manety, Deborah Khider, Christopher Heiser, Nicholas McKay, Julien Emile-Geay, and Cody Routson

Studying how climate has changed in the past allows placing the current trends into their geological context. Extracting this information from the geological records require the use of diverse data, each with their own idiosyncrasies. Therefore, compiling these records require both expertise and time to annotate. To facilitate this task, a recommender system, PaleoRec, was deployed on the main annotation interface for these data.

Perspectives

Evolution of machine learning in environmental science – a perspective

William W. Hsieh

This perspective paper reviews the evolution and growth of machine learning (ML) models in environmental science. The opaque nature of ML models led to decades of slow growth, but exponential growth commenced around the mid 2010s. Novel ML models which have contributed to this exponential growth (e.g. deep convolutional neural networks, encoder-decoder networks and generative-adversarial networks) are reviewed, as well as approaches to merging ML models with physics-based models.

Why we need to focus on developing ethical, responsible, and trustworthy artificial intelligence approaches for environmental science

Amy McGovern, Imme Ebert-Uphoff, David John Gagne, and Ann Bostrom

This position paper discusses the need for the environmental sciences community to ensure that they are developing and using artificial intelligence (AI) methods in an ethical and responsible manner. This paper is written at a general level, meant for the broad environmental sciences and earth sciences community, as the use of AI methods continues to grow rapidly within this community.

Note that many of the papers link to replication materials hosted in GitHub, Zenodo and other open repositories within the Data Availability Statement of the article. Open Data and Open Materials badges are visible on the article to indicate the existence of this related material. 

We hope you enjoy this first batch, and feel free to share widely – that’s the beauty of open-access. Finally, we would like to thank everyone reading this for your part in growing the EDS community!

April 13th, 2022, Boulder Colorado, USA

Claire Monteleoni, Editor-in-Chief

For more information about EDS, see the Instructions for Authors and the Five Reasons to Submit to EDS.

Comments

  1. Environmental Data Science needs real reach. Data from the oceans of the world, its poles, and its atmospheric strata now require diligent choices about accessing local scientific communities. It doesn’t take very much; education will benefit from this work in years to come.

Leave a reply

Your email address will not be published. Required fields are marked *