Hostname: page-component-89b8bd64d-46n74 Total loading time: 0 Render date: 2026-05-07T17:47:18.651Z Has data issue: false hasContentIssue false

Leveraging heterogeneous LiDAR data to model successional stages at tree species level in temperate forests

Published online by Cambridge University Press:  21 November 2024

Lisa Bald*
Affiliation:
Environmental Informatics, Faculty of Geography, Philipps-University Marburg, Marburg, Germany
Alice Ziegler
Affiliation:
Environmental Informatics, Faculty of Geography, Philipps-University Marburg, Marburg, Germany
Jannis Gottwald
Affiliation:
tRackIT Systems GmbH, Cölbe, Germany
Tiziana L. Koch
Affiliation:
Remote Sensing Laboratories, Department of Geography, University of Zürich, Zürich, Switzerland Land Change Science, Swiss Federal Institute for Forest Snow and Landscape Research WSL, Birmensdorf, Switzerland
Marvin Ludwig
Affiliation:
Institute Landscape Ecology, University of Münster, Münster, Germany
Hanna Meyer
Affiliation:
Institute Landscape Ecology, University of Münster, Münster, Germany
Stephan Wöllauer
Affiliation:
Environmental Informatics, Faculty of Geography, Philipps-University Marburg, Marburg, Germany Faculty of Resource Management, HAWK University of Applied Sciences and Arts Goettingen, Goettingen, Germany
Dirk Zeuss
Affiliation:
Environmental Informatics, Faculty of Geography, Philipps-University Marburg, Marburg, Germany
Nicolas Frieß
Affiliation:
Environmental Informatics, Faculty of Geography, Philipps-University Marburg, Marburg, Germany
*
Corresponding author: Lisa Bald; Email: lisa.bald@uni-marburg.de

Abstract

In the context of the ongoing biodiversity crisis, understanding forest ecosystems, their tree species composition, and especially the successional stages of their development is crucial. They collectively shape the biodiversity within forests and thereby influence the ecosystem services that forests provide, yet this information is not readily available on a large scale. Remote sensing techniques offer promising solutions for obtaining area-wide information on tree species composition and their successional stages. While optical data are often freely available in appropriate quality over large scales, obtaining light detection and ranging (LiDAR) data, which provide valuable information about forest structure, is more challenging. LiDAR data are mostly acquired by public authorities across several years and therefore heterogeneous in quality. This study aims to assess if heterogeneous LiDAR data can support area-wide modeling of forest successional stages at the tree species group level. Different combinations of spectral satellite data (Sentinel-2) and heterogeneous airborne LiDAR data, collected by the federal government of Rhineland-Palatinate, Germany, were utilized to model up to three different successional stages of seven tree species groups. When incorporating heterogeneous LiDAR data into random forest models with spatial variable selection and spatial cross-validation, significant accuracy improvements of up to 0.23 were observed. This study shows the potential of not dismissing initially seemingly unusable heterogeneous LiDAR data for ecological studies. We advocate for a thorough examination to determine its usefulness for model enhancement. A practical application of this approach is demonstrated, in the context of mapping successional stages of tree species groups at a regional level.

Information

Type
Application Paper
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2024. Published by Cambridge University Press
Figure 0

Figure 1. General modeling workflow that is applied to every tree species group to model the specific successional stages. The modeling makes use of two different data sets for prediction. On the one hand LiDAR data, on the other hand, multispectral Sentinel-2 data. From the LiDAR data various indices are derived in the Remote Sensing Database (RSDB; Wöllauer et al., 2020). The Sentinel-2 data were processed within the “Software Framework for Operational Radiometric Correction for Environmental Monitoring” (FORCE; version 3.7.10; Frantz, 2019) and additionally spectral indices were calculated. As reference data and therefore response variable of the models the data from the forest inventory were used. We conducted three different model types, namely structural, hybrid, and spectral models using different combinations of the predictive data sets.

Figure 1

Figure 2. (a) Location of the study area (orange) in Europe. (b) The study area is confined to the forest mask (dark green) derived from the Copernicus high-resolution layer. Data: EEA, 2022; GeoBasis-DE/LVermGeoRP, 2022; OpenStreetMap, 2023.

Figure 2

Figure 3. Spatial distribution of the tree species groups (a) and successional stages (b) from the forest inventory. As the forest inventory only surveys state-owned forests the depicted polygons represent only a subset within the whole study area. Data: Landesforsten Rheinland-Pfalz (2014).

Figure 3

Table 1. Sentinel-2 bands and indices used in this study. Images for these bands and indices were calculated from monthly composites for 2019 to 2021 for January, March, April, May, June, September, and October. See Table A2 in the appendix for the complete formulas to calculate the spectral indices

Figure 4

Figure 4. Properties of heterogeneous LiDAR data of Rhineland-Palatinate. The year in which the data were recorded as well as the calculated point density derived directly from the LiDAR data set, based on 100 m pixel are depicted. It is visible that there is a transition from lower point densities in earlier years to higher point densities in later years. Data: GeoBasis-DE/LVermGeoRP (2022).

Figure 5

Table 2. Overview of LiDAR indices characterizing the vegetation calculated with the Remote Sensing Database (RSDB; Wöllauer et al., 2020; see Appendix A4 for RSDB labels)

Figure 6

Figure 5. Model performances. The left column of the plots shows the results of models applied to the test data sets only using Sentinel-2 variables (spectral models), and the middle column shows the results using Sentinel-2 and LiDAR variables (hybrid models). Each colored plot shows the confusion matrices of the testing for one tree species. Labels from the reference data are shown on the x-axis and the predicted values on the y-axis in percent. For example, the bar for the maturing phase (yellow), indicating the model classification, should be as large as possible in the first row (maturing) of each plot. All classifications in the same row, but in the other phases (blue and green) are misclassified. The right column shows the differences in accuracy for each class between the spectral models and the hybrid models. All values are rounded to two decimals.

Figure 7

Figure 6. The ranks of variable groups from variable selection. Boxplots display the ranks of variables selected during feature selection for different variable groups. Colored dots show the rank separated by tree species group. The boxes in the plot show the interquartile range of the ranks, with the median marked by a vertical line within each box. The whiskers extend to the minimum and maximum ranks without outliers. The data within the boxes indicate the average rank at which variables from each variable group were selected during the feature selection process in model tuning. As each variable group consists of several variables (see Tables 1 and 2), each model might be represented in each group multiple times. Numbers in y-axis labels indicate how many variables belong to each specific group. Black y-axis labels indicate that variables are Sentinel-2 variables, while gray labels are LiDAR-derived variables. Variable importance for each variable individually for each model is provided in Figure A4 in the appendix. *Each Sentinel-2 variable is available for 7 months.

Figure 8

Table 3. Count of chosen variables during feature selection. Numbers in row names indicate how many variables belong to the specific variable group. Black variables are Sentinel-2 variables while gray variables are LiDAR-derived variables. (a) Spectral models. (b) Hybrid models. *Each Sentinel-2 variable is available for 7 month

Figure 9

Figure 7. Area-wide map of tree species groups and its successional stages for Rhineland-Palatinate. Leaflet available at: https://envima.github.io/LidarForestModeling/.

Figure 10

Figure 8. Detailed map of LiDAR survey borders. Variations of the tree species group specific successional stages are indicated with colors in two map sections in plot (a) and (c). These exemplary areas were chosen at boundaries of LiDAR scenes with large temporal differences. On plot (b) the years of acquisition for the LiDAR data set can be identified with lighter colors for older data and darker colors for more recent data.

Figure 11

Table A1. Number of available pixels and polygons per tree species group after balancing

Figure 12

Table A2. Formulas for Sentinel-2 bands and indices used in this study. Images for these bands and indices were calculated from monthly composites for 2019–2021 for January, March, April, May, June, September, and October. For the full workflow to create the Sentinel-2 data refer to Bhandari et al. (2024)

Figure 13

Table A3. Used sensors for the acquisition of LiDAR data for each year and beam divergence of the respective sensor. 0.18 mrad correspond to an increase of 18 cm of beam diameter per 1000 m distance

Figure 14

Table A4. Overview of structural LiDAR indices with the names of the Remote Sensing Database (RSDB; Wöllauer et al., 2020). The formulas of the calculation of each index can be found under their RSDB name at: https://github.com/environmentalinformatics-marburg/rsdb/wiki/Point-cloud-indices

Figure 15

Figure A1. Pseudocode balancing.

Figure 16

Table A5. Number of available forest inventory polygons for each successional stage and tree species group

Figure 17

Figure A2. The left Column of the plots shows the test results of models only using Sentinel-2 variables (spectral models), middle column shows test results using Sentinel-2 and LiDAR variables (hybrid models). The right column shows the test results of the models using only LiDAR variables (structural model). Each plot shows the confusion matrices of the testing for one tree species group. Observed values are shown on the x-axis and the predicted values on the y-axis in percent. For example, the bar for the maturing phase (yellow) should be as large as possible in the first row (maturing) of each plot. All values that end up in the other phases (blue and green) are misclassified pixels. Values are rounded to two decimals.

Figure 18

Table A6. Significance of t-test for each variable group

Figure 19

Figure A3. Variable importance of the tree species groups model. Sentinel-2 variables are labeled black and LiDAR variables are labeled gray at the y-axis.

Figure 20

Table A7. Confusion matrix for the tree species groups model with a total accuracy of 0.81 on test data. The values indicate the classified pixels as percentage

Figure 21

Table A8. Confusion matrix for the hierarchical modeling of tree species groups and successional stages with letters indicating the successional stages (qualification: q, dimensioning: d, and maturing: m). The values indicate the classified pixels as a percentage

Figure 22

Table A9. The most important species in the tree species groups in the forest inventory data

Figure 23

Figure A4. Variable importance of hybrid and spectral tree species groups specific successional stages models. For each tree species group one model only using Sentinel-2 data (spectral on the left) and one using Sentinel-2 and LiDAR (hybrid on the right) is depicted. Sentinel-2 variables are labeled black and LiDAR variables are labeled gray at the y-axis.

Author comment: Leveraging heterogeneous LiDAR data to model successional stages at tree species level in temperate forests — R0/PR1

Comments

November 22, 2023

Claire Monteleoni

Editor-in-Chief

Environmental Data Science

Dear Editor,

We are excited to submit our application paper, “Leveraging readily available heterogeneous LiDAR data to enhance modeling of successional stages at tree species level in temperate forests”, for consideration for publication in Environmental Data Science.

The paper presents our findings on the utilization of heterogeneous LiDAR data to model and map tree species specific forest successional stages on a large regional scale. Our research addresses an underutilized area that involves the wealth of available but often overlooked heterogeneous data, most often collected by government administrations, which, despite its potential, remains largely untapped in the realm of ecological modeling. By exploring the integration of heterogeneous LiDAR data alongside multi-spectral optical satellite data, we reveal significant improvements in performance accuracy on spatially independent test data. We believe our work will be of interest to the readers of Environmental Data Science due tothe beneficial utilization of a heterogeneous dataset, which is of interest for interdisciplinary researchers using LiDAR data in ecological modeling and nature conservation. The study’s insights into harnessing readily available but heterogeneous datasets for ecological modeling could inspire a broader understanding of employing diverse data for conservation strategies in the community.

We confirm that this manuscript is original and has not been published elsewhere in part or in entirety and is not under consideration by another journal. We do not have any conflicts of interest to declare.

Sincerely,

and on behalf of all authors,

Lisa Bald

Department of Geography, Environmental Informatics, Philipps-University Marburg, Deutschhausstraße 12, 35032 Marburg, Germany

Phone: +49 6421 28-25323

Email: lisa.bald@uni-marburg.de

Review: Leveraging heterogeneous LiDAR data to model successional stages at tree species level in temperate forests — R0/PR2

Conflict of interest statement

Reviewer declares none.

Comments

Paper deals with the modelling of successional stages in different species using LiDAR. I found it very interesting, useful and in general well written, although some sentences can be rewritten for clarity (see the specific comments below). It is also very cool that the map is shared in a browser.

My main concern is that why so many indices were used, while most of them is probably highly correlated? Can you address that issue? Maybe it would be profitable to remove the ones which have the highest correlations, as the 19 indices is quite a lot, particularly if they are not later assessed in terms of variable importance separately. The assessment of importance of all of the indices at once as shown in Figure 6 is not very informative in my opinion. Furthermore, it would be helpful to include their equations as well, probably in the appendix (as, for example, there is one version of NDWI which is actually the same as NDMI).

Specific comments:

Page 2:

Line 45 – You use “the data is” here, but “data are” in other parts of the manuscript

Lines 47-50 - this is surprising for me. Could you refer to more studies which stated that such data are not reliable?

Page 4:

Figure 1 – explain RSDB abbreviation

Page 5:

Figure 2: I suggest some modifications here. Maybe use more visible countries’ borders on the left map (maybe it can be in bigger scale, just showing the neighbouring/closest countries?), smaller scale bars and labels. In the map on the right could you please put coordinates labels like 6.5, 7, 7.5... and 49.5, 50.0 etc...? And use the same coordinate labels on each map (figures 3,4,7,8). Also, maybe it would be better to depict forest mask with green colour? And why is the area border so coarse?

Page 6:

Figure 3: Maybe you could also include second map showing the distribution of successional stages according to forest inventory data. Also, maybe in the background, you could add the extent of forest mask?

Line 44: And you used “data was” here

Page 7:

Lines 48-49: please refer to the table 2 here

Page 10:

Line 3: Maybe add that “reference data were balanced”?

Line 5: Again, you used “data was balanced here”, while “data were...” in line 3.

Lines 7-9 Please rephrase for clarity

Page 11:

Lines 18-21: Include the numbers (accuracy) here

Lines 21-23: I think it is more an element of the discussion

Line 25: Maybe also put the most important numbers regarding structural model here

Page 12:

Line 50: maybe “Figure 6 displays boxplots for both the spectral and hybrid models”?

Page 15:

Table 3 – please explain a and b in the table caption

Page 18:

Lines 13-14: What happened to the area of “other deciduous trees species” on the map? It is just not shown as forests at all?

Lines 40-42 starting with “However, the unsystematic ...”- please rephrase, I think there are too many commas in this sentence

Lines 42-43: This sentence starting with “At least more focus...” also sounds a bit strange

Page 19:

Line 2: “recent” does not suit well to the reference from 2009 ;)

Line 45: rather “larch tree species group”

Line 52: But SPOT is not LiDAR, I suggest rewriting this part

Page 20:

Line 5: I don’t know if species can benefit from the additional LiDAR data, I think it is beneficial rather for their mapping accuracy.

Lines 13-15: Probably at some point the difference in acquisition dates may have impact as some trees simply convert from one successional group to another in the meantime, right?

Review: Leveraging heterogeneous LiDAR data to model successional stages at tree species level in temperate forests — R0/PR3

Conflict of interest statement

Reviewer declares none.

Comments

Peer reviewing – Environmental Data Science EDS-2023-0078

Manuscript title: “Leveraging readily available heterogeneous LiDAR data to

enhance modeling of successional stages at tree species level in temperate forests”

The reviewed manuscript by Bald et al. is a significant contribution to the field, as it investigates the potential of free available and heterogenous LiDAR data, in conjunction with the spectral signal from satellite images (e.g., Sentinel-2), to characterize forest succession in the federal state of Rhineland-Palatinate, Germany. The authors' modeling workflow, which incorporated LiDAR and Sentinel-2 data, demonstrated that a combination of LiDAR metrics, spectral bands, and vegetation indices from Sentinel-2 could effectively predict the successional stage of some tree species, outperforming the inventory forest data. This suggests that combining structural canopy metrics and optical characteristics holds promise for tree species classification and distribution maps. However, it’s worth noting that the hybrid model had limitations, particularly for certain tree species (other deciduous trees that were removed from the model) and for faster-growing tree species.

General comment

The use of heterogeneous and freely available LiDAR data, as demonstrated in this manuscript, holds great promise and should be encouraged. The three-dimensional structural data such as LiDAR provides a valuable tool for forest managers and conservationists to monitor the temporal change of various ecosystem processes, such as carbon flux, thermal regulation, herbivory, etc., on large spatial scales. Therefore, the tree species classification and distribution maps, as generated by this study, can significantly enhance decision-making strategies for forest management. The authors also alluded to the potential of their research in understanding forest successional stages as indicators of forest biodiversity; however, the mechanisms by which forest successional stages can promote biodiversity and even mitigate the negative effects of climate change and disturbances could be addressed deeper. The few lines in the introduction and discussion undervalued the potential of such analysis to estimate ecosystem functions after disturbance and measure ecosystem recovery and resilience against human and natural perturbations. Highlighting the implications of such a study for biodiversity and the functioning of forest ecosystems would improve the overarching idea of using such free data.

It would be interesting to know what exactly heterogeneous LiDAR data means. A short description of the LiDAR settings (i.e., the density of points, beam divergence, wavelength, footprint, etc.) would allow it to be compared with another free LiDAR available and be reproducible. In the result sections, I greatly appreciated Fig 5 as it provided a deeper insight into model performance by showing the three metrics per tree species group and successional stage. Even though F1 scores for the hybrid model mostly showed good classification performance, the contribution of dimensioning and maturing classes was dominant in at least four of six species, meaning that likely classification between the notable decline in the height and early growth phase needs to be carefully considered. The contribution of single LiDAR predictor variables could be explored further to understand the complexity of the classification between qualification and dimensioning. For instance, the standard deviation of height or CH can indicate which stage the classification would run better. Only as a suggestion have the intensities provided by the LiDAR data been shown to be a promising tool for describing the ecophysiological performance of tree species. Such data could improve the classification as it can distinguish between tree species' development stages. I particularly appreciate the interactive map with the final classification and distribution of tree species and the description of metrics used in the models (GitHub platform).

Minor aspects

Line 22: Fig. 3. The map’s coordinates are too small compared to the other maps.

Line 47: such as instead

Lines 13 to 18: In the section Balancing data, the two sentences are quite hard to read. You might consider rewriting them.

Lines 9 to 13: In section 4.1, would there be a reason to use intensities of the LiDAR to improve the differences of deciduous tree species? Or the problem of the extremely limited number of polygons would not allow for enough replicates to test it?

Lines 31 and 32: Section 4.1: Could the authors elaborate on how mapping successional stages can be essential for biodiversity?

Recommendation: Leveraging heterogeneous LiDAR data to model successional stages at tree species level in temperate forests — R0/PR4

Comments

Dear Dr. Bald,

I apologize for the delay in returning your paper. It was challenging to find reviewers willing to dedicate time to it, a recurrent issue these days. However, we now have feedback from two reviewers.

Reviewer One recommends major revisions, primarily concerned about the high number of correlated vegetation indices used. Reviewer Two is pleased with your manuscript, praising it for its significant contribution to the field and the effective use of heterogeneous LiDAR and Sentinel-2 data to predict successional stages of tree species.

I personally find Reviewer One’s comment relevant but minor, and I am confident you can address it swiftly. Additionally, I suggest making the title more concise and clearer. The writing is sometimes overly complex, and the paper could benefit from more straightforward wording.

It’s also important to note that while the code is available, the study is not reproducible as statements like listVRT = list.files(“data/001_raw_data/force/data/mosaic/”) is your GitHub-repo do not help if there is no folder with data and no indication of an open repository. Make sure that the revision contains a reproducible workflow. Or explain how the study could be replicated in both methods and data availability statement.

Overall, I see great potential in your approach being transferable to other regions, thus providing essential information for forest management decision-making.

Therefore, I recommend that you address the issues raised by the reviewers and myself as thoroughly as possible. If done well, I see no obstacles in recommending your paper for publication.

Best regards,

Miguel Mahecha

Decision: Leveraging heterogeneous LiDAR data to model successional stages at tree species level in temperate forests — R0/PR5

Comments

No accompanying comment.

Author comment: Leveraging heterogeneous LiDAR data to model successional stages at tree species level in temperate forests — R1/PR6

Comments

July 12, 2024

Claire Monteleoni

Editor-in-Chief

Environmental Data Science

Dear Editor,

We are excited to submit our revised manuscript, “Leveraging heterogeneous LiDAR data to model successional stages at tree species level in temperate forests” to Environmental Data Science.

Thank you for your efforts in securing reviewers for our manuscript despite the current challenges. We have carefully addressed the feedback provided by you and the reviewers and hope that these revisions meet the journal’s standards.

We confirm that this manuscript is original and has not been published elsewhere in part or in entirety and is not under consideration by another journal. We do not have any conflicts of interest to declare.

Sincerely,

and on behalf of all authors,

Lisa Bald

Department of Geography, Environmental Informatics, Philipps-University Marburg, Deutschhausstraße 12, 35032 Marburg, Germany

Phone: +49 6421 28-25323

Email: lisa.bald@uni-marburg.de

Recommendation: Leveraging heterogeneous LiDAR data to model successional stages at tree species level in temperate forests — R1/PR7

Comments

Dear authors, thank you for the excellent revision! Best wishes, Miguel Mahecha

Decision: Leveraging heterogeneous LiDAR data to model successional stages at tree species level in temperate forests — R1/PR8

Comments

No accompanying comment.