A machine learning approach to mapping canopy gaps in an indigenous tropical submontane forest using WorldView-3 multispectral satellite imagery

Summary Selective logging in tropical forests may lead to deforestation and forest degradation, so accurate mapping of it will assist in forest restoration, among other ecological applications. This study aimed to track canopy tree loss due to illegal logging of the important hardwood tree Ocotea usambarensis in a closed-canopy submontane tropical forest by evaluating the mapping potential of the very-high-resolution WorldView-3 multispectral dataset using random forest (RF) and support vector machine (SVM) with radial basis function kernel classifiers. The results show average overall accuracies of 92.3 ± 2.6% and 94.0 ± 2.1% for the RF and SVM models, respectively. Average kappa coefficients were 0.88 ± 0.03 for RF and 0.90 ± 0.02 for SVM. The user’s and producer’s accuracies for both classifiers were in the range of 84–100%. This study further indicates that vegetation indices derived from bands 5 and 6 helped detect canopy gaps in the study area. Both variable importance measurement in the RF algorithm and pairwise feature selection proved useful in identifying the most pertinent variables in the classification of canopy gaps. These findings could allow forest managers to improve methods of detecting canopy gaps at larger scales using remote sensing data and relatively little additional fieldwork.


Introduction
Tropical rainforests cover c. 7% of the globe, and they contain more than 53 000 tree species compared to c. 124 in temperate Europe (Slik et al. 2015). Vital environmental processes such as the water cycle, soil conservation, carbon sequestration and habitat protection are regulated by tropical tree species. Kenya's rainforest cover, mostly montane forests, is fragmented into patches that are being degraded (NEMA 2010). The Mount Kenya Forest Reserve (MKFR) sustains a variety of biodiversity and affords vital ecosystem services, as well as being a major forested water catchment area in Kenya (NEMA 2010). Intense growth of the human population around the MKFR and increasing poverty levels have led to forest degradation there due to illegal logging of important timber trees, especially Ocotea usambarensis, which is a hardwood tree with excellent decay and insect resistance. In 2000In -2001 usambarensis was the most highly priced timber in Kenya. The State of the Environment report by the National Environment Management Authority (NEMA) listed O. usambarensis as endangered (NEMA 2010). Normally, Ocotea trees mature over 60-70 years and are relatively large with spreading crowns and stem diameters in the range 3.75-9.50 m. The tree has low seed viability and poor regeneration, produces seeds only every 10 years and seed germination is sporadic.
Tracking of selective logging (SL) in tropical forests is important due to its effects on biodiversity and other forest attributes, including ecosystem services, the microclimate and carbon pools (Dalagnol et al. 2019). Landscape-level spatial assessments of canopy gaps have primarily used ground-based methods, and, due to the amount of effort and expense needed to acquire the measurements, the areas covered by these surveys are small and not contiguous spatially. Remote sensing (RS) is a more cost-effective and less laborious method for modelling canopy gaps than using ground-based methods (Malahlela et al. 2014). However, SL can be challenging to detect and quantify because of its partial disturbance of the forest canopy and small scale of impact (Dalagnol et al. 2019). Three main types of very-high-resolution (VHR) earth observation data have been used to detect canopy gaps due to SL in tropical forests (i.e., optical, LiDAR (light detection and ranging) and radar (radio detection and ranging)). LiDAR technology has made the detection of small canopy gaps possible; for example, Asner et al. (2013) explored canopy height models from a single LiDAR data acquisition, while Andersen et al. (2014) used simple differencing of LiDAR canopy height models to detect disappearing tree crowns with exceptional accuracy. Ellis et al. (2016) used LiDAR data from a single acquisition to detect SL by estimating aboveground biomass, while Rex et al. (2020) used LiDAR data acquired before and after logging to estimate the change in aboveground biomass due to logging. However, LiDAR covers relatively small spatial extents and data acquisition costs are high. Therefore, researchers may opt to integrate data from different RS systems; for example, Dalagnol et al. (2019) combined airborne LiDAR and VHR satellite data to quantitatively assess and validate canopy gaps. Automated mapping using time-series approaches applied to calibrated synthetic aperture radar (SAR) data have been successful in detecting SL (Baldauf & Köhl 2009). Hethcoat et al. (2021) assessed the effectiveness of SAR data for monitoring tropical SL, but SARbased biomass estimates have lower precision at the same resolution than optical data. Traditional aerial photography was successfully used for mapping canopy gaps before the introduction of high-resolution optical data (Malahlela et al. 2014); however, aerial photography data acquisition is cost-intensive, although technological advances have revitalized the use of aerial photography through unmanned aerial vehicles (UAVs). Spaias et al. (2016) used a hyperspectral camera on a UAV to detect and quantify small-scale canopy gaps in a tropical forest, although the amount of spatial and spectral data gained made the data processing computationally demanding, especially where cloud-computing resources were lacking. Ota et al. (2019) used digital aerial photographs acquired before and after logging to estimate the change in aboveground biomass linked to SL, while Kamarulzaman et al. (2022) used UAV data to detect forest canopy gaps attributed to SL. However, digital aerial photographs acquired using UAVs cover relatively small spatial extents.
Machine learning (ML) for the classification of RS data has been applied in mapping SL with increasing success. Dalagnol et al. (2019) used a random forest (RF) model to detect tree loss with an average precision of 64%. Hethcoat et al. (2019) reported a detection rate of logged pixels of c. 90% using RF. Kamarulzaman et al. (2022) compared conventional and ML classifiers. The support vector machine (SVM) and artificial neural network (ANN) classifiers attained higher overall accuracy of 85%. Using ordinary least squares regression and ML approaches (RF, k-nearest neighbour, SVM and ANN), Rex et al. (2020) monitored the change in aboveground biomass due to SL. Hethcoat et al. (2020Hethcoat et al. ( , 2021 developed RF models and used logging records to detect SL. Therefore, RF, SVM and ANN approaches, which have superior image handling capabilities, have been mostly used in this area. The development of VHR multispectral sensors such as WorldView-2 is critical for discriminating between tree canopies and vegetated gaps (Malahlela et al. 2014), because some of the inherent features of hyperspectral data, such as carotenoids and chlorophyll-sensitive bands, are preserved in WorldView-2/3 multispectral data (Mutanga et al. 2012). Visual interpretation of VHR multidate satellite imagery represents a promising way to detect canopy gaps with fairly low uncertainty (Dalagnol et al. 2019). Nonetheless, spatially accurate tree-scale validation data are not readily available, so automated approaches using VHR satellite data to accurately map canopy gaps over large and remote areas are not readily available (Dalagnol et al. 2019). The primary focus of this study is exploring the potential of WorldView-3 imagery to develop a SL monitoring system capable of detecting canopy gaps over large spatial extents in a closed-canopy submontane tropical forest using RF and SVM models.

Study area
This study was conducted in a c. 264ha area in the MKFR (Fig. 1), which covers c. 213 083 ha and encircles the 7150-ha Mount Kenya National Park (MKNP), which begins at 3100 m and extends to the highest point, the Batiaan Peak, at 5199 m (Lange & Bussmann 1998). The phonolites from main volcanic events c. 2 million years ago form the bedrock of the study area, but the inorganic body of the soils originates from a later coverage of volcanic ashes and pyroclastic rocks (Lange & Bussmann 1998). The mountain lies on the equator (latitude 0°10´S, longitude 37°20´E) and forms one of the most pristine mountain ecosystems globally and a remarkable landscapes due to its peaks with rugged glacier-clad summits and diverse forests. The precipitation pattern consists of long rains from March to May and short rains from October to December ( Supplementary Fig. S1, available online). The mountain shows a marked vegetational gradient dictated by altitude and rainfall amount. The lower tree line of the forest belt is due to agricultural and pastoral activities. Ocotea usambarensis, which never constitutes pure stands and prefers humid Nitisols and Acrisols, forms the evergreen submontane forests on the southern, southeastern and eastern slopes of Mount Kenya between 1500 and 2500 m altitude (Lange & Bussmann 1998).
Acquisition and pre-processing of satellite data WorldView-3 data were acquired on 15 September 2019 for detecting canopy gaps in selectively logged sites. WorldView-2 imagery acquired on 30 January 2014 and Google Earth were used for historical comparison. In order to cancel out the haze component caused by additive scattering from the RS data, the dark object subtraction method was applied (Chavez 1988). The WorldView-2 satellite captures panchromatic images (450-800 nm) with a spatial resolution of 0.46 m and multispectral images with eight visible-near-infrared (VNIR) bands (400-1040 nm) at 1.84m resolution, while WorldView-3 acquires panchromatic images with a spatial resolution of 0.30 m, multispectral imagery with eight VNIR bands at 1.2 m and eight shortwave-infrared (SWIR) bands (1195-2365 nm) at 3.7m spatial resolution. Additionally, there are 12 clouds, aerosols, vapours, ice and snow (CAVIS) bands with a spatial resolution of 30 m. Only WorldView-3 VNIR bands were used because SWIR bands covering the study area exhibited extensive cloud cover. The satellite data were pansharpened to obtain new bands with a spectral resolution of the multispectral bands and a spatial resolution of the panchromatic band.

Acquisition of field data
Ground truth points were collected in February 2020 using a handheld Global Positioning System (eTrex® 20 GPS Receiver; Garmin, Olathe, KS, USA) and a pansharpened WorldView-3 image (pixel = 0.30 m). Seventy vegetated gaps formed after illegal logging of Ocotea trees were located in the field (Fig. 1). In the WorldView-3 imagery, the canopy gaps were either partially/fully illuminated or not illuminated at all (Fig. S2b & c). Since the human-made canopy gaps shared similar reflectance characteristics with natural canopy gaps, GPS coordinates of 301 vegetated gaps and 301 shaded gaps were collected, including the 70 canopy gaps formed after illegal logging of Ocotea trees. A vegetated gap is a forest canopy gap with low vegetation inside it, which is the initial stage of vegetation recovery from forest disturbance. Coordinates of the approximate locations of gap centres were recorded and then overlaid on the pansharpened WorldView-3 image. Using a geographic information system (GIS; ArcGIS® v. 10.3; ESRI, Redlands, CA, USA), points were set on the vegetated and shaded canopy gap pixels on the WorldView-3 imagery, and by following the edges of the pixels, the points were made into polygons.

256
Colbert M Jackson and Elhadi Adam Locations of closed forest canopy could be identified using the WorldView-3 imagery; thus, a total of 301 polygons were extracted. Canopy gaps formed after the logging of Ocotea trees had their dimensions collected in the field, such as their dripline measurements, maximum length and compass orientation, together with the maximum width perpendicular to the length. Using the GIS, a map of canopy gaps was generated from ground survey data. The accuracy of canopy gap delineation using RS was assessed by comparing them with the dimensions collected from the field. The ground reference data were then randomly split into train and test datasets of 70% and 30%, respectively.

Feature extraction and selection
In minimizing the effects of data saturation when mapping canopy gaps in dense forests, methods such as vegetation indices (VIs), image transform algorithms, texture measures and spectral mixture analysis have been utilized previously (Malahlela et al. 2014, Dalagnol et al. 2019. Table 1 presents 55 features (23 means, 23 standard deviations (SDs), 8 ratios and 1 brightness feature) extracted from pansharpened WorldView-3 imagery, and these features are subsequently referred to as variables in the analyses. The VIs were chosen from those sensitive to greenness and plant senescence (Malahlela et al. 2014). To accurately extract shaded gaps, the shadow detection index (SDI) of Shahi et al. (2014) was used in the modelling. Shade is associated with canopy gaps as nearby trees appear on the edges of gaps (Dalagnol et al. 2019). In reducing the redundancy and intercorrelation among the list of potential features, a subset of the best-performing features was extracted from the initial 55 features before classification; for these, approximate optimal thresholds were determined from the reference data using histograms and then adjusted to determine thresholds that resulted in the highest matching accuracy compared to the reference data.

Spectral separability
Spectral separability measures the distance between two signatures, and the separability between any combinations of variables can be used in the classification. Only the subset from the best-performing features was used in the analysis. The digitized vegetated gaps, shaded gaps and forest canopy samples (Fig. S3) were assigned spectral information using the mean pixel values within their polygons and then used to generate respective signature files. The transformed divergence (TD) index and Jeffries-Matusita (J-M) distance separability measures were used. Divergence (D) is calculated from the mean and variance-covariance matrices of the data representing feature classes (Kavzoglu & Mather 2000): The TD is introduced to reduce the impact of well-separated classes that may raise the average divergence value and make the divergence measure misleading (Kavzoglu & Mather 2000): where tr[·] is the trace of a matrix, which is the sum total of the diagonal elements of the matrix, and Σ i and Σ j are the variancecovariance matrices of classes i and j; μ i and μ j are the corresponding mean vectors; c is a constant value defining the range of TD values. The J-M distance between distributions of two classes ω i and ω j has been defined as follows (Richards & Jia 1999): where B ij is the Bhattacharyya distance (Kailath 1967) computed as: where μ i and μ j are the mean reflectances of species i and j, Σ i and Σ j correspond to their covariance matrices, with |Σ i | and |Σ j | being the determinants of Σ i and Σ j , respectively, ln is the natural logarithm function and T is the transposition function. The J-M distance improves the Bhattacharya distance by normalizing it to between 0 and 2.

Pairwise feature comparison
A correlation matrix was computed from the reference samples (Table 2). Similarity scores were calculated between each pair of variables to determine whether or not two variables were co-referent. Pairwise comparisons are in form of a matrix: C = [c kp ] n × n , where c kp is the pairwise comparison rating for kth and pth criteria. The matrix C is reciprocal; that is, c pk = c À1 kp , and all of its diagonal elements are unity; that is, c kp = 1, for k = p (Malczewski 2016).

RF and SVM models
The results in this study were obtained by training the RF and SVM models in R software (R Core Team, Vienna, Austria). In RF, to differentiate between predefined categories, decision trees recursively partition the source set into subsets with bagged samples by univariate splits at internal nodes (Breiman 2001). Before running the model, the number of decision trees (ntree) and the number of predictor variables (mtry) randomly selected at each node are defined. The RF model aggregates predictions from all decision trees, then the majority vote of all trees assigns a final class for unknown features (Breiman 2001). Using the grid search method, the mean decrease in accuracy (MDA) was used to extract a subset of the best-performing variables (Breiman 2001). The MDA shows how much accuracy the model losses by excluding each variable; therefore, the higher the MDA value, the more important the variable in the model.
The SVM assigns a class from one of the two possible labels when test data are introduced after the training phase (Vapnik 2000). The SVM separates the original data while maximizing the margin between classes and minimizing the misclassification error (Vapnik 2000). An advantage of ML classifiers such as SVM is that they are suited for extreme case binary classification. For any two distinguishable classes with k samples represented by (x 1 , y 1 ), : : : , (x k , y k ), where x ∈ R n is an n-dimensional space and y is a class label with values of þ1 or −1, SVM will look for an optimal hyperplane defined by w = (w 1 , : : : , w n ) and b, such that (Huang et al. 2008): The hyperplane can be located by minimizing the norm of w or the following function under the above inequality constraint (Huang et al. 2008): Using kernel functions, SVM applies non-linear decision boundaries and introduces a cost parameter C and gamma parameter γ to quantify the penalty of misclassification errors and to give the curvature weight of the decision boundary, respectively. The robust radial basis function was selected as it has fewer parameter values to predefine. A parameter search must be done to select the best C and γ for a certain classification problem. Therefore, the γ parameter needs to be predefined (Huang et al. 2008): The cost parameter C also needs to be predetermined for the canopy gap-mapping problem. A cross-validation quantitative analysis of pairs of values for the C and γ parameters was carried out. The combination of parameters with the lowest error was chosen to train the algorithm. Both RF and SVM classifiers were trained using 70% of the ground reference data, and for robust classification results the ten-fold cross-validation method was repeated ten times.

Measures of model performance
The performance of the RF and SVM models was evaluated using 30% of the ground truth data. Confusion matrices with overall accuracy, kappa coefficient and producer's and user's accuracies were computed and averaged over ten repetitions. Overall accuracy is computed by summing the number of pixels correctly classified divided by the total number of pixels, while producer's accuracy is the percentage of particular classes on the ground that are indicated as such on the classified map (Mutanga et al. 2012). The user's accuracy shows the probability that a pixel indicated as a specific feature is classified as such on the classification map (Mutanga et al. 2012). The kappa coefficient is the difference between the observed accuracy and the agreement that would have been  (Foody & Mathur 2004).

Explanatory power of the variables extracted from the WorldView-3 bands
The most important variables as depicted by the highest values were the brightness feature, the means of the WorldView-3 VNIR bands, the chlorophyll absorption ratio index (CARI), the modified chlorophyll absorption ratio index (MCARI), the carotenoid reflectance index 2 (CRI-2), the normalized pigment chlorophyll index (NPCI), the plant senescence reflectance index (PSRI), the red-edge position index (REPI), the SDI and the SD of the anthocyanin reflectance index (ARI; Fig. 2).

Optimization of the RF and SVM models
The iteration closest to the model mean produced default mtry and ntree values of 14 and 500, respectively, with an out-of-bag error rate of 0.074 (Fig. S4). Using the same approach, the SVM model produced 0.1 and 10 for gamma and cost, respectively, yielding a cross-validation error of 0.060.

Spectral separability
The mean spectral reflectance curves of the training data were extracted from pixels of the 17 best-performing variables and plotted with their SDs (Fig. S5). Some of the variables, such as the means of bands 1-5, the NPCI, the PSRI and the SD of the ARI exhibited considerable spectral overlaps across the three classes. Only the means of bands 7 and 8 and the SDI helped separate the three classes beyond 1 SD of uncertainty. The brightness feature and the means of band 6, the CARI and the MCARI show considerable overlaps between forest canopy and vegetated gaps. The spectral separability (Table S1) between the forest canopy and vegetated gaps was low in both TD index and J-M distance by the means of the WorldView-3 VNIR bands, but the VIs indicate that the forest canopy and vegetated gaps were clearly separable. The RF model also evaluated the ability of each variable to detect vegetated gaps, shaded gaps and tree crowns, and the mean of band 4 was critical in the identification of the three classes compared to the other variables (Fig. S6). The mean of band 5 was crucial in detecting vegetated gaps, as were the means of the CARI, the CRI-2, the MCARI, the PSRI, the REPI and bands 1, 3 and 6. The mean of band 5 also helped detect shaded gaps. The means of the brightness feature, bands 2 and 7, the SDI, the NPCI and the SD of the ARI helped to detect the forest canopy.

Pairwise feature comparison
Generally, the means of the WorldView-3 VNIR bands were highly correlated, providing redundant information ( Table 2). The same applied for the means of the VIs, such as the NPCI, the CARI, the MCARI, the CRI-2 and the PSRI. The highest negative correlations were between the means of the REPI and the CRI-2, the MCARI, the NPCI and the PSRI. High negative correlations were also recorded between the mean of the SDI and those of the CARI and the MCARI. The lowest positive correlation, 0.001, was between the MCARI and bands 1 and 2. The means of the SDI and the MCARI were perfectly positively correlated, while the means of the REPI and the CARI were perfectly Table 2.
Pairwise correlations between the best-performing variables extracted from the WorldView-3 visible-near-infrared bands, computed from the reference samples. (See Table 1   negatively correlated. A high correlation was found between the SD of the ARI and the means of the MCARI and the SDI.

Logging feature detectability
In the WorldView-2 and Google Earth imagery covering the study area, 66 Ocotea trees could be identified, and with high confidence; their respective canopy gaps created after these trees were logged were identified in the WorldView-3 imagery. High confidence for gap identification meant a marked change in image pixels. Figure S7 shows the means of the CARI, the REPI and the MCARI extracted from WorldView-2 and WorldView-3. The circles in Fig. S7 compare the VI values before and after the SL events.

Model performance
The average overall accuracies for the RF and SVM models were similar (i.e., 92.3 ± 2.6% and 94.0 ± 2.1%, respectively; Table 3). The average kappa coefficients were 0.88 ± 0.03 for the RF model and 0.90 ± 0.02 for the SVM model. The user's accuracy ranges were 82-100% for the RF model and 86-100% for the SVM model. The producer's accuracy ranges were 83-100% for the RF model and 86-100% for the SVM model. Generally, the shaded gaps class showed the highest user's and producer's accuracies, primarily because the other two classes represent vegetation. Therefore, the non-vegetation class was spectrally distinguished from the vegetation classes. In general, forest canopy had lower user's and producer's accuracies, and this could be attributed to the range of reflectance characteristics of tree crowns in the WorldView-3 imagery.

Classification maps
In the post-processing stage, it was necessary to transform the classification maps so that they had only two classes, namely canopy gaps and forest canopy. As such, the shaded and vegetated gap classes were merged into one 'canopy gaps' class. In general, the classified maps indicated that SL of Ocotea trees caused mostly small-scale but spatially widespread disturbances in the MKFR (Fig. 3). The McNemar test returned a Z value of 0.88; thus, there were no significant differences (Z ≥ 1.96) at the 95% confidence level amongst the confusion matrices of the two classifiers.

Discussion
Using VHR multispectral satellite data, field data and ML algorithms showed great potential for monitoring SL in this tropical forest. Generally, the index means outperformed the SD and ratio variables in terms of the detection of canopy gaps because spectral variability in areas of canopy gaps was only due to shadows from surrounding trees and/or low vegetation. Dalagnol et al. (2019) reported that the most important variables for tree loss detection were the SDs of the reflectance VNIR bands (especially the red band) and the shadow fraction. Therefore, marked increases in spectral variability in tree loss areas were due to shadows cast by nearby trees, the non-photosynthetic vegetation and exposed soil. In Malahlela et al. (2014), the observed improved results were associated with the use of the red-edge band of the WorldView-2 sensor. The current study has identified VIs derived from bands 5 and 6 as crucial in the detection of canopy gaps due to SL. They are therefore transferrable to other tropical, closed-canopy ecosystems with different species compositions where ground-truth data are not available. The WorldView-3 satellite also has a SWIR sensor, which provides rich data for precisely identifying and characterizing forest landscape features, further enhancing WorldView-3's capacity to monitor canopy gaps. Although variable selection makes modelling simpler and faster to fit and predict, the MDA in the RF model is unable to detect false correlations; therefore, it may be biased because larger values are normally exaggerated and vice versa. Hur et al. (2017) developed an approach to overcome this based on using the Shapley value method on RF regression, but more experiments need to be conducted with other data types in order to confirm this.
Only correlation values >0.70 were significant (Table 2), thus providing redundant information, although ML algorithms can effectively handle this collinearity. In a negative correlation, one variable has the opposite effect compared to that of the other; therefore, the higher the absolute correlation coefficient, the more the variables might be critical during classification.
These results are an indication of the good agreement between the classification of logging of Ocotea trees and field data. However, error matrices only estimate the classification accuracy depending on the samples collected from the field; therefore, only biased conclusions can be drawn from such data (Foody & Mathur 2004). Future research will explore other metrics of model performance, such as balanced accuracy, bias score, precision, recall and F-score. Fig. 2. The relative importance of the variables derived from WorldView-3 visible-near-infrared bands in discriminating vegetated and shaded gaps and forest canopy as measured by random forest classifiers using the mean decrease in accuracy. (See Table 1 for acronym definitions.)

260
Colbert M Jackson and Elhadi Adam The logging of Ocotea trees has led to canopy gaps that have stimulated the regrowth of secondary forest dominated by fastgrowing species, mostly Macaranga kilimandscharica and Neoboutonia macrocalyx.
Mapping canopy gaps in tropical forests using optical RS remains a challenge because of persistent cloud cover, further compounded by unreliable cloud and cloud shadow detection algorithms. Other data sources such as SAR, which can capture images of Earth's surface regardless of smoke, darkness or cloud cover, should be explored. Trees that had dropped their leaves or lianas on top of tree crowns that experience sudden dieback may wrongly be interpreted as canopy gaps. Therefore, the application of time series methods that use seasonal models should be explored. The method used in this study might not be applicable in sparser forests because the only notable changes detected here were the disappearances of the shadows of the logged trees. In addition, a tree can be felled in the direction of an existing gap, meaning that the existing gap may undergo only an insignificant increase in size. Researchers should aim to develop accurate methods to detect such canopy gaps.
Nevertheless, data integration, which is heavily dependent on the compatibility of multi-source RS data, specifically the consistency in spatial, spectral, temporal and radiometric resolutions, accompanied by more sophisticated data fusion techniques, is key to the measurement and mapping of forest attributes (Jackson & Adam 2020). Furthermore, LiDAR may serve as a sampling technique when trying to scale up the impacts of SL events to larger regions. In its absence, other sources of publicly available training data can be used. The development of advanced automated methods for processing LiDAR data would lower dataprocessing costs, allowing for data acquisition over extensive areas (Jackson & Adam 2020). Further studies should make use of UAVs to cover larger areas at reduced costs, with caution taken regarding the current challenges posed by UAVs. Furthermore, airborne/ spaceborne hyperspectral imagery covering extensive geographical areas may be obtained for such research through the use of hyperspectral imaging satellites, which are to be launched in the next few years (e.g., NASA's Surface Biology and Geology (SBG) mission and the Carbon Mapper constellation of satellites).
Supplementary material. For supplementary material accompanying this paper visit https://doi.org/10.1017/S0376892922000339. Table 3. Confusion matrices for the random forest and support vector machine classifiers for the respective models whose overall accuracy was closest to the average overall accuracy.