Feature Extraction and Classiﬁcation of Canopy Gaps Using GLCM-and MLBP-Based Rotation-Invariant Feature Descriptors Derived from WorldView-3 Imagery

: Accurate mapping of selective logging (SL) serves as the foundation for additional research on forest restoration and regeneration, species diversiﬁcation and distribution, and ecosystem dynamics, among other applications. This study aimed to model canopy gaps created by illegal logging of Ocotea usambarensis in Mt. Kenya Forest Reserve (MKFR). A texture-spectral analysis approach was applied to exploit the potential of WorldView-3 (WV-3) multispectral imagery. First, texture properties were explored in the sub-band images using fused grey-level co-occurrence matrix (GLCM)- and local binary pattern (LBP)-based texture feature extraction. Second, the texture features were fused with colour using the multivariate local binary pattern (MLBP) model. The G-statistic and Euclidean distance similarity measures were applied to increase accuracy. The random forest (RF) and support vector machine (SVM) were used to identify and classify distinctive features in the texture and spectral domains of the WV-3 dataset. The variable importance measurement in RF ranked the relative inﬂuence of sets of variables in the classiﬁcation models. Overall accuracy (OA) scores for the respective MLBP models were in the range of 80–95.1%. The respective user’s accuracy (UA) and producer’s accuracy (PA) for the univariate LBP and MLBP models were in the range of 67–75% and 77–100%, respectively.


Introduction
Tropical forests are c. 7% (c. 2 billion ha) of the earth's terrestrial environment, and house about half of all biodiversity; tropical forests serve various economic, social, and environmental functions [1]. However, most conservation initiatives have not been successful because in many regions tropical forests are being cleared for timber and expansion of agricultural land [2]. Unsustainable selective logging (SL) is probably the single biggest factor contributing to the global degradation of tropical forests [3]. Selective logging (SL) reduces forest density when the sparsely distributed, and most valuable trees are cut, creating canopy gaps, without necessarily displaying any logging infrastructure [4,5]. Previous studies on the estimation of deforestation rates in tropical forests generally ignored the effects of SL [6]. However, recently researchers have emphasised the contribution of illegal and SL to the rates of deforestation [7]. Kenya's tropical forest cover is mainly composed of montane forests. For decades, the MKFR (Mt. Kenya Forest Reserve) has been subjected to illegal logging for its commercially valuable reserves of indigenous timber, especially the endangered Ocotea usambarensis-a hardwood tree sought for its excellent decay and insect resistance [8,9]. The O. usambarensis has large bole diameters between 3.75 and 9.5 m-seeds are produced every 10 years but germination is intermittent, and it takes a long time to reach maturity, i.e., c. 60 to 70 years [10].
Because SL has an impact on biodiversity, the ecosystem services they provide, microclimate, and carbon pools, tracking this activity is crucial in tropical forests [11]. Canopy gaps are usually small (<1000 m 2 ) [12]. In the past, the gap's size was determined by ground-based techniques [13]. Field surveys can be challenging, especially in rough terrain-they are prone to error. Furthermore, the results of ground surveys can be subjective, e.g., Nakashizuka et al. [14]. Additionally, in field surveys, it can be challenging to distinguish fine canopy gaps and the ones where the understory vegetation is dense. The most appropriate solution was to observe forests from above rather than below through remote sensing (RS) [15]. Many methods applied to map SL in tropical forests used low/medium spatial resolution datasets that have a high rate of false detections [16]. The detectability of the effects of SL on medium spatial resolution images is 1 to 3 years [17]. Therefore, the amount of forest degradation that is not detected using low/medium spatial resolution datasets is unknown [16]. Recently, very high-resolution (VHR) RS datasets, i.e., <1 m per pixel, have caught the interest of researchers studying SL in tropical forests [11,[18][19][20][21][22]. Satellite and airborne data with VHR are appropriate for precisely delineating forest canopy gaps, as well as individual tree crowns [11]. Accurate quantification of canopy gaps from disappearing tree crowns has a crucial contribution in calculating carbon densities of forests, as well as modelling the effects of forest degradation on tropical biodiversity. Currently, to compute carbon densities in forests, none of the algorithms used account for canopy gaps. The accuracy of carbon estimates can be improved, provided canopy gaps are accurately identified.
The spectral information in multispectral RS data is limited-the textural features of a RS dataset can reveal the spatial correlation among pixels to detect change in the structure of vegetation [23,24]. Therefore, unlike pixel-based techniques, texture-based classification techniques considered how a pixel related to its neighbourhood [25,26]. The application of texture analysis in RS studies reported great achievements [27]. Texture in images is the change in the frequency in the tone of pixels [27]. In RS, different approaches have been applied in extracting textural features from images [26,28]. Originally by Haralick et al. [29], texture measures, e.g., statistical metrics, are extractable from the grey-level co-occurrence matrix (GLCM). Model-based approaches classify textured images according to probability distributions in random fields, e.g., Cross and Jain [30], and the local linear transformations, e.g., Unser [31]. He and Wang [32] relied on the texture spectrum. These approaches have limited application because of their computational and time complexities [25,33]. Their spatial analysis is mostly applied to small neighbourhoods, on a single scale [33]. This difficulty has been solved through the development of multichannel-based image analysis [26,28,33]. A textured image is normally reduced into characteristic feature images by application of, e.g., wavelet, Gabor, or neural network-based filters [26,28]. Thus, with just a few feature statistics, a high-dimensional textural pattern can be modelled [26,28]. Among the texture models, variants of the local binary pattern (LBP) such as the multivariate local binary pattern (MLBP) [27], and the multivariate advanced local binary pattern (MALBP) [26] are computationally convenient for RS images [27]. The LBP model was developed by Ojala et al. [34] for grey-level images. Very high-resolution RS data and GLCM analysis have been successfully used in mapping tropical forests [35][36][37][38].
The visual interpretation of VHR multi-date RS data is a good way to detect and quantify gaps in forests with fairly low uncertainty [11]. Nonetheless, spatially precise data for validation are lacking, and automated approaches based on VHR-RS datasets to detect canopy gaps with high precision over extensive areas are lacking [11]. Although the GLCM and LBP models have been applied differently elsewhere, they have not been used to study canopy gaps, especially in montane tropical forests using VHR-RS data. Therefore, this study aimed to use the fused GLCM-/MLBP-based approach to test whether canopy gaps from illegal logging of O. usambarensis in a highly heterogeneous montane tropical forest can be accurately mapped using WorldView-3 (WV-3) dataset. The performance of the basic LBP model and its variant, i.e., the MLBP model, was compared. It also aimed to provide a framework for carrying out similar studies over larger spatial extents in the future. The high-resolution (HR) WorldView-2 (WV-2) and Google Earth were used to provide historical data. To achieve high classification accuracies, the ability to combine/discriminate between samples is crucial [34], therefore, two similarity measures were used, i.e., the G-statistic and Euclidean distance. Due to their excellent performance and clear logic in handling RS data, the random forest (RF) and support vector machine (SVM) were used to classify canopy gaps in the study area.

Study Area
The Mt. Kenya Forest Reserve (MKFR) was established in 1932 under the management of the Department of Forest-now known as the Kenya Forest Service-with the primary goal of preserving and developing the forest reserve. This included creating plantations to replace harvested indigenous stands, regulating resource access, and preserving the forest industry [8]. The forest reserve, located in Central Kenya, covers c. 213,083 ha, and spans a range of elevation, slope, and aspect positions [39]. The snow-capped mountain is right on latitude 0 • 10 S and longitude 37 • 20 E [40]. In 1997, the mountain received the UNESCO World Heritage Site designation [9]. The study covers approximately 264 ha of Chuka Forest in Tharaka Nithi County. Chuka Forest is part of the Mt. Kenya ecosystem and encompasses approximately 21,740 ha ( Figure 1). Geomatics 2023, 3, FOR PEER REVIEW 3 opy gaps from illegal logging of O. usambarensis in a highly heterogeneous montane tropical forest can be accurately mapped using WorldView-3 (WV-3) dataset. The performance of the basic LBP model and its variant, i.e., the MLBP model, was compared. It also aimed to provide a framework for carrying out similar studies over larger spatial extents in the future. The high-resolution (HR) WorldView-2 (WV-2) and Google Earth were used to provide historical data. To achieve high classification accuracies, the ability to combine/discriminate between samples is crucial [34], therefore, two similarity measures were used, i.e., the G-statistic and Euclidean distance. Due to their excellent performance and clear logic in handling RS data, the random forest (RF) and support vector machine (SVM) were used to classify canopy gaps in the study area.

Study Area
The Mt. Kenya Forest Reserve (MKFR) was established in 1932 under the management of the Department of Forest-now known as the Kenya Forest Service-with the primary goal of preserving and developing the forest reserve. This included creating plantations to replace harvested indigenous stands, regulating resource access, and preserving the forest industry [8]. The forest reserve, located in Central Kenya, covers c. 213,083 ha, and spans a range of elevation, slope, and aspect positions [39]. The snow-capped mountain is right on latitude 0°10′ S and longitude 37°20′ E [40]. In 1997, the mountain received the UNESCO World Heritage Site designation   Altitude and the difference in the amount of rainfall received has resulted in a pronounced vegetational gradient in Mt. Kenya. Mt. Kenya's lower slopes are characterised by montane forest, including the species Newtonia buchananii, Podocarpus latifolia, Croton megalocarpus, Nuxia congesta, Olea europaea spp. Africana, Juniperus procera, Calodendrum capense, and Ocotea usambarensis. The O. usambarensis also forms in the sub-montane forests on the extremely humid eastern, southern, and south eastern slopes at 1500 to 2500 m [41].

Acquisition and Pre-Processing of Satellite Data
This study used WorldView-3 (WV-3) multispectral dataset acquired on 15 September 2019 to detect canopy gaps, while WorldView-2 (WV-2) data acquired on 30 January 2014 and historical imagery in Google Earth offered insights into historical reference for logging ( Figure 2). The satellite data of MKFR were provided by Swift Geospatial, Pretoria, South Africa. The panchromatic band of the WV-2 was captured with a spatial resolution of 0.46 m-the WV-3 captures at 0.3 m [42,43]. The multispectral images (8 visible-nearinfrared; VNIR) of the WV-2 were acquired at 1.84 m [42]. The WV-3 captures at 1.2 m [43]. The WV-3 acquires eight shortwave-infrared (SWIR) bands with a pixel size of 3.7 m, and eight CAVIS (clouds, aerosol, vapour, ice, and snow) bands at 30 m [43].

Acquisition and Pre-Processing of Satellite Data
This study used WorldView-3 (WV-3) multispectral dataset acquired on 15 September 2019 to detect canopy gaps, while WorldView-2 (WV-2) data acquired on 30 January 2014 and historical imagery in Google Earth offered insights into historical reference for logging ( Figure 2). The satellite data of MKFR were provided by Swift Geospatial, Pretoria, South Africa. The panchromatic band of the WV-2 was captured with a spatial resolution of 0.46 m-the WV-3 captures at 0.3 m [42,43]. The multispectral images (8 visiblenear-infrared; VNIR) of the WV-2 were acquired at 1.84 m [42]. The WV-3 captures at 1.2 m [43]. The WV-3 acquires eight shortwave-infrared (SWIR) bands with a pixel size of 3.7 m, and eight CAVIS (clouds, aerosol, vapour, ice, and snow) bands at 30 m [43]. The ENVI module (ENVI 5.3) FLAASH was used to atmospherically calibrate the images by converting the digital numbers (DN) to the top-of-atmosphere reflectance. The WV-3 data were co-registered with the WV-2 image to be able to match features between the two datasets-this produced an average root mean square error (RMSE) of 3.41 m. Before calculating textural features, the VNIR bands were down-scaled to 0.3 m pixels, with the 1.2 m pixels sub-divided into 16 pixels [38]. The method necessitated the extraction of texture information without the inclusion of uncertainties of pansharpened VNIR bands [44].

Acquisition of Field Data
A Global Positioning System (eTrex ® 20 GPS Receiver; Garmin, Olathe, KS, USA) and false-colour composite (853-RGB) of WV-3 images were used to locate canopy gaps in the field in February 2020. The three bands were among the best-performing bands in Jackson and Adam [45], therefore, they were used in this study's analysis. Additionally, the 853-RGB is a well-known band combination for analysing vegetation [43]. In the WV-3 image, gaps were partially illuminated/fully illuminated/not illuminated. In the study area, the human-made canopy gaps reflected the same as natural canopy gaps. The canopy gaps were vegetated, i.e., the gaps had low vegetation in them. This was the initial stage of vegetation recovery from disturbance. GPS coordinates of 100 vegetated gaps and 100 The ENVI module (ENVI 5.3) FLAASH was used to atmospherically calibrate the images by converting the digital numbers (DN) to the top-of-atmosphere reflectance. The WV-3 data were co-registered with the WV-2 image to be able to match features between the two datasets-this produced an average root mean square error (RMSE) of 3.41 m. Before calculating textural features, the VNIR bands were down-scaled to 0.3 m pixels, with the 1.2 m pixels sub-divided into 16 pixels [38]. The method necessitated the extraction of texture information without the inclusion of uncertainties of pansharpened VNIR bands [44].

Acquisition of Field Data
A Global Positioning System (eTrex ® 20 GPS Receiver; Garmin, Olathe, KS, USA) and false-colour composite (853-RGB) of WV-3 images were used to locate canopy gaps in the field in February 2020. The three bands were among the best-performing bands in Jackson and Adam [45], therefore, they were used in this study's analysis. Additionally, the 853-RGB is a well-known band combination for analysing vegetation [43]. In the WV-3 image, gaps were partially illuminated/fully illuminated/not illuminated. In the study area, the humanmade canopy gaps reflected the same as natural canopy gaps. The canopy gaps were vegetated, i.e., the gaps had low vegetation in them. This was the initial stage of vegetation recovery from disturbance. GPS coordinates of 100 vegetated gaps and 100 shaded gaps per image block were collected and overlaid on the WV-3 image using a geographic information system (GIS-ArcGIS ® v. 10.3; ESRI, Redlands, CA, USA). The pixels of the vegetated gaps, as well as those of the shaded gaps were extracted from the WV-3 imagery. The spatial resolution of the WV-3 data enabled the derivation of forest canopies as references-thus 100 samples of tree crowns were extracted per block. The ground reference data were randomly split into 70% and 30%, i.e., as train and test data, respectively.
Appropriate image block sizes were selected to calculate texture features. Regions in large blocks show a mixture of textures, while small blocks may reduce the probability of computing a texture measure [27]. In this study, six non-overlapping subset images (each 1400 × 1400 pixels) were generated from the WV-3 imagery covering the study area. The three classes-vegetated and shaded gaps, and forest canopy-were easily differentiated in the WV-3 imagery ( Figure 3).
shaded gaps per image block were collected and overlaid on the WV-3 image using a geographic information system (GIS-ArcGIS ® v. 10.3; ESRI, Redlands, CA, USA). The pixels of the vegetated gaps, as well as those of the shaded gaps were extracted from the WV-3 imagery. The spatial resolution of the WV-3 data enabled the derivation of forest canopies as references-thus 100 samples of tree crowns were extracted per block. The ground reference data were randomly split into 70% and 30%, i.e., as train and test data, respectively.
Appropriate image block sizes were selected to calculate texture features. Regions in large blocks show a mixture of textures, while small blocks may reduce the probability of computing a texture measure [27]. In this study, six non-overlapping subset images (each 1400 × 1400 pixels) were generated from the WV-3 imagery covering the study area. The three classes-vegetated and shaded gaps, and forest canopy-were easily differentiated in the WV-3 imagery (Figure 3). Pixels were sampled randomly, covering areas close to the class edges and centrespixels around class edges were vital in aiding the classifier's edge detection of textural features [38]. To attain high classification accuracy, the pixel size of the reference data corresponding to the texture classes were kept the same, i.e., it consisted of 20 × 20 pixels ( Table 1). The same number of reference points for the vegetated and shaded gaps, and forest canopy was collected because data imbalance reduces the accuracy and performance of the classifier [46]. The topmost layer of a forest, mostly tree crowns with a few emergent trees having heights that shoot above the canopy.
Dimensions of canopy gaps created by the logging of Ocotea trees were measured in the field, including the dripline measurements, maximum length, compass orientation, Pixels were sampled randomly, covering areas close to the class edges and centres-pixels around class edges were vital in aiding the classifier's edge detection of textural features [38]. To attain high classification accuracy, the pixel size of the reference data corresponding to the texture classes were kept the same, i.e., it consisted of 20 × 20 pixels (Table 1). The same number of reference points for the vegetated and shaded gaps, and forest canopy was collected because data imbalance reduces the accuracy and performance of the classifier [46]. Dimensions of canopy gaps created by the logging of Ocotea trees w the field, including the dripline measurements, maximum length, comp and maximum breadth [47]. Points directly below the dripline were noted impossible to cover all of the canopy drip-line, the boundaries were someh A map of the gaps was created from the ground data using ArcGIS. The tween the measurements gathered in the field with the remote sensing (RS enabled the evaluation of the accuracy of the delineated canopy gaps.

Feature Extraction and Selection
Low-lying vegetation in the forest canopy Dimensions of canopy gaps created by the logging of Ocotea trees w the field, including the dripline measurements, maximum length, comp and maximum breadth [47]. Points directly below the dripline were noted impossible to cover all of the canopy drip-line, the boundaries were someh A map of the gaps was created from the ground data using ArcGIS. The tween the measurements gathered in the field with the remote sensing (RS enabled the evaluation of the accuracy of the delineated canopy gaps.

Feature Extraction and Selection
Gaps in the forest canopy that are darker because of the shadows cast by the nearby tree crowns Dimensions of canopy gaps created by the logging of Ocotea trees w the field, including the dripline measurements, maximum length, comp and maximum breadth [47]. Points directly below the dripline were noted impossible to cover all of the canopy drip-line, the boundaries were someh A map of the gaps was created from the ground data using ArcGIS. The tween the measurements gathered in the field with the remote sensing (RS enabled the evaluation of the accuracy of the delineated canopy gaps.

Feature Extraction and Selection
The topmost layer of a forest, mostly tree crowns with a few emergent trees having heights that shoot above the canopy.
Dimensions of canopy gaps created by the logging of Ocotea trees were measured in the field, including the dripline measurements, maximum length, compass orientation, and maximum breadth [47]. Points directly below the dripline were noted and since it was impossible to cover all of the canopy drip-line, the boundaries were somehow generalised. A map of the gaps was created from the ground data using ArcGIS. The comparison between the measurements gathered in the field with the remote sensing (RS) measurements enabled the evaluation of the accuracy of the delineated canopy gaps.

Feature Extraction and Selection
Texture analysis is an effective component of classification for higher-resolution (HR) images-it is convenient to use because image segmentation is not needed [48]. A crucial property of texture is the repetitive nature of the pattern(s) in an area [26]. The spectral information in images has been frequently used in interpreting and analysing images; however, images may have an object reflecting differently, and different objects reflecting the same [49]. This affects the accuracy of image analysis. Improvement in the spatial resolution of RS images has contributed to more spatial structures and texture features, which has led to increased classification accuracy.
According to Cohen and Spies [50], texture features drawn from images of HR can be applied in forestry research. Lucieer et al. [27] and Suruliandi and Jenicka [26] noted that significantly high classification accuracies have been achieved using textural information. The texture-spectral analysis approach used in this study evaluated the widely used greylevel co-occurrence matrix (GLCM) texture measures and the multivariate local binary pattern (MLBP)-an extension of the state-of-art texture descriptor Local Binary Patterns (LBP). The GLCMs are theoretically simple and easy to implement and they generate fewer features [51].
The LBP texture model was put forth by Ojala et al. [34]. On a circular radius of R, the LBP operator thresholds pixels in a circular pattern at the value of the centre pixel, in the neighbourhood of P evenly spaced pixels. It is capable of detecting uniform patterns for any angular space quantization and spatial resolution. The LBP were fused with rotation invariant GLCM measures, i.e., homogeneity, contrast, entropy, angular second moment, and correlation, which led to the following features: LBP/HOM, LBP/CON, LBP/ENT, LBP/ASM, and LBP/COR for bands 3 (Green), 5 (Red), and 8 (Near Infrared 2) of the WV-3 image. The values of these features were computed and allocated to the image pixels, thus revealing textural patterns. Therefore, the histogram of the joint LBP and GLCM feature occurrence formed the final texture feature.
The LBP operator describes the texture of a single band. To improve classification accuracy, Lucieer et al. [27] applied the LBP texture measure to colour images by proposing a multivariate texture model, i.e., the MLBP operator, which describes local pixel relations in three bands [26,27]. Three 3 × 3 matrices describe the local texture in individual bands, while six 3 × 3 matrices compare texture among bands. In the MLBP model, the univariate GLCM measure (e.g., HOM-homogeneity) was extended as multivariate homogeneity (MHOM), i.e., comprising the individual independent homogeneities HOM3, HOM5, and HOM8 representing bands 3 (Green), 5 (Red), and 8 (Near Infrared 2), respectively. The global texture pattern description was derived by combining the MLBP and MHOM in a 2-D histogram. In the 2-D histogram, the x ordinate denotes MLBP and the y ordinate denotes MHOM. In order to incorporate colour into the MLBP model, the same procedure was repeated for the MLBP and the remaining GLCM feature composites, i.e., the respective composites of contrast (MCON), entropy (MENT), angular second moment (MASM), and correlation (MCON).

Similarity and Separability between Training Signatures
Two measures were used to compare the similarity between training signatures, i.e., the G-statistic and Euclidean distance. The G-statistic is defined as follows [52]: where sample s corresponds to a histogram of the texture measure distribution, while model m corresponds to a histogram of a reference area. tb constitutes the number of bins and fi represents the probability for bin i. The Euclidean distance is calculated as follows [53]: where x is the vector of the first spectral signature and y is the vector of the second spectral signature. n is the number of image bands.

Training of Random Forest and Support Vector Machine Classifiers
The random forest (RF) models contain bootstrapped ensembles of decision trees-they can handle independent variables in large numbers while still reporting high classification accuracy [54]. In this research, for the LBP model, the RF classifier was trained using 15 WV-3 metrics for the image blocks as predictors to classify the samples as vegetated gap, shaded gap, or forest canopy. For the MLBP model, 5 WV-3 metrics were used to train the RF classifier. During the training process the learning parameters of the RF classifier (the mtry), the number of predictor variables, and the number of decision trees (ntree) were optimised to obtain the best possible settings. Each tree applies a randomised bagging approach to retrieve a training data subset and utilise it to cross-validate each tree's result. This enables the RF models to develop an "out-of-bag" (OOB) accuracy and metrics of input in determining the significance of specific variables in the model [54]. The 10-fold cross-validation (CV) technique was used to extract the optimal parameters applied in the training phase of the RF models. The mean decrease accuracy (MDA) and the Mean Decrease in Gini (MDG) indices of variable importance were used [54]. The MDA computes the added error rate related to an input variable's exemption from a tree while the MDG calculates the reduction in the forest-wide average in node impurities from splits on a variable [55]. The higher the MDA and MDG indices, the more influential the corresponding variables. For a robust selection of features, a combined ranking of both indices was applied in the RF models [55].
The support vector machine (SVM) models apply a supervised binary classifier, able to classify linearly inseparable pixels-support vectors are the samples nearest to the separating hyperplane [55]. The SVM models find support vectors with an optimal margin near the separating hyperplane [56]. Using kernel functions, SVM models use kernel functions to apply decision boundaries that are not linear and introduce gamma (γ) and cost (C) parameters. A similar approach to the RF was used to optimise the SVM parameters-i.e., cost and gamma-to select the optimal pair of the C and γ. The two parameters, respectively, determine the penalty for errors of misclassification and give the curvature weight of the deciding boundary. The radial basis function (RBF) kernel was chosen for this study.
The classification of the texture features using the RF and SVM classifiers involved two phases. Firstly, the classifiers were trained using the respective known samples' global histograms and their class labels-the two should be consonant with the classifiers' corresponding pair of classes. Secondly, the unknown samples' global histograms were the input to the RF and SVM classifiers, which search for the class label of the test sample through a comparison of the respective global histograms of the test sample and the training samples. The RF and SVM classification models used 70% and 30% of the ground data as train and test data, respectively.

Classification Post-Processing
In the classification post-processing stage, the shaded gap and vegetated gap classes were merged into one class-canopy gaps. Therefore, the final map was composed of two classes only-canopy gaps and forest canopy. Morphological filters were implemented with ArcGIS tools whereby, thin corridors and small spaces amongst forest canopy were removed using the Shrink tool while the Expand tool was applied to enlarge the classified raster gaps. The Focal Statistics tool was used to minimise pixelation and eliminate remaining trees within gaps. The Majority filter retained the most frequently occurring value. Gap polygons with an area < 100 m 2 were eliminated using the Select function because only gaps ≥ 100 m 2 are significant for carbon dynamics [57], and to rule out small gaps that were not likely caused by felled O. usambarensis trees.

Measures of Model Performance
Accuracy assessment is used to determine whether pixels identified in the field are classified as they should be [58]. An assessment of the performance of the two ML classifiers was conducted on 30% of the ground data. Confusion matrices consisting of overall accuracy (OA), kappa coefficient (κ), producer's accuracy (PA), and user's accuracy (UA) were produced and averaged over ten iterations. The OA is calculated by dividing correctly classified pixels by the total number of pixels-typically expressed as a percentage [59]. The PA is the proportion of particular classes on the ground, referred to as such by the classification map [59]. The UA displays the likelihood that a labelled pixel will be placed on the classified map as such [60]. The kappa coefficient (κ) represents the difference between the accuracy that was observed and expected. Therefore, the classification accuracies and kappa statistics computed from error matrices were used to evaluate how the classifiers performed.

Similarity and Separability between Training Signatures
The G-statistic and Euclidean distance enabled the avoidance of erroneous assumptions regarding the distribution of features. The G-statistic score is an indication of the possibility that two samples are from the same population. Therefore, a higher score means that the probability of two samples being from the same population is low, and vice versa. Likewise, the Euclidean distance is 0 if two signatures are alike-it is higher for signatures showing little similarity. The results (Table 2) indicated that forest canopy and vegetated gaps had the lowest Euclidean Distance between them with a value of 41. The G-statistic value between newly created gaps and shaded gaps was the highest with 9.01, while the forest canopy and vegetated gaps was the lowest (0.38), followed by forest canopy and newly created gaps. The Euclidean distance separability measure followed a similar pattern. Generally, the texture classes, i.e., shaded gaps, vegetated gaps, and tree crowns, had good separation.

Optimisation of Random Forest and Support Vector Machine Classifiers
The results of the optimisation of the RF and SVM parameters for the six image blocks are listed in Table 3  The average importance score (Figure 4) showed the most important variables-band 3's LBP/HOM and LBP/CON, the band 5's LBP/HOM and LBP/CON, and the band 8's LBP/ENT and LBP/ASM. The least performing variables, which showed the lowest average importance scores were the band 3's LBP/ENT and LBP/ASM, and the band 5's LBP/ENT and LBP/ASM.

Model Performance
The confusion matrices in Table 4 show the results of the RF and SVM classifiers for the MLBP/MHOM, the MLBP/MCON, the MLBP/MENT, the MLBP/MASM, and the MLBP/MCOR models for image block D, which recorded an average overall accuracy of 86.88 ± 5.1% and 89.78 ± 3.7% for RF and SVM classifiers, respectively. Image block E's RF classification attained an average overall accuracy of 87.00 ± 5.1% while the SVM's was 87.28 ± 5.8%. The SVM classification of the MLBP/MCON was the highest at 95.1%. The respective univariate LBP measures provided overall accuracies (OAs) in the range of 67-

Model Performance
The confusion matrices in Table 4 show the results of the RF and SVM classifiers for the MLBP/MHOM, the MLBP/MCON, the MLBP/MENT, the MLBP/MASM, and the MLBP/MCOR models for image block D, which recorded an average overall accuracy of 86.88 ± 5.1% and 89.78 ± 3.7% for RF and SVM classifiers, respectively. Image block E's RF classification attained an average overall accuracy of 87.00 ± 5.1% while the SVM's was 87.28 ± 5.8%. The SVM classification of the MLBP/MCON was the highest at 95.1%. The respective univariate LBP measures provided overall accuracies (OAs) in the range of 67-75%.  (%)  FC  25  3  4  32  78  FC  25  3  3  31  81  SG  3  26  1  30  87  SG  3  27  1  31  87  VG  2  1  25  28  89  VG  2  0  26  28  93  Total  30  30  30  90  Total  30  30  30 Table 5. For each classifier, the table reports the average classification accuracy in the form µ ± σ, where µ is the mean and σ is the standard deviation of the OA.

Image Classification
A subset of the classification results using MLBP/MHOM for image block D, whose RF and SVM model optimisation parameters recorded some of the lowest OOB and CV errors, respectively, is shown in Figure 5. Even based on visual interpretation, the classes are mapped correctly. The multivariate local binary pattern (MLBP) model distinguishes classes very well because it assigns distinct and precise pattern codes to show patterns. The boundaries of extracted canopy gaps are overlaid on the ground truth canopy gap areas which are shown in red.

Image Classification
A subset of the classification results using MLBP/MHOM for image block D, whose RF and SVM model optimisation parameters recorded some of the lowest OOB and CV errors, respectively, is shown in Figure 5. Even based on visual interpretation, the classes are mapped correctly. The multivariate local binary pattern (MLBP) model distinguishes classes very well because it assigns distinct and precise pattern codes to show patterns. The boundaries of extracted canopy gaps are overlaid on the ground truth canopy gap areas which are shown in red.

Discussion
The application of moderate resolution remote sensing (RS) imagery to detect canopy gaps from selective logging (SL) may depict spectral confusion of gaps in forests due to natural disturbances such as windfall. Remote sensing (RS) methods applied on Landsat datasets can only detect selectively logged areas at moderately high intensities, i.e., >20 m 3 ha −1 ; 3-7 trees ha −1 . These methods are incapable of quantifying the magnitude and duration of logging damage in regions undergoing lower logging intensities, i.e., <20 m 3 ha −1 [61]. Due to the sub-pixel scale of SL gaps, the broad spectral range of Landsat wavelengths cannot detect subtle forest changes. Pansharpened multispectral images and up-

Discussion
The application of moderate resolution remote sensing (RS) imagery to detect canopy gaps from selective logging (SL) may depict spectral confusion of gaps in forests due to natural disturbances such as windfall. Remote sensing (RS) methods applied on Landsat datasets can only detect selectively logged areas at moderately high intensities, i.e., >20 m 3 ha −1 ; 3-7 trees ha −1 . These methods are incapable of quantifying the magnitude and duration of logging damage in regions undergoing lower logging intensities, i.e., <20 m 3 ha −1 [61]. Due to the sub-pixel scale of SL gaps, the broad spectral range of Landsat wavelengths cannot detect subtle forest changes. Pansharpened multispectral images and upscaling of spatial resolutions are some of the commonly applied techniques to enhance lower-resolution imagery. In intensive SL analysis, high spatial resolution (5-10 m pixel size) images enable the detection of tree fall gaps log landings, and logging roads. However, for SL where only individual trees are targeted, the application of very highresolution (VHR) remotely sensed datasets is viable in detecting and mapping disappearing tree crowns.
This study aimed to discover whether grey-level co-occurrence matrix (GLCM)-and multivariate local binary pattern (MLBP)-based rotation-invariant feature descriptors derived from VHR WorldView-3 (WV-3) imagery may be extended and used for canopy gap classification in a tropical sub-montane forest. The study applied a local binary pattern (LBP) model fused with a GLCM model, whereby the rotation invariant LBP operator was used to obtain the LBP images of subsets of images extracted from a WV-3 scene covering the study area. Then, five GLCM measures of the LBP images were calculated to describe the image texture features. The LBP texture measure was applied to colour images by applying a multivariate texture model-the multivariate local binary pattern (MLBP) operator. Due to the robustness of the model, the classes were found to be separable. For the LBP model, a uniformity measure was applied to show the uniformity of the neighbourhood's pixel values-according to Ojala et al. [34], in a textured image >90% of patterns are uniform.
A circular neighbourhood set comprising 8 neighbouring pixels and a radius of 1 was used-the values for P and R were 8 and 1, respectively. Large P and R values are appropriate for describing large-scale textures, and vice versa [38]. The circular symmetrical neighbour set approach is more robust and delivers more accurate results [51]. According to Clausi [62], different combinations of values of P and R in neighbourhood sets might offer meaningful texture descriptions. Larger window sizes enable classifiers to extract rich textural information from a pixel, which could improve accuracy; however, a larger window size might reduce sensitivity to class edges, and eventually smooth over the image [62].
Previously, very-high-resolution (VHR) earth observation (EO) data have been used to detect SL in tropical forests. For example, Asner et al. [63] used canopy height models (CHMs) from a single LiDAR (light detection and ranging) data acquisition, while Andersen et al. [5] used simple differencing of CHMs to successfully detect disappearing tree crowns. Ellis et al. [18] and Rex et al. [21] used a single date and bi-temporal LiDAR data, respectively, to estimate aboveground biomass (AGB) in selectively logged forests. Dalagnol et al. [11] combined airborne LiDAR and VHR satellite data to quantitatively assess and validate canopy gaps due to tree loss-an average precision of 64% was reported. Baldauf and Köhl [64] applied automated mapping using time-series approaches to detect SL using calibrated SAR (synthetic aperture radar) data. Before the introduction of high-resolution (HR) optical data, the costly traditional aerial photography was used for mapping canopy gaps-technological advances have revitalised its use through unmanned aerial vehicles (UAVs). Spaias et al. [19] used UAV data acquired using a hyperspectral camera to detect canopy gaps in a tropical forest-where cloud-computing resources are lacking the amount of spatial and spectral data acquired may make the data processing computationally demanding. Ota et al. [20] used bi-temporal digital aerial photographs (DAPs) to compute the change in AGB due to logging. Kamarulzaman et al. [22] used UAV data to detect forest canopy gaps from SL. The support vector machine (SVM) and artificial neural network (ANN) classifiers achieved higher overall accuracy of 85% compared to conventional classifiers. However, LiDAR and UAV data cover relatively small spatial extents.
The accuracy results values reported here show that good classification results were obtained. The GLCM features perform better with ≤10 classes-therefore, they can outdo more powerful methods [51]. This research used just five co-occurrence descriptors, although Di Ruberto et al. [65] state that a higher number of co-occurrence features could obtain excellent results. The respective univariate LBP measures provided classification accuracies in the range of 67-75%. The multivariate LBP models gave higher classification accuracies (Table 5). Although the MLBP/MHOM recorded the lowest out-of-bag (OOB) and cross validation (CV) errors, the MLBP/MCON with a CV error of 0.112 (gamma and cost values of 0.1 and 100, respectively) for the SVM model outperformed all the other models to record the highest classification accuracy of 95.1% for image block E. The lowest classification accuracy (80.0%) was recorded by the RF's MLBP/MASM for image block F. The MLBP/MSAM performed poorly in all RF and SVM models. The overall accuracies (OAs) reported could have even been higher were it not for confusion between the classes.
The fusing of textural and spectral information from three WV-3 bands performed better than their basic models. Future research will aim to modify the model to include more than three bands, even extending it for hyperspectral data. This will explore the contribution of separate colour bands in texture analysis. It will also assist in investigating novel combinations of colour and texture for classification. Although complexity and computational demands would increase, adding more bands might not significantly increase the amount of textural information [27]. However, the net benefit would be increased accuracy in classification, which can be worth it.
Persistent cloud cover in tropical forests presents a major challenge when mapping canopy gaps using optical RS-this is further made worse by the absence of reliable cloud and cloud shadow detection algorithms. This greatly limited the size of the study area.

Conclusions
Accurate mapping of canopy gaps is of great importance to forest managers because it guides on-the-ground conservation and restoration projects and management applications. The results reported in this study show that canopy gaps from illegal logging of Ocotea usambarensis have been accurately mapped with high accuracy. The study used an approach that used features integrating both texture and spectral distributions of a very high resolution WorldView-3 dataset-this approach considers the cross band relations. In order to increase classification accuracy, the G-statistic and Euclidean distance measures were used to discriminate between the samples. The framework used in this study could allow forest managers to develop improved methods of mapping canopy gaps at larger spatial extents, using remotely sensed data and very little/no fieldwork-currently, this can be only applied as a guide and cannot be generalised. Future research will aim to find a technique of combining more than three bands of different kinds of remote sensing data.