Identiﬁcation of Silvicultural Practices in Mediterranean Forests Integrating Landsat Time Series and a Single Coverage of ALS Data

: Understanding forest dynamics at the stand level is crucial for sustainable management. Landsat time series have been shown to be effective for identiﬁcation of drastic changes, such as natural disturbances or clear-cuts, but detecting subtle changes requires further research. Time series of six Landsat-derived vegetation indexes (VIs) were analyzed with the BFAST (Breaks for Additive Season and Trend) algorithm aiming to characterize the changes resulting from harvesting practices of different intensities (clear-cutting, cutting with seed-trees, and thinning) in a Mediterranean forest area of Spain. To assess the contribution of airborne laser scanner (ALS) data and the potential implications of it being after or before the detected changes, two scenarios were deﬁned (based on the year in which ALS data were acquired (2010), and thereby detecting changes from 2005 to 2010 (before ALS data) and from 2011 to 2016 (after ALS data). Pixels identiﬁed as change by BFAST were attributed with change in VI intensity and ALS-derived statistics (99th height percentile and forest canopy cover) for classiﬁcation with random forests, and derivation of change maps. Fusion techniques were applied to leverage the potential of each individual VI change map and to reduce mapping errors. The Tasseled Cap Brightness (TCB) and Normalized Burn Ratio (NBR) indexes provided the most accurate results, the latter being more precise for thinning detection. Our results demonstrate the suitability of Landsat time series and ALS data to characterize forest stand changes caused by harvesting practices of different intensity, with improved accuracy when ALS data is acquired after the change occurs. Clear-cuttings were more readily detectable compared to cutting with seed-trees and thinning, detection of which required fusion approaches. This methodology could be implemented to produce annual cartography of harvesting practices, enabling more accurate statistics and spatially explicit identiﬁcation of forest operations.


Introduction
Sustainable forest management is needed for simultaneous production of socioeconomic benefits and reduction of the climate change effects, minimizing deforestation and forest degradation, protecting soils, and preserving biological diversity and water resources [1]. Achieving sustainable management requires knowledge of forest disturbance and overall dynamics, as this information aids in understanding the current state of forests and their response to changes [2]. Since forests are in continuous change, forestry experts have joined efforts to develop reliable and timely systems for monitoring change across done regarding the potential of integrating different VIs to improve the characterization of abrupt and subtle changes driven by different forest management practices [35].
Many studies aimed at detecting changes driven by forest management practices have been conducted in boreal forests [16,18,19,36,37] or tropical forests [22,27,38]. In Mediterranean environments, there are also some examples of clear-cutting detection [15,39,40], but these experiences are isolated examples compared to the large number of studies conducted in other environments. The research deficit in Mediterranean forests is in part due to a greater difficulty in detecting silvicultural interventions motivated by the smaller extent of the changes [1,41]. The smaller the changes the less accuracy in their detection is achieved [17]. Reference [16] reported that for forest management practices larger than 5 hectares, accuracies of 93% were achieved but for those smaller than 2 hectares the accuracy dropped to 66%. As illustrative examples, reference [42] analyzed the characteristics of the Canadian forest harvesting, reporting that by the late 1990s clear-cuts were generally larger than 50 ha across most of Canada, whereas [17] claimed that clear-cut areas averaged about 3 ha in a forest area in France, and [39] reported that most clear-cuts in Italy are in the range of 1-3 ha. In light of these differences, the changes occurring in Mediterranean forests can be consistently different from those in boreal forests, resulting in spectral trajectories with different patterns that could hinder the detection of changes.
The goal of this study was to test the capacity of the Landsat time series and BFAST algorithm to detect and characterize a range of change types over a Mediterranean forest in northern Spain. Thus, the following specific objectives were addressed: (1) To assess the performance of BFAST and a range of VIs to identify areas subject to changes related to silvicultural practices; (2) To characterize the detected changes by identifying harvesting practices of varying intensity, from clear cuts to cutting with seed trees and thinning; and (3) To test whether the inclusion of ALS metrics evaluated before or after the detected change provides greater accuracies in the classification of harvesting practices.

Study Area and Data
Our study focused in the Urbión Model Forest area, the largest continuous forested area (177,394 ha) in the Iberian Peninsula [43], which is located in Burgos and Soria provinces (northern Spain) ( Figure 1). The Urbión Model Forest is situated in a mountainous area between 900 to 2000 m above sea level. Despite the cold and long winters associated with the altitude and continental character of the region, the climate is mainly characterized by warm and dry summers typical of the Mediterranean climate [44]. The annual average temperature is 9.3 • C and annual precipitation ranges from 546 to 631 mm.
The region hosts important timber industries and thereby it is an area with intense forest management, where changes due to sanitary issues are minimal. Pinus pinaster Ait. and Pinus sylvestris L. are the dominant species, coexisting with other species such as Fagus sylvatica L., Quercus ilex L., Quercus faginea Lam. and Pinus nigra J.F. Arnold [43]. In addition, this region encompasses some protected areas, e.g., Cañón del Río Lobos Natural Park, where Juniperus thurifera L. (Iberian Juniper) can be found.

ALS Data
Airborne laser scanning (ALS) data covering the entire region of interest were acquired during the leaf-on 2010 campaign by the Spanish National Programme of Aerial Orthophotography [45]. The acquisitions have a minimum density of 0.5 points per m 2 and vertical RMSE < 0.4 m. The ALS dataset was processed with FUSION software [46] to generate a 2-m digital elevation model, enabling the estimation of height above ground for each vegetation point. Points below 2 m were not considered in calculating two forest structural metrics: the 99th height percentile (p99) and the forest canopy cover (fcc).

ALS Data
Airborne laser scanning (ALS) data covering the entire region of interest were acquired during the leaf-on 2010 campaign by the Spanish National Programme of Aerial Orthophotography [45]. The acquisitions have a minimum density of 0.5 points per m 2 and vertical RMSE < 0.4 m. The ALS dataset was processed with FUSION software [46] to generate a 2-m digital elevation model, enabling the estimation of height above ground for each vegetation point. Points below 2 m were not considered in calculating two forest structural metrics: the 99th height percentile (p99) and the forest canopy cover (fcc).

Landsat Time Series
Landsat data (path/row 201/031) covering the study area were downloaded from the United States Geological Survey archive. A total of 472 Landsat Collection 1 Level 2 surface reflectance images acquired during the period 1984 to 2016, with cloud cover less than 80%, were analyzed. Fully compatible time series data were used to facilitate reliable change detection. Pixels with clouds and cloud shadows were vetted with Fmask [47], and an additional buffer was used around the identified cloud and cloud shadows to remove edge contamination. Six Vegetation Indexes (VIs) ( Table 1) were computed for each of the cloud-cleaned Landsat images: Normalized Difference Vegetation Index (NDVI), Normalized Difference Moisture Index (NDMI), Normalized Burn Ratio (NBR), Tasselled Cap Brightness (TCB), Tasselled Cap Greenness (TCG) and Tasselled Cap Wetness (TCW).

Landsat Time Series
Landsat data (path/row 201/031) covering the study area were downloaded from the United States Geological Survey archive. A total of 472 Landsat Collection 1 Level 2 surface reflectance images acquired during the period 1984 to 2016, with cloud cover less than 80%, were analyzed. Fully compatible time series data were used to facilitate reliable change detection. Pixels with clouds and cloud shadows were vetted with Fmask [47], and an additional buffer was used around the identified cloud and cloud shadows to remove edge contamination. Six Vegetation Indexes (VIs) ( Table 1) were computed for each of the cloud-cleaned Landsat images: Normalized Difference Vegetation Index (NDVI), Normalized Difference Moisture Index (NDMI), Normalized Burn Ratio (NBR), Tasselled Cap Brightness (TCB), Tasselled Cap Greenness (TCG) and Tasselled Cap Wetness (TCW).

Detection of Harvesting Practices
The BFAST algorithm was applied over forest pixels for identification of change and no-change. Those areas identified as change were later classified as different harvesting practices thanks to a reference database manually delineated.

Forest Mask
A forest mask was created to ensure that the changes detected were in fact forest disturbances and no other land cover dynamics. Random forests models [54], implemented in the R package RandomForest [55], were used to identify pixels that had once been forest Remote Sens. 2021, 13, 3611 5 of 17 throughout the study period. Landsat images from 2005 and 2016 (with <80% cloud cover) were used to create annual composites for those two years. For each pixel, the observation with highest NDVI was selected as the composite value. Furthermore, some annual NDVI and NBR statistics were derived from each year's imagery. The optimal predictors for classification were identified with VSURF (variable selection using random forests) [56] aiming to improve the classification performance. The final classification model included NIR and SWIR of the 2005 and 2016 composites, the minimum and average NDVI and NBR of both years, and the green spectral band of 2005 as predictors.

BFAST Implementation
The BFAST algorithm was applied to identify changed and non-changed pixels. BFAST integrates an iterative decomposition of time series into trend, seasonal and remainder components, with methods for detecting changes within time series [26]. This methodology defines two periods within a time series: a historic period, considered as a stable time interval in which no abrupt changes have occurred, and a monitoring period, which will be analyzed for change detection. The historic period can be manually defined, based on the knowledge of the area, or automatically detected by the algorithm [29]. The data of the historic period were used to fit a model, which is extrapolated to the monitoring period to assess if the new observations conform or not with the stability of the model. To observe whether or not the extrapolated model remains stable in the monitoring period, the algorithm applies a measure of discrepancy called moving sums (MOSUMs) of the residuals [29]. A structural breakpoint is declared when the MOSUMs exceeds the 95% significance boundary. BFAST is applied at pixel level generating two outputs, the breakpoint timing (when the breakpoint is detected) and the change magnitude. The change magnitude is estimated deriving the difference between the median of the fitted model and the new observations during the monitoring period [29]. Positive values of change magnitude indicate an abrupt increase in activity, while negative values of change magnitude indicate an abrupt decrease in activity [17]. While the timing is only provided for the breakpoints, the change magnitude information is given in a spatially continuous way.
Landsat-based VIs time series were analyzed with BFAST Monitor method [29] using the bfastSpatial package in R (https://github.com/loicdtx/bfastSpatial, accessed on 30 November 2020) and maps of change were generated. We explored two different scenarios to assess the potential implications of having ALS data after or before the detected changes. These scenarios consisted of the definition of two monitoring periods of the same length (5 years) but with a different temporal location of the ALS data (acquired 2010) relative to the monitoring period. (Figure 2). In scenario A the monitoring period is 2005-2010; in scenario B the monitoring period is 2011-2016. With this design, in scenario A the changes detected are those from 2005 to 2010, and in scenario B the changes detected are those from 2011 to 2016. The historic period was determined automatically using the ROC approach implemented in BFAST [29]. A "harmonic" seasonal single-order model was fitted for each historic period (1984-2004 for scenario A, and 1984-2010 for scenario B). The trend component was omitted as it tends to generate false breakpoints and to inflate change in magnitude values [27].

Selection of Training Points
All pixels included in the forest mask were classified into no change or change related to harvesting practices: clear-cutting, cutting with seed-trees and thinning ( Figure 3). These three harvesting practices are characterized by a different proportion of cut down trees: clear-cutting involves the removal of all the trees in a stand, cutting with seed-trees implies preserving a small number of dispersed trees needed for natural regeneration, and thinning is an intermediate harvest in which a variable proportion of trees is removed to reduce the stand density and to enhance the quality and growth of the remaining trees. The thinning class included a range of cutting intensities in the area, from light (15%) to heavy (60%) thinning. We did not consider potential subtle changes associated with sanitary issues Remote Sens. 2021, 13, 3611 6 of 17 during the studied period as they were not relevant according to our knowledge of the study area. Reference polygons of variable size (0.3-9.4 ha) were manually digitized based on contemporaneous high-resolution imagery from Google Earth and PNOA (with imagery available from 2005, 2007, 2009, 2011, 2014 and 2017). From each polygon sample points were randomly selected, making up a 400 training samples (100 points per class). The reason behind the decision to create 100 samples per class was to ensure class balance avoiding overestimation of the most representative classes ( [57]) and to ensure a large number to accommodate the data dimensions ( [58]).

Selection of Training Points
All pixels included in the forest mask were classified into no change or change related to harvesting practices: clear-cutting, cutting with seed-trees and thinning ( Figure 3). These three harvesting practices are characterized by a different proportion of cut down trees: clear-cutting involves the removal of all the trees in a stand, cutting with seed-trees implies preserving a small number of dispersed trees needed for natural regeneration, and thinning is an intermediate harvest in which a variable proportion of trees is removed to reduce the stand density and to enhance the quality and growth of the remaining trees. The thinning class included a range of cutting intensities in the area, from light (15%) to heavy (60%) thinning. We did not consider potential subtle changes associated with sanitary issues during the studied period as they were not relevant according to our knowledge of the study area. Reference polygons of variable size (0.3-9.4 ha) were manually digitized based on contemporaneous high-resolution imagery from Google Earth and PNOA (with imagery available from 2005, 2007, 2009, 2011, 2014 and 2017). From each polygon sample points were randomly selected, making up a 400 training samples (100 points per class). The reason behind the decision to create 100 samples per class was to ensure class balance avoiding overestimation of the most representative classes ( [57]) and to ensure a large number to accommodate the data dimensions ( [58]).

Selection of Training Points
All pixels included in the forest mask were classified into no change or change related to harvesting practices: clear-cutting, cutting with seed-trees and thinning ( Figure 3). These three harvesting practices are characterized by a different proportion of cut down trees: clear-cutting involves the removal of all the trees in a stand, cutting with seed-trees implies preserving a small number of dispersed trees needed for natural regeneration, and thinning is an intermediate harvest in which a variable proportion of trees is removed to reduce the stand density and to enhance the quality and growth of the remaining trees. The thinning class included a range of cutting intensities in the area, from light (15%) to heavy (60%) thinning. We did not consider potential subtle changes associated with sanitary issues during the studied period as they were not relevant according to our knowledge of the study area. Reference polygons of variable size (0.3-9.4 ha) were manually digitized based on contemporaneous high-resolution imagery from Google Earth and PNOA (with imagery available from 2005, 2007, 2009, 2011, 2014 and 2017). From each polygon sample points were randomly selected, making up a 400 training samples (100 points per class). The reason behind the decision to create 100 samples per class was to ensure class balance avoiding overestimation of the most representative classes ( [57]) and to ensure a large number to accommodate the data dimensions ( [58]).

Random Forests Models
Random Forest (RF) classification models were calibrated with the training sample of 400 points described in Section 2.2.3. An individual RF classification model was fitted for each VI analyzed and each scenario (i.e., six RF models per scenario). ALS metrics (fcc and p99) calculated in Section 2.1.1 as well as the change magnitude derived from BFAST (Section 2.2.2) were used as predictor variables for the calibration of the RF models ( Figure 4). In addition to the twelve change maps generated from this classification process (six change maps per scenario), an overall classification model (hereafter called ALL) per scenario was calibrated by employing the ALS metrics and the six VI pools of change magnitudes. Therefore, finally there were a total of 14 change maps. Because RF classifiers consist of a combination of decision trees, where each tree contributes with a prediction and the final class is the most voted by all the decision trees [54], a reliability measure was defined using the percentage of votes (see Section 2.4).
The BFAST output breakpoint timing was not included as a predictor variable for two reasons: (1) it does just marginally improve the accuracy of the classified maps [59] and (2) its inclusion could incur large omission errors due to the existence of changes not declared as structural breakpoints. There could be subtle changes (as thinning operations) that cause a slight decrease not sufficient to surpass the algorithm threshold that defines a structural breakpoint [60].

Validation
The validation process was designed to identify the overall accuracy of the change detection process, and the confusion among change classes in each scenario, via confusion matrices built at the pixel level. Each change map accuracy was individually assessed with independent validation datasets independent from the data used to develop the classification, avoiding any spatial overlapping between the training and validation datasets. The validation dataset consisted of 496 samples for scenario A and 515 for scenario B, allocated following a probabilistic stratified design. The four classes of change (no-change, clear-cutting, cutting with seed-trees and thinning) served as strata, resulting in 348 random points for no-change areas and 148 within change areas for scenario A, and 370 points for no-change and 145 within change areas for the scenario B. High-resolution imagery from Google Earth and PNOA were used to visualize the validation datasets, which were labeled with the type of change (without change year attribution). The BFAST output breakpoint timing was not included as a predictor variable for two reasons: (1) it does just marginally improve the accuracy of the classified maps [59] and (2) its inclusion could incur large omission errors due to the existence of changes not declared as structural breakpoints. There could be subtle changes (as thinning operations) that cause a slight decrease not sufficient to surpass the algorithm threshold that defines a structural breakpoint [60].

Validation
The validation process was designed to identify the overall accuracy of the change detection process, and the confusion among change classes in each scenario, via confusion matrices built at the pixel level. Each change map accuracy was individually assessed with independent validation datasets independent from the data used to develop the classification, avoiding any spatial overlapping between the training and validation datasets. The validation dataset consisted of 496 samples for scenario A and 515 for scenario B, allocated following a probabilistic stratified design. The four classes of change (no-change, clearcutting, cutting with seed-trees and thinning) served as strata, resulting in 348 random points for no-change areas and 148 within change areas for scenario A, and 370 points for no-change and 145 within change areas for the scenario B. High-resolution imagery from Google Earth and PNOA were used to visualize the validation datasets, which were labeled with the type of change (without change year attribution).

Fusion Maps
To leverage the potential of each individual VI change map for identification of harvesting practices and to reduce mapping errors [21,28], we created new maps of change by applying three different fusion approaches ( Figure 5). Specific rules establishing rankings of frequency and the reliability measure (i.e., percentage of votes in the RF, see Section 2.3.1) of the class assigned in each VI change map were implemented in R code to select fusion classes. When there was a tie in the ranking, the VI change class predicted with the greatest accuracy was preferred. The three fusion techniques implemented are described as follows, where F MAX and F SUM use also the percentage of votes as an aggregation criteria, as in [61]: (iii) F SUM. A fusion map was created by selecting the overall most voted class, i.e., summing reliability measures of the all VI change maps. For example, in Figure 5 clear-cutting is selected, as the total reliability measure of its three selecting VI (60%) exceeds the reliability measure of the thinning class (50%).
aggregation criteria, as in [61]: (i) FUSION. A fusion map was created by selecting the most frequent class amongst the six VIs change maps. For example, in Figure 5 clear-cutting is selected, as it happens in 3 out of 6 VI maps; (ii) F MAX. A fusion map was created by selecting the class with the greatest reliability measure in any of the VI change maps. For example, in Figure 5 thinning is chosen, since the reliability measure of TCG (50%) is the greatest; and (iii) F SUM. A fusion map was created by selecting the overall most voted class, i.e., summing reliability measures of the all VI change maps. For example, in Figure 5 clear-cutting is selected, as the total reliability measure of its three selecting VI (60%) exceeds the reliability measure of the thinning class (50%).

VIs Overall Accuracy Performance
Classifications into the categories no change, clear-cutting, cutting with seed-trees, and thinning, yielded overall accuracies ranging from 70% to 85% ( Figure 6). Accuracies were greater for scenario A among the individual VIs tested. TCB and NBR stood out from the rest of the VIs, with overall accuracies over 85%. TCB achieved its best performance (85.69% versus 74.32%) in scenario A (with ALS metrics at the end of the BFAST monitoring period), while NBR improved from 80.24% in scenario A to 85.02% in scenario B (with ALS metrics at the beginning of the BFAST monitoring period). Among the VIs tested, the NDMI and NDVI performed well in both scenarios, showing greater accuracy values for scenario A. The least accurate VI was the TCG with overall accuracy values around 70% and the TCW performed particularly poorly in scenario A (70.56%).

VIs Overall Accuracy Performance
Classifications into the categories no change, clear-cutting, cutting with seed-trees, and thinning, yielded overall accuracies ranging from 70% to 85% ( Figure 6). Accuracies were greater for scenario A among the individual VIs tested. TCB and NBR stood out from the rest of the VIs, with overall accuracies over 85%. TCB achieved its best performance (85.69% versus 74.32%) in scenario A (with ALS metrics at the end of the BFAST monitoring period), while NBR improved from 80.24% in scenario A to 85.02% in scenario B (with ALS metrics at the beginning of the BFAST monitoring period). Among the VIs tested, the NDMI and NDVI performed well in both scenarios, showing greater accuracy values for scenario A. The least accurate VI was the TCG with overall accuracy values around 70% and the TCW performed particularly poorly in scenario A (70.56%).

Performance in Classification of Forestry Practices
Omission and commission errors differed substantially across harvesting practices with a general increase in errors in scenario B (Figure 7). TCB performed better than the

Performance in Classification of Forestry Practices
Omission and commission errors differed substantially across harvesting practices with a general increase in errors in scenario B (Figure 7). TCB performed better than the other VIs in identifying the clear-cutting in scenario A, with 15% commission error and 8.11% omission error. However, in scenario B, NBR outperformed TCB, achieving 31.25% and 5.38% of commission and omission errors, respectively. TCB and NBR had similar performance in detecting thinning areas. TCB yielded better results in scenario A, with commission and omission errors of 35.65% and 22.35%, respectively, whereas NBR was the most accurate in scenario B, with commission and omission errors of 40.24% and 26.87%, respectively. As for the cutting with seed-trees class, the NBR was the most accurate VI in both scenarios, with commission and omission errors of 44.68% and 0% in scenario A and 55.56% and 13.04% in scenario B. Figure 6. Overall accuracy (OA) results (%) for the detection of different harvesting practices using six VIs (NDVI, NDMI, NBR, TCB, TCG and TCW); (left): scenario A, when ALS data were acquired at the end of the monitoring period; (right): scenario B, when ALS data were acquired at the beginning of the monitoring period.

Performance in Classification of Forestry Practices
Omission and commission errors differed substantially across harvesting practices with a general increase in errors in scenario B (Figure 7). TCB performed better than the other VIs in identifying the clear-cutting in scenario A, with 15% commission error and 8.11% omission error. However, in scenario B, NBR outperformed TCB, achieving 31.25% and 5.38% of commission and omission errors, respectively. TCB and NBR had similar performance in detecting thinning areas. TCB yielded better results in scenario A, with commission and omission errors of 35.65% and 22.35%, respectively, whereas NBR was the most accurate in scenario B, with commission and omission errors of 40.24% and 26.87%, respectively. As for the cutting with seed-trees class, the NBR was the most accurate VI in both scenarios, with commission and omission errors of 44.68% and 0% in scenario A and 55.56% and 13.04% in scenario B.  The accuracy assessment revealed that the no-change class had larger omission errors (mostly caused by confusion between non-change and thinning) than commission errors, whereas the other classes were more affected by commission error (Figure 7). The nochange class showed commission and omission errors of 4.75% and 13.51%, respectively, in scenario A, and of 5.37% and 9.46% in scenario B. Clear-cutting was the most accurately classified of the three harvesting practices; in this case, commission errors originated from its confusion with no-change but specially with cutting with seed-trees. Clear-cutting and cutting with seed-trees are expected to produce a similar decrease in the VI magnitude which leads to its misclassification. However, when the cutting-seed tree occurred prior to the ALS acquisition, ALS data detect the trees left for natural regeneration, and contribute to improve their distinction (Figure 8).
classified of the three harvesting practices; in this case, commission errors originated from its confusion with no-change but specially with cutting with seed-trees. Clear-cutting and cutting with seed-trees are expected to produce a similar decrease in the VI magnitude which leads to its misclassification. However, when the cutting-seed tree occurred prior to the ALS acquisition, ALS data detect the trees left for natural regeneration, and contribute to improve their distinction (Figure 8).

Fusion Maps
The overall accuracy of the FUSION and FSUM maps were greater than the overall accuracy of FMAX and all individual VI maps. FUSION had an overall accuracy of 88.51% in scenario A and 87.16% in scenario B. Among the three harvesting practices, thinning and cutting with seed-trees benefited more from the fusion approach than clear-cutting (Figure 9). Commission errors of both thinning and cutting with seed-trees were smaller with fusion maps than with any of the VIs individually assessed. For instance, FUSION

Fusion Maps
The overall accuracy of the FUSION and FSUM maps were greater than the overall accuracy of FMAX and all individual VI maps. FUSION had an overall accuracy of 88.51% in scenario A and 87.16% in scenario B. Among the three harvesting practices, thinning and cutting with seed-trees benefited more from the fusion approach than clearcutting (Figure 9). Commission errors of both thinning and cutting with seed-trees were smaller with fusion maps than with any of the VIs individually assessed. For instance, FUSION decreased the commission error of thinning from 35.65% to 28.83% (Table 2) in scenario A (where TCB was the best performing individual VI) and from 40.24% to 29.17% in scenario B (where NBR was the best performing individual VI). The decrease of commission errors in the cutting with seed-trees class is only observed in scenario A (from 44.68% to 25.71%). None of the fusion approaches improved the detection of clear-cutting by individual VIs.
decreased the commission error of thinning from 35.65% to 28.83% (Table 2) in scenario A (where TCB was the best performing individual VI) and from 40.24% to 29.17% in scenario B (where NBR was the best performing individual VI). The decrease of commission errors in the cutting with seed-trees class is only observed in scenario A (from 44.68% to 25.71%). None of the fusion approaches improved the detection of clear-cutting by individual VIs.

Discussion
In this study, we applied methods for identification of forest changes in Mediterranean forests, and classification into three harvesting practices. The TCB and NBR were the most successful VIs among the six essayed [15,16,21,62]. Nevertheless, other VIs like the Normalized Difference Fraction Index (NDFI) and the TCA [63] have shown promising

Discussion
In this study, we applied methods for identification of forest changes in Mediterranean forests, and classification into three harvesting practices. The TCB and NBR were the most successful VIs among the six essayed [15,16,21,62]. Nevertheless, other VIs like the Normalized Difference Fraction Index (NDFI) and the TCA [63] have shown promising results for detection of changes of low intensity [27,28,64] in other areas and their performance might be tested in our Mediterranean study site.
Clear-cutting was, as expected, the type of change best characterized, with omission and commission errors similar to those achieved by other authors (8.11% and 15% for scenario A and 5.38% and 31.25% for scenario B). Reference [16] reported an omission error of 16.2% for clear-cutting detection in a boreal area in Canada, and clear-cutting was classified in Finland with commission errors ranging between 7-11.6% [18]. Clear-cutting in France was detected with 19% and 54% of omission and commission error, respectively [17], while in Italy, reference [40] achieved omission errors ranging between 16-55% and commission errors smaller than 15%. The similar spectral response caused by clear-cutting and cutting with-seed trees explained in part the commission errors obtained. However, cleared shrublands were also misclassified as clear-cutting, probably due to inaccuracies in the forest mask. It is important to work with an accurate forest mask, to get the best performance from the algorithm [59,60] especially in Mediterranean conditions [40]. Furthermore, the size of Landsat pixels (30 m) makes the derivation of forest masks challenging, particularly in heterogeneous or open forests [28]. The spatial resolution also has an impact on change detection accuracy, particularly for small clearcut patches [17]. Although a general increase in the harvest patch median size has been observed across Europe [41], Spain shows a median patch size similar to France or Italy, smaller than those observed in northern countries [41].
The low spectral change response that causes thinning operations is more likely to be classified as non-change [16,62], justifying some of the omission errors of this class. Our validation dataset was based on visual interpretation, which may have also impacted the accuracy results of the thinning class. Clear-cutting and intense thinning practices imply a lasting change easily interpreted, while thinning of low intensity might cause canopy openings that rapidly close [65]. Because high-resolution images were only available every two or three years, this temporal resolution might not be sufficient to discern this forestry practice; some of the commission errors could in fact be thinning operations correctly classified by Landsat and BFAST, but misinterpreted in the reference high-resolution images. Reference [66] stated the difficulty to detect non-stand replacing disturbances such as thinning when only annual images were at their disposal. Reference [19] reported that the low thinning intensity that characterized its reference data had a greater effect on the accuracy results as they are more difficult to identify. Besides, our classification scheme did not include subtle disturbances such as insect defoliation or stress and some of these instances could fall into the thinning class [33]. In any case, this approach performance and its accuracy assessment will be better if field points are available as reference [27].
As in [37,62], omission did not exceed commission in the no-change class, being consistent among the VIs analyzed, and proving that it is possible to accurately characterize the forest areas in which no harvesting practices were conducted. The interest of this identification is in connection with ALS-based forest estimates. Even though ALS data benefits have been well-documented [7,67], there is an important handicap related to their temporal resolution. ALS practical usefulness might be reduced in areas where changes happened after data acquisition [68]. Therefore, it is important to identify the unchanged areas in which ALS data continue to be operational, and to assess the use of other metrics to predict current forest attributes in the areas where harvesting practices were conducted [69,70].
The accuracy results obtained for cutting with seed-trees and thinning with individual VI were lower than the results reported in [16] and suggested the convenience to apply fusion techniques to increase accuracy. Among the fusion techniques applied, the three approaches achieved similar accuracy results, albeit slightly greater for the FUSION approach in which percentage of votes is not considered. Further research is recommended to assess the importance of this criterion when fusing methods are applied. Reference [71] applied the fusion techniques based on the percentage of votes to create a forest land cover map, and achieved higher accuracies, but they did not test our FUSION approach.
ALS-derived metrics complement the magnitude metric derived from BFAST by improving the identification of harvesting practices especially when ALS data are acquired after the detected change. The integration of ALS data and Landsat time series enables one to better differentiate those harvesting practices more likely to cause a similar spectral response. Thanks to the increased availability of multi-temporal, wall-to-wall ALS data, in some countries like Spain, an integration of data from different sensors may provide a suitable alternative for detecting harvesting practices more accurately. In the absence of ALS data acquired after the change, RADAR data may be a valuable source of information to discern these forest harvesting practices thanks to its capacity to estimate forest height [72].
Change intensity and ALS metrics were the only predictor variables considered to determine the agent of change. Further work is recommended to fit classification models with change persistence and post-disturbance regrowth [16,73,74] as predictor variables, aiming to better discriminate the forest harvesting practices. In this sense, the potential of using bitemporal ALS data for forest harvesting classification has also been demonstrated [3]. The Spanish second complete ALS coverage is expected to be ready by the end of 2021; these data could also be incorporated for classification purposes. Finally, given the benefits derived from the fusion approaches, different VI synergies should be explored to assess the best VI combination to create more accurate change maps [28].
Our results demonstrate that intense changes such as clear-cuts can be mapped altogether with the same accuracy regardless of the ALS data acquisition date (Figure 7). Hence, Landsat time series can be used solely for drastic change detection when distinguishing between clear-cutting and cutting with seed-trees is not required. In this regard, since the spectral response caused by these changes lasts several years [66], the use of inter-annual time series could be an alternative [75]. Nevertheless, caution should be exercised when considering inter-annual time series since the spectral signal recovery may vary depending on the forest conditions. Clear cuts in Mediterranean forests have shown faster recovery times than clear cuts in boreal forests [15]. Unlike drastic change detection, subtle changes caused by thinning are better characterized with intra-annual time series [18]; thus, time-series algorithms with one image per year frequency are not adequate for thinning detection [31]. Dense time series integrating images from different sensors [76] pose an opportunity to increase the accuracy of the detected changes [23]. Additionally, the BFAST monitoring algorithm has been recently implemented in the Google Earth Engine, supporting the replication of the methodology over large areas and alleviating the users from downloading and processing bulky files [77].
The results obtained in this study confirmed the suitability of integrating one ALS coverage, Landsat time series and BFAST to detect a range of harvesting practices in a Mediterranean study region. Since information about harvesting is important for carbon cycling reports [78], the methodology developed in this work could be implemented to produce annual cartography of harvesting practices enabling more accurate statistics and spatially explicit identification of these operations. Complete and updated national scale cartography of harvesting practices is currently missing, as it is not produced for the study area nor at a national scale [15,79]. Besides, the results obtained can be used in the future to develop and adapt forestry management policies to ensure sustainable management of exploited forest areas [80,81].

Conclusions
In this study, six Landsat-based VIs time series were analyzed with the BFAST algorithm, as a means for characterizing changes resulting from harvesting practices of different intensities in a Mediterranean forest area. Fusion approaches were assessed and ALS metrics were included as predictor variables to improve the change characterization. The results demonstrated the suitability of Landsat time series and ALS data to detect changes caused by harvesting practices of different intensity, but with greater accuracy when ALS data were acquired after the change occurred. TCB and NBR were the VIs with better performance, while TCG performed the worst and NBR performed particularly well for thinning classification. Clear-cuttings are more readily detectable compared to cutting with seed-trees and thinning, which requires fusion approaches to increase mapping accuracy. The results are relevant for countries that aim at monitoring their forest interventions and reporting harvest area statistics.