Using Space-Time Features to Improve Detection of Forest Disturbances from Landsat Time Series

Current research on forest change monitoring using medium spatial resolution Landsat satellite data aims for accurate and timely detection of forest disturbances. However, producing forest disturbance maps that have both high spatial and temporal accuracy is still challenging because of the trade-off between spatial and temporal accuracy. Timely detection of forest disturbance is often accompanied by many false detections, and existing approaches for reducing false detections either compromise the temporal accuracy or amplify the omission error for forest disturbances. Here, we propose to use a set of space-time features to reduce false detections. We first detect potential forest disturbances in the Landsat time series based on two consecutive negative anomalies, and subsequently use space-time features to confirm forest disturbances. A probability threshold is used to discriminate false detections from forest disturbances. We demonstrated this approach in the UNESCO Kafa Biosphere Reserve located in the southwest of Ethiopia by detecting forest disturbances between 2014 and 2016. Our results show that false detections are reduced significantly without compromising temporal accuracy. The user’s accuracy was at least 26% higher than the user’s accuracies obtained when using only temporal information (e.g., two consecutive negative anomalies) to confirm forest disturbances. We found the space-time features related to change in spatio-temporal variability, and spatio-temporal association with non-forest areas, to be the main predictors for forest disturbance. The magnitude of change and two consecutive negative anomalies, which are widely used to distinguish real changes from false detections, were not the main predictors for forest disturbance. Overall, our findings indicate that using a set of space-time features to confirm forest disturbances increases the capacity to reject many false detections, without compromising the temporal accuracy.


Introduction
Current research on forest change monitoring aims for accurate and timely detection of forest disturbances.Many change detection approaches have therefore been proposed in recent years to improve the detection of forest disturbances from medium spatial resolution satellite data (e.g., Landsat) [1][2][3][4][5][6][7][8].The proposed approaches have addressed many key challenges, including seasonality in dry forests [9][10][11].However, it is still challenging to produce forest disturbance maps which have both high spatial and temporal accuracy; there is always a trade-off [1,6,7,10].A shorter temporal detection delay is typically associated with high producer's accuracy and low user's accuracy [10].Low user's accuracy implies that timely detection of forest disturbances is accompanied by many false detections.False detections in satellite time series may occur because of insufficient deseasonalisation or because of the presence of sensors' artefacts and remnant atmospheric contaminations [1,12].Yet, existing approaches lack the capacity to reject false detections without compromising the temporal accuracy.In cases where high user's accuracy has been reported, the omission error for small-scale forest disturbances was high [8].In short, amplified false detections and high omission error for small-scale forest disturbances are two major challenges that currently hinder accurate and timely detection of forest disturbances.
Recent studies attempted to remedy the problem of false detections by using a threshold on the magnitude of change to discriminate real disturbances from false detections [9,13].The use of a threshold on the magnitude of change is based on the assumption that, when compared to real changes, false detections have a smaller magnitude of change.This assumption is problematic, however, because forest disturbances that involve partial removal of forest cover may also have a small magnitude of change [14].Small-scale and gradual forest disturbances, which are common especially in Africa and caused mainly by small-holder agriculture expansion, domestic firewood and charcoal extractions [15], also tend to have a small magnitude of change [13].Recent studies show that false detections are particularly common in areas where small-scale forest disturbances, caused by small-holder agriculture and human settlements expansion, are prevalent [16].Separating forest disturbances from false detections using a threshold on the magnitude of change in such areas may therefore lead to high omission error for small-scale forest disturbances.The difficulty in separating false detections and forest disturbances may explain why small-scale forest disturbances are still a major constraint to accurate mapping and quantification of forest loss [17].It is therefore important that new change detection approaches improve the detection of small-scale forest disturbances in order to achieve accurate mapping and quantification of forest loss.
Alternative to magnitude of change, some change detection approaches use consecutive anomalies to confirm forest disturbance.Essentially, forest disturbance is only confirmed if there are at least two or more consecutive negative anomalies in the time series [3,[6][7][8]].An observation is considered a negative anomaly if it is below a specified threshold [18].The approach of consecutive anomalies is based on the assumption that the presence of multiple consecutive negative anomalies in the time series may be a strong indicator of forest disturbance because the impact of forest disturbances are typically persistent through time [1].Consecutive negative anomalies are also considered more robust against noise in the time series than a threshold on the magnitude of change [1,3,7].When detecting forest disturbances in a timely manner using a consecutive negative anomaly approach, however, a threshold for defining an observation as an anomaly has to be less strict [6,10].Unfortunately, a less strict threshold amplifies false detections [6].In addition to consecutive anomalies, other approaches use temporal metrics derived from pixel-time series of Landsat spectral bands and indices to calculate the likelihood of forest loss [8].Forest loss is confirmed when there are two consecutive observations at the pixel showing a higher likelihood of forest loss [8].This approach is particularly novel because the likelihood of forest loss is calculated based on a set of spectral and temporal metrics.Overall, this forest change detection approach produced a forest disturbance map with high user's accuracy in Peru, but the omission error for small-scale forest disturbances was high [8].
Solving the problems of false detections and omission of small-scale forest disturbances remains challenging mainly because existing forest change detection approaches only rely on temporal and spectral information; they ignore spatial information.Yet, forest disturbances, by default, are spatio-temporal processes [19].Oftentimes, new disturbances, especially those induced by humans, have both a spatial and temporal association to old disturbances [19].This spatio-temporal association arises because new loggings, for example, are more likely to occur within the neighbourhood of previously logged areas or logging routes than in areas which are far from previous disturbances.Essentially, a set of "space-time features" can be extracted from satellite data cubes and they can subsequently be used as indicators for forest disturbance in order to discriminate false detections from real disturbances.To the best of our knowledge, this idea of using space-time features to discriminate false detections from real disturbances has not yet been investigated.
The objective of this paper was to investigate whether the detection of forest disturbances from Landsat time series can be improved by using space-time features.We propose a new approach that first detects potential forest disturbances in the pixel-time series based on two-consecutive negative anomalies, and we subsequently use space-time features to confirm forest disturbances.The forest disturbance maps generated using the space-time features approach were compared to maps generated using two-consecutive negative anomaly only to detect forest disturbances, and using two-consecutive negative anomalies and the magnitude of change to confirm forest disturbances.We hypothesised that the space-time features approach would produce forest disturbance maps which are spatially more accurate.Throughout this paper, forest disturbance is defined as forest loss which constitutes either full or partial removal of forest cover, which is visible from Landsat multi-spectral data.

Study Area
Our study focused on detecting forest disturbances in the UNESCO Kafa Biosphere Reserve, situated in the southwest of Ethiopia (Figure 1), where a moist Afromontane broadleaf evergreen forest with moderate seasonality exists [13].The forest in this area is subject to small-scale and gradual disturbance processes, caused mainly by small-holder agriculture, human settlements expansion, industrial coffee plantations, and domestic firewood and charcoal extractions [20,21].Previous studies in the UNESCO Kafa Biosphere Reserve show that separating forest disturbances from false detections is difficult, especially those that involve partial removal of the forest cover [13,16,22].This study area is therefore ideal for evaluating the importance of using space-time features to achieve accurate and timely detection of forest disturbances.We defined forest disturbance as full or partial removal of forest cover, which is visible from Landsat multi-spectral images.
The objective of this paper was to investigate whether the detection of forest disturbances from Landsat time series can be improved by using space-time features.We propose a new approach that first detects potential forest disturbances in the pixel-time series based on two-consecutive negative anomalies, and we subsequently use space-time features to confirm forest disturbances.The forest disturbance maps generated using the space-time features approach were compared to maps generated using two-consecutive negative anomaly only to detect forest disturbances, and using two-consecutive negative anomalies and the magnitude of change to confirm forest disturbances.We hypothesised that the space-time features approach would produce forest disturbance maps which are spatially more accurate.Throughout this paper, forest disturbance is defined as forest loss which constitutes either full or partial removal of forest cover, which is visible from Landsat multi-spectral data.

Study Area
Our study focused on detecting forest disturbances in the UNESCO Kafa Biosphere Reserve, situated in the southwest of Ethiopia (Figure 1), where a moist Afromontane broadleaf evergreen forest with moderate seasonality exists [13].The forest in this area is subject to small-scale and gradual disturbance processes, caused mainly by small-holder agriculture, human settlements expansion, industrial coffee plantations, and domestic firewood and charcoal extractions [20,21].Previous studies in the UNESCO Kafa Biosphere Reserve show that separating forest disturbances from false detections is difficult, especially those that involve partial removal of the forest cover [13,16,22].This study area is therefore ideal for evaluating the importance of using space-time features to achieve accurate and timely detection of forest disturbances.We defined forest disturbance as full or partial removal of forest cover, which is visible from Landsat multi-spectral images.

Data and Methods
Figure 2 provides the overview of the data used, and the methods followed.To detect forest disturbances, we first pre-processed a total of 172 terrain corrected (L1T) multi-spectral images from Landsat-7 Enhanced Thematic Mapper (ETM+, 116 images) and Landsat-8 Operational Land Imager (OLI, 56 images) acquired between January 2010 and June 2016.From these images, we derived a normalised difference vegetation index (NDVI; [23,24], Section 3.1.1).Second, we produced a benchmark forest mask (Section 3.1.2),and used it to mask non-forest areas in the NDVI time series.Third, we normalised NDVI time series spatially to reduce seasonality (Section 3.1.3).Fourth, we flagged pixels with two consecutive negative anomalies in Landsat NDVI time series as potentially disturbed (Section 3.2).Fifth, we used three different approaches to confirm forest disturbance (Section 3.3).In the first approach (Section 3.3.1),forest disturbances were immediately confirmed once there were two consecutive negative anomalies in the pixel-time series (Two-consecutive anomaly approach).In the second approach (Section 3.3.2),we used two consecutive negative anomalies and the magnitude of change to calculate the probability of forest disturbance (Two consecutive anomalies + magnitude of change approach).In the third approach (Section 3.3.3),we used several space-time features from the local data cube of a flagged pixel to calculate the probability of forest disturbance (Space-time feature approach).Finally, for each approach, we estimated the spatial and temporal accuracy for forest disturbance maps (Section 3.4).

Data and Methods
Figure 2 provides the overview of the data used, and the methods followed.To detect forest disturbances, we first pre-processed a total of 172 terrain corrected (L1T) multi-spectral images from Landsat-7 Enhanced Thematic Mapper (ETM+, 116 images) and Landsat-8 Operational Land Imager (OLI, 56 images) acquired between January 2010 and June 2016.From these images, we derived a normalised difference vegetation index (NDVI; [23,24], Section 3.1.1).Second, we produced a benchmark forest mask (Section 3.1.2),and used it to mask non-forest areas in the NDVI time series.Third, we normalised NDVI time series spatially to reduce seasonality (Section 3.1.3).Fourth, we flagged pixels with two consecutive negative anomalies in Landsat NDVI time series as potentially disturbed (Section 3.2).Fifth, we used three different approaches to confirm forest disturbance (Section 3.3).In the first approach (Section 3.3.1),forest disturbances were immediately confirmed once there were two consecutive negative anomalies in the pixel-time series (Two-consecutive anomaly approach).In the second approach (Section 3.3.2),we used two consecutive negative anomalies and the magnitude of change to calculate the probability of forest disturbance (Two consecutive anomalies + magnitude of change approach).In the third approach (Section 3.3.3),we used several space-time features from the local data cube of a flagged pixel to calculate the probability of forest disturbance (Space-time feature approach).Finally, for each approach, we estimated the spatial and temporal accuracy for forest disturbance maps (Section 3.4).//landsat.usgs.gov/CDR_LSR.php).USGS use Landsat Ecosystem Disturbance Adaptive Processing System (LEDAPS) algorithm [25] to generate Landsat-7/ETM+ surface reflectance products.Landsat 8 OLI surface reflectance products are generated using Landsat 8 OLI surface reflectance algorithm [26].Clouds and cloud shadows in Landsat-7/ETM+ and Landsat-8/OLI images were masked using the CFmask cloud-shadow mask product [27].CFmask products are distributed together with the Landsat SR CDR Products.Our reference period was 2010-2013, whereas the monitoring period was 2014-2016.

Benchmark Forest Mask
A benchmark forest mask is essential when monitoring forest cover disturbances from satellite time series in order to avoid confusing forest cover loss with other land cover dynamics (e.g., cropping cycles; [9,13]).Similar to [13], we used a supervised maximum likelihood classifier to generate a forest mask from Landsat-8 OLI multi-spectral image time series for 2013.In UNESCO Kafa Biosphere Reserve, there are complex patchy mosaic transitional areas, such as cropland demarcated with planted trees [13].In such patchy mosaic transitional areas, vegetation dynamics, not related to forest disturbance (e.g., crop harvest), may occur, and can potentially be detected as forest disturbances, thus amplifying the commission error.To avoid detecting such vegetation dynamics as forest disturbance, we removed patches which were smaller than 0.54 ha from the forest mask.The final forest mask product contained 3,034,809 pixels (≈273,133 ha) that were classified as forest.

Spatial Normalisation to Reduce Seasonality in NDVI Time Series
The moist Afromontane broadleaf evergreen forest in the UNESCO Kafa Biosphere Reserve exhibits moderate seasonality [13].If not accounted for appropriately, seasonality may disguise forest disturbances in the time series.Here, we used spatial normalisation [9] to reduce seasonality in Landsat NDVI time series.It has been demonstrated that spatial normalisation significantly reduces seasonality [9,10] in satellite time series.Seasonality is reduced by dividing the NDVI value for each pixel with the 95th percentile (P 95 ) computed from the NDVI values of the pixels within the spatial neighbourhood of such pixel.The spatial window that defines the neighbourhood should be larger than the size of forest disturbance events one wants to detect, to ensure that there are forest pixels within the window to avoid normalising against a value from the disturbed area [6].The spatial window of 15 by 15 pixels (≈20 ha) we used in this study was deemed sufficient because forest disturbance patches in the UNESCO Kafa Biosphere Reserve tend to be small [13,20].

Detecting Negative Anomalies in Landsat NDVI Data Cubes
We detected negative anomalies in the local data cubes of spatially normalised Landsat NDVI time series using a method that detects forest disturbances as extreme events in Landsat data cubes [6].The local data cube is defined around each pixel in the image time series and its temporal and spatial extents are user-defined [6].We used a local data cube with a spatial extent of the 15 × 15 Landsat pixels.The temporal extent was from 2010 to 2016.The local data cube is split into a reference cube and a monitoring cube.The reference cube contains historical observations, whereas the monitoring cube contains newly acquired observations that need to be assessed for forest disturbance [6].Negative anomalies in the monitoring period were identified using a threshold computed from the observations available in the reference period of the local data cube.The threshold corresponds to a user-specified percentile (e.g., 5th percentile), and an observation was a negative anomaly if it was below the specified percentile [18].Here, we evaluated five percentile thresholds (1st, 2th, 3rd, 4th and 5th).A pixel was flagged as potentially disturbed if it had two consecutive negative anomalies in its time series.Three different approaches were used to confirm negative anomalies as forest disturbance (Section 3.3).Observations in the monitoring period were assessed sequentially rather than simultaneously to mimic a near real-time monitoring scenario.

Two-Consecutive Anomaly Approach
With this approach, a pixel was immediately confirmed as disturbed once two consecutive negative anomalies were present in the pixel-time series.This approach was proposed in our earlier work, and it has been demonstrated at two sites, one in a dry tropical forest in Bolivia and the other in a humid evergreen forest in Brazil [6].

Two Consecutive Anomalies + Magnitude of Change Approach
We used the two consecutive negative anomalies (C anomaly ) and the magnitude of change (M change ) (Table 1) to confirm forest disturbance.The aim here was to assess whether using magnitude of change together with two consecutive negative anomalies would improve discrimination of forest disturbances from false detections when compared to a scenario whereby forest disturbance is confirmed using two consecutive negative anomalies only.We used a random forest model (n = 1311, trees = 501; [28]) to calculate the probability of disturbance using two consecutive negative anomalies and magnitude of change as predictors of forest disturbance.We used a probability 0.5 as the threshold for accepting negative anomalies as forest disturbance.Random forest model was trained using a training data set acquired through visual interpretation of Landsat multi-spectral images.The training sample pixels were selected through probability sampling, using a forest disturbance map generated using the two-consecutive anomaly approach (Section 3.3.1).

Space-Time Feature Approach
We used all 17 space-time features (see Table 1) extracted from the local data cube of Landsat NDVI time series to calculate the probability of forest disturbance once two consecutive anomalies are signalled at a pixel.The probability of forest disturbance was calculated using random forest model (n = 1311, trees = 501; [28]).A pixel was declared disturbed if the probability of disturbance was equal to or greater than 0.5.The importance of each space-time feature for predicting forest disturbance was determined using the conditional variable importance approach [29] that accounts for correlation between variables, and accommodates variables of different data types.We used the same training data set as in Section 3.3.2 to train the random forest model.
Out of 17 features, one (SD rc ) is measuring spatio-temporal variability in the reference period, three (Po cum , Pr cum , SD Trend ) are measuring change in spatial variability over time, six (Po extremes , Po patch , Po step , Pr patch , Pr extremes , Pr step ) are measuring spatio-temporal association with pixels experiencing negative anomalies, and three (CB nf , N nf , P nf ) are measuring spatial association with non-forest areas.The remaining four features, i.e., V rc , Q thresh , M change , C anomaly , represent the number of observations in the local data cube over the reference period, the anomaly threshold for identifying negative anomalies, the magnitude of change and two consecutive negative anomalies, respectively.Spatial variability and spatio-temporal variability can be measured using various metrics (e.g., the range, standard deviation; [30][31][32]).Here, we used the standard deviation as a measure of spatial variability and spatio-temporal variability.Spatial variability was calculated at each time slice within the local data cube, whereas spatio-temporal variability was calculated by taking into account both the spatial and temporal information in the local data cube.Many space-time features can be extracted from data cubes of satellites image time series.In this study, we chose the space-time features that can be extracted easily in an automated manner from the local data cube to ensure that the change detection framework can be used for near real-time forest change monitoring.

Estimating the Spatial and Temporal Accuracy
We estimated the spatial and temporal accuracy for forest cover disturbance using 2343 sample pixels generated through stratified probability sampling [33].The sample pixels and the strata were generated using a forest disturbance map produced by Hamunyela et al. (in preparation).This forest disturbance map was generated using the change detection framework presented in this paper, but it Remote Sens. 2017, 9, 515 8 of 17 was based on the combination of Landsat and Sentinel-2 data.The area proportion for each stratum is shown in Table 2.The largest stratum (No change I) represents forest areas which were 60 m away from areas where disturbances were detected.Stratum No change II was a 60 m buffer around the areas where disturbances were detected.A buffer zone was created around disturbances in order to improve the estimation of the omission error because changes and misclassifications are expected at the edges of the forest.Stratum Change I represents areas where disturbances were detected between 2014 and 2015, while stratum Change II represents areas where disturbances were detected between 2015 and 2016.The area was classified as change if the probability of forest disturbance was greater than or equal to 0.5.The allocation of sample size to each stratum was based on the approach recommended by [34,35] that ensures a reliable estimation of overall accuracy, producer's and user's accuracy for classes with small area proportions.Based on this sample allocation, we calculated the area-adjusted overall accuracy, producer's accuracy and user's accuracy.In addition, we calculated the area bias (the difference between user's and producer's accuracy), and the temporal accuracy expressed as a median temporal detection delay in days.The temporal detection delay was defined as the number of days between the date of disturbance as per reference data and the acquisition date for the image in which a disturbance was confirmed.Similar to [3,9,11,13,36], reference data were collected from Landsat multi-spectral images through visual interpretation of images.This approach, however, has major limitations because forest disturbances which are smaller than the spatial resolution of Landsat images might not be visible, but might be detected.It is for this reason that we complemented the visual interpretation of Landsat images with very high resolution imagery available in Google Earth (https://www.google.com/earth/)and Bing Maps (http://www.bing.com/maps/),but images are not temporally dense like Landsat.

Assessing the Effect of Probability Threshold on Spatial Accuracy
In Sections 3.3.2and 3.3.3,we used a probability threshold of 0.5 to discriminate false detections from real changes.In some instances, the user may have a different monitoring objective.For example, a user may prefer a map with the lowest area bias.We assessed how the probability threshold affects the spatial accuracy for forest disturbance maps produced using the space-time feature approach and the two consecutive anomalies + magnitude of change approach.We chose the maps which were generated using the 5th percentile that achieved the highest producer's accuracy and the shortest temporal detection delay.More specifically, we assessed how overall accuracy, user's accuracy, producer's accuracy vary as a function of the probability of the threshold for accepting negative anomalies as forest disturbance.We increased the probability threshold from 0 to 0.95, at an interval of 0.05.

Results
The accuracies for the space-time feature approach and the two consecutive anomalies + magnitude of change approach are based on the forest disturbance maps generated using 0.5 as the probability threshold for accepting anomalies as forest disturbance.The effect of the probability threshold on the overall accuracy, user's accuracy and producer's accuracy for maps generated using the space-time features approach and the two consecutive anomalies + magnitude of change approach is shown in Section 4.2.In Section 4.3, the area estimates for disturbed forest are presented.The ranking of the space-time features based on their conditional variable importance are presented in Section 4.4.

Spatial and Temporal Accuracy
The space-time feature approach yielded the highest user's accuracy across all the percentile thresholds (>70%, Figure 3d).More specifically, for each percentile threshold, the user's accuracy achieved from the space-time feature approach was at least 26% higher than the user's accuracy achieved either from a two-consecutive anomaly approach or from the magnitude of change approach.This gain in the user's accuracy was particularly high (36.4%)when using the least strict percentile threshold (5th percentile).The difference in the user's accuracy between the two-consecutive anomaly approach and the two consecutive anomalies + magnitude of change approach was marginal across all the percentile thresholds.Across all percentile thresholds, the user's accuracy achieved from the two-consecutive anomaly approach and the two consecutive anomalies + magnitude of change approach was below 60%.Compared to the space-time feature approach, the two-consecutive anomaly approach and the two consecutive anomalies + magnitude of change approach achieved the higher producer's accuracy when using the least strict percentile thresholds (4th and 5th percentile, Figure 3c), but the space-time feature approach achieved the highest overall accuracy (Figure 3a), and the lowest area bias (Figure 3b).More specifically, the producer's accuracy for the space-time feature approach was about 9% lower than those of the two-consecutive anomaly approach and the two consecutive anomalies + magnitude of change approach when using the least strict percentile threshold.For all approaches, the least strict percentiles achieved the shortest temporal detection delay (32 days) and highest producer's accuracy (>79%).For the space-time feature approach, the differences in the producer's accuracy and median temporal detection delay between the most strict percentile threshold (5th percentile) and least strict percentile threshold (1st percentile) were 47.7% and 137 days, respectively.Figure 4 shows an example of a false detection, related to remnant clouds, which was accepted when using the two-consecutive anomaly approach and the magnitude of change approach, but rejected by the space-time feature approach.

The Effect of the Probability Threshold on Spatial Accuracy
For the space-time approach, the user's accuracy for map produced using the 5th percentile threshold increased by 69.8%, reaching 99.1% when increasing the probability threshold from 0 to 0.95.The producer's accuracy decreased by 75.8 %, reaching 15.4%.The crossover point for the user's and producer's accuracy was approximately at the probability threshold of 0.55 (Figure 5).At this probability threshold, the user's accuracy was 78.3% while the producer's accuracy was 76.8%.In contrast, the user's and producer's accuracy did not reach the crossover point when using the two consecutive anomalies + magnitude of change approach (Figure 5).For this approach, the lowest area bias (17.3%) was achieved at the probability threshold of 0.95, and it was in favour of the producer's accuracy.For the magnitude of change approach, the user's accuracy only increased by 18.3%, reaching 47.6% when increasing the probability threshold from 0 to 0.95; and the producer's accuracy decreased by 26.3%, reaching 64.3%.

Area Estimates for Disturbed Forest
Using the map produced from the space-time feature approach, the 5th percentile threshold and probability threshold of 0.55 (Figure 6), we estimated that about 17,915 ha of forest in the UNESCO Kafa Biosphere Reserve has been disturbed between 2014 and 2016.The disturbances involved either full or partial removal of forest canopy.Disturbances were mainly concentrated at the edges of the forest (Figure 6).

Space-Time Features Important for Predicting Forest Disturbance
Our results show that cumulative sums of residuals for spatial variability (Pocum, and Prcum), the anomaly threshold for identifying anomalous observations (Qthresh), the variability in the reference cube of a local data cube (SDrc) and the number of non-forest pixels within the local data cube in the reference period (CBnf) were the most important predictors for forest disturbance (Figure 7).These

Area Estimates for Disturbed Forest
Using the map produced from the space-time feature approach, the 5th percentile threshold and probability threshold of 0.55 (Figure 6), we estimated that about 17,915 ha of forest in the UNESCO Kafa Biosphere Reserve has been disturbed between 2014 and 2016.The disturbances involved either full or partial removal of forest canopy.Disturbances were mainly concentrated at the edges of the forest (Figure 6).

Space-Time Features Important for Predicting Forest Disturbance
Our results show that cumulative sums of residuals for spatial variability (Pocum, and Prcum), the anomaly threshold for identifying anomalous observations (Qthresh), the variability in the reference cube of a local data cube (SDrc) and the number of non-forest pixels within the local data cube in the reference period (CBnf) were the most important predictors for forest disturbance (Figure 7).These

Area Estimates for Disturbed Forest
Using the map produced from the space-time feature approach, the 5th percentile threshold and probability threshold of 0.55 (Figure 6), we estimated that about 17,915 ha of forest in the UNESCO Kafa Biosphere Reserve has been disturbed between 2014 and 2016.The disturbances involved either full or partial removal of forest canopy.Disturbances were mainly concentrated at the edges of the forest (Figure 6).

Space-Time Features Important for Predicting Forest Disturbance
Our results show that cumulative sums of residuals for spatial variability (Po cum , and Pr cum ), the anomaly threshold for identifying anomalous observations (Q thresh ), the variability in the reference cube of a local data cube (SD rc ) and the number of non-forest pixels within the local data cube in the reference period (CB nf ) were the most important predictors for forest disturbance (Figure 7).These

Discussion
In this paper, we proposed the use of space-time features to improve forest change monitoring using Landsat time series.Space-time features are used to confirm forest disturbances when negative anomalies are detected in the pixel-time series.Our results show that using space-time features to confirm forest disturbances reduces false detections significantly, with minimal negative effect on producer's accuracy for forest disturbance.Contrary to previous studies, our results also show that using two consecutive negative anomalies to confirm forest disturbances does not reduce false detections sufficiently even when using strict percentile thresholds to identify the anomalies.Instead of reducing false detections, strict percentile thresholds amplified the omission error, and led to a longer temporal detection delay.These results suggest that the criterion of two consecutive anomalies is not robust enough to reject false detection as previously thought [1,6,7].Two consecutive anomalies might be sufficient in areas where forest disturbances occur at large scale and in an abrupt manner (e.g., [6,7]), but not sufficient when dealing with complex small-scale forest disturbances [13,20].Our results also show that using the magnitude of change, in addition to two consecutive negative anomalies, to confirm forest disturbances does not reduce false detections sufficiently.The two consecutive anomalies + magnitude of change approach produced spatial and temporal accuracies largely similar to those of the two-consecutive anomalies approach.The false detections remained high despite using strict probability thresholds, implying that false detections also had high probability of disturbance when using magnitude of change as a predictor of forest disturbance.Overall, these findings suggest that temporal information alone might not be sufficient to reduce false detections, especially when aiming for accurate and timely detection of small-scale and gradual forest disturbances.
We observed that forest disturbances were mainly detected on the periphery of the forest.This pattern of forest disturbance is consistent with the findings of the previous studies [13,22] that mapped forest disturbances in this area using Landsat time series.The drivers of forest disturbance, especially expansion of small-holder agricultural areas and human settlements, are likely to occur on edges of the forest especially when there has been ongoing forest conservation and restoration activities in the area because it might be easier to illegally disturb forest edges than to clear new large intact forest area.

Importance of Space-Time Features for Accurate and Timely Detection of Forest Disturbances
Space-time features provided an opportunity to timely detect anomalies without amplifying false detections.This capacity to reject many false detections explains why the difference in the user's accuracy between the maps produced using the least strict percentile threshold (5th percentile) and the maps produced using most strict percentile threshold (1st percentile) was only (8%).This difference was greater than 17% when using only two consecutive negative anomalies or magnitude of change to confirm forest disturbances.These results suggest that, although using less strict percentile thresholds may initially flag many pixels as potentially disturbed, a set of space-time features provides sufficient information to accurately discriminate false detections from real changes.At the same time, by using least strict percentile thresholds, we can reduce the omission error and the temporal detection delay.When aiming for the lowest area bias (1.5%), our spatial accuracies (overall accuracy = 96.2%,user's accuracy = 76.8%,producer's accuracy = 78.3%)were within the range of those reported in other studies that attempted to map forest disturbances at annual or sub-annual scales in recent years [6,7,9,11,13,37,38].However, our high spatial accuracies were not achieved at the expense of temporal accuracy.

Important Space-Time Features for Forest Disturbance Predicting
Our analysis for conditional variable importance shows that space-time features which are related to spatial variability were the most important predictors for forest disturbances.More specifically, cumulative sum of residuals for spatial variability at the time step before the focal pixel is flagged as potentially disturbed, the anomaly threshold, the variability of the observations in the reference cube, and the cumulative sum of residuals for spatial variability at the time step the focal pixel is flagged as potentially disturbed were the most important predictors for forest disturbance.All these variables are computed using both spatial and temporal information, and are measuring spatio-temporal variability.The high importance of the variables related to spatio-temporal variability is not surprising because spatio-temporal variability is likely to increase once the forest becomes disturbed.The number of non-forest pixels within the local data cube was also an important predictor of forest disturbances.This is expected because new human-induced forest disturbance typically occurs closer to old disturbances in time and space [19].Thus, negative anomalies detected at forest pixels which have spatial association with non-forest areas are most likely to signify forest disturbance.Magnitude of change and two consecutive negative anomalies, which are widely used to distinguish real changes from false detections [1,6,7,9,13], were not the strongest predictors for forest disturbance.Note, however, that our magnitude of change is computed differently from those used in other studies [9,13].In our study area, forest disturbances are known to occur at a small-scale and in a gradual manner [13].They may therefore have small magnitude of changes.This may explain why the magnitude of change was not a strong predictor of forest disturbance.
Although some space-time features were less important than others, it is important to highlight here that the importance of each space-time feature as a predictor for forest disturbance may vary from one area to another, depending on the nature of the forest disturbance.In areas where the forest is cleared on a large scale and involves sudden and full forest cover removal (e.g., [7,9,36]), the magnitude of change might be a strong predictor of forest disturbance.Nonetheless, the space-time feature approach is flexible because the probability of disturbance is calculated using a supervised machine learning algorithm [28].However, an elaborate training data set to train the learning algorithm is needed to ensure the applicability of this approach to different forest areas.The requirement for training data is therefore disadvantageous because training data for forest disturbances are often lacking in many parts of the world [36], especially for small-scale forest disturbances which are difficult to acquire through visual interpretation of satellite images.Nonetheless, with the advent of community-based forest monitoring [39,40], observations on forest disturbances acquired through citizen science and local experts could be used for training purposes [20,22,40,41].Availability of daily high-resolution images from nanosatellites sensors (e.g., Planet's satellites) may also improve the acquisition of training data.

Space-Time Features Approach in the Era of Multi-Sensor Forest Disturbances
The approach of using space-time features that is presented in this paper is generic because the features we proposed here can be computed from image time series of any satellite sensor (e.g., Sentinel-2), including Synthetic Aperture Radar (SAR) sensors whose data has been widely used for forest monitoring in recent years [7,10,37,42,43].Our approach can also be applied to other optical satellite data metrics.We can, for example, use all spectral bands, and extract space-time features from each spectral band.Such multi-spectral space-time feature can then be used to predict the probability of forest disturbances.In this way, we can concurrently exploit spatial, temporal and spectral information in satellite data to accurately and timely map forest disturbance.It should be noted, however, that a clear understanding of how reflectance in each spectral channel reacts to forest disturbance is needed to extract consistent and meaningful space-time features.

Conclusions
This paper demonstrated that using space-time features extracted from Landsat NDVI data cubes to confirm forest disturbance increases the capacity to reject many false detections, without compromising the temporal accuracy.The use of space-time features is based on the idea that forest disturbances, by default, are spatio-temporal processes, hence information on spatio-temporal association and variability in satellite time series should be considered to accurately and efficiently detect forest disturbances.The following space-time features were the strongest predictors for forest disturbance: (1) the change in spatio-temporal variability, (2) spatio-temporal association with non-forest, and (3) variability in the reference period.Magnitude of change and two consecutive negative anomalies, which are widely used to distinguish real changes from false detections, were not the main predictors of forest disturbance.These findings indicate that space-time features have potential to improve the detection of forest disturbances, especially small-scale and gradual forest disturbances whose magnitude of change is often small.Detection of small scale forest disturbances may further improve when using satellite sensors, like Sentinel-2, that provide increased spatial and temporal details.

Figure 1 .
Figure 1.Overview of the UNESCO Kafa Biosphere Reserve located in southwest of Ethiopia.Areas in green represent the benchmark forest map.The benchmark forest map was generated using a supervised maximum likelihood classifier.

Figure 1 .
Figure 1.Overview of the UNESCO Kafa Biosphere Reserve located in southwest of Ethiopia.Areas in green represent the benchmark forest map.The benchmark forest map was generated using a supervised maximum likelihood classifier.

Figure 2 .
Figure 2. The workflow for this study.

Figure 2 .
Figure 2. The workflow for this study.

Figure 3 .
Figure 3. Overall accuracy (a); area bias (b); producer's accuracy (c); user's accuracy (d); and median temporal detection delay (e) for forest disturbances detected in the UNESCO Kafa Biosphere Reserve between 2014 and 2016 from Landsat NDVI time series.Forest disturbances were confirmed using three approaches: the two-consecutive anomalies approach, the two consecutive anomalies + magnitude of change approach and the space-time feature approach.The accuracies are shown as a function for the percentile thresholds for declaring anomalous observations in local data cubes.The vertical bars in (a,c,d) indicate the 95% confidence interval.

Figure 3 .
Figure 3. Overall accuracy (a); area bias (b); producer's accuracy (c); user's accuracy (d); and median temporal detection delay (e) for forest disturbances detected in the UNESCO Kafa Biosphere Reserve between 2014 and 2016 from Landsat NDVI time series.Forest disturbances were confirmed using three approaches: the two-consecutive anomalies approach, the two consecutive anomalies + magnitude of change approach and the space-time feature approach.The accuracies are shown as a function for the percentile thresholds for declaring anomalous observations in local data cubes.The vertical bars in (a,c,d) indicate the 95% confidence interval.

Figure 4 .
Figure 4.An example of a false detection (red dotted line), related to remnant clouds, which was rejected (probability of disturbance = 0.13) when using the space-time features.The probability of disturbance was 0.51 when only using the magnitude of change to predict forest disturbance.The magnitude of change was −0.22.The red dot represents the pixel (location: 7°30′5.042″N,36°1′30.959″E)whose time series is shown, and the black areas in the image chips represent areas which were masked as clouds or cloud shadow using CFmask.Image chips which were covered completely with no data are skipped.

Figure 5 .
Figure 5. Overall accuracy, producer's accuracy and user's accuracy for forest disturbances detected in the UNESCO Kafa Biosphere Reserve between 2014 and 2016 from Landsat NDVI time series using the 5th percentile while changing the probability threshold for accepting two consecutive negative anomalies as forest disturbance.

Figure 4 .
Figure 4.An example of a false detection (red dotted line), related to remnant clouds, which was rejected (probability of disturbance = 0.13) when using the space-time features.The probability of disturbance was 0.51 when only using the magnitude of change to predict forest disturbance.The magnitude of was −0.22.The red dot represents the pixel (location: 7 • 30 5.042"N, 36 • 1 30.959"E)whose time series is shown, and the black areas in the image chips represent areas which were masked as clouds or cloud shadow using CFmask.Image chips which were covered completely with no data are skipped.

Figure 4 .
Figure 4.An example of a false detection (red dotted line), related to remnant clouds, which was rejected (probability of disturbance = 0.13) when using the space-time features.The probability of disturbance was 0.51 when only using the magnitude of change to predict forest disturbance.The magnitude of change was −0.22.The red dot represents the pixel (location: 7°30′5.042″N,36°1′30.959″E)whose time series is shown, and the black areas in the image chips represent areas which were masked as clouds or cloud shadow using CFmask.Image chips which were covered completely with no data are skipped.

Figure 5 .
Figure 5. Overall accuracy, producer's accuracy and user's accuracy for forest disturbances detected in the UNESCO Kafa Biosphere Reserve between 2014 and 2016 from Landsat NDVI time series using the 5th percentile while changing the probability threshold for accepting two consecutive negative anomalies as forest disturbance.

Figure 5 .
Figure 5. Overall accuracy, producer's accuracy and user's accuracy for forest disturbances detected in the UNESCO Kafa Biosphere Reserve between 2014 and 2016 from Landsat NDVI time series using the 5th percentile while changing the probability threshold for accepting two consecutive negative anomalies as forest disturbance.
space-time features were two times more important than consecutive negative anomalies (C anomaly ) and the magnitude of change (M change ).Remote Sens. 2017, 9, 515 12 of 17 space-time features were two times more important than consecutive negative anomalies (Canomaly) and the magnitude of change (Mchange).

Figure 6 .
Figure 6.An example of forest disturbances detected in the UNESCO Kafa Biosphere Reserve between 2014 and 2016 from Landsat NDVI time series (insert A).The probability disturbance was calculated using the space-time features extracted from Landsat NDVI data cubes based on the 5th percentile threshold and 0.55 probability threshold.The base image is the surface reflectance in the red channel (Band 4) for Landsat-8/OLI acquired on 10 March 2016.The black patches represent areas masked for clouds and cloud shadows.

Figure 7 .
Figure 7. Conditional variable importance for each space-time feature used to predict forest disturbances in the UNESCO Kafa Biosphere Reserve from data cubes of Landsat normalised difference vegetation index (NDVI) time series.Training data (n = 1311) was used to determine the conditional variable importance.

Figure 6 .
Figure 6.An example of forest disturbances detected in the UNESCO Kafa Biosphere Reserve between 2014 and 2016 from Landsat NDVI time series (insert A).The probability disturbance was calculated using the space-time features extracted from Landsat NDVI data cubes based on the 5th percentile threshold and 0.55 probability threshold.The base image is the surface reflectance in the red channel (Band 4) for Landsat-8/OLI acquired on 10 March 2016.The black patches represent areas masked for clouds and cloud shadows.

Figure 6 .
Figure 6.An example of forest disturbances detected in the UNESCO Kafa Biosphere Reserve between 2014 and 2016 from Landsat NDVI time series (insert A).The probability disturbance was calculated using the space-time features extracted from Landsat NDVI data cubes based on the 5th percentile threshold and 0.55 probability threshold.The base image is the surface reflectance in the red channel (Band 4) for Landsat-8/OLI acquired on 10 March 2016.The black patches represent areas masked for clouds and cloud shadows.

Figure 7 .
Figure 7. Conditional variable importance for each space-time feature used to predict forest disturbances in the UNESCO Kafa Biosphere Reserve from data cubes of Landsat normalised difference vegetation index (NDVI) time series.Training data (n = 1311) was used to determine the conditional variable importance.

Figure 7 .
Figure 7. Conditional variable importance for each space-time feature used to predict forest disturbances in the UNESCO Kafa Biosphere Reserve from data cubes of Landsat normalised difference vegetation index (NDVI) time series.Training data (n = 1311) was used to determine the conditional variable importance.

Table 1 .
Space-time features extracted from local data cubes for Landsat NDVI time series to predict the probability of forest disturbances given two consecutive negative anomalies at a particular pixel.The spatial extent of the local data cube was 15 by 15 Landsat pixels.T 1 refers to the time step where the first negative anomaly of the two consecutive negative anomalies is detected, whereas T 2 is the time step for the second negative anomaly.The space-time features are grouped by category in the table.

Table 2 .
Area proportions, number of pixels and allocated sample size for each stratum.