Monitoring Deforestation in Rainforests Using Satellite Data : A Pilot Study from Kalimantan , Indonesia

Monitoring large forest areas is presently feasible with satellite remote sensing as opposed to time-consuming and expensive ground surveys as alternative. This study evaluated, for the first time, the potential of using freely available medium resolution (30 m) Landsat time series data for deforestation monitoring in tropical rainforests of Kalimantan, Indonesia, at sub-annual time scales. A simple, generic, data-driven algorithm for deforestation detection based on a consecutive anomalies criterion was proposed. An accuracy assessment in the spatial and the temporal domain was carried out using high-confidence reference sample pixels interpreted with the aid of multi-temporal very high spatial resolution image series. Results showed a promising spatial accuracy, when three consecutive anomalies were required to confirm a deforestation event. Recommendations in tuning the algorithm for different operational use cases were provided within the context of satisfying REDD+ requirements, depending on whether spatial accuracy or temporal accuracy need to be optimized.


Introduction
Forests are central to the breakthrough 2015 Paris climate agreement. Item 2 under Article 5 reads "Parties are encouraged to take action to implement and support, including through results-based payments, the existing framework as set out in related guidance and decisions already agreed under the Convention for: policy approaches and positive incentives for activities relating to reducing emissions from deforestation and forest degradation, and the role of conservation, sustainable management of forests and enhancement of forest carbon stocks in developing countries; and alternative policy approaches, such as joint mitigation and adaptation approaches for the integral and sustainable management of forests, while reaffirming the importance of incentivizing, as appropriate, non-carbon benefits associated with such approaches." [1]. Deforestation and forest degradation account for about 10% of global CO 2 emissions [2][3][4]. While a large-scale and long-term energy transition away from fossil fuels is expected to take still a few decades, it is estimated that tropical forest conservation and restoration could reduce up to half of total carbon emissions [5]. Particularly, the tropical region of insular South East Asia (hereafter SE Asia), notably the country of Indonesia, the number of viable (i.e., cloud-free) observations in areas with pervasive cloud cover, such as Indonesia. Such approach is more suitable to detect changes in humid tropical forests, where the time window to detect disturbance events is very short [20], because the vegetation regrowth happens immediately after disturbances. However, the opportunity of detecting deforestation at sub-annual scales by the "dense time series" approach (hereafter referred as DTS approach) comes with the challenge of separating noise from the real change spectral signal. This is because DTS class of algorithms need to be designed to intelligently deal with the realm of noise (in contrast to the "annual series" approach based on annual best-quality image mosaic) inherent in satellite data due to unmasked clouds and imperfections in image geo-registration and atmospheric corrections [21].
To our knowledge, the performance of LTS data and DTS algorithm for deforestation detection has not yet been evaluated in the tropical rainforests of insular SE Asia, likely due to the lack of available multitemporal ground data or VHSR images for verification of the algorithm results. This present study assessed the potential of using LTS data and DTS algorithm approach for the detection of deforestation events at sub-annual scales in Indonesian tropical rainforests. The study targeted abrupt (discrete) deforestation events, considering that forest clearing to make way for commercial plantation is the main deforestation driver in Indonesia. A data-driven deforestation detection algorithm was proposed based on a simple decision boundary (i.e., detection threshold) for signalling a potential deforestation event, followed by a consecutive anomalies criterion (CAC) for confirming a deforestation event.
To avoid uncertainty in reference data interpreted from Landsat images themselves, the spatial and temporal deforestation detection accuracies were assessed using human-interpreted reference data collected strictly with the aid of multitemporal VHSR images (hereafter "VHSR series").

Case Study Area
Kalimantan, also known as Indonesian Borneo, is the part (c. three quarter) of Borneo, the world's third largest island, that belongs to the country of Indonesia. In early 1970s, an estimated 76% of Borneo was covered by old-growth forests [10]. Since then, over the period 1973-2015 industrial-scale selective logging and plantation industries in the island have resulted in the estimated loss of 34% of the island's old-growth forest cover, 26% (14.4 MHa) in Kalimantan, leaving a remaining 26 MHa total forest area in Kalimantan, with 64% of it intact forests [7]. Further, historically the forest loss has also occurred within proposed and existing protected areas [22]. The soils in Kalimantan range from highly fertile to virtually sterile types supporting different vegetation types, with the most common moderately weathered inceptisols; however the soil distribution is not well known [23]. Kalimantan has a tropical climate classified as Af by the Köppen-Geiger system, with the average temperature of 26.7 • C and average annual rainfall of 2992 mm. Forest landscape restoration (FLR) activities in Indonesia have only recently gained momentum; within the framework of the Bonn Challenge launched in 2011 (http://www.bonnchallenge.org/), Indonesia has committed a total national restoration target of 29 MHa (https://infoflr.org/countries/indonesia).
In this pilot study, as an emphasis was given to high-confidence reference data for accuracy assessment of deforestation detection, the study did not target a specific study area. Rather, constrained by the available VHSR images, case study locations were selected within Kalimantan ( Figure 1) which (1) were forested in the year 2000; (2) showed deforestation activities; (3) have the longest timespan between the earliest and latest VHSR image; and (4) have the most number of VHSR images. The benchmark forest map for the year 2000 was based on the map of Indonesia time-sequential primary forest extent and loss [8] publicly available at [24].
VHSR images obtained through Digital Globe viewing services were used in collecting the high-confidence reference samples (Table A1, Appendix A). There was a lack of historical VHSR images from before and after deforestation in Google Earth application over forests of Kalimantan mega-island. The VHSR images were available as rectangular (north-oriented) subsets of approximately 1.1 × 1.1 km. VHSR series at six locations (Table A1, Appendix A, Figure 1a) were selected based on the above-mentioned criteria; Figure 1b-e shows an example of one of the selected VHSR series. In this study, the deforestation monitoring was started on 1 January 2000. The end of monitoring period was set around the date of the latest VHSR image available (Table A1, Appendix A).  Figure 1. (a) Location of six case study area where multitemporal very high spatial resolution images (VHSR, green stars) were available. The geographic extent of the VHSR images is given in Table A1, Appendix A. The map of primary intact forest area in 2000 was obtained from [24] (class values 2, 4, 5, 6, 7, 8, 9 in the original map). The map of oil palm concessions [25] and logging concessions [26] were obtained from [27], made openly available by Global Forest Watch, and produced by the Indonesia Ministry of Environment and Forestry. VHSR images of one scene is shown in (b-d) (dates shown in lower left corner) overlain with interpreted reference Landsat sample pixels area (red squares). The VHSR images were obtained through Digital Globe viewing service.

Satellite Data
The Google Earth Engine (EE) online platform [28] was used to retrieve all available Landsat surface reflectance images in the USGS Landsat archive, as temporal stacks for the study locations (Table A1, Appendix A), including Landsat-5 TM (1984-2012), Landsat-7 ETM+ (1999-2016), and Landsat-8 OLI (2013-2016) sensors. EE is a cloud-based platform that facilitates easy access to Google's high-performance computing resources and the multi-petabyte catalog of satellite imagery and geospatial datasets in its public data archive. EE JavaScript Application Programming Interface [29] was used to retrieve the temporal stacks of Landsat surface reflectance images, subset the images to the case study VHSR scenes, mask the images to remove non-clear pixels, compute spectral vegetation indices, and finally export the pre-processed images for further time series analysis in R statistical software [30]. To remove non-clear pixels, the images were masked for clouds, shadows, and water bodies using the CFmask (C Version of Function Of Mask) layer [31] provided in the surface reflectance product. The surface reflectance product (also known as Analysis Ready Data, ARD) was produced from the standard L1T product (radiometrically calibrated and orthorectified), and atmospherically corrected using the Landsat Ecosystem Disturbance Adaptive Processing System (LEDAPS [32,33]) for TM and ETM+, and Landsat Surface Reflectance Code (LaSRC [32,34]) for OLI.
The Landsat missions carry a multispectral passive optical sensor which measures radiation reflected from Earth's surface in several discrete broad electromagnetic channels called spectral bands. The common spectral bands are similar across TM, ETM+, and OLI, namely blue, green, red, near-infrared (NIR), first shortwave-infrared (SWIR1), and second shortwave-infrared (SWIR2) [35]. Images in these bands have a nominal 30 m spatial resolution.
The feasibility of sub-annual deforestation detection is constrained by the intra-annual observation density. For most of the study period  at least two Landsat systems were in orbit, providing collection opportunity, and thus possible data temporal resolution of 8 days. This temporal resolution, however, has not always been realized globally due to selective (non-systematic) scene acquisition plan, Landsat system health issues, differing data reception capabilities of archiving agencies such as USGS [36], Scan-Line-Corrector failure of the ETM+ sensor, and cloud cover. Therefore, the intra-annual availability of cloud-free Landsat observations was examined, as the mean and standard deviation of cloud-free observation numbers per year and per pixel. This was done during the period 1999-2012 when Landsat-5 TM and Landsat-7 ETM+ were in orbit, and during 2013-2016 when Landsat-7 ETM+ and Landsat-8 OLI were in orbit. In the latter, cloud-free observation availability from Landsat-8 OLI alone was assessed, to assess the relative advantage of combining data from different Landsat satellites. The period prior to 1999 was not assessed as only Landsat-5 TM was in orbit and scene acquisition was very limited.

Deforestation Detection Algorithm
The proposed algorithm for deforestation detection consisted of the following steps. First, an ordinary linear regression model was fit to the time series observations in the period defined as stable history period (t = 1, ..., n). The root mean square error of this history model (RMSE history ) was recorded. Second, this history model was projected into the future i.e., to predict observations in the subsequent period defined as monitoring period (t = n + 1, ..., N). Third, the discrepancies between the model predictions (ŷ s ) and the observations (y s ) in the monitoring period was monitored as the residual (y s −ŷ s ). The monitoring was done sequentially to emulate a real monitoring scenario. An observation was signaled as an anomaly (i.e., a breakpoint) when the absolute residual exceeded the following decision boundary: where k is a factor that controls how sensitive/conservative is the algorithm in detecting an anomaly and RMSE history is the RMSE of the history model in that pixel's time series. Different values for k were assessed with regards to its impact on deforestation detection accuracy. Fourth, a consecutive anomalies criterion (CAC, Appendix B) was applied, that is the algorithm was designed to not immediately confirm a deforestation event when a single anomaly was detected, but instead require the anomalies to be signaled consecutively for cons times. This was done to prevent false positive detection of deforestation events, since in the preliminary experiments, it was observed that anomalies were frequently declared at noise observations (mainly due to remaining undetected clouds) in the monitoring period. In addition, the consecutive anomalies were required to occur within two years time window as it was observed that the NDMI signal recovers and stabilizes within two or three years since the deforestation event. In summary, the proposed algorithm has only two parameters namely k and cons with straightforward effects. The algorithm is therefore based on the "minimum user input" approach [37].
As input to the deforestation detection algorithm, four commonly used broadband vegetation indices (VI) were initially examined, namely the normalized difference vegetation index (NDVI) [38], enhanced vegetation index (EVI) [39], normalized difference moisture index (NDMI), also known as normalized difference water index [40] and normalized burn ratio (NBR) [41]. It was observed that NDMI shows high sensitivity (most clear signal) in response to deforestation events in the study area, with signal change magnitude most visibly larger than ephemeral noise. NDMI was calculated as follows: According to Schultz et al. [42] who used LTS for deforestation detection across tropical forest sites in Brazil, Ethiopia and Vietnam, "wetness"-related VI such as NDMI (the SWIR band used is sensitive to canopy water content) performed better than "greenness"-related VI such as NDVI and EVI (the red band used is sensitive to pigment content). They attributed the lower accuracy of the greenness VI to their inability to properly isolate the change signal from noise in LTS. Therefore, NDMI was chosen for deforestation detection in this study. Figure 2 illustrates the proposed algorithm. An observation is flagged as an anomaly (red-circled) if its residual (vertical dashed grey line) is larger than the decision boundary (solid green lines). Lowering k (Figure 2a) moves the boundary lines closer to the model prediction line (solid blue line), and as a result, more observations are flagged as anomalies; in other words, the algorithm becomes more sensitive (less conservative). Note that while forest canopy removals cause a decrease in NDMI (i.e., a negative anomaly), here both the negative and positive anomalies were processed with the intention to make the algorithm more generic with respect to different spectral inputs. Different spectral inputs would either decrease (e.g., NDMI, NDVI, NBR, and Tasseled Cap Wetness [43]) or increase (e.g., SWIR1, red, and Tasseled Cap Brightness [43]) in response to forest canopy removals. This is considering different forest types or different forest disturbance types might require different optimal spectral variables. q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q (a) 1 9 8 8 1 9 8 9 1 9 9 0 1 9 9 1 1 9 9 2 1 9 9 3 1 9 9 4 1 9 9 5 1 9 9 6 1 9 9 7 1 9 9 8 Observation date NDMI q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q (b) 1 9 8 8 1 9 8 9 1 9 9 0 1 9 9 1 1 9 9 2 1 9 9 3 1 9 9 4 1 9 9 5 1 9 9 6 1 9 9 7 1 9 9 8  Furthermore, to prevent the history model from being impacted by noisy historical observations, a noise removal procedure to the historical observations similar to [44] was tested. That is, for each observation y t in the history period (t = 1, ..., n), the procedure compared y t against its immediate temporal neighbors (y t−1 and y t+1 ). An observation y t was determined as an outlier if the differences between y t and its temporal neighbors were less than 1% of the values of the neighbours [44] . If y t was an outlier, its value was replaced with the average of the values of its temporal neighbors, if the time span between the temporal neighbors is not more than one year; otherwise y t was assigned as a missing observation.

Spatial and Temporal Accuracy Assessment
In this pilot study, instead of assessing the spatially-representative accuracy of deforestation detection over large areas, the aim was to assess the accuracy using high-confidence reference data interpreted with the aid of the available VHSR series. Furthermore, this allowed us to identify lower-impact (lower-magnitude) deforestation events, in addition to the higher-impact large clearing. Therefore, the study intentionally selected Landsat pixels within which the status of the forests (i.e., whether deforested or remaining as forests) can be visually ascertained in VHSR images. In total, 435 reference samples (Landsat pixels) were collected across the six VHSR scenes (Figure 1a), comprising 227 "deforestation" samples and 208 forests-remaining-forests samples (hereafter "non-deforestation" samples). Here deforestation event was defined as removals of the forest canopy which are visible in very high spatial resolution (VHSR, pixel size < 5 m [45]) satellite images. The deforestation samples included 56 (24.7%) small clearing samples (i.e., sub-pixel canopy removal such as dirt/logging road opening) which are often the initial sign of large-scale deforestation and illegal logging activities [46]. The number of reference samples, as constrained by the available cloud-free VHSR multitemporal images over forests in the notoriously cloudy studied region, served for the initial validation of the deforestation detection algorithm in this pilot study.
The count (sample)-based spatial accuracy of deforestation detection was assessed. Note that the count-based accuracy does not necessarily give a measure of map (area) accuracy, in which the areal representativeness of the samples need to be accounted for [47], which was not possible and was not the aim in this pilot study. Producer's accuracy (PA), user's accuracy (UA), and overall accuracy (OA) was calculated. OA is the proportion of all correctly predicted samples from all samples: where, TP is the number of true positive detection, i.e., correctly predicted deforestation, TN is number of true negative detection, i.e., correctly predicted non-deforestation, FP is number of false positive detection, i.e., incorrectly predicted deforestation (non-deforestation in reference data), FN is number of false negative detection, i.e., incorrectly predicted non-deforestation (deforestation in reference data). Class-specific PA is one minus error of omission (FN), while class-specific U A is one minus error of commission (FP): An algorithm may seem to correctly detect the change, but it detects the change at incorrect date (i.e., false TP) due to noise. It is therefore essential to also assess temporal accuracy in addition to spatial accuracy. The temporal accuracy was estimated as median temporal lag (MTL), i.e., the median number of days between reference date of deforestation (T re f ) and algorithm-detected date of deforestation (T con f , confirmed date if CAC is applied). MTL was calculated for correctly predicted deforestation (TP) samples. Deforestation detected (by algorithm) earlier than the reference date was treated as FP (also in spatial accuracy calculation). Therefore, MTL was calculated as follows: T re f was visually identified from the LTS with the aid of VHSR series, which is the only alternative in the absence of ground data. T re f was identified as the date of the Landsat observation at which the deforestation signal is first visible in the LTS. This gives reference date precision close to the effective temporal resolution (cloud-free observation density) of LTS data, and thus matches the maximum precision of the algorithm applied to LTS data. Finally, a comparison between results of the proposed algorithm and the operational annual forest cover change map [6] accessible in Global Forest Watch (GFW) online platform (http://globalforestwatch.org/) was conducted. This represents a first per-pixel validation of the annual forest cover change map, in Indonesian rainforests, taking advantage of the multitemporal VHSR images available in this study. It is important to ensure an appropriate comparison between the accuracy of the proposed algorithm, and the accuracy of the GFW annual forest cover change map, as assessed with the reference data collected in this study. The reference data were collected by interpreting the LTS profile together with the available multitemporal VHSR images. Therefore, the LTS were monitored for change up to the date of the latest available VHSR image. For example, for the first VHSR scene (Table A1, Appendix A), change was monitored up to 15 August 2015. That means, changes that might have happened after 15 August 2015, but still within the year 2015, was not being monitored. Therefore, to compare with the GFW annual forest cover change map, reference data which changes were identified (visually) up to the year before the year of the latest available VHSR image (in this example, 2014) were used. These were 399 out of the 435 original reference samples. This is because, for example in the case of the first scene that was monitored up to 15 August 2015, the GFW annual forest cover loss detected in the year 2015 would include changes that happened until the end of the year 2015. The algorithm proposed in this present study was therefore re-applied to the LTS up to the end of the year before the year of the latest available VHSR image. In the GFW annual forest cover change map, the pixels which have the loss year identified after the year before the year of the latest available VHSR image were reclassified as no-change.

Availability of Landsat Cloud-Free Observations
During 1999 to 2012, most areas in Kalimantan had on average (14 years average) 4-8 cloud-free Landsat observations per year (Figure 3a, in yellow). This indicates the potential for sub-annual detection of deforestation in this region, despite the pervasive cloud cover. This intra-annual observation density likely does not allow to reliably construct a seasonal model component in addition to the linear trend model. In the same area, the standard deviation of 2-4 observations (Figure 3d, in orange), however, indicates that in a given year there could be very few (2) or even zero cloud-free observation. The lower cloud-free observations availability in the middle part of the island (2-4 observations, Figure 3a in orange) was observed to spatially coincide with high elevation areas (based on the Shuttle Radar Topography Mission Digital Elevation dataset). A higher number of cloud-free observations (more than 8, in green and blue) can be seen within the scene overlap areas. The scene overlap areas, however, also had a higher standard deviation of cloud-free observation count (4)(5)(6)(7)(8). During 2013-2016, utilizing data from a single OLI sensor alone, large areas had only 2-4 cloud-free observations per year (Figure 3c, in orange). Combining data from two sensors, i.e., OLI and ETM+ increased (doubled) cloud-free observations to 4-12 (Figure 3b, in yellow and green) as compared to using OLI alone.  Further, the temporal accuracy (delay) of deforestation detection depends on the temporal separation between the available cloud-free observations reported above (i.e., the effective data temporal resolution). Out of the 435 reference sample pixels assessed in this study, 60% of the cloud-free LTS observations in the whole period (since 1984) had an immediate previous temporal neighbor with up to 56 days separation ( Figure 4). Less than 20% of cloud-free observations had previous temporal neighbor apart by 8 (nominal revisit interval of two Landsat sensors) or 16 days (nominal revisit interval of one Landsat sensor). Thus, the temporal detection accuracy of 8-16 days is very likely not possible in insular SE Asia. Some cloud-free observations were apart by more than one year from the previous cloud-free observations in the time series, which were likely the observations from early the 1990s because Landsat-5 mission did not have a systematic scene acquisition plan.

Demonstration of Deforestation Detection Algorithm
No substantial cross-sensor bias was observed relative to the deforestation change signal (different colors of points in Figure 5). In the cases of large forest clearing, followed by conversion to oil palm plantation or natural revegetation, the proposed algorithm correctly detected deforestation near (with some delay) the reference date (Figures 5a-d and 6a-d). In the cases of forests remaining forests (non-deforestation), the proposed algorithm also correctly registered no deforestation event (true negative) (Figures 5e,f and 6e,f). The anomalies were not confirmed as a deforestation event since they did not occur for consecutively three times. In the cases of small clearing (sub-pixel canopy removal), the proposed algorithm on the other hand failed (false negative) to detect the subtler change signal in (Figures 5i and 6i), but succeeded in Figures 5h and 6h. Lowering k might allow the detection of a subtle change in Figures 5i and 6i, but as a consequence, more observations would have been flagged as potential change and thus might lead to false positive detection at noise observation.        Figure 6) are: forests cleared and converted into oil palm plantation (a,b); forests cleared and naturally revegetated (c,d); forests remaining forests, i.e., non-deforestation (e,f); forests cleared and converted into settlements (e.g., built-up structure) (g); and small clearings in forests (h,i). Observations (points) are colored by sensor, i.e., TM in orange, ETM+ in blue, and OLI in magenta. History noise removal is not applied. Vertical black line denotes the start of monitoring, vertical dashed red line denotes algorithm-detected deforestation, red circle marks flagged a potential (not necessarily confirmed) deforestation event, vertical dashed green line denotes the reference date of deforestation, the solid blue line is the ordinary linear regression model fitted to historical observations. forests remaining forests i.e., non-deforestation (e,f); forests cleared and converted into settlements (e.g., built-up structure) (g); and small clearings in forests (h,i). The VHSR images were obtained through Digital Globe viewing service.

Spatial and Temporal Accuracies
The results of accuracy assessment with the available high-confident reference samples showed that concerning spatial accuracy (Figure 7a-c), there was a clear trade-off between PA and UA of deforestation class. Increasing k (thus conservative-ness of the algorithm) led to increasing UA (fewer commission errors), but simultaneously led to decreasing PA (more omission errors). An optimal algorithm seeks to strike a balance between UA and PA. The application of CAC (cons > 1, Figure 7b,c) provided higher spatial accuracy (i.e., higher UA, PA and OA at the cross-over point between UA and PA) than when it was not applied (cons = 1, Figure 7a). In turn, when CAC was applied, requiring three anomalous observations to confirm a deforestation event (i.e., cons = 3, Figure 7c) provided higher spatial accuracy than requiring only two anomalies (i.e., cons = 2, Figure 7b). Concerning temporal accuracy (Figure 7d-f), however, the detection delay when requiring three anomalies (Figure 7f) was more than twice as long (in other words, temporal accuracy was twice as low) as when requiring two anomalies ( Figure 7e). Thus, there was another clear trade-off between spatial and temporal accuracies. cons > 3 were not evaluated as it would lead to too long detection delay. Note this result was obtained without applying history noise removal, which only provided minor improvement (Table 1), but could cost considerably more computational time. The highest spatial accuracy was obtained using cons = 3 and k = 4 ( Table 1). Lowering cons from 3 to 2 (optimal k = 5.5, history noise removal not applied) reduced the overall (spatial) accuracy by 5.1%, but also reduced detection delay, i.e., median temporal lag (MTL) from 112 days (2 observations) to 40 days (1 observation).  (1)), cons is the number of consecutive anomalies required to confirm a deforestation event. Table 1. Deforestation detection accuracies of the proposed algorithm assessed with 435 reference samples. cons is the number of consecutive anomalies required to confirm a deforestation event. k is the factor which controls the decision boundary of anomaly detection (Equation (1)). MTL is the median temporal lag between the date of deforestation event confirmed by the algorithm, and reference date, given in number of days and number of observations. NA: not applicable. OA is overall accuracy, UA and PA are user's and producer's accuracies of deforestation class. The accuracy of the Global Forest Watch (GFW) annual forest cover change map was found to be high (Table 2). However, the accuracy of the proposed algorithm was found to be higher (Table 3), based on the validation with the reference data available in this study. More importantly, there was a good agreement between the proposed algorithm result and GFW annual forest cover change map ( Table 4), despite that the proposed algorithm provided detection by dates i.e., as satellite observations become available. Table 2. Error matrix between reference data and GFW annual forest cover change map.

Non-Deforestation Deforestation Sum UA (%) PA (%) OA (%)
Predicted  Table 3. Error matrix between reference data and the detection result of the proposed algorithm.

Non-Deforestation Deforestation Sum UA (%) PA (%) OA (%)
Predicted  Table 4. Comparison between GFW annual forest cover change map and the detection result of the proposed algorithm.

Non-Deforestation Deforestation Sum
This study Non-deforestation  211  2  213  Deforestation  19  167  186  Sum  230  169  399 The wall-to-wall application of the proposed algorithm to the available VHSR scenes showed that, firstly, the areas of forests remaining forests were generally correctly predicted as being not deforested in all the three VHSR scenes (Figure 8). There were however still some occurrences of false positive error.
In Figure 8d, pixels in gray were detected to be deforested already in 2001, which was supported by the VHSR evidence (Figure 1b). This provided information on the history of the forest area: a road was first opened in early 2000, but the large clearcut deforestation took place only after 2010 (the blue-ish pixels) to open the land for oil palm plantation. The opening of roads visible in the latest VHSR ( Figure 8a) however appeared erroneously detected later than expected, and the road pixels showed varying detection dates.
In Figure 8b, the deforestation events associated with the road opening were detected (Figure 8e) earlier (orange-ish pixels), correctly. Pixels in blue corresponded to areas that were visibly clearcut in VHSR image dated 6 July 2011 (not shown here). In the 2011 VHSR image, the areas to the left of the area cleared by 2011 were still forested, and were cleared in the next available VHSR dated 8 August 2015 (not shown here); the algorithm correctly detected the deforestation events in 2013-2015 (purple, magenta and pink pixels).
In Figure 8f, it was encouraging to see the small (sub-pixel) clearings (canopy removals) in the lower right corner of Figure 8c were detectable. However, the small clearings in the lower left part of the area failed to be detected. The deforestation map also indicated the speed and spatial direction of deforestation: the purple areas cleared in 2013, followed by the pink areas which were cleared in 2014. The pixels in cyan on top (algorithm informed deforestation in 2010) were visibly cleared in the VHSR evidence dated 25 March 2012 (not shown here). In Figure 9f, pixels in yellow were detected to be deforested in 2009, which was supported by the VHSR evidence (Figure 9b). Pixels in blue were also visibly deforested between 2011 and 2012 ( Figure 9c,d), as correctly detected by the algorithm. Afterward, more areas were cleared by 2014 (Figure 9e), correctly shown in purple in the deforestation map.
Finally, the spectral (NDMI) change magnitude associated with omission (FN) and commission errors (FP) in deforestation class was examined (Figure 10a). This was done to critically analyze the errors and thus to identify potential modifications that may improve the accuracy of the algorithm. Figure 10b shows the magnitude as a factor of RMSE history . Note the values of |magnitude|/RMSE history of FP, noise, TP, and first flagged (first time a later-confirmed deforestation event was flagged as an anomaly) detection were above 4 because the spatially optimal k = 4 is used.
Several observations can be made. First, a large range of magnitude was registered for the observations flagged as noise. Second, noise magnitude overlapped with magnitude of TP detection. Third, magnitude associated with FP error overlapped with magnitude of TP detection. Together, these suggest, by the nature of the noise in LTS in the study region, confirming change (deforestation) by just the magnitude information, without consecutive anomalies criterion, is decidedly prone to FP error.   Figure 10. Distribution of (a) spectral change magnitude, and (b) spectral change magnitude as a factor of history model RMSE (RMSE history ) associated with commission error (false positive, FP), noise (i.e., the residual of observations flagged as anomaly but did not lead to a confirmed deforestation event), true positive (TP), "1st flagged" (i.e., the residual of first-time anomaly which led to confirm the deforestation), omission error (FN), and true negative (TN) in deforestation detection by the proposed algorithm with optimal (i.e., highest spatial accuracy) cons = 3 and k = 4. In FP and TP cases, the magnitude is calculated as the median of consecutive anomalous observations which led to a confirmed deforestation. In FN and TN cases (deforestation not detected), the magnitude is calculated as the median of all observations in the monitoring period.
Fourth, magnitude of FN error was similar to magnitude of TN detection, suggesting the FN samples were mostly the lower-magnitude deforestation samples (i.e., small clearing/sub-pixel canopy removal). Indeed, out of the total 15 FN samples, 11 were small clearing samples, which was about 20% of small clearing samples in reference dataset (56). This, however, means the other 80% of small clearing samples were correctly predicted.
Fifth, the magnitude of TP detection varied largely, up to 19 ×RMSE history , with the magnitude values mostly distributed between 6 ×RMSE history and 10 ×RMSE history . Sixth, the |magnitude|/RMSE history of the first anomaly which led to a confirmed deforestation ("1st flagged") was distributed in the same values as the magnitude of TP. These last two points suggest, the number of required consecutive anomalies (cons) can be designed to be spatially and temporally adaptable: when |magnitude|/RMSE history is much higher than the initial k = 4 (with initial cons = 3), allowing immediate (cons = 1) deforestation confirmation or just two observations (cons = 2) to confirm deforestation event may reduce detection delay (MTL) without incurring FP error.

Discussion
This study evaluated, for the first time, the capability of a methodology based on Landsat time series (LTS) data and a dense time series (DTS) algorithm for deforestation monitoring in the tropical rainforests of insular SE Asia. Based on the accuracy assessment using the available high-confident reference samples, the best result (cons = 3, k = 4) achieved a satisfactory spatial accuracy (UA and PA of deforestation class, and OA above 90%, history noise removal not applied), however with a considerable time lag between the deforestation event and the detection (MTL = 112 days). Results showed that shorter detection delay (MTL = 40 days) could be achieved by decreasing the required number of consecutive anomalies (cons) from three (cons = 3) to two (cons = 2), at the expense of slightly lower spatial accuracy (UA and PA of deforestation class, and OA above 87%, history noise removal not applied). There was therefore an unavoidable trade-off between spatial and temporal accuracies. However the cons value could be set depending on use cases, whether a higher priority is placed on spatial (i.e., confident detection) or temporal accuracy (i.e., immediate detection) [48].
Two primary use cases of the deforestation monitoring methodology can be considered in Indonesian context, namely for law enforcement and carbon accounting. The law enforcement use case concerns two types of situations; firstly, the situation when an immediate action is required to stop the newly-detected illegal deforestation event from spreading to the surrounding forest area, and secondly, the situation when the deforestation information is needed to identify the liable or responsible deforestation actors. In the first situation, the highest temporal accuracy i.e., the monitoring system capable of providing the fastest deforestation alert is required. The results in this study showed that, when the consecutive anomalies criterion was disabled (i.e., by setting cons = 1), a high commission error rate of more than 30% was produced (Figure 7). Arguably, the monitoring system should also be sufficiently robust against false alert (i.e., false positive detection/commission error), such as when the decision involves dispatching forest guards to the deforestation location, which would be very costly operation. Thus, it is recommended to set cons to 2 which provided MTL: 40 days and acceptable commission error rate of 13%. In the attribution of deforestation actors, spatial accuracy is more important, and in fact one may wait to see the resulting size of the deforested area and the speed at which it spreads, to identify whether the likely deforestation actors are small-scale smallholder farmer or industrial-scale concession owner. Thus, it is recommended to set cons to 3 to obtain the optimal spatial accuracy.
The carbon accounting use case concerns the estimation of activity data (i.e., size of deforested area) needed to estimate the amount of greenhouse gas (GHG) emissions (emission amount = activity data × emission factor), both for establishing the "baseline" reference emissions level and evaluating future emissions level [13]. This is required in the REDD+ MRV reporting. For this, the highest spatial accuracy is mandatory to allow the most accurate estimation of activity data. The temporal accuracy is only needed at the level of reporting frequency/period such as required by REDD+. Thus, for this use case it is recommended to set cons to 3 to obtain the optimal spatial accuracy.
The proposed methodology relied on a simple, generic, data-driven deforestation monitoring algorithm, requiring minimum user input and relatively easy recalibration as necessary. Such algorithm has two useful properties. Firstly, it provides better interpretability and transparency than algorithms based on machine learning with a complex set of learned rules or response-predictor relationships. The proposed algorithm has only two tunable parameters, i.e., factor k and cons (Equation (1)), with a straightforward effect on spatial and temporal accuracies (Figure 7). k controls the minimum spectral change magnitude that would be flagged as potential deforestation, and consequently controls the trade-off between UA (i.e., commission error) and PA (i.e., omission error). An end user therefore can tune k based on knowledge of the type of deforestation or disturbance phenomenon in the area of interest, i.e., whether they are predominantly high impact (high magnitude e.g., complete canopy removal (stand replacing), thus set higher k) or low impact (low magnitude e.g., partial (sub-pixel) canopy removal (non-stand replacing), thus set lower k) in nature. cons on the other hand can be tuned based on the relative importance of spatial accuracy (confident detection) versus temporal accuracy (fast detection), since it was shown that there was an unavoidable trade-off between spatial and temporal accuracies. The second useful property of the proposed data-driven algorithm is that it does not require or depend on a training step, unlike machine learning algorithms. Algorithms that build a predictive model based on machine-learned relationship between probability of disturbance/deforestation and spectral change magnitude would, by design, limit the detection to the "severity" of disturbance/deforestation in the data used to train the machine learning algorithms [21]. The proposed algorithm can therefore be more easily transferred to other local conditions, i.e., other tropical regions, by straightforward tuning of k or/and cons. Further, in line with the MRV reporting principle of transparency, and for accountability, beyond the data transparency, arguably there is a need for methodological transparency as well [16,49].
A main issue in the dense time series approach of deforestation detection is dealing with false positives, i.e., noise that has been falsely identified as deforestation signal. In the proposed algorithm, the consecutive anomalies criterion was applied to prevent false positives based on the logic that real change produces a more persistent change signal, as compared to the ephemeral noise. This logic has also been used in other continuous change detection methods [50,51]. Other significant efforts to solve this issue have been put forward, for example, first, by modelling the probability of change from the spectral change magnitude at the identified breakpoints in the time series [37,52], second, by checking if the temporal anomaly is also a spatial anomaly with respect to the neighboring pixels [51], or, third, by removing the noise in the time series i.e., the remnant clouds (undetected by the operational CFmask algorithm [31]), using a multiTemporal mask (Tmask) algorithm [53]. To be successful, the first approach requires the breakpoint to be identified near the time of actual deforestation event, the second approach requires that the neighbouring pixels (in a specified optimum window size) are relatively homogeneous, undisturbed forests, and are not data gap (e.g., cloud or shadow pixel), while the third approach requires a sufficient cloud-free observation density to estimate the time series model. Considering the challenges in applying the aforementioned approaches in insular SE Asia as one of the cloudiest regions on Earth, with highly fragmented forest landscape, the approach based on a "consecutive anomalies criterion" (CAC) was chosen in this present study, where cons = 2 was found to provide above 85% spatial accuracy, with median detection delay of 40 days. Similarly, the operational Global Land Analysis and Discovery (GLAD) Landsat-based humid tropical forest disturbance alerts, accessible through the Global Forest Watch platform, confirms the alerts as tree cover loss if remain unconfirmed until two or more out of four consecutive observations [54]. This near-real-time alert system (hence also capable of sub-annual monitoring) is operational across the pan-tropical forests, however it has not been evaluated (i.e., formally validated) in insular SE Asia yet. Unfortunately, a comparison between the proposed algorithm and the GLAD alert system (e.g., concerning the false alert rate and detection delay) could not be performed in this present study, because the alert data (detection day) is available from 2015 [55], while all but one sample in the reference data available in this present study, and most area visible in the available VHSR images, had change visually confirmed before 2015. Nevertheless, since the validation exercise in this present study showed the spatial accuracy of the proposed sub-annual detection algorithm was found higher than the operational annual forest cover change map [6], this preliminarily indicates that simpler algorithm and input spectral metrics based on the standard USGS surface reflectance product, can achieve a comparable accuracy to the more complex machine learning algorithms employing a large number of multitemporal spectral metrics, and image pre-processing. Further, the simpler, data-driven (minimum user input) algorithm such as demonstrated in the present study has the added benefit of a greater transparency and interpretability.
As an emphasis was given to high-confidence multitemporal reference data, it should be acknowledged that the validation data used in this pilot study was not a large and a probability sample, and thus the sample-based (i.e., not adjusted with area proportion between deforested and non-deforested area [47]) accuracy reported was only indicative and not representative for the large forest area in Kalimantan mega-island. Nevertheless, wall-to-wall application of the algorithm in the area covered by the available VHSR series showed a promising wall-to-wall accuracy (Figures 8 and 9). Further works are needed to collect a spatially and temporally representative probability-based reference sample for the larger Kalimantan forest area. A satisfactory spatial accuracy from such large scale assessment would provide statistical confidence in using LTS data and the algorithm to produce estimates of the size of deforestation and forests remaining forests area in the larger Kalimantan area. This would however be a challenging undertaking due to lack of multitemporal VHSR images over Kalimantan, such as that observed in the open Google Earth application. Collecting such a comprehensive reference sample database can benefit from crowdsource mapping activities [56], where citizen scientists can be trained to interpret and annotate LTS breakpoints and segments, with visual aid of Landsat image chips (colour and geometric pattern), and when available, VHSR images from multiple dates. Crowdsourcing would also help in collecting samples of remnant (un-masked) clouds which was the main source of commission error in deforestation detection using optical satellite data. This cloud sample data can be used to improve the existing cloud masking algorithm.
In addition to large area accuracy assessment, further works can utilize the presented methodology and other change information it can provide for two purposes. Firstly, the spectral-temporal information [57,58] in LTS can be used to overcome the limitation of using solely spectral information for the task of differentiating natural forests and plantation forests [46], as well as differentiating primary natural forests and secondary natural forests. Distinguishing natural forests from plantation forests, and subsequently identifying in which of the two land use types tree cover loss and gain occurred, allow to better quantify the changes in forest carbon stock and other non-carbon values of forests [59,60]. Mapping natural forests and plantation forests has usually been done by or involved visual human-interpretation of Landsat imagery [7,61,62]. An automated method would, however, increase the objectivity and comparability of the mapping results [63], as well as prove beneficial for routine natural forests and plantation forests map generation to assess their area changes through time. It was observed that in forests that were converted into oil palm plantation, after the deforestation event the spectral signal (i.e., NDMI) tended to stabilize at lower value than pre-disturbance (natural forests) level ( Figure 5). The forests that underwent a natural revegetation on the other hand showed the NDMI returned to pre-disturbance level. This suggests the slope of the post-disturbance recovery segment is potentially a useful discriminatory feature between natural forests and plantation forests. The slope of regrowth can be estimated by fitting a linear model to the observations starting from the date when the deforestation event was confirmed by the algorithm. Alternatively, the regrowth monitoring algorithm based on whether the spectral signal returns to pre-disturbance stable level [37] is promising too. That is, if regrowth is detected, it indicates a natural revegetation post-disturbance; if regrowth is not detected, it indicates the land has been planted with oil palm post-disturbance. However, accurate detection of deforestation date is a prerequisite to both approaches. Discriminating primary (old-growth, intact) natural forests from secondary natural forests on the other hand necessitates extending the LTS temporal depth by also integrating satellite data from the Multispectral Scanner System (MSS) on board Landsat-1 through Landsat-5. This is because in Indonesia history, deforestation and forest degradation started to intensify in the 1970s (personal communication with Prof. Mari Pangestu, former Minister of Trade for Indonesia). The MSS sensors acquired imagery from 1972 to 1999. Inclusion of MSS data will also improve the sparse TM observation density before 2000. Harmonizing MSS data and data from later sensors (i.e., TM, ETM+, and OLI used in this present study) is however challenging due to differences in the spatial, spectral, and radiometric resolution and quality [64].
Secondly, the information on timing of disturbance/deforestation event can be used to infer the age of a secondary forest stand, based on the length of time that has passed since the disturbance/deforestation event. Age may serve as a better predictor of tree heights and diameter-and thus wood volume, biomass, and in turn aboveground carbon stock density and increment [65][66][67]-than leaf area to which optical satellite data is directly sensitive. Age may also improve the estimation of high biomass beyond the saturation level of estimates from optical and microwave data [46]. The carbon uptake capacity of trees in forests (and plantations) also likely vary with their age. This is important as carbon accummulation/carbon stock increment is one of the largest source of uncertainty in carbon accounting [16]. A map of forest stand age potentially allows to create a spatially-explicit carbon stock density and increment, and consequently spatially-explicit emission factors. Such map may reduce the uncertainty of using fixed emission factors estimated based on allometric relationships calibrated with a limited number of trees [16]. Forest stand age also indicates the current successional stage, and thus the biodiversity status [65], which is important for conservation consideration (i.e, the "+" in REDD+).
While the level of spatial accuracy achieved in this study was promising, the achieved temporal accuracy can not be considered adequate for near-real-time detection of deforestation. This was however due to the data itself i.e., the LTS cloud-free observation density in the region, rather than the error of the algorithm. Looking forward, however, it can be well expected that the temporal accuracy of deforestation detection in insular SE Asia would improve following three recent developments in satellite data provisions. First, the addition of new data stream from European Space Agency's now fully-operational (i.e., global and systematic acquisitions realized [68]) Sentinel-2A (launched June 2015) and Sentinel-2B (launched March 2017) potentially improves the cloud-free observation density. A recent study showed the average revisit interval of the combined Landsat-8, Sentinel-2A and Sentinel-2B will be about 2.8-3.8 days (as compared to 16 days when using Landsat-8 only) near the equator between 0 • and about 5 • latitude (Kalimantan lies between approximately 5 • N and 5 • S) (Figure 10b in [69]). Second, the operational global production of Sentinel-2 analysis-ready surface reflectance product has been planned to start from mid-March this year [70]. Third, works are well underway towards the provision of spatially and spectrally harmonized Landsat and Sentinel-2 surface reflectance product [71]. Together, these developments will soon provide freely available medium resolution (10-30 m) satellite observations with an unprecedented frequency, and it remains to be seen how much would, the substantial improvement in revisit interval, bring improvement in cloud-free observation density [72], and thus temporal accuracy of deforestation detection, and capability for near-real-time monitoring [73], in Kalimantan, Indonesia and the wider insular SE Asia region.

Conclusions
This study evaluated, for the first time, the potential of freely available dense Landsat Time Series (LTS) data for deforestation detection in tropical rainforests of Kalimantan, Indonesia, at sub-annual time scales. Results showed, firstly, regarding data, the cloud-free observation density provided by combined LTS data from TM, ETM+, and OLI sensors indicated the feasibility of sub-annual deforestation mapping and monitoring in the Kalimantan mega-island, despite the region's persistent cloud cover. Secondly, regarding the deforestation detection, the pilot validation indicated a promising spatial accuracy. Therefore, the presented methodology is promising for use cases in which high spatial accuracy is a priority, such as to improve the quality of Activity Data needed for calculating emissions (and removals) from land use, land-use change, and forestry (LULUCF) under UNFCCC requirements, in the context of REDD+ at a subnational level. The promising spatial accuracy provides motivation for further studies to collect a larger reference sample size, representative for the larger Kalimantan mega-island forest area, in order to evaluate the robustness of the presented methodology for producing large area estimates of deforested area, as well as for large area deforestation monitoring. Such a large scale assessment would only be feasible with the help of crowdsourcing activities for collecting the reference data through visual interpretation of LTS and VHSR. At the same time, reference data of the remnant clouds in the satellite image can be built to further improve cloud masking algorithm, and hence reducing noise in LTS. Further works are also needed to assess the performance of the proposed methodology in the detection of forest degradation i.e., non-stand replacing forest disturbances, for which sizable reference data were not available in this study. Detecting various types of forest disturbances, with the associated different ranges of spectral change magnitude, likely benefits from the use of multiple spectral variables (indices and individual spectral bands) other than the NDMI utilized for primarily deforestation detection in this study. The achieved temporal accuracy (median temporal lag of 40-112 days) was deemed not adequate for near-real-time monitoring purpose. Looking forward, however, it can be expected that the new data stream (freely available) from the now fully-operational Sentinel-2A and Sentinel-2B would improve the cloud-free observation density, and thus the temporal accuracy of deforestation detection. Finally, beyond deforestation mapping and monitoring, the results indicated the potential of utilizing the spectral-temporal features in the dense LTS data to automate the task of mapping natural forests and plantations. Results from this are expected to be useful for many stakeholders who are interested in forest management and in palm oil production, but also for the local community affected by the consequences of deforestation.
Author Contributions: H., A.K. and P.Y. conceived and designed the study; H. analyzed the data; A.K., V.M. and P.Y. assisted in obtaining satellite data and information on study area; M.R., A.K., V.M., P.Y. and S.P. supported H. in the analyses; H. prepared the first version of the manuscript; all authors participated in discussing the results and writing the final paper.
Funding: This study was funded by Aalto University (grant 91160122). Part of the research was developed in the Young Scientists Summer Program (YSSP) at the International Institute for Applied Systems Analysis (IIASA), Laxenburg, Austria with financial support from the Academy of Finland. This work was supported by the RESTORE+ project (www.restoreplus.org) which is part of the International Climate Initiative (IKI), supported by the Federal Ministry for the Environment, Nature Conservation and Nuclear Safety (BMU) based on a decision adopted by the German Bundestag; it received support from the project "Delivering Incentives to End Deforestation: Global Ambition, Private/Public Finance, and Zero-Deforestation Supply Chains" funded by the Norwegian Agency for Development Cooperation under agreement number QZA-0464 QZA-16/0218. Furthermore, the work has been funded by IIASA's Tropical Futures Initiative (TFI).