Accuracy Assessments of Local and Global Forest Change Data to Estimate Annual Disturbances in Temperate Forests

: Forest disturbances are generally estimated using globally available forest change maps or locally calibrated disturbance maps. The choice of disturbance map depends on the trade-o ﬀ s among the detection accuracy, processing time, and expert knowledge. However, the accuracy di ﬀ erences between global and local maps have still not been fully investigated; therefore, their optimal use for estimating forest disturbances has not been clariﬁed. This study assesses the annual forest disturbance detection of an available Global Forest Change map and a local disturbance map based on a Landsat temporal segmentation algorithm in areas dominated by harvest disturbances. We assess the forest disturbance detection accuracies based on two reference datasets in each year. We also use a polygon-based assessment to investigate the thematic accuracy based on each disturbance patch. As a result, we found that the producer’s and user’s accuracies of disturbances in the Global Forest Change map were 30.1–76.8% and 50.5–90.2%, respectively, for 2001–2017, which corresponded to 78.3–92.5% and 88.8–97.1%, respectively in the local disturbance map. These values indicate that the local disturbance map achieved more stable and higher accuracies. The polygon-based assessment showed that larger disturbances were likely to be accurately detected in both maps; however, more small-scale disturbances were at least partially detected by the Global Forest Change map with a higher commission error. Overall, the local disturbance map had higher forest disturbance detection accuracies. However, for forest disturbances larger than 3 ha, the Global Forest Change map achieved comparable accuracies. In conclusion, the Global Forest Change map can be used to detect larger forest disturbances, but it should be used cautiously because of the substantial commission error for small-scale disturbances and yearly variations in estimated areas and accuracies.


Introduction
Forest disturbances largely influence the forest structure, biodiversity, and global carbon cycle [1][2][3][4][5]; hence, characterizing the spatial and temporal dynamics of forest disturbances is a key component of regional-and national-scale forest management [6,7]. Satellite data provide valuable information for deriving forest changes over large scales because of spatially and temporally consistent observations. Landsat data have especially been used to monitor deforestation, forest degradation, and forest disturbances in annual or more frequent time steps [8,9]. After the opening of the United States Geological Survey Landsat archives, many studies attempted to exploit long-term Landsat time-series data for forest change characterization at a 30-m spatial resolution [10]. Accordingly, automated Figure 1. Study area. The land and forest areas were provided by the National Land Numerical Information (https://nlftp.mlit.go.jp/ksj/index.html). The sample grids and pixels represent the sample distribution in the accuracy assessment.

Global Forest Change Map
The Global Forest Change dataset (version 1.7) developed by Hansen et al. [36] was used. It contains tree cover in 2000, forest loss, forest gain, and loss year layers. The forest loss year layer was clipped to the study area. The loss year 2001-2017 was used to represent the forest disturbances in the study period. A forest mask was not applied to the forest loss herein.

Local Disturbance Map
A local disturbance map developed by Shimizu et al. [57] in this region was used. The map was developed by using a random forest (RF) classification based on the LandTrendr temporal segmentation [11,58] executed on the Google Earth Engine (GEE) [59]. The methods used to develop the disturbance map are briefly described herein. First, the Landsat surface reflectance (SR) data from 1984 to 2018 were used to generate annual median composites. After converting the annual SR median composites to tasseled cap and other spectral indices, the LandTrendr temporal segmentation was implemented to extract the predictor variables in each year. The predictor variables and training samples were then used for the RF classification of four land cover change classes (i.e., forest harvesting, other disturbances, stable forest, and other cover changes). The derivation of the predictor variables and the implementation of RF classification were based on a direct prediction method presented in [60]. The forest harvesting and other disturbance patches less than four pixels were removed to reduce the commission errors. If two-year consecutive forest harvesting or other disturbances existed in the same pixel location, those in the latter year were removed. In this study, the four change classes were reclassed to disturbance and no-disturbance. The disturbances that occurred in 2001-2017 were used to coincide with the study period of the Global Forest Change map.

Landsat, Sentinel-2, and SRTM DEM for Reference Data Collection and Model Building
Landsat 5, 7, and 8 SR data from 2000 to 2018 with a cloud cover less than 70% were used to generate annual median composites. These composite images were downloaded from GEE. The individual scenes of Landsat SR data from 2013 to 2018 were also downloaded. The annual median composites of the Sentinel-2 top of atmosphere (TOA) reflectance data were generated and downloaded from GEE using the Sentinel-2 data from 2013 to 2018 with less than 70% cloud cover. The Normalized Burn Ratio (NBR), Tasseled Cap Wetness (TCW), and Angle (TCA) [61] were calculated for visual interpretation in Section 2.5 using Landsat annual median composites 2000-2018. The NBR was used to calculate the predictor variables in the mixed-effect models (Section 2.7). The elevation data from the Shuttle Radar Topography Mission (SRTM) Digital Elevation Model (DEM) [62] in GEE was used to calculate the topographic elevation and the slope in the mixed-effect models.

Reference Sample Collection
Two reference sample data were collected for accuracy assessments: one for the pixel-based reference data in 2001-2017 at a 30-m spatial resolution and the other for the reference data of 6 × 6 km grids in 2013-2017 at a 10-m spatial resolution. Higher spatial resolution data are desirable for accurate accuracy assessment and the investigation of the effects of mixed pixels of disturbance detection; however, high-spatial-resolution data are only available for recent years. Therefore, in this study, Remote Sens. 2020, 12, 2438 5 of 20 the 30-m spatial resolution reference data were collected to assess the entire study period using Landsat data. The reference data of the 10-m spatial resolution was collected using the Sentinel-2 data for recent years. All pixels in both reference data were used to assess the accuracy of the annual forest disturbance detection. The sample grid reference data was further used for the polygon-based assessment and mixed-effect modeling. Pixel-based reference samples were collected using a stratified random sampling approach. Based on the original local disturbance map, a total of 9000-pixel locations were selected from four strata (i.e., 2500 for harvested, 500 for other disturbed, 5000 for the stable forest, and 1000 for other changes pixels) for visual interpretation. Although Arévalo et al. [63] indicated that a sample selection of different sample locations in each change period can achieve better precision, the reference samples in this study were collected in the same location across all years because of the labeling efficiency in the visual interpretation [47,64]. We implemented the visual interpretation to label the change classes (i.e., disturbance or no-disturbance) of these samples in each year in 2001-2017. We used the Landsat time series composite images in each year, the trajectory of the spectral indices of the Landsat time series, and high-resolution satellite images in Google Earth to label disturbance or no-disturbance ( Figure 2). As a result, 9000 reference pixels were collected at a 30 m spatial resolution in each year. The map classes of both maps were different from the strata; thus, a ratio estimator [65] was used to calculate the unbiased accuracy metrics in the accuracy assessment that follows.
Remote Sens. 2020, 12, x FOR PEER REVIEW 6 of 20 To assess the accuracy of annual disturbance detection and use in mixed-effect models, the sample grid reference data was collected using 6 × 6 km grids at a 10 m spatial resolution. The accuracy assessment using the sample data in grids and blocks was widely implemented in previous studies (e.g., [66][67][68]). In this study, the study area was first divided into 6 × 6 km grids, resulting in 1408 grids that overlapped with the land area. The cumulative disturbance area of the local disturbance map from 2013 to 2017 was calculated in each grid. Then, a total of 35 grids that overlapped with more than 20% of the land area were randomly selected from the four strata corresponding to the disturbance area ratios of 0%, 0-1%, 1-3%, and >3% (i.e., 10, 10, 8, and 7 grids, respectively).
All annual forest disturbances from 2013 to 2017 in the selected grids were manually delineated To assess the accuracy of annual disturbance detection and use in mixed-effect models, the sample grid reference data was collected using 6 × 6 km grids at a 10 m spatial resolution. The accuracy assessment using the sample data in grids and blocks was widely implemented in previous studies (e.g., [66][67][68]). In this study, the study area was first divided into 6 × 6 km grids, resulting in 1408 grids that overlapped with the land area. The cumulative disturbance area of the local disturbance map from 2013 to 2017 was calculated in each grid. Then, a total of 35 grids that overlapped with more than Remote Sens. 2020, 12, 2438 6 of 20 20% of the land area were randomly selected from the four strata corresponding to the disturbance area ratios of 0%, 0-1%, 1-3%, and >3% (i.e., 10, 10, 8, and 7 grids, respectively).
All annual forest disturbances from 2013 to 2017 in the selected grids were manually delineated using Sentinel-2 composite images ( Figure 3). Because no Sentinel-2 data was available before Sentinel-2A was launched in 2015, the forest disturbances from 2013 to 2015 were delineated using the Sentinel-2 composite images in 2016. Then, the timing of each forest disturbance was attributed using Landsat images. We assumed that this procedure rarely caused a disturbance omission because the forest disturbances in this region were visible from the satellite images for several years after the disturbance events. Manually delineated disturbances were confirmed using high-resolution satellite images in Google Earth. The Automated Registration and Orthorectification Package (AROP) [69] was used in each grid to reduce the miss registration between the disturbance detection based on the Landsat images and the reference data based on the Sentinel-2 images. The near-infrared band of the Landsat and Sentinel-2 composite images was used to align the Sentinel-2 images to the Landsat images in the AROP. The delineated results were rasterized to a 10 m spatial resolution. Water surfaces, such as rivers and seas, were masked in the reference data. The total valid area in the 35 reference grids accounted for 2.9% of the study area.

Accuracy Assessments of Disturbance Detection for the Entire Study Area
The area-based thematic accuracies of both the Global Forest Change map and local disturbance map were assessed based on two reference data. For the accuracy assessment using the 10 m spatial resolution grid reference data of 2013-2017, both maps were resampled to a 10 m spatial resolution using the nearest neighbor method to evaluate the effects of the mixed pixels. All pixels in both reference data were used for the calculation. The producer's accuracy (PA) and the user's accuracy (UA) of the forest disturbance were calculated as follows using the ratio estimator [65,70]: where H is the number of strata, N h is the total number of pixels in the stratum h, y h and x h are the sample means of y u and x u , respectively, at the stratum h, and y u and x u are the numerator and the dominator in the equation defined for each accuracy metrics using pixel u, respectively [71]. The variance ofR was calculated as follows: where n h is the number of sample pixels in the stratum h, s 2 yh and s 2 xh are the sample variances of y u and x u in the stratum h, respectively, andX and s xyh are defined as follows:

Accuracy Assessments of Disturbance Detection for the Entire Study Area
The area-based thematic accuracies of both the Global Forest Change map and local disturbance map were assessed based on two reference data. For the accuracy assessment using the 10 m spatial resolution grid reference data of 2013-2017, both maps were resampled to a 10 m spatial resolution using the nearest neighbor method to evaluate the effects of the mixed pixels. All pixels in both reference data were used for the calculation. The producer's accuracy (PA) and the user's accuracy (UA) of the forest disturbance were calculated as follows using the ratio estimator [65,70]:

Polygon-Based Accuracy Assessment
Similar to that in the study of Linke et al. [54], the polygon-based assessment was implemented herein to calculate a polygon-based thematic accuracy using the 35-grid reference data. Both the Global Forest Change map and the local disturbance map were resampled to a 10 m spatial resolution using the nearest neighbor method. The spatial overlap proportion at each reference and the detected disturbance patch were used to identify the correct detection. Each reference disturbance patch was assessed as regards whether the spatial overlap proportion with the detected disturbance in the Global Remote Sens. 2020, 12, 2438 8 of 20 Forest Change or local disturbance map was larger than the threshold, and vice versa. The thresholds for the spatial overlap proportion from 0% to 100% were then investigated. The higher thresholds indicate a stricter assessment for identifying the correct disturbance, whereas the lower thresholds denote a relaxed assessment. We calculated the pseudo-PA, which is the number of correctly detected disturbance patches divided by the total number of reference disturbance patches, and the UA, which is the number of correctly detected disturbance patches divided by total number of detected disturbance patches, for both maps. We also evaluated the pseudo-PA and the UA at different disturbance size classes using the spatial overlap threshold of 50%, following the approach of Linke et al. [54]. The disturbance patches located at the edge of each grid were removed to avoid the effects of the disturbance that occurred outside the grids.

Mixed-Effect Modeling
A Generalized Linear Mixed Model (GLMM) with a logit link and a binomial family was used based on the variables derived from the 10 m reference grid data to investigate the influencing factors of the correct disturbance detection. A total of 10% of disturbance pixels in the reference data were randomly selected as the GLMM input. The response variable was "correctly detected" or "not detected" by the Global Forest Change map and local disturbance map in each year. The explanatory variables were the pre-NBR value (i.e., NBR value in the previous year), delta NBR value (i.e., the difference of the NBR value from the previous year), topographic elevation, slope, disturbance patch size of the disturbance, and distance from the disturbance edge. Each disturbance patch was used as a random effect following a similar approach by Gibson et al. [72]. The explanatory variables were calculated at a 10 m spatial resolution. The explanatory variables were selected by a step-wise elimination based on the Akaike Information Criterion (AIC). The "glmmML" package was used to implement the GLMM in R statistical software [73]. Figure 4 shows the PAs and the UAs of the disturbance in the accuracy assessment based on the pixel-based reference data at a 30 m spatial resolution from 2001-2017 (the reader is advised to see the error matrices provided in the supplementary tables for details). The PAs and the UAs of the disturbance of the Global Forest Change map varied for each year and ranged from 30.7% to 76.8% and 50.6% to 90.2%, respectively, whereas those for the local disturbance map were 78.3-92.5% and 88.8-97.1%, respectively. Large differences were also found in the Global Forest Change map across the years. The PA and the UA of the disturbance for the entire period of 2001-2017 were 62.4% (95% confidence interval of ±6.6%) and 65.5% (±10.1%), respectively, for the Global Forest Change map and 87.4% (±7.5%) and 94.1% (±2.4%), respectively, for the local disturbance map. Figure 5 summarizes the disturbance area estimated from the Global Forest Change map, local disturbance map, and reference data. The disturbance areas of both maps were calculated from a pixel-counting approach in the study area. The total disturbance area estimated in the reference data accounted for 3.2% of the study area in the entire study period (i.e., 0.2% annual average).  disturbance of the Global Forest Change map varied for each year and ranged from 30.7% to 76.8% and 50.6% to 90.2%, respectively, whereas those for the local disturbance map were 78.3-92.5% and 88.8-97.1%, respectively. Large differences were also found in the Global Forest Change map across the years. The PA and the UA of the disturbance for the entire period of 2001-2017 were 62.4% (95% confidence interval of ±6.6%) and 65.5% (±10.1%), respectively, for the Global Forest Change map and 87.4% (±7.5%) and 94.1% (±2.4%), respectively, for the local disturbance map.  Figure 5 summarizes the disturbance area estimated from the Global Forest Change map, local disturbance map, and reference data. The disturbance areas of both maps were calculated from a pixel-counting approach in the study area. The total disturbance area estimated in the reference data accounted for 3.2% of the study area in the entire study period (i.e., 0.2% annual average).     The total numbers of disturbance patches in the reference sample grids in the Global Forest Change map, local disturbance map, and reference data were 2067 (1699.5 ha), 1167 (1773.5 ha), and 1233 (1851.6 ha), respectively. The average and median disturbance patch sizes of the reference data were 1.50 ha and 0.75 ha, respectively. The number of disturbance patches of less than 1 ha accounted for 60.7% of the total disturbance patches in the reference data. On the other hand, the total area of disturbance of less than 1 ha accounted for 19.2% of the total disturbance area. Figure 6 shows the PAs and the UAs of the disturbance in the accuracy assessment based on the grid reference data at a 10 m spatial resolution for 2013-2017 (the reader is advised to see the error matrices provided in the supplementary tables for details). The Global Forest Change map achieved PAs and UAs of 53.2-69.3% and 65.5-80.3%, respectively (Figure 6), while the local disturbance map achieved PAs and UAs of 65.4-78.6% and 75.0-82.4%, respectively. The PA and UA of the disturbance for the entire period of 2013-2017 were 62.2% (±0.3%) and 71.8% (±0.3%), respectively, for the Global Forest Change map and 72.5% (±0.2%) and 78.9% (±0.2%), respectively, for the local disturbance map. The pseudo-PAs and UAs of the disturbance decreased with the higher spatial overlap thresholds (Figure 8a). The PAs of all the disturbance sizes of the Global Forest Change map at 0% threshold were larger than those of the local disturbance map. However, for thresholds larger than or equal to 20%, the local disturbance map achieved higher PAs than the Global Forest Change map. The UAs of the local disturbance map were consistently larger than those of the Global Forest Change map at the spatial overlap thresholds of 0-90%. The PA and UA of the Global Forest Change map at the spatial overlap thresholds of 0%, 20%, and 50% were 86.0%/77.4%, 76.1%/73.2%, and 54.1%/58.5%, respectively, while those of the local disturbance map were 77.3%/96.0%, 76.4%/95.4%, and 67.2%/87.2%, respectively. When divided into a disturbance size larger than or equal to 1 ha ( Figure 8b) and less than 1 ha (Figure 8c), the PAs and UAs of the disturbance larger than or equal to 1 ha were generally higher than those of less than 1 ha. For a disturbance larger than or equal to 1 ha, the UAs of both disturbance maps were almost the same at the thresholds 0-50% with differences of less than 4%. The PAs were generally smaller in the Global Forest Change map than in the local disturbance map. The PA and UA of the Global Forest Change map at the spatial overlap thresholds of 0%, 20%, and 50% were 94.7%/98.9%, 88.1%/98.7%, and 73.9%/92.8%, respectively, for a disturbance larger than or equal to 1 ha, while those for the local disturbance map were 97.5%/98.9%, 96.5%/98.7%, and 86.0%/97.1%, respectively. For a disturbance of less than 1 ha, the PAs of the disturbance at thresholds lower than 30% were higher in the Global Forest Change map, indicating that the disturbance was more likely detected by this map. The PA and UA of the Global Forest Change map at the spatial overlap thresholds of 0%, 20%, and 50% were 80.3%/72.4%, 68.2%/67.4%, and 41.1%/50.6%, respectively, while those for the local disturbance map were 64.1%/94.1%, 63.3%/93.3%, and 55.0%/81.0%, respectively, for a disturbance smaller than 1 ha. Figure 7 depicts the distribution of the disturbance patch size in the reference sample grids. The total numbers of disturbance patches in the reference sample grids in the Global Forest Change map, local disturbance map, and reference data were 2067 (1699.5 ha), 1167 (1773.5 ha), and 1233 (1851.6 ha), respectively. The average and median disturbance patch sizes of the reference data were 1.50 ha and 0.75 ha, respectively. The number of disturbance patches of less than 1 ha accounted for 60.7% of the total disturbance patches in the reference data. On the other hand, the total area of disturbance of less than 1 ha accounted for 19.2% of the total disturbance area. The pseudo-PAs and UAs of the disturbance decreased with the higher spatial overlap thresholds (Figure 8a). The PAs of all the disturbance sizes of the Global Forest Change map at 0% threshold were larger than those of the local disturbance map. However, for thresholds larger than or equal to 20%, the local disturbance map achieved higher PAs than the Global Forest Change map. The UAs of the local disturbance map were consistently larger than those of the Global Forest Change map at the spatial overlap thresholds of 0-90%. The PA and UA of the Global Forest Change map at the spatial overlap thresholds of 0%, 20%, and 50% were 86.0%/77.4%, 76.1%/73.2%, and 54.1%/58.5%, respectively, while those of the local disturbance map were 77.3%/96.0%, 76.4%/95.4%, and 67.2%/87.2%, respectively. When divided into a disturbance size larger than or equal to 1 ha ( Figure  8b) and less than 1 ha (Figure 8c), the PAs and UAs of the disturbance larger than or equal to 1 ha were generally higher than those of less than 1 ha. For a disturbance larger than or equal to 1 ha, the UAs of both disturbance maps were almost the same at the thresholds 0-50% with differences of less than 4%. The PAs were generally smaller in the Global Forest Change map than in the local disturbance map. The PA and UA of the Global Forest Change map at the spatial overlap thresholds of 0%, 20%, and 50% were 94.7%/98.9%, 88.1%/98.7%, and 73.9%/92.8%, respectively, for a disturbance larger than or equal to 1 ha, while those for the local disturbance map were 97.5%/98.9%, 96.5%/98.7%, and 86.0%/97.1%, respectively. For a disturbance of less than 1 ha, the PAs of the   The pseudo-PAs and UAs using the overlap threshold of 50% increased with the increasing disturbance size class (Figure 9). The PAs and UAs of the local disturbance map were higher than or equal to those of the Global Forest Change map. The PAs of the Global Forest Change map and the local disturbance map for the disturbance size class of 0-0.5 ha were 34.4% and 40.9%, respectively. The PA and the UA exceeded 80% for a disturbance larger than 3 ha in the Global Forest Change map and larger than 1 ha in the local disturbance map. The pseudo-PAs and UAs using the overlap threshold of 50% increased with the increasing disturbance size class (Figure 9). The PAs and UAs of the local disturbance map were higher than or equal to those of the Global Forest Change map. The PAs of the Global Forest Change map and the local disturbance map for the disturbance size class of 0-0.5 ha were 34.4% and 40.9%, respectively. The PA and the UA exceeded 80% for a disturbance larger than 3 ha in the Global Forest Change map and larger than 1 ha in the local disturbance map.

Accuracy Assessment Using the Sample Grid Reference Data for 2013-2017
Remote Sens. 2020, 12, x FOR PEER REVIEW 13 of 20 Figure 9. PA and UA of the disturbance at each disturbance patch size class for the Global Forest Change map and the local disturbance map in the polygon-based assessment using a 50% overlap threshold.

GLMM
The GLMM results showed that the pre-NBR value, delta NBR value, disturbance patch size, and distance from the disturbance edge were significant explanatory variables in both maps (Table  1). A larger disturbance patch size and the distance from the disturbance edge were related to a higher correct detection, whereas higher pre-NBR and delta NBR values were associated with a lower correct detection. Figure 10 shows the distribution of each variable for the correct detection and the omission of the disturbance pixels. Figure 11 illustrates the disturbance and no-disturbance detection rate for the distance from the disturbance edge. Inside the reference disturbance patches, the correct detection rate increased with increasing distance from the disturbance edge until 120 m. The correct detection rate became larger than 90% at distances of 40 and 70 m for local and Global Forest Change maps, respectively. Outside the reference disturbance patches, the correct detection rate became larger than 90% at a distance of 30 m in both disturbance maps.

GLMM
The GLMM results showed that the pre-NBR value, delta NBR value, disturbance patch size, and distance from the disturbance edge were significant explanatory variables in both maps (Table 1). A larger disturbance patch size and the distance from the disturbance edge were related to a higher correct detection, whereas higher pre-NBR and delta NBR values were associated with a lower correct detection. Figure 10 shows the distribution of each variable for the correct detection and the omission of the disturbance pixels. Figure 11 illustrates the disturbance and no-disturbance detection rate for the distance from the disturbance edge. Inside the reference disturbance patches, the correct detection rate increased with increasing distance from the disturbance edge until 120 m. The correct detection rate became larger than 90% at distances of 40 and 70 m for local and Global Forest Change maps, respectively. Outside the reference disturbance patches, the correct detection rate became larger than 90% at a distance of 30 m in both disturbance maps.

Discussion
The choice of the Global Forest Change map or the local disturbance map for estimating forest disturbances depends on the trade-offs between the accuracy and the usability of these disturbance maps. The accuracy of the annual disturbance detection provides an insight into the trade-offs and efficient use of disturbance maps. This study assessed the annual disturbance detection by investigating the detection accuracy in the entire study area, polygon-based accuracy, and influencing factors for correct detection. These detailed assessments provided complementary information to determine the possible use of different disturbance maps to estimate the annual forest disturbances.
The accuracies of the annual disturbance detection were higher in the local disturbance map than in the Global Forest Change map in both reference data. The detected disturbance area and the estimated accuracy in the Global Forest Change map considerably differed for 2001-2017. This result is in line with the findings of previous studies that indicated the differences in accuracy and estimated area in different years [43,46]. The better performance of the local disturbance map is not surprising because the local disturbance map used in this study was based on the LandTrendr segmentation, which enhanced the temporal information, and was calibrated to this study area. This indicates that developing a local disturbance map in the study area is a simple approach for improving the disturbance detection accuracy. However, a previous study showed that the disturbance detection accuracy using the local and Global Forest Change maps varied regionally [46]. This might be because the disturbance detection accuracy is affected by climatic and vegetation factors, such as satellite data availability due to clouds and vegetation recovery after disturbances. Furthermore, the accuracy of the local disturbance map is affected by change detection algorithms. Cohen et al. [74] demonstrated that the detection results were largely different in each detection algorithm. In this context, change detection algorithms should be suitable for the study area in terms of regional disturbance processes and data availability. The integration of several algorithms using ensemble learning (e.g., [75,76]) is one possible solution for minimizing the influences of algorithm selection.
The accuracy metrics calculated using pixel-based and sample grid reference data were largely different from each other in both disturbance maps in the 2013-2017 period. The accuracy metrics based on the pixel-based reference data generally had higher accuracies (Figures 4 and 6) that can be attributed to the different spatial resolutions of the two reference data (i.e., 10 and 30 m). Both disturbance maps were originally developed at a 30 m spatial resolution; hence, the mixed pixels containing the disturbance boundaries were assessed in the validation using the reference data with the 10 m spatial resolution, but not in reference data with the 30 m spatial resolution. The differences can be substantial for small-scale disturbances because of the lower correct detection rate near the edge of the disturbances (Figure 11). In areas dominated by small-scale disturbances, such as those in this study, these differences largely affect the accuracy metrics of the entire study area. In addition, the pixel-based reference data collected with stratified random sampling in the local disturbance map might reduce the inclusion of small-scale disturbances. This can lead to optimistically estimated accuracies because disturbances of less than four pixels were removed from the local disturbance map. Note that the reference data collected at the 10 m spatial resolution might be influenced by the image registration error between Landsat and Sentinel-2 and the miss delineation of the disturbance shapes in the visual interpretation process. A further rigorous accuracy assessment for the mixed pixels should be conducted using high-resolution satellite images or other remote sensing data, such as aerial photographs.
The polygon-based assessment revealed that the Global Forest Change map can detect disturbances at least partially, even though the commission error was also high. The PAs of the disturbances of less than 1 ha in the local disturbance map were lower than those of the Global Forest Change map in lower spatial overlap thresholds because disturbance patches of less than four pixels were removed to reduce the commission error in the local disturbance map. However, if the 50% spatial overlap is required for correct detection, the local disturbance map has better accuracies even for small-scale disturbances. The PA and UA of the disturbance exceeded 80% for the disturbance larger than 3 ha in the Global Forest Change data and that larger than 1 ha in the local disturbance map. These results indicate that both disturbance maps can feasibly detect larger disturbances at a 50% spatial overlap. If the UA of 80% is sufficient to reduce false change detection, both maps can be used to detect disturbances larger than 1 ha and 0.5 ha, respectively. The difficulty of detecting small-scale disturbances ranging from 1 to 3 ha was addressed in previous studies [42,45,54]. Although the polygon-based assessment did not depend on the entire study area but rather on the sample grid reference data, we assumed that the estimates had little bias because the total area of the sample grids was large and accounted for 2.9% of the entire study area.
The delta NBR, pre-NBR, disturbance patch size, and distance from the disturbance edge were the significant factors influencing the disturbance detection. The delta NBR represents the difference in the vegetation condition of two consecutive years; hence, selecting it as a significant explanatory variable for disturbance detection was not surprising and confirmed the findings of the previous studies (e.g., [56,77]). The disturbance patch size also affected the detection accuracy in the various types of disturbance in previous studies (e.g., [55]). This might be indirectly affected by the mixed pixels in near-disturbance edges. Previous studies investigated the possible effects by adjacent pixels in MODIS at a 500 m resolution [78] and used the omission error near the disturbance edge for the accuracy assessment (e.g., [79]). The present study confirmed the relationship of the omission error at a finer spatial resolution of 30 m. Although Hamunyela et al. [44] indicated that elevation is an influencing factor for disturbance detection in the Global Forest Change map in Tanzania, this was not confirmed herein. This discrepancy might have been affected by several factors, such as regional data availability, cloud cover condition, elevation range, and vegetation. Although the results of this study did not confirm, these factors should be considered when using disturbance maps in such regions. Disturbance agents, such as fire, windthrow, and harvesting, can affect the detection accuracy because different disturbance agents have different disturbance sizes, vegetation recovery conditions, and magnitudes of change [13,22]. These factors should be analyzed to assess possible errors in regions with various disturbance agents.
The local disturbance map can be generated using satellite data with more frequent steps or a higher spatial resolution (e.g., Sentinel-2 data). Note that in this study, the pixel size of the Landsat data for detecting small-scale disturbances was coarse, and this might be alleviated by using high-spatial-resolution data. The use and integration of local disturbance maps using other satellite data should be investigated in different regions.

Conclusions
This study investigated the accuracy of annual disturbance detection using the Hansen Global Forest Change map and the local disturbance map in temperate forests dominated by harvest related disturbances. The local disturbance map generally achieved higher and more stable accuracies in each year compared with the Global Forest Change map. Therefore, the generation of a locally calibrated map is a valid approach if higher accuracy is required. The accuracy assessments revealed that the accuracies and the detected disturbance areas of the Global Forest Change map differed across years. These varying accuracies and area estimates should be considered when the Global Forest Change map is used for regional disturbance mapping. As expected, smaller disturbances were difficult to detect using both disturbance maps, resulting in a lower disturbance detection accuracy in areas dominated by small-scale disturbances. The mixed-effects models also indicated that the disturbance size was an influencing factor for correct detection in both maps. Although the Global Forest Change map can partially detect disturbances even if they are less than 1 ha, the map should be cautiously used because of its high commission error in small-scale disturbances. However, for disturbances larger than 3 ha, both disturbance maps achieved higher and comparable accuracies. In this context, the Global Forest Change map can be used for detecting larger disturbances. In areas dominated by small-scale disturbances, the Landsat data for detecting disturbances were coarse. High-spatial resolution data, such as Sentinel-2 data, might be used for achieving sufficient detection accuracies.