Mean Composite Fire Severity Metrics Computed with Google Earth Engine Offer Improved Accuracy and Expanded Mapping Potential

: Landsat-based ﬁre severity datasets are an invaluable resource for monitoring and research purposes. These gridded ﬁre severity datasets are generally produced with pre- and post-ﬁre imagery to estimate the degree of ﬁre-induced ecological change. Here, we introduce methods to produce three Landsat-based ﬁre severity metrics using the Google Earth Engine (GEE) platform: The delta normalized burn ratio (dNBR), the relativized delta normalized burn ratio (RdNBR), and the relativized burn ratio (RBR). Our methods do not rely on time-consuming a priori scene selection but instead use a mean compositing approach in which all valid pixels (e.g., cloud-free) over a pre-speciﬁed date range (pre- and post-ﬁre) are stacked and the mean value for each pixel over each stack is used to produce the resulting ﬁre severity datasets. This approach demonstrates that ﬁre severity datasets can be produced with relative ease and speed compared to the standard approach in which one pre-ﬁre and one post-ﬁre scene are judiciously identiﬁed and used to produce ﬁre severity datasets. We also validate the GEE-derived ﬁre severity metrics using ﬁeld-based ﬁre severity plots for 18 ﬁres in the western United States. These validations are compared to Landsat-based ﬁre severity datasets produced using only one pre- and post-ﬁre scene, which has been the standard approach in producing such datasets since their inception. Results indicate that the GEE-derived ﬁre severity datasets generally show improved validation statistics compared to parallel versions in which only one pre-ﬁre and one post-ﬁre scene are used, though some of the improvements in some validations are more or less negligible. We provide code and a sample geospatial ﬁre history layer to produce dNBR, RdNBR, and RBR for the 18 ﬁres we evaluated. Although our approach requires that a geospatial ﬁre history layer (i.e., ﬁre perimeters) be produced independently and prior to applying our methods, we suggest that our GEE methodology can reasonably be implemented on hundreds to thousands of ﬁres, thereby increasing opportunities for ﬁre severity monitoring and research across the globe.


Introduction
The degree of fire-induced ecological change, or fire severity, has been the focus of countless studies across the globe [1][2][3][4][5]. These studies often rely on gridded metrics that use pre-and post-fire imagery to estimate the amount of fire-induced change; the most common metrics are the delta normalized burn ratio (dNBR) [6], the relativized delta normalized burn ratio (RdNBR) [7], and the relativized burn ratio (RBR) [8]. These metrics generally have a high correspondence (r 2 ≥ 0.65) to field-based measures of fire severity [9][10][11][12], making them an attractive alternative to expensive and time-consuming collection of post-fire field data. These satellite-inferred fire severity metrics are often produced using Landsat Thematic Mapper (TM), Enhanced Thematic mapper Plus (ETM+), and Operational Land Imager (OLI) imagery due to their combined temporal depth (1984-present) and global coverage, although they can be produced from other sensors such as the Moderate Resolution Imaging Spectroradiometer (MODIS) [13] and Sentinal2A [14].
However, producing satellite-inferred fire severity datasets can be challenging, particularly if severity data are needed for a large number of fires (>~20) or over broad spatial extents. For example, expertise in remote sensing technologies and software is necessary, indicating the need for a remote-sensing specialist or a substantial investment of time to learn such technologies and software. Furthermore, fire severity datasets have traditionally been produced using one pre-fire and one post-fire Landsat image [15,16], which requires careful attention to scene selection. Image selection can be time consuming in terms of identifying scenes with no clouds covering the fire of interest and avoiding scenes affected by a low sun angle and those with mismatched phenology between pre-and post-fire conditions [6,17]. Even when careful attention to image selection has been achieved, some images (those from Landsat ETM+ acquired after 2003) and the resulting gridded severity datasets will have missing data due to the failure of the Scan Line Corrector [18].
Challenges in producing satellite-inferred severity datasets have likely hampered development of regional to national fire severity products in many countries. The exception is in the United States (US), where Landsat-derived severity metrics have been produced for all 'large' fires (those ≥400 ha in the western US and ≥250 ha in the eastern US) that have occurred since 1984 [19]. This effort, undertaken by the US government, is called the Monitoring Trends in Burn Severity (MTBS) program and has mapped the perimeter and severity of over 20,000 fires. The MTBS program has provided data for numerous scientific studies ranging from those involving <10 fires [20][21][22] to those involving >1000 fires [2,23,24] and for topics such as fuel treatment effectiveness, climate change impacts, and time series analyses [25][26][27][28]. The fire severity datasets produced by the MTBS program have clearly advanced wildland fire research in the US. Although some studies involving the trends, drivers, and distribution of satellite-inferred fire severity are evident outside of the US [4,5,15,29,30], the number and breadth of such studies are relatively scarce and restricted compared to those conducted in the US. We suggest that, if spatially and temporally comprehensive satellite-inferred severity metrics were more widely available in other countries or regions, opportunities for fire severity monitoring and research would increase substantially.
In this paper, we present methods to quickly and easily produce Landsat-derived fire severity metrics (dNBR, RdNBR, and RBR). These methods are implemented within the Google Earth Engine (GEE) platform. As opposed to the standard approach in which one pre-fire and one post-fire Landsat scene are identified and used to produce these fire severity datasets, we use a mean compositing approach in which all valid pixels (e.g., cloud-free) over a pre-specified date range are stacked and the mean value for each pixel over each stack is calculated. Consequently, there is no need for a priori scene selection, which substantially speeds up the time necessary to produce fire severity datasets. The main caveat, however, is that a fire history GIS dataset (i.e., polygons of fire perimeters) must be available and produced independent of this process. Where fire history datasets are currently available or can easily be generated, our methods provide a means to produce satellite-inferred fire severity products similar to those distributed by the MTBS program. We also validate the severity metrics produced with our GEE methodology by evaluating the correspondence of dNBR, RdNBR, and RBR to a field-based measure of severity and measure the classification accuracy when categorized as low, moderate, and high severity. These validations were conducted on 18 fires in the western US [8] and were compared to parallel validations of fire severity datasets using one pre-fire and post-fire scene.
Code and a sample fire history GIS dataset are provided to aid users in replicating and implementing our methods.

Processing in Google Earth Engine
We produced the following Landsat-based fire severity metrics for each of the 18 fires that are described in Section 2.2; the perimeter of each fire was obtained from the MTBS program [19]. All fire severity metrics are based on the normalized burn ratio (NBR; Equation (1)) and include the: (i) Delta normalized burn ratio (dNBR; Equation (2)) [6]; (ii) relativized delta normalized burn ratio (RdNBR; Equation (3)) [7]; and (iii) relativized burn ratio (RBR; Equation (4)) [8]. These are produced using Landsat TM, ETM+, and OLI imagery.
where NIR (Equation (1)) is the near infrared band and SWIR (Equation (1)) is the shortwave infrared band. The NBR prefire qualifier in RdNBR (Equation (3)) is necessary because the equation fails when NBR prefire equals zero and produces very large values when it approaches zero. Within GEE, mean pre-and post-fire NBR values (Equation (1)) across a pre-specified date range (termed a 'mean composite') were calculated per pixel across the stack of valid pixels (e.g., cloudand snow-free pixels). For fires that occurred in Arizona, New Mexico, and Utah, the date range is April through June; for all other fires, the date range is June through September ( Figure 1). These date ranges are based on various factors including the fire season, expected snow cover, expected cloud cover and latitude. We used the Landsat Surface Reflectance Tier 1 datasets, which among the bands, includes a quality assessment mask to identify those pixels with clouds, shadow, water, and snow. This mask is produced by implementing a multi-pass algorithm (called 'CFMask') based on decision trees and is described in detail by Foga et al. [31]. As such, pixels identified as cloud, shadow, water, and snow were excluded when producing the mean composite pre-and post-fire NBR. The resulting pre-and post-fire NBR mean composite images are then used to calculate dNBR, RdNBR, and RBR (Equations (2)-(4)). Our mean compositing approach renders the need for a priori scene selection unnecessary.
We also produced alternative versions of each severity metric in which we account for potential phenological differences between pre-and post-fire imagery, also known as the 'dNBR offset ' [6]. The dNBR offset is the average dNBR of pixels outside the burn perimeter (i.e., unburned) and is intended to account for differences between pre-and post-fire imagery that arise due to varying conditions in phenology or precipitation between respective time periods. Incorporating the dNBR offset is advisable when making comparisons among fires [7,8]. For each fire, we determined the dNBR offset by calculating the mean dNBR value across all pixels located 180 m outside of the fire perimeter; informal testing indicated that a 180 m distance threshold adequately quantifies dNBR differences among unburned pixels. A simple subtraction of the fire-specific dNBR offset from each dNBR raster incorporates the dNBR offset [17]. The dNBR (with the offset) is then used to produce RdNBR and RBR (Equations (3) and (4)).

Validation
We aimed to determine whether our GEE methodology (specifically the mean compositing method) produced Landsat-based fire severity datasets with equivalent or higher validation statistics than severity datasets produced using one pre-fire and one post-fire scene (i.e., the standard approach since these metrics were introduced). This validation has three components (described below), all of which rely on 1681 field-based severity plots covering 18 fires in the western US that burned between 2001 and 2011; these are the same plots and fires that were originally evaluated by Parks et al. [8] ( Figure 1) ( Table 1). The field data represent the composite burn index (CBI) [6], which rates factors such as surface fuel consumption, soil char, vegetation mortality, and scorching of trees. CBI is rated on a continuous scale from zero to three, with CBI = 0 reflecting no change due to fire and CBI = 3 reflecting the highest degree of fire-induced ecological change. The fires selected by Parks et al. [8] and used in this study (Table 1) met the following criteria: (i) They had at least 40 field-based CBI plots; and (ii) at least 15% of the plots fell into each class representing low, moderate, and high severity. Of the 1681 field-based CBI plots, 30% are considered low severity (CBI < 1.25), 41% are moderate severity (CBI ≥ 1.25 and < 2.25), and 29% are high severity (CBI ≥ 2.25). The first validation evaluates the correspondence of each severity metric to the CBI data for each fire. Exactly following Parks et al. [8], we extracted GEE-derived dNBR, RdNBR, and RBR values using bilinear interpolation and then used nonlinear regression in the R statistical environment [32] to evaluate the performance of each severity metric. Specifically, we quantified the correspondence of each severity metric (the dependent variable) to CBI (the independent variable) as the coefficient of determination, which is the R 2 of a linear regression between predicted and observed severity values. We conducted this analysis for each fire and reported the mean R 2 across the 18 fires. We then conducted a parallel analysis but used MTBS-derived severity datasets. This parallel analysis allows for a robust comparison of severity datasets produced using one pre-fire and one post-fire image (e.g., MTBS-derived metrics) with the mean compositing approach as achieved with GEE. This validation was conducted on the severity metrics without and with the dNBR offset . Our second validation is nearly identical to that described in the previous paragraph but plot data from all 18 fires was combined (n = 1681). That is, instead of evaluating on a per-fire basis, we evaluated the plot data from all fires simultaneously. Following Parks et al. [8], this evaluation used a five-fold cross-validation. That is, five evaluations were conducted with 80% of the plot data used to train each nonlinear model and the remaining 20% used to test each model. The resulting coefficients of determination (R 2 ) and standard errors for the five testing datasets were averaged.
The third validation evaluates the classification accuracy when categorizing the satellite-and field-derived severity datasets into three discrete classes representing low, moderate, and high severity. To do so, we grouped the CBI plot data into severity classes using well-recognized CBI thresholds: Low severity corresponds to CBI values ranging from 0-1.24, moderate severity from 1.25-2.24, and high severity from 2.25-3.0 [7]. We then identified thresholds specific to each metric (with and without incorporating the dNBR offset ) corresponding to the low, moderate, and high CBI thresholds using nonlinear regression models as previously described. However, the nonlinear models used to produce low, moderate, and high severity thresholds for this evaluation used all 1681 plots combined and did not use the cross-validated versions. We measured the classification accuracy (i.e., the percent correctly classified) with 95% confidence intervals using the 'caret' package [34] in the R statistical environment [32]. We also produced confusion matrices for each severity metric and report the user's and producer's accuracy for each severity class (low, moderate, and high).
Finally, it is worth noting that we did not directly use the fire severity datasets distributed by the MTBS program. Our reasoning is that the MTBS program does not distribute the RBR. Furthermore, the MTBS program incorporates the dNBR offset into the RdNBR product but does not distribute RdNBR without the dNBR offset . The MTBS program does, however, distribute the imagery used to produce each fire severity metric. In order to make valid comparisons to the GEE-derived datasets, we opted to use the pre-and post-fire imagery distributed by the MTBS program to produce dNBR, RdNBR, and RBR, with and without the dNBR offset , for each of the 18 fires. All processing of MTBS-derived fires was accomplished with the 'raster' package [35] in the R statistical environment [32].

Google Earth Engine Implementation and Code
We provide a sample code and a geospatial fire history layer to produce a total of six raster datasets (dNBR, RdNBR, and RBR; with and without the dNBR offset ) for each of the 18 previously described fires. This code produces severity datasets that are clipped to a bounding box representing the outer extent of each fire. We designed the code to use imagery from one year before and one year after each fire occurs and to use a pre-specified date range for image selection for each fire, as previously described. These parameters can easily be modified to suit the needs of different users, ecosystems, and fire regimes.

Results
Using GEE, we were able to quickly produce dNBR, RdNBR, and RBR (with and without the dNBR offset ) for the 18 fires analyzed. The entire process was completed in approximately 1 h; fires averaged about 15,000 hectares in size and ranged from 723-60,000 hectares. This timeframe included a few minutes of active, hands-on time and about 60 min of GEE computational processing. This timeframe should be considered a very rough estimate, however, because GEE processing time varies widely among fires (larger fire sizes require more computational processing) and because production time depends on available resources shared among users within GEE's cloud-based computing platform [36]; nonetheless, processing time is very fast with fairly low investment in terms of human labor.
The mean compositing approach, in conjunction with the exclusion pixels classified as cloud, shadow, snow, and water, resulted in a variable number of valid Landsat scenes used in producing each pre-and post-fire NBR image. The average number of stacked pixels used to produce pre-and post-fire NBR was about 11. This varied by fire and ranged from 2-20 for pre-fire NBR and from 6-20 for post-fire NBR.
Our first validation, in which correspondence between CBI and each severity metric was computed independently for each fire, shows that there is not a substantial improvement between the MTBS-and GEE-derived fire severity metrics ( Table 2). When the correspondence between CBI and each severity metric for 1681 plots covering 18 fires was evaluated simultaneously using a five-fold cross-validation (our second evaluation), the R 2 was consistently higher for the GEE-derived fire severity datasets as compared to the MTBS-derived datasets (Table 3; Figure 2). Furthermore, the inclusion of the dNBR offset increased the correspondence to CBI for all fire severity metrics except for GEE-derived RdNBR ( Table 3). All terms in the nonlinear regressions for all severity metrics (those with and without the dNBR offset ) were statistically significant (p < 0.05) in all five folds of the cross-validation. Table 3. R 2 of the five-fold cross-validation of the correspondence between CBI and each MTBS-and GEE-derived fire severity metric for 1681 plots across 18 fires; standard error shown in parentheses. The values characterize the average of five folds and represent the severity metrics excluding and including the dNBR offset . The GEE-derived fire severity datasets generally resulted in an improvement over the comparable MTBS-derived datasets in terms of overall classification accuracy ( Table 4); Inclusion of the dNBR offset provided additional improvement for the most part ( Table 4). The only exception is for the GEE-derived RdNBR, in which the classification accuracy was slightly lower when using the dNBR offset ( Table 4). The confusion matrices for each fire severity metric (with and without the dNBR offset ) indicate that the user's and producer's accuracies are usually higher with the GEE-derived metrics compared to the MTBS-derived metrics (Tables 5 and 6). The thresholds we used to classify plots as low, moderate, or high severity are shown in Table 7; these may be useful for others who implement our GEE methodology and want to classify the resulting datasets.

MTBS-Derived GEE-Derived
1 Figure 2. Plots show each MTBS-(top row) and GEE-derived (bottom row) severity metric and the corresponding field-based CBI. All severity metrics include the dNBR offset. Red lines show the modeled fit of the nonlinear regressions for all 1681 plots. The model fits and the resulting R 2 shown here were not produced using cross-validation and therefore may differ slightly from the results shown in Table 3. Extreme RdNBR values are not shown to improve visual appearance of the RdNBR panels. Table 4. Classification accuracy (percent correctly classified) and 95% confidence intervals (CI) for the three fire severity metrics (with and without the dNBR offset ). Each fire severity metric is classified into categories representing low, moderate, and high severity based on index-specific thresholds (see Table 7) and compared to the same classes based on composite burn index thresholds.

Without dNBR offset
With dNBR offset  Table 5. Confusion matrices for classifying as low, moderate, and high severity using the severity metrics computed without the dNBR offset . Confusion matrices for MTBS-derived metrics are on the left and confusion matrices for GEE-derived metrics are on the right. UA: user's accuracy; PA: producer's accuracy.

Reference CBI Class Reference CBI Class
Classified using MTBS-derived dNBR  Table 6. Confusion matrices for classifying as low, moderate, and high severity using the severity metrics computed with the dNBR offset . Confusion matrices for MTBS-derived metrics are on the left and confusion matrices for GEE-derived metrics are on the right. UA: user's accuracy; PA: producer's accuracy.

Reference CBI Class Reference CBI Class
Classified using MTBS-derived dNBR

Discussion
The Google Earth Engine (GEE) methodology we developed to produce Landsat-based measures of fire severity is an important contribution to wildland fire research and monitoring. For example, our methodology will allow those who are not remote sensing experts, but have some familiarity with GEE, to quickly produce fire severity datasets ( Figure 3). This benefit is due to the efficiency and speed of the cloud-based GEE platform [37,38] and because no a priori scene selection is necessary. Furthermore, compared to the standard approach in which only one pre-and post-fire scene are used, the GEE mean composite fire severity datasets exhibit higher validation statistics in terms of the correspondence (R 2 ) to CBI and higher classification accuracies for most severity classes. This suggests that mean composite severity metrics more accurately represent fire-induced ecological change, likely because the compositing method is less biased by pre-and post-fire scene mismatch and image characteristics inherent in standard processing. The computation and incorporation of the dNBR offset within GEE further improves, for the most part, the validation statistics of all metrics.
The improvements in the validation statistics of the GEE-derived severity metrics over the MTBS-derived severity metrics, when evaluated on a per-fire basis, are more or less negligible (see Table 2). This suggests that if practitioners and researchers are interested in only one fire [20,39], it does not matter if fire severity metrics are produced using the mean compositing approach or using one pre-fire and one post-fire image (e.g., MTBS). It is also worth noting that the improvements in the validation statistics of the GEE-derived severity metrics over the MTBS-derived severity metrics, when all plots are evaluated simultaneously, are not statistically significant in most cases. That is, the overall classification accuracy of the GEE-derived metrics overlap the 95% confidence intervals of the MTBS-derived metrics in all comparisons except that of RdNBR without the dNBR offset (Table 4). Although the user's and producer's accuracy is oftentimes higher for the GEE-derived severity metrics (Tables 5 and 6), this is not always the case for all severity classes. In particular, the producer's accuracy (but not the user's accuracy) is generally higher for the MTBS-derived metrics when evaluating the high severity class. Nevertheless, the modest improvement in most validation statistics of the GEE-derived metrics, together with the framework and code we distribute in this study, will likely provide the necessary rationale and tools for producing fire severity datasets in counties that do not have national programs tasked with producing such datasets (e.g., MTBS in the United States). The Monitoring Trends in Burn Severity (MTBS) program in the US, which produces and distributes Landsat-based fire severity datasets [19], has enabled scientists to conduct research involving hundreds to thousands of fires [2,24,40,41]. Outside of the US, where programs similar to MTBS do not exist, most fire severity research is limited to only a handful of fires, the exceptions being Fang et al. [15] (n = 72 fires in China) and Whitman et al. [42] (n = 56 fires in Canada). We suggest that the GEE methodology we developed will allow users in regions outside of the US to efficiently produce fire severity datasets for hundreds to thousands of fires in their geographic areas of interest, thereby providing enhanced opportunities for fire severity monitoring and research. Although fire history datasets (i.e., georeferenced fire perimeters) are a prerequisite for implementing our GEE methodology, such datasets have already been produced and used in scientific studies in Portugal [43], Spain [44], Canada [45], portions of Australia [46], southern France [47], the Sky Island Mountains of Mexico [48], and likely elsewhere. Therefore, the GEE methods developed here provide a common platform for assessing fire-induced ecological change and can provide more opportunities for fire severity monitoring and research across the globe.
The fires we analyzed primarily burned in conifer forests and were embedded within landscapes comprised of similar vegetation. As such, our approach to incorporating the dNBR offset that used pixels in a 180 m 'ring' around the fire perimeter may not be appropriate everywhere and we urge caution in landscapes in which fires burn vegetation that is not similar to that of the surrounding lands. For example, our methods for calculating and implementing the dNBR offset would not be appropriate if a fire burned a forested patch that was surrounded by completely different vegetation such as shrubland or agriculture. In such cases, we recommend that fire severity datasets exclude the dNBR offset as it may not improve burn assessments. Similarly, the low, moderate, and high severity thresholds identified in this study (Table 7) are likely only applicable to forested landscapes in the western US, and other thresholds may be more suitable to other regions of the globe and in different vegetation types. Finally, our choice of developing post-fire imagery from the period one-year after the fire may not be appropriate for all ecosystems. Arctic tundra ecosystems, for example, might be better represented by imagery derived immediately after the fire or after snowmelt but prior to green-up the year following the fire [49]. The GEE approach can be easily modified to select dates that best suit each ecosystem.

Conclusions
In this paper, we present practical and efficient methodology for producing three Landsat-based fire severity metrics: dNBR, RdNBR, and RBR. These methods rely on Google Earth Engine and provide expanded potential in terms of fire severity monitoring and research in regions outside of the US that do not have a dedicated program for mapping fire severity. In validating the fire severity metrics, our goal was not to compare and contrast individual metrics (e.g., dNBR vs. RBR) [11,12] nor to critique products produced by the MTBS program. Instead, we aimed to evaluate differences between the GEE-based mean compositing approach to the standard approach in which one pre-fire and post-fire Landsat scene are used to produce severity datasets. The GEE-based severity datasets generally achieved higher validation statistics in terms of correspondence to field data and overall classification accuracy. The inclusion of the dNBR offset generally provided additional improvements in these validation statistics for most fire severity metrics regardless of whether they were MTBS-or GEE-derived. This provides further evidence that inclusion of the dNBR offset should be considered when multiple fires are of interest [8,17]. Our evaluation included fires over a large spatial extent (the western US) and with varied fire regime attributes, ranging from those that are predominantly surface fire regimes to those that are stand-replacing regimes. Consequently, the higher validation statistics reported here for the GEE-derived composite-based fire severity datasets should provide researchers and practitioners with increased confidence in these products.
Author Contributions: S.A.P. conceived of the study, conducted the statistical validations, and wrote the paper. L.M.H. aided in designing the study, developed GEE code, and contributed to manuscript writing. R.A.L. aided in designing the study and contributed to manuscript writing. M.A.V. and N.P.R. developed GEE code and contributed to manuscript writing.
Funding: This research was partially funded by an agreement between the US Geological Survey and US Forest Service.