1. Introduction
What is a fire? Defining the spatial and temporal boundaries of fire events is critical for understanding the drivers and trends in fires [
1], ecological consequences [
2], and adaptation implications [
3]. Answering this question is fundamental to defining fire regimes, and the spatial and temporal characteristics of fire events in a strict sense [
4,
5,
6], i.e., size, frequency, intensity, severity, seasonality, duration, and rate of spread. Remote sensing has increased our capacity to quantify some of these characteristics at large spatial scales, such as frequency, intensity, size, and severity [
7,
8,
9]. However, there is an even greater potential to inform our understanding of changing fire and resilience of ecosystems and society if we are able to delineate events in remote sensing fire products that preserve the temporal characteristics. We can then better understand whether ecosystem state transitions depend on fire intensity and speed or how communities in the wildland-urban interface (WUI) may be vulnerable to rapid fire spread.
There are generally three classes of information that satellite sensors capture about fire activity: active fires based on thermal threshold exceedance [
10,
11,
12], fire radiative power as a metric of heat flux [
13,
14,
15], and a burned area derived from a change detection algorithm [
16,
17,
18], sometimes also informed by active fire detections [
19]. These fire properties are estimated at the pixel level, which ranges in size for these products from 10 s to 1000 s of meters. In order to explore fire activity patterns, these pixel-level detections are aggregated in some way, necessitating the assumption of homogenous fire characteristics across that pixel. Global burned area products tend to underestimate total burned area due to missing small fires [
20] and within-fire burned area due to underestimation of burned areas within an event [
21]. Further, global scale studies explore total burned area summed across larger units or the density of hot pixels as a metric of fire frequency [
8,
22,
23,
24], which leaves an understanding of actual events missing. Given the abundance of satellite fire data (e.g.,
Table 1), and that they do not “see” the same aspects of fire [
12], we fundamentally need landscape-scale event delineation to integrate across products and build a greater understanding of how fire regimes vary at regional and global scales [
25]. With event-level delineations, we can then also calculate a critical but less understood property of fire regimes—the daily fire spread rate. MODIS-based burned area products [
26,
27] use sub-daily images to estimate the date a pixel burned. As such, they are uniquely suited to provide estimates of fire spread rate and duration, but only if we can say which pixels are all part of the same event. There have been some attempts to characterize fire spread using active fire products, but the code and resulting data products are not publicly available [
28]. Defining events from MODIS-based products enables capturing fire events, from small to large events at a global-scale, providing key metrics on fire regimes and how they are changing.
There are several different approaches for delineating fire events based on proximity of burned area or hot pixels in space and time. Some studies have clustered the MODIS active fire hotspots (MODIS MOD14) to delineate events in Europe and northern Africa [
29] and Indonesian tropical rainforests [
26,
27] to understand what drives large fires. Others have used clustering of MODIS burned area (MODIS MCD64) pixels [
7,
8,
30,
31]. Most studies require pixel adjacency (
Table 1), but a more relaxed spatial criteria facilitates exploring fires that have unburned patches within their perimeters—critical refugia that are necessary for regeneration [
21]. This approach is also less likely to over-segment events that are imperfectly detected, due to low fire severity or cloudiness, for example.
Given the number of studies that use the MODIS burned area product (e.g., studies in
Table 1) and emerging new fire data products [
12,
17,
25,
26,
27,
32] that conduct some sort of event delineation as part of the processing, there is a great need to develop an open and well-documented algorithm for defining fire events from remotely sensed detections of fire. Moreover, event delineation enables joining of different data products to build a more complete picture of regional and global fire. Better delineation of the boundaries of events could lead to better estimates of total burned area, as well as exploration of derived spatiotemporal metrics around events that constitute the fire regime (e.g., event size, event shape, ignition point, unburned refugia within a fire, and fire spread rate). Many of the algorithms that have been developed previously were used and optimized for one specific analysis, and the code was not published for further development and reuse [
7,
33,
34,
35]. Furthermore, decisions were often made that lessened the computational cost, but relied on assumptions that are often not universally applicable. Most notably, data were often aggregated into a single yearly layer, which resulted in the artificial aggregation of pixels that burned multiple times in one year, and the artificial segmentation of events that started in one year and ended in the next [
34,
35,
36].
Further, there is a need to better validate the temporal and spatial thresholds, as this selection can substantially alter the number of detected fire events. Fire metrics can be sensitive to how boundaries are delineated [
37]. Moreover, we expect the optimum temporal and spatial thresholds to vary based on size distribution and spread differences that will vary across ecoregions (e.g., fast, large grassland fires vs. small, slow temperate forest fires) and land use types (e.g., agricultural fires vs. deforestation fires). Even so, ground-based delineations of fire perimeters also have their challenges, and incident command delineations may overestimate wildfire perimeters, as delineating unburned patches is difficult on the ground. Additionally, multiple fire patches may start independently and in proximity (e.g., when a lightning storm starts multiple events), which then merge into one fire complex.
Here, we (i) develop an open, refined, and adaptable algorithm for defining events; (ii) derive events and companion metrics for fires in the CONUS from the MODIS MCD64 burned area product, based on the optimum spatial and temporal thresholds; (iii) validate the MODIS-derived events against the Monitoring Trends in Burn Severity (MTBS) product, which is manually derived from Landsat imagery [
38]; and (iv) demonstrate how defining events enables us to explore additional metrics of the fire regime across the US. Here, we define an event [
39] as a geographic concept with delineated spatial and temporal boundaries around a specific phenomenon that is homogenous in some property and distinct from adjacent areas. The algorithm is designed in a way that makes it adaptable to data source, regional context, and even event type: the spatiotemporal criteria can be altered, and it can be used with newer burned area or active fire products (e.g., Fire_cci based on MODIS images at 250 m resolution [
26] or VIIRS [
12]), or even different types of phenomena (e.g., bark beetle outbreaks, or floods).
2. Materials and Methods
2.1. Study Area and Data Acquisition and Processing
The study area was CONUS. We chose this study area because of the availability of other fire datasets like MTBS [
38], which we were able to use to gauge the accuracy of our aggregation of burned pixels to events from the MCD64 dataset. We used the MODIS Collection 6 MCD64 burned area product [
27] [available at
ftp://fuoco.geog.umd.edu/MCD64A1/C6/]. These data contain five layers at 500-m resolution: burn date, first day, last day, a quality assessment, and error. The data are available worldwide, via a sinusoidal projection that is divided into 648 tiles (268 of which are terrestrial), each with 2400 rows and columns at 463-m resolution. We downloaded the entire monthly time series available for each tile that overlaps with CONUS, and extracted the burn date layer.
2.2. Accounting for Pixels That Burn More Than Once Per Year (Intra-Annual Reburns)
Some other studies that have aggregated pixels into fire events from the MODIS burned area product have aggregated the input data to a yearly time-step [
34,
35], taking either the earliest or latest burn date in the case of pixels that burn twice in one year. This assumes a minimal occurrence of pixels that actually burn twice in one year (e.g., the land burns first in spring and then again in fall). Aggregating the monthly data to yearly time steps makes the processing of the data much less complex and computationally costly (i.e., it allows for a two-dimensional moving window). However, aggregation at a yearly timescale presents two problems. First, the occurrence of pixels that burn more than once within a year would result in separate events being collapsed, resulting in an underestimate of burned area for the study area and an overestimate of duration. Second, fires that burn from one year to the next become arbitrarily split into two events.
Prior efforts have justified ignoring intra-year or intra-season reburns based on an occurrence of around 1% [
34,
35]. However we found that when we examined the study area tile by tile, some areas experienced much greater rates of intra-year reburns. To investigate whether reburned pixels would have a confounding effect on our data, we examined the occurrence of pixels that burned multiple times per year for each of the tiles overlapping CONUS for each year. We converted each monthly tile in CONUS to binary (1 for burned, 0 for unburned), summed each monthly pixel per year, and calculated the percentage of pixels that burned more than once per year, per tile. For all of CONUS (2001-2018) except the tile that contains Florida, there were a total of 12,676 pixels that burned more than once in a given year, or about 0.48% of pixels. The tile that includes Florida (h10v06), however, had a rate of 5% (sd 2.3%) of pixels that burned multiple times per year (
Table 2). We suspect that this high reburn occurrence is due to the year-round growing season combined with year-round occurrence of lightning strikes and human ignition pressure. Intra-year reburns would present a problem if this algorithm were expanded globally, because there are many ecosystems, especially in the tropics, with year-round growing seasons combined with year-round anthropogenic ignition sources.
Because of the relatively high reburn occurrence, and also due to concern over segmenting winter fires into multiple events, we decided not to aggregate the input rasters by year or fire season. Instead, we created a space-time cube for each monthly tile for the entire time series, where the julian day of the year for each pixel in each month layer was converted to a number along a continuous series starting on 1 January 1970.
2.3. Defining Events with a Flexible, Fast Algorithm
We created a flexible, fast algorithm that automatically downloads, processes, defines events, and calculates summary statistics for the entire coterminous United States (likely within ~30 min on a normal laptop). To define events, we used a three-dimensional moving window to aggregate burned pixels into distinct events. The algorithm takes as input a spatial variable, representing the number of pixels, and a temporal variable, representing the number of days, within which to group burn detections. It then aggregates by assigning each burned pixel an event identification number.
The data processing script downloads the entire time series of HDF files from the ftp server, extracts the burn date layer from each monthly tile, and adds them to a three-dimensional netCDF data cube. We used this data structure to maximize efficiency and speed. The event perimeter script reads the netCDF file for each tile, where each band represents one month, and for each burned pixel the date of fire detection is represented as the number of days since 1 January 1970. The netCDF file is converted into a three-dimensional array, and the moving window traverses the array. To avoid unnecessary computation, we did not check cells in which there was no burned area assignment throughout the study period.
For each cell of the three-dimensional array where at least one fire detection occurred, the program creates a mask identifying all burned pixels that fall within the spatial and temporal range of the current cell. If the current cell is already part of an existing event, any new burned pixels are assigned the event ID for that event. If it is a new event, the current cell and all overlapping cells are given the next sequential event ID. If there are multiple event IDs within the mask, it means two perimeters have grown together and they are merged into the first event ID. After the event perimeters are delineated within each tile, all event perimeters that potentially overlap with an adjacent tile are flagged. After all tiles are processed, the flagged events are partitioned and those that overlap spatially and temporally are merged. Finally, events across all tiles are merged into a final dataset and given a new sequential event ID.
2.4. Sensitivity Analysis: Identifying the Optimal Spatiotemporal Parameters for Delineating CONUS Fire Events
In order to find which combination of spatial and temporal variables outputs best defined fire events for CONUS, we assessed how well the FIRED outputs matched fire perimeters from MTBS [
38]. MTBS is a dataset of fire perimeters from 1984–2016 derived from Landsat satellite data. It has a minimum size threshold of 404 ha in the western US and 202 ha in the eastern US (separated by the 97th meridian). It documents 21,673 fire events throughout the entire US, and 13,741 in the overlapping study area and timeframe, beginning in 2001. One problematic feature of the MTBS data for this comparison is that fire complexes are not dealt with uniformly. Fire complexes are “two or more individual incidents located in the same general area which are assigned to a single incident commander or unified command [
40].” In some cases, each fire patch is assigned its own ID number and is represented as a single perimeter, and in other cases these complexes are lumped into a multipolygon with a single ID number. To address this issue, we split all multipolygons into single polygons, assigned unique ID numbers to each polygon, and then calculated the area for each individual polygon. This way, our sensitivity analysis would objectively assess how individual polygons matched, without the confounding factor of aggregated multipolygons.
We ran the fire event classifier for all spatiotemporal combinations between 1–15 days and 1–15 pixels (463–6945 m), resulting in 225 spatiotemporal combinations for CONUS. For each combination we matched the FIRED events that were >404 ha in the west and >202 ha in the eastern US to the associated MTBS wildfire perimeter.
An accuracy assessment was conducted for each spatiotemporal combination of the FIRED events, based on how well they matched the MTBS events. For each unique fire polygon in the MTBS database, we extracted the ID numbers for each FIRED event overlapping the MTBS polygon. Then, for each unique FIRED event, we extracted each MTBS ID that overlapped. We then calculated the ratio of the number of unique MTBS events that contained a FIRED event divided by the number of unique FIRED events that contained at least one MTBS event, with the optimum value being one. We used this ratio to approximate the spatio-temporal combination that minimized both over- and under-segmentation of the FIRED events based on known MTBS fire perimeters.
Based on the ratio that minimized both over- and under-segmentation, we estimated an optimal combination for the US of 5 pixels (2315 m) and 11 days. We calculated commission and omission errors for both the FIRED events and the MTBS events.
2.5. Calculating Statistics for Each Event, and Daily Statistics within Events
Once the optimal spatial-temporal aggregation level was identified, we created two vector products for CONUS: one where individual pixels were aggregated to polygons representing each fire event, and one where individual pixels were aggregated to each date within each event. For the event-level vector product, we calculated ignition location and date, duration, spread rate (burned area/duration), burned area, date of maximum growth, area burned on the dates of maximum and minimum growth (the date with the highest burned area per event), and the mean daily area burned for each event. We also extracted the mode of the International Geosphere-Biosphere Programme land cover classification from the MODIS MCD12Q1 landcover product for the year before the fire [
41], and the Community for Environmental Cooperation’s level 1–3 ecoregions [
42], for each event (
Table 3). Ecoregions are areas where soil, climate, vegetation, and other properties of ecosystems are generally similar. The Center for Environmental Cooperation has a nested product, with 3 levels of progressively finer grained ecoregion delineations. For the daily-level vector product, we calculated the daily burned area, cumulative burned area per day, days since ignition, mode landcover per day, and mode ecoregion per day, in addition to the metrics calculated for the event-level product (
Table 4). In addition, the algorithm has a third output: a table with each burned pixel as a single row, with coordinates, burn date, and the event identification number derived from the algorithm. This raw output is provided so the end-user can use and manipulate the raw data in any way they see fit.
2.6. Comparison of FIRED Events to MTBS Events and the National Interagency Fire Center Estimates
In order to understand how well the FIRED algorithm delineated event size, we compared the estimates of burned area from FIRED events to the estimates of burned area for MTBS events for the subset of events that were captured by both products. Because MTBS does not account for unburned patches within a fire perimeter when they calculate burned area, many burned area estimates reported by MTBS are likely overestimations. Thus, comparing the area burned by the two products represents a trade-off between imperfect satellite detection from MODIS and imperfect burned area reporting in the perimeters that drive selection by the MTBS product. With those caveats in mind, we co-located those events captured by both products (i.e., they overlapped in space and time) and compared estimated area burned at the event level using two approaches. First, to compare all fire events, we created a linear regression model where the FIRED-determined area burned predicted MTBS-determined area burned. Second, to understand how that relationship varied with size class, we binned the fire events into 50 equal size classes and created a linear model on each subset. The expectation was that FIRED-based burned areas would be consistently less than the MTBS-based burned areas. In addition, due to lower burn detection by MODIS for smaller fires [
32], we expected the models at smaller size classes to explain less of the variation than for large sizes. We also acquired the total yearly burned area and fire counts from the National Interagency Fire Center (NIFC) for CONUS to understand how FIRED and MTBS products compared to the aggregation of all reported wildfires (note, NIFC does not include intentional land use fires or prescribed burns).
2.7. Data and Code Availability
4. Discussion
Remote sensing has fundamentally changed our ability to quantify fire, and has consequently challenged how we define fire events. The active fire, burned area, and fire radiative power and severity products [
12,
14,
15,
17,
18,
27,
38] have expanded how we can conceptualize fire regimes. Key to translating this wealth of information is defining fire events in space and time so that we can understand how modern fire regimes are changing. Parallel efforts such as the Global Fire Atlas (based on the MODIS MCD64 product [
27]) have converged on identifying the same need, with a key motivation to improve global fire modeling [
30]. We argue that the need is more profound, that in order to understand how fire regimes are changing at regional to global scales we need an open, and flexible methodology to identify events and integrate fire data across sources based on these events. This event-based approach could be utilized to derive events in any satellite product to build a more complete picture of fire.
There are several beneficial aspects of our approach that yield more appropriate delineation of multi-year events, small fires, complexes, and intra-annual reburns, while also providing key output metrics, e.g., daily fire spread and pre-fire landcover. The primary difference between FIRED and other algorithms is that FIRED uses the entire monthly time series as a space-time cube input, upon which a three-dimensional moving window is applied, compared to aggregating fire seasons or years into one layer, upon which a two-dimensional moving window is applied. This enables proper identification of intra-year reburns (
Table 1), and ensures that fires at the end or beginning of months or years are not arbitrarily split into multiple events (
Figure 3). Second, because the FIRED database is based on the MODIS MCD64 burned area product, it includes fire events as small as 21 ha (i.e., the size of a single MCD64 pixel). The MCD64 burned area product is also informed by MOD14 active fire detections [
27], which may capture events smaller than 21 ha. The MOD14 active fire product has been shown to theoretically detect fires as small as 4 m
2, although such detections are rare (~90% omission error) [
32]. Small fire events greater than 12.6 hectares are more likely the events that are captured by the MOD14 active fire product (10% omission error), and by extension the MODIS MCD64 burned area product [
32], and therefore are reflected in FIRED. The MTBS database, in contrast, has a minimum threshold of 202 ha east and 404 ha west of the 97th meridian. Having small fires expands our ability to understand how fire size and burned area are changing, beyond just the large events [
43]. Smaller events are difficult to capture systematically, but we know these events can be incredibly important in the US, contributing large additional burned areas and emissions [
20,
44]. Third, the daily-level product preserves the daily-scale information (i.e., daily polygons and ensuing metrics) for the larger events. This elucidates whether large fire events are actually complexes of smaller independently ignited fire patches, or if the large event is truly the product of a single ignition location (e.g., the Rim Fire in
Figure 4). This also allows users to link daily-level burned area data within a defined event to daily or even sub-daily covariates (e.g., climate variables). Daily polygons should be used carefully, as there is uncertainty associated with the burn dates estimated by the MCD64 product (Giglio 2018). They found that 44% of burned grid cells were detected on the same day of an active fire, and 68% within 2 days. Fourth, this product provides several attributes that are new pieces of information, refined across CONUS. For example, fire spread rate is a unique attribute, derived from events, which is a critical piece of information not easily accessed in other datasets (e.g., MTBS). FIRED also provides the landcover for the year before the fire for each event, a coarse metric of fuels information that is critical for understanding ecosystem impacts and resilience. This annual landcover information could enable exploration of when fire precipitates rapid vegetation transitions, particularly as woody plant-dominated systems may lose their resilience to fire against a backdrop of warmer and drier climates [
45,
46]. Last, FIRED is also the only automated, satellite-derived data product we are aware of that captures intra-annual reburns. Intra-annual reburns will perhaps become more prevalent in the future as the decline of resilience in some ecosystems leads to an acceleration of disturbance regimes [
47,
48], particularly if novel ecosystems result from invasive, flammable plants [
7,
49].
Another key advantage of this approach is that the algorithm is open-source and flexible; we hope for community input and we expect it to improve over time. The spatio-temporal criteria can be altered based on other information, regionally-specific fire perimeters such as Canada’s National Burned Area Composite (
https://cwfis.cfs.nrcan.gc.ca/datamart), or known delineations of intentional land use fires or prescribed burns. Further, we anticipate that this algorithm has wide applicability to other fire products and other efforts to build events based on any geospatial data that has both spatial and temporal information. Previous studies, including this team’s previous efforts [
7], have not made their workflow and code publicly available, limiting the potential to facilitate community development of an integrated, evolving global fire database.
With the plethora of remote sensing data about fire and fire effects, there is a great need to delineate events at large regional and global scales. There are at least three other recent studies that have created fire events from the MODIS burned area product (
Table 1), two of which [
34,
36] have created global fire event databases. In addition to the global efforts, Frantz et al. [
35] created an algorithm based on a study area in sub-Saharan Africa that uses a top-down multilevel segmentation strategy that starts by defining potential ignition points and gradually refines the individual object membership. All three efforts use an approach that starts by identifying potential ignition points and grows objects from the ignition point using only adjacent pixels. The code for the algorithm created by Andela et al. [
34] is not publicly available, and the code created by Frantz et al. [
35] is available upon request. Laurent et al. [
36] created a publicly available database, and the code is also available upon request. Their output data contains what they term fire patch functional traits, including patch area and other morphological features, but does not preserve daily fire spread information or polygons containing the perimeter shapes of the derived events. Our approach differs in that we use a spatiotemporal window that can capture isolated burned pixels that may be part of the same event, but may be isolated because of the inability of the MODIS sensor to detect burned area in the area between patches due to cloudiness, low vegetation density, low severity, or unburned patches (i.e., fire refugia) that are important elements of an event. It is worth noting that the spatial-temporal thresholds we derived (i.e., 11-day window and a 5-pixel distance) are much greater than those used in most previous studies (e.g., [
12,
34] but see Frantz et al. [
35]), leading to less artificial truncation, or oversplitting, of events. For example, the Rim Fire that occurred in California in 2013 was delineated into more than 10 separate events by the Global Fire Atlas algorithm, whereas our algorithm delineated a single event that more closely matches the MTBS delineation (
Figure 4). Future improvements could include (i) validation with smaller events, such as those contained in the US-based National Incident Feature Service dataset, formerly Geomac [
50] or others; (ii) estimates of uncertainty around start and end dates of the fire events; (iii) regionally varying thresholds based on fire regime characteristics; and (iv) development of an optimization process that does not rely on already existing fire perimeter polygons. In the current study, we were able to use the MTBS database to define the optimum spatial and temporal parameters for delineating events in CONUS. Unfortunately, these types of data do not exist for many parts of the world. We attempted to scale the FIRED product to the entire globe, and found that our spatial and temporal parameters were inappropriate, particularly for the savanna biome where very high proximity of fires in space and time led to severe aggregation of events. This highlights a substantial need for global fire perimeter data [
51], or development of an optimization approach that does not rely on these external data.
This is a unique moment in the history of fire science, given the abundance of fire data across spatial scales, that requires the fire science community to better coordinate efforts on fire data harmonization challenges and opportunities. We see great potential to build a community-driven, fire data infrastructure that we term OneFire. OneFire is a coordinated architecture that would enable a community of researchers and stakeholders to use, repurpose, and contribute to fire data, code, and workflows. The vision for OneFire is that it will be a coordinated, community-inspired data architecture that connects and integrates the many global, national, and regional fire databases. This is no small task, but integrating these datasets is key to unlocking a transformation in fire science and rapidly accelerating new discoveries about why fire regimes are changing and how societies and ecosystems are vulnerable. There is an enormous amount of data and work relevant for fire science that could be leveraged, if only it were open, reproducible, and scalable. For example, we anticipate that a newly published ICS-209-PLUS dataset (an integrated database of over 120,000 incident command reports for the U.S. from 1999–2014) could be connected to MODIS FIRED events to join physical attributes with social impact and response on a daily scale [
52]. Social media information around wildfires could also be leveraged and provide a view of social response that previously would not have been possible [
53,
54]. Additional satellite sensors and their derived products, e.g., active fire, could be leveraged to expand the detections per event and add other key properties like fire radiative power. Key elements of a vision for OneFire include (i) identified fire events across many datasets utilizing the FIRED event-builder algorithm or synergistic approach the delineates events in space and time; (ii) integration workflows that then connect those same events across data sources to build a fuller suite of attributes around commonly identified events; (iii) data and computational infrastructure that allows for community contributions of data, code, and compute environments; (iv) formal linkages to other important climate, environment, and social data sources that provide insights into driving forces or responses; and (v) support for community building, engagement, and training that facilitates large, diverse team science. Ultimately, no single sensor is going to provide all the information we need about fires, and we can never anticipate all the ways that such an integrated source of fire information will get used. OneFire will help us build a fuller, global picture of fire.