Improving on MODIS MCD64A1 Burned Area Estimates in Grassland Systems: A Case Study in Kansas Flint Hills Tall Grass Prairie

Uncertainty in satellite-derived burned area estimates are especially high in grassland systems, which are some of the most frequently burned ecosystems in the world. In this study, we compare differences in predicted burned area estimates for a region with the highest fire activity in North America, the Flint Hills of Kansas, USA, using the moderate resolution imaging spectroradiometer (MODIS) MCD64A1 burned area product and a customization of the MODIS MCD64A1 product using a major ground-truthing effort by the Kansas Department of Health and Environment (KDHE-MODIS customization). Local-scale ground-truthing and the KDHE-MODIS product suggests MODIS burned area estimates under predicted fire occurrence by 28% over a 19-year period in the Flint Hills ecoregion. Between 2001 and 2019, MODIS product indicated <1 million acres burned on average, which was far below the KDHE-MODIS customization (mean = 2.6 million acres). MODIS also showed that <1% of the Flint Hills burned 5 times from 2001–2019 (2001, 2002, 2007, 2012 and 2013), whereas KDHE-MODIS customization showed this never happened in any single year. KDHE-MODIS also captured some areas of the Flint Hills that burned every year (19 times out of 19 years), which is well-known with field inventory data, whereas the maximum fire occurrence in MODIS was 14 times in 19 years. Finally, MODIS never captured >8% burned area for any given year in the Flint Hills, even in years when fire activity was highest (2008, 2009, 2011, 2014). Based on these results, coupling MODIS burned area computations with local scale ground-truth efforts has the potential to significantly improve fire occurrence estimates and reduce uncertainty in other grassland and savanna regions.


Introduction
Satellite data have become the general standard for advancing our understanding of global ecological processes such as fire activity [1], carbon cycles [2], air quality [3], and climate [4]. High quality satellite data are now freely available at moderate spatiotemporal resolution through earth observatory datasets such as the moderate resolution imaging spectroradiometer (MODIS). The usefulness and application of freely available satellite-derived datasets are unmatched to advance global scientific pursuits [5,6]. To tackle some of Earth's biggest challenges, e.g., understanding current and future land cover dynamics, particularly in times of climate uncertainty, our reliance on satellite data will potentially increase. Therefore, where possible, quality control measures are important for inference, interpretation, and application when utilizing these global datasets for government planning and policy interventions.
In recent years, the discipline of fire ecology has increased its reliance on remotely sensed data for basic and applied questions in the field. The development of fire ecology has transformed substantially from a field sampling based discipline with broad data extrapolation to one utilizing satellite data. Extrapolation is common in ecology; however, the process relies on a model and thus is associated with a source of error [7]. Satellite data can be used to overcome the extrapolation problem at the discretion of the data resolution (grain) and the error associated with the model used to develop the satellite data. The trade-off is the geographic extent of satellite data, which has revolutionized how we study global patterns. Satellite-derived data quality has vastly improved since the first satellite to send data from space launched in the early mid 1900s. Today, several satellite-derived fire data products are freely available, among which MODIS is one of the most popular [8], which has enabled a plethora of global fire research.
While satellite data are being increasingly used to inform policy, monitor landscape dynamics and mitigate undesired landscape changes in the Earth system, uncertainty in model output continues to be a central and necessary pursuit to take advantage of continued advances in remotely sensed data and their applications. This is especially true in grasslands and savannas where satellite-derived fire products such as burned areas are generally underestimated since the recovery period of these systems are rapid and small fires (<100 ha) often go undetected [9]. In North America, the detection of 2003-2005 fires increased from East (60%) to West (89%) with the central region (Great Plains grasslands) in between (80%). Here, failure to detect accurate burned area products were correlated with increased cloud cover, forest types, landscape pattern, fuel load and fire behavior [10]. MODIS burned area product utilizes active fire detection, and therefore a fire may go undetected if it is inactive during the sensor overpass [10]. This highlights the need to improve on these global fire data products by reducing uncertainty via local customizations by utilizing on-site observers, especially in grassland systems.
For fire researchers in grasslands and savannas ecosystems, there has been a consistent outcry over the need to improve the reliability of remotely sensed quantification for the simplest of fire metrics-such as the presence or absence of fire at a given time and at a reliable spatial resolution (i.e., mapping burned areas and fire occurrence). These are the most pyric systems in the world and high fire activity areas offer some stability to study the causes e.g., ignition sources, feedbacks with herbivory, of these fire regimes [11][12][13]. In grassland systems, fires are often low intensity/severity and because these systems recover more rapidly than others such as forests, many fires go undetected. Forest fires are generally of higher intensity due to fuel loads [14][15][16]) and have longer post-disturbance recovery periods associated with forest succession dynamics [17]. Furthermore, validation of grassland and savanna fires in Africa using alternative satellite-derived fire products at finer resolution showed higher commission rates in grassland and savanna systems [18] and generally produce the weakest relationships for savannas and grasslands regions [19,20]. The general underestimation of burned grassland areas has resulted in local efforts aimed at improving upon global fire data products to better capture spatial information on fire occurrence in areas that rely on accurate fire data for local scale state planning. Generated maps are used to better understand fire attributes such as burned areas, patterns, seasons and ignition sources for local stakeholders and planning for future fire management. Derived maps are generally detailed and often used to inform management decisions such as grazing rotations and monitor fuel loads to reduce wildfire risk [21]. In the Flint Hills ecoregion within the tallgrass prairie of North America, organizations such as the Kansas Department of Health and Environment (KDHE) in USA develop annual MODIS-based customized burned area fire scars by incorporating field data with the MODIS active fire datasets (MCD64A1) [22]. This represents, to our knowledge, one of the first efforts to improve upon global spatial fire occurrence data at landscape scales in grassland systems and provides an opportunity for other grassland and savanna regions of the world to learn from this effort.
In this paper, we compare differences in data capture for fire occurrence between MODIS burned area product, MCD64A1, [23] and KDHE-MODIS customization of this dataset between 2001 and 2019. The Flint Hills fire regime is well studied, e.g., [24][25][26][27] making it an ideal location to perform this test. Quantifying satellite-derived fire data is important for this rangeland-based economy given management and control of threats such as woodland expansion [25] and air quality concerns associated with smoke emissions during fire season [24]. We discuss the implications of our findings for the Flint Hills ecoregion and the potential to improve upon global fire datasets using regional customizations of global products.

Study Area
The Flint Hills ecoregion is situated mostly in Kansas, USA and is one of the last remaining tallgrass prairies in North America embedded in a mosaic of land cover types within the Great Plains, USA but is dominated mostly by grassland ( Figure 1). The Flint Hills has a long history of fire research [28]. Its fire frequency is intact and well-studied with some areas burning annually and others hardly at all [25,26]. No other grassland region in North America maintains their fire frequency regime at regional scales as observed in the Flint Hills. Woodland expansion has increased in the region over the past few decades [29] and has become more of a concern in recent years [30]. Rainfall in the region ranges from 220 to 1300mm with a mean annual precipitation of about 940 mm [31] and thus has the potential to sustain relatively high levels of woody plants [32].
Remote Sens. 2020, 13, x 3 of 9 2019. The Flint Hills fire regime is well studied, e.g., [24][25][26][27] making it an ideal location to perform this test. Quantifying satellite-derived fire data is important for this rangeland-based economy given management and control of threats such as woodland expansion [25] and air quality concerns associated with smoke emissions during fire season [24]. We discuss the implications of our findings for the Flint Hills ecoregion and the potential to improve upon global fire datasets using regional customizations of global products.

Study Area
The Flint Hills ecoregion is situated mostly in Kansas, USA and is one of the last remaining tallgrass prairies in North America embedded in a mosaic of land cover types within the Great Plains, USA but is dominated mostly by grassland ( Figure 1). The Flint Hills has a long history of fire research [28]. Its fire frequency is intact and well-studied with some areas burning annually and others hardly at all [25,26]. No other grassland region in North America maintains their fire frequency regime at regional scales as observed in the Flint Hills. Woodland expansion has increased in the region over the past few decades [29] and has become more of a concern in recent years [30]. Rainfall in the region ranges from 220 to 1300mm with a mean annual precipitation of about 940 mm [31] and thus has the potential to sustain relatively high levels of woody plants [32].

Fire Data
Fire data was obtained from two sources; Moderate Resolution Imaging Spectroradiometer (MODIS) burned area product Collection 6, MCD64A1, [23] and the KDHE-MODIS customization of the MODIS MCD64A1burned area product (combination of Collections 5 and 6). The MODIS burned area product MCD64A1 Collection 6 product has made substantial improvements in detecting global burned areas (by 26%) compared to Collection 5. The overall technique produces composite imagery summarizing persistent changes in the time series of a burn-sensitive vegetation index which is used to estimate probabilistic thresholds suitable for classifying individual 500-m grid cells as either burned or unburned [23]. The KDHE-MODIS MCD64A1 customization data set was created using MODIS MCD64A1 burned area satellite images and the surface reflectance layer following the methods described in Mohler and Goodin [26], and includes previously developed fire scars created

Fire Data
Fire data was obtained from two sources; Moderate Resolution Imaging Spectroradiometer (MODIS) burned area product Collection 6, MCD64A1, [23] and the KDHE-MODIS customization of the MODIS MCD64A1burned area product (combination of Collections 5 and 6). The MODIS burned area product MCD64A1 Collection 6 product has made substantial improvements in detecting global burned areas (by 26%) compared to Collection 5. The overall technique produces composite imagery summarizing persistent changes in the time series of a burn-sensitive vegetation index which is used to estimate probabilistic thresholds suitable for classifying individual 500-m grid cells as either burned or unburned [23]. The KDHE-MODIS MCD64A1 customization data set was created using MODIS MCD64A1 burned area satellite images and the surface reflectance layer following the methods described in Mohler and Goodin [26], and includes previously developed fire scars created by Mohler and Goodin [26]. Collection 5 (MCD45) of the active fire datasets was used utilized from the year 2001 up until March 31st, 2017. Thereafter, Collection 6 (MCD64A1) was utilized for the KDHE-MODIS burned area product development. For the purposes of this study, fires that fell within a bounding box (−97 • 30 3.24" to −95 • 29 60" W, 35 • 59 60" to 40 • 0 22" N) were considered, making up an area of about 49,221 km 2 . We also use the terms "MODIS "when referring to the MODIS MCD64A1 and "KDHE-MODIS" when referring to KDHE-MODIS MCD64A1 customization throughout the manuscript.

Data Analysis
For the KDHE-MODIS MCD64A1 customization, daily surface reflectance data from both Aqua MODIS and Terra MODIS satellites were utilized for training on a near-daily basis when clear-sky conditions were present across the Flint Hills study area during the prescribed burn season. Burn season may vary each year, but predominately begins in mid-February to mid-March, and ends in late April to early May, but the remainder of the year is not analyzed even though some fire may be present. Analysis was completed for portions of the Flint Hills under appropriate conditions (i.e., clear skies). If data is not available for a specific day due to these quality assurance reasons (e.g., cloud cover) then these data were not used, and the next valid daily data set was used. Following Mohler and Goodin [26], a supervised classification was performed with a minimum distance of 250 m during the burn season to ensure producer accuracy of ≥90% when burned areas were mapped within two weeks of an active fire. Thereafter, land parcels (pixels) were further classified as burned or remain unburned by field staff when they were in the field or nearby predicted burned land parcels. Using both datasets, fire frequency was calculated and a fire scar map for the Flint Hills and its surrounding areas beyond the extent of the level 3 ecoregion [33] was compiled. The MODIS data was extracted using Google Earth Engine [34] and resampled to 250 m to match the resolution of the KDHE-MODIS customized burned area product.
We compared fire occurrence and burned area between MODIS and KDHE-MODIS customized burned area product from 2001 to 2019. Fire occurrence comparisons, i.e., agreement and disagreement, were conducted comparing burnt vs. unburnt pixels between the two products over the entire time period (irrespective of fire frequency). Burned areas, however, were compared annually. All analyses were conducted in R v. 3.6.1 [35] using raster [36] and diffeR [37] packages.

Results
On average, MODIS recorded <1 million acres burned while the KDHE-MODIS customization improved on that by 58% (>2 million acres, Figure 2). MODIS detection of burned areas for 5 years (2001, 2002, 2007, 2012 and 2013) were <1%, while the KDHE-MODIS customization showed that no year recorded <1%. MODIS data missed 5 years of fire data even on the most frequently burned pixels. MODIS also never captured a single year with >8% burned area, even in the years the Flint Hills experienced its most fire activity (2008,2009,2011,2014).
Major differences in mean fire return intervals (MFRI = years of observation/# fires) were observed in the core area that burns most frequently with MODIS = 1.35 years and KDHE-MODIS customized = 1 year (Figure 3). Comparing the two burned area products on fire presence/absence only, 18% agreement in burned areas was observed, while agreement in non-burned areas was 50%. In areas where MODIS recorded no fires, KDHE-MODIS customization ground-truthed 28% fire activity. Subsequently, in areas where KDHE-MODIS customization recorded no fires, MODIS recorded 4% fire activity (Figure 3). Major differences in mean fire return intervals (MFRI = years of observation/# fires) were observed in the core area that burns most frequently with MODIS = 1.35 years and KDHE-MODIS customized = 1 year (Figure 3). Comparing the two burned area products on fire presence/absence only, 18% agreement in burned areas was observed, while agreement in non-burned areas was 50%. In areas where MODIS recorded no fires, KDHE-MODIS customization ground-truthed 28% fire activity. Subsequently, in areas where KDHE-MODIS customization recorded no fires, MODIS recorded 4% fire activity (Figure 3). Fire metrics in the region is highly contingent on the extent of what is considered the Flint Hills and the municipal counties that make up this ecoregion. When we consider the environmentallybased ecoregion boundary as the "Flint Hills proper", about 5% of the Flint Hills burns on average according to MODIS and 23% according to the KDHE-MODIS customization. If we consider an extended area that includes the 20 counties as the "greater Flint Hills region" (−97°30ʹ3.24ʺ to −95°29ʹ60ʺ W, 35°59ʹ60 to 40°0ʹ22ʺ N) which matches the extent of KDHE-MODIS customized fire data, this value decreases to 3.5% (MODIS) and 14.5% (KDHE-MODIS customization), and annual acres burned are reported in Figure 2.

Discussion
Our study compares fire occurrence between two related datasets in one of the last remaining grassland regions of the world with a relatively intact fire frequency regime. No other region in the Great Plains burns as often as the Flint Hills and is one of a few regions in the world where fire can be studied at ecoregion/landscape scales. Our reliance on satellite data has increased substantially in recent years and field-based assessments of fire accuracy are important to reduce model uncertainty associated with satellite-derived data products. Overall, KDHE's customization of MODIS MCD64A1 burned area product showed major improvement in burned areas compared to MODIS without customization. The main reasons for the improvement on MCD64A1 burned area product is because the KDHE-MODIS customization relies on field data when experts clarify parcels that have burned or not. As such, we are underestimating satellite-derived fire activity even in one of the world's most frequently burned grassland areas without field-based accuracy assessments.
MODIS missed five years of data in the most frequently burned pixels. Compared to historical fire return estimates, which believed to be somewhere between one and five years [38], 85% of the greater Flint Hills region burns less than expected. This value increased by 9% without KDHE-MODIS customization. The KDHE-MODIS customization captured the core area that maintained a 1-year MFRI for which the Flint Hills is well known [26]. Even in areas most frequently burned, MODIS MFRI was 1.35. Nevertheless, except the core area, most of the Flint Hills does not capture the degree of historical fire return interval and this may have greater implications in the future for threats such as woodland expansion.
A general expectation based on historical records is that one third or about two million acres of the Flint Hills burns annually. Depending on what we consider as the Flint Hills, i.e., Flint Hill proper vs. greater Flint Hills, this value may vary. If we consider the greater Flint Hills with 20 municipal counties that contribute to it, then about 2.6 million acres burns on average which corresponds to 14.5%. In comparison, about two million acres burns in the Flint Hills proper (ecoregion boundary)

Discussion
Our study compares fire occurrence between two related datasets in one of the last remaining grassland regions of the world with a relatively intact fire frequency regime. No other region in the Great Plains burns as often as the Flint Hills and is one of a few regions in the world where fire can be studied at ecoregion/landscape scales. Our reliance on satellite data has increased substantially in recent years and field-based assessments of fire accuracy are important to reduce model uncertainty associated with satellite-derived data products. Overall, KDHE's customization of MODIS MCD64A1 burned area product showed major improvement in burned areas compared to MODIS without customization. The main reasons for the improvement on MCD64A1 burned area product is because the KDHE-MODIS customization relies on field data when experts clarify parcels that have burned or not. As such, we are underestimating satellite-derived fire activity even in one of the world's most frequently burned grassland areas without field-based accuracy assessments.
MODIS missed five years of data in the most frequently burned pixels. Compared to historical fire return estimates, which believed to be somewhere between one and five years [38], 85% of the greater Flint Hills region burns less than expected. This value increased by 9% without KDHE-MODIS customization. The KDHE-MODIS customization captured the core area that maintained a 1-year MFRI for which the Flint Hills is well known [26]. Even in areas most frequently burned, MODIS MFRI was 1.35. Nevertheless, except the core area, most of the Flint Hills does not capture the degree of historical fire return interval and this may have greater implications in the future for threats such as woodland expansion.
A general expectation based on historical records is that one third or about two million acres of the Flint Hills burns annually. Depending on what we consider as the Flint Hills, i.e., Flint Hill proper vs. greater Flint Hills, this value may vary. If we consider the greater Flint Hills with 20 municipal counties that contribute to it, then about 2.6 million acres burns on average which corresponds to 14.5%. In comparison, about two million acres burns in the Flint Hills proper (ecoregion boundary) per annum which corresponds to 23% according to KDHE-MODIS customization. Utilizing MODIS fire data without field-based customization would underestimate this value severely as MODIS never recorded >8% burned area in any year. Grassland fires are probably the most under predicted land cover type in the world because of the rapid recovery period following a fire and small fires may be easily missed. Limitations in predicting fire activity has huge drawbacks for a rangeland-based economies such as those in the Flint Hills, especially with regards to the control of woodland expansion [25] and air quality concerns associated with smoke emissions [24] during fire season (March to May) [39]. Understanding fire patterns is essential for land planning to combat and control these threats to the Flint Hills.
The value of ground-truthing the MCD64A1 burned area product is unparalleled in this frequently burned grassland region. Furthermore, grassland fires generally rapidly pass from the burning phase to the smoldering phase, which is poorly captured by MODIS if fires are not large enough (≥100 ha) [40]. Recent advances in monitoring have coupled remote sensing data with ground-truthed information by integrating products into a cloud-computing, artificial intelligence environment [41]). A similar process may help to leverage ground-truth data with remote sensing information to improve grassland characterization of fire occurrence data. Other satellite sources offer alternative fire products at fine resolution (10-60 m) such as the European Space Agency (ESA) Copernicus Sentinel 2 mission that launched in June 2015, which already showed an improvement on predicting fire activity in Africa [9].

Conclusions
Accurate fire data is essential to maintain one of the last remaining tallgrass prairies on Earth and MODIS is unmatched at providing accessible spatiotemporal global fire data. Our main findings showed that local-scale ground-truthing of the MCD64A1 product improved burned area estimates by 28% over a 19-year period in the Flint Hills ecoregion. Obtaining accurate fire data is a challenge for all grassland and savanna researchers globally. We are aware of departures in fire return intervals and the decrease in global fire activity [42]. Based on current research trends [43,44], future conservation efforts will rely heavily on satellite-based data. Reducing model uncertainty in satellite fire data will allow us to make better decisions, improve our ability to map, visualize and be informed on changes to global fire regimes. Therefore, we need local customization to improve our global fire data and reduce uncertainty.