1. Introduction
Climate change, with its severe heat waves, prolonged droughts, forceful wind gusts, and changes in precipitation patterns, is causing more frequent and intense wildfires [
1]. For example, 13 of the largest wildfires on record in California occurred within the last ten years, with nine of them in 2020 and 2021 alone [
2]. Wildfires cause major disturbances to natural ecosystems and human communities, including loss of human lives [
3]. Fire detection and monitoring are thus crucial components to mitigate wildfire damage.
One approach to detect and monitor fires is to use images from Geostationary Earth Orbit (GEO) satellites [
4]. This approach can detect fires in near real time since GEO satellites capture images every 5–15 min. Indeed, fire detection products have been developed for most GEO satellite systems, mainly using threshold-based algorithms [
5,
6]. Notable examples of GEO satellites and sensors used for fire detection are the Advanced Baseline Imager (ABI) onboard the Geostationary Operational Environmental Satellite–R Series (GOES-R) [
7], the Himawari-8/9 that is equipped with the Advanced Himawari Imager (AHI) [
8], and the Geo-Kompsat-2A carrying the Advanced Meteorological Imager (AMI) [
9]. Over the past few years, several efforts have been made to enhance the fire detection capabilities of current GEO-based algorithms. In particular, several authors developed sophisticated machine learning-based fire detection algorithms [
10,
11,
12]. Hence, evaluating the performance of a given fire detection algorithm is a key component, both for comparing the accuracy of different methods and for developing improved ones.
The omission and commission error rates are two common quantities used to evaluate fire detection algorithms [
13]. The omission error rate is defined as the probability that the algorithm does not detect a fire. The commission error rate, also known as the false alarm rate, is the probability that the algorithm declares a fire even though there is no fire at that location [
13]. Other related evaluation measures are the precision and recall of a fire detection algorithm, and its corresponding F1-score [
14].
To estimate these quantities, it is essential to have a reliable ground truth reference for comparison. One of the most accurate approaches is to use ground-based in situ fire reports as a reference, as collected by disaster response entities such as a National Forest Service [
12]. However, such reports may have limited spatial coverage and are often not publicly available. A different approach is to use the fire product output from images of Low Earth Orbit (LEO) satellites as ground truth. LEO satellites have higher spatial resolution than GEO satellites, and their fire products are considered much more accurate [
15,
16]. This approach has been applied with several LEO satellites and their imaging systems, including the Visible Infrared Imaging Radiometer Suite (VIIRS), Moderate Resolution Imaging Spectroradiometer (MODIS), Landsat, and Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) [
13,
17,
18,
19,
20,
21,
22,
23,
24,
25].
However, as noted in several studies [
26,
27], there can be spatial misalignments in the locations of objects as they appear in LEO and GEO images. The fundamental drivers of these GEO-LEO spatial misalignments are related to physical and geometric constraints. A primary factor is the parallax effect. GEO sensors are located over the equator and often observe regions at large zenith angles. In contrast, LEO sensors might view the same area from a near-nadir position or from a large viewing angle in a completely different direction. This discrepancy in viewing geometry means that the apparent position of features is physically shifted, causing pixel locations to be displaced on the Earth’s surface [
26,
27].
Figure 1 illustrates this effect, showing how a single fire event observed by both LEO and GEO satellites from different viewing angles results in an apparent spatial displacement of the fire location. Beyond the parallax effect, another major source of spatial misalignments is the sensor’s Point Spread Function (PSF) and the pixel remapping process [
5]. The Point Spread Function (PSF) is an inherent optical property of the instrument. Due to diffraction, the radiant energy of a fire disperses outside the center field of view, spreading the thermal signal into adjacent detectors. In addition, the remapping operation of satellite pixels onto a fixed Earth grid requires interpolation, which smears the radiant energy of a sub-pixel fire across adjacent pixels, further distorting its apparent location.
Figure 2 shows an example of misalignments between the locations of fires detected from GEO and LEO images in three regions—California-Oregon, Amazonas, and Patagonia. The GEO images were captured by the GOES 16/17 ABI, whereas the VIIRS instrument acquired the corresponding LEO images. The figure shows that, although both instruments observe the same fires, there are significant spatial displacements of several kilometers in their locations. Hence, blindly using LEO detections as ground truth may be highly problematic for evaluating GEO fire detection algorithms.
An additional complication is that misalignments between the locations of GEO and LEO fire detections do not follow a systematic pattern. This is illustrated in
Figure 3, which shows, for two different fire events in the Amazonas, their locations as detected by GOES 16 ABI and VIIRS. Even though the two fires, which occurred on 17 August 2022, were approximately in the same region, their ABI-VIIRS displacements are in different directions (east direction in one, and north direction in the other). This is just one example, as similar inconsistencies appear in all regions considered in this study.
These misalignments raise important questions at the focus of this study: How common are misalignments between ABI and VIIRS fire detections? And how should the accuracy of a fire detection algorithm be assessed in light of such misalignments?
Several studies that evaluated fire detection methods using LEO detections as ground truth did not account for these misalignments [
11,
13,
28]. In these studies, a GEO fire detection is considered a false alarm if there are no LEO detections inside the GEO pixel. Other studies applied either a circular buffer around each GEO fire-detected pixel [
18,
21,
22,
23] or a square window [
20,
24,
25]. In these studies, a GEO fire pixel is considered a correct detection if there is at least one LEO fire detection in its buffer area. Otherwise, the GEO fire-detected pixel is labeled as a false alarm.
The primary objective of this study is to quantify the prevalence of GEO–LEO misalignments and assess their impact on the evaluation of GEO fire detection accuracy. To this end, this study uses the GOES-16/17 ABI Fire Detection and Characterization (FDC) product as the GEO-based data source, and the VIIRS active fire product as the LEO-based reference in three different regions. This study shows that spatial misalignments between the location of fire detections from GEO and LEO sensors significantly impact the accuracy estimates of GEO-based fire detection algorithms. It is further shown that such misalignments commonly occur across various latitudes and ecosystem types.
2. Materials and Methods
This section outlines the data and the methods used to quantify the misalignments between GOES ABI and VIIRS fire detections and to evaluate the GOES-FDC algorithm. First, the study areas and data sources are described. Next, the pre-processing steps performed to spatially align these datasets are detailed, including corrections for the bow-tie effect, the rasterization of VIIRS fire detections, and the non-maximal suppression step. Subsequently, the methodology for quantifying spatial misalignments and evaluating algorithm performance using various buffer sizes is presented. All analyses in this study are performed at the pixel level.
2.1. Study Area
Three distinct regions of varying latitudes were included in the dataset of this study: high latitude north in the California-Oregon region (western United States), equatorial latitude in Brazil’s Amazonas State, and high latitude south in Patagonia (
Figure 4). These areas all experience wildfires and have different environmental and climatic conditions.
2.2. Data Sources
For each of the three regions, a dataset for the whole year 2022 of fire product pairs from the LEO VIIRS and the GEO GOES-16/17 ABI was constructed (GOES-17 for California-Oregon and GOES-16 for the other two regions) to have temporally matching fire detections. Only GOES ABI images taken at ±5 min intervals from the corresponding VIIRS images were considered.
Figure 5 illustrates the distribution of time offsets between the paired GOES-ABI and VIIRS images across the three study regions. As shown, the median time offset is 1 min, and it is assumed that fire progression within this time-frame would not impact the misalignment and evaluation analyses.
For VIIRS, the collection 2, level-2, VNP14IMG active fire product with 375 m resolution [
29] was downloaded. Only “high” and “nominal” confidence VIIRS fire detections were retained, while “low” confidence cases were omitted.
For GOES ABI, the FDC product was downloaded [
5]. The GOES FDC has several mask codes for its detected fire pixels, as detailed in
Table 1. As the table shows, some fire pixel categories are referred to as “temporally filtered”. A fire-detected pixel is considered temporally filtered if it was also detected as a fire at least once in the past 12 h.
The categories of fire detection with the lowest false alarm rates are “processed” and “saturated”, both temporally and not temporally processed [
5,
18]. Thus, in this study, only pixels with the corresponding mask codes (10, 11, 30, and 31) were used to assess the commission error and the prevalence of misalignments.
The constructed dataset contains 156,952 VIIRS (
Table 2) and 59,522 matching GOES FDC fire detections, whereby 14,645 of them are from the “processed”, “temporally filtered processed”, “saturated”, and “temporally filtered saturated“ categories (
Table 3). The dataset size is comparable to that used for the GEO fire detection evaluation [
13,
18].
Appendix A and
Appendix B provide additional details about the fire products and their source materials.
2.3. Removal of Duplicate VIIRS Detections—The Bow-Tie Effect
The VIIRS instrument has a whiskbroom scanner, where pixel size increases toward the edges of each scan. Therefore, pixels near the edges overlap with those from previous scan lines, resulting in pixel overlap. This phenomenon is known as the ”bow-tie effect” [
30]. VIIRS removes some of the duplicated pixels in its standard processing pipeline. However, several residual “bow-tie” pixels remain. These duplicates may lead to an incorrect estimation of the probability of fire detection by GOES FDC. Hence, these VIIRS duplicate fire detections need to be removed. This issue is detailed in
Appendix C.
2.4. Adjustment of VIIRS Pixel Size as a Function of View Zenith Angle
The nominal spatial resolution of VIIRS active fire pixels is 375 m at nadir. However, the actual pixel size increases with the View Zenith Angle (VZA) of the VIIRS scan. To account for this increase, the actual pixel size in meters (
) was calculated as a function of the VZA [
31] as follows:
where
is the nominal pixel size at nadir (375 m), and
is the VZA in degrees. This correction ensures that the spatial resolution of each VIIRS fire detection reflects its true surface coverage. The corrected pixel sizes were subsequently used during the rasterization of VIIRS detections onto the GOES-ABI grid.
2.5. Rasterization of VIIRS
As VIIRS and GOES ABI have different spatial resolutions and projections, the VIIRS data were rasterized into the GOES ABI grid. Specifically, the size-adjusted VIIRS fire detections were rasterized into the GOES ABI grid using the “rasterize” function from the “Rasterio” Python library (version 3.9.23) [
32]. The rasterized output is a GOES ABI grid where each pixel is assigned the number of VIIRS pixels that intersect it. A GOES ABI pixel was assigned a zero value if it had no intersection with VIIRS pixels.
Figure 6 illustrates this process.
2.6. Prevalence of Misalignments
The prevalence of misalignments between the locations of fires detected by GOES FDC vs. VIIRS in a given dataset is estimated as follows: Let
m be the total number of FDC fire detections in the dataset. Each detection
, located at GOES ABI grid point
, is assigned to one of three possible categories: If there is at least one VIIRS fire detection inside the FDC pixel
, then the pixel is assigned to the first category with
. Otherwise,
, which may be due to misalignment. In this case, the presence of VIIRS fire detections is checked in a
buffer window around the pixel
. If there is a VIIRS fire detection inside this buffer, then
(category two); otherwise,
(category three). Subsequently, the following sums are computed:
Next, the total number of GOES FDC false alarms is computed:
Finally, the percentages of these three categories are computed as follows:
Equation (
6) provides the percentage of FDC fire detections that are misaligned with those of VIIRS.
2.7. Misalignment-Aware GOES FDC Fire Evaluation
This section describes a misalignment-aware methodology for evaluating the accuracy of a GEO fire detection algorithm using LEO-based detections as ground truth. While this study applies the evaluation to the GOES-16/17 FDC product and VIIRS, the methodology is flexible and can be applied to other GEO/VIIRS combinations. The corresponding Python code is publicly available at the following GitHub repository (
https://github.com/asafvanunu/GEO_LEO_evaluation, accessed on 1 March 2026). Specifically, as described in
Section 2.2, the assumed input is a dataset of GEO/VIIRS images of the same region, observed at matching times (within a ±5-min difference).
The first step is to use the VIIRS detections to construct a ground-truth reference of FDC fire pixel locations for each FDC image in the dataset. As discussed in several works [
13,
18,
22], due to the higher spatial resolution of VIIRS, it can detect small-sized fires much more accurately than the GOES FDC. Hence, when evaluating the GOES FDC product, one user-defined parameter is the threshold
t. This threshold specifies the minimum number of VIIRS fire detections required within an FDC pixel for it to be labeled as a fire pixel. Indeed, one quantity of interest in evaluating the GOES FDC output is its probability of detecting a fire, depending on its area, as measured by the number of VIIRS detections of the fire, for example Li et al. [
13].
The fact that there may be spatial misalignments between the fire locations of the FDC product and VIIRS makes this evaluation not straightforward. Suppose, for example, that we wish to estimate this probability with at least
VIIRS detections. Due to spatial misalignments, the VIIRS detections may be spread over two adjacent FDC pixels. This is illustrated in the bottom right corner of
Figure 7 (left panel), where 6 VIIRS detections are spread with 4 in one FDC pixel and 2 in an adjacent one. To account for this issue, the following non-maximal suppression process is applied to the output image of the rasterize function in
Section 2.5. We note that non-maximal suppression is a common set of techniques widely used in low-level computer vision tasks, particularly to handle sub-pixel misalignments [
33]
In our setting, let
be the number of VIIRS fires inside an FDC pixel in a given location, also denoted as a cell. Let
t be the user-chosen threshold for the number of VIIRS detections. A cell with a value
is labeled as a fire pixel. To address sub-pixel misalignments, if a cell has a value
and a neighboring cell
has fewer but non-zero number of detections
, then the value at the original cell is updated to their sum
. If
, the cell is labeled as a fire pixel. If two neighboring cells have the same number of VIIRS detections,
, and
, they are both labeled as fire pixels. Such duplicated fire pixels are treated in post-processing to ensure no double-counting when calculating the omission error rate. The precise details of this non-maximal suppression step are described in
Appendix D. The rasterized image of VIIRS fire detections after non-maximal suppression is considered as the reference ground truth.
Figure 7 illustrates the process for a VIIRS rasterized image as input and a threshold
. With the non-maximal suppression procedure, the resulting output image contains eight fire pixels, whereas only three are present without a correction.
Next, true positive (TP) and false positive (FP) FDC pixels are identified, taking into account the potential FDC-VIIRS spatial misalignments.
To interpret the spatial extent of these buffer windows, it is necessary to consider the varying pixel size of the GOES ABI sensor. The nominal 2 km nadir resolution degrades as the view zenith angle increases [
34]. Consequently, the area covered by a specific buffer size varies significantly between the equatorial Amazonas region (∼2 km pixel size) and the higher-latitude regions of California-Oregon and Patagonia (∼4 km pixel size).
Table 4 details the approximate ground dimensions for each buffer window size used in this study.
Accordingly, a buffer window is applied around each FDC fire detection as follows. Specifically, let be the FDC image, where at pixel locations where the FDC product detects a fire and otherwise. In addition, for a given threshold t, let be the VIIRS image after the rasterization and non-maximal suppression steps. An FDC fire-labeled pixel is considered a TP if the buffer around it contains a VIIRS fire-labeled pixel. The FDC pixel is regarded as an FP if no VIIRS fire-labeled pixels are present inside the buffer window.
Finally, false negative (FN) events are computed as follows. If the buffer window around a VIIRS fire pixel does not contain an FDC fire-labeled pixel, that FDC pixel is considered a FN pixel, also known as an omission error. For this computation, an FDC pixel is considered as fire if it belongs to any of the fire mask codes in
Table 1. Because the non-maximal suppression step aggregates detections from adjacent grid cells, it can cause an artificial duplication of fire pixels. Specifically, if two adjacent cells contain an identical number of VIIRS detections and their combined sum exceeds the user-defined threshold
t, both cells are updated to the aggregated value. If these adjacent cells are missed by the GOES FDC product, counting them separately would artificially inflate the omission error rate. To avoid this double-counting, a fractional FN weight of 0.5 is assigned to these aggregated cells. Consequently, the adjacent cells contribute exactly one FN to the final evaluation metrics. The precise details of this correction step are provided in
Appendix D. The remaining pixels are considered true negative (TN).
The computation of FN, TP, and FP with a buffer window of size
is demonstrated in
Figure 8. The red pixels annotated as GFP are detections made by the FDC fire product. In contrast, the green pixels annotated as VFP are VIIRS potential fire pixels after the non-maximal suppression step.
2.8. Accuracy Assessment
The accuracy of a GEO fire detection algorithm is quantified by its recall, precision, and F1 score. These are common quality measures used to assess classifiers. In the context of fire detection, recall quantifies the ability of a GEO fire algorithm to detect the ground truth fire pixels. Precision measures the fraction of actual fire pixels out of all those declared as fire by the GEO algorithm. The F1 score is the harmonic mean of precision and recall. Each accuracy measure is in the range [0,1], with higher values indicating better performance.
Given the number of TP, FP, and FN pixels for a set of
n pairs of FDC/VIIRS images, as computed in
Section 2.7 above, the recall, precision, and F1 score are calculated as follows: First, the total number of TP, FN, and FP pixels across all images in the dataset is computed. Let
k denote the index of a single image. The total values are
The accuracy measures are then estimated by
In addition, via a bootstrap procedure [
35], 95% confidence intervals are computed for each measure with 1500 repetitions.
4. Discussion
In this study, VIIRS fire detections were used to evaluate the performance of the GOES-16/17 ABI FDC product. However, such an evaluation is not straightforward due to spatial misalignments between GEO and LEO fire detections. To address this issue, this study presents a scheme that incorporates a buffer window and a non-maximal suppression step to account for these misalignments. The scheme is generalizable and can be applied to other GEO/VIIRS sensor combinations, as most GEO systems share similar spatial, multispectral, and temporal characteristics. This study’s results demonstrate the importance of this approach in mitigating the impact of spatial misalignments.
To evaluate the GOES FDC performance, a large dataset of fire detections was constructed, covering three regions of different latitudes and containing approximately 157,000 VIIRS fire detections and 15,000 GOES FDC fire pixels (including processed and saturated categories). The evaluation process treats each pixel-level detection as an independent observation. While sequential detections may correspond to the same fire event across time, each acquisition is treated independently as it represents a new image taken by the satellite.
A key finding is that FDC-VIIRS misalignments occur in approximately 12% of the GOES FDC fire detections. This holds for both GOES satellites (GOES 16/17) and across latitudes and ecosystem types. To the best of our knowledge, no previous work has quantified this issue.
A second key finding is the importance of incorporating a buffer window around the FDC fire detections. For example, without a buffer, the precision of the GOES FDC in the three study regions is estimated to be in the range of 0.74 to 0.79. In contrast, with a buffer, the estimates are significantly higher, ranging from 0.85 to 0.93. In addition, implementing a buffer window yields significantly improved recall and F1 scores estimates, effectively mitigating the impact of FDC-VIIRS misalignments. These findings underscore the need to account for misalignments to ensure reliable fire detection assessments.
The precision values remain relatively constant for buffer window sizes
and higher. This finding indicates that pixels falsely detected as fire by GOES FDC (without matching VIIRS detections) were indeed FP. The
buffer window achieves the highest precision, recall, and F1 score. However, applying such a large window increases the risk of matching unrelated fires. As shown in
Table 4, in high-latitude regions like California-Oregon and Patagonia, a
window covers a substantial area of approximately
km. Matching fires over such large distances likely introduces false associations between independent fire events. Therefore, a smaller window size is recommended to avoid such errors. Recall values are low when using a VIIRS fire detection threshold
. This is regardless of the buffer window size. These low values highlight the limitation of the GOES FDC to detect small fires.
In light of the findings above, the use of a spatial buffer window to evaluate GOES FDC performance is highly important. As discussed in
Section 1, direct GEO/LEO pixel-to-pixel comparisons are inherently complicated by geometric and optical distortions. Standard ABI data currently lack terrain correction, meaning the parallax effect can induce apparent geolocation offsets of several kilometers in different regions [
26,
27]. Furthermore, the sensor’s PSF smears a sub-pixel fire’s radiant energy across adjacent pixels. The GOES FDC algorithm explicitly assumes a PSF where only 75% of the signal originates from the center field of view for the 3.9
m band, and only 51% for the 11.2
m band [
5]. This dispersion is further exacerbated by the data remapping process. Therefore, a spatial buffer is required to mitigate spatial uncertainty and ensure an accurate assessment of the true detection capability of the GOES FDC product.
The large percentage of spatial misalignments in FDC/VIIRS image pairs has significant implications for the evaluation and training of machine learning-based GEO fire detection algorithms. Typically, VIIRS data are used as ground truth for training GEO-based machine learning fire detection algorithms [
10,
11,
36]. However, to the best of our knowledge, most machine learning works to date have not considered misalignments between GEO and LEO sensors. Constructing a ground truth reference without taking the GEO/LEO misalignments introduces label noise into the training dataset. Standard training on a dataset with label noise often yields sub-optimal classifiers [
37]. Hence, there is a need to account for spatial misalignments also in training machine learning models. Another interesting future direction is to extend this beyond pixel-level evaluation. The methodology proposed in this study can be utilized for burned area analysis, using aggregated fire pixels as an estimation for the total burned area. This transition from pixel-level validation to event-level assessment would provide a more comprehensive understanding of fire dynamics over time.