About Validation-Comparison of Burned Area Products

This paper proposes a validation-comparison method for burned area (BA) products. The technique considers: (1) bootstrapping of scenes for validation-comparison and (2) permutation tests for validation. The research focuses on the tropical regions of Northern Hemisphere South America and Northern Hemisphere Africa and studies the accuracy of the BA products: MCD45, MCD64C5.1, MCD64C6, Fire CCI C4.1, and Fire CCI C5.0. The first and second parts consider methods based on random matrix theory for zone differentiation and multiple ancillary variables such as BA, the number of burned fragments, ecosystem type, land cover, and burned biomass. The first method studies the zone effect using bootstrapping of Riemannian, full Procrustes, and partial Procrustes distances. The second method explores the validation by using distance permutation tests under uncertainty. The results refer to Fire CCI 5.0 with the best BA description, followed by MCD64C6, MCD64C5.1, MCD45, and Fire CCI 4.1. It was also found that biomass, total BA, and the number of fragments affect the BA product accuracy.


Introduction
Burning of biomass is one of the main sources of aerosol and greenhouse gas emissions in the atmosphere [1]. This influences strongly carbon cycles and plant succession processes [2]. According to Giglio et al. [3], each year on average 348 million hectares of biomass are burned around the world. For the most part, this quantity is detected by means of satellite remote sensing in the optical region: red (620-670 nm), NIR (1230-1250 nm), and SWIR (2105-2155 nm) [4]. However, in geographical areas like the tropics, the identification of burning on this spectral zone is complex due to extensive cloud cover [5] and the size of the fires [6], probably due to the moisture content in the forests [7], and in the case of the savannas, the rapid movement (and short-lived) of fire and relatively low fuel levels [8], which limit the detection of fires. These elements have an impact on emissions estimates and on reports of burned area (BA). Chuvieco et al. [9] reported a greater amount of area burned in the world compared to those of Giglio et al. [3] and estimated that between 324 and 416 million hectares are burned annually around the world.
Tropical regions have the largest deposits of aboveground biomass [10][11][12]. They also involve a high percentage of the biomass burned in the world [13], contributing to the main concentrations of This work proposes a method for validation and comparison of five BA products with reference data by including the location, biomass, land cover, fire size, and the number of fires in 44 scenes of the tropics in Northern Hemisphere South America (NHSA) and Northern Hemisphere Africa (NHAF). The method includes two parts: (1) a comparison and validation of BA products by bootstrapping; (2) randomness and permutation tests for validation of BA products.

Study Area
Giglio et al. [42] divided the continents into regions of fire using a study of five temporal metrics as shown in Figure 2. In this paper we will focus on two regions: Northern Hemisphere South America (NHSA) and Northern Hemisphere Africa (NHAF). Both regions are highly affected by fires [43][44][45][46][47][48].  This work proposes a method for validation and comparison of five BA products with reference data by including the location, biomass, land cover, fire size, and the number of fires in 44 scenes of the tropics in Northern Hemisphere South America (NHSA) and Northern Hemisphere Africa (NHAF). The method includes two parts: (1) a comparison and validation of BA products by bootstrapping; (2) randomness and permutation tests for validation of BA products.

Study Area
Giglio et al. [42] divided the continents into regions of fire using a study of five temporal metrics as shown in Figure 2. In this paper we will focus on two regions: Northern Hemisphere South America (NHSA) and Northern Hemisphere Africa (NHAF). Both regions are highly affected by fires [43][44][45][46][47][48].
Remote Sens. 2020, 12, x FOR PEER REVIEW 3 of 23 (a) fires of 1 pixel (b) A single fire of pixels (c) Separated fires that sum pixels This work proposes a method for validation and comparison of five BA products with reference data by including the location, biomass, land cover, fire size, and the number of fires in 44 scenes of the tropics in Northern Hemisphere South America (NHSA) and Northern Hemisphere Africa (NHAF). The method includes two parts: (1) a comparison and validation of BA products by bootstrapping; (2) randomness and permutation tests for validation of BA products.
Contributions about sample size and error for validation scenes can be found in Supplementary Material Section S1.1. Likewise, the location of sample scenes are described in Supplementary Material Section S1.2. This information was not included in the paper to avoid any confusion with the main Remote Sens. 2020, 12, 3972 4 of 23 purpose of validation-comparison of BA products. However, the addressed methods given in the Supplementary Material consider innovations that an interested reader can use in similar verification works. For example, they hold for quality validation of cartographic products and accuracy assessment of land cover products.

Reference Data
In 2009, Boschetti, et al. [24], defined a standard protocol that is used by the CEOS Cal-Val (attached to the CEOS-WGCV) for the process of validating the quality of the cartographic products. An adjustment to this protocol was implemented to restrict the fire perimeters for the years 2007-2008, a period with a significant increment of fires in NHSA and NHAF regions according to report emissions GFED model (Global Fire Emissions Database, see https://www.globalfiredata.org/). In Supplementary Material (SM) Section S1.3, we propose a method for processing the reference data with a hybrid method combining the BAMS model, based on LANDSAT 5 (TM) and 7 (ETM+) [50], with cloud masking, cloud shadows and water masks based on temporal stability proposed by Valencia et al. [40]. The last one was based on the LEDAPS model by Masek et al. [51]. The results were verified with a good level of confidence according to Claverie et al. [52].
The addressed hybrid method derived by Valencia et al. [37] is based on the fact that the LEDAPS cloud mask is more accurate than BAMS cloud mask (see SM Section S1.3). Moreover, it also considers a validation of the seeds obtained by the BAMS model, because some polygons can appear as burned areas due to changes in soil moisture.

BA Products Considered for Validation and Comparison
The BA products evaluated in this study (Table 1)  Generated from MODIS 500 m Collection 5 images, and information about thermal anomalies in the same sensor MCD64C5.1 [3] University of Maryland

MCD64A1 Collection 6
Generated from MODIS 500 m Collection 6 images, and information about thermal anomalies in the same sensor plus information from the VIIRS sensor MCD64C6 [4] University of Maryland

Ancillary Information
The analysis of this work includes quantitative and qualitative variables in order to describe the ecosystems and the BA phenomenon. We consider the spatial location of the LANDSAT scenes (WRS2 path and row), land cover, continent, biomass, average fire size, and number of burned fragments. Additionally, vegetation cartographic layers studied by Avitabile et al. [10], Olson et al. [54], and ESA [55] were also included in our research (see Table 2).

Statistical Methods for Validation and Comparison of BA Products
In the validation process, the error matrix and the associated statistics provide percentages of omission, commission, agreement and discrepancy and measurements of global accuracy (see SM Section S2 validation results using descriptive statistics of error matrices). According to Figure 1, the descriptive statistics of SM Section S2.1 (Equations (3)-(8)) do not include the topology and dynamics of the fire. In general, they cannot measure effects about location of the reference images, their interdependence and relationship with other important variables such as: number of burned fragments, amount of BA, type of ecosystem or amount of biomass.
This section provides two methods for validation and comparison of BA products based on random matrix theory, in order to consider some ancillary variables and georeferenced data. The exposition is organized as follows: Section 2.3.1 describes the perturbation model, then Section 2.3.2 defines the implemented distances, and Section 2.3.3 derives the two proposed methods of the paper.

Perturbation Model and Desirable Properties for a Matrix Distance in the Validation Context
First the perturbation model is described. In this setting the information of M variables (error matrix, land cover, continent, biomass, average fire size, and/or number of burned fragments) registered by reference data and the BA products in K scenes (location), can be storage in K × M matrices. In the model is assumed that there are much more scenes than variables (K M). Additionally, the LANDSAT information is considered as the reference data. Let L = (l i,j ) be the corresponding K × M matrix, storing the values provided by reference data. In this case l i,j corresponds to the information about the j-th variable in the i-th zone, where i = 1, . . . , K and j = 1, . . . , M. Reference data collected by LANDSAT means that L is a deterministic matrix instead of a random matrix.
Remote Sens. 2020, 12, 3972 6 of 23 Now, let P 1 , . . . , P q , the corresponding K × M matrices associated to the measurements registered by q amount of BA products in the same K scenes and M variables. In this case, the matrices P r are perturbations of L. Then, our model takes the form: where E r , r = 1, . . . , q, is a K × M perturbation matrix with unknown elements. The validation process uses distances between two matrices for searching the "best" product (P m ) closest to L, compared with the remaining BA products. In other words, the method finds the minimum value of d(P r , L), r = 1, . . . , q, where d(P r , L) denotes the distance of P r and L in certain space. Then, in our notation d(P m , L) < d(P s , L), for all s m.
For a validation context, and under the assumption that there are much more scenes than variables (K M), the distance d(P r , L) between the K × M matrices P r and L must satisfy the following two important properties: Property 1: The distance d(P r , L) strongly depends on the perturbations (small or large) of the reference data matrix L. In other words, let P u = L + E u a certain BA product; if the perturbation matrix E u tends to a null matrix, under a set tolerance given by the precission of the data, then d(P u , L) must be small. Otherwise, if E u is a big perturbation of L, then d(P u , L) must be large relative to the first case.
Property 2: The distance d(P r , L) is sensible to permutations of rows (scenes). Technically, let P u = L + E u and P v = L σ + E v , where L σ denotes the same matrix L but with permuted rows according to certain permutation σ. If the permuted scenes have similar information in all the variables, then d(P u , L) ≈ d(P v , L), otherwise the distances must be large, relative to the first case. This restriction is important because allows a georeferenced validation including the location and information of the variables of interest in such scenes.

Riemannian, Full and Partial Procrustes Distances: Definitions and Comparison
There are several important distances satisfying the above properties. Three of them emerging from the shape theory are the Riemannian distance, the full Procrustes distance, and the partial Procrustes distance [40,56]. The shape theory belongs to the spectral theory, an old and very popular technique in science. Such theory appears, for example, in eigenvalue problems, principal component analysis for clustering, or principal components regression for variable selection in linear models, etc. [36]. Recently, Riemannian distance has been used in experimental physics and chemistry studies in order to discriminate populations on plasma physics evolution [57], determination of equilibrium structures in nanoclusters [58], and spectroscopy shape analysis [59], showing the applicability to scientific studies.
Next, the Riemannian distance (RD), the full Procrustes distance (FPD), and the partial Procrustes distance (PPD) are defined according to [56].
The Riemannian distance d(P r , L) between any BA product, represented by the matrix P r and reference data, represented by L is an angle in radians ranged from zero (equality between L and P r ) to π 2 (maximum difference between L and P r ). Explicitly: where λ 1 ≥ λ 2 ≥ · · · ≥ λ N−1 ≥ |λ N | are the square roots of the eigenvalues of the matrix λ N < 0 if and only if P r I K − 1 k 1 K 1 K 1 K L < 0. For notations in these expressions, A and B represent matrices of adequate orders, where A and |A| denote the transpose and the determinant of the matrix A, respectively; meanwhile I K and 1 K , represents the identity matrix of order K and a column vector of A second useful comparison is the full Procrustes distance. It oscillates between 0 and 1, and is given by: The third comparison function used in this paper is the partial Procrustes distance which ranges from 0 to √ 2, and is given by: Thus, the validation with reference data (LANDSAT) of q BA products are summarized in q angles which can be order for establishing hierarchies of the products. Meanwhile, the full Procrustes distance is the Sine of the Riemannian distance; the partial Procrustes distance is the duplication of the sine of the half Riemannian distance. As shown in Figure 3, the trigonometrical relation between distances the full (Equation (3)) and partial Procrustes (Equation (4)) respect to the Riemannian distance are very similar for small angles (<0.90 approximately). It means that they behave similarly in presence of BA products near to reference data. Our interest pursues the same conclusion about the best product using the three distances. When the products are very different from reference data, the Riemannian, full, and partial Procrustes distances are near to π 2 , 1 or √ 2, respectively, and this case is not of interest because the randomness of the measure exceeds the expertise of the product. A second useful comparison is the full Procrustes distance. It oscillates between 0 and 1, and is given by: The third comparison function used in this paper is the partial Procrustes distance which ranges from 0 to √2 , and is given by: Thus, the validation with reference data (LANDSAT) of BA products are summarized in angles which can be order for establishing hierarchies of the products. Meanwhile, the full Procrustes distance is the Sine of the Riemannian distance; the partial Procrustes distance is the duplication of the sine of the half Riemannian distance. As shown in Figure 3, the trigonometrical relation between distances the full (Equation (3)) and partial Procrustes (Equation (4)) respect to the Riemannian distance are very similar for small angles (<0.90 approximately). It means that they behave similarly in presence of BA products near to reference data. Our interest pursues the same conclusion about the best product using the three distances. When the products are very different from reference data, the Riemannian, full, and partial Procrustes distances are near to , 1 or √2, respectively, and this case is not of interest because the randomness of the measure exceeds the expertise of the product. The three distances behave similar in presence of BA products close to reference data (LANDSAT).  SM Section S3 shows a verification of the addressed properties by a simulation of the perturbation model. It shows the behavior of the three distances around the referred Properties 1 and 2 described Remote Sens. 2020, 12, 3972 8 of 23 in Section 2.3.1. Several simulations of the perturbation model were implemented in order to clarify each property.

Proposed Methods Based on Distances of Matrices
Following SM Section S3, we could continue with a number of simulations by changing the distributions in E r , and checking the behavior of the Riemannian, full, and partial Procrustes distances d(L, P r ), but we do not know the true distributions of the perturbations. Instead, we use a method with the same power of simulations but considering resampling of the data that we really know. The bootstrap technique learns from the data and provides empirical distributions for the perturbations and the matrix distances d(L, P r ), as previously used in other contexts of remote sensing validation [40].

First Method: Comparison and Validation of BA Products by Bootstrap of Riemannian, Full or Partial Procrustes Distances
A simple comparison among the products is obtained by ordering the distances of all possible pairs of the five matrices related to each BA product: MCD64C6, MCD64C5, MCD45, Fire-CCI 4.1 (MERIS) and Fire-CCI 5.0 (MODIS). Then, we can establish similar or different performance according to the variables included in the columns. Now, a notorious discrepancy of two products can be explained from different sources. This can be attributed to dissimilarity in a few scenes or numerous scenes. One of the techniques refers to a more complex study of bootstrap exploring the distributions of the distances.
The method for validation works as follows: (1) Consider q = 5 real BA products measure in the same scenes and variables of reference data. For each P r , r = 1, . . . , q, we extract a sample with replacement of the K scenes in the M variables of interest and permute the rows of the product P r , according that resample. (2) Let P r,1 be the new resample product by rows.
(3) Compute d(L, P r,1 ). (4) We repeat t = 1,000,000 times the above steps, then we obtain d(L, P r,1 ), . . . , d(L, P r,t ), and we can find the empirical probability density function f (x) of the Riemannian, Full or Partial Procrustes distances, under resampling of the rows. Then, inference and probability computations can be performed by using the real Riemannian or Procrustes distances d(L, P r ).
The above method can also be used for comparison among the products. In this case we find empirical distributions for all the possible pairs of BA products P s , P r , r = 1, . . . , q; s = 1, . . . , q; r s.
The bootstrap method is carried out under resampling of the rows (scenes) allowing big differences (or similarities) without removing extreme data. If the discrepancy corresponds to few scenes, the resample tends to repeat equalities with similar performance, and there are more small differences than large ones. Thus, the distribution tends to be leptokurtic (unimodal, peaked, and narrow) and biased to the right, because the extremal differences are rare, then one can conclude that both BA products are statistically similar in most validation scenes, and a confidence interval of 95% and median estimation can be provided for the explanations and conclusions. Otherwise, when the discrepancy obeys several scenes, the bootstrap method tends to replace concordance scenes with extremely different performances and the distribution of the distance tends to be spread (platykurtic) and biased to the left. This indicates that both products are statistically different, and the addressed interval and median locations can support that conclusion. In the same way, it can explain the behavior of two products with a Riemannian, full, or partial Procrustes distances close to zero and the expected empirical distribution via bootstrap method. With the same philosophy, given that LANDSAT provides the reference data, a comparison with each BA product by using the bootstrap method reflects the statistical validation of the corresponding product according to the variables and scenes of interest. Certainly, the selection of well-defined variables, summarizing information of the local and proper characteristics of the BA, provides robust comparisons among the BA products and reference data validation. When a variable does not represent the field truth, the bootstrap method does not separate properly the corresponding performance. It is important to highlight that the bootstrap method is not constrained by Gaussian distributions, and is thus a robust statistical technique flexible to the distance distribution emerged from each pair under comparison or validation.

Second Method: Randomness and Permutation Tests for Validation of BA Products
The bootstrapping under zone resampling is a method for comparison and validation without deleting extreme discrepancies and similarity. Now, as a complementary analysis, one can focus on the implicit randomness of the measures by using a recent powerful statistical technique called permutation test [36]. The method explores the randomness in the measures of the BA products, in the scenes and ancillary variables under consideration. Namely, how much of the collected data of a BA product obeys the randomness than the expert theory behind the remote sensing technology.
The method is described as follows: (1) Consider a BA product P r , r = 1, . . . , q and the usual reference data matrix L.
(2) Compute the true Riemannian, full, or partial distance d(L, P r ).
(3) Let P σ 1 the same matrix P but with permuted rows according to certain permutation σ 1 .
(4) Compute d(L, P σ 1 ). (5) Repeat t = 100.000 times the steps (3) and (4). Then find the random Riemannian, Full or Partial Procrustes distances d(L, P σ 1 ), . . . , d(L, P σ t ) are obtained. Thus, the associated empirical probability density function g(x) of the Riemannian, full, or partial Procrustes distances are found. (6) Compute the probability p that a random product reaches the true distance, i.e., P(x < d(L, P r )) = p. (7) Set the significant level of the permutation test as α = 0.01. If p < α, then the permutation test rejects the hypothesis that the collected data by product P r are generated by randomness. Otherwise if p α the permutation test concludes that the data registered by P r are widely random and the true Riemannian, full, or partial Procrustes distances with L can be obtained easily by a non expert technology. (8) Repeat the algorithm for all the products P r , r = 1, . . . , q. The smallest p-value provides the "best" product according to the Property 1 of the Riemannian, full, or partial Procrustes distance described in the perturbation model P u = L + E u of Section 2.3.1.

Sample Size and Location of Sample Scenes
The application of SM Sections S1.1 and S1.2 provides 44 areas of validation on both continents: 17 in NHSA and 27 in NHAF. These scenes are marked by the black boxes in Figure 4a,b. The procedure and the results to calculate the sample size, provide identification of the validation scenes, and support the processing for the generation of the reference information can be found in SM Section S1.

First Method: Comparison and Validation of BA Products via Riemannian, Full and Partial Procrustes Distance Bootstraping
Now, we focus on the problem of comparison and validation of BA products by bootstrap of the Riemannian, full, and partial distances. Figure 5. Best global product using Bootstrap validation processes via matrix distances: (a) Riemannian, (b) Full Procrustes, (c) Partial Procrustes. The functions represent the smoothing of the frequency histogram of the three distances. In all cases, the magenta color (L-F) distribution ratifies that the Fire CCI 5.0 product is very near to the reference data (LANDSAT images). This distribution has a leptokurtic behavior, and the mean is also closer to the zero value. was obtained by applying the method of Section 2.3.3 (First Method). It reports the empirical probability density functions and 95% confidence intervals of the Riemannian distance bootstraping between the five pairs of validations BA products and reference data. In this case, we have considered the biomass sum, total BA, number of fires, P11, P12, P21, and P22 in the 44 scenes. Here, P11 is the proportion of BA according to both the product and the reference validation data (LANDSAT), P12 is the commission error area proportion, P21 is the omission error area proportion and P22 is the unburned area proportion according to both the product and the reference validation data (See SM Sections S2 and S2.1).

First Method: Comparison and Validation of BA Products via Riemannian, Full and Partial Procrustes Distance Bootstraping
Now, we focus on the problem of comparison and validation of BA products by bootstrap of the Riemannian, full, and partial distances. Figure 5. Best global product using Bootstrap validation processes via matrix distances: (a) Riemannian, (b) Full Procrustes, (c) Partial Procrustes. The functions represent the smoothing of the frequency histogram of the three distances. In all cases, the magenta color (L-F) distribution ratifies that the Fire CCI 5.0 product is very near to the reference data (LANDSAT images). This distribution has a leptokurtic behavior, and the mean is also closer to the zero value. was obtained by applying the method of Section 2.3.3 (First Method). It reports the empirical probability density functions and 95% confidence intervals of the Riemannian distance bootstraping between the five pairs of validations BA products and reference data. In this case, we have considered the biomass sum, total BA, number of fires, P11, P12, P21, and P22 in the 44 scenes. Here, P11 is the proportion of BA according to both the product and the reference validation data (LANDSAT), P12 is the commission error area proportion, P21 is the omission error area proportion Each pair in the figure is shortly denoted, for example, L-F, corresponds to the density of all the Riemannian distances of reference data with the Fire-CCI 5.0 product under the 1,000,000 resampling experiment.
Note that similar validation conclusions of Riemannian distance (Figure 5a) are reached by using the full Procrustes distance (Figure 5b) and the partial Procrustes distance (Figure 5c).
These results suggest that the same method, based on Riemannian (Full or Partial Procrustes) distance bootstrapping, can be used for comparison and validation of several variables and subsets of the scenes. For example, we considered the validation and comparison of BA products with ecosystem variables, including biomass on each land cover, etc., and the most important results can be seen in SM Section S4 Table S4-1. SM Section S4 Table S4-1 gives the results of 1050 simulations under several combinations of variables and regions. Each combination involves 10 pairs for comparison among the products and five pairs for validation with reference data. The selected variables include: Total BA, four components of error matrix, biomass sum, number of fires, fire size mean, three land cover types and two BIOME Olson types. Each simulation summarizes an intensive computational Riemannian distance bootstrapping of 100,000 resamples with replacement of the scenes. For space reasons, instead of providing the 1050 distribution functions as depicted in Figure 5. Best global product using Bootstrap validation processes via matrix distances: (a) Riemannian, (b) Full Procrustes, (c) Partial Procrustes. The functions represent the smoothing of the frequency histogram of the three distances. In all cases, the magenta color (L-F) distribution ratifies that the Fire CCI 5.0 product is very near to the reference data (LANDSAT images). This distribution has a leptokurtic behavior, and the mean is also closer to the zero value. a, we report the Riemannian distance mean of the corresponding empirical probability density function for each possible pair. The mean explains sufficiently the comparison of validation. A reported mean near to zero reflects equality for the pair because the distribution tends to be leptokurtic. Otherwise, a large mean is explained by highly platykurtic distribution due to strong differences in the leverage of the corresponding scenes of the pair. Note that the reported means for validation also define an order for the product performance. Similar computations for full and partial Procrutes distances were not included in the SM but they arrived at the same conclusions. Note that similar validation conclusions of Riemannian distance (Figure 5a) are reached by using the full Procrustes distance (Figure 5b) and the partial Procrustes distance (Figure 5c).
These results suggest that the same method, based on Riemannian (Full or Partial Procrustes) distance bootstrapping, can be used for comparison and validation of several variables and subsets of the scenes. For example, we considered the validation and comparison of BA products with ecosystem variables, including biomass on each land cover, etc., and the most important results can be seen in SM Section S4 Table S4-1. SM Section S4 Table S4-1 gives the results of 1050 simulations under several combinations of variables and regions. Each combination involves 10 pairs for comparison among the products and five pairs for validation with reference data. The selected variables include: Total BA, four components of error matrix, biomass sum, number of fires, fire size mean, three land cover types and two BIOME Olson types. Each simulation summarizes an intensive computational Riemannian distance bootstrapping of 100,000 resamples with replacement of the scenes. For space reasons, instead of providing the 1050 distribution functions as depicted in Figure 5. Best global product using Bootstrap validation processes via matrix distances: (a) Riemannian, (b) Full Procrustes, (c) Partial Procrustes. The functions represent the smoothing of the frequency histogram of the three distances. In all cases, the magenta color (L-F) distribution ratifies that the Fire CCI 5.0 product is very near to the reference data (LANDSAT images). This distribution has a leptokurtic behavior, and the mean is also closer to the zero value. a, we report the Riemannian distance mean of the corresponding empirical probability density function for each possible pair. The mean explains sufficiently the comparison of validation. A reported mean near to zero reflects equality for the pair because the distribution tends to be leptokurtic. Otherwise, a large mean is explained by highly platykurtic distribution due to strong differences in the leverage of the corresponding scenes of the pair. Note that the reported means for validation also define an order for the product performance. Similar computations for full and partial Procrutes distances were not included in the SM but they arrived at the same conclusions.

Second Method: Randomness and Permutation Tests for Validation of BA Products
In this section, we have applied the method of Section 2.3.3 (Second Method) for measuring the implicit randomness of the validation under different scenarios. Again, we present only the results about the Riemannian distance. The full and partial Procrustes distance gave similar results.
The Figure 6a-f depicts the empirical probability density function of the Riemannian distances associated to 100,000 random permutations of all the 44 scenes under study. Here. P1, P2, P3, P4, P5, and L stand for MCD64C5.1, MCD64C6, MCD45, Fire CCI 4.1, Fire CCI 5.0, and reference data. Also, we use the notation P11, P12, P21, and P22 for the elements of the error matrix. SB, TBA, and NF denote the sum of biomass, total BA, and number of fires, respectively. First, consider validation processes based on error matrices (Figure 6a). According to the discussion Figure 6a, the permutation tests fails for all the products, because the probability behind the observed true Riemannian distance with reference data is near to 1. However, once the sum of biomass and number of fires are considered, the permutation tests changes drastically by rejecting the null hypothesis. Then an order for the BA products can be established in terms of small p-values, see Figure 6b. Now, Figure 6c preserves a similar conclusion about the best two products, under the number of fires and total BA. When the sum of biomass and total BA are considered, both permutation tests of Figure 6c,d arrive at the same order for the BA products. Finally, Figure 6e shows the expected unified conclusion of Figure 6b-d for the three ancillary variables in the permutation tests. Note that the order, under a 0.0000 p-value and small distances, is preserved in the three referred combinations. The best product is Fire CCI 5.0, followed by MCD64C6. Observe that Figure 6f is consequent with Figure 6a  6b-d for the three ancillary variables in the permutation tests. Note that the order, under a 0.0000 p-value and small distances, is preserved in the three referred combinations. The best product is Fire CCI 5.0, followed by MCD64C6. Observe that Figure 6f is consequent with Figure 6a,e. The inclusion of error matrix does not represent a significant improvement of the permutation tests under sum of biomass, total BA, and number of fires. The order of the performances preserved the conclusion of Figure 6e.  A similar conclusion to Figure 6e is obtained for the full Procrustes distance under the variables sum of biomass, total BA, and number of fires (see Figure 7a). Figure 7b shows a similar behavior of the permutation test of partial Procrustes distance under the same variables.
The results about the full and partial Procrustes distances are very similar. We highlight the best representative tests in Figure 6e.
The validation analysis is complemented with several permutation tests under different variables and continents. In order to include morphological and ecosystem information of the fires, we will focus on SB, TBA, and NF, but we will present the error matrix (P11, P22, P21, P12), (SB, NF). and (TBA, NF) for correctness (see Table 3.).
Remote Sens. 2020, 12, x FOR PEER REVIEW 14 of 23 A similar conclusion to Figure 6e is obtained for the full Procrustes distance under the variables sum of biomass, total BA, and number of fires (see Figure 7a). Figure 7b shows a similar behavior of the permutation test of partial Procrustes distance under the same variables. The results about the full and partial Procrustes distances are very similar. We highlight the best representative tests in Figure 6e.
The validation analysis is complemented with several permutation tests under different variables and continents. In order to include morphological and ecosystem information of the fires, we will focus on SB, TBA, and NF, but we will present the error matrix (P11, P22, P21, P12), (SB, NF). and (TBA,NF) for correctness (see Table 3.).
Explicitly, Table 3 and SM Section S5 (Table S5-1) give 260 permutation tests for validation of BA products under several combinations of variables and regions. The combinations include: validation scenes, three ecosystem types, and two BIOME Olson type. Each cell of the table collects the p-value of the Riemannian distance permutation test associated with an intensive computational exercise of 100.000 permutations of the scenes of the product. The p-value reports the probability that a non-expert product ruled by uncertainty can reach the same precision of the observed value in the Explicitly, Table 3 and SM Section S5 (Table S5-1) give 260 permutation tests for validation of BA products under several combinations of variables and regions. The combinations include: validation scenes, three ecosystem types, and two BIOME Olson type. Each cell of the table collects the p-value of the Riemannian distance permutation test associated with an intensive computational exercise of 100.000 permutations of the scenes of the product. The p-value reports the probability that a non-expert product ruled by uncertainty can reach the same precision of the observed value in the validation scenes. The statistics of each test is given by the observed Riemannian distance between the BA product and reference data.
Certainly, there are several combinations of variables and scenes that can be considering for permutation tests and the corresponding conclusions can promote a future work. SM Section S5 shows some of those combinations involving continent, ecosystem type and Olson's biomes. Table 3. p-values of permutation test for validation under different scenarios and variables. The statistic of the test is the true Riemannian distance (in parenthesis) between the product and reference matrices. Note that the true Riemannian distances near to zero are associated with p-values close to zero. Rejecting the permutation test with such small p-value means that a random generator cannot obtain the same observed values for the product. The opposite conclusion can be addressed when true Riemannian distance is very far from zero, then the permutation test, with a p value near to 1, cannot be rejected. It means that the data measured by the product can be easily obtained by a random generator, without any technological expertise. P-values of the table also serves for ordering the product respect the performance under the variables of interest. The tests also allow ordering among statistical significant groups of variables.

Discussion and Assessment of Findings
The results of the paper have verified that comparison of BA products and their validation with reference data depend strongly of the quality of sampling, processing of the reference images, fire morphology, ecosystem variables, and constraints of the statistical models. Each issue can promote a deep investigation. However, we tried to propose a method with new insights of each one.

Sample Size Simulation, Zone Allocation and Descriptive Statistic Validation
For completeness and relations with traditional analysis, the SM Sections S1, S1.1-S1.3, S2, and S2.1 provide a number of results about information processing, sample size, validation zone allocation, and descriptive statistics with error matrix.
In both continents, the selected sample size was in agreement with the expert recommendation of Boschetti et al. [16].
Descriptive statistics of the error matrices given in SM Section S2 show high omission and commission errors. Then, it is impossible to order the global performance of the products. Each analysis using that matrix is highly randomly explained. As the reader can check in Supplementary Figures  S2-1, S2-2, S2-3, and S2-4 and Supplementary Tables S2-1-2-5, the global behavior of the statistics is ruled by high randomness. As we have discussed, they lack of any geographical, morphological and ecosystem information for validation. The proposed methods include those variables and allow a classification for the products.
We have also noted that the classical method for a 500-m resolution in BA products is validated with LANDSAT images of 30-m resolution. Then, a significant underestimation appeared in the descriptive statistics Bias and RelBias, because small burned areas are not included. Although the Fire CCI Collection 4.1 product has 300 m of spatial resolution, a quality inherited from the MERIS sensor, a significantly lower temporal resolution was found compared to the data from MODIS. Using MODIS, information about the earth surface and atmospheric conditions can be obtained 2 times per day and twice at night, if we consider the swath its Terra and Aqua platforms over the course of a day. This also influenced the results.
By relating statistics like Oe, Ce, and RelB with the amount of burned area (in terms of LANDSAT burned pixels or reference data), it can be seen that all products have a tendency towards omission, and also that, in areas with a lower number of burned pixels, the products tend to omit 100% of the area that is actually burned.
The results obtained using traditional statistics indicated that high values for product accuracy when employing the global overall accuracy (OA) statistic are accompanied by little variability between the products analyzed. This reflects problems with the OA metric, as no one product is more representative than another. It can be observed that, in areas with few burnt pixels, all of the models used here are affected by the precision of the "unburned" class which causes an overestimation of the overall accuracy. Similar overvaluation using OA has been reported by References [2,6,30,60].

First Method: Comparison and Validation of BA Products via Riemannian, Full and Partial Procrustes Distance Bootstraping
The first contribution gives an alternative method for comparison of BA products and validation with reference data (LANDSAT images) by a stochastic modeling of the leverage of the scenes in each pair of interest. We have noted that storing the information of BA products and reference in a matrix can retain the dependency between the scenes and variables with respect to the Riemannian, full, or partial Procrustes distances. Given that the Riemannian distance is an angle (ranged from 0 • to 90 • ) in a hypersphere then we can define an order for the validation of the BA products. The Full and Partial Procrustes distances have also a simple interpretation because they are similar with the Riemannian distance for BA products near to reference data. In that case, the products are not governed by randomness and all the conclusions are unified. When a product is very far from the reference data, the three distances are large and the comparison among then is not of interest. The distances were sensitive to changes in the matrix entries. In the simulations and the empirical distributions with real data, we have noted strong discrepancies and similarities of products with reference data, but it depended on the variables and continent, such that a useful order could be obtained for each case. The spatial analysis in the matrix distance setting showed a certain advantage over the use of the addressed descriptive statistics of bias, Relbias, OE, CE, AB, etc.
We have found also that ancillary variables have improved statistically the validation procedure, (see Supplementary Table S4-1). The results for comparison between the products and validation with reference data show the performance of the proposed method under several variables.
In general, the validation techniques based on Riemannian distance bootstraping identified Fire CCI 5.0 as the most accurate product, according to the empirical probability density function and the 95% confidence interval of distances. The list continues with MCD64C6, MCD64C5.1, MCD45 and Fire-CCI 4.1, Figure 5. Best global product using Bootstrap validation processes via matrix distances: (a) Riemannian, (b) Full Procrustes, (c) Partial Procrustes. The functions represent the smoothing of the frequency histogram of the three distances. In all cases, the magenta color (L-F) distribution ratifies that the Fire CCI 5.0 product is very near to the reference data (LANDSAT images). This distribution has a leptokurtic behavior, and the mean is also closer to the zero value, and SM (Supplementary Table  S4-1 and S5-1). Note that the results for the validation of the BA products allow several classifications according to their performance by continents and ancillary variables. For example, in terms of the new variable total BA, the validation process refers to Fire CCI 5.0. The order continues with MCD64C6, MCD64C5.1, MCD45, and Fire CCI 4.1, (see Supplementary Table S4-1). However, if it is included the error matrix variables, all the products differ strongly from the reference values. Under the error matrix variables, if an order is required, despite the strong differences with reference data, the shape theory quotes that MCD45 is the nearest product to reference in NHSA. Then, the order follows with Fire-CCI 4.1, MCD64C6, MCD64C5.1, and Fire CCI 5.0. In NHAF, the selected variables maintain the discrepancies with reference data, but the order changes as follows: Fire CCI 5.0, MCD45, MCD64C5.1, MCD64C6, and Fire CCI 4.1 (see Supplementary Table S4-1).
If a new variable is incorporated, such as burned biomass, the Riemannian distance probability density function of Fire CCI 5.0 and reference data gets more leptokurtic, the mean tends to zero, and 95% confidence intervals are narrow. Meanwhile, the distribution of the remaining pairs tends to be more spread, multimodal, and platykurtic. If an order is required the list continues with: MCD64C6, MCD64C5.1, MCD45 and Fire CCI 4.1 (Supplementary Table S4-1). The same analysis by continent shows the following decreasing performance in NHSA: Fire CCI 5.0, MCD64C6, MCD64C5.1, Fire CCI 4.1, and MCD45 (see Supplementary Table S4-1). In NHAF, the results of the validation asses statistically that Fire CCI 5.0 is the nearest product of reference data. In fact, the remaining products are far from Fire CCI 5.0; the results propose the following decreasing order: Fire CCI 4.1, MCD64C6, MCD64C5.1, and MCD45 (see Supplementary Table S4-1).
The results agree with the technology behind Fire CCI 5.0 and the remaining BA products. When a product with different spatial resolution (250, 300, or 500 m) is validated with the reference, the spatial and thematic quality of the product influence the statistical results. In this case, BA product with less positional and thematic accuracy, incorporates more uncertainty.
The use of the shape theory also allowed for the inclusion of land cover variables and their relationship with the burned areas. It can be check that the cover in savannas was well detected by the five validated products, but Fire CCI 5.0 received the major statistical significance, followed by MCD64C6 (see SM S4 Table S4-1). Similarly, soils with crops mixed variables had a similar behavior for validation in NHAF. In fact, Fire CCI 5.0 and MCD products head the list of performance validation, but the results usually moved Fire CCI 4.1 Collection to the end. For the detection and quantification of fires in perennial forests, the statistical methods ratify the same conclusion: Fire CCI 5.0 is followed by similar MCD products and Fire CCI 4.1 is placed at the end. However, the statistical results in some scenes and variables give a wide gap when they are compared strictly with reference data.
The statistical methods also included validation of BA products under ecosystem variables. The results address a major statistical significance for Fire CCI 5.0 in savannas of NHSA; the remaining products share a similar detection resolution. Several validation scenes included the ecosystem type Sahela Acacia Savanna, the validation process provided a similar statistical significance to all the BA products. However, in the West Savanna ecosystem, the product Fire CCI 5.0 reached the main statistical significance, followed by both collections of MCD64; and according to the Riemannian distance bootstrapping the data collected by products MCD45 and Fire CCI 4.1 where located at the end of the statistical classification (see Supplementary Table S4-1).
Once the number of burned fragments is included in the validation, the collections of Fire CCI receives the main statistical significance in NHSA and NHAF. Similarly, if the burned biomass, then Fire CCI 5.0 resembles the reference data with the major statistical significance. Supplementary Table S4-1 also considers a combination of cover type variables with biomass, then the statistical results increase the significance for the BA products classification. Each one of the 1050 simulations enrich the analysis for comparison or validation and opens a future research.
When the Riemannian distance was replaced by Partial and Full Procrustes distances, the results were similar, then a robust conclusion about validation and comparison can be supported.

Second Method: Permutation Tests and Implicit Randomness of the Products in the Validation Process
We focused only in the validation process, but the technique can also be applied in comparison of products. The approach concerned the quantification of the randomness in the validation process. The permutation test of the Riemannian, full, or partial Procrustes distance of each pair product reference data suggested a measure of the randomness in the validation process under several combinations of variables and regions. It is based on the fact that a matching of expert data of a BA product with reference cannot be explained by uncertainty. It is a measure of the quality of the product and it is resistant to the usual Gaussian constraint. This second contribution of our study also provided several combinations of ecosystem variables with number of fires (related with fire morphology). It opens a perspective for validation beyond the general descriptive statistics of the proportions given in the error matrices.
The work also derived dozens of results which can be used for future validation routines. The addressed bootstrap with resampling of scenes and permutation tests for random explanation, consolidate two complementary methods for the validation of a BA product with reference data.
We highlight that 57 of the 65 permutation tests based on error matrices variables do not reject the null hypothesis of random explanation of the reference. Table 3 and Supplementary Table S5-1 show that the observed Riemannian distance (the value in parenthesis) between the BA product and reference data is large; then an arbitrary permutation of the scenes in the product can reach easily the same distance with the reference, and then the p-value of the test is high. Only a few pairs in ecosystem Sahelian Acacia Savanna rejected the null hypothesis with a 5% level.
However, if we use ancillary variables SB, TBA, NF, most of the permutation tests gives at least a product with low p-value (observed Riemannian distance near to zero) and a statistically significant order of the products can be given. For example, Fire CCI 5.0 resembled the reference data with a statistically significant level of 5% in most of the combinations. MCD64C6 was the second product with a meaningful statistical performance. When all the ecosystem and Biome Olson variables are included, the permutation tests of Table 3 provide a statistical significance for at least one product. Only the validation of products under the Sahelian Acacia Savanna ecosystem in NHAF cannot reject the corresponding null hypothesis.
Permutations test are useful for suggesting meaningful variables. If we consider the number of fires and total BA, the Riemannian distances of the products are closer to reference data (see Figure 6c). If we remove the strong randomness of P11, P22, P12, and P21 the differences among the products became clear and are agree with the results of the previous section based on bootstrapping) Figure 6b-f). The same results were obtained with the full and partial Procrustes distances, as shown in Figure 7a,b.
Finally, a few combinations in Table 3 and Supplementary Table S5-1 provide problems for the validation of all the products. Thus, a research concerning the detection of some ecosystem and Biome Olson types will open interesting perspectives in future.
We highlight that the results given in Sections 3.2 and 3.3 agree with simulations of the perturbation model given in SM Section S3.

Conclusions
This work proposed two methods for validation-comparison of BA products: (1) a Riemannian, full and partial Procrustes distance distribution bootstrapping was used for the comparison of BA products and their validation with reference data, and (2) a second technique for the validation of BA products was studied under Riemannian, full, and partial Procrustes distance permutation tests. The description of the Riemannian, full, and partial Procrustes distance was also examined under controlled simulations, and the real data verified the behavior under perturbations and row resample of the reference data matrix.
Each part of the method studied the literature context of the existing techniques. Part 1 studied the empirical probability density function of Riemannian, full, and partial distance of the matrices indexed by the validation scenes and new ancillary variables. Then, part 2 considered the implicit randomness of the products under permutation of the validation scenes. A list of hierarchies for BA products was obtained by the three distances under several combinations of ancillary variables, continents, land cover, and biome Olson type.
Error matrix and associated descriptive statistics are based on general proportions of omission, commission, agreement and disagreement. These proportions cannot describe the morphological, ancillary and location variables. They offer a local validation that hardly defines the best product when there are different validation scenes with morphological variables and wide ecosystems. Additionally, they do not include the spatial location of the validation scenes, eliminating the statistical richness provided by georeferenced systems. When the proportions of the error matrix are included together with the ancillary variables in our method, the proportions lose weight in the comparison-validation because the new descriptors are georeferenced and describe the morphological truth and the particular ecosystem.
Our methods suggested the inclusion of the scene location, the number of fires by pixel and several ancillary variables such as biomass, ecosystem type and biome Olson type. Parts (1) and (2) summarized the data in matrices collecting the scenes and variables, then the products and reference are compared by using three different distances. The distances behave equally under BA products near to the reference. When the products are far from the reference, the distances are large, and the randomness rules their performance and explanation of the truth. Part 1 measured the leverage of the validation scenes by studying the empirical probability density function of a bootstrap for the Riemannian, Full and Partial Procrustes distances. Then, the best product emerged from unimodal, peaked, and narrow distributions and an order for the performance can be proposed. Further, part 2 explored the validation under arbitrary permutations of the validation scenes with the same distances. A test for implicit randomness of the products can be performed. Then, a p-value near to zero refers a BA product with the best performance under the studied variables. Hundreds of intensive computational simulations were performed under several regions, ancillary variables, and distances. Both methods suggested that Fire CCI 5.0 has the major statistical significance under several combinations of ecosystem, biomass, BA, and number of fires. MCD64C6 led the statistical significance in some experiments. In general, all the computations have collected multiple similarities and differences of the BA products that can be explored in a future research.
Moreover, new insights involving the dynamics and connexity of fires will be part of a subsequent work. The methods for validation and comparison here derived and those mentioned in the Supplementary Material can be used in other contexts of remote sensing, such as the accuracy assessment of land cover products and quality validation of cartographic products.
Supplementary Materials: The following are available online at http://www.mdpi.com/2072-4292/12/23/3972/s1, from S1 to S5, Figure S1 Omission and commission of BA products. Green squares at zero RelB% correspond to the reference area. The points represent the statistics of each validation zone. The RelB statistic of a product gives the tendency to omission (−) or commission (+), Figure S2-2. Patterns of omission and commission for each BA product. The size of each point represents the coincident BA (m 2 ) between the BA product and the reference. The points represent the statistics of each validation zone. The square in green shows the reference with commission/omission error free. Observe the extreme concentration of errors above the 50%, Figure S2-3. Comparison between omission and commission by continent. The points represent the statistics of each validation zone. Note again the high concentration of errors above the 50%., Figure S2-4. Representativeness of validated products according to the type of ecosystem. Ecosystem type reference information extracted from the Olson biomes. This graph shows the amount of total BA in km 2 detected by each of the validated products and their reference (LANDSAT). Note that in the NHSA the ecosystem "Llanos" is the most affected by burning. While in the NHAF the most affected ecosystem was "Northern Congolian forest savanna mosaic ", Figure S3-1. Simulated Riemannian distance of 93 products perturbed from matrix L from a sample of 10.000 non-central t-distribution of 2 degrees of freedom and non centrality parameter of 3% to 93%. (a) Property (1) of Riemannian distance, red, black and green symbols corresponds to minimum, maximum and median values. (b) Property (2) of Riemannian distance, Figure S3-2. Simulated Full Procrustes Distance of 93 products perturbed from matrix L from a sample of 10.000 non central t distribution of 2 degrees of freedom and non centrality parameter of 3% to 93%. (a) Property (1) of Full Procrustes Distance, red, black and green symbols corresponds to minimum, maximum and median values. (b) Property (2) of Full Procrustes Distance, Figure S3-3. Simulated Partial Procrustes distance of 93 products perturbed from matrix L from a sample of 10.000 non-central t-distribution of 2 degrees of freedom and non centrality parameter of 3% to 93%. (a) Property (1) of Partial Procrustes Distance, red, black and green symbols corresponds to minimum, maximum and median values. (b) Property (2) of Partial Procrustes Distance,