Classification of North Africa for Use as an Extended Pseudo Invariant Calibration Sites (EPICS) for Radiometric Calibration and Stability Monitoring of Optical Satellite Sensors

Pseudo invariant calibration sites (PICS) have been extensively used for the radiometric calibration and temporal stability monitoring of optical satellite sensors. Due to limited knowledge about the radiometric stability of North Africa, only a limited number of sites in the region are used for this purpose. This work presents an automated approach to classify North Africa for its potential use as an extended PICS (EPICS) covering vast portions of the continent. An unsupervised classification algorithm identified 19 “clusters” representing distinct land surface types was used; three clusters were identified with spatial uncertainties within approximately 5% in the shorter wavelength bands and 3% in the longer wavelength bands. A key advantage of the cluster approach is that large numbers of pixels are aggregated into contiguous homogeneous regions sufficiently distributed across the continent to allow multiple imaging opportunities per day, as opposed to imaging a typical PICS once during the sensor’s revisit period. This potential increase in temporal resolution could result in increased sensitivity for the quicker identification of changes in sensor response.


Introduction
For over 45 years, data from Earth observing satellite sensors have been used to increase the understanding of long-term global change. Thus, the quality of data produced by previous and currently operational sensors is a primary concern, as it critically depends on accurate radiometric calibration for each sensor. Accurate radiometric calibration also allows meaningful comparisons of data acquired by different sensors to be combined into a continuous record [1]. Radiometric calibration needs to be performed both before launch (to establish an initial operating state) and at regular intervals after launch (to account for degradations in response due to launch stresses, aging and effects relating to conditions in near-Earth orbit). Sensors often include onboard calibrators (e.g., solar diffuser panels, lamps, thermal blackbody radiators) intended to provide data for radiometric calibration and long-term stability monitoring.
Since an onboard calibrator experiences conditions similar to the sensor, its data will likely show similar degradation in radiometric response. Onboard calibrator systems can increase the design, building, and operating costs of a sensor and are thus not included in all satellites, especially those designed with shortened operating lifetimes. Alternatively, image data from locations on the earth's surface exhibiting minimal temporal change and other desirable properties have been identified and are increasingly used as a calibration data source; these sites are known as pseudo-invariant calibration the year; this class included all soils, rocks, and sands. A similar global land cover product was also generated for Terra MODIS using a supervised classification algorithm, with training datasets derived from higher resolution image data and ground-based surface reflectance measurements when available [28]. It also identified a generic "barren" class across North Africa.
To the present time, the "barren" classification is still used to group all soil, sand, and rock land cover types into a single land cover type, as most applications using land cover data do not require such a fine distinction. However, this "barren" land cover also includes spatially and temporally stable ground cover types that are highly desirable with respect to PICS-based radiometric calibration. Consequently, the "barren" land cover type identified in North Africa should be resolved into subclasses, which would allow the evaluation of their radiometric intensity and stability for potential use in calibration across wider portions of a sensor's dynamic range.

Current Approach for Extending PICS
Spectral properties of North Africa have been only studied superficially due to lack of its significance to the scientific community. As its majority of regions are stable, it should be studied more rigorously in order to utilize these regions for radiometric calibration of optical satellite sensors. So, the primary concern of this work is to characterize the temporal variability throughout all of North Africa from Egypt to the Atlantic Ocean (latitude: 36 • N to 15 • N and longitude: 18 • W to 35 • E longitude), rather than characterizing limited regions based on manual procedures for identifying specific "optimal" regions. Along with the temporal variability of North Africa, its spectral properties are also studied in order to develop North Africa as continental-scale EPICS. To perform this analysis, a dataset from a well calibrated sensor is necessary so Landsat 8 OLI images acquired over three years, from launch through March 2017, were used. Then, the temporally stable pixels of every image were classified according to the unique spectral response using a classification algorithm. Pixels with spectrally similar responses were then aggregated into continental scale regions. These regions were then considered suitable for future evaluation of their potential for sensor calibration and stability monitoring which will be described in a future paper.
These EPICS can be used for all the radiometric calibration purposes for which traditional PICS have been used such as long term trending, cross-calibration, and developing an absolute calibration model. The key advantage of the EPICS over traditional PICS is its spatial extent which provides a significantly large number of calibration data points than the traditional PICS. The increased number of calibration data points enables the ability to detect the sensor degradation quickly and with more sensitivity than the traditional PICS. It can also help to build the larger cross calibration data set contributing to achieve a cross calibration quality similar to that of individual PICS in a significantly shorter time interval. Similarly, EPICS based absolute calibration model can significantly increase the temporal resolution of the calibration opportunities to a daily or nearly daily basis. This paper is organized as follows. Section 1 provides a brief overview of the topic. Section 2 presents the materials and methods used in the analysis. Section 3 demonstrate the resulting cluster classification map of North Africa and its validation. Section 4 discusses different thresholds used in the classification of North Africa, cluster properties and potential directions for future research into this topic. Finally, Section 5 provides a brief summary of this work.

Google Earth Engine (GEE)
Google Earth Engine (GEE) is a cloud based platform which provides access to global geospatial datasets and significant processing power [29]. By using the GEE, a user can process petabytes of image data currently contained in the Landsat and other image archives. Images within GEE are stored as tiles in their original projections, spatial resolutions, and bit depth, and arranged into collections for each sensor; this avoids potential issues with degradation of data quality. The image tiles are also placed into a pyramid whose levels consist of the image data down sampled by a factor of two. Whenever users request any particular portion of an image, only the tile(s) from the appropriate pyramid level are retrieved, which significantly reduces the required computational time.

SDSU Derived Data Product
Landsat 8 OLI data were chosen for this work, as the analyses of Knight et al. [30] and Markham et al. [31] reported an OLI reflectance calibration accuracy of approximately 3%. The key spectral and spatial characteristics of the OLI are presented in Table 1. Through the use of the GEE system, all pixels of North Africa acquired by the OLI over the three-year period were screened for clouds/cloud shadows and saturated pixels. These data were spatially resampled from the native resolution of 30 × 30 meters to 300 × 300 meters. A temporal mean, standard deviation, and uncertainty (ratio of temporal standard deviation to temporal mean) were calculated for each pixel location, along with a count of the number of measurements that went into the statistics calculations. These summary statistics data were then "stacked" into data cubes and "tiled" into 1 • latitude by 1 • longitude images. The image and pixel sizes were chosen as a practical trade-off between the amount of data retrievable from GEE and the current processing and storage capabilities of the South Dakota State University (SDSU) Image Processing Laboratory systems. In addition, resampling the tiles to 300 m spatial resolution did not significantly degrade the information obtainable from them as much of North Africa is relatively spatially uniform.  2  492  60  30  Green  3  561  57  30  Red  4  654  37  30  NIR  5  865  28  30  Cirrus  9  1373  20  30  SWIR1  6  1609  85  30  SWIR2  7  2201  187  30  Panchromatic  8 590 172 15 Figure 1 shows the concept behind a processed data cube. It consists of 25 layers. Layers 1-8 are the temporal mean statistics image for each VNIR/SWIR multispectral band, ordered by band number starting with the Coastal/Aerosol band and ending with the Cirrus band. Layers 9-16 and 17-24 are the standard deviation and uncertainty images, respectively, again ordered by band number. The final layer consists of the number of resampled image pixels used to calculate the summary statistics. Remote Sens. 2019, 11, x FOR PEER REVIEW 5 of 18

Mosaic Image of North Africa
Once the processed data tiles were retrieved from GEE and processed locally into data cubes, the temporal mean images were mosaicked to generate composite images of North Africa. The resulting mosaics covered an area from 36°N to 15°N latitude and 18°W to 35°E longitude. In all, approximately 1003 data cubes were used to generate the North Africa mosaics. As part of the mosaicking process, each pixel in a temporal mean tile was screened for a temporal uncertainty of 5% or less, and sufficient resampled pixel count (25 or greater, as it represented approximately one third of the potential cloud-free acquisitions). Helder et al. [2] used a threshold of 3% temporal stability to identify the individual PICS. Since this work focuses on finding continental scale PICS, the temporal variability criteria was relaxed to 5% in order to account for extra variability likely arising due to the large spatial extent of the PICS.

Classification of North African Land Cover
Landcover maps identifying spectrally similar pixels can be generated through supervised or unsupervised classification algorithms. In supervised classifications (e.g. decision trees, maximum likelihood, and minimum distance [32,33]), reference "training" datasets are used to perform the classification. In unsupervised classifications (e.g. isodata and k-means [34]), standard image processing techniques are directly applied to the image data to perform the classification. There is no consensus regarding the superiority of supervised or unsupervised methods used by themselves; the suitability of the classification method depends on the available reference data. Other research [35,36] suggests that combination of supervised and unsupervised methods can provide more accurate land cover classification results.
This work performs, for the first time, a higher-resolution classification of land cover across all of North Africa. K-means unsupervised classification was chosen as the classification method due to i) its computational simplicity, which is critical when performing analyses at a continental scale [37,38]; and ii) the lack of prior knowledge of spectral information for that region. The K-means algorithm is presented below: • Step 1: Estimate initial mean values for K = 2 clusters:

Mosaic Image of North Africa
Once the processed data tiles were retrieved from GEE and processed locally into data cubes, the temporal mean images were mosaicked to generate composite images of North Africa. The resulting mosaics covered an area from 36 • N to 15 • N latitude and 18 • W to 35 • E longitude. In all, approximately 1003 data cubes were used to generate the North Africa mosaics. As part of the mosaicking process, each pixel in a temporal mean tile was screened for a temporal uncertainty of 5% or less, and sufficient resampled pixel count (25 or greater, as it represented approximately one third of the potential cloud-free acquisitions). Helder et al. [2] used a threshold of 3% temporal stability to identify the individual PICS. Since this work focuses on finding continental scale PICS, the temporal variability criteria was relaxed to 5% in order to account for extra variability likely arising due to the large spatial extent of the PICS.

Classification of North African Land Cover
Landcover maps identifying spectrally similar pixels can be generated through supervised or unsupervised classification algorithms. In supervised classifications (e.g., decision trees, maximum likelihood, and minimum distance [32,33]), reference "training" datasets are used to perform the classification. In unsupervised classifications (e.g., isodata and k-means [34]), standard image processing techniques are directly applied to the image data to perform the classification. There is no consensus regarding the superiority of supervised or unsupervised methods used by themselves; the suitability of the classification method depends on the available reference data. Other research [35,36] suggests that combination of supervised and unsupervised methods can provide more accurate land cover classification results.
This work performs, for the first time, a higher-resolution classification of land cover across all of North Africa. K-means unsupervised classification was chosen as the classification method due to (i) its computational simplicity, which is critical when performing analyses at a continental scale [37,38]; and (ii) the lack of prior knowledge of spectral information for that region. The K-means algorithm is presented below:

•
Step 1: Estimate initial mean values for K = 2 clusters: Step 2: Calculate the Euclidean distances between each mosaicked image pixel and initial cluster means: Step 3: Classify pixels based on the minimum distance to a cluster: Step 4: Calculate the new cluster mean: • Step 5: If max(|(µ ij − µ ijnew )|) > 0.0001 replace the old cluster mean with the new cluster mean calculated in Step 4, then return to Step 2. Otherwise, proceed to Step 6. • Step 6: Calculate the spatial uncertainty of all pixels within each cluster.

•
Step 7: If the maximum spatial uncertainty of any cluster is greater than 5%, increase the number of clusters by one and return to Step 1. Otherwise, terminate the algorithm.
K-means algorithm was initially "seeded" with cluster means represented by the temporal means of two randomly chosen pixels from the mosaicked images. The Euclidean distance between cluster means (two randomly chosen pixels) and each pixel of the mosaicked image pixels is calculated in Step 2. Third step classifies all the pixels of North Africa based on their minimum distance to the cluster's mean. The average of the new classified pixels is calculated and considered as the new cluster mean in Step 4. The difference between the current cluster mean and the new cluster mean is computed and if the difference is more than the threshold 0.0001 then current cluster mean is replaced by newly computed cluster mean and again start from Step 2. A mean difference threshold of 0.0001 was used to determine whether new clusters should be created; this threshold value was chosen to ensure that final convergence of the classification remained independent of the chosen starting pixels. For cluster mean differences of less than the difference threshold, the spatial uncertainties of the current set of clusters was calculated. If any cluster's spatial uncertainty exceeded 5% in any band, the current cluster count was incremented by 1, and the classification process was repeated; otherwise, the classification process terminated at the current cluster count. A 5% spatial uncertainty threshold was selected in order to balance aggregation within cluster regions (i.e., minimizing the creation of smaller cluster regions that are more widely spread) while ensuring lower spatial uncertainty. For this work, the classification process identified 19 distinct clusters. Figure 2 shows the resulting red/green/blue composite mosaicked image of North Africa after filtering for temporally unstable pixels and pixel count threshold. The white areas represent pixels Remote Sens. 2019, 11, 875 7 of 18 excluded due to temporal instability and pixels identified as having clouds/cloud shadows and/or small water bodies. The black pixels represent "fill" over regions beyond the African continent.  Figure 3 shows the 19 distinct clusters identified in the classification. All of the identified clusters have contiguous regions widely distributed across the continent, potentially allowing a much higher imaging frequency rather than an imaging frequency of several days for most moderate resolution sensors over a single PICS. Figure 4 shows a bar graph of the resulting pixel counts for each cluster. Among the clusters, Cluster 3 has the largest number of pixels (approximately 4.3 million). Cluster 13 (the light green graph in Figure 3 has approximately 3 million pixels. Although Cluster 3 has the largest pixel count, its pixels are more widely distributed in non-contiguous regions. The Cluster 13 pixels, on the other hand, are grouped into large contiguous regions widely distributed across North Africa, making this cluster a potential candidate for EPICS-based calibration.    Figure 3 shows the 19 distinct clusters identified in the classification. All of the identified clusters have contiguous regions widely distributed across the continent, potentially allowing a much higher imaging frequency rather than an imaging frequency of several days for most moderate resolution sensors over a single PICS. Figure 4 shows a bar graph of the resulting pixel counts for each cluster. Among the clusters, Cluster 3 has the largest number of pixels (approximately 4.3 million). Cluster 13 (the light green graph in Figure 3 has approximately 3 million pixels. Although Cluster 3 has the largest pixel count, its pixels are more widely distributed in non-contiguous regions. The Cluster 13 pixels, on the other hand, are grouped into large contiguous regions widely distributed across North Africa, making this cluster a potential candidate for EPICS-based calibration.    Figure 3 has approximately 3 million pixels. Although Cluster 3 has the largest pixel count, its pixels are more widely distributed in non-contiguous regions. The Cluster 13 pixels, on the other hand, are grouped into large contiguous regions widely distributed across North Africa, making this cluster a potential candidate for EPICS-based calibration.   Table 2 shows the final spatial uncertainties (ratio of spatial standard deviation to spatial mean) estimated for each cluster when the classification process was terminated. Due to more pronounced atmospheric scatter in shorter wavelengths, the spatial uncertainties are generally higher in the shorter wavelength bands (the exception being Cluster 4, which exhibited greater uncertainty in the SWIR 1 band). Despite its widespread extent, the Cluster 13 spatial uncertainties are within 5% in the Coastal/Aerosol and Blue bands, and within 3% in the longer wavelength bands. As a result, it should be quite useful for radiometric calibration and radiometric stability assessment.   Table 2 shows the final spatial uncertainties (ratio of spatial standard deviation to spatial mean) estimated for each cluster when the classification process was terminated. Due to more pronounced atmospheric scatter in shorter wavelengths, the spatial uncertainties are generally higher in the shorter wavelength bands (the exception being Cluster 4, which exhibited greater uncertainty in the SWIR 1 band). Despite its widespread extent, the Cluster 13 spatial uncertainties are within 5% in the Coastal/Aerosol and Blue bands, and within 3% in the longer wavelength bands. As a result, it should be quite useful for radiometric calibration and radiometric stability assessment.  Figure 5 shows the multispectral signature of each cluster represented by the mean TOA reflectance and ±1σ standard deviation.  Figure 5 shows the multispectral signature of each cluster represented by the mean TOA reflectance and ±1σ standard deviation. At shorter wavelengths, the surface cover is less reflective; the range of TOA reflectance is narrower, and the standard deviations of different clusters exhibit greater overlap. Conversely, the surface cover is more reflective at longer wavelengths; the range of TOA reflectance is wider, and the standard deviations of different clusters exhibit much less overlap, suggesting that the clusters represent distinct surface cover types.

Cluster Spectral Signatures
Cluster 4 (represented by the bottom reddish brown plot) has the lowest overall reflectance levels, while cluster 2 (represented by the top yellow plot) has the highest reflectance levels. This distribution allows sensor calibration across wider portions of the dynamic range, especially at longer wavelengths.

Comparison of Traditional PICS and Cluster 13 behaviour
As Cluster 13 spread across the continent, it also includes one of the most widely used traditional PICS i.e. Libya 4, so the temporal behavior of Cluster 13, using selected 16 paths and rows, is compared with Libya 4 temporal behavior as shown in Figure 6. For comparison, Libya 4 CNES (French: Centre national d'études spatiales) region of interest (ROI) is chosen as it is regarded as one of the best regions by CNES based on the long term trending of North African and Arabian desert [2,39]. In Figure 6, the left and right figure present the temporal trend of Libya 4 and Cluster 13, since launch to mid-August 2018, using Landsat 8 OLI. The behavior of Cluster 13 is the same as the behavior of traditional PICS, Libya 4, but it provides a significantly large number of acquisitions than Libya 4 within the same interval of time, i.e., Libya 4 provides 110 could-free observations since launch to mid-August 2018 whereas Cluster 13 provides nearly daily observation and offers 1434 cloud-free observations within the same interval of time. The temporal mean (± 2 sigma) of Cluster 13 and Libya 4 are also compared in Figure 7. As Libya 4 temporal mean lies within the uncertainty bar of Cluster 13 temporal mean, the temporal mean of both Libya 4 and Cluster 13 are same. This implies that despite its continental scale spatial extent, Cluster 13 has the same response like the Libya 4 CNES ROI. At shorter wavelengths, the surface cover is less reflective; the range of TOA reflectance is narrower, and the standard deviations of different clusters exhibit greater overlap. Conversely, the surface cover is more reflective at longer wavelengths; the range of TOA reflectance is wider, and the standard deviations of different clusters exhibit much less overlap, suggesting that the clusters represent distinct surface cover types.
Cluster 4 (represented by the bottom reddish brown plot) has the lowest overall reflectance levels, while cluster 2 (represented by the top yellow plot) has the highest reflectance levels. This distribution allows sensor calibration across wider portions of the dynamic range, especially at longer wavelengths.

Comparison of Traditional PICS and Cluster 13 Behaviour
As Cluster 13 spread across the continent, it also includes one of the most widely used traditional PICS i.e., Libya 4, so the temporal behavior of Cluster 13, using selected 16 paths and rows, is compared with Libya 4 temporal behavior as shown in Figure 6. For comparison, Libya 4 CNES (French: Centre national d'études spatiales) region of interest (ROI) is chosen as it is regarded as one of the best regions by CNES based on the long term trending of North African and Arabian desert [2,39]. In Figure 6

Validation of North African Classification
To validate the North Africa classification results, cloud-free, full-resolution Landsat-7 ETM+ scenes imaging Cluster 13 were obtained. To reduce the amount of data to process, images were limited to 9 distinct WRS-2 path/row combinations from different acquisition dates; these path/row combinations maximized the intersection between the ETM+ images and Cluster 13 region boundaries and provided near-daily imaging frequency during the ETM+'s 16-day revisit period. Binary masks were generated for each path/row image and were used to exclude i) pixels identified in other clusters; and Cluster 13 pixels with 5% or greater temporal uncertainty. For the purposes of this work, the masks were created at 30m spatial resolution in order to match the spatial resolution of ETM+; the mask generation procedure is sufficiently general in nature and can generate masks at any required spatial resolution for any sensor.

Validation of North African Classification
To validate the North Africa classification results, cloud-free, full-resolution Landsat-7 ETM+ scenes imaging Cluster 13 were obtained. To reduce the amount of data to process, images were limited to 9 distinct WRS-2 path/row combinations from different acquisition dates; these path/row combinations maximized the intersection between the ETM+ images and Cluster 13 region boundaries and provided near-daily imaging frequency during the ETM+'s 16-day revisit period. Binary masks were generated for each path/row image and were used to exclude i) pixels identified in other clusters; and Cluster 13 pixels with 5% or greater temporal uncertainty. For the purposes of this work, the masks were created at 30m spatial resolution in order to match the spatial resolution of ETM+; the mask generation procedure is sufficiently general in nature and can generate masks at any required spatial resolution for any sensor.

Validation of North African Classification
To validate the North Africa classification results, cloud-free, full-resolution Landsat-7 ETM+ scenes imaging Cluster 13 were obtained. To reduce the amount of data to process, images were limited to 9 distinct WRS-2 path/row combinations from different acquisition dates; these path/row combinations maximized the intersection between the ETM+ images and Cluster 13 region boundaries and provided near-daily imaging frequency during the ETM+'s 16-day revisit period. Binary masks were generated for each path/row image and were used to exclude (i) pixels identified in other clusters; and Cluster 13 pixels with 5% or greater temporal uncertainty. For the purposes of this work, the masks were created at 30 m spatial resolution in order to match the spatial resolution of ETM+; the mask generation procedure is sufficiently general in nature and can generate masks at any required spatial resolution for any sensor. Figure 8 shows the resulting temporal TOA reflectance trend of Cluster 13 using selected WRS-2 paths and rows. All the Landsat 7 images used in this analysis were screened for 5% cloud cover due to which the density of Landsat 7 data sample increases after 2014. The error bar of each data points represents its spatial uncertainty. The cluster means, spatial uncertainty, temporal uncertainty and the corresponding statistics from the temporal trend are listed in Table 3. The overall cluster means and temporal trend from different WRS2 paths and rows are essentially the same for most of the band except NIR band where there is an offset of 0.01 reflectance unit. The lifetime time series of Cluster 13 TOA reflectance using Landsat 7 shows a trend in some bands and also exhibits apparent seasonal effects in the longer wavelength bands, indicating the necessity for additional drift and BRDF correction. It is possible that the observed reflectance differences are due to (i) no application of BRDF correction to the data; (ii) no application of drift correction to the data, and (iii) potential outliers in the data. As mentioned earlier, all pixels used as input to the K-means algorithm exhibited temporal uncertainties of 5% or less. Not surprisingly, the Cluster 13 trend data also exhibits approximately 5% or less temporal uncertainty. Figure 8 shows the resulting temporal TOA reflectance trend of Cluster 13 using selected WRS-2 paths and rows. All the Landsat 7 images used in this analysis were screened for 5% cloud cover due to which the density of Landsat 7 data sample increases after 2014. The error bar of each data points represents its spatial uncertainty. The cluster means, spatial uncertainty, temporal uncertainty and the corresponding statistics from the temporal trend are listed in Table 3. The overall cluster means and temporal trend from different WRS2 paths and rows are essentially the same for most of the band except NIR band where there is an offset of 0.01 reflectance unit. The lifetime time series of Cluster 13 TOA reflectance using Landsat 7 shows a trend in some bands and also exhibits apparent seasonal effects in the longer wavelength bands, indicating the necessity for additional drift and BRDF correction. It is possible that the observed reflectance differences are due to i) no application of BRDF correction to the data; iii) no application of drift correction to the data, and ii) potential outliers in the data. As mentioned earlier, all pixels used as input to the K-means algorithm exhibited temporal uncertainties of 5% or less. Not surprisingly, the Cluster 13 trend data also exhibits approximately 5% or less temporal uncertainty.
In general, the image-estimated spatial uncertainties are larger than the corresponding classification-estimated spatial uncertainties; however, they are within 5% in all bands. The large image-estimated spatial uncertainty is due to the reason that the selected path/row images intersect sub-regions within Cluster 13 exhibiting more heterogeneous reflectance values, resulting in a larger standard deviation and corresponding spatial uncertainty.    In general, the image-estimated spatial uncertainties are larger than the corresponding classification-estimated spatial uncertainties; however, they are within 5% in all bands. The large image-estimated spatial uncertainty is due to the reason that the selected path/row images intersect sub-regions within Cluster 13 exhibiting more heterogeneous reflectance values, resulting in a larger standard deviation and corresponding spatial uncertainty.

Discussion
This work has developed a novel procedure for classification of Landsat-8 OLI image data of North Africa, demonstrating the region's potential for use as an extended PICS. The classification, based on a standard k-means unsupervised classification, is computationally simple and does not require prior knowledge of the region's spectral characteristics. The proposed procedure allows efficient storage and processing of large amounts of OLI image data by resampling them to 300 m spatial resolution; due to the relative uniformity of the region, resampling does not lead to significant loss of spectral information that would adversely impact the resulting classification. The procedure relies on a maximum 5% temporal uncertainty threshold to account for additional variability resulting from use of a much larger region size. Overall, the proposed procedure represents a significant advance over previous efforts that relied heavily upon visual inspection to identify much smaller candidate PICS regions [2,24], and can be applied "as is" to any sensor. The proposed procedure identified 19 distinct "clusters" of pixels exhibiting temporal uncertainty of 5% or less. Processing in the procedure terminated when the spatial uncertainty within any given cluster was 5% or less across all bands. The 5% spatial uncertainty threshold represents a practical compromise to avoid the extremes of (i) lower spatial uncertainty within a larger number of clusters; or (ii) higher spatial uncertainty within a smaller number of clusters, either of which would yield a less accurate calibration.
Previous landcover classifications had identified North Africa as "barren", containing less than 10% vegetation and composed primarily of some combination of sand, bare soil, and rock [27,28,40]. The "barren" classification has largely remained to the present time, as the scientific community did not demand a finer classification. However, this landcover classification applies to many current PICS extensively used for sensor radiometric calibration and performance analysis, such as Libya 4, Mauritania 1, and Algeria 1 [5,8,21,22]. Choi et al. [41] studied the spectral stability of the Libya 4, Libya 1 and Mauritania 2 PICS using average deviation (AD) and spectral angle mapper (SAM) where average deviation is the absolute reflectance difference and SAM is the spectral angle between the test and reference spectrum [42,43]. This is the earliest attempt to study the spectral similarity of PICS and in this study, they found that ROIs chosen parallel to the along-track direction were spectrally stable for spectral angle mapper values up to ±2 degrees, with an average deviation of ±1.7% in reflectance scale. In contrast to previous work, this work presents the classification of North Africa by studying each and every pixel and group together the pixels with similar characteristics in order to develop extended pseudo invariant calibration sites.
Clusters found by the unsupervised classification algorithm possess different spectral characteristics. Even if the spectral signature of clusters appears to be similar in multiple bands, they are distinct if they are observed across all the bands. The spectral separation between different clusters is wavelength dependent, as shown in Figure 5. Longer wavelengths have very distinct spectral separation; the SWIR1 band has the most distinct spectral separation of 0.33 reflectance units whereas the coastal band has poor spectral separation, just separated by 0.07 reflectance units. So, on the basis of different intensity levels, North Africa offers a better dynamic range for sensor calibration at longer wavelengths than at shorter wavelengths as shown in Figure 5.
Factors to consider when performing a cluster-based analysis include the cluster size (represented by the number of pixels in the cluster) and its distribution across a region, as well as the degree of contiguity of the cluster pixels. The "optimal" cluster for calibration and performance monitoring balances a lower spatial uncertainty, a larger size and distribution (in order to cover a larger geographic area), and a greater degree of pixel contiguity (in order to minimize effects due to geometric misregistration between images); this becomes particularly important for sensors with higher (<30 m) spatial resolution.
Two of the clusters identified by the proposed procedure (Clusters 2 and 3) exhibit similar levels of spatial uncertainty at longer wavelengths. With respect to cluster size and distribution across the region, Cluster 3 would be considered optimal; however, with respect to cluster contiguity, Cluster 2 would be considered optimal. For sensors with lower geometric accuracy, Cluster 2 would be preferred for radiometric calibration and sensor performance analysis.
Cluster 3 has the largest number of pixels (approximately 4.3 million) as shown in Figure 4, however, as mentioned above, its pixels are relatively non-contiguous. In contrast, Cluster 13 possesses a similar intensity level to Cluster 3, and contains a slightly smaller number of pixels that are almost as widely distributed across northern Africa. However, its spatial uncertainty is smaller than Cluster 3's across most bands, and its pixels form much more contiguous subregions. While other clusters might be preferable to Cluster 13 with respect to spatial uncertainty and cluster characteristics in specific bands (e.g., Clusters 5 and 15 in the Red and SWIR2 bands), Cluster 13 exhibits favorable characteristics across the majority of bands, making it an ideal bright target cluster for calibration purposes.
Similarly, Clusters 4 and 6 provide a lower intensity level that would help widen the dynamic range for PICS based calibration. Between them, Cluster 4 is slightly darker, but it has a spatial uncertainty of approximately 8% to 9% across all bands; in addition, it contains the smaller number of pixels forming less contiguous subregions. Cluster 6 would be preferred due to its larger number of pixels, more contiguous sub-regions and lower spatial uncertainty.
Some of the key advantages that this classification of North Africa offered to the radiometric calibration community are highlighted as follows: • Historically, PICS-based calibration work used bright desert targets due to there being no universally recognized set of darker PICS exhibiting sufficient temporal and spatial stability.
As the proposed procedure identifies clusters with 5% or better temporal stability, it offers the potential for improving calibration accuracy by extending the dynamic range over which calibration can be performed.

•
Previously, PICS such as Libya 4 has only one image acquisition corresponding to the satellite revisit cycle but Cluster 13 found by the classification of North Africa is observed in nearly a daily fashion. As Libya 4 lies within Cluster 13, it is similar to observing Libya 4 in a daily manner in contrast to 16 days' period which helps to quickly detect the drift of any satellite sensors with greater sensitivity. • One of the major application of PICS is the cross-calibration of optical satellite sensors [14,16,44]. Previously, limited cross calibration opportunities were available as PICS are observed once in 16 days. However, Cluster 13 provides more cross-calibration opportunities as it spreads across the continent which helps to decrease the cross-calibration gain and bias uncertainties between any optical satellite sensor pairs. In addition, it helps to achieve a cross calibration quality similar to that of individual PICS in a significantly shorter time interval. • Several researchers have developed data driven absolute calibration model in order to simulate the TOA reflectance of an individual PICS, such as Libya 4 [19][20][21][22][23]. The number of Libya 4 observations is limited due to orbital pattern and cloud cover. As Cluster 13 has a significantly large number of observations than an individual PICS, it enhances the model's ability to predict the TOA reflectance more accurately as more training datasets are available for developing the EPICS based absolute calibration model. Furthermore, EPICS based absolute calibration model can increase the temporal resolution of calibration opportunities to a daily or nearly daily basis for any optical satellite sensor.
This paper only presents initial cluster classification results. Since Cluster 13 has more than 3 million pixels, offers lower spatial uncertainty across most bands, and forms more contiguous subregions widely distributed across North Africa, it stands out as a viable first candidate for EPICS based calibration. Future directions for this work would investigate the use of cluster regions, beginning with Cluster 13, to perform long-term stability assessment and sensor cross calibration. The following sections discuss these potential cluster-based applications in greater detail.

Long Term Monitoring of Sensor Radiometric Stability
Cluster-based analyses of long-term sensor stability can potentially benefit from a significant increase in temporal resolution over that provided by individual PICS, since a cluster is considered as a group of spectrally similar pixels, with each cluster containing contiguous regions distributed across the continent. Preliminary investigation into this topic has begun with Cluster 13, as its contiguous regions tended to be the largest in size among all the clusters and were more widely distributed. Figure 9 shows the intersection of Landsat WRS-2 path/rows (black lines) and Cluster 13 region boundaries (red). In all, 25 WRS-2 path/rows imaging these regions were identified, potentially allowing multiple image acquisitions per day within the 16-day revisit period. This potential increase in temporal resolution could result in quicker detection of sensor response changes over time, which would be especially desirable for sensors with shorter life expectancies and greater variability in detector response. contiguous regions tended to be the largest in size among all the clusters and were more widely distributed. Figure 9 shows the intersection of Landsat WRS-2 path/rows (black lines) and Cluster 13 region boundaries (red). In all, 25 WRS-2 path/rows imaging these regions were identified, potentially allowing multiple image acquisitions per day within the 16-day revisit period. This potential increase in temporal resolution could result in quicker detection of sensor response changes over time, which would be especially desirable for sensors with shorter life expectancies and greater variability in detector response.

Hyperspectral Data Availability for Cross Calibration
A cluster-based cross calibration approach would need to have accurate knowledge of the cluster's expected hyperspectral response, in order to derive appropriate scaling factors to compensate for spectral response differences between a well calibrated "reference" sensor and the sensor to be calibrated. In the absence of ground truth measurements, this knowledge can be derived from available hyperspectral image data. Figure 10 shows a simplified diagram of the five largest contiguous regions of Cluster 13 across North Africa (represented by the blue color) and a map of the available Hyperion coverage over the same regions (represented by the red color). Hyperion is a hyperspectral pushbroom sensor imaging the Earth's surface in the 400-2500 nm portion of the solar spectrum, in 242 overlapping bands with a spectral resolution of approximately 10 nm; 196 of these bands are well calibrated. It images a 7 km by 100 km swath at a spatial resolution of 30 m [45][46][47][48]. Preliminary investigation has identified a set of 731 Hyperion scenes imaging these regions. After filtering the set for 10% or less cloud cover and sensor view angles within ± 5° of nadir, 101 scenes were identified that could be used to determine the hyperspectral profile. As Cluster 13 regions are spectrally similar (as determined by the classification process), it could be expected that the hyperspectral profile of a region may be used as a "representative" profile for the entire cluster.

Hyperspectral Data Availability for Cross Calibration
A cluster-based cross calibration approach would need to have accurate knowledge of the cluster's expected hyperspectral response, in order to derive appropriate scaling factors to compensate for spectral response differences between a well calibrated "reference" sensor and the sensor to be calibrated. In the absence of ground truth measurements, this knowledge can be derived from available hyperspectral image data. Figure 10 shows a simplified diagram of the five largest contiguous regions of Cluster 13 across North Africa (represented by the blue color) and a map of the available Hyperion coverage over the same regions (represented by the red color). Hyperion is a hyperspectral pushbroom sensor imaging the Earth's surface in the 400-2500 nm portion of the solar spectrum, in 242 overlapping bands with a spectral resolution of approximately 10 nm; 196 of these bands are well calibrated. It images a 7 km by 100 km swath at a spatial resolution of 30 m [45][46][47][48]. Preliminary investigation has identified a set of 731 Hyperion scenes imaging these regions. After filtering the set for 10% or less cloud cover and sensor view angles within ± 5 • of nadir, 101 scenes were identified that could be used to determine the hyperspectral profile. As Cluster 13 regions are spectrally similar (as determined by the classification process), it could be expected that the hyperspectral profile of a region may be used as a "representative" profile for the entire cluster. Remote Sens. 2019, 11, x FOR PEER REVIEW 15 of 18

Conclusions
This work demonstrated the potential of using the whole of North Africa as a continental PICS. Traditionally, only a few locations in North Africa have been used due to the lack of knowledge about the temporal stability of most of North Africa. This work analyzed each and every pixel of North Africa and found that the majority of North Africa is temporally stable and could potentially be used for the radiometric calibration of satellite sensors.
Conventionally, all these pixels were broadly categorized as a "barren" surface type which includes soil, sand, and rock. This work developed a first-ever high-resolution classification of this "barren" surface type of North Africa using Landsat 8 OLI image data and sub classified it into finer classes using an unsupervised classification algorithm. The algorithm identified 19 clusters each representing a distinct surface type having different spectral characteristics. The range of TOA reflectance at shorter wavelengths is narrow, but wider at longer wavelengths which offers a wide dynamic range to calibrate optical satellite sensors.
All of the identified clusters contained a large number of pixels grouped in contiguous regions potentially useable for sensor calibration and stability assessment. Among all the clusters, Cluster 13 possessed the largest contiguous region, lowest overall spatial uncertainty and was also more widely distributed across North Africa. Due to their spatial extent, most of the clusters allowed multiple imaging opportunities per day in contrast to a typical once during the sensor's revisit time. Multiple imaging opportunities obtained through a cluster, or EPICS, approach increases the temporal resolution of a calibration time series which leads to increased sensitivity for quicker identification of changes in sensor response.
While some obvious next steps need to take place, as indicated in the discussion sections, the results presented in this paper are very promising. They represent an advance towards the development of an extended PICS (EPICS) with large areal extent, and the ability to evaluate any sensor system on a daily basis using a potentially continent-wide site that behaves as a point site. The cluster-based approach could lead to a new way to calibrate and monitor sensor radiometric performance on orbit. More importantly, this approach could address a very new and serious problem of calibration monitoring for shorter-lived sensor missions, such as low-cost small satellitebased systems, where more pronounced degradation in radiometric performance is possible.

Conclusions
This work demonstrated the potential of using the whole of North Africa as a continental PICS. Traditionally, only a few locations in North Africa have been used due to the lack of knowledge about the temporal stability of most of North Africa. This work analyzed each and every pixel of North Africa and found that the majority of North Africa is temporally stable and could potentially be used for the radiometric calibration of satellite sensors.
Conventionally, all these pixels were broadly categorized as a "barren" surface type which includes soil, sand, and rock. This work developed a first-ever high-resolution classification of this "barren" surface type of North Africa using Landsat 8 OLI image data and sub classified it into finer classes using an unsupervised classification algorithm. The algorithm identified 19 clusters each representing a distinct surface type having different spectral characteristics. The range of TOA reflectance at shorter wavelengths is narrow, but wider at longer wavelengths which offers a wide dynamic range to calibrate optical satellite sensors.
All of the identified clusters contained a large number of pixels grouped in contiguous regions potentially useable for sensor calibration and stability assessment. Among all the clusters, Cluster 13 possessed the largest contiguous region, lowest overall spatial uncertainty and was also more widely distributed across North Africa. Due to their spatial extent, most of the clusters allowed multiple imaging opportunities per day in contrast to a typical once during the sensor's revisit time. Multiple imaging opportunities obtained through a cluster, or EPICS, approach increases the temporal resolution of a calibration time series which leads to increased sensitivity for quicker identification of changes in sensor response.
While some obvious next steps need to take place, as indicated in the discussion sections, the results presented in this paper are very promising. They represent an advance towards the development of an extended PICS (EPICS) with large areal extent, and the ability to evaluate any sensor system on a daily basis using a potentially continent-wide site that behaves as a point site. The cluster-based approach could lead to a new way to calibrate and monitor sensor radiometric performance on orbit. More importantly, this approach could address a very new and serious problem of calibration monitoring for shorter-lived sensor missions, such as low-cost small satellite-based systems, where more pronounced degradation in radiometric performance is possible.