Vicarious Methodologies to Assess and Improve the Quality of the Optical Remote Sensing Images: A Critical Review

: Over the past decade, number of optical Earth-observing satellites performing remote sensing has increased substantially, dramatically increasing the capability to monitor the Earth. The quantity of remote sensing satellite increase is primarily driven by improved technology, miniaturization of components, reduced manufacturing, and launch cost. These satellites often lack on-board calibrators that a large satellite utilizes to ensure high quality (radiometric, geometric, spatial quality, etc.) scientiﬁc measurement. To address this issue, this work presents “best” vicarious image quality assessment and improvement techniques for those kinds of optical satellites which lack an on-board calibration system. In this article, image quality categories have been explored, and essential quality parameters (absolute and relative calibration, aliasing, etc.) have been identiﬁed. For each of the parameters, appropriate characterization methods are identiﬁed along with their speciﬁcations or requirements. In cases of multiple methods, recommendations have been made based-on the strengths and weaknesses of each method. Furthermore, processing steps have been presented, including examples. Essentially, this paper provides a comprehensive study of the criteria that need to be assessed to evaluate remote sensing satellite data quality, and the best vicarious methodologies to evaluate identiﬁed quality parameters such as coherent noise and ground sample distance. determining absolute radiometric calibration, radiometric stability, relative radiometric calibration, signal-to-noise ratio, and artifacts. di ﬀ erence between OLI measurements and model predictions along with the accuracy and precision of the model-predicted TOA reﬂectances. The accuracy, which is described by Root-Mean-Squared-Error (RMSE), and precision, which is described by Standard Deviation, between measurements and predictions, are about 0.88% and 0.87%, respectively. These results are within the accuracy (3%) and precision (2%) of Nischal’s Libya-4 PICS model. Visual observation of Figure 2 shows that model-predicted reﬂectances follow seasonal variations in actual measurement to a certain extent.


Introduction
Over the past few decades, technological advancement dramatically reduced the size and cost of microelectronics. This drives commercial companies to build and launch a constellation of satellites (Planet cubesat constellation, SkySat constellation, etc. [1,2]) that are able to image the entire Earth in a single day. These satellites not only play an important role in monitoring global environmental change but also in detecting disasters such as forest fires, volcanoes, earthquakes and oil spills [3,4]. To use the data from any Earth-observing satellites accurate radiometric and geometric calibration, and high spatial quality, in terms of minimal blurring, aliasing, etc., should be ensured. For instance, crop health observation, yield prediction and ocean color monitoring require accurate radiometric quantity such as radiance or reflectance, while object identification in a remotely sensed image sometimes requires high spatial resolution. Additionally, high geometric accuracy, specifically multi-temporal image-to-image registration accuracy, is essential to monitor the physical changes (changes in shoreline, ice sheet, etc.) in the Earth surface. Prior to launching any Earth-observing sensor, usually radiometric, geometric, spatial and spectral parameters are characterized and calibrated. However, launch stress can change

Radiometry
'Radiometry is the science and technology of the measurement of radiation from all wavelengths within the optical spectrum' [17]. Remote sensing of Earth works based on the propagated electromagnetic radiation to the remote sensing sensor. One aspect of the remote sensing sensors' performance is radiometric characteristics, which include: radiometric resolution or dynamic range, accuracy of radiometric quantity (reflectance or radiance) in absolute scale, radiometric response change over time, differentiable signal in presence of noise, etc. Radiometric resolution refers to the amount of information contained in each pixel, which is expressed in units of bits. In other words, radiometric resolution defines the sensitivity to the magnitude of the electromagnetic energy recorded by the imaging sensor, and it is decided prior to designing an imaging system. To understand the radiometric behavior of a space-borne system, the following set of key parameters need to be characterized. These include: (1) signal-to-noise ratio; (2) absolute radiometric calibration; (3) relative radiometric calibration; (4) radiometric stability; (5) artifacts; (6) linearity of the response; (7) polarization sensitivity [18]. Out of seven quality parameters, the linearity of the response and polarization sensitivity are characterized pre-launch; typically, specialized onboard calibrators are required for on-orbit assessment of these two parameters, while vicarious approaches can be hard [19]. As an example, MODIS and Landsat 8 polarization sensitivity have been measured using a polarized light source and sheet polarizer, which can be seen in [20,21]. The main challenge of measuring the polarization sensitivity may be to provide polarized light to the focal plane. The polarization-sensitivity and linearity of the response will not be considered in this article. This paper, in Section 3, presents the details and best practices for determining absolute radiometric calibration, radiometric stability, relative radiometric calibration, signal-to-noise ratio, and artifacts.

Spatial
The spatial quality of a remote-sensing satellite system relies on several aspects of the imaging system. Spatial performance can be expressed in terms of ground sample distance (GSD), modulation transfer function (MTF), aliasing, light rejection and internal scattering, and ghosting [22]. The GSD describes the spacing between adjacent pixel centers, and MTF provides information about the blurring amount that arises because of the imaging components' non-ideal behavior. These two parameters define the spatial resolution of a remote sensing system. Spatial resolution is one of the most important parameters for remote sensing application since it determines the amount of details

Radiometry
'Radiometry is the science and technology of the measurement of radiation from all wavelengths within the optical spectrum' [17]. Remote sensing of Earth works based on the propagated electromagnetic radiation to the remote sensing sensor. One aspect of the remote sensing sensors' performance is radiometric characteristics, which include: radiometric resolution or dynamic range, accuracy of radiometric quantity (reflectance or radiance) in absolute scale, radiometric response change over time, differentiable signal in presence of noise, etc. Radiometric resolution refers to the amount of information contained in each pixel, which is expressed in units of bits. In other words, radiometric resolution defines the sensitivity to the magnitude of the electromagnetic energy recorded by the imaging sensor, and it is decided prior to designing an imaging system. To understand the radiometric behavior of a space-borne system, the following set of key parameters need to be characterized. These include: (1) signal-to-noise ratio; (2) absolute radiometric calibration; (3) relative radiometric calibration; (4) radiometric stability; (5) artifacts; (6) linearity of the response; (7) polarization sensitivity [18]. Out of seven quality parameters, the linearity of the response and polarization sensitivity are characterized pre-launch; typically, specialized onboard calibrators are required for on-orbit assessment of these two parameters, while vicarious approaches can be hard [19]. As an example, MODIS and Landsat 8 polarization sensitivity have been measured using a polarized light source and sheet polarizer, which can be seen in [20,21]. The main challenge of measuring the polarization sensitivity may be to provide polarized light to the focal plane. The polarization-sensitivity and linearity of the response will not be considered in this article. This paper, in Section 3, presents the details and best practices for determining absolute radiometric calibration, radiometric stability, relative radiometric calibration, signal-to-noise ratio, and artifacts.

Spatial
The spatial quality of a remote-sensing satellite system relies on several aspects of the imaging system. Spatial performance can be expressed in terms of ground sample distance (GSD), modulation transfer function (MTF), aliasing, light rejection and internal scattering, and ghosting [22]. The GSD describes the spacing between adjacent pixel centers, and MTF provides information about the blurring amount that arises because of the imaging components' non-ideal behavior. These two parameters define the spatial resolution of a remote sensing system. Spatial resolution is one of the most important parameters for remote sensing application since it determines the amount of details an image can provide [23]. Consequently, GSD and MTF estimation is necessary to assess the spatial quality of the remote sensing data product. Aliasing phenomena arises due to the insufficient sampling rate, which is unable to record high-frequency scene features [24]. Aliasing appears as patterns in the imagery that not only degrade the visual quality of the image but also reduce the precision of remotely sensed data. For those reasons, aliasing detection and removal is essential in satellite imagery. Light rejection and internal scattering is another spatial performance indicator, which is system-design-dependent [25]. Ghosting becomes evident when unexpected signals come from outside the field-of-view of the sensor, resulting in degraded spatial image quality [26]. In Section 4 of this article, three spatial quality assessment parameters, namely MTF, aliasing and GSD, have been explored, and best way of evaluation is presented.

Geometry
In remote sensing, geometry refers to the geometric precision which is measured by registration accuracy and geolocation accuracy, which is also known as geodetic accuracy and cartographic registration [22,27,28]. Typically, two types of registration accuracy knowledge of an earth-observing imaging system are realized: band-to-band and image-to-image registration accuracy. The geolocation accuracy provides information about the geometric performance of the satellite in-orbit. To clarify, it gives the positional offset between the actual position on the surface of the earth to the satellite-determined position. Cartographic registration is known as geometric accuracy, which is the measured positional offset between an actual location in the ground to that location in the geolocation-corrected satellite image. In order to account for the geometric distortions, geometric calibration is performed prior to the launch of a spacecraft, but vibration during launch, moisture loss in vacuum and variation in thermal environment can change the calibration parameters, requiring frequent on-orbit geometric calibration [29]. High geometric accuracy is required in numerous applications, such as change detection, multi-sensor data fusion and classification. [30][31][32]. Therefore, on-orbit geometric calibration is necessary to obtain a high-geometric-quality remote sensing observation. This paper (in Section 5) delineates best practices for determining registration accuracy (band-to-band and image-to-image) and geodetic accuracy.

Radiometric Quality Parameters and Methods
This section presents radiometric quality parameters, methods, and prioritization of the methods along with examples.

Absolute Calibration
Absolute radiometric calibration allows the conversion of image digital numbers (DN) to physical units such as radiances. Since DNs from different sensors have no meaningful relationship, the conversion of image DNs to spectral radiances is crucial in remote sensing, as it enables comparison between measurements from different sensors. Consequently, absolute radiometric calibration is essential to the remote sensing data user community. A remote sensing imaging sensor is calibrated prior-to-launch in the laboratory and post-launch while the satellite is operating in-orbit. A space-born satellite system calibration may be subject to changes due to the degradation of the electronic instruments over time, variation in filter transmittance and spectral response, etc., requiring frequent post-launch radiometric calibration.
Numerous post-launch absolute calibration techniques have been developed over the past decades; they can be broadly classified as on-board and vicarious methods [33][34][35][36][37][38][39][40][41][42][43] On-board calibrators (OBCs) are common in large government sensors (usually more than 1000 kg mass) such as the Landsat series, Sentinel 2A and 2B, Terra and Aqua Moderate Resolution Imaging Spectroradiometer (MODIS) sensor. However, many satellites, especially small satellites, do not contain OBCs; accordingly, they rely on scene-based vicarious absolute calibration methods. Every aforementioned absolute calibration method has its strengths, weaknesses, and accuracies, which define the suitability of the method for a given sensor. Table 1 provides the strengths, weaknesses and Système International (SI) traceability of six vicarious absolute calibration methods. The traceability of an absolute calibration method varies depending on the wavelength, atmospheric condition, number of observation used, etc. Typically, the traceability varies within the ranges shown in Table 1 (details can be seen in the associated references). Section 3.1.1 to Section 3.1.6. briefly describe the methods and their associated traceabilities, including examples and appropriateness for remote sensing satellite absolute calibration. The stable spectral characteristics, high reflectance, and minimal atmospheric effect make PICS popular for radiometric calibration. Moreover, PICS-based calibration is one of the least Remote Sens. 2020, 12, 4029 6 of 40 expensive calibration methods since it only requires PICS imagery acquired by the to-be-calibrated sensor. Consequently, researchers have been using PICS data to monitor the temporal stability and cross-calibrate satellite sensors for a long period of time [44][45][46][47]. However, usage of PICS as an absolute radiometric calibration target for earth observing satellite was not explored until 2013. In 2013, at South Dakota State University (SDSU), Helder et al. developed a Libya 4 PICS model based on Terra MODIS and Earth Observation-1 (EO-1) Hyperion observations to show the potential of PICS for absolute calibration [48]. In 2014, Mishra et al. reported improvements in Helder's Libya 4 PICS absolute calibration model [33]. The improved Libya 4 empirical model accuracy is between 2% and 3%, and precision is between 1% and 2%. In 2019, Bipin et al. extended the work presented in [33] through applying the APICS model to five other PICS [49]. The results show that the model for Egypt 1, Libya 1 and Sudan 1 PICS has approximately 2-3% accuracy and 1-2% precision. However, Niger 1 and Niger 2 model are less accurate (approximately 7%) with similar precision. One major drawback of the traditional APICS model is that it is restricted to a limited viewing geometry of ±20 degrees; in other words, it has limited BRDF capability.
Recently, PICS have been extended to cover vast portion of North Africa, and they are named 'clusters' [50]. At SDSU Image Processing Laboratory (SDSU IPLab), extended PICS absolute calibration (ExPAC) model has been developed for one of the clusters (namely, cluster 13) with extensive BRDF capability that subdued the shortcomings of the APICS model [41]. The ExPAC model shows around 4% accuracy in shorter wavelength bands (i.e., costal aerosol and blue band of Landsat 8) and from 1% to 2% accuracy for higher wavelength bands (visible to shortwave infrared). Precision or random variability in the ExPAC model are in the same order as accuracy. Therefore, depending on the absolute calibration accuracy and precision requirement, solar and viewing conditions of sensor while collecting imagery, and capability of collecting imagery, Libya 4, Libya 1, Egypt 1, Sudan 1 APICS models and/or the SDSU IPLab ExPAC model can be used to perform absolute radiometric calibration of a satellite sensor.
The APICS model, described in [33,49] by Mishra et al. and Bipin et al., was developed using Terra MODIS and Hyperion observations of PICS. Terra MODIS was selected as a source for absolute calibration because it is one of the best-calibrated sensors, with 2% uncertainty in Top-of-Atmosphere (TOA) reflectance scale [51], and the Hyperion hyperspectral sensor was used as a source of hyperspectral profile due to its 3-5% accuracy and around 2% repeatability (prior to 2012 data) [52,53]. The first step of developing the absolute calibration model is to cross-calibrate the Hyperion TOA reflectance to Terra MODIS TOA reflectance. The cross-calibration or scale factor was calculated with Hyperion and MODIS simultaneous nadir overpass scenes. An empirical BRDF model for solar zenith angle was developed using nadir-looking Terra MODIS observations. As explained in [49], to account for the varying view zenith angle, a view zenith BRDF model was developed from a spectrally cleaner (high transmittance and reflectance) Hyperion band.
Nischal et al. expressed the absolute calibration model as follows [33] Here, ρ PICS is the model-predicted TOA reflectance. ρ h (λ) represents Hyperion TOA reflectance for the selected PICS. k(λ) is the scaling factor to normalize the Hyperion spectrum ρ h (λ) to MODIS. m 1 (λ) represents the slope coefficient of the BRDF model for solar zenith angle normalization, which is obtained from Terra MODIS observations. m 2 (λ) and m 3 (λ) are the linear and quadratic BRDF model coefficients derived from Hyperion data for view zenith angle normalization. SZA and VZA are the solar zenith and view zenith angle of the selected sensor, respectively. f A (t) represents the atmospheric model, which can be ignored since its effect is negligible [54].
In order to present the comparison of the absolute calibration model predicted and measured reflectances of Landsat 8 Operational Land Imager (OLI) over Egypt 1, Bipin et al. illustrated the NIR band (Band 5 of Landsat 8 OLI) reflectance, which can be seen in Figure 2. Figure 3 shows percentage Remote Sens. 2020, 12, 4029 7 of 40 difference between OLI measurements and model predictions along with the accuracy and precision of the model-predicted TOA reflectances. The accuracy, which is described by Root-Mean-Squared-Error (RMSE), and precision, which is described by Standard Deviation, between measurements and predictions, are about 0.88% and 0.87%, respectively. These results are within the accuracy (3%) and precision (2%) of Nischal's Libya-4 PICS model. Visual observation of Figure 2 shows that model-predicted reflectances follow seasonal variations in actual measurement to a certain extent. model coefficients derived from Hyperion data for view zenith angle normalization. SZA and VZA are the solar zenith and view zenith angle of the selected sensor, respectively. (t) represents the atmospheric model, which can be ignored since its effect is negligible [54].
In order to present the comparison of the absolute calibration model predicted and measured reflectances of Landsat 8 Operational Land Imager (OLI) over Egypt 1, Bipin et al. illustrated the NIR band (Band 5 of Landsat 8 OLI) reflectance, which can be seen in Figure 2. Figure 3 shows percentage difference between OLI measurements and model predictions along with the accuracy and precision of the model-predicted TOA reflectances. The accuracy, which is described by Root-Mean-Squared-Error (RMSE), and precision, which is described by Standard Deviation, between measurements and predictions, are about 0.88% and 0.87%, respectively. These results are within the accuracy (3%) and precision (2%) of Nischal's Libya-4 PICS model. Visual observation of Figure 2 shows that modelpredicted reflectances follow seasonal variations in actual measurement to a certain extent.   The above-explained Egypt 1 model accuracy depends on Terra MODIS and Hyperion calibration uncertainty, as the model was developed from Terra MODIS and Hyperion observations, and Landsat 8 OLI calibration uncertainty. Atmosphere, instrument response deviation, and higher order BRDF could be the factors for the 0.87% random uncertainty, or precision between model prediction and Landsat 8 observation. This uncertainty cannot be eliminated, but its effect can be reduced by taking multiple measurements. Typically, the precision or random uncertainty decreases by 1 √ ⁄ , N being the number of observations or measurements [55].
The high absolute radiometric accuracy and precision of the APICS and ExPAC model make them practical methods for absolute calibration of an in-orbit satellite sensor; accordingly, researchers have been using the APICS model for many years. For instance, Barsi [54,56]. At this time, APICS and ExPAC models have been The above-explained Egypt 1 model accuracy depends on Terra MODIS and Hyperion calibration uncertainty, as the model was developed from Terra MODIS and Hyperion observations, and Landsat 8 OLI calibration uncertainty. Atmosphere, instrument response deviation, and higher order BRDF could be the factors for the 0.87% random uncertainty, or precision between model prediction and Landsat 8 observation. This uncertainty cannot be eliminated, but its effect can be reduced by taking multiple measurements. Typically, the precision or random uncertainty decreases by 1/ √ N, N being the number of observations or measurements [55].
The high absolute radiometric accuracy and precision of the APICS and ExPAC model make them practical methods for absolute calibration of an in-orbit satellite sensor; accordingly, researchers have  [54,56]. At this time, APICS and ExPAC models have been proven as compelling absolute calibration methods for Earth-observing satellite sensors. However, the improved BRDF capability of the ExPAC model over the APICS model makes the ExPAC model more attractive compared to the existing APICS model. Consequently, the ExPAC model would be a better option for calibrating (in absolute radiometric scale) any space-borne remote sensing satellite.

Radiometric Calibration Network
The Radiometric Calibration Network (RadCalNet) is a network of instrumented radiometric calibration sites, which has been developed for calibrating multiple sensors to a common reference. The Committee on Earth Observation Satellites (CEOS) Working Group on Calibration and Validation (WGCV) Infrared and Visible Optical Sensors Subgroup (IVOS) implemented RadCalNet and publicly opened the data in June 2018 [57]. Currently, RadCalNet sites are located in Railroad Valley Playa in the US (RVUS), LaCrau in France (LCFR), Gobabeb in Namibia (GONA), and Baotou in China (BTCN). The BTCN calibration site contains two categories of targets: artificial reflectance targets, which are suitable for a high-spatial-resolution sensor (e.g., within 10m) and desert targets, suitable for moderate-spatial-resolution satellites (e.g., Landsat satellite at 30 m) [58]. In RadCalNet, a traditional refelectance-based vicarious calibration approach (details in Section 3.1.4) has been automated to obtain high-temporal-resolution calibration (compared to the traditional vicarious approach). Automatic instrumentation of RadCalNet provides bottom-of-the-atmosphere (BOA) measurements and an estimate of propagated TOA reflectance and their associated uncertainties. Among the sites, the RVUS and GONA site show high accuracies in TOA reflectance measurement with 3% to 4% uncertainties, whereas the LCFR and BTCN site are less stable, and their respective uncertainties vary from 2% to 6% and from 4% to 4.5% [34,39].
Open data, spatially homogenous sites, measurement every 30 min from 9:00 to 15:00 h local standard time, frequent stability monitoring, and quality-controlled and processed TOA reflectance makes RadCalNet a suitable method for absolute radiometric calibration of an on-orbit satellite sensor viewing the sites. Therefore, numerous satellites are being calibrated using the publicly available RadCalNet dataset [59][60][61][62]. However, there is a reported limitation in using these data as they are provided only at nadir view, causing viewing an angle effect on the non-nadir viewing sensor [34]. To address this issue, Bouvet et al. suggested an approach based on simulating off-nadir TOA reflectances to match the viewing angle of the sensor of interest [40].
RadCalNet data can be obtained from this [63] web portal. Calibration of a sensor of interest using RadCalNet consists of the following steps:

1.
Extract RadCalNet TOA reflectance and uncertainties from the above-mentioned web portal for the same dates and times as the sensor of interest, imaging the selected site; 2.
Calculate test sensor TOA reflectance for the chosen site, including uncertainties; 3.
Interpolate the RadCalNet TOA reflectances at sensor overpass time to account for the time differences between the two measurements explained in step 1 and 2; 4.
In order to match the spectral resolution of the sensor to RadCalNet TOA reflectance, interpolate RadCalNet TOA reflectance (at 1 nm) to selected sensor TOA reflectance; 5.
Normalize RadCalNet TOA reflectance to the corresponding multispectral value of the selected sensor for direct comparison where RSR λ represents the relative spectral response function of the sensor of interest, ρ λ is the hyperspectral TOA reflectance of RadCalNet site, λ 1 and λ 2 are the minimum and maximum wavelengths of the band at 10 nm steps. ρ R is the RadCalNet-predicted TOA reflectance in the specific sensor band; 6.
Compare the output (step 2) from the selected sensor and RadCalNet TOA Reflectance (step 5) and calculated associated uncertainties.
Jing et al. applied the above steps to compare the predicted TOA reflectance of three RadCalNet sites-RVUS, LCFR and GONA-with Landsat 7, Landsat 8, Sentinel 2A and 2B observations [34].
In conclusion, RadCalNet is very promising method for remote sensing satellite absolute radiometric calibration. This method of absolute calibration can be performed from one cloud free image of RadCalNet site acquired by the test sensor. However, multiple images should be used to obtain lower calibration uncertainty, as random uncertainty decreases with the increase in number of observations.

Cross-Calibration
Cross-calibration is one of the post-launch absolute calibration methods where a sensor is calibrated against a well-calibrated satellite sensor, which is typically referred to as 'reference' sensor. Currently, there are multiple well-calibrated sensors operating on orbit, Landsat 7, Landsat 8, Sentinel 2A and 2B, and MODIS, to name a few. Image data of all the mentioned sensors are free to use, which makes this method less expensive compared to traditional reflectance-based vicarious approach (details are in Section 3.1.4). The primary weakness of this method can be the requirement of SNO or NCO scene pairs. Additionally, there will be multiple sources of uncertainty regardless of whether the SNO or NCO approach is used, which has been delineated in the forthcoming text. Despite some inevitable drawbacks, the possibility of calibrating a sensor against a well-calibrated sensor can be an option for absolute radiometric calibration when a multiple well-calibrated sensor offering highly accurate (in radiometric scale) scientific observations. A typical step to perform cross-calibration can be: (i) reference sensor selection, (ii) cross-calibration approach selection, and data selection based on the approach, (iii) spectral response mismatch correction, (iv) cross-calibration gain and bias estimation, and (v) uncertainty estimation.
The first step of cross-calibration is to select a well-calibrated reference sensor. In order to choose a reference sensor, a few parameters. such as radiometric accuracy, spatial resolution and temporal resolution, can be considered. Table 2 provides radiometric accuracy in the TOA reflectance scale, spatial resolution and temporal resolution of Landsat 7, Landsat 8, Sentinel 2A, Sentinel 2B and MODIS. The sensors are built to meet or exceed these accuracies; in most cases, they exceed these absolute radiometric accuracy specifications. Among the five sensors, MODIS appears to be the most radiometrically accurate, and it images the Earth every 1 to 2 days. However, the spatial resolution of MODIS is very low (from 250 to 1000 m) which suggests the requirement of large regions of interest for cross-calibration. Two high-spatial-resolution Sentinel sensors each have revisit time of 10 days, but their absolute published radiometric accuracy is between 3% and 5%. Even though Landsat 7 and 8 image the Earth at the same spatial and temporal resolution, Landsat 8 provides observations on the order of 3%, or less published accuracy in the TOA reflectance scale. Therefore, Landsat 8 and/or Sentinel sensors could be selected as a 'reference' sensor for cross-calibrating a satellite sensor. The second step of cross-calibration can be the selection of an approach to follow for performing cross-calibration between two sensors. Cross-calibration is usually performed by two approaches: (i) using simultaneous nadir overpass observations, and/or (ii) using near-coincident observations, from the to-be-calibrated and reference sensor.

Simultaneous Nadir Overpass (SNO) Approach
Simultaneous observations are referred as SNO events, which occur when both the reference and to-be-calibrated sensor image a target at the same time. Since the observations are obtained simultaneously, direct comparison between the measurements (such as image digital number or radiometric quantity, e.g., reflectance, radiance) will give the absolute calibration parameters, gains and biases. Compared to the NCO approach, this cross-calibration method introduces low absolute calibration uncertainty, due to the concurrent scene pair usage, resulting in a consistent view and solar geometry between the two sensors (details can be seen in Section 3.1.3.3). However, an SNO event between a well-calibrated sensor and to-be-calibrated sensor can be infrequent.

Near Coincident Observation (NCO) Approach
The NCO approach of cross-calibration relies on observations acquired at different times (e.g., minutes, hours, days apart) by the reference and to-be-calibrated sensor. Different image acquisition times can cause multiple disparities which are independent of sensors' inherent radiometric response difference. These disparities include atmospheric condition, solar geometry, target feature variation, etc. Consequently, in order to employ the NCO approach, the calibration target should be stable in terms of radiometric response and atmospheric condition and should have a nearly Lambertian nature to reduce the differences due to the sun position change. Desert sites accommodate some of these characteristics, such as stable atmosphere and radiometric response, and spatial uniformity, which make them a suitable target for cross-calibration [67,68]. In order to perform radiometric stability monitoring (details are in Section 3.2) and absolute radiometric calibration, researchers have identified more than twenty such desert sites; they are popularly known as PICS [69][70][71][72]. These traditional PICS, usually referred to as Libya 4, Sudan 1, Niger 1, Libya 1, etc., have been used in many studies to perform cross-calibration [35,[73][74][75]. Recently, Shrestha et al. [76] cross-calibrated Landsat 8 OLI and Sentinel-2A MSI using coincident and near coincident scene pairs of cluster 13 (one of the EPICS). At SDSU IPLab, a novel technique of cross-calibration, namely trend-to-trend cross-calibration, has been developed, that utilizes the near coincident observations over some portions or different portions of the EPICS [30]. The continental footprint of EPICS provides a much more frequent possibility of near-simultaneous imaging by the reference and to-be-calibrated sensors, and hence higher NCO cross-calibration opportunity over EPICS compared to the traditional PICS based cross-calibration.
Evidently, there will be a higher possibility of near coincident observations than SNO events from a reference and to-be-calibrated sensor, but the aforementioned disparities in the NCO approach can result in higher cross-calibration uncertainties (details can be seen in the forthcoming text) compared to the SNO-event-based approach.

Spectral Response Mismatch and Uncertainty of SNO and NCO Approach
For both the SNO and NCO approach, spectral response mismatch should be corrected, which can be the third step of cross-calibration. Often, the spectral response function of a to-be-calibrated sensor and reference sensor exhibit differences. This difference should be accounted for to compare the sensors 'radiometric response. The process of correcting the spectral response differences is known as spectral band adjustment factor (SBAF) correction. The details of performing SBAF correction can be seen in [46]. The next step (step iv) is to estimate cross-calibration gain and offset through a regression analysis of reflectances/radiances or digital numbers. Usually, uncertainty is estimated and reported with cross-calibration gain and offset, which can be the fifth step of cross-calibration.
Uncertainty arises in every cross-calibration step, which contributes to the overall cross-calibration uncertainty. Few sources of uncertainties are inevitable, such as reference sensor calibration uncertainty and SBAF uncertainty, in both the SNO and NCO method of cross-calibration. Uncertainty due to site instability, solar and viewing geometry change (due to the time difference and/or positional difference in the sensors during image acquisition); atmospheric differences are the major sources of uncertainty in the NCO method.
In order to explore the primary sources and major contributor/s of uncertainty of the NCO-based cross-calibration approach, over the years, much research has been carried out. For instance, Chander et al. investigated cross-calibration uncertainties due to different spectral responses, spectral resolution, spectral filter shift, geometric misregistration, and spatial resolution; the result shows that SBAF uncertainty is the dominant source of uncertainties [77]. In [78], Pinto et al. presented a way to evaluate SBAF's inherent uncertainties using the Monte Carlo simulation method, which can be exploited to estimate the SBAF uncertainty. As an additional example, while cross-calibrating Sentinel 2A MSI with Landsat 8 OLI, Farhad delineates uncertainty analysis and shows that reference sensors' calibration uncertainty, atmospheric variability, target non-uniformity and RSR difference are the major contributors to the overall uncertainty [73]. It is apparent that regardless of whether SNO-event-based or NCO-based absolute cross-calibration is used, cross-calibration uncertainty would be greater than the reference sensors' inherent uncertainty, and other sources of uncertainty will vary depending on the above-mentioned factors and number of scene pairs used in cross-calibration. From the above discussion, it is evident that the SNO approach could be the 'best' method when expecting the lowest possible cross-calibration uncertainty.
Li et al. presented cross-calibration of Sentinel 2 MSI with Landsat 8 OLI using an NCO event over the Saharan desert [79]. Figure 4 shows the TOA reflectance comparison of Landsat 8 OLI and Sentinel-2 MSI after SBAF correction. Due to the scale issues, a cirrus band is shown at the bottom right corner. The black 1:1 line shows the agreement between OLI and MSI observations. The gains and offsets can be obtained through band-by-band linear regression. Details of the result are presented in [79].   Employing the above-explained approach, cross-calibration can be performed from one cloud-free image pair (SNO or NCO event). Multiple scene pairs can decrease the cross-calibration uncertainty, since random uncertainty decreases by the square-root of the number of scene pairs, as long as there are no systematic errors in either the to-be-calibrated or reference sensor.

Traditional Reflectance-Based Vicarious Calibration
Traditional Reflectance-Based Vicarious Calibration (TRBVC) is a post-launch absolute calibration method that relies on in situ measurements of surface reflectance and atmospheric condition, while the satellite images the calibration target. The possibility of low calibration uncertainty (upon appropriately modeled atmospheric conditions and surface reflectance measurements), and its independent method for calibrating satellite sensor (similar to the instrumented RadCalNet approach, but TRBVC can be performed over vegetative, desert, etc., targets) are some of the advantages of the TRBVC method [80]. The TRBVC approach not only requires ground instruments to measure target reflectance and atmospheric condition, but also requires experienced field personnel, which can be expensive and labor-intensive. Since this approach hinges on deploying field personnel to take measurements, this method can be lengthy, in contrast to the automated RadCalNet approach. Consequently, this method of absolute calibration is less attractive compared to the APICS/ExPAC model, RadCalNet and cross-calibration for frequently monitoring a satellite system. However, a frequent field campaign will give SI traceable knowledge (independent of on-board calibrators) of absolute radiometric accuracy, and hence a greater degree of comprehension about an in-orbit sensor. Therefore, traditional reflectance-based vicarious calibration is one of the options for monitoring the absolute calibration of an Earth-observing satellite.
Traditional reflectance-based vicarious calibration can be performed through the following ways [80,81]: (i) surface bidirectional reflectance factor (BRF) calculation, (ii) atmospheric measurements, and (iii) TOA spectral radiance propagation. The surface reflectance is measured by transporting a portable hyperspectral spectroradiometer across the entire site in a predetermined pattern, and reference panel measurements are taken throughout the entire site at predetermined points. The ratio of site measurement and reference panel measurement gives the BRF of the site. In order to determine atmospheric transmission and radiance, an automated solar radiometer is used, that tracks the sun throughout the day and measures the incoming solar irradiance extinction due to atmospheric absorption and scattering. Finally, surface BRF and atmospheric measurements are used as inputs to MODerate resolution atmospheric TRANsmission (MODTRAN) or Second Simulation of a Satellite Signal in the Solar Spectrum (6S) radiative transfer code. From the MODTRAN or 6S-predicted TOA radiance or reflectance of the test sensor, gain can be calculated as where G L,λ and G ρ, λ are the gains for a specific spectral band, L λ and ρ λ are the band-integrated MODTRAN or 6S predicted TOA radiance and reflectance, respectively, DN λ represents average DN for all the pixels of the test site measured by the test sensor for a given band and b λ is the average offset for a given band.  [82]. In this study, both UA and GSFC used desert sites, whereas SDSU took measurements over the vegetation target to obtain BRF. To show the consistency between the OLI measurement and vicarious ground-based method, both reflectances and radiances have been calculated and percentage difference is obtained as percent difference = (Measured -OLI)/OLI, where Measured represents ground-based TOA radiance/reflectance measurement and OLI is the Landsat 8 OLI TOA radiance/reflectance measurement. Figure 5 shows the percentage difference in TOA spectral radiance and reflectance of eight OLI spectral bands [82]. UA and SDSU results for coastal aerosol (443 nm) and blue band (483 nm) appear to be off by about 5% due to very low surface reflectance and atmospheric effects. However, band 3 to band 8 shows consistency between the two types of site, and OLI agrees within the uncertainties of the vicarious calibration method, which are on the order from 2.5% to 3.0%. The vertical bars associated to each point represent the 1σ standard deviation of the measurements.
NASA Goddard Space Flight Center (GSFC) at test sites in Nevada, California, Arizona and South Dakota, USA [82]. In this study, both UA and GSFC used desert sites, whereas SDSU took measurements over the vegetation target to obtain BRF. To show the consistency between the OLI measurement and vicarious ground-based method, both reflectances and radiances have been calculated and percentage difference is obtained as percent difference = (Measured -OLI)/OLI, where Measured represents ground-based TOA radiance/reflectance measurement and OLI is the Landsat 8 OLI TOA radiance/reflectance measurement. Figure 5 shows the percentage difference in TOA spectral radiance and reflectance of eight OLI spectral bands [82]. UA and SDSU results for coastal aerosol (443 nm) and blue band (483 nm) appear to be off by about 5% due to very low surface reflectance and atmospheric effects. However, band 3 to band 8 shows consistency between the two types of site, and OLI agrees within the uncertainties of the vicarious calibration method, which are on the order from 2.5% to 3.0%. The vertical bars associated to each point represent the 1σ standard deviation of the measurements. The test site surface BRF calculation is one of the components of the reflectance-based approach, and the uncertainty of BRF calculation is dominated by reference panel characterization. Uncertainty in atmospheric characterization, MODTRAN or 6S calculation, and solar zenith angle measurement are the other major contributors to the total uncertainty. The in situ vicarious absolute calibration uncertainty can be between 1.5% and 2.5% depending on the wavelengths, measurement device, operators, etc. [42]. Additionally, the precision of the reflectance-based vicarious approach in the midvisible wavelength range is between 2.5% and 3.5% [83].

Lunar Calibration
The moon, our closest celestial neighbor, is an exceedingly stable reflector of sunlight, and its reflectivity, observed by the on-orbit satellite, depends on lunar angles and orientation. The collection of lunar imagery rests on the ability of an Earth overserving satellite to change the imaging direction and point to the moon surface from Earth orbit. The positional relationship between sun-moonsatellite also defines the capability of a satellite to capture moon image. The lack of atmosphere, similar dynamic range as Earth scenes, and lack maintenance requirement, unlike the RadCalNet or in situ traditional vicarious calibration site, are a few of the advantages of using the moon as a The test site surface BRF calculation is one of the components of the reflectance-based approach, and the uncertainty of BRF calculation is dominated by reference panel characterization. Uncertainty in atmospheric characterization, MODTRAN or 6S calculation, and solar zenith angle measurement are the other major contributors to the total uncertainty. The in situ vicarious absolute calibration uncertainty can be between 1.5% and 2.5% depending on the wavelengths, measurement device, operators, etc. [42]. Additionally, the precision of the reflectance-based vicarious approach in the mid-visible wavelength range is between 2.5% and 3.5% [83].

Lunar Calibration
The moon, our closest celestial neighbor, is an exceedingly stable reflector of sunlight, and its reflectivity, observed by the on-orbit satellite, depends on lunar angles and orientation. The collection of lunar imagery rests on the ability of an Earth overserving satellite to change the imaging direction and point to the moon surface from Earth orbit. The positional relationship between sun-moon-satellite also defines the capability of a satellite to capture moon image. The lack of atmosphere, similar dynamic range as Earth scenes, and lack maintenance requirement, unlike the RadCalNet or in situ traditional vicarious calibration site, are a few of the advantages of using the moon as a radiometric calibration source [84]. In order to exploit the potential of the moon as a calibration target, in 2005 at the United States Geological Survey (USGS), Kieffer et al. developed a lunar irradiance model based on observations by the RObotic Lunar Observatory (ROLO) [85]. In spite of having the numerous advantages of a lunar calibration approach, it is not typically used for absolute calibration due to its from 5% to 10% model uncertainty in absolute scale [37]. The complex satellite-sun-moon positional relationship and orientation of the moon and its phase makes developing an absolute calibration model (better than the current accuracy level) very challenging. Currently, there are three projects on-going at the National Institute of Standards and Technology (NIST), National Physics Laboratory (NPL) and NASA Langley Research Center to improve the absolute calibration accuracy significantly [86][87][88]. The low absolute accuracy of the current lunar model and requirement of changing Earth-imaging direction are the major constraints of using a lunar calibration model. Therefore, the development of a new accurate lunar model will make this absolute calibration method a viable future option for the Earth-observing satellite.

Deep Convective Clouds
Extremely bright Deep Convective Clouds (DCC) have nearly Lambertian reflectance and are situated at the top of the atmosphere, specifically at the tropopause, where atmospheric effect is minimal, which make them a potential target for radiometric calibration [89]. Consequently, the radiative transfer model or a reference sensor such as SCIAMACHY or MODIS have been used in the past to predict DCC reflectance and perform absolute calibration [90,91]. However, absolute calibration accuracy from DCC is at 5% level [38], and DCCs are usually identified using thermal bands. Furthermore, this calibration method is typically used for a low-resolution larger-footprint sensor. Many satellites do not have thermal bands, and many of them are medium-to-high spatial resolution sensors (less than 30m). Nonetheless, a DCC method for satellite needs investigation to ascertain its suitability for absolute radiometric calibration.

Radiometric Stability
The radiometric stability of an imaging instrument is a measure of how the instrument's radiometric response changes over time. It is one of the important quality parameters, since the radiometric stability of an imaging instrument defines the detectability of very small Earth surface change. In order to ensure sensors' radiometric stability, numerous precautionary measures are taken in the sensor development, launching, and space operation steps. In spite of taking these necessary steps, in the harsh space environment, the radiometric response of an imaging sensor can change due to temperature variation, voltage level change, and radiation in the space in every possible timescale, requiring on-orbit assessment. To assess the on-orbit sensor's radiometric stability performance, two different types of radiometric stability have been monitored in numerous studies: (i) short-term, and (ii) long-term radiometric stability [66,[92][93][94][95]. There is no clear definition of short-term stability and long-term stability.
Short-term stability can be referred to as stability in a radiometric response within a single orbit. Short-term stability is assessed exploiting on-board calibrators, specifically stimulation lamps [92,96,97]. Stability characterization in a single orbit from Earth or celestial scenes has not been reported to date. Thus, the short-term stability of a satellite cannot be characterized from Earth/celestial scenes at this time. Long-term stability can be referred to as stability in radiometric response beyond single orbit. Sometimes, it is reported as sensor degradation or drift per year [22,94,98,99]. Long-term radiometric stability is typically monitored using OBCs, lunar and PICS observations [43,54,100,101]. In the absence of OBCs, PICS image-based and lunar observation-based methods are the practical options for a remote sensing satellites' long-term stability assessment. The next subsection explains the long-term stability characterization approaches for an Earth-observing satellite.

Long-Term Stability
This subsection delineates three long-term radiometric stability assessment techniques, and their strengths and weaknesses.

Lunar Observation-Based Method
As explained in Section 3.1.5, stable sunlight reflectivity and little-to-no atmospheric distortion between satellite orbit to moon makes lunar-based stability assessment an attractive approach. Lunar irradiance measurements over time explain the temporal behavior of the imaging sensor. A USGS lunar model is also used for stability monitoring, due to its better than 1% [85] relative precision for any phase angle within ±90 degrees [102,103]. Thus, the lunar-based method can be utilized in the absence of OBCs or as a secondary method to track OBCs degradation. As explained herein [103], to perform a lunar-based stability assessment, a few corrections must be applied: distance corrections, oversampling correction, phase angle corrections, libration corrections, and noise reduction correction. Lastly, a lunar irradiance time series is calculated and assessed for radiometric stability.
Over the years, this method of long-term stability assessment has been applied to several satellites, such as Landsat 8, MODIS, SeaWiFS, PROBA-V [66,[101][102][103]. These satellites have exploited the different frequencies of observations (for instance, once or twice in a lunar cycle). The required number of observations hinges on the amount of trends to-be-detected over a timespan (details are given in the forthcoming texts). However, the amount of moon imagery collection is limited by the different lunar phases.

PICS Based Radiometric Stability Monitoring
Earth surfaces that exhibit minimal change over time are usually referred to as invariant targets. Multi-temporal image data over those invariant targets explains the temporal behavior of the imaging sensor, i.e., the long-term radiometric stability of the sensor. To be a pseudo-invariant calibration site or PICS, according to several investigators [81,[104][105][106], a target should have some properties, such as: temporal stability, spatial uniformity, Lambertian nature, and being located away from waterbodies, urban and industrial areas. Even though Earth surfaces do not contain all the mentioned properties, a few Earth-imaging targets exhibit some of the properties, allowing many researchers to use PICS for monitoring the long-term radiometric stability of a remote sensing satellite sensor. The PICS-based stability monitoring method is utilized as a standalone approach [44,107,108], or along with OBCs and a lunar-based method [95,100,109]. In the harsh space environment, not only can the sensor degrade, but OBCs also can degrade over time, requiring another method to track the performance of both the sensor and the on-board calibrator. The lunar approach is used sometimes as a secondary method, but to capture a lunar image, the imaging direction of the satellite system must be altered, and many satellites do not have the ability to maneuver the instrument. However, the PICS-based method can be easy to implement and low-cost, since it requires collecting routine imagery of the target and expertise to analyze the observations. These makes the PICS-based method an attractive option, as a standalone approach or as a secondary method to OBCs, for assessing the long-term stability of a spaceborne imaging system. Therefore, PICS can be used confidently for the post-launch long-term radiometric stability assessment of a satellite sensor.
Over the past few decades, numerous PICS have been used to monitor the temporal degradation of on-orbit imaging sensor. Almost all the exploited PICS are from twenty North African and Saudi Arabian desert sites, recommended by Cosnefroy et al. in 1996 [69]. A lot of research has been carried out in the past decades aiming to find new invariant calibration target/s, especially on a global scale, rather than Africa and Arabia [72,110,111]. As mentioned earlier, recently, Shrestha et al. extended PICS (EPICS) to all of north Africa through developing the cluster approach [50], and Hasan et al. showed a 3% temporal uncertainty of cluster 13, which shows the potential of the cluster for long-term stability monitoring [112]. The key advantage of EPICS-based stability monitoring is the ability to assess radiometric stability from daily/near daily observation, but the temporal uncertainty of EPICS can be higher than traditional PICS observation.
In 2019, Bacour et al. showed that twenty desert PICS identified by Cosnefroy et al. are still 'optimal' [72]. At the CEOS IVOS-19 meeting, six of the twenty PICS were selected as pseudo-invariant standard 'reference' calibration targets [113]. They are popularly known as [114] Libya 4, Libya 1, Mauritania 1, Mauritania 2, Algeria 3, and Algeria 5. Details of these six PICS, including latitude and longitude, altitude, and climatological parameters, can be seen in [115]. For many years, numerous researchers exploited six CEOS-endorsed PICS to monitor the calibration trend of the remote sensing satellite. Among all the PICS, Libya 4 appears to be the most stable site, with from 1% to 1.5% temporal uncertainty [116]. Consequently, Libya 4 is the most frequently used PICS. Out of six CEOS PICS, two of the Mauritania sites exhibited more than 3% temporal uncertainties in Landsat 7 ETM+ observations [33], which may be because of their closeness to the West African coastline [98]. Therefore, one must be cautious about selecting these two PICS for stability monitoring.
The PICS-based long-term stability assessment approach consists of a few simple steps. The following steps of trending have been adopted from [44]:

1.
Site selection: depending on the requirements and constraints, such as temporal stability, amount of trend to-be-detected, Lambertian nature, etc., PICS should be selected; 2.
TOA reflectance or radiance calculation: at sensor reflectance or radiance should be calculated using calibration parameters; 4.
BRDF normalization: to remove the seasonality, BRDF normalization should be applied; 6.
Trend detection: plotting BRDF-normalized TOA reflectance or radiance and observing the change over time.
Barsi et al. applied the above mentioned steps to Libya 4, Algeria 3 and Sudan 1 PICS data from Sentinel-2A MSI, aiming to observe the stability of the sensor over time [54]. Figure 6 illustrates the lifetime trend of coastal aerosol band for the three PICS. At Libya 4 PICS, Sentinel-2A MSI coastal aerosol band shows −0.14 ± 0.73 (± 2σ) %/year drift, which is displayed in Figure 7. The slope over time for all the other bands is also calculated and presented in Figure 7 to assess the temporal stability, and the result shows general agreement among three sites within 2-sigma uncertainty in seven MSI bands.
Libya 1, Mauritania 1, Mauritania 2, Algeria 3, and Algeria 5. Details of these six PICS, including latitude and longitude, altitude, and climatological parameters, can be seen in [115]. For many years, numerous researchers exploited six CEOS-endorsed PICS to monitor the calibration trend of the remote sensing satellite. Among all the PICS, Libya 4 appears to be the most stable site, with from 1% to 1.5% temporal uncertainty [116]. Consequently, Libya 4 is the most frequently used PICS. Out of six CEOS PICS, two of the Mauritania sites exhibited more than 3% temporal uncertainties in Landsat 7 ETM+ observations [33], which may be because of their closeness to the West African coastline [98]. Therefore, one must be cautious about selecting these two PICS for stability monitoring.
The PICS-based long-term stability assessment approach consists of a few simple steps. The following steps of trending have been adopted from [44]: 1. Site selection: depending on the requirements and constraints, such as temporal stability, amount of trend to-be-detected, Lambertian nature, etc., PICS should be selected; 2. Region of Interest (ROI) selection: spatially homogenous ROI must be selected from the chosen PICS scenes. For example, six CEOS PICS ROI extent can be seen in [115], and SDSU IPLAB 'optimal' ROI coordinates (at Libya 4, Libya 1, Sudan 1, Egypt 1, Niger 1, Niger 2) are shown in [117]; 3. TOA reflectance or radiance calculation: at sensor reflectance or radiance should be calculated using calibration parameters; 4. Outlier rejection: cloud-contaminated TOA reflectance or radiance must be ignored; 5. BRDF normalization: to remove the seasonality, BRDF normalization should be applied; 6. Trend detection: plotting BRDF-normalized TOA reflectance or radiance and observing the change over time. Barsi et al. applied the above mentioned steps to Libya 4, Algeria 3 and Sudan 1 PICS data from Sentinel-2A MSI, aiming to observe the stability of the sensor over time [54]. Figure 6 illustrates the  The amount of trend that can be detected rests on several factors such as the timespan of the dataset, temporal frequency of the data, temporal variability in the data, autocorrelation in the data, etc. [118][119][120]. These factors vary, to a certain extent, depending on the capability of the imaging sensor, atmospheric condition while collecting the imagery, and length of the dataset at certain PICS. The natural variability in the target, which is attributed to temporal variability, impedes the ability to detect a statistically significant trend. Statistically significant calibration trend detection is only The amount of trend that can be detected rests on several factors such as the timespan of the dataset, temporal frequency of the data, temporal variability in the data, autocorrelation in the data, etc. [118][119][120]. These factors vary, to a certain extent, depending on the capability of the imaging sensor, atmospheric condition while collecting the imagery, and length of the dataset at certain PICS. The natural variability in the target, which is attributed to temporal variability, impedes the ability to detect a statistically significant trend. Statistically significant calibration trend detection is only possible when the to-be-detected trend surpasses the natural variability in the PICS. Since all the aforementioned factors are atmospheric-condition-, temporal-resolution-and length-of-available dataset-dependent, the minimum trend detectability will vary accordingly. This article presents the approach to detecting the minimum trend, considering 'known' temporal variability, autocorrelation, and length of the dataset. Alternatively, necessary minimum data record length can also be determined when knowing the amount of minimum trend to-be-detected.
According to Weatherhead et al. [121,122], number of years (N) taken to detect a trend of magnitude ω 0 at the 95% confidence level and with 50% probability can be estimated by where σ N is the variability in the monthly averaged data time series and ∅ is the 1-month lag autocorrelation in the monthly averaged data. This equation can be used to determine: (i) length of the data necessary to detect a trend of a certain magnitude, and (ii) the magnitude of the trend that can be detected by a specific timespan of dataset. The unit of σ N and ω 0 must be the same; for instance, if one expects ω 0 in percentage, σ N must be expressed in percentage, dividing it by the mean of the monthly averaged time series data.
The above-explained approach has been applied by Bhatt et al. [123] to detect a statistically significant minimum trend in VIIRS Libya 4 monthly observation. Recently, in 2019, Hasan et al. showed a trend detection possibility using temporally rich cluster 13 and traditional Libya 4 PICS observations [112].
PICS-based radiometric stability is a proven ability to monitor the long-term stability of a remote sensing sensor through normal Earth-observation-based trend detection. Therefore, for the sensors without on-board calibrators and the ability to collect lunar imagery, PICS-based long-term radiometric stability monitoring can be the primary approach.

Relative Radiometric Calibration
The process of quantifying radiometric response variation in each detector relative to each other is known as relative radiometric calibration. In an ideal situation, each detector of a camera system should give exactly the same output when they are exposed to the same amount of electromagnetic radiation. However, the ideal state does not exist due to minute variations in detector manufacturing, variability in electronic gain and offset, and differences in spectral and linear responses. Consequently, every detector in a linear array imaging system exhibits different behavior, causing noticeable striping artifacts in the collected imagery. In order to address the above-mentioned problems, the imaging sensor is usually characterized in a simulated space environment prior to launch. However, launch stress, ultraviolet radiation and temporal degradation are a few of the factors that can cause the non-uniformity in detector response while the satellite is operating in-orbit. Hence, in-orbit relative radiometric calibration and correction must be performed to ensure high-quality image data. Over the years, numerous methods have been employed to remove the detector-level artifacts. They can be classified into two broad categories: i) on-board and ii) Earth-scene-based method. On-board calibrators, such as lamps or diffuser panels, are used in several remote sensing systems as a uniform radiance source for detector-to-detector non-uniformity characterization [51,66,124]. In the absence of on-board calibrators or as a method to monitor on-board instrument degradation, Earth-imagery-based methods are utilized to quantify detector-to-detector response variation. Yaw or side-slither maneuver and lifetime statistics are two of the most popular vicarious approaches for on-orbit relative radiometric calibration of a linear array imaging system from Earth scenes [125,126]. The strengths and weaknesses of these two methods along with target type has been presented in Table 3 (details can be seen in Sections 3.3.1 and 3.3.2). Table 3. Relative calibration methods.

Strengths Weaknesses Target Type
Yaw or Side-Slither Maneuver [125,[127][128][129] Can be time-efficient compared to statistics approach One cloud-free acquisition may be enough Satellite should have the ability to maneuver Loss of normal image data Requires uniform imaging target Visible and near-inferred band: Greenland and Dome C of Antarctica SWIR band: Sahara Desert and Arabian Peninsula sites Lifetime Statistics [126] No need to maneuver Normal Earth-scene-based approach Requires substantial number of images Any surface type

Yaw or Side-Slither Maneuver
The pushbroom imager, which is a linear detector array, scans the imaging target one row at a time during the movement of the spacecraft, which forms an image, with each column of the image generated by a single detector. On the other hand, in the side-slither maneuver process, the focal plane of the system is rotated ninety degrees on its yaw axis, and the imaging direction is parallel to the direction of the spacecraft rather than the perpendicular normal pushbroom imaging operation. Thus, every detector of the array images the same area of the ground and measures an identical amount of electromagnetic radiation. However, obtaining ideal side-slither scan requires a perfectly parallel spacecraft and array position, no optical distortion, etc. The roll, pitch and yaw stability of the spacecraft should be maintained to a certain extent to minimize the ground variability of the projection of each detector [125]. More details of the side-slither or yaw-axis maneuver technique can be seen herein [130].
Capability to maneuver, loss of normal image data, and uniform imaging target requirement are the primary constraints of collecting appropriate data for detector-to-detector non-uniformity characterization. However, no-requirement of on-board hardware and more time efficient than some of the traditional methods are a few of the major strengths of the side-slither technique for relative radiometric calibration [131]. The side-slither technique has shown a significant improvement in image quality for RapidEye [125], Landsat 8 [128], Quickbird [129] and Pleiades-HR [132] sensors. In 2014, Pesta et. al. presented a comparable relative gain correction in Landsat 8 OLI image using a diffuser-based and side-slither approach [128]. Therefore, this method of relative calibration can be used without any reservation.
Radiometric and spatial uniformity, high signal-to-noise ratio, and a large enough area for a sufficient amount of data are a few of the major criteria for side-slither imaging target selection for relative gain characterization. However, not all the bands are bright in a single target, hence, different types of targets should be used for different bands. For instance, a visible and near-infrared band can be calibrated using Greenland and Dome C of Antarctica [128], whereas the Sahara Desert and Arabian Peninsula sites provide high enough spectral radiance for accurate relative gain characterization of SWIR band [125,[127][128][129]. Theoretically, one cloud-free side-slither acquisition may be enough for relative calibration, assuming the sensor or detector is stable.
Linearity in detector response and uniform radiances to each detector are two of the major assumptions that must be fulfilled for non-uniformity characterization. Detector linearity can be assessed by finding the relationship between input radiances to the focal plane and detector output from the imaging system. At the onset, in the side-slither or 'flat fielding' approach, homogenous flat-field areas are selected as an appropriate target from bright images, and then each detector response is shifted to line up. Each detectors' bias, which is calculated from dark images, is subtracted from the raw signal, and the relative gains are obtained by column-averaging the bias-eliminated side-slither collect and dividing by the average signal level of the scene [125]. Finally, these relative gains are applied to each detector's offset removed output, which results in corrected images. Improvement due to relative gain correction should be assessed both qualitatively and quantitatively. Qualitative image enhancement can be evaluated by visual inspection, and quantitative improvement is usually apprised by calculating banding and streaking metrics. Banding is a phenomenon that appears due to the deviation in average array response from a group of detector's responses. Streaking becomes apparent in an image when a single detector's response deviates strikingly from its neighboring detectors. These artifacts often become evident in fairly homogenous scenes at different radiance levels. Banding and streaking requirements are usually established during the development phase of the satellite system. The banding and streaking metric can be defined in multiple ways, QuickBird and Landsat 8 OLI banding and streaking equations are presented in [127,133]. Figure 8 shows an example of banding and streaking for the red band of a desert scene from QuickBird sensor [127]. The left figure presents radiometrically corrected column averages, which is used to calculate the banding, and the right figure illustrates percentage streaking, which is less than 0.12%, for all the detectors in the red band. The presented result for all the QuickBird bands met the banding and streaking requirements. may be enough for relative calibration, assuming the sensor or detector is stable.
Linearity in detector response and uniform radiances to each detector are two of the major assumptions that must be fulfilled for non-uniformity characterization. Detector linearity can be assessed by finding the relationship between input radiances to the focal plane and detector output from the imaging system. At the onset, in the side-slither or 'flat fielding' approach, homogenous flatfield areas are selected as an appropriate target from bright images, and then each detector response is shifted to line up. Each detectors' bias, which is calculated from dark images, is subtracted from the raw signal, and the relative gains are obtained by column-averaging the bias-eliminated sideslither collect and dividing by the average signal level of the scene [125]. Finally, these relative gains are applied to each detector's offset removed output, which results in corrected images. Improvement due to relative gain correction should be assessed both qualitatively and quantitatively. Qualitative image enhancement can be evaluated by visual inspection, and quantitative improvement is usually apprised by calculating banding and streaking metrics. Banding is a phenomenon that appears due to the deviation in average array response from a group of detector's responses. Streaking becomes apparent in an image when a single detector's response deviates strikingly from its neighboring detectors. These artifacts often become evident in fairly homogenous scenes at different radiance levels. Banding and streaking requirements are usually established during the development phase of the satellite system. The banding and streaking metric can be defined in multiple ways, QuickBird and Landsat 8 OLI banding and streaking equations are presented in [127,133]. From the above discussion, it is evident that the side-slither maneuver is a well-defined and established method of relative calibration. Therefore, this method can be a suitable option for relative From the above discussion, it is evident that the side-slither maneuver is a well-defined and established method of relative calibration. Therefore, this method can be a suitable option for relative gain estimation of an Earth-observing satellite. If the satellite system lacks an on-board calibrator or maneuver mechanism, then a lifetime statistics approach can be utilized for relative radiometric calibration.

Lifetime Image Statistics
Lifetime image statistics' relative radiometric calibration method relies on the modification of histogram observed from each individual detector in an imaging system and is applicable to push broom, whisk broom and frame cameras. Histogram modification, which is an Earth-imagery-based technique, is based on the assumption that all the detector sees the same image information in a statistical sense. This approach is valid for deriving the parameters (e.g., relative gains) from a single scene and application to an individual scene from whiskbroom sensors, since it can be assumed that each detector of whiskbroom imaging sensor measures the same signal levels in a statistical sense. The assumption might not be valid for a single scene from pushbroom sensors or frame cameras since every detector of a pushbroom instrument/frame imager does not see the same image information. However, this assumption can be used for pushbroom or frame sensors if the range of image data is extended over many scenes. The detector-level non-uniformity characterization is usually performed from multiple scenes over the course of the sensor's lifetime. Relative gains used to be characterized using all the available data [126], but non-uniformity characterization from a subset of images provided acceptable results [134]. One advantage of this is that it allows frequent relative gain characterization which will provide useful knowledge about sensor degradation. Nowadays, any Earth-observing sensor generally collects several hundreds of scenes each day, which indicates that the sheer volume of data should be queried for scene statistics, and that can take a substantial amount of time. However, the lack of requirement for an on-board instrument, no need to maneuver, and usage of normal earth scenes are the major strengths of this relative calibration method. Consequently, in the absence of on-board calibrators and yaw axis maneuver capability, the statistics method is an option for remote sensing satellite relative radiometric calibration.
Based on the aforementioned assumption, Angal first presented relative gain characterization and correction of a pushbroom sensor called Advanced Land Imager (ALI) [126]. The application of relative gains derived from 20,000 ALI scenes showed a significant improvement in image quality. In 2010, Shrestha developed a method to identify the best type of images for a lifetime statistics approach based on image mean and standard deviation [134]. High mean and high standard deviation (HMHSD) images provided the best result to estimate the relative gains, and the algorithm requires fewer HMHSD scenes than normal scenes to stabilize the relative gain. Additionally, relative gain estimates based on mean statistics performed better compared to the standard deviation-based approach. This work also presented an approach to determine the number of required scenes with a few incomprehensions, which suggested the requirement of future work. In 2017, Anderson et al. described a procedure for estimating relative gains from the normally acquired high mean and low standard deviation (HMLS) Earth scenes [135]. Moreover, a method to obtain the number of required scenes is also presented based on Landsat 8 OLI data, and a minimum of 1200 HMLS images was used as a threshold to calculate relative gains for all OLI bands. Finally, the scene statistics result was compared with the diffuser result which was obtained from Greenland images, and the statistics results outperformed the diffuser result both qualitatively and quantitatively. From the work presented by Anderson et al. [135], it is evident that the HMLS earth scenes will be appropriate, and number of scenes required can be calculated based on the presented approach.
Relative radiometric calibration using the statistics method can be performed in following way: (i) detector-by-detector mean DN calculation, and then bias subtraction for all the available scenes, (ii) mean DN calculation for each detector from all the available scenes, (iii) standard deviation calculation for each detector from all the available scenes, (iv) global mean (from step 3) DN and global standard deviation (from step 3) computation, and (v) calculating relative gain dividing either mean DN from step 2 by global mean (step 4) or dividing standard deviation from step 3 by global standard from step 4. Relative gain correction can be applied in two of the following steps: (i) subtracting mean bias from the raw image data for each detector and (ii) multiplying the reciprocal of each detector's relative gain by each of bias-corrected image pixel.
The above-explained relative calibration steps are followed to obtain relative gains for each detector, and afterwards, correction was applied to the ALI Arizona acquisition, and the original and corrected images were illustrated in Figure 9 [136]. Figure 9a shows the band 7 original image along with the stretched Sensor Chip Assembly (SCA) 3 image at the bottom. The vertical stripping artifacts are evident in the shown images. Figure 9b,c presents calibrated images using pre-launch coefficients and using relative gains, respectively. The stretched regions (at the bottom) in two of the Figures clearly show the stripping reduction. Compared to the pre-launch coefficient, the relative gain approach apparently provided better correction of detector level non-uniformity. detector, and afterwards, correction was applied to the ALI Arizona acquisition, and the original and corrected images were illustrated in Figure 9 [136]. Figure 9a shows the band 7 original image along with the stretched Sensor Chip Assembly (SCA) 3 image at the bottom. The vertical stripping artifacts

Signal-to-Noise Ratio
Signal-to-noise ratio (SNR) is defined as the ratio of signal to the random variability in the signal which is known as the 'noise' of a system. It is a measure of useful information obtained by an instrument. In remote sensing, SNR is an image-quality assessment indicator, and it evaluates the radiometric performance of an imaging system. Noise is an inevitable part of any instrument; remote sensing satellites are no exception. Thus, the SNR of a satellite should be estimated to assess the quality of the data product. Since SNR usually varies with signal level, it must be reported in such a way that it clearly signifies the imaging quality of a sensor. For instance, the SNR of two imaging systems can be estimated under same illumination conditions, i.e., the same radiance level, which will allow one to compare the SNR from one satellite to another. As an example, Landsat 7 and 8 reports the SNR at typical and maximum radiances, which can be seen in [92]. Several sensors' SNR have been estimated by exploiting their corresponding on-board calibrators [137][138][139]. Over the years, numerous Earth-imagery-based SNR estimation methods have also been developed for remote sensing sensors. Some of the popular Earth-imagery-based SNR estimation methods are the homogenous area (HA) [140], local means and local standard deviations (LMLSD) [141] methods. Each of the above-mentioned methods has its own strengths and weaknesses (a summary can be seen in Table 4), which have been presented in the forthcoming subsections, along with its appropriateness for remote sensing satellite SNR estimation.

Signal-To-Noise Ratio
Signal-to-noise ratio (SNR) is defined as the ratio of signal to the random variability in the signal which is known as the 'noise' of a system. It is a measure of useful information obtained by an instrument. In remote sensing, SNR is an image-quality assessment indicator, and it evaluates the radiometric performance of an imaging system. Noise is an inevitable part of any instrument; remote sensing satellites are no exception. Thus, the SNR of a satellite should be estimated to assess the quality of the data product. Since SNR usually varies with signal level, it must be reported in such a way that it clearly signifies the imaging quality of a sensor. For instance, the SNR of two imaging systems can be estimated under same illumination conditions, i.e., the same radiance level, which will allow one to compare the SNR from one satellite to another. As an example, Landsat 7 and 8 reports the SNR at typical and maximum radiances, which can be seen in [92]. Several sensors' SNR have been estimated by exploiting their corresponding on-board calibrators [137][138][139]. Over the years, numerous Earth-imagery-based SNR estimation methods have also been developed for remote sensing sensors. Some of the popular Earth-imagery-based SNR estimation methods are the homogenous area (HA) [140], local means and local standard deviations (LMLSD) [141] methods. Each of the above-mentioned methods has its own strengths and weaknesses (a summary can be seen in Table 4), which have been presented in the forthcoming subsections, along with its appropriateness for remote sensing satellite SNR estimation. Table 4. SNR estimation methods.

Strengths Weaknesses Target Type
Homogenous Area [140,142] Relatively easy to compute Normal Earth scene can be used Almost impossible to find absolute homogenous surface in satellite imagery dry lake, desert, snow, dense vegetation Local Mean and Local Standard Deviation [141] Can be automated Does not require large homogenous areas but many small homogenous regions.
Noise must be mainly additive Image should contain many small homogenous areas Target with many small homogenous areas

Homogenous Area Method
One of the simplest SNR estimation approach relies on the calculated mean and standard deviation within a manually selected homogenous area. The ratio of the mean and standard deviation gives an estimate of SNR. A few of the strengths of this method of SNR estimation are: (i) it is relatively easy (compared to complex statistical approaches) to implement, (ii) it requires normal Earth scenes with a homogenous area, and (iii) it gives SNR from directly computed parameters (mean and standard deviation) rather than from multiple steps of statistical outcomes. However, the fact that it is almost impossible to find an absolute homogenous surface in a satellite imagery, inevitable atmospheric variability, and manual selection of homogeneous region are the major weaknesses of HA method. Since manually selected homogenous areas almost always contain image features, there is a possibility of SNR underestimation. However, as presented by Ren et. al. in [142], a homogenous target such as dry lake, desert, snow, dense vegetation can give a reasonable estimate of SNR. Moreover, river and forest have also been used to estimate the SNRs of different satellite sensors [143,144]. To report SNRs at multiple signal levels, several types of target should be used. The size of the homogenous area should not be very small, since a small homogenous area would not give the best estimate of SNR, or too large, because a large area will increase surface and atmospheric variability, which will underestimate the SNR [142]. The accuracy of this method might vary depending on the surface type used to estimate SNR. The SNR of Landsat 8 reflective bands using the HA method over dry lake, desert, snow, dense vegetation shows an overestimation on the order of 50 to 250 SNR and underestimation on the order of 20 to 50 SNR (at Landsat 8 OLI typical radiances, mentioned earlier), which suggest the shortcomings of this method compare to the on-board approach [142]. Nevertheless, this method can be an option for the satellite systems without OBCs.

Local Means and Local Standard Deviations Method
Local means and local standard deviations (LMLSD) is an earth imagery-based SNR estimation method which exploits the fact that remote sensing images usually contain numerous small homogenous areas. In 1993, Gao presented this method [141], and showed that estimated SNR by LMLSD is similar to the SNR estimated by the HA method. This process of SNR estimation can be automated, and it does not require large homogenous areas but many small homogenous regions. However, this method had developed assuming that noise in the images are mainly additive, which must be investigated first before proceeding with this method. One way to inspect the assumption is to plot the calculated local means and local standard deviations within the small homogenous blocks. If the majority of the local means are clustered around a single standard deviation, then it can be said that the noise is mainly additive.
As explained in [141,145,146], the first step of this method is to divide an image into small blocks of 3 × 3, 5 × 5, . . . . . . , 21× 21 pixels. Afterwards, local means (LMs) and standard deviations (LSDs) are calculated for each block. The ratio of those LMs and LSDs are the estimated SNR for each block size. A histogram of the estimated SNRs for each block contains the SNR information of the entire image. The SNR at the peak of the histograms (when most of the peaks converge to a single SNR value) they represent the SNR estimate of the image. This approach to SNR estimation had been applied to Gaofen-1, Landsat 8 OLI, Landsat 7 ETM+, Terra/Aqua MODIS observation, and the results have been compared to assess the performance of Gaofen-1 [146]. The estimated SNR of Gaofen-1 red and NIR bands are approximately 75 and 35, respectively, which can be seen in Figure 10.
The accuracy of this method may be assessed from the estimated SNR of Landsat 8 OLI using the LMLSD approach and OBC approach, assuming OBC approach represent the actual SNR of Landsat 8 OLI. The LMLSD and OBC estimated the SNR results presented in [146] and, in [137], suggest that the LMLSD method overestimated the SNR by approximately 50-100 at OLI-typical radiances of from 14 to 40 W/(m 2 sr µm), ranging from costal to NIR band.
suggest that the LMLSD method overestimated the SNR by approximately 50-100 at OLI-typical radiances of from 14 to 40 W/(m 2 sr μm), ranging from costal to NIR band.

Artifacts
Since there were no clear standard definitions of image artifacts, Roman-Gonzalez defined artifacts as 'artificial structures that represent a structured perturbation of the signal' [148]. Image artifacts can be generated from design problem/s, detector saturation, and on-board processing unit error. It can also arise during image compression and data transmission. Morfitt et. al. present some of the Landsat 8 OLI artifacts: spectral cross-talk, stray light, bright target recovery, impulse noise, coherent noise and radiometric uniformity [92]. Since coherent noise and striping noise (caused by radiometric non-uniformity) are present in almost all imaging systems, they have been addressed in the forthcoming subsections.

Striping Noise
Striping noise is an anomaly in remote sensing images. It primarily occurs due to the inconsistent response of multiple detectors. As explained in Section 3.3, relative radiometric calibration and correction is performed to remove detector-to-detector non-uniformity, which results in reduced striping in the imagery. Even though relative gain correction reduces the striping artifacts, it is impossible to remove the striping completely due to the uncertainties in the relative gain estimation method. Moreover, rapid space environment change affects the performance of the detector in-orbit, which causes striping artifacts in the images. Since striping artifacts reduce data quality and limit the applications, such as classification [149], object segmentation [150], and sparse unmixing [151], the image should be de-striped before being provided to the users. In order to identify or monitor the striping noise in the imagery, homogenous areas such as deserts, Greenland scenes or water bodies can be used. As mentioned earlier, relative calibration is not usually performed frequently, and it Since this method of SNR estimation does not require scenes with a large homogenous area, and it can be automated, this method can be an option for Earth-observing satellite SNR estimation. However, if the assumption of mainly additive noise dominant image is not met, another block counting approach, presented in [147] can be used, which works well when the image noise is mainly multiplicative.

Artifacts
Since there were no clear standard definitions of image artifacts, Roman-Gonzalez defined artifacts as 'artificial structures that represent a structured perturbation of the signal' [148]. Image artifacts can be generated from design problem/s, detector saturation, and on-board processing unit error. It can also arise during image compression and data transmission. Morfitt et. al. present some of the Landsat 8 OLI artifacts: spectral cross-talk, stray light, bright target recovery, impulse noise, coherent noise and radiometric uniformity [92]. Since coherent noise and striping noise (caused by radiometric non-uniformity) are present in almost all imaging systems, they have been addressed in the forthcoming subsections.

Striping Noise
Striping noise is an anomaly in remote sensing images. It primarily occurs due to the inconsistent response of multiple detectors. As explained in Section 3.3, relative radiometric calibration and correction is performed to remove detector-to-detector non-uniformity, which results in reduced striping in the imagery. Even though relative gain correction reduces the striping artifacts, it is impossible to remove the striping completely due to the uncertainties in the relative gain estimation method. Moreover, rapid space environment change affects the performance of the detector in-orbit, which causes striping artifacts in the images. Since striping artifacts reduce data quality and limit the applications, such as classification [149], object segmentation [150], and sparse unmixing [151], the image should be de-striped before being provided to the users. In order to identify or monitor the striping noise in the imagery, homogenous areas such as deserts, Greenland scenes or water bodies can be used. As mentioned earlier, relative calibration is not usually performed frequently, and it might leave a few degrees of striping. For that reason, an image de-striping algorithm can be applied to remove the striping artifacts. Based on the methodology, the image de-striping algorithm can be divided into three categories: (i) statistics-based, (ii) filter-based, and (iii) variational de-striping method.
As mentioned in Section 3.3, statistics methods cannot be used to derive the parameters (e.g., gains) from a single scene of pushbroom/frame imager, requiring many images to meet the assumption of 'each detector sees the same image information in a statistical sense'. Consequently, the statistics method may not be appropriate for striping noise reduction, since a frequently usable de-striping algorithm is expected. The filter-based methods are simple and computationally fast, but sometimes image structures contain the same frequency as stripes, thus the filter-based method removes the scene contents, which is the major disadvantage of this method [152]. The variational de-striping method reduces the limitation of the filter-based method [153]. However, this method is not able to completely eliminate the limitation of scene content removal, and it might be complex and computationally slower than a filter-based approach [154]. Therefore, a simple and computationally faster (compared to the variational de-stripping approach) filter-based method can be appropriate for striping noise removal, even though it occasionally removes image content. An example of striping noise removal from the Landsat 7 ETM+ imagery using filter-based approach can be seen in [155].

Coherent Noise
Multiple electromagnetic waves in any electronic instrument interfere with each other and create coherent noise. In remote sensing imagery, coherent noise (CN) appears as a periodic pattern at a frequency or at narrow-frequency range. Figure 11 shows coherent noise in the Landsat 7 ETM+ level 1 band 3 image and Landsat 5 Thematic Mapper (TM) level-1 band 5 image [156]. Coherent noise not only degrades visual image quality, but also affects the relative uncertainty of the data [92]. Moreover, the presence of coherent noise causes error in atmospheric corrections when it performs using dark pixels [157] and in a water quality study [158]. Consequently, coherent noise should be characterized and removed from the image data. Coherent noise magnitude varies instrument by instrument, and typically it is reported as zero-to-peak or peak-to-peak noise magnitude in DNs. For example, CN magnitude in Landsat 4 Multispectral Scanner (MSS), Landsat 5 TM, Landsat 7 ETM+ is measured about 0.6 DN (zero-to-peak), about 0.5 DN (zero-to-peak) and about 3 DN (peak-to-peak), respectively [159][160][161]. CN can also be reported in the contrast level, which is the frequency domain noise amplitude normalized to the dynamic range of the images collected for CN characterization [92]. This type of noise is most visible in relatively homogenous areas of an image [160]. Therefore, night ocean scene, image of dark water bodies, shutter collect, or desert scenes would be appropriate for characterizing coherent noise. Typically, periodic coherent noise is detected and removed in the frequency domain. Finding the noise frequency/s is the first step of this process, and then an appropriate filter is used to remove the noise from the image [159].
to remove the striping artifacts. Based on the methodology, the image de-striping algorithm can be divided into three categories: (i) statistics-based, (ii) filter-based, and (iii) variational de-striping method.
As mentioned in Section 3.3, statistics methods cannot be used to derive the parameters (e.g., gains) from a single scene of pushbroom/frame imager, requiring many images to meet the assumption of 'each detector sees the same image information in a statistical sense'. Consequently, the statistics method may not be appropriate for striping noise reduction, since a frequently usable de-striping algorithm is expected. The filter-based methods are simple and computationally fast, but sometimes image structures contain the same frequency as stripes, thus the filter-based method removes the scene contents, which is the major disadvantage of this method [152]. The variational de-striping method reduces the limitation of the filter-based method [153]. However, this method is not able to completely eliminate the limitation of scene content removal, and it might be complex and computationally slower than a filter-based approach [154]. Therefore, a simple and computationally faster (compared to the variational de-stripping approach) filter-based method can be appropriate for striping noise removal, even though it occasionally removes image content. An example of striping noise removal from the Landsat 7 ETM+ imagery using filter-based approach can be seen in [155].

Coherent Noise
Multiple electromagnetic waves in any electronic instrument interfere with each other and create coherent noise. In remote sensing imagery, coherent noise (CN) appears as a periodic pattern at a frequency or at narrow-frequency range. Figure 11 shows coherent noise in the Landsat 7 ETM+ level 1 band 3 image and Landsat 5 Thematic Mapper (TM) level-1 band 5 image [156]. Coherent noise not only degrades visual image quality, but also affects the relative uncertainty of the data [92]. Moreover, the presence of coherent noise causes error in atmospheric corrections when it performs using dark pixels [157] and in a water quality study [158]. Consequently, coherent noise should be characterized and removed from the image data. Coherent noise magnitude varies instrument by instrument, and typically it is reported as zero-to-peak or peak-to-peak noise magnitude in DNs. For example, CN magnitude in Landsat 4 Multispectral Scanner (MSS), Landsat 5 TM, Landsat 7 ETM+ is measured about 0.6 DN (zero-to-peak), about 0.5 DN (zero-to-peak) and about 3 DN (peak-to-peak), respectively [159][160][161]. CN can also be reported in the contrast level, which is the frequency domain

Spatial Quality Parameters and Methods
This section presents spatial image quality parameters and vicarious techniques to estimate modulation transfer function, calculate ground sampling distance and identify aliasing.

Modulation Transfer Function
Modulation Transfer Function (MTF) is a spatial image quality evaluation metric that measures the sharpness of the image generated by a linear, shift-invariant imaging system. MTF characteristics are typically estimated prior to the launch of a remote sensing satellite; however, vibrations during launch, transition from air to vacuum, varying thermal state of the sensor and/or changes in material properties over time may change the MTF characteristics of the sensor [162]. Consequently, MTF estimation is necessary while an imaging system is operating on-orbit. Over the years, numerous MTF estimation methods have been developed, and they can be divided into artificial (human-made) target-or natural-target-based methods. They can also be divided based on the properties of the target such as edge [163], pulse or line [164,165], and impulse methods [166]. In this article, MTF estimation methods have been classified based on the target properties. Table 5 presents a summary of the strengths and weaknesses of each method, including target types. The following subsections delineate the methods along with the advantages and disadvantages, and provide recommendations on the methods for remote sensing satellite MTF estimation. The edge method exploits sharp edges in images acquired by a remote sensing sensor to estimate the amount of blur in the imagery. The homogeneity in either side of the edge, to maintain low noise level, and a high-contrast edge is expected to obtain reasonable estimate of MTF [167]. Moreover, the edge should be straight to ensure that only the system performance is estimated, otherwise edge alignment will be necessary during data processing, and if possible, edge analysis in two directions (horizontal and vertical) from two perpendicular planes should be performed. Helder et. al. suggested that the width of the edge should be between 6 and 10 pixels, and height should be greater than 20 pixels [166]. Additional details of the edge target requirement can be seen in [168]. Edge target can either be natural, such as moon [169], agricultural field boundaries [170], sea/icefield transitions [171], or artificial, such as painted concreate, often designed as a checkerboard pattern (can be seen in Figure 12a), tarp-made [172] or parking lot. The natural or artificial MTF target is selected based on the trade-offs between sensor GSD, uniformity in bright and dark region, and line sharpness. Usually, the low-resolution sensor's MTF is estimated using the edge of the moon as a large MTF target, but this is not available in the Earth scene, and they are expensive to build [173][174][175]. Some of the artificial and natural MTF targets can be seen in the USGS test site webpage [176], however, recent imagery (by looking at Google Earth, Google maps and Bing maps) reveals that most of the artificial sites are not viable for MTF estimation at this time. Thus, in order to perform MTF estimation using the edge method, one may have to construct an artificial target; the size of the MTF target will vary depending on the sensor GSD, and homogeneity in the bright and dark region and the line sharpness must be achieved for a reasonable estimate of MTF. Figure 12b illustrates the steps used to estimate MTF from an edge-target system response [177], and, as explained by Helder et al. in [178], the obtained edge spread function (ESF) is differentiated to obtain the line spread function (LSF), and normalized magnitude of the Fourier transform of LSF result in MTF.
imagery (by looking at Google Earth, Google maps and Bing maps) reveals that most of the artificial sites are not viable for MTF estimation at this time. Thus, in order to perform MTF estimation using the edge method, one may have to construct an artificial target; the size of the MTF target will vary depending on the sensor GSD, and homogeneity in the bright and dark region and the line sharpness must be achieved for a reasonable estimate of MTF.

Pulse or Line Method
The pulse method utilizes a target made up of a bright area surrounded by dark areas which appears as a step pulse. In this method, input to the system is a step pulse and sensor output is referred as pulse response function (PRF). The ratio of the magnitude of Fourier transform (FT) of PRF and magnitude of FT of step pulse gives the MTF. This method is known as the pulse method since the system input is a step pulse; however, when the width of the pulse is extremely narrow, this method can be referred to as a line method [165]. As an artificial or "human-made" target, tarp-build pulse and bridges were used in the past to estimate the MTF [164,180]. A few requirements of the pulse target as are: the pulse target must maintain homogeneity throughout the pulse and edges, and the width of the pulse must be chosen carefully so that the zero-crossing point of the sinc function (FT of input step pulse) does not occur at the Nyquist frequency [11]. In order to avoid that, Helder et. al. in [166] suggested 3 GSD as optimal. More details of the pulse target requirements can be seen in [168]. Since the zero-crossing points are present in sinc function, and there is almost certainly noise in the system, a division in Fourier space will produce an error in the MTF estimate at certain frequencies. Therefore, care must be taken while performing this method. Despite the advantage of obtaining LSF directly from the sensor output, this method would be difficult to utilize since targets are expensive to build, and it must be deployed effectively, which will require experienced personnel. As mentioned in the edge method section, the artificial pulse target should be developed based on sensor GSD, and uniformity and high contrast or sharpness between dark and bright regions should be ensured for reasonable MTF estimation. As a natural target, bridges can be used for moderate spatial resolution sensors' (from 10 to 60 m GSD) MTF estimation [181].

Impulse Method
The MTF of an imaging system can also be estimated from the impulse response of the system; in this method, the input to the system is an impulse (hence the name 'impulse method'), and output is a point spread function (PSF). Ratio of the Fourier transform of impulse and PSF are then

Pulse or Line Method
The pulse method utilizes a target made up of a bright area surrounded by dark areas which appears as a step pulse. In this method, input to the system is a step pulse and sensor output is referred as pulse response function (PRF). The ratio of the magnitude of Fourier transform (FT) of PRF and magnitude of FT of step pulse gives the MTF. This method is known as the pulse method since the system input is a step pulse; however, when the width of the pulse is extremely narrow, this method can be referred to as a line method [165]. As an artificial or "human-made" target, tarp-build pulse and bridges were used in the past to estimate the MTF [164,180]. A few requirements of the pulse target as are: the pulse target must maintain homogeneity throughout the pulse and edges, and the width of the pulse must be chosen carefully so that the zero-crossing point of the sinc function (FT of input step pulse) does not occur at the Nyquist frequency [11]. In order to avoid that, Helder et. al. in [166] suggested 3 GSD as optimal. More details of the pulse target requirements can be seen in [168]. Since the zero-crossing points are present in sinc function, and there is almost certainly noise in the system, a division in Fourier space will produce an error in the MTF estimate at certain frequencies. Therefore, care must be taken while performing this method. Despite the advantage of obtaining LSF directly from the sensor output, this method would be difficult to utilize since targets are expensive to build, and it must be deployed effectively, which will require experienced personnel. As mentioned in the edge method section, the artificial pulse target should be developed based on sensor GSD, and uniformity and high contrast or sharpness between dark and bright regions should be ensured for reasonable MTF estimation. As a natural target, bridges can be used for moderate spatial resolution sensors' (from 10 to 60 m GSD) MTF estimation [181].

Impulse Method
The MTF of an imaging system can also be estimated from the impulse response of the system; in this method, the input to the system is an impulse (hence the name 'impulse method'), and output is a point spread function (PSF). Ratio of the Fourier transform of impulse and PSF are then normalized to obtain corresponding MTF. Full, two-dimensional (2D) PSF estimate from the system output reduces the complexity and uncertainty of calculating MTF, which is the main advantage of this method over edge and pulse methods. However, the obtained PSF is almost certainly noisy, which limits the accuracy of the estimated MTF [182]. Impulse targets are nothing but a point source, which can either be artificially built or natural (e.g., stars). As an artificial target, researchers have used active sources such as a spotlight [183,184] and passive sources such as convex mirrors [166] as inputs to the imaging satellite. Stars have been used as a natural point target for several satellite's MTF estimation [185][186][187], and MTF estimation using celestial targets has the advantage of a lack of atmosphere and the possibility of celestial scenes from any orbit. However, locating the stars with appropriate spacing and changing the imaging direction of a satellite are the major constraints of using stars as point target. Even though artificial targets are expensive and time-consuming to build, as shown by Rangaswamy [188] and Leger [183], these targets are appropriate for high-spatial-resolution sensors' MTF estimation. Compared to the edge and pulse method, the impulse method can be advantageous since this method provides full two-dimensional understanding of MTF, but several artificial point sources are needed to obtain a full 2D MTF, which might be a challenging task to accomplish. High-spatial-resolution satellites' MTF can be estimated from either artificial or natural point targets, depending on the capability of the sensor and other factors such as cost and experienced personnel for target building.

Aliasing
Aliasing is one type of spatial artifact that becomes evident due to the low sampling rate. It can arise because of the under-sampling during analog-to-digital conversion and resampling. Insufficient sampling fails to capture high-frequency scene content: as a result, repeated patterns, such as jaggedness on line features, thin structures and edges, become prominent near high-frequency components [189,190]. These repeated patterns are known as aliasing artifacts, which reduce the image quality, and this affects every subsequent application that uses the aliased data. For instance, the impacts of spatial aliasing on data fusion are reported in [191], and sea-ice thickness measurement can be seen in [192]. Thus, in order to ensure high-spatial-quality data from the Earth-observing satellite, spatial aliasing should be detected and removed before giving the data to the user. Aliasing not only depends on the spatial content of the scene but also sensor MTF [193]. Since aliasing hinges on image features, scenes that contains high-frequency content i.e., numerous edges and lines, will be appropriate for visually detecting aliasing. However, there might be aliasing that is not visible to the naked eye. In order to detect visible and invisible aliasing, Coulange et. al. proposed an algorithm based on suspicious colocalizations of Fourier transform coefficients [194]. This method detects aliasing from only one image and reduces aliasing, preserving high-frequency image details, which are the major advantages of the colocalization approach over the other frequency domain aliasing detection and correction approach presented in [190,195,196].

Ground Sampling Distance
Ground Sampling Distance (GSD) is the center-to-center distance between adjacent pixels in an image. It is one of the most popular spatial quality indicators of a remote sensing sensor, since it quantifies the spatial resolution of an imaging system [197]. The GSD provides information about the detectable objects in the imagery; just to clarify, a low GSD will reveal small objects in the images. Since all applications that make use of spatial image information require accurate information about GSD, GSD measurement is necessary for potential users of the data. The GSD can be calculated from the relationship (GSD = p·H/ f · cos θ) among detector pixel pitch "p", focal length of the instrument "f", altitude of the satellite, "H", and look angle "θ" [198]. If the altitude of the satellite or any other parameter is not known, then GSD can be estimated from an image using the known distance between two points on the ground. The number of pixels between the two points should be counted, and the ratio of the distance between two points and the number of pixels will give the GSD estimate [199].

Geometric Quality Parameters and Methods
This section presents geometric quality parameters and vicarious techniques to assess registration accuracy and geodetic accuracy.

Registration Accuracy
The registration accuracy of a spaceborne imaging system refers to the closeness of intra-band and multi-temporal image-to-image pixel registration. Band-to-band and image-to-image misregistration create significant problems in change detection, spatio-temporal fusion, classification accuracy, etc.
Change detection is sensitive to image-to-image registration error. In order to present the effect of image misregistration, Dai et al. showed that to attain a less than 10% error in change detection, a registration accuracy of less than one-fifth of a pixel is expected [30]. Over the years, spatio-temporal fusion techniques gained popularity, since they can address the problem of coarse spatial and temporal resolution [200]. The research in [31] showed that image-to-image registration error significantly impacts the spatio-temporal fusion accuracy between MODIS and Landsat 7 ETM+ images. Low band-to-band registration accuracy strikingly reduces image sharpness and leads to misclassification [32]. The impact of band-to-band misregistration in science data products is higher at non-homogenous areas than the spatially homogenous target [201]. It is evident from the above discussion that the use of remotely sensed data requires high band-to-band and image-to-image registration accuracy.

Band-To-Band Registration Accuracy
Band-to-band registration (BBR) accuracy is a measure of alignment among the bands of a scene acquired by an imaging sensor. As stated earlier, numerous applications require high BBR accuracy; consequently, BBR accuracy characterization is an important quality parameter for any remote sensing system. There are a few reported methods of band-to-band calibration for remote sensing sensors. For instance, on-board calibrator is exploited for Terra MODIS intra-band calibration [202], and the result from the on-board calibrator is validated using a ground scene approach [203]. The absolute BBR accuracy of these two methods has not been reported. The measured band alignment deviation between an on-board and ground scene approach is found to be approximately 20m on average for a visible to NIR band of Terra MODIS. On-board calibrators are not an attractive option for this task, since they increase the cost and complexities in the system. Furthermore, a ground scene approach requires constructing specific dark areas over a bright target; details can be seen in [203]. Another BBR accuracy assessment approach utilizes lunar observation, however, the lunar approach is primarily used for assessing the stability of BBR [204]. The stability of BBR has been assessed from lunar observations of MODIS and Visible Infrared Imaging Radiometer Suite (VIIRS), and the uncertainty of MODIS lunar-based and on-board approaches are found to be in the same order (though actual uncertainty is unknown) [205,206]. The requirement for imaging direction change and correcting seasonal variations in the moon's appearance are a few of the major constraints of using the lunar method. Additionally, this method is usually used for low-spatial-resolution, larger-footprint sensors. For those reasons, the lunar-based method is not a practical option for Earth-observing satellites.
Cross-correlation is an image feature-based BBR accuracy assessment approach. The easy to implement, computationally fast cross-correlation method works well for satellite images, since the satellite image patch contains sufficient signal without too much high-frequency content [207]. Additionally, geometric, and radiometric distortions are usually kept to a minimum in the remote sensing dataset, which is a requirement for a cross-correlation approach. Thus, this method has been used for high-resolution remote sensing sensor BBR accuracy assessments in recent years [28,208,209]. Minimum inter-band spectral difference is one of the major preconditions of the cross-correlation method. Consequently, cloud-free acquisitions of desert sites with little-to-no vegetation are typically used for BBR accuracy assessment [208]. For instance, Landsat 8 OLI BBR accuracy had been measured using 18 cloud-free desert scenes [208], and Landsat 7 ETM+ BBR calibration was performed from 27 Earth scenes scattered over several desert sites [209]. Based on the study in [208,209], it is obvious that quite a few cloud-free desert scenes are required to perform this method. A cross-correlation approach was used to assess EO-1 ALI BBR performance and the results were compared with Landsat 7 ETM+ performance; the positional offset between ALI and ETM+ bands was found to be approximately 0.08 pixel (for 30m ALI and ETM+ bands) [28]. The absolute accuracy of cross-correlation approach has not been reported in any of the above-mentioned references. However, ALI and ETM+ comparison indicates the efficacy of the cross-correlation approach. Therefore, cross-correlation can be an approach for remote sensing satellites' BBR accuracy assessment.

Image-To-Image Registration Accuracy
Image-to-image registration (IIR) accuracy is a measure of alignment among multi-temporal images of the same target, acquired by an imaging sensor. Image registration accuracy impacts every application that uses a temporal remote sensing dataset. In order to improve the IIR accuracy, many methods of image registration have been developed over the past decades. They can be classified into two broad categories: (i) area-based and (ii) feature-based methods. Area-based approaches compare spatial patterns of the intensity label in small image subsets [210], whereas feature-based methods are reliant on the identification of salient spatial features such as edges or unique shapes [211,212]. One of the major disadvantages of a feature-based approach is that the target features might change over time, which will lead to poor accuracy assessment. On the other hand, an area-based method works without detecting prominent objects. Thus, it will not be affected by the change in image features. However, area-based methods are computationally slower than feature-based approaches. Faster modern computers can overcome that problem. Therefore, one of the area-based methods, such as image correlation [213], would be suitable for image registration accuracy assessment. An area-based image correlation approach is used to assess the IIR accuracy of Landsat 7 ETM+ and Landsat 8 OLI; the IIR accuracy of both the sensors is within 12m specification (for both the sensors), which might be an indication of the effectiveness of the area-based method [208,209].

Geodetic Accuracy
Geodetic accuracy is usually referred to in absolute scale, and it is a measure of geolocation accuracy of the image data created by the imaging system of a spacecraft [208]. Geolocation accuracy specifies the geometric performance of the satellite system operating in-orbit, and it is measured using highly accurate Ground Control Points (GCPs). These GCPs are located in the geometric calibration sites, and their actual coordinates are known. For instance, the USGS EROS range in the city of Sioux Falls, South Dakota, USA, has over 400 highly accurate GCPs which can be used to assess the geodetic accuracy of a space-borne imaging system. A calibration site with highly accurate GCPs, along with their coordinates and the test image of that site from a satellite sensor, are two of the requirements to assess geodetic accuracy. The steps used to measure the absolute geodetic accuracy is fairly simple, which can be seen in [28]. Briefly, at first, the GCPs should be identified in the test images, and the locations of the identified GCPs are compared with their actual known locations. Then, mean, root-mean-squared, and standard deviation along-and across-track errors can be calculated for each of the tested scenes to compare scene-by-scene performances. The uncertainty of the GCP-based approach depends on the number of control points used to estimate geodetic accuracy; however, the quantitative impact of the number of control points on uncertainty is unknown [28].
Another approach of absolute geodetic accuracy assessment uses the Global Land Survey (GLS) scenes that contain control points. The test sites with GCPs provide better accuracy compared to GLS control points. However, test sites with highly accurate GCPs are not available in a global scale. Consequently, a GLS-based approach can be an option to assess the absolute geodetic performance of an Earth-observing satellite. Landsat 8 absolute geodetic accuracy using the GCP and GLS approach differs by about 30 m, and the GCP approach yields higher geodetic accuracy than GLS approach [208].

Conclusions
This article presents a critical review of image-quality criteria and best vicarious methodologies to assess and improve the optical images of the Earth-observing satellite, exploring radiometric, geometric, and spatial quality categories. Knowledge of these quality categories is critical, since the data user community expects to use radiometrically, geometrically and spatially accurate data. Signal-to-noise ratio, absolute calibration, relative calibration, radiometric stability, and image artifacts are found to be the primary on-orbit radiometry characterization parameters. The spatial quality of remote sensing images is defined by modulation transfer function, ground sampling distance and aliasing. Registration and geodetic accuracy are the geometric quality evaluation criteria for a spaceborne imaging system. Each of the parameters can be assessed or quantified by multiple methods. In this work, the best quality assessment and improvement methods have been identified, including the strengths, weaknesses, requirements, such as type of images required, number of images required to perform the task, etc. Additionally, methods have been recommended based on its strengths and weaknesses, and processing steps of the method are outlined, along with examples.
As mentioned throughout this article, the quality of Earth-observing-satellite-generated observation is essential for every subsequent application. Therefore, the presented complete review of remote sensing image quality and best practices of methods will help satellite owners and operators to decide which method of quality they will rely on, and data users to know about different quality criteria so that they are aware of the quality of scientific observations.