## 1. Introduction

The presence of clouds, atmospheric absorption, weather effects and sensor-introduced noise can be important causes of distorted and missing data in satellite imagery. Therefore, numerous gap-filling and smoothing techniques have been developed in the past few years. Next, we cite some of these techniques. Timesat [

1,

2] and Hants [

3] are very popular because of their good performance and free access. Both methods are mathematical procedures based on filtering and harmonic analysis and use multitemporal imagery for filling gaps, but do not use external auxiliary information. These approaches can be applied to long series of data and smooth complete series of images without any prior specification of the gaps to be filled. The moving weighted harmonic analysis (MWHA) [

4] is another method based on a modification of the previous cited Hants technique that incorporates a moving support domain to assign the weights for all the points, therein greatly simplifying the determination of the frequency number. The spatially- and temporally-weighted regression (STWR) method [

5] is more strongly focused on filling cloud gaps. This approach requires a one-by-one identification of the gaps and utilizes a simple least-squares linear regression to capture the temporal trends characterizing invariant similar pixels. Vuolo et al. [

6] developed a proposal for smoothing and gap-filling high-resolution multi-spectral time series using the Whittaker smoother and derived information about the neighborhood. Gapfill [

7] is a recent alternative that uses specific ordering among images and quantile regression for the gap-filling process. This technique achieves a great performance because it produces smoothed images of high quality; however, it is slow when filling large gaps, and it needs a prior specification of the gaps to be filled.

The MODIS and LANDSAT missions are carrying out similar programs for processing common remote sensing data derived from multi-temporal series of images in a routine manner. The normalized difference vegetation index (

NDVI) and the day and night land surface temperature (LST) are two examples of such data. Smoothing

NDVI is frequently conducted with the maximum value composite (MVC) procedure [

8]. It assigns the maximum value of the time series of pixels across the composite period. Alternative techniques include the use of a bidirectional reflectance distribution function (BRDF-C), the constrained maximum value composite (CV-MVC) [

9], the mean value iteration filter (MVI), the changing-weight filter (CW) and the interpolation for data reconstruction (IDR) technique [

10]. For day and night LST, it is common to average the cloud-free pixels over the composite period [

11]. Composite images are then of weekly or biweekly temporal resolution. However, even after the composition process, residual noise can still arise in images, and therefore, a smoothing procedure using high-spatio-temporal-resolution ground-truth data is proposed in this paper. Frequently, remote sensing data are proxy variables in mathematical or statistical models for improving the predictions of ground-truth data. For example, land surface temperature from MODIS MOD11A2 was used as a proxy variable in a spatio-temporal regression model for smoothing temperatures in Croatia [

12], but used a reduced number of images and rain gauge stations. To improve the estimation of the area occupied by olive trees, a unit level linear mixed model was proposed, whereby the auxiliary variable came from multispectral Landsat 7 ETM images [

13]. Different geostatistical models were checked with soil data collected with automated sensor networks to provide the spatio-temporal interpolation of soil water, temperature and electrical conductivity in the Cook Agronomy Farm dataset [

14]. A distributed lag model for assessing the impact of current and previous 16-day rainfall anomalies was proposed based on the enhanced vegetation index as a proxy for above-ground net productivity in South Africa [

15]. A complete reconstruction of time series of daily LST data in central Europe from 2003 to 2016 was achieved in [

16], where remote sensing emissivity and elevation were used as external covariates in the spatio-temporal interpolation process. The authors used air temperature ground-truth data to check the predicted values. In this case, various remote sensing data were the explanatory variables of the models used for improving the predictions in other remote sensing data. Research wherein ground-truth data are used as proxy variables wherein remote sensing data including the target variable are more scarce. Only recently, we have found a cubic spline model that has been combined with a weighted least squares regression [

17] for extracting the seasonality characterizing the land surface temperature and a space-state model (SSM) that has been introduced for detecting changes in trends of surfaces occupied by different categories of

NDVI in continental Spain during 2011, 2012, and 2013 [

18].

In this work, we propose to merge multi-temporal remote sensing data with ground-truth data using a thin plate spline smoothing method with external auxiliary variables. Thin plate splines can be seen as a generalization of a regression model wherein the linear relationship between the response variable and the coordinates is substituted by a non-parametric function providing a more flexible fitting [

19]. Under certain conditions, the thin-plate splines are formally equivalent to kriging [

20], and because the measure of smoothness is invariant under the rotation of the coordinates, thin-plate splines are specially suited for spatial data. The thin-plate splines have an important advantage with regard to universal kriging because they do not need to fit variograms. Moreover, the estimators of the thin-plane splines can be obtained in a closed form by solving a linear system of

n equations, where

n is the number of data points. The approach used here is based on fitting second-order splines to the anomalies obtained from the median of the target gap image neighborhood, therein using the temporal and spatial dependences of the previous and subsequent images across different years for smoothing potential outliers. Using only the coordinates, but without external covariates, the performance of thin-plate splines for filling gaps was evaluated in a simulation study, which showed that this technique is faster and more accurate than Timesat, Hants and Gapfill when filling different sizes of gaps in day and night LST and is equally competitive when filling

NDVI gaps [

21]. Here, we improve this proposed method by including auxiliary variables from ground-truth data, which must be fitted at the same spatio-temporal resolution as remote sensing data. We also focus on smoothing outliers instead of filling gaps, and we compare the performance of the newly-proposed method with a spatio-temporal state-space model using the same ground-truth data. We check the performance through a simulation study whereby time series of three different remote sensing datasets are altered and later smoothed with these models.

Figure 1 shows the flowchart of the paper. First, meteorological and remote sensing data are taken. Second, daily ground-truth information needs to be averaged over eight- and 16-day periods to match the corresponding day and night LST and

NDVI time periods. These data are interpolated in the study region of Navarre (Spain) (see

Figure 2) through a thin-plate spline model with altitude as the external covariate. This is a preliminary task because these data are a small discrete set, and greater spatial resolution is recommended for predicting. For the simulation study, remote sensing data are altered with different sizes of distortions, i.e., 5%, 10%, 15% and 20%, and different magnitudes are used according to the derived variables. Third, we fit different models for every outlier outbreak: a state-space model with covariates (SSMWc), a thin-plate spline model with covariates (TpsWc), a state-space model without covariates (SSMWoc) and a thin-plate spline model without covariates (TpsWoc). All of these methods provide multi-temporal smoothed series of the three remote sensing datasets. Finally, to compare model performances, we obtain the square root of the mean squared prediction error, which will be summarized in graphs and tables.

## 2. Data

Navarre is a region of approximately 10,000 km

${}^{2}$ located in the north of Spain (see

Figure 2). Elevations vary between 200 and 2500 meters in the highest zone of the Pyrenees, located in Northeastern Navarre. Valleys and mountains are ubiquitous in the north, and small hills are common in the central part of the province. The northwest of Navarre is humid, but not highly so, and the northeast is a mountainous region with elevations between 1459 m and 2438. The central area is characterized by a temperate Mediterranean climate, with a tendency towards a continental climate. The south is mainly flat; the climate is Mediterranean and continental with dry summers; temperatures exhibit large annual variations; there is minimal and irregular rainfall; and northerly winds are frequent. Clearly, the climate varies across the province, and large weather differences can be experienced on the same day.

In this study, we have drawn classical variables, such as the maximum temperature and humidity, from 48 rain gauge stations. The data are of daily temporal resolution, from which weekly and biweekly mean data are derived. The average distance among rain gauge stations is approximately 15 km.

The Moderate Resolution Imaging Spectroradiometer (MODIS) provided the day and night LST composite images from Version-5 MOD11A2 (Terra) and the

NDVI composite images from both Version-5 MOD13A2 (Terra) and Version-5 MYD13A2 (Aqua); see the URL [

9] for details. We have focused on composite images of the Navarre region from the 2011 to 2015 time period. Each of the day LST and night LST remote sensing datasets require 230 tiles for enclosing Navarre. These datasets correspond to 46 scenes with a temporal resolution of 8 days every year for 5 years. Terra and Aqua have different starting dates each year because Terra starts on the first day of January and Aqua starts on the ninth day of January. Therefore, to achieve two composite

NDVI images per month every year, we have retrieved 23 images from Aqua and one image from Terra, fitted to November. In total, 120 scenes with a 16-day temporal resolution have been captured across 5 years of study. The spatial resolution of each tile is equal to

$166\times 154$ (25,564) pixels of approximately 1 km

${}^{2}$. Inside the Navarre borders, 11,691 pixels are enclosed. All images have been downloaded from [

22] in Hierarchical Data Format (HDF). This format helps catalog geo-referenced images, but makes data processing more difficult. Then, all images were transformed into TIFF format to be processed by the R software [

23].

NDVI is a very useful index for agricultural mapping, yield monitoring and measuring changes in ecosystems, land use and climate at both global and regional scales [

24,

25,

26]. It is obtained in two important bands, the red (

R) and near-infrared (

NIR), and is calculated as

$NDVI=\left(\right(NIR)-R)/\left(\right(NIR)+R)$ [

27].

NDVI has a theoretical maximum of one, and its relationship to vegetation characteristics, such as biomass, productivity, percent cover and leaf area index, is asymptotically nonlinear as it approaches one.

NDVI is less sensitive to ground characteristics at higher values and essentially saturates when the leaf area index is greater than one [

28]. Barren land, sand and snow usually present very low

NDVI values, and sparse vegetation presents values of approximately between 0.2 and 0.5. High

NDVI values, roughly from 0.6 to 0.9, correspond to dense vegetation from crops at their peak growth stage and tropical vegetation.

The MODIS LST product is derived from two thermal infrared (TIR) channels: 31 (10.78 to 11.28

$\mathsf{\mu}$m) and 32 (11.77 to 12.27

$\mathsf{\mu}$m). More information can be found in [

29]. Assuming that the signal difference in the two TIR bands is caused by differential absorption of radiation in the atmosphere, the correction of atmospheric effects is performed with the split-window algorithm [

30]. Using prior knowledge of the land cover type classification based on the MODIS land cover product (MOD12C1), this algorithm corrects for emissivity effects. Uncertainty in LST estimates increases when significant variations in temperature occur at the 5-km

${}^{2}$ scale [

31]. Errors in LST retrieval may be larger in bare soil and highly heterogeneous areas due to large uncertainties in surface emissivities and when the column water vapor content is high [

31]. An eight-day composite period was chosen because twice this period is the exact ground track repeat period of the Terra platform. LST over eight days is the averaged LSTs of the MOD11A2 product for eight days. Processing these images will preserve the original temperature in Kelvin, but the day and night LST figures given in this work will be expressed in Celsius.

Both derived variables come from composite images, meaning that they have been pre-processed to reduce noise coming from atmospheric and electronic effects and that their accuracy has also been checked via ground-truth data at a coarse spatial resolution. However, when downscaling these data to small regions and comparing them with ground-truth data, we can observe, in some seasons, typically in autumn and winter, some abnormal results.

Figure 3 shows from the top to bottom the boxplots of the day LST and night LST remote sensing data and those of the mean maximum temperature (

${T}_{max}$) and the mean humidity (

${H}_{mean}$) in Navarre during 2011 for 46 time periods. Each boxplot depicts the median in the horizontal line and the dispersion of every image in a particular stage. Within each box, data between the first and third quantiles are plotted. Dots outside vertical extensions might be, but are not necessarily outliers. The meteorological variables

${T}_{max}$ and

${H}_{mean}$ have been chosen because they are the most highly correlated with day and night LST and

NDVI during the time periods.

Figure 4 shows the boxplots of

NDVI in the 24 time periods of 2011. In

Figure 3 and

Figure 4, all the variables show a clear pattern of seasonality, yet in

${H}_{mean}$, larger variability is observed. Day and night LST and

${T}_{max}$ show a parallel concave shape and therefore a positive correlation across time periods;

${H}_{mean}$ is more convex, showing a negative correlation with day and night LST.

NDVI presents a different seasonality pattern, which is negatively correlated with

${T}_{max}$ and positively correlated with

${H}_{mean}$ across the time periods. In summer, all the variables are more stable and show less variability. Similar phenology patterns are found across the study period of 2011- to 2015.

Figure 5 shows Navarre images in the third week of 2011. At the top, day LST and

${T}_{max}$ (in Celsius) are shown, and at the bottom, night LST (in Celsius) and

${H}_{mean}$ (in percentages) are presented. We can also see the large similarity among these patterns on these dates. The dots on the right images correspond to the rain gauge stations used as ground-truth data.

Figure 6 shows the altitude of Navarre and the

NDVI on the second fortnight of February 2014 because

NDVI includes 24 periods. In this figure, we can also check why similar patterns correspond to a high correlation.

## 4. Results and Discussion

To check the contribution of the ground-truth data for improving the quality of satellite imagery, a simulation study is conducted in Navarra using 120 composite images of

NDVI, 230 composite images of day LST and 230 composite images of night LST between 2011 and 2015. In this simulation study, we randomly include four different sizes of outlier outbreaks with

$5\%,10\%,15\%$ and

$20\%$ abnormal observations in each image. The distortion consists of altering 50% of the

NDVI raw data, 5% of the day LST raw data and 5% of the night LST raw data. The distortion percentages look different, but all represent approximately 50% of the range of the three remote sensing datasets used in this paper. The performance of the methods is evaluated with the square root of the mean squared prediction error, defined by:

where

${z}_{ijkl}$ and

${\widehat{z}}_{ijkl}$ are the original and predicted derived variables, respectively;

${n}_{4}=$ 11,691 is the number of pixels inside the Navarre borders; and

${T}_{m}$ is the number of images in the

m-th climatological season across the five years (see

Section 3.2). The index

$j=TpsWc,TpsWoc,SSMWc,SSMWoc$ indicates the type of model;

k is the type of distortion; and

l is the type of derived variable.

$RMSE(j,k,l,m)$ is calculated and plotted in

Figure 7 and

Figure 8.

Figure 7 depicts the RMSE of day LST and night LST for the two versions of Tps, i.e., TpsWc (blue with covariates) and TpsWoc (purple without covariates), and SSM, i.e., SSMWc (red with covariates) and SSMWoc (green without covariates). In all cases, TpsWc outperforms the remaining models, as it achieves the lowest RMSE.

Figure 8 shows the RMSE obtained after smoothing

NDVI. TpsWc is again the best method in all the climatological seasons and again presents a greater difference with regard to the SSM with and without covariates (SSMWc and SSMWoc). Nevertheless, differences between thin-plate splines with and without covariates are apparently less important than in day LST and night LST, but only because the range of the

NDVI variable is smaller than that of LST. Summarizing, all the figures show lower RMSE when using covariates in both thin-plate splines (Tps) and state-space models (SSM).

Table 1 shows the RMSE reduction percentage when both models include ground-truth information compared to when not including this information. The maximum percentage reduction is approximately 20% in night LST for TpsWc and in

NDVI for SSMWc; however, TpsWc clearly provides the lowest values of RMSE with and without covariates.

Figure 9 shows the effects of distorting the images and smoothing them in the fourth time period of November 2011. At the top and from left to right, we see the distorted image of LST with an outlier outbreak of 5%, the TpsWc smoothed image and the SSMWc smoothed image with altitude,

${T}_{max}$ and

${H}_{mean}$ as the ground-truth covariates. At the bottom, the distorted image with an outlier outbreak of 20% in the same time period and the derived smoothed images are shown. Both models remove distorted data, and SSMWc seemingly better smooths the images, but TpsWc better preserves the original image pattern. The top of

Figure 10 shows the boxplots of the distorted images, and at the bottom, the boxplots of the smoothed images in the 46 periods of 2011 are shown. We can see how the phenology of the remote sensing data is preserved after smoothing.

The simulation study shows an important decrease in the RMSE calculated with both the TpsWc and SSMWc methods; however, unless we alter the ground-truth data, we cannot evaluate what occurs in terms of RMSE when the covariates exhibit higher or lower correlations with the dependent variable. This is why we have defined a new artificial covariate linearly correlated with the remote sensing data. This correlation takes on values of $0.66,0.75,0.83,0.92$ and 1. After introducing at random 20% abnormal observations in the time series of raw images, we smoothed them using TpsWc and SSMWc with the artificial covariate. When the correlation increases, a greater reduction in the RMSE is observed, although specific values are not shown here to preserve space. On average, each time we increase the linear correlation of the artificial variable with the remote sensing data by one tenth, a 3 to 4% RMSE reduction in the TpsWc and SSMWc models is observed. Therefore, using ground-truth data for increasing the quality of composite images is a recommended option in both models, although TpsWc achieves better results, as it has the lowest RMSE. The mean running time of day and night LST for processing 230 images is approximately 13 min with TpsWc and 11 min with SSWc on a PC with an Intel Core i7-4790 (Intel Corporation, Santa Clara, CA, USA) 3.60 GHz with 4 cores and 16 GB of RAM; therefore, both models are very fast models.

## 5. Conclusions

In this work, we propose to increase the quality of satellite images through the use of ground-truth data from rain gauge stations in different stochastic models. A preliminary analysis for assessing the quality of day LST, night LST and NDVI remote sensing data in Navarre (Spain) during 2011 to 2015 has revealed that these composite images preserve the temporal and seasonal patterns for each year; however, fall and winter represent the most likely periods whereby these data could be more vulnerable to atmospheric and electronic errors, and some atypical observations can emerge. One way of avoiding abnormal values in remote sensing data is accommodating ground-truth data at high temporal and spatial resolutions when applying statistical models for smoothing. However, models involving both types of data and that are able to manage spatial and temporal stochastic dependences remain scarce. In this study, we propose a new method called TpsWc that is based on smoothing the median anomalies of the remote sensing data with a thin-plate spline model wherein the anomalies of the ground-truth data are the external covariates. The method is compared with a version without covariates (TpsWoc) and a state-space model with (SSMWc) and without covariates (SSMWoc).

The new approach (TpsWc) encompasses the temporal dependence among multi-temporal satellite images because it smooths the anomalies of the previous and subsequent images across years, and it accommodates the spatial dependence fitting non-parametric functions of the coordinates with a thin-plate spline. Ground-truth data are included as external covariates in the model after subtracting the median. We have conducted a simulation study wherein different outlier outbreaks have been randomly introduced in the original images. The study finds that TpsWc is the best option for all the variables, although SSWc presents a greater reduction in the RMSE for NDVI when using covariates. Moreover, we have found that when the covariates are more strongly correlated with the remote sensing data, a greater reduction in the root mean squared prediction error is achieved. Finally, the phenology of all the variables is preserved, but the potential outliers are removed in all the remote sensing data.

The thin-plate splines and state-space models used in this paper are applied in time-consuming procedures, and the running time increases when we increase the spatial and temporal resolutions of the images. Therefore, improving these procedures from a computational point of view remains a matter of further research, because there are many alternatives for improving the quality of the satellite images, but only a few such methods utilize ground-truth data.