A Comparison of Three Temporal Smoothing Algorithms to Improve Land Cover Classification: A Case Study from NEPAL

Time series land cover data statistics often fluctuate abruptly due to seasonal impact and other noise in the input image. Temporal smoothing techniques are used to reduce the noise in time series data used in land cover mapping. The effects of smoothing may vary based on the smoothing method and land cover category. In this study, we compared the performance of Fourier transformation smoothing, Whittaker smoother and Linear-Fit averaging smoother on Landsat 5, 7 and 8 based yearly composites to classify land cover in Province No. 1 of Nepal. The performance of each smoother was tested based on whether it was applied on image composites or on land cover primitives generated using the random forest machine learning method. The land cover data used in the study was from the years 2000 to 2018. Probability distribution was examined to check the quality of primitives and accuracy of the final land cover maps were accessed. The best results were found for the Whittaker smoothing for stable classes and Fourier smoothing for other classes. The results also show that classification using a properly selected smoothing algorithm outperforms a classification based on its unsmoothed data set. The final land cover generated by combining the best results obtained from different smoothing approaches increased our overall land cover map accuracy from 79.18% to 83.44%. This study shows that smoothing can result in a substantial increase in the quality of the results and that the smoothing approach should be carefully considered for each land cover class.


Introduction
Thematic land cover and forest maps are important sources of information for managers and policy makers for local and national-level policies and development strategies [1][2][3][4]. Important decisions regarding land use allocation, food production, hydrological modelling and natural resource management need to be supported by accurate information on spatial and temporal land cover dynamics which is occurring at a tremendous rate [3,[5][6][7]. Remote sensing is a widely-accepted method for land cover classification. Since commercial high resolution images, free medium resolution images and open source cloud computing platforms are becoming more available, using machine learning (ML) techniques to classify land cover types with remotely-sensed data has gained popularity [8,9].
Dealing with noise is one of the main challenges in satellite image classification schemes. Satellite images can contain noise from different sources. Atmospheric effects while the image is taken, sensor degradation and thermal noise, and processing of signals received by sensors are some of the sources that can introduce noise to the images [10]. Noise might propagate through the classification scheme and negatively affect the performance of the model and quality of the outputs [10,11]. Pixel-based classifications in particular were found to be especially sensitive to noise as a considerable proportion of the signal of a pixel might come from the surrounding pixels [12,13]. Various techniques have been proposed to reduce the impact of noise. These methods include pre-processing of the input images or post-processing the results [12].
Even without these processes, random forest, among other machine learning methods, has been found to adapt and reduce the impacts of noise. This is because it uses bootstrapping and random split construction unlike other algorithms [14]. Random forest has been found to perform well in similar classification problems [15]. While convolution neural networks (CNN) has been gaining momentum in the remote sensing community, it is still a budding field and needs more research [16]. Another method to reduce noise and improve classification accuracy is the use of time-series remote sensing data [17,18]. However, one caveat to this technique is that image noise and sensor-introduced noises from individual scenes compound when information from multiple scenes are combined. This is because, if even one scene has noise it can impact the whole series. [19,20]. The issue of noise can also be remedied by smoothing the data either early in the process or before final assemblage of a land cover map so that the noise and inconsistencies are filtered out.
Smoothing is a method of identifying the underlying structure of data such as trends [21,22]. There are multiple studies that explore the use of smoothing algorithms on time-series data in order to reduce noise. For example, Lunetta et al. [20] explored the use of Fourier transformation on normalized difference vegetation index (NDVI) values; Chen et al. [23], Cao et al. [24] applied smoothing to vegetation index data by using a modified Savitzky-Golay filter; Jonsson and Eklundh [25,26] developed the TIMESAT software package, which can assimilate Savitzky-Golay, asymmetric Gaussian, and double-logistic algorithms to smooth NDVI time-series data from different sensors. However, these smoothing algorithms were found to be mathematically complex and computationally intensive. Therefore, various groups proposed methodologies that simplify the process. For example, Sakamoto et al. [27], Geerken et al. [28] explored the use of Fourier transforms for smoothing NDVI data, where the raw data is converted to harmonics and then frequencies corresponding to noise are removed. Eilers [29] introduced the Whittaker smoother, which uses a simple algorithm that runs quickly, yet still balances the fidelity of noisy data with the smoothness of the resulting curve [30]. The Whittaker smoother has also been recently adopted in Google Earth Engine (GEE) to smooth MODIS EVI time series data [31]. Khanal et al. also used the Whittaker smoother to map built-up areas of Kathmandu with improvements in results [32].
Techniques for smoothing remote sensing data have been widely-used; however, there are few studies that compare the smoothing algorithms themselves. Shao et al. [33] focused on smoothing MODIS-NDVI data to support land cover classification. However, since classifications using ML algorithms use more than just NDVI data, studies on how smoothing multiple input features and then training models with this data are needed. Similarly, Atkinson et al. [34] compared Fourier, Whittaker Smoother, Asymmetric Gaussian and Double Logistic smoothing approaches on phenology. Other studies have been done to test the performance of multiple smoothers on MODIS-LAI products [35] and hyper-spectral imagery [36] as well. Performance of some smoothers have also been tested in the context of land cover but with smoothers such as majority filter, Gaussian smoothing and bilateral filtering [37].
This study follows the Regional Land Cover Monitoring System (RLCMS) approach to perform land cover classification [38,39]. RLCMS is a joint USAID and NASA collaborative project that provides support to dedicated development and sustainable landscape projects [40]. This approach introduces the concept of primitives, which can be described as biophysical layers that represent information required to segregate land cover types [38,41,42], for example, tree canopy cover for forest/non-forest classification. The primitives are prepared from composites that are pre-processed with cloud/shadow masking, BRDF correction and terrain correction. ML techniques such as random forest are then used to generate primitives in the form of probability layers [38,40]. These primitives are then used in accordance with land cover typology in a decision tree that outputs required land cover map. While a land cover type can depend on multiple primitives, in this study we focus on definitions that allow single primitives per type as our focus is on studying the impact of smoothing rather than building an extensive land cover map.
Saah et al. [38] applied the Whittaker smoothing algorithm on primitive layers before assembling the final land cover map to stabilize the time-series and remove noise. However, they did not explore and evaluate the impact of including smoothing at a different stage of the image processing. Moreover, specific primitives or land cover classes may show better performance using a different smoothing algorithm. Therefore, this study explores the impact of temporal smoothing on image classification accuracy by following the RLCMS methodology and testing the approaches across all the classes in multiple stages of the workflow. This was done by applying three different smoothing algorithms on eight land cover classes. We compared the final accuracy of applying smoothing during pre-processing and post-processing. This study will help guide future work on land cover mapping on the appropriate choice for smoothing methodologies towards increasing the accuracy of the final results.

Study Area
Province No. 1, the easternmost province of Nepal, was selected as the study area. The province stretches from approximately 88.2217 • E to 86.1358 • E and 28.1310 • N to 26.3298 • N. This area was chosen because it has extensive topographical and land cover type variation within a short north-south distance: the south contains the flat plains of the Terai region, which progress to hills and snowy mountains in the north. The 2010 land cover map of Nepal ( Figure 1) shows that the study region contains all land cover types presented by Uddin et al. [43].

Land Cover Typology
The land cover maps of Nepal prepared by Uddin et al. [43] were used to develop the land cover typology for this study. From the 12 classes presented in the Uddin et al. land cover map, we derived eight classes to further examine. For this study all forests were consolidated into a single class. Additionally, we created a new class "river bed" in order to obtain better segregation from classes such as built-up and barren land. When not separated, river beds have a tendency to be classified as either built-up or barren land with no real consistency. The complete typology used for this study is shown in Table 1.

Reference Data Collection
The reference data were labelled using visual interpretation using Collect Earth [44,45] where the plot size was set to equal the spatial resolution of the 30 m Landsat images that were used in land cover mapping. Each plot had 49 points which were labelled according to observed land cover type. The plots were generated on a systematic grid with 2 km between sample plots and additional reference points were added with opportunistic sampling for underrepresented classes. The label land cover type for each plot was derived based on the majority type occurrence within the plot. The training points and validation points were randomly split from this reference data set with 86,093 points being used for training and 4299 for validation. The distributions of these points can be seen on Appendix A. The maximum number of points per class for validation was capped in order to avoid skewed accuracy because of a large number of points in a more accurate class. Because of persistent cloud covers, from 4307 validation data only 3677 points were used to perform accuracy assessment.

Land Cover Classification
The overall methodology of classification and comparison using various approaches is shown in Figure 2.

Optical Image Processing
Landsat 5, 7 and 8 surface reflectance images in GEE were used to prepare time series composites from the years 2000 to 2018. First, each scene was pre-processed by shadow masking, cloud masking, and then applying BRDF and topographic corrections [40]. The pre-processed scenes were used to create yearly composites. In order to prepare these composites, six bands from each scene were used: blue, green, red, nir, swir1 and swir2. The closest real pixel to the median, referred to as 'medoid', 20 percentile and 80 percentile of those bands were stored in a composite. Additionally, 20 and 80 percentile of NDVI, NDWI were also stored. In addition, information from auxiliary layers such as the JRC global water and SRTM DEM layers were added to the composite. The resulting layer, along with labelled reference data, was used to create primitives for each land cover type. The primitives were generated using a random forest classifier with 100 trees. Feature importance was also calculated to avoid overfitting. These primitives were then run through a decision tree classifier to obtain the final land cover maps.

Application of Smoothing
Three smoothing algorithms were tested using two different approaches, while everything else was kept constant. In the first approach, smoothing was applied during pre-processing on the composites where all spectral bands were smoothed temporally within our study period starting from 2000 and extending to 2018. These smoothed bands were then combined with the reference data to train a random forest classifier. This classifier was next used to create primitives for each class. This technique was repeated with each of the three smoothing algorithms. In our second approach, primitives for each class were first prepared using composites without smoothing and reference data. Then, each probability layer was smoothed temporally to obtain a smooth time series of primitives for each class. This technique was repeated for each of the three smoothing algorithms. The GEE codes for these implementations can be found in Appendices B.1 and B.2. This process resulted in 7 layers: three for each of the three smoothing algorithms applied on composite image, three for each of the three smoothing algorithms applied on primitives and one original data layer where smoothing was not applied.
The three algorithms that were used are detailed below:

Whittaker Smoothing
The Whittaker smoother is a popular smoothing approach that has been shown to work well with evenly-spaced data [29,46]. The Whittaker smoother can be considered as a special case of B-spline smoothing where the number of knots are equal to the data points [47]. Studies conducted by Hermance et al. [48,49] have successfully demonstrated its use in signal processing. The Whittaker smoother works by fitting a smooth series to a discrete data set. For a noisy y-series and its corresponding smooth z-series, the fidelity of data and roughness of z has to be balanced. Fidelity S is the sum of squares of differences and R is the roughness of the smooth data which is expressed as second order differences. With the goal of minimizing the balanced combination of the two, the combination Q can be expressed as where, and κ controls the degree of smoothness where the larger the κ is, the smoother z is. This can be noted in matrix form as in Equation (4).
where, I = Identity matrix λ = Smoothing degree, the larger λ is, the smoother z will be D = differential matrix with m-2 rows and m columns with d as the order of differences Each row of the differential matrix D with the order of differences 2 contains the pattern 1 −2 1, shifted in a way where for each row i, d i,i = 1, d i,i+1 = −2, d i,i+2 = 1 and all other values are 0. For example, This matrix equation can be solved to compute the matrix z to obtain the smooth series.

Fourier Smoothing
Fourier smoothing is another popular smoothing approach based on the Fourier transformation of data. There are multiple scenarios where it has been used [25,33,34,50,51]. First, the raw data is converted to frequency domain by implementing a direct Fourier transformation (DFT). The DFT is defined by Equation (5) [52] where, A n = nth coefficient of the DFT X k = kth data in the times series N = Number of data in the time series i = √ −1 It can also be written as (6), However, since computations with the complex exponential are not straightforward on GEE yet, Equation (7) is re-expressed using Euler's formula as follows: The real and imaginary parts were handled separately. From the results, the data located in indices greater than the specified smoothing degree were set as zero. After reducing the higher harmonics to zero, the resulting values are reverted back to the time domain by applying inverse DFT. This essentially means that, first we fit the whole data set using a set of cyclic functions and then discarding the higher harmonics to remove the overfit so that we essentially are left with smoothened data set. The Fourier smoother along with Whittaker smoother was found to be two of the best smoothers in terms of classification in a study done by Shao et al. [33].

Linear Fit Smoothing
Linear Fit Smoothing is what we refer to as a simple derivative of moving window averaging where the window, a subset of the whole data set is used to average and obtain smooth values [53]. In this case, instead of taking the average over the moving window, data within a window is used to estimate a linear fit which is further averaged to get the final smoothened results.
The nature and smoothing effect of these algorithms can be explored through Figure 3a,b which were explored on different points exhibiting different types of time series data. From the plots it can be seen that, Fourier smoothing tends to preserve the shape of the time series more than others while Whittaker smoothing gave us the most flattened smoothing among the others. The value of smoothing degree of all of these algorithms was decided based on trial and error method of testing multiple values and deciding on the one that gave the best result. The parameters used for these smoothing algorithms are listed below in Table 2.

Final Land Cover Map Generation and Accuracy Assessment
Primitives were assessed based on how well it could distinguish the feature it was supposed to represent vs the others. From the seven sets of primitives, the best primitive for each land cover type was picked based on their performance; this was used to create the eighth set of primitives. Different land cover maps were prepared in an assemblage logic using a decision tree classifier. For each set of primitives, the same assemblage logic was applied to produce land cover maps (see Appendices B.3 and C). The decision tree was kept constant so that only smoothing had an effect on the change in accuracy of the land cover maps. The accuracy of and change in accuracy between each land cover type was compared in order to understand the impact of different algorithms on different land cover types.

Results
The primitive assessment (Figure 4) shows the performance of the six different smoothing approaches on each of the eight different land cover types (snow, forest, cropland, wetland, riverbed, settlement, barren and grassland). A primitive is said to have better separability when the two box plots generated from it, one representing the probability distributions of its corresponding land cover type and another representing others, have either narrower plots or larger distance between its two plots. Narrower plots represent more consistent probability distribution while increased distance between the box plots means a clearer separation between the primitive's corresponding type and others. This results in reduced omission and commission when the proper threshold is applied for classification. Therefore, better separation is used to identify better primitives. We found that for snow and forest there is a clear separation between the class itself and other classes, followed by wetlands, settlement and riverbed. For grassland and cropland there is more overlap between the probability distributions. Figure 4 also shows that Fourier smoothing results in the best separation for snow, wetlands, riverbed, barren land and grassland. For the other land cover types, Whittaker performed better than Fourier. Linear-fit was found to perform better than Fourier for forest, cropland and settlement but not as good as Whittaker. On the other hand, it performed better than Whittaker for other types but did not consistently outperform either Whittaker or Fourier. Furthermore, it was found that smoothing on the composite generally shows narrower distributions for snow, wetland and riverbed classes as compared to primitives. For forest, cropland and settlement the distributions are narrower when smoothing is applied on the primitives as opposed to on composites.
An example of the accuracy gained by smoothing versus visual assessment alone is shown in Figure 5. Here, we show that classification of forest without smoothing results in an inconsistent sudden drop in forest cover for the year 2014 (Figure 5b,f). The seen temporal dynamics of deforestation and regrowth would not be expected considering the short time frame [54]. Our results show a more consistent change after temporal smoothing in land cover types, which is more in line with the expected pathways.
The best performing smoothing algorithms were then used in the land cover assemblage. Figure 6 shows the accuracy of the different land cover classes before smoothing and after combining best primitives post-smoothing. It can be seen that snow and grassland classes show least improvements, while settlement, cropland and forest have larger improvements. It can also be noted that smoothing shows larger improvements for categories with little temporal variability. For example, snow, riverbeds and wetlands can be expected to have more temporal dynamics and show the least improvement.
The final accuracy assessment (Table 3) demonstrates how smoothing improved results. The overall accuracy increased from 79.18% to 83.44%. It is important to note that this result means that using smoothed data sets, given that the smoothing parameters are optimized for the input data set, is better than using no smoothing at all. However, to ensure that smoothing is not introducing errors, the nature of land cover type in question needs to be properly identified and the appropriate smoothing approach needs to be taken. Consequently, to optimize the accuracy of the final maps, our results showed that it is best to use different algorithms based on the land cover type and then combine the best primitives to prepare the final land cover map.   Comparison of class wise accuracy before applying any smoothing and after selecting the best primitives from various smoothing results. In the plot, accuracy extends from 0% (at centre) to 100% (at the outer edges).

Discussion
Three different smoothing algorithms were applied on eight different land cover classes. We found that improvements vary amongst different categories. We argue that this difference in performance is caused by the nature of the classes. In the case of Nepal, barren land often interchanges with grassland, and wetland (mainly rivers) often interchanges with riverbeds. Forest, cropland and settlement are more constant classes, which changes either once or infrequently. It therefore appears that Whittaker smoothing works best on land cover types with clear differentiation between them, whereas Fourier can be used on land cover types that more frequently changing. Fourier transformation showed good results for land cover types that frequently change as it breaks down a data series into a combination of cyclic equations; by doing this it preserves many of the highs and lows of the data while still removing noise. Fourier smoothing therefore works best when applied on the composite layer rather than the primitives, as the random forest method can then can use the noise-free composite.
In this study, we explored the use of smoothing on remote sensing composites and primitive layers. However, most studies (e.g., [20,23,24]) apply categorical classification on composites based on indices directly and do not have the intermediate primitive layers as used in this study. Moreover, the study was applied on a relatively small area with distinct climatological and topographic characteristics that determine the spatio-temporal land cover dynamics. More testing is needed to investigate if these findings also apply to other regions. It should also be noted that combining our smoothing algorithms with change detection algorithms such as LandTrendr [55] and Breaks For Additive Seasonal and Trend (BFAST) [56] might help improve land cover mapping accuracy as these algorithms are specifically designed to detect changes.
A main issue in generating land cover maps from annual composites is the number of valid observations (i.e., cloud-free pixels in a year). As such, the intra-annual distribution of pixels varies from year to year and as a consequence, the phenology is not necessarily fully represented in each annual composite. This might be captured in the ML model if multiple years are covered, but smoothing applies some additional noise reduction. This study shows that a smoothing algorithm should be selected based on the dynamics of a class. For example, Whittaker smoothing shows better results for stable classes. Fourier shows good results for classes that are dynamic when applied on composites. In general, it was found that smoothing on the primitives gives better performance than when applied on the composite layers for stable classes. Primitives are confidence layers generated by the ML algorithm so smoothing a set of primitives uses temporal information to improve estimates generated by the ML model rather than observation made by sensors. Stable classes show better performance after smoothing as it flattens out the inter annual dynamics in surface reflectance or class probability. For highly dynamic classes temporal segmentation algorithms might be another alternative approach to capture rapid changes.
We noticed an interesting phenomenon where changes in average probability were largely insignificant, but the lengths of error bars were significantly reduced. This means that smoothing does not necessarily increase the average confidence of overall classification but still improves the inter-class separability by reducing the deviation among pixels. The results line up with Shao et al. [33] while applying the Fourier and the Whittaker algorithms on an NDVI layer that was used for classification. Mingwei et al. [57] applied Fourier transformation on MODIS derived vegetation indices and found good correlation, which seems contradictory to our results. However, it needs to be noted that the study took a look at phenology, where there is a cyclic change in values, while we are looking at annual layers for which we do not expect to see much back and forth changes.

Conclusions
This study tested on using temporal smoothing as a means to improve land cover maps that were prepared by using a machine learning algorithm. Here the Fourier, Linear-fit and Whittaker smoothing algorithms were tested on image composites and on image primitives representing individual land cover types over time. Overall, we found that using temporal smoothing with machine learning improves the results compared to no smoothing; however, for classes that frequently undergo changes, Fourier smoothing produces the best results when applied on composite images. Conversely, for classes that do not undergo regular changes, Whittaker gave the best results when applied on primitives. Land cover types that do not undergo frequent changes generally were more improved with smoothing; in some cases, the accuracy of results decreased in classes that show frequent changes. The overall accuracy of the maps produced by our approach improved from 79.18% to 83.44% when we combined the best techniques for each land cover type. We therefore recommend that when creating land cover maps from satellite data, researchers identify the dynamics of the land cover type based on the target temporal resolution and then apply the appropriate algorithm. In cases where smoothing needs to be applied to the whole land cover rather than primitives from just one class, we recommend smoothing each class separately and then using the smoothed layers to prepare the final land cover map. Acknowledgments: We express our gratitude to David Molden, Director General and Eklabya Sharma, Deputy Director General of ICIMOD for the overall guidance. This study was partially supported by core funds of ICIMOD contributed by the governments of Afghanistan, Australia, Austria, Bangladesh, Bhutan, China, India, Myanmar, Nepal, Norway, Pakistan, Switzerland, and the United Kingdom. This paper has been prepared under the SERVIR-Hindu Kush Himalaya (SERVIR-HKH) initiative funded by the NASA and USAID. Our special gratitude goes to Ghulam Rasul, Regional Programme Manager, MENRIS and Birendra Bajracharya, Chief of Party, SERVIR-HKH for their encouragement and very active support in bringing out this work. We also thank the SERVIR RLCMS team for their assistance. Finally, we would like to thank those who reviewed the articles and provided valuable suggestions to improve the manuscript.

Conflicts of Interest:
The authors declare no conflict of interest. The views and interpretations in this publication are those of the authors. They are not necessarily attributable to ICIMOD and do not imply the expression of any opinion by ICIMOD concerning the legal status of any country, territory, city, or area of its authority or concerning the delimitation of its frontiers or boundaries or the endorsement of any product.   In order to test the impact of smoothing in landcover classification , we need to 5 first check with smoothing the surface reflectance itself as well as smoothing the 6 intermediate smoothing results . In this step we will be smoothing primitives using 7 different algorithms namely whittaker , fourier and linear fit . The export will 8 generate an image with timeseries data and rmse on different bands . Different 9 script is later required to split them into images .  Now that we have satisfactory primitives we will run them through assembler 5 to obtain a land cover map . The assembler prepares a decision tree based on 6 user specified thresholds which can be tuned based on visual assessment as 7 well as primitive assessment plots . The order of primitives in the list denotes 8 the order in which primitives are placed in the decision tree with the first 9 primitive placed on the top and so forth . This basically means that if a pixel 10 has high probability on two primitives ( according to the specified threshold ) 11 the final class will be based on the primitive that is higher up on the decision 12 tree .