1. Introduction
In the field of remote sensing, the 21st century is referred to as “the era of big data”, in which a large amount of (not only) freely available data is accessible. When processing a large amount of data, especially for time series analysis, it is productive to use tools that can process them quickly and effectively. In some cases, there may be time and performance issues and limitations using traditional desktop solutions, where it is necessary to download and preprocess data before the necessary analyses can be performed on them. As a consequence, the so-called cloud platforms have been developed. They store not only images and data archives, but they also bring the computing technology needed for data processing [
1,
2].
In forest monitoring, multispectral optical satellite data have proven to be a very effective data source. In many cases, however, optical data have certain shortcomings, especially regarding the presence of clouds. Electromagnetic waves in the microwave spectrum can penetrate clouds, fog and light rain and are not dependent on sunlight. Therefore, sensors in these wavelengths can observe the Earth’s surface both during the day and at night. Additionally, they can be potentially used in monitoring landscape changes as complementary to optical data (i.e., in [
3]). Since the launch of the Sentinel-1 satellite in 2014, providing freely available synthetic aperture radar (SAR) data in C-band, interest in SAR data has started to grow, and new methods began to be developed. Systematic sensing of the Earth’s surface with constant geometric characteristics of the sensors brought many advantages over previous radar missions.
The interaction between the SAR signal and the surface depends mainly on the characteristics of the studied object, e.g., on the shape, roughness and dielectric properties (mainly moisture) [
4], as well as on the sensor characteristics, i.e., on wavelength, incidence angle and polarization [
5,
6,
7,
8]. Characteristics of the studied object are different for every land cover class and can be biased by some environmental factors, like moisture content [
9], change in temperature [
10,
11], etc. The radar incidence angle is increasing from near range to far range, which causes variations of the backscatter values for a given land cover class, resulting in different backscatter for the same class at different incidence angles. The correction of the radar incidence angle can be only valid in ideal situations, where the effect of terrain can be neglected, e.g., over relatively flat areas (e.g., in [
12]) or over the sea (e.g., in [
13]). On the other hand, in the case of areas with rugged terrain, the incidence angle of the electromagnetic wave is affected by the slope and aspect of the studied terrain. Thus, the effect of the terrain needs to be considered and eliminated or at least minimized for appropriate evaluation and interpretation of obtained backscatter. This factor is most apparent in the intensity of the received signal in the case of a combination of images from different orbits and paths of the satellite (ascending vs. descending or adjacent paths covering a given area). Analysis in mountainous areas, where the slope and orientation of the terrain are not considered, i.e., only flat Earth approximations are used, leads to errors [
14]. This is especially important for forests, as most of the forests of Central Europe are located in mountainous areas, particularly national parks and protected areas. In the time-series analysis, this effect is a constant fluctuation of the backscatter values acquired from different successive paths. To ensure the highest possible temporal resolution of satellite data in the time-series analysis, it is necessary to use all available acquisition orbits and paths of an observed area. Therefore, to obtain accurate information in the time-series analysis that represents a real status or change over an area, it is necessary to eliminate the influence of the terrain.
There are several techniques to correct the topographic effects. As per the relevant literature, the first approaches used to correct the topographic effect belong to methods based on cosine square correction [
13,
15,
16,
17,
18]. Currently, the most used and well-known algorithm for correcting the topographic effect is the Radiometric Terrain Correction developed by Small [
19] available for desktop software, such as SNAP Sentinel-1 Toolbox. Using this method, not only the geometry but also the radiometry of the scene is corrected for the terrain influences [
7]. In this method, accurate knowledge of the acquisition geometry of the image and a DEM are used to estimate the local illuminated area of each image pixel [
19]. A local illuminated area is then used to normalize the backscatter value.
In some recent studies, in the process of removing the backscatter dependency on the radar incidence angle or on the local incidence angle (LIA), so-called regression-based normalizations were used, where all image pixels are corrected to the same reference incidence angle (e.g., in [
20,
21,
22,
23]). The prerequisite for successful application of the regression-based normalization is to find the relationship between the backscatter and incidence angle, therefore, to obtain the slope coefficient of the regression line. The regression-based normalization is used mainly for the so-called pixel-based method, where the slope coefficient is calculated for each image pixel separately, using several satellite images from the specified time range (e.g., in [
20,
21,
23,
24,
25]). As the pixel-based method can be computationally intensive for large areas, Widhalm et al. [
26] developed the so-called slope approach, aiming to normalize the backscatter–LIA dependence using only single images. In that approach, the slope coefficients of the linear regression were determined for each land cover type separately, based on which they created specific coefficients used in the further regression-based normalization, which are valid only for specific high latitude areas. However, they found that their approach is not suitable for applications in mountainous areas, as the linear relationship between backscatter and LIA was not valid. Nonlinear relationship over sloped terrain was also found in [
27] and stated in [
12].
In exploring backscatter–LIA dependence, it is important to note that different land cover types have a different backscatter–LIA relationship, as found in previous studies [
16,
26,
28,
29]. Hinse et al. [
16] found a flat angular behavior over deciduous and coniferous forests (0.01 and 0.02 dB/degree, respectively) and steeper in agricultural lands and cornfields (0.15 and 0.11 dB/degree, respectively). In [
28,
29], the backscatter–LIA dependence was compared for the tropical forest, tree savannah, Sahelian area and desert classes. In this case, tropical forest areas had a flatter angular behavior (0.06 dB/degree) than other land cover types. Widhalm et al. [
26] calculated the regression slope coefficients of the backscatter–LIA dependence for 56 sample classes, in which values were in the range from 0.07 to 0.3 dB/degree, while forests had values from 0.07 to 0.17 dB/degree.
A modern method in the processing of RS data is cloud computing. The most popular cloud-based platform has become Google Earth Engine (GEE), with a rapid yearly increase in usage for a wide range of applications [
1]. GEE (
https://earthengine.google.com, last accessed 26 April 2021) contains “multi-petabyte” analysis-ready data, in combination with all computation power located in the cloud. It can be accessed through an Internet-accessible application programming interface (API) and a web-based interactive development environment (IDE). Therefore, users can access data and perform analysis through any web browser. It is freely accessible for research, education and nonprofit use [
2]. If radar satellite data are used in GEE, there are limited options for correcting topographic effects. Vollrath et al. [
30] published a method to eliminate the effects of the terrain on radar satellite data in the GEE environment. That methodology is based on using two physical reference models used to correct the backscattering behavior. The first model is developed for vegetation monitoring, where volume scattering is the dominant scattering type, and the second one is developed for areas with surface scattering. However, the effect of different backscatter–LIA relationships for specific land cover classes is not reflected in this method.
The presented study introduces a land cover-specific correction of the local incidence angle using linear regression analysis in GEE. This method is called the land cover-specific local incidence angle correction (LC-SLIAC). The land cover-specific method of the LC-SLIAC aims to correct one specific land cover class, i.e., in the case of this study, to correct backscatter over forests. The purpose of the LC-SLIAC method is based on previous studies, where different backscatter–LIA relationships were proved for different land cover classes (e.g., [
16,
26,
28,
29]). Using this algorithm, the terrain effects of each individual scene and in the time-series curve are eliminated. The method’s suitability was tested on time series of coniferous and deciduous forests in selected areas in Central Europe. The main difference between most of the above-mentioned regression-based normalization approaches and LC-SLIAC is in using a single image to calculate the dependence of backscatter values of a selected forest type on LIA. This is possible using available land cover databases containing information on the spatial distribution of a given land cover type. Therefore, for each SAR scene, it is possible to calculate the backscatter–LIA dependency separately. The next methodological aspect is using the site- and path-specific reference incidence angles for time-series analysis.
In this context, the main aim of the study is to introduce the LC-SLIAC method and to test its suitability for eliminating the effect of the terrain. The accuracy of the proposed method is tested by statistical evaluation and comparison of backscatter of forest areas before and after the correction for forests with different characteristics—at a different elevation, terrain slope and orientation, and with different characteristics of the LIA. Moreover, the effectiveness of the method is tested within a three-month time-series analysis of coniferous and deciduous forests.
4. Discussion
In this study, the LC-SLIAC land cover-specific method, which aims to eliminate the effects of terrain on backscatter values over forested areas, was developed and tested. The LC-SLIAC method can correct the terrain effects of individual images and, therefore, allows the combination of images with different imaging geometries, i.e., from all available acquisition orbits and paths over a selected area, in a time-series analysis. The traditional method for achieving comparable images is based on using images with the same imaging geometry—from exactly the same path, orbit and satellite (i.e., in [
45,
46,
47]). However, a drawback in the case of using only data from the same orbital plane is that it leads to a low temporal resolution. Considering all available images over a given area can dramatically increase the temporal resolution, especially in the case of the Sentinel-1 mission, which comprises a constellation of two satellites sharing the same orbital plane. In the case of SAs from Central Europe, a maximum of four different paths were available for a SA. It resulted in around 60 images available in three months instead of 15. An essential decision before processing a large amount of SAR data for time-series analysis is to use suitable software and hardware for this purpose. Cloud-based solution GEE was used and tested in this study, where all data and processing power are stored. Sentinel-1 C-band data were selected and used because of its effectiveness in analyzing problems in forest ecosystems (e.g., in [
7,
10,
46,
48]). It has been proven that the spatial resolution of Sentinel-1 data is suitable for forest monitoring. Reiche et al. [
48] found that using Sentinel-1 data, it is possible to monitor disturbed areas bigger than 0.2 ha. Moreover, in the last update of their algorithm, the minimal mapping unit for disturbance monitoring was set to 0.1 ha [
49].
The relevance of the proposed LC-SLIAC method using C-band SAR satellite data was demonstrated in 16 study areas from different parts of Central Europe. The main task of the methodology was to explain the relationship between LIA and backscatter values by the calculation of regression line coefficients. Regression coefficients were also used in previous studies in the traditional pixel-based method (e.g., in [
20,
23,
24]). They were calculated for each image pixel separately based on the backscatter and incidence angle values obtained from all available images in the selected time range. In this case, each pixel has its own unique regression coefficients. One of the disadvantages of the pixel-based regression method is that the number of different incidence angle values included in the regression analysis depends on the number of different satellite tracks available over the studied area. In the case of Central Europe, when using Sentinel-1 satellite data, there are only 2–4 satellite tracks available over a given area. In high latitude areas, where Sentinel-1 Extra-Wide (EW) swath data are available, it can be on average 4.9 to 7.7 tracks per pixel (e.g., in [
26]). Some satellites can have more available different acquisition paths over the image pixel, i.e., in the case of Envisat with a maximum of 15 different available paths over an area in Germany (in [
23]). However, the reasonable minimum sample size for regression analysis with one predictor should be ~50 based on the rule of thumbs (i.e., [
50] or [
51]). By using more measurements from the same path (i.e., from several acquisition times), the amount of data for each pixel can be larger (as used, e.g., in [
23,
24,
25]), which results in several backscatter values for each individual incidence angle value. In this case, differences will appear only for the backscatter value caused mainly by the growing season or by changing climatic conditions during the year. In contrast, the regression line coefficients in the proposed LC-SLIAC method were calculated separately for each image from selected forest areas located in the SA. To get an appropriate number of forest points included in the regression analysis, these forest areas were selected from 1000 randomly generated points over the SA using the forest mask derived from a combination of CLC and GFC. On average, 395 forest points were included in the regression analysis (with a minimum of 138 and a maximum of 668 forest areas). From this point of view, each image has its unique regression coefficients. Therefore, the effect of different seasons (seasonality in data) through the year does not influence the coefficients. Combining the two land cover databases aimed to reduce the possibility of biasing the linear relationship by outliers that may represent non-forest areas due to errors in the land cover databases used. To reduce the possibility of selecting areas representing non-forest areas or forest pixels with a low density of trees, only pixels with at least 50% of tree canopy cover (from the GFC) were selected. The remaining outliers were excluded using Tukey’s fences. The accuracy assessment of the forest masks showed more than 90% overall accuracy of this approach.
The regression analyses performed on selected forest areas showed a negative linear relationship between backscatter and LIA for each SA. The negative linear behavior supports the existence of terrain effects on backscatter values and, therefore, reinforces the importance of its elimination. SAs with different terrain characteristics (i.e., mean elevation) and LIA characteristics (LIA range and LIA IQR) were used to compare the effect of LIA on backscatter for different types of SAs. Generally, stronger correlation, higher scale coefficient
b and higher R
2 were found over more tilted SAs with high LIA range and LIA IQR. On the other hand, the mean elevation of selected forest areas has a lower influence on the obtained results. According to the found results, in SAs with relatively flat terrain (mean terrain slope <4°), with LIA range <22° and LIA IQR <4°, a low R
2 was observed (2–6%), and the regression was statistically not significant at the significance level of 1%. Excluding these SAs, the explained mean variation in backscatter by LIA was 55% and 58% for VV and VH polarization, respectively. The lowest R
2 of 15% (VH) was found in coniferous forests of Žďárské vrchy PLA with the lowest LIA range (29°) from the remaining SAs, and a maximum of 74% (VV) was found in coniferous forests of the Low Tatras NP (SA8) with the highest LIA range (66°). Achieved R
2 valeus are similar to results found in [
52] (65% of the variation in backscatter was explained by terrain topography) and are higher than the results found in [
53] (5–50%). In addition, these findings are in contrast with studies that found a nonlinear relationship between backscatter and LIA over a tilted terrain [
12,
27].
Slope coefficients
b of the linear regression in this study were found to be steep and showed a relatively strong correlation in all SAs except low LIA SAs. Compared to earlier studies based on C-band data, i.e., to [
28] or [
29], a flat angular behavior was detected over forests (regression line slope
b = 0.06), however in these studies, ERS-1 wind scatterometer data with a spatial resolution of 50 km was used, where it was almost impossible to avoid the mixed-pixel problem in the regression analysis. Linear regression line slope
b in the presented study was higher than 0.2 dB/degree over Low and High Tatra NP (SA6 and SA8), with the highest LIA range and LIA IQR, while the lowest regression slope coefficient
b was achieved in low LIA SAs. According to the comparison of the resulting statistics between coniferous and deciduous SAs after excluding low LIA SAs, the mean scale coefficient
b and the coefficient of determination R
2 of coniferous forests were higher than for deciduous SAs by 0.02 dB/degree and ~5%, respectively. The influence of mean LIA range or LIA IQR was, in this case, minimal, as the difference between the two groups in the mean LIA range and LIA IQR was ~1°. These results correspond to the findings in [
16], where higher values were also found for conifers. These findings also support the assumption for using the land cover-specific correction, that LIA has a different influence on backscatter based on the land cover class. Discrimination between coniferous and deciduous forests was found in previous studies (e.g., [
21,
52,
54]). Higher obtained backscatter in VV polarization compared to VH over forested areas was found, which was also observed in earlier studies (i.e., [
20,
21,
55]). It can be explained by the higher attenuation of VV polarization by vegetation cover compared to VH, as it was also found in [
8].
It is shown in
Figure 6 and
Figure 7 that the LC-SLIAC method eliminates the effect of LIA on backscatter values. Statistical comparisons of uncorrected and corrected forest areas involved in the regression analysis showed a reduction in terms of backscatter variance after correction in all SAs. In low LIA SAs with gentle terrain slopes, the decrease in variance was very low and statistically not significant. High (>40% reduction with a maximum of 74%) and a statistically significant decrease in variance were found in all other areas, which are characterized as areas with moderate to steep terrain slopes. After the exclusion of low LIA SAs, other SAs showed the mean reduction in the variance of 56% and 60% for VH and VV polarization, respectively. That kind of reduction can be considered suitable for further analyses and is higher compared to earlier studies, where the terrain was considered moderate, and the cosine square correction or other methods were used—e.g., in [
16], a maximum of 10% variance decrease after the correction was found, in [
15] a decrease by 5–13% and in [
40] a decrease of approximately 20% was found. However, the variance after the correction remained high (around 3.5 dB) in SAs with the most tilted terrain, while for other SAs, it was around 2 dB. The high variance of backscatter in the forest areas after correction can be caused by the generally high heterogeneity of forest vegetation (different types of trees, different growth stages, or density of trees involved in the regression analysis), which is especially higher at higher altitudes, as was also mentioned in [
56].
The time-series analysis aimed to achieve better results using the mean LIA obtained from different paths, where the investigation was done only for the selected case study with its near surroundings (using a 20 m buffer). The validation of the proposed correction method applied in a three-month time-series analysis using all available Sentinel-1 satellite tracks showed that the LIA correction is the most effective and statistically significant in CSs, where the range of LIA ≥27° can be found. For CSs where the range of LIA ≤ 10°, the correction was almost not apparent, and the change in variance was statistically not significant, while for the CS9 with 27° LIA range was already statistically significant. Because only six CSs were tested for the effects of correction in this study, more CSs are needed to test in the future to identify a threshold LIA range at which the change is statistically significant using LC-SLIAC. These results are similar to the results in [
16], where for pixels with LIA <26° and terrain slope <6° did not occur any considerable correction using semi-empirical cosine-based methods. Similarly, in the case of a different correction method used in [
12], a minimal effect of the correction was found over areas with an incidence angle range <10°. However, after the correction, some fluctuations of values in the time-series remained using LC-SLIAC, which can be attributed to random short-term variations caused by environmental factors, like increased moisture or different reflectivity of the forest caused by different time of the acquisition (approximately 5 a.m. versus 5 p.m.). This different reflectivity can be caused, for instance, by change of temperature between two measurements, change in moisture, different nature of the leaves, etc. According to the influence of precipitation, Frison et al. [
55] did not find any relationship between precipitation and backscatter values over forested areas using Sentinel-1 data. They explained this behavior by the difference between the measured precipitation and the precipitation that can be retained by the leaves or needles of a tree. Tanase et al. [
57] found that C-band was less sensitive to vegetation water content (achieving variations about 1 dB) than the P- and L-bands.
Although there is an increasing number of studies using SAR data for time-series analyses of forests (e.g., in [
21,
55,
58,
59]), there is still a need for an interface and a tool, which is computationally not intensive and where multi-annual SAR time-series can be created. As the GEE is an open-access interface for non-commercial use, the LC-SLIAC method was implemented to the GEE and is available as a freely available function using the requirement call:
require(‘users/danielp/LC-SLIAC:LC-SLIAC’).
A limitation of this study is that the methodology is prepared for application over forested areas. For the successful correction, an appropriate number of forest areas must be included in the regression analysis of backscatter and LIA. Therefore, the correction should be applied over areas with a higher share of forests. In the following studies, it would be appropriate to test and statistically evaluate the results in areas with different land cover types. Another limitation of this study is that the methodology was tested in countries of Central Europe, for which the CLC dataset is available. In the case of Europe, it will be worth trying to implement the Copernicus High-Resolution Layers for forests [
60] or to test the method in countries out of the EU. It is possible to use other global, e.g., the Copernicus 100 m Copernicus Global Land Cover layers [
61], regional, or national land cover databases. Instead of using land cover databases, it would also be useful to perform supervised classification on satellite data to determine the extent of forested areas in the studied area. In this study, a freely available DEM was used, i.e., the SRTM DEM with a 30 m resolution. Therefore, it will be important to test and validate the LC-SLIAC method using a more accurate and higher resolution DEM. Moreover, the simplified shadow and layover masking used in this study cannot consider the passive shadow and layover regions, which are important in eliminating non-valid pixels over tilted terrain.
There are still unanswered questions connected with this study. In future research, it is necessary to explain the reason for the short-term fluctuations of backscatter values in the subsequent dates, try to implement the methodology used in this study for long-term time-series to detect seasonality or changes in forests, as well as to understand the character of the seasonal activity, or in detail to access the relationship between radar time-series behavior and characteristics of the studied area (terrain slope, aspect, elevation, LIA range or characteristics of the vegetation). In the case of long-term time-series analysis, it should be helpful to implement radar indicators (e.g., polarimetric radar indices) to monitor the condition of forests and compare them with the most used vegetation indices (NDVI, NDMI) derived from optical data or test the impact of precipitation on the evaluation of long-term time-series and propose methods for their correction for C-band SAR data. On the other hand, the planned radar missions Biomass with P-band (2021), NISAR with L- and S-band (2022), TanDEM-L with the L-band (2022) can bring new opportunities for forest exploration and thus new challenges in data processing and analysis.