Comparison between Classic Methods and Deep Learning Approach in Detecting Changes of Waterbodies from Sentinel-1 Images

: Climate change has directly impacted Earth’s habitats, resulting in various adverse effects, such as the desiccation of water bodies. The process of identifying such changes through field observations is time-consuming and costly. By using remote sensing techniques, it has become easier than ever to monitor changes in the environment. Radar satellites, unlike optics, can acquire data in all weather conditions, regardless of the time of day. These data can provide valuable information about the environment and surface roughness. Various methods have been proposed for detecting changes, which can be divided into classic and deep learning methods. Classic methods only use image information, such as radar backscatter, which cannot extract spatial information. Sentinel-1 (S1) is an Earth observation radar sensor that provides free access to SAR (Synthetic Aperture Radar) images. This study aims to compare the performance of two classic methods, a ratio index (RI) and Markov random field (MRF), with deep learning networks in detecting changes. As a deep network, Inception CNN (convolutional neural network) is presented as an enhancement of the original CNN to detect the changes. To evaluate methods, two instances of S1 images from Lake Poop ó , located in the Altiplano Mountains in Oruro Department, Bolivia, are used as a primary dataset. The results of the comparison models were assessed using three evaluation metrics: Overall Accuracy (O.A), Missed Error (M.E), and Kappa Coefficient (K). Based on the evaluations, the Inception CNN performed exceptionally in all metrics, with O.A, K, and M.E rates of 97.35%, 90.28%, and 9%, respectively. Meanwhile, the ratio index had poor performance, with 83.27%, 29.05%, and 75.03%, respectively, for O.A, K, and M.E. These results indicated that the Inception CNN could provide better performance in detecting changes from S1 images.


Introduction
Climate changes have significantly influenced human and animal habitats.As an illustration of these changes, one notable example is the reduction in the widths of water zones.Identifying changes in water zones is crucial for making informed decisions in environmental protection and management [1].Identification of such changes through field observations is time-consuming and expensive.The utilization of remote sensing techniques has significantly facilitated the monitoring of changes, surpassing the challenges encountered in the past.Remote sensing images provide great information about the Earth's surfaces [2].Unlike optical satellites, radar satellites can acquire data in all weather conditions, day and night.These data are sensitive to surface roughness and can provide comprehensive information about the environment.Water zones exhibit minimal surface roughness, particularly in the absence of strong winds, resulting in their appearance as dark areas in radar images [3].In remote sensing, detecting algorithms can be classified into two groups: classical methods and deep learning models.
Classical methods rely on backscattered information, often leading to unsatisfactory results with low accuracy.Liang et al. [4] presented a new local hierarchical regional thresholding method for describing water using SAR images.Zhang et al. [5] introduced a novel approach to assessing flood extent using multi-temporal Sentinel-1 data.An automatic thresholding procedure generates initial land and water classification.Then, a fuzzy logic-based method refines the initial classification.Experiments demonstrate that using different polarizations as image bands cannot provide better results.To tackle this issue, incorporating contextual information enhances the accuracy and reliability of the classification outcomes [6].Wang et al. [7] combined the threshold segmentation method with Markov random fields (MRF) and integrated simulated annealing (SA) into the process of image noise reduction.As a result, a water extraction method demonstrates high accuracy in classification.In another study, Song et al. [8] introduced a method for selecting features from SAR images, which relied on the correlation of sparse coefficients.The aim was to enhance the precision of change detection (CD).However, these conventional methods still need to be improved in terms of extracting spatial information properly.
Deep learning models have the advantage of effectively extracting spectral information without being constrained by the limitations of classical approaches.In their paper, Aghdami-Nia et al. [9] developed an automatic coastline extraction framework by modifying the Standard U-Net model to enhance sea-land segmentation.In another study, Lin et al. [10] proposed a novel approach utilizing a Fully Convolutional Neural Network to detect water in Sentinel-1 SAR images accurately.The overall detection performance is enhanced by incorporating the spatial information of neighboring pixels and analyzing the corresponding pixel intensities.
The performance of classical methods and deep networks in CD using Sentinel-1 images has been investigated to determine which approach yields superior results.In this study, the Ratio Index (RI) is employed as a fundamental classical method, while the MRF is utilized as an enhanced version of this method.In addition, an improved form of CNN called Inception CNN is introduced as a deep network to detect waterbody changes effectively.This network can consider the different scales of image objects within the network.
The structure of the current investigation is as follows: The second section introduces the research methodology.Section 3 presents the experimental result.Finally, in Section 4, we summarize the conclusions.

Methodology
In this section, we present the three mentioned CD methods.An overview of the workflow is shown in Figure 1.
provide comprehensive information about the environment.Water zones exhibit minimal surface roughness, particularly in the absence of strong winds, resulting in their appearance as dark areas in radar images [3].In remote sensing, detecting algorithms can be classified into two groups: classical methods and deep learning models.
Classical methods rely on backscattered information, often leading to unsatisfactory results with low accuracy.Liang et al. [4] presented a new local hierarchical regional thresholding method for describing water using SAR images.Zhang et al. [5] introduced a novel approach to assessing flood extent using multi-temporal Sentinel-1 data.An automatic thresholding procedure generates initial land and water classification.Then, a fuzzy logic-based method refines the initial classification.Experiments demonstrate that using different polarizations as image bands cannot provide better results.To tackle this issue, incorporating contextual information enhances the accuracy and reliability of the classification outcomes [6].Wang et al. [7] combined the threshold segmentation method with Markov random fields (MRF) and integrated simulated annealing (SA) into the process of image noise reduction.As a result, a water extraction method demonstrates high accuracy in classification.In another study, Song et al. [8] introduced a method for selecting features from SAR images, which relied on the correlation of sparse coefficients.The aim was to enhance the precision of change detection (CD).However, these conventional methods still need to be improved in terms of extracting spatial information properly.
Deep learning models have the advantage of effectively extracting spectral information without being constrained by the limitations of classical approaches.In their paper, Aghdami-Nia et al. [9] developed an automatic coastline extraction framework by modifying the Standard U-Net model to enhance sea-land segmentation.In another study, Lin et al. [10] proposed a novel approach utilizing a Fully Convolutional Neural Network to detect water in Sentinel-1 SAR images accurately.The overall detection performance is enhanced by incorporating the spatial information of neighboring pixels and analyzing the corresponding pixel intensities.
The performance of classical methods and deep networks in CD using Sentinel-1 images has been investigated to determine which approach yields superior results.In this study, the Ratio Index (RI) is employed as a fundamental classical method, while the MRF is utilized as an enhanced version of this method.In addition, an improved form of CNN called Inception CNN is introduced as a deep network to detect waterbody changes effectively.This network can consider the different scales of image objects within the network.
The structure of the current investigation is as follows: The second section introduces the research methodology.Section 3 presents the experimental result.Finally, in Section 4, we summarize the conclusions.

Methodology
In this section, we present the three mentioned CD methods.An overview of the workflow is shown in Figure 1. Figure 1 illustrates the stepwise process of CD.In general, the research method has four steps.Initially, the images undergo preprocessing, including geocoding, radiometric calibration, and filtering using the Lee Sigma filter.Afterward, the preprocessed images are subjected to three methods to produce the desired difference image.The final step is Figure 1 illustrates the stepwise process of CD.In general, the research method has four steps.Initially, the images undergo preprocessing, including geocoding, radiometric calibration, and filtering using the Lee Sigma filter.Afterward, the preprocessed images are subjected to three methods to produce the desired difference image.The final step is to evaluate the change maps created.In the following sections, these methods will be introduced in detail.

Ratio Index
If we let I t1 and I t2 represent the SAR intensity images in t1 and t2 times, the RI, which looks like a log ratio index, can be defined as follows: RI = log ((I t1 + eps)/(I t2 + eps)) where eps represents a minimal decimal value, and refers to a small constant value known as "epsilon" or "small parameter".This parameter is employed to avoid computational issues arising from division by zero.The equation's robustness and results are improved, especially when the values of I t1 and I t2 tend towards zero.This study sets eps to 5, and the Otsu thresholding technique is employed [11] to generate the change map.

MRF
The MRF algorithm is an influential image-processing technique employed to model and analyze intricate structures within images.Using probability theory, the MRF can estimate the likelihood of a particular state occurring in each pixel.Imagine receiving a change index image representing a collection of N pixel vectors X = {x1, x2, . . . ,xN}.The labels of the difference image are denoted by L = {l1, l2}.The maximum a posteriori (MAP) estimation determines the pixels' labels.For a given pixel x, the formulation can be described as follows [12]: where P(x|c) represents the conditional probability distribution within the Gaussian distribution model, and P(c) denotes the prior probability distribution of the label layer.Based on the Bayesian inference principle, one can achieve the maximum value in the posterior probability by minimizing the total energy function.The detailed investigations in reference [12] can be referred to for further details.

Inception CNN
Deep learning models such as CNNs are applied to image recognition, classification, and CD.These networks enable accurate predictions or classifications by automatically learning and extracting relevant features from input images.The distinguishing characteristic of CNNs is the capability to execute convolution operations.Convolution involves sliding a small kernel over the input image to extract spatial information.By getting deeper layers, CNNs can generate complex features.The process and operations carried out in this layer can be described as follows: where z k l denotes the output feature vector of layer l. ml represents the number of convolutional filters in layer l of the network and x n l−1 corresponds to the nth input vector of layer l. b k l represents the bias vector and w k.n l shows the filter connecting the nth feature map in the previous layer (l−1) to the kth feature map in layer l.The denotes the convolution operator [13].
Using a fixed kernel size in the initial layers of CNNs can lead to disregarding the varying scale of objects in an image.To address this, the Inception module has been applied in this study.The Inception module aims to capture features at multiple spatial scales using parallel convolutional operations of different filter sizes within the same layer.This allows the model to learn and combine diverse features simultaneously.The Inception module simultaneously applies max pooling and three convolutions to the input data.All generated feature maps are merged to serve as inputs for the next layer.
The proposed deep network receives the stacked bi-temporal SAR VV polarization images as input and produces the change map in the output layer.Patch-based processing is the fundamental approach to utilizing image data in CNNs.Therefore, the input image is divided into dimensions of 25 × 25 × 2 and used as input for the network.The numbers of filters are arranged in the following order: [16, 32, 64, 128, 256], and the kernel size is set as 3 × 3. The learning rate and the cost function are set to 0.001 and Adam, respectively.The network architecture, as shown in Figure 2, illustrates the desired configuration.The proposed deep network receives the stacked bi-temporal SAR VV polarization images as input and produces the change map in the output layer.Patch-based processing is the fundamental approach to utilizing image data in CNNs.Therefore, the input image is divided into dimensions of 25 × 25 × 2 and used as input for the network.The numbers of filters are arranged in the following order: [16, 32, 64, 128, 256], and the kernel size is set as 3 × 3. The learning rate and the cost function are set to 0.001 and Adam, respectively.The network architecture, as shown in Figure 2, illustrates the desired configuration.

Study Area and Dataset
For comparing the performance of classical methods and the proposed deep network, two Sentinel-1 SAR images were acquired from Lake Poopó, which is located in the Oruro Department of Bolivia in South America, with a geographical longitude of 67°02′50.4″W and a geographical latitude of 18°49′26.84″S, taken in the years 9 July 2018 and 15 August 2020.Figure 3 visually illustrates the location of the studied area.

Study Area and Dataset
For comparing the performance of classical methods and the proposed deep network, two Sentinel-1 SAR images were acquired from Lake Poopó, which is located in the Oruro Department of Bolivia in South America, with a geographical longitude of 67 • 02 ′ 50.4 ′′ W and a geographical latitude of 18 • 49 ′ 26.84 ′′ S, taken in the years 9 July 2018 and 15 August 2020.Figure 3 visually illustrates the location of the studied area.The proposed deep network receives the stacked bi-temporal SAR VV polarization images as input and produces the change map in the output layer.Patch-based processing is the fundamental approach to utilizing image data in CNNs.Therefore, the input image is divided into dimensions of 25 × 25 × 2 and used as input for the network.The numbers of filters are arranged in the following order: [16, 32, 64, 128, 256], and the kernel size is set as 3 × 3. The learning rate and the cost function are set to 0.001 and Adam, respectively.The network architecture, as shown in Figure 2, illustrates the desired configuration.

Study Area and Dataset
For comparing the performance of classical methods and the proposed deep network, two Sentinel-1 SAR images were acquired from Lake Poopó, which is located in the Oruro Department of Bolivia in South America, with a geographical longitude of 67°02′50.4″W and a geographical latitude of 18°49′26.84″S, taken in the years 9 July 2018 and 15 August 2020.Figure 3 visually illustrates the location of the studied area.

Result Analysis
The visual representation in Figure 4 showcases the outcomes of RI, MRF, and Inception CNN methods.The ground truth changes have been obtained experimentally and manually by visually examining the changes in the study area.The RI, which divides the pixels of the second image by the first image to obtain changes, does not provide satisfactory results.The poor performance of this method depends on the use of only the polarization information and thresholding output.

Result Analysis
The visual representation in Figure 4 showcases the outcomes of RI, MRF, and Inception CNN methods.The ground truth changes have been obtained experimentally and manually by visually examining the changes in the study area.The RI, which divides the pixels of the second image by the first image to obtain changes, does not provide satisfactory results.The poor performance of this method depends on the use of only the polarization information and thresholding output.The MRF model can consider the pixel neighborhood that improves the detection outcomes.The MRF algorithm improved the detection results by considering the pixels' neighborhood.However, finding and selecting an appropriate number of iterations and window sizes can be time-consuming and challenging.On the other hand, Inception CNN can extract deep spatial features from the image pixels.In addition, the trained network has a high level of automation compared to classical methods.This results in a notable enhancement in CD performance.
To conduct a comprehensive and numerical assessment of the change results, the following precision evaluation indices were utilized: Overall Accuracy (O.A), Missed Error (M.E), and Kappa Coefficient (KC).Based on the evaluation indices presented in Table 1, the accuracy of the proposed deep network in detecting waterbody changes has been 97.35%, which is the highest OA accuracy.In contrast, the RI has exhibited the worst performance at 83.27%.

Conclusions
The advancement of remote sensing techniques has made it easier to monitor environmental changes, such as the depletion of water zones.This progress has significantly enhanced our ability to understand and address ecological transformations.This study compares the performance of classical methods and deep learning approaches in identifying water zone changes from Sentinel-1 images.As examples of classical methods, the research employed RI and MRF.Moreover, Inception CNN was utilized as an alternative to deep learning networks to enhance the CD performance.The MRF algorithm improved detection results by taking into account pixel neighborhoods.However, the time-consuming task lies in determining suitable iterations and window sizes.On the other hand, Inception CNN integrates a multi-scale approach directly within its architecture, enabling The MRF model can consider the pixel neighborhood that improves the detection outcomes.The MRF algorithm improved the detection results by considering the pixels' neighborhood.However, finding and selecting an appropriate number of iterations and window sizes can be time-consuming and challenging.On the other hand, Inception CNN can extract deep spatial features from the image pixels.In addition, the trained network has a high level of automation compared to classical methods.This results in a notable enhancement in CD performance.
To conduct a comprehensive and numerical assessment of the change results, the following precision evaluation indices were utilized: Overall Accuracy (O.A), Missed Error (M.E), and Kappa Coefficient (KC).Based on the evaluation indices presented in Table 1, the accuracy of the proposed deep network in detecting waterbody changes has been 97.35%, which is the highest OA accuracy.In contrast, the RI has exhibited the worst performance at 83.27%.

Conclusions
The advancement of remote sensing techniques has made it easier to monitor environmental changes, such as the depletion of water zones.This progress has significantly enhanced our ability to understand and address ecological transformations.This study compares the performance of classical methods and deep learning approaches in identifying water zone changes from Sentinel-1 images.As examples of classical methods, the research employed RI and MRF.Moreover, Inception CNN was utilized as an alternative to deep learning networks to enhance the CD performance.The MRF algorithm improved detection results by taking into account pixel neighborhoods.However, the time-consuming task lies in determining suitable iterations and window sizes.On the other hand, Inception CNN

Figure 1 .
Figure 1.The flowchart of generating change results (CD stands for Change Detection).

Figure 1 .
Figure 1.The flowchart of generating change results (CD stands for Change Detection).

Figure 3 .
Figure 3.The geographical location of the study area in South America, specifically Bolivia, along with the employed SAR images.(a) VV polarization image acquired in 2018, and (b) VV polarization image acquired in 2020.

Figure 3 .
Figure 3.The geographical location of the study area in South America, specifically Bolivia, along with the employed SAR images.(a) VV polarization image acquired in 2018, and (b) VV polarization image acquired in 2020.

Figure 3 .
Figure 3.The geographical location of the study area in South America, specifically Bolivia, along with the employed SAR images.(a) VV polarization image acquired in 2018, and (b) VV polarization image acquired in 2020.

Table 1 .
Accuracy assessment of three methods in generating a water zone change map.

Table 1 .
Accuracy assessment of three methods in generating a water zone change map.