Recognizing the Shape and Size of Tundra Lakes in Synthetic Aperture Radar (SAR) Images Using Deep Learning Segmentation

: Permafrost tundra contains more than twice as much carbon as is currently in the atmosphere, and it is warming six times as fast as the global mean. Tundra lakes dynamics is a robust indicator of global climate processes, and is still not well understood. Satellite data, particularly, from synthetic aperture radar (SAR) is a suitable tool for tundra lakes recognition and monitoring of their changes. However, manual analysis of lake boundaries can be slow and inefﬁcient; therefore, reliable automated algorithms are required. To address this issue, we propose a two-stage approach, comprising instance deep-learning-based segmentation by U-Net, followed by semantic segmentation based on a watershed algorithm for separating touching and overlapping lakes. Implementation of this concept is essential for accurate sizes and shapes estimation of an individual lake. Here, we evaluated the performance of the proposed approach on lakes, manually extracted from tens of C-band SAR images from Sentinel-1, which were collected in the Yamal Peninsula and Alaska areas in the summer months of 2015–2022. An accuracy of 0.73, in terms of the Jaccard similarity index, was achieved. The lake’s perimeter, area and fractal sizes were estimated, based on the algorithm framework output from hundreds of SAR images. It was recognized as lognormal distributed. The evaluation of the results indicated the efﬁciency of the proposed approach for accurate automatic estimation of tundra lake shapes and sizes, and its potential to be used for further studies on tundra lake dynamics, in the context of global climate change, aimed at revealing new factors that could cause the planet to warm or cool.


Introduction
As permafrost thaws, some portion of its organic matter will be decomposed by microorganisms, emitting large amounts of greenhouse gases.This permafrost carbon-climate feedback is the largest terrestrial feedback to climate change, and one of the most likely to occur.It is increasingly recognized that the magnitude and timing of the permafrost carbon-climate feedback depends largely on the reorganization of hydrology (the distribution of wet and dry surfaces), and associated changes in redox conditions in permafrost landscapes [1][2][3].One of the major hydrologic transitions that can occur during permafrost thaw is surface collapse due to the melting of ground ice supporting the soil profile.This abrupt, non-linear process, termed thermokarst, has proven exceedingly difficult to map, let alone predict [4][5][6].In lowland landscapes, thermokarst can transform dry land to a lake.
In some areas of the permafrost zone, current or drained thermokarst lakes occupy 40-80% of the landscape surface, often forming characteristic features [7,8].If permafrost thaw continues after lake formation, loss of frozen ground supporting the water body can trigger drainage of lakes to groundwater or connected river networks, decreasing the overall number and area of large lakes, but increasing residual small lakes.Generally, high latitudes of the permafrost zone are experiencing an increase in lake area, while lower latitudes of the permafrost zone are seeing a decrease, though trends are spatiotemporally complex [9][10][11][12][13].Integrating permafrost lake formation and draining into Earth system models is a high priority in the permafrost community, for improving predictions of the permafrost climate feedback [14][15][16][17].
The abundance and size distribution of tundra lakes are important spatial characteristics of the landscape, for understanding the dynamics of permafrost thawing and related processes in tundra ecosystems.In addition, the trade of the shapes of tundra lake patterns can be rigorously quantified by employing these quantities.Many studies [18][19][20] have recently revealed that quantification of size distribution and fractal features requires a high volume of remote sensing data on the land surface.Statistical and geometrical parameters, such as size distribution and fractal dimension, could play a role, as a metric of the accuracy of tundra lakes image segmentation when real physical parameters are limited or unavailable.
Several studies have addressed the problem of automized tundra lakes recognition from satellite optical data.Manuscript [21] reports that the traditional way is to utilize image segmentation techniques, where images are divided into regions based on some features: for example, a paper by [22] employed a cloud-removal technique, called the function of mask (Fmask) algorithm, on Landsat imagery of a tundra surface.Automatic segmentation of water bodies in satellite imagery, to estimate lake size distributions, was considered in [23].In [17], the authors applied robust linear trends based on the Theil-Sen regression algorithm, and random forest (RF) classification technique, by feeding precalculated spatiotemporal trend information, to analyze the lake dynamics across Northern permafrost regions.Application of supervised classification, using a maximum likelihood classifier and digital change detection, to identify areas of high erosion or deposition of permafrost, was investigated in [24].
Although data sets of permafrost surface features based on passive-microwave satellite data do exist [25,26], their spatial resolution is relatively coarse (20-50 km), and thus only allows for estimation of inundation fractions and their variations [27,28].For accurate monitoring of waterbodies, data with a higher spatial resolution are required, as in tens of meters in a pixel [29].Some optical instruments acquire high-resolution images (typically 1-15 m), but they depend on cloud cover conditions and the luminance of the surface, which significantly limits their utilization, particularly in polar regions.Thus, a number of studies have focused on synthetic aperture radar (SAR) data for recognition of permafrost surface features and quantifying their dynamics [30][31][32][33][34][35], including InSAR techniques [36][37][38].SAR backscatter intensity strongly depends on the dielectric constant of the surface, and is very different for water and ice [39].Thus, thanks to the high spatial resolution and the signature differences, SAR is a suitable tool for tundra lakes recognition in many cases.
Although the manual analysis of SAR data is sensor-independent, and relatively robust against misclassification [35,40,41], it is time-consuming; therefore, automated methods are preferable.Some studies have proposed using machine learning for characterizing Arctic tundra landscapes, including lakes [30,42].To the best of our knowledge, stateof-the-art deep learning methods have not been previously applied to the problems of automatic tundra lakes recognition from SAR, and, thus, one of the primary aims of this study was to investigate the potential of deep learning for accurate tundra lake shape and size recognition.C-band Sentinel-1 images, freely available under the Copernicus Program, were used to develop a machine learning methodology capable of segmenting the images into two classes: tundra lakes and background.To assess the applicability of the method, we analyzed the generated time series of segmented Sentinel-1 images, for quantification of the lake spatial features in the Arctic region.
The rest of the paper is organized as follows.In Section 2, a description of satellite SAR data processing, selected test sites, reference data preparation and tundra lakes recognition algorithm workflows are given.The results obtained in this study are presented in Section 3, followed by discussion in Section 4. The main findings and implications are summarized in the conclusions in Section 5.

Satellite SAR Images and Test Sites
For our study, we selected two test sites in the Arctic region, characterized by the presence of tundra lakes: the Yamal Peninsula and Alaska.The regions are located far from each other, which allowed for an improved understanding of the performance of our approach in different environmental conditions.From the two test sites, we downloaded hundreds of Sentinel-1 Level-1.5 Ground Range Detected (GRD) dual-polarized data through the Alaska SAR Facility data archive facility, for the summer months of 2015-2022, when the lakes characterized their maximum development.The spatial boundaries of the collected images are depicted on a map in Figure 1.Then, we computed the SAR backscattering coefficient σ 0 from digital numbers DN as where A was a calibration constant from a look-up table from auxiliary calibration files (for details, see [43]).Then, σ 0 was conventionally converted to dB: σ 0 db = 10 • log(σ 0 ) for both HV and HH polarizations.Finally, the data were projected onto a Polar Stereographic grid with a pixel size of 50 m.Radar backscattering σ 0 from water and ice is strongly angledependent, due to the specificities of SAR system geometry [39,44,45]: hence, information on the incidence angle (I a ) is very useful for SAR data analysis.Contrary to polarization and frequency, I a is not fixed, and varies across the image.The backscatter intensity from a homogeneous surface varies with I a , and decreases across a SAR image from near-range (low I a ) to far-range (high I a ).To compensate for this effect, most research applies a global I a correction, using constant slope before the classification task [46][47][48]: however, in doing this, they neglect the known physical differences in backscatter behavior, with I a for different surface types.Thus, to take these specificities into account, we utilized incidence angle information in our recognition framework for every corresponding pixel.The Sentinel-1 metadata contained values of I a at certain points, which we extracted and interpolated onto the image grid by linear interpolation.Then, the values of σ 0 HV (Figure 2A), σ 0 HH (Figure 2B) and incidence angle I a (Figure 2C) were normalized in the range 0-1 by min-max normalization ( value−min max−min ), and then used to form a 3-channel image (Figure 2D).This data structure was used in both the training and segmentation stages.

Manual Mapping of Tundra Lakes
For the training of the U-Net, we prepared binary images containing manually annotated regions of tundra lakes.The analysis was performed manually by visual expert analysis, using VGG image annotator software [49].The lakes were mapped in the Yamal and Alaska test sites (Figure 1), from tens of images, and stored as binary tiff files, where a pixel intensity of 255 corresponded to the lake, and 0 to the background.Overall, we generated sets of 668 and 296, respectively, of 512 × 512 images containing labeled lakes in the Yamal and Alaska sites [50].An example of an image generated from manual recognition of tundra lakes is shown in Figure 3.

Automatic Tundra Lake Recognition from SAR Images
In this study, we propose a two-stage algorithm framework for tundra lakes recognition from SAR images.The algorithm operates with so-called semantic and instance segmentation.Semantic segmentation is the first stage, and associates every pixel of an image with a class label, such as a lake or other surface feature: it treats multiple objects of the same class as a single entity.By contrast, instance segmentation, which is the second stage, treats multiple objects of the same class as distinct individual instances.Here, we propose a combination of both methods, which is useful for separating touching or overlapping individual objects-which, in our case, were the lakes-for unambiguous estimation of their size and shape.The majority of semantic segmentation methods utilize encoder-decoder deep neural network structures, most of which are based on the U-Net architecture proposed in [51], which is an improved version of the fully convolutional network for medical images [52], and which has also been applied to the water bodies imagery [53,54].The underlying technique of U-Net involves skipped connections, to improve gradient flow, and to allow for transferring of information between the down-sampling and up-sampling paths, by connecting each pair of encoder-decoder layers.In our algorithm framework, the U-Net took an input SAR image, classified each pixel in the image into tundra lake/background classes, and produced a binary image.The semantic segmentation by the U-Net was followed by instance segmentation with a classical watershed algorithm [55,56], which is a region-based technique that utilizes image morphology.The general scheme of our tundra lakes recognition workflow is shown in Figure 4.The framework operated with Sentinel-1 data synthesized into 3-channel images, as described in Section 2.1.As a low-pass filter prior to classification and detection strongly improved the recognition results [30], we applied an anisotropic diffusion filter to suppress speckle appearance [57,58].For training of the U-Net, the Adam optimizer [59]  Often, lakes can overlap or touch each other, therefore for accurate estimation of their shapes, special methods are required.To address this potential issue, we complemented our recognition framework with instance segmentation, which was the second stage.Here, we aimed at labeling each pixel of an image with a corresponding instance: in our case, the instance was an individual tundra lake.The watershed algorithm described in [56] was used.A chessboard metric was chosen, as it provided more robust results than other widely used metrics [60,61].This step of our framework was implemented using a combination of several computer vision methods from the Python-based OpenCV framework [62].In the following text, we complement the description of the processing methods with a corresponding reference to OpenCV functions.
First, we treated isolated groups of a few pixels as artificial, and discarded them using (cv2.morphologyEx).Then, we identified sure background areas, by convolving an image with a kernel, K, that replaced the image pixel in the anchor point position with that maximal value (cv2.dilate),consequently causing the tundra lake areas to extend their coverage.This increased the lake boundary to their background and, thus, we defined pixels that corresponded with certainty to the background (pixel value = 0).To find a sure foreground area, the distance transform method was used (cv2.distanceTransform),followed by thresholding the obtained values inside the foreground regions to their respective distances from the closest background value.The threshold of the distance transform was chosen empirically, and was 15% of its maximum value.Higher values might have resulted in the discarding of small lakes.Then, we estimated the ambiguous pixels (pixel value = −1) by subtracting the background from the foreground.Pixel labeling was performed, using (cv2.ConnectedComponents) for the three categories: foreground (1); background (0); and ambiguous (−1).Finally, for defining instances that were individual lakes, watershed segmentation based on the derived markers was applied by (cv2.watershed).All the described steps are summarized in the pseudo-code Algorithm listing 1. x with M into binary image B I x , where pixel value x of 255 corresponds to the lakes, and to 0 for the background.end Merge {B I x } X x=1 into B I .Remove small noisy areas in B I by morphological opening with kernel of 3 × 3 pixels.Identify sure background area, by dilating x =255 a few times, to increase lake boundary to background.Calculate Chessboard transform distance P DT for each pixel x.Estimate foreground pixels x FG , by thresholding P DT as 15% of max(P DT ).Estimate ambiguous pixels as x A = x =255 − x FG .Define pixels that correspond to a certain lake (instance) by watershed segmentation, given as x Li = f(B I , x A ).

Measuring Trade Features of Segmented Lakes
The formation of geological objects entails different complex natural phenomena, and it is quite remarkable that some of them have identical fractal dimensions, as was shown for clouds, in the pioneering paper by [63].Fractal theory [64] can be used as a method for studying partially correlated (over many scales) spatial phenomena that are not differentiable but are continuous.This theory helps quantify complex shapes or boundaries, and relates them to underlying processes that may affect pattern complexity.For simple objects like circles and polygons, the perimeter P scales as the square root of the area A. However, for complex planar regions, with fractal curves as their boundaries: where the exponent D is the fractal dimension of the boundary curve.

Box Counting Method
The Box counting method is a computer method of extracting data from real pixel images, for further analysis.The methodology consists of breaking the whole image into smaller pieces, and counting the boxes with some particular feature.
In the fractal analysis, we have to count the number of boxes that cover the fractal, and to see how it changes when the size of the box is decreasing.The number of boxes should approximately follow a power law: where R is the size of box, and d is called Kolmogorov capacity or simply box-counting dimension.
There are other methods for counting fractal dimensions, such as the area-perimeter or divider method, which is also used in geological research [65]: in order to apply it, we need to detect boundaries with high precision, which is subject to segmentation method and resolution.Instead, the box-counting method provides a simple computer method that gives robust complexity characteristics for a mixed land-lakes object.

Tundra Lakes Recognition by U-Net
To assess the performance of our framework (described in Section 2.3) for the lakes recognition, we applied it to Sentinel-1 Extra Wide Swath (EW) data acquired over the Yamal Peninsula on 23 August 2016 at 12:49:16.First, the data at HV-and HH-polarization mode were calibrated, and incidence angle information was extracted.Then, we projected them onto a polar stereographic grid with a grid step size of 50 m.A pseudo-RGB image was synthesized from the data, and is shown in Figure 5A.It should be noted that this image was not used in the training stage.Then, we applied a pre-trained U-Net to extract the tundra lakes from the data, and a binary image was generated (Figure 5B).Before the automatic recognition, we manually labeled the tundra lakes from the image, and used them as ground truth labels (Figure 5C).From visual inspection, in general, we saw good agreement between automatically and manually extracted lakes.However, the lake borders produced by our framework looked more smooth and more natural, compared to those manually extracted.To show the difference between the data, we subtracted the ground truth from the U-Net output, and the result is shown in Figure 5D.The blue areas correspond to lakes that were not detected by our U-Net, but were recognized by an expert, and the red, conversely, corresponded to areas which were recognized as lakes by the U-Net, but not in the manually derived image.In total, four lakes were not detected by the U-Net model (the blue blobs in Figure 5).The other differences mostly corresponded to lake border areas.As we mentioned above, the lake borders in the image generated by U-Net were more natural, and closer to what we see on the satellite image.Most of the very small lakes, the size of a few pixels, were not recognized either by U-Net or by expert analysis, because of their too-small sizes in terms of the SAR image resolution.

Instance Segmentation of the Lakes and Noise Filtering
To demonstrate the advantage of our tundra lakes recognition concept, let us consider an example of the lakes recognition products obtained from a Sentinel-1 image taken over the northern coast of Alaska (157.0301• W, 70.93083 • N) on 20 October 2016 at 04:02:51.The result of segmentation by U-Net is shown in Figure 6A, after noisy areas removal in Figure 6B, and the final result obtained with instance segmentation using the watershed algorithm is in Figure 6C.In the zoomed area in Figure 6D, we can recognize three lakes, but lake 2 and lake 3 are overlapped, which means that if we analyze their shapes they will be considered as one lake, which is not desirable, as we want to derive their spatial characteristics separately.Thanks to the instance segmentation stage of our framework, we can separate the two overlapped lakes by recognizing their borders (see Algorithm 1), and then applying the watershed algorithm.Finally, the lakes are separated and ready for further analysis, such as their shapes and sizes estimation (Figure 6F).

Fractal Dimension
Fractal dimension counted by the box-counting method led to the two following plots.It clearly showed that, except for some deviation, fractal dimension remained almost constant.Fluctuation could be explained as the computational process being vulnerable according to the choice of the size of the box, which was subject to the finite size of the pixels on the images.

Size Distribution
We calculated the size distribution of the lakes on historical maps, by means of MATLAB (R2022a, The MathWorks Inc., Natick, MA, USA) core function library skimage.The standard procedures counted the number of pixels associated with lands and lakes, which led to the plots below.

Discussion
Some of the most recent comprehensive comparisons between the original U-Net [51] and its modifications, such as [66,67], indicate that architectural changes in U-Net are potentially unnecessary for images formed from coherent signals and associated with increased trade and slower speed for the marginal performance gains [68].Thus, in this study, we utilized the original architecture of U-Net.But, in future work, state-of-the-art modifications of U-Net can be assessed, to improve the performance.
As a loss function, binary cross-entropy computed by predicted values and ground truths is typically selected for segmentation tasks with U-Net.But the targets generally occupy a small part of the entire training image, so that the minimization of cross-entropy inclines to be biased towards the outside of the target (background).Thus, in our recognition framework, we utilized the Jaccard similarity index, to avoid false-positive detection, and to increase binary classification accuracy.
There are many challenges in quantifying changes in the shape and patterning of tundra lakes [69], because in comparison to mathematical fractals (like the Koch snowflake or the Sierpinski gasket) in mathematical space, we have to restrict ourselves to the finite size of pixels on images.The novel techniques of image analysis may successfully tackle these scaling challenges.Found here, time series of fractal dimensions show that the fractal features of tundra lakes are constant enough in time (see Figures 7 and 8).We did not observe an increase in shape complexity, even if a lake's size was growing.In manuscript [70], the authors analyzed Landsat images for the Western Siberia region, using region-growing segmentation techniques.While the average fractal dimension of tundra lakes lay within the interval found in our work, there were a few spikes, probably due to used technique issues.For the two sites, Alaska and Yamal, we showed that the lakes that were recognized on Sentinel-1 satellite images were lognormal distributed (see Figures 9 and 10).In manuscript [22], tundra lakes of the size of 0.002 km 2 to 50 km 2 were segmented from highresolution Sentinel-2 satellite images covering 725,000 km 2 of the East Siberia region.The authors found that tundra lakes in the region were also log-normal distributed.The fractal dimension and size distributions exhibited universal characteristics that did not necessarily depend on the details of the driving mechanisms and associated complexities: thus, they could be used in tundra landscape modeling [71,72].Pattern recognition, and quantifying its fractal dimension, are also important in the field of biomedical image analysis [73,74].We believe that the methods of fractal pattern recognition developed here will be useful in other areas where the system complexity should be defined.

Conclusions
This study addressed the problem of automated tundra lake recognition from SAR images for the lake shape and size calculations.We proposed a two-stage recognition framework comprising instance (1) and semantic segmentation (2), which operated with Sentinel-1 images.In the first stage, the tundra lake areas were extracted by U-Net, and a segmentation accuracy of 0.73, in terms of the Jaccard similarity, was achieved.The second stage, which was the watershed algorithm, was aimed at the separation of touching and overlapping lakes produced in the first stage, for the unambiguous estimation of the shape and size of an individual lake.The high performance of the proposed method was demonstrated, based on a generated dataset of manually labeled lakes from hundreds of images taken over the Yamal Peninsula and Alaska areas in the summer months of 2015-2022.The approach was powerful and reliable, in terms of recognition accuracy, computational efficiency and degree of automation.
The applicability of the proposed framework was assessed, based on a generated multiyear dataset of segmented binary images containing derived tundra lakes from Sentinel-1 data.Lake sizes and shapes were estimated from the framework output, and showed high potential gains for tundra lake changes analysis in the context of global climate.However, the model results for a wider range of environmental conditions and regions still need to be proven.Possible directions for improving the recognition of lakes include training over different areas and seasons, and utilizing SAR texture characteristics.State-of-the-art U-Net architectures could be incorporated into the framework, and its performance gain evaluated, but at the cost of complexity and reduced speed.Although the new approach is used here for SAR images from Sentinel-1, it has great potential for images taken from other satellite platforms, including optical imagery.
The study of the fractal characterizations of networks of tundra lakes, and their size distributions, is essential for the representation of permafrost, vegetation and landscape dynamics in climate models.The analysis of new SAR imagery of tundra lakes by a novel deep learning method showed that the average fractal dimension of recognized lake patterns was very stable over seven-year time intervals, and that all the changes fitted into the limits of accuracy.The lake sizes were lognormally distributed for every year in the studied time interval.We expect that the proposed approach to tundra lake recognition will be demanded in climate emulators that are based on conducting high-quality data analysis and revealing complex features of the systems [75].

Figure 1 .
Figure 1.Sentinel-1 images collected over the two test sites: in red-Alaska test site; in blue-Yamal test site.

Figure 3 .
Figure 3.A pseudo-RGB Sentinel-1 image (R: HV-polarization; G: HH-polarization; B: incidence angle) taken over the Yamal Peninsula on 12 August 2015 at 12:40:52, containing tundra lakes with different shapes and sizes (A), and a binary image with manually extracted tundra lakes (B), where the white color corresponds to the tundra lake areas.
was employed with default parameters.A batch size of sixteen was used, and the network was trained for 200 epochs.As a loss function, the Jaccard similarity index or the Intersection over Union (IoU = Area o f Intersection/Area o f Union) was used.This metric operated with the Area o f Intersection, which was the area where our predicted image overlapped the ground truth image, and the Area o f Union, which combined our predicted image and the ground truth image.For the training, image patches of 512 × 512 pixels containing SAR data, and corresponding masks of tundra lakes labels, were used.After the U-Net training, an accuracy of 0.73, in terms of IoU, was achieved.The output of the semantic segmentation stage with U-Net was a binary image containing pixels with values of 0 (background) and 255 (tundra lakes).

Figure 5 .
Figure 5. Results of tundra lakes recognition from Sentinel-1 data taken over the Yamal Peninsula (69.4366 • E, 68.5102 • N) from 23 August 2016 at 12:49:16 by our algorithm framework: (A) input pseudo-RGB image where R = SAR backscattering at HV-polarization, B = SAR backscattering at HH-polarization and G = incidence angle; (B) predicted tundra lakes (white) (C) ground truth image with manually mapped lakes; (D) the difference between ground truth and predicted data, where red indicates pixels corresponding to lakes in the framework output but not in the ground truth, blue corresponds to pixels recognized as lakes in the ground truth but not in the framework output and white stands for agreement between the data.

Figure 6 .
Figure 6.(A) segmented tundra lakes at Alaska test site (157.0301• W, 70.93083 • N) from 20 October 2016 at 04:02:51, obtained by U-Net; (B) after filtering of small "noisy lakes"; followed by (C) instance segmentation, where white areas correspond to tundra lakes, and colored blobs stand for individual segmented lakes (instances).The white rectangles in (A-C) correspond to a zoomed area containing three lakes labeled as (1), (2) and (3); (D) raw U-Net output in the zoomed area, where lake 2 and lake 3 are overlapped (one instance); (E) lakes areas after morphological opening, and the obtained lake borders by the watershed segmentation, are indicated in red; (F) results of instance segmentation with the watershed algorithm, where each lake (instance) is depicted in different colors.

Figure 7 .
Figure 7. Box plot of the fractal dimension (fd) for the time period 2015-2022; the studied region is in Alaska.

Figure 8 .
Figure 8. Box plot of the fractal dimension (fd) for the time period 2015-2022; the studied region is in Yamal.

Figure 9 .
Figure 9. Histogram of the number of recognized lakes on images from Alaska vs the log of lake size (area).

Figure 10 .
Figure 10.Histogram of the number of recognized lakes on images from Yamal vs the log of lake size (area).
Tundra lake shapes recognition from SAR images