Deriving Large-Scale Coastal Bathymetry from Sentinel-2 Images Using an HIGH-Performance Cluster: A Case Study Covering North Africa’s Coastal Zone

Coasts are areas of vitality because they host numerous activities worldwide. Despite their major importance, the knowledge of the main characteristics of the majority of coastal areas (e.g., coastal bathymetry) is still very limited. This is mainly due to the scarcity and lack of accurate measurements or observations, and the sparsity of coastal waters. Moreover, the high cost of performing observations with conventional methods does not allow expansion of the monitoring chain in different coastal areas. In this study, we suggest that the advent of remote sensing data (e.g., Sentinel 2A/B) and high performance computing could open a new perspective to overcome the lack of coastal observations. Indeed, previous research has shown that it is possible to derive large-scale coastal bathymetry from S-2 images. The large S-2 coverage, however, leads to a high computational cost when post-processing the images. Thus, we develop a methodology implemented on a High-Performance cluster (HPC) to derive the bathymetry from S-2 over the globe. In this paper, we describe the conceptualization and implementation of this methodology. Moreover, we will give a general overview of the generated bathymetry map for NA compared with the reference GEBCO global bathymetric product. Finally, we will highlight some hotspots by looking closely to their outputs.


Introduction
Coastal morphology plays vital role in the global environment. It could be considered as a barrier between land and sea. Coastal regions with shallow water are subject to permanent changes. This impacts their vulnerability to coastal flooding. Moreover, it also can harm economies-e.g., the Netherlands invests on average EUR 35,000,000.00 per year to nourish the coast [1]. Moreover, the depth and shapes of the underwater bathymetry are important for the maritime sectors [2]. Bathymetric maps are also used in biological oceanography, because the depth of the sea is linked to biological characteristics of marine ecosystems. Therefore, generating high-resolution bathymetry maps is an added value to several domains. Despite this major importance, it is very challenging to create bathymetric maps with both accurate spatial resolution and large spatial coverage. It is both time-consuming and very expensive [3]. In recent decades, single-beam echosounding and many other technologies such as multi-beam echosounding, Light Detection and Ranging (LIDAR), have been used to produce the bathymetry of coastal regions at different resolutions and accuracy [3][4][5]. Classical studies of surveying coastal regions, generally use acoustical techniques which consist of measuring the distance between the device and the bottom of the sea. This technique offers a good accuracy but it is very slow and covers very limited areas. An alternative that can scan a wide area with a good spatial resolution is LIDAR. It uses infrared and green laser transmitter and post-flight data processing techniques to generate survey depths with high accuracy [6,7]. However, this technique is more expensive, especially when the study area is large. To overcome technical and economical issues of in situ observations, the new generation of spaceborne optical remotesensing sensors could be a good alternative. It is characterized by high spatial resolution and regular revisit time (varying from a few days to 16 days) while covering the entire globe [8]. Several studies are currently moving in this direction, and try to use remote sensing data to retrieve the water depth in coastal regions. Hamylton et al. [9] compared two approaches to estimate the bathymetry at 5 m of resolution. This study was established at both the Lizard Island and Sykes Reef sites, by using WorldView-2 images. Chybicki [10] demonstrated that an inversion of Sentinel 2A/B radiance is useful to derive an estimation of the shore bathymetry (for areas with a depth lesser than 18 m in South Baltic coastline. Other researchers used similar method to derive the bathymetry such as [11,12]. This method is efficient for nearshore areas but has a principal limitation of this method is the water turbidity which degrades the quality of the results. Thus, the objective of this article is to present a large-scale implementing an High Performance Cluster (HPC) methodology to derive the bathymetry from high spatial resolution images (of type Sentinel 2A/B) from a regional, continental to global scale. We present an approach that processes many images simultaneously with a computation time around 1.5 h per image per CPU (central processing unit). At first stage, we describe physical laws that control the estimation of the water depth followed by a detailed HPC-workflow to implement the bathymetry derivation-code. In the results section we present a study case applied in North African's coasts. Detailed description of computational time and memory while running the code over the tiles will be given. Finally, in discussion section we discuss the limitations of the methodology, and present a recommendation to run it over the world with specific IT resources.

Study Area
The arid coasts of North Africa, extending over more 6000 km from Morocco Atlantic Coast to the Nile Delta ( Figure 1), are undergoing pronounced shoreline retreats and coastal flooding that are reported as a consequence of the ongoing sea level rise resulting from global warming. The coastal zone plays an important role in the economic development of this region [13]. Indeed, more than 60% of the population live in coastal cities, and 90% of the country's industries are based along the coast [14,15]. Of particular interest are the abnormal shoreline dynamics for deltaic and sandy beaches, which are severely impacted by abrupt decadal variabilities in both climatic and anthropogenic drivers resulting in their increased vulnerability to disturbances from coastal hazards. Unfortunately, the evolution, distribution and impacts of these drivers remain largely unquantified, let alone understood, for these extensive arid coasts that harbor the major portion of North Africa's population as well as unique and fragile marine ecosystems.

Sentinel 2A/B Images Retrieval
Sentinel 2A and 2B are two twin polar-orbiting satellites launched in 2015 and 2017, respectively, as part of the European Copernicus program [16]. They are designed for the operational monitoring of atmosphere, land and ocean. With a wide-swath and a multispectral imager (MSI) with 13 spectral bands (from 443 nm to 2190 nm), a high spatial resolution imagery varying from 10 m for the majority of the bands covering the visible, and very near infrared (VNIR), to 60 m for the short wave infrared (SWIR). In addition to this variety of bands and the high spatial resolution, Sentinel 2A/B has a temporal resolution reaching a maximum of 5 days (more you are far than equator less the temporal resolution [17]). They allow the observation and monitoring of land, atmosphere and ocean every 5 days.
In this study, 287 Sentinel 2A/B tiles cover the North Africa coastal zone ( Figure 1) were used. First, we queried all available Sentinel 2A/B (S2-L1C) images between January 2015 and January 2020 in the PEPS collection of the Theia Land data center. For the analysis in this work we limit the computations to ten scenes of cloud-free Sentinel 2A/B images during a 5 year span from 2015 to 2020. A sensitivity analysis in the original method article [18] and for West-African hotspots [19], shows that the wave power (as a function of wave height and period) as well as directional spread of the wave field affect depth estimation. Generally, the more powerful, narrow-spread wave fields allow for a more accurate estimation of the water depth.
Considering this, 10 cloud free images with most powerful waves are selected in this study. Since we are working on 287 tiles and in each tile there are 10 images, we integrated 2870 images in the HPC. If our area (the selected image) is totally covered by water, it has an execution time of 72 core-hours (36 CPUs used for two hours). If the image has a terrestrial part this time decreases. This is why we fixed walltime for running the Portable Batch System (PBS) in 2 h.

Global Water Mask
Since it is not our objective to estimate water depths on lands, the land is filtered from the image by using a land-mask. The set of routines is fed by a water mask which not only prohibits depth estimation at land but it is also used to compute the distance to the shore that is used in adaptive tile-sizes during the computation. Here we use an existing global water mask with an initial resolution of 300 m. It was generated by European Space Agency Climate Change by combining SAR images and recent Landsat-derived products. The water mask was re-sampled to a resolution of 500 m in order to be compatible with our outputs: water depths are estimated on 500 m resolution.

Physical Description
Ocean waves, when they are about to arrive in the coastal zone, are free moving waves. Only in intermediate h int = L 2 to shallow water h sh = L 20 , the propagation of these waves is limited by, among other physical processes, the water depth. As the water depth reduces towards shore, waves increasingly "feel" the bottom by increasing bottom-friction until the waves eventually break close to shore. Depth domination of the wave propagation can be described with through a mathematical relationship; the dispersion relation for free surface waves.
To solve (1), one is after a pair of wave celerity (c), wave length (L) or number (k) and wave period (T) or frequency (ω). Here we use the approach following Bergsma et al. [18] to find wave phase shifts (leading to celerity c) per wave number (k) in different detectorbands using a Radon-Transform based Fourier slicing techniques. The great benefit of this Radon-Transform based technique is the limited dependence on image resolution to estimate wave propagation while maintaining computation performance.

IT Workflow
To minimize the execution time, each Sentinel 2A/B image is split into 36 subsets (6 × 6). This ensure that all the available CPUs are used at the same time. Thereafter, we move a sub-window over the pixels that we want to process (depending on the output resolution). In this sub-window the physical equations described in Section 3.1. The approach of this decomposition is illustrated in Figure 2. The size of each sub-window depends primarily on the distance to the shore. It's equation is given as follows: κ = min 2.5, log 10 (D shore ) 2 2 (2) where, κ is the factor regulating the size of the window. D shore is the distance to the shore and W xs represents the window length (The size of the window is W xs × W xs . This widow is applied for each study bands. L win is constant fixed in 200 m in this study. Once the pixel is surrounded by the window, a Radon Transform (RT) is applied as described in [18]: R sub I represents the sinogram calculated over the sub-window domain. This sinogram allows to extract the main direction of the wave by taking the maximum variance or standard deviation over all directions. Once the direction is found, a fast-Fourier transform DFT is performed over beam with the selected angle to obtain the phase per band (φ) (following the methodology of [18]). Using two images with a slight δ(t), one can compute the phase shift Φ and find celerity (c) by linking λ, Φ using (5) where, δ(t) represents the difference time in seconds (s) between the acquisition detector bands. Here only the 10-m resolution bands are used, but possible Radon-augmentation enables the use of all bands [18]. The extract the depth (D) a numerical implementation of Equation (1) is implemented. A graphical description of the algorithm that summarizes the workflow is given in Figure 3. The IT implementation of the physical code is presented in Section 3.1.

HPC Implementation
A High Performance Computing (HPC) techniques refers to a supercomputer with a high level of performance as compared to a general-purpose computer. The basic elements of HPC techniques consists of node, queue, and a job. A node is a single physical or logical computer with one or more processors. Queuing systems are responsible for managing job requests which are shell generally scripts submitted by users. A job is a collection of instructions that a user initiates. Each job reserves specific resources in term of Randomaccess memory (RAM) and CPUs. To sum up, the computations are performed by the cluster, by submitting a job request to a specific batch queue. The scheduler will assign your job to a compute node in the order determined by the policy on that queue and the availability of an idle compute node.
In our case, the different simulations were established in the CNES-HPC cluster. All available cores are allocated with 36 Central Processing Units (CPU) and 70 Gb of RAM.
The data are collected from the datalake; thereafter, each node treat separately one image (Figure 4). Figure 4. Schematic overview of a HPC architecture. From the right to the left: loading of the data from the datalake Sentinel-1, 2, ..., n. Then in each node (the yellow box) the image is split into 36 sub-images that are treated in parallel. Each output is merged and then saved in the datalake.

North Africa Coastal Bathymetry Showcase
Water depth maps are at 500 m of spatial resolution. Figure 5 shows the result of the computation for the North Africa coastal region. Bathymetry is computed until the 2% of potential estimated depth [17] using ERA5 hindcast (ECMWF). This covers most of the shallow shelves, bays (Gabes bay) and delta (Nile). This potential varies from shallow in the Mediterranean Sea, due to limited fetch to deeper waters along the open Atlantic Coast. S2Shores results show in general a good agreement with GEBCO, the referent global bathymetry product, combining numerous surveys and previous altimetry-based coarse satellite estimations. However, it was found that GEBCO has substantial issues in coastal and shallow waters, and often generates some bumps due to interpolation of different source data-set and ends up at the shore with a linear interpolation to the shoreline. A more thorough validation from an unrivaled local survey remains invaluable in terms of ground truth and is yet to be done in this area. However, [19] conducted in Senegal a comparison with echo-sounder surveys and our estimated showed an accuracy of meters (RMSE is varying between 2 and 5 m, i.e., 10-20%).
The implemented method allow to represent different types coasts varying from shallow depth to deep depth as shown in Figure 6 for the three different coasts in Morocco; Djerba island, Tunisia; and Egypt. In these cases the maximum depth, or deep water limit is reached around 45 m.

Conclusions and Way Forward
By positioning itself on a global coverage, S2SHORES covers different coastal zone environments around the world with a sensor resolution of 10 m (Sentinel 2A/B) over a variable band from the shore and down to a depth limit of about 50 m [19]. The approach can be extended to other missions (Pleiades, World-View, SPOT6, etc.) of higher resolution for high precision need at hot-spots [20]. Sentinel 2A/B assets are the free data and the long-term nature of the Sentinel 2A/B program. Indeed, Sentinel-2A and B cover the periods 2015-2022 and 2017-2024, respectively, and their successors Sentinel-2 C and D to ensure the continuity of the mission are already scheduled for the next seven years. We show here that the recent availability of new global high resolution products such as Sentinel 2A/B (ESA) COPERNICUS constellation, combined with the latest methodology of bathymetry retrieval, can be applied globally: offering a new vision with uniform method. The results presented here are a pathway toward new EOS coastal products. Besides the approach employed and the scores, the new possibilities are evident. There is a current strong and increasing demand for such global product, after the decades old supremacy of state-of-the-art global relatively coarse resolution and rather inappropriate at the coast. This for coastal engineering, management and planning, risk forecasting and mitigation abut also scientific advance. The way forward include the optimal use of the regular, and increasing, revisit of these earth observation satellite to monitor coastal changes under climate change.
Author Contributions: M.W.B. and G.T. did the analyses and wrote the manuscript; R.A., E.W.J.B. and C.J.D. participated to the ideas, revised the manuscript at different stages and collaborated on the ideas. All authors have read and agreed to the published version of the manuscript.
Funding: Several sources were used to achieve this work. We thank especially CNES and IRD.
Data Availability Statement: The S2Shores maps are available at CNES datalake. Please contact R.A. (rafael.almar@ird.fr) if you are looking for their use.