SynthWakeSAR: A Synthetic SAR Dataset for Deep Learning Classiﬁcation of Ships at Sea

: The classiﬁcation of vessel types in SAR imagery is of crucial importance for maritime applications. However, the ability to use real SAR imagery for deep learning classiﬁcation is limited, due to the general lack of such data and/or the labor-intensive nature of labeling them. Simulating SAR images can overcome these limitations, allowing the generation of an inﬁnite number of datasets. In this contribution, we present a synthetic SAR imagery dataset with ship wakes, which comprises 46,080 images for ten different real vessel models. The variety of simulation parameters includes 16 ship heading directions, 6 ship velocities, 8 wind directions, 2 wind velocities, and 3 incidence angles. In addition, we extensively investigate the classiﬁcation performance for noise-free, noisy, and denoised ship wake scenes. We utilize the standard AlexNet architecture and employ training from scratch. To achieve the best classiﬁcation performance, we conduct Bayesian optimization to determine hyperparameters. Results demonstrate that the classiﬁcations of vessel types based on their SAR signatures are highly efﬁcient, with maximum accuracies of 96.16%, 92.7%, and 93.59%, when training using noise-free, noisy, and denoised datasets, respectively. Thus, we conclude that the best strategy in practical applications should be to train convolutional neural networks on denoised SAR datasets. The results show that the versatility of the SAR simulator can open up new horizons in the application of machine learning to a variety of SAR platforms.


Introduction
Synthetic aperture radar (SAR) technologies have shown remarkable progress in recent years and the availability of remotely sensed data of the sea surface is continuously growing. Several spaceborne SAR missions (e.g., COSMO-SkyMed, TerraSAR-X, NovaSAR-1, ICEYE) have developed a new generation of satellites exploiting SAR to provide spatial resolutions that were previously unavailable. The corresponding SAR datasets are especially useful for analyzing ship wakes, not only because of the high level of detail available but also because of the lower satellite orbital altitude (e.g., in comparison to Sentinel-1), which decreases the range-to-velocity (R/V) ratio-one of the key factors in SAR image degradation.
In addition, the application of artificial intelligence and machine learning (deep learning in particular) has also reached a significant level of maturity, with many methods having been developed in the field of object detection, segmentation, and classification in remote sensing images [1,2]. The main benefit of using SAR images, compared with other remote sensing methods, is that they yield information for wide areas under challenging weather conditions, day or night. Accurate analytics of SAR imagery is not only important in the recognition of ships themselves, but also the detection and characterization of their wakes. Although the visibility of ships is primarily enabled by strong radar signal backscattering, they are not always present in SAR images, especially in images with lower SAR resolution (e.g., Sentinel-1). Instead, the ship wake is the usual indicator of the ship presence, while the ship position is also often shifted to some extent with respect to wake location as a result of the Doppler effect.
Ship wakes provide key information for the surveillance of maritime traffic (e.g., illegal commercial activities) and are also useful in classifying the characteristics of the wakegenerating vessel and, hence, estimating the ship heading and velocity [3,4]. A detailed description of the SAR wake imaging mechanism is presented in [5][6][7]. In addition, the availability of automatic identification system (AIS) data enables the integration of such information for machine learning development since it can constitute the ground-truth for ship identification. The main issue is the limited availability of large amounts of both types of data, which are the primary inputs required for building reliable training datasets. The use of synthetic SAR imagery can fill this gap, providing a theoretically infinite set of images for multiple sea conditions, ship models, and SAR platforms. It is important to note that in this case, a priori simulation parameters substitute the need for AIS data. In addition, this simplifies the laborious process of matching SAR images to the AIS data [8].
The earliest applications of deep learning for ship detection [9] and classification [10] in SAR images were proposed only a few years ago. Thereafter, the main efforts of the community have focused on the acquisition of real SAR datasets of ships and few of such datasets have been presented [8,[11][12][13][14][15][16][17][18][19]. However, it is important to note that most of these datasets were created for detection tasks (some include segmentation) and only some of them can be used for ship classification [8,12,13,18]. The first studies focusing on the application of deep learning for the detection of ship wakes in SAR were ref. [20], where detected ship wakes were used for ship velocities estimation, and [6], where a real SAR dataset containing ship wakes was proposed. Recently, the concept of using simulated SAR images of marine vehicles with wake patterns for deep learning applications was also mentioned [21].
The main objective of our paper is to draw the attention of the research community to the benefits of using synthetic SAR datasets for classification and detection tasks. The wake system represents a unique signature for each individual ship. Nevertheless, attempting the acquisition of all possible real SAR image variants for each ship would be a gargantuan task, as many factors must be taken into account, such as different ship velocities and different sea states. The use of an available and versatile SAR image simulator [7,22] allows the generation of an unlimited number of different scenarios, overcoming these limitations. Thus, in our work, for the first time, we present and make openly available a synthetic dataset of SAR images containing ship wakes for classification purposes. It includes 46,080 SAR images for ten different ship models. We also analyze for the first time the best algorithm training strategy, by comparing the alternatives of using noise-free, noisy, and denoised images for the ship identification task.
The communication is organized as follows: Section 2 presents the SAR imagery modeling details and structure of the dataset and then describes the parameters of the deep learning network that we employed. In Section 3, the classification results and comparisons between different training strategies are discussed. A conclusion, with future work directions and applications, is outlined in Section 4.

Materials and Methods
A complete description of the SAR imagery simulation methodology with all the relevant mathematical details is available in [7], with the corresponding open-source package (MATLAB) available via the University of Bristol Research Data Repository [22].

Ship Wake Modeling in SAR Imagery
A SAR image of a ship wake consists of two parts: wind-and ship-generated wave components. They form the complete surface elevation model through their superposition as Z = Z sea + Z ship . The first part Z sea in turn is modeled based on the linear theory of surface waves and includes a summation of many independent harmonic waves with Rayleigh distributed amplitude A. The amplitude is based on sea wave spectrum S(k) and directional spreading function D(k, θ). In this work, we used JONSWAP spectrum S J [23] with fetch size F = 20 km and Longuet-Higgins et al. cosine type spreading function D LH [24] with parameter S = 8. We utilized two wind velocities as V w1 = 3 m/s and V w2 = 6 m/s. This choice follows from the fact that the Kelvin wake is best observed in SAR images for a calm sea state (V w ≤ 3 m/s) and cusp waves can still be observed at relatively high wind velocities (6-10 m/s) [25]. We also selected eight different wind directions: The second component of the SAR image, Z ship , is modeled as a Kelvin wake and is based on the Michell thin ship theory with its further approximated form of fluid velocity potential described in [4,7]. Based on freely available information at www.marinetraffic. com, we selected ten real ships (cargo, tanker, passenger vessel, high speed craft, fishing vessel) and modeled them using the parameters shown in Table 1. Similar to the approach taken for wind velocity, to account for factors influencing wake visualization, we limited the minimum ship velocity to V s1 = 5 m/s for all ship models. This is because a higher ship velocity produces a better radar scattering as the wake signature. In addition, in [26], it was shown that faster ships are more easily detectable in SAR images. In order to provide balanced training samples for each ship, we equally interpolated ship velocities between the minimum velocity V s1 and maximum velocity V s6 (unique for each ship) providing six velocities per class (Table 1). This also ensures a greater difference between the velocities for different ships and as a result, gives a greater diversity in wake signatures for all data. One of the most significant parameters influencing the SAR imaging of ship wakes is the ship heading direction relative to the SAR platform flight direction. Indeed, depending on the ship's heading, waves of the Kelvin system may or may not be observable in the SAR image. Therefore, we used a considerable number of ship heading directions (16 to be precise) to create a greater combination of realistic SAR images of ship wake content. The ship directions are as follows: SAR images were simulated corresponding to normalized radar cross-section (NRCS), with tilt and hydrodynamic modulations, and velocity bunching. The size of each scene 0.96 × 0.96 km is chosen to include enough details of wakes for all modeled vessels but also because it is a convenient size as input into the deep convolutional neural network (CNN). The simulation parameters are as follows (similar to the TerraSAR-X platform): Frequency Finally, all SAR images are scaled within the same intensity range of values by nonlinear normalization [10]: with An integral part of a real SAR image is speckle noise, which can significantly suppress the wake details (Figure 1a,b). If we consider real SAR images as a basis for forming the training dataset, the question is: Is it beneficial to use (i) noisy images for training and then noisy images for input to the classification or (ii) denoised images for training and again denoised images for classification? Although we do not use real SAR images in this study, this issue is very important, because synthetic data can potentially be used as a training dataset for classification tasks in real SAR images. To answer this question and to determine the best strategy for network training we prepared three datasets: (i) noise-free images I, (ii) noisy images I n , and (iii) denoised images I d . They are all identical, and only differ in a noise component (absent, present, or filtered). For simplicity and without loss of generality, here we chose to employ a K-distributed intensity speckle model [7]. Finally, because it is time-consuming to apply advanced denoisers (e.g., BM3D or Bayesian filters [27,28]) for large datasets, for illustration purposes, we utilized a simple median filter of size 5 × 5. An example of simulated SAR images is presented in Figure 1.

Dataset Structure
The schematic illustration of the structure of the dataset is shown in Figure 2. The number of the synthetic SAR images per class is based on a combination of simulated parameters as follows: 6 ship velocities V s × 16 ship heading directions D s × 2 wind velocities V w × 8 wind directions D w × 1 polarization HH × 3 incidence angles θ r . Thus, the overall number of images in the dataset for 10 classes is 46,080 (10 ship models, being 2 models for each of the 5 categories of the ship, Table 1).

Figure 2.
The structure of the synthetic SAR dataset for each class (for a single incidence angle θ r ) with a cross combination of ship velocities V s , the ship heading directions D s , wind velocities V w , and wind directions D w , with 1536 combinations overall. The values are given in Table 1 and in the text.

CNN Architecture
To evaluate the proposed dataset, we employ one of the most well-known neural network architectures, AlexNet [29]. The network is comprised of 8 layers, where the first five are convolutional layers and the last three are fully connected. We slightly modified a couple of parameters in this network, as shown in Figure 3, where we used 1 image channel instead of 3 for the input imagery, and the final layer was updated for 10 output classes instead of 1000. We also further adjusted the size of all images by interpolation to 227 × 227 pixels. In contrast to the large majority of studies, which use pre-trained AlexNet (transfer learning), we employed the untrained AlexNet architecture (learning from scratch). As the untrained network does not include optimized weights and biases, the hyperparameters must be determined prior to training. Tuning these hyperparameters is a difficult and timeconsuming task. The optimal combination of hyperparameters was derived via Bayesian optimization by maximizing the validation accuracy. We specified a range of values for each hyperparameter for all datasets (I, I n , I d ), and 30 trials per dataset were evaluated. In Table 2, the initial range of values and estimated optimal values for each dataset are provided. For all calculations, we used the stochastic gradient descent with momentum (SGDM) optimizer, a batch size of 256, a maximum number of epochs of 50, and a frequency of network validation of 108. Additionally, to prevent overfitting, data augmentation was performed as follows: a random translation within the range [−4, 4] pixels on the X and Y axes, and random rotation within a range of [−5, 5] degrees. Three trained networks are presented in this study corresponding to a noise-free dataset I (I-Net), a noisy dataset I n (I n -Net), and a denoised dataset I d (I d -Net).

Results and Discussion
The proposed dataset was analyzed in two respects: (i) the performance in classifying ship types based on their SAR image signatures, and (ii) for determining the best classification strategy in terms of using either noise-free, noisy, or denoised training datasets.
All datasets were randomly partitioned into a training set (60%), validation set (20%), and test set (20%). It is important to note here that in order to cross-validate different datasets, all images within the training, validation, and test sets were the same for all datasets (I, I n , I d ). For example, this allows the use of the network trained on the noise-free dataset I (I-Net), and then, by substitution of the noise-free test set with the appropriate test sets from noisy I n and denoised I d datasets, the evaluation of the network performance in terms of classification accuracy. Let us start with the overall comparison of trained networks and their performance per class. Figure 4 illustrates confusion matrix graphs calculated for all trained networks applied on relevant pairs (I-Net: I, I d -Net: I d , I n -Net: I n ) for the test sets. The accuracy is logical and follows the intuition that "less noise leads to better performance" (I-Net-96.16%, I d -Net-93.59%, I n -Net-92.7%).  In Table 3, the summary of classification accuracy results for different trained networks is presented. Evaluations were only carried out for combinations potentially applicable to real SAR images. This is due to the fact that real radar images always include speckle noise, and for example, the use of networks trained on noisy I n and denoised I d datasets (I n -Net and I d -Net) for ship identification in the noise-free dataset I is irrelevant. In this sense, the estimation of the accuracy of the network I-Net on dataset I also seems redundant, but we presented it for an overall comparison of the triad I-Net, I n -Net, and I d -Net. In summary, the following strategies were investigated: (i) the noise-free-trained network I-Net evaluated with noise-free, noisy, and denoised datasets; (ii) I n -Net and I d -Net networks applied to the noisy and denoised datasets. Interestingly, the maximum accuracy was achieved for I d -Net with the I d dataset but only for the training set case (99.16%). The minimum accuracy of 48.79% occurred when I d -Net was used on the I n dataset (test set), which confirms the significant influence of noise on the classification process. However, the better accuracy is related to I-Net when it is cross-utilized on I n and I d (75.9% and 73.18%) in comparison to scenarios where I n -Net was used with I d (69.93%) and again I d -Net with I n (48.79%). Furthermore, in view of judging potential applicability to the case of real SAR images, the best accuracy was achieved for the network I d -Net with the I d dataset (93.59%). However, in practice, this could also be dependent on the denoising method, while here a simple median filtering was employed, as previously mentioned (Section 2.1). From this perspective, training straightaway based on a noisy dataset can be considered an alternative approach, since the accuracy for the network I n -Net with the I n dataset also achieved a good value of 92.7%. Hence, one can conclude that the two strategies that can be applied when using our synthetic SAR dataset of ship wakes are to train on either (i) the denoised I d dataset, or (ii) the noisy I n dataset. The latter has the advantage of reducing the additional image processing time (by excluding denoising). This is possible due to the generation of a large number of synthetic radar images using multiple simulation scenarios.
For visualization purposes, Figure 5 also shows 25 randomly selected test images (I n -Net: I n ) with predicted classes and predicted probabilities of these classes. It is readily noticeable that images with less distinguishable ship wake details are less accurately classified.

Conclusions
Synthetic aperture radar has been used for over fifty years to image waves on the ocean's surface. The many theoretical developments achieved in the hydrodynamic modeling of the sea surface and the effects on SAR image formation now allow the generation of very realistic synthetic SAR datasets. This can enable the use of machine learning in the classification of vessels. In this study, we introduced and analyzed the first such dataset to help overcome the well-known limitation of the lack of a sufficient number of labeled real SAR images with ship wakes for deep learning classification. The conceptualization of this work has consisted of two aspects: (i) classification of ship types on the basis of their wake signatures in synthetic SAR images, and (ii) analysis of the classification strategies in terms of using noise-free, noisy, and denoised datasets. In contrast to the usual practice of using pre-trained networks, we employed the untrained CNN AlexNet architecture and performed training from scratch. It is demonstrated that even with a small number of epochs (50), the networks were trained with a high level of accuracy for training sets 98.68%, 97.82%, and 99.16%, and for test sets 96.16%, 92.7%, and 93.59% (noise-free, noisy, and denoised datasets, respectively).
One should keep in mind that the ship velocity affects the amplitude of the wakes and, consequently, their visualization in the radar image. The same applies to wind velocity, but the general principle is that a bigger amplitude for wakes and smaller for ambient sea waves is better for wake visualization. This creates uncertainties in the choice of ship velocity for simulation, as for the same velocity and constant amplitude of ambient sea waves, one ship's wake will not be visible while another will. This means that the training dataset may contain images where only the sea waves are represented, which may have an impact on classification accuracy. However, it also applies to the concept of a 'boundary condition' [7], where due to similar size (wavelength) and amplitude of the sea and ship waves, wake signatures can disappear or be less noticeable in the SAR image. This question should therefore be explored further, bearing in mind that with the increase in the number of ship models, the problem becomes more complicated. Another major issue that should be studied is the impact that similar wake signatures, corresponding however to different vessels, have on classification accuracy. Finally, and perhaps most importantly, the application to classify ships in real data should be investigated, either by direct use of the presented trained networks or after some form of transfer learning.
To summarize, we highlight that there are a number of advantages to using synthetic SAR datasets for classifying vessels. Since simulations allow for the generation of the necessary amount of data, it solves the imbalanced data problems often experienced with real data when they have a skewed class distribution. Automation also means that synthetic data generation is much faster than the usual manual processing of real SAR images. Furthermore, the use of known parameters for simulations can replace AIS data, which also considerably simplifies the typical, laborious process of integrating AIS data with SAR images. Ultimately, the versatility of our SAR simulator allows the building of datasets corresponding to different SAR platforms.