Hyperspectral Marine Oil Spill Monitoring Using a Dual-Branch Spatial–Spectral Fusion Model

: Marine oil spills pose a crucial concern in the monitoring of marine environments, and optical remote sensing serves as a vital means for marine oil spill detection. However, optical remote sensing imagery is susceptible to interference from sunglints and shadows, leading to diminished spectral diﬀerences between oil ﬁlms and seawater. This makes it challenging to accurately extract the boundaries of oil–water interfaces. To address these aforementioned issues, this paper proposes a model based on the graph convolutional architecture and spatial–spectral information fusion for the oil spill detection of real oil spill incidents. The model is experimentally evaluated using both spaceborne and airborne hyperspectral oil spill images. Research ﬁndings demonstrate the superior oil spill detection accuracy of the developed model when compared to Graph Convolutional Network (GCN) and CNN-Enhanced Graph Convolutional Network (CEGCN), across two hyperspectral datasets collected from the Bohai Sea. Moreover, the performance of the developed model in oil spill detection remains optimal, even with only 1% of the training samples. Similar conclusions are drawn from the oil spill hyperspectral data collected from the Yellow Sea. These results validate the eﬃcacy and robustness of the proposed model for marine oil spill detection.


Introduction
Marine oil spills are unforeseen events that arise from accidents or operational errors during the processes of petroleum exploration, development, and transportation, constituting the most emblematic and severe instances of marine environmental contamination. These spills pose significant threats to the marine environment, marine organisms, and human economic activities. Harmful substances can be transferred through the food chain, leading to long-term challenges in mitigating the resulting impacts [1][2][3][4][5]. Under suitable conditions, oil spills can also trigger harmful algal blooms, giving rise to more extensive ecological negative effects [6].
With the increasing demand for petroleum energy in China, the activities of offshore petroleum resource development, transportation, and storage are growing steadily, resulting in an escalating risk of offshore oil spills. As a major importer of crude oil, China relies on maritime transportation for 90% of its oil supply. In recent years, China has experienced frequent marine oil spill incidents. For instance, in 2010, an oil spill incident occurred as the result of an explosion in the Dalian Xingang oil pipeline. During 2011, there was an oil spill incident in the Penglai 19-3 oilfield [7]. Additionally, in 2013, an oil spill incident was caused by a leakage and explosion in the Sinopec pipeline at Huangdao [8,9]. The year 2018 witnessed an oil spill incident due to a collision involving the "SAN-CHI" tanker at the mouth of the Yang e River [10,11]. Moreover, in 2019, an oil spill event (2) For the extraction of spectral information, a residual graph convolution approach is designed, focusing on the spectral information of each layer and performing residual calculations to highlight the main spectral information; (3) To evaluate the effect of the algorithm proposed in this study, we compared the oil spill detection results of spaceborne and airborne hyperspectral images based on the proposed method with other two graph convolutional network models. The proposed algorithm achieves best performance in marine oil spill detection.

Proposed Method
Due to the impact of sensor hardware, flight a itude, as well as clouds, wind, and waves, optical images of oil spills on the sea surface may suffer from some problems, such as bad lines, stripes, sunglints, and shadows, causing indistinct spectral differences between the oil film and the background seawater. To address these issues, this study constructs a model named Graph Convolutional-DS-UNET Neural Network (DUNET), which integrates dual-branch spatial and spectral information based on the graph convolutional architecture, for the detection of marine oil spills in the real oil spill incidents. The DUNET model consists of three main modules: spectral feature extraction, spatial feature extraction, and spatial-spectral feature fusion. The overall framework of DUNET is illustrated in Figure 1. Firstly, linear iterative clustering is used to construct superpixel maps through continuous iteration; Afterwards, in the spectral branch, a residual Graph Convolutional Network (GCN) is used to extract spectral information. At the same time, in the spatial branch, feature enhancement modules are used to enhance the preprocessed images and remove redundancy; a deep separable UNET network is used to learn deep and shallow spatial information. Finally, the spectral features and spatial features of oil spills extracted from the two branches are fused, and the Cross Entropy loss function is used for optimization, so as to classify the finally extracted hyperspectral oil spill features.

Residual GCN Spectral Feature Extraction Module
The spectral feature extraction module designed based on GCN uses the spatial and spectral correlation between adjacent superpixel blocks to obtain the one to many relationship of features in the space by constructing a superpixel map. Assuming where V and E represent the vertex set and edge set, respectively. Therefore, we can define the Adjacency matrix A , which defines the relationship of each point in the graph. Each element in the Adjacency matrix can usually be represented by Formula (1): where  is the parameter that controls the width of the Radial Basis Function, and vector , i j x x represents the spectral characteristics related to vertices i  and j  . After A is calculated, the corresponding Laplacian matrix L can be solved, as shown in Formula (2).
where N I represents Identity matrix, and D represents Symmetric matrix of Adja- More robust graph structure data can be obtained by normalizing the Laplace matrix. After a series of matrix changes, the expression for GCN is obtained as follows: where   l H represents the output of layer l ,     represents the activation function, and W represents the weight.
In order to address the problem of gradient disappearance and overfi ing caused by GCN, a residual GCN is proposed based on ResNet network structure. The expression is as follows: The spectral feature information extracted by the residual GCN is named spectral H .

Deep Separable U-Net Spatial Feature Extraction Module
The common network structure usually uses pooling layer to reduce the size of the image, so as to increase the receptive field relatively. However, in this process, some information will inevitably be lost. The advantage of the Dilated Convolution (or Expansion Convolution) algorithm is that it can expand the receptive field while avoiding information loss. It can retain the internal data structure and avoid the use of down sampling. It can also capture multi-scale context information by se ing different expansion rates.
Depthwise Convolution and Pointwise Convolution can be combined to extract feature information from the Euclidean space of oil films. The number of feature maps after the Depthwise Convolution is the same as the number of channels in the input layer, making it impossible to extend the feature map. Moreover, it independently performs convolution operations on each channel in the input layer, which cannot effectively utilize the feature information of different channels in the same spatial position. Therefore, Pointwise Convolution is needed to combine these feature maps to generate new feature maps. The operation of Pointwise Convolution is very similar to conventional convolution operations, with a convolution kernel size of 1 × 1 × M. M is the number of channels in the previous layer. Thus, the convolution operation here will weigh and combine the previous feature maps in the depth direction to generate new feature maps.
In this paper, DS-UNET is designed to make it more suitable for detecting marine oil spills by improving the U-Net network. Deep separable convolutions are introduced at the input and output ends, and dilated convolutions are introduced at the convolutional layer to achieve lightweight and reduce computational costs. The encoding module consists of two deep separable convolutional modules, two convolutional modules of 3 × 3, and one the maximum pooling module of 2 × 2. The decoding module is composed of one upsampling convolutional layer, concat feature concatenation, two convolutional modules of 3 × 3, and two deep separable convolutional modules. Among them, deep separable convolution can reduce computational and parameter complexity. The structure of deep separable U-net network is shown in Figure 2. The output of the decoder is the deep and shallow spatial features extracted from the encoder, which are concatenated to fully extract the spatial information spatial H of oil spills hyperspectral image.

Feature Fusion
The features obtained two branches are fused by feature concatenation, namely the spectral information obtained by the residual GCN spectral feature extraction module and the deep and shallow spatial information obtained by the deep separable U-Net spatial feature extraction module. Finally, the extracted oil film feature H (Formula (5)) is output. The eigenvectors are input into the softmax function, and final oil spill detection result is acquired through the Cross Entropy loss function, as shown in Formula (6).
where Y represents the true value, P represents the predicted value of each pixel, and represents the probability that pixel i belongs to class c , which is calculated using the softmax function. C and N represent the total number of categories and samples in the training dataset, respectively.

Data
The Bohai Sea, China's only inland sea, covers an area of 78,000 square kilometers, accounting for only 2.6% of China's marine territory. Due to its limited influence from ocean currents and poor water exchange capacity, the Bohai Sea faces complex, high-intensity, and multiple sources of human activities. As a result, pollutants are difficult to purify rapidly, making it become a key area for monitoring pollutant emissions.
In this study, airborne hyperspectral imagery and spaceborne hyperspectral imagery obtained from two marine oil spill incidents in the Bohai Sea were selected as the data sources.

Hyperion Spaceborne Hyperspectral Data
The spaceborne hyperspectral imagery used in this study was Hyperion data acquired on 6 May 2007, in the Liaodong Bay of China (shown by the red pentagram in Figure 3). Hyperion is one of the three sensors mounted on the EO-1 satellite and is the first spaceborne civilian imaging spectrometer. The Hyperion imagery covers a spectral range of 400~2500 nm, spanning spectra from visible light to shortwave infrared, with a total of 242 spectral bands. The spectral resolution is 10 nm, and the spatial resolution is 30 m. Due to limitations in radiometric calibration and significant influence from water vapor and signal-to-noise ratio, only 175 bands (426~926 nm, 933~1346 nm, 1427~1810 nm, 1942~2385 nm) are actually usable in this study. The Hyperion imagery (Figure 4a) used in our experiments has dimensions of 444 × 400 pixels and contains information on oil slicks, seawater, and vessel tracks. However, the imagery also suffers from stripe noise and bad lines, severely affecting oil spill detection.

AISA+ Airborne Hyperspectral Data
Unmanned aerial vehicles offer advantages such as high flexibility and strong realtime capabilities, compensating for the drawbacks of satellite remote sensing, such as lag and low accuracy. The airborne hyperspectral imagery used in this study was AISA+ data acquired by the China Marine Surveillance North Sea Aviation detachment on 23 August 2011, in the Penglai 19-3 oilfield (shown by the orange pentagram in Figure 3). The AISA+ imagery covers a spectral range of 400~970 nm, spanning the spectra from visible light to near-infrared, with a total of 258 spectral bands. The spectral resolution is 5 nm, and the sensor's field of view is 39.7°, with a spatial resolution of 1.41 m at a flying altitude of 1000 m. The AISA+ imagery (Figure 4b) used for oil spill detection in the Penglai 19-3 oilfield has dimensions of 444 × 364 pixels and was acquired at an approximate altitude of 700 m, with a spatial resolution of approximately 0.99 m. The imagery contains information on oil slicks, seawater, platform and ships. However, it is also affected by sunglints that can impact oil spill detection.

Ground Truth Data
The ground truth image of oil spill distribution for the Liao Dong Bay incident (Figure 4c) was produced through manual visual interpretation based on acquired hyperspectral images and expert knowledge. The ground truth image for the Penglai 19-3 oilfield incident (Figure 4d) was generated through manual visual interpretation based on a combination of on-site aerial photographs, hyperspectral images, and expert knowledge. Aerial photographs were taken using cameras mounted on Chinese Marine Surveillance aircraft, while the hyperspectral imagery was obtained synchronously using the AISA+ imaging spectrometer installed on Chinese Marine Surveillance aircraft. The ground truth image for the Liaodong Bay oil spill incident includes oil slicks seawater and background, while the ground truth image for the Penglai 19-3 oilfield incident includes oil slicks, seawater, platforms, and ships.

Experimental Setup
In this article, we randomly selected 5% of the samples for training, 5% for validation, and the remaining 90% for testing for each dataset. All experiments were performed on NVIDIA GeForce RTX 3090 GPU with 24 GB of memory (NVIDIA, Santa Clara, CA, USA). Table 1 lists the number of training, validation, and testing samples for the three datasets.

Experimental Results
The proposed method was applied to the two oil spill hyperspectral datasets, and two graph convolutional network models, including the GCN [64] and the CNN-Enhanced Graph Convolutional Network (CEGCN) [65], were selected for comparison. To ensure fairness in the comparative experiment, the three methods were evaluated using the same set of hyperparameters. The oil spill detection results of the proposed method and the other two algorithms on the two hyperspectral datasets are shown in Figure 5.
From Figure 5, it can be observed that there are differences in the oil spill detection capabilities between the proposed algorithm and the other two algorithms. The DUNET model achieves the closest results to the ground truth images for both datasets, with clear oil film boundaries and minimal misclassification. The CEGCN model performs slightly worse, with some oil films misclassified as seawater. The GCN model exhibits the largest differences from the ground truth images, particularly in terms of lacking a ention to boundaries and details, as indicated by the dashed circles in Figure 5a,b. The detection results demonstrate the effectiveness of the DUNET model in utilizing both spatial and spectral information through dual branches and feature fusion to improve the detection results. To quantitatively showcase the oil spill detection capabilities of the proposed model and the other two methods, four metrics were used for accuracy evaluation of the oil spill detection results, namely detection accuracy, overall accuracy, average accuracy, and the Kappa coefficient.
Overall, the DUNET model outperforms the CEGCN and GCN models in terms of oil spill detection accuracy, overall accuracy, average accuracy, and Kappa coefficient on two datasets (Table 2). Specifically, for the Hyperion dataset of Liaodong Bay, the DUNET model achieves an oil spill detection accuracy of 84.02%, which is 3.05% and 18.33% higher than that of the CEGCN and GCN models, respectively. The overall accuracy and average accuracy of the DUNET model are 99.17% and 94.42%, respectively, surpassing the CEGCN model by 0.62% and 1.05%, as well as the GCN model by 0.80% and 6.34%. As for the AISA+ dataset of the Penglai 19-3 oilfield, the DUNET model achieves an oil spill detection accuracy of 95.95%, and an improvement of 2.57% and 6.93% compared to the CEGCN and GCN models, respectively. The overall accuracy and average accuracy are 96.50% and 94.80%, surpassing the CEGCN model by 1.98% and 1.82%, as well as the GCN model by 4.92% and 5.52%. The Kappa coefficients of the DUNET model on two datasets are higher than 0.9, which are be er than those of the CEGCN model and GCN model, indicating a strong agreement between the classification results of the DUNET model and the ground truth images. In summary, the proposed DUNET model achieves the best performance in oil spill detection compared to the CEGCN and GCN models.

Impact of Different Proportions of Training Samples on Oil Spill Detection Performance
To further validate the robustness of the proposed model under different proportions of training samples, in this section, we randomly select 1%, 3%, and 5% of the training samples from the two datasets for model training. The oil spill detection results of the three methods based on different proportions of training samples are shown in Figure 6.
Overall, the three methods exhibit different classification performance based on different proportions of training samples. For two hyperspectral datasets, as the proportion of training samples increases, the accuracy of oil spill detection for the three methods also gradually improves. However, the proposed DUNET method consistently achieves the highest detection accuracy, even under the condition of only 1% training samples, outperforming both CEGCN and GCN in oil film detection. At the same time, it can also be found that with the increasing proportion of training samples, the oil spill detection accuracy based on the fusion method of spatial and spectral information tends to stabilize, especially in the airborne AISA+ dataset. This could be a ributed to the higher spectral and spatial resolution, as well as the larger dimensionality of the AISA+ data. Furthermore, we conducted a comparison of the running times of DUNET, CEGCN, and GCN models under different proportions of training samples, as shown in Table 3. Overall, when considering the same proportion of training samples, the GCN model exhibited the shortest training time, followed by the CEGCN model, while the DUNET model had a slightly longer training time. At the same time, the GCN model demonstrated the shortest test time, while the test times for the CEGCN and DUNET models were similar.

Application on Oil Spill Image in the Yellow Sea
To further validate the effectiveness and applicability of the proposed method, in this section, we apply the developed model to AISA+ hyperspectral data that acquired at an oil spill incident in the Yellow Sea, namely the Dalian Xingang oil spill incident on 6 August 2010. The location of the oil spill image is indicated by the blue pentagram in Figure  3.

Oil Spill Detection Results
The AISA+ image (Figure 7a) used for the oil spill detection experiment in Dalian Xingang has dimensions of 256 × 384 pixels and contains information about both the oil slick and seawater. However, the image also contains striping and bad lines that can interfere the oil spill detection. A ground truth image of the oil spill distribution (Figure 7b) was created through the combination of aerial photographs and manual visual interpretation. For the Dalian dataset, we randomly selected 5% of the samples for training, 5% for validation, and the remaining 90% for testing. Table 4 lists the number of training, validation, and testing samples for Dalian dataset. We conducted oil spill detection experiments on AISA+ hyperspectral data in Dalian using the developed method and two other deep learning models, and the detection results are shown in Figure 8  In order to quantitatively demonstrate the oil spill detection capabilities of the proposed model and the other two methods, four indicators, namely detection accuracy, overall accuracy, average accuracy, and Kappa coefficient, were used to evaluate the accuracy of the oil spill detection results, as shown in Table 5. Similar to the previous conclusions, the developed DUNET model achieves the highest accuracy, overall accuracy, average accuracy, and Kappa coefficient on the Dalian oil spill airborne hyperspectral dataset, followed by the CEGCN model, while the GCN model performs the worst. The DUNET model achieves an oil spill detection accuracy of 98.34% based on the Dalian AISA+ dataset, which is an improvement of 1.15% and 6.76% compared to the CEGCN and GCN models, respectively. The overall accuracy and average accuracy are both 98.30%, surpassing the CEGCN model by 0.80% and 0.81%, as well as the GCN model by 8.80% and 8.71%. The Kappa coefficient of DUNET model achieves 0.9659 based on the Dalian AISA+ dataset, outperforming the CEGCN and GCN models by 0.0159 and 0.1760, respectively. This demonstrates the applicability of the developed oil spill detection method in other marine areas, effectively detecting oil spills in hyperspectral images under the influence of striping. This indicates that the developed oil spill detection model is applicable to oil spill scenarios in other marine areas, effectively detecting oil spills in airborne hyperspectral images under striping effects.

Analysis of Different Proportions of Training Samples
To further validate the robustness of the proposed model under different proportions of training samples, in this section, we randomly select 1%, 3%, and 5% of the training samples from the Dalian oil spill dataset for model training. The oil spill detection results of the proposed model and the other two deep learning methods based on different proportions of training samples are shown in Figure 9. Overall, as the proportions of training samples increase, the oil spill detection accuracy of all three methods gradually improves. However, the proposed DUNET method consistently achieves the highest accuracy, even under the condition of only 1% training samples, outperforming both CEGCN and GCN in oil spill detection.

Analysis of Spectral Resolution of Sensor
Airborne hyperspectral sensors have high spatial and spectral resolution, strong maneuverability, and the ability to quickly respond to emergency situations, providing unparalleled advantages in obtaining timely information on marine oil spills. However, due to weather conditions and limited endurance, airborne hyperspectral data acquired is often scarce during oil spill incidents. On the other hand, spaceborne hyperspectral sensors have the advantage of continuously acquiring data consistently on a global scale. However, they have lower spatial resolution, are prone to cloud interference, have long revisit periods, and lower signal-to-noise ratio.
The AISA+ airborne hyperspectral data used in this study has a spectral resolution of 5 nm and a spatial resolution of 0.99 m. The Hyperion spaceborne hyperspectral data has a spectral resolution of 10 nm and a spatial resolution of 30 m. Although Hyperion covers a wider spectral range, from 400 to 2500 nm, spanning visible light to shortwave infrared, compared to the spectral range of the AISA+ imagery (400~970 nm), it is worth noting that the oil spill detection accuracy based on AISA+ imagery consistently exceeds 90%, while the accuracy based on Hyperion imagery remains below 85%. This indicates that the oil spill detection accuracy based on AISA+ imagery, despite noticeable sunglint interference, is higher than that based on Hyperion imagery due to its higher spectral resolution. Furthermore, for data with higher spectral resolution, as the proportions of training samples gradually increase, the oil spill detection accuracy based on the spectral-spatial fusion method tends to stabilize.

Conclusions
China, as a major importer of crude oil, relies heavily on maritime transportation, with 90% of its crude oil imports being transported by sea. This inevitably increases the risk of marine oil spills. Timely and accurate monitoring of oil spill locations and extents plays a crucial role in effectively responding to and managing such unpredictable oil spill incidents. Hyperspectral remote sensing has been widely adopted for monitoring marine oil spills due to its superior spectral and spatial resolution capabilities. However, optical remote sensing images are susceptible to sunglints and shadows, which can reduce the spectral differences between oil films and seawater, especially in accurately extracting the edge information of oil-water interfaces. In light of these challenges, this study developed a GCN-based model that integrates spatial and spectral information from dual branches for hyperspectral oil spill detection in real marine scenarios.
The main conclusions drawn from this study are as follows: (1) Compared to the GCN and CEGCN, the proposed DUNET model achieved the best oil spill detection accuracy on two hyperspectral datasets from the Bohai Sea, confirming its effectiveness in hyperspectral oil spill detection. (2) The performance of the developed model in oil spill detection remains optimal, even with only 1% of the training samples, demonstrating its robustness. (3) When applied to hyperspectral oil spill data from the Yellow Sea, the developed model exhibited superior detection accuracy compared to the other two algorithms, further validating its applicability.
The findings highlight the effectiveness of the proposed model in different datasets and different proportions of training samples, emphasizing its potential significance in supporting oil spill monitoring and emergency response efforts in marine environments.

Data Availability Statement:
The data used in this study are available on request from the first author.