1. Introduction
Water plays an irreplaceable role in activities like the conservation of biodiversity, agriculture, tourism and industry, among others. However, besides being the most abundant liquid on the planet, there is a severe scarcity of quality usable water. This problem is becoming more accentuated due to climate change in recent years [
1], requiring huge investments and difficult treatments for finding new water sources and acquiring quality water. The United Nations reflected the need for cooperation between worldwide organizations in Sustainable Development Goal (SDG) 6 [
2]. This goal aims to ensure the availability and sustainable management of water and sanitation for all. SDG target 6.3 addresses another problem related to the release of hazardous chemicals and materials.
Approximately half of all the wastewater generated worldwide is released without treatment [
3]. This, in addition to the accidental waste of residues like organic matter, oil spillage, heavy metals, and even radioactive substances, makes the situation a real environmental hazard. If these sources of contamination are not detected and treated accordingly in time, they can extend and cover the whole water surface, contaminating it directly, affecting its biodiversity, or indirectly promoting the appearance of invasive species or algae blooms [
4], which, with time, can make the water unhealthy for human use. To avoid reaching this state of environmental crisis, water quality values must remain within water quality standards [
5], and governments and organizations need to continuously monitor water masses. Monitoring is the preemptive measure against water contamination and degradation, as recovery is a process that takes several years [
6].
Traditional water quality monitoring approaches focus on taking manual measures and analyzing samples in laboratories, requiring a lot of effort and human resources [
7]. Recently, traditional methods are being replaced with superficial satellite methods [
8] or intelligent robots, such as submarines and surface or aerial vehicles, that can be equipped with water quality sensors and robotic actuators [
7]. Thus, vehicles are able to perform tasks ranging from exploration to actuation on water masses [
9], involving detection, chasing and cleaning pollutants, and other monitoring tasks in real time. Furthermore, the time taken in laboratories to analyze samples induces a delay that, in the case of emergent contaminants, can cause a public health problem [
10] not present when monitoring with autonomous vehicles. Therefore, it is envisioned that autonomous vehicles will play a crucial role in SDG target 6.6 in protecting water masses bodies [
2].
The improvements in battery autonomy and computation power have made autonomous vehicles able to take intelligent actions. Thus, tasks that previously required an operator remotely controlling the vehicle are being replaced with a programmed movement policy that dictates the behavior of the vehicle [
11]. The objective of these policies is to provide vehicles with target points or waypoints to travel to, making obtaining a policy a path planning problem. Another objective is to optimize the monitoring task assigned, which can be exploration or actuation, while taking into account factors present in the vehicle, such as battery, sensing, and actuating constraints. Thus, the path planning problem needs to take into account information about the environment. Developing a policy becomes a complex challenge due to the highly dynamic scenario of water masses. Since water is a fluid affected by several forces that facilitate the movement of particles through the whole mass, determining how a mass of water and its properties evolve through time is difficult. Therefore, vehicles need to adapt to this environment. Vehicles do not have prior knowledge about the environment. As a consequence, information is gathered by the vehicles during its mission and processed inside the vehicle or at a base station that the vehicle is able to communicate with, making offline planning an invalid solution.
With the advancements of neural networks, this field was able to provide solutions to the Adaptative Informative Path Planning (AIPP) problem based on deep architectures that have been developed [
12]. As a clear example, Deep Reinforced Learning (DRL) approaches are able to solve the informative path planning problem, providing a valid policy with which vehicles are able to carry out the designated task [
12], offering more robust and scalable solutions that adapt to the complexities and uncertainties of the environment. There are various optimization tasks, but regarding AIPP algorithms, some previous works have focused on water quality monitoring, contamination phenomenon exploration, and search tasks [
13]. Among these previous works, some have focused on the contamination detection of algae blooms [
14] and oil spills [
15] using autonomous surface vehicles (ASVs), which are also called agents in the field of AIPP [
16], equipped with specialized sensors [
17,
18]. However, most of the previous works made the assumption of lentic waters [
19]. This means that the evolution of water properties and contaminants is slow and it can be considered that they do not change throughout a monitoring mission. However, this is not the case in larger water bodies like seam rivers and larger lakes, where the scenario is highly dynamic due to currents and wind, among other factors. Therefore, water quality conditions may evolve at the same time or faster than the actuation of the vehicles, and consequently, measures can become easily outdated. Although the use of multiple vehicles can alleviate the problem by increasing the data samples [
20] or processing the age of the data collected [
21], the obtained models will not reflect the real evolution of water quality parameters. Therefore, the planning actions will be sub-optimal in real scenarios.
The aim of a contamination model is to solve the estimation problem, obtaining the whole contamination map from partial observations and estimating its evolution. Several contamination models have been studied in the past [
22]. In [
23], the evolution of a contaminant in a river was modeled using mathematical hydrodynamic equations and solving the inverse model, reducing the potential harm caused by pollution accidents. In [
24], several numerical models based on advection–dispersion equations or transport models for vulnerability assessment were used. However, characterizing the evolution in a larger water mass like a lake with partial observations cannot be explicitly described with equations, as it is affected by several chaotic effects. Bayesian contamination models like the Gaussian process [
25] are able to provide valid solutions to the static problem, with the downside of a high computational cost that increases with the number of samples. In [
26], a contamination model was obtained using a variational autoencoder (VAE) neural network, providing a more scalable solution with water samples. In the same paper, the results showed that a good contamination model is able to provide improvement in policy performance of approximately 50%. Thus, offering a forecasting module of the contamination that provides not only the present state of the contamination but makes a prediction of the future state of contaminants is likely to improve policy performance even further.
This paper proposes a variational autoencoder architecture based on the popular UNet network [
27] combined with a prior and posterior convolutional neural network (CNN) architecture [
28]. In [
26], a similar architecture was proposed for the static case. In this paper, it was extended to a dynamic case, analyzing its capacity to estimate future distributions. This network was trained in a model-free manner, using only simulated interactions of the agents with the environment. The aim of the simulator is to provide a spatio-temporal distribution of pollutants in water bodies, replicating an oil spill accident. The simulator was used to create training and test datasets. The proposed VAE-UNet architecture will be a tool for any AIPP algorithm to plan ahead. The VAE works as a model that can provide accurate information about the contamination state from partial observations. Simultaneously, it captures the temporal-dependent behavior of the contamination, providing foresight for future contamination states.
To summarize, this paper contributes the following:
A novel VAE neural network following the U-Net architecture that aims to provide future state estimations of water pollutant evolution.
A comparison of the network performance against a naive baseline prediction.
A further study of the limitations and overfitting of the suggested architecture.
This paper is organized as follows:
Section 2 presents the materials and methods and describes the problem that the proposed VAE wants to solve, as well as how to set up the environment, contamination, agents, and the simulator. Lastly, the architecture of the VAE will be discussed. In
Section 3, the results of the VAE obtained will be analyzed, and the model’s behavior will be compared with that of a naive model. In
Section 4, the main contributions of this paper and future lines of work will be discussed.
3. Results
Experiments were performed using a 97 × 93 node scenario in a circular shape. The simulator was used offline to create datasets for
and
as inputs of the network and
as the ground truth of the scenario and the value we want to compare to as output from the variational autoencoder (VAE). To evaluate the performance of the VAE in different environments, the simulator was configured to create scenarios where oil spill evolution can present three different behaviors: Linear dispersion, currents and wind affect particles moving in a general direction, allows erratic behaviors, as seen in
Figure 11c. Circular expansion, wind and currents forces are minimal and the sources have a high flow of particles, allowing for particles to grow in a circular shape as seen in
Figure 11a. Triangular diffusion is a rule-based behavior that mimics an oil spill caused by a flow coming from a broken cross pipe that follows a cross shape, as seen in
Figure 11b. Oil presents a fast release that slows down once it has been liberated to the water body. This last environment presents the most artificial behavior but adds more complexity to the problem. The simulator (
https://github.com/AloePacci/cpp_oil_simulator, accessed on 11 January 2025) and VAE (
https://github.com/AloePacci/VAEPOCTEWE, accessed on 11 January 2025) codes are available on GitHub.
The VAE was configured to have five window frames
as inputs expatiated uniformly five time steps between each other ranging from
to
, and another five frames as output ranging from
to
. Several datasets were created for each of the oil spill behaviors; 20,000 different contamination scenarios were synthesized for training, 4000 for testing and 200 for validation. These include monitoring situations with agents equipped with electrochemical sensors, with influence radius
, and agents equipped with cameras, with influence radius
. An example of a contamination instance dataset can be seen in
Figure 12.
All simulations and training were carried out on a server running Ubuntu 22.04.4 LTS (Universidad de Sevilla, Sevilla, Spain), equipped with an Intel Dual Xeon Gold 5220R CPU 2.20 GHz, 192 GB of RAM and two GPUs: Nvidia Quadro A4000 48 GB and Nvidia RTX 3090 25 GB. Training loss was calculated using Equation (
17) and forgetting factor
. The hyperparameters
and the learning rate
were optimized utilizing Optuna [
38] to minimize the reconstruction loss
in order to address the final objective of model accuracy.
Given the datasets and different combinations of agents and oil spill evolution behaviors, different networks were trained to calculate the cross losses and validate the effectiveness and generalizing capabilities of the proposed VAE. Thus, results were divided into a fleet of agents characterized by an influence radius and another characterized by an influence radius . For each fleet of agents, four different models were trained: one for each of the oil spill behaviors for cross validation, and one containing a fusion of all three oil spill behaviors, from now on called the generalized network. The weights were chosen for the network at the epoch that showed the lowest test loss.
3.1. Performance Metrics
As mentioned previously, the aim of the VAE-UNet is to provide future states of oil spill contamination. The baseline taken for comparison to evaluate the performance of the network is the static approach, where the environment is considered non-dynamic and the expected future state of the contamination position is equal to the current one . The loss at time step 0 at areas recently visited by agents is minimal. However, in areas with data measured several steps ago, or with predictions at future timestamps, the error using this approach increases at a high rate. The visual of this loss establishes this baseline as a solution that underperforms but provides a valid estimation.
Figure 13 shows the results of evaluating the reconstruction loss MSE
before using the baseline approach. The loss incurred by the VAE-UNet was calculated with respect to the baseline and expressed in a percentage value of the baseline loss.
3.2. Fleet with
This fleet is characterized by an influence radius
and is able to take measurements in the nodes where the agents are currently located at, like the one used in [
39] equipped with electrochemical sensors. It is made up of four different agents that are able to move through three cells each time step. An example of the dataset can be seen in
Figure 14.
The networks training loss curve shows a high descending slope that stabilizes around epoch 100, as seen in
Figure 15. A value of 200 epochs for training was considered sufficient.
Figure 16 shows the relation between the three different losses during training. In analyzing the reconstruction loss
curve, the loss associated with each of the future estimations shows a similar value despite the loss reduction applied to future predictions. This justifies the assumption taken before that the further the network looks into the future, the higher the loss.
In
Table 2, the reconstruction loss value of each trained network is shown. The network trained using only triangular diffusion data shows the lowest loss. This could be due to the simplicity of the contamination behavior for this case. It is followed by circular expansion; being simpler, it presents no effects of the wind or currents. Lastly, the linear dispersion case shows the highest loss. This could be due to the high variety and random evolution of this contamination behaviour. The generalized network, presenting data from all the different datasets, presents a middle value.
Once the different networks were trained, their performances were evaluated.
Table 3 shows the reconstruction loss results of evaluating each trained network against each of the different validation datasets. In view of the results, the solution presented in this paper is able to provide a prediction of the evolution of an oil spill with an error of less than 10% of the naive baseline approach for each assigned contamination behaviour. This shows that the network is able to predict oil spill evolution with high accuracy in environments similar to those it was trained with.
To test the adaptability of the trained VAE-UNet in unseen behaviour, the network was evaluated against datasets in which the oil spill behaves very differently compared to the dataset that it was trained on. The results show that the VAE behaves better than the baseline prediction in all individual cases, except for the circular expansion cross-loss against the triangular diffusion case, which underperformed. This occurs mostly due to overfitting, as there is no wind or current effect in the circular expansion dataset. The opposite can be seen in the linear dispersion network, where contamination particles evolve in diverse ways, allowing for a better adaptability and lower cross-losses. A special case is triangular diffusion error, where the generalized VAE performs better than in triangular case in its own error. This could be due to loss hyperparameters being optimized for the generalized case and a better understanding of particle behavior due to a more varied dataset.
Lastly,
Table 4 shows the evolution of
along the different prediction timestamps for each VAE-UNet trained through the complete validation dataset. The values show that as the VAE-UNet predicts further into the future, the higher the error in the prediction. The increase in this loss is higher in the specific contamination behaviors of VAE-UNets due to them not being trained for the whole dataset. However, the loss value remains low, presenting a better estimation than the baseline. The
Supplementary Material includes a video showing the evolution of the VAE-UNet as the fleet explores the map.
3.3. Fleet with
The fleet is characterized by an influence radius
; this could be the case of agents equipped with spectral sensors like the ones present in [
40]. It is made up of four different agents that are able to move through three cells each time step. An example of this fleet’s dataset can be seen in
Figure 12. This fleet is able to provide contamination information about nodes adjacent to the agents in a four-node radius, providing more information than the one equipped with electrochemical sensors, resulting in a lower loss, as seen in
Figure 17. The analysis of the reconstruction loss for each time step in
Figure 18 shows similar values despite the loss reduction. This result enhances the assumption that the further into the future the estimation, the higher the error committed by the VAE in the prediction.
Table 5 presents the training and testing reconstruction losses. The losses present the same relationships. However, due to having more information, the magnitude of the loss is lower.
The performance of the network was evaluated, and the results are shown in
Table 6. The static approach selected as the baseline provided by this fleet has more information, providing an estimation loss that is six times lower on average. The VAE is able to process the new information to provide better estimations. However, even though the absolute loss values were reduced. The cross-losses show a performance worse than those of the baseline in environments different from those in the trained cases. This overfit is more present in the circular expansion case trained without the influences of the wind or currents. The opposite is seen in the linear dispersion case, presenting a more varied training dataset and environment effects.
Table 7 shows the evolution of
along the different prediction timestamps through the complete validation dataset. The overfit can be easily seen in the circular expansion network. The rest of the trained networks present loss values lower than those of the baseline. The generalized dataset presents the lowest reconstruction loss value, being the network trained with the most varied dataset. These results proclaim that the more varied the dataset, the better and the more robust the network, leading to better adaptation to unknown environments and lower losses.
A visual representation of the VAE-UNet generalized network for the fleet equipped with electrochemical sensors can be seen in
Figure 19, showing the partial input
and the output of the VAE
against the ground-truth data
Y and the difference between both
. As mentioned previously, the agent policies are not the objective of this study.
Figure 20 shows an example where agents have yet to discover the oil spill contamination. The VAE-UNet predicts contamination to be in an erroneous position. This addresses the effect of the fleet’s information-gathering performance present both during testing and training in the modeling of the contamination.
Figure 21 shows the evolution of the different losses along three different oil spill environments. Initially, reconstruction loss increases until the contamination area is detected and then decreases sharply. The generalized network presents instances where the fleet with
presents a lower loss than
. This is due to the planner policy of the agents that provides different monitoring information to each fleet.
4. Conclusions
This paper proposes a variational autoencoder to predict the evolution of oil spill contamination in water bodies from partial observations. To assess performance, it was tested on several scenarios presenting three different simulated oil contamination environments: circular expansion, presenting minimal wind and current forces; triangular diffusion, where contamination is exposed to biased currents; and linear dispersion, allowing random behaviors with high wind and current effects. Furthermore, the test was duplicated using two fleets of autonomous surface vehicles with different monitoring capabilities: a fleet equipped with electrochemical sensors able to take punctual measurements and a fleet equipped with spectral cameras able to monitor an area close to the vehicle.
According to
Table 4 and
Table 7, the validation results of the proposed generalized VAE show a prediction loss as low as 3.51%, the baseline at current time, by the fleet equipped with electrochemical sensors, and as high as 8.21%, the baseline 20 time steps into the future, by the fleet equipped with spectral cameras. The magnitude of this loss increases with the age of the predictions, presenting an increase the further into the future that the predictions are made. The overfit of the network to the data trained on was tested using networks trained with datasets presenting only one of the three available environments. The results show a lower loss at the specific environments and a higher loss at different ones. A further study showed that this overfit decreases when the network is trained with a more varied dataset, presenting validation losses as high as six times the baseline for the circular expansion case in the fleet equipped with spectral cameras, or 11.73% for the baseline, in the linear dispersion case with the fleet equipped with electrochemical sensors. Thus, it is expected that the proposed generalized network trained with a varied dataset performs very well in new environments.
The gathering performance of the agents affects the proposed VAE in two different ways. The fleet equipped with spectral cameras is able to cover a wider monitoring area, providing more monitoring data and allowing for a reconstruction loss six times lower on average. Furthermore, the wider coverage allows for detecting the contamination position with more certainty. The path-planning policy is random, presenting cases where the vehicles have not detected any contamination and the prediction erroneously locates the contamination. Thus, in a monitoring scenario, the initial losses of the proposed VAE show an underperforming solution.
Future lines of work diverge into two lines of investigation. On the one hand, an analysis of the effect of the proposed VAE-UNet structure on informative path planning should be performed, providing numeric data of the effects of taking a future state of contamination particles into account in agent policy estimation. On the other hand, the limits of the VAE should be addressed, evaluating the effects on the prediction accuracy of different agent policies and the input requirements regarding number and age of window frames.