Quality of Monitoring Optimization in Underwater Sensor Networks through a Multiagent Diversity-Based Gradient Approach

Due to the complex underwater environment, conventional measurement and sensing methods used for land are difficult to apply directly in the underwater environment. Especially for seabed topography, it is impossible to perform long-distance and accurate detection by electromagnetic waves. Therefore, various types of acoustic and even optical sensing devices for underwater applications have been used. Equipped with submersibles, these underwater sensors can detect a wide underwater range accurately. In addition, the development of sensor technology will be modified and optimized according to the needs of ocean exploitation. In this paper, we propose a multiagent approach for optimizing the quality of monitoring (QoM) in underwater sensor networks. Our framework aspires to optimize the QoM by resorting to the machine learning concept of diversity. We devise a multiagent optimization procedure which is able to both reduce the redundancy among the sensor readings and maximize the diversity in a distributed and adaptive manner. The mobile sensor positions are adjusted iteratively using a gradient type of updates. The overall framework is tested through simulations based on realistic environment conditions. The proposed approach is compared to other placement approaches and is found to achieve a higher QoM with a smaller number of sensors.


Introduction
Quality of monitoring (QoM) [1][2][3][4] of information in the underwater domain [5,6] is a recent concept that has attracted the attention of researchers in the field of wireless sensors and connected objects. In recent years, there has been an upsurge of interest in underwater wireless sensor networks (UWSNs). UWSNs are made up of several autonomous sensor nodes. These sensor nodes are scattered underwater to carry out detection tasks in order to collect different properties related to underwater environments [6]. UWSNs admit a large set of applications that includes, for instance, monitoring the living conditions of fish, such as measuring temperature, humidity, pH, and CO 2 concentrations, in order to associate those metrics with the amount of fish produced under these conditions and weigh less than 3 kg and are used as an alternative to a diver, specifically in places where a diver might not physically enter, such as a sewer, pipeline, or small cavity. Mini-class AUVs weigh around 15 kg. They are also used as a diver alternative. General-class AUVs have less than 5 HP (propulsion), manipulators, grippers, and a sonar unit used on light survey applications. The maximum depth is less than 1000 meters. Light-work classes typically have less than 50 HP (propulsion). They carry manipulators and are made from polymers such as polyethylene rather than conventional alloys. Heavy-work classes typically have less than 220 HP (propulsion) with the ability to carry at least two manipulators. They have a working depth of up to 3500 m. These networks contain multiple sensor nodes and vehicles deployed underwater. Autonomous underwater vehicles and devices are used to collect data from these sensor nodes, and different sizes and designs can be used depending on the mission. Challenges of underwater communication can be long propagation delay and bandwidth and sensor failures. UWSNs are equipped with a limited battery that cannot be recharged or replaced. The difficulty of energy conservation for UWSN involves the development of underwater communication and networking techniques [18][19][20][21][22]. Monitoring in the underwater domain consists of evaluating the water quality, which is a task of utmost importance. Water quality plays a vital role in fish farming. Good water quality helps farmers ensure maximum fish growth, guarantees a high-quality product, and minimizes the diseases and deaths rate. All these factors increase fish production and consequently influence national and international economic growth. Water contains many parameters that can judge its quality. In aquaculture, there are intervals of standard values [23]; if a value of a parameter exceeds the limits, the water quality will be influenced. The main parameters of aquaculture water are dissolved oxygen (DO), j (NH3, NH + 4 ), nitrite (NO − 2 ), nitrate, turbidity, pH, and temperature [24]. Many factors can influence water quality such as biology, physics, and human activities; they make it a very complex, nonlinear, and dynamic system. This kind of system cannot be managed with a classical method. An outdated classification model does not help to achieve better results. Therefore, the use of new technologies by introducing artificial intelligence and machine learning can be an effective solution.
In this paper, we resort to a multiobjective function to quantify the quality of monitoring in underwater wireless sensor networks. The multiobjective function incorporates two objectives: minimizing covariance of sensor readings in order to reflect the idea of reducing redundancy, and maximizing the diversity among the sensor readings in order to reflect the idea of choosing positions that unveil novel information not described by the rest of the sensors. In this sense, the QoM problem can be seen as an instance of an optimal sensor coverage problem. The novelties in this paper revolve around the following: • The first objective is to maximize the coverage of the sensor network by optimizing the placement of sensors. This involves designing a distributed algorithm that enables the sensors to communicate with each other and cooperate to achieve maximum coverage of the area of interest. • The second objective is to reduce the redundancy of the sensor network by not only considering the correlation between sensors readings but also the diversity of the readings using the concept of determinantal point process.
The closest work to ours is due to Weiler et al. [25]. The latter work investigates adjusting the positions of sensors based on the gradient descent approach. The objective function contains only one term that takes the inverse of the overall covariance between sensors and points to sense. The reason for choosing the inverse is to penalize regions that are covered by more than a sensor and thus better take into account the marginalcontribution of each sensor. The objective function is intuitive; however, it does not have a sound physical meaning. In our work, we take into account diversity as a new term in the objective function without inverting the overall sum of covariance. Our framework contributes to filling this gap in AUV solutions. In this study, we use AUV robots because of their ability to move underwater without needing any external intervention. AUVs are underwater robots that are typically used in mining areas, agriculture applications, and so forth [26]. They are one of the most significant tools for the exploration and application of marine resources [27,28]. An AUV is a self-piloting vehicle that performs a task, and usually, it is equipped with an onboard artificial intelligence system with a set of programmed commands, which can be modified remotely by data or information broadcast by the vehicle's sensors [29]. Our network of AUVs is characterized by its ability to move horizontally by ejecting water, and to move vertically, the AUVs use a buoyancy control system.
The remainder of the paper is organized as follows. In Section 3, the proposed solution is detailed. A series of experiments and some simulation results are presented to show the validity and relevance of the proposed approaches in Section 4. Section 5 gives some results and comparisons with similar works, and Section 6 concludes this article while giving some directions for future work. In our network, we consider that the AUVs move in 2D; horizontally and vertically by implementing a GRN which is one of a widely used set of methods in different fields such as swarm robotics [25][26][27].

Related Work
The application of advanced information and communication technology, such as the Internet of Things (IoT) and the various machine learning methods to better manage the behavior of autonomous underwater vehicles (AUVs), is becoming a trend for the purpose of better QoM. For example, the processing and visualization of water quality data can be carried out remotely and in real time using these AUVs. An example of this application is presented in [30], where an underwater environment monitoring system based on UWSNs is introduced. This system was conceived to be able to perform a large quantity of uninterrupted collected data. Further work is presented in [31], which introduces advanced wireless protocols developed for the IoT in order to highlight their adaptability for the WSN application used in water quality monitoring. Many spatial coverage algorithms are surveyed in [32], with a very detailed comparison. For instance, the authors in [18] propose a top-down positioning scheme (TPS) for acoustic UWSNs while ensuring the quality of service of the new reference nodes during the determination phase of well-located nodes based on the gradient method. Furthermore, the latter work presents a new method of estimating the 3D Euclidean distance to facilitate nonlocalized nodes to find more reference nodes in order to become localized.
A distributed coverage control scheme is described in [33], where a density function describing frequency random events with mobile sensors operates within a restricted range specified by a probabilistic model. The algorithm used in this work is based on the gradient, which needs local information on each sensor and maximizes the probabilities of detection of common random events. For a coverage control problem, costs of communication are calculated according to two scenarios of data collection: the first takes the network as a network that collects data from a single source, and the second one identifies the network with multisource data.To model the cost of communication, the authors use the same form of energy consumption.
Our work builds on the work by Detweiler et al. [34], where the authors deployed a gradient-based decentralized controller that dynamically adjusts the depth of a submarine sensor network to improve the QoM. In contrast to our work, which also uses the concept of diversity, the latter work only involves optimizing a particular redundancy function based on the correlation between the different sensors. This solution was implemented to solve the problem of monitoring chromophoric dissolved organic matter (CDOM) in the Neponset River, which feeds into Boston Harbor. The study proved that the controller converges to a local minimum. This controller is adapted to a network of submarine sensors capable of adjusting their depths. The results of simulations and experiments verified the functionality and performance of this system and the algorithm presented. The SALMON (Sea Water Quality Monitoring and Management) [35] presented a concept of a guidance system using AUVs to detect and perform automated analysis of several water quality parameters.
In order to model the quality of surveillance, ref. [36] focused on the theoretical study of spatial and temporal correlations due to the various physical phenomena of wireless sensor deployment in nature. Two schemes were proposed to reveal the time and space dependence under centralized and distributed settings to maximize the overall QoM based on sensing scheduling. The same authors proposed another study in [37] using the nondecreasing submodular function to measure the QoM, but this time they took into account the correlation in the detected data in order to define distributed scheduling schemes that are used to determine a high QoM in a ring cycle sensor array.
In [25], the authors proposed RDBF, which is a relative remote routing protocol that takes into consideration energy saving while minimizing delays in transmission. This work was based on the use of an aptitude factor to determine the degree of relevance of a node to participate in transmitting packets. This aptitude test helps reduce needless transfers by the nodes, which helps reduce power consumption and end-to-end delay, in addition to reducing redundancy by controlling transfer time of multiple senders. However, none of the existing studies use the concept of diversity from machine learning to deal with issues of quality of monitoring in the underwater environment.
In recent years, the need for controlling robots based on artificial intelligence and, more particularly, machine learning instead of programming has increased. Several methods have addressed this demand using genetic algorithms, neural networks, and other artificial intelligence (AI) or machine learning methods to control some of the functionality of robots [28]. The majority of mutlirobot systems rely on a default programmed algorithm, something that cannot be applied in a dynamic environment characterized by unpredictable change; therefore, the robot system has to adapt with the environmental changes and take into account the local perception of the robot. The authors of [29] proposed the Hierarchical Gene Regulatory Network (H-GRNe) for Adaptive Multirobot Pattern Formation, which is a two-layer gene regulatory network (GRN) model that adapts the generation and formation of multirobot patterns. In this model, the adaptation part of pattern generation is conducted in the first layer and then these generated patterns will drive the robots in the second layer with a decentralized control mechanism. The authors accompanied their study with simulation in a changing environment that proved the efficiency of H-GRNe to form the desired pattern, and also a strong adaptation to robot failure. The AUVs used in this network apply the cellular adhesion molecules (CAM) combined with GRN controllers proposed by [33]. This model is based on the control of GRN-CAM hydrons, which refers, in our case, to a group of AUVs.

An Optimization Function for Water Quality which Minimizes Sensor Redundancy and Maximizes Diversity
This section provides the details of the proposed solution, with the additional aim of highlighting the characteristics of the proposed architecture and how it is implemented. We consider N AUVs at locations P i (x i , y i , z i ) with i = 1, . . . , N. We assume that the sensors move in a two-dimensional plane defined by the x and z axes, with a fixed y coordinate, as seen in Figure 2, reducing the three-dimensional positioning to p i (x i , z i ). We will assume that the correlation between pairs of sensors decreases, not necessarily isotropically, with their distance as a Gaussian function. Consequently, we can postulate that the covariance between two sensors i and j is given by where σ x and σ z have the meaning of (spatial) correlation decreasing rates in the x and z directions, respectively.

Gradient Based on Covariance
Since we want to maximize redundancy among the sensors, we need to minimize the overall pairwise correlation between sensors. In other words, we minimize the following function: The minimum of H(p 1 , . . . , p N ) fulfills the equations for i = 1, . . . , N, yielding

Gradient of Diversity
Minimizing the function H(p 1 , . . . , p N ) alone leads to a solution which indeed minimizes redundancy, but does not guarantee that one covers the maximum amount of information in the system. In other words, one also needs to take into account the diversity covered by the set of sensors. Assuming all the information of the system can be encoded in the linear correlations observed in the systems, the determinant of the covariance matrix L between pairs of sensors is a proper measure of such a total amount of information, since it reflects the total variance of the data collected by the set of sensors. The idea of using the determinant as a measure of diversity is found also in the the theory of determinantal point processes. We therefore consider the covariance matrix L with elements L ij = Cov(p i , p j ), as defined in Equation (1), and seek its maximum, which is a solution of which can be written as where G is a matrix with elements G ij = G ji = −(x i − x j ) 2 − (z i − z j ) 2 and denotes the Hadamard product, for the full derivation of Equation (6).

Weighted Objective Function
We now combine both the redundancy H and diversity L in the same weighted objective function F, defined as where we consider the normalization of both function H and L to have values between 0 and 1, and introduce a parameter w which tunes how much the function H dominates over the function L. For simplicity, we define Equation (9) together with Equations (4) and (6) close the optimization problem for extracting the set of locations (x i , z i ) of the N sensors, which optimizes the redundancy and diversity together. Note that the gradient controller in Equation (9) converges to a critical point of F.
At this juncture, we are ready to present our multiagent algorithm for optimizing the above objective function. From Equation (9), the numerical implementation of the optimization problem can be performed through a simple Newton-Raphson scheme. Namely, let t denote a discrete time instant. We shall update the positions of sensor i recursively. The position at time t + 1 is given by where λ is a learning parameter.

Derivation of the Gradient of Diversity
From Equation (1), it is easy to obtain We apply the Jacob formula: We used that the diagonal entries of (A • B)C T and (A • C)B T coincide and we also used the fact that dG dx i T = dG dx i because of symmetry.
The matrix R = L L −T is commonly known in the field of control theory as the relative gain array and admits many applications in the latter field. Similarly,

Proof of the Convergence of the Gradient Controller
We define the gradient controller as To prove that our gradient controller (Equation (13)) converges to a critical point of F ζ , we must verify the following four properties: 1.
Must have a lower bound; 4.
Must be radially unbounded or the trajectories of the system must be bounded.
While this assures convergence to a critical point of F ζ , small perturbations to the system will cause the gradient controller to converge to a local minimum and not a local maximum or saddle point of the cost function.
H and det(L) verify properties 1, 2, 3, and 4. We use also the sum of two Lipschitz functions asLipschitz. Therefore, F ζ verifies all the four properties [33].

Numerical Implementation and Experiments
To test the performance of our algorithm, we adopt the same environment parameters describing the concentration of CDOM specific to the depth of the Neponset River caused by the tide found in [25]. Each underwater environment is characterized by σ s and σ d . Although those parameters were not explicitly given by the authors in their studies [25,34], we resort to a separability in the exponential function describing the covariance in order to extract them directly from Figure 4 in [34] via curve fitting.
For this first environment used in [34], we have σ s = 2.074 as covariance according to X, and σ d = 0.917 as covariance according to Z.
We use a grid size of length 8 km along the X direction and 3 m along the Z direction. Furthermore, we use a learning rate λ = 0.1. Choosing an excessively large value of the learning parameter λ gives a wrong convergence and can make the system oscillate. However, choosing a too-small value λ makes the convergence sluggish. Now, we report the experimental results for different number of sensors. Our second environment is characterized by σ s = 1.977 as covariance according to X, and σ d = 1.198 as covariance according to Z. We obtain similar results to environment 1. For the sake of brevity, we merely report the results for the second environment in Appendix A. Although we conducted a large set experiments for different sets of sensors and different parameters of the multiobjective function, we merely report a few representative results for the sake of brevity as the conclusions are similar for the different experiments. When it comes to the objective function, we report results for two representative cases: ω = 0.8, which describes a case where the multiobjective function weights the covariance minimization term more, and ω = 0.2, which describes a case where the multiobjective function favors the diversity maximization term more.

Case of 10 Sensors
In this scenario, we deploy 10 sensors initially at uniformly random positions and we run our scheme using ω = 0.8 and ω = 0.2. Note that according to the multiobjective function, ω = 0.8 places more weight on the covariance, while ω = 0.2 places more weight on the diversity. Figure 3 shows the covariance after running our algorithm for 10 4 iterations. We can clearly see that in the case of ω = 0.2 we obtain the minimal covariance and fastest convergence rate. Figure 4 shows that the corresponding diversity is largest for ω = 0.2. We therefore conclude that by choosing ω = 0.2, we obtain both lower covariance and higher diversity. In other terms, introducing the diversity term permits also to reduce the covariance as it seems that the diversity permits the optimization system to avoid some local minima.
The final positions are depicted in Figure 5. We visually observe that the positions with ω = 0.2 give a total coverage of the sensors, while with ω = 0.8, the sensors are positioned only in the middle and at the top of the network.

Case of 20 Sensors
Now, we describe the experiment for 20 sensors. We use the same values of ω = 0.2 and ω = 0.8 and show the graphs for covariance, diversity, and final positions.
The covariance is depicted in Figure 6 where the convergence speed seems faster for ω = 0.2 compared to ω = 0.8. The rate of diversity is depicted in Figure 7 for both values ω = 0.2 and ω = 0.8. We can see that ω = 0.2 gives a higher value for the diversity. The final position of this case study is presented in Figure 8 and we can visually verify the adequate positioning of the sensors with ω = 0.2, despite the increase in the number of sensors.

Further Discussion
As mentioned in Section 4, the convergence speed seems faster for ω = 0.2 compared to ω = 0.8 for both cases (10 and 20 sensors). By comparing the performances of two environments, we donate that for ω = 0.2, the diversity seems to be faster than for ω = 0.8 and the covariance seems to be minimal. In the performance comparison between two cases of study, 10 and 20 sensors and for ω = 0.2 and ω = 0.8, we notice that the covariance decreases rapidly only for the value ω = 0.8, and for the diversity, the case ω = 0.2 is more impacting than the case of ω = 0.8. The positions of the sensors at the end show that for the value 0.2, we have more coverage of the study area than in the case of 0.8.
In Table 1, we give an overview of several papers published on quality of monitoring (QoM) in underwater sensors, each proposing a different approach and method for optimizing sensor placement and data collection in underwater environments. These papers demonstrate that there are various approaches and methods for optimizing QoM in underwater sensor networks, and the performance of these methods can depend on factors such as the optimization algorithm, the network topology, and the environmental conditions. It is important to carefully consider these factors when designing and deploying underwater sensor networks and to evaluate the performance of different approaches using appropriate metrics and benchmarks.

Conclusions and Future Work
In this paper, a new optimization algorithm based on covariance and diversity is presented to optimize the QoM. Moreover, we presented the challenges associated with each of these blocks and how they were tackled by several relevant papers in the literature. This was performed in a systematic way, by focusing on the methods, conclusions, and higher level decisions of each paper. More specifically, we can conclude the following. (1) Input features should convey useful information about the propagation problem at hand, while also having small correlation between them. (2) Dimensionality reduction techniques can help identifying the dominant propagation-related input features by removing redundant ones. (3) Increasing the number of training data by presenting the ML model with more propagation scenarios improves its accuracy. As future work, we propose to investigate different aspects: • The impact of varying the number of agents on the performance of the system: While our approach showed promising results in improving the performance of QoM, it is important to understand how the number of agents affects the overall system performance. Future work could focus on varying the number of agents and evaluating the resulting impact on the performance of the system. • Agent selection: Future work could explore the development of a more efficient algorithm for agent selection that reduces computational costs while still achieving high-quality optimization results. • Evaluating the impact of environmental factors on the performance of the system: The performance of underwater sensor networks is often affected by various environmental factors such as water temperature, salinity, and turbidity. Future work could investigate how these environmental factors affect the performance of the multiagent diversity-based gradient approach optimization and identify ways to mitigate their impact. • Extending the optimization to other QoM metrics: The multiagent diversity-based gradient approach optimization has been mainly focused on optimizing the energy efficiency of underwater sensor networks. Future work could explore the extension of the optimization approach to other QoM metrics such as latency, throughput, and reliability. As future work, we could also try to jointly optimize the communication cost and quality of monitoring. • Machine learning techniques such as reinforcement learning and deep learning have shown promising results in optimizing various aspects of underwater sensor networks. Future work could explore the integration of these techniques with our optimization approach to further improve the performance of the system. • Three-axis models: To improve our study and move close to the real world, a three-axis model will be considered in future works.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A. Environment 2
Our second environment was obtained from [34] and is characterized by σ s = 1.97786755059 as covariance according to X and σ d = 1.19859333137 as covariance according to Z. We obtain similar results to environment 1.