Learning-Based Task Offloading for Marine Fog-Cloud Computing Networks of USV Cluster

In recent years, unmanned surface vehicles (USVs) have made important advances in civil, maritime, and military applications. With the continuous improvement of autonomy, the increasing complexity of tasks, and the emergence of various types of advanced sensors, higher requirements are imposed on the computing performance of USV clusters, especially for latency sensitive tasks. However, during the execution of marine operations, due to the relative movement of the USV cluster nodes and the network topology of the cluster, the wireless channel states are changing rapidly, and the computing resources of cluster nodes may be available or unavailable at any time. It is difficult to accurately predict in advance. Therefore, we propose an optimized offloading mechanism based on the classic multi-armed bandit (MAB) theory. This mechanism enables USV cluster nodes to dynamically make offloading decisions by learning the potential computing performance of their neighboring team nodes to minimize average computation task offloading delay. It is an optimized algorithm named Adaptive Upper Confidence Boundary (AUCB) algorithm, and corresponding simulations are designed to evaluate the performance. The algorithm enables the USV cluster to effectively adapt to the marine vehicular fog computing networks, balancing the trade-off between exploration and exploitation (EE). The simulation results show that the proposed algorithm can quickly converge to the optimal computation task offloading combination strategy under heavy and light input data loads.


•
In complex mission operations, USV team nodes must be assigned different roles, depending on limited platform size, energy, and payloads. For example, some nodes need to be installed with more payloads to complete the detection and sensing functions for the cluster, and no more computing resources can be configured. At the same time, some nodes can be configured with stronger computing resources, but fewer other payloads. By assigning different roles to cluster nodes, it is possible to make unmanned node functions more specific and fully utilized.

•
At the level 1 of unmanned system's autonomy, cluster nodes do not need to have strong computing resources. All sensor data will be transmitted to the remote control center, and the operator judges the situation and sets the corresponding commands. As the task complexity increases, the cluster node autonomy must be improved, and remote control mode is far from meeting the requirements of complex tasks, and cannot take the advantages of unmanned systems. • At the same time, cluster nodes need to be equipped with more advanced sensors to achieve higher requirements of detection and sensing through more sensor data to meet the needs of complex tasks. This large amount of data cannot be handled by the remote control center because it will generate a large delay and cannot meet the requirements of cluster nodes to respond to situational changes in a short time.
In the past few years, we focused on research of collaborative autonomy and control of USVs, and we had achieved level 2 of unmanned system's autonomy. In recent years, we have begun to research new communication technologies because we find that the existing maritime communication technologies cannot be able to support autonomy of USVs to cope with more complex tasks, especially latency-sensitive tasks.
The picture below shows a certain type of USV that we have developed which has significant applications in some areas. USVs are undergoing marine testing, as shown in Figure 1. computing performance of USV cluster, especially for latency sensitive tasks [1]. The main reasons include the following aspects: • In complex mission operations, USV team nodes must be assigned different roles, depending on limited platform size, energy, and payloads. For example, some nodes need to be installed with more payloads to complete the detection and sensing functions for the cluster, and no more computing resources can be configured. At the same time, some nodes can be configured with stronger computing resources, but fewer other payloads. By assigning different roles to cluster nodes, it is possible to make unmanned node functions more specific and fully utilized.

•
At the level 1 of unmanned system's autonomy, cluster nodes do not need to have strong computing resources. All sensor data will be transmitted to the remote control center, and the operator judges the situation and sets the corresponding commands. As the task complexity increases, the cluster node autonomy must be improved, and remote control mode is far from meeting the requirements of complex tasks, and cannot take the advantages of unmanned systems. • At the same time, cluster nodes need to be equipped with more advanced sensors to achieve higher requirements of detection and sensing through more sensor data to meet the needs of complex tasks. This large amount of data cannot be handled by the remote control center because it will generate a large delay and cannot meet the requirements of cluster nodes to respond to situational changes in a short time.
In the past few years, we focused on research of collaborative autonomy and control of USVs, and we had achieved level 2 of unmanned system's autonomy. In recent years, we have begun to research new communication technologies because we find that the existing maritime communication technologies cannot be able to support autonomy of USVs to cope with more complex tasks, especially latency-sensitive tasks.
The picture below shows a certain type of USV that we have developed which has significant applications in some areas. USVs are undergoing marine testing, as shown in Figure 1. For the above reasons, it is important to make full use of the computing resources of USV cluster. At the same time, combined with the computing resource of the remote cloud, the overall computing performance can be significantly improved.
However, due to the limited marine communication conditions (such as bandwidth limitation, channel quality, etc.) between USV cluster and remote cloud, if a large number of complex computing task are offloaded to a centralized cloud with rich computing resources, the transmission costs are also relatively high.
Although the application of marine unmanned systems is just at the beginning, the development capability is very strong. The capacity of marine unmanned platforms is increasing. For the above reasons, it is important to make full use of the computing resources of USV cluster. At the same time, combined with the computing resource of the remote cloud, the overall computing performance can be significantly improved.
However, due to the limited marine communication conditions (such as bandwidth limitation, channel quality, etc.) between USV cluster and remote cloud, if a large number of complex computing task are offloaded to a centralized cloud with rich computing resources, the transmission costs are also relatively high.

•
This paper proposes a marine fog-cloud computing architecture for USV clusters, and analyzes the network dataflow.

•
For typical application scenarios, we consider a marine vehicular fog-cloud computing model of moving USV cluster nodes.

•
An optimized learning-based computation offloading mechanism based on classic MAB theory is proposed. This mechanism enables USV cluster nodes to dynamically make decisions by learning the potential offloading performance of their neighboring team nodes to minimize average computation task offloading delay. • Furthermore, we propose an optimized algorithm named AUCB algorithm and design corresponding simulations to evaluate the performance under typical conditions.

Related Work
The research of unmanned systems involves a deep integration of many disciplines such as systems engineering, control engineering, information, and communication engineering. It is also the frontier research field and future development direction of marine science and technology.
With the rapid development of information technology, technological advances have paved the way for the emergence of complex services. Computation task offloading is attractive for Internet of Things (IoT) and edge computing. Typically, task offloading can occur between sensors, edge devices, fog nodes, or IoT nodes [2].
In order to improve the task scheduling efficiency of the marine collaborative edge systems and reduce communication cost, an optimized algorithm is proposed in [3].
SDN and fog computing have been integrated into maritime broadband communication systems to minimize the total weight delay of the stand-alone scheduling scenario and to achieve the minimum delay for weighted upload packets in [4].
In the marine mobile computing environment, the network status changes at any time during task execution, and the migration decision problem in the mobile fog environment needs to be transformed into a runtime offloading decision-making problem.
If the migration decision is made without considering the dynamics of the marine mobile fog environment, the strategy will result in incorrect migration, and ultimately affect the comprehensive dynamic computing performance of the USV fog cluster.
As far as the scope of this paper is concerned, at present, there are many researches on static computation offloading decision-making. In the system development stage, the offloading strategy is formulated through program analysis. After development completion, the algorithm strategy will no longer change. For example, the problems studied in [5][6][7][8] are static offloading decisions.
Some existing related research based on mobile device edge computing mostly utilizes the idle mobile node device around user terminals to complete the offloading of computing task through D2D communications. The corresponding research is carried out in [9,10].
Reference [11] designs the computation offloading strategy under renewable energy supply and the opportunistic-based mobile self-organizing cloudlet offloading strategy.
USV clusters work in a challenging communications environment, and the reliable communication system is a vital factor for them to operate safely and accomplish mission tasks. It needs to be able to ensure the reliability and effectiveness of the communication mechanism under the premise of coping with the ever-changing adverse factors, even in the harsh electromagnetic and network environment.
Therefore, in a dynamic environment, the migration strategy design should be formulated according to the current situation, and the migration strategy should change constantly.
Driven by the latest developments in artificial intelligence, the fog radio access networks are seen as a potential architecture to support IoT services. A joint mode selection and resource management algorithm based on deep reinforcement learning is proposed in [12].
F-RAN is an emerging architecture that takes advantage of edge computing and distributed storage in edge devices. Reference [13] proposes a NOMA-based F-RAN architecture with powerful edge computing capabilities to meet the heterogeneous requirements of mobile vehicular systems.
In a fog computing network, a mobile device can offload its data or computation intensive tasks to a fog node in its vicinity. Based on theoretical analysis, a multi-objective optimization problem is proposed in [14] with the objective of minimizing energy consumption, processing delays and communication costs for each mobile terminal.
The vehicular edge computing network integrates the computing resources of nearby vehicles and provides computing services. A learning-based offloading mechanism is proposed in [15].
However, obtaining an optimal strategy in such a dynamic system is challenging. In addition to immediate rewards, reinforced learning (RL) also considers long-term goals, which are important for time-varying dynamic systems. Reference [16] proposes a RL-based optimized algorithm to solve the task allocation problem in wireless mobile edge computing network.
Reference [17] proposes a task offloading algorithm based on deep Q-network. The algorithm can learn to develop an optimal offloading strategy without relying on the prior knowledge of dynamic statistics.
Related research on computation migration systems, cloud service operators and cloud resource operators in [18] from three aspects: migration decision-making, task access control and energy-efficient resource management. The research fully considers the problems and challenges brought by the mobile cloud environment to the computation offloading system.
Fog computing is expected to provide low latency computing services at the edge of the network for the IoT systems. Reference [19] proposes a computation task offloading algorithm to simulate the competition between IoT terminals and to distribute the limited computing resources of the fog nodes.
Fog computing can provide latency-sensitive service for terminals and reduce power consumption and traffic congestion. It achieves efficient resources utilization and better performance [20].
Reference [21] performs a strict comparative analysis of the fog computing and the conventional cloud computing in the IoT environment. The results show that the performance of fog computing is superior to traditional cloud computing with the increase in the number of applications requiring real-time services.
Reference [22] proposes a new fog computing model that can alleviate the potential problems of dedicated computing infrastructure and the slow response in cloud computing. The results show that fog computing can greatly improve the performance of the analysis service compared to using only the cloud model.
Reference [23] investigates the joint allocation of radio and computing resources to optimize system performance and improve user satisfaction, and proposes a matching game mechanism to provide a distributed solution for the joint resource management.
We adopt a lot of ideas from related research on resource management, communication, and other aspects in [1,24].
Reference [25] proposes a fog-cloud computing architecture for unmanned aerial vehicles and carried out the corresponding computing performance analysis. A utility-aware data transmission mechanism for delay tolerant networks is proposed in [26], which considers the internal properties of nodes and external contacts.

Computing Architecture
In this section, we will explain several existing problems of marine communication, and discuss the proposed model and expected goals.
During the execution of a marine mission, the USV cluster must be able to perform several important functions such as information distribution and task collaboration, which include information distribution, task coordination, team roles allocation, dynamic team formation, interaction with team members or external devices, etc.
The cluster nodes can use wireless communication link for information distribution. They can distribute dynamic mission information to other cooperative nodes (unmanned or not).
However, the marine communication environment has corresponding special characteristics-such as large scale, limited bandwidth, communication delay, etc.-and requires the following considerations: • For the most computing tasks, a large amount of data generated by the USV cluster nodes which are responsible for situational awareness, needs to be processed, and the processed data is much smaller than the raw data, which is below several orders of magnitude.

•
The computing performance of remote cloud is much higher than that of each USV cluster node, but limited by the marine communication conditions between marine cluster and remote cloud, the transmission delay is also very large and unrealistic, if all raw data is offloaded to cloud for calculation.

•
Although the computing performance of the cluster node is lower than remote cloud server, considerable computing performance can also be obtained by making full use of the parallel computing performance of each cluster node and the communication bandwidth advantage within cluster coverage.

•
In general, for most computing tasks, USVs are usually energy insensitive.
Therefore, we propose a marine fog-cloud computing architecture, and develop an optimized learning-based offloading mechanism to solve several mentioned problems.
The marine fog-cloud computing architecture can provide a promising solution to these problems. In the proposed computing architecture, task offloading to the distributed USV cluster nodes enables the full use of underutilized computing resources to mitigate the communication load, and reduce processing delays.
Generally, the moving USV cluster nodes can be classified into task nodes (TaNs) and computation nodes (CoNs). TaNs generate computation tasks that need to be offloaded to available CoNs of USV fog cluster, or remote cloud, as shown in Figure 2.
Each node of USV fog cluster has networked structure features such as decentralization and roles equality. By optimizing the utilization of fog node resources, the overall performance of the network is optimized. The fog coordinator mainly performs roles allocation, task coordination, dynamic team forming, or radio frequency communication between fog cluster and remote cloud.
Although the conditions of marine communication are relatively limited currently, but when the fog cluster nodes are few, or the computing resources are not particularly rich, the fog-cloud computing architecture can also obtain considerable overall computing performance by fully utilizing the computing resources of remote cloud.
The marine fog-cloud computing architecture can provide a promising solution to these problems. In the proposed computing architecture, task offloading to the distributed USV cluster nodes enables the full use of underutilized computing resources to mitigate the communication load, and reduce processing delays.
Generally, the moving USV cluster nodes can be classified into task nodes (TaNs) and computation nodes (CoNs). TaNs generate computation tasks that need to be offloaded to available  When the computing resources available to the fog cluster nodes are abundant, the remote cloud will be less important to the overall computing performance. Therefore, the fog-cloud computing architecture fully considers the comprehensive consideration of the above situations to achieve acceptable computing benefits.
Vehicular fog computing integrates fog computing and vehicular networks and is expected to provide real-time service for latency-sensitive task [27].
In this paper, the proposed task allocation and resource management algorithms of fog-cloud computing architecture are optimized to reduce processing delay and communication load, and take full advantage of the computing performance of the fog-cloud computing architecture.

Marine Fog-Cloud Computing Network Dataflow
With the continuous improvement of unmanned system's autonomy and the increasing complexity of tasks, higher requirements are imposed on the computing performance of USV cluster, especially for latency sensitive tasks.
Many tasks must be completed at the marine cluster level to meet latency requirements-such as teamwork keeping, resource allocation, cluster coordination, etc.-as shown in Figure 3. It also shows the computation offloading delays considered in the proposed computing model, including transmission delay and computation delay.
In addition, because industrial big data is often unstructured, it is pruned and refined by local fog before being sent to the remote cloud [28].
In general, the transmission delay includes the delay in which the raw data is migrated from TaNs to the CoNs, and the delay in returning the processed data. The transmission delay is affected by the amount of data, transmission power, channel state, etc.
The computing delay refers to the delay when the CoNs complete the data computation, which is usually determined by the raw data size and the computing performance or available resources of the CoNs.
However, during task execution, the channel state and the computing performance or available resources of the computation node cannot be known to the task node in advance, when making a computation migration strategy.
On the other hand, the signaling overhead will be much high, if each task node requests the parameters of all candidate computation nodes in each time period. cluster, especially for latency sensitive tasks.
Many tasks must be completed at the marine cluster level to meet latency requirements-such as teamwork keeping, resource allocation, cluster coordination, etc.-as shown in Figure 3. It also shows the computation offloading delays considered in the proposed computing model, including transmission delay and computation delay.
In addition, because industrial big data is often unstructured, it is pruned and refined by  Therefore, we need to develop an effective scheduling mechanism or algorithm that can be used to take full advantage of the computing performance of the fog-cloud computing architecture, and reduce processing delay and communication load, when making the offloading decision without additional signaling overhead.

Typical Application Scenarios of Marine Fog-Cloud Computing
This section describes typical application scenarios of the marine fog-cloud computing architecture for USV clusters. These application scenarios require low latency of computing services.
The fog-cloud computing architecture and efficient algorithm design can be used for providing optimization for computing services.

Marine Situational Awareness (MSA)
The USV's MSA systems can obtain sensing information from onboard sensors, or receive cluster information through the communication module, including task collaboration data from teammates, and convert the information into a generic situation picture. This is important for the unmanned system's collaborative autonomy.
Four different levels of requirements should be achieved, including: Level 1, object evaluation, it completes the fusion and elimination of data onboard sensors, or external collaborative data sources of heterogeneous nodes, and generates an overall situational map within the task framework.
Level 2, situational assessment, it is used to evaluate the fused general contextual picture to identify object attributes within the coverage of the teammates' sensors (eg, hostile, friendly, neutral, etc.) and to prioritize emerging or potential threat-related targets.
Level 3, predictive consciousness, it is used to identify the motion parameters and future trajectories of possible threats and assesses the intentions of possible threats. Level 4, process optimization, it determines that when information from an MSA system does not meet the operation requirements, other means can be taken to obtain important information, such as requesting a mission plan from another platform or from a communication network [29].
The MSA system is also capable of collecting and maintaining other information such as sea states, weather conditions, etc. All the information will also be provided to the mission plan to make autonomic strategy for guiding USV cluster nodes' behaviors.
With the emergence of advanced sensors, a large amount of data is generated for situational analysis. USV cluster nodes need to intelligently analyze, process, and convert related data into a general situation picture in the mission framework for collaborative nodes. The processed information can also be distributed to remote monitoring and control centers.

Autonomic Strategy Formulation
Autonomic strategy formulation determines and establishes the goals, mission and objectives of USV cluster. It enables the USV cluster to identify and select the appropriate plan among all available strategies that can achieve its goals.
With the continuous adoption of intelligent algorithms including reinforcement learning, the autonomy level of USV clusters is also constantly improving. At the same time, more and more complex and real-time tasks require USV cluster to make autonomous decisions immediately to respond to the changing environmental requirements internal or external. This is extremely important for unmanned systems. In addition to being able to accomplish tasks better, it also enables USV clusters to respond effectively to complex and changing environments, embodying the intelligence of unmanned systems.
Although the computing performance of USV cluster nodes is still relatively low compared with cloud computing, but with the continuous development of information technology, the computing performance of USV cluster nodes is also continuously improved.
Especially for latency-sensitive tasks, it is also an important method to obtain considerable computing performance, by making full use of the USV cluster computing resources, especially when marine communication conditions (bandwidth limitation, channel quality, etc.) are limited between the cluster and remote cloud.

Dynamic Team Formation
It accommodates the composition and rebuild of the USV cluster as required, including teammates identification, team structures, and roles allocation within mission framework.
It is very important for USV clusters to cope with changes to refactor the team organization, including team member loss or failure, new team members joining, current task changes, and other conditions. As conditions change, the structure of the team will be dynamically adjusted and the roles will be re-allocated to form new interactions.
In the process of dynamic team formation of multiple heterogeneous platforms, team members are regrouped according to the requirements of tasks, forming a new organization to complete a separate task, or merging the organizations to form a new joint task team.
The heterogeneous fog cluster can dynamically perform computation migration/offloading, and utilize the dynamic resources of mobile platforms in the network to provide computing services. The marine fog cluster should also allow mobile nodes to join and leave at any time.
The computing task is completed through the cooperation of the mobile platforms, and the dynamic performance of the edge computing service is maintained.
Through the dynamic mobile edge computation migration and offloading strategy, the computing performance of USV fog cluster nodes are optimized under the conditions of limited maritime communication resources. The marine fog-cloud computing architecture reduces unnecessary offloading of computing tasks and communication tasks and maintains good dynamic computing performance.
In summary, a satisfactory computation offloading strategy can fully utilize the heterogeneous, diversified storage and computing resources of USV cluster nodes, and should also consider the mobility of the cluster nodes.

Joint Mission Evaluation
Joint mission evaluation can be used to provide a higher-level operational assessment of large-scale joint operations and diversification of mission implementation.
It should have the ability to integrate multiple mission nodes, multi-dimensional situational information space, operation patterns, etc., to build modular joint mission evaluation space.
After operations completion, it is not difficult to evaluate the results of simple task for USV cluster. However, for large-scale, multi-factor, multi-node participated operations, USV cluster needs to conduct comprehensive assessment of task completion degree based on a large number of multi-dimensional observation data, and generate task evaluation results.
Usually, mission evaluation should be completed in a short period of time and clear evaluation results should be drawn, so it is also time sensitive.
The task evaluation results will be sent to the remote monitoring center to reduce communication load, data processing, and communication delay.

Massive Data of Heterogeneous Sensors and Computing-Intensive Tasks
With the continuous improvement of unmanned system's autonomy, and the increasing complexity of tasks, various types of advanced sensors have emerged. At the same time, higher requirements are imposed on the computing performance of USV cluster, especially for latency sensitive tasks.
The following functions should be achieved: multi-sensor data fusion, re-configurability of sensor weighting, adaptability of fault sensors and erroneous data, intelligent heterogeneous data association, etc.

Multi-Sensor Information Fusion
In complex and uncertain marine surface environments, USVs must have abilities of integrate multi-sensor data to maintain accurate and continuous sensing of the surrounding conditions and to transform sensory data into meaningful information within the task framework.
At the same time, this feature provides the USVs with the adaptability to perform tasks in dynamic, complex situations.

Reconfigurability of Sensor Weighting
This feature refers to the re-configurability of the sensor weight of USVs during task execution. When USV's heterogeneous sensor networks are used for multi-sensor data fusion processing, each sensor onboard may have different weights for different applications and different tasks. The sensor management system must have the ability to dynamically reconfigure sensor weights to achieve good performance of the USV platforms for different mission operations.

Adaptability of Faulty Sensors and Erroneous Information
In many cases, USVs need to work in a heterogeneous sensor network due to the requirements of different tasks. This is also one of the important parameters to be considered when designing a multi-sensor data fusion system.
Based on this situation, the sensor management system must be dynamically self-adaptive of sensor failure and erroneous data, thereby enhancing the adaptability of USV sensor network to cope with changes.

Intelligent Heterogeneous Data Association
In general, multi-sensor data fusion systems based on heterogeneous sensor networks must have abilities to process different sensor data simultaneously.
Since the combination of heterogeneous sensors may change during task execution, the data combination also changes. Intelligent heterogeneous data associations must be performed before multi-sensor data fusion and input to the decision-making module of the USVs.

Computation Tasks Offloading
For the above typical application scenarios, we consider a marine vehicular fog-cloud computing system in which USV cluster nodes can be classified into two categories: TaNs and CoNs as shown in Figure 2. TaNs generate computation tasks that need to be offloaded to CoNs of USV fog cluster, or remote cloud.
At the same time, they can also complete partial computing tasks, if conditions permit. Roles of TaN or CoN are not fixed during task execution, which depends on whether the computation resources on board are shareable and sufficient or not. CoNs are employed as fog computing nodes to provide computation services, while TaNs generate computation tasks that need to be offloaded.
For each TaN of fog cluster, the surrounding CoNs in same marine operations within its communication range Cr can be considered as candidate available computation task allocation nodes.
TaN can obtain the dynamic information of each available CoN, including USVs ID, position, and speed, provided by automatic identification system. The computation task will be migrated to several of available CoNs of fog cluster or remote cloud according to task offloading algorithms.
In the proposed framework, task migration decisions will be completed in a distributed manner. Each TaN can make its computation task migration decisions independently, in order to avoid additional, large signaling overhead.
In the above typical application scenarios, we focus on a representative task offloading in the marine operations for total T time periods. The TaN generates tasks, makes computation offloading decisions in discrete-time t, selects several available CoNs or remote cloud, and performs computation task offloading and receives the processed results.
Denote the available CoNs set by N(t), and we should note that N(t) may change with time since USV cluster nodes are moving during execution of operations.

Computation Delay
In the proposed mechanism, the input raw data size in time period t can be denoted by x t (in bits) which should be offloaded from TaN to CoN n. Denote the output processed data size by y t (in bits) which will be fed back to TaN. The computing performance parameter is denoted by ω t , which indicates the number of CPU cycles required to process each bit of data The total workload can be given by x t ω t in time period t [30]. For each CoN n, the maximum computing capability can be denoted by F(n) (in CPU cycles per second). In general, multiple computing tasks may be processed simultaneously, and the available computing capability to TaN can be denoted by F(t, n) in time period t. Therefore, the computation delay D c (t, n) of CoN n is D c (t, n) = x t ω t F(t, n) (1)

Transmission Delay
However, in the real systems of USV fog cluster, F(t, n) cannot be known to the TaN in advance.
The uplink transmission rate between TaN and CoN n ∈ N(t) can be denoted by R CoN n is I (u) t,n . Denote transmission power by P, channel bandwidth by W of TaN and the noise power by δ 2 . Therefore, the uplink transmission rate between TaN and CoN n is The downlink transmission rate can be given by The downlink channel state between CoN n and TaN is denoted by H t,n . The interference power at the TaN is denoted by I (d) t,n . The total transmission delay D t (t, n) can be written as In the real systems of USV fog cluster, both R (u) t,n and R (d) t,n cannot be known to the TaN in advance.

Offloading Delay
In time period t, the total offloading delay D(t, n) is the computation delay D c (t, n) plus the transmission delay D t (t, n). If conditions permit, TaNs can also perform partial computation tasks. So for TaNs, only the computation delay D c (t, n) should be considered, and A represents a set of TaNs.

Problem Formulation
The TaN can formulate a task migration strategy A t that allocates a computation amount to each CoN to minimize offloading delay. Therefore, the problem can be formulated as D(t, A t ) = max D(t, 1) , D(t, 2), . . . , D(t, n) P1 : min For the task migration strategy A t , the offloading delay D(t, A t ) is the maximum of the delays of available computing nodes.
If the exact values of relevant parameters, such as computing capability F(t, n), uplink transmission rates R (u) t,n and downlink transmission rates R (d) t,n of all available CoNs, can be known to TaN in advance, it is not difficult to calculate the offloading delay D(t, n).
However, in real systems and marine operations, due to the relative movement of the USV cluster nodes, the state and interference of the wireless communication channel will change rapidly with time, and the resources of the CoN n may be shared by several computing tasks simultaneously. Therefore, the transmission rate and computing capability will also change rapidly with time. The above parameters are difficult to predict in advance. Furthermore, if each TaN requests the relevant parameters of all CoNs in each time period t, the signaling overhead between TaN and CoN will be much high.
When making the current offloading decisions, TaNs do not know the performance of the CoN in advance. We design an optimized learning-based computation task offloading mechanism that helps TaNs learn the offloading performance of available CoNs through historical data.

Problem Analysis
We develop an optimized learning-based offloading algorithm which enables TaN to learn the offloading performance of available CoNs to minimize the average delays. In the same computing task, it can be assumed that there are various different input data sizes, and the ratio of the output data size to the input data size is kept constant throughout the time. It is a reasonable assumption when the computing task types are the same. Therefore, let y t /x t = α 0 and ω t = ω 0 for ∀t.
Then the offloading delay can be defined as u(t, n) is the total delay of offloading one bit data to CoN n in time period t, which reflects the average offloading performance of CoN. The offloading delay is In a real system, the size of the input data x t can be known to TaN, when making offloading decisions at time t. However, for ∀n ∈ N(t), it is impossible to know the exact value of u(t, n) and its distribution in advance. This requires the TaN to learn to get a relatively accurate estimate.

Optimized Algorithms
The problem is similar to MAB problem. In our proposed framework, TaN can be considered the player and each available CoN corresponds to an action with unknown loss distribution. The player then decides which combination of bets should be taken to minimize the average loss and achieve considerable gains.
The main challenge of classic MAB problem is whether it can effectively balance the trade-off between exploration and exploitation (EE) by exploring different operations to understand the relatively accurate estimates of each distribution.
Several excellent algorithms have been proposed to solve MAB problems, such as the UCB1 and UCB2 algorithms based on the upper confidence bound (UCB). The MAB mechanism has been adopted in wireless communications to learn unknown environments, including channel access and mobility management [31].
The proposed problem is similar to the classic MAB problem. However, there are still several problems that need to be solved. First, we need a combination offloading strategy of USV fog cluster nodes to achieve higher performance rather than only one best node. Second, the available CoNs set N(t) changes with time, but the number of actions is fixed in classic MAB problems. The computing resources of CoNs may available or unavailable within the communication coverage of TaN during task execution, and the computing resources of candidate CoNs may be unavailable, causing a dynamic selection space. Therefore, existing solutions need to be optimized to fully and effectively utilize the empirical information of the remaining CoNs.
Finally, in the classic MAB problem, the performance loss is equal in each time period. In the proposed algorithms, we introduce a weighting factor for the input data size x t . It enables the algorithm exploit more when x t is high and explore more when x t is low, to reduce exploration costs and achieve balanced computing performance.
For each computing node, we set the upper and lower computing thresholds of the input data size to x + p and x − p . When x t is higher than x + p , the computing capability is saturated. When it is lower than x − p , the computing capability is excessive.
Therefore, we propose an AUCB algorithm, as follows (see Algorithm 1).

10:
Offload the task to CoNs: 11: A t = arg min D(t, A t ) 12: Observe delay D(t, A t ) 13: Update u t,a t ← u t−1,a t k t−1,a t +u t,a t k t−1,a t +1

14:
Update k t,a t ← k t−1,a t + 1 15: end if 16: end for The proposed algorithms consider the appearance time of new CoN n and the input data size x t , and can dynamically adjust the exploration weight and introduce the load-awareness and occurrence-awareness during task offloading. The computation offloading strategy is made on line 11. In the algorithm design, when the system communication characteristics change significantly, the computation offloading count will be reset.

Performance Analysis
In real systems, reliable allocation algorithms can make performance loss ρ T stable and acceptable without known relevant current parameters in advance. The average delay of optimal solution is denoted by D * within time period T.
Similarly, percentage of performance loss is denoted by δ T . In the next section, we design several simulations to evaluate the performance of the proposed AUCB algorithm in both heavy and light input data load conditions, which varies across time.
The idea of the bandit algorithms is to see how much regret the choice will bring, and the less regrets the better. In the MAB problem, the indicator used to measure good or bad is cumulative regret or performance loss. By simulating the same number of times with different selected algorithms, we can compare the cumulative regret growth rate of selected algorithms. If an algorithm is better, its cumulative regret growth is also slower than others.

Simulations
We design corresponding simulations to evaluate the offloading delay and performance loss of the proposed algorithm. We rationalize the corresponding parameters to a certain extent to simplify the calculation process.
In our simulations, x t and u(t, n) varies across time in a natural way with an initial value. In the series of u(t 0 , n), the first one is cloud node, others are fog cluster nodes. Next, we will analyze the performance of the proposed algorithm from multiple perspectives and assess the impact of changes in the main parameters. Main parameters' values in simulations are shown in Table 1. Table 2 shows the performance settings of CoNs, and the available or unavailable time of computing resources.  Figure 4a shows the input data size varies across time in a natural way with initial value of 2 Mbits. For most of the time, the data input load is large. Figure 4b shows the performance of the selected algorithms under diverse resource available time of CoNs. In our simulations, we set up different scenes to evaluate the effect of available time and unavailable time.
In the first epoch, two CoNs indexed by 4 and 7 are available, while in the next epoch, remote cloud node is unavailable. The results show that the proposed algorithm can learn the task offloading performance of the newly appeared CoNs more quickly, and effectively utilize the performance of the remaining CoN when one node is unavailable. It can reduce performance loss by about 70% compared to the UCB1 algorithm. The average task offloading delay is shown in Figure 4c, where the average performance of AUCB converges faster than UCB1 and other selected algorithms to achieve the optimal offloading performance.

Performance of Selected Algorithms under Comprehensive Conditions
Figure 5a also shows the input data size varies across time, but for most of the time, the data input load is light. Figure 5b shows the performance of the selected algorithms under comprehensive conditions without any changes of cluster nodes. Under this condition, the proposed algorithm and UCB1 can achieve similar results, but when data input load raises, the proposed algorithm will get better performance than UCB1 immediately. The simulations show that the proposed algorithm achieves better performance than other algorithms, and can obtain more optimized values faster, especially under heavy data load conditions.

Performance of Selected Algorithms under Comprehensive Conditions
Figure 5a also shows the input data size varies across time, but for most of the time, the data input load is light. Figure 5b shows the performance of the selected algorithms under comprehensive conditions without any changes of cluster nodes. Under this condition, the proposed algorithm and UCB1 can achieve similar results, but when data input load raises, the proposed algorithm will get better performance than UCB1 immediately. The simulations show that the proposed algorithm achieves better performance than other algorithms, and can obtain more optimized values faster, especially under heavy data load conditions.   Figure 6a shows that different computing check thresholds settings may affect the computing performance of the proposed algorithm at the beginning of simulations, but as time progresses, the difference of performance becomes very small and is ignorable. Figure 6b shows that the proposed algorithm achieves near-optimal computation offloading performance. At the end of the simulation period T, its performance loss is less than 8% of the optimal delay, which is superior to other selected algorithms. The simulation results show that the proposed AUCB algorithm can still effectively balance the trade-off between exploration and exploitation under both heavy and light input data load conditions. The algorithm performance is relatively good compared to the optimal solution and can provide a bounded deviation.

Performance Loss Comparison of Computing Thresholds Settings
By introducing the recourse available or unavailable time of CoN and the normalized input data size, AUCB algorithm is both load-aware and occurrence-aware. The computing performance is also relatively good.

Conclusions
In this work, we propose a marine fog-cloud computing architecture for USV clusters, and study the computation task offloading problem in the marine vehicular computing system architecture. We develop an optimized learning-based computation task offloading mechanism based on classic MAB theory. It enables vehicles to learn the potential offloading performance of their neighboring team nodes to minimize average computation task offloading delay. The simulation results show that computing performance of proposed AUCB algorithm is relatively good under both heavy and light input data load conditions.  Figure 6b shows that the proposed algorithm achieves near-optimal computation offloading performance. At the end of the simulation period T, its performance loss is less than 8% of the optimal delay, which is superior to other selected algorithms.
The simulation results show that the proposed AUCB algorithm can still effectively balance the trade-off between exploration and exploitation under both heavy and light input data load conditions. The algorithm performance is relatively good compared to the optimal solution and can provide a bounded deviation.
By introducing the recourse available or unavailable time of CoN and the normalized input data size, AUCB algorithm is both load-aware and occurrence-aware. The computing performance is also relatively good.

Conclusions
In this work, we propose a marine fog-cloud computing architecture for USV clusters, and study the computation task offloading problem in the marine vehicular computing system architecture. We develop an optimized learning-based computation task offloading mechanism based on classic MAB theory. It enables vehicles to learn the potential offloading performance of their neighboring team nodes to minimize average computation task offloading delay. The simulation results show that computing performance of proposed AUCB algorithm is relatively good under both heavy and light input data load conditions. Next, we will build a more detailed model or framework, and the corresponding simulation environment, to support the optimization of the proposed algorithm in order to obtain better performance. At that stage, the simulation platform will satisfy more complex tasks' requirements, while retaining the further expansion and exploration capabilities.