Multi-Granularity Mission Negotiation for a Decentralized Remote Sensing Satellite Cluster

: Satellite remote sensing is developing towards the micro-satellite cluster, which brings new challenges to mission assignment and planning for the cluster. A multi-agent system (MAS) is used, but the time delay caused by communication and computation is rarely considered. To solve the problem, a neural-network-based multi-granularity negotiation method under decentralized architecture is proposed. Firstly, we divided negotiation into three levels of granularity, and they work in different modes. Secondly, a neural network was trained to help the satellite select the best level in real-time. Through experiments, we compared the satellites working in three different levels of granularity, in which a multi-granularity decision was used. As a result of our experiments, a lower cost-effectiveness ratio was obtained, which proved that the multi-granularity negotiation method proposed in this paper is practical.


Introduction
Satellite remote sensing aims to obtain information from the Earth's surface. It has been widely used in geography, Earth science, meteorology, military, etc. [1]. In traditional imaging missions, a single low Earth orbit satellite is often used to take images of multiple targets multiple times. However, the increasing number of missions and the higher time resolution requirements call for more satellite members [2]. Besides, different missions require different kinds of payload, but multiple payloads can hardly be carried by one satellite. Furthermore, multiple payloads can hardly work together at the same time in the narrow imaging time window. Therefore, current remote sensing missions rely more and more on large satellite clusters, which brings new challenges to mission assignment and planning [3].
A multi-agent system (MAS) is the main solution to the distributed autonomy problem for multiple satellites. Agent technology was first used in DeepSpace-1 (DS-1) [4,5]. Although MAS applied in DS-1 is a single satellite system, with its success the MAS started to be wildly used in satellite cluster task planning. Campbell [6][7][8] summarized the work of predecessors, and defined four different working modes of satellite cluster, including traditional mode, top-down mode, centralized mode, and distribute mode. The traditional mode and the top-down mode were used in the early satellite clusters. For instance, He [9] proposed an edge computing framework including a central node and several edge nodes using a constructive heuristic algorithm based on the density of residual tasks. Although these algorithms have achieved good results, the central node (ground station or central master satellite) works very hard.
Current research mainly focuses on the latter two modes: the centralized mode and the distributed mode. Cheng [10] developed a satellite cluster negotiation and mission assignment method by using contract net in the centralized mode. He proposed three negotiation models: acquaintance's trust-based announcing bidding, adaptive bidding with swarm intelligence, and a multi-attribute decision-based fuzzy evaluation bidding method. Wang [11] discussed the cooperative multicenter problem on satellite imaging scheduling and proposed a cooperative co-evolutionary algorithm, and then proposed a novel fixed-length binary encoding mechanism for missions assignment. Iacopino [12] designed a self-organizing multi-agent structure belong to the distribution mode. The system can adapt to the change of the problems and can synchronize the satellites' plans in order to avoid duplication. In order to improve cluster consistency, Zhao [13] designed a distributed time consensus protocol for multi-agent systems with general linear dynamics on a directed graph. Buzzi [14] proposed an agent-based framework to study the self-organization of the satellite cluster, which can establish a temporary alliance to change the modularity of the system.
Since a remote sensing mission is usually divided into multiple stages (discovery, identification, confirmation, tracking, and monitoring) and one stage can further be divided into multiple steps (collection, transmission, process, etc.), a micro-satellite is not able to finish a mission all by itself. Current research has begun to focus on the cooperation of satellite clusters toward finishing one complex mission instead of merely using one satellite. Globus [15] used multiple variants of the genetic algorithm, hill climbing, simulated annealing, squeaky wheel optimization, and iteration to realize the coordination of multiple satellites and multiple observations of the same target. Araguz [16] designed autonomous operational schemes for Earth-observing swarms of nano-satellites, in which all satellites cooperate to observe multiple targets. Gallud [17] presented an agent-based simulation framework. In his study, satellites could work together to perform a set of observational tasks. His framework supported allocation targets for satellites and collective sensing of a target for multi-satellites. He [18] considered three important sections of planning (store sequence of tasks, action sequence, and the satellite status) to deal with scientific event discovery, satellite faults, cloud obscuration, and emergencies.
The main problem of the remote sensing mission lies in the conflict between resource consumption and mission accomplishment. The researchers considered many factors. Nag Sreeja [19] proposed a modular framework that combines orbital mechanics, attitude control, and scheduling optimization working on Cubesats. Wang [20] discussed the impact of clouds in the case of joint probabilities, and established a sample average approximation (SAA) model accordingly. Wang [21] also proposed a branch and cut algorithm based on lazy constraint generation. Although these kinds of limited resources onboard have been considered, the influence of time delay on negotiation is less related to that. The time delay is mainly caused by limited communication ability and computational resources. When a satellite keeps tracking a target, it has to act out a continuous attitude maneuver which affects the pointing accuracy of its antenna, resulting in a decrease of the communication quality. Besides, tracking a target produces a large number of images, the process of which is heavy work for a single micro-satellite. A common way is to transmit those images to other micro-satellites. However, image transmission occupies a large amount of communication bandwidth, so that less communication bandwidth can be left for negotiation. Lastly, the image process itself is computationally expensive, so that a micro-satellite can hardly negotiate with others when processing images. In conclusion, a large time delay may cause the delay of negotiation and may finally lead to negotiation failure.
In this paper, we propose a neural-network-based multi-granularity negotiation method under decentralized architecture belonging to distributed mode in [6][7][8]. Firstly, we propose a multi-granularity negotiation model. The negotiation process is divided into three levels of granularity: low-level granularity, middle-level granularity, and high-level granularity. In each granularity, satellites transmit different kinds of information and use different negotiation processes and assignment algorithms. Secondly, we establish a neural network to intelligently select the appropriate level of granularity. The neural network could output the success probabilities of negotiation in each level of granularity. Then a simple decision mode based on the probabilities is used to select the level in real-time.
The remainder of the paper is organized as follows. In Section 2, the modeling process of the multi-granularity negotiation method is introduced considering three aspects: mission preprocesses for all levels of granularity, multi-granularity partitioning of the multi-agent negotiation model, and intelligent real-time level selection of negotiation granularity. In Section 3, experiments to verify this method in different conditions are described. The proposed method is discussed in Section 4. Finally, the conclusions of the study are given in Section 5.

Proposed Methods
To finish negotiation with limited communication resources, a multi-granularity negotiation method was developed. Firstly, the traditional multi-agent negotiation model is divided into three levels of granularity: low-level granularity, middle-level granularity, and high-level granularity. Mission selection algorithms were designed for each level of granularity. A fixed consistency algorithm is used to realize the decentralization of the satellite cluster. After that, we designed different negotiation processes and consistency mechanisms to simplify the negotiation process of different granularity. Secondly, different samples on different target numbers, bandwidths and transmit hop numbers were collected to train a neural network. Finally, the best neural network was established to select the best granularity onboard. The flowchart of the multi-granularity negotiation method proposed in this paper is illustrated in Figure 1.

Time Window Calculation
Suppose that there are n targets, the set of all targets can be defined as T: where t i expresses the information of the ith target, including latitude, longitude, height, and motion characteristics. Motion characteristics contain course, angular velocity, speed, and acceleration. All these motion characteristics can be unknown. The calculation of the time window is to find mission start time T S i and mission end time T E i to make sure that for ∀t ∈ [T S i , T E i ], two constraints are satisfied: the optical visibility constraint and the geometric visibility constraint. The optical visibility constraint for the ith target C o i (t) can be expressed as: where R i is the position vector of t i in the inertial coordinate, which can be calculated by integrating with time. R Sun is the position vector of the sun in inertial coordinate. The geometric visibility constraint for ith target C g i (t) can be expressed as: where α is the angle between the target's position vector R i and its projection vectorR i in the orbital plane of the satellite, which can be expressed as: α max is the maximum value of α, expressed as: where R is satellite's position vector in inertial coordinate. θ max in Equation (3) is the maximum attitude maneuver angle of the satellite. The desired attitude maneuver angle θ is expressed as: ). (6) Traditional method

Profit Calculation
One remote sensing mission can be divided into five stages: discovery, identification, confirmation, tracking, and monitoring. Different initial profits p ini j are defined as constants for each stage of the mission in this paper, as is shown in Table 1. Mission profit P i,j can be expressed as: where i is the index of the target. j is the index of the stage, and j ∈ N + , 1 ≤ j ≤ 5. A t i is a constant coefficient varying with the type of the target. A p i is also a constant coefficient varying with the position of the target. In this model, it is easy to find out that the profit is only related to the targets.

Cost Calculation
Mission cost consists of three parts: time cost, memory cost, and power cost. The time cost c t i is defined as: where T S i is the start time of time window of target i. The mission duration t d i,j corresponds to the ith target and jth stage of the mission, expressed as: Memory cost per second is considered as a constant corresponding to the camera, written as MEM. The memory cost c m changing with mission duration can be expressed as Similar to memory cost, average satellite power is also considered as a constant: POW. The power cost c p also changes with mission duration, expressed as The total cost is the sum of these three costs, i.e., the cost for ith target during its jth stage, is c i,j = c t + c m + c p .

Multi-Granularity Partition of the Multi-Agent Negotiation Model
The framework in this paper is decentralization. In the blockchain technology, there are several consistency algorithms, including RAFT [22], PBFT [23], DPOS [24], RIPPLE [24], POW [25], POS [26], etc. We applied RAFT, PBFT, and RIPPLE in this paper. In RAFT, there are three roles of leader, candidate, and follower. At the start, all nodes in the net are followers. Then some nodes become candidates and initiate a vote. After voting, only one leader is elected, who manages the net by synchronizing logs and connects all followers by heartbeat. Followers only replicate the log. Differently from RAFT, PBFT introduces a result verification mechanism. It can avoid a byzantine fault by using the processes of REQUEST, PRE-PREPARE, PREPARE, COMMIT, and REPLY. In RIPPLE, more leaders are elected. The consistent process is divided into two parts: consistency in leader group and global consistency.
There are three roles of satellites set in this system: leader node (Leader), temporary leader node (TLeader), and follower node (Follower). The Leader is elected to organize the negotiation. The Leader periodically processes the mission requests from Follower and releases negotiation results to Followers. The Leader establishes a leader group consisting of itself and TLeaders. They firstly reach an agreement in the group by group vote. After being verified by TLeaders the negotiation result can be sent to Followers. Followers originate missions themselves and negotiate with each other under the Leader's control.

Low-Level Granularity
In low-level granularity, the satellites are allowed to negotiate only once to save communication resources. The negotiation process can be described as the following steps (the TLeaders' assistance process is ignored in this granularity and is introduced in high-Level granularity) shown in Figure 2:

•
Step 1: Preprocessing the missions. The Follower first preprocesses the missions by calculating the time window, profit, and cost.

•
Step 2: Select a set of desired mission stages. The desired mission stages are selected by decision algorithm in Follower.

•
Step 3: Transmit the mission set. The set of desired mission stages MS desire and their cost is broadcast, and the Leader records the mission set.

•
Step 4: Leader process. The Leader finishes the mission assignment using a greedy algorithm.
According to the profit calculation model, the profits of each mission stage are the same. That means the mission stages are assigned to the satellite with less cost. • Step 5: Announce temporary result. The Leader announces a temporary result to all Followers. • Step 7: Result release. The Leader records the selection from the Follower who votes in favor. Then the Leader releases the final negotiation result.

•
Step 8: Execute. The Follower checks the result again, then picks the intersection of the set of assigned missions MS actual and the set of desired mission stages MS desire , and then executes the mission stages in the intersection.
Under the process, each Follower needs to decide the desired mission set by itself. It becomes a group knapsack problem with time constraint, which can be described as: where s i,j is mission selection result, s i,j = 1 if the jth stage of ith target is selected; else s i,j = 0. Both of i and i are the target ID in the selected mission set. Dynamic programming is chosen to solve the problem. In the low-level granularity, the communication resource is most significant. Due to the RAFT algorithm's low consumption, it was selected to be the consistency algorithm work at this level. The algorithm's parameters are adjusted to space usage. The modifications include:

Follower Follower Leader
Heartbeat interval is longer. Due to the time-varying characteristics of satellite networks, the connection may be unstable. Frequent broadcast behavior for heartbeat may cause wastage of bandwidth. Additionally, the network delay may cause misjudgment of the connection between the Leader and Followers. The longer interval brings a bigger allowance.
The election interval is longer. Based on the principle of RAFT, the election happens when the Leader is unconnected. That means the election interval is related to the heartbeat interval. Therefore, the election interval is longer too.
The negotiation result is not released in a heartbeat message. Based on the principle of RAFT, a longer heartbeat interval causes a delay of mission response. Thus the release negotiation result package is designed to be independent. It usually transmits periodically; besides, it also supports emergency transmission to improve real-time performance for emergency missions.

Middle-Level Granularity Model
Middle-level granularity corresponding to a normal communication environment is designed to balance bandwidth usage and the negotiation effect.
Similar to low-level granularity, the satellites are divided into one Leader and several Followers. The process can be divided into the following steps shown in Figure 3:

•
Step 1: Preprocessing the missions. Same as the Step 1 in low-level granularity.

•
Step 2: Select a set of mission stages. Differently from the Step 2 in low-level granularity, the set of mission stages contains both desired and candidate mission stages.

•
Step 3: Transmit the mission set. Differently from the Step 3 in low-level granularity, these mission stages broadcast include their execution costs and time windows, and the Leader records the set.

•
Step 4: Leader process. Same as Step 4 in low-level granularity.

•
Step 5: Announce temporary result. Same as Step 5 in low-level granularity • Step 6: Result check and vote. The Followers working in the middle-level granularity are responsible for their own mission set. The assigned missions MS actual must all be in the submitted mission set. The total cost of the mission set is less than satellite's capability c m ax. Besides, the assigned mission set must contain enough missions. The Follower votes in favor if the above two constraints are satisfied. If not, the Follower votes against. The vote result V is expressed as: • Step 7: Vote process. If all Followers vote in favor or number of negotiation failures is too high, the Leader stops negotiation and goes to Step 8. If not, the Leader adjusts the set of mission stages while still using a greedy algorithm, and then goes to Step 5.

•
Step 8: Result release. The Leader releases the final negotiation result.

•
Step 9: Execute. Same as the Step 8 in low-level granularity. In this process, the Follower chooses some spare mission stages. Thus, the time constraint no longer fits for this process. It becomes a group knapsack problem, which can be described as:

Follower Follower Leader
Dynamic programming is still chosen to solve the problem. In the middle-level granularity, the RAFT algorithm is still valid. The usage is the same as the consistency algorithm in low-level granularity.

High-Level Granularity Model
High-level granularity is designed to a obtain better negotiation effect in a good communication environment. Differently from the other two levels of granularity, the satellites are divided into a Leader group and several Followers, and the Leader group consists of a Leader and TLeaders. The process can be divided into the following steps shown in Figure 4:

•
Step 1: Apply to join the Leader group. There exists one Leader in the system as the Leader for all levels of granularity. In order to increase the fault tolerance rate, the Follower with a direct connection (only one hop to the Leader) applies to join the Leader group. After being authorized by the Leader, it becomes a TLeader.

•
Step 2: Preprocessing the missions. Same as before.

•
Step 3: Select a set of mission stages. The set of mission stages, which contains all the supported mission stages, is selected by decision algorithm in Followers and TLeaders.

•
Step 4: Transmit the mission set. These mission stages are broadcast, including their execution cost and time window, and both the Leader and TLeaders record the set.

•
Step 5: Leader Process. The Leader and TLeaders assign missions using a greedy algorithm, considering an extra constraint of the maximum cost of each Follower.

•
Step 6: Announce temporary result. The Leader firstly announces a temporary result in the Leader group and votes for the result. Then the temporary result is announced if confirmed by most TLeaders. If not they go to Step 5 to process again.

•
Step 7: Result check and vote. Same as before.

•
Step 8: Vote process. If all Followers vote in favor or negotiation fails times too much, the Leader stops negotiations and goes to Step 9. If not, the Leader adjusts the set of mission stages still using a greedy algorithm and then goes to Step 6.

•
Step 9: Result release. Same as Step 6, the Leader releases the final negotiation result first in the leader group. After being confirmed, the negotiation results are released to all nodes.
In this process, the nodes chose as many mission stages as possible. All mission stages satisfied cost effectiveness ratio are chosen, described as: In the high-level granularity, RIPPLE and PBFT algorithms are combined to improve system stability and negotiation accuracy. The modifications include: Leader group member selection method. In RIPPLE algorithm, the Leader group member is selected from all Followers. Considering the communication environment, all Followers satisfying a condition should make applications for the group.
The PBFT verification mechanism is introduced into the leader group. The PBFT verification mechanism could effectively prevent node failure or disturbance. This is well adapted to satellites because of their limited computation resources. Verification in the leader group can reduce the possibility of error information release. In other words, the leader group sacrifices its communication bandwidth and computational resources for other nodes working in lower granularity.  Figure 4. High-level granularity negotiation process.

Real-Time Level Selection of Negotiation Granularity
The satellite needs to select an appropriate level of granularity. Selecting a higher level brings large data transmission and complex negotiation processes which may cause negotiation failure. Selecting a lower level can improve the success rate of negotiation but may cause bad negotiations.
Level selection is also difficult because the selections of satellites interact and unknown to each other. We collected lots of samples about selection and negotiation effectiveness. A neural network was trained to describe the relationship between the communication environment and success probabilities of negotiation in each level of granularity. Thus, the satellite can make decisions of level selection based on the probabilities in real-time onboard.

Neural Network Description
The negotiation information transmission is related to network status and size of information witch is selected as input parameters of the neural network. In this paper, the network status is described by two parameters: remaining communication bandwidth and hop number to the Leader. The size of information is described by the number of targets n. The outputs of the neural network are the success probabilities of negotiation in each level of granularity recorded as P L , P M m and P H .
Suppose there are N L layers. All layers are fully-connected. In the kth layer, there are N k n nodes. They are all connected to (k − 1)th layer's nodes with parameters w k and b k . The activation function is optimized during training. The mean squared error MSE was chosen to evaluate network performance: where y i is the vector of real value form sample;ŷ i is the vector of predicted value output by the neural network.

Neural Network Training
The neural network training is divided into two steps: finding the best activation function and determining the neural network structure. In the first step, the two-layer model is set to be the initial neural network structure. The output layer's activation function is set to be the purelin model. There are 14 activation functions and different node numbers of hide-layers are traversed to find out the best three activation functions. In the second step, the three-layers structure and also the different node numbers in each layer are selected and traversed, including 4, 6, 8, 16, 32, 64, 128, and 256. By comparing the MSE, the best structure of the neural network is found.

Targets and Mission Status
We randomly generated a set of targets and missions. The target's initial position was between 2 • N and 2 • S. The target's motion was also random, including course angle ψ i between −180 • and 180 • and velocity v i between 0 and 30 knots. The status of the mission defined as the numbers shown in Table 2 was random, between 0 and 5. Table 2. Definition of the mission status. 0  not discovered  1  discovered  2  identified  3  confirmed  4  tracking  5  monitoring Finally, we got the set of targets and mission information shown in Table 3; the rest of them are listed in Appendix A. 148.65 • E, 1.62 • N −51 • 8.6 0 · · · · · · · · · · · · · · ·

Imaging Satellite
Satellite parameters are shown in Table 4. They are ten isomorphic satellites in the scene, with an interval phase of −5 • . At the beginning of simulation, mission 1 was assigned to satellite 4. Therefore, satellite 4 executed a tracking mission stage which affects the negotiation.

Sample Collection
We traversed 3 levels of granularity for satellite 6 to satellite 10; different bandwidth usage is shown in Table 5; and target numbers are shown in Table 6. Table 5. Simulation value of bandwidth usage. 1  2  3  4  5  6  7 bandwidth usage 60% 65% 70% 75% 80% 85% 90% We collected 3 5 × 7 × 8 = 13,608 samples. Negotiation results can be obtained from the Leader node and converted into connectivity probabilities of each level of granularity as the output of the neural network. Remaining communication bandwidth Bw and the number of targets n are also the inputs of simulation. The hop number to the Leader node of each satellite Hn can be read from the simulation result.

Training Result
We chose 14 different activation functions (elliotsig, hardlim, hardlims, logsig, netinv, poslin, purelin, radbas, radbasn, satlin, satlins, softmax, tansig, and tribas) using the first layer. Additionally, the activation function of the second layer was purelin. After training, the best ten neural networks' distributions of error for the first step in Section 2.3.2 were gathered to be shown in Figure 5, and their parameters are shown in Table 7.
The best three activation functions (radbas, softmax, satlin) were found out. The convergence process of the best neural network is shown in Figure 6.
Then we used these three activation functions and the three-layer structure to find out the best neural network. After training, the best ten neural networks' distributions of error for the second step were recorded for Figure 7, and their parameters are shown in Table 8.

Experimental Results
All parameters in Tables 5 and 6 were chosen to verify the neural-network-based multi-granularity negotiation method. We collected and analyzed the profits and costs based on negotiation results. The profit shown in Figure 9 and the cost-effectiveness ratio shown in Figure 10 are most relevant.

Discussion
Resources onboard are limited, which means the resources for satellite clusters, such as communication resources and computation resources, may have conflicting demands from the mission and negotiations. This causes failure of negotiations easily. Therefore we need a method to balance the gains from negotiation and negotiation delay. We proposed a method in which the negotiations are divided into three levels of granularity with different working modes. As shown the result, these three levels of granularity perform differently, which means this division is effective and can be used in different situations. Moreover, the satellite selects the best level onboard in real-time, and achieves better performance than each of the three levels. Additionally, the satellite can select the level onboard without many resources. The cost-effectiveness ratio of the satellite cluster is increased. It is worth noting that the three levels of granularity were designed by the author based on the author's task planning algorithm. That means the number and mode of each granularity may be different from those in other task planning algorithms.
We used machine learning to train a neural network. As shown in the training result, the MSE of the neural network reaches 0.002, which is not very good for the training itself, and this may be caused by a small sample set of 112 kinds of the sample. Those kinds of samples are the statistical result from 13,608 samples. However, this error is satisfactory in application, because possibility does not require high accuracy. Nonlinear fitting and small sample learning may obtain better results, but may not change the selection of granularity.

Conclusions
Limited communication and computational resources onboard are significant factors for negotiation, which is not getting enough attention. There usually exists cooperation in the micro-satellite cluster, which means the effect of time delay caused by communication and computational cannot be negligible.
In this paper, a neural-network-based multi-granularity negotiation method under decentralized architecture is proposed, which is proven to be effective. Advantages of the proposed method are summarized as follows: 1. The three levels of granularity we allocated work for different situations. We combine the advantages of the levels and use the best level of granularity according to the situation, which brings about better profits and a higher cost-effectiveness ratio. 2. Complex situation analysis for granularity selection also brings about time delay; therefore, a neural network is trained for granularity selection in real-time. 3. The framework of satellites is decentralization, which means it is suitable for a large satellite cluster containing failure nodes and malicious nodes.

Conflicts of Interest:
The authors declare no conflict of interest.
Appendix A