DNNs Based Computation Ofﬂoading for LEO Satellite Edge Computing

: Huge low earth orbit (LEO) satellite networks can achieve global coverage with low latency. In addition, mobile edge computing (MEC) servers can be mounted on LEO satellites to provide computing ofﬂoading services for users in remote areas. A multi-user multi-task system model is modeled and the problem of user’s ofﬂoading decisions and bandwidth allocation is formulated as a mixed integer programming problem to minimize the system utility function expressed as the weighted sum of the system energy consumption and delay. However, it cannot be effectively solved by general optimizations. Thus, a deep learning-based ofﬂoading algorithm for LEO satellite edge computing networks is proposed to generate ofﬂoading decisions through multiple parallel deep neural networks (DNNs) and store the newly generated optimal ofﬂoading decisions in memory to improve all DNNs to obtain near-optimal ofﬂoading decisions. Moreover, the optimal bandwidth allocation scheme of the system is theoretically derived for the user’s bandwidth allocation problem. The simulation results show that the proposed algorithm can achieve a good convergence effect within a small number of training steps, and obtain the optimal system utility function values compared with the comparative algorithms under different system parameters, and the time cost of the system and DNNs is very satisfactory.


Introduction
As an important research direction for future 6G communication, low earth orbit (LEO) satellites have wide coverage and can achieve full coverage of ground users on the earth [1][2][3]. With the rapid development of communication, human beings need to face and process massive data, and users have different requirements for different tasks in different scenarios, e.g., the ultra-low latency of tasks processing needs to be considered [4].
However, it is not enough to rely only on the limited computing power of the users to handle these computing tasks. In this context, mobile edge computing (MEC) [5] has received a lot of attention. It is considered that, by placing the edge servers with strong computing power close to the users, the users will offload the computing tasks to the edge servers close to the user side for calculation, so as to obtain a better user quality of experience (QoE).
Full coverage is an important direction for future 6G communication. However, providing computing services in remote/rural areas becomes a challenging task due to the lack of communication infrastructure. Not only that, in the event of some natural disasters, the ground communication infrastructure is easily damaged, and the ground network cannot provide computing services for users. In this case, although LEO satellites can achieve full coverage for users on the ground, the computing power on LEO satellites is limited. The MEC servers have powerful computing power and can quickly respond to the received tasks. Therefore, by placing the MEC servers on the LEO satellites, the computing power of the satellites is enhanced. In some special scenarios, MEC servers are deployed on the LEO satellites, and users offload tasks to the LEO satellites to obtain smaller delay and energy consumption [6,7]. With the rapid development of artificial intelligence, many learning algorithms are used in MEC networks for generating offloading decisions and resource allocation [8,9]. Therefore, we consider applying the deep learning to the LEO satellites edge computing system.
The main contributions of this paper are as follows: • We design a LEO satellite edge computing system where the LEO satellite equipped with MEC server serves ground users, and the effect of free space loss on channel gain is also considered in the simulation. • We investigate a multi-user multi-task LEO satellite edge computing network. For the user's offloading decisions problem, we use multiple parallel deep neural networks (DNNs) to generate corresponding offloading decisions. For the user's bandwidth allocation problem, we derive the optimal bandwidth allocation scheme of the system. • We conduct a large number of simulation experiments. Simulation results show the reliability and performance superiority of the proposed algorithm. Compared with the other comparative algorithms, the proposed algorithm in this paper can obtain the optimal system utility function values. Moreover, our proposed algorithm can obtain a good convergence effect and a low average time cost.

Related Works
In recent years, satellite edge computing technology and artificial intelligence technology have received a lot of attention. In this section, we will introduce the relevant literature.
Li et al. [10] propose the "LEO-MEC" scenario, that is, placing edge servers on LEO satellites. In the literature, the authors mainly consider two problems: one is the service request scheduling decisions problem for the resource utilization of the system. Another is the service placement problem related to scheduling decisions. The authors use the OPTI toolbox to solve this problem, which achieves better performance than the contrasting algorithms.
Tang et al. [11] propose a structure that combines satellite edge computing with cloud computing, defining the optimization problem as a problem of reducing system energy consumption, and using a distributed algorithm to obtain satisfactory system energy consumption.
Gao et al. [12] apply virtual network function (VNF) technology to the scenario of satellite edge computing, define the VNF placement problem as a game problem, and propose an algorithm based on game theory to solve the placement problem, so that the entire system can obtain maximum benefit.
Zhang et al. [13] study the allocation of computing resource and communication resource for computing tasks in satellite networks. For this complex mixed integer programming problem, a low complexity algorithm based on game theory and many-to-one matching theory is proposed, which greatly reduces the tasks execution delay.
Because Space-Air-Ground Integrated Network (SAGIN) can provide full coverage on a global scale and can efficiently manage resource, a lot of literature considers combining SAGIN with edge computing.
Yu et al. [14] introduce a scenario combining edge computing under the edge computing-enabled SAGINs (EC-SAGINs) architecture. In some special extreme scenarios, it can provide users in remote areas with Internet of Vehicles services. In addition, this document proposes a pre-classification scheme and a deep imitation learning (DIL)-based decisions algorithm to enable the satellite to complete the tasks as quickly as possible under the premise of minimum resource usage.
Cheng et al. [15] study the tasks offloading scheme of ground users in the SAGIN architecture. As the environment in the SAGIN architecture is constantly changing, this document proposes a tasks scheduling algorithm based on reinforcement learning to adapt to the changing environment. Through the reasonable allocation of resource, a better user's QoE is obtained, and the convergence of the system is also improved.
Mao et al. [16] propose an aerospace-assisted hybrid cloud edge computing framework, which minimizes the maximum computing delay between system end-users through joint optimization of the system's communication resource and computing resource. Under the premise of ensuring convergence, this document proposes an alternate optimization algorithm to optimize the system, which has achieved better convergence and lower computational delay.
As an important technology in the field of artificial intelligence, deep learning has received more and more attention. People have applied deep learning to many different fields and obtained many achievements [17,18]. In wireless communication, deep learning also solves many computational problems [19], such as resource allocation [20,21], signal detection [22,23], interference alignment [24] and caching [25].
Some recent studies have applied the idea of deep reinforcement learning (DRL) to the MEC system, solved the problem of computing offloading and resource allocation of the MEC system, and made some progress. In [26], Chen et al. use deep reinforcement learning to solve the joint optimization problem of computing offloading and resource allocation in MEC systems. In [27], Seid et al. propose a DRL-based model-free collaborative computing offloading and resource allocation scheme, which achieves the optimal system performance.
In deep learning, a deep neural network (DNN) is a very classical and effective network structure [28]. The biggest difference between it and other neural network structures is that its layers are fully connected. For the problem of the shortage of edge computing resource, Dong et al. [29] propose an algorithm based on adaptive DNN partition, which has obtained the best overall cost. In order to solve the problem of limited resource in the servers in the edge computing system, Tang et al. [30] jointly optimize the problem of DNN partition and resource allocation, and obtain ideal efficiency through iterator alternating optimization algorithm. In order to solve the problem of offloading decisions and bandwidth allocation in the edge computing system, Yang et al. [31] propose a distributed DNN offloading algorithm (DDOA) algorithm to generate offloading decisions through multiple parallel DNNs and jointly optimize the weighted sum of delay and energy consumption. To solve the bandwidth allocation problem, the DDOA algorithm uses the orthogonal frequency division multiple access (OFDMA) technique to divide the bandwidth, and the simulation results show the superiority of the performance.
The DDOA algorithm is one of the comparative algorithms in our paper. The differences between the algorithm proposed in this paper and the DDOA algorithm are as follows: The DDOA algorithm considers the mobile fog computing scenario in the ground network, while this paper considers placing MEC server on LEO satellite to provide computing services for users anywhere on the ground. Although the channel gain is mentioned in the DDOA algorithm, the specific channel condition is not given. In this paper, the free space loss in the satellite-earth link is considered. In the DDOA algorithm, each user's tasks are executed sequentially, so the total delay of each user's task processing in the DDOA algorithm is the sum of the processing delay of all tasks. However, in this paper, we consider that the offloading decisions for each user's tasks are executed in parallel. While the tasks are processed locally, other tasks can also be offloaded to LEO satellite for processing. Therefore, the total delay of each user's task processing in this paper is the maximum value of each user's total local processing delay and each user's total edge processing delay. In the DDOA algorithm, bandwidth allocation is to use the orthogonal frequency division multiple access (OFDMA) technique to divide bandwidth, and the bandwidth allocated to each user is proportional to the size of the offloading tasks. In this paper, we consider the size of the tasks, the generated offloading decisions, the channel gain, the transmit power and the noise power, and derive the optimal bandwidth allocation scheme using the Lagrangian multiplier method.
In order to compare the pros and cons of the related works and compare them with the current work, we summarize the related works in Table 1.

Reference
Method Advantages Disadvantages [10] OPTI toolbox Optimal objective value High toolbox requirements [11] Alternating direction method of multipliers Reduce the energy consumption Complexity is not low [12] Game approach High network payoff Unstable [13] Game-theoretic approach and matching theory Low weighted-sum latency Lack of energy consumption consideration [14] Deep imitation learning Low task completion time Low accuracy [15] Deep reinforcement learning Low cost High energy consumption [16] Alternating Lack of consideration for optimal bandwidth allocation Compared with the above references, the proposed algorithm jointly optimizes the system delay and energy consumption, generates near-optimal offloading decisions through DNNs, derives the optimal bandwidth allocation scheme through Lagrangian multiplier method, and has stable convergence effect and low time cost. Compared with the comparative algorithms, the proposed algorithm obtains the optimal system utility function values.

System Model
This paper considers a LEO satellite edge computing system consisting of N users and an MEC server placed on the LEO satellite, as shown in Figure 1. We think that each user has M independent computing tasks, each of which can be executed locally or offloaded to the LEO satellite equipped with the MEC server for computation. We denote the offloading decision as x nm ∈ {0, 1}, when x nm = 1 represents user n offloads computing task m to the MEC server on the LEO satellite for execution, and when x nm = 0 represents user n's task m chooses to be executed locally.

Local Execution
When the users choose to execute the computing tasks locally, we set the size of the mth computing task generated by user n as l nm and set the energy consumption required by the users to process each bit in the computing tasks as e local . Therefore, the energy consumption of user n locally executing task m is given by: We denote the processing time of each data bit processed locally by user n as t local . Therefore, the delay for user n to process task m locally is: Therefore, the total delay for user n to process its tasks locally is:

Edge Execution
When the users choose to offload the computing tasks to the MEC server on the LEO satellite, the users offload the computing tasks to the LEO satellite and process tasks on the edge server. Limited by the size of solar panels, we think that the available energy of LEO satellites is a very precious resource. Therefore, we consider the energy consumption of MEC server on the LEO satellite processing the tasks [32][33][34][35][36]. Due to the small size of the returned results generated by the MEC server, we do not consider the energy consumption and delay caused by the process of returning the computation results to the users [37,38].
We divide the total energy consumption of ground users offloading tasks to the MEC server on the LEO satellite into two parts: one part is the energy consumption E trans nm of transmitting the tasks to the MEC server, and the other part is the energy consumption of the MEC server on the LEO satellite processing the tasks, which is denoted as a linear function of task l nm . The total energy consumption of user n's task m offloading to the MEC server is: where φ is the weight of MEC server energy consumption.
In this system, we define p n as the transmit power of the nth user offloading the tasks. The transmission rate when user n offloads the tasks can be derived from Shannon's formula as r n = b n log 2 1 + h n p n σ , where b n is the bandwidth allocated to user n, h n is the channel gain of the channel occupied by the nth user, and σ is the noise power. We think that the satellite-to-ground link loss in this system is mainly free space loss.
The transmission delay for user n to offload task m to the MEC server is as follows: In addition, the processing delay of the MEC server is given by: We think that there is a MEC server in LEO satellite, and the processing rate of the MEC server is f sat .
The total delay when user n offloads computing tasks to the MEC server on the LEO satellite for execution is: The simultaneous task transmission and MEC task execution are not considered in this paper because the focus of this paper is on the offloading decisions problem and bandwidth allocation problem [39][40][41][42][43]. In addition, in LEO satellite edge computing, if the simultaneous task transmission and MEC task execution are considered, the MEC servers need to process more tasks in one time slot, and the MEC servers will generate more energy consumption within a unit time, which will bring a great burden on the resource-constrained MEC servers on LEO satellites.
In this paper, we assume that all computing tasks that need to be offloaded can be finished during the LEO satellite coverage time.

Problem Formulation
We define the system utility function J as the weighted sum of energy consumption and delay required by the processing tasks: where ϕ represents the weight between the energy consumption and delay required by the processing tasks in the system utility function. By setting parameter ϕ, we can more intuitively express the relationship between delay and energy consumption. The optimization objective of DDOA algorithm [31] is Delay-Energy Weighted Sum (DEWS) metric: where ω ∈ (0, 1) is a weighting factor. In this paper, apart from considering the total delay of each user's task processing as the maximum value of each user's total local processing delay and each user's total edge processing delay, the relationship between J and ∇ is : J = 1/(1 − ω) × ∇, and the relation between φ and ω is φ = ω/(1 − ω). If the optimization objective J in this paper is multiplied by (1 − ω), it will be the same as the optimization objective DEWS proposed in [31].
We formulate the problem of energy consumption and delay for joint optimization tasks as the problem of minimizing the utility function of the system as: subject to Constraint (9a) states that the total uplink bandwidth allocation for all users cannot exceed the maximum bandwidth B. Constraint (9b) indicates that the bandwidth b n allocated to each user is greater than or equal to 0. Constraint (9c) represents a binary constraint on offloading decision x nm .

Proposed Algorithm
For the mixed integer programming problem (P1) with a nonlinear objective function, we want to use deep learning methods to solve this problem. The structure of the proposed algorithm is shown in Figure 2. In general, we input computing tasks into DNNs, generate corresponding candidate offloading decisions for computing tasks through DNNs, and allocate bandwidth for each user through the derived optimal bandwidth allocation scheme. Then, the utility function values of the system are calculated through the offloading decisions scheme and bandwidth allocation scheme obtained, and then the offloading decisions corresponding to the optimal system utility function value are output as the optimal offloading decisions of this iteration. Finally, the optimal offloading decisions and the size of the computing tasks are stored in the memory for training DNNs to achieve the goal of optimizing DNNs. It consists of two processes: one is the generation of the offloading decisions, and the other is the update of the offloading strategy, and these two processes are executed alternately. The generation of offloading decisions relies on the DNNs. On the input side, the DNNs take the size of the computing tasks as input and output the corresponding candidate offloading decisions. If the offloading decisions generated by a certain time slot are calculated locally by all users, the bandwidth allocation algorithm will not be performed, b 1 , b 2 , . . . , b N are set to 0. If the offloading decisions generated by a certain time slot have the edge offloading scheme, the system allocates bandwidth to each transmission channel through our bandwidth allocation algorithm. Then, we calculate the system utility function values. Then, the system selects the offloading decisions with the least system utility function value as the output.
In the offloading strategy updates phase, the input computing tasks size and the output offloading decisions are stored in the memory structure, and the training samples are extracted from the memory to train the DNNs. As these two processes iterate, the performance of the DNNs will gradually improve. The specific process of our algorithm is as follows: For the deep learning process, we take the computing tasks size l as the input of the DNNs and output the candidate binary offloading decisions. There are d DNNs, and d candidate binary offloading decisions will be generated. Our purpose is to design an offloading strategy function π d . For the d-th DNN, once the all computing tasks size l for N users is input at the input, it will quickly generate the corresponding candidate offloading decisions x d . The strategy is expressed as: When the corresponding offloading decisions are obtained through DNNs, the optimization problem (P1) becomes the optimal bandwidth allocation problem (P2) of the system for each user. Problem (P2) is formulated as: subject to For the bandwidth allocation problem, we can see from the above formula that the constraint of the optimization problem (P2) is only related to the bandwidth allocation of the system. Therefore, when satisfying the bandwidth constraint, the utility function value of the system is only related to the transmission delay of the user n offloading its task m to the edge server, so the optimization problem (P2) becomes an optimization problem (P3): l nm r n x nm (12) subject to In this paper, the number of users is N, each user has M different tasks, and this paper thinks that each user generates one computing task in the same time slot each time, and each iteration consists of M time slots, so each iteration can process all tasks. Therefore, using the Lagrangian multiplier method for the N computing tasks generated in the mth time slot, the Lagrangian function of problem P3 can be expressed as: In the above formula, l nm represents the size of the mth computing task generated by nth user, x nm represents the offloading decision for the mth computing task generated by nth user, b n represents the bandwidth allocated to the nth user, and r n represents the transmission rate when the nth user offloads the task. To obtain the optimal solution of problem P3, the KKT condition must be satisfied: By solving the KKT condition (14), we can obtain the optimal bandwidth allocation scheme for each user in the system: The specific derivation process of (15) can be found in Appendix A. Thus far, we have come up with the optimal bandwidth allocation scheme after obtaining the offloading decisions.
During the DNN update process, the system will use a gradient descent algorithm to minimize the average cross-entropy loss function to update the parameter value θ d of d DNNs. Algorithm 1 shows a proposed algorithm for the computing offloading and bandwidth allocation in LEO satellite edge computing.
Specifically, the input of our algorithm is the computing task size of the ith iteration, and the output is the optimal offloading decisions of the ith iteration. First, d DNNs are initialized with random parameter θ d and the memory structure is emptied. Then, we input the computing task size l i of the ith iteration into d DNNs to generate d candidate offloading decisions x d , allocate bandwidth for each user by the optimal bandwidth allocation scheme (15), and calculate the system utility function values J d by using the candidate offloading decisions and the optimal bandwidth allocation scheme. From d system utility function values, we select the candidate offloading decisions corresponding to the smallest system utility function value as the optimal offloading decisions of the ith iteration, and output the optimal offloading decisions x * i . Finally, we store the computing task size l i in the ith iteration and the optimal offloading decisions x * i in the ith iteration into the memory to train DNNs to improve the system performance. Allocates bandwidth to each user by the optimal bandwidth allocation scheme (15) 7: Calculate the system utility function values J d by the candidate offloading decisions 8: and the optimal bandwidth allocation scheme 9: Select the best solution x * i = arg min x d J d and update the memory structure by adding 10: l i , x * i into the memory structure and output the optimal offloading decisions x * i 11: in this iteration 12: if i % interval = 0 then 13: Uniformly sample d batches of the training samples from the memory 14: Train the DNNs and update θ d using a gradient descent algorithm 15: end if 16: Update i to i + 1 17: end for

Simulation Results
In this section, we will verify the superiority of our proposed algorithm through a large number of experiments, including: analysis of convergence, comparative analysis of system utility function values, and time cost analysis of system and DNNs. In this paper, we think that there are three users with computing needs on the ground side, and each user has three computing tasks. This paper thinks that the wireless channel transmission signal conforms to the free space path loss model. L p can be represented as L p = 32.4 + 20 lg(D) + 20 lg(F), where D is 200 km and F is 30,000 MHz. We set user transmit power p = 10 mw and noise power σ = 10 −13 w. In this paper, we think that the tasks obey a uniform random distribution of (10 MB, 30 MB), the processing energy consumption of the users in local computing is 3.25 × 10 −7 J/bit. When the users are computing offloading, the energy consumption of transmitting tasks is 1.42 × 10 −7 J/bit, the total uplink bandwidth of the system is 150 Mbps. For DNNs, each DNN consists of three layers. The activation functions of the first and second layers are the rectifiers linear unit (ReLU), and the activation function of the third layer is the sigmoid function. The weight matrix size of the first layer is 9 × 120, the weight matrix size of the second layer is 120 × 80, and the weight matrix size of the third layer is 80 × 9. The number of neurons in the first layer is 9. The number of neurons in the second layer is 120. The number of neurons in the third layer is 80. The optimizer is the AdamOptimizer.
The effect of the number of training steps on the convergence of DNNs is showed in Figure 3. It can be seen from Figure 3, as the number of training steps increases, the cost of DNNs tends to 0 rapidly, and DNNs can achieve good convergence effect with a small number of training steps.  In this paper, we compare three comparative algorithms including: DDOA algorithm [31], full edge offloading algorithm, and random offloading algorithm. The DDOA algorithm uses multiple parallel DNNs to generate offloading decisions, but the algorithm does not consider the optimal bandwidth allocation of the system. The full edge offloading algorithm means that all computing tasks are offloaded to the MEC server for execution. The random offloading algorithm means that the computing tasks generated by the users are randomly selected for local computing or offloaded to the MEC server for computing.
When we set ϕ and φ to different values, the utility function values of the system are shown in Figures 4 and 5. It can be seen that, as the parameter ϕ or φ increases, the system utility function values of the system increase, which is obviously reasonable because the system utility function values are proportional to the parameters ϕ and φ. From Figures 4 and 5, we can see that, among the four different algorithms, our proposed algorithm obtains the optimal system utility function values. For our proposed algorithm, since we generate near-optimal offloading decisions and adaptively allocate system bandwidth, we obtain optimal system utility function values, and the simulation results meet our expectations.   Figures 6 and 7 show the comparison of system utility function values corresponding to different offloading algorithms under different LEO satellite MEC server processing rates f sat and local devices' processing capabilities t local . As shown in Figure 6, with the increase of the LEO satellite MEC server processing rates f sat , the system utility function values of the four offloading algorithms will decrease. This is because the higher processing rates of the MEC server reduce the processing time during the offloading process, and therefore the system utility function values decrease. It can be seen from Figure 6 that the downward trend of the full offloading algorithm is very obvious. This is because the total processing time on the LEO satellite of the full offloading algorithm only depends on the processing rates of the LEO satellite MEC server. As can be seen from Figure 7, the system utility function values of the three offloading algorithms increase with the local devices processing capabilities t local because t local is the processing time required by the local devices to compute each data bit locally. A larger t local means that the tasks take more processing time to process locally, so the system utility function values of the three algorithms increase accordingly. It can be seen from Figure 7 that the values of the system utility function of the full offloading algorithm do not change with the increase of t local . This is because the system utility function values of the full offloading algorithm are independent of local devices processing capabilities, so this result is reasonable. Among the four offloading algorithms, our proposed algorithm achieves the optimal system utility function values. This is because our algorithm generates the offloading decisions with the lowest system utility function values through the DNNs, and can allocate the system bandwidth adaptively, so we can obtain the optimal system utility function values.    Figure 8, we can see that our proposed system only needs about 0.12 s to complete an iteration on average. For each input computing task, our DNNs only need about 0.5 ms to obtain the corresponding offloading decisions on average, and the time cost is very considerable.  4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Number of Experiments

Discussion
As can be seen from the above simulation results, our proposed algorithm can converge within a small number of training steps, and the time cost is very satisfactory because we can generate offloading decisions through multiple parallel DNNs, and therefore we can obtain the ideal convergence performance and time cost. Compared with other comparative algorithms, our proposed algorithm can obtain the optimal system utility function values because we not only generate near-optimal offloading decisions through DNNs, but also derive the optimal bandwidth allocation scheme, so we can obtain the optimal system utility function values.
This discovery gives us a deeper understanding of the research on the MEC servers mounted on the LEO satellites, verifies the feasibility of the system by simulating the satellite-to-earth channel environment, and proves the excellent performance of the proposed algorithm through simulations. This research provides theoretical support for the application of edge computing on the LEO satellites in the future.
Our future research direction is the application of edge computing in the satellite-ground collaborative network. With the advantages of wide coverage and strong computing power of the satellite-ground collaborative network, it can provide computing services for users anywhere on the earth, quickly respond to user's computing requests, and bring better QoE to users.

Conclusions
In this paper, we propose a scenario where an MEC server is equipped on the LEO satellite to form an edge computing LEO satellite, where user-generated computing tasks can be computed locally or offloaded to the MEC server for execution. This paper considers the multi-user multi-task scenario, and we define the utility function of the system as the weighted sum of system energy consumption and delay, and define the optimization problem as the problem of minimizing the utility function of the system. We propose an offloading algorithm based on deep learning to obtain near-optimal offloading decisions and allocate bandwidth to each user through an adaptive optimal bandwidth allocation algorithm to obtain the optimal system utility function values. The simulation results show that the algorithm proposed in this paper can obtain the optimal system utility function values with a low average time cost, and can achieve a good convergence effect. The satellite-ground collaborative network has a large amount of dynamic and time-varying resources, which are difficult to optimize. The future work is to use deep reinforcement learning with a good global search capability, solve the computation offloading problem in the multi-user, multi-satellite, and multi-task scenario in the satellite-ground collaborative network, and solve the bandwidth allocation problem in the complex scenario.