Research on Task-Oriented Computation Offloading Decision in Space-Air-Ground Integrated Network

: In Space–Air–Ground Integrated Networks (SAGIN), computation offloading technology is a new way to improve the processing efficiency of node tasks and improve the limitation of computing storage resources. To solve the problem of large delay and energy consumption cost of task computation offloading, which caused by the complex and variable network offloading environment and a large amount of offloading tasks, a computation offloading decision scheme based on Markov and Deep Q Networks (DQN) is proposed. First, we select the optimal offloading network based on the characteristics of the movement of the task offloading process in the network. Then, the task offloading process is transformed into a Markov state transition process to build a model of the computational offloading decision process. Finally, the delay and energy consumption weights are introduced into the DQN algorithm to update the computation offloading decision process, and the optimal offloading decision under the low cost is achieved according to the task attributes. The simulation results show that compared with the traditional Lyapunov-based offloading decision scheme and the classical Q-learning algorithm, the delay and energy consumption are respectively reduced by 68.33% and 11.21%, under equal weights when the offloading task volume exceeds 500 Mbit. Moreover, compared with offloading to edge nodes or backbone nodes of the network alone, the proposed mixed offloading model can satisfy more than 100 task requests with low energy consumption and low delay. It can be seen that the computation offloading decision proposed in this paper can effectively reduce the delay and energy consumption during the task computation offloading in the Space–Air–Ground Integrated Network environment, and can select the optimal offloading sites to execute the tasks according to the characteristics of the task itself.


Introduction
Space-Air-Ground Integrated Network (SAGIN) is a large-scale and complex heterogeneous network system formed by the interconnection of space-based information network, internet, and mobile communication network, which combines satellite network with terrestrial network closely. Further, it collects, transmits, and processes space information in real-time to meet the demand for communication services with global coverage. The space-Air-Ground Integrated Network is a strategic and fundamental critical infrastructure network, which can drive the development of China's emerging industries. In the field of military and civilian information applications and Internet of everything era, it plays a crucial role to deal with the demand for massive information services and will be the inevitable trend of the future development of the network [1]. The space-ground integrated network is a large complex hierarchical heterogeneous network, which mainly consists of three parts: space-based backbone network, space-based edge access network and ground-based backbone network. As shown in Figure 1, the space-based network Figure 1. Architecture of the Space-Earth Integration Network. The structure of the Space-Air-Ground Integrated Network is given from the perspective of network nodes, and the space-based backbone network is the core infrastructure of the space-ground integrated information network to meet the needs of various types of communications through the integration of space and ground networks.
The GEO satellites have the characteristics of high coverage, high transmission rate, and dynamic adaptation, but there are limitations of orbit fixation and high processing task delay, while the low-medium orbits have the advantages of wide distribution and short distance, and the computing storage ability of ground base stations is strong. Using air, space, and ground resources to work collaboratively can achieve multi-dimensional strategic information services. However, due to the dynamic change of the Space-Air-Ground Integrated Network topology, the number of tasks to be processed is huge and the number of satellites is limited, so there are many problems such as limited computing resources abilities of network nodes, and untimely task processing, resulting in some important task processing to fail [4]. How to make the network nodes in different orbits cooperate closely and collaborate in-depth to enable complex or large-scale tasks to be processed efficiently, is an urgent problem should be solved at present.
At present, computation offloading technology [5] is widely used in Internet of things and onboard network. It has become an important means of real-time processing and lasting endurance of terminal business, mainly to solve the problems of lack of computing capacity and limited storage resources of computing nodes. Computation offloading is the key technology in Mobile Edge Computing (MEC), which can be executed by offloading the business to other network sites with plentiful computational resources and powerful computing capabilities, and then after the tasks are completed, the results are passed back to the offloading terminal devices, thus saving the cost of the offloading terminal devices and reducing the limitation of limited resources in the mobile system [6]. Therefore, in the Space-Air-Ground Integrated Network, the computation offloading technology can be used to decompose the compute-intensive tasks. Selecting any computing node in the network that can take on the task offloading as the offloading site to perform tasks can effectively alleviate the computational storage pressure of network nodes and improve the efficiency of huge and complex tasks. The general process for task computation offloading in the Space-Air-Ground Integrated Network is [7]: (1) Analysis whether any terminal located in the space-based backbone network, the edge access network or the ground-based backbone network requires task computation offloading. (2) Get the resources of the network. (3) Analyze the state of the wireless channel, and determine whether the task is executed locally or needs to be offloaded to other service devices for execution by the computation offloading decision. (4) According to the decision instructions, the tasks that need to be offloaded are split into several separate subtasks, the subtasks that do not support offload are executed by the local terminal, and the subtasks that can be offloaded are offloaded to other network nodes in the Space-Air-Ground Integrated Network.
The design of computation offloading mainly includes offloading decisions and optimization of algorithms [8]. We consider the different types of computing nodes capabilities of the space-based backbone network, the edge access network and the ground-based backbone network in the Space-Air-Ground Integrated Network, and in-depth study of computation offloading decisions and algorithms for optimization decisions that are suitable for the integrated network environment, in order to improve the execution efficiency of huge and complex tasks in complex network environments.

Related Research
In recent years, computation offloading decisions [9] which aim at reducing delay and energy consumption have become one of the hot topics of computation offloading technology research, mainly considering whether user terminals need to be offloaded, the number of offloads, and the choice of offloading location. Zhang et al. [10] combined the auction theory to transform the computation offloading problem into a matching problem, and proposed a multi-round sealed sequence combined auction mechanism to reduce the delay in the computation offloading process. Kao and Krishnamachari [11] minimized the application delay under the constraint of resource utilization, at the same time transformed computation offloading into NP-hard problems, and proposed FPTAS online learning algorithm to solve the problem of computation offloading delay in the dynamic environment. Kamoun et al. [12] proposed a pre-computed offline solution to optimize the computation offloading process so that the terminal energy consumption could be minimized. Zhao et al. [13] proposed an optional offloading scheme based on the lightweight request and access framework to reduce the energy consumption of devices. In reference [14], a computation offloading decision based on the Lyapunov function was proposed. However, considering the topological dynamic characteristics of the network, this method is not suitable for the wireless channel model with randomly changing. Du et al. [15] proposed a low-complexity suboptimal algorithm, which used semi-deterministic relaxation and randomization to obtain the optimal offloading decision to solve the computation offloading problems in mixed fog/cloud computing systems. Liu et al. [16] transformed the problem of computation offloading into the convex optimization problem, and proposed an integrated fog and cloud computation offloading strategy, which offloaded a series of applications to nearby fog nodes or cloud center and reduced the system delay. Ren et al. [17] proposed an optimal joint communication and computing resource allocation algorithm for the problem of resource allocation in computation offloading. Further, the method of segmentation optimized partial offloading model was used to segment the data to achieve the optimal resource matching under low delay. Chen et al. [18] proposed a mixed-integer nonlinear computing task offloading decision, which transformed the problem of low-delay computation offloading into the problem of task offloading placement and task resource allocation. This scheme effectively reduced the delay of the computation offloading process in ultra-intensive networks. Yu et al. [19] aimed to reduce the delay of the computation offloading process in the actual Internet of things and proposed a completely polynomial-time approximate computation offloading strategy to rationally allocate resources for applications and shorten the computation processing time. In Reference [20], a task priority algorithm of maximum energy saving was proposed, which divided the tasks that need to be offloaded and applied the greedy algorithm to reduce energy consumption in the computation offloading process. Chen et al. [21] considered the computation offloading problem of mobile edge servers with energy collection devices and proposed a task offloading strategy based on the Lyapunov algorithm and greedy scheduling algorithm. When the server received a task offloading request, a centralized and distributed resource scheduling strategy was adopted to reduce energy consumption. Pham et al. [22] equalized the delay and energy consumption cost, and used the dichotomy method to allocate computational resources to the mobile terminals that need to be offloaded, while maximizing the number of offload tasks and effectively reducing the total system cost. In Reference [23], the goal of minimizing delay and energy consumption was achieved by combining OREO online algorithm and the Lyapunov algorithm to dynamically optimize the server cache and the task computation offloading process in the unknown system state scenario. In Reference [24], a task offloading algorithm for distributed optimization of mobile edge computing was proposed to optimize the selection of offloading terminals and the transmission power of CPU frequency, to reduce system delay and energy consumption. Among the latest computation offloading techniques, Naouri et al. [25] proposed a three-layer task offloading framework named DCC applied to mobile terminals that distinguishes between types of application tasks and reduces delay by shunting tasks with higher computational requirements to the cloudlet and cloud layers, and placing low computational and high communication cost tasks to be executed on the device layer. Zhang et al. [26] investigated task offloading in virtual vehicle servers and proposed a fully distributed and infrastructure-less congestion avoidance and traffic optimization system for vehicular ad-hoc networks in urban environments, called DIFTOS, which divides the city map into a server with a hierarchical structure and has been demonstrated in simulations under different traffic conditions its effectiveness and scalability. Most of the existing computation offloading decision schemes have been applied in mobile edge computing scenarios for mobile terminals or in-vehicle networks, but without considering the time-varying factors of the network, the communication quality of the channel, and the tasks' attributes. For tasks of excessive magnitude, they are still unable to effectively reduce delay and energy consumption in the task offloading process.
The Space-Air-Ground Integrated Network has a complicated and changeable environment, and the amount of offloading tasks is huge so that the existing computation offloading decision schemes are not applicable. Considering the Space-Air-Ground Integrated Network structure, network channel states, task characteristics, and terminal devices performance, combining the Markov State Transition Principle [27,28] and neural networks, we propose a computation offloading decision method based on Markov and DQN. This method performs data segmentation processing on the task, and abstracts the computation offloading decision process of the task into the state transition process of Markov model, and then uses the improved DQN algorithm to update the model, and finally selects the most optimal offloading decision according to the attributes of the tasks and the performance of the offloading sites.

Computation Offloading Decision Process
In this section, combining with Markov Decision Process Principle, the tasks are offloaded to the ground-based backbone nodes, the edge access nodes, or the space-based backbone nodes according to their own locations and requirements. Figure 2 shows the model of the computation offloading decision process based on Markov Decision. The offloading terminals (offloader) of the Space-Air-Ground Integrated Network are the terminal devices at various positions in the network, and the offloading sites (offloadee) are offloaded sets that can provide a variety of computing nodes choices for tasks. The terminal devices in the network are in high-speed movement states. When processing high-intensity tasks, the signal strength of the wireless network coverage of the offloading sites will be weakened due to the prolonged processing time. Therefore, the huge and complex task is divided into several independent subtasks using the divide and conquer thinking, and the continuous task processing process is abstracted into the state transition process of the Markov model to reduce the series of effects caused by the dynamic change of the position of the terminal devices. Taking the offloading tasks (offloader tasks) in the edge access network as an example, the entire Markov decision process is constructed. Define Vin as the i-th task of the current terminal device n.
The location of the offloading terminals and the offloading execution terminals are not fixed. During the task offloading execution, the wireless channel performance from the terminals to the servers in the space-based backbone network, the ground-based backbone network, or edge access network is constantly changing. And the higher the transmission delay, the more difficult it is to guarantee the transmission quality of the wireless channel. Therefore, it is necessary to fully consider the factors that affect the network status during the task offloading execution. In the process of task offloading, to minimize the delay and energy consumption of task computation offloading, it is necessary to keep the current task offloading network as the most optimal network. When the terminal devices leave the scope of the offloading network which is currently executed by the offloading task, or a new offloading network that meets the offloading conditions is added, the original offloading network will change. At this time, switching through the network should be considered to complete the task offloading. Figure 3 shows the flow chart of network switching in the process of task computation offloading of the Space-Air-Ground Integrated Network.  . Network Switching Flowchart. This figure represents the network selection and switching process during task offloading. Network switching is always present during the whole task offloading process, always keeping the network where the current task is offloaded as the optimal network, so that the delay and energy cost of task offloading are minimized.
When the computational resources of the offloading sites do not meet the offloading conditions, it's essential to select new offloading sites or perform network switching or abandon the task computation offloading and change to execute at local terminal devices. However, due to the complex environment of the Space-Air-Ground Integrated Network, the offloading network is prone to network interruption during the switching process. To avoid network interruption caused by the change of offloading network, it is required to select the most optimal offloading sites before computation offloading process, construct the most optimal offloading network, and reduce the probability of network switching in the decision process. Figure 4 shows the mobile model of the task offloading process. Among the model, Point O represents the terminal position where the current task of the network is offloaded, Point A represents the initial position of the offloading sites when task offloading, and Points B, C, and D are the positions during the movement of the offloading sites. When the offloading sites move to the boundary of D, but the task offloading calculation has not been completed, the task offloading may be failing. Next, we choose the offloading network by calculating the residence time of the offloading sites within the network. From the Friss equation for calculating the turbulence transmission loss in free space [29] and the wireless signal strength attenuation model based on cubic spline interpolation [30], the signal strength PA of the offloading terminal at A point can be obtained, as shown in Equation (1): where P  represents the signal strength reference value, and  represents the wireless network signal attenuation factor.  (3): OC is the vertical line of the offloading terminal's moving path. According to the triangle side length equation, triangle similarity theorem, and the known triangle side length, the distance between AD can be calculated finally. The resident time resident j T of the terminal device in the wireless network is as shown in Equation (4): A task i of the offloading terminal is offloaded to the offloading site j for execution, and the maximum calculation amount maxgive f provided by the offloading site j to the terminal device in the network is calculated as Equation (5): where s i T represents the transmission time of task i data offloading to the specified offloading site, r i T represents the data return time after the task is calculated and processed at the offloading site, and j f represents the computing capacity of the offloading node j.
In the task computation offloading decision process, the computing capacity of all offloading sites in the current network is judged according to resident j T and maxgive f , the maxgive f s are sorted according to the value, and then select the most optimal offloading network. Use control parameters to represent the system state of the Markov decision process model in the offloading network. The entire system state is composed of the Markov decision process set. A single Markov decision process includes the system state set, action set, state transition probability, cost function and strategy space. At each decision moment of task offloading, the control parameters of the system correspond to the actions in the Markov decision process. The state transition probability of the system controls the dynamic changes of the state and control parameters at each stage. Further, the evaluation metrics for the final task computation offloading are presented by the cost function.
In the three-layer network architecture of Space-Air-Ground Integrated Network, select the offloading sites that meets in the Markov decision process, and win represents the computing power required for the offloading task. Among them, the newly added state is the initial state 0,0 x of task computation offloading and the state [0, 6 5] i k    represents the set of task offloading locations, where k represents the optional offloading sites in the edge access network, and 6 represents the offloading sites of the space-based backbone nodes, 5 represents the offloading sites of the ground-based backbone nodes. The state s is determined by the channel state and the location of the task offloading site. Assuming that the initial channel state is good, the initial state of the system is expressed as , the task computation offloading is completed.

DQN-Based Computation Offloading Decision Algorithm
In the Markov computation offloading decision process, the strategy space contains countless kinds of strategies, and it is necessary to find the best state-action path as the final optimal decision, and then the offloading environment performs state transition according to the actions of the terminal selecting a certain offloading site in the network and calculates the reward value through the reward function. If the reward value is positive, the terminal is less likely to select that offloading site thereafter; if the reward value is negative, the trend is reversed. In this process, a large number of state-action pairs are generated. In this section, the DQN algorithm is used to process these state-action pairs, and the convolutional neural network and Q-learning algorithm are fused to solve the task offloading decision process model to find the best action corresponding to each state and complete the optimal task computation offloading.
Using the Q-learning algorithm, the intelligent terminal transfers the state of the offloading tasks through the real-time action value and the cumulative reward value. During the task offloading process, the channel state in the network is estimated in advance and the computing capacity of the terminal is monitored. During the Q-learning process, the processing status of the task is represented as ( , ) are as follows:  (11) Considering that in the Space-Air-Ground Integrated Network offloading environment, both the offloading terminals and the offloading sites have mobility. In Equations (4) and (5) in the Markov computation offloading decision process, the offloading sites that can theoretically complete the calculation task are obtained. Among these offloading sites that can complete the calculation task, select the offloading site with the smallest sum of delay and energy consumption cost, and the computation offloading decision taken at this time is optimal. The traditional measure of the performance of the offloading decision uses delay or energy consumption as a single constraint and seeks another minimum value to select the optimal offloading decision. To be able to balance the cost of delay and energy consumption, while taking into account the characteristics of different offloading tasks, weight parameters [31]  and  that the maximum value is 1 are set for the cost of the task, respectively representing the weight of delay and energy consumption. Each task that executes offloading selects the size of the parameter value  and  according to its characteristic attributes. For example, for military emergency tasks, the parameter value  can be set bigger. The task computation offloading optimization goals are as follows: . : maxgive where m represents the number of subtasks of the offloading task v of the terminal device n, i represents the number of currently executed subtasks 0 i m   , and C represents the weighted total cost of delay and energy consumption. After obtaining the above-mentioned optimal strategy, it is also necessary to compare its total cost with the total cost of the local execution of the task. If the offloading cost is greater than the local execution cost, the task is executed locally, otherwise, the execution is offloaded. For edge computing nodes, space-based backbone nodes, and ground-based backbone nodes in the Space-Air-Ground Integrated Network, there are differences in the computing storage performance of offloading sites. Therefore, the calculation equations for delay and energy consumption of different offloading sites are given, as shown in Equations (6) to (11), the reward function is shown in Equation (15).    (16) and (17) where t  represents the learning rate. The larger the value of t  is, the faster the convergence speed will be. However, the value should not be too large, otherwise, the convergence will be unstable and the optimal computation offloading decision will be affected.
Since Q-learning cannot store all possible states that in the complex Space-Air-Ground Integrated offloading network in a Q table, introducing the neural network to estimate the (s, a) Q generated in Q-learning can reduce the state space in the process of computation offloading decision and accelerate the convergence rate of the offloading process. Equations (16) and (17) is used to calculate the partial action (s,a) Q value of a single task during the task offloading decision period t, which is stored as the experience value in the memory bank Ω and used as the training data set of the convolutional neural network. Then the vector containing the value of each action (s,a) Q is output through two fully connected convolutional network layers. The neural network is used to convert a single Q value into a Q network, and then the backpropagation and gradient descent methods are used to minimize the loss function, as shown in Equation (18) Finally, Equations (8) and (11) choose the optimal offloading decision to execute the task.
The DQN computation offloading algorithm uses the deep convolutional neural network to nonlinearly approximate the action-value function in Q-learning during the computation offloading decision process, adopts the experience playback method to train the Q-learning process, and randomly extracts experience data from the state space to learn. This method breaks the correlation between past experiences so that it can quickly find the best task computation offloading decision path in the strategy space of the Space-Air-Ground Integrated Network. The specific computation offloading algorithm process is shown in Algorithm 1: Algorithm 1 Computation Offloading Algorithm Based on DQN Input: task amount; training cycle; the offloading terminal state-action value; location of the offloading terminals and the offloading sites; movement speed of the offloading terminals; Output: total cost of task offloading C 1. initial the computation offloading service request of the offloading terminals; initial the offloading terminal state-action value and read the task amount; 2. calculate the resident time of the offloading terminal inside the network. complete the task computation offloading; 23.
Send the data to the offloading terminal; 24. end if 25. until Over task amount; 26. All task data processing completed.

Results
The Space-Air-Ground Integrated Network simulation scene is set as follows: According to the 6 + 5 + x structure of the Space-Air-Ground Integrated Network, two spacebased backbone nodes, two ground-based backbone nodes, and six edge access nodes were selected for simulation. That is, there are one offloading terminal and ten computation offloading sites, among which four are used to simulate the computation offloading sites of space-based or ground-based backbone nodes (B_offloadee), and the rest are computation offloading sites of edge computing nodes (E_offloadee), which are randomly distributed in different latitudes, and all nodes are connected through the wireless network. The factors that affect the task computation offloading decision include the offloading terminal, offloading sites, and attributes of the task itself, as shown in Table 1. Considering the diversity of offloading sites of space-based backbone nodes, ground-based backbone nodes, and edge access nodes in the network, the performance of ten offloading sites are set, as shown in Table 2 The simulation in this paper utilized Python 3.7 for algorithm simulation and algorithm analysis, and the code run dependency libraries including Tensorflow, Keras, Numpy, Scipy, Matplotlib, CUDA, etc. We used Spyder which the integrated development environment (IDE) that comes with Anaconda, perform the simulation verification, and the specific results are as follows.
Firstly, taking the weighted total cost of delay and energy consumption as indicators, the optimization performance of the DQN computation offloading algorithm is compared with that of traditional computation offloading algorithm including Lyapunov-based methods and classical Q-learning-based methods. Figure 5 shows the comparison of the total cost for delay and energy consumption weight of 0.5 each, indicating the need to balance delay and energy consumption when offloading tasks. Figure 6 compares the total delay and energy cost of the three computational offloading methods for delay-sensitive offloading tasks. The curves obtained from the three computation offloading methods in Figures 5 and 6 show an increasing trend, indicating that the total delay and energy consumption cost for task offloading in the integrated network increases with the amount of tasks to be offloaded, while the offloading strategy based on the DQN computational offloading algorithm proposed in this paper can significantly reduce the delay and energy consumption cost.  Secondly, using the number of times of network switching as an indicator, we compare and analyze the impact of the DQN computation offloading method, Lyapunov algorithm, and Q-learning algorithm on the task offloading decision process under the Space-Air-Ground Integrated Network environment. In the process of offloading, network switching will increase the cost of task execution. To reduce the cost of delay and energy consumption, the number of network switching needs to be offloaded as few as possible. The results of this comparison are shown in Figure 7. The graphs of all three strategies show an increasing trend, i.e., it shows that the number of network switches performed by all three strategies gradually increases as the number of tasks increases in the Space-Air-Ground Integrated Network. The Lyapunov algorithm-based computation offloading strategy always has the highest number of network switches, and the DQN algorithm based computational offloading strategy proposed in this paper has significant advantages when the number of offloading tasks is high, with the lowest number of network switches and thus the lowest additional cost during the computational offloading of tasks. Then, using the delay and energy consumption as indicators, according to the characteristics of the three-layer architecture of the Space-Air-Ground Integrated Network and the terminal's dynamic movement, the effectiveness of the Markov computation offloading decision process to maximize the use of the three types of computing nodes in the network is verified. Classify task offloading into all offloading to edge computing nodes, all offloading to backbone nodes, and mixed computation offloading at edge backbone nodes. Figure 8 compare the energy consumption cost of the mixed computation offloading under a certain work request volume with that of the mode in which the particular offloading sites are selected separately, while Figure 9 compares the delay cost. The graphs for all three task offloading modes show an upward trend as the workload (i.e., the number of task requests) increases. The rising trend of the mixed compute offloading mode proposed in this paper rises slowly compared to the computation offloading mode that offloads all to the backbone nodes, while the computation offloading mode that offloads all to the edge nodes tends to saturate when the workload is high. Therefore, the mixed compute offload mode for edge backbone nodes in this paper has lower energy and delay cost.  Based on the simulation results of the algorithm, the number of network switches and the task offloading mode, the DQN-based offloading decision scheme can be applied to the Space-Air-Ground Integrated Network, which has large number of tasks to be offloaded, the complex offloading environment and the limited network resources, and it can effectively reduce the time delay and energy consumption during the task offloading process.

Discussion
The main objective of this paper is to solve the problems of low computational offload efficiency, high delay and energy consumption cost in Space-Air-Ground Integrated Network due to the complex and variable network offloading environment, the large amount of offloading tasks and the limited computational and storage resources of the network nodes, and to investigate a computation offloading solution that can adapt to these characteristics of the Space-Air-Ground Integrated Network. In this paper, we investigate a computation offloading solution that can accommodate these characteristics of the integrated network. In the following, we discuss and analyze the findings from the algorithms, network switching times and task offloading modes of the computation offloading application, and then explain and validate the effectiveness and applicability of our proposed DQN-based computation offloading decision scheme.
For the comparison of computational offloading algorithms, the Lyapunov algorithm is the most widely used in computation offloading technology, which can maintain the stability of the offloading system when the amount of tasks increases and prevent the system performance from deteriorating sharply. The classical Q-learning algorithm can solve the Markov decision process model through the maximum cumulative reward obtained by the state-action, but the algorithm has limited applicability and is inefficient in the large-scale state space. We have compared the system cost of these three kinds of computation offloading algorithms when dealing with different amounts of offloading tasks to verify the effectiveness of the proposed computational offloading algorithm in updating the computation offloading decision process in the Space-Air-Ground Integrated Network. We set different weights for the delay-sensitive tasks that need to be offloaded on the offloading terminal. As can be seen from Figures 5 and 6, for the task offloading requirements with equal weighting of delay and energy consumption, and for offloading task volumes above 500 Mbit, the proposed computation offloading strategy reduces on average 68.33% compared to the traditional Lyapunov-based computational offloading decision scheme, and 11.21% compared to the Markovian decision method based on the classical Q-learning algorithm. Further, for delay-sensitive offloading tasks, the three computation offloading algorithms can effectively reduce the total cost of the offloading strategy under the condition that the delay-sensitive offloading tasks have a greater weight than the weight of energy consumption. The total cost of the offloading strategy of the three computation offloading algorithms shows an upward trend with the increase in the number of offloading tasks. In contrast, the upward trend of the DQN computation offloading algorithm slows down significantly, which can effectively reduce the total delay-energy consumption cost of the task offloading strategy. In Figure 6, when the amount of offloading tasks is less than 160 Mbit, the total cost of offloading strategy under the DQN computation offloading algorithm is slightly higher than the Q-learning algorithm and close to Lyapunov algorithm. This is because the DQN computation offloading algorithm is more complex than the other two algorithms, but the number of tasks to be offloaded in the Space-Air-Ground Integrated Network is huge. Therefore, the offloading strategy based on the DQN computation offloading algorithm can effectively reduce delay and energy consumption.
For the comparison of the number of network switches during task offloading for different computation offloading methods, as shown in Figure 7, the Lyapunov-based computation offloading method has the highest number of switches and the steepest upward trend. When the number of offloading tasks is small, the number of network switching times based on the Q-learning computation offloading method is the least; when the number of offloading tasks exceeds 130 Mbit, the number of network switching times based on the DQN computation offloading method is the least. It can be seen that the Lyapunov algorithm does not have the learning characteristics of the Q-learning algorithm and the DQN algorithm, and is not able to monitor the offloading environment in real-time for policy learning and updating, resulting in the highest number of network switches during the task offloading. Therefore, compared to Q-learning and DQN algorithms, the Lyapunov algorithm is not suitable for application in the complex Space-Air-Ground Integrated Network. For the Q-learning algorithm, the learning environment of the offloading network is relatively simple with a lower algorithm complexity than the DQN algorithm when the number of offloading tasks is small. However, considering the complexity, dynamic characteristics, and the large number of tasks in the Space-Air-Ground Integrated Network, the DQN-based computation offloading method can select the optimal computation offloading decision and offloading network, ensuring fewer network switches and effectively reducing the additional cost in the computation offloading process.
For the three different mission offloading modes of the Space-Air-Ground Integrated Network, as shown in Figure 8, the task has the lowest energy consumption for edge-only computation offloading, with the workload between 20 and 80 requests, which is close to the energy consumption for edge backbone nodes mixed computation offloading but is saturated with edge computing workloads above 80 requests. Only the backbone nodes have the largest computation offloading energy consumption and the fastest increasing. In contrast, the offloading task using the mixed computation offloading mode of edge and backbone nodes has lower energy consumption and high workloads, which effectively meets the work demands of a large number of offloading tasks in the Space-Air-Ground Integrated Network. Figure 9 shows that the offloading task using a mixedmode of edge and backbone nodes computing has the lowest delay cost, and the delay cost tends to increase more slowly than other offloading modes as the workloads increase. Specifically, when the number of the network work requests exceeds 100, the proposed mixed offloading approach is 61.54% lower in energy consumption and 71.01% lower in delay cost than the full offloading to the network backbone nodes approach, and is able to satisfy a large number of task offloading requests without saturation. Therefore, selecting the mix of edge and backbone nodes for computation offloading is more suitable for the task offloading environment of the Space-Air-Ground Integrated Network than selecting a pattern of offloading sites individually.

Conclusions
In the context of the Space-Air-Ground Integrated Network, due to the complex and variable offloading environment, the huge amount of offloading tasks, and the different storage resources and computing capacity of different orbital nodes, this paper theoretically analyzes the status of the offloading terminals and the offloading sites, and the selection of the offloading network and proposes a computation offloading decision scheme based on the Markov decision process and DQN algorithm. Specifically, segment the task data, select the optimal offloading network and the optimal offloading sites, transform the task processing process into the state transformation process of the Markov model, and use the DQN computation offloading algorithm to optimize the Markov decision process and seek the optimal offloading strategy. The experiment proved that the proposed computation offloading strategy reduces 68.33% on average compared to the traditional Lyapunov-based computational offloading decision scheme and 11.21% on average compared to the Markov decision method based on the classical Q-learning algorithm when the number of offloading tasks exceeds 500 Mbit and under the weighting requirements such as delay and energy consumption. When the number of offloading tasks exceeds 130 Mbit, the number of network switches required to select the optimal computation offloading decision and offload the network by the computation offloading scheme proposed in this paper is significantly lower than that of the Lyapunov-based and classical Q-learning algorithm-based computation offloading methods. Moreover, when the number of network work requests exceeds 100, the proposed mixed offloading approach can satisfy a large number of task offloading requests, and the energy cost and delay cost are 61.54% and 71.01% lower than those of the offloading to the network backbone nodes and the offloading to the edge nodes, respectively. Therefore, the proposed computational offloading scheme is suitable for Space-Air-Ground Integrated Network, reduces the number of switching of offloading networks, effectively reduces the delay and energy consumption cost of task computational offloading, and improves the network resource utilization.
The Space-Air-Ground Integrated Network is the trend of future networks as a huge system that integrates communication and aerospace telemetry to achieve full network coverage. In previous research, computation offloading techniques that enable efficient task processing have been applied to mobile phone terminals and vehicle terminals, but in the future, as various terminal devices continue to develop and advance, and human needs become more and more advanced, all terminal devices will be part of the Space-Air-Ground Integrated Network, and the rational application of computation offloading techniques to the Space-Air-Ground Integrated Network is of great theoretical importance. The application of computational offloading technology to Space-Air-Ground Integrated Network is of great theoretical significance and practical application. In this paper, the existing computation offloading techniques are improved and applied to the Space-Air-Ground Integrated Network, starting from the structural characteristics of the network. On the one hand, according to the delay and energy consumption requirements of the Space-Air-Ground Integrated Network's task offloading, the task can be matched with the offloading sites to effectively reduce delay and energy consumption while taking into account the requirements. On the other hand, the mixed consumption offloading model is able to cope with the huge volume of task processing requests in the future Internet of everything era with limited network resources. Theoretically, it can contribute significantly to the efficiency of task processing, thus addressing the impact and challenges of the inherent shortcomings of the Space-Air-Ground Integrated Network on the future development of the Internet.